Mike Galbraith [Mon, 15 Oct 2007 15:00:13 +0000 (17:00 +0200)]
sched: cleanup, remove the TASK_NONINTERACTIVE flag
Here's another piece of low hanging obsolete fruit.
Remove obsolete TASK_NONINTERACTIVE.
Signed-off-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Dmitry Adamushko [Mon, 15 Oct 2007 15:00:13 +0000 (17:00 +0200)]
sched: cleanup, make dequeue_entity() and update_stats_wait_end() similar
make dequeue_entity() / enqueue_entity() and update_stats_dequeue() /
update_stats_enqueue() look similar, structure-wise.
zero effect, functionality-wise:
text data bss dec hex filename
34550 3026 100 37676 932c sched.o.before
34550 3026 100 37676 932c sched.o.after
Signed-off-by: Dmitry Adamushko <dmitry.adamushko@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Dmitry Adamushko [Mon, 15 Oct 2007 15:00:13 +0000 (17:00 +0200)]
sched: cleanup, remove calc_weighted()
remove obsolete code -- calc_weighted()
Signed-off-by: Dmitry Adamushko <dmitry.adamushko@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Dmitry Adamushko [Mon, 15 Oct 2007 15:00:13 +0000 (17:00 +0200)]
sched: tidy up SCHED_RR
- make timeslices of SCHED_RR tasks constant and not
dependent on task's static_prio [1] ;
- remove obsolete code (timeslice related bits);
- make sched_rr_get_interval() return something more
meaningful [2] for SCHED_OTHER tasks.
[1] according to the following link, it's not compliant with SUSv3
(not sure though, what is the reference for us :-)
http://lkml.org/lkml/2007/3/7/656
[2] the interval is dynamic and can be depicted as follows "should a
task be one of the runnable tasks at this particular moment, it would
expect to run for this interval of time before being re-scheduled by the
scheduler tick".
(i.e. it's more precise if a task is runnable at the moment)
yeah, this seems to require task_rq_lock/unlock() but this is not a hot
path.
results:
(SCHED_FIFO)
dimm@earth:~/storage/prog$ sudo chrt -f 10 ./rr_interval
time_slice: 0 : 0
(SCHED_RR)
dimm@earth:~/storage/prog$ sudo chrt 10 ./rr_interval
time_slice: 0 :
99984800
(SCHED_NORMAL)
dimm@earth:~/storage/prog$ ./rr_interval
time_slice: 0 :
19996960
(SCHED_NORMAL + a cpu_hog of similar 'weight' on the same CPU --- so should be a half of the previous result)
dimm@earth:~/storage/prog$ taskset 1 ./rr_interval
time_slice: 0 :
9998480
Signed-off-by: Dmitry Adamushko <dmitry.adamushko@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Alexey Dobriyan [Mon, 15 Oct 2007 15:00:13 +0000 (17:00 +0200)]
sched: uninline scheduler
* save ~300 bytes
* activate_idle_task() was moved to avoid a warning
bloat-o-meter output:
add/remove: 6/0 grow/shrink: 0/16 up/down: 438/-733 (-295) <===
function old new delta
__enqueue_entity - 165 +165
finish_task_switch - 110 +110
update_curr_rt - 79 +79
__load_balance_iterator - 32 +32
__task_rq_unlock - 28 +28
find_process_by_pid - 24 +24
do_sched_setscheduler 133 123 -10
sys_sched_rr_get_interval 176 165 -11
sys_sched_getparam 156 145 -11
normalize_rt_tasks 482 470 -12
sched_getaffinity 112 99 -13
sys_sched_getscheduler 86 72 -14
sched_setaffinity 226 212 -14
sched_setscheduler 666 642 -24
load_balance_start_fair 33 9 -24
load_balance_next_fair 33 9 -24
dequeue_task_rt 133 67 -66
put_prev_task_rt 97 28 -69
schedule_tail 133 50 -83
schedule 682 594 -88
enqueue_entity 499 366 -133
task_new_fair 317 180 -137
Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Mon, 15 Oct 2007 15:00:13 +0000 (17:00 +0200)]
sched: tweak wakeup granularity
tweak wakeup granularity.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Mon, 15 Oct 2007 15:00:13 +0000 (17:00 +0200)]
sched: optimize schedule() a bit on SMP
optimize schedule() a bit on SMP, by moving the rq-clock update
outside the rq lock.
code size is the same:
text data bss dec hex filename
25725 2666 96 28487 6f47 sched.o.before
25725 2666 96 28487 6f47 sched.o.after
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Dmitry Adamushko [Mon, 15 Oct 2007 15:00:13 +0000 (17:00 +0200)]
sched: fix __pick_next_entity()
The thing is that __pick_next_entity() must never be called when
first_fair(cfs_rq) == NULL. It wouldn't be a problem, should 'run_node'
be the very first field of 'struct sched_entity' (and it's the second).
The 'nr_running != 0' check is _not_ enough, due to the fact that
'current' is not within the tree. Generic paths are ok (e.g. schedule()
as put_prev_task() is called previously)... I'm more worried about e.g.
migration_call() -> CPU_DEAD_FROZEN -> migrate_dead_tasks()... if
'current' == rq->idle, no problems.. if it's one of the SCHED_NORMAL
tasks (or imagine, some other use-cases in the future -- i.e. we should
not make outer world dependent on internal details of sched_fair class)
-- it may be "Houston, we've got a problem" case.
it's +16 bytes to the ".text". Another variant is to make 'run_node' the
first data member of 'struct sched_entity' but an additional check (se !
= NULL) is still needed in pick_next_entity().
Signed-off-by: Dmitry Adamushko <dmitry.adamushko@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Ingo Molnar [Mon, 15 Oct 2007 15:00:13 +0000 (17:00 +0200)]
sched: vslice fixups for non-0 nice levels
Make vslice accurate wrt nice levels, and add some comments
while we're at it.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Ingo Molnar [Mon, 15 Oct 2007 15:00:12 +0000 (17:00 +0200)]
sched: whitespace cleanups
more whitespace cleanups. No code changed:
text data bss dec hex filename
26553 2790 288 29631 73bf sched.o.before
26553 2790 288 29631 73bf sched.o.after
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Ingo Molnar [Mon, 15 Oct 2007 15:00:12 +0000 (17:00 +0200)]
sched: mark scheduling classes as const
mark scheduling classes as const. The speeds up the code
a bit and shrinks it:
text data bss dec hex filename
40027 4018 292 44337 ad31 sched.o.before
40190 3842 292 44324 ad24 sched.o.after
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Srivatsa Vaddagiri [Mon, 15 Oct 2007 15:00:12 +0000 (17:00 +0200)]
sched: group scheduler, fix latency
There is a possibility that because of task of a group moving from one
cpu to another, it may gain more cpu time that desired. See
http://marc.info/?l=linux-kernel&m=
119073197730334 for details.
This is an attempt to fix that problem. Basically it simulates dequeue
of higher level entities as if they are going to sleep. Similarly it
simulate wakeup of higher level entities as if they are waking up from
sleep.
Signed-off-by: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Srivatsa Vaddagiri [Mon, 15 Oct 2007 15:00:12 +0000 (17:00 +0200)]
sched: group scheduler, fix bloat
Recent fix to check_preempt_wakeup() to check for preemption at higher
levels caused a size bloat for !CONFIG_FAIR_GROUP_SCHED.
Fix the problem.
42277 10598 320 53195 cfcb kernel/sched.o-before_this_patch
42216 10598 320 53134 cf8e kernel/sched.o-after_this_patch
Signed-off-by: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Srivatsa Vaddagiri [Mon, 15 Oct 2007 15:00:12 +0000 (17:00 +0200)]
sched: group scheduler, fix coding style issues
Fix coding style issues reported by Randy Dunlap and others
Signed-off-by: Dhaval Giani <dhaval@linux.vnet.ibm.com>
Signed-off-by: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Ingo Molnar [Mon, 15 Oct 2007 15:00:12 +0000 (17:00 +0200)]
sched: cleanup, remove stale comment
cleanup, remove stale comment.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Peter Zijlstra [Mon, 15 Oct 2007 15:00:12 +0000 (17:00 +0200)]
sched: speed up and simplify vslice calculations
speed up and simplify vslice calculations.
[ From: Mike Galbraith <efault@gmx.de>: build fix ]
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Peter Zijlstra [Mon, 15 Oct 2007 15:00:12 +0000 (17:00 +0200)]
sched: clean up min_vruntime use
clean up min_vruntime use.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Srivatsa Vaddagiri [Mon, 15 Oct 2007 15:00:12 +0000 (17:00 +0200)]
sched: group scheduler SMP migration fix
group scheduler SMP migration fix: use task_cfs_rq(p) to get
to the relevant fair-scheduling runqueue of a task, rq->cfs
is not the right one.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Mon, 15 Oct 2007 15:00:12 +0000 (17:00 +0200)]
sched: clean up schedstats, cnt -> count
rename all 'cnt' fields and variables to the less yucky 'count' name.
yuckage noticed by Andrew Morton.
no change in code, other than the /proc/sched_debug bkl_count string got
a bit larger:
text data bss dec hex filename
38236 3506 24 41766 a326 sched.o.before
38240 3506 24 41770 a32a sched.o.after
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Dmitry Adamushko [Mon, 15 Oct 2007 15:00:12 +0000 (17:00 +0200)]
sched: yield fix
fix yield bugs due to the current-not-in-rbtree changes: the task is
not in the rbtree so rbtree-removal is a no-no.
[ From: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>: build fix. ]
also, nice code size reduction:
kernel/sched.o:
text data bss dec hex filename
38323 3506 24 41853 a37d sched.o.before
38236 3506 24 41766 a326 sched.o.after
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Dmitry Adamushko <dmitry.adamushko@gmail.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Srivatsa Vaddagiri [Mon, 15 Oct 2007 15:00:12 +0000 (17:00 +0200)]
sched: group scheduler wakeup latency fix
group scheduler wakeup latency fix: when checking for preemption
we must check cross-group too, not just intra-group.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Mon, 15 Oct 2007 15:00:11 +0000 (17:00 +0200)]
sched: remove set_leftmost()
Lee Schermerhorn noticed that set_leftmost() contains dead code,
remove this.
Reported-by: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Hiroshi Shimamoto [Mon, 15 Oct 2007 15:00:11 +0000 (17:00 +0200)]
sched: clean up sched_fork()
The adjusting sched_class is a missing part of the already existing "do
not leak PI boosting priority to the child" at the sched_fork(). This
patch moves the adjusting sched_class from wake_up_new_task() to
sched_fork().
this also shrinks the code a bit:
text data bss dec hex filename
40111 4018 292 44421 ad85 sched.o.before
40102 4018 292 44412 ad7c sched.o.after
Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Signed-off-by: Dmitry Adamushko <dmitry.adamushko@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Peter Zijlstra [Mon, 15 Oct 2007 15:00:11 +0000 (17:00 +0200)]
sched: max_vruntime() simplification
max_vruntime() simplification.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Ingo Molnar [Mon, 15 Oct 2007 15:00:11 +0000 (17:00 +0200)]
sched: fix sched_fork()
fix sched_fork(): large latencies at new task creation time because
the ->vruntime was not fixed up cross-CPU, if the parent got migrated
after the child's CPU got set up.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Ingo Molnar [Mon, 15 Oct 2007 15:00:11 +0000 (17:00 +0200)]
sched: fix sign check error in place_entity()
fix sign check error in place_entity() - we'd get excessive
latencies due to negatives being converted to large u64's.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Ingo Molnar [Mon, 15 Oct 2007 15:00:11 +0000 (17:00 +0200)]
sched: undo some of the recent changes
undo some of the recent changes that are not needed after all,
such as last_min_vruntime.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Ingo Molnar [Mon, 15 Oct 2007 15:00:11 +0000 (17:00 +0200)]
sched: remove last_min_vruntime effect
remove last_min_vruntime use - prepare to remove it.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Ingo Molnar [Mon, 15 Oct 2007 15:00:11 +0000 (17:00 +0200)]
sched: remove condition from set_task_cpu()
remove condition from set_task_cpu(). Now that ->vruntime
is not global anymore, it should (and does) work fine without
it too.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Ingo Molnar [Mon, 15 Oct 2007 15:00:11 +0000 (17:00 +0200)]
sched: entity_key() fix
entity_key() fix - we'd occasionally end up with a 0 vruntime
in the !initial case.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Peter Zijlstra [Mon, 15 Oct 2007 15:00:10 +0000 (17:00 +0200)]
sched debug: check spread
debug feature: check how well we schedule within a reasonable
vruntime 'spread' range. (note that CPU overload can increase
the spread, so this is not a hard condition, but normal loads
should be within the spread.)
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Ingo Molnar [Mon, 15 Oct 2007 15:00:10 +0000 (17:00 +0200)]
sched debug: more width for parameter printouts
more width for parameter printouts in /proc/sched_debug.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Peter Zijlstra [Mon, 15 Oct 2007 15:00:10 +0000 (17:00 +0200)]
sched: add vslice
add vslice: the load-dependent "virtual slice" a task should
run ideally, so that the observed latency stays within the
sched_latency window.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Ingo Molnar [Mon, 15 Oct 2007 15:00:10 +0000 (17:00 +0200)]
sched debug: print settings
print the current value of all tunables in /proc/sched_debug output.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Ingo Molnar [Mon, 15 Oct 2007 15:00:10 +0000 (17:00 +0200)]
sched: remove unneeded tunables
remove unneeded tunables.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
S.Caglar Onur [Mon, 15 Oct 2007 15:00:10 +0000 (17:00 +0200)]
sched debug: BKL usage statistics, fix
build fix for the SCHED_DEBUG && !SCHEDSTATS case.
Signed-off-by: S.Ceglar Onur <caglar@pardus.org.tr>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Ingo Molnar [Mon, 15 Oct 2007 15:00:10 +0000 (17:00 +0200)]
sched debug: BKL usage statistics
add per task and per rq BKL usage statistics.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Ingo Molnar [Mon, 15 Oct 2007 15:00:09 +0000 (17:00 +0200)]
sched: enable CONFIG_FAIR_GROUP_SCHED=y by default
enable CONFIG_FAIR_GROUP_SCHED=y by default.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Ingo Molnar [Mon, 15 Oct 2007 15:00:09 +0000 (17:00 +0200)]
sched: fair-group sched, cleanups
fair-group sched, cleanups.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Srivatsa Vaddagiri [Mon, 15 Oct 2007 15:00:09 +0000 (17:00 +0200)]
sched: add fair-user scheduler
Enable user-id based fair group scheduling. This is useful for anyone
who wants to test the group scheduler w/o having to enable
CONFIG_CGROUPS.
A separate scheduling group (i.e struct task_grp) is automatically created for
every new user added to the system. Upon uid change for a task, it is made to
move to the corresponding scheduling group.
A /proc tunable (/proc/root_user_share) is also provided to tune root
user's quota of cpu bandwidth.
Signed-off-by: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Signed-off-by: Dhaval Giani <dhaval@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Srivatsa Vaddagiri [Mon, 15 Oct 2007 15:00:09 +0000 (17:00 +0200)]
sched: clean up code under CONFIG_FAIR_GROUP_SCHED
With the view of supporting user-id based fair scheduling (and not just
container-based fair scheduling), this patch renames several functions
and makes them independent of whether they are being used for container
or user-id based fair scheduling.
Also fix a problem reported by KAMEZAWA Hiroyuki (wrt allocating
less-sized array for tg->cfs_rq[] and tf->se[]).
Signed-off-by: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Signed-off-by: Dhaval Giani <dhaval@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Srivatsa Vaddagiri [Mon, 15 Oct 2007 15:00:09 +0000 (17:00 +0200)]
sched: print &rq->cfs stats
- Print &rq->cfs statistics as well (useful for group scheduling)
Signed-off-by: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Signed-off-by: Dhaval Giani <dhaval@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Srivatsa Vaddagiri [Mon, 15 Oct 2007 15:00:09 +0000 (17:00 +0200)]
sched: print nr_running and load in /proc/sched_debug
- print nr_running and load information for cfs_rq in /proc/sched_debug
Signed-off-by: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Signed-off-by: Dhaval Giani <dhaval@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Srivatsa Vaddagiri [Mon, 15 Oct 2007 15:00:08 +0000 (17:00 +0200)]
sched: fix minor bug in yield
- fix a minor bug in yield (seen for CONFIG_FAIR_GROUP_SCHED),
group scheduling would skew when yield was called.
Signed-off-by: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Signed-off-by: Dhaval Giani <dhaval@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Srivatsa Vaddagiri [Mon, 15 Oct 2007 15:00:08 +0000 (17:00 +0200)]
sched: revert recent removal of set_curr_task()
Revert removal of set_curr_task.
Use put_prev_task/set_curr_task when changing groups/policies
Signed-off-by: Srivatsa Vaddagiri < vatsa@linux.vnet.ibm.com>
Signed-off-by: Dhaval Giani <dhaval@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Ingo Molnar [Mon, 15 Oct 2007 15:00:08 +0000 (17:00 +0200)]
sched: kernel/sched_fair.c whitespace cleanups
some trivial whitespace cleanups.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Mike Galbraith [Mon, 15 Oct 2007 15:00:08 +0000 (17:00 +0200)]
sched: fix formatting of /proc/sched_debug
fix formatting of /proc/sched_debug
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Ingo Molnar [Mon, 15 Oct 2007 15:00:08 +0000 (17:00 +0200)]
sched: enhance debug output
enhance debug output by changing
12345678 nsecs to 12.345678 output,
this is more human-readable.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Ingo Molnar [Mon, 15 Oct 2007 15:00:08 +0000 (17:00 +0200)]
sched: prettify /proc/sched_debug output
print the correct amount of dashes in /proc/sched_debug.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Dmitry Adamushko [Mon, 15 Oct 2007 15:00:08 +0000 (17:00 +0200)]
sched: rework enqueue/dequeue_entity() to get rid of set_curr_task()
rework enqueue/dequeue_entity() to get rid of
sched_class::set_curr_task(). This simplifies sched_setscheduler(),
rt_mutex_setprio() and sched_move_tasks().
text data bss dec hex filename
24330 2734 20 27084 69cc sched.o.before
24233 2730 20 26983 6967 sched.o.after
Signed-off-by: Dmitry Adamushko <dmitry.adamushko@gmail.com>
Signed-off-by: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Dmitry Adamushko [Mon, 15 Oct 2007 15:00:08 +0000 (17:00 +0200)]
sched: simplify sched_class::yield_task()
the 'p' (task_struct) parameter in the sched_class :: yield_task() is
redundant as the caller is always the 'current'. Get rid of it.
text data bss dec hex filename
24341 2734 20 27095 69d7 sched.o.before
24330 2734 20 27084 69cc sched.o.after
Signed-off-by: Dmitry Adamushko <dmitry.adamushko@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Dmitry Adamushko [Mon, 15 Oct 2007 15:00:08 +0000 (17:00 +0200)]
sched: optimize task_new_fair()
due to the fact that we no longer keep the 'current' within the tree,
dequeue/enqueue_entity() is useless for the 'current' in
task_new_fair(). We are about to reschedule and
sched_class->put_prev_task() will put the 'current' back into the tree,
based on its new key.
text data bss dec hex filename
24388 2734 20 27142 6a06 sched.o.before
24341 2734 20 27095 69d7 sched.o.after
Signed-off-by: Dmitry Adamushko <dmitry.adamushko@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Ingo Molnar [Mon, 15 Oct 2007 15:00:08 +0000 (17:00 +0200)]
sched: fix delay accounting performance regression
fix delay accounting performance regression - those sched_clock()
calls are not needed.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Dmitry Adamushko [Mon, 15 Oct 2007 15:00:07 +0000 (17:00 +0200)]
sched: do not keep current in the tree and get rid of sched_entity::fair_key
Get rid of 'sched_entity::fair_key'.
As a side effect, 'current' is not kept withing the tree for
SCHED_NORMAL/BATCH tasks anymore. This simplifies some parts of code
(e.g. entity_tick() and yield_task_fair()) and also somewhat optimizes
them (e.g. a single update_curr() now vs. dequeue/enqueue() before in
entity_tick()).
Signed-off-by: Dmitry Adamushko <dmitry.adamushko@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Dmitry Adamushko [Mon, 15 Oct 2007 15:00:07 +0000 (17:00 +0200)]
sched: add set_curr_task() calls
p->sched_class->set_curr_task() has to be called before
activate_task()/enqueue_task() in rt_mutex_setprio(),
sched_setschedule() and sched_move_task() in order to set up
'cfs_rq->curr'. The logic of enqueueing depends on whether a task to be
inserted is 'current' or not.
Signed-off-by: Dmitry Adamushko <dmitry.adamushko@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Dmitry Adamushko [Mon, 15 Oct 2007 15:00:07 +0000 (17:00 +0200)]
sched: sched_setscheduler() fix
Fix a problem in the 'sched-group' patch for !CONFIG_FAIR_GROUP_SCHED.
description:
sched_setscheduler()
{
...
if (task_running()) p->sched_class->put_prev_entity();
[ this one sets up cfs_rq->curr to NULL ]
...
if (task_running) p->sched_class->set_curr_task();
[ and this one is a _NOP_ (empty) for !CONFIG_FAIR_GROUP_SCHED ]
As a result, the task continues to run with cfs_rq->curr == NULL... no
crashes (due to checks for !NULL in place) but e.g. update_curr()
effectively becomes a NOP... i.e. runtime statistics for this task is
not accounted untill it's rescheduled anew.
Signed-off-by: Dmitry Adamushko <dmitry.adamushko@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Srivatsa Vaddagiri [Mon, 15 Oct 2007 15:00:07 +0000 (17:00 +0200)]
sched: group-scheduler core
Add interface to control cpu bandwidth allocation to task-groups.
(not yet configurable, due to missing CONFIG_CONTAINERS)
Signed-off-by: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Signed-off-by: Dhaval Giani <dhaval@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Mike Galbraith [Mon, 15 Oct 2007 15:00:07 +0000 (17:00 +0200)]
sched: fix SMP migration latencies
fix SMP migration latencies: the vruntimes of different CPUs are
at incompatible offsets so they have to be fixed up when migrating
a task across CPUs.
Signed-off-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Peter Zijlstra [Mon, 15 Oct 2007 15:00:07 +0000 (17:00 +0200)]
sched: better min_vruntime tracking
Better min_vruntime tracking: update it every time 'curr' is
updated - not just when a task is enqueued into the tree.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Mike Galbraith <efault@gmx.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Ingo Molnar [Mon, 15 Oct 2007 15:00:07 +0000 (17:00 +0200)]
sched: x86: allow single-depth wchan output
sched.o gets smaller and faster if we compile it with -fomit-frame-pointers,
so make this a config option. The cost is the loss of multi-depth wchan
lookups - but SysRq-T is a sufficient replacement for them anyway, so their
utility is much lower these days.
the size difference is significant:
text data bss dec hex filename
34005 3462 24 37491 9273 sched.o.before
33470 3462 24 36956 905c sched.o.after
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Mike Galbraith <efault@gmx.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Dmitry Adamushko [Mon, 15 Oct 2007 15:00:06 +0000 (17:00 +0200)]
sched: clean up schedstat block in dequeue_entity()
Better placement of #ifdef CONFIG_SCHEDSTAT block in dequeue_entity().
Signed-off-by: Dmitry Adamushko <dmitry.adamushko@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Ingo Molnar [Mon, 15 Oct 2007 15:00:06 +0000 (17:00 +0200)]
sched: remove wait_runtime fields and features
remove wait_runtime based fields and features, now that the CFS
math has been changed over to the vruntime metric.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Mike Galbraith <efault@gmx.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Ingo Molnar [Mon, 15 Oct 2007 15:00:06 +0000 (17:00 +0200)]
sched: remove wait_runtime limit
remove the wait_runtime-limit fields and the code depending on it, now
that the math has been changed over to rely on the vruntime metric.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Mike Galbraith <efault@gmx.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Dmitry Adamushko [Mon, 15 Oct 2007 15:00:06 +0000 (17:00 +0200)]
sched: clean up struct load_stat
'struct load_stat' is redundant now so let's get rid of it.
Signed-off-by: Dmitry Adamushko <dmitry.adamushko@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Mike Galbraith <efault@gmx.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Ingo Molnar [Mon, 15 Oct 2007 15:00:06 +0000 (17:00 +0200)]
sched: debug: update exec_clock only when SCHED_DEBUG
micro-optimization: update cfs_rq->exec_clock only if
CONFIG_SCHED_DEBUG=y.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Mike Galbraith <efault@gmx.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Ingo Molnar [Mon, 15 Oct 2007 15:00:06 +0000 (17:00 +0200)]
sched: add more vruntime statistics
add more vruntime statistics.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Peter Zijlstra [Mon, 15 Oct 2007 15:00:05 +0000 (17:00 +0200)]
sched: handle vruntime 64-bit overflow
Handle vruntime overflow by centering the key space around min_vruntime.
( otherwise we could overflow 64-bit vruntime in a few days with SCHED_IDLE
tasks - or in a few years with nice +19. )
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Mike Galbraith <efault@gmx.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Peter Zijlstra [Mon, 15 Oct 2007 15:00:05 +0000 (17:00 +0200)]
sched: add tree based averages
add support for tree based vruntime averages.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Mike Galbraith <efault@gmx.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Ingo Molnar [Mon, 15 Oct 2007 15:00:05 +0000 (17:00 +0200)]
sched: remove SCHED_FEAT_SKIP_INITIAL
remove SCHED_FEAT_SKIP_INITIAL - it was off by default and even
when enabled it never made any real difference.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Ingo Molnar [Mon, 15 Oct 2007 15:00:05 +0000 (17:00 +0200)]
sched: add se->vruntime debugging
debug se->vruntime fields.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Mike Galbraith <efault@gmx.de>
Peter Zijlstra [Mon, 15 Oct 2007 15:00:05 +0000 (17:00 +0200)]
sched: clean up new task placement
clean up new task placement.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Mike Galbraith <efault@gmx.de>
Ingo Molnar [Mon, 15 Oct 2007 15:00:05 +0000 (17:00 +0200)]
sched: wakeup granularity increase
increase wakeup granularity - we were overscheduling a bit.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Mike Galbraith <efault@gmx.de>
Ingo Molnar [Mon, 15 Oct 2007 15:00:05 +0000 (17:00 +0200)]
sched: simplify check_preempt() methods
simplify the check_preempt() methods.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Mike Galbraith <efault@gmx.de>
Peter Zijlstra [Mon, 15 Oct 2007 15:00:05 +0000 (17:00 +0200)]
sched: simplify adaptive latency
simplify adaptive latency.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Mike Galbraith <efault@gmx.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Peter Zijlstra [Mon, 15 Oct 2007 15:00:04 +0000 (17:00 +0200)]
sched: new task placement for vruntime
add proper new task placement for the vruntime based math too.
( note: introduces a swap() macro, but the swap token is too
widely used in the kernel namespace for a generic version
to be added without changing non-scheduler code - so this
cleanup will be done separately. )
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Mike Galbraith <efault@gmx.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Ingo Molnar [Mon, 15 Oct 2007 15:00:04 +0000 (17:00 +0200)]
sched: optimize vruntime based scheduling
optimize vruntime based scheduling.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Mike Galbraith <efault@gmx.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Ingo Molnar [Mon, 15 Oct 2007 15:00:04 +0000 (17:00 +0200)]
sched: move sched_feat() definitions
move sched_feat() definitions so that it can be used sooner by generic
code too.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Mike Galbraith <efault@gmx.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Ingo Molnar [Mon, 15 Oct 2007 15:00:04 +0000 (17:00 +0200)]
sched: introduce se->vruntime
introduce se->vruntime as a sum of weighted delta-exec's, and use that
as the key into the tree.
the idea to use absolute virtual time as the basic metric of scheduling
has been first raised by William Lee Irwin, advanced by Tong Li and first
prototyped by Roman Zippel in the "Really Fair Scheduler" (RFS) patchset.
also see:
http://lkml.org/lkml/2007/9/2/76
for a simpler variant of this patch.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Mike Galbraith <efault@gmx.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Ingo Molnar [Mon, 15 Oct 2007 15:00:04 +0000 (17:00 +0200)]
sched: clean up calc_weighted()
clean up calc_weighted() - we always use the normalized shift so
it's not needed to pass that in. Also, push the non-nice0 branch
into the function.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Mike Galbraith <efault@gmx.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Ingo Molnar [Mon, 15 Oct 2007 15:00:04 +0000 (17:00 +0200)]
sched: speed up update_load_add/_sub()
speed up update_load_add/_sub() by not delaying the division - this
reduces CPU pipeline dependencies.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Mike Galbraith <efault@gmx.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Ingo Molnar [Mon, 15 Oct 2007 15:00:04 +0000 (17:00 +0200)]
sched: uninline __enqueue_entity()/__dequeue_entity()
suggested by Roman Zippel: uninline __enqueue_entity() and
__dequeue_entity().
this reduces code size:
text data bss dec hex filename
25385 2386 16 27787 6c8b sched.o.before
25257 2386 16 27659 6c0b sched.o.after
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Mike Galbraith <efault@gmx.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Peter Zijlstra [Mon, 15 Oct 2007 15:00:03 +0000 (17:00 +0200)]
sched: simplify SCHED_FEAT_* code
Peter Zijlstra suggested to simplify SCHED_FEAT_* checks via the
sched_feat(x) macro.
No code changed:
text data bss dec hex filename
38895 3550 24 42469 a5e5 sched.o.before
38895 3550 24 42469 a5e5 sched.o.after
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Mike Galbraith <efault@gmx.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Ingo Molnar [Mon, 15 Oct 2007 15:00:03 +0000 (17:00 +0200)]
sched: cleanup: simplify cfs_rq_curr() methods
cleanup: simplify cfs_rq_curr() methods - now that the cfs_rq->curr
pointer is unconditionally present, remove the wrappers.
kernel/sched.o:
text data bss dec hex filename
11784 224 2012 14020 36c4 sched.o.before
11784 224 2012 14020 36c4 sched.o.after
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Mike Galbraith <efault@gmx.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Ingo Molnar [Mon, 15 Oct 2007 15:00:03 +0000 (17:00 +0200)]
sched: track cfs_rq->curr on !group-scheduling too
Noticed by Roman Zippel: use cfs_rq->curr in the !group-scheduling
case too. Small micro-optimization and cleanup effect:
text data bss dec hex filename
36269 3482 24 39775 9b5f sched.o.before
36177 3486 24 39687 9b07 sched.o.after
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Mike Galbraith <efault@gmx.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Ingo Molnar [Mon, 15 Oct 2007 15:00:03 +0000 (17:00 +0200)]
sched: remove precise CPU load calculations #2
continued removal of precise CPU load calculations.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Mike Galbraith <efault@gmx.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Ingo Molnar [Mon, 15 Oct 2007 15:00:03 +0000 (17:00 +0200)]
sched: remove precise CPU load
CPU load calculations are statistical anyway, and there's little benefit
from having it calculated on every scheduling event. So remove this code,
it gets rid of a divide from the scheduler wakeup and context-switch
fastpath.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Mike Galbraith <efault@gmx.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Ingo Molnar [Mon, 15 Oct 2007 15:00:03 +0000 (17:00 +0200)]
sched: remove stat_gran
remove the stat_gran code - it was disabled by default and it causes
unnecessary overhead.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Mike Galbraith <efault@gmx.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Ingo Molnar [Mon, 15 Oct 2007 15:00:02 +0000 (17:00 +0200)]
sched: use constants if !CONFIG_SCHED_DEBUG
use constants if !CONFIG_SCHED_DEBUG.
this speeds up the code and reduces code-size:
text data bss dec hex filename
27464 3014 16 30494 771e sched.o.before
26929 3010 20 29959 7507 sched.o.after
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Mike Galbraith <efault@gmx.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Ingo Molnar [Mon, 15 Oct 2007 15:00:02 +0000 (17:00 +0200)]
sched: uniform tunings
use the same defaults on both UP and SMP.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Mike Galbraith <efault@gmx.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Ingo Molnar [Mon, 15 Oct 2007 15:00:02 +0000 (17:00 +0200)]
sched: debug: track maximum 'slice'
track the maximum amount of time a task has executed while
the CPU load was at least 2x. (i.e. at least two nice-0
tasks were runnable)
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Mike Galbraith <efault@gmx.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Ingo Molnar [Mon, 15 Oct 2007 15:00:02 +0000 (17:00 +0200)]
sched: small sched_debug cleanup
small kernel/sched_debug.c cleanup - break up
multi-variable assignment.
no code changed:
text data bss dec hex filename
38869 3550 24 42443 a5cb sched.o.before
38869 3550 24 42443 a5cb sched.o.after
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Mike Galbraith <efault@gmx.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Matthias Kaehlcke [Mon, 15 Oct 2007 15:00:02 +0000 (17:00 +0200)]
sched: use list_for_each_entry_safe() in __wake_up_common()
Use list_for_each_entry_safe() instead of list_for_each_safe() in
__wake_up_common()
Signed-off-by: Matthias Kaehlcke <matthias.kaehlcke@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Mike Galbraith <efault@gmx.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Ingo Molnar [Mon, 15 Oct 2007 15:00:02 +0000 (17:00 +0200)]
sched: resched task in task_new_fair()
to get full child-runs-first semantics make sure the parent is
rescheduled.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Mike Galbraith <efault@gmx.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Ingo Molnar [Mon, 15 Oct 2007 15:00:01 +0000 (17:00 +0200)]
sched: fix sysctl_sched_child_runs_first flag
fix the sched_child_runs_first flag: always call into ->task_new()
if we are on the same CPU, as SCHED_OTHER tasks depend on it for
correct initial setup.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Mike Galbraith <efault@gmx.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
David Brownell [Sun, 14 Oct 2007 21:50:25 +0000 (14:50 -0700)]
Fix compile while compiling drivers/mmc/host/mmc_spi.o with !BLOCK
Make sure the mmc_spi driver can build without CONFIG_BLOCK.
Issue noted by "Avuton Olrich" <avuton@gmail.com> and randconfig.
While that won't be a common configuration, sometimes embedded
boards use SDIO to interface WLAN or Bluetooth chips (vs some
parallel interface), and don't provide an MMC/SD socket for use
with flash memory cards.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sun, 14 Oct 2007 23:47:05 +0000 (16:47 -0700)]
Merge git://git./linux/kernel/git/tglx/linux-2.6-x86
* git://git.kernel.org/pub/scm/linux/kernel/git/tglx/linux-2.6-x86:
x86: force timer broadcast on late AMD C1E detection
x86: move local APIC timer init to the end of start_secondary()
clockevents: introduce force broadcast notifier
x86: fix missing include for vsyscall
Stephen Hemminger [Sun, 14 Oct 2007 20:25:22 +0000 (13:25 -0700)]
sky2: reboot fix
The call to napi_disable() in the PCI shutdown handler is problematic,
and is aggravated by the new NAPI.
Also, make sure watchdog timer doesn't go off.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Thomas Gleixner [Sun, 14 Oct 2007 20:57:45 +0000 (22:57 +0200)]
x86: force timer broadcast on late AMD C1E detection
The 64bit SMP bootup is slightly different to the 32bit one. It enables
the boot CPU local APIC timer before all CPUs are brought up. Some AMD C1E
systems have the C1E feature flag only set in the secondary CPU. Due to
the early enable of the boot CPU local APIC timer the APIC timer is
registered as a fully functional device. When we detect the wreckage during
the bringup of the secondary CPU, we need to force the boot CPU into
broadcast mode.
Check the C1E caused APIC timer disable, when the secondary APIC timer is
initialized. If the boot CPU APIC timer was registered as a functional
clock event device, then fix this up and utilize the
CLOCK_EVT_NOTIFY_BROADCAST_FORCE mechanism to force the already
registered boot CPU APIC timer into broadcast mode.
Tested by force injecting the failure mode.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Thomas Gleixner [Sun, 14 Oct 2007 20:57:45 +0000 (22:57 +0200)]
x86: move local APIC timer init to the end of start_secondary()
Preparatory patch for the AMD C1E wreckage fixup.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Thomas Gleixner [Sun, 14 Oct 2007 20:57:45 +0000 (22:57 +0200)]
clockevents: introduce force broadcast notifier
The 64bit SMP bootup is slightly different to the 32bit one. It enables
the boot CPU local APIC timer before all CPUs are brought up. Some AMD C1E
systems have the C1E feature flag only set in the secondary CPU. Due to
the early enable of the boot CPU local APIC timer the APIC timer is
registered as a fully functional device. When we detect the wreckage during
the bringup of the secondary CPU, we need to force the boot CPU into
broadcast mode.
Add a new notifier reason and implement the force broadcast in the clock
events layer.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>