[PATCH] sched: reduce overhead of calc_load
authorJack Steiner <steiner@sgi.com>
Fri, 31 Mar 2006 10:31:21 +0000 (02:31 -0800)
committerLinus Torvalds <torvalds@g5.osdl.org>
Fri, 31 Mar 2006 20:18:58 +0000 (12:18 -0800)
Currently, count_active_tasks() calls both nr_running() &
nr_interruptible().  Each of these functions does a "for_each_cpu" & reads
values from the runqueue of each cpu.  Although this is not a lot of
instructions, each runqueue may be located on different node.  Depending on
the architecture, a unique TLB entry may be required to access each
runqueue.

Since there may be more runqueues than cpu TLB entries, a scan of all
runqueues can trash the TLB.  Each memory reference incurs a TLB miss &
refill.

In addition, the runqueue cacheline that contains nr_running &
nr_uninterruptible may be evicted from the cache between the two passes.
This causes unnecessary cache misses.

Combining nr_running() & nr_interruptible() into a single function
substantially reduces the TLB & cache misses on large systems.  This should
have no measureable effect on smaller systems.

On a 128p IA64 system running a memory stress workload, the new function
reduced the overhead of calc_load() from 605 usec/call to 324 usec/call.

Signed-off-by: Jack Steiner <steiner@sgi.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
include/linux/sched.h
kernel/sched.c
kernel/timer.c

index d04186d8cc685d9be0ddea01fc94a23b7a6828a7..ab84adf5bb9af8f3a36e8445880a37f0794abded 100644 (file)
@@ -100,6 +100,7 @@ DECLARE_PER_CPU(unsigned long, process_counts);
 extern int nr_processes(void);
 extern unsigned long nr_running(void);
 extern unsigned long nr_uninterruptible(void);
+extern unsigned long nr_active(void);
 extern unsigned long nr_iowait(void);
 
 #include <linux/time.h>
index a9ecac398bb9b979a4457ea0ade69259c9e7dc53..6e52e0adff80dfc04fa921dd88a0df7bce8a4a2a 100644 (file)
@@ -1658,6 +1658,21 @@ unsigned long nr_iowait(void)
        return sum;
 }
 
+unsigned long nr_active(void)
+{
+       unsigned long i, running = 0, uninterruptible = 0;
+
+       for_each_online_cpu(i) {
+               running += cpu_rq(i)->nr_running;
+               uninterruptible += cpu_rq(i)->nr_uninterruptible;
+       }
+
+       if (unlikely((long)uninterruptible < 0))
+               uninterruptible = 0;
+
+       return running + uninterruptible;
+}
+
 #ifdef CONFIG_SMP
 
 /*
index 9062a82ee8ec40d9a6ef985c625bd8daaf5778c5..6b812c04737b5b611c9d40257f8c3b28292955c4 100644 (file)
@@ -825,7 +825,7 @@ void update_process_times(int user_tick)
  */
 static unsigned long count_active_tasks(void)
 {
-       return (nr_running() + nr_uninterruptible()) * FIXED_1;
+       return nr_active() * FIXED_1;
 }
 
 /*