sched, doc: Beef up load balancing description

author Borislav Petkov <bp@alien8.de>

Sun, 27 Mar 2011 15:57:13 +0000 (17:57 +0200)

committer Ingo Molnar <mingo@elte.hu>

Thu, 31 Mar 2011 11:00:35 +0000 (13:00 +0200)
author Borislav Petkov <bp@alien8.de>
Sun, 27 Mar 2011 15:57:13 +0000 (17:57 +0200)
committer Ingo Molnar <mingo@elte.hu>
Thu, 31 Mar 2011 11:00:35 +0000 (13:00 +0200)
diff --git a/Documentation/scheduler/sched-domains.txt b/Documentation/scheduler/sched-domains.txt

index 373ceacc367eb9a98cb6bbcad6f7ea4e83c21982..b7ee379b651bea7d962fcb51789b40689bb0e112 100644 (file)
--- a/Documentation/scheduler/sched-domains.txt
+++ b/Documentation/scheduler/sched-domains.txt
@@ -1,8 +1,7 @@
-Each CPU has a "base" scheduling domain (struct sched_domain). These are
-accessed via cpu_sched_domain(i) and this_sched_domain() macros. The domain
+Each CPU has a "base" scheduling domain (struct sched_domain). The domain
  hierarchy is built from these base domains via the ->parent pointer. ->parent
-MUST be NULL terminated, and domain structures should be per-CPU as they
-are locklessly updated.
+MUST be NULL terminated, and domain structures should be per-CPU as they are
+locklessly updated.
  
  Each scheduling domain spans a number of CPUs (stored in the ->span field).
  A domain's span MUST be a superset of it child's span (this restriction could
@@ -26,11 +25,26 @@ is treated as one entity. The load of a group is defined as the sum of the
  load of each of its member CPUs, and only when the load of a group becomes
  out of balance are tasks moved between groups.
  
-In kernel/sched.c, rebalance_tick is run periodically on each CPU. This
-function takes its CPU's base sched domain and checks to see if has reached
-its rebalance interval. If so, then it will run load_balance on that domain.
-rebalance_tick then checks the parent sched_domain (if it exists), and the
-parent of the parent and so forth.
+In kernel/sched.c, trigger_load_balance() is run periodically on each CPU
+through scheduler_tick(). It raises a softirq after the next regularly scheduled
+rebalancing event for the current runqueue has arrived. The actual load
+balancing workhorse, run_rebalance_domains()->rebalance_domains(), is then run
+in softirq context (SCHED_SOFTIRQ).
+
+The latter function takes two arguments: the current CPU and whether it was idle
+at the time the scheduler_tick() happened and iterates over all sched domains
+our CPU is on, starting from its base domain and going up the ->parent chain.
+While doing that, it checks to see if the current domain has exhausted its
+rebalance interval. If so, it runs load_balance() on that domain. It then checks
+the parent sched_domain (if it exists), and the parent of the parent and so
+forth.
+
+Initially, load_balance() finds the busiest group in the current sched domain.
+If it succeeds, it looks for the busiest runqueue of all the CPUs' runqueues in
+that group. If it manages to find such a runqueue, it locks both our initial
+CPU's runqueue and the newly found busiest one and starts moving tasks from it
+to our runqueue. The exact number of tasks amounts to an imbalance previously
+computed while iterating over this sched domain's groups.
  
  *** Implementing sched domains ***
  The "base" domain will "span" the first level of the hierarchy. In the case
author	Borislav Petkov <bp@alien8.de>
	Sun, 27 Mar 2011 15:57:13 +0000 (17:57 +0200)
committer	Ingo Molnar <mingo@elte.hu>
	Thu, 31 Mar 2011 11:00:35 +0000 (13:00 +0200)