sched/fair: Add detailed description to the sched load avg metrics

author Yuyang Du <yuyang.du@intel.com>

Tue, 5 Apr 2016 04:12:28 +0000 (12:12 +0800)

committer Ingo Molnar <mingo@kernel.org>

Thu, 5 May 2016 07:41:08 +0000 (09:41 +0200)
author Yuyang Du <yuyang.du@intel.com>
Tue, 5 Apr 2016 04:12:28 +0000 (12:12 +0800)
committer Ingo Molnar <mingo@kernel.org>
Thu, 5 May 2016 07:41:08 +0000 (09:41 +0200)
diff --git a/include/linux/sched.h b/include/linux/sched.h

index 7d779d70a3a59b25be1995609369b200b09751c3..57faf789c88f8d86576138053f6e1b85b672159b 100644 (file)
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1211,18 +1211,56 @@ struct load_weight {
  };
  
  /*
- * The load_avg/util_avg accumulates an infinite geometric series.
- * 1) load_avg factors frequency scaling into the amount of time that a
- * sched_entity is runnable on a rq into its weight. For cfs_rq, it is the
- * aggregated such weights of all runnable and blocked sched_entities.
- * 2) util_avg factors frequency and cpu capacity scaling into the amount of time
- * that a sched_entity is running on a CPU, in the range [0..SCHED_CAPACITY_SCALE].
- * For cfs_rq, it is the aggregated such times of all runnable and
+ * The load_avg/util_avg accumulates an infinite geometric series
+ * (see __update_load_avg() in kernel/sched/fair.c).
+ *
+ * [load_avg definition]
+ *
+ *   load_avg = runnable% * scale_load_down(load)
+ *
+ * where runnable% is the time ratio that a sched_entity is runnable.
+ * For cfs_rq, it is the aggregated load_avg of all runnable and
   * blocked sched_entities.
- * The 64 bit load_sum can:
- * 1) for cfs_rq, afford 4353082796 (=2^64/47742/88761) entities with
- * the highest weight (=88761) always runnable, we should not overflow
- * 2) for entity, support any load.weight always runnable
+ *
+ * load_avg may also take frequency scaling into account:
+ *
+ *   load_avg = runnable% * scale_load_down(load) * freq%
+ *
+ * where freq% is the CPU frequency normalized to the highest frequency.
+ *
+ * [util_avg definition]
+ *
+ *   util_avg = running% * SCHED_CAPACITY_SCALE
+ *
+ * where running% is the time ratio that a sched_entity is running on
+ * a CPU. For cfs_rq, it is the aggregated util_avg of all runnable
+ * and blocked sched_entities.
+ *
+ * util_avg may also factor frequency scaling and CPU capacity scaling:
+ *
+ *   util_avg = running% * SCHED_CAPACITY_SCALE * freq% * capacity%
+ *
+ * where freq% is the same as above, and capacity% is the CPU capacity
+ * normalized to the greatest capacity (due to uarch differences, etc).
+ *
+ * N.B., the above ratios (runnable%, running%, freq%, and capacity%)
+ * themselves are in the range of [0, 1]. To do fixed point arithmetics,
+ * we therefore scale them to as large a range as necessary. This is for
+ * example reflected by util_avg's SCHED_CAPACITY_SCALE.
+ *
+ * [Overflow issue]
+ *
+ * The 64-bit load_sum can have 4353082796 (=2^64/47742/88761) entities
+ * with the highest load (=88761), always runnable on a single cfs_rq,
+ * and should not overflow as the number already hits PID_MAX_LIMIT.
+ *
+ * For all other cases (including 32-bit kernels), struct load_weight's
+ * weight will overflow first before we do, because:
+ *
+ *    Max(load_avg) <= Max(load.weight)
+ *
+ * Then it is the load_weight's responsibility to consider overflow
+ * issues.
   */
  struct sched_avg {
         u64 last_update_time, load_sum;
author	Yuyang Du <yuyang.du@intel.com>
	Tue, 5 Apr 2016 04:12:28 +0000 (12:12 +0800)
committer	Ingo Molnar <mingo@kernel.org>
	Thu, 5 May 2016 07:41:08 +0000 (09:41 +0200)