cpufreq: intel_pstate: Disable energy efficiency optimization
authorSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Fri, 3 Feb 2017 22:18:39 +0000 (14:18 -0800)
committerRafael J. Wysocki <rafael.j.wysocki@intel.com>
Fri, 3 Feb 2017 23:11:08 +0000 (00:11 +0100)
Some Kabylake desktop processors may not reach max turbo when running in
HWP mode, even if running under sustained 100% utilization.

This occurs when the HWP.EPP (Energy Performance Preference) is set to
"balance_power" (0x80) -- the default on most systems.

It occurs because the platform BIOS may erroneously enable an
energy-efficiency setting -- MSR_IA32_POWER_CTL BIT-EE, which is not
recommended to be enabled on this SKU.

On the failing systems, this BIOS issue was not discovered when the
desktop motherboard was tested with Windows, because the BIOS also
neglects to provide the ACPI/CPPC table, that Windows requires to enable
HWP, and so Windows runs in legacy P-state mode, where this setting has
no effect.

Linux' intel_pstate driver does not require ACPI/CPPC to enable HWP, and
so it runs in HWP mode, exposing this incorrect BIOS configuration.

There are several ways to address this problem.

First, Linux can also run in legacy P-state mode on this system.
As intel_pstate is how Linux enables HWP, booting with
"intel_pstate=disable"
will run in acpi-cpufreq/ondemand legacy p-state mode.

Or second, the "performance" governor can be used with intel_pstate,
which will modify HWP.EPP to 0.

Or third, starting in 4.10, the
/sys/devices/system/cpu/cpufreq/policy*/energy_performance_preference
attribute in can be updated from "balance_power" to "performance".

Or fourth, apply this patch, which fixes the erroneous setting of
MSR_IA32_POWER_CTL BIT_EE on this model, allowing the default
configuration to function as designed.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Reviewed-by: Len Brown <len.brown@intel.com>
Cc: 4.6+ <stable@vger.kernel.org> # 4.6+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
drivers/cpufreq/intel_pstate.c

index f91c25718d164c9d9339acf671d67937995fe076..86e36544925f6f452425ae1d0634743d8ada54a8 100644 (file)
@@ -1235,6 +1235,25 @@ static void intel_pstate_hwp_enable(struct cpudata *cpudata)
                cpudata->epp_default = intel_pstate_get_epp(cpudata, 0);
 }
 
+#define MSR_IA32_POWER_CTL_BIT_EE      19
+
+/* Disable energy efficiency optimization */
+static void intel_pstate_disable_ee(int cpu)
+{
+       u64 power_ctl;
+       int ret;
+
+       ret = rdmsrl_on_cpu(cpu, MSR_IA32_POWER_CTL, &power_ctl);
+       if (ret)
+               return;
+
+       if (!(power_ctl & BIT(MSR_IA32_POWER_CTL_BIT_EE))) {
+               pr_info("Disabling energy efficiency optimization\n");
+               power_ctl |= BIT(MSR_IA32_POWER_CTL_BIT_EE);
+               wrmsrl_on_cpu(cpu, MSR_IA32_POWER_CTL, power_ctl);
+       }
+}
+
 static int atom_get_min_pstate(void)
 {
        u64 value;
@@ -1845,6 +1864,11 @@ static const struct x86_cpu_id intel_pstate_cpu_oob_ids[] __initconst = {
        {}
 };
 
+static const struct x86_cpu_id intel_pstate_cpu_ee_disable_ids[] = {
+       ICPU(INTEL_FAM6_KABYLAKE_DESKTOP, core_params),
+       {}
+};
+
 static int intel_pstate_init_cpu(unsigned int cpunum)
 {
        struct cpudata *cpu;
@@ -1875,6 +1899,12 @@ static int intel_pstate_init_cpu(unsigned int cpunum)
        cpu->cpu = cpunum;
 
        if (hwp_active) {
+               const struct x86_cpu_id *id;
+
+               id = x86_match_cpu(intel_pstate_cpu_ee_disable_ids);
+               if (id)
+                       intel_pstate_disable_ee(cpunum);
+
                intel_pstate_hwp_enable(cpu);
                pid_params.sample_rate_ms = 50;
                pid_params.sample_rate_ns = 50 * NSEC_PER_MSEC;