cpuidle: ACPI: do not overwrite name and description of C0 commit c7e8bdf5872c5a8f5a6494e16fe839c38a0d3d3d upstream. Fix a bug that leads to showing the name and description of C-state C0 as "<null>" in sysfs after the ACPI C-states changed (e.g. after AC->DC or DC->AC transition). The function poll_idle_init() in drivers/cpuidle/driver.c initializes the state 0 during cpuidle_register_driver(), so we better do not overwrite it again with '\0' during acpi_processor_cst_has_changed(). Signed-off-by: Thomas Schlichter <thomas.schlichter@web.de> Reviewed-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ACPI / cpuidle: fix deadlock between cpuidle_lock and cpu_hotplug.lock commit 6726655dfdd2dc60c035c690d9f10cb69d7ea075 upstream. There is a following AB-BA dependency between cpu_hotplug.lock and cpuidle_lock: 1) cpu_hotplug.lock -> cpuidle_lock enable_nonboot_cpus() _cpu_up() cpu_hotplug_begin() LOCK(cpu_hotplug.lock) cpu_notify() ... acpi_processor_hotplug() cpuidle_pause_and_lock() LOCK(cpuidle_lock) 2) cpuidle_lock -> cpu_hotplug.lock acpi_os_execute_deferred() workqueue ... acpi_processor_cst_has_changed() cpuidle_pause_and_lock() LOCK(cpuidle_lock) get_online_cpus() LOCK(cpu_hotplug.lock) Fix this by reversing the order acpi_processor_cst_has_changed() does thigs -- let it first execute the protection against CPU hotplug by calling get_online_cpus() and obtain the cpuidle lock only after that (and perform the symmentric change when allowing CPUs hotplug again and dropping cpuidle lock). Spotted by lockdep. Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
sched, idle: Fix the idle polling state logic commit ea8117478918a4734586d35ff530721b682425be upstream. Mike reported that commit 7d1a9417 ("x86: Use generic idle loop") regressed several workloads and caused excessive reschedule interrupts. The patch in question failed to notice that the x86 code had an inverted sense of the polling state versus the new generic code (x86: default polling, generic: default !polling). Fix the two prominent x86 mwait based idle drivers and introduce a few new generic polling helpers (fixing the wrong smp_mb__after_clear_bit usage). Also switch the idle routines to using tif_need_resched() which is an immediate TIF_NEED_RESCHED test as opposed to need_resched which will end up being slightly different. Reported-by: Mike Galbraith <bitbucket@online.de> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: lenb@kernel.org Cc: tglx@linutronix.de Link: http://lkml.kernel.org/n/tip-nc03imb0etuefmzybzj7sprf@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ACPI / PM: Move processor suspend/resume to syscore_ops The system suspend routine of the ACPI processor driver saves the BUS_MASTER_RLD register and its resume routine restores it. However, there can be only one such register in the system and it really should be saved after non-boot CPUs have been offlined and restored before they are put back online during resume. For this reason, move the saving and restoration of BUS_MASTER_RLD to syscore suspend and syscore resume, respectively, and drop the no longer necessary suspend/resume callbacks from the ACPI processor driver. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
cpuidle: remove en_core_tk_irqen flag The en_core_tk_irqen flag is set in all the cpuidle driver which means it is not necessary to specify this flag. Remove the flag and the code related to it. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Acked-by: Kevin Hilman <khilman@linaro.org> # for mach-omap2/* Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
cpuidle / ACPI: recover percpu ACPI processor cstate Commit ac3ebafa81af76d6 "ACPI / idle: remove usage of the statedata" changed the percpu processor cstate to a unified cstate in ACPI idle. That caused all our NHM boxes to boot hang or panic. 2178751 Task dump for CPU 1: 2178752 swapper/1 R running task 6736 0 1 0x00000000 2178753 ffff8801e8029dc8 ffffffff8101cf96 ffff8801e8029e28 ffffffff813d294b 2178754 0000000000000f99 0000000000000003 00000000003cf654 0000000025c17d03 2178755 ffff8801e8029e38 ffff8801e74fc000 00000002590dc5c4 ffffffff8163cdb0 2178756 Call Trace: 2178757 [<ffffffff8101cf96>] ? acpi_processor_ffh_cstate_enter+0x2d/0x2f 2178758 [<ffffffff813d294b>] acpi_idle_enter_bm+0x1b1/0x236 2178759 [<ffffffff8163cdb0>] ? disable_cpuidle+0x10/0x10 2178760 [<ffffffff8163cdc2>] cpuidle_enter+0x12/0x14 2178761 [<ffffffff8163d286>] cpuidle_wrap_enter+0x2f/0x6d 2178762 [<ffffffff8163d2d4>] cpuidle_enter_tk+0x10/0x12 2178763 [<ffffffff8163cdd6>] cpuidle_enter_state+0x12/0x3a 2178764 [<ffffffff8163d4a7>] cpuidle_idle_call+0xe8/0x161 2178765 [<ffffffff81008d99>] cpu_idle+0x5e/0xa4 2178766 [<ffffffff8174c6c1>] start_secondary+0x1a9/0x1ad 2178767 Task dump for CPU 2: In fact, the ACPI idle is based on the assumption of difference percpu cstate structures that are necessary for the implementation to work cprrectly. A unique acpi_processor_cx is not sifficient by far. This patch is just a quick fix re-introducing the percpu cstates. If someone really wants to unify the ACPI cstates, please make sure that the whole software infrastructure is changed and take hardware as well as many different kinds of BIOS settings into account. [rjw: Changelog] Reported-by: LKP project <lkp@linux.intel.com> Reported-by: Xie ChanglongX <changlongx.xie@intel.com> Tested-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Alex Shi <alex.shi@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
x86 idle: remove mwait_idle() and "idle=mwait" cmdline param mwait_idle() is a C1-only idle loop intended to be more efficient than HLT, starting on Pentium-4 HT-enabled processors. But mwait_idle() has been replaced by the more general mwait_idle_with_hints(), which handles both C1 and deeper C-states. ACPI processor_idle and intel_idle use only mwait_idle_with_hints(), and no longer use mwait_idle(). Here we simplify the x86 native idle code by removing mwait_idle(), and the "idle=mwait" bootparam used to invoke it. Since Linux 3.0 there has been a boot-time warning when "idle=mwait" was invoked saying it would be removed in 2012. This removal was also noted in the (now removed:-) feature-removal-schedule.txt. After this change, kernels configured with (CONFIG_ACPI=n && CONFIG_INTEL_IDLE=n) when run on hardware that supports MWAIT will simply use HLT. If MWAIT is desired on those systems, cpuidle and the cpuidle drivers above can be enabled. Signed-off-by: Len Brown <len.brown@intel.com> Cc: x86@kernel.org
ACPI / idle: remove usage of the statedata Len Brown sent a patch to remove this field in the intel_idle driver. The other user of this field is the davinci cpuidle driver and a patch has been sent to remove the usage of it. This patch removes the last user of this field. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Len Brown <len.brown@intel.com>
ACPI / idle: pass the cpuidle_device parameter The cpuidle_device is retrieved in the function by using directly the global variable. But the caller of this function already have this device and it can be passed as a parameter. That is one small step to encapsulate the code more. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Len Brown <len.brown@intel.com>
ACPI / processor: Get power info before updating the C-states acpi_processor_get_power_info() has to be called before acpi_processor_setup_cpuidle_states() to have the latest information available. This fixes the missing C-state information after AC-->DC transition. Signed-off-by: Thomas Schlichter <thomas.schlichter@web.de> Cc: <stable@vger.kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
ACPI / cpuidle: Fix NULL pointer issues when cpuidle is disabled If cpuidle is disabled, that means that: per_cpu(acpi_cpuidle_device, pr->id) is set to NULL as the acpi_processor_power_init ends up failing at retval = cpuidle_register_driver(&acpi_idle_driver) (in acpi_processor_power_init) and never sets the per_cpu idle device. So when acpi_processor_hotplug on CPU online notification tries to reference said device it crashes: cpu 3 spinlock event irq 62 BUG: unable to handle kernel NULL pointer dereference at 0000000000000004 IP: [<ffffffff81381013>] acpi_processor_setup_cpuidle_cx+0x3f/0x105 PGD a259b067 PUD ab38b067 PMD 0 Oops: 0002 [#1] SMP odules linked in: dm_multipath dm_mod xen_evtchn iscsi_boot_sysfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi libcrc32c crc32c nouveau mxm_wmi wmi radeon ttm sg sr_mod sd_mod cdrom ata_generic ata_piix libata crc32c_intel scsi_mod atl1c i915 fbcon tileblit font bitblit softcursor drm_kms_helper video xen_blkfront xen_netfront fb_sys_fops sysimgblt sysfillrect syscopyarea xenfs xen_privcmd mperf CPU 1 Pid: 3047, comm: bash Not tainted 3.8.0-rc3upstream-00250-g165c029 #1 MSI MS-7680/H61M-P23 (MS-7680) RIP: e030:[<ffffffff81381013>] [<ffffffff81381013>] acpi_processor_setup_cpuidle_cx+0x3f/0x105 RSP: e02b:ffff88001742dca8 EFLAGS: 00010202 RAX: 0000000000010be9 RBX: ffff8800a0a61800 RCX: ffff880105380000 RDX: 0000000000000003 RSI: 0000000000000200 RDI: ffff8800a0a61800 RBP: ffff88001742dce8 R08: ffffffff81812360 R09: 0000000000000200 R10: aaaaaaaaaaaaaaaa R11: 0000000000000001 R12: ffff8800a0a61800 R13: 00000000ffffff01 R14: 0000000000000000 R15: ffffffff81a907a0 FS: 00007fd6942f7700(0000) GS:ffff880105280000(0000) knlGS:0000000000000000 CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000004 CR3: 00000000a6773000 CR4: 0000000000042660 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process bash (pid: 3047, threadinfo ffff88001742c000, task ffff880017944000) Stack: 0000000000000150 ffff880100f59e00 ffff88001742dcd8 ffff8800a0a61800 0000000000000000 00000000ffffff01 0000000000000000 ffffffff81a907a0 ffff88001742dd18 ffffffff813815b1 ffff88001742dd08 ffffffff810ae336 Call Trace: [<ffffffff813815b1>] acpi_processor_hotplug+0x7c/0x9f [<ffffffff810ae336>] ? schedule_delayed_work_on+0x16/0x20 [<ffffffff8137ee8f>] acpi_cpu_soft_notify+0x90/0xca [<ffffffff8166023d>] notifier_call_chain+0x4d/0x70 [<ffffffff810bc369>] __raw_notifier_call_chain+0x9/0x10 [<ffffffff81094a4b>] __cpu_notify+0x1b/0x30 [<ffffffff81652cf7>] _cpu_up+0x103/0x14b [<ffffffff81652e18>] cpu_up+0xd9/0xec [<ffffffff8164a254>] store_online+0x94/0xd0 [<ffffffff814122fb>] dev_attr_store+0x1b/0x20 [<ffffffff81216404>] sysfs_write_file+0xf4/0x170 This patch fixes it. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: <stable@vger.kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
cpuidle: Measure idle state durations with monotonic clock Many cpuidle drivers measure their time spent in an idle state by reading the wallclock time before and after idling and calculating the difference. This leads to erroneous results when the wallclock time gets updated by another processor in the meantime, adding that clock adjustment to the idle state's time counter. If the clock adjustment was negative, the result is even worse due to an erroneous cast from int to unsigned long long of the last_residency variable. The negative 32 bit integer will zero-extend and result in a forward time jump of roughly four billion milliseconds or 1.3 hours on the idle state residency counter. This patch changes all affected cpuidle drivers to either use the monotonic clock for their measurements or make use of the generic time measurement wrapper in cpuidle.c, which was already working correctly. Some superfluous CLIs/STIs in the ACPI code are removed (interrupts should always already be disabled before entering the idle function, and not get reenabled until the generic wrapper has performed its second measurement). It also removes the erroneous cast, making sure that negative residency values are applied correctly even though they should not appear anymore. Signed-off-by: Julius Werner <jwerner@chromium.org> Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com> Tested-by: Daniel Lezcano <daniel.lezcano@linaro.org> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Acked-by: Len Brown <len.brown@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
cpuidle / ACPI: fix potential NULL pointer dereference The dereference should be moved below the NULL test. dpatch engine is used to auto generate this patch. (https://github.com/weiyj/dpatch) Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: Len Brown <len.brown@intel.com>
cpuidle / ACPI : move cpuidle_device field out of the acpi_processor_power structure Currently we have the cpuidle_device field in the acpi_processor_power structure. This adds a dependency between processor.h and cpuidle.h Although it is not a real problem, removing this dependency has the benefit of separating a bit more the cpuidle code from the rest of the acpi code. Also, the compilation should be a bit improved because we do no longer include cpuidle.h in processor.h. The preprocessor was generating 30418 loc and with this patch it generates 30256 loc for processor_thermal.c, a file which is not concerned at all by cpuidle, like processor_perflib.c and processor_throttling.c. That may sound ridiculous, but "small streams make big rivers" :P This patch moves this field into a static global per cpu variable like what is done in the intel_idle driver. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
ACPI / processor: remove unused function parameter The 'device' parameter is not used neither in acpi_processor_power_init and acpi_processor_power_exit. This patch removes it. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
cpuidle / ACPI : remove power from acpi_processor_cx structure Remove the unused power field from struct struct acpi_processor_cx. [rjw: Modified changelog.] Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Merge branch 'release' of git://git./linux/kernel/git/lenb/linux Pull ACPI & power management update from Len Brown: "Re-write of the turbostat tool. lower overhead was necessary for measuring very large system when they are very idle. IVB support in intel_idle It's what I run on my IVB, others should be able to also:-) ACPICA core update We have found some bugs due to divergence between Linux and the upstream ACPICA base. Most of these patches are to reduce that divergence to reduce the risk of future bugs. Some cpuidle updates, mostly for non-Intel More will be coming, as they depend on this part. Some thermal management changes needed by non-ACPI systems. Some _OST (OS Status Indication) updates for hot ACPI hot-plug." * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux: (51 commits) Thermal: Documentation update Thermal: Add Hysteresis attributes Thermal: Make Thermal trip points writeable ACPI/AC: prevent OOPS on some boxes due to missing check power_supply_register() return value check tools/power: turbostat: fix large c1% issue tools/power: turbostat v2 - re-write for efficiency ACPICA: Update to version 20120711 ACPICA: AcpiSrc: Fix some translation issues for Linux conversion ACPICA: Update header files copyrights to 2012 ACPICA: Add new ACPI table load/unload external interfaces ACPICA: Split file: tbxface.c -> tbxfload.c ACPICA: Add PCC address space to space ID decode function ACPICA: Fix some comment fields ACPICA: Table manager: deploy new firmware error/warning interfaces ACPICA: Add new interfaces for BIOS(firmware) errors and warnings ACPICA: Split exception code utilities to a new file, utexcep.c ACPI: acpi_pad: tune round_robin_time ACPICA: Update to version 20120620 ACPICA: Add support for implicit notify on multiple devices ACPICA: Update comments; no functional change ...