Rafael J. Wysocki [Wed, 22 Aug 2012 10:27:24 +0000 (12:27 +0200)]
ARM: shmobile: Add A4S cpuidle state on sh7372
Add a "C5" cpuidle state to the SH7372 SoC connected to the A4S power
domain in such a way that A4S may be turned off by cpuidle if all
I/O devices in that domain have been suspended (or do not have
attached drivers).
This requires some reorganization of the initialization of SH7372
power management which affects the the boards based on it, Mackerel
and AP4EVB.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Magnus Damm <damm@opensource.se>
Rafael J. Wysocki [Mon, 3 Sep 2012 23:45:50 +0000 (01:45 +0200)]
Merge branch 'pm-cpuidle' into pm-shmobile
* pm-cpuidle:
PM / cpuidle: Make ladder governor use the "disabled" state flag
Honor state disabling in the cpuidle ladder governor
Rafael J. Wysocki [Wed, 15 Aug 2012 18:58:19 +0000 (20:58 +0200)]
ARM: shmobile: Make sh7372 cpuidle handling more straightforward
The sh7372 cpuidle code uses the same artificially designed routine
shmobile_cpuidle_enter() as the .enter() callback for all of its
cpuidle states. However, shmobile_cpuidle_enter() calls a different
"enter" function for each state using an array of function pointers
populated by the sh7372 PM initialization code. Moreover, the
states[] array of the shmobile cpuidle driver is populated by that
code as well, although in principle it just might have been filled
with static data.
All of that complexity goes away if the sh7372 cpuidle code is
allowed to define its own cpuidle driver structure that can be passed
for registration to the common shmobile cpuidle initialization
routine, so modify the code accordingly.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Magnus Damm <damm@opensource.se>
Rafael J. Wysocki [Wed, 15 Aug 2012 18:57:27 +0000 (20:57 +0200)]
ARM: shmobile: Move definition of shmobile_init_late() to header
The role of the only function in the common.c file in
arch/arm/mach-shmobile, shmobile_init_late(), is to call two
initializers whose definitions depend on kernel configuration
options. Those initializers may very well be called from a static
inline function in arm/mach-shmobile/include/mach/common.h,
though, in which makes the code a bit easier to read. Moreover,
the common.c may be dropped entirely then.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Magnus Damm <damm@opensource.se>
Rafael J. Wysocki [Wed, 15 Aug 2012 18:57:06 +0000 (20:57 +0200)]
ARM: shmobile: Remove the console check from sh7372_enter_suspend()
The !console_suspend_enabled check in sh7372_enter_suspend() seems
to be reversed and the condition it is supposed to catch (console
clock enabled) should be detected by the sh7372_sysc_valid() check
anyway, so remove it.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Magnus Damm <damm@opensource.se>
Rafael J. Wysocki [Wed, 15 Aug 2012 18:56:41 +0000 (20:56 +0200)]
ARM: shmobile: Rework adding devices to PM domains on AP4EVB
Use the function rmobile_add_devices_to_domains() introduced
previously for adding devices to PM domains during the AP4EVB
initialization instead of a series of rmobile_add_device_to_domain*()
calls. This also causes the default device PM QoS latencies to be
used on that board in analogy with Mackerel.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Magnus Damm <damm@opensource.se>
Rafael J. Wysocki [Wed, 15 Aug 2012 18:56:26 +0000 (20:56 +0200)]
ARM: shmobile: Rework adding devices to PM domains on Mackerel
On SH7372 and Mackerel devices are added to PM domains through a
series of rmobile_add_device_to_domain_td() calls where the last
argument is always the same. This is quite inefficient, so add
a common function for adding devices to PM domains that reads the
domain-device pairs information from a table and use it during SH7372
and Mackerel initialization.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Magnus Damm <damm@opensource.se>
Rafael J. Wysocki [Tue, 7 Aug 2012 22:29:16 +0000 (00:29 +0200)]
ARM: shmobile: Specify device latencies for Mackerel devices directly
The results of adaptive latency computations in
GENPD_DEV_TIMED_CALLBACK() show that the start/stop and save/restore
state latencies of all devices on the Mackerel board I have tried are
a little below 250 us. Therefore, if the 250 us is used as the
common initial value of the latency fields in struct gpd_timing_data
for all devices on Mackerel, the latency values will never have to
change at run time and there won't be any overhead related to
re-computation of the corresponding PM QoS data.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Magnus Damm <damm@opensource.se>
Rafael J. Wysocki [Tue, 7 Aug 2012 22:28:36 +0000 (00:28 +0200)]
ARM: shmobile: Specify device latencies for SH7372 devices directly
The results of adaptive latency computations in
GENPD_DEV_TIMED_CALLBACK() show that the start/stop and save/restore
state latencies of all devices on SH7372 I have tried are a little
below 250 us. Therefore, if the 250 us is used as the common initial
value of the latency fields in struct gpd_timing_data for all devices
on SH7372, the latency values will never have to change at run time
and there won't be any overhead related to re-computation of the
corresponding PM QoS data.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Magnus Damm <damm@opensource.se>
Rafael J. Wysocki [Tue, 7 Aug 2012 22:27:52 +0000 (00:27 +0200)]
ARM: shmobile: Allow device latencies to be specified directly
Make it possible to specify device start/stop and save/restore
state latencies directy when adding devices to PM domains. For
this purpose, introduce rmobile_add_device_to_domain_td() whose
third argument is a pointer to a struct gpd_timing_data object
containing device latency data.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Magnus Damm <damm@opensource.se>
Rafael J. Wysocki [Tue, 7 Aug 2012 22:27:10 +0000 (00:27 +0200)]
ARM: shmobile: Set PM domain on/off latencies directly
The results of adaptive latency computations in __pm_genpd_poweron()
and pm_genpd_poweroff() show that the power on/power off latencies
of all power domains in SH7372 are a little below 250 us. Therefore,
if 250 us is used as the common initial value of the latency fields
in struct generic_pm_domain for all domains, the latency values
will never have to change at run time and there won't be any overhead
related to re-computation of the corresponding PM QoS data.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Magnus Damm <damm@opensource.se>
Rafael J. Wysocki [Mon, 6 Aug 2012 23:15:02 +0000 (01:15 +0200)]
ARM: shmobile: Make rmobile_init_pm_domain() static
Since rmobile_init_pm_domain() is not called anywhere outside of
arch/arm/mach-shmobile/pm-rmobile.c any more, it can be made static
and its header may be removed from pm-rmobile.h. Modify the code
accordingly.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Magnus Damm <damm@opensource.se>
Rafael J. Wysocki [Mon, 6 Aug 2012 23:14:14 +0000 (01:14 +0200)]
ARM: shmobile: Move r8a7779's PM domain objects to a table
Instead of giving a name to every r8a7779's PM domain object, put
them all into a table and initialize them all together in a loop.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Magnus Damm <damm@opensource.se>
Rafael J. Wysocki [Mon, 6 Aug 2012 23:13:37 +0000 (01:13 +0200)]
ARM: shmobile: Move r8a7740's PM domain objects to a table
Instead of giving a name to every r8a7740's PM domain object, put
them all into a table and use rmobile_init_domains(), introduced by a
previous patch, for initializing them all altogether. Also, use
pm_genpd_add_subdomain_names() for adding A3SP as a subdomain of A4S.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Magnus Damm <damm@opensource.se>
Rafael J. Wysocki [Mon, 6 Aug 2012 23:12:56 +0000 (01:12 +0200)]
ARM: shmobile: Move sh7372's PM domain objects to a table
Instead of giving a name to every sh7372's PM domain object, put them
all into a table and use rmobile_init_domains(), introduced by a
previous patch, for initializing them all altogether. Also, use
pm_genpd_add_subdomain_names() for adding subdomains to the PM
domains and pm_genpd_poweron_name() for turning on the A4S domain
when preparing for system suspend.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Magnus Damm <damm@opensource.se>
Rafael J. Wysocki [Wed, 15 Aug 2012 18:54:15 +0000 (20:54 +0200)]
ARM: shmobile: Do not access sh7372 A4S domain internals directly
The sh7372_enter_suspend() routine checks the status field of the
generic PM domain object corresponding to the A4S domain in order to
check if it can turn that domain off when entering system sleep.
However, it shouldn't rely on the specific values of the generic
data structures this way, so make it use its own mechanism to
recognize when it is safe to turn that domain off.
For this purpos, introduce a boolean variable a4s_suspend_ready
that will be set by the A4S' suspend routine and unset by its
resume routine executed by rmobile_pd_power_down() and
__rmobile_pd_power_up(), respectively.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Magnus Damm <damm@opensource.se>
Rafael J. Wysocki [Mon, 6 Aug 2012 23:10:22 +0000 (01:10 +0200)]
ARM: shmobile: Add routine for automatic PM domains initialization
Add a new routine, rmobile_init_domains(), allowing the caller
to initialize all generic PM objects stored in a table in one
operation.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Magnus Damm <damm@opensource.se>
Rafael J. Wysocki [Mon, 6 Aug 2012 23:09:31 +0000 (01:09 +0200)]
ARM: shmobile: Use domain names when adding subdomains to power domains
Make the power management code under arch/arm/mach-shmobile/
use pm_genpd_add_subdomain_names() for adding subdomains to power
domains, which makes it possible to drop
rmobile_pm_add_subdomain() and will allow us to carry out those
operations for domain objects stored in tables in a straightforward
way.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Magnus Damm <damm@opensource.se>
Rafael J. Wysocki [Mon, 6 Aug 2012 23:07:46 +0000 (01:07 +0200)]
ARM: shmobile: Drop r8a7779_add_device_to_domain()
If the r8a7779's PM domains are given names, this SoC and its boards
will be able to use rmobile_add_device_to_domain() for adding devices
to those domains and r8a7779_add_device_to_domain(), which is not
used anywhere at the moment anyway, may be dopped.
Accordingly, give names to the r8a7779's PM domains and drop
r8a7779_add_device_to_domain().
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Magnus Damm <damm@opensource.se>
Rafael J. Wysocki [Mon, 6 Aug 2012 23:07:01 +0000 (01:07 +0200)]
ARM: shmobile: Use names of power domains for adding devices to them
Make the power management code under arch/arm/mach-shmobile/ use
names of power domains instead of pointers to domain objects for
adding devices to the domains. This will allow us to put the
domain objects into tables and register them all in one shot
going forward.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Magnus Damm <damm@opensource.se>
Rafael J. Wysocki [Wed, 15 Aug 2012 18:32:59 +0000 (20:32 +0200)]
PM / Domains: Operations related to cpuidle using domain names
Make it possible to use domain names in operations connecting cpuidle
to and disconnecting it from a PM domain. This is useful on
platforms where PM domain objects are organized in such a way that
the names of the domains are easier to use than the addresses of
those objects.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Rafael J. Wysocki [Wed, 15 Aug 2012 18:32:43 +0000 (20:32 +0200)]
PM / Domains: Document cpuidle-related functions and change their names
The names of the cpuidle-related functions in
drivers/base/power/domain.c are inconsistent with the names of the
other exported functions in that file (the "pm_" prefix is missing
from them) and they are missing kerneldoc comments.
Fix that by adding the missing "pm_" prefix to the names of those
functions and add kerneldoc comments documenting them.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Rafael J. Wysocki [Mon, 6 Aug 2012 23:11:14 +0000 (01:11 +0200)]
PM / Domains: Add power-on function using names to identify domains
It sometimes is necessary to turn on a given PM domain when only
the name of it is known and the domain pointer is not readily
available. For this reason, add a new helper function,
pm_genpd_name_poweron(), allowing the caller to turn on a PM domain
using its name for identification. To avoid code duplication,
move the domain lookup code to a separate function.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Rafael J. Wysocki [Mon, 6 Aug 2012 23:08:37 +0000 (01:08 +0200)]
PM / Domains: Make it possible to use names when adding subdomains
Add a new helper function, pm_genpd_add_subdomain_names(), allowing
the caller to add a subdomain to a generic PM domain using names for
domain identification (both domains have to be initialized before).
This function is useful for adding subdomains to PM domains whose
representations are stored in tables, when the caller doesn't know
the indices of the domain to add the subdomain to and of the
subdomain itself, but it knows the domains' names.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Rafael J. Wysocki [Mon, 6 Aug 2012 23:06:11 +0000 (01:06 +0200)]
PM / Domains: Make it possible to use domain names when adding devices
Add a new helper function __pm_genpd_name_add_device() allowing
a device to be added to a (registered) generic PM domain identified
by name. Add a wrapper around it, pm_genpd_name_add_device(),
passing NULL as the last argument and reorganize pm_domains.h for the
new functions to be defined consistently with the existing ones.
These functions are useful for adding devices to PM domains whose
representations are stored in tables, when the caller doesn't know
the index of the domain to add the device to, but it knows the
domain's name.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Rafael J. Wysocki [Mon, 13 Aug 2012 12:00:25 +0000 (14:00 +0200)]
PM: Do not use the syscore flag for runtime PM
The syscore device PM flag used to mark the devices (belonging to
PM domains) that should never be turned off, except for the system
core (syscore) suspend/hibernation and resume stages, need not be
accessed by the runtime PM core functions, because all of the devices
it is set for need to be marked as "irq safe" anyway and are
protected from being turned off by runtime PM by ensuring that their
usage counters are always set.
For this reason, make the syscore flag system-wide PM-specific
and simplify the code used for manipulating it, because it need not
acquire the device's power.lock any more.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Rafael J. Wysocki [Mon, 13 Aug 2012 12:00:16 +0000 (14:00 +0200)]
sh: MTU2: Basic runtime PM support
Modify the SH MTU2 clock event device driver to support runtime PM at
a basic level (i.e. device clocks can be disabled and enabled, but
domain power must be on, because the device has to be marked as
"irq safe").
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Magnus Damm <damm@opensource.se>
Rafael J. Wysocki [Sun, 5 Aug 2012 23:48:57 +0000 (01:48 +0200)]
sh: CMT: Basic runtime PM support
Modify the SH CMT clock source/clock event device driver to support
runtime PM at a basic level (i.e. device clocks can be disabled and
enabled, but domain power must be on, because the devices have to
be marked as "irq safe").
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Magnus Damm <damm@opensource.se>
Rafael J. Wysocki [Sun, 5 Aug 2012 23:48:17 +0000 (01:48 +0200)]
sh: TMU: Basic runtime PM support
Modify the SH TMU clock source/clock event device driver to support
runtime PM at a basic level (i.e. device clocks can be disabled and
enabled, but domain power must be on, because the devices have to
be marked as "irq safe").
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Magnus Damm <damm@opensource.se>
Rafael J. Wysocki [Sun, 5 Aug 2012 23:47:29 +0000 (01:47 +0200)]
PM / Domains: Do not measure start time for "irq safe" devices
The genpd_start_dev() routine used by pm_genpd_runtime_resume()
to put "irq safe" devices into the full power state measures the
time necessary to "start" the device and updates its PM QoS timing
data if necessary. This may lead to a deadlock if the given device
is a clock source and genpd_start_dev() is invoked from within the
clock source's .enable() routine, which will happen if that routine
uses pm_runtime_get_sync(), for example, to ensure that the device
is operational.
For this reason, introduce a special routine analogous to
genpd_start_dev(), called genpd_start_dev_no_timing(), that doesn't
carry out the time measurement, and make pm_genpd_runtime_resume()
use it instead of genpd_start_dev() to power up "irq safe" devices.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Rafael J. Wysocki [Sun, 5 Aug 2012 23:46:39 +0000 (01:46 +0200)]
PM / Domains: Move syscore flag from subsys data to struct device
The syscore device PM flag is used to mark the devices (belonging to
a PM domain) that should never be turned off, except for the system
core (syscore) suspend/hibernation and resume stages. That flag is
stored in the device's struct pm_subsys_data object whose address is
available from struct device. However, in some situations it may be
convenient to set that flag before the device is added to a PM
domain, so it is better to move it directly to the "power" member of
struct device. Then, it can be checked by the routines in
drivers/base/power/runtime.c and drivers/base/power/main.c, which is
more straightforward.
This also reduces the number of dev_gpd_data() invocations in the
generic PM domains framework, so the overhead related to the syscore
flag is slightly smaller.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Magnus Damm <damm@opensource.se>
Rafael J. Wysocki [Sun, 5 Aug 2012 23:45:54 +0000 (01:45 +0200)]
PM / Domains: Rename the always_on device flag to syscore
The always_on device flag is used to mark the devices (belonging to
a PM domain) that should never be turned off, except for the system
core (syscore) suspend/hibernation and resume stages. Change name
of that flag to "syscore" to better reflect its purpose.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Magnus Damm <damm@opensource.se>
Rafael J. Wysocki [Sun, 5 Aug 2012 23:45:11 +0000 (01:45 +0200)]
PM / Runtime: Allow helpers to be called by early platform drivers
Runtime PM helper functions, like pm_runtime_get_sync(), cannot be
called by early platform device drivers, because the devices' power
management locks are not initialized at that time. This is quite
inconvenient, so modify early_platform_add_devices() to initialize
the devices power management locks as appropriate and make sure that
they won't be initialized more than once if an early platform
device is going to be used as a regular one later.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Rafael J. Wysocki [Sun, 5 Aug 2012 23:44:28 +0000 (01:44 +0200)]
PM: Reorganize device PM initialization
Make the device power management initialization more straightforward
by moving the initialization of common (i.e. used by both runtime PM
and system suspend) fields to a separate routine.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Rafael J. Wysocki [Sun, 5 Aug 2012 23:43:41 +0000 (01:43 +0200)]
sh: MTU2: Introduce clock events suspend/resume routines
Introduce suspend/resume routines for SH MTU2 clock event devices
such that if those devices belong to a PM domain, the generic PM
domains framework will be notified that the given domain may be
turned off (during system suspend) or that it has to be turned on
(during system resume).
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Magnus Damm <damm@opensource.se>
Rafael J. Wysocki [Sun, 5 Aug 2012 23:43:03 +0000 (01:43 +0200)]
sh: CMT: Introduce clocksource/clock events suspend/resume routines
Introduce suspend/resume routines for SH CMT clock event devices and
modify the suspend/resume routines for SH CMT clock sources such that
if those devices belong to a PM domain, the generic PM domains
framework will be notified that the given domain may be turned off
(during system suspend) or that it has to be turned on (during system
resume).
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Magnus Damm <damm@opensource.se>
Rafael J. Wysocki [Sun, 5 Aug 2012 23:41:20 +0000 (01:41 +0200)]
sh: TMU: Introduce clocksource/clock events suspend/resume routines
Introduce suspend/resume routines for SH TMU clock source and
clock event device such that if those devices belong to a PM domain,
the generic PM domains framework will be notified that the given
domain may be turned off (during system suspend) or that it has to
be turned on (during system resume).
This change allows the A4R domain on SH7372 to be turned off during
system suspend (tested on the Mackerel board) if the TMU clock source
and/or clock event device is in use.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Magnus Damm <damm@opensource.se>
Rafael J. Wysocki [Sun, 5 Aug 2012 23:40:41 +0000 (01:40 +0200)]
timekeeping: Add suspend and resume of clock event devices
Some clock event devices, for example such that belong to PM domains,
need to be handled in a spcial way during the timekeeping suspend
and resume (which takes place in the system core, or "syscore",
stages of system power transitions) in analogy with clock sources.
Introduce .suspend() and .resume() callbacks for clock event devices
that will be executed by timekeeping_suspend/_resume(), respectively,
next the the clock sources' .suspend() and .resume() callbacks.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Rafael J. Wysocki [Sun, 5 Aug 2012 23:39:57 +0000 (01:39 +0200)]
PM / Domains: Add power off/on function for system core suspend stage
Introduce function pm_genpd_syscore_switch() and two wrappers around
it, pm_genpd_syscore_poweroff() and pm_genpd_syscore_poweron(),
allowing the callers to let the generic PM domains framework know
that the given device is not necessary any more and its PM domain
can be turned off (the former) or that the given device will be
required immediately, so its PM domain has to be turned on (the
latter) during the system core (syscore) stage of system suspend
(or hibernation) and resume.
These functions will be used for handling devices registered as
clock sources and clock event devices that belong to PM domains.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Rafael J. Wysocki [Sun, 5 Aug 2012 23:39:16 +0000 (01:39 +0200)]
PM / Domains: Introduce simplified power on routine for system resume
Introduce function pm_genpd_sync_poweron() for restoring domain power
during resume from system suspend and hibernation. It can be much
simpler than pm_genpd_poweron(), because it doesn't have to care
about locking and it can skip many checks done by the latter.
Modify pm_genpd_resume_noirq() and pm_genpd_restore_noirq() to use
the new function.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Rafael J. Wysocki [Wed, 15 Aug 2012 18:28:52 +0000 (20:28 +0200)]
PM / cpuidle: Make ladder governor use the "disabled" state flag
For the mechanism introduced by commit
cbc9ef0 (PM / Domains: Add
preliminary support for cpuidle, v2) to work with the ladder
governor, that governor should respect the "disabled" state flag
added by that commit. Change the ladder governor accordingly.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Carsten Emde [Thu, 19 Jul 2012 20:34:10 +0000 (20:34 +0000)]
Honor state disabling in the cpuidle ladder governor
There are two cpuidle governors ladder and menu. While the ladder
governor is always available, if CONFIG_CPU_IDLE is selected, the
menu governor additionally requires CONFIG_NO_HZ.
A particular C state can be disabled by writing to the sysfs file
/sys/devices/system/cpu/cpuN/cpuidle/stateN/disable, but this mechanism
is only implemented in the menu governor. Thus, in a system where
CONFIG_NO_HZ is not selected, the ladder governor becomes default and
always will walk through all sleep states - irrespective of whether the
C state was disabled via sysfs or not. The only way to select a specific
C state was to write the related latency to /dev/cpu_dma_latency and
keep the file open as long as this setting was required - not very
practical and not suitable for setting a single core in an SMP system.
With this patch, the ladder governor only will promote to the next
C state, if it has not been disabled, and it will demote, if the
current C state was disabled.
Note that the patch does not make the setting of the sysfs variable
"disable" coherent, i.e. if one is disabling a light state, then all
deeper states are disabled as well, but the "disable" variable does not
reflect it. Likewise, if one enables a deep state but a lighter state
still is disabled, then this has no effect. A related section has been
added to the documentation.
Signed-off-by: Carsten Emde <C.Emde@osadl.org>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Linus Torvalds [Sat, 1 Sep 2012 17:39:58 +0000 (10:39 -0700)]
Linux 3.6-rc4
John Stultz [Fri, 31 Aug 2012 17:30:06 +0000 (13:30 -0400)]
time: Move ktime_t overflow checking into timespec_valid_strict
Andreas Bombe reported that the added ktime_t overflow checking added to
timespec_valid in commit
4e8b14526ca7 ("time: Improve sanity checking of
timekeeping inputs") was causing problems with X.org because it caused
timeouts larger then KTIME_T to be invalid.
Previously, these large timeouts would be clamped to KTIME_MAX and would
never expire, which is valid.
This patch splits the ktime_t overflow checking into a new
timespec_valid_strict function, and converts the timekeeping codes
internal checking to use this more strict function.
Reported-and-tested-by: Andreas Bombe <aeb@debian.org>
Cc: Zhouping Liu <zliu@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sat, 1 Sep 2012 00:02:58 +0000 (17:02 -0700)]
Merge git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM bugfixes from Marcelo Tosatti.
* git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: x86: fix KVM_GET_MSR for PV EOI
kvm: Fix nonsense handling of compat ioctl
Linus Torvalds [Sat, 1 Sep 2012 00:02:20 +0000 (17:02 -0700)]
Merge tag 'parisc-fixes' of git://git./linux/kernel/git/jejb/parisc-2.6
Pull PARISC fixes from James Bottomley:
"This is a set of two bug fixes. One is the ATOMIC problem which is
now causing a compile failure in certain situations. The other is
mishandling of PER_LINUX32 which may also cause user visible effects.
Signed-off-by: James Bottomley <JBottomley@Parallels.com>"
* tag 'parisc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/parisc-2.6:
[PARISC] fix personality flag check in copy_thread()
[PARISC] Redefine ATOMIC_INIT and ATOMIC64_INIT to drop the casts
Linus Torvalds [Sat, 1 Sep 2012 00:01:31 +0000 (17:01 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/s390/linux
Pull s390 fixes from Martin Schwidefsky:
"A couple of s390 bug fixes for 3.5-rc4"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/32: Don't clobber personality flags on exec
s390/smp: add missing smp_store_status() for !SMP
s390/dasd: fix ioctl return value
s390: Always use "long" for ssize_t to match size_t
Linus Torvalds [Thu, 30 Aug 2012 16:11:33 +0000 (09:11 -0700)]
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
"A bunch of scattered fixes ati/intel/nouveau, couple of core ones,
nothing too shocking or different."
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
drm: Add EDID_QUIRK_FORCE_REDUCED_BLANKING for ASUS VW222S
gma500: Consider CRTC initially active.
drm/radeon: fix dig encoder selection on DCE61
drm/radeon: fix double free in radeon_gpu_reset
drm/radeon: force dma32 to fix regression rs4xx,rs6xx,rs740
drm/radeon: rework panel mode setup
drm/radeon/atom: powergating fixes for DCE6
drm/radeon/atom: rework DIG modesetting on DCE3+
drm/radeon: don't disable plls that are in use by other crtcs
drm/radeon: add proper checking of RESOLVE_BOX command for r600-r700
drm/radeon: initialize tracked CS state
drm/radeon: fix reading CB_COLORn_MASK from the CS
drm/nvc0/copy: check PUNITS to determine which copy engines are disabled
i915: Quirk no_lvds on Gigabyte GA-D525TUD ITX motherboard
drm/i915: Use the correct size of the GTT for placing the per-process entries
drm: Check for invalid cursor flags
drm: Initialize object type when using DRM_MODE() macro
drm/i915: fix color order for BGR formats on IVB
drm/i915: fix wrong order of parameters in port checking functions
Heiko Carstens [Tue, 28 Aug 2012 08:02:08 +0000 (10:02 +0200)]
s390/32: Don't clobber personality flags on exec
In native 32 bit mode the personality flags were not correctly inherited.
This is the s390 version of
59e4c3a2 "powerpc/32: Don't clobber personality
flags on exec".
Reported-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Paul Menzel [Wed, 8 Aug 2012 21:12:19 +0000 (23:12 +0200)]
drm: Add EDID_QUIRK_FORCE_REDUCED_BLANKING for ASUS VW222S
Connecting an ASUS VW222S [1] over VGA a garbled screen is shown with
vertical stripes in the top half.
In commit
bc42aabc [2]
commit
bc42aabc6a01b92b0f961d65671564e0e1cd7592
Author: Adam Jackson <ajax@redhat.com>
Date: Wed May 23 16:26:54 2012 -0400
drm/edid/quirks: ViewSonic VA2026w
Adam Jackson added the quirk `EDID_QUIRK_FORCE_REDUCED_BLANKING` which
is also needed for this ASUS monitor.
All log files and output from `xrandr` is included in the referenced
Bugzilla report #17629.
Please note that this monitor only has a VGA (D-Sub) connector [1].
[1] http://www.asus.com/Display/LCD_Monitors/VW222S/
[2] http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commit;h=
bc42aabc6a01b92b0f961d65671564e0e1cd7592
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=17629
Signed-off-by: Paul Menzel <paulepanter@users.sourceforge.net>
Cc: <dri-devel@lists.freedesktop.org>
Cc: Adam Jackson <ajax@redhat.com>
Cc: Ian Pilcher <arequipeno@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Thu, 30 Aug 2012 00:35:34 +0000 (10:35 +1000)]
Merge branch 'drm-fixes-3.6' of git://people.freedesktop.org/~agd5f/linux into drm-fixes
Alex writes:
Highlights:
- fix a gart regression on older IGP chips
- more MSAA fixes
- fix a double free in gpu reset code
- modesetting fixes
- trinity dig encoder fix.
* 'drm-fixes-3.6' of git://people.freedesktop.org/~agd5f/linux:
drm/radeon: fix dig encoder selection on DCE61
drm/radeon: fix double free in radeon_gpu_reset
drm/radeon: force dma32 to fix regression rs4xx,rs6xx,rs740
drm/radeon: rework panel mode setup
drm/radeon/atom: powergating fixes for DCE6
drm/radeon/atom: rework DIG modesetting on DCE3+
drm/radeon: don't disable plls that are in use by other crtcs
drm/radeon: add proper checking of RESOLVE_BOX command for r600-r700
drm/radeon: initialize tracked CS state
drm/radeon: fix reading CB_COLORn_MASK from the CS
Forest Bond [Mon, 13 Aug 2012 16:31:24 +0000 (16:31 +0000)]
gma500: Consider CRTC initially active.
[this one ideally should make 3.6 - it fixes the very annoying mode setting bug]
This causes the pipe to be forced off prior to initial mode set, which
roughly mirrors the behavior of the i915 driver. It fixes initial mode
setting on my Intel DN2800MT (Cedarview) board. Without it, mode
setting triggers an out-of-range error from the monitor for most modes,
but only on initial configuration (i.e. they can be configured
successfully from userspace after that).
Signed-off-by: Forest Bond <forest.bond@rapidrollout.com>
Signed-off-by: Alan Cox <alan@linux.intel.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Alex Deucher [Wed, 29 Aug 2012 23:48:26 +0000 (19:48 -0400)]
drm/radeon: fix dig encoder selection on DCE61
Was using the DCE41 code which was wrong. Fixes
blank displays on a number of Trinity systems.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Linus Torvalds [Wed, 29 Aug 2012 18:36:22 +0000 (11:36 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/mason/linux-btrfs
Pull btrfs fixes from Chris Mason:
"I've split out the big send/receive update from my last pull request
and now have just the fixes in my for-linus branch. The send/recv
branch will wander over to linux-next shortly though.
The largest patches in this pull are Josef's patches to fix DIO
locking problems and his patch to fix a crash during balance. They
are both well tested.
The rest are smaller fixes that we've had queued. The last rc came
out while I was hacking new and exciting ways to recover from a
misplaced rm -rf on my dev box, so these missed rc3."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (25 commits)
Btrfs: fix that repair code is spuriously executed for transid failures
Btrfs: fix ordered extent leak when failing to start a transaction
Btrfs: fix a dio write regression
Btrfs: fix deadlock with freeze and sync V2
Btrfs: revert checksum error statistic which can cause a BUG()
Btrfs: remove superblock writing after fatal error
Btrfs: allow delayed refs to be merged
Btrfs: fix enospc problems when deleting a subvol
Btrfs: fix wrong mtime and ctime when creating snapshots
Btrfs: fix race in run_clustered_refs
Btrfs: don't run __tree_mod_log_free_eb on leaves
Btrfs: increase the size of the free space cache
Btrfs: barrier before waitqueue_active
Btrfs: fix deadlock in wait_for_more_refs
btrfs: fix second lock in btrfs_delete_delayed_items()
Btrfs: don't allocate a seperate csums array for direct reads
Btrfs: do not strdup non existent strings
Btrfs: do not use missing devices when showing devname
Btrfs: fix that error value is changed by mistake
Btrfs: lock extents as we map them in DIO
...
Linus Torvalds [Wed, 29 Aug 2012 18:35:00 +0000 (11:35 -0700)]
Merge git://www.linux-watchdog.org/linux-watchdog
Pull watchdog fixes from Wim Van Sebroeck:
"This will fix a warning for watchdog-test.c and it will remove a
duplicate include of delay.h"
* git://www.linux-watchdog.org/linux-watchdog:
watchdog: da9052: Remove duplicate inclusion of delay.h
watchdog: fix watchdog-test.c build warning
David Rientjes [Wed, 29 Aug 2012 02:57:21 +0000 (19:57 -0700)]
mm, slab: lock the correct nodelist after reenabling irqs
cache_grow() can reenable irqs so the cpu (and node) can change, so ensure
that we take list_lock on the correct nodelist.
This fixes an issue with commit
072bb0aa5e06 ("mm: sl[au]b: add
knowledge of PFMEMALLOC reserve pages") where list_lock for the wrong
node was taken after growing the cache.
Reported-and-tested-by: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Christian König [Wed, 29 Aug 2012 11:24:15 +0000 (13:24 +0200)]
drm/radeon: fix double free in radeon_gpu_reset
radeon_ring_restore is freeing the memory for the saved
ring data. We need to remember that, otherwise we try to
restore the ring data again on the next try. Additional
to that it shouldn't try the reset infinitely if we have
saved ring data.
Signed-off-by: Christian König <deathsimple@vodafone.de>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Jerome Glisse [Tue, 28 Aug 2012 20:50:22 +0000 (16:50 -0400)]
drm/radeon: force dma32 to fix regression rs4xx,rs6xx,rs740
It seems some of those IGP dislike non dma32 page despite what
documentation says. Fix regression since we allowed non dma32
pages. It seems it only affect some revision of those IGP chips
as we don't know which one just force dma32 for all of them.
https://bugzilla.redhat.com/show_bug.cgi?id=785375
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Mon, 27 Aug 2012 21:48:18 +0000 (17:48 -0400)]
drm/radeon: rework panel mode setup
Adjust the panel mode setup to match the behavior
of the vbios. Rather than checking for specific
bridge chip ids, just check the eDP configuration register.
This saves extra aux transactions and works across
DP bridge chips without requiring additional per chip
id checking.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Fri, 24 Aug 2012 22:21:21 +0000 (18:21 -0400)]
drm/radeon/atom: powergating fixes for DCE6
Power gating is per crtc pair, but the powergating registers
should be called individually. The hw handles power up/down
properly. The pair is powered up if either crtc in the pair
is powered up and the pair is not powered down until both
crtcs in the pair are powered down. This simplifies
programming and should save additional power as the previous
code never actually power gated the crtc pair.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Alex Deucher [Wed, 22 Aug 2012 13:54:56 +0000 (09:54 -0400)]
drm/radeon/atom: rework DIG modesetting on DCE3+
The ordering is important and the current drm code
wasn't cutting it for modern DIG encoders. We need
to have information about crtc before setting up
the encoders so I've shifted the ordering a bit.
Probably we'll need a full rework akin to danvet's
recent intel patchs. This patch fixes numerous
issues with DP bridge chips and makes link training
much more reliable.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Alex Deucher [Tue, 21 Aug 2012 23:06:21 +0000 (19:06 -0400)]
drm/radeon: don't disable plls that are in use by other crtcs
Some plls are shared for DP.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Marek Olšák [Fri, 24 Aug 2012 12:27:36 +0000 (14:27 +0200)]
drm/radeon: add proper checking of RESOLVE_BOX command for r600-r700
Checking of the second colorbuffer was skipped on r700, because
CB_TARGET_MASK was 0xf. With r600, CB_TARGET_MASK is changed to 0xff,
so we must set the number of samples of the second colorbuffer to 1 in order
to pass the CS checker.
The DRM version is bumped, because RESOLVE_BOX is always rejected without this
fix on r600.
Signed-off-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Marek Olšák [Wed, 22 Aug 2012 15:02:43 +0000 (17:02 +0200)]
drm/radeon: initialize tracked CS state
This should help catch uninitialized registers and reject commands
because of that.
Signed-off-by: Marek Olšák <maraeo@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Marek Olšák [Wed, 22 Aug 2012 15:02:42 +0000 (17:02 +0200)]
drm/radeon: fix reading CB_COLORn_MASK from the CS
Signed-off-by: Marek Olšák <maraeo@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Sachin Kamat [Tue, 7 Aug 2012 09:44:12 +0000 (15:14 +0530)]
watchdog: da9052: Remove duplicate inclusion of delay.h
delay.h header file was included twice.
Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Randy Dunlap [Mon, 23 Jul 2012 17:46:11 +0000 (10:46 -0700)]
watchdog: fix watchdog-test.c build warning
Fix compiler warning by making the function static:
Documentation/watchdog/src/watchdog-test.c:34:6: warning: no previous prototype for 'term'
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Dave Airlie [Wed, 29 Aug 2012 10:09:23 +0000 (20:09 +1000)]
Merge branch 'drm-intel-fixes' of git://people.freedesktop.org/~danvet/drm-intel into drm-fixes
Daniel writes:
"Just a few smaller things:
- Fix up a pipe vs. plane confusion from a refactoring, fixes a regression
from 3.1 (Anhua Xu).
- Fix ivb sprite pixel formats (Vijay).
- Fixup ppgtt pde placement for machines where the Bios artifically limits
the availbale gtt space in the name of ... product differentiation
(Chris). This fixes an oops.
- Yet another no_lvds quirk entry."
* 'drm-intel-fixes' of git://people.freedesktop.org/~danvet/drm-intel:
i915: Quirk no_lvds on Gigabyte GA-D525TUD ITX motherboard
drm/i915: Use the correct size of the GTT for placing the per-process entries
drm/i915: fix color order for BGR formats on IVB
drm/i915: fix wrong order of parameters in port checking functions
Dave Airlie [Wed, 29 Aug 2012 10:05:40 +0000 (20:05 +1000)]
Merge branch 'drm-nouveau-fixes' of git://anongit.freedesktop.org/git/nouveau/linux-2.6 into drm-fixes
Ben says its just a single fix to avoid the wrong pcopy units being used.
* 'drm-nouveau-fixes' of git://anongit.freedesktop.org/git/nouveau/linux-2.6:
drm/nvc0/copy: check PUNITS to determine which copy engines are disabled
Ben Skeggs [Mon, 27 Aug 2012 06:22:49 +0000 (16:22 +1000)]
drm/nvc0/copy: check PUNITS to determine which copy engines are disabled
On some Fermi chipsets (NVCE particularly) PCOPY1 doesn't exist. And if
what I've seen on Kepler is true of Fermi too, chipsets of the same type
can have different PCOPY units available.
This should fix a v3.5 regression reported by a number of people effecting
suspend/resume on NVC8/NVCE chipsets.
Cc: stable@vger.kernel.org [3.5]
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Stefan Behrens [Fri, 10 Aug 2012 14:58:21 +0000 (08:58 -0600)]
Btrfs: fix that repair code is spuriously executed for transid failures
If verify_parent_transid() fails for all mirrors, the current code
calls repair_io_failure() anyway which means:
- that the disk block is rewritten without repairing anything and
- that a kernel log message is printed which misleadingly claims
that a read error was corrected.
This is an example:
parent transid verify failed on
615015833600 wanted 110423 found 110424
parent transid verify failed on
615015833600 wanted 110423 found 110424
btrfs read error corrected: ino 1 off
615015833600 (dev /dev/...)
It is wrong to ignore the results from verify_parent_transid() and to
call repair_eb_io_failure() when the verification of the transids failed.
This commit fixes the issue.
Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Liu Bo [Wed, 22 Aug 2012 03:13:25 +0000 (21:13 -0600)]
Btrfs: fix ordered extent leak when failing to start a transaction
We cannot just return error before freeing ordered extent and releasing reserved
space when we fail to start a transacion.
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Liu Bo [Thu, 23 Aug 2012 02:10:38 +0000 (20:10 -0600)]
Btrfs: fix a dio write regression
This bug is introduced by commit
3b8bde746f6f9bd36a9f05f5f3b6e334318176a9
(Btrfs: lock extents as we map them in DIO).
In dio write, we should unlock the section which we didn't do IO on in case that
we fall back to buffered write. But we need to not only unlock the section
but also cleanup reserved space for the section.
This bug was found while running xfstests 133, with this 133 no longer complains.
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Josef Bacik [Fri, 24 Aug 2012 18:53:03 +0000 (12:53 -0600)]
Btrfs: fix deadlock with freeze and sync V2
We can deadlock with freeze right now because we unconditionally start a
transaction in our ->sync_fs() call. To fix this just check and see if we
have a running transaction to commit. This saves us from the deadlock
because at this point we'll have the umount sem for the sb so we're safe
from freezes coming in after we've done our check. With this patch the
freeze xfstests no longer deadlocks. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Stefan Behrens [Mon, 27 Aug 2012 14:30:03 +0000 (08:30 -0600)]
Btrfs: revert checksum error statistic which can cause a BUG()
Commit
442a4f6308e694e0fa6025708bd5e4e424bbf51c added btrfs device
statistic counters for detected IO and checksum errors to Linux 3.5.
The statistic part that counts checksum errors in
end_bio_extent_readpage() can cause a BUG() in a subfunction:
"kernel BUG at fs/btrfs/volumes.c:3762!"
That part is reverted with the current patch.
However, the counting of checksum errors in the scrub context remains
active, and the counting of detected IO errors (read, write or flush
errors) in all contexts remains active.
Cc: stable <stable@vger.kernel.org> # 3.5
Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Stefan Behrens [Wed, 1 Aug 2012 11:45:52 +0000 (05:45 -0600)]
Btrfs: remove superblock writing after fatal error
With commit
acce952b0, btrfs was changed to flag the filesystem with
BTRFS_SUPER_FLAG_ERROR and switch to read-only mode after a fatal
error happened like a write I/O errors of all mirrors.
In such situations, on unmount, the superblock is written in
btrfs_error_commit_super(). This is done with the intention to be able
to evaluate the error flag on the next mount. A warning is printed
in this case during the next mount and the log tree is ignored.
The issue is that it is possible that the superblock points to a root
that was not written (due to write I/O errors).
The result is that the filesystem cannot be mounted. btrfsck also does
not start and all the other btrfs-progs tools fail to start as well.
However, mount -o recovery is working well and does the right things
to recover the filesystem (i.e., don't use the log root, clear the
free space cache and use the next mountable root that is stored in the
root backup array).
This patch removes the writing of the superblock when
BTRFS_SUPER_FLAG_ERROR is set, and removes the handling of the error
flag in the mount function.
These lines can be used to reproduce the issue (using /dev/sdm):
SCRATCH_DEV=/dev/sdm
SCRATCH_MNT=/mnt
echo 0
25165824 linear $SCRATCH_DEV 0 | dmsetup create foo
ls -alLF /dev/mapper/foo
mkfs.btrfs /dev/mapper/foo
mount /dev/mapper/foo $SCRATCH_MNT
echo bar > $SCRATCH_MNT/foo
sync
echo 0
25165824 error | dmsetup reload foo
dmsetup resume foo
ls -alF $SCRATCH_MNT
touch $SCRATCH_MNT/1
ls -alF $SCRATCH_MNT
sleep 35
echo 0
25165824 linear $SCRATCH_DEV 0 | dmsetup reload foo
dmsetup resume foo
sleep 1
umount $SCRATCH_MNT
btrfsck /dev/mapper/foo
dmsetup remove foo
Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de>
Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
Josef Bacik [Tue, 7 Aug 2012 20:00:32 +0000 (16:00 -0400)]
Btrfs: allow delayed refs to be merged
Daniel Blueman reported a bug with fio+balance on a ramdisk setup.
Basically what happens is the balance relocates a tree block which will drop
the implicit refs for all of its children and adds a full backref. Once the
block is relocated we have to add the implicit refs back, so when we cow the
block again we add the implicit refs for its children back. The problem
comes when the original drop ref doesn't get run before we add the implicit
refs back. The delayed ref stuff will specifically prefer ADD operations
over DROP to keep us from freeing up an extent that will have references to
it, so we try to add the implicit ref before it is actually removed and we
panic. This worked fine before because the add would have just canceled the
drop out and we would have been fine. But the backref walking work needs to
be able to freeze the delayed ref stuff in time so we have this ever
increasing sequence number that gets attached to all new delayed ref updates
which makes us not merge refs and we run into this issue.
So to fix this we need to merge delayed refs. So everytime we run a
clustered ref we need to try and merge all of its delayed refs. The backref
walking stuff locks the delayed ref head before processing, so if we have it
locked we are safe to merge any refs inside of the sequence number. If
there is no sequence number we can merge all refs. Doing this not only
fixes our bug but keeps the delayed ref code from adding and removing
useless refs and batching together multiple refs into one search instead of
one search per delayed ref, which will really help our commit times. I ran
this with Daniels test and 276 and I haven't seen any problems. Thanks,
Reported-by: Daniel J Blueman <daniel@quora.org>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Josef Bacik [Wed, 8 Aug 2012 16:12:59 +0000 (10:12 -0600)]
Btrfs: fix enospc problems when deleting a subvol
Subvol delete is a special kind of awful where we use the global reserve to
cover the ENOSPC requirements. The problem is once we're done removing
everything we do a btrfs_update_inode(), which by default will try to do the
delayed update stuff which will use it's own reserve. There will be no
space in this reserve and we'll return ENOSPC. So instead use
btrfs_update_inode_fallback() which will just fallback to updating the inode
item in the case of enospc. This is fine because the global reserve covers
the space requirements for this. With this patch I can now delete a subvol
on a problem image Dave Sterba sent me. Thanks,
Reported-by: David Sterba <dave@jikos.cz>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Miao Xie [Thu, 9 Aug 2012 03:39:36 +0000 (21:39 -0600)]
Btrfs: fix wrong mtime and ctime when creating snapshots
When we created a new snapshot, the mtime and ctime of its parent directory
were not updated. Fix it.
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Arne Jansen [Thu, 9 Aug 2012 06:16:53 +0000 (00:16 -0600)]
Btrfs: fix race in run_clustered_refs
With commit
commit
d1270cd91f308c9d22b2804720c36ccd32dbc35e
Author: Arne Jansen <sensille@gmx.net>
Date: Tue Sep 13 15:16:43 2011 +0200
Btrfs: put back delayed refs that are too new
I added a window where the delayed_ref's head->ref_mod code can diverge
from the sum of the remaining refs, because we release the head->mutex
in the middle. This leads to btrfs_lookup_extent_info returning wrong
numbers. This patch fixes this by adjusting the head's ref_mod with each
delayed ref we run.
Signed-off-by: Arne Jansen <sensille@gmx.net>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Chris Mason [Tue, 7 Aug 2012 19:34:49 +0000 (15:34 -0400)]
Btrfs: don't run __tree_mod_log_free_eb on leaves
When we split a leaf, we may end up inserting a new root on top of that
leaf. The reflog code was incorrectly assuming the old root was always
a node. This makes sure we skip over leaves.
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Josef Bacik [Mon, 6 Aug 2012 19:46:38 +0000 (13:46 -0600)]
Btrfs: increase the size of the free space cache
Arne was complaining about the space cache having mismatching generation
numbers when debugging a deadlock. This is because we can run out of space
in our preallocated range for our space cache if you have a pretty
fragmented amount of space in your pinned space. So just increase the
amount of space we preallocate for space cache so we can be sure to have
enough space. This will only really affect data ranges since their the only
chunks that end up larger than 256MB. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Josef Bacik [Wed, 1 Aug 2012 19:36:24 +0000 (15:36 -0400)]
Btrfs: barrier before waitqueue_active
We need a barrir before calling waitqueue_active otherwise we will miss
wakeups. So in places that do atomic_dec(); then atomic_read() use
atomic_dec_return() which imply a memory barrier (see memory-barriers.txt)
and then add an explicit memory barrier everywhere else that need them.
Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Arne Jansen [Mon, 6 Aug 2012 20:18:51 +0000 (14:18 -0600)]
Btrfs: fix deadlock in wait_for_more_refs
Commit
a168650c introduced a waiting mechanism to prevent busy waiting in
btrfs_run_delayed_refs. This can deadlock with btrfs_run_ordered_operations,
where a tree_mod_seq is held while waiting for the io to complete, while
the end_io calls btrfs_run_delayed_refs.
This whole mechanism is unnecessary. If not enough runnable refs are
available to satisfy count, just return as count is more like a guideline
than a strict requirement.
In case we have to run all refs, commit transaction makes sure that no
other threads are working in the transaction anymore, so we just assert
here that no refs are blocked.
Signed-off-by: Arne Jansen <sensille@gmx.net>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Fengguang Wu [Sat, 4 Aug 2012 07:45:02 +0000 (01:45 -0600)]
btrfs: fix second lock in btrfs_delete_delayed_items()
Fix a real bug caught by coccinelle.
fs/btrfs/delayed-inode.c:1013:1-11: second lock on line 1013
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Josef Bacik [Fri, 3 Aug 2012 20:49:19 +0000 (16:49 -0400)]
Btrfs: don't allocate a seperate csums array for direct reads
We've been allocating a big array for csums instead of storing them in the
io_tree like we do for buffered reads because previously we were locking the
entire range, so we didn't have an extent state for each sector of the
range. But now that we do the range locking as we map the buffers we can
limit the mapping lenght to sectorsize and use the private part of the
io_tree for our csums. This allows us to avoid an extra memory allocation
for direct reads which could incur latency. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Josef Bacik [Thu, 2 Aug 2012 14:23:59 +0000 (10:23 -0400)]
Btrfs: do not strdup non existent strings
When we close devices we add back empty devices for some reason that escapes
me. In the case of a missing dev we don't allocate an rcu_string for it's
name, so check to see if the device has a name and if it doesn't don't
bother strdup()'ing it. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Josef Bacik [Thu, 2 Aug 2012 14:22:20 +0000 (10:22 -0400)]
Btrfs: do not use missing devices when showing devname
If you do the following
mkfs.btrfs /dev/sdb /dev/sdc
rmmod btrfs
dd if=/dev/zero of=/dev/sdb bs=1M count=1
mount -o degraded /dev/sdc /mnt/btrfs-test
the box will panic trying to deref the name for the missing dev since it is
the lower numbered devid. So fix show_devname to not use missing devices.
Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Stefan Behrens [Wed, 1 Aug 2012 10:28:01 +0000 (04:28 -0600)]
Btrfs: fix that error value is changed by mistake
In iterate_inodes_from_logical() the error result from
extent_from_logical() is patched by mistake. Typically ENOENT is
patched to EINVAL because (-ENOENT & BTRFS_EXTENT_FLAG_TREE_BLOCK)
evaluates to true.
Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de>
Josef Bacik [Tue, 31 Jul 2012 20:28:48 +0000 (16:28 -0400)]
Btrfs: lock extents as we map them in DIO
A deadlock in xfstests 113 was uncovered by commit
d187663ef24cd3d033f0cbf2867e70b36a3a90b8
This is because we would not return EIOCBQUEUED for short AIO reads, instead
we'd wait for the DIO to complete and then return the amount of data we
transferred, which would allow our stuff to unlock the remaning amount. But
with this change this no longer happens, so if we have a short AIO read (for
example if we try to read past EOF), we could leave the section from EOF to
the end of where we tried to read locked. Fixing this is tricky since there
is no clear way to know exactly how much data DIO truly submitted for IO, so
to make this less hard on ourselves and less combersome we need to lock the
extents as we try to map them, and then we unlock any areas we didn't
actually map. This makes us completely safe from deadlocks and reliance on
a particular behavior of the DIO code. This also lays the groundwork for
allowing us to use the normal csum storage method for reads which means we
can remove an allocation. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Dan Carpenter [Mon, 30 Jul 2012 08:10:44 +0000 (02:10 -0600)]
Btrfs: fix some endian bugs handling the root times
"trans->transid" is cpu endian but we want to store the data as little
endian. "item->ctime.nsec" is only 32 bits, not 64.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Dan Carpenter [Mon, 30 Jul 2012 08:15:15 +0000 (02:15 -0600)]
Btrfs: unlock on error in btrfs_delalloc_reserve_metadata()
We should release this mutex before returning the error code.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Dan Carpenter [Mon, 30 Jul 2012 08:15:43 +0000 (02:15 -0600)]
Btrfs: checking for NULL instead of IS_ERR
add_qgroup_rb() never returns NULL, only error pointers.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Dan Carpenter [Mon, 30 Jul 2012 08:16:10 +0000 (02:16 -0600)]
Btrfs: fix some error codes in btrfs_qgroup_inherit()
These are returning zero when it should be returning a negative error
code.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Stefan Behrens [Thu, 26 Jul 2012 09:40:35 +0000 (03:40 -0600)]
Btrfs: fix a misplaced address operator in a condition
This should obviously not be "if (&flag)" but "if (flag)".
Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de>
Heiko Carstens [Mon, 27 Aug 2012 13:18:45 +0000 (15:18 +0200)]
s390/smp: add missing smp_store_status() for !SMP
Fix this compile error:
arch/s390/kernel/machine_kexec.c: In function ‘setup_regs’:
arch/s390/kernel/machine_kexec.c:63:3: error: implicit declaration
of function ‘smp_store_status’ [-Werror=implicit-function-declaration]
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Heiko Carstens [Mon, 27 Aug 2012 08:59:42 +0000 (10:59 +0200)]
s390/dasd: fix ioctl return value
For unimplemented ioctls the dasd driver should return -ENOTTY.
Reported-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
Acked-by: Stefan Weinhuber <wein@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Michael S. Tsirkin [Sun, 26 Aug 2012 15:00:29 +0000 (18:00 +0300)]
KVM: x86: fix KVM_GET_MSR for PV EOI
KVM_GET_MSR was missing support for PV EOI,
which is needed for migration.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Linus Torvalds [Sun, 26 Aug 2012 20:02:51 +0000 (13:02 -0700)]
Merge tag 'hwmon-for-linus' of git://git./linux/kernel/git/groeck/linux-staging
Pull a hwmon fix from Guenter Roeck:
"Fix sensor readings for Asus M5A78L in asus_atk0110 driver."
* tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
hwmon: (asus_atk0110) Add quirk for Asus M5A78L
Alan Cox [Wed, 22 Aug 2012 13:34:11 +0000 (14:34 +0100)]
kvm: Fix nonsense handling of compat ioctl
KVM_SET_SIGNAL_MASK passed a NULL argument leaves the on stack signal
sets uninitialized. It then passes them through to
kvm_vcpu_ioctl_set_sigmask.
We should be passing a NULL in this case not translated garbage.
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>