Srivatsa S. Bhat [Fri, 6 Sep 2013 19:53:27 +0000 (01:23 +0530)]
cpufreq: Invoke __cpufreq_remove_dev_finish() after releasing cpu_hotplug.lock
__cpufreq_remove_dev_finish() handles the kobject cleanup for a CPU going
offline. But because we destroy the kobject towards the end of the CPU offline
phase, there are certain race windows where a task can try to write to a
cpufreq sysfs file (eg: using store_scaling_max_freq()) while we are taking
that CPU offline, and this can bump up the kobject refcount, which in turn might
hinder the CPU offline task from running to completion. (It can also cause
other more serious problems such as trying to acquire a destroyed timer-mutex
etc., depending on the exact stage of the cleanup at which the task managed to
take a new refcount).
To fix the race window, we will need to synchronize those store_*() call-sites
with CPU hotplug, using get_online_cpus()/put_online_cpus(). However, that
in turn can cause a total deadlock because it can end up waiting for the
CPU offline task to complete, with incremented refcount!
Write to sysfs CPU offline task
-------------- ----------------
kobj_refcnt++
Acquire cpu_hotplug.lock
get_online_cpus();
Wait for kobj_refcnt to drop to zero
**DEADLOCK**
A simple way to avoid this problem is to perform the kobject cleanup in the
CPU offline path, with the cpu_hotplug.lock *released*. That is, we can
perform the wait-for-kobj-refcnt-to-drop as well as the subsequent cleanup
in the CPU_POST_DEAD stage of CPU offline, which is run with cpu_hotplug.lock
released. Doing this helps us avoid deadlocks due to holding kobject refcounts
and waiting on each other on the cpu_hotplug.lock.
(Note: We can't move all of the cpufreq CPU offline steps to the
CPU_POST_DEAD stage, because certain things such as stopping the governors
have to be done before the outgoing CPU is marked offline. So retain those
parts in the CPU_DOWN_PREPARE stage itself).
Reported-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Srivatsa S. Bhat [Fri, 6 Sep 2013 19:53:09 +0000 (01:23 +0530)]
cpufreq: Split __cpufreq_remove_dev() into two parts
During CPU offline, the cpufreq core invokes __cpufreq_remove_dev()
to perform work such as stopping the cpufreq governor, clearing the
CPU from the policy structure etc, and finally cleaning up the
kobject.
There are certain subtle issues related to the kobject cleanup, and
it would be much easier to deal with them if we separate that part
from the rest of the cleanup-work in the CPU offline phase. So split
the __cpufreq_remove_dev() function into 2 parts: one that handles
the kobject cleanup, and the other that handles the rest of the work.
Reported-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Andreas Schwab [Sat, 7 Sep 2013 16:35:08 +0000 (18:35 +0200)]
cpufreq: Fix wrong time unit conversion
The time spent by a CPU under a given frequency is stored in jiffies unit
in the cpu var cpufreq_stats_table->time_in_state[i], i being the index of
the frequency.
This is what is displayed in the following file on the right column:
cat /sys/devices/system/cpu/cpuX/cpufreq/stats/time_in_state
2301000 19835820
2300000 3172
[...]
Now cpufreq converts this jiffies unit delta to clock_t before returning it
to the user as in the above file. And that conversion is achieved using the API
cputime64_to_clock_t().
Although it accidentally works on traditional tick based cputime accounting, where
cputime_t maps directly to jiffies, it doesn't work with other types of cputime
accounting such as CONFIG_VIRT_CPU_ACCOUNTING_* where cputime_t can map to nsecs
or any granularity preffered by the architecture.
For example we get a buggy zero delta on full dyntick configurations:
cat /sys/devices/system/cpu/cpuX/cpufreq/stats/time_in_state
2301000 0
2300000 0
[...]
Fix this with using the proper jiffies_64_t to clock_t conversion.
Reported-and-tested-by: Carsten Emde <C.Emde@osadl.org>
Signed-off-by: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Viresh Kumar [Sat, 31 Aug 2013 12:18:23 +0000 (17:48 +0530)]
cpufreq: serialize calls to __cpufreq_governor()
We can't take a big lock around __cpufreq_governor() as this causes
recursive locking for some cases. But calls to this routine must be
serialized for every policy. Otherwise we can see some unpredictable
events.
For example, consider following scenario:
__cpufreq_remove_dev()
__cpufreq_governor(policy, CPUFREQ_GOV_STOP);
policy->governor->governor(policy, CPUFREQ_GOV_STOP);
cpufreq_governor_dbs()
case CPUFREQ_GOV_STOP:
mutex_destroy(&cpu_cdbs->timer_mutex)
cpu_cdbs->cur_policy = NULL;
<PREEMPT>
store()
__cpufreq_set_policy()
__cpufreq_governor(policy, CPUFREQ_GOV_LIMITS);
policy->governor->governor(policy, CPUFREQ_GOV_LIMITS);
case CPUFREQ_GOV_LIMITS:
mutex_lock(&cpu_cdbs->timer_mutex); <-- Warning (destroyed mutex)
if (policy->max < cpu_cdbs->cur_policy->cur) <- cur_policy == NULL
And so store() will eventually result in a crash if cur_policy is
NULL at this point.
Introduce an additional variable which would guarantee serialization
here.
Reported-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Viresh Kumar [Sat, 31 Aug 2013 12:23:40 +0000 (17:53 +0530)]
cpufreq: don't allow governor limits to be changed when it is disabled
__cpufreq_governor() returns with -EBUSY when governor is already
stopped and we try to stop it again, but when it is stopped we must
not allow calls to CPUFREQ_GOV_LIMITS event as well.
This patch adds this check in __cpufreq_governor().
Reported-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Stephen Boyd [Wed, 28 Aug 2013 21:24:45 +0000 (14:24 -0700)]
cpufreq: Don't use smp_processor_id() in preemptible context
Workqueues are preemptible even if works are queued on them with
queue_work_on(). Let's use raw_smp_processor_id() here to silence
the warning.
BUG: using smp_processor_id() in preemptible [
00000000] code: kworker/3:2/674
caller is gov_queue_work+0x28/0xb0
CPU: 0 PID: 674 Comm: kworker/3:2 Tainted: G W 3.10.0 #30
Workqueue: events od_dbs_timer
[<
c010c178>] (unwind_backtrace+0x0/0x11c) from [<
c0109dec>] (show_stack+0x10/0x14)
[<
c0109dec>] (show_stack+0x10/0x14) from [<
c03885a4>] (debug_smp_processor_id+0xbc/0xf0)
[<
c03885a4>] (debug_smp_processor_id+0xbc/0xf0) from [<
c0635864>] (gov_queue_work+0x28/0xb0)
[<
c0635864>] (gov_queue_work+0x28/0xb0) from [<
c0635618>] (od_dbs_timer+0x108/0x134)
[<
c0635618>] (od_dbs_timer+0x108/0x134) from [<
c01aa8f8>] (process_one_work+0x25c/0x444)
[<
c01aa8f8>] (process_one_work+0x25c/0x444) from [<
c01aaf88>] (worker_thread+0x200/0x344)
[<
c01aaf88>] (worker_thread+0x200/0x344) from [<
c01b03bc>] (kthread+0xa0/0xb0)
[<
c01b03bc>] (kthread+0xa0/0xb0) from [<
c01061b8>] (ret_from_fork+0x14/0x3c)
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Stratos Karafotis [Mon, 26 Aug 2013 18:42:21 +0000 (21:42 +0300)]
cpufreq: governor: Fix typos in comments
- 'Governer' should be 'Governor'.
- 'S' is used for Siemens (electrical conductance) in SI units,
so use small 's' for seconds.
Signed-off-by: Stratos Karafotis <stratosk@semaphore.gr>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Stratos Karafotis [Mon, 26 Aug 2013 18:37:28 +0000 (21:37 +0300)]
cpufreq: governors: Remove duplicate check of target freq in supported range
Function __cpufreq_driver_target() checks if target_freq is within
policy->min and policy->max range. generic_powersave_bias_target() also
checks if target_freq is valid via a cpufreq_frequency_table_target()
call. So, drop the unnecessary duplicate check in *_check_cpu().
Signed-off-by: Stratos Karafotis <stratosk@semaphore.gr>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Stephen Boyd [Tue, 27 Aug 2013 18:47:29 +0000 (11:47 -0700)]
cpufreq: Fix timer/workqueue corruption due to double queueing
When a CPU is hot removed we'll cancel all the delayed work items
via gov_cancel_work(). Normally this will just cancels a delayed
timer on each CPU that the policy is managing and the work won't
run, but if the work is already running the workqueue code will
wait for the work to finish before continuing to prevent the
work items from re-queuing themselves like they normally do. This
scheme will work most of the time, except for the case where the
work function determines that it should adjust the delay for all
other CPUs that the policy is managing. If this scenario occurs,
the canceling CPU will cancel its own work but queue up the other
CPUs works to run. For example:
CPU0 CPU1
---- ----
cpu_down()
...
__cpufreq_remove_dev()
cpufreq_governor_dbs()
case CPUFREQ_GOV_STOP:
gov_cancel_work(dbs_data, policy);
cpu0 work is canceled
timer is canceled
cpu1 work is canceled <work runs>
<waits for cpu1> od_dbs_timer()
gov_queue_work(*, *, true);
cpu0 work queued
cpu1 work queued
cpu2 work queued
...
cpu1 work is canceled
cpu2 work is canceled
...
At the end of the GOV_STOP case cpu0 still has a work queued to
run although the code is expecting all of the works to be
canceled. __cpufreq_remove_dev() will then proceed to
re-initialize all the other CPUs works except for the CPU that is
going down. The CPUFREQ_GOV_START case in cpufreq_governor_dbs()
will trample over the queued work and debugobjects will spit out
a warning:
WARNING: at lib/debugobjects.c:260 debug_print_object+0x94/0xbc()
ODEBUG: init active (active state 0) object type: timer_list hint: delayed_work_timer_fn+0x0/0x10
Modules linked in:
CPU: 0 PID: 1491 Comm: sh Tainted: G W 3.10.0 #19
[<
c010c178>] (unwind_backtrace+0x0/0x11c) from [<
c0109dec>] (show_stack+0x10/0x14)
[<
c0109dec>] (show_stack+0x10/0x14) from [<
c01904cc>] (warn_slowpath_common+0x4c/0x6c)
[<
c01904cc>] (warn_slowpath_common+0x4c/0x6c) from [<
c019056c>] (warn_slowpath_fmt+0x2c/0x3c)
[<
c019056c>] (warn_slowpath_fmt+0x2c/0x3c) from [<
c0388a7c>] (debug_print_object+0x94/0xbc)
[<
c0388a7c>] (debug_print_object+0x94/0xbc) from [<
c0388e34>] (__debug_object_init+0x2d0/0x340)
[<
c0388e34>] (__debug_object_init+0x2d0/0x340) from [<
c019e3b0>] (init_timer_key+0x14/0xb0)
[<
c019e3b0>] (init_timer_key+0x14/0xb0) from [<
c0635f78>] (cpufreq_governor_dbs+0x3e8/0x5f8)
[<
c0635f78>] (cpufreq_governor_dbs+0x3e8/0x5f8) from [<
c06325a0>] (__cpufreq_governor+0xdc/0x1a4)
[<
c06325a0>] (__cpufreq_governor+0xdc/0x1a4) from [<
c0633704>] (__cpufreq_remove_dev.isra.10+0x3b4/0x434)
[<
c0633704>] (__cpufreq_remove_dev.isra.10+0x3b4/0x434) from [<
c08989f4>] (cpufreq_cpu_callback+0x60/0x80)
[<
c08989f4>] (cpufreq_cpu_callback+0x60/0x80) from [<
c08a43c0>] (notifier_call_chain+0x38/0x68)
[<
c08a43c0>] (notifier_call_chain+0x38/0x68) from [<
c01938e0>] (__cpu_notify+0x28/0x40)
[<
c01938e0>] (__cpu_notify+0x28/0x40) from [<
c0892ad4>] (_cpu_down+0x7c/0x2c0)
[<
c0892ad4>] (_cpu_down+0x7c/0x2c0) from [<
c0892d3c>] (cpu_down+0x24/0x40)
[<
c0892d3c>] (cpu_down+0x24/0x40) from [<
c0893ea8>] (store_online+0x2c/0x74)
[<
c0893ea8>] (store_online+0x2c/0x74) from [<
c04519d8>] (dev_attr_store+0x18/0x24)
[<
c04519d8>] (dev_attr_store+0x18/0x24) from [<
c02a69d4>] (sysfs_write_file+0x100/0x148)
[<
c02a69d4>] (sysfs_write_file+0x100/0x148) from [<
c0255c18>] (vfs_write+0xcc/0x174)
[<
c0255c18>] (vfs_write+0xcc/0x174) from [<
c0255f70>] (SyS_write+0x38/0x64)
[<
c0255f70>] (SyS_write+0x38/0x64) from [<
c0106120>] (ret_fast_syscall+0x0/0x30)
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Rafael J. Wysocki [Tue, 27 Aug 2013 00:37:54 +0000 (02:37 +0200)]
Merge branch 'cpufreq-fixes' of git://git.linaro.org/people/vireshk/linux into pm-cpufreq
Pull cpufreq fixes for v3.12 from Viresh Kumar.
* 'cpufreq-fixes' of git://git.linaro.org/people/vireshk/linux:
cpufreq: imx6q: Fix clock enable balance
cpufreq: tegra: fix the wrong clock name
Sascha Hauer [Mon, 26 Aug 2013 11:48:36 +0000 (13:48 +0200)]
cpufreq: imx6q: Fix clock enable balance
For changing the cpu frequency the i.MX6q has to be switched to some
intermediate clock during the PLL reprogramming. The driver tries
to be clever to keep the enable count correct but gets it wrong. If
the cpufreq is increased it calls clk_disable_unprepare twice
on pll2_pfd2_396m. This puts all other devices which get their clock
from pll2_pfd2_396m into a nonworking state.
Fix this by removing the clk enabling/disabling altogether since the
clk core will do this automatically during a reparent.
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Joseph Lo [Fri, 23 Aug 2013 01:43:58 +0000 (09:43 +0800)]
cpufreq: tegra: fix the wrong clock name
The "cpu" and "pclk_p_cclk" was a virtual clock name that was used in
the legacy Tegra clock framework. It was not used after converting to
CCF. Fix it as the correct clock name that we are using.
Tested-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Joseph Lo <josephl@nvidia.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Rafael J. Wysocki [Thu, 22 Aug 2013 22:57:19 +0000 (00:57 +0200)]
Merge branch 'cpu_of_node' of git://linux-arm.org/linux-skn into pm-cpufreq-next
Pull DT/core/cpufreq cpu_ofnode updates for v3.12 from Sudeep KarkadaNagesha.
* 'cpu_of_node' of git://linux-arm.org/linux-skn:
cpufreq: pmac32-cpufreq: remove device tree parsing for cpu nodes
cpufreq: pmac64-cpufreq: remove device tree parsing for cpu nodes
cpufreq: maple-cpufreq: remove device tree parsing for cpu nodes
cpufreq: arm_big_little: remove device tree parsing for cpu nodes
cpufreq: kirkwood-cpufreq: remove device tree parsing for cpu nodes
cpufreq: spear-cpufreq: remove device tree parsing for cpu nodes
cpufreq: highbank-cpufreq: remove device tree parsing for cpu nodes
cpufreq: cpufreq-cpu0: remove device tree parsing for cpu nodes
cpufreq: imx6q-cpufreq: remove device tree parsing for cpu nodes
drivers/bus: arm-cci: avoid parsing DT for cpu device nodes
ARM: mvebu: remove device tree parsing for cpu nodes
ARM: topology: remove hwid/MPIDR dependency from cpu_capacity
of/device: add helper to get cpu device node from logical cpu index
driver/core: cpu: initialize of_node in cpu's device struture
ARM: DT/kernel: define ARM specific arch_match_cpu_phys_id
of: move of_get_cpu_node implementation to DT core library
powerpc: refactor of_get_cpu_node to support other architectures
openrisc: remove undefined of_get_cpu_node declaration
microblaze: remove undefined of_get_cpu_node declaration
Rafael J. Wysocki [Thu, 22 Aug 2013 22:55:13 +0000 (00:55 +0200)]
Merge back earlier 'pm-cpufreq' material.
Sudeep KarkadaNagesha [Wed, 17 Jul 2013 12:52:17 +0000 (13:52 +0100)]
cpufreq: pmac32-cpufreq: remove device tree parsing for cpu nodes
Now that the cpu device registration initialises the of_node(if available)
appropriately for all the cpus, parsing here is redundant.
This patch removes DT parsing and uses cpu->of_node instead.
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Sudeep KarkadaNagesha <sudeep.karkadanagesha@arm.com>
Sudeep KarkadaNagesha [Wed, 17 Jul 2013 12:42:56 +0000 (13:42 +0100)]
cpufreq: pmac64-cpufreq: remove device tree parsing for cpu nodes
Now that the cpu device registration initialises the of_node(if available)
appropriately for all the cpus, parsing here is redundant.
This patch removes all DT parsing and uses cpu->of_node instead.
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Sudeep KarkadaNagesha <sudeep.karkadanagesha@arm.com>
Sudeep KarkadaNagesha [Wed, 17 Jul 2013 11:39:29 +0000 (12:39 +0100)]
cpufreq: maple-cpufreq: remove device tree parsing for cpu nodes
Now that the cpu device registration initialises the of_node(if available)
appropriately for all the cpus, parsing here is redundant.
This patch removes all DT parsing and uses cpu->of_node instead.
Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Sudeep KarkadaNagesha <sudeep.karkadanagesha@arm.com>
Sudeep KarkadaNagesha [Mon, 17 Jun 2013 14:51:44 +0000 (15:51 +0100)]
cpufreq: arm_big_little: remove device tree parsing for cpu nodes
Now that the cpu device registration initialises the of_node(if available)
appropriately for all the cpus, parsing here is redundant.
This patch removes all DT parsing and uses cpu->of_node instead.
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Sudeep KarkadaNagesha <sudeep.karkadanagesha@arm.com>
Sudeep KarkadaNagesha [Mon, 17 Jun 2013 14:09:51 +0000 (15:09 +0100)]
cpufreq: kirkwood-cpufreq: remove device tree parsing for cpu nodes
Now that the cpu device registration initialises the of_node(if available)
appropriately for all the cpus, parsing here is redundant.
This patch removes all DT parsing and uses cpu->of_node instead.
Cc: Jason Cooper <jason@lakedaemon.net>
Acked-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Sudeep KarkadaNagesha <sudeep.karkadanagesha@arm.com>
Sudeep KarkadaNagesha [Mon, 17 Jun 2013 14:09:15 +0000 (15:09 +0100)]
cpufreq: spear-cpufreq: remove device tree parsing for cpu nodes
Now that the cpu device registration initialises the of_node(if available)
appropriately for all the cpus, parsing here is redundant.
This patch removes all DT parsing and uses cpu->of_node instead.
Cc: Deepak Sikri <sikrid@qti.qualcomm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Sudeep KarkadaNagesha <sudeep.karkadanagesha@arm.com>
Sudeep KarkadaNagesha [Mon, 17 Jun 2013 14:07:28 +0000 (15:07 +0100)]
cpufreq: highbank-cpufreq: remove device tree parsing for cpu nodes
Now that the cpu device registration initialises the of_node(if available)
appropriately for all the cpus, parsing here is redundant.
This patch removes all DT parsing and uses cpu->of_node instead.
Cc: Mark Langsdorf <mark.langsdorf@calxeda.com>
Acked-by: Rob Herring <rob.herring@calxeda.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Sudeep KarkadaNagesha <sudeep.karkadanagesha@arm.com>
Sudeep KarkadaNagesha [Mon, 17 Jun 2013 14:04:19 +0000 (15:04 +0100)]
cpufreq: cpufreq-cpu0: remove device tree parsing for cpu nodes
Now that the cpu device registration initialises the of_node(if available)
appropriately for all the cpus, parsing here is redundant.
This patch removes all DT parsing and uses cpu->of_node instead.
Acked-by: Shawn Guo <shawn.guo@linaro.org>
Acked-by: Rob Herring <rob.herring@calxeda.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Sudeep KarkadaNagesha <sudeep.karkadanagesha@arm.com>
Sudeep KarkadaNagesha [Mon, 17 Jun 2013 13:58:48 +0000 (14:58 +0100)]
cpufreq: imx6q-cpufreq: remove device tree parsing for cpu nodes
Now that the cpu device registration initialises the of_node(if available)
appropriately for all the cpus, parsing here is redundant.
This patch removes all DT parsing and uses cpu->of_node instead.
Acked-by: Shawn Guo <shawn.guo@linaro.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Sudeep KarkadaNagesha <sudeep.karkadanagesha@arm.com>
Sudeep KarkadaNagesha [Mon, 17 Jun 2013 13:51:48 +0000 (14:51 +0100)]
drivers/bus: arm-cci: avoid parsing DT for cpu device nodes
Since the CPU device nodes can be retrieved using arch_of_get_cpu_node,
we can use it to avoid parsing the cpus node searching the cpu nodes and
mapping to logical index.
This patch removes parsing DT for cpu nodes by using of_get_cpu_node.
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Sudeep KarkadaNagesha <sudeep.karkadanagesha@arm.com>
Sudeep KarkadaNagesha [Wed, 3 Jul 2013 15:01:42 +0000 (16:01 +0100)]
ARM: mvebu: remove device tree parsing for cpu nodes
Currently set_secondary_cpus_clock assume the CPU logical ordering
and the MPDIR in DT are same, which is incorrect.
Since the CPU device nodes can be retrieved in the logical ordering
using the DT helper, we can remove the devices tree parsing.
This patch removes DT parsing by making use of of_get_cpu_node.
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Jason Cooper <jason@lakedaemon.net>
Acked-by: Gregory Clement <gregory.clement@free-electrons.com>
Signed-off-by: Sudeep KarkadaNagesha <sudeep.karkadanagesha@arm.com>
Sudeep KarkadaNagesha [Mon, 17 Jun 2013 13:20:00 +0000 (14:20 +0100)]
ARM: topology: remove hwid/MPIDR dependency from cpu_capacity
Currently the topology code computes cpu capacity and stores it in
the list along with hwid(which is MPIDR) as it parses the CPU nodes
in the device tree. This is required as it needs to be mapped to the
logical CPU later.
Since the CPU device nodes can be retrieved in the logical ordering
using DT/OF helpers, its possible to store cpu_capacity also in logical
ordering and avoid storing hwid for each entry.
This patch removes hwid by making use of of_get_cpu_node.
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Rob Herring <rob.herring@calxeda.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Sudeep KarkadaNagesha <sudeep.karkadanagesha@arm.com>
Sudeep KarkadaNagesha [Thu, 18 Jul 2013 10:22:04 +0000 (11:22 +0100)]
of/device: add helper to get cpu device node from logical cpu index
Multiple drivers need to get the cpu device node from the cpu logical
index and then access the of_node.
This patch adds helper function to fetch the device node directly.
Acked-by: Rob Herring <rob.herring@calxeda.com>
Signed-off-by: Sudeep KarkadaNagesha <sudeep.karkadanagesha@arm.com>
Sudeep KarkadaNagesha [Mon, 17 Jun 2013 11:58:45 +0000 (12:58 +0100)]
driver/core: cpu: initialize of_node in cpu's device struture
CPUs are also registered as devices but the of_node in these cpu
devices are not initialized. Currently different drivers requiring
to access cpu device node are parsing the nodes themselves and
initialising the of_node in cpu device.
The of_node in all the cpu devices needs to be initialized properly
and at one place. The best place to update this is CPU subsystem
driver when registering the cpu devices.
The OF/DT core library now provides of_get_cpu_node to retrieve a cpu
device node for a given logical index by abstracting the architecture
specific details.
This patch uses of_get_cpu_node to assign of_node when registering the
cpu devices.
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Rob Herring <rob.herring@calxeda.com>
Signed-off-by: Sudeep KarkadaNagesha <sudeep.karkadanagesha@arm.com>
Sudeep KarkadaNagesha [Mon, 17 Jun 2013 12:11:29 +0000 (13:11 +0100)]
ARM: DT/kernel: define ARM specific arch_match_cpu_phys_id
OF/DT core library now provides architecture specific hook to match the
logical cpu index with the corresponding physical identifier. Most of the
cpu DT node parsing and initialisation is contained in devtree.c. So it's
better to define ARM specific arch_match_cpu_phys_id there.
This mainly helps to avoid replication of the code doing CPU node parsing
and physical(MPIDR) to logical mapping.
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Rob Herring <rob.herring@calxeda.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Sudeep KarkadaNagesha <sudeep.karkadanagesha@arm.com>
Sudeep KarkadaNagesha [Thu, 15 Aug 2013 13:01:40 +0000 (14:01 +0100)]
of: move of_get_cpu_node implementation to DT core library
This patch moves the generalized implementation of of_get_cpu_node from
PowerPC to DT core library, thereby adding support for retrieving cpu
node for a given logical cpu index on any architecture.
The CPU subsystem can now use this function to assign of_node in the
cpu device while registering CPUs.
It is recommended to use these helper function only in pre-SMP/early
initialisation stages to retrieve CPU device node pointers in logical
ordering. Once the cpu devices are registered, it can be retrieved easily
from cpu device of_node which avoids unnecessary parsing and matching.
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Grant Likely <grant.likely@linaro.org>
Acked-by: Rob Herring <rob.herring@calxeda.com>
Signed-off-by: Sudeep KarkadaNagesha <sudeep.karkadanagesha@arm.com>
Sudeep KarkadaNagesha [Thu, 15 Aug 2013 12:34:18 +0000 (13:34 +0100)]
powerpc: refactor of_get_cpu_node to support other architectures
Currently different drivers requiring to access cpu device node are
parsing the device tree themselves. Since the ordering in the DT need
not match the logical cpu ordering, the parsing logic needs to consider
that. However, this has resulted in lots of code duplication and in some
cases even incorrect logic.
It's better to consolidate them by adding support for getting cpu
device node for a given logical cpu index in DT core library. However
logical to physical index mapping can be architecture specific.
PowerPC has it's own implementation to get the cpu node for a given
logical index.
This patch refactors the current implementation of of_get_cpu_node.
This in preparation to move the implementation to DT core library.
It separates out the logical to physical mapping so that a default
matching of the physical id to the logical cpu index can be added
when moved to common code. Architecture specific code can override it.
Cc: Rob Herring <rob.herring@calxeda.com>
Cc: Grant Likely <grant.likely@linaro.org>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Sudeep KarkadaNagesha <sudeep.karkadanagesha@arm.com>
Sudeep KarkadaNagesha [Thu, 15 Aug 2013 09:19:51 +0000 (10:19 +0100)]
openrisc: remove undefined of_get_cpu_node declaration
This patch removes the declaration of the function 'of_get_cpu_node'
which is not defined for openrisc. This is in preparation to move
it's definition from PPC to DT common code.
Again it could be there as it was originally copied from powerpc.
Acked-by: Jonas Bonn <jonas@southpole.se>
Signed-off-by: Sudeep KarkadaNagesha <sudeep.karkadanagesha@arm.com>
Sudeep KarkadaNagesha [Thu, 15 Aug 2013 09:07:43 +0000 (10:07 +0100)]
microblaze: remove undefined of_get_cpu_node declaration
This patch removes the declaration of the function 'of_get_cpu_node'
which is not defined for microblaze. This is in preparation to move
it's definition from PPC to DT common code.
Michal Simek says: "it was just there because Microblaze
was based on powerpc code"
Acked-by: Michal Simek <monstr@monstr.eu>
Signed-off-by: Sudeep KarkadaNagesha <sudeep.karkadanagesha@arm.com>
Li Zhong [Tue, 20 Aug 2013 23:31:08 +0000 (01:31 +0200)]
cpufreq: fix bad unlock balance on !CONFIG_SMP
This patch tries to fix lockdep complaint attached below.
It seems that we should always read acquire the cpufreq_rwsem,
whether CONFIG_SMP is enabled or not. And CONFIG_HOTPLUG_CPU
depends on CONFIG_SMP, so it seems we don't need CONFIG_SMP for the
code enabled by CONFIG_HOTPLUG_CPU.
[ 0.504191] =====================================
[ 0.504627] [ BUG: bad unlock balance detected! ]
[ 0.504627] 3.11.0-rc6-next-
20130819 #1 Not tainted
[ 0.504627] -------------------------------------
[ 0.504627] swapper/1 is trying to release lock (cpufreq_rwsem) at:
[ 0.504627] [<
ffffffff813d927a>] cpufreq_add_dev+0x13a/0x3e0
[ 0.504627] but there are no more locks to release!
[ 0.504627]
[ 0.504627] other info that might help us debug this:
[ 0.504627] 1 lock held by swapper/1:
[ 0.504627] #0: (subsys mutex#4){+.+.+.}, at: [<
ffffffff8134a7bf>] subsys_interface_register+0x4f/0xe0
[ 0.504627]
[ 0.504627] stack backtrace:
[ 0.504627] CPU: 0 PID: 1 Comm: swapper Not tainted 3.11.0-rc6-next-
20130819 #1
[ 0.504627] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
[ 0.504627]
ffffffff813d927a ffff88007f847c98 ffffffff814c062b ffff88007f847cc8
[ 0.504627]
ffffffff81098bce ffff88007f847cf8 ffffffff81aadc30 ffffffff813d927a
[ 0.504627]
00000000ffffffff ffff88007f847d68 ffffffff8109d0be 0000000000000006
[ 0.504627] Call Trace:
[ 0.504627] [<
ffffffff813d927a>] ? cpufreq_add_dev+0x13a/0x3e0
[ 0.504627] [<
ffffffff814c062b>] dump_stack+0x19/0x1b
[ 0.504627] [<
ffffffff81098bce>] print_unlock_imbalance_bug+0xfe/0x110
[ 0.504627] [<
ffffffff813d927a>] ? cpufreq_add_dev+0x13a/0x3e0
[ 0.504627] [<
ffffffff8109d0be>] lock_release_non_nested+0x1ee/0x310
[ 0.504627] [<
ffffffff81099d0e>] ? mark_held_locks+0xae/0x120
[ 0.504627] [<
ffffffff811510cb>] ? kfree+0xcb/0x1d0
[ 0.504627] [<
ffffffff813d77ea>] ? cpufreq_policy_free+0x4a/0x60
[ 0.504627] [<
ffffffff813d927a>] ? cpufreq_add_dev+0x13a/0x3e0
[ 0.504627] [<
ffffffff8109d2a4>] lock_release+0xc4/0x250
[ 0.504627] [<
ffffffff8106c9f3>] up_read+0x23/0x40
[ 0.504627] [<
ffffffff813d927a>] cpufreq_add_dev+0x13a/0x3e0
[ 0.504627] [<
ffffffff8134a809>] subsys_interface_register+0x99/0xe0
[ 0.504627] [<
ffffffff81b19f3b>] ? cpufreq_gov_dbs_init+0x12/0x12
[ 0.504627] [<
ffffffff813d7f0d>] cpufreq_register_driver+0x9d/0x1d0
[ 0.504627] [<
ffffffff81b19f3b>] ? cpufreq_gov_dbs_init+0x12/0x12
[ 0.504627] [<
ffffffff81b1a039>] acpi_cpufreq_init+0xfe/0x1f8
[ 0.504627] [<
ffffffff810002ba>] do_one_initcall+0xda/0x180
[ 0.504627] [<
ffffffff81ae301e>] kernel_init_freeable+0x12c/0x1bb
[ 0.504627] [<
ffffffff81ae2841>] ? do_early_param+0x8c/0x8c
[ 0.504627] [<
ffffffff814b4dd0>] ? rest_init+0x140/0x140
[ 0.504627] [<
ffffffff814b4dde>] kernel_init+0xe/0xf0
[ 0.504627] [<
ffffffff814d029a>] ret_from_fork+0x7a/0xb0
[ 0.504627] [<
ffffffff814b4dd0>] ? rest_init+0x140/0x140
Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
Acked-and-tested-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Viresh Kumar [Tue, 20 Aug 2013 06:38:26 +0000 (12:08 +0530)]
cpufreq: Use cpufreq_policy_list for iterating over policies
To iterate over all policies we currently iterate over all online
CPUs and then get the policy for each of them which is suboptimal.
Use the newly created cpufreq_policy_list for this purpose instead.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Viresh Kumar [Tue, 20 Aug 2013 06:38:25 +0000 (12:08 +0530)]
cpufreq: remove cpufreq_policy_cpu per-cpu variable
cpufreq_policy_cpu per-cpu variables are used for storing the ID of
the CPU that manages the given CPU's policy. However, we also store
a policy pointer for each cpu in cpufreq_cpu_data, so the
cpufreq_policy_cpu information is simply redundant.
It is better to use cpufreq_cpu_data to retrieve a policy and get
policy->cpu from there, so make that happen everywhere and drop the
cpufreq_policy_cpu per-cpu variables which aren't necessary any more.
[rjw: Changelog]
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Viresh Kumar [Tue, 20 Aug 2013 06:38:24 +0000 (12:08 +0530)]
cpufreq: remove unnecessary check in __cpufreq_governor()
We don't need to check if event is CPUFREQ_GOV_POLICY_INIT and put
governor module as we are sure event can only be START/STOP here.
Remove the useless check.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Viresh Kumar [Tue, 20 Aug 2013 06:38:23 +0000 (12:08 +0530)]
cpufreq: remove policy from cpufreq_policy_list during suspend
cpufreq_policy_list is a list of active policies. We do remove
policies from this list when all CPUs belonging to that policy are
removed. But during system suspend we don't really free a policy
struct as it will be used again during resume, so we didn't remove
it from cpufreq_policy_list as well..
However, this is incorrect. We are saying this policy isn't valid
anymore and must not be referenced (though we haven't freed it), but
it can still be used by code that iterates over cpufreq_policy_list.
Remove policy from this list during system suspend as well.
Of course, we must add it back whenever the first CPU belonging to
that policy shows up.
[rjw: Changelog]
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Viresh Kumar [Tue, 20 Aug 2013 06:38:22 +0000 (12:08 +0530)]
cpufreq: Fix white space in __cpufreq_remove_dev()
Align closing brace '}' of an if block.
[rjw: Subject and changelog]
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Linus Torvalds [Sun, 18 Aug 2013 21:36:53 +0000 (14:36 -0700)]
Linux 3.11-rc6
Linus Torvalds [Sun, 18 Aug 2013 15:51:28 +0000 (08:51 -0700)]
Merge branch 'for-3.11-fixes' of git://git./linux/kernel/git/tj/cgroup
Pull cgroup fix from Tejun Heo:
"This contains one patch to fix the return value of cpuset's cgroups
interface function, which used to always return -ENODEV for the writes
on the 'memory_pressure_enabled' file"
* 'for-3.11-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cpuset: fix the return value of cpuset_write_u64()
Rafael J. Wysocki [Sun, 18 Aug 2013 13:35:59 +0000 (15:35 +0200)]
Revert "cpufreq: Use cpufreq_policy_list for iterating over policies"
Revert commit
eb60852 (cpufreq: Use cpufreq_policy_list for iterating
over policies), because it breaks system suspend/resume on multiple
machines.
It either causes resume to block indefinitely or causes the BUG_ON()
in lock_policy_rwsem_##mode() to trigger on sysfs accesses to cpufreq
attributes.
Conflicts:
drivers/cpufreq/cpufreq.c
Linus Torvalds [Sat, 17 Aug 2013 17:43:19 +0000 (10:43 -0700)]
Merge tag 'ext4_for_linus' of git://git./linux/kernel/git/tytso/ext4
Pull jbd2 bug fixes from Ted Ts'o:
"Two jbd2 bug fixes, one of which is a regression fix"
* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
jbd2: Fix oops in jbd2_journal_file_inode()
jbd2: Fix use after free after error in jbd2_journal_dirty_metadata()
Guenter Roeck [Sat, 17 Aug 2013 03:50:55 +0000 (20:50 -0700)]
s390: Fix broken build
Fix this build error:
In file included from fs/exec.c:61:0:
arch/s390/include/asm/tlb.h:35:23: error: expected identifier or '(' before 'unsigned'
arch/s390/include/asm/tlb.h:36:1: warning: no semicolon at end of struct or union [enabled by default]
arch/s390/include/asm/tlb.h: In function 'tlb_gather_mmu':
arch/s390/include/asm/tlb.h:57:5: error: 'struct mmu_gather' has no member named 'end'
Broken due to commit
2b047252d0 ("Fix TLB gather virtual address range
invalidation corner cases").
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: stable@vger.kernel.org
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
[ Oh well. We had build testing for ppc amd um, but no s390 - Linus ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Robin Holt [Fri, 16 Aug 2013 23:01:42 +0000 (18:01 -0500)]
MAINTAINERS: Change ownership for SGI specific modules.
I have taken a different job. I am removing myself as maintainer of
GRU. Dimitri will continue to maintain the SGI GRU driver, changing the
XP/XPC/XPNET maintainer to Cliff Whickman, but leaving behind my
personal email address to answer any questions about the design or
operation of the XP family of drivers.
Signed-off-by: Robin Holt <holt@sgi.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jan Kara [Sat, 17 Aug 2013 01:19:41 +0000 (21:19 -0400)]
jbd2: Fix oops in jbd2_journal_file_inode()
Commit
0713ed0cde76438d05849f1537d3aab46e099475 added
jbd2_journal_file_inode() call into ext4_block_zero_page_range().
However that function gets called from truncate path and thus inode
needn't have jinode attached - that happens in ext4_file_open() but
the file needn't be ever open since mount. Calling
jbd2_journal_file_inode() without jinode attached results in the oops.
We fix the problem by attaching jinode to inode also in ext4_truncate()
and ext4_punch_hole() when we are going to zero out partial blocks.
Reported-by: majianpeng <majianpeng@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Linus Torvalds [Fri, 16 Aug 2013 23:52:29 +0000 (16:52 -0700)]
Merge branch 'fixes' of git://git.linaro.org/people/rmk/linux-arm
Pull ARM fixes from Russell King:
"The usual collection of random fixes. Also some further fixes to the
last set of security fixes, and some more from Will (which you may
already have in a slightly different form)"
* 'fixes' of git://git.linaro.org/people/rmk/linux-arm:
ARM: 7807/1: kexec: validate CPU hotplug support
ARM: 7812/1: rwlocks: retry trylock operation if strex fails on free lock
ARM: 7811/1: locks: use early clobber in arch_spin_trylock
ARM: 7810/1: perf: Fix array out of bounds access in armpmu_map_hw_event()
ARM: 7809/1: perf: fix event validation for software group leaders
ARM: Fix FIQ code on VIVT CPUs
ARM: Fix !kuser helpers case
ARM: Fix the world famous typo with is_gate_vma()
Linus Torvalds [Fri, 16 Aug 2013 23:49:06 +0000 (16:49 -0700)]
Merge branch 'for-3.11' of git://git./linux/kernel/git/geert/linux-m68k
Pull m68k fixes from Geert Uytterhoeven:
"These are two critical fixes, needed by distro kernels, and thus also
destined for stable:
- The do_div() commit fixes a crash in mounting btrfs volumes, which
was a regression from 3.2,
- The ARAnyM fix allows to have NatFeat drivers as loadable modules,
which is needed for initrds"
* 'for-3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
m68k: Truncate base in do_div()
m68k/atari: ARAnyM - Fix NatFeat module support
Linus Torvalds [Fri, 16 Aug 2013 17:00:18 +0000 (10:00 -0700)]
Merge tag 'clk-fixes-for-linus' of git://git.linaro.org/people/mturquette/linux
Pull clock controller fixes from Michael Turquette:
"Two small fixes for the Zynq clock controller introduced in 3.11-rc1
and another Exynos clock patch which fixes a regression that prevents
the video pipeline from functioning on that platform"
* tag 'clk-fixes-for-linus' of git://git.linaro.org/people/mturquette/linux:
clk: exynos4: Add CLK_GET_RATE_NOCACHE flag for the Exynos4x12 ISP clocks
clk/zynq/clkc: Add CLK_SET_RATE_PARENT flag to ethernet muxes
clk/zynq/clkc: Add dedicated spinlock for the SWDT
Linus Torvalds [Fri, 16 Aug 2013 16:59:00 +0000 (09:59 -0700)]
Merge tag 'pm-3.11-rc6' of git://git./linux/kernel/git/rafael/linux-pm
Pull power management fix from Rafael Wysocki:
"The removal of delayed_work_pending() checks from kernel/power/qos.c
done in 3.9 introduced a deadlock in pm_qos_work_fn().
Fix from Stephen Boyd"
* tag 'pm-3.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
PM / QoS: Fix workqueue deadlock when using pm_qos_update_request_timeout()
Linus Torvalds [Fri, 16 Aug 2013 16:58:21 +0000 (09:58 -0700)]
Merge tag 'sound-3.11' of git://git./linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"This batch contains a few USB audio fixes, a couple of HD-audio
quirks, various small ASoC driver fixes in addition to an ASoC core
fix that may lead to memory corruption.
Unfortunately slightly more volume than the previous pull request, but
all are reasonable regression fixes"
* tag 'sound-3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda - Add a fixup for Gateway LT27
ASoC: tegra: fix Tegra30 I2S capture parameter setup
ALSA: usb-audio: Fix invalid volume resolution for Logitech HD Webcam C525
ALSA: hda - Fix missing mute controls for CX5051
ALSA: usb-audio: fix automatic Roland/Yamaha MIDI detection
ALSA: 6fire: make buffers DMA-able (midi)
ALSA: 6fire: make buffers DMA-able (pcm)
ALSA: hda - Add pinfix for LG LW25 laptop
ASoC: cs42l52: Add new TLV for Beep Volume
ASoC: cs42l52: Reorder Min/Max and update to SX_TLV for Beep Volume
ASoC: dapm: Fix empty list check in dapm_new_mux()
ASoC: sgtl5000: fix buggy 'Capture Attenuate Switch' control
ASoC: sgtl5000: prevent playback to be muted when terminating concurrent capture
Linus Torvalds [Fri, 16 Aug 2013 16:57:38 +0000 (09:57 -0700)]
Merge tag 'usb-3.11-rc6' of git://git./linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
"Here are some small USB fixes for 3.11-rc6 that have accumulated.
Nothing huge, a EHCI fix that solves a much-reported audio USB
problem, some usb-serial driver endian fixes and other minor fixes, a
wireless USB oops fix, and two new quirks"
* tag 'usb-3.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
USB: keyspan: fix null-deref at disconnect and release
USB: mos7720: fix broken control requests
usb: add two quirky touchscreen
USB: ti_usb_3410_5052: fix big-endian firmware handling
USB: adutux: fix big-endian device-type reporting
USB: usbtmc: fix big-endian probe of Rigol devices
USB: mos7840: fix big-endian probe
USB-Serial: Fix error handling of usb_wwan
wusbcore: fix kernel panic when disconnecting a wireless USB->serial device
USB: EHCI: accept very late isochronous URBs
Linus Torvalds [Fri, 16 Aug 2013 16:35:29 +0000 (09:35 -0700)]
Merge git://git./linux/kernel/git/davem/net
Pull networking fixes from David Miller:
1) Fix SKB leak in 8139cp, from Dave Jones.
2) Fix use of *_PAGES interfaces with mlx5 firmware, from Moshe Lazar.
3) RCU conversion of macvtap introduced two races, fixes by Eric
Dumazet
4) Synchronize statistic flows in bnx2x driver to prevent corruption,
from Dmitry Kravkov
5) Undo optimization in IP tunneling, we were using the inner IP header
in some cases to inherit the IP ID, but that isn't correct in some
circumstances. From Pravin B Shelar
6) Use correct struct size when parsing netlink attributes in
rtnl_bridge_getlink(). From Asbjoern Sloth Toennesen
7) Length verifications in tun_get_user() are bogus, from Weiping Pan
and Dan Carpenter
8) Fix bad merge resolution during 3.11 networking development in
openvswitch, albeit a harmless one which added some unreachable
code. From Jesse Gross
9) Wrong size used in flexible array allocation in openvswitch, from
Pravin B Shelar
10) Clear out firmware capability flags the be2net driver isn't ready to
handle yet, from Sarveshwar Bandi
11) Revert DMA mapping error checking addition to cxgb3 driver, it's
buggy. From Alexey Kardashevskiy
12) Fix regression in packet scheduler rate limiting when working with a
link layer of ATM. From Jesper Dangaard Brouer
13) Fix several errors in TCP Cubic congestion control, in particular
overflow errors in timestamp calculations. From Eric Dumazet and
Van Jacobson
14) In ipv6 routing lookups, we need to backtrack if subtree traversal
don't result in a match. From Hannes Frederic Sowa
15) ipgre_header() returns incorrect packet offset. Fix from Timo Teräs
16) Get "low latency" out of the new MIB counter names. From Eliezer
Tamir
17) State check in ndo_dflt_fdb_del() is inverted, from Sridhar
Samudrala
18) Handle TCP Fast Open properly in netfilter conntrack, from Yuchung
Cheng
19) Wrong memcpy length in pcan_usb driver, from Stephane Grosjean
20) Fix dealock in TIPC, from Wang Weidong and Ding Tianhong
21) call_rcu() call to destroy SCTP transport is done too early and
might result in an oops. From Daniel Borkmann
22) Fix races in genetlink family dumps, from Johannes Berg
23) Flags passed into macvlan by the user need to be validated properly,
from Michael S Tsirkin
24) Fix skge build on 32-bit, from Stephen Hemminger
25) Handle malformed TCP headers properly in xt_TCPMSS, from Pablo Neira
Ayuso
26) Fix handling of stacked vlans in vlan_dev_real_dev(), from Nikolay
Aleksandrov
27) Eliminate MTU calculation overflows in esp{4,6}, from Daniel
Borkmann
28) neigh_parms need to be setup before calling the ->ndo_neigh_setup()
method. From Veaceslav Falico
29) Kill out-of-bounds prefetch in fib_trie, from Eric Dumazet
30) Don't dereference MLD query message if the length isn't value in the
bridge multicast code, from Linus Lüssing
31) Fix VXLAN IGMP join regression due to an inverted check, from Cong
Wang
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (70 commits)
net/mlx5_core: Support MANAGE_PAGES and QUERY_PAGES firmware command changes
tun: signedness bug in tun_get_user()
qlcnic: Fix diagnostic interrupt test for 83xx adapters
qlcnic: Fix beacon state return status handling
qlcnic: Fix set driver version command
net: tg3: fix NULL pointer dereference in tg3_io_error_detected and tg3_io_slot_reset
net_sched: restore "linklayer atm" handling
drivers/net/ethernet/via/via-velocity.c: update napi implementation
Revert "cxgb3: Check and handle the dma mapping errors"
be2net: Clear any capability flags that driver is not interested in.
openvswitch: Reset tunnel key between input and output.
openvswitch: Use correct type while allocating flex array.
openvswitch: Fix bad merge resolution.
tun: compare with 0 instead of total_len
rtnetlink: rtnl_bridge_getlink: Call nlmsg_find_attr() with ifinfomsg header
ethernet/arc/arc_emac - fix NAPI "work > weight" warning
ip_tunnel: Do not use inner ip-header-id for tunnel ip-header-id.
bnx2x: prevent crash in shutdown flow with CNIC
bnx2x: fix PTE write access error
bnx2x: fix memory leak in VF
...
Linus Torvalds [Thu, 15 Aug 2013 18:42:25 +0000 (11:42 -0700)]
Fix TLB gather virtual address range invalidation corner cases
Ben Tebulin reported:
"Since v3.7.2 on two independent machines a very specific Git
repository fails in 9/10 cases on git-fsck due to an SHA1/memory
failures. This only occurs on a very specific repository and can be
reproduced stably on two independent laptops. Git mailing list ran
out of ideas and for me this looks like some very exotic kernel issue"
and bisected the failure to the backport of commit
53a59fc67f97 ("mm:
limit mmu_gather batching to fix soft lockups on !CONFIG_PREEMPT").
That commit itself is not actually buggy, but what it does is to make it
much more likely to hit the partial TLB invalidation case, since it
introduces a new case in tlb_next_batch() that previously only ever
happened when running out of memory.
The real bug is that the TLB gather virtual memory range setup is subtly
buggered. It was introduced in commit
597e1c3580b7 ("mm/mmu_gather:
enable tlb flush range in generic mmu_gather"), and the range handling
was already fixed at least once in commit
e6c495a96ce0 ("mm: fix the TLB
range flushed when __tlb_remove_page() runs out of slots"), but that fix
was not complete.
The problem with the TLB gather virtual address range is that it isn't
set up by the initial tlb_gather_mmu() initialization (which didn't get
the TLB range information), but it is set up ad-hoc later by the
functions that actually flush the TLB. And so any such case that forgot
to update the TLB range entries would potentially miss TLB invalidates.
Rather than try to figure out exactly which particular ad-hoc range
setup was missing (I personally suspect it's the hugetlb case in
zap_huge_pmd(), which didn't have the same logic as zap_pte_range()
did), this patch just gets rid of the problem at the source: make the
TLB range information available to tlb_gather_mmu(), and initialize it
when initializing all the other tlb gather fields.
This makes the patch larger, but conceptually much simpler. And the end
result is much more understandable; even if you want to play games with
partial ranges when invalidating the TLB contents in chunks, now the
range information is always there, and anybody who doesn't want to
bother with it won't introduce subtle bugs.
Ben verified that this fixes his problem.
Reported-bisected-and-tested-by: Ben Tebulin <tebulin@googlemail.com>
Build-testing-by: Stephen Rothwell <sfr@canb.auug.org.au>
Build-testing-by: Richard Weinberger <richard.weinberger@gmail.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Takashi Iwai [Fri, 16 Aug 2013 06:17:05 +0000 (08:17 +0200)]
ALSA: hda - Add a fixup for Gateway LT27
Gateway LT27 needs a fixup for the inverted digital mic.
Reported-by: "Nathanael D. Noblet" <nathanael@gnat.ca>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Moshe Lazer [Wed, 14 Aug 2013 14:46:48 +0000 (17:46 +0300)]
net/mlx5_core: Support MANAGE_PAGES and QUERY_PAGES firmware command changes
In the previous QUERY_PAGES command version we used one command to get the
required amount of boot, init and post init pages. The new version uses the
op_mod field to specify whether the query is for the required amount of boot,
init or post init pages. In addition the output field size for the required
amount of pages increased from 16 to 32 bits.
In MANAGE_PAGES command the input_num_entries and output_num_entries fields
sizes changed from 16 to 32 bits and the PAS tables offset changed to 0x10.
In the pages request event the num_pages field also changed to 32 bits.
In the HCA-capabilities-layout the size and location of max_qp_mcg field has
been changed to support 24 bits.
This patch isn't compatible with firmware versions < 5; however, it turns out that the
first GA firmware we will publish will not support previous versions so this should be OK.
Signed-off-by: Moshe Lazer <moshel@mellanox.com>
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Thu, 15 Aug 2013 12:52:57 +0000 (15:52 +0300)]
tun: signedness bug in tun_get_user()
The recent fix
d9bf5f1309 "tun: compare with 0 instead of total_len" is
not totally correct. Because "len" and "sizeof()" are size_t type, that
means they are never less than zero.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Manish Chopra [Thu, 15 Aug 2013 12:29:29 +0000 (08:29 -0400)]
qlcnic: Fix diagnostic interrupt test for 83xx adapters
o Do not allow interrupt test when adapter is resetting.
Signed-off-by: Manish Chopra <manish.chopra@qlogic.com>
Signed-off-by: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sucheta Chakraborty [Thu, 15 Aug 2013 12:29:28 +0000 (08:29 -0400)]
qlcnic: Fix beacon state return status handling
o Driver was misinterpreting the return status for beacon
state query leading to incorrect interpretation of beacon
state and logging an error message for successful status.
Fixed the driver to properly interpret the return status.
Signed-off-by: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Himanshu Madhani [Thu, 15 Aug 2013 12:29:27 +0000 (08:29 -0400)]
qlcnic: Fix set driver version command
Driver was issuing set driver version command through all
functions in the adapter. Fix the driver to issue set driver
version once per adapter, through function 0.
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Tue, 13 Aug 2013 18:45:13 +0000 (11:45 -0700)]
net: tg3: fix NULL pointer dereference in tg3_io_error_detected and tg3_io_slot_reset
Commit
d8af4dfd8 ("net/tg3: Fix kernel crash") introduced a possible
NULL pointer dereference in tg3 driver when !netdev || !netif_running(netdev)
condition is met and netdev is NULL. Then, the jump to the 'done' label
calls dev_close() with a netdevice that is NULL. Therefore, only call
dev_close() when we have a netdevice, but one that is not running.
[ Add the same checks in tg3_io_slot_reset() per Gavin Shan - by Nithin
Nayak Sujir ]
Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Gavin Shan <shangw@linux.vnet.ibm.com>
Cc: Michael Chan <mchan@broadcom.com>
Signed-off-by: Nithin Nayak Sujir <nsujir@broadcom.com>
Signed-off-by: Nithin Nayak Sujir <nsujir@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Takashi Iwai [Thu, 15 Aug 2013 18:43:46 +0000 (20:43 +0200)]
Merge tag 'asoc-v3.11-rc5' of git://git./linux/kernel/git/broonie/sound into for-linus
ASoC: Fixes for v3.11
A few driver specific fixes here plus one core fix for a memory
corruption issue in DAPM initialisation which could lead to crashes.
Mark Brown [Thu, 15 Aug 2013 10:37:54 +0000 (11:37 +0100)]
Merge remote-tracking branch 'asoc/fix/tegra' into asoc-linus
Mark Brown [Thu, 15 Aug 2013 10:37:53 +0000 (11:37 +0100)]
Merge remote-tracking branch 'asoc/fix/sgtl5000' into asoc-linus
Mark Brown [Thu, 15 Aug 2013 10:37:53 +0000 (11:37 +0100)]
Merge remote-tracking branch 'asoc/fix/dapm' into asoc-linus
Mark Brown [Thu, 15 Aug 2013 10:37:52 +0000 (11:37 +0100)]
Merge remote-tracking branch 'asoc/fix/cs42l52' into asoc-linus
Stephen Warren [Wed, 14 Aug 2013 20:24:16 +0000 (14:24 -0600)]
ASoC: tegra: fix Tegra30 I2S capture parameter setup
The Tegra30 I2S driver was writing the AHUB interface parameters to the
playback path register rather than the capture path register. This
caused the capture parameters not to be configured at all, so if
capturing using non-HW-default parameters (e.g. 16-bit stereo rather
than 8-bit mono) the audio would be corrupted.
With this fixed, audio capture from an analog microphone works correctly
on the Cardhu board.
Cc: stable@vger.kernel.org
Signed-off-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Mark Brown <broonie@linaro.org>
Jesper Dangaard Brouer [Wed, 14 Aug 2013 21:47:11 +0000 (23:47 +0200)]
net_sched: restore "linklayer atm" handling
commit
56b765b79 ("htb: improved accuracy at high rates")
broke the "linklayer atm" handling.
tc class add ... htb rate X ceil Y linklayer atm
The linklayer setting is implemented by modifying the rate table
which is send to the kernel. No direct parameter were
transferred to the kernel indicating the linklayer setting.
The commit
56b765b79 ("htb: improved accuracy at high rates")
removed the use of the rate table system.
To keep compatible with older iproute2 utils, this patch detects
the linklayer by parsing the rate table. It also supports future
versions of iproute2 to send this linklayer parameter to the
kernel directly. This is done by using the __reserved field in
struct tc_ratespec, to convey the choosen linklayer option, but
only using the lower 4 bits of this field.
Linklayer detection is limited to speeds below 100Mbit/s, because
at high rates the rtab is gets too inaccurate, so bad that
several fields contain the same values, this resembling the ATM
detect. Fields even start to contain "0" time to send, e.g. at
1000Mbit/s sending a 96 bytes packet cost "0", thus the rtab have
been more broken than we first realized.
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 15 Aug 2013 08:41:10 +0000 (01:41 -0700)]
Merge branch 'fixes' of git://git./linux/kernel/git/jesse/openvswitch
Jesse Gross says:
====================
Three bug fixes that are fairly small either way but resolve obviously
incorrect code. For net/3.11.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Julia Lawall [Wed, 14 Aug 2013 14:26:53 +0000 (16:26 +0200)]
drivers/net/ethernet/via/via-velocity.c: update napi implementation
Drivers supporting NAPI should use a NAPI-specific function for receiving
packets. Hence netif_rx is changed to netif_receive_skb.
Furthermore netif_napi_del should be used in the probe and remove function
to clean up the NAPI resource information.
Thanks to Francois Romieu, David Shwatrz and Rami Rosen for their help on
this patch.
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexey Kardashevskiy [Wed, 14 Aug 2013 09:19:01 +0000 (19:19 +1000)]
Revert "cxgb3: Check and handle the dma mapping errors"
This reverts commit
f83331bab149e29fa2c49cf102c0cd8c3f1ce9f9.
As the tests PPC64 (powernv platform) show, IOMMU pages are leaking
when transferring big amount of small packets (<=64 bytes),
"ping -f" and waiting for 15 seconds is the simplest way to confirm the bug.
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Santosh Rastapur <santosh@chelsio.com>
Cc: Jay Fenlason <fenlason@redhat.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Divy Le ray <divy@chelsio.com>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Acked-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sarveshwar Bandi [Wed, 14 Aug 2013 07:51:47 +0000 (13:21 +0530)]
be2net: Clear any capability flags that driver is not interested in.
It is possible for some versions of firmware to advertise capabilities that driver
is not ready to handle. This may lead to controller stall. Since the driver is
interested only in subset of flags, clearing the rest.
Signed-off-by: Sarveshwar Bandi <sarveshwar.bandi@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jesse Gross [Wed, 14 Aug 2013 22:50:36 +0000 (15:50 -0700)]
openvswitch: Reset tunnel key between input and output.
It doesn't make sense to output a tunnel packet using the same
parameters that it was received with since that will generally
just result in the packet going back to us. As a result, userspace
assumes that the tunnel key is cleared when transitioning through
the switch. In the majority of cases this doesn't matter since a
packet is either going to a tunnel port (in which the key is
overwritten with new values) or to a non-tunnel port (in which
case the key is ignored). However, it's theoreticaly possible that
userspace could rely on the documented behavior, so this corrects
it.
Signed-off-by: Jesse Gross <jesse@nicira.com>
Pravin B Shelar [Tue, 30 Jul 2013 22:44:14 +0000 (15:44 -0700)]
openvswitch: Use correct type while allocating flex array.
Flex array is used to allocate hash buckets which is type struct
hlist_head, but we use `struct hlist_head *` to calculate
array size. Since hlist_head is of size pointer it works fine.
Following patch use correct type.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
Jesse Gross [Mon, 13 May 2013 15:41:06 +0000 (08:41 -0700)]
openvswitch: Fix bad merge resolution.
git silently included an extra hunk in vport_cmd_set() during
automatic merging. This code is unreachable so it does not actually
introduce a problem but it is clearly incorrect.
Signed-off-by: Jesse Gross <jesse@nicira.com>
Jingoo Han [Fri, 9 Aug 2013 07:14:51 +0000 (16:14 +0900)]
cpufreq: unicore2: Staticize local symbol
This local symbol is used only in this file.
Fix the following sparse warnings:
drivers/cpufreq/unicore2-cpufreq.c:27:5: warning: symbol 'ucv2_verify_speed' was not declared. Should it be static?
Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Hanjun Guo [Tue, 13 Aug 2013 10:20:10 +0000 (18:20 +0800)]
cpufreq / s3c24xx: Fix s3c_cpufreq_initclks() __init attribute location
__init belongs after the return type on functions, not before it.
Signed-off-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Hanjun Guo [Tue, 13 Aug 2013 10:20:09 +0000 (18:20 +0800)]
cpufreq / pxa2xx: Fix pxa_cpufreq_init_voltages() __init attribute location
__init belongs after the return type on functions, not before it.
Signed-off-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Hanjun Guo [Tue, 13 Aug 2013 10:20:08 +0000 (18:20 +0800)]
cpufreq / gx: Fix gx_detect_chipset() __init attribute location
__init belongs after the return type on functions, not before it.
Signed-off-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Julia Lawall [Tue, 13 Aug 2013 13:02:25 +0000 (15:02 +0200)]
pxa3xx-cpufreq.c: Avoid using ARRAY_AND_SIZE(e) as a function argument
Replace ARRAY_AND_SIZE(e) in function argument position to avoid
hiding the arity of the called function.
The semantic match that makes this change is as follows:
(http://coccinelle.lip6.fr/)
// <smpl>
@@
expression e,f;
@@
f(...,
- ARRAY_AND_SIZE(e)
+ e,ARRAY_SIZE(e)
,...)
// </smpl>
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Mark Brown [Tue, 13 Aug 2013 12:58:24 +0000 (14:58 +0200)]
cpufreq: cpufreq-cpu0: NULL is a valid regulator
Since NULL could in theory be a valid regulator we ought to check for
IS_ERR() rather than for NULL. In practice this is unlikely to be an
issue but it's better for neatness.
Signed-off-by: Mark Brown <broonie@linaro.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Lan Tianyu [Tue, 13 Aug 2013 02:05:53 +0000 (10:05 +0800)]
acpi-cpufreq: Use cpufreq_freq_attr_rw to define the cpb attribute
Standardise the defintion of the cpb (Core Performance Boost)
attribute in the acpi-cpufreq driver via the cpufreq_freq_attr_rw
macro.
[rjw: Subject and changelog]
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Rafael J. Wysocki [Wed, 14 Aug 2013 20:22:57 +0000 (22:22 +0200)]
Merge branch 'cpufreq-fixes' of git://git.linaro.org/people/vireshk/linux into pm-cpufreq
Pull ARM cpufreq fixes from Viresh Kumar.
* 'cpufreq-fixes' of git://git.linaro.org/people/vireshk/linux:
cpufreq: fix EXYNOS drivers selection
cpufreq: exynos5440: Fix to skip when new frequency same as current
Rafael J. Wysocki [Wed, 14 Aug 2013 20:21:16 +0000 (22:21 +0200)]
Merge back earlier 'pm-cpufreq' material
Johan Hovold [Tue, 13 Aug 2013 11:27:35 +0000 (13:27 +0200)]
USB: keyspan: fix null-deref at disconnect and release
Make sure to fail properly if the device is not accepted during attach
in order to avoid null-pointer derefs (of missing interface private
data) at disconnect or release.
Cc: stable@vger.kernel.org
Signed-off-by: Johan Hovold <jhovold@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Johan Hovold [Tue, 13 Aug 2013 11:27:34 +0000 (13:27 +0200)]
USB: mos7720: fix broken control requests
The parallel-port code of the drivers used a stack allocated
control-request buffer for asynchronous (and possibly deferred) control
requests. This not only violates the no-DMA-from-stack requirement but
could also lead to corrupt control requests being submitted.
Cc: stable@vger.kernel.org
Signed-off-by: Johan Hovold <jhovold@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Oliver Neukum [Wed, 14 Aug 2013 09:01:46 +0000 (11:01 +0200)]
usb: add two quirky touchscreen
These devices tend to become unresponsive after S3
Signed-off-by: Oliver Neukum <oneukum@suse.de>
CC: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Linus Torvalds [Wed, 14 Aug 2013 17:04:43 +0000 (10:04 -0700)]
Merge branch 'akpm' (patches from Andrew Morton)
Merge a bunch of fixes from Andrew Morton.
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
fs/proc/task_mmu.c: fix buffer overflow in add_page_map()
arch: *: Kconfig: add "kernel/Kconfig.freezer" to "arch/*/Kconfig"
ocfs2: fix null pointer dereference in ocfs2_dir_foreach_blk_id()
x86 get_unmapped_area(): use proper mmap base for bottom-up direction
ocfs2: fix NULL pointer dereference in ocfs2_duplicate_clusters_by_page
ocfs2: Revert
40bd62e to avoid regression in extended allocation
drivers/rtc/rtc-stmp3xxx.c: provide timeout for potentially endless loop polling a HW bit
hugetlb: fix lockdep splat caused by pmd sharing
aoe: adjust ref of head for compound page tails
microblaze: fix clone syscall
mm: save soft-dirty bits on file pages
mm: save soft-dirty bits on swapped pages
memcg: don't initialize kmem-cache destroying work for root caches
Andreas Schwab [Fri, 9 Aug 2013 13:14:08 +0000 (15:14 +0200)]
m68k: Truncate base in do_div()
Explicitly truncate the second operand of do_div() to 32 bits to guard
against bogus code calling it with a 64-bit divisor.
[Thorsten]
After upgrading from 3.2 to 3.10, mounting a btrfs volume fails with:
btrfs: setting nodatacow, compression disabled
btrfs: enabling auto recovery
btrfs: disk space caching is enabled
*** ZERO DIVIDE *** FORMAT=2
Current process id is 722
BAD KERNEL TRAP:
00000000
Modules linked in: evdev mac_hid ext4 crc16 jbd2 mbcache btrfs xor lzo_compress zlib_deflate raid6_pq crc32c libcrc32c
PC: [<
319535b2>] __btrfs_map_block+0x11c/0x119a [btrfs]
SR: 2000 SP:
30c1fab4 a2:
30f0faf0
d0:
00000000 d1:
00001000 d2:
00000000 d3:
00000000
d4:
00010000 d5:
00000000 a0:
3085c72c a1:
3085c72c
Process mount (pid: 722, task=
30f0faf0)
Frame format=2 instr addr=
319535ae
Stack from
30c1faec:
00000000 00000020 00000000 00001000 00000000 01401000 30253928 300ffc00
00a843ac 3026f640 00000000 00010000 0009e250 00d106c0 00011220 00000000
00001000 301c6830 0009e32a 000000ff 00000009 3085c72c 00000000 00000000
30c1fd14 00000000 00000020 00000000 30c1fd14 0009e26c 00000020 00000003
00000000 0009dd8a 300b0b6c 30253928 00a843ac 00001000 00000000 00000000
0000a008 3194e76a 30253928 00a843ac 00001000 00000000 00000000 00000002
Call Trace: [<
00001000>] kernel_pg_dir+0x0/0x1000
[...]
Code: 222e ff74 2a2e ff5c 2c2e ff60 4c45 1402 <2d40> ff64 2d41 ff68 2205 4c2e 1800 ff68 4c04 0800 2041 d1c0 2206 4c2e 1400 ff68
[Geert]
As diagnosed by Andreas, fs/btrfs/volumes.c:__btrfs_map_block()
calls
do_div(stripe_nr, stripe_len);
with stripe_len u64, while do_div() assumes the divisor is a 32-bit number.
Due to the lack of truncation in the m68k-specific implementation of
do_div(), the division is performed using the upper 32-bit word of
stripe_len, which is zero.
This was introduced by commit
53b381b3abeb86f12787a6c40fee9b2f71edc23b
("Btrfs: RAID5 and RAID6"), which changed the divisor from
map->stripe_len (struct map_lookup.stripe_len is int) to a 64-bit temporary.
Reported-by: Thorsten Glaser <tg@debian.org>
Signed-off-by: Andreas Schwab <schwab@linux-m68k.org>
Tested-by: Thorsten Glaser <tg@debian.org>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: stable@vger.kernel.org
Geert Uytterhoeven [Thu, 25 Jul 2013 22:08:25 +0000 (00:08 +0200)]
m68k/atari: ARAnyM - Fix NatFeat module support
As pointed out by Andreas Schwab, pointers passed to ARAnyM NatFeat calls
should be physical addresses, not virtual addresses.
Fortunately on Atari, physical and virtual kernel addresses are the same,
as long as normal kernel memory is concerned, so this usually worked fine
without conversion.
But for modules, pointers to literal strings are located in vmalloc()ed
memory. Depending on the version of ARAnyM, this causes the nf_get_id()
call to just fail, or worse, crash ARAnyM itself with e.g.
Gotcha! Illegal memory access. Atari PC = $968c
This is a big issue for distro kernels, who want to have all drivers as
loadable modules in an initrd.
Add a wrapper for nf_get_id() that copies the literal to the stack to
work around this issue.
Reported-by: Thorsten Glaser <tg@debian.org>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: stable@vger.kernel.org
Weiping Pan [Tue, 13 Aug 2013 13:46:56 +0000 (21:46 +0800)]
tun: compare with 0 instead of total_len
Since we set "len = total_len" in the beginning of tun_get_user(),
so we should compare the new len with 0, instead of total_len,
or the if statement always returns false.
Signed-off-by: Weiping Pan <wpan@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Asbjoern Sloth Toennesen [Mon, 12 Aug 2013 16:30:09 +0000 (16:30 +0000)]
rtnetlink: rtnl_bridge_getlink: Call nlmsg_find_attr() with ifinfomsg header
Fix the iproute2 command `bridge vlan show`, after switching from
rtgenmsg to ifinfomsg.
Let's start with a little history:
Feb 20: Vlad Yasevich got his VLAN-aware bridge patchset included in
the 3.9 merge window.
In the kernel commit
6cbdceeb, he added attribute support to
bridge GETLINK requests sent with rtgenmsg.
Mar 6th: Vlad got this iproute2 reference implementation of the bridge
vlan netlink interface accepted (iproute2
9eff0e5c)
Apr 25th: iproute2 switched from using rtgenmsg to ifinfomsg (
63338dca)
http://patchwork.ozlabs.org/patch/239602/
http://marc.info/?t=
136680900700007
Apr 28th: Linus released 3.9
Apr 30th: Stephen released iproute2 3.9.0
The `bridge vlan show` command haven't been working since the switch to
ifinfomsg, or in a released version of iproute2. Since the kernel side
only supports rtgenmsg, which iproute2 switched away from just prior to
the iproute2 3.9.0 release.
I haven't been able to find any documentation, about neither rtgenmsg
nor ifinfomsg, and in which situation to use which, but kernel commit
88c5b5ce seams to suggest that ifinfomsg should be used.
Fixing this in kernel will break compatibility, but I doubt that anybody
have been using it due to this bug in the user space reference
implementation, at least not without noticing this bug. That said the
functionality is still fully functional in 3.9, when reversing iproute2
commit
63338dca.
This could also be fixed in iproute2, but thats an ugly patch that would
reintroduce rtgenmsg in iproute2, and from searching in netdev it seams
like rtgenmsg usage is discouraged. I'm assuming that the only reason
that Vlad implemented the kernel side to use rtgenmsg, was because
iproute2 was using it at the time.
Signed-off-by: Asbjoern Sloth Toennesen <ast@fiberby.net>
Reviewed-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
yonghua zheng [Tue, 13 Aug 2013 23:01:03 +0000 (16:01 -0700)]
fs/proc/task_mmu.c: fix buffer overflow in add_page_map()
Recently we met quite a lot of random kernel panic issues after enabling
CONFIG_PROC_PAGE_MONITOR. After debuggind we found this has something
to do with following bug in pagemap:
In struct pagemapread:
struct pagemapread {
int pos, len;
pagemap_entry_t *buffer;
bool v2;
};
pos is number of PM_ENTRY_BYTES in buffer, but len is the size of
buffer, it is a mistake to compare pos and len in add_page_map() for
checking buffer is full or not, and this can lead to buffer overflow and
random kernel panic issue.
Correct len to be total number of PM_ENTRY_BYTES in buffer.
[akpm@linux-foundation.org: document pagemapread.pos and .len units, fix PM_ENTRY_BYTES definition]
Signed-off-by: Yonghua Zheng <younghua.zheng@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Chen Gang [Tue, 13 Aug 2013 23:01:02 +0000 (16:01 -0700)]
arch: *: Kconfig: add "kernel/Kconfig.freezer" to "arch/*/Kconfig"
All architectures include "kernel/Kconfig.freezer" except three left, so
let them include it too, or 'allmodconfig' will report error.
The related errors: (with allmodconfig for openrisc):
CC kernel/cgroup_freezer.o
kernel/cgroup_freezer.c: In function 'freezer_css_online':
kernel/cgroup_freezer.c:133:15: error: 'system_freezing_cnt' undeclared (first use in this function)
kernel/cgroup_freezer.c:133:15: note: each undeclared identifier is reported only once for each function it appears in
kernel/cgroup_freezer.c: In function 'freezer_css_offline':
kernel/cgroup_freezer.c:157:15: error: 'system_freezing_cnt' undeclared (first use in this function)
kernel/cgroup_freezer.c: In function 'freezer_attach':
kernel/cgroup_freezer.c:200:4: error: implicit declaration of function 'freeze_task'
kernel/cgroup_freezer.c: In function 'freezer_apply_state':
kernel/cgroup_freezer.c:371:16: error: 'system_freezing_cnt' undeclared (first use in this function)
Signed-off-by: Chen Gang <gang.chen@asianux.com>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Chen Liqin <liqin.chen@sunplusct.com>
Cc: Lennox Wu <lennox.wu@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jeff Liu [Tue, 13 Aug 2013 23:01:01 +0000 (16:01 -0700)]
ocfs2: fix null pointer dereference in ocfs2_dir_foreach_blk_id()
Fix a NULL pointer deference while removing an empty directory, which
was introduced by commit
3704412bdbf3 ("[readdir] convert ocfs2").
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<(null)>] (null)
PGD
6da85067 PUD
6da89067 PMD 0
Oops: 0010 [#1] SMP
CPU: 0 PID: 6564 Comm: rmdir Tainted: G O 3.11.0-rc1 #4
RIP: 0010:[<
0000000000000000>] [< (null)>] (null)
Call Trace:
ocfs2_dir_foreach+0x49/0x50 [ocfs2]
ocfs2_empty_dir+0x12c/0x3e0 [ocfs2]
ocfs2_unlink+0x56e/0xc10 [ocfs2]
vfs_rmdir+0xd5/0x140
do_rmdir+0x1cb/0x1e0
SyS_rmdir+0x16/0x20
system_call_fastpath+0x16/0x1b
Code: Bad RIP value.
RIP [< (null)>] (null)
RSP <
ffff88006daddc10>
CR2:
0000000000000000
[dan.carpenter@oracle.com: fix pointer math]
Signed-off-by: Jie Liu <jeff.liu@oracle.com>
Reported-by: David Weber <wb@munzinger.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Radu Caragea [Tue, 13 Aug 2013 23:00:59 +0000 (16:00 -0700)]
x86 get_unmapped_area(): use proper mmap base for bottom-up direction
When the stack is set to unlimited, the bottomup direction is used for
mmap-ings but the mmap_base is not used and thus effectively renders
ASLR for mmapings along with PIE useless.
Cc: Michel Lespinasse <walken@google.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Adrian Sendroiu <molecula2788@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Tiger Yang [Tue, 13 Aug 2013 23:00:58 +0000 (16:00 -0700)]
ocfs2: fix NULL pointer dereference in ocfs2_duplicate_clusters_by_page
Since ocfs2_cow_file_pos will invoke ocfs2_refcount_icow with a NULL as
the struct file pointer, it finally result in a null pointer dereference
in ocfs2_duplicate_clusters_by_page.
This patch replace file pointer with inode pointer in
cow_duplicate_clusters to fix this issue.
[jeff.liu@oracle.com: rebased patch against linux-next tree]
Signed-off-by: Tiger Yang <tiger.yang@oracle.com>
Signed-off-by: Jie Liu <jeff.liu@oracle.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Acked-by: Tao Ma <tm@tao.ma>
Tested-by: David Weber <wb@munzinger.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jie Liu [Tue, 13 Aug 2013 23:00:57 +0000 (16:00 -0700)]
ocfs2: Revert
40bd62e to avoid regression in extended allocation
Revert commit
40bd62eb7fb8 ("fs/ocfs2/journal.h: add bits_wanted while
calculating credits in ocfs2_calc_extend_credits").
Unfortunately this change broke fallocate even if there is insufficient
disk space for the preallocation, which is a serious problem.
# df -h
/dev/sda8 22G 1.2G 21G 6% /ocfs2
# fallocate -o 0 -l 200M /ocfs2/testfile
fallocate: /ocfs2/test: fallocate failed: No space left on device
and a kernel warning:
CPU: 3 PID: 3656 Comm: fallocate Tainted: G W O 3.11.0-rc3 #2
Call Trace:
dump_stack+0x77/0x9e
warn_slowpath_common+0xc4/0x110
warn_slowpath_null+0x2a/0x40
start_this_handle+0x6c/0x640 [jbd2]
jbd2__journal_start+0x138/0x300 [jbd2]
jbd2_journal_start+0x23/0x30 [jbd2]
ocfs2_start_trans+0x166/0x300 [ocfs2]
__ocfs2_extend_allocation+0x38f/0xdb0 [ocfs2]
ocfs2_allocate_unwritten_extents+0x3c9/0x520
__ocfs2_change_file_space+0x5e0/0xa60 [ocfs2]
ocfs2_fallocate+0xb1/0xe0 [ocfs2]
do_fallocate+0x1cb/0x220
SyS_fallocate+0x6f/0xb0
system_call_fastpath+0x16/0x1b
JBD2: fallocate wants too many credits (51216 > 4381)
Signed-off-by: Jie Liu <jeff.liu@oracle.com>
Cc: Goldwyn Rodrigues <rgoldwyn@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Lothar Waßmann [Tue, 13 Aug 2013 23:00:56 +0000 (16:00 -0700)]
drivers/rtc/rtc-stmp3xxx.c: provide timeout for potentially endless loop polling a HW bit
It's always a bad idea to poll on HW bits without a timeout.
The i.MX28 RTC can be easily brought into a state in which the RTC is
not running (until after a power-on-reset) and thus the status bits
which are polled in the driver won't ever change.
This patch prevents the kernel from getting stuck in this case.
Signed-off-by: Lothar Waßmann <LW@KARO-electronics.de>
Acked-by: Wolfram Sang <wsa@the-dreams.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Michal Hocko [Tue, 13 Aug 2013 23:00:55 +0000 (16:00 -0700)]
hugetlb: fix lockdep splat caused by pmd sharing
Dave has reported the following lockdep splat:
=================================
[ INFO: inconsistent lock state ]
3.11.0-rc1+ #9 Not tainted
---------------------------------
inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage.
kswapd0/49 [HC0[0]:SC0[0]:HE1:SE1] takes:
(&mapping->i_mmap_mutex){+.+.?.}, at: [<
c114971b>] page_referenced+0x87/0x5e3
{RECLAIM_FS-ON-W} state was registered at:
mark_held_locks+0x81/0xe7
lockdep_trace_alloc+0x5e/0xbc
__alloc_pages_nodemask+0x8b/0x9b6
__get_free_pages+0x20/0x31
get_zeroed_page+0x12/0x14
__pmd_alloc+0x1c/0x6b
huge_pmd_share+0x265/0x283
huge_pte_alloc+0x5d/0x71
hugetlb_fault+0x7c/0x64a
handle_mm_fault+0x255/0x299
__do_page_fault+0x142/0x55c
do_page_fault+0xd/0x16
error_code+0x6c/0x74
irq event stamp:
3136917
hardirqs last enabled at (
3136917): _raw_spin_unlock_irq+0x27/0x50
hardirqs last disabled at (
3136916): _raw_spin_lock_irq+0x15/0x78
softirqs last enabled at (
3136180): __do_softirq+0x137/0x30f
softirqs last disabled at (
3136175): irq_exit+0xa8/0xaa
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(&mapping->i_mmap_mutex);
<Interrupt>
lock(&mapping->i_mmap_mutex);
*** DEADLOCK ***
no locks held by kswapd0/49.
stack backtrace:
CPU: 1 PID: 49 Comm: kswapd0 Not tainted 3.11.0-rc1+ #9
Hardware name: Dell Inc. Precision WorkStation 490 /0DT031, BIOS A08 04/25/2008
Call Trace:
dump_stack+0x4b/0x79
print_usage_bug+0x1d9/0x1e3
mark_lock+0x1e0/0x261
__lock_acquire+0x623/0x17f2
lock_acquire+0x7d/0x195
mutex_lock_nested+0x6c/0x3a7
page_referenced+0x87/0x5e3
shrink_page_list+0x3d9/0x947
shrink_inactive_list+0x155/0x4cb
shrink_lruvec+0x300/0x5ce
shrink_zone+0x53/0x14e
kswapd+0x517/0xa75
kthread+0xa8/0xaa
ret_from_kernel_thread+0x1b/0x28
which is a false positive caused by hugetlb pmd sharing code which
allocates a new pmd from withing mapping->i_mmap_mutex. If this
allocation causes reclaim then the lockdep detector complains that we
might self-deadlock.
This is not correct though, because hugetlb pages are not reclaimable so
their mapping will be never touched from the reclaim path.
The patch tells lockup detector that hugetlb i_mmap_mutex is special by
assigning it a separate lockdep class so it won't report possible
deadlocks on unrelated mappings.
[peterz@infradead.org: comment for annotation]
Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Michal Hocko <mhocko@suse.cz>
Cc: Peter Zijlstra <peterz@infradead.org>
Reviewed-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>