Paul E. McKenney [Wed, 27 Jun 2012 00:00:35 +0000 (17:00 -0700)]
rcu: Prevent force_quiescent_state() memory contention
Large systems running RCU_FAST_NO_HZ kernels see extreme memory
contention on the rcu_state structure's ->fqslock field. This
can be avoided by disabling RCU_FAST_NO_HZ, either at compile time
or at boot time (via the nohz kernel boot parameter), but large
systems will no doubt become sensitive to energy consumption.
This commit therefore uses a combining-tree approach to spread the
memory contention across new cache lines in the leaf rcu_node structures.
This can be thought of as a tournament lock that has only a try-lock
acquisition primitive.
The effect on small systems is minimal, because such systems have
an rcu_node "tree" consisting of a single node. In addition, this
functionality is not used on fastpaths.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Paul E. McKenney [Tue, 26 Jun 2012 21:00:48 +0000 (14:00 -0700)]
rcu: Adjust debugfs tracing for kthread-based quiescent-state forcing
Moving quiescent-state forcing into a kthread dispenses with the need
for the ->n_rp_need_fqs field, so this commit removes it.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Paul E. McKenney [Mon, 25 Jun 2012 15:41:11 +0000 (08:41 -0700)]
rcu: Allow RCU quiescent-state forcing to be preempted
RCU quiescent-state forcing is currently carried out without preemption
points, which can result in excessive latency spikes on large systems
(many hundreds or thousands of CPUs). This patch therefore inserts
a voluntary preemption point into force_qs_rnp(), which should greatly
reduce the magnitude of these spikes.
Reported-by: Mike Galbraith <mgalbraith@suse.de>
Reported-by: Dimitri Sivanich <sivanich@sgi.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Paul E. McKenney [Sat, 23 Jun 2012 00:06:26 +0000 (17:06 -0700)]
rcu: Move quiescent-state forcing into kthread
As the first step towards allowing quiescent-state forcing to be
preemptible, this commit moves RCU quiescent-state forcing into the
same kthread that is now used to initialize and clean up after grace
periods. This is yet another step towards keeping scheduling
latency down to a dull roar.
Updated to change from raw_spin_lock_irqsave() to raw_spin_lock_irq()
and to remove the now-unused rcu_state structure fields as suggested by
Peter Zijlstra.
Reported-by: Mike Galbraith <mgalbraith@suse.de>
Reported-by: Dimitri Sivanich <sivanich@sgi.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Dimitri Sivanich [Fri, 29 Jun 2012 21:17:29 +0000 (14:17 -0700)]
rcu: Segregate rcu_state fields to improve cache locality
The fields in the rcu_state structure that are protected by the
root rcu_node structure's ->lock can share a cache line with the
fields protected by ->onofflock. This can result in excessive
memory contention on large systems, so this commit applies
____cacheline_internodealigned_in_smp to the ->onofflock field in
order to segregate them.
Signed-off-by: Dimitri Sivanich <sivanich@sgi.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: Dimitri Sivanich <sivanich@sgi.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Paul E. McKenney [Tue, 12 Jun 2012 00:39:43 +0000 (17:39 -0700)]
rcu: Provide OOM handler to motivate lazy RCU callbacks
In kernels built with CONFIG_RCU_FAST_NO_HZ=y, CPUs can accumulate a
large number of lazy callbacks, which as the name implies will be slow
to be invoked. This can be a problem on small-memory systems, where the
default 6-second sleep for CPUs having only lazy RCU callbacks could well
be fatal. This commit therefore installs an OOM hander that ensures that
every CPU with lazy callbacks has at least one non-lazy callback, in turn
ensuring timely advancement for these callbacks.
Updated to fix bug that disabled OOM killing, noted by Lai Jiangshan.
Updated to push the for_each_rcu_flavor() loop into rcu_oom_notify_cpu(),
thus reducing the number of IPIs, as suggested by Steven Rostedt. Also
to make the for_each_online_cpu() loop be preemptible. (Later, it might
be good to use smp_call_function(), as suggested by Peter Zijlstra.)
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: Sasha Levin <levinsasha928@gmail.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Paul E. McKenney [Thu, 21 Jun 2012 16:54:10 +0000 (09:54 -0700)]
rcu: Prevent offline CPUs from executing RCU core code
Earlier versions of RCU invoked the RCU core from the CPU_DYING notifier
in order to note a quiescent state for the outgoing CPU. Because the
CPU is marked "offline" during the execution of the CPU_DYING notifiers,
the RCU core had to tolerate being invoked from an offline CPU. However,
commit
b1420f1c (Make rcu_barrier() less disruptive) left only tracing
code in the CPU_DYING notifier, so the RCU core need no longer execute
on offline CPUs. This commit therefore enforces this restriction.
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Paul E. McKenney [Fri, 22 Jun 2012 18:08:41 +0000 (11:08 -0700)]
rcu: Break up rcu_gp_kthread() into subfunctions
Then rcu_gp_kthread() function is too large and furthermore needs to
have the force_quiescent_state() code pulled in. This commit therefore
breaks up rcu_gp_kthread() into rcu_gp_init() and rcu_gp_cleanup().
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Paul E. McKenney [Thu, 21 Jun 2012 15:19:05 +0000 (08:19 -0700)]
rcu: Allow RCU grace-period cleanup to be preempted
RCU grace-period cleanup is currently carried out with interrupts
disabled, which can result in excessive latency spikes on large systems
(many hundreds or thousands of CPUs). This patch therefore makes the
RCU grace-period cleanup be preemptible, including voluntary preemption
points, which should eliminate those latency spikes. Similar spikes from
forcing of quiescent states will be dealt with similarly by later patches.
Updated to replace uses of spin_lock_irqsave() with spin_lock_irq(), as
suggested by Peter Zijlstra.
Reported-by: Mike Galbraith <mgalbraith@suse.de>
Reported-by: Dimitri Sivanich <sivanich@sgi.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Paul E. McKenney [Thu, 21 Jun 2012 00:07:14 +0000 (17:07 -0700)]
rcu: Move RCU grace-period cleanup into kthread
As a first step towards allowing grace-period cleanup to be preemptible,
this commit moves the RCU grace-period cleanup into the same kthread
that is now used to initialize grace periods. This is needed to keep
scheduling latency down to a dull roar.
[ paulmck: Get rid of stray spin_lock_irqsave() calls. ]
Reported-by: Mike Galbraith <mgalbraith@suse.de>
Reported-by: Dimitri Sivanich <sivanich@sgi.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Paul E. McKenney [Wed, 20 Jun 2012 00:18:20 +0000 (17:18 -0700)]
rcu: Allow RCU grace-period initialization to be preempted
RCU grace-period initialization is currently carried out with interrupts
disabled, which can result in 200-microsecond latency spikes on systems
on which RCU has been configured for 4096 CPUs. This patch therefore
makes the RCU grace-period initialization be preemptible, which should
eliminate those latency spikes. Similar spikes from grace-period cleanup
and the forcing of quiescent states will be dealt with similarly by later
patches.
Reported-by: Mike Galbraith <mgalbraith@suse.de>
Reported-by: Dimitri Sivanich <sivanich@sgi.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Paul E. McKenney [Mon, 17 Sep 2012 21:32:58 +0000 (14:32 -0700)]
rcu: Prevent initialization-time quiescent-state race
The next step in reducing RCU's grace-period initialization latency on
large systems will make this initialization preemptible. Unfortunately,
making the grace-period initialization subject to interrupts (let alone
preemption) exposes the following race on systems whose rcu_node tree
contains more than one node:
1. CPU 31 starts initializing the grace period, including the
first leaf rcu_node structures, and is then preempted.
2. CPU 0 refers to the first leaf rcu_node structure, and notes
that a new grace period has started. It passes through a
quiescent state shortly thereafter, and informs the RCU core
of this rite of passage.
3. CPU 0 enters an RCU read-side critical section, acquiring
a pointer to an RCU-protected data item.
4. CPU 31 takes an interrupt whose handler removes the data item
referenced by CPU 0 from the data structure, and registers an
RCU callback in order to free it.
5. CPU 31 resumes initializing the grace period, including its
own rcu_node structure. In invokes rcu_start_gp_per_cpu(),
which advances all callbacks, including the one registered
in #4 above, to be handled by the current grace period.
6. The remaining CPUs pass through quiescent states and inform
the RCU core, but CPU 0 remains in its RCU read-side critical
section, still referencing the now-removed data item.
7. The grace period completes and all the callbacks are invoked,
including the one that frees the data item that CPU 0 is still
referencing. Oops!!!
One way to avoid this race is to remove grace-period acceleration from
rcu_start_gp_per_cpu(). Now, the only reason for this acceleration was
to allow CPUs bringing RCU out of idle state to have their callbacks
invoked after only one grace period, rather than the two grace periods
that would otherwise be required. But this acceleration does not
work when RCU grace-period initialization is moved to a kthread because
the CPU posting the callback is no longer necessarily the CPU that is
initializing the resulting grace period.
This commit therefore removes this now-pointless (and soon to be dangerous)
grace-period acceleration, thus avoiding the above race.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Paul E. McKenney [Tue, 19 Jun 2012 01:36:08 +0000 (18:36 -0700)]
rcu: Move RCU grace-period initialization into a kthread
As the first step towards allowing grace-period initialization to be
preemptible, this commit moves the RCU grace-period initialization
into its own kthread. This is needed to keep large-system scheduling
latency at reasonable levels.
Also change raw_spin_lock_irqsave() to raw_spin_lock_irq() as suggested
by Peter Zijlstra in review comments.
Reported-by: Mike Galbraith <mgalbraith@suse.de>
Reported-by: Dimitri Sivanich <sivanich@sgi.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Paul E. McKenney [Sat, 22 Sep 2012 20:55:30 +0000 (13:55 -0700)]
rcu: Fix day-one dyntick-idle stall-warning bug
Each grace period is supposed to have at least one callback waiting
for that grace period to complete. However, if CONFIG_NO_HZ=n, an
extra callback-free grace period is no big problem -- it will chew up
a tiny bit of CPU time, but it will complete normally. In contrast,
CONFIG_NO_HZ=y kernels have the potential for all the CPUs to go to
sleep indefinitely, in turn indefinitely delaying completion of the
callback-free grace period. Given that nothing is waiting on this grace
period, this is also not a problem.
That is, unless RCU CPU stall warnings are also enabled, as they are
in recent kernels. In this case, if a CPU wakes up after at least one
minute of inactivity, an RCU CPU stall warning will result. The reason
that no one noticed until quite recently is that most systems have enough
OS noise that they will never remain absolutely idle for a full minute.
But there are some embedded systems with cut-down userspace configurations
that consistently get into this situation.
All this begs the question of exactly how a callback-free grace period
gets started in the first place. This can happen due to the fact that
CPUs do not necessarily agree on which grace period is in progress.
If a CPU still believes that the grace period that just completed is
still ongoing, it will believe that it has callbacks that need to wait for
another grace period, never mind the fact that the grace period that they
were waiting for just completed. This CPU can therefore erroneously
decide to start a new grace period. Note that this can happen in
TREE_RCU and TREE_PREEMPT_RCU even on a single-CPU system: Deadlock
considerations mean that the CPU that detected the end of the grace
period is not necessarily officially informed of this fact for some time.
Once this CPU notices that the earlier grace period completed, it will
invoke its callbacks. It then won't have any callbacks left. If no
other CPU has any callbacks, we now have a callback-free grace period.
This commit therefore makes CPUs check more carefully before starting a
new grace period. This new check relies on an array of tail pointers
into each CPU's list of callbacks. If the CPU is up to date on which
grace periods have completed, it checks to see if any callbacks follow
the RCU_DONE_TAIL segment, otherwise it checks to see if any callbacks
follow the RCU_WAIT_TAIL segment. The reason that this works is that
the RCU_WAIT_TAIL segment will be promoted to the RCU_DONE_TAIL segment
as soon as the CPU is officially notified that the old grace period
has ended.
This change is to cpu_needs_another_gp(), which is called in a number
of places. The only one that really matters is in rcu_start_gp(), where
the root rcu_node structure's ->lock is held, which prevents any
other CPU from starting or completing a grace period, so that the
comparison that determines whether the CPU is missing the completion
of a grace period is stable.
Reported-by: Becky Bruce <bgillbruce@gmail.com>
Reported-by: Subodh Nijsure <snijsure@grid-net.com>
Reported-by: Paul Walmsley <paul@pwsan.com>
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: Paul Walmsley <paul@pwsan.com> # OMAP3730, OMAP4430
Cc: stable@vger.kernel.org
Josh Triplett [Wed, 5 Sep 2012 06:23:06 +0000 (23:23 -0700)]
trace: Don't declare trace_*_rcuidle functions in modules
Tracepoints declare a static inline trace_*_rcuidle variant of the trace
function, to support safely generating trace events from the idle loop.
Module code never actually uses that variant of trace functions, because
modules don't run code that needs tracing with RCU idled. However, the
declaration of those otherwise unused functions causes the module to
reference rcu_idle_exit and rcu_idle_enter, which RCU does not export to
modules.
To avoid this, don't generate trace_*_rcuidle functions for tracepoints
declared in module code.
Link: http://lkml.kernel.org/r/20120905062306.GA14756@leaf
Reported-by: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Linus Torvalds [Sat, 8 Sep 2012 23:43:45 +0000 (16:43 -0700)]
Linux 3.6-rc5
Linus Torvalds [Sat, 8 Sep 2012 23:22:43 +0000 (16:22 -0700)]
Merge branch 'fixes-for-3.6' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping
Pull DMA-mapping fixes from Marek Szyprowski:
"Another set of fixes for ARM dma-mapping subsystem.
Commit
e9da6e9905e6 replaced custom consistent buffer remapping code
with generic vmalloc areas. It however introduced some regressions
caused by limited support for allocations in atomic context. This
series contains fixes for those regressions.
For some subplatforms the default, pre-allocated pool for atomic
allocations turned out to be too small, so a function for setting its
size has been added.
Another set of patches adds support for atomic allocations to
IOMMU-aware DMA-mapping implementation.
The last part of this pull request contains two fixes for Contiguous
Memory Allocator, which relax too strict requirements."
* 'fixes-for-3.6' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping:
ARM: dma-mapping: IOMMU allocates pages from atomic_pool with GFP_ATOMIC
ARM: dma-mapping: Introduce __atomic_get_pages() for __iommu_get_pages()
ARM: dma-mapping: Refactor out to introduce __in_atomic_pool
ARM: dma-mapping: atomic_pool with struct page **pages
ARM: Kirkwood: increase atomic coherent pool size
ARM: DMA-Mapping: print warning when atomic coherent allocation fails
ARM: DMA-Mapping: add function for setting coherent pool size from platform code
ARM: relax conditions required for enabling Contiguous Memory Allocator
mm: cma: fix alignment requirements for contiguous regions
Linus Torvalds [Sat, 8 Sep 2012 23:20:59 +0000 (16:20 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/dtor/input
Pull input subsystem updates from Dmitry Torokhov.
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: wacom - add support for EMR on Cintiq 24HD touch
Input: i8042 - add Gigabyte T1005 series netbooks to noloop table
Input: imx_keypad - reset the hardware before enabling
Input: edt-ft5x06 - fix build error when compiling wthout CONFIG_DEBUG_FS
Linus Torvalds [Fri, 7 Sep 2012 19:29:38 +0000 (12:29 -0700)]
Merge branch 'upstream-fixes' of git://git./linux/kernel/git/jikos/hid
Pull HID updates from Jiri Kosina:
"It contains a fix for Eaton Ellipse MAX UPS from Alan Stern,
performance improvement (not processing debug data if noone is
interested), by Henrik Rydberg, and allowing tpkbd-driven devices to
work even with generic driver in a crippled mode, by Andres Freund."
* 'upstream-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
HID: tpkbd: work even if the new Lenovo Keyboard driver is not configured
HID: Only dump input if someone is listening
HID: add NOGET quirk for Eaton Ellipse MAX UPS
Andres Freund [Thu, 30 Aug 2012 12:37:14 +0000 (14:37 +0200)]
HID: tpkbd: work even if the new Lenovo Keyboard driver is not configured
c1dcad2d32d0252e8a3023d20311b52a187ecda3 added a new driver configured by
HID_LENOVO_TPKBD but made the hid_have_special_driver entry non-optional which
lead to a recognized but non-working device if the new driver wasn't
configured (which is the correct default).
Signed-off-by: Andres Freund <andres@anarazel.de>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Linus Torvalds [Fri, 7 Sep 2012 00:16:42 +0000 (17:16 -0700)]
Merge tag 'stable/for-linus-3.6-rc4-tag' of git://git./linux/kernel/git/konrad/xen
Pull Xen bug-fixes from Konrad Rzeszutek Wilk:
* Fix for TLB flushing introduced in v3.6
* Fix Xen-SWIOTLB not using proper DMA mask - device had 64bit but
in a 32-bit kernel we need to allocate for coherent pages from a
32-bit pool.
* When trying to re-use P2M nodes we had a one-off error and triggered
a BUG_ON check with specific CONFIG_ option.
* When doing FLR in Xen-PCI-backend we would first do FLR then save the
PCI configuration space. We needed to do it the other way around.
* tag 'stable/for-linus-3.6-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
xen/pciback: Fix proper FLR steps.
xen: Use correct masking in xen_swiotlb_alloc_coherent.
xen: fix logical error in tlb flushing
xen/p2m: Fix one-off error in checking the P2M tree directory.
Linus Torvalds [Fri, 7 Sep 2012 00:15:49 +0000 (17:15 -0700)]
Merge tag '3.6-pci-fixes' of git://git./linux/kernel/git/helgaas/pci
Pull PCI updates from Bjorn Helgaas:
"Power management
- PCI/PM: Enable D3/D3cold by default for most devices
- PCI/PM: Keep parent bridge active when probing device
- PCI/PM: Fix config reg access for D3cold and bridge suspending
- PCI/PM: Add ABI document for sysfs file d3cold_allowed
Core
- PCI: Don't print anything while decoding is disabled"
* tag '3.6-pci-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
PCI: Don't print anything while decoding is disabled
PCI/PM: Add ABI document for sysfs file d3cold_allowed
PCI/PM: Fix config reg access for D3cold and bridge suspending
PCI/PM: Keep parent bridge active when probing device
PCI/PM: Enable D3/D3cold by default for most devices
Linus Torvalds [Thu, 6 Sep 2012 17:23:58 +0000 (10:23 -0700)]
Merge tag 'fixes-for-linus' of git://git./linux/kernel/git/arm/arm-soc
Pull ARM SoC bug fixes from Olof Johansson:
"Mostly Renesas and Atmel bugfixes this time, targeting boot and build
problems. A couple of patches for gemini and kirkwood as well. On a
whole nothing very controversial."
* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
ARM: gemini: fix the gemini build
ARM: shmobile: armadillo800eva: enable rw rootfs mount
ARM: Kirkwood: Fix 'SZ_1M' undeclared here for
db88f6281-bp-setup.c
ARM: shmobile: mackerel: fixup usb module order
ARM: shmobile: armadillo800eva: fixup: sound card detection order
ARM: shmobile: marzen: fixup smsc911x id for regulator
ARM: at91/feature-removal-schedule: delay at91_mci removal
ARM: mach-shmobile: armadillo800eva: Enable power button as wakeup source
ARM: mach-shmobile: armadillo800eva: Fix GPIO buttons descriptions
ARM: at91/dts: remove partial parameter in at91sam9g25ek.dts
ARM: at91/clock: fix PLLA overclock warning
ARM: at91: fix rtc-at91sam9 irq issue due to sparse irq support
ARM: at91: fix system timer irq issue due to sparse irq support
ARM: shmobile: sh73a0: fixup RELOC_BASE of intca_irq_pins_desc
Linus Torvalds [Thu, 6 Sep 2012 16:39:47 +0000 (09:39 -0700)]
Merge tag 'hwmon-for-linus' of git://git./linux/kernel/git/groeck/linux-staging
Pull a hwmon fix from Guenter Roeck:
"One patch, fixing DIV_ROUND_CLOSEST to support negative dividends.
While the changes are not in the drivers/hwmon directory, the problem
primarily affects hwmon drivers, and it makes sense to push the patch
through the hwmon tree."
* tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
linux/kernel.h: Fix DIV_ROUND_CLOSEST to support negative dividends
Linus Torvalds [Thu, 6 Sep 2012 16:38:25 +0000 (09:38 -0700)]
Merge branch 'rc-fixes' of git://git./linux/kernel/git/mmarek/kbuild
Pull kbuild fixes from Michal Marek:
"These are two fixes that should go into 3.6. The link-vmlinux.sh one
is obvious.
The other one fixes make firmware_install with certain configurations,
where a file in the toplevel firmware tree gets installed first, and
$(INSTALL_FW_PATH)/$$(dir <file>) results in /lib/firmware/./, which
confuses make 3.82 for some reason."
* 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
firmware: fix directory creation rule matching with make 3.82
link-vmlinux.sh: Fix stray "echo" in error message
Dave Jones [Thu, 6 Sep 2012 16:01:00 +0000 (12:01 -0400)]
Remove user-triggerable BUG from mpol_to_str
Trivially triggerable, found by trinity:
kernel BUG at mm/mempolicy.c:2546!
Process trinity-child2 (pid: 23988, threadinfo
ffff88010197e000, task
ffff88007821a670)
Call Trace:
show_numa_map+0xd5/0x450
show_pid_numa_map+0x13/0x20
traverse+0xf2/0x230
seq_read+0x34b/0x3e0
vfs_read+0xac/0x180
sys_pread64+0xa2/0xc0
system_call_fastpath+0x1a/0x1f
RIP: mpol_to_str+0x156/0x360
Cc: stable@vger.kernel.org
Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Konrad Rzeszutek Wilk [Wed, 5 Sep 2012 20:35:20 +0000 (16:35 -0400)]
xen/pciback: Fix proper FLR steps.
When we do FLR and save PCI config we did it in the wrong order.
The end result was that if a PCI device was unbind from
its driver, then binded to xen-pciback, and then back to its
driver we would get:
> lspci -s 04:00.0
04:00.0 Ethernet controller: Intel Corporation 82574L Gigabit Network Connection
13:42:12 # 4 :~/
> echo "0000:04:00.0" > /sys/bus/pci/drivers/pciback/unbind
> modprobe e1000e
e1000e: Intel(R) PRO/1000 Network Driver - 2.0.0-k
e1000e: Copyright(c) 1999 - 2012 Intel Corporation.
e1000e 0000:04:00.0: Disabling ASPM L0s L1
e1000e 0000:04:00.0: enabling device (0000 -> 0002)
xen: registering gsi 48 triggering 0 polarity 1
Already setup the GSI :48
e1000e 0000:04:00.0: Interrupt Throttling Rate (ints/sec) set to dynamic conservative mode
e1000e: probe of 0000:04:00.0 failed with error -2
This fixes it by first saving the PCI configuration space, then
doing the FLR.
Reported-by: Ren, Yongjie <yongjie.ren@intel.com>
Reported-and-Tested-by: Tobias Geiger <tobias.geiger@vido.info>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
CC: stable@vger.kernel.org
Linus Torvalds [Thu, 6 Sep 2012 02:41:58 +0000 (19:41 -0700)]
Merge tag 'mmc-fixes-for-3.6-rc5' of git://git./linux/kernel/git/cjb/mmc
Pull MMC fixes from Chris Ball:
- a firmware bug on several Samsung MoviNAND eMMC models causes
permanent corruption on the device when secure erase and secure trim
requests are made, so we disable those requests on these eMMC devices.
- atmel-mci: fix a hang with some SD cards by waiting for not-busy flag.
- dw_mmc: low-power mode breaks SDIO interrupts; fix PIO error handling;
fix handling of error interrupts.
- mxs-mmc: fix deadlocks; fix compile error due to dma.h arch change.
- omap: fix broken PIO mode causing memory corruption.
- sdhci-esdhc: fix card detection.
* tag 'mmc-fixes-for-3.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc:
mmc: omap: fix broken PIO mode
mmc: card: Skip secure erase on MoviNAND; causes unrecoverable corruption.
mmc: dw_mmc: Disable low power mode if SDIO interrupts are used
mmc: dw_mmc: fix error handling in PIO mode
mmc: dw_mmc: correct mishandling error interrupt
mmc: dw_mmc: amend using error interrupt status
mmc: atmel-mci: not busy flag has also to be used for read operations
mmc: sdhci-esdhc: break out early if clock is 0
mmc: mxs-mmc: fix deadlock caused by recursion loop
mmc: mxs-mmc: fix deadlock in SDIO IRQ case
mmc: bfin_sdh: fix dma_desc_array build error
Miklos Szeredi [Wed, 5 Sep 2012 16:38:50 +0000 (18:38 +0200)]
uml: fix compile error in deliver_alarm()
Fix the following compile error on UML.
arch/um/os-Linux/time.c: In function 'deliver_alarm':
arch/um/os-Linux/time.c:117:3: error: too few arguments to function 'alarm_handler'
arch/um/os-Linux/internal.h:1:6: note: declared here
The error was introduced by commit
d3c1cfcd ("um: pass siginfo to guest
process") in 3.6-rc1.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
CC: Martin Pärtel <martin.partel@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Alan Cox [Tue, 4 Sep 2012 14:10:08 +0000 (15:10 +0100)]
dj: memory scribble in logi_dj
Allocate a structure not a pointer to it !
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Thu, 6 Sep 2012 01:41:32 +0000 (18:41 -0700)]
Merge branch 'merge' of git://git./linux/kernel/git/benh/powerpc
Pull powerpc fixes from Benjamin Herrenschmidt:
"Here are a few fixes for 3.6 that were piling up while I was away or
busy (I was mostly MIA a week or two before San Diego).
Some fixes from Anton fixing up issues with our relatively new DSCR
control feature, and a few other fixes that are either regressions or
bugs nasty enough to warrant not waiting."
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
powerpc: Don't use __put_user() in patch_instruction
powerpc: Make sure IPI handlers see data written by IPI senders
powerpc: Restore correct DSCR in context switch
powerpc: Fix DSCR inheritance in copy_thread()
powerpc: Keep thread.dscr and thread.dscr_inherit in sync
powerpc: Update DSCR on all CPUs when writing sysfs dscr_default
powerpc/powernv: Always go into nap mode when CPU is offline
powerpc: Give hypervisor decrementer interrupts their own handler
powerpc/vphn: Fix arch_update_cpu_topology() return value
Linus Torvalds [Thu, 6 Sep 2012 01:40:12 +0000 (18:40 -0700)]
Merge tag 'gpio-fixes-for-v3.6' of git://git./linux/kernel/git/linusw/linux-gpio
Pull GPIO fixes from Linus Walleij:
"These are some GPIO regression fixes for v3.6:
- Erroneous debug message from of_get_named_gpio_flags()
- Make sure the MC9S08DZ60 GPIO driver depend on I2C being compiled
in (not module) or allmodconfig breaks.
- Check return value from irq_alloc_descs() in the Emma Mobile GPIO
driver.
- Assign the owner field for the rdc321x driver so the module won't
be removed if it has active GPIOs."
* tag 'gpio-fixes-for-v3.6' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
gpio: rdc321x: Prevent removal of modules exporting active GPIOs
gpio: em: Fix checking return value of irq_alloc_descs
gpio: mc9s08dz60: Fix build error if I2C=m
gpio: Fix debug message in of_get_named_gpio_flags()
Linus Torvalds [Thu, 6 Sep 2012 01:38:52 +0000 (18:38 -0700)]
Merge tag 'sound-3.6' of git://git./linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"There are nothing scaring, contains only small fixes for HD-audio and
USB-audio:
- EPSS regression fix and GPIO fix for HD-audio IDT codecs
- A series of USB-audio regression fixes that are found since 3.5
kernel"
* tag 'sound-3.6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: snd-usb: fix cross-interface streaming devices
ALSA: snd-usb: fix calls to next_packet_size
ALSA: snd-usb: restore delay information
ALSA: snd-usb: use list_for_each_safe for endpoint resources
ALSA: snd-usb: Fix URB cancellation at stream start
ALSA: hda - Don't trust codec EPSS bit for IDT 92HD83xx & co
ALSA: hda - Avoid unnecessary parameter read for EPSS
ALSA: hda - Do not set GPIOs for speakers on IDT if there are no speakers
Linus Torvalds [Thu, 6 Sep 2012 01:38:02 +0000 (18:38 -0700)]
Merge tag 'fbdev-fixes-for-3.6-1' of git://github.com/schandinat/linux-2.6
Pull fbdev fixes from Florian Tobias Schandinat:
- a fix by Paul Cercueil to prevent a possible buffer overflow
- a fix by Bruno Prémont to prevent a rare sleep in invalid context
- a fix by Julia Lawall for a double free in auo_k190x
- a fix by Dan Carpenter to prevent a division by zero in mb862xxfb
- a regression fix by Tomi Valkeinen for the SDI output in OMAP
- a fix by Grazvydas Ignotas to fix the console colors in OMAP
* tag 'fbdev-fixes-for-3.6-1' of git://github.com/schandinat/linux-2.6:
OMAPFB: fix framebuffer console colors
OMAPDSS: Fix SDI PLL locking
video: mb862xxfb: prevent divide by zero bug
drivers/video/auo_k190x.c: drop kfree of devm_kzalloc's data
fbcon: Fix bit_putcs() call to kmalloc(s, GFP_KERNEL)
fbcon: prevent possible buffer overflow.
Linus Torvalds [Thu, 6 Sep 2012 01:37:16 +0000 (18:37 -0700)]
Merge tag 'upstream-3.6-rc5' of git://git.infradead.org/linux-ubi
Pull ubi fix from Artem Bityutskiy:
"A single small fix for memory deallocation: we allocated memory using
'kmem_cache_alloc()' but were freeing it using 'kfree()' in some
cases. Now we fix this by using 'kmem_cache_free()' instead."
* tag 'upstream-3.6-rc5' of git://git.infradead.org/linux-ubi:
UBI: fix a horrible memory deallocation bug
Mikulas Patocka [Sat, 1 Sep 2012 16:34:07 +0000 (12:34 -0400)]
Fix order of arguments to compat_put_time[spec|val]
Commit
644595f89620 ("compat: Handle COMPAT_USE_64BIT_TIME in
net/socket.c") introduced a bug where the helper functions to take
either a 64-bit or compat time[spec|val] got the arguments in the wrong
order, passing the kernel stack pointer off as a user pointer (and vice
versa).
Because of the user address range check, that in turn then causes an
EFAULT due to the user pointer range checking failing for the kernel
address. Incorrectly resuling in a failed system call for 32-bit
processes with a 64-bit kernel.
On odder architectures like HP-PA (with separate user/kernel address
spaces), it can be used read kernel memory.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ronny Hegewald [Fri, 31 Aug 2012 09:57:52 +0000 (09:57 +0000)]
xen: Use correct masking in xen_swiotlb_alloc_coherent.
When running 32-bit pvops-dom0 and a driver tries to allocate a coherent
DMA-memory the xen swiotlb-implementation returned memory beyond 4GB.
The underlaying reason is that if the supplied driver passes in a
DMA_BIT_MASK(64) ( hwdev->coherent_dma_mask is set to 0xffffffffffffffff)
our dma_mask will be u64 set to 0xffffffffffffffff even if we set it to
DMA_BIT_MASK(32) previously. Meaning we do not reset the upper bits.
By using the dma_alloc_coherent_mask function - it does the proper casting
and we get 0xfffffffff.
This caused not working sound on a system with 4 GB and a 64-bit
compatible sound-card with sets the DMA-mask to 64bit.
On bare-metal and the forward-ported xen-dom0 patches from OpenSuse a coherent
DMA-memory is always allocated inside the 32-bit address-range by calling
dma_alloc_coherent_mask.
This patch adds the same functionality to xen swiotlb and is a rebase of the
original patch from Ronny Hegewald which never got upstream b/c the
underlaying reason was not understood until now.
The original email with the original patch is in:
http://old-list-archives.xen.org/archives/html/xen-devel/2010-02/msg00038.html
the original thread from where the discussion started is in:
http://old-list-archives.xen.org/archives/html/xen-devel/2010-01/msg00928.html
Signed-off-by: Ronny Hegewald <ronny.hegewald@online.de>
Signed-off-by: Stefano Panella <stefano.panella@citrix.com>
Acked-By: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
CC: stable@vger.kernel.org
Alex Shi [Fri, 24 Aug 2012 08:55:13 +0000 (08:55 +0000)]
xen: fix logical error in tlb flushing
While TLB_FLUSH_ALL gets passed as 'end' argument to
flush_tlb_others(), the Xen code was made to check its 'start'
parameter. That may give a incorrect op.cmd to MMUEXT_INVLPG_MULTI
instead of MMUEXT_TLB_FLUSH_MULTI. Then it causes some page can not
be flushed from TLB.
This patch fixed this issue.
Reported-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Alex Shi <alex.shi@intel.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Tested-by: Yongjie Ren <yongjie.ren@intel.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Konrad Rzeszutek Wilk [Wed, 5 Sep 2012 14:22:45 +0000 (10:22 -0400)]
Merge commit '
4cb38750d49010ae72e718d46605ac9ba5a851b4' into stable/for-linus-3.6
* commit '
4cb38750d49010ae72e718d46605ac9ba5a851b4': (6849 commits)
bcma: fix invalid PMU chip control masks
[libata] pata_cmd64x: whitespace cleanup
libata-acpi: fix up for acpi_pm_device_sleep_state API
sata_dwc_460ex: device tree may specify dma_channel
ahci, trivial: fixed coding style issues related to braces
ahci_platform: add hibernation callbacks
libata-eh.c: local functions should not be exposed globally
libata-transport.c: local functions should not be exposed globally
sata_dwc_460ex: support hardreset
ata: use module_pci_driver
drivers/ata/pata_pcmcia.c: adjust suspicious bit operation
pata_imx: Convert to clk_prepare_enable/clk_disable_unprepare
ahci: Enable SB600 64bit DMA on MSI K9AGM2 (MS-7327) v2
[libata] Prevent interface errors with Seagate FreeAgent GoFlex
drivers/acpi/glue: revert accidental license-related
6b66d95895c bits
libata-acpi: add missing inlines in libata.h
i2c-omap: Add support for I2C_M_STOP message flag
i2c: Fall back to emulated SMBus if the operation isn't supported natively
i2c: Add SCCB support
i2c-tiny-usb: Add support for the Robofuzz OSIF USB/I2C converter
...
Konrad Rzeszutek Wilk [Tue, 4 Sep 2012 19:45:17 +0000 (15:45 -0400)]
xen/p2m: Fix one-off error in checking the P2M tree directory.
We would traverse the full P2M top directory (from 0->MAX_DOMAIN_PAGES
inclusive) when trying to figure out whether we can re-use some of the
P2M middle leafs.
Which meant that if the kernel was compiled with MAX_DOMAIN_PAGES=512
we would try to use the 512th entry. Fortunately for us the p2m_top_index
has a check for this:
BUG_ON(pfn >= MAX_P2M_PFN);
which we hit and saw this:
(XEN) domain_crash_sync called from entry.S
(XEN) Domain 0 (vcpu#0) crashed on cpu#0:
(XEN) ----[ Xen-4.1.2-OVM x86_64 debug=n Tainted: C ]----
(XEN) CPU: 0
(XEN) RIP: e033:[<
ffffffff819cadeb>]
(XEN) RFLAGS:
0000000000000212 EM: 1 CONTEXT: pv guest
(XEN) rax:
ffffffff81db5000 rbx:
ffffffff81db4000 rcx:
0000000000000000
(XEN) rdx:
0000000000480211 rsi:
0000000000000000 rdi:
ffffffff81db4000
(XEN) rbp:
ffffffff81793db8 rsp:
ffffffff81793d38 r8:
0000000008000000
(XEN) r9:
4000000000000000 r10:
0000000000000000 r11:
ffffffff81db7000
(XEN) r12:
0000000000000ff8 r13:
ffffffff81df1ff8 r14:
ffffffff81db6000
(XEN) r15:
0000000000000ff8 cr0:
000000008005003b cr4:
00000000000026f0
(XEN) cr3:
0000000661795000 cr2:
0000000000000000
Fixes-Oracle-Bug:
14570662
CC: stable@vger.kernel.org # only for v3.5
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Benjamin Herrenschmidt [Tue, 4 Sep 2012 15:08:28 +0000 (15:08 +0000)]
powerpc: Don't use __put_user() in patch_instruction
patch_instruction() can be called very early on ppc32, when the kernel
isn't yet running at it's linked address. That can cause the !
is_kernel_addr() test in __put_user() to trip and call might_sleep()
which is very bad at that point during boot.
Use a lower level function instead for now, at least until we get to
rework ppc32 boot process to do the code patching later, like ppc64
does.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Paul Mackerras [Tue, 4 Sep 2012 18:33:08 +0000 (18:33 +0000)]
powerpc: Make sure IPI handlers see data written by IPI senders
We have been observing hangs, both of KVM guest vcpu tasks and more
generally, where a process that is woken doesn't properly wake up and
continue to run, but instead sticks in TASK_WAKING state. This
happens because the update of rq->wake_list in ttwu_queue_remote()
is not ordered with the update of ipi_message in
smp_muxed_ipi_message_pass(), and the reading of rq->wake_list in
scheduler_ipi() is not ordered with the reading of ipi_message in
smp_ipi_demux(). Thus it is possible for the IPI receiver not to see
the updated rq->wake_list and therefore conclude that there is nothing
for it to do.
In order to make sure that anything done before smp_send_reschedule()
is ordered before anything done in the resulting call to scheduler_ipi(),
this adds barriers in smp_muxed_message_pass() and smp_ipi_demux().
The barrier in smp_muxed_message_pass() is a full barrier to ensure that
there is a full ordering between the smp_send_reschedule() caller and
scheduler_ipi(). In smp_ipi_demux(), we use xchg() rather than
xchg_local() because xchg() includes release and acquire barriers.
Using xchg() rather than xchg_local() makes sense given that
ipi_message is not just accessed locally.
This moves the barrier between setting the message and calling the
cause_ipi() function into the individual cause_ipi implementations.
Most of them -- those that used outb, out_8 or similar -- already had
a full barrier because out_8 etc. include a sync before the MMIO
store. This adds an explicit barrier in the two remaining cases.
These changes made no measurable difference to the speed of IPIs as
measured using a simple ping-pong latency test across two CPUs on
different cores of a POWER7 machine.
The analysis of the reason why processes were not waking up properly
is due to Milton Miller.
Cc: stable@vger.kernel.org # v3.0+
Reported-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Anton Blanchard [Mon, 3 Sep 2012 16:51:10 +0000 (16:51 +0000)]
powerpc: Restore correct DSCR in context switch
During a context switch we always restore the per thread DSCR value.
If we aren't doing explicit DSCR management
(ie thread.dscr_inherit == 0) and the default DSCR changed while
the process has been sleeping we end up with the wrong value.
Check thread.dscr_inherit and select the default DSCR or per thread
DSCR as required.
This was found with the following test case, when running with
more threads than CPUs (ie forcing context switching):
http://ozlabs.org/~anton/junkcode/dscr_default_test.c
With the four patches applied I can run a combination of all
test cases successfully at the same time:
http://ozlabs.org/~anton/junkcode/dscr_default_test.c
http://ozlabs.org/~anton/junkcode/dscr_explicit_test.c
http://ozlabs.org/~anton/junkcode/dscr_inherit_test.c
Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: <stable@kernel.org> # 3.0+
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Anton Blanchard [Mon, 3 Sep 2012 16:49:47 +0000 (16:49 +0000)]
powerpc: Fix DSCR inheritance in copy_thread()
If the default DSCR is non zero we set thread.dscr_inherit in
copy_thread() meaning the new thread and all its children will ignore
future updates to the default DSCR. This is not intended and is
a change in behaviour that a number of our users have hit.
We just need to inherit thread.dscr and thread.dscr_inherit from
the parent which ends up being much simpler.
This was found with the following test case:
http://ozlabs.org/~anton/junkcode/dscr_default_test.c
Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: <stable@kernel.org> # 3.0+
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Anton Blanchard [Mon, 3 Sep 2012 16:48:46 +0000 (16:48 +0000)]
powerpc: Keep thread.dscr and thread.dscr_inherit in sync
When we update the DSCR either via emulation of mtspr(DSCR) or via
a change to dscr_default in sysfs we don't update thread.dscr.
We will eventually update it at context switch time but there is
a period where thread.dscr is incorrect.
If we fork at this point we will copy the old value of thread.dscr
into the child. To avoid this, always keep thread.dscr in sync with
reality.
This issue was found with the following testcase:
http://ozlabs.org/~anton/junkcode/dscr_inherit_test.c
Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: <stable@kernel.org> # 3.0+
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Anton Blanchard [Mon, 3 Sep 2012 16:47:56 +0000 (16:47 +0000)]
powerpc: Update DSCR on all CPUs when writing sysfs dscr_default
Writing to dscr_default in sysfs doesn't actually change the DSCR -
we rely on a context switch on each CPU to do the work. There is no
guarantee we will get a context switch in a reasonable amount of time
so fire off an IPI to force an immediate change.
This issue was found with the following test case:
http://ozlabs.org/~anton/junkcode/dscr_explicit_test.c
Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: <stable@kernel.org> # 3.0+
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Paul Mackerras [Thu, 26 Jul 2012 18:51:09 +0000 (18:51 +0000)]
powerpc/powernv: Always go into nap mode when CPU is offline
The CPU hotplug code for the powernv platform currently only puts
offline CPUs into nap mode if the powersave_nap variable is set.
However, HV-style KVM on this platform requires secondary CPU threads
to be offline and in nap mode. Since we know nap mode works just
fine on all POWER7 machines, and the only machines that support the
powernv platform are POWER7 machines, this changes the code to
always put offline CPUs into nap mode, regardless of powersave_nap.
Powersave_nap still controls whether or not CPUs go into nap mode
when idle, as before.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Paul Mackerras [Thu, 26 Jul 2012 13:56:11 +0000 (13:56 +0000)]
powerpc: Give hypervisor decrementer interrupts their own handler
At the moment the handler for hypervisor decrementer interrupts is
the same as for decrementer interrupts, i.e. timer_interrupt().
This is bogus; if we ever do get a hypervisor decrementer interrupt
it won't have anything to do with the next timer event. In fact
the only time we get hypervisor decrementer interrupts is when one
is left pending on exit from a KVM guest.
When we get a hypervisor decrementer interrupt we don't need to do
anything special to clear it, since they are edge-triggered on the
transition of HDEC from 0 to -1. Thus this adds an empty handler
function for them. We don't need to have them masked when interrupts
are soft-disabled, so we use STD_EXCEPTION_HV instead of
MASKABLE_EXCEPTION_HV.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Jesse Larrew [Thu, 7 Jun 2012 21:04:34 +0000 (16:04 -0500)]
powerpc/vphn: Fix arch_update_cpu_topology() return value
arch_update_cpu_topology() should only return 1 when the topology has
actually changed, and should return 0 otherwise.
This patch fixes a potential bug where rebuild_sched_domains() would
reinitialize the sched domains even when the topology hasn't changed.
Signed-off-by: Jesse Larrew <jlarrew@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Linus Walleij [Thu, 30 Aug 2012 17:22:36 +0000 (19:22 +0200)]
ARM: gemini: fix the gemini build
Test-compiling obscure machines I notice that the gemini (which
by the way lacks a defconfig) is broken since some time back.
Adding a simple missing include makes it build again.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Olof Johansson <olof@lixom.net>
Olof Johansson [Wed, 5 Sep 2012 04:41:35 +0000 (21:41 -0700)]
Merge branch 'fixes' of git://git./linux/kernel/git/horms/renesas into fixes
Two regression fixes and one boot-loader compatibility fix from Simon Horman.
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas:
ARM: shmobile: armadillo800eva: enable rw rootfs mount
ARM: shmobile: mackerel: fixup usb module order
ARM: shmobile: armadillo800eva: fixup: sound card detection order
Paul Walmsley [Fri, 24 Aug 2012 06:00:18 +0000 (06:00 +0000)]
mmc: omap: fix broken PIO mode
After commit
26b88520b80695a6fa5fd95b5d97c03f4daf87e0 ("mmc:
omap_hsmmc: remove private DMA API implementation"), the Nokia N800
here stopped booting:
[ 2.086181] Waiting for root device /dev/mmcblk0p1...
[ 2.324066] Unhandled fault: imprecise external abort (0x406) at 0x00000000
[ 2.331451] Internal error: : 406 [#1] ARM
[ 2.335784] Modules linked in:
[ 2.339050] CPU: 0 Not tainted (3.6.0-rc3 #60)
[ 2.344146] PC is at default_idle+0x28/0x30
[ 2.348602] LR is at trace_hardirqs_on_caller+0x15c/0x1b0
...
This turned out to be due to memory corruption caused by long-broken
PIO code in drivers/mmc/host/omap.c. (Previously, this driver had
been using DMA; but the above commit caused the MMC driver to fall
back to PIO mode with an unmodified Kconfig.)
The PIO code, added with the rest of the driver in commit
730c9b7e6630f786fcec026fb11d2e6f2c90fdcb ("[MMC] Add OMAP MMC host
driver"), confused bytes with 16-bit words. This bug caused memory
located after the PIO transfer buffer to be corrupted with transfers
larger than 32 bytes. The driver also did not increment the buffer
pointer after the transfer occurred. This bug resulted in data
corruption during any transfer larger than 64 bytes.
Signed-off-by: Paul Walmsley <paul@pwsan.com>
Reviewed-by: Felipe Balbi <balbi@ti.com>
Tested-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
Ian Chen [Wed, 29 Aug 2012 06:05:36 +0000 (15:05 +0900)]
mmc: card: Skip secure erase on MoviNAND; causes unrecoverable corruption.
For several MoviNAND eMMC parts, there are known issues with secure
erase and secure trim. For these specific MoviNAND devices, we skip
these operations.
Specifically, there is a bug in the eMMC firmware that causes
unrecoverable corruption when the MMC is erased with MMC_CAP_ERASE
enabled.
References:
http://forum.xda-developers.com/showthread.php?t=
1644364
https://plus.google.com/
111398485184813224730/posts/21pTYfTsCkB#
111398485184813224730/posts/21pTYfTsCkB
Signed-off-by: Ian Chen <ian.cy.chen@samsung.com>
Reviewed-by: Namjae Jeon <linkinjeon@gmail.com>
Acked-by: Jaehoon Chung <jh80.chung@samsung.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Cc: stable <stable@vger.kernel.org> [3.0+]
Signed-off-by: Chris Ball <cjb@laptop.org>
Doug Anderson [Wed, 25 Jul 2012 15:33:17 +0000 (08:33 -0700)]
mmc: dw_mmc: Disable low power mode if SDIO interrupts are used
The documentation for the dw_mmc part says that the low power
mode should normally only be set for MMC and SD memory and should
be turned off for SDIO cards that need interrupts detected.
The best place I could find to do this is when the SDIO interrupt
was first enabled. I rely on the fact that dw_mci_setup_bus()
will be called when it's time to reenable.
Signed-off-by: Doug Anderson <dianders@chromium.org>
Acked-by: Seungwon Jeon <tgih.jun@samsung.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
Seungwon Jeon [Wed, 1 Aug 2012 00:30:46 +0000 (09:30 +0900)]
mmc: dw_mmc: fix error handling in PIO mode
Data transfer will be continued until all the bytes are transmitted,
even if data crc error occurs during a multiple-block data transfer.
This means RXDR/TXDR interrupts will occurs until data transfer is
terminated. Early setting of host->sg to NULL prevents going into
xxx_data_pio functions, hence permanent unhandled RXDR/TXDR interrupts
occurs. And checking error interrupt status in the xxx_data_pio functions
is no need because dw_mci_interrupt does do the same. This patch also
removes it.
Signed-off-by: Seungwon Jeon <tgih.jun@samsung.com>
Acked-by: Jaehoon Chung <jh80.chung@samsung.com>
Acked-by: Will Newton <will.newton@imgtec.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
Seungwon Jeon [Wed, 1 Aug 2012 00:30:40 +0000 (09:30 +0900)]
mmc: dw_mmc: correct mishandling error interrupt
Datasheet of SYNOPSYS mentions that DTO(Data Transfer Over) interrupt
will be raised even if some error interrupts, however it is actually
found that DTO does not occur. SYNOPSYS has confirmed this issue.
Current implementation defers the call of tasklet_schedule until DTO
when the error interrupts is happened. This patch fixes error handling.
Signed-off-by: Seungwon Jeon <tgih.jun@samsung.com>
Acked-by: Jaehoon Chung <jh80.chung@samsung.com>
Acked-by: Will Newton <will.newton@imgtec.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
Seungwon Jeon [Wed, 1 Aug 2012 00:30:30 +0000 (09:30 +0900)]
mmc: dw_mmc: amend using error interrupt status
RINTSTS status includes masked interrupts as well as unmasked.
data_status and cmd_status are set by value of RINTSTS in interrupt handler
and tasklet finally uses it to decide whether error is happened or not.
In addition, MINTSTS status is used for setting data_status in PIO.
Masked error interrupt will not be handled and that status can be considered
non-error case.
Signed-off-by: Seungwon Jeon <tgih.jun@samsung.com>
Reviewed By: Girish K S <girish.shivananjappa@linaro.org>
Acked-by: Jaehoon Chung <jh80.chung@samsung.com>
Acked-by: Will Newton <will.newton@imgtec.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
Ludovic Desroches [Tue, 24 Jul 2012 09:42:04 +0000 (11:42 +0200)]
mmc: atmel-mci: not busy flag has also to be used for read operations
Even if the datasheet says that the not busy flag has to be used only
for write operations, it's false except for version lesser than v2xx.
Not waiting on the not busy flag for read operations can cause the
controller to hang-up during the initialization of some SD cards
with DMA after the first CMD6 -- the next command is sent too early.
Signed-off-by: Ludovic Desroches <ludovic.desroches@atmel.com>
Cc: stable <stable@vger.kernel.org> [3.5, 3.6]
Signed-off-by: Chris Ball <cjb@laptop.org>
Shawn Guo [Wed, 22 Aug 2012 15:10:01 +0000 (23:10 +0800)]
mmc: sdhci-esdhc: break out early if clock is 0
Since commit
30832ab56 ("mmc: sdhci: Always pass clock request value
zero to set_clock host op") was merged, esdhc_set_clock starts hitting
"if (clock == 0)" where ESDHC_SYSTEM_CONTROL has been operated. This
causes SDHCI card-detection function being broken. Fix the regression
by moving "if (clock == 0)" above ESDHC_SYSTEM_CONTROL operation.
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
Lauri Hintsala [Tue, 17 Jul 2012 14:16:10 +0000 (17:16 +0300)]
mmc: mxs-mmc: fix deadlock caused by recursion loop
Release the lock before mmc_signal_sdio_irq is called by
mxs_mmc_enable_sdio_irq.
Backtrace:
[ 65.470000] =============================================
[ 65.470000] [ INFO: possible recursive locking detected ]
[ 65.470000] 3.5.0-rc5 #2 Not tainted
[ 65.470000] ---------------------------------------------
[ 65.470000] ksdioirqd/mmc0/73 is trying to acquire lock:
[ 65.470000] (&(&host->lock)->rlock#2){-.-...}, at: [<
bf054120>] mxs_mmc_enable_sdio_irq+0x18/0xdc [mxs_mmc]
[ 65.470000]
[ 65.470000] but task is already holding lock:
[ 65.470000] (&(&host->lock)->rlock#2){-.-...}, at: [<
bf054120>] mxs_mmc_enable_sdio_irq+0x18/0xdc [mxs_mmc]
[ 65.470000]
[ 65.470000] other info that might help us debug this:
[ 65.470000] Possible unsafe locking scenario:
[ 65.470000]
[ 65.470000] CPU0
[ 65.470000] ----
[ 65.470000] lock(&(&host->lock)->rlock#2);
[ 65.470000] lock(&(&host->lock)->rlock#2);
[ 65.470000]
[ 65.470000] *** DEADLOCK ***
[ 65.470000]
[ 65.470000] May be due to missing lock nesting notation
[ 65.470000]
[ 65.470000] 1 lock held by ksdioirqd/mmc0/73:
[ 65.470000] #0: (&(&host->lock)->rlock#2){-.-...}, at: [<
bf054120>] mxs_mmc_enable_sdio_irq+0x18/0xdc [mxs_mmc]
[ 65.470000]
[ 65.470000] stack backtrace:
[ 65.470000] [<
c0014990>] (unwind_backtrace+0x0/0xf4) from [<
c005ccb8>] (__lock_acquire+0x14f8/0x1b98)
[ 65.470000] [<
c005ccb8>] (__lock_acquire+0x14f8/0x1b98) from [<
c005d3f8>] (lock_acquire+0xa0/0x108)
[ 65.470000] [<
c005d3f8>] (lock_acquire+0xa0/0x108) from [<
c02f671c>] (_raw_spin_lock_irqsave+0x48/0x5c)
[ 65.470000] [<
c02f671c>] (_raw_spin_lock_irqsave+0x48/0x5c) from [<
bf054120>] (mxs_mmc_enable_sdio_irq+0x18/0xdc [mxs_mmc])
[ 65.470000] [<
bf054120>] (mxs_mmc_enable_sdio_irq+0x18/0xdc [mxs_mmc]) from [<
bf0541d0>] (mxs_mmc_enable_sdio_irq+0xc8/0xdc [mxs_mmc])
[ 65.470000] [<
bf0541d0>] (mxs_mmc_enable_sdio_irq+0xc8/0xdc [mxs_mmc]) from [<
c0219b38>] (sdio_irq_thread+0x1bc/0x274)
[ 65.470000] [<
c0219b38>] (sdio_irq_thread+0x1bc/0x274) from [<
c003c324>] (kthread+0x8c/0x98)
[ 65.470000] [<
c003c324>] (kthread+0x8c/0x98) from [<
c00101ac>] (kernel_thread_exit+0x0/0x8)
[ 65.470000] BUG: spinlock lockup suspected on CPU#0, ksdioirqd/mmc0/73
[ 65.470000] lock: 0xc3358724, .magic:
dead4ead, .owner: ksdioirqd/mmc0/73, .owner_cpu: 0
[ 65.470000] [<
c0014990>] (unwind_backtrace+0x0/0xf4) from [<
c01b46b0>] (do_raw_spin_lock+0x100/0x144)
[ 65.470000] [<
c01b46b0>] (do_raw_spin_lock+0x100/0x144) from [<
c02f6724>] (_raw_spin_lock_irqsave+0x50/0x5c)
[ 65.470000] [<
c02f6724>] (_raw_spin_lock_irqsave+0x50/0x5c) from [<
bf054120>] (mxs_mmc_enable_sdio_irq+0x18/0xdc [mxs_mmc])
[ 65.470000] [<
bf054120>] (mxs_mmc_enable_sdio_irq+0x18/0xdc [mxs_mmc]) from [<
bf0541d0>] (mxs_mmc_enable_sdio_irq+0xc8/0xdc [mxs_mmc])
[ 65.470000] [<
bf0541d0>] (mxs_mmc_enable_sdio_irq+0xc8/0xdc [mxs_mmc]) from [<
c0219b38>] (sdio_irq_thread+0x1bc/0x274)
[ 65.470000] [<
c0219b38>] (sdio_irq_thread+0x1bc/0x274) from [<
c003c324>] (kthread+0x8c/0x98)
[ 65.470000] [<
c003c324>] (kthread+0x8c/0x98) from [<
c00101ac>] (kernel_thread_exit+0x0/0x8)
Reported-by: Attila Kinali <attila@kinali.ch>
Signed-off-by: Lauri Hintsala <lauri.hintsala@bluegiga.com>
Acked-by: Shawn Guo <shawn.guo@linaro.org>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
Lauri Hintsala [Tue, 17 Jul 2012 14:16:09 +0000 (17:16 +0300)]
mmc: mxs-mmc: fix deadlock in SDIO IRQ case
Release the lock before mmc_signal_sdio_irq is called by mxs_mmc_irq_handler.
Backtrace:
[ 79.660000] =============================================
[ 79.660000] [ INFO: possible recursive locking detected ]
[ 79.660000]
3.4.0-00009-g3e96082-dirty #11 Not tainted
[ 79.660000] ---------------------------------------------
[ 79.660000] swapper/0 is trying to acquire lock:
[ 79.660000] (&(&host->lock)->rlock#2){-.....}, at: [<
c026ea3c>] mxs_mmc_enable_sdio_irq+0x18/0xd4
[ 79.660000]
[ 79.660000] but task is already holding lock:
[ 79.660000] (&(&host->lock)->rlock#2){-.....}, at: [<
c026f744>] mxs_mmc_irq_handler+0x1c/0xe8
[ 79.660000]
[ 79.660000] other info that might help us debug this:
[ 79.660000] Possible unsafe locking scenario:
[ 79.660000]
[ 79.660000] CPU0
[ 79.660000] ----
[ 79.660000] lock(&(&host->lock)->rlock#2);
[ 79.660000] lock(&(&host->lock)->rlock#2);
[ 79.660000]
[ 79.660000] *** DEADLOCK ***
[ 79.660000]
[ 79.660000] May be due to missing lock nesting notation
[ 79.660000]
[ 79.660000] 1 lock held by swapper/0:
[ 79.660000] #0: (&(&host->lock)->rlock#2){-.....}, at: [<
c026f744>] mxs_mmc_irq_handler+0x1c/0xe8
[ 79.660000]
[ 79.660000] stack backtrace:
[ 79.660000] [<
c0014bd0>] (unwind_backtrace+0x0/0xf4) from [<
c005f9c0>] (__lock_acquire+0x1948/0x1d48)
[ 79.660000] [<
c005f9c0>] (__lock_acquire+0x1948/0x1d48) from [<
c005fea0>] (lock_acquire+0xe0/0xf8)
[ 79.660000] [<
c005fea0>] (lock_acquire+0xe0/0xf8) from [<
c03a8460>] (_raw_spin_lock_irqsave+0x44/0x58)
[ 79.660000] [<
c03a8460>] (_raw_spin_lock_irqsave+0x44/0x58) from [<
c026ea3c>] (mxs_mmc_enable_sdio_irq+0x18/0xd4)
[ 79.660000] [<
c026ea3c>] (mxs_mmc_enable_sdio_irq+0x18/0xd4) from [<
c026f7fc>] (mxs_mmc_irq_handler+0xd4/0xe8)
[ 79.660000] [<
c026f7fc>] (mxs_mmc_irq_handler+0xd4/0xe8) from [<
c006bdd8>] (handle_irq_event_percpu+0x70/0x254)
[ 79.660000] [<
c006bdd8>] (handle_irq_event_percpu+0x70/0x254) from [<
c006bff8>] (handle_irq_event+0x3c/0x5c)
[ 79.660000] [<
c006bff8>] (handle_irq_event+0x3c/0x5c) from [<
c006e6d0>] (handle_level_irq+0x90/0x110)
[ 79.660000] [<
c006e6d0>] (handle_level_irq+0x90/0x110) from [<
c006b930>] (generic_handle_irq+0x38/0x50)
[ 79.660000] [<
c006b930>] (generic_handle_irq+0x38/0x50) from [<
c00102fc>] (handle_IRQ+0x30/0x84)
[ 79.660000] [<
c00102fc>] (handle_IRQ+0x30/0x84) from [<
c000f058>] (__irq_svc+0x38/0x60)
[ 79.660000] [<
c000f058>] (__irq_svc+0x38/0x60) from [<
c0010520>] (default_idle+0x2c/0x40)
[ 79.660000] [<
c0010520>] (default_idle+0x2c/0x40) from [<
c0010a90>] (cpu_idle+0x64/0xcc)
[ 79.660000] [<
c0010a90>] (cpu_idle+0x64/0xcc) from [<
c04ff858>] (start_kernel+0x244/0x2c8)
[ 79.660000] BUG: spinlock lockup on CPU#0, swapper/0
[ 79.660000] lock:
c398cb2c, .magic:
dead4ead, .owner: swapper/0, .owner_cpu: 0
[ 79.660000] [<
c0014bd0>] (unwind_backtrace+0x0/0xf4) from [<
c01ddb1c>] (do_raw_spin_lock+0xf0/0x144)
[ 79.660000] [<
c01ddb1c>] (do_raw_spin_lock+0xf0/0x144) from [<
c03a8468>] (_raw_spin_lock_irqsave+0x4c/0x58)
[ 79.660000] [<
c03a8468>] (_raw_spin_lock_irqsave+0x4c/0x58) from [<
c026ea3c>] (mxs_mmc_enable_sdio_irq+0x18/0xd4)
[ 79.660000] [<
c026ea3c>] (mxs_mmc_enable_sdio_irq+0x18/0xd4) from [<
c026f7fc>] (mxs_mmc_irq_handler+0xd4/0xe8)
[ 79.660000] [<
c026f7fc>] (mxs_mmc_irq_handler+0xd4/0xe8) from [<
c006bdd8>] (handle_irq_event_percpu+0x70/0x254)
[ 79.660000] [<
c006bdd8>] (handle_irq_event_percpu+0x70/0x254) from [<
c006bff8>] (handle_irq_event+0x3c/0x5c)
[ 79.660000] [<
c006bff8>] (handle_irq_event+0x3c/0x5c) from [<
c006e6d0>] (handle_level_irq+0x90/0x110)
[ 79.660000] [<
c006e6d0>] (handle_level_irq+0x90/0x110) from [<
c006b930>] (generic_handle_irq+0x38/0x50)
[ 79.660000] [<
c006b930>] (generic_handle_irq+0x38/0x50) from [<
c00102fc>] (handle_IRQ+0x30/0x84)
[ 79.660000] [<
c00102fc>] (handle_IRQ+0x30/0x84) from [<
c000f058>] (__irq_svc+0x38/0x60)
[ 79.660000] [<
c000f058>] (__irq_svc+0x38/0x60) from [<
c0010520>] (default_idle+0x2c/0x40)
[ 79.660000] [<
c0010520>] (default_idle+0x2c/0x40) from [<
c0010a90>] (cpu_idle+0x64/0xcc)
[ 79.660000] [<
c0010a90>] (cpu_idle+0x64/0xcc) from [<
c04ff858>] (start_kernel+0x244/0x2c8)
Signed-off-by: Lauri Hintsala <lauri.hintsala@bluegiga.com>
Acked-by: Shawn Guo <shawn.guo@linaro.org>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
Sonic Zhang [Mon, 30 Jul 2012 07:03:02 +0000 (15:03 +0800)]
mmc: bfin_sdh: fix dma_desc_array build error
Descriptor array structure has been moved into blackfin dma.h head file.
This patch fix below error:
drivers/mmc/host/bfin_sdh.c:52:8: error: redefinition of 'struct
dma_desc_array'
make[4]: *** [drivers/mmc/host/bfin_sdh.o] Error 1
Signed-off-by: Sonic Zhang <sonic.zhang@analog.com>
Signed-off-by: Bob Liu <lliubbo@gmail.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
Artem Bityutskiy [Mon, 3 Sep 2012 14:12:29 +0000 (17:12 +0300)]
UBI: fix a horrible memory deallocation bug
UBI was mistakingly using 'kfree()' instead of 'kmem_cache_free()' when
freeing "attach eraseblock" structures in vtbl.c. Thankfully, this happened
only when we were doing auto-format, so many systems were unaffected. However,
there are still many users affected.
It is strange, but the system did not crash and nothing bad happened when
the SLUB memory allocator was used. However, in case of SLOB we observed an
crash right away.
This problem was introduced in 2.6.39 by commit
"
6c1e875 UBI: add slab cache for ubi_scan_leb objects"
A note for stable trees:
Because variable were renamed, this won't cleanly apply to older kernels.
Changing names like this should help:
1. ai -> si
2. aeb_slab_cache -> seb_slab_cache
3. new_aeb -> new_seb
Reported-by: Richard Genoud <richard.genoud@gmail.com>
Tested-by: Richard Genoud <richard.genoud@gmail.com>
Tested-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Cc: stable@vger.kernel.org [v2.6.39+]
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Kuninori Morimoto [Mon, 3 Sep 2012 06:06:52 +0000 (23:06 -0700)]
ARM: shmobile: armadillo800eva: enable rw rootfs mount
armadillo800eva default boot loader is "hermit",
and it's tag->u.core.flags has flag when kernel boots.
Because of this, ${LINUX}/arch/arm/kernel/setup.c :: parse_tag_core()
didn't remove MS_RDONLY flag from root_mountflags.
Thus, the rootfs is mounted as "readonly".
This patch adds "rw" kernel parameter,
and enable read/write mounts for rootfs
Cc: Masahiro Nakai <nakai@atmark-techno.com>
Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
Linus Torvalds [Sun, 2 Sep 2012 18:30:10 +0000 (11:30 -0700)]
Merge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6
Pull CIFS fixes from Steve French.
* 'for-next' of git://git.samba.org/sfrench/cifs-2.6:
CIFS: Fix cifs_do_create error hadnling
cifs: print error code if smb signature verification fails
CIFS: Fix log messages in packet checking for SMB2
CIFS: Protect i_nlink from being negative
Linus Torvalds [Sun, 2 Sep 2012 18:28:00 +0000 (11:28 -0700)]
Merge git://git./linux/kernel/git/davem/net
Pull networking fixes from David Miller:
1) NLA_PUT* --> nla_put_* conversion got one case wrong in
nfnetlink_log, fix from Patrick McHardy.
2) Missed error return check in ipw2100 driver, from Julia Lawall.
3) PMTU updates in ipv4 were setting the expiry time incorrectly, fix
from Eric Dumazet.
4) SFC driver erroneously reversed src and dst when reporting filters
via ethtool.
5) Memory leak in CAN protocol and wrong setting of IRQF_SHARED in
sja1000 can platform driver, from Alexey Khoroshilov and Sven
Schmitt.
6) Fix multicast traffic scaling regression in ipv4_dst_destroy, only
take the lock when we really need to. From Eric Dumazet.
7) Fix non-root process spoofing in netlink, from Pablo Neira Ayuso.
8) CWND reduction in TCP is done incorrectly during non-SACK recovery,
fix from Yuchung Cheng.
9) Revert netpoll change, and fix what was actually a driver specific
problem. From Amerigo Wang. This should cure bootup hangs with
netconsole some people reported.
10) Fix xen-netfront invoking __skb_fill_page_desc() with a NULL page
pointer. From Ian Campbell.
11) SIP NAT fix for expectiontation creation, from Pablo Neira Ayuso.
12) __ip_rt_update_pmtu() needs RCU locking, from Eric Dumazet.
13) Fix usbnet deadlock on resume, can't use GFP_KERNEL in this
situation. From Oliver Neukum.
14) The davinci ethernet driver triggers an OOPS on removal because it
frees an MDIO object before unregistering it. Fix from Bin Liu.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (41 commits)
net: qmi_wwan: add several new Gobi devices
fddi: 64 bit bug in smt_add_para()
net: ethernet: fix kernel OOPS when remove davinci_mdio module
net/xfrm/xfrm_state.c: fix error return code
net: ipv6: fix error return code
net: qmi_wwan: new device: Foxconn/Novatel E396
usbnet: fix deadlock in resume
cs89x0 : packet reception not working
netfilter: nf_conntrack: fix racy timer handling with reliable events
bnx2x: Correct the ndo_poll_controller call
bnx2x: Move netif_napi_add to the open call
ipv4: must use rcu protection while calling fib_lookup
bnx2x: fix 57840_MF pci id
net: ipv4: ipmr_expire_timer causes crash when removing net namespace
e1000e: DoS while TSO enabled caused by link partner with small MSS
l2tp: avoid to use synchronize_rcu in tunnel free function
gianfar: fix default tx vlan offload feature flag
netfilter: nf_nat_sip: fix incorrect handling of EBUSY for RTCP expectation
xen-netfront: use __pskb_pull_tail to ensure linear area is big enough on RX
netfilter: nfnetlink_log: fix error return code in init path
...
Olof Johansson [Sun, 2 Sep 2012 15:22:58 +0000 (08:22 -0700)]
Merge branch 'fixes' of git://git./linux/kernel/git/horms/renesas into fixes
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas:
ARM: shmobile: marzen: fixup smsc911x id for regulator
Olof Johansson [Sun, 2 Sep 2012 15:21:25 +0000 (08:21 -0700)]
Merge branch 'fixes-for-v3.6-v2' of git://git.infradead.org/users/jcooper/linux into fixes
* 'fixes-for-v3.6-v2' of git://git.infradead.org/users/jcooper/linux:
ARM: Kirkwood: Fix 'SZ_1M' undeclared here for
db88f6281-bp-setup.c
Henrik Rydberg [Sat, 1 Sep 2012 19:47:11 +0000 (21:47 +0200)]
HID: Only dump input if someone is listening
Going through the motions of printing the debug message information
takes a long time; using the keyboard can lead to a 160 us irqsoff
latency. This patch skips hid_dump_input() when there are no open
handles, which brings latency down to 100 us.
Signed-off-by: Henrik Rydberg <rydberg@euromail.se>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Bjørn Mork [Sat, 1 Sep 2012 03:47:26 +0000 (03:47 +0000)]
net: qmi_wwan: add several new Gobi devices
Gobi devices are composite, needing both the qcserial and
qmi_wwan drivers to support all functions. Re-syncing the
list of supported devices with qcserial.
Cc: Aleksander Morgado <aleksander@lanedo.com>
Cc: Thomas Tuttle <ttuttle@chromium.org>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@tempietto.lan>
Dan Carpenter [Sat, 1 Sep 2012 09:57:40 +0000 (09:57 +0000)]
fddi: 64 bit bug in smt_add_para()
The intent was to set 4 bytes of data so that's why the sp_len is set
to 4 on the next line. The cast to u_long pointer clears 8 bytes
on 64 bit arches.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@tempietto.lan>
Guenter Roeck [Sat, 25 Aug 2012 00:25:01 +0000 (17:25 -0700)]
linux/kernel.h: Fix DIV_ROUND_CLOSEST to support negative dividends
DIV_ROUND_CLOSEST returns a bad result for negative dividends:
DIV_ROUND_CLOSEST(-2, 2) = 0
Most of the time this does not matter. However, in the hardware monitoring
subsystem, DIV_ROUND_CLOSEST is sometimes used on integers which can be
negative (such as temperatures).
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Jean Delvare <khali@linux-fr.org>
Linus Torvalds [Sat, 1 Sep 2012 17:39:58 +0000 (10:39 -0700)]
Linux 3.6-rc4
John Stultz [Fri, 31 Aug 2012 17:30:06 +0000 (13:30 -0400)]
time: Move ktime_t overflow checking into timespec_valid_strict
Andreas Bombe reported that the added ktime_t overflow checking added to
timespec_valid in commit
4e8b14526ca7 ("time: Improve sanity checking of
timekeeping inputs") was causing problems with X.org because it caused
timeouts larger then KTIME_T to be invalid.
Previously, these large timeouts would be clamped to KTIME_MAX and would
never expire, which is valid.
This patch splits the ktime_t overflow checking into a new
timespec_valid_strict function, and converts the timekeeping codes
internal checking to use this more strict function.
Reported-and-tested-by: Andreas Bombe <aeb@debian.org>
Cc: Zhouping Liu <zliu@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Axel Lin [Sat, 1 Sep 2012 06:11:22 +0000 (14:11 +0800)]
gpio: rdc321x: Prevent removal of modules exporting active GPIOs
This driver can be built as a module, set the missing owner field of
struct gpio_chip to prevent removal of modules exporting active GPIOs.
Signed-off-by: Axel Lin <axel.lin@gmail.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Linus Torvalds [Sat, 1 Sep 2012 00:02:58 +0000 (17:02 -0700)]
Merge git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM bugfixes from Marcelo Tosatti.
* git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: x86: fix KVM_GET_MSR for PV EOI
kvm: Fix nonsense handling of compat ioctl
Linus Torvalds [Sat, 1 Sep 2012 00:02:20 +0000 (17:02 -0700)]
Merge tag 'parisc-fixes' of git://git./linux/kernel/git/jejb/parisc-2.6
Pull PARISC fixes from James Bottomley:
"This is a set of two bug fixes. One is the ATOMIC problem which is
now causing a compile failure in certain situations. The other is
mishandling of PER_LINUX32 which may also cause user visible effects.
Signed-off-by: James Bottomley <JBottomley@Parallels.com>"
* tag 'parisc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/parisc-2.6:
[PARISC] fix personality flag check in copy_thread()
[PARISC] Redefine ATOMIC_INIT and ATOMIC64_INIT to drop the casts
Linus Torvalds [Sat, 1 Sep 2012 00:01:31 +0000 (17:01 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/s390/linux
Pull s390 fixes from Martin Schwidefsky:
"A couple of s390 bug fixes for 3.5-rc4"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/32: Don't clobber personality flags on exec
s390/smp: add missing smp_store_status() for !SMP
s390/dasd: fix ioctl return value
s390: Always use "long" for ssize_t to match size_t
Axel Lin [Tue, 28 Aug 2012 11:30:44 +0000 (19:30 +0800)]
gpio: em: Fix checking return value of irq_alloc_descs
irq_alloc_descs() returns negative error code on failure.
Signed-off-by: Axel Lin <axel.lin@gmail.com>
Acked-by: Magnus Damm <damm@opensource.se>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Axel Lin [Wed, 29 Aug 2012 01:35:24 +0000 (09:35 +0800)]
gpio: mc9s08dz60: Fix build error if I2C=m
Make GPIO_MC9S08DZ60 depend on I2C=y, this fixes below build error:
LD init/built-in.o
drivers/built-in.o: In function `mc9s08dz60_get_value':
clk-fixed-factor.c:(.text+0x7214): undefined reference to `i2c_smbus_read_byte_data'
drivers/built-in.o: In function `mc9s08dz60_set':
clk-fixed-factor.c:(.text+0x727c): undefined reference to `i2c_smbus_read_byte_data'
clk-fixed-factor.c:(.text+0x72bc): undefined reference to `i2c_smbus_write_byte_data'
drivers/built-in.o: In function `mc9s08dz60_i2c_driver_init':
clk-fixed-factor.c:(.init.text+0x290): undefined reference to `i2c_register_driver'
drivers/built-in.o: In function `mc9s08dz60_i2c_driver_exit':
clk-fixed-factor.c:(.exit.text+0x2c): undefined reference to `i2c_del_driver'
make: *** [vmlinux] Error 1
Signed-off-by: Axel Lin <axel.lin@gmail.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Bin Liu [Thu, 30 Aug 2012 06:37:32 +0000 (06:37 +0000)]
net: ethernet: fix kernel OOPS when remove davinci_mdio module
davinci mdio device is not unregistered from mdiobus when removing
the module, which causes BUG_ON() when free the device from mdiobus.
Calling mdiobus_unregister() before mdiobus_free() fixes the issue.
Signed-off-by: Bin Liu <b-liu@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Julia Lawall [Wed, 29 Aug 2012 06:49:15 +0000 (06:49 +0000)]
net/xfrm/xfrm_state.c: fix error return code
Initialize return variable before exiting on an error path.
A simplified version of the semantic match that finds this problem is as
follows: (http://coccinelle.lip6.fr/)
// <smpl>
(
if@p1 (\(ret < 0\|ret != 0\))
{ ... return ret; }
|
ret@p1 = 0
)
... when != ret = e1
when != &ret
*if(...)
{
... when != ret = e2
when forall
return ret;
}
// </smpl>
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Julia Lawall [Wed, 29 Aug 2012 06:49:12 +0000 (06:49 +0000)]
net: ipv6: fix error return code
Initialize return variable before exiting on an error path.
The initial initialization of the return variable is also dropped, because
that value is never used.
A simplified version of the semantic match that finds this problem is as
follows: (http://coccinelle.lip6.fr/)
// <smpl>
(
if@p1 (\(ret < 0\|ret != 0\))
{ ... return ret; }
|
ret@p1 = 0
)
... when != ret = e1
when != &ret
*if(...)
{
... when != ret = e2
when forall
return ret;
}
// </smpl>
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Aleksander Morgado [Tue, 28 Aug 2012 02:30:32 +0000 (02:30 +0000)]
net: qmi_wwan: new device: Foxconn/Novatel E396
Foxconn-branded Novatel E396, Gobi3k modem.
Cc: Dan Williams <dcbw@redhat.com>
Cc: Bjørn Mork <bjorn@mork.no>
Cc: Ben Chan <benchan@google.com>
Signed-off-by: Aleksander Morgado <aleksander@lanedo.com>
Acked-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
Oliver Neukum [Sun, 26 Aug 2012 20:41:38 +0000 (20:41 +0000)]
usbnet: fix deadlock in resume
A usbnet device can share a multifunction device
with a storage device. If the storage device is autoresumed
the usbnet devices also needs to be autoresumed. Allocating
memory with GFP_KERNEL can deadlock in this case.
This should go back into all kernels that have
commit
65841fd5132c3941cdf5df09e70df3ed28323212
That is 3.5
Signed-off-by: Oliver Neukum <oneukum@suse.de>
CC: stable@kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
Jaccon Bastiaansen [Mon, 27 Aug 2012 11:53:51 +0000 (11:53 +0000)]
cs89x0 : packet reception not working
The RxCFG register of the CS89x0 could be configured incorrectly
(because of misplaced parentheses), resulting in the disabling
of packet reception.
Signed-off-by: Jaccon Bastiaansen <jaccon.bastiaansen@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Mack [Thu, 30 Aug 2012 16:52:31 +0000 (18:52 +0200)]
ALSA: snd-usb: fix cross-interface streaming devices
Commit
68e67f40b ("ALSA: snd-usb: move calls to usb_set_interface")
saved us some unnecessary calls to snd_usb_set_interface() but ignored
the fact that there is at least one device out there which operates on
two endpoint in different interfaces simultaniously.
Take care for this by catching the case where data and sync endpoints
are located on different interfaces and calling snd_usb_set_interface()
between the start of the two endpoints.
Signed-off-by: Daniel Mack <zonque@gmail.com>
Reported-by: Robert M. Albrecht <linux@romal.de>
Cc: stable@kernel.org [v3.5+]
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Daniel Mack [Thu, 30 Aug 2012 16:52:30 +0000 (18:52 +0200)]
ALSA: snd-usb: fix calls to next_packet_size
In order to support devices with implicit feedback streaming models,
packet sizes are now stored with each individual urb, and the PCM
handling code which fills the buffers purely relies on the size fields
now.
However, calling snd_usb_audio_next_packet_size() for all possible
packets in an URB at once, prior to letting the PCM code do its job
does in fact not lead to the same behaviour than what the old code did:
The PCM code will break its loop once a period boundary is reached,
consequently using up less packets that it really could.
As snd_usb_audio_next_packet_size() implements a feedback mechanism to
the endpoints phase accumulator, the number of calls to that function
matters, and when called too often, the data rate runs out of bounds.
Fix this by making the next_packet function public, and call it from the
PCM code as before if the packet data sizes are not defined.
Signed-off-by: Daniel Mack <zonque@gmail.com>
Cc: stable@kernel.org [v3.5+]
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Daniel Mack [Thu, 30 Aug 2012 16:52:29 +0000 (18:52 +0200)]
ALSA: snd-usb: restore delay information
Parts of commit
294c4fb8 ("ALSA: usb: refine delay information with USB
frame counter") were unfortunately lost during the refactoring of the
snd-usb driver in 3.5.
This patch adds them back, restoring the correct delay information
behaviour.
Signed-off-by: Daniel Mack <zonque@gmail.com>
Cc: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Cc: stable@kernel.org [3.5+]
Signed-off-by: Takashi Iwai <tiwai@suse.de>
David S. Miller [Fri, 31 Aug 2012 17:06:37 +0000 (13:06 -0400)]
Merge branch 'master' of git://1984.lsi.us.es/nf
Pavel Roskin [Thu, 30 Aug 2012 21:11:17 +0000 (17:11 -0400)]
ALSA: snd-usb: use list_for_each_safe for endpoint resources
snd_usb_endpoint_free() frees the structure that contains its argument.
Signed-off-by: Pavel Roskin <proski@gnu.org>
Cc: stable@vger.kernel.org
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Andrew Lunn [Thu, 30 Aug 2012 05:39:12 +0000 (07:39 +0200)]
ARM: Kirkwood: Fix 'SZ_1M' undeclared here for
db88f6281-bp-setup.c
Linux-next has failed to compile for kirkwood since 23 August with:
arch/arm/mach-kirkwood/
db88f6281-bp-setup.c:29: error: 'SZ_1M' undeclared here (not in a function)
arch/arm/mach-kirkwood/
db88f6281-bp-setup.c:33: error: 'SZ_4M' undeclared here (not in a function)
Add missing <linux/sizes.h>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jason Cooper <jason@lakedaemon.net>
Pablo Neira Ayuso [Wed, 29 Aug 2012 16:25:49 +0000 (16:25 +0000)]
netfilter: nf_conntrack: fix racy timer handling with reliable events
Existing code assumes that del_timer returns true for alive conntrack
entries. However, this is not true if reliable events are enabled.
In that case, del_timer may return true for entries that were
just inserted in the dying list. Note that packets / ctnetlink may
hold references to conntrack entries that were just inserted to such
list.
This patch fixes the issue by adding an independent timer for
event delivery. This increases the size of the ecache extension.
Still we can revisit this later and use variable size extensions
to allocate this area on demand.
Tested-by: Oliver Smith <olipro@8.c.9.b.0.7.4.0.1.0.0.2.ip6.arpa>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Kuninori Morimoto [Mon, 6 Aug 2012 05:47:00 +0000 (22:47 -0700)]
ARM: shmobile: mackerel: fixup usb module order
renesas_usbhs driver can play role as both Host and Gadget.
In case of Gadget, it requires not only renesas_usbhs
but also usb gadget module (like g_ether).
So, renesas_usbhs driver calls usb_add_gadget_udc() on probe time.
Because of this behavior,
Host port plays also Gadget role if kernel has both Host/Gadget support.
In mackerel case, from
0ada2da51800a4914887a9bcf22d563be80e50be
(ARM: mach-shmobile: mackerel: use renesas_usbhs instead of r8a66597_hcd)
usb0 plays Gadget role, and usb1 plays Host role,
and current mackerel board probes as usb1 -> usb0.
Thus, 1st installed usb gadget module (like g_ether) will be
assigned to usb1 (= usb Host port), and 2nd module to usb0 (= usb Gadget port).
It is very confusable for user.
This patch fixup usb modes probing order as usb0 -> usb1.
Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
Kuninori Morimoto [Thu, 9 Aug 2012 06:03:07 +0000 (23:03 -0700)]
ARM: shmobile: armadillo800eva: fixup: sound card detection order
Since armadillo800eva has 2 sound cards,
and had reversed deferred probe order issue,
it was purposely registered in reverse order.
But it was solved by
1d29cfa57471a5e4b8a7c2a7433eeba170d3ad92
(driver core: fixup reversed deferred probe order)
armadillo800eva board is expecting that
FSI-WM8978 is the 1st, and FSI-HDMI is the 2nd sound card.
This patch fixes it up
Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
Merav Sicron [Mon, 27 Aug 2012 03:26:20 +0000 (03:26 +0000)]
bnx2x: Correct the ndo_poll_controller call
This patch correct poll_bnx2x (ndo_poll_controller call) which was not
functioning well with MSI-X.
Signed-off-by: Merav Sicron <meravs@broadcom.com>
Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Merav Sicron [Mon, 27 Aug 2012 03:26:19 +0000 (03:26 +0000)]
bnx2x: Move netif_napi_add to the open call
Move netif_napi_add for all queues from the probe call to the open call, to
avoid the case that napi objects are added for queues that may eventually not
be initialized and activated. With the former behavior, the driver could crash
when netpoll was calling ndo_poll_controller.
Signed-off-by: Merav Sicron <meravs@broadcom.com>
Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Tue, 28 Aug 2012 12:33:07 +0000 (12:33 +0000)]
ipv4: must use rcu protection while calling fib_lookup
Following lockdep splat was reported by Pavel Roskin :
[ 1570.586223] ===============================
[ 1570.586225] [ INFO: suspicious RCU usage. ]
[ 1570.586228] 3.6.0-rc3-wl-main #98 Not tainted
[ 1570.586229] -------------------------------
[ 1570.586231] /home/proski/src/linux/net/ipv4/route.c:645 suspicious rcu_dereference_check() usage!
[ 1570.586233]
[ 1570.586233] other info that might help us debug this:
[ 1570.586233]
[ 1570.586236]
[ 1570.586236] rcu_scheduler_active = 1, debug_locks = 0
[ 1570.586238] 2 locks held by Chrome_IOThread/4467:
[ 1570.586240] #0: (slock-AF_INET){+.-...}, at: [<
ffffffff814f2c0c>] release_sock+0x2c/0xa0
[ 1570.586253] #1: (fnhe_lock){+.-...}, at: [<
ffffffff815302fc>] update_or_create_fnhe+0x2c/0x270
[ 1570.586260]
[ 1570.586260] stack backtrace:
[ 1570.586263] Pid: 4467, comm: Chrome_IOThread Not tainted 3.6.0-rc3-wl-main #98
[ 1570.586265] Call Trace:
[ 1570.586271] [<
ffffffff810976ed>] lockdep_rcu_suspicious+0xfd/0x130
[ 1570.586275] [<
ffffffff8153042c>] update_or_create_fnhe+0x15c/0x270
[ 1570.586278] [<
ffffffff815305b3>] __ip_rt_update_pmtu+0x73/0xb0
[ 1570.586282] [<
ffffffff81530619>] ip_rt_update_pmtu+0x29/0x90
[ 1570.586285] [<
ffffffff815411dc>] inet_csk_update_pmtu+0x2c/0x80
[ 1570.586290] [<
ffffffff81558d1e>] tcp_v4_mtu_reduced+0x2e/0xc0
[ 1570.586293] [<
ffffffff81553bc4>] tcp_release_cb+0xa4/0xb0
[ 1570.586296] [<
ffffffff814f2c35>] release_sock+0x55/0xa0
[ 1570.586300] [<
ffffffff815442ef>] tcp_sendmsg+0x4af/0xf50
[ 1570.586305] [<
ffffffff8156fc60>] inet_sendmsg+0x120/0x230
[ 1570.586308] [<
ffffffff8156fb40>] ? inet_sk_rebuild_header+0x40/0x40
[ 1570.586312] [<
ffffffff814f4bdd>] ? sock_update_classid+0xbd/0x3b0
[ 1570.586315] [<
ffffffff814f4c50>] ? sock_update_classid+0x130/0x3b0
[ 1570.586320] [<
ffffffff814ec435>] do_sock_write+0xc5/0xe0
[ 1570.586323] [<
ffffffff814ec4a3>] sock_aio_write+0x53/0x80
[ 1570.586328] [<
ffffffff8114bc83>] do_sync_write+0xa3/0xe0
[ 1570.586332] [<
ffffffff8114c5a5>] vfs_write+0x165/0x180
[ 1570.586335] [<
ffffffff8114c805>] sys_write+0x45/0x90
[ 1570.586340] [<
ffffffff815d2722>] system_call_fastpath+0x16/0x1b
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Pavel Roskin <proski@gnu.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yuval Mintz [Sun, 26 Aug 2012 00:35:45 +0000 (00:35 +0000)]
bnx2x: fix 57840_MF pci id
Commit
c3def943c7117d42caaed3478731ea7c3c87190e have added support for
new pci ids of the 57840 board, while failing to change the obsolete value
in 'pci_ids.h'.
This patch does so, allowing the probe of such devices.
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Francesco Ruggeri [Fri, 24 Aug 2012 07:38:35 +0000 (07:38 +0000)]
net: ipv4: ipmr_expire_timer causes crash when removing net namespace
When tearing down a net namespace, ipv4 mr_table structures are freed
without first deactivating their timers. This can result in a crash in
run_timer_softirq.
This patch mimics the corresponding behaviour in ipv6.
Locking and synchronization seem to be adequate.
We are about to kfree mrt, so existing code should already make sure that
no other references to mrt are pending or can be created by incoming traffic.
The functions invoked here do not cause new references to mrt or other
race conditions to be created.
Invoking del_timer_sync guarantees that ipmr_expire_timer is inactive.
Both ipmr_expire_process (whose completion we may have to wait in
del_timer_sync) and mroute_clean_tables internally use mfc_unres_lock
or other synchronizations when needed, and they both only modify mrt.
Tested in Linux 3.4.8.
Signed-off-by: Francesco Ruggeri <fruggeri@aristanetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>