Linus Torvalds [Thu, 26 Jun 2014 03:06:06 +0000 (20:06 -0700)]
Merge tag 'nfs-for-3.16-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client fixes from Trond Myklebust:
"Highlights include:
- Stable fix for a data corruption case due to incorrect cache
validation
- Fix a couple of false positive cache invalidations
- Fix NFSv4 security negotiation issues"
* tag 'nfs-for-3.16-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
NFSv4: test SECINFO RPC_AUTH_GSS pseudoflavors for support
NFS Return -EPERM if no supported or matching SECINFO flavor
NFS check the return of nfs4_negotiate_security in nfs4_submount
NFS: Don't mark the data cache as invalid if it has been flushed
NFS: Clear NFS_INO_REVAL_PAGECACHE when we update the file size
nfs: Fix cache_validity check in nfs_write_pageuptodate()
Linus Torvalds [Wed, 25 Jun 2014 19:19:01 +0000 (12:19 -0700)]
Merge tag 'fixes-for-linus' of git://git./linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Arnd Bergmann:
"A new set of bug fixes for 3.16, containing patches for seven
platforms:
at91:
- drivers/misc fix for Kconfig PWM symbol
- correction of several values in DT after conversion to CCF
- fix at91sam9261/at91sam9261ek mistake in slow crystal vs. slow RC osc
imx:
- Use GPIO for card CD/WP on imx51-babbage and eukrea-mbimxsd51,
because controller base CD/WP is not working in esdhc driver due to
runtime PM support
- A couple of random ventana gw5xxx board fixes
- Add IMX_IPUV3_CORE back to defconfig, which gets lost when moving
IPUv3 driver out of staging tree
- Fix enet/fec clock selection on imx6sl
- Fix display node on imx53-m53evk board
- A couple of Cubox-i updates from Russell, which were omitted from
the merge window due to dependency
integrator:
- fix an OF-related regression against 3.15
mvebu:
- mvebu (v7)
- Fix broken SoC ID detection
- Select ARM_CPU_SUSPEND for v7
- Remove armada38x compatible string (no users yet)
- Enable Dove SoC in mvebu_v7_defconfig
- kirkwood
- Fix phy-connection-type on GuruPlug board
qcom:
- enable gsbi driver in defconfig
- fix section mismatch warning in serial driver
samsung:
- use WFI macro in platform_do_lowpower because exynos cpuhotplug
includes a hardcoded WFI instruction and it causes compile error
in Thumb-2 mode.
- fix GIC reg sizes for exynos4 SoCs
- remove reset timer counter value during boot and resume for mct
to fix a big jump in printk timestamps
- fix pm code to check cortex-A9 for another exynos SoCs
- don't rely on firmware's secondary_cpu_start for mcpm
sti:
- Ethernet clocks were wrongly defined for STiH415/416 platforms
- STiH416 B2020 revision E DTS file name contained uppercase, change to
lowercase"
* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (33 commits)
ARM: at91/dt: sam9261: remove slow RC osc
ARM: at91/dt: define sam9261ek slow crystal frequency
ARM: at91/dt: sam9261: correctly define mainck
ARM: at91/dt: sam9n12: correct PLLA ICPLL and OUT values
ARM: at91/dt: sam9x5: correct PLLA ICPLL and OUT values
misc: atmel_pwm: fix Kconfig symbols
ARM: integrator: fix OF-related regression
ARM: mvebu: Fix the improper use of the compatible string armada38x using a wildcard
ARM: dts: kirkwood: fix phy-connection-type for Guruplug
ARM: EXYNOS: Don't rely on firmware's secondary_cpu_start for mcpm
ARM: dts: imx51-eukrea-mbimxsd51-baseboard: unbreak esdhc.
ARM: dts: imx51-babbage: Fix esdhc setup
ARM: dts: mx5: Move the display out of soc {} node
ARM: dts: mx5: Fix IPU port node placement
ARM: mvebu: select ARM_CPU_SUSPEND for Marvell EBU v7 platforms
ARM: mvebu: Fix broken SoC ID detection
ARM: imx_v6_v7_defconfig: Enable CONFIG_IMX_IPUV3_CORE
ARM: multi_v7_defconfig: Add QCOM GSBI driver
ARM: stih41x: Rename stih416-b2020-revE.dts to stih416-b2020e.dts
tty: serial: msm: Fix section mismatch warning
...
Arnd Bergmann [Wed, 25 Jun 2014 18:27:15 +0000 (20:27 +0200)]
Merge tag 'at91-fixes' of git://github.com/at91linux/linux-at91 into fixes
Merge "First AT91 fixes batch for 3.16" from Nicolas Ferre:
- drivers/misc fix for Kconfig PWM symbol
- correction of several values in DT after conversion to CCF
- fix at91sam9261/at91sam9261ek mistake in slow crystal vs. slow RC osc
* tag 'at91-fixes' of git://github.com/at91linux/linux-at91:
ARM: at91/dt: sam9261: remove slow RC osc
ARM: at91/dt: define sam9261ek slow crystal frequency
ARM: at91/dt: sam9261: correctly define mainck
ARM: at91/dt: sam9n12: correct PLLA ICPLL and OUT values
ARM: at91/dt: sam9x5: correct PLLA ICPLL and OUT values
misc: atmel_pwm: fix Kconfig symbols
Arnd Bergmann [Wed, 25 Jun 2014 13:34:00 +0000 (15:34 +0200)]
Merge tag 'mvebu-fixes-3.16' of git://git.infradead.org/linux-mvebu into fixes
Merge "mvebu fixes for v3.16" from Jason Cooper:
- mvebu
- Fix broken SoC ID detection
- Select ARM_CPU_SUSPEND for v7
- Remove armada38x compatible string (no users yet)
- kirkwood
- Fix phy-connection-type on GuruPlug board
* tag 'mvebu-fixes-3.16' of git://git.infradead.org/linux-mvebu:
ARM: mvebu: Fix the improper use of the compatible string armada38x using a wildcard
ARM: dts: kirkwood: fix phy-connection-type for Guruplug
ARM: mvebu: select ARM_CPU_SUSPEND for Marvell EBU v7 platforms
ARM: mvebu: Fix broken SoC ID detection
Linus Torvalds [Wed, 25 Jun 2014 17:34:17 +0000 (10:34 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/sage/ceph-client
Pull Ceph fix from Sage Weil:
"This fixes a corner case for cloned RBD images"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
rbd: handle parent_overlap on writes correctly
Alexandre Belloni [Mon, 23 Jun 2014 06:51:41 +0000 (08:51 +0200)]
ARM: at91/dt: sam9261: remove slow RC osc
The at91sam9261 doesn't actually have a slow RC oscillator, remove it from the
dtsi.
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Acked-by: Boris BREZILLON <boris.brezillon@free-electrons.com>
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Alexandre Belloni [Sat, 14 Jun 2014 00:10:43 +0000 (02:10 +0200)]
ARM: at91/dt: define sam9261ek slow crystal frequency
Define at91sam9261ek's slow crystal frequencies.
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Acked-by: Boris BREZILLON <boris.brezillon@free-electrons.com>
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Alexandre Belloni [Fri, 13 Jun 2014 12:02:29 +0000 (14:02 +0200)]
ARM: at91/dt: sam9261: correctly define mainck
mainck (CKGR_MCFR register) is actually using main_osc (CKGR_MOR register).
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Acked-by: Boris BREZILLON <boris.brezillon@free-electrons.com>
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Alexandre Belloni [Fri, 13 Jun 2014 11:28:12 +0000 (13:28 +0200)]
ARM: at91/dt: sam9n12: correct PLLA ICPLL and OUT values
ICPLL can only take 0 or 1, it got mixed with OUT which can be in the [0-3]
range.
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Acked-by: Boris BREZILLON <boris.brezillon@free-electrons.com>
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Alexandre Belloni [Fri, 13 Jun 2014 11:25:34 +0000 (13:25 +0200)]
ARM: at91/dt: sam9x5: correct PLLA ICPLL and OUT values
ICPLL can only take 0 or 1, it got mixed with OUT which can be in the [0-3]
range.
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Acked-by: Boris BREZILLON <boris.brezillon@free-electrons.com>
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Linus Torvalds [Wed, 25 Jun 2014 12:44:17 +0000 (05:44 -0700)]
Merge branch 'merge' of git://git./linux/kernel/git/benh/powerpc
Pull powerpc fixes and cleanups from Ben Herrenschmidt:
"Here are a handful or two of powerpc fixes and simple/trivial
cleanups. A bunch of them fix ftrace with the new ABI v2 in Little
Endian, the rest is a scattering of fairly simple things"
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
powerpc: Don't skip ePAPR spin-table CPUs
powerpc/module: Fix TOC symbol CRC
powerpc/powernv: Remove OPAL v1 takeover
powerpc/kmemleak: Do not scan the DART table
selftests/powerpc: Use the test harness for the TM DSCR test
powerpc/cell: cbe_thermal.c: Cleaning up a variable is of the wrong type
powerpc/kprobes: Fix jprobes on ABI v2 (LE)
powerpc/ftrace: Use pr_fmt() to namespace error messages
powerpc/ftrace: Fix nop of modules on 64bit LE (ABIv2)
powerpc/ftrace: Fix inverted check of create_branch()
powerpc/ftrace: Fix typo in mask of opcode
powerpc: Add ppc_global_function_entry()
powerpc/macintosh/smu.c: Fix closing brace followed by if
powerpc: Remove __arch_swab*
powerpc: Remove ancient DEBUG_SIG code
powerpc/kerenl: Enable EEH for IO accessors
Linus Torvalds [Wed, 25 Jun 2014 12:30:20 +0000 (05:30 -0700)]
Merge tag 'for_linus' of git://git./linux/kernel/git/mst/vhost
Pull vhost cleanups from Michael S Tsirkin:
"Two cleanup patches removing code duplication that got introduced by
changes in rc1. Not fixing crashes, but I'd rather not carry the
duplicate code until the next merge window"
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
vhost-scsi: don't open-code kvfree
vhost-net: don't open-code kvfree
Linus Torvalds [Wed, 25 Jun 2014 12:08:09 +0000 (05:08 -0700)]
Merge tag 'trace-fixes-v3.16-rc1-v2' of git://git./linux/kernel/git/rostedt/linux-trace
Pull tracing cleanups and fixes from Steven Rostedt:
"This includes three patches from Oleg Nesterov. The first is a fix to
a race condition that happens between enabling/disabling syscall
tracepoints and new process creations (the check to go into the ptrace
path for a process can be set when it shouldn't, or not set when it
should). Not a major bug but one that should be fixed and even
applied to stable.
The other two patches are cleanup/fixes that are not that critical,
but for an -rc1 release would be nice to have. They both deal with
syscall tracepoints.
It also includes a patch to introduce a new macro for the
TRACE_EVENT() format called __field_struct(). Originally, __field()
was used to record any variable into a trace event, but with the
addition of setting the "is signed" attribute, the check causes
anything but a primitive variable to fail to compile. That is,
structs and unions can't be used as they once were. When the "is
signed" check was introduce there were only primitive variables being
recorded. But that will change soon and it was reported that
__field() causes build failures.
To solve the __field() issue, __field_struct() is introduced to allow
trace_events to be able to record complex types too"
* tag 'trace-fixes-v3.16-rc1-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing: Add __field_struct macro for TRACE_EVENT()
tracing: syscall_regfunc() should not skip kernel threads
tracing: Change syscall_*regfunc() to check PF_KTHREAD and use for_each_process_thread()
tracing: Fix syscall_*regfunc() vs copy_process() race
Nicolas Ferre [Wed, 25 Jun 2014 09:33:44 +0000 (11:33 +0200)]
misc: atmel_pwm: fix Kconfig symbols
AT91 symbols AT91SAM9263, AT91SAM9RL, and AT91SAM9G45 do not exist and this
patch changes them to their correct ARCH_* version.
These symbols are chosen instead of the SOC_* ones because this driver is not
converted to DT.
Anyway, the ATMEL_PWM symbol and the associated driver will be removed soon,
during the move to the PWM sub-system.
Reported-by: Paul Bolle <pebolle@tiscali.nl>
Acked-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Scott Wood [Wed, 25 Jun 2014 01:15:51 +0000 (20:15 -0500)]
powerpc: Don't skip ePAPR spin-table CPUs
Commit
59a53afe70fd530040bdc69581f03d880157f15a "powerpc: Don't setup
CPUs with bad status" broke ePAPR SMP booting. ePAPR says that CPUs
that aren't presently running shall have status of disabled, with
enable-method being used to determine whether the CPU can be enabled.
Fix by checking for spin-table, which is currently the only supported
enable-method.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Cc: Michael Neuling <mikey@neuling.org>
Cc: Emil Medve <Emilian.Medve@Freescale.com>
Cc: stable@vger.kernel.org
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Laurent Dufour [Tue, 24 Jun 2014 08:53:59 +0000 (10:53 +0200)]
powerpc/module: Fix TOC symbol CRC
The commit
71ec7c55ed91 introduced the magic symbol ".TOC." for ELFv2 ABI.
This symbol is built manually and has no CRC value computed. A zero value
is put in the CRC section to avoid modpost complaining about a missing CRC.
Unfortunately, this breaks the kernel module loading when the kernel is
relocated (kdump case for instance) because of the relocation applied to
the kcrctab values.
This patch compute a CRC value for the TOC symbol which will match the one
compute by the kernel when it is relocated - aka '0 - relocate_start' done in
maybe_relocated called by check_version (module.c).
Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Cc: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Michael Ellerman [Tue, 24 Jun 2014 07:17:47 +0000 (17:17 +1000)]
powerpc/powernv: Remove OPAL v1 takeover
In commit
27f4488872d9 "Add OPAL takeover from PowerVM" we added support
for "takeover" on OPAL v1 machines.
This was a mode of operation where we would boot under pHyp, and query
for the presence of OPAL. If detected we would then do a special
sequence to take over the machine, and the kernel would end up running
in hypervisor mode.
OPAL v1 was never a supported product, and was never shipped outside
IBM. As far as we know no one is still using it.
Newer versions of OPAL do not use the takeover mechanism. Although the
query for OPAL should be harmless on machines with newer OPAL, we have
seen a machine where it causes a crash in Open Firmware.
The code in early_init_devtree() to copy boot_command_line into cmd_line
was added in commit
817c21ad9a1f "Get kernel command line accross OPAL
takeover", and AFAIK is only used by takeover, so should also be
removed.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Andy Adamson [Thu, 12 Jun 2014 19:02:32 +0000 (15:02 -0400)]
NFSv4: test SECINFO RPC_AUTH_GSS pseudoflavors for support
Fix nfs4_negotiate_security to create an rpc_clnt used to test each SECINFO
returned pseudoflavor. Check credential creation (and gss_context creation)
which is important for RPC_AUTH_GSS pseudoflavors which can fail for multiple
reasons including mis-configuration.
Don't call nfs4_negotiate in nfs4_submount as it was just called by
nfs4_proc_lookup_mountpoint (nfs4_proc_lookup_common)
Signed-off-by: Andy Adamson <andros@netapp.com>
[Trond: fix corrupt return value from nfs_find_best_sec()]
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Andy Adamson [Mon, 9 Jun 2014 19:33:20 +0000 (15:33 -0400)]
NFS Return -EPERM if no supported or matching SECINFO flavor
Do not return RPC_AUTH_UNIX if SEINFO reply tests fail. This
prevents an infinite loop of NFS4ERR_WRONGSEC for non RPC_AUTH_UNIX mounts.
Without this patch, a mount with no sec= option to a server
that does not include RPC_AUTH_UNIX in the
SECINFO return can be presented with an attemtp to use RPC_AUTH_UNIX
which will result in an NFS4ERR_WRONG_SEC which will prompt the SECINFO
call which will again try RPC_AUTH_UNIX....
Signed-off-by: Andy Adamson <andros@netapp.com>
Tested-By: Steve Dickson <steved@redhat.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Andy Adamson [Mon, 9 Jun 2014 19:33:19 +0000 (15:33 -0400)]
NFS check the return of nfs4_negotiate_security in nfs4_submount
Signed-off-by: Andy Adamson <andros@netapp.com>
Tested-By: Steve Dickson <steved@redhat.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Trond Myklebust [Fri, 20 Jun 2014 17:11:01 +0000 (13:11 -0400)]
NFS: Don't mark the data cache as invalid if it has been flushed
Now that we have functions such as nfs_write_pageuptodate() that use
the cache_validity flags to check if the data cache is valid or not,
it is a little more important to keep the flags in sync with the
state of the data cache.
In particular, we'd like to ensure that if the data cache is empty, we
don't start marking it as needing revalidation.
Reported-by: Scott Mayhew <smayhew@redhat.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Trond Myklebust [Fri, 20 Jun 2014 17:16:38 +0000 (13:16 -0400)]
NFS: Clear NFS_INO_REVAL_PAGECACHE when we update the file size
In nfs_update_inode(), if the change attribute is seen to change on
the server, then we set NFS_INO_REVAL_PAGECACHE in order to make
sure that we check the file size.
However, if we also update the file size in the same function, we
don't need to check it again. So make sure that we clear the
NFS_INO_REVAL_PAGECACHE that was set earlier.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Scott Mayhew [Fri, 20 Jun 2014 12:44:42 +0000 (08:44 -0400)]
nfs: Fix cache_validity check in nfs_write_pageuptodate()
NFS_INO_INVALID_DATA cannot be ignored, even if we have a delegation.
We're still having some problems with data corruption when multiple
clients are appending to a file and those clients are being granted
write delegations on open.
To reproduce:
Client A:
vi /mnt/`hostname -s`
while :; do echo "XXXXXXXXXXXXXXX" >>/mnt/file; sleep $(( $RANDOM % 5 )); done
Client B:
vi /mnt/`hostname -s`
while :; do echo "YYYYYYYYYYYYYYY" >>/mnt/file; sleep $(( $RANDOM % 5 )); done
What's happening is that in nfs_update_inode() we're recognizing that
the file size has changed and we're setting NFS_INO_INVALID_DATA
accordingly, but then we ignore the cache_validity flags in
nfs_write_pageuptodate() because we have a delegation. As a result,
in nfs_updatepage() we're extending the write to cover the full page
even though we've not read in the data to begin with.
Signed-off-by: Scott Mayhew <smayhew@redhat.com>
Cc: <stable@vger.kernel.org> # v3.11+
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Linus Torvalds [Tue, 24 Jun 2014 21:00:13 +0000 (14:00 -0700)]
Merge git://git.kvack.org/~bcrl/aio-fixes
Pull aio fixes from Ben LaHaise:
"These fix a kernel memory disclosure issue (arbitrary kmap() &
copy_to_user()) revealed in CVE-2014-0206 by changes that were
introduced in v3.10"
* git://git.kvack.org/~bcrl/aio-fixes:
aio: fix kernel memory disclosure in io_getevents() introduced in v3.10
aio: fix aio request leak when events are reaped by userspace
Linus Torvalds [Tue, 24 Jun 2014 20:59:00 +0000 (13:59 -0700)]
Merge branch 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm
Pull ARM fixes from Russell King:
"A number of low impact fixes, the most noticable one is the thumb2
frame pointer fix. We also fix a regression caused during this merge
window with ARM925 CPUs running with caches disabled, and fix a number
of warnings"
* 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
ARM: arm925: ensure assembly sets up writethrough mapping
ARM: perf: fix compiler warning with gcc 4.6.4 (and tidy code)
ARM: l2c: fix dependencies on PL310 errata symbols
ARM: 8069/1: Make thread_save_fp macro aware of THUMB2 mode
ARM: 8068/1: scoop: Remove unused variable
Benjamin LaHaise [Tue, 24 Jun 2014 17:32:51 +0000 (13:32 -0400)]
aio: fix kernel memory disclosure in io_getevents() introduced in v3.10
A kernel memory disclosure was introduced in aio_read_events_ring() in v3.10
by commit
a31ad380bed817aa25f8830ad23e1a0480fef797. The changes made to
aio_read_events_ring() failed to correctly limit the index into
ctx->ring_pages[], allowing an attacked to cause the subsequent kmap() of
an arbitrary page with a copy_to_user() to copy the contents into userspace.
This vulnerability has been assigned CVE-2014-0206. Thanks to Mateusz and
Petr for disclosing this issue.
This patch applies to v3.12+. A separate backport is needed for 3.10/3.11.
Signed-off-by: Benjamin LaHaise <bcrl@kvack.org>
Cc: Mateusz Guzik <mguzik@redhat.com>
Cc: Petr Matousek <pmatouse@redhat.com>
Cc: Kent Overstreet <kmo@daterainc.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: stable@vger.kernel.org
Benjamin LaHaise [Tue, 24 Jun 2014 17:12:55 +0000 (13:12 -0400)]
aio: fix aio request leak when events are reaped by userspace
The aio cleanups and optimizations by kmo that were merged into the 3.10
tree added a regression for userspace event reaping. Specifically, the
reference counts are not decremented if the event is reaped in userspace,
leading to the application being unable to submit further aio requests.
This patch applies to 3.12+. A separate backport is required for 3.10/3.11.
This issue was uncovered as part of CVE-2014-0206.
Signed-off-by: Benjamin LaHaise <bcrl@kvack.org>
Cc: stable@vger.kernel.org
Cc: Kent Overstreet <kmo@daterainc.com>
Cc: Mateusz Guzik <mguzik@redhat.com>
Cc: Petr Matousek <pmatouse@redhat.com>
Linus Walleij [Tue, 24 Jun 2014 12:08:07 +0000 (14:08 +0200)]
ARM: integrator: fix OF-related regression
Commit
07e461cd7e73a84f0e3757932b93cc80976fd749
"of: Ensure unique names without sacrificing determinism"
caused a boot failure regression on the Integrator machines.
The problem is probably caused by fiddling too much with
the device tree population in the OF init function, such
as passing the SoC bus device as parent when populating
the device tree.
This patch fixes the problem by:
- Avoiding to explicitly look up the tree root
- Look up devices needed before device population from
the match only, passing NULL as root
- Passing NULL as root and parent when calling
of_platform_populate()
After this the Integrators boot again. Tested on
Integrator/AP and Integrator/CP.
Cc: Grant Likely <grant.likely@linaro.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Gregory CLEMENT [Mon, 23 Jun 2014 14:16:51 +0000 (16:16 +0200)]
ARM: mvebu: Fix the improper use of the compatible string armada38x using a wildcard
Wildcards in compatible strings should be avoid. "marvell,armada38x"
was recently introduced but was not yet used.
The armada 385 SoC is a superset of the armada 380 SoC (with more CPUs
and more PCIe slots). So this patch replaces the use of
"marvell,armada38x" by the "marvell,armada380" string.
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Link: https://lkml.kernel.org/r/1403533011-21339-1-git-send-email-gregory.clement@free-electrons.com
Acked-by: Andrew Lunn <andrew@lunn.ch>
Cc: <stable@vger.kernel.org> # v3.15+
Signed-off-by: Jason Cooper <jason@lakedaemon.net>
Catalin Marinas [Fri, 13 Jun 2014 08:44:21 +0000 (09:44 +0100)]
powerpc/kmemleak: Do not scan the DART table
The DART table allocation is registered to kmemleak via the
memblock_alloc_base() call. However, the DART table is later unmapped
and dart_tablebase VA no longer accessible. This patch tells kmemleak
not to scan this block and avoid an unhandled paging request.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Michael Ellerman [Fri, 13 Jun 2014 11:16:04 +0000 (21:16 +1000)]
selftests/powerpc: Use the test harness for the TM DSCR test
This gives us standardised success/failure output and also handles
killing the test if it runs forever (2 minutes).
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Rickard Strandqvist [Sat, 14 Jun 2014 16:25:11 +0000 (18:25 +0200)]
powerpc/cell: cbe_thermal.c: Cleaning up a variable is of the wrong type
This variable is of the wrong type, everywhere it is used it
should be an unsigned int rather than a int.
Signed-off-by: Rickard Strandqvist <rickard_strandqvist@spectrumdigital.se>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Michael Ellerman [Mon, 23 Jun 2014 03:23:31 +0000 (13:23 +1000)]
powerpc/kprobes: Fix jprobes on ABI v2 (LE)
In commit
721aeaa9 "Build little endian ppc64 kernel with ABIv2", we
missed some updates required in the kprobes code to make jprobes work
when the kernel is built with ABI v2.
Firstly update arch_deref_entry_point() to do the right thing. Now that
we have added ppc_global_function_entry() we can just always use that, it
will do the right thing for 32 & 64 bit and ABI v1 & v2.
Secondly we need to update the code that sets up the register state before
calling the jprobe handler. On ABI v1 we setup r2 to hold the TOC, on ABI
v2 we need to populate r12 with the function entry point address.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Michael Ellerman [Tue, 17 Jun 2014 06:15:36 +0000 (16:15 +1000)]
powerpc/ftrace: Use pr_fmt() to namespace error messages
The printks() in our ftrace code have no prefix, so they appear on the
console with very little context, eg:
Branch out of range
Use pr_fmt() & pr_err() to add a prefix. While we're at it, collapse a
few split lines that don't need to be, and add a missing newline to one
message.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Michael Ellerman [Tue, 17 Jun 2014 06:15:35 +0000 (16:15 +1000)]
powerpc/ftrace: Fix nop of modules on 64bit LE (ABIv2)
There is a bug in the handling of the function entry when we are nopping
out a branch from a module in ftrace.
We compare the result of module_trampoline_target() with the value of
ppc_function_entry(), and expect them to be true. But they never will
be.
module_trampoline_target() will always return the global entry point of
the function, whereas ppc_function_entry() will always return the local.
Fix it by using the newly added ppc_global_function_entry().
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Michael Ellerman [Tue, 17 Jun 2014 06:15:34 +0000 (16:15 +1000)]
powerpc/ftrace: Fix inverted check of create_branch()
In commit
24a1bdc35, "Fix ABIv2 issues with __ftrace_make_call", Anton
changed the logic that creates and patches the branch, and added a
thinko in the check of create_branch(). create_branch() returns the
instruction that was generated, so if we get zero then it succeeded.
The result is we can't ftrace modules:
Branch out of range
WARNING: at ../kernel/trace/ftrace.c:1638
ftrace failed to modify [<
d000000004ba001c>] fuse_req_init_context+0x1c/0x90 [fuse]
We should probably fix patch_instruction() to do that check and make the
API saner, but that's a separate patch. For now just invert the test.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Michael Ellerman [Tue, 17 Jun 2014 06:15:33 +0000 (16:15 +1000)]
powerpc/ftrace: Fix typo in mask of opcode
In commit
24a1bdc35, "Fix ABIv2 issues with __ftrace_make_call", Anton
changed the logic that checks for the expected code sequence when
patching a module.
We missed the typo in the mask, 0xffff00000 should be 0xffff0000, which
has the effect of making the test always true.
That makes it impossible to ftrace against modules, eg:
Unexpected call sequence:
48000008 e8410018
WARNING: at ../kernel/trace/ftrace.c:1638
ftrace failed to modify [<
d000000007cf001c>] rng_dev_open+0x1c/0x70 [rng_core]
Reported-by: David Binderman <dcb314@hotmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Michael Ellerman [Tue, 17 Jun 2014 06:15:32 +0000 (16:15 +1000)]
powerpc: Add ppc_global_function_entry()
ABIv2 has the concept of a global and local entry point to a function.
In most cases we are interested in the local entry point, and so that is
what ppc_function_entry() returns.
However we have a case in the ftrace code where we want the global entry
point, and there may be other places we need it too. Rather than special
casing each, add an accessor.
For ABIv1 and 32-bit there is only a single entry point, so we return
that. That means it's safe for the caller to use this without also
checking the ABI version.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Rasmus Villemoes [Fri, 20 Jun 2014 19:44:27 +0000 (21:44 +0200)]
powerpc/macintosh/smu.c: Fix closing brace followed by if
A closing brace followed by "if" is almost certainly a mistake. Maybe
"else if" was meant, but in this case it doesn't really matter.
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Benjamin Herrenschmidt [Tue, 24 Jun 2014 02:28:56 +0000 (12:28 +1000)]
powerpc: Remove __arch_swab*
The generic code uses gcc built-ins which work fine so there's no benefit
in implementing our own anymore.
We can't completely remove the ld/st_le* functions as some historical
cruft still uses them, but that's next on the radar
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Michael Ellerman [Mon, 23 Jun 2014 04:17:47 +0000 (14:17 +1000)]
powerpc: Remove ancient DEBUG_SIG code
We have some compile-time disabled debug code in signal_xx.c. It's from
some ancient time BG, almost certainly part of the original port, given
the very similar code on other arches.
The show_unhandled_signal logic, added in
d0c3d534a438 (2.6.24) is
cleaner and prints more useful information, so drop the debug code.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Gavin Shan [Mon, 23 Jun 2014 00:56:22 +0000 (10:56 +1000)]
powerpc/kerenl: Enable EEH for IO accessors
In arch/powerpc/kernel/iomap.c, lots of IO reading accessors missed
to check EEH error as Ben pointed. The patch fixes it.
For the writing accessors, we change the called functions only for
making them look similar to the reading counterparts.
Suggested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Linus Torvalds [Tue, 24 Jun 2014 00:05:28 +0000 (17:05 -0700)]
Merge tag 'compress-3.16-rc3' of git://git./linux/kernel/git/gregkh/driver-core
Pull compress bugfixes from Greg KH:
"Here are two bugfixes for some compression functions that resolve some
errors when uncompressing some pathalogical data. Both were found by
Don A Bailey"
* tag 'compress-3.16-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
lz4: ensure length does not wrap
lzo: properly check for overruns
Linus Torvalds [Mon, 23 Jun 2014 23:48:14 +0000 (16:48 -0700)]
Merge branch 'akpm' (patches from Andrew Morton)
Merge fixes from Andrew Morton:
"The nmi patch and watchdog patch aren't actually fixes - they're
features which needed a few last-minutes touchups.
Otherwise, a rather large batch of fixes - ocfs2 review takes a while
and I got distracted and missed last week's batch"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (31 commits)
ocfs2/dlm: do not purge lockres that is queued for assert master
ocfs2: do not return DLM_MIGRATE_RESPONSE_MASTERY_REF to avoid endless,loop during umount
ocfs2: manually do the iput once ocfs2_add_entry failed in ocfs2_symlink and ocfs2_mknod
ocfs2: fix a tiny race when running dirop_fileop_racer
ocfs2/dlm: fix misuse of list_move_tail() in dlm_run_purge_list()
ocfs2: refcount: take rw_lock in ocfs2_reflink
ocfs2: revert "ocfs2: fix NULL pointer dereference when dismount and ocfs2rec simultaneously"
ocfs2: fix deadlock when two nodes are converting same lock from PR to EX and idletimeout closes conn
ocfs2: should add inode into orphan dir after updating entry in ocfs2_rename()
mm: fix crashes from mbind() merging vmas
checkpatch: reduce false positives when checking void function return statements
ia64: arch/ia64/include/uapi/asm/fcntl.h needs personality.h
DMA, CMA: fix possible memory leak
slab: fix oops when reading /proc/slab_allocators
shmem: fix faulting into a hole while it's punched
mm: let mm_find_pmd fix buggy race with THP fault
mm: thp: fix DEBUG_PAGEALLOC oops in copy_page_rep()
kernel/watchdog.c: print traces for all cpus on lockup detection
nmi: provide the option to issue an NMI back trace to every cpu but current
Documentation/accounting/getdelays.c: add missing null-terminate after strncpy call
...
Xue jiufei [Mon, 23 Jun 2014 20:22:09 +0000 (13:22 -0700)]
ocfs2/dlm: do not purge lockres that is queued for assert master
When workqueue is delayed, it may occur that a lockres is purged while it
is still queued for master assert. it may trigger BUG() as follows.
N1 N2
dlm_get_lockres()
->dlm_do_master_requery
is the master of lockres,
so queue assert_master work
dlm_thread() start running
and purge the lockres
dlm_assert_master_worker()
send assert master message
to other nodes
receiving the assert_master
message, set master to N2
dlmlock_remote() send create_lock message to N2, but receive DLM_IVLOCKID,
if it is RECOVERY lockres, it triggers the BUG().
Another BUG() is triggered when N3 become the new master and send
assert_master to N1, N1 will trigger the BUG() because owner doesn't
match. So we should not purge lockres when it is queued for assert
master.
Signed-off-by: joyce.xue <xuejiufei@huawei.com>
Reviewed-by: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
jiangyiwen [Mon, 23 Jun 2014 20:22:09 +0000 (13:22 -0700)]
ocfs2: do not return DLM_MIGRATE_RESPONSE_MASTERY_REF to avoid endless,loop during umount
The following case may lead to endless loop during umount.
node A node B node C node D
umount volume,
migrate lockres1
to B
want to lock lockres1,
send
MASTER_REQUEST_MSG
to C
init block mle
send
MIGRATE_REQUEST_MSG
to C
find a block
mle, and then
return
DLM_MIGRATE_RESPONSE_MASTERY_REF
to B
set C in refmap
umount successfully
try to umount, endless
loop occurs when migrate
lockres1 since C is in
refmap
So we can fix this endless loop case by only returning
DLM_MIGRATE_RESPONSE_MASTERY_REF if it has a mastery mle when receiving
MIGRATE_REQUEST_MSG.
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: jiangyiwen <jiangyiwen@huawei.com>
Reviewed-by: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Xue jiufei <xuejiufei@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
jiangyiwen [Mon, 23 Jun 2014 20:22:09 +0000 (13:22 -0700)]
ocfs2: manually do the iput once ocfs2_add_entry failed in ocfs2_symlink and ocfs2_mknod
When the call to ocfs2_add_entry() failed in ocfs2_symlink() and
ocfs2_mknod(), iput() will not be called during dput(dentry) because no
d_instantiate(), and this will lead to umount hung.
Signed-off-by: jiangyiwen <jiangyiwen@huawei.com>
Cc: Joel Becker <jlbec@evilplan.org>
Reviewed-by: Mark Fasheh <mfasheh@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Yiwen Jiang [Mon, 23 Jun 2014 20:22:09 +0000 (13:22 -0700)]
ocfs2: fix a tiny race when running dirop_fileop_racer
When running dirop_fileop_racer we found a dead lock case.
2 nodes, say Node A and Node B, mount the same ocfs2 volume. Create
/race/16/1 in the filesystem, and let the inode number of dir 16 is less
than the inode number of dir race.
Node A Node B
mv /race/16/1 /race/
right after Node A has got the
EX mode of /race/16/, and tries to
get EX mode of /race
ls /race/16/
In this case, Node A has got the EX mode of /race/16/, and wants to get EX
mode of /race/. Node B has got the PR mode of /race/, and wants to get
the PR mode of /race/16/. Since EX and PR are mutually exclusive, dead
lock happens.
This patch fixes this case by locking in ancestor order before trying
inode number order.
Signed-off-by: Yiwen Jiang <jiangyiwen@huawei.com>
Signed-off-by: Joseph Qi <joseph.qi@huawei.com>
Cc: Joel Becker <jlbec@evilplan.org>
Reviewed-by: Mark Fasheh <mfasheh@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Xue jiufei [Mon, 23 Jun 2014 20:22:08 +0000 (13:22 -0700)]
ocfs2/dlm: fix misuse of list_move_tail() in dlm_run_purge_list()
When a lockres in purge list but is still in use, it should be moved to
the tail of purge list. dlm_thread will continue to check next lockres in
purge list. However, code list_move_tail(&dlm->purge_list,
&lockres->purge) will do *no* movements, so dlm_thread will purge the same
lockres in this loop again and again. If it is in use for a long time,
other lockres will not be processed.
Signed-off-by: Yiwen Jiang <jiangyiwen@huawei.com>
Signed-off-by: joyce.xue <xuejiufei@huawei.com>
Reviewed-by: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Wengang Wang [Mon, 23 Jun 2014 20:22:08 +0000 (13:22 -0700)]
ocfs2: refcount: take rw_lock in ocfs2_reflink
This patch tries to fix this crash:
#5 [
ffff88003c1cd690] do_invalid_op at
ffffffff810166d5
#6 [
ffff88003c1cd730] invalid_op at
ffffffff8159b2de
[exception RIP: ocfs2_direct_IO_get_blocks+359]
RIP:
ffffffffa05dfa27 RSP:
ffff88003c1cd7e8 RFLAGS:
00010202
RAX:
0000000000000000 RBX:
ffff88003c1cdaa8 RCX:
0000000000000000
RDX:
000000000000000c RSI:
ffff880027a95000 RDI:
ffff88003c79b540
RBP:
ffff88003c1cd858 R8:
0000000000000000 R9:
ffffffff815f6ba0
R10:
00000000000001c9 R11:
00000000000001c9 R12:
ffff88002d271500
R13:
0000000000000001 R14:
0000000000000000 R15:
0000000000001000
ORIG_RAX:
ffffffffffffffff CS: 0010 SS: 0018
#7 [
ffff88003c1cd860] do_direct_IO at
ffffffff811cd31b
#8 [
ffff88003c1cd950] direct_IO_iovec at
ffffffff811cde9c
#9 [
ffff88003c1cd9b0] do_blockdev_direct_IO at
ffffffff811ce764
#10 [
ffff88003c1cdb80] __blockdev_direct_IO at
ffffffff811ce7cc
#11 [
ffff88003c1cdbb0] ocfs2_direct_IO at
ffffffffa05df756 [ocfs2]
#12 [
ffff88003c1cdbe0] generic_file_direct_write_iter at
ffffffff8112f935
#13 [
ffff88003c1cdc40] ocfs2_file_write_iter at
ffffffffa0600ccc [ocfs2]
#14 [
ffff88003c1cdd50] do_aio_write at
ffffffff8119126c
#15 [
ffff88003c1cddc0] aio_rw_vect_retry at
ffffffff811d9bb4
#16 [
ffff88003c1cddf0] aio_run_iocb at
ffffffff811db880
#17 [
ffff88003c1cde30] io_submit_one at
ffffffff811dc238
#18 [
ffff88003c1cde80] do_io_submit at
ffffffff811dc437
#19 [
ffff88003c1cdf70] sys_io_submit at
ffffffff811dc530
#20 [
ffff88003c1cdf80] system_call_fastpath at
ffffffff8159a159
It crashes at
BUG_ON(create && (ext_flags & OCFS2_EXT_REFCOUNTED));
in ocfs2_direct_IO_get_blocks.
ocfs2_direct_IO_get_blocks is expecting the OCFS2_EXT_REFCOUNTED be removed in
ocfs2_prepare_inode_for_write() if it was there. But no cluster lock is taken
during the time before (or inside) ocfs2_prepare_inode_for_write() and after
ocfs2_direct_IO_get_blocks().
It can happen in this case:
Node A(which crashes) Node B
------------------------ ---------------------------
ocfs2_file_aio_write
ocfs2_prepare_inode_for_write
ocfs2_inode_lock
...
ocfs2_inode_unlock
#no refcount found
.... ocfs2_reflink
ocfs2_inode_lock
...
ocfs2_inode_unlock
#now, refcount flag set on extent
...
flush change to disk
ocfs2_direct_IO_get_blocks
ocfs2_get_clusters
#extent map miss
#buffer_head miss
read extents from disk
found refcount flag on extent
crash..
Fix:
Take rw_lock in ocfs2_reflink path
Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com>
Reviewed-by: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Xue jiufei [Mon, 23 Jun 2014 20:22:08 +0000 (13:22 -0700)]
ocfs2: revert "ocfs2: fix NULL pointer dereference when dismount and ocfs2rec simultaneously"
75f82eaa502c ("ocfs2: fix NULL pointer dereference when dismount and
ocfs2rec simultaneously") may cause umount hang while shutting down
truncate log.
The situation is as followes:
ocfs2_dismout_volume
-> ocfs2_recovery_exit
-> free osb->recovery_map
-> ocfs2_truncate_shutdown
-> lock global bitmap inode
-> ocfs2_wait_for_recovery
-> check whether osb->recovery_map->rm_used is zero
Because osb->recovery_map is already freed, rm_used can be any other
values, so it may yield umount hang.
Signed-off-by: joyce.xue <xuejiufei@huawei.com>
Reviewed-by: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Tariq Saeed [Mon, 23 Jun 2014 20:22:08 +0000 (13:22 -0700)]
ocfs2: fix deadlock when two nodes are converting same lock from PR to EX and idletimeout closes conn
Orabug:
18639535
Two node cluster and both nodes hold a lock at PR level and both want to
convert to EX at the same time. Master node 1 has sent BAST and then
closes the connection due to idletime out. Node 0 receives BAST, sends
unlock req with cancel flag but gets error -ENOTCONN. The problem is
this error is ignored in dlm_send_remote_unlock_request() on the
**incorrect** assumption that the master is dead. See NOTE in comment
why it returns DLM_NORMAL. Upon getting DLM_NORMAL, node 0 proceeds to
sends convert (without cancel flg) which fails with -ENOTCONN. waits 5
sec and resends.
This time gets DLM_IVLOCKID from the master since lock not found in
grant, it had been moved to converting queue in response to conv PR->EX
req. No way out.
Node 1 (master) Node 0
============== ======
lock mode PR PR
convert PR -> EX
mv grant -> convert and que BAST
...
<-------- convert PR -> EX
convert que looks like this: ((node 1, PR -> EX) (node 0, PR -> EX))
...
BAST (want PR -> NL)
------------------>
...
idle timout, conn closed
...
In response to BAST,
sends unlock with cancel convert flag
gets -ENOTCONN. Ignores and
sends remote convert request
gets -ENOTCONN, waits 5 Sec, retries
...
reconnects
<----------------- convert req goes through on next try
does not find lock on grant que
status DLM_IVLOCKID
------------------>
...
No way out. Fix is to keep retrying unlock with cancel flag until it
succeeds or the master dies.
Signed-off-by: Tariq Saeed <tariq.x.saeed@oracle.com>
Reviewed-by: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
alex chen [Mon, 23 Jun 2014 20:22:07 +0000 (13:22 -0700)]
ocfs2: should add inode into orphan dir after updating entry in ocfs2_rename()
There are two files a and b in dir /mnt/ocfs2.
node A node B
mv a b
In ocfs2_rename(), after calling
ocfs2_orphan_add(), the inode of
file b will be added into orphan
dir.
If ocfs2_update_entry() fails,
ocfs2_rename return error and mv
operation fails. But file b still
exists in the parent dir.
ocfs2_queue_orphan_scan
-> ocfs2_queue_recovery_completion
-> ocfs2_complete_recovery
-> ocfs2_recover_orphans
The inode of the file b will be
put with iput().
ocfs2_evict_inode
-> ocfs2_delete_inode
-> ocfs2_wipe_inode
-> ocfs2_remove_inode
OCFS2_VALID_FL in the inode
i_flags will be cleared.
The file b still can be accessed
on node B.
ls /mnt/ocfs2
When first read the file b with
ocfs2_read_inode_block(). It will
validate the inode using
ocfs2_validate_inode_block().
Because OCFS2_VALID_FL not set in
the inode i_flags, so the file
system will be readonly.
So we should add inode into orphan dir after updating entry in
ocfs2_rename().
Signed-off-by: alex.chen <alex.chen@huawei.com>
Reviewed-by: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Hugh Dickins [Mon, 23 Jun 2014 20:22:07 +0000 (13:22 -0700)]
mm: fix crashes from mbind() merging vmas
In v2.6.34 commit
9d8cebd4bcd7 ("mm: fix mbind vma merge problem")
introduced vma merging to mbind(), but it should have also changed the
convention of passing start vma from queue_pages_range() (formerly
check_range()) to new_vma_page(): vma merging may have already freed
that structure, resulting in BUG at mm/mempolicy.c:1738 and probably
worse crashes.
Fixes:
9d8cebd4bcd7 ("mm: fix mbind vma merge problem")
Reported-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Tested-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Christoph Lameter <cl@linux.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: <stable@vger.kernel.org> [2.6.34+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joe Perches [Mon, 23 Jun 2014 20:22:07 +0000 (13:22 -0700)]
checkpatch: reduce false positives when checking void function return statements
The previous patch had a few too many false positives on styles that
should be acceptable.
Signed-off-by: Joe Perches <joe@perches.com>
Tested-by: Anish Bhatt <anish@chelsio.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andrew Morton [Mon, 23 Jun 2014 20:22:07 +0000 (13:22 -0700)]
ia64: arch/ia64/include/uapi/asm/fcntl.h needs personality.h
fs/notify/fanotify/fanotify_user.c: In function 'SYSC_fanotify_init':
fs/notify/fanotify/fanotify_user.c:726: error: implicit declaration of function 'personality'
fs/notify/fanotify/fanotify_user.c:726: error: 'PER_LINUX32' undeclared (first use in this function)
fs/notify/fanotify/fanotify_user.c:726: error: (Each undeclared identifier is reported only once
fs/notify/fanotify/fanotify_user.c:726: error: for each function it appears in.)
Reported-by: Wu Fengguang <fengguang.wu@intel.com>
Cc: Will Woods <wwoods@redhat.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: <stable@vger.kernel.org> [3.15.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joonsoo Kim [Mon, 23 Jun 2014 20:22:07 +0000 (13:22 -0700)]
DMA, CMA: fix possible memory leak
We should free memory for bitmap when we find zone mismatch, otherwise
this memory will leak.
Additionally, I copy code comment from PPC KVM's CMA code to inform why
we need to check zone mis-match.
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Reviewed-by: Michal Nazarewicz <mina86@mina86.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Michal Nazarewicz <mina86@mina86.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: Alexander Graf <agraf@suse.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joonsoo Kim [Mon, 23 Jun 2014 20:22:06 +0000 (13:22 -0700)]
slab: fix oops when reading /proc/slab_allocators
Commit
b1cb0982bdd6 ("change the management method of free objects of
the slab") introduced a bug on slab leak detector
('/proc/slab_allocators'). This detector works like as following
decription.
1. traverse all objects on all the slabs.
2. determine whether it is active or not.
3. if active, print who allocate this object.
but that commit changed the way how to manage free objects, so the logic
determining whether it is active or not is also changed. In before, we
regard object in cpu caches as inactive one, but, with this commit, we
mistakenly regard object in cpu caches as active one.
This intoduces kernel oops if DEBUG_PAGEALLOC is enabled. If
DEBUG_PAGEALLOC is enabled, kernel_map_pages() is used to detect who
corrupt free memory in the slab. It unmaps page table mapping if object
is free and map it if object is active. When slab leak detector check
object in cpu caches, it mistakenly think this object active so try to
access object memory to retrieve caller of allocation. At this point,
page table mapping to this object doesn't exist, so oops occurs.
Following is oops message reported from Dave.
It blew up when something tried to read /proc/slab_allocators
(Just cat it, and you should see the oops below)
Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
Modules linked in:
[snip...]
CPU: 1 PID: 9386 Comm: trinity-c33 Not tainted 3.14.0-rc5+ #131
task:
ffff8801aa46e890 ti:
ffff880076924000 task.ti:
ffff880076924000
RIP: 0010:[<
ffffffffaa1a8f4a>] [<
ffffffffaa1a8f4a>] handle_slab+0x8a/0x180
RSP: 0018:
ffff880076925de0 EFLAGS:
00010002
RAX:
0000000000001000 RBX:
0000000000000000 RCX:
000000005ce85ce7
RDX:
ffffea00079be100 RSI:
0000000000001000 RDI:
ffff880107458000
RBP:
ffff880076925e18 R08:
0000000000000001 R09:
0000000000000000
R10:
0000000000000000 R11:
000000000000000f R12:
ffff8801e6f84000
R13:
ffffea00079be100 R14:
ffff880107458000 R15:
ffff88022bb8d2c0
FS:
00007fb769e45740(0000) GS:
ffff88024d040000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
CR2:
ffff8801e6f84ff8 CR3:
00000000a22db000 CR4:
00000000001407e0
DR0:
0000000002695000 DR1:
0000000002695000 DR2:
0000000000000000
DR3:
0000000000000000 DR6:
00000000fffe0ff0 DR7:
0000000000070602
Call Trace:
leaks_show+0xce/0x240
seq_read+0x28e/0x490
proc_reg_read+0x3d/0x80
vfs_read+0x9b/0x160
SyS_read+0x58/0xb0
tracesys+0xd4/0xd9
Code: f5 00 00 00 0f 1f 44 00 00 48 63 c8 44 3b 0c 8a 0f 84 e3 00 00 00 83 c0 01 44 39 c0 72 eb 41 f6 47 1a 01 0f 84 e9 00 00 00 89 f0 <4d> 8b 4c 04 f8 4d 85 c9 0f 84 88 00 00 00 49 8b 7e 08 4d 8d 46
RIP handle_slab+0x8a/0x180
To fix the problem, I introduce an object status buffer on each slab.
With this, we can track object status precisely, so slab leak detector
would not access active object and no kernel oops would occur. Memory
overhead caused by this fix is only imposed to CONFIG_DEBUG_SLAB_LEAK
which is mainly used for debugging, so memory overhead isn't big
problem.
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Reported-by: Dave Jones <davej@redhat.com>
Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reviewed-by: Vladimir Davydov <vdavydov@parallels.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Hugh Dickins [Mon, 23 Jun 2014 20:22:06 +0000 (13:22 -0700)]
shmem: fix faulting into a hole while it's punched
Trinity finds that mmap access to a hole while it's punched from shmem
can prevent the madvise(MADV_REMOVE) or fallocate(FALLOC_FL_PUNCH_HOLE)
from completing, until the reader chooses to stop; with the puncher's
hold on i_mutex locking out all other writers until it can complete.
It appears that the tmpfs fault path is too light in comparison with its
hole-punching path, lacking an i_data_sem to obstruct it; but we don't
want to slow down the common case.
Extend shmem_fallocate()'s existing range notification mechanism, so
shmem_fault() can refrain from faulting pages into the hole while it's
punched, waiting instead on i_mutex (when safe to sleep; or repeatedly
faulting when not).
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Hugh Dickins <hughd@google.com>
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Tested-by: Sasha Levin <sasha.levin@oracle.com>
Cc: Dave Jones <davej@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Hugh Dickins [Mon, 23 Jun 2014 20:22:05 +0000 (13:22 -0700)]
mm: let mm_find_pmd fix buggy race with THP fault
Trinity has reported:
BUG: unable to handle kernel NULL pointer dereference at
0000000000000018
IP: __lock_acquire (kernel/locking/lockdep.c:3070 (discriminator 1))
CPU: 6 PID: 16173 Comm: trinity-c364 Tainted: G W
3.15.0-rc1-next-20140415-sasha-00020-gaa90d09 #398
lock_acquire (arch/x86/include/asm/current.h:14
kernel/locking/lockdep.c:3602)
_raw_spin_lock (include/linux/spinlock_api_smp.h:143
kernel/locking/spinlock.c:151)
remove_migration_pte (mm/migrate.c:137)
rmap_walk (mm/rmap.c:1628 mm/rmap.c:1699)
remove_migration_ptes (mm/migrate.c:224)
migrate_pages (mm/migrate.c:922 mm/migrate.c:960 mm/migrate.c:1126)
migrate_misplaced_page (mm/migrate.c:1733)
__handle_mm_fault (mm/memory.c:3762 mm/memory.c:3812 mm/memory.c:3925)
handle_mm_fault (mm/memory.c:3948)
__get_user_pages (mm/memory.c:1851)
__mlock_vma_pages_range (mm/mlock.c:255)
__mm_populate (mm/mlock.c:711)
SyS_mlockall (include/linux/mm.h:1799 mm/mlock.c:817 mm/mlock.c:791)
I believe this comes about because, whereas collapsing and splitting THP
functions take anon_vma lock in write mode (which excludes concurrent
rmap walks), faulting THP functions (write protection and misplaced
NUMA) do not - and mostly they do not need to.
But they do use a pmdp_clear_flush(), set_pmd_at() sequence which, for
an instant (indeed, for a long instant, given the inter-CPU TLB flush in
there), leaves *pmd neither present not trans_huge.
Which can confuse a concurrent rmap walk, as when removing migration
ptes, seen in the dumped trace. Although that rmap walk has a 4k page
to insert, anon_vmas containing THPs are in no way segregated from
4k-page anon_vmas, so the 4k-intent mm_find_pmd() does need to cope with
that instant when a trans_huge pmd is temporarily absent.
I don't think we need strengthen the locking at the THP end: it's easily
handled with an ACCESS_ONCE() before testing both conditions.
And since mm_find_pmd() had only one caller who wanted a THP rather than
a pmd, let's slightly repurpose it to fail when it hits a THP or
non-present pmd, and open code split_huge_page_address() again.
Signed-off-by: Hugh Dickins <hughd@google.com>
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Bob Liu <bob.liu@oracle.com>
Cc: Christoph Lameter <cl@gentwo.org>
Cc: Dave Jones <davej@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Hugh Dickins [Mon, 23 Jun 2014 20:22:05 +0000 (13:22 -0700)]
mm: thp: fix DEBUG_PAGEALLOC oops in copy_page_rep()
Trinity has for over a year been reporting a CONFIG_DEBUG_PAGEALLOC oops
in copy_page_rep() called from copy_user_huge_page() called from
do_huge_pmd_wp_page().
I believe this is a DEBUG_PAGEALLOC false positive, due to the source
page being split, and a tail page freed, while copy is in progress; and
not a problem without DEBUG_PAGEALLOC, since the pmd_same() check will
prevent a miscopy from being made visible.
Fix by adding get_user_huge_page() and put_user_huge_page(): reducing to
the usual get_page() and put_page() on head page in the usual config;
but get and put references to all of the tail pages when
DEBUG_PAGEALLOC.
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Tested-by: Sasha Levin <sasha.levin@oracle.com>
Cc: Dave Jones <davej@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Aaron Tomlin [Mon, 23 Jun 2014 20:22:05 +0000 (13:22 -0700)]
kernel/watchdog.c: print traces for all cpus on lockup detection
A 'softlockup' is defined as a bug that causes the kernel to loop in
kernel mode for more than a predefined period to time, without giving
other tasks a chance to run.
Currently, upon detection of this condition by the per-cpu watchdog
task, debug information (including a stack trace) is sent to the system
log.
On some occasions, we have observed that the "victim" rather than the
actual "culprit" (i.e. the owner/holder of the contended resource) is
reported to the user. Often this information has proven to be
insufficient to assist debugging efforts.
To avoid loss of useful debug information, for architectures which
support NMI, this patch makes it possible to improve soft lockup
reporting. This is accomplished by issuing an NMI to each cpu to obtain
a stack trace.
If NMI is not supported we just revert back to the old method. A sysctl
and boot-time parameter is available to toggle this feature.
[dzickus@redhat.com: add CONFIG_SMP in certain areas]
[akpm@linux-foundation.org: additional CONFIG_SMP=n optimisations]
[mq@suse.cz: fix warning]
Signed-off-by: Aaron Tomlin <atomlin@redhat.com>
Signed-off-by: Don Zickus <dzickus@redhat.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Mateusz Guzik <mguzik@redhat.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Jan Moskyto Matejka <mq@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Aaron Tomlin [Mon, 23 Jun 2014 20:22:05 +0000 (13:22 -0700)]
nmi: provide the option to issue an NMI back trace to every cpu but current
Sometimes it is preferred not to use the trigger_all_cpu_backtrace()
routine when one wants to avoid capturing a back trace for current. For
instance if one was previously captured recently.
This patch provides a new routine namely
trigger_allbutself_cpu_backtrace() which offers the flexibility to issue
an NMI to every cpu but current and capture a back trace accordingly.
Patch x86 and sparc to support new routine.
[dzickus@redhat.com: add stub in #else clause]
[dzickus@redhat.com: don't print message in single processor case, wrap with get/put_cpu based on Oleg's suggestion]
[sfr@canb.auug.org.au: undo C99ism]
Signed-off-by: Aaron Tomlin <atomlin@redhat.com>
Signed-off-by: Don Zickus <dzickus@redhat.com>
Acked-by: David S. Miller <davem@davemloft.net>
Cc: Mateusz Guzik <mguzik@redhat.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Rickard Strandqvist [Mon, 23 Jun 2014 20:22:04 +0000 (13:22 -0700)]
Documentation/accounting/getdelays.c: add missing null-terminate after strncpy call
Added a guaranteed null-terminate after call to strncpy.
This was partly found using a static code analysis program called
cppcheck.
Signed-off-by: Rickard Strandqvist <rickard_strandqvist@spectrumdigital.se>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Micky Ching [Mon, 23 Jun 2014 20:22:04 +0000 (13:22 -0700)]
drivers/memstick/host/rtsx_pci_ms.c: add cancel_work when remove driver
Add cancel_work_sync() in rtsx_pci_ms_drv_remove() to cancel pending
request work when removing the driver.
Signed-off-by: Micky Ching <micky_ching@realsil.com.cn>
Cc: Samuel Ortiz <sameo@linux.intel.com> says:
Cc: Maxim Levitsky <maximlevitsky@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Alex Dubov <oakad@yahoo.com>
Cc: Roger Tseng <rogerable@realtek.com>
Cc: Wei WANG <wei_wang@realsil.com.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Rientjes [Mon, 23 Jun 2014 20:22:04 +0000 (13:22 -0700)]
mm, pcp: allow restoring percpu_pagelist_fraction default
Oleg reports a division by zero error on zero-length write() to the
percpu_pagelist_fraction sysctl:
divide error: 0000 [#1] SMP DEBUG_PAGEALLOC
CPU: 1 PID: 9142 Comm: badarea_io Not tainted 3.15.0-rc2-vm-nfs+ #19
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
task:
ffff8800d5aeb6e0 ti:
ffff8800d87a2000 task.ti:
ffff8800d87a2000
RIP: 0010: percpu_pagelist_fraction_sysctl_handler+0x84/0x120
RSP: 0018:
ffff8800d87a3e78 EFLAGS:
00010246
RAX:
0000000000000f89 RBX:
ffff88011f7fd000 RCX:
0000000000000000
RDX:
0000000000000000 RSI:
0000000000000001 RDI:
0000000000000010
RBP:
ffff8800d87a3e98 R08:
ffffffff81d002c8 R09:
ffff8800d87a3f50
R10:
000000000000000b R11:
0000000000000246 R12:
0000000000000060
R13:
ffffffff81c3c3e0 R14:
ffffffff81cfddf8 R15:
ffff8801193b0800
FS:
00007f614f1e9740(0000) GS:
ffff88011f440000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
000000008005003b
CR2:
00007f614f1fa000 CR3:
00000000d9291000 CR4:
00000000000006e0
Call Trace:
proc_sys_call_handler+0xb3/0xc0
proc_sys_write+0x14/0x20
vfs_write+0xba/0x1e0
SyS_write+0x46/0xb0
tracesys+0xe1/0xe6
However, if the percpu_pagelist_fraction sysctl is set by the user, it
is also impossible to restore it to the kernel default since the user
cannot write 0 to the sysctl.
This patch allows the user to write 0 to restore the default behavior.
It still requires a fraction equal to or larger than 8, however, as
stated by the documentation for sanity. If a value in the range [1, 7]
is written, the sysctl will return EINVAL.
This successfully solves the divide by zero issue at the same time.
Signed-off-by: David Rientjes <rientjes@google.com>
Reported-by: Oleg Drokin <green@linuxhacker.ru>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Chen Gang [Mon, 23 Jun 2014 20:22:04 +0000 (13:22 -0700)]
lib/Kconfig.debug: let FRAME_POINTER exclude SCORE, just like exclude most of other architectures
The related warning:
scripts/kconfig/conf --allmodconfig Kconfig
warning: (FAULT_INJECTION_STACKTRACE_FILTER && LATENCYTOP && KMEMCHECK && LOCKDEP) selects FRAME_POINTER which has unmet direct dependencies (DEBUG_KERNEL && (CRIS || M68K || FRV || UML || AVR32 || SUPERH || BLACKFIN || MN10300 || METAG) || ARCH_WANT_FRAME_POINTERS)
Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com>
Cc: Chen Liqin <liqin.linux@gmail.com>
Cc: Lennox Wu <lennox.wu@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Don Zickus [Mon, 23 Jun 2014 20:22:03 +0000 (13:22 -0700)]
kernel/watchdog.c: remove preemption restrictions when restarting lockup detector
Peter Wu noticed the following splat on his machine when updating
/proc/sys/kernel/watchdog_thresh:
BUG: sleeping function called from invalid context at mm/slub.c:965
in_atomic(): 1, irqs_disabled(): 0, pid: 1, name: init
3 locks held by init/1:
#0: (sb_writers#3){.+.+.+}, at: [<
ffffffff8117b663>] vfs_write+0x143/0x180
#1: (watchdog_proc_mutex){+.+.+.}, at: [<
ffffffff810e02d3>] proc_dowatchdog+0x33/0x110
#2: (cpu_hotplug.lock){.+.+.+}, at: [<
ffffffff810589c2>] get_online_cpus+0x32/0x80
Preemption disabled at:[<
ffffffff810e0384>] proc_dowatchdog+0xe4/0x110
CPU: 0 PID: 1 Comm: init Not tainted 3.16.0-rc1-testing #34
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
dump_stack+0x4e/0x7a
__might_sleep+0x11d/0x190
kmem_cache_alloc_trace+0x4e/0x1e0
perf_event_alloc+0x55/0x440
perf_event_create_kernel_counter+0x26/0xe0
watchdog_nmi_enable+0x75/0x140
update_timers_all_cpus+0x53/0xa0
proc_dowatchdog+0xe4/0x110
proc_sys_call_handler+0xb3/0xc0
proc_sys_write+0x14/0x20
vfs_write+0xad/0x180
SyS_write+0x49/0xb0
system_call_fastpath+0x16/0x1b
NMI watchdog: disabled (cpu0): hardware events not enabled
What happened is after updating the watchdog_thresh, the lockup detector
is restarted to utilize the new value. Part of this process involved
disabling preemption. Once preemption was disabled, perf tried to
allocate a new event (as part of the restart). This caused the above
BUG_ON as you can't sleep with preemption disabled.
The preemption restriction seemed agressive as we are not doing anything
on that particular cpu, but with all the online cpus (which are
protected by the get_online_cpus lock). Remove the restriction and the
BUG_ON goes away.
Signed-off-by: Don Zickus <dzickus@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Reported-by: Peter Wu <peter@lekensteyn.nl>
Tested-by: Peter Wu <peter@lekensteyn.nl>
Acked-by: David Rientjes <rientjes@google.com>
Cc: <stable@vger.kernel.org> [3.13+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Christoph Lameter [Mon, 23 Jun 2014 20:22:03 +0000 (13:22 -0700)]
MAINTAINERS: SLAB maintainer update
As discussed in various threads on the side:
Remove one inactive maintainer, add two new ones and update my email
address. Plus add Andrew. And fix the glob to include files like
mm/slab_common.c
Signed-off-by: Christoph Lameter <cl@linux.com>
Acked-by: David Rientjes <rientjes@google.com>
Acked-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Pekka Enberg <penberg@kernel.org>
Cc: Matt Mackall <mpm@selenic.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Naoya Horiguchi [Mon, 23 Jun 2014 20:22:03 +0000 (13:22 -0700)]
hugetlb: fix copy_hugetlb_page_range() to handle migration/hwpoisoned entry
There's a race between fork() and hugepage migration, as a result we try
to "dereference" a swap entry as a normal pte, causing kernel panic.
The cause of the problem is that copy_hugetlb_page_range() can't handle
"swap entry" family (migration entry and hwpoisoned entry) so let's fix
it.
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: <stable@vger.kernel.org> [2.6.37+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Hugh Dickins [Mon, 23 Jun 2014 20:22:03 +0000 (13:22 -0700)]
tmpfs: ZERO_RANGE and COLLAPSE_RANGE not currently supported
I was well aware of FALLOC_FL_ZERO_RANGE and FALLOC_FL_COLLAPSE_RANGE
support being added to fallocate(); but didn't realize until now that I
had been too stupid to future-proof shmem_fallocate() against new
additions. -EOPNOTSUPP instead of going on to ordinary fallocation.
Signed-off-by: Hugh Dickins <hughd@google.com>
Reviewed-by: Lukas Czerner <lczerner@redhat.com>
Cc: <stable@vger.kernel.org> [3.15]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Rientjes [Mon, 23 Jun 2014 20:22:03 +0000 (13:22 -0700)]
mm, hotplug: probe interface is available on several platforms
Documentation/memory-hotplug.txt incorrectly states that the memory
driver "probe" interface is only supported on powerpc and is vague about
its application on x86. Clarify the platforms that make this interface
available if memory hotplug is enabled.
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Petr Tesarik [Mon, 23 Jun 2014 20:22:03 +0000 (13:22 -0700)]
kexec: save PG_head_mask in VMCOREINFO
To allow filtering of huge pages, makedumpfile must be able to identify
them in the dump. This can be done by checking the appropriate page
flag, so communicate its value to makedumpfile through the VMCOREINFO
interface.
There's only one small catch. Depending on how many page flags are
available on a given architecture, this bit can be called PG_head or
PG_compound.
I sent a similar patch back in 2012, but Eric Biederman did not like
using an #ifdef. So, this time I'm adding a common symbol
(PG_head_mask) instead.
See https://lkml.org/lkml/2012/11/28/91 for the previous version.
Signed-off-by: Petr Tesarik <ptesarik@suse.cz>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Shaohua Li <shli@kernel.org>
Cc: Alexey Kardashevskiy <aik@ozlabs.ru>
Cc: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Srivatsa S. Bhat [Mon, 23 Jun 2014 20:22:02 +0000 (13:22 -0700)]
CPU hotplug, smp: flush any pending IPI callbacks before CPU offline
There is a race between the CPU offline code (within stop-machine) and
the smp-call-function code, which can lead to getting IPIs on the
outgoing CPU, *after* it has gone offline.
Specifically, this can happen when using
smp_call_function_single_async() to send the IPI, since this API allows
sending asynchronous IPIs from IRQ disabled contexts. The exact race
condition is described below.
During CPU offline, in stop-machine, we don't enforce any rule in the
_DISABLE_IRQ stage, regarding the order in which the outgoing CPU and
the other CPUs disable their local interrupts. Due to this, we can
encounter a situation in which an IPI is sent by one of the other CPUs
to the outgoing CPU (while it is *still* online), but the outgoing CPU
ends up noticing it only *after* it has gone offline.
CPU 1 CPU 2
(Online CPU) (CPU going offline)
Enter _PREPARE stage Enter _PREPARE stage
Enter _DISABLE_IRQ stage
=
Got a device interrupt, and | Didn't notice the IPI
the interrupt handler sent an | since interrupts were
IPI to CPU 2 using | disabled on this CPU.
smp_call_function_single_async() |
=
Enter _DISABLE_IRQ stage
Enter _RUN stage Enter _RUN stage
=
Busy loop with interrupts | Invoke take_cpu_down()
disabled. | and take CPU 2 offline
=
Enter _EXIT stage Enter _EXIT stage
Re-enable interrupts Re-enable interrupts
The pending IPI is noted
immediately, but alas,
the CPU is offline at
this point.
This of course, makes the smp-call-function IPI handler code running on
CPU 2 unhappy and it complains about "receiving an IPI on an offline
CPU".
One real example of the scenario on CPU 1 is the block layer's
complete-request call-path:
__blk_complete_request() [interrupt-handler]
raise_blk_irq()
smp_call_function_single_async()
However, if we look closely, the block layer does check that the target
CPU is online before firing the IPI. So in this case, it is actually
the unfortunate ordering/timing of events in the stop-machine phase that
leads to receiving IPIs after the target CPU has gone offline.
In reality, getting a late IPI on an offline CPU is not too bad by
itself (this can happen even due to hardware latencies in IPI
send-receive). It is a bug only if the target CPU really went offline
without executing all the callbacks queued on its list. (Note that a
CPU is free to execute its pending smp-call-function callbacks in a
batch, without waiting for the corresponding IPIs to arrive for each one
of those callbacks).
So, fixing this issue can be broken up into two parts:
1. Ensure that a CPU goes offline only after executing all the
callbacks queued on it.
2. Modify the warning condition in the smp-call-function IPI handler
code such that it warns only if an offline CPU got an IPI *and* that
CPU had gone offline with callbacks still pending in its queue.
Achieving part 1 is straight-forward - just flush (execute) all the
queued callbacks on the outgoing CPU in the CPU_DYING stage[1],
including those callbacks for which the source CPU's IPIs might not have
been received on the outgoing CPU yet. Once we do this, an IPI that
arrives late on the CPU going offline (either due to the race mentioned
above, or due to hardware latencies) will be completely harmless, since
the outgoing CPU would have executed all the queued callbacks before
going offline.
Overall, this fix (parts 1 and 2 put together) additionally guarantees
that we will see a warning only when the *IPI-sender code* is buggy -
that is, if it queues the callback _after_ the target CPU has gone
offline.
[1]. The CPU_DYING part needs a little more explanation: by the time we
execute the CPU_DYING notifier callbacks, the CPU would have already
been marked offline. But we want to flush out the pending callbacks at
this stage, ignoring the fact that the CPU is offline. So restructure
the IPI handler code so that we can by-pass the "is-cpu-offline?" check
in this particular case. (Of course, the right solution here is to fix
CPU hotplug to mark the CPU offline _after_ invoking the CPU_DYING
notifiers, but this requires a lot of audit to ensure that this change
doesn't break any existing code; hence lets go with the solution
proposed above until that is done).
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Suggested-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Gautham R Shenoy <ego@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Mike Galbraith <mgalbraith@suse.de>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Cc: Rik van Riel <riel@redhat.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Sachin Kamat <sachin.kamat@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Steven Miao [Mon, 23 Jun 2014 20:22:02 +0000 (13:22 -0700)]
mm: nommu: per-thread vma cache fix
mm could be removed from current task struct, using previous vma->vm_mm
It will crash on blackfin after updated to Linux 3.15. The commit "mm:
per-thread vma caching" caused the crash. mm could be removed from
current task struct before
mmput()->
exit_mmap()->
delete_vma_from_mm()
the detailed fault information:
NULL pointer access
Kernel OOPS in progress
Deferred Exception context
CURRENT PROCESS:
COMM=modprobe PID=278 CPU=0
invalid mm
return address: [0x000531de]; contents of:
0x000531b0: c727 acea 0c42 181d 0000 0000 0000 a0a8
0x000531c0: b090 acaa 0c42 1806 0000 0000 0000 a0e8
0x000531d0: b0d0 e801 0000 05b3 0010 e522 0046 [a090]
0x000531e0: 6408 b090 0c00 17cc 3042 e3ff f37b 2fc8
CPU: 0 PID: 278 Comm: modprobe Not tainted
3.15.0-ADI-2014R1-pre-00345-gea9f446 #25
task:
0572b720 ti:
0569e000 task.ti:
0569e000
Compiled for cpu family 0x27fe (Rev 0), but running on:0x0000 (Rev 0)
ADSP-BF609-0.0 500(MHz CCLK) 125(MHz SCLK) (mpu off)
Linux version
3.15.0-ADI-2014R1-pre-00345-gea9f446 (steven@steven-OptiPlex-390) (gcc version 4.3.5 (ADI-trunk/svn-5962) ) #25 Tue Jun 10 17:47:46 CST 2014
SEQUENCER STATUS: Not tainted
SEQSTAT:
00000027 IPEND: 8008 IMASK: ffff SYSCFG: 2806
EXCAUSE : 0x27
physical IVG3 asserted : <0xffa00744> { _trap + 0x0 }
physical IVG15 asserted : <0xffa00d68> { _evt_system_call + 0x0 }
logical irq 6 mapped : <0xffa003bc> { _bfin_coretmr_interrupt + 0x0 }
logical irq 7 mapped : <0x00008828> { _bfin_fault_routine + 0x0 }
logical irq 11 mapped : <0x00007724> { _l2_ecc_err + 0x0 }
logical irq 13 mapped : <0x00008828> { _bfin_fault_routine + 0x0 }
logical irq 39 mapped : <0x00150788> { _bfin_twi_interrupt_entry + 0x0 }
logical irq 40 mapped : <0x00150788> { _bfin_twi_interrupt_entry + 0x0 }
RETE: <0x00000000> /* Maybe null pointer? */
RETN: <0x0569fe50> /* kernel dynamic memory (maybe user-space) */
RETX: <0x00000480> /* Maybe fixed code section */
RETS: <0x00053384> { _exit_mmap + 0x28 }
PC : <0x000531de> { _delete_vma_from_mm + 0x92 }
DCPLB_FAULT_ADDR: <0x00000008> /* Maybe null pointer? */
ICPLB_FAULT_ADDR: <0x000531de> { _delete_vma_from_mm + 0x92 }
PROCESSOR STATE:
R0 :
00000004 R1 :
0569e000 R2 :
00bf3db4 R3 :
00000000
R4 :
057f9800 R5 :
00000001 R6 :
0569ddd0 R7 :
0572b720
P0 :
0572b854 P1 :
00000004 P2 :
00000000 P3 :
0569dda0
P4 :
0572b720 P5 :
0566c368 FP :
0569fe5c SP :
0569fd74
LB0:
057f523f LT0:
057f523e LC0:
00000000
LB1:
0005317c LT1:
00053172 LC1:
00000002
B0 :
00000000 L0 :
00000000 M0 :
0566f5bc I0 :
00000000
B1 :
00000000 L1 :
00000000 M1 :
00000000 I1 :
ffffffff
B2 :
00000001 L2 :
00000000 M2 :
00000000 I2 :
00000000
B3 :
00000000 L3 :
00000000 M3 :
00000000 I3 :
057f8000
A0.w:
00000000 A0.x:
00000000 A1.w:
00000000 A1.x:
00000000
USP :
056ffcf8 ASTAT:
02003024
Hardware Trace:
0 Target : <0x00003fb8> { _trap_c + 0x0 }
Source : <0xffa006d8> { _exception_to_level5 + 0xa0 } JUMP.L
1 Target : <0xffa00638> { _exception_to_level5 + 0x0 }
Source : <0xffa004f2> { _bfin_return_from_exception + 0x6 } RTX
2 Target : <0xffa004ec> { _bfin_return_from_exception + 0x0 }
Source : <0xffa00590> { _ex_trap_c + 0x70 } JUMP.S
3 Target : <0xffa00520> { _ex_trap_c + 0x0 }
Source : <0xffa0076e> { _trap + 0x2a } JUMP (P4)
4 Target : <0xffa00744> { _trap + 0x0 }
FAULT : <0x000531de> { _delete_vma_from_mm + 0x92 } P0 = W[P2 + 2]
Source : <0x000531da> { _delete_vma_from_mm + 0x8e } P2 = [P4 + 0x18]
5 Target : <0x000531da> { _delete_vma_from_mm + 0x8e }
Source : <0x00053176> { _delete_vma_from_mm + 0x2a } IF CC JUMP pcrel
6 Target : <0x0005314c> { _delete_vma_from_mm + 0x0 }
Source : <0x00053380> { _exit_mmap + 0x24 } JUMP.L
7 Target : <0x00053378> { _exit_mmap + 0x1c }
Source : <0x00053394> { _exit_mmap + 0x38 } IF !CC JUMP pcrel (BP)
8 Target : <0x00053390> { _exit_mmap + 0x34 }
Source : <0xffa020e0> { __cond_resched + 0x20 } RTS
9 Target : <0xffa020c0> { __cond_resched + 0x0 }
Source : <0x0005338c> { _exit_mmap + 0x30 } JUMP.L
10 Target : <0x0005338c> { _exit_mmap + 0x30 }
Source : <0x0005333a> { _delete_vma + 0xb2 } RTS
11 Target : <0x00053334> { _delete_vma + 0xac }
Source : <0x0005507a> { _kmem_cache_free + 0xba } RTS
12 Target : <0x00055068> { _kmem_cache_free + 0xa8 }
Source : <0x0005505e> { _kmem_cache_free + 0x9e } IF !CC JUMP pcrel (BP)
13 Target : <0x00055052> { _kmem_cache_free + 0x92 }
Source : <0x0005501a> { _kmem_cache_free + 0x5a } IF CC JUMP pcrel
14 Target : <0x00054ff4> { _kmem_cache_free + 0x34 }
Source : <0x00054fce> { _kmem_cache_free + 0xe } IF CC JUMP pcrel (BP)
15 Target : <0x00054fc0> { _kmem_cache_free + 0x0 }
Source : <0x00053330> { _delete_vma + 0xa8 } JUMP.L
Kernel Stack
Stack info:
SP: [0x0569ff24] <0x0569ff24> /* kernel dynamic memory (maybe user-space) */
Memory from 0x0569ff20 to
056a0000
0569ff20:
00000001 [
04e8da5a]
00008000 00000000 00000000 056a0000 04e8da5a 04e8da5a
0569ff40:
04eb9eea ffa00dce 02003025 04ea09c5 057f523f 04ea09c4 057f523e 00000000
0569ff60:
00000000 00000000 00000000 00000000 00000000 00000000 00000001 00000000
0569ff80:
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
0569ffa0:
0566f5bc 057f8000 057f8000 00000001 04ec0170 056ffcf8 056ffd04 057f9800
0569ffc0:
04d1d498 057f9800 057f8fe4 057f8ef0 00000001 057f928c 00000001 00000001
0569ffe0:
057f9800 00000000 00000008 00000007 00000001 00000001 00000001 <
00002806>
Return addresses in stack:
address : <0x00002806> { _show_cpuinfo + 0x2d2 }
Modules linked in:
Kernel panic - not syncing: Kernel exception
[ end Kernel panic - not syncing: Kernel exception
Signed-off-by: Steven Miao <realmz6@gmail.com>
Acked-by: Davidlohr Bueso <davidlohr@hp.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Acked-by: David Rientjes <rientjes@google.com>
Cc: <stable@vger.kernel.org> [3.15.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Sebastian Hesselbarth [Mon, 23 Jun 2014 20:25:15 +0000 (22:25 +0200)]
ARM: dts: kirkwood: fix phy-connection-type for Guruplug
Commit
eeb845459a72e792a959278b858f9c417e9995bd
("ARM: dts: kirkwood: set Guruplug phy-connection-type to rgmii-id")
added phy-connection-type properties to ethernet PHY nodes.
Actually, the property has to be set for the ethernet port node instead.
Fix it by moving the corresponding properties to the correct nodes.
Signed-off-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Link: https://lkml.kernel.org/r/1403555115-13111-1-git-send-email-sebastian.hesselbarth@gmail.com
Fixes:
eeb845459a72: ('ARM: dts: kirkwood: set Guruplug phy-connection-type to rgmii-id')
Cc: <stable@vger.kernel.org> # v3.16+
Signed-off-by: Jason Cooper <jason@lakedaemon.net>
Greg Kroah-Hartman [Sat, 21 Jun 2014 05:01:41 +0000 (22:01 -0700)]
lz4: ensure length does not wrap
Given some pathologically compressed data, lz4 could possibly decide to
wrap a few internal variables, causing unknown things to happen. Catch
this before the wrapping happens and abort the decompression.
Reported-by: "Don A. Bailey" <donb@securitymouse.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Greg Kroah-Hartman [Sat, 21 Jun 2014 05:00:53 +0000 (22:00 -0700)]
lzo: properly check for overruns
The lzo decompressor can, if given some really crazy data, possibly
overrun some variable types. Modify the checking logic to properly
detect overruns before they happen.
Reported-by: "Don A. Bailey" <donb@securitymouse.com>
Tested-by: "Don A. Bailey" <donb@securitymouse.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Arnd Bergmann [Mon, 23 Jun 2014 12:12:48 +0000 (14:12 +0200)]
Merge tag 'imx-fixes-3.16' of git://git./linux/kernel/git/shawnguo/linux into fixes
Pull "i.MX fixes for 3.16" from Shawn Guo:
- Use GPIO for card CD/WP on imx51-babbage and eukrea-mbimxsd51,
because controller base CD/WP is not working in esdhc driver due to
runtime PM support
- A couple of random ventana gw5xxx board fixes
- Add IMX_IPUV3_CORE back to defconfig, which gets lost when moving
IPUv3 driver out of staging tree
- Fix enet/fec clock selection on imx6sl
- Fix display node on imx53-m53evk board
- A couple of Cubox-i updates from Russell, which were omitted from
the merge window due to dependency
* tag 'imx-fixes-3.16' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux:
ARM: dts: imx51-eukrea-mbimxsd51-baseboard: unbreak esdhc.
ARM: dts: imx51-babbage: Fix esdhc setup
ARM: dts: mx5: Move the display out of soc {} node
ARM: dts: mx5: Fix IPU port node placement
ARM: imx_v6_v7_defconfig: Enable CONFIG_IMX_IPUV3_CORE
ARM: dts: hummingboard/cubox-i: move usb otg configuration to platform level
ARM: dts: cubox-i: add support for PWM-driven front panel LED
ARM: dts: imx6: ventana: correct gw52xx sgtl5000 clock source
ARM: dts: imx6qdl-gw5xxx: Fix Linear Technology vendor prefix
ARM: dts: imx6: ventana: fix include typo
ARM: dts: imx6sl: correct the fec ipg clock source
ARM: imx6sl: add missing enet clock for imx6sl
Ilya Dryomov [Tue, 10 Jun 2014 09:53:29 +0000 (13:53 +0400)]
rbd: handle parent_overlap on writes correctly
The following check in rbd_img_obj_request_submit()
rbd_dev->parent_overlap <= obj_request->img_offset
allows the fall through to the non-layered write case even if both
parent_overlap and obj_request->img_offset belong to the same RADOS
object. This leads to data corruption, because the area to the left of
parent_overlap ends up unconditionally zero-filled instead of being
populated with parent data. Suppose we want to write 1M to offset 6M
of image bar, which is a clone of foo@snap; object_size is 4M,
parent_overlap is 5M:
rbd_data.<id>.
0000000000000001
---------------------|----------------------|------------
| should be copyup'ed | should be zeroed out | write ...
---------------------|----------------------|------------
4M 5M 6M
parent_overlap obj_request->img_offset
4..5M should be copyup'ed from foo, yet it is zero-filled, just like
5..6M is.
Given that the only striping mode kernel client currently supports is
chunking (i.e. stripe_unit == object_size, stripe_count == 1), round
parent_overlap up to the next object boundary for the purposes of the
overlap check.
Cc: stable@vger.kernel.org # 3.10+
Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Michael S. Tsirkin [Thu, 12 Jun 2014 16:00:01 +0000 (19:00 +0300)]
vhost-scsi: don't open-code kvfree
Now that we have kvfree, use it in vhost-scsi instead of
the open-coded version.
Cc: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Romain Francoise [Thu, 12 Jun 2014 08:42:34 +0000 (10:42 +0200)]
vhost-net: don't open-code kvfree
Commit
23cc5a991c ("vhost-net: extend device allocation to vmalloc")
added another open-coded version of kvfree (which is available since
v3.15-rc5), nuke it.
Signed-off-by: Romain Francoise <romain@orebokech.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Arnd Bergmann [Sun, 22 Jun 2014 18:46:52 +0000 (20:46 +0200)]
Merge tag 'samsung-fixes-1' of git://git./linux/kernel/git/kgene/linux-samsung into fixes
Merge Samsung fixes for 3.16 from Kukjin Kim:
- use WFI macro in platform_do_lowpower because exynos cpuhotplug
includes a hardcoded WFI instruction and it causes compile error
in Thumb-2 mode.
- fix GIC reg sizes for exynos4 SoCs
- remove reset timer counter value during boot and resume for mct
to fix a big jump in printk timestamps
- fix pm code to check cortex-A9 for another exynos SoCs
- don't rely on firmware's secondary_cpu_start for mcpm
* tag 'samsung-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kgene/linux-samsung:
ARM: EXYNOS: Don't rely on firmware's secondary_cpu_start for mcpm
ARM: EXYNOS: fix pm code to check for cortex A9 rather than the SoC
clocksource: exynos_mct: Don't reset the counter during boot and resume
ARM: dts: fix reg sizes of GIC for exynos4
ARM: EXYNOS: Use wfi macro in platform_do_lowpower
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Linus Torvalds [Sun, 22 Jun 2014 05:02:54 +0000 (19:02 -1000)]
Linux 3.16-rc2
Linus Torvalds [Sun, 22 Jun 2014 05:01:15 +0000 (19:01 -1000)]
Merge branch 'i2c/for-next' of git://git./linux/kernel/git/wsa/linux
Pull i2c new drivers from Wolfram Sang:
"Here is a pull request from i2c hoping for the "new driver" rule.
Originally, I wanted to send this request during the merge window, but
code checkers with very recent additions complained, so a few fixups
were needed. So, some more time went by and I merged rc1 to get a
stable base"
So the "new driver" rule is really about drivers that people absolutely
need for the kernel to work on new hardware, which is not so much the
case for i2c. So I considered not pulling this, but eventually
relented.
Just for FYI: the whole (and only) point of "new drivers" is not that
new drivers cannot regress things (they can, and they have - by
triggering badly tested code on machines that never triggered that code
before), but because they can bring to life machines that otherwise
wouldn't be useful at all without the drivers.
So the new driver rule is for essential things that actual consumers
would care about, ie devices like networking or disk drivers that matter
to normal people (not server people - they run old kernels anyway, so
mainlining new drivers is irrelevant for them).
* 'i2c/for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: sun6-p2wi: fix call to snprintf
i2c: rk3x: add NULL entry to the end of_device_id array
i2c: sun6i-p2wi: use proper return value in probe
i2c: sunxi: add P2WI (Push/Pull 2 Wire Interface) controller support
i2c: sunxi: add P2WI DT bindings documentation
i2c: rk3x: add driver for Rockchip RK3xxx SoC I2C adapter
Linus Torvalds [Sun, 22 Jun 2014 02:40:30 +0000 (16:40 -1000)]
Merge tag 'locks-v3.16-2' of git://git.samba.org/jlayton/linux
Pull file locking fixes from Jeff Layton:
"File locking related bugfixes
Nothing too earth-shattering here. A fix for a potential regression
due to a patch in pile #1, and the addition of a memory barrier to
prevent a race condition between break_deleg and generic_add_lease"
* tag 'locks-v3.16-2' of git://git.samba.org/jlayton/linux:
locks: set fl_owner for leases back to current->files
locks: add missing memory barrier in break_deleg
Linus Torvalds [Sun, 22 Jun 2014 02:38:16 +0000 (16:38 -1000)]
Merge branch 'rc-fixes' of git://git./linux/kernel/git/mmarek/kbuild
Pull kbuild fixes from Michal Marek:
"There are three fixes for regressions caused by the relative paths
series: deb-pkg, tar-pkg and *docs did not work with O=.
Plus, there is a fix for the linux-headers deb package and a fixed
typo. These are not regression fixes but are safe enough"
* 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
kbuild: fix a typo in a kbuild document
builddeb: fix missing headers in linux-headers package
Documentation: Fix DocBook build with relative $(srctree)
kbuild: Fix tar-pkg with relative $(objtree)
deb-pkg: Fix for relative paths
Linus Torvalds [Sun, 22 Jun 2014 00:21:43 +0000 (14:21 -1000)]
Merge branch 'for-linus' of git://git./linux/kernel/git/mason/linux-btrfs
Pull btrfs fixes from Chris Mason:
"This fixes some lockups in btrfs reported with rc1. It probably has
some performance impact because it is backing off our spinning locks
more often and switching to a blocking lock. I'll be able to nail
that down next week, but for now I want to get the lockups taken care
of.
Otherwise some more stack reduction and assorted fixes"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
Btrfs: fix wrong error handle when the device is missing or is not writeable
Btrfs: fix deadlock when mounting a degraded fs
Btrfs: use bio_endio_nodec instead of open code
Btrfs: fix NULL pointer crash when running balance and scrub concurrently
btrfs: Skip scrubbing removed chunks to avoid -ENOENT.
Btrfs: fix broken free space cache after the system crashed
Btrfs: make free space cache write out functions more readable
Btrfs: remove unused wait queue in struct extent_buffer
Btrfs: fix deadlocks with trylock on tree nodes
Linus Torvalds [Sun, 22 Jun 2014 00:20:38 +0000 (14:20 -1000)]
Merge branch 'for-3.16' of git://linux-nfs.org/~bfields/linux
Pull nfsd bugfixes from Bruce Fields:
"Fixes for a new regression from the xdr encoding rewrite, and a
delegation problem we've had for a while (made somewhat more annoying
by the vfs delegation support added in 3.13)"
* 'for-3.16' of git://linux-nfs.org/~bfields/linux:
NFSD: fix bug for readdir of pseudofs
NFSD: Don't hand out delegations for 30 seconds after recalling them.
Linus Torvalds [Sat, 21 Jun 2014 17:07:17 +0000 (07:07 -1000)]
Merge branch 'perf-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
"This is larger than usual: the main reason are the ARM symbol lookup
speedups that came in late and were hard to resist.
There's also a kprobes fix and various tooling fixes, plus the minimal
re-enablement of the mmap2 support interface"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits)
x86/kprobes: Fix build errors and blacklist context_track_user
perf tests: Add test for closing dso objects on EMFILE error
perf tests: Add test for caching dso file descriptors
perf tests: Allow reuse of test_file function
perf tests: Spawn child for each test
perf tools: Add dso__data_* interface descriptons
perf tools: Allow to close dso fd in case of open failure
perf tools: Add file size check and factor dso__data_read_offset
perf tools: Cache dso data file descriptor
perf tools: Add global count of opened dso objects
perf tools: Add global list of opened dso objects
perf tools: Add data_fd into dso object
perf tools: Separate dso data related variables
perf tools: Cache register accesses for unwind processing
perf record: Fix to honor user freq/interval properly
perf timechart: Reflow documentation
perf probe: Improve error messages in --line option
perf probe: Improve an error message of perf probe --vars mode
perf probe: Show error code and description in verbose mode
perf probe: Improve error message for unknown member of data structure
...
Linus Torvalds [Sat, 21 Jun 2014 17:06:02 +0000 (07:06 -1000)]
Merge branch 'locking-urgent-for-linus.patch' of git://git./linux/kernel/git/tip/tip
Pull rtmutex fixes from Thomas Gleixner:
"Another three patches to make the rtmutex code more robust. That's
the last urgent fallout from the big futex/rtmutex investigation"
* 'locking-urgent-for-linus.patch' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
rtmutex: Plug slow unlock race
rtmutex: Detect changes in the pi lock chain
rtmutex: Handle deadlock detection smarter
Linus Torvalds [Sat, 21 Jun 2014 16:47:01 +0000 (06:47 -1000)]
Merge branch 'for-linus' of git://git./linux/kernel/git/s390/linux
Pull s390 patches from Martin Schwidefsky:
"A couple of bug fixes, a debug change for qdio, an update for the
default config, and one small extension.
The watchdog module based on diagnose 0x288 is converted to the
watchdog API and it now works under LPAR as well"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/ccwgroup: use ccwgroup_ungroup wrapper
s390/ccwgroup: fix an uninitialized return code
s390/ccwgroup: obtain extra reference for asynchronous processing
qdio: Keep device-specific dbf entries
s390/compat: correct ucontext layout for high gprs
s390/cio: set device name as early as possible
s390: update default configuration
s390: avoid format strings leaking into names
s390/airq: silence lockdep warning
s390/watchdog: add support for LPAR operation (diag288)
s390/watchdog: use watchdog API
s390/sclp_vt220: Enable ASCII console per default
s390/qdio: replace shift loop by ilog2
s390/cio: silence lockdep warning
s390/uaccess: always load the kernel ASCE after task switch
s390/ap_bus: Make modules parameters visible in sysfs
Linus Torvalds [Sat, 21 Jun 2014 16:45:54 +0000 (06:45 -1000)]
Merge tag 'for-linus' of git://github.com/gxt/linux
Pull UniCore32 bug fixes from Guan Xuetao:
"This includes bugfixes to make unicore32 successfully build under
defconfig, and some changes for allmodconfig (though not finished)"
* tag 'for-linus' of git://github.com/gxt/linux:
unicore32: Remove ARCH_HAS_CPUFREQ config option
UniCore32: Change git tree location information in MAINTAINERS
arch: unicore32: ksyms: export '__cpuc_coherent_kern_range' to avoid compiling failure
arch: unicore32: ksyms: export 'pm_power_off' to avoid compiling failure.
arch: unicore32: ksyms: export additional find_first_*() to avoid compiling failure
arch:unicore32:mm: add devmem_is_allowed() to support STRICT_DEVMEM
unicore32: include: asm: add missing ')' for PAGE_* macros in pgtable.h
arch/unicore32/kernel/setup.c: add generic 'screen_info' to avoid compiling failure
drivers: scsi: mvsas: fix compiling issue by adding 'MVS_' for "enum pci_interrupt_cause"
arch: unicore32: kernel: ksyms: remove 'bswapsi2' and 'muldi3' to avoid compiling failure
arch/unicore32/kernel/ksyms.c: remove 2 export symbols to avoid compiling failure
drivers/rtc/rtc-puv3.c: remove "&dev->" for typo issue MIME-Version: 1.0
drivers/rtc/rtc-puv3.c: use dev_dbg() instead of dev_debug() for typo issue
arch/unicore32/include/asm/io.h: add readl_relaxed() generic definition
arch/unicore32/include/asm/ptrace.h: add generic definition for profile_pc()
arch/unicore32/mm/alignment.c: include "asm/pgtable.h" to avoid compiling error
arch/unicore32/kernel/clock.c: add readl() and writel() for 'PM_' macros
arch/unicore32/kernel/module.c: use __vmalloc_node_range() instead of __vmalloc_area()
arch/unicore32/kernel/ksyms.c: remove several undefined exported symbols
Linus Torvalds [Sat, 21 Jun 2014 16:43:19 +0000 (06:43 -1000)]
Merge tag 'char-misc-3.16-rc2' of git://git./linux/kernel/git/gregkh/char-misc
Pull char / misc driver fixes from Greg KH:
"Here are 3 patches, one a revert of the UIO patch you objected to in
3.16-rc1 and that no one wanted to defend, a w1 driver bugfix, and a
MAINTAINERS update for the vmware balloon driver.
All of these, except for the MAINTAINERS update which just got added,
have been in linux-next just fine"
* tag 'char-misc-3.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
MAINTAINERS: add entry for VMware Balloon driver
w1: mxc_w1: Fix incorrect "presence" status
Revert "uio: fix vma io range check in mmap"
Linus Torvalds [Sat, 21 Jun 2014 16:42:40 +0000 (06:42 -1000)]
Merge tag 'staging-3.16-rc2' of git://git./linux/kernel/git/gregkh/staging
Pull staging driver fixes from Greg KH:
"Here are a few fixes for staging and iio drivers that resolve issues
reported in 3.16-rc1.
All have been in linux-next just fine"
* tag 'staging-3.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
imx-drm: parallel-display: Fix DPMS default state.
staging: android: timed_output: fix use after free of dev
staging: comedi: addi_apci_1564: add addi_watchdog dependency
staging: rtl8723au: Reference correct firmwarefiles with MODULE_FIRMWARE()
staging: rtl8723au: Request correct firmware file for A-cut parts
iio: adc: checking for NULL instead of IS_ERR() in probe
iio: adc: at91: signedness bug in at91_adc_get_trigger_value_by_name()
iio: mxs-lradc: fix divider
iio: Fix endianness issue in ak8975_read_axis()
staging/iio: IIO_SIMPLE_DUMMY_BUFFER neds IIO_BUFFER
twl4030-madc: Request processed values in twl4030_get_madc_conversion
staging: iio: tsl2x7x_core: fix proximity treshold
iio: Fix two mpl3115 issues in measurement conversion
iio: hid-sensors: Get feature report from sensor hub after changing power state
Linus Torvalds [Sat, 21 Jun 2014 16:41:42 +0000 (06:41 -1000)]
Merge tag 'tty-3.16-rc2' of git://git./linux/kernel/git/gregkh/tty
Pull tty/serial bugfixes from Greg KH:
"Here are some tty / serial driver bugfixes for 3.16-rc2 that resolve
some reported issues. The samsung driver build error itself has been
reported by a bunch of people, sorry about that one. The others are
all tiny and everyone seems to like them in linux-next so far"
* tag 'tty-3.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
tty/serial: fix 8250 early console option passing to regular console
tty: Correct INPCK handling
serial: Fix IGNBRK handling
serial: samsung: Fix build error
Linus Torvalds [Sat, 21 Jun 2014 16:41:07 +0000 (06:41 -1000)]
Merge tag 'usb-3.16-rc2' of git://git./linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
"Here are some USB fixes for 3.16-rc2 that resolve some reported
issues. All of these have been in linux-next for a while with no
problems"
* tag 'usb-3.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
USB: usbtest: add a timeout for scatter-gather tests
USB: EHCI: avoid BIOS handover on the HASEE E200
usb: fix hub-port pm_runtime_enable() vs runtime pm transitions
usb: quiet peer failure warning, disable poweroff
usb: improve "not suspended yet" message in hub_suspend()
xhci: Fix sleeping with IRQs disabled in xhci_stop_device()
usb: fix ->update_hub_device() vs hdev->maxchild
Doug Anderson [Sat, 21 Jun 2014 10:30:53 +0000 (19:30 +0900)]
ARM: EXYNOS: Don't rely on firmware's secondary_cpu_start for mcpm
On exynos mcpm systems the firmware is hardcoded to jump to an address
in SRAM (0x02073000) when secondary CPUs come up. By default the
firmware puts a bunch of code at that location. That code expects the
kernel to fill in a few slots with addresses that it uses to jump back
to the kernel's entry point for secondary CPUs.
Originally (on prerelease hardware) this firmware code contained a
bunch of workarounds to deal with boot ROM bugs. However on all
shipped hardware we simply use this code to redirect to a kernel
function for bringing up the CPUs.
Let's stop relying on the code provided by the bootloader and just
plumb in our own (simple) code jump to the kernel. This has the nice
benefit of fixing problems due to the fact that older bootloaders
(like the one shipped on the Samsung Chromebook 2) might have put
slightly different code into this location.
Once suspend/resume is implemented for systems using exynos-mcpm we'll
need to make sure we reinstall our fixed up code after resume. ...but
that's not anything new since IRAM (and thus the address of the
mcpm_entry_point) is lost across suspend/resume anyway.
Signed-off-by: Doug Anderson <dianders@chromium.org>
Acked-by: Kevin Hilman <khilman@linaro.org>
Tested-by: Kevin Hilman <khilman@linaro.org>
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>
Denis Carikli [Wed, 18 Jun 2014 12:56:56 +0000 (14:56 +0200)]
ARM: dts: imx51-eukrea-mbimxsd51-baseboard: unbreak esdhc.
The following commit:
89d7e5c mmc: sdhci-esdhc-imx: add runtime pm support
has the effect of also disabling the hardware card detect
in runtime pm.
We switch to GPIO based detection to avoid this issue.
This patch is based on:
ARM: dts: imx51-babbage: Fix esdhc setup
Signed-off-by: Denis Carikli <denis@eukrea.com>
Signed-off-by: Shawn Guo <shawn.guo@freescale.com>
Sascha Hauer [Fri, 23 May 2014 12:33:04 +0000 (14:33 +0200)]
ARM: dts: imx51-babbage: Fix esdhc setup
Since commit
89d7e5c13122 (mmc: sdhci-esdhc-imx: add runtime pm
support), controller based card detection / write protection is not
supported anymore by esdhc driver. Let's use GPIO for CD/WP on esdhc1
instead.
While at it, fix cd gpio polarity for esdhc2. This is wrong and
currently only works because the imx esdhc driver ignores the polarity.
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Shawn Guo <shawn.guo@freescale.com>