Dave Chinner [Mon, 12 Oct 2015 07:38:11 +0000 (18:38 +1100)]
Merge branch 'xfs-io-fixes' into for-next
Dave Chinner [Mon, 12 Oct 2015 07:37:58 +0000 (18:37 +1100)]
Merge branch 'xfs-logging-fixes' into for-next
Eric Sandeen [Mon, 12 Oct 2015 07:21:22 +0000 (18:21 +1100)]
xfs: simplify /proc teardown & error handling
remove_proc_subtree() was added in 3.9, and can be
used to simplify our procfile creation error handling
and cleanup, removing the nested gotos. It simply
removes fs/xfs and everything created under it.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Bill O'Donnell [Mon, 12 Oct 2015 07:21:22 +0000 (18:21 +1100)]
xfs: per-filesystem stats counter implementation
This patch modifies the stats counting macros and the callers
to those macros to properly increment, decrement, and add-to
the xfs stats counts. The counts for global and per-fs stats
are correctly advanced, and cleared by writing a "1" to the
corresponding clear file.
global counts: /sys/fs/xfs/stats/stats
per-fs counts: /sys/fs/xfs/sda*/stats/stats
global clear: /sys/fs/xfs/stats/stats_clear
per-fs clear: /sys/fs/xfs/sda*/stats/stats_clear
[dchinner: cleaned up macro variables, removed CONFIG_FS_PROC around
stats structures and macros. ]
Signed-off-by: Bill O'Donnell <billodo@redhat.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Bill O'Donnell [Mon, 12 Oct 2015 07:21:19 +0000 (18:21 +1100)]
xfs: per-filesystem stats in sysfs
This patch implements per-filesystem stats objects in sysfs. It
depends on the application of the previous patch series that
develops the infrastructure to support both xfs global stats and
xfs per-fs stats in sysfs.
Stats objects are instantiated when an xfs filesystem is mounted
and deleted on unmount. With this patch, the stats directory is
created and populated with the familiar stats and stats_clear files.
Example:
/sys/fs/xfs/sda9/stats/stats
/sys/fs/xfs/sda9/stats/stats_clear
With this patch, the individual counts within the new per-fs
stats file(s) remain at zero. Functions that use the the macros
to increment, decrement, and add-to the per-fs stats counts will
be covered in a separate new patch to follow this one. Note that
the counts within the global stats file (/sys/fs/xfs/stats/stats)
advance normally and can be cleared as it was prior to this patch.
[dchinner: move setup/teardown to xfs_fs_{fill|put}_super() so
it is down before/after any path that uses the per-mount stats. ]
Signed-off-by: Bill O'Donnell <billodo@redhat.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Eric Sandeen [Mon, 12 Oct 2015 05:04:15 +0000 (16:04 +1100)]
xfs: avoid null *src in memcpy call in xlog_write
The gcc undefined behavior sanitizer caught this; surely
any sane memcpy implementation will no-op if size == 0,
but behavior with a *src of NULL is technically undefined
(declared nonnull), so avoid it here.
We are actually in this situation frequently via
xlog_commit_record(), because:
struct xfs_log_iovec reg = {
.i_addr = NULL,
.i_len = 0,
.i_type = XLOG_REG_TYPE_COMMIT,
};
Reported-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Brian Foster [Mon, 12 Oct 2015 05:04:13 +0000 (16:04 +1100)]
xfs: pass total block res. as total xfs_bmapi_write() parameter
The total field from struct xfs_alloc_arg is a bit of an unknown
commodity. It is documented as the total block requirement for the
transaction and is used in this manner from most call sites by virtue of
passing the total block reservation of the transaction associated with
an allocation. Several xfs_bmapi_write() callers pass hardcoded values
of 0 or 1 for the total block requirement, which is a historical oddity
without any clear reasoning.
The xfs_iomap_write_direct() caller, for example, passes 0 for the total
block requirement. This has been determined to cause problems in the
form of ABBA deadlocks of AGF buffers due to incorrect AG selection in
the block allocator. Specifically, the xfs_alloc_space_available()
function incorrectly selects an AG that doesn't actually have sufficient
space for the allocation. This occurs because the args.total field is 0
and thus the remaining free space check on the AG doesn't actually
consider the size of the allocation request. This locks the AGF buffer,
the allocation attempt proceeds and ultimately fails (in
xfs_alloc_fix_minleft()), and xfs_alloc_vexent() moves on to the next
AG. In turn, this can lead to incorrect AG locking order (if the
allocator wraps around, attempting to lock AG 0 after acquiring AG N)
and thus deadlock if racing with another operation. This problem has
been reproduced via generic/299 on smallish (1GB) ramdisk test devices.
To avoid this problem, replace the undocumented hardcoded total
parameters from the iomap and utility callers to pass the block
reservation used for the associated transaction. This is consistent with
other xfs_bmapi_write() callers throughout XFS. The assumption is that
the total field allows the selection of an AG that can handle the entire
operation rather than simply the allocation/range being requested (e.g.,
resulting btree splits, etc.). This addresses the aforementioned
generic/299 hang by ensuring AG selection only occurs when the
allocation can be satisfied by the AG.
Reported-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Brian Foster [Mon, 12 Oct 2015 05:02:08 +0000 (16:02 +1100)]
xfs: add an xfs_zero_eof() tracepoint
Add a tracepoint in xfs_zero_eof() to facilitate tracking and debugging
EOF zeroing events. This has proven useful in the context of other
direct I/O tracepoints to ensure EOF zeroing occurs within appropriate
file ranges.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Brian Foster [Mon, 12 Oct 2015 05:02:05 +0000 (16:02 +1100)]
xfs: always drain dio before extending aio write submission
XFS supports and typically allows concurrent asynchronous direct I/O
submission to a single file. One exception to the rule is that file
extending dio writes that start beyond the current EOF (e.g.,
potentially create a hole at EOF) require exclusive I/O access to the
file. This is because such writes must zero any pre-existing blocks
beyond EOF that are exposed by virtue of now residing within EOF as a
result of the write about to be submitted.
Before EOF zeroing can occur, the current file i_size must be stabilized
to avoid data corruption. In this scenario, XFS upgrades the iolock to
exclude any further I/O submission, waits on in-flight I/O to complete
to ensure i_size is up to date (i_size is updated on dio write
completion) and restarts the various checks against the state of the
file. The problem is that this protection sequence is triggered only
when the iolock is currently held shared. While this is true for async
dio in most cases, the caller may upgrade the lock in advance based on
arbitrary circumstances with respect to EOF zeroing. For example, the
iolock is always acquired exclusively if the start offset is not block
aligned. This means that even though the iolock is already held
exclusive for such I/Os, pending I/O is not drained and thus EOF zeroing
can occur based on an unstable i_size.
This problem has been reproduced as guest data corruption in virtual
machines with file-backed qcow2 virtual disks hosted on an XFS
filesystem. The virtual disks must be configured with aio=native mode
and the must not be truncated out to the maximum file size (as some virt
managers will do).
Update xfs_file_aio_write_checks() to unconditionally drain in-flight
dio before EOF zeroing can occur. Rather than trigger the wait based on
iolock state, use a new flag and upgrade the iolock when necessary. Note
that this results in a full restart of the inode checks even when the
iolock was already held exclusive when technically it is only required
to recheck i_size. This should be a rare enough occurrence that it is
preferable to keep the code simple rather than create an alternate
restart jump target.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Brian Foster [Mon, 12 Oct 2015 04:59:25 +0000 (15:59 +1100)]
xfs: validate metadata LSNs against log on v5 superblocks
Since the onset of v5 superblocks, the LSN of the last modification has
been included in a variety of on-disk data structures. This LSN is used
to provide log recovery ordering guarantees (e.g., to ensure an older
log recovery item is not replayed over a newer target data structure).
While this works correctly from the point a filesystem is formatted and
mounted, userspace tools have some problematic behaviors that defeat
this mechanism. For example, xfs_repair historically zeroes out the log
unconditionally (regardless of whether corruption is detected). If this
occurs, the LSN of the filesystem is reset and the log is now in a
problematic state with respect to on-disk metadata structures that might
have a larger LSN. Until either the log catches up to the highest
previously used metadata LSN or each affected data structure is modified
and written out without incident (which resets the metadata LSN), log
recovery is susceptible to filesystem corruption.
This problem is ultimately addressed and repaired in the associated
userspace tools. The kernel is still responsible to detect the problem
and notify the user that something is wrong. Check the superblock LSN at
mount time and fail the mount if it is invalid. From that point on,
trigger verifier failure on any metadata I/O where an invalid LSN is
detected. This results in a filesystem shutdown and guarantees that we
do not log metadata changes with invalid LSNs on disk. Since this is a
known issue with a known recovery path, present a warning to instruct
the user how to recover.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Brian Foster [Mon, 12 Oct 2015 04:40:24 +0000 (15:40 +1100)]
xfs: log local to remote symlink conversions correctly on v5 supers
A local format symlink inode is converted to extent format when an
extended attribute is set on an inode as part of the attribute fork
creation. This means a block is allocated, the local symlink target name
is copied to the block and the block is logged. Currently,
xfs_bmap_local_to_extents() handles logging the remote block data based
on the size of the data fork prior to the conversion. This is not
correct on v5 superblock filesystems, which add an additional header to
remote symlink blocks that is nonexistent in local format inodes.
As a result, the full length of the remote symlink block content is not
logged. This can lead to corruption should a crash occur and log
recovery replay this transaction.
Since a callout is already used to initialize the new remote symlink
block, update the local-to-extents conversion mechanism to make the
callout also responsible for logging the block. It is already required
to set the log buffer type and format the block appropriately based on
the superblock version. This ensures the remote symlink is always logged
correctly. Note that xfs_bmap_local_to_extents() is only called for
symlinks so there are no other callouts that require modification.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Brian Foster [Mon, 12 Oct 2015 04:34:20 +0000 (15:34 +1100)]
xfs: add missing ilock around dio write last extent alignment
The iomap codepath (via get_blocks()) acquires and release the inode
lock in the case of a direct write that requires block allocation. This
is because xfs_iomap_write_direct() allocates a transaction, which means
the ilock must be dropped and reacquired after the transaction is
allocated and reserved.
xfs_iomap_write_direct() invokes xfs_iomap_eof_align_last_fsb() before
the transaction is created and thus before the ilock is reacquired. This
can lead to calls to xfs_iread_extents() and reads of the in-core extent
list without any synchronization (via xfs_bmap_eof() and
xfs_bmap_last_extent()). xfs_iread_extents() assert fails if the ilock
is not held, but this is not currently seen in practice as the current
callers had already invoked xfs_bmapi_read().
What has been seen in practice are reports of crashes down in the
xfs_bmap_eof() codepath on direct writes due to seemingly bogus pointer
references from xfs_iext_get_ext(). While an explicit reproducer is not
currently available to confirm the cause of the problem, crash analysis
and code inspection from David Jeffrey had identified the insufficient
locking.
xfs_iomap_eof_align_last_fsb() is called from other contexts with the
inode lock already held, so we cannot acquire it therein.
__xfs_get_blocks() acquires and drops the ilock with variable flags to
cover the event that the extent list must be read in. The common case is
that __xfs_get_blocks() acquires the shared ilock. To provide locking
around the last extent alignment call without adding more lock cycles to
the dio path, update xfs_iomap_write_direct() to expect the shared ilock
held on entry and do the extent alignment under its protection. Demote
the lock, if necessary, from __xfs_get_blocks() and push the
xfs_qm_dqattach() call outside of the shared lock critical section.
Also, add an assert to document that the extent list is always expected
to be present in this path. Otherwise, we risk a call to
xfs_iread_extents() while under the shared ilock. This is safe as all
current callers have executed an xfs_bmapi_read() call under the current
iolock context.
Reported-by: David Jeffery <djeffery@redhat.com>
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Zhaohongjiang [Mon, 12 Oct 2015 04:28:39 +0000 (15:28 +1100)]
cancel the setfilesize transation when io error happen
When I ran xfstest/073 case, the remount process was blocked to wait
transactions to be zero. I found there was a io error happened, and
the setfilesize transaction was not released properly. We should add
the changes to cancel the io error in this case.
Reproduction steps:
1. dd if=/dev/zero of=xfs1.img bs=1M count=2048
2. mkfs.xfs xfs1.img
3. losetup -f ./xfs1.img /dev/loop0
4. mount -t xfs /dev/loop0 /home/test_dir/
5. mkdir /home/test_dir/test
6. mkfs.xfs -dfile,name=image,size=2g
7. mount -t xfs -o loop image /home/test_dir/test
8. cp a file bigger than 2g to /home/test_dir/test
9. mount -t xfs -o remount,ro /home/test_dir/test
[ dchinner: moved io error detection to xfs_setfilesize_ioend() after
transaction context restoration. ]
Signed-off-by: Zhao Hongjiang <zhaohongjiang@huawei.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Bill O'Donnell [Sun, 11 Oct 2015 18:19:45 +0000 (05:19 +1100)]
xfs: pass xfsstats structures to handlers and macros
This patch is the next step toward per-fs xfs stats. The patch makes
the show and clear routines able to handle any stats structure
associated with a kobject.
Instead of a single global xfsstats structure, add kobject and a pointer
to a per-cpu struct xfsstats. Modify the macros that manipulate the stats
accordingly: XFS_STATS_INC, XFS_STATS_DEC, and XFS_STATS_ADD now access
xfsstats->xs_stats.
The sysfs functions need to get from the kobject back to the xfsstats
structure which contains it, and pass the pointer to the ->xs_stats
percpu structure into the show & clear routines.
Signed-off-by: Bill O'Donnell <billodo@redhat.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Bill O'Donnell [Sun, 11 Oct 2015 18:18:45 +0000 (05:18 +1100)]
xfs: consolidate sysfs ops
As a part of the series to move xfs global stats from procfs to sysfs,
this patch consolidates the sysfs ops functions and removes redundancy.
Signed-off-by: Bill O'Donnell <billodo@redhat.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Bill O'Donnell [Sun, 11 Oct 2015 18:17:45 +0000 (05:17 +1100)]
xfs: remove unused procfs code
As a part of the work to move xfs global stats from procfs to sysfs,
this patch removes the now unused procfs code that was xfs stat specific.
Signed-off-by: Bill O'Donnell <billodo@redhat.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Bill O'Donnell [Sun, 11 Oct 2015 18:16:45 +0000 (05:16 +1100)]
xfs: create symlink proc/fs/xfs/stat to sys/fs/xfs/stats
As a part of the work to move xfs global stats from procfs to sysfs,
this patch creates the symlink from proc/fs/xfs/stat to sys/fs/xfs/stats.
Signed-off-by: Bill O'Donnell <billodo@redhat.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Bill O'Donnell [Sun, 11 Oct 2015 18:15:45 +0000 (05:15 +1100)]
xfs: create global stats and stats_clear in sysfs
Currently, xfs global stats are in procfs. This patch introduces
(replicates) the global stats in sysfs. Additionally a stats_clear file
is introduced in sysfs.
Signed-off-by: Bill O'Donnell <billodo@redhat.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Linus Torvalds [Sun, 20 Sep 2015 21:32:34 +0000 (14:32 -0700)]
Linux 4.3-rc2
Linus Torvalds [Sun, 20 Sep 2015 04:05:02 +0000 (21:05 -0700)]
Merge branch 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm
Pull ARM fixes from Russell King:
"Three fixes and a resulting cleanup for -rc2:
- Andre Przywara reported that he was seeing a warning with the new
cast inside DMA_ERROR_CODE's definition, and fixed the incorrect
use.
- Doug Anderson noticed that kgdb causes a "scheduling while atomic"
bug.
- OMAP5 folk noticed that their Thumb-2 compiled X servers crashed
when enabling support to cover ARMv6 CPUs due to a kernel bug
leaking some conditional context into the signal handler"
* 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
ARM: 8425/1: kgdb: Don't try to stop the machine when setting breakpoints
ARM: 8437/1: dma-mapping: fix build warning with new DMA_ERROR_CODE definition
ARM: get rid of needless #if in signal handling code
ARM: fix Thumb2 signal handling when ARMv6 is enabled
Linus Torvalds [Sun, 20 Sep 2015 03:57:45 +0000 (20:57 -0700)]
Merge tag 'linux-kselftest-4.3-rc2' of git://git./linux/kernel/git/shuah/linux-kselftest
Pull kselftest fixes from Shuah Khan:
"This update contains 7 fixes for problems ranging from build failurs
to incorrect error reporting"
* tag 'linux-kselftest-4.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
selftests: exec: revert to default emit rule
selftests: change install command to rsync
selftests: mqueue: simplify the Makefile
selftests: mqueue: allow extra cflags
selftests: rename jump label to static_keys
selftests/seccomp: add support for s390
seltests/zram: fix syntax error
Linus Torvalds [Sun, 20 Sep 2015 03:41:31 +0000 (20:41 -0700)]
Merge tag 'pm+acpi-4.3-rc2' of git://git./linux/kernel/git/rafael/linux-pm
Pull power management and ACPI updates from Rafael Wysocki:
"Included are: a somewhat late devfreq update which however is mostly
fixes and cleanups with one new thing only (the PPMUv2 support on
Exynos5433), an ACPI cpufreq driver fixup and two ACPI core cleanups
related to preprocessor directives.
Specifics:
- Fix a memory allocation size in the devfreq core (Xiaolong Ye).
- Fix a mistake in the exynos-ppmu DT binding (Javier Martinez
Canillas).
- Add support for PPMUv2 ((Platform Performance Monitoring Unit
version 2.0) on the Exynos5433 SoCs (Chanwoo Choi).
- Fix a type casting bug in the Exynos PPMU code (MyungJoo Ham).
- Assorted devfreq code cleanups and optimizations (Javi Merino,
MyungJoo Ham, Viresh Kumar).
- Fix up the ACPI cpufreq driver to use a more lightweight way to get
to its private data in the ->get() callback (Rafael J Wysocki).
- Fix a CONFIG_ prefix bug in one of the ACPI drivers and make the
ACPI subsystem use IS_ENABLED() instead of #ifdefs in function
bodies (Sudeep Holla)"
* tag 'pm+acpi-4.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpufreq: acpi-cpufreq: Use cpufreq_cpu_get_raw() in ->get()
ACPI: Eliminate CONFIG_.*{, _MODULE} #ifdef in favor of IS_ENABLED()
ACPI: int340x_thermal: add missing CONFIG_ prefix
PM / devfreq: Fix incorrect type issue.
PM / devfreq: tegra: Update governor to use devfreq_update_stats()
PM / devfreq: comments for get_dev_status usage updated
PM / devfreq: drop comment about thermal setting max_freq
PM / devfreq: cache the last call to get_dev_status()
PM / devfreq: Drop unlikely before IS_ERR(_OR_NULL)
PM / devfreq: exynos-ppmu: bit-wise operation bugfix.
PM / devfreq: exynos-ppmu: Update documentation to support PPMUv2
PM / devfreq: exynos-ppmu: Add the support of PPMUv2 for Exynos5433
PM / devfreq: event: Remove incorrect property in exynos-ppmu DT binding
Linus Torvalds [Sun, 20 Sep 2015 03:17:40 +0000 (20:17 -0700)]
Merge tag 'clk-fixes-for-linus' of git://git./linux/kernel/git/clk/linux
Pull clk fixes from Stephen Boyd:
"A few driver fixes for tegra, rockchip, and st SoCs and a two-liner in
the framework to avoid oops when get_parent ops return out of range
values on tegra platforms"
* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
drivers: clk: st: Rename st_pll3200c32_407_c0_x into st_pll3200c32_cx_x
clk: check for invalid parent index of orphans in __clk_init()
clk: tegra: dfll: Properly protect OPP list
clk: rockchip: add critical clock for rk3368
Linus Torvalds [Sun, 20 Sep 2015 03:10:30 +0000 (20:10 -0700)]
Merge tag 'led-fixes-for-v4.3-rc2' of git://git./linux/kernel/git/j.anaszewski/linux-leds
Pull LED fixes from Jacek Anaszewski:
- fix module autoload for six OF platform drivers (aat1290, bcm6328,
bcm6358, ktd2692, max77693, ns2)
- aat1290: add missing static modifier
- ipaq-micro: add missing LEDS_CLASS dependency
- lp55xx: correct Kconfig dependecy for f/w user helper
* tag 'led-fixes-for-v4.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds:
leds:lp55xx: Correct Kconfig dependency for f/w user helper
leds: leds-ipaq-micro: Add LEDS_CLASS dependency
leds: aat1290: add 'static' modifier to init_mm_current_scale
leds: leds-ns2: Fix module autoload for OF platform driver
leds: max77693: Fix module autoload for OF platform driver
leds: ktd2692: Fix module autoload for OF platform driver
leds: bcm6358: Fix module autoload for OF platform driver
leds: bcm6328: Fix module autoload for OF platform driver
leds: aat1290: Fix module autoload for OF platform driver
Linus Torvalds [Sun, 20 Sep 2015 03:04:11 +0000 (20:04 -0700)]
Merge tag 'for-linus' of git://git./linux/kernel/git/dledford/rdma
Pull rdma fixes from Doug Ledford:
"The new hfi1 driver in staging/rdma has had a number of fixup patches
since being added to the tree. This is the first batch of those fixes"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma:
IB/hfi: Properly set permissions for user device files
IB/hfi1: mask vs shift confusion
IB/hfi1: clean up some defines
IB/hfi1: info leak in get_ctxt_info()
IB/hfi1: fix a locking bug
IB/hfi1: checking for NULL instead of IS_ERR
IB/hfi1: fix sdma_descq_cnt parameter parsing
IB/hfi1: fix copy_to/from_user() error handling
IB/hfi1: fix pstateinfo from returning improperly byteswapped value
Linus Torvalds [Sun, 20 Sep 2015 02:13:03 +0000 (19:13 -0700)]
Merge branch 'libnvdimm-fixes' of git://git./linux/kernel/git/nvdimm/nvdimm
Pull libnvdimm fixes from Dan Williams:
- a boot regression (since v4.2) fix for some ARM configurations from
Tyler
- regression (since v4.1) fixes for mkfs.xfs on a DAX enabled device
from Jeff. These are tagged for -stable.
- a pair of locking fixes from Axel that are hidden from lockdep since
they involve device_lock(). The "btt" one is tagged for -stable, the
other only applies to the new "pfn" mechanism in v4.3.
- a fix for the pmem ->rw_page() path to use wmb_pmem() from Ross.
* 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
mm: fix type cast in __pfn_to_phys()
pmem: add proper fencing to pmem_rw_page()
libnvdimm: pfn_devs: Fix locking in namespace_store
libnvdimm: btt_devs: Fix locking in namespace_store
blockdev: don't set S_DAX for misaligned partitions
dax: fix O_DIRECT I/O to the last block of a blockdev
Linus Torvalds [Sun, 20 Sep 2015 01:57:09 +0000 (18:57 -0700)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull block updates from Jens Axboe:
"This is a bit bigger than it should be, but I could (did) not want to
send it off last week due to both wanting extra testing, and expecting
a fix for the bounce regression as well. In any case, this contains:
- Fix for the blk-merge.c compilation warning on gcc 5.x from me.
- A set of back/front SG gap merge fixes, from me and from Sagi.
This ensures that we honor SG gapping for integrity payloads as
well.
- Two small fixes for null_blk from Matias, fixing a leak and a
capacity propagation issue.
- A blkcg fix from Tejun, fixing a NULL dereference.
- A fast clone optimization from Ming, fixing a performance
regression since the arbitrarily sized bio's were introduced.
- Also from Ming, a regression fix for bouncing IOs"
* 'for-linus' of git://git.kernel.dk/linux-block:
block: fix bounce_end_io
block: blk-merge: fast-clone bio when splitting rw bios
block: blkg_destroy_all() should clear q->root_blkg and ->root_rl.blkg
block: Copy a user iovec if it includes gaps
block: Refuse adding appending a gapped integrity page to a bio
block: Refuse request/bio merges with gaps in the integrity payload
block: Check for gaps on front and back merges
null_blk: fix wrong capacity when bs is not 512 bytes
null_blk: fix memory leak on cleanup
block: fix bogus compiler warnings in blk-merge.c
Chris Mason [Fri, 18 Sep 2015 17:35:08 +0000 (13:35 -0400)]
fs-writeback: unplug before cond_resched in writeback_sb_inodes
Commit
505a666ee3fc ("writeback: plug writeback in wb_writeback() and
writeback_inodes_wb()") has us holding a plug during writeback_sb_inodes,
which increases the merge rate when relatively contiguous small files
are written by the filesystem. It helps both on flash and spindles.
For an fs_mark workload creating 4K files in parallel across 8 drives,
this commit improves performance ~9% more by unplugging before calling
cond_resched(). cond_resched() doesn't trigger an implicit unplug, so
explicitly getting the IO down to the device before scheduling reduces
latencies for anyone waiting on clean pages.
It also cuts down on how often we use kblockd to unplug, which means
less work bouncing from one workqueue to another.
Many more details about how we got here:
https://lkml.org/lkml/2015/9/11/570
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Tyler Baker [Sat, 19 Sep 2015 07:58:10 +0000 (03:58 -0400)]
mm: fix type cast in __pfn_to_phys()
The various definitions of __pfn_to_phys() have been consolidated to
use a generic macro in include/asm-generic/memory_model.h. This hit
mainline in the form of
012dcef3f058 "mm: move __phys_to_pfn and
__pfn_to_phys to asm/generic/memory_model.h". When the generic macro
was implemented the type cast to phys_addr_t was dropped which caused
boot regressions on ARM platforms with more than 4GB of memory and
LPAE enabled.
It was suggested to use PFN_PHYS() defined in include/linux/pfn.h
as provides the correct logic and avoids further duplication.
Reported-by: kernelci.org bot <bot@kernelci.org>
Suggested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Tyler Baker <tyler.baker@linaro.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Rafael J. Wysocki [Fri, 18 Sep 2015 21:07:46 +0000 (23:07 +0200)]
Merge branch 'acpi-bus'
* acpi-bus:
ACPI: Eliminate CONFIG_.*{, _MODULE} #ifdef in favor of IS_ENABLED()
ACPI: int340x_thermal: add missing CONFIG_ prefix
Rafael J. Wysocki [Fri, 18 Sep 2015 21:05:28 +0000 (23:05 +0200)]
Merge branches 'pm-cpufreq' and 'pm-devfreq'
* pm-cpufreq:
cpufreq: acpi-cpufreq: Use cpufreq_cpu_get_raw() in ->get()
* pm-devfreq:
PM / devfreq: Fix incorrect type issue.
PM / devfreq: tegra: Update governor to use devfreq_update_stats()
PM / devfreq: comments for get_dev_status usage updated
PM / devfreq: drop comment about thermal setting max_freq
PM / devfreq: cache the last call to get_dev_status()
PM / devfreq: Drop unlikely before IS_ERR(_OR_NULL)
PM / devfreq: exynos-ppmu: bit-wise operation bugfix.
PM / devfreq: exynos-ppmu: Update documentation to support PPMUv2
PM / devfreq: exynos-ppmu: Add the support of PPMUv2 for Exynos5433
PM / devfreq: event: Remove incorrect property in exynos-ppmu DT binding
Linus Torvalds [Fri, 18 Sep 2015 16:28:20 +0000 (09:28 -0700)]
Merge tag 'for_linus' of git://git./linux/kernel/git/mst/vhost
Pull virtio fixes and cleanups from Michael Tsirkin:
"This fixes the virtio-test tool, and improves the error handling for
virtio-ccw"
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
virtio/s390: handle failures of READ_VQ_CONF ccw
tools/virtio: propagate V=X to kernel build
vhost: move features to core
tools/virtio: fix build after 4.2 changes
Linus Torvalds [Fri, 18 Sep 2015 16:23:08 +0000 (09:23 -0700)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm
Pull KVM fixes from Paolo Bonzini:
"Mostly stable material, a lot of ARM fixes"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (22 commits)
sched: access local runqueue directly in single_task_running
arm/arm64: KVM: Remove 'config KVM_ARM_MAX_VCPUS'
arm64: KVM: Remove all traces of the ThumbEE registers
arm: KVM: Disable virtual timer even if the guest is not using it
arm64: KVM: Disable virtual timer even if the guest is not using it
arm/arm64: KVM: vgic: Check for !irqchip_in_kernel() when mapping resources
KVM: s390: Replace incorrect atomic_or with atomic_andnot
arm: KVM: Fix incorrect device to IPA mapping
arm64: KVM: Fix user access for debug registers
KVM: vmx: fix VPID is 0000H in non-root operation
KVM: add halt_attempted_poll to VCPU stats
kvm: fix zero length mmio searching
kvm: fix double free for fast mmio eventfd
kvm: factor out core eventfd assign/deassign logic
kvm: don't try to register to KVM_FAST_MMIO_BUS for non mmio eventfd
KVM: make the declaration of functions within 80 characters
KVM: arm64: add workaround for Cortex-A57 erratum #852523
KVM: fix polling for guest halt continued even if disable it
arm/arm64: KVM: Fix PSCI affinity info return value for non valid cores
arm64: KVM: set {v,}TCR_EL2 RES1 bits
...
Ira Weiny [Thu, 17 Sep 2015 17:47:49 +0000 (13:47 -0400)]
IB/hfi: Properly set permissions for user device files
Some of the device files are required to be user accessible for PSM while
most should remain accessible only by root.
Add a parameter to hfi1_cdev_init which controls if the user should have access
to this device which places it in a different class with the appropriate
devnode callback.
In addition set the devnode call back for the existing class to be a bit more
explicit for those permissions.
Finally remove the unnecessary null check before class_destroy
Tested-by: Donald Dutile <ddutile@redhat.com>
Signed-off-by: Haralanov, Mitko (mitko.haralanov@intel.com)
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Dan Carpenter [Wed, 16 Sep 2015 16:03:45 +0000 (19:03 +0300)]
IB/hfi1: mask vs shift confusion
We are shifting by the _MASK macros instead of the _SHIFT ones.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Dan Carpenter [Wed, 16 Sep 2015 16:02:54 +0000 (19:02 +0300)]
IB/hfi1: clean up some defines
I added spaces around operators so it matches kernel style because
normally "-1ULL" is a number and " - 1" is a subtract operation. Also
removed some superflous "ULL" types so "1ULL" becomes "1".
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Dan Carpenter [Wed, 16 Sep 2015 06:42:25 +0000 (09:42 +0300)]
IB/hfi1: info leak in get_ctxt_info()
The cinfo struct has a hole after the last struct member so we need to
zero it out. Otherwise we disclose some uninitialized stack data.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Dan Carpenter [Wed, 16 Sep 2015 06:22:51 +0000 (09:22 +0300)]
IB/hfi1: fix a locking bug
mutex_trylock() returns zero on failure, not EBUSY.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Dan Carpenter [Wed, 16 Sep 2015 06:22:20 +0000 (09:22 +0300)]
IB/hfi1: checking for NULL instead of IS_ERR
__get_txreq() returns an ERR_PTR() but this checks for NULL so it would
oops on failure.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Mike Marciniszyn [Tue, 15 Sep 2015 14:19:27 +0000 (10:19 -0400)]
IB/hfi1: fix sdma_descq_cnt parameter parsing
The boolean tests should have been or-ed.
Reported-by: David Binderman <dcb314@hotmail.com>
Reviewed-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Dan Carpenter [Tue, 15 Sep 2015 10:35:25 +0000 (13:35 +0300)]
IB/hfi1: fix copy_to/from_user() error handling
copy_to/from_user() returns the number of bytes which we were not able
to copy. It doesn't return an error code.
Also a couple places had a printk() on error and I removed that because
people can take advantage of it to fill /var/log/messages with spam.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Ira Weiny [Wed, 9 Sep 2015 05:28:21 +0000 (01:28 -0400)]
IB/hfi1: fix pstateinfo from returning improperly byteswapped value
Byteswap link_width_downgrade_*_active values before sending on the wire. In
addition properly define the Port State Info structure.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Christian Gomez <christian.gomez@intel.com>
Signed-off-by: Rimmer, Todd <todd.rimmer@intel.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Acked-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Linus Torvalds [Fri, 18 Sep 2015 15:11:42 +0000 (08:11 -0700)]
Merge branch 'irq-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull irq updates from Thomas Gleixner:
"This is a rather large update post rc1 due to the final steps of
cleanups and API changes which had to wait for the preparatory patches
to hit your tree.
- Regression fixes for ARM GIC irqchips
- Regression fixes and lockdep anotations for renesas irq chips
- The leftovers of the cleanup and preparatory patches which have
been ignored by maintainers
- Final conversions of the newly merged users of obsolete APIs
- Final removal of obsolete APIs
- Final removal of ARM artifacts which had been introduced during the
conversion of ARM to the generic interrupt code.
- Final split of the irq_data into chip specific and common data to
reflect the needs of hierarchical irq domains.
- Treewide removal of the first argument of interrupt flow handlers,
i.e. the irq number, which is not used by the majority of handlers
and simple to retrieve from the other argument the irq descriptor.
- A few comment updates and build warning fixes"
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (40 commits)
arm64: Remove ununsed set_irq_flags
ARM: Remove ununsed set_irq_flags
sh: Kill off set_irq_flags usage
irqchip: Kill off set_irq_flags usage
gpu/drm: Kill off set_irq_flags usage
genirq: Remove irq argument from irq flow handlers
genirq: Move field 'msi_desc' from irq_data into irq_common_data
genirq: Move field 'affinity' from irq_data into irq_common_data
genirq: Move field 'handler_data' from irq_data into irq_common_data
genirq: Move field 'node' from irq_data into irq_common_data
irqchip/gic-v3: Use IRQD_FORWARDED_TO_VCPU flag
irqchip/gic: Use IRQD_FORWARDED_TO_VCPU flag
genirq: Provide IRQD_FORWARDED_TO_VCPU status flag
genirq: Simplify irq_data_to_desc()
genirq: Remove __irq_set_handler_locked()
pinctrl/pistachio: Use irq_set_handler_locked
gpio: vf610: Use irq_set_handler_locked
powerpc/mpc8xx: Use irq_set_handler_locked()
powerpc/ipic: Use irq_set_handler_locked()
powerpc/cpm2: Use irq_set_handler_locked()
...
Linus Torvalds [Fri, 18 Sep 2015 15:06:28 +0000 (08:06 -0700)]
Merge branch 'x86-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull x86 fix from Thomas Gleixner:
"A single regression fix for the x86 dma allocator which got wreckaged
in the merge window"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/pci/dma: Fix gfp flags for coherent DMA memory allocation
Linus Torvalds [Fri, 18 Sep 2015 15:01:06 +0000 (08:01 -0700)]
Merge tag 'powerpc-4.3-2' of git://git./linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
- Fix 32-bit TCE table init in kdump kernel from Nish
- Fix kdump with non-power-of-2 crashkernel= from Nish
- Abort cxl_pci_enable_device_hook() if PCI channel is offline from
Andrew
- Fix to release DRC when configure_connector() fails from Bharata
- Wire up sys_userfaultfd()
- Fix race condition in tearing down MSI interrupts from Paul
- Fix unbalanced pci_dev_get() in cxl_probe() from Daniel
- Fix cxl build failure due to -Wunused-variable gcc behaviour change
from Ian
- Tell the toolchain to use ABI v2 when building an LE boot wrapper
from Benh
- Fix THP to recompute hash value after a failed update from Aneesh
- 32-bit memcpy/memset: only use dcbz once cache is enabled from
Christophe
* tag 'powerpc-4.3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc32: memset: only use dcbz once cache is enabled
powerpc32: memcpy: only use dcbz once cache is enabled
powerpc/mm: Recompute hash value after a failed update
powerpc/boot: Specify ABI v2 when building an LE boot wrapper
cxl: Fix build failure due to -Wunused-variable behaviour change
cxl: Fix unbalanced pci_dev_get in cxl_probe
powerpc/MSI: Fix race condition in tearing down MSI interrupts
powerpc: Wire up sys_userfaultfd()
powerpc/pseries: Release DRC when configure_connector fails
cxl: abort cxl_pci_enable_device_hook() if PCI channel is offline
powerpc/powernv/pci-ioda: fix kdump with non-power-of-2 crashkernel=
powerpc/powernv/pci-ioda: fix 32-bit TCE table init in kdump kernel
Dominik Dingel [Fri, 18 Sep 2015 09:27:45 +0000 (11:27 +0200)]
sched: access local runqueue directly in single_task_running
Commit
2ee507c47293 ("sched: Add function single_task_running to let a task
check if it is the only task running on a cpu") referenced the current
runqueue with the smp_processor_id. When CONFIG_DEBUG_PREEMPT is enabled,
that is only allowed if preemption is disabled or the currrent task is
bound to the local cpu (e.g. kernel worker).
With commit
f78195129963 ("kvm: add halt_poll_ns module parameter") KVM
calls single_task_running. If CONFIG_DEBUG_PREEMPT is enabled that
generates a lot of kernel messages.
To avoid adding preemption in that cases, as it would limit the usefulness,
we change single_task_running to access directly the cpu local runqueue.
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: <stable@vger.kernel.org>
Fixes:
2ee507c472939db4b146d545352b8a7c79ef47f8
Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Linus Torvalds [Fri, 18 Sep 2015 04:41:02 +0000 (21:41 -0700)]
Merge tag 'platform-drivers-x86-v4.3-2' of git://git.infradead.org/users/dvhart/linux-platform-drivers-x86
Pull x86 platform driver fixes from Darren Hart:
"Fix an issue introduced by the previous major toshiba rework. Add a
quirk. Workaround a few platform specific firmware items. One
cleanup to wmi I inadvertently dropped from a previous pull request.
Details:
hp-wmi:
- limit hotkey enable
toshiba_acpi:
- Fix hotkeys registration on some toshiba models
- Fix USB Sleep and Music always disabled
wmi:
- Remove private %pUL implementation
asus-nb-wmi:
- Add wapf=4 quirk for X456UA/X456UF"
* tag 'platform-drivers-x86-v4.3-2' of git://git.infradead.org/users/dvhart/linux-platform-drivers-x86:
hp-wmi: limit hotkey enable
toshiba_acpi: Fix hotkeys registration on some toshiba models
toshiba_acpi: Fix USB Sleep and Music always disabled
wmi: Remove private %pUL implementation
asus-nb-wmi: Add wapf=4 quirk for X456UA/X456UF
Linus Torvalds [Fri, 18 Sep 2015 04:16:47 +0000 (21:16 -0700)]
Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from ANdrew Morton:
"8 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
revert "mm: make sure all file VMAs have ->vm_ops set"
MAINTAINERS: update LTP mailing list
userfaultfd: add missing mmput() in error path
lib/string_helpers.c: fix infinite loop in string_get_size()
alpha: lib: export __delay
alpha: io: define ioremap_uc
kasan: fix last shadow judgement in memory_is_poisoned_16()
zram: fix possible use after free in zcomp_create()
Andrew Morton [Thu, 17 Sep 2015 23:02:00 +0000 (16:02 -0700)]
revert "mm: make sure all file VMAs have ->vm_ops set"
Revert commit
6dc296e7df4c "mm: make sure all file VMAs have ->vm_ops
set".
Will Deacon reports that it "causes some mmap regressions in LTP, which
appears to use a MAP_PRIVATE mmap of /dev/zero as a way to get anonymous
pages in some of its tests (specifically mmap10 [1])".
William Shuman reports Oracle crashes.
So revert the patch while we work out what to do.
Reported-by: William Shuman <wshuman3@gmail.com>
Reported-by: Will Deacon <will.deacon@arm.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cyril Hrubis [Thu, 17 Sep 2015 23:01:57 +0000 (16:01 -0700)]
MAINTAINERS: update LTP mailing list
[akpm@linux-foundation.org: Wanlong Gao has moved]
Signed-off-by: Cyril Hrubis <chrubis@suse.cz>
Cc: Jan Stancek <jstancek@redhat.com>
Cc: Stanislav Kholmanskikh <stanislav.kholmanskikh@oracle.com>
Cc: Alexey Kodanev <alexey.kodanev@oracle.com>
Cc: Wanlong Gao <wanlong.gao@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Eric Biggers [Thu, 17 Sep 2015 23:01:54 +0000 (16:01 -0700)]
userfaultfd: add missing mmput() in error path
This fixes a memleak if anon_inode_getfile() fails in userfaultfd().
Signed-off-by: Eric Biggers <ebiggers3@gmail.com>
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Vitaly Kuznetsov [Thu, 17 Sep 2015 23:01:51 +0000 (16:01 -0700)]
lib/string_helpers.c: fix infinite loop in string_get_size()
Some string_get_size() calls (e.g.:
string_get_size(1, 512, STRING_UNITS_10, ..., ...)
string_get_size(15, 64, STRING_UNITS_10, ..., ...)
) result in an infinite loop. The problem is that if size is equal to
divisor[units]/blk_size and is smaller than divisor[units] we'll end
up with size == 0 when we start doing sf_cap calculations:
For string_get_size(1, 512, STRING_UNITS_10, ..., ...) case:
...
remainder = do_div(size, divisor[units]); -> size is 0, remainder is 1
remainder *= blk_size; -> remainder is 512
...
size *= blk_size; -> size is still 0
size += remainder / divisor[units]; -> size is still 0
The caller causing the issue is sd_read_capacity(), the problem was
noticed on Hyper-V, such weird size was reported by host when scanning
collides with device removal. This is probably a separate issue worth
fixing, this patch is intended to prevent the library routine from
infinite looping.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Acked-by: James Bottomley <JBottomley@Odin.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Sudip Mukherjee [Thu, 17 Sep 2015 23:01:49 +0000 (16:01 -0700)]
alpha: lib: export __delay
__delay was not exported as a result while building with allmodconfig we
were getting build error of undefined symbol. __delay is being used by:
drivers/net/phy/mdio-octeon.c
Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Sudip Mukherjee [Thu, 17 Sep 2015 23:01:46 +0000 (16:01 -0700)]
alpha: io: define ioremap_uc
ioremap_uc was not defined and as a result while building with
allmodconfig were getting build error of: implicit declaration of
function 'ioremap_uc'.
Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Xishi Qiu [Thu, 17 Sep 2015 23:01:43 +0000 (16:01 -0700)]
kasan: fix last shadow judgement in memory_is_poisoned_16()
The shadow which correspond 16 bytes memory may span 2 or 3 bytes. If
the memory is aligned on 8, then the shadow takes only 2 bytes. So we
check "shadow_first_bytes" is enough, and need not to call
"memory_is_poisoned_1(addr + 15);". But the code "if
(likely(!last_byte))" is wrong judgement.
e.g. addr=0, so last_byte = 15 & KASAN_SHADOW_MASK = 7, then the code
will continue to call "memory_is_poisoned_1(addr + 15);"
Signed-off-by: Xishi Qiu <qiuxishi@huawei.com>
Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Andrey Konovalov <adech.fo@gmail.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Michal Marek <mmarek@suse.cz>
Cc: <zhongjiang@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Luis Henriques [Thu, 17 Sep 2015 23:01:40 +0000 (16:01 -0700)]
zram: fix possible use after free in zcomp_create()
zcomp_create() verifies the success of zcomp_strm_{multi,single}_create()
through comp->stream, which can potentially be pointing to memory that
was freed if these functions returned an error.
While at it, replace a 'ERR_PTR(-ENOMEM)' by a more generic
'ERR_PTR(error)' as in the future zcomp_strm_{multi,siggle}_create()
could return other error codes. Function documentation updated
accordingly.
Fixes:
beca3ec71fe5 ("zram: add multi stream functionality")
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Acked-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kyle Evans [Fri, 11 Sep 2015 15:40:17 +0000 (10:40 -0500)]
hp-wmi: limit hotkey enable
Do not write initialize magic on systems that do not have
feature query 0xb. Fixes Bug #82451.
Redefine FEATURE_QUERY to align with 0xb and FEATURE2 with 0xd
for code clearity.
Add a new test function, hp_wmi_bios_2008_later() & simplify
hp_wmi_bios_2009_later(), which fixes a bug in cases where
an improper value is returned. Probably also fixes Bug #69131.
Add missing __init tag.
Signed-off-by: Kyle Evans <kvans32@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Darren Hart <dvhart@linux.intel.com>
Linus Torvalds [Thu, 17 Sep 2015 19:32:40 +0000 (12:32 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/sage/ceph-client
Pull Ceph fixes from Sage Weil:
"These are both fixes to the new and improved keepalive2 behavior"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
libceph: advertise support for keepalive2
libceph: don't access invalid memory in keepalive2 path
Linus Torvalds [Thu, 17 Sep 2015 19:25:42 +0000 (12:25 -0700)]
Merge tag 'for-v4.3-rc' of git://git./linux/kernel/git/sre/linux-power-supply
Pull power supply fixes from Sebastian Reichel:
"twl4030-charger fixes"
* tag 'for-v4.3-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply:
twl4030_charger: fix another compile error
Revert "twl4030_charger: correctly handle -EPROBE_DEFER from devm_usb_get_phy_by_node"
Gabriel Fernandez [Wed, 16 Sep 2015 07:42:59 +0000 (09:42 +0200)]
drivers: clk: st: Rename st_pll3200c32_407_c0_x into st_pll3200c32_cx_x
Use a generic name for this kind of PLL
Correction in dts files are already done here:
commit
5eb26c605909 ("ARM: STi: DT: Rename st_pll3200c32_407_c0_x into st_pll3200c32_cx_x")
Signed-off-by: Gabriel Fernandez <gabriel.fernandez@linaro.org>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Linus Torvalds [Thu, 17 Sep 2015 18:28:17 +0000 (11:28 -0700)]
Merge tag 'arm64-fixes' of git://git./linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
"This addresses some problems with filesystem writeback due to the
recently merged hardware DBM patches, which caused us to treat some
read-only pages as dirty.
There are also some other, less significant fixes that are described
in the summary below:
A mixture of fixes for regressions introduced during the merge window,
some longer standing problems that we spotted and a couple of hardware
errata. The main changes are:
- Fix fallout from the h/w DBM patches, causing filesystem writeback
issues on both v8 and v8.1 CPUs
- Workaround for Cortex-A53 erratum #843419 in the module loader
- Fix for long-standing issue with compat big-endian signal handlers
using the saved floating point state"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: errata: add module build workaround for erratum #843419
arm64: compat: fix vfp save/restore across signal handlers in big-endian
arm64: cpu hotplug: ensure we mask out CPU_TASKS_FROZEN in notifiers
arm64: head.S: initialise mdcr_el2 in el2_setup
arm64: enable generic idle loop
arm64: pgtable: use a single bit for PTE_WRITE regardless of DBM
arm64: Fix pte_modify() to preserve the hardware dirty information
arm64: Fix the pte_hw_dirty() check when AF/DBM is enabled
arm64: dma-mapping: check whether cma area is initialized or not
Linus Torvalds [Thu, 17 Sep 2015 18:01:34 +0000 (11:01 -0700)]
Merge branch 'x86-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
- misc fixes all around the map
- block non-root vm86(old) if mmap_min_addr != 0
- two small debuggability improvements
- removal of obsolete paravirt op
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/platform: Fix Geode LX timekeeping in the generic x86 build
x86/apic: Serialize LVTT and TSC_DEADLINE writes
x86/ioapic: Force affinity setting in setup_ioapic_dest()
x86/paravirt: Remove the unused pv_time_ops::get_tsc_khz method
x86/ldt: Fix small LDT allocation for Xen
x86/vm86: Fix the misleading CONFIG_VM86 Kconfig help text
x86/cpu: Print family/model/stepping in hex
x86/vm86: Block non-root vm86(old) if mmap_min_addr != 0
x86/alternatives: Make optimize_nops() interrupt safe and synced
x86/mm/srat: Print non-volatile flag in SRAT
x86/cpufeatures: Enable cpuid for Intel SHA extensions
Linus Torvalds [Thu, 17 Sep 2015 17:55:25 +0000 (10:55 -0700)]
Merge branch 'timers-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull timer fixes from Ingo Molnar:
"A fix for an abs()/abs64() bug that caused too slow NTP convergence on
32-bit kernels, plus a removal of an obsolete clockevents driver
facility after all users got converted during the merge window"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
clockevents: Remove unused set_mode() callback
time: Fix timekeeping_freqadjust()'s incorrect use of abs() instead of abs64()
Linus Torvalds [Thu, 17 Sep 2015 17:49:42 +0000 (10:49 -0700)]
Merge branch 'sched-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull scheduler fixes from Ingo Molnar:
"A migrate_tasks() locking fix, and a late-coming nohz change plus a
nohz debug check"
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched: 'Annotate' migrate_tasks()
nohz: Assert existing housekeepers when nohz full enabled
nohz: Affine unpinned timers to housekeepers
Linus Torvalds [Thu, 17 Sep 2015 17:37:46 +0000 (10:37 -0700)]
Merge branch 'perf-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull perf fixes from Ingo MOlnar:
"Mostly tooling fixes, but also two x86 PMU driver fixes"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf tests: Fix software clock events test setting maps
perf tests: Fix task exit test setting maps
perf evlist: Fix create_syswide_maps() not propagating maps
perf evlist: Fix add() not propagating maps
perf evlist: Factor out a function to propagate maps for a single evsel
perf evlist: Make create_maps() use set_maps()
perf evlist: Make set_maps() more resilient
perf evsel: Add own_cpus member
perf evlist: Fix missing thread_map__put in propagate_maps()
perf evlist: Fix splice_list_tail() not setting evlist
perf evlist: Add has_user_cpus member
perf evlist: Remove redundant validation from propagate_maps()
perf evlist: Simplify set_maps() logic
perf evlist: Simplify propagate_maps() logic
perf top: Fix segfault pressing -> with no hist entries
perf header: Fixup reading of HEADER_NRCPUS feature
perf/x86/intel: Fix constraint access
perf/x86/intel/bts: Set event->hw.itrace_started in pmu::start to match the new logic
perf tools: Fix use of wrong event when processing exit events
perf tools: Fix parse_events_add_pmu caller
Ilya Dryomov [Mon, 14 Sep 2015 09:44:22 +0000 (12:44 +0300)]
libceph: advertise support for keepalive2
We are the client, but advertise keepalive2 anyway - for consistency,
if nothing else. In the future the server might want to know whether
its clients support keepalive2.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Yan, Zheng <zyan@redhat.com>
Ilya Dryomov [Mon, 14 Sep 2015 13:01:05 +0000 (16:01 +0300)]
libceph: don't access invalid memory in keepalive2 path
This
struct ceph_timespec ceph_ts;
...
con_out_kvec_add(con, sizeof(ceph_ts), &ceph_ts);
wraps ceph_ts into a kvec and adds it to con->out_kvec array, yet
ceph_ts becomes invalid on return from prepare_write_keepalive(). As
a result, we send out bogus keepalive2 stamps. Fix this by encoding
into a ceph_timespec member, similar to how acks are read and written.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Yan, Zheng <zyan@redhat.com>
Ming Lei [Thu, 17 Sep 2015 16:06:28 +0000 (00:06 +0800)]
block: fix bounce_end_io
When bio bounce is involved, one new bio and its biovecs are
cloned from the comming bio, which can be one fast-cloned bio
from upper layer(such as dm).
So it is obviously wrong to assume the start index of the coming(
original) bio's io vector is zero, which can be any value between
0 and (bi_max_vecs - 1), especially in case of bio split.
This patch fixes Fedora's booting oops on i386, often with the
following kernel log together:
> [ 9.026738] systemd[1]: Switching root.
> [ 9.036467] systemd-journald[149]: Received SIGTERM from PID 1
> (systemd).
> [ 9.082262] BUG: Bad page state in process kworker/u5:1 pfn:372ac
> [ 9.083989] page:
f3d32ae0 count:0 mapcount:0 mapping:
f2252178
> index:0x16a
> [ 9.085755] flags: 0x40020021(locked|lru|mappedtodisk)
> [ 9.087284] page dumped because: page still charged to cgroup
> [ 9.088772] bad because of flags:
> [ 9.089731] flags: 0x21(locked|lru)
> [ 9.090818] page->mem_cgroup:
f2c3e400
Reported-by: Josh Boyer <jwboyer@fedoraproject.org>
Tested-by: Adam Williamson <awilliam@redhat.com>
Cc: Ming Lin <mlin@kernel.org>
Cc: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Ming Lei [Thu, 17 Sep 2015 15:58:38 +0000 (09:58 -0600)]
block: blk-merge: fast-clone bio when splitting rw bios
biovecs has become immutable since v3.13, so it isn't necessary
to allocate biovecs for the new cloned bios, then we can save
one extra biovecs allocation/copy, and the allocation is often
not fixed-length and a bit more expensive.
For example, if the 'max_sectors_kb' of null blk's queue is set
as 16(32 sectors) via sysfs just for making more splits, this patch
can increase throught about ~70% in the sequential read test over
null_blk(direct io, bs: 1M).
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Kent Overstreet <kent.overstreet@gmail.com>
Cc: Ming Lin <ming.l@ssi.samsung.com>
Cc: Dongsu Park <dpark@posteo.net>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
This fixes a performance regression introduced by commit
54efd50bfd,
and allows us to take full advantage of the fact that we have immutable
bio_vecs. Hand applied, as it rejected violently with commit
5014c311baa2.
Signed-off-by: Jens Axboe <axboe@fb.com>
Ross Zwisler [Wed, 16 Sep 2015 20:52:21 +0000 (14:52 -0600)]
pmem: add proper fencing to pmem_rw_page()
pmem_rw_page() needs to call wmb_pmem() on writes to make sure that the
newly written data is durable. This flow was added to pmem_rw_bytes()
and pmem_make_request() with this commit:
commit
61031952f4c8 ("arch, x86: pmem api for ensuring durability of
persistent memory updates")
...the pmem_rw_page() path was missed.
Cc: <stable@vger.kernel.org>
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Axel Lin [Wed, 16 Sep 2015 13:25:38 +0000 (21:25 +0800)]
libnvdimm: pfn_devs: Fix locking in namespace_store
Always take device_lock() before nvdimm_bus_lock() to prevent deadlock.
Signed-off-by: Axel Lin <axel.lin@ingics.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Linus Torvalds [Thu, 17 Sep 2015 15:45:23 +0000 (08:45 -0700)]
Merge branch 'locking-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull locking fixes from Ingo Molnar:
"Spinlock performance regression fix, plus documentation fixes"
* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
locking/static_keys: Fix up the static keys documentation
locking/qspinlock/x86: Only emit the test-and-set fallback when building guest support
locking/qspinlock/x86: Fix performance regression under unaccelerated VMs
locking/static_keys: Fix a silly typo
Linus Torvalds [Thu, 17 Sep 2015 15:44:27 +0000 (08:44 -0700)]
Merge branch 'core-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull RCU fix from Ingo Molnar:
"Fix a false positive warning"
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
security/device_cgroup: Fix RCU_LOCKDEP_WARN() condition
Axel Lin [Wed, 16 Sep 2015 13:24:47 +0000 (21:24 +0800)]
libnvdimm: btt_devs: Fix locking in namespace_store
Always take device_lock() before nvdimm_bus_lock() to prevent deadlock.
Cc: <stable@vger.kernel.org>
Signed-off-by: Axel Lin <axel.lin@ingics.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Paolo Bonzini [Thu, 17 Sep 2015 14:51:59 +0000 (16:51 +0200)]
Merge tag 'kvm-arm-for-4.3-rc2-2' of git://git./linux/kernel/git/kvmarm/kvmarm into kvm-master
Second set of KVM/ARM changes for 4.3-rc2
- Workaround for a Cortex-A57 erratum
- Bug fix for the debugging infrastructure
- Fix for 32bit guests with more than 4GB of address space
on a 32bit host
- A number of fixes for the (unusual) case when we don't use
the in-kernel GIC emulation
- Removal of ThumbEE handling on arm64, since these have been
dropped from the architecture before anyone actually ever
built a CPU
- Remove the KVM_ARM_MAX_VCPUS limitation which has become
fairly pointless
Junichi Nomura [Mon, 14 Sep 2015 07:38:36 +0000 (07:38 +0000)]
x86/pci/dma: Fix gfp flags for coherent DMA memory allocation
Commit
6894258eda2f reversed the order of gfp_flags adjustment in
dma_alloc_attrs() for x86 [arch/x86/kernel/pci-dma.c] As a result,
relevant flags set by dma_alloc_coherent_gfp_flags() are just
discarded and cause coherent DMA memory allocation failure on some
devices.
Fixes:
6894258eda2f ("dma-mapping: consolidate dma_{alloc,free}_{attrs,coherent}")
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Tested-by: Tony Luck <tony.luck@intel.com>
Acked-by: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20150914073834.GA13077@xzibit.linux.bs1.fc.nec.co.jp
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Ming Lei [Wed, 2 Sep 2015 06:31:21 +0000 (14:31 +0800)]
arm/arm64: KVM: Remove 'config KVM_ARM_MAX_VCPUS'
This patch removes config option of KVM_ARM_MAX_VCPUS,
and like other ARCHs, just choose the maximum allowed
value from hardware, and follows the reasons:
1) from distribution view, the option has to be
defined as the max allowed value because it need to
meet all kinds of virtulization applications and
need to support most of SoCs;
2) using a bigger value doesn't introduce extra memory
consumption, and the help text in Kconfig isn't accurate
because kvm_vpu structure isn't allocated until request
of creating VCPU is sent from QEMU;
3) the main effect is that the field of vcpus[] in 'struct kvm'
becomes a bit bigger(sizeof(void *) per vcpu) and need more cache
lines to hold the structure, but 'struct kvm' is one generic struct,
and it has worked well on other ARCHs already in this way. Also,
the world switch frequecy is often low, for example, it is ~2000
when running kernel building load in VM from APM xgene KVM host,
so the effect is very small, and the difference can't be observed
in my test at all.
Cc: Dann Frazier <dann.frazier@canonical.com>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Will Deacon [Tue, 15 Sep 2015 16:15:33 +0000 (17:15 +0100)]
arm64: KVM: Remove all traces of the ThumbEE registers
Although the ThumbEE registers and traps were present in earlier
versions of the v8 architecture, it was retrospectively removed and so
we can do the same.
Whilst this breaks migrating a guest started on a previous version of
the kernel, it is much better to kill these (non existent) registers
as soon as possible.
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
[maz: added commend about migration]
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Marc Zyngier [Wed, 16 Sep 2015 15:18:59 +0000 (16:18 +0100)]
arm: KVM: Disable virtual timer even if the guest is not using it
When running a guest with the architected timer disabled (with QEMU and
the kernel_irqchip=off option, for example), it is important to make
sure the timer gets turned off. Otherwise, the guest may try to
enable it anyway, leading to a screaming HW interrupt.
The fix is to unconditionally turn off the virtual timer on guest
exit.
Cc: stable@vger.kernel.org
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Marc Zyngier [Wed, 16 Sep 2015 15:18:59 +0000 (16:18 +0100)]
arm64: KVM: Disable virtual timer even if the guest is not using it
When running a guest with the architected timer disabled (with QEMU and
the kernel_irqchip=off option, for example), it is important to make
sure the timer gets turned off. Otherwise, the guest may try to
enable it anyway, leading to a screaming HW interrupt.
The fix is to unconditionally turn off the virtual timer on guest
exit.
Cc: stable@vger.kernel.org
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Will Deacon [Tue, 17 Mar 2015 12:15:02 +0000 (12:15 +0000)]
arm64: errata: add module build workaround for erratum #843419
Cortex-A53 processors <= r0p4 are affected by erratum #843419 which can
lead to a memory access using an incorrect address in certain sequences
headed by an ADRP instruction.
There is a linker fix to generate veneers for ADRP instructions, but
this doesn't work for kernel modules which are built as unlinked ELF
objects.
This patch adds a new config option for the erratum which, when enabled,
builds kernel modules with the mcmodel=large flag. This uses absolute
addressing for all kernel symbols, thereby removing the use of ADRP as
a PC-relative form of addressing. The ADRP relocs are removed from the
module loader so that we fail to load any potentially affected modules.
Cc: <stable@vger.kernel.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Will Deacon [Tue, 15 Sep 2015 11:07:06 +0000 (12:07 +0100)]
arm64: compat: fix vfp save/restore across signal handlers in big-endian
When saving/restoring the VFP registers from a compat (AArch32)
signal frame, we rely on the compat registers forming a prefix of the
native register file and therefore make use of copy_{to,from}_user to
transfer between the native fpsimd_state and the compat_vfp_sigframe.
Unfortunately, this doesn't work so well in a big-endian environment.
Our fpsimd save/restore code operates directly on 128-bit quantities
(Q registers) whereas the compat_vfp_sigframe represents the registers
as an array of 64-bit (D) registers. The architecture packs the compat D
registers into the Q registers, with the least significant bytes holding
the lower register. Consequently, we need to swap the 64-bit halves when
converting between these two representations on a big-endian machine.
This patch replaces the __copy_{to,from}_user invocations in our
compat VFP signal handling code with explicit __put_user loops that
operate on 64-bit values and swap them accordingly.
Cc: <stable@vger.kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Will Deacon [Fri, 11 Sep 2015 14:31:24 +0000 (15:31 +0100)]
arm64: cpu hotplug: ensure we mask out CPU_TASKS_FROZEN in notifiers
We have a couple of CPU hotplug notifiers for resetting the CPU debug
state to a sane value when a CPU comes online.
This patch ensures that we mask out CPU_TASKS_FROZEN so that we don't
miss any online events occuring due to suspend/resume.
Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Takashi Iwai [Mon, 7 Sep 2015 12:25:01 +0000 (14:25 +0200)]
leds:lp55xx: Correct Kconfig dependency for f/w user helper
The commit [
b67893206fc0: leds:lp55xx: fix firmware loading error]
tries to address the firmware file handling with user helper, but it
sets a wrong Kconfig CONFIG_FW_LOADER_USER_HELPER_FALLBACK. Since the
wrong option was enabled, the system got a regression -- it suffers
from the unexpected long delays for non-present firmware files.
This patch corrects the Kconfig dependency to the right one,
CONFIG_FW_LOADER_USER_HELPER. This doesn't change the fallback
behavior but only enables UMH when needed.
Bugzilla: https://bugzilla.opensuse.org/show_bug.cgi?id=944661
Fixes:
b67893206fc0 ('leds:lp55xx: fix firmware loading error')
Cc: <stable@vger.kernel.org> # v4.2+
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Jacek Anaszewski <j.anaszewski@samsung.com>
Jacek Anaszewski [Mon, 7 Sep 2015 15:06:05 +0000 (17:06 +0200)]
leds: leds-ipaq-micro: Add LEDS_CLASS dependency
Fix missing Kconfig LEDS_CLASS dependency.
Signed-off-by: Jacek Anaszewski <j.anaszewski@samsung.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Jacek Anaszewski [Fri, 4 Sep 2015 10:27:09 +0000 (12:27 +0200)]
leds: aat1290: add 'static' modifier to init_mm_current_scale
Function init_mm_current_scale is used only locally. Make it static then.
Signed-off-by: Jacek Anaszewski <j.anaszewski@samsung.com>
Luis de Bethencourt [Tue, 1 Sep 2015 21:36:59 +0000 (23:36 +0200)]
leds: leds-ns2: Fix module autoload for OF platform driver
This platform driver has a OF device ID table but the OF module
alias information is not created so module autoloading won't work.
Signed-off-by: Luis de Bethencourt <luis@debethencourt.com>
Signed-off-by: Jacek Anaszewski <j.anaszewski@samsung.com>
Luis de Bethencourt [Tue, 1 Sep 2015 21:36:41 +0000 (23:36 +0200)]
leds: max77693: Fix module autoload for OF platform driver
This platform driver has a OF device ID table but the OF module
alias information is not created so module autoloading won't work.
Signed-off-by: Luis de Bethencourt <luis@debethencourt.com>
Signed-off-by: Jacek Anaszewski <j.anaszewski@samsung.com>
Luis de Bethencourt [Tue, 1 Sep 2015 21:36:15 +0000 (23:36 +0200)]
leds: ktd2692: Fix module autoload for OF platform driver
This platform driver has a OF device ID table but the OF module
alias information is not created so module autoloading won't work.
Signed-off-by: Luis de Bethencourt <luis@debethencourt.com>
Signed-off-by: Jacek Anaszewski <j.anaszewski@samsung.com>
Luis de Bethencourt [Tue, 1 Sep 2015 21:35:55 +0000 (23:35 +0200)]
leds: bcm6358: Fix module autoload for OF platform driver
This platform driver has a OF device ID table but the OF module
alias information is not created so module autoloading won't work.
Signed-off-by: Luis de Bethencourt <luis@debethencourt.com>
Signed-off-by: Jacek Anaszewski <j.anaszewski@samsung.com>
Luis de Bethencourt [Tue, 1 Sep 2015 21:35:38 +0000 (23:35 +0200)]
leds: bcm6328: Fix module autoload for OF platform driver
This platform driver has a OF device ID table but the OF module
alias information is not created so module autoloading won't work.
Signed-off-by: Luis de Bethencourt <luis@debethencourt.com>
Signed-off-by: Jacek Anaszewski <j.anaszewski@samsung.com>
Luis de Bethencourt [Tue, 1 Sep 2015 21:35:07 +0000 (23:35 +0200)]
leds: aat1290: Fix module autoload for OF platform driver
This platform driver has a OF device ID table but the OF module
alias information is not created so module autoloading won't work.
Signed-off-by: Luis de Bethencourt <luis@debethencourt.com>
Signed-off-by: Jacek Anaszewski <j.anaszewski@samsung.com>
LEROY Christophe [Wed, 16 Sep 2015 10:04:53 +0000 (12:04 +0200)]
powerpc32: memset: only use dcbz once cache is enabled
memset() uses instruction dcbz to speed up clearing by not wasting time
loading cache line with data that will be overwritten.
Some platform like mpc52xx do no have cache active at startup and
can therefore not use memset(). Allthough no part of the code
explicitly uses memset(), GCC may make calls to it.
This patch modifies memset() such that at startup, memset()
unconditionally skip the optimised bloc that uses dcbz instruction.
Once the initial MMU is set up, in machine_init() we patch memset()
by replacing this inconditional jump by a NOP
Tested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
LEROY Christophe [Wed, 16 Sep 2015 10:04:51 +0000 (12:04 +0200)]
powerpc32: memcpy: only use dcbz once cache is enabled
memcpy() uses instruction dcbz to speed up copy by not wasting time
loading cache line with data that will be overwritten.
Some platform like mpc52xx do no have cache active at startup and
can therefore not use memcpy(). Allthough no part of the code
explicitly uses memcpy(), GCC makes calls to it.
This patch modifies memcpy() such that at startup, memcpy()
unconditionally jumps to generic_memcpy() which doesn't use
the dcbz instruction.
Once the initial MMU is set up, in machine_init() we patch memcpy()
by replacing this inconditional jump by a NOP
Reported-by: Michal Sojka <sojkam1@fel.cvut.cz>
Tested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Doug Anderson [Wed, 26 Aug 2015 17:26:49 +0000 (18:26 +0100)]
ARM: 8425/1: kgdb: Don't try to stop the machine when setting breakpoints
In (
23a4e40 arm: kgdb: Handle read-only text / modules) we moved to
using patch_text() to set breakpoints so that we could handle the case
when we had CONFIG_DEBUG_RODATA. That patch used patch_text().
Unfortunately, patch_text() assumes that we're not in atomic context
when it runs since it needs to grab a mutex and also wait for other
CPUs to stop (which it does with a completion).
This would result in a stack crawl if you had
CONFIG_DEBUG_ATOMIC_SLEEP and tried to set a breakpoint in kgdb. The
crawl looked something like:
BUG: scheduling while atomic: swapper/0/0/0x00010007
CPU: 0 PID: 0 Comm: swapper/0 Not tainted
4.2.0-rc7-00133-geb63b34 #1073
Hardware name: Rockchip (Device Tree)
(unwind_backtrace) from [<
c00133d4>] (show_stack+0x20/0x24)
(show_stack) from [<
c05400e8>] (dump_stack+0x84/0xb8)
(dump_stack) from [<
c004913c>] (__schedule_bug+0x54/0x6c)
(__schedule_bug) from [<
c054065c>] (__schedule+0x80/0x668)
(__schedule) from [<
c0540cfc>] (schedule+0xb8/0xd4)
(schedule) from [<
c0543a3c>] (schedule_timeout+0x2c/0x234)
(schedule_timeout) from [<
c05417c0>] (wait_for_common+0xf4/0x188)
(wait_for_common) from [<
c0541874>] (wait_for_completion+0x20/0x24)
(wait_for_completion) from [<
c00a0104>] (__stop_cpus+0x58/0x70)
(__stop_cpus) from [<
c00a0580>] (stop_cpus+0x3c/0x54)
(stop_cpus) from [<
c00a06c4>] (__stop_machine+0xcc/0xe8)
(__stop_machine) from [<
c00a0714>] (stop_machine+0x34/0x44)
(stop_machine) from [<
c00173e8>] (patch_text+0x28/0x34)
(patch_text) from [<
c001733c>] (kgdb_arch_set_breakpoint+0x40/0x4c)
(kgdb_arch_set_breakpoint) from [<
c00a0d68>] (kgdb_validate_break_address+0x2c/0x60)
(kgdb_validate_break_address) from [<
c00a0e90>] (dbg_set_sw_break+0x1c/0xdc)
(dbg_set_sw_break) from [<
c00a2e88>] (gdb_serial_stub+0x9c4/0xba4)
(gdb_serial_stub) from [<
c00a11cc>] (kgdb_cpu_enter+0x1f8/0x60c)
(kgdb_cpu_enter) from [<
c00a18cc>] (kgdb_handle_exception+0x19c/0x1d0)
(kgdb_handle_exception) from [<
c0016f7c>] (kgdb_compiled_brk_fn+0x30/0x3c)
(kgdb_compiled_brk_fn) from [<
c00091a4>] (do_undefinstr+0x1a4/0x20c)
(do_undefinstr) from [<
c001400c>] (__und_svc_finish+0x0/0x34)
It turns out that when we're in kgdb all the CPUs are stopped anyway
so there's no reason we should be calling patch_text(). We can
instead directly call __patch_text() which assumes that CPUs have
already been stopped.
Fixes:
23a4e4050ba9 ("arm: kgdb: Handle read-only text / modules")
Reported-by: Aapo Vienamo <avienamo@nvidia.com>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Andre Przywara [Mon, 14 Sep 2015 16:49:02 +0000 (17:49 +0100)]
ARM: 8437/1: dma-mapping: fix build warning with new DMA_ERROR_CODE definition
Commit
96231b2686b5: ("ARM: 8419/1: dma-mapping: harmonize definition
of DMA_ERROR_CODE") changed the definition of DMA_ERROR_CODE to use
dma_addr_t, which makes the compiler barf on assigning this to an
"int" variable on ARM with LPAE enabled:
*************
In file included from /src/linux/include/linux/dma-mapping.h:86:0,
from /src/linux/arch/arm/mm/dma-mapping.c:21:
/src/linux/arch/arm/mm/dma-mapping.c: In function '__iommu_create_mapping':
/src/linux/arch/arm/include/asm/dma-mapping.h:16:24: warning:
overflow in implicit constant conversion [-Woverflow]
#define DMA_ERROR_CODE (~(dma_addr_t)0x0)
^
/src/linux/arch/arm/mm/dma-mapping.c:1252:15: note: in expansion of
macro DMA_ERROR_CODE'
int i, ret = DMA_ERROR_CODE;
^
*************
Remove the actually unneeded initialization of "ret" in
__iommu_create_mapping() and move the variable declaration inside the
for-loop to make the scope of this variable more clear.
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Russell King [Wed, 16 Sep 2015 10:08:49 +0000 (11:08 +0100)]
ARM: get rid of needless #if in signal handling code
Remove the #if statement which caused trouble for kernels that support
both ARMv6 and ARMv7. Older architectures do not implement these bits,
so it should be safe to always clear them.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Mans Rullgard [Sun, 15 Feb 2015 12:33:49 +0000 (12:33 +0000)]
clk: check for invalid parent index of orphans in __clk_init()
If a mux clock is initialised (by hardware or firmware) with an
invalid parent, its ->get_parent() can return an out of range
index. For example, the generic mux clock attempts to return
-EINVAL, which due to the u8 return type ends up a rather large
number. Using this index with the parent_names[] array results
in an invalid pointer and (usually) a crash in the following
strcmp().
This patch adds a check for the parent index being in range,
ignoring clocks reporting invalid values.
Signed-off-by: Mans Rullgard <mans@mansr.com>
Tested-by: Rhyland Klein <rklein@nvidia.com>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Thierry Reding [Thu, 10 Sep 2015 13:55:21 +0000 (15:55 +0200)]
clk: tegra: dfll: Properly protect OPP list
The OPP list needs to be protected against concurrent accesses. Using
simple RCU read locks does the trick and gets rid of the following
lockdep warning:
===============================
[ INFO: suspicious RCU usage. ]
4.2.0-next-
20150908 #1 Not tainted
-------------------------------
drivers/base/power/opp.c:460 Missing rcu_read_lock() or dev_opp_list_lock protection!
other info that might help us debug this:
rcu_scheduler_active = 1, debug_locks = 0
4 locks held by kworker/u8:0/6:
#0: ("%s""deferwq"){++++.+}, at: [<
c0040d8c>] process_one_work+0x118/0x4bc
#1: (deferred_probe_work){+.+.+.}, at: [<
c0040d8c>] process_one_work+0x118/0x4bc
#2: (&dev->mutex){......}, at: [<
c03b8194>] __device_attach+0x20/0x118
#3: (prepare_lock){+.+...}, at: [<
c054bc08>] clk_prepare_lock+0x10/0xf8
stack backtrace:
CPU: 2 PID: 6 Comm: kworker/u8:0 Not tainted 4.2.0-next-
20150908 #1
Hardware name: NVIDIA Tegra SoC (Flattened Device Tree)
Workqueue: deferwq deferred_probe_work_func
[<
c001802c>] (unwind_backtrace) from [<
c00135a4>] (show_stack+0x10/0x14)
[<
c00135a4>] (show_stack) from [<
c02a8418>] (dump_stack+0x94/0xd4)
[<
c02a8418>] (dump_stack) from [<
c03c6f6c>] (dev_pm_opp_find_freq_ceil+0x108/0x114)
[<
c03c6f6c>] (dev_pm_opp_find_freq_ceil) from [<
c0551a3c>] (dfll_calculate_rate_request+0xb8/0x170)
[<
c0551a3c>] (dfll_calculate_rate_request) from [<
c0551b10>] (dfll_clk_round_rate+0x1c/0x2c)
[<
c0551b10>] (dfll_clk_round_rate) from [<
c054de2c>] (clk_calc_new_rates+0x1b8/0x228)
[<
c054de2c>] (clk_calc_new_rates) from [<
c054e44c>] (clk_core_set_rate_nolock+0x44/0xac)
[<
c054e44c>] (clk_core_set_rate_nolock) from [<
c054e4d8>] (clk_set_rate+0x24/0x34)
[<
c054e4d8>] (clk_set_rate) from [<
c0512460>] (tegra124_cpufreq_probe+0x120/0x230)
[<
c0512460>] (tegra124_cpufreq_probe) from [<
c03b9cbc>] (platform_drv_probe+0x44/0xac)
[<
c03b9cbc>] (platform_drv_probe) from [<
c03b84c8>] (driver_probe_device+0x218/0x304)
[<
c03b84c8>] (driver_probe_device) from [<
c03b69b0>] (bus_for_each_drv+0x60/0x94)
[<
c03b69b0>] (bus_for_each_drv) from [<
c03b8228>] (__device_attach+0xb4/0x118)
ata1: SATA link down (SStatus 0 SControl 300)
[<
c03b8228>] (__device_attach) from [<
c03b77c8>] (bus_probe_device+0x88/0x90)
[<
c03b77c8>] (bus_probe_device) from [<
c03b7be8>] (deferred_probe_work_func+0x58/0x8c)
[<
c03b7be8>] (deferred_probe_work_func) from [<
c0040dfc>] (process_one_work+0x188/0x4bc)
[<
c0040dfc>] (process_one_work) from [<
c004117c>] (worker_thread+0x4c/0x4f4)
[<
c004117c>] (worker_thread) from [<
c0047230>] (kthread+0xe4/0xf8)
[<
c0047230>] (kthread) from [<
c000f7d0>] (ret_from_fork+0x14/0x24)
Signed-off-by: Thierry Reding <treding@nvidia.com>
Fixes:
c4fe70ada40f ("clk: tegra: Add closed loop support for the DFLL")
[vince.h@nvidia.com: Unlock rcu on error path]
Signed-off-by: Vince Hsu <vince.h@nvidia.com>
[sboyd@codeaurora.org: Dropped second hunk that nested the rcu
read lock unnecessarily]
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Pavel Fedin [Wed, 5 Aug 2015 10:53:57 +0000 (11:53 +0100)]
arm/arm64: KVM: vgic: Check for !irqchip_in_kernel() when mapping resources
Until
b26e5fdac43c ("arm/arm64: KVM: introduce per-VM ops"),
kvm_vgic_map_resources() used to include a check on irqchip_in_kernel(),
and vgic_v2_map_resources() still has it.
But now vm_ops are not initialized until we call kvm_vgic_create().
Therefore kvm_vgic_map_resources() can being called without a VGIC,
and we die because vm_ops.map_resources is NULL.
Fixing this restores QEMU's kernel-irqchip=off option to a working state,
allowing to use GIC emulation in userspace.
Fixes:
b26e5fdac43c ("arm/arm64: KVM: introduce per-VM ops")
Cc: stable@vger.kernel.org
Signed-off-by: Pavel Fedin <p.fedin@samsung.com>
[maz: reworked commit message]
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>