Theodore Ts'o [Thu, 4 Sep 2014 22:07:25 +0000 (18:07 -0400)]
ext4: prepare to drop EXT4_STATE_DELALLOC_RESERVED
The EXT4_STATE_DELALLOC_RESERVED flag was originally implemented
because it was too hard to make sure the mballoc and get_block flags
could be reliably passed down through all of the codepaths that end up
calling ext4_mb_new_blocks().
Since then, we have mb_flags passed down through most of the code
paths, so getting rid of EXT4_STATE_DELALLOC_RESERVED isn't as tricky
as it used to.
This commit plumbs in the last of what is required, and then adds a
WARN_ON check to make sure we haven't missed anything. If this passes
a full regression test run, we can then drop
EXT4_STATE_DELALLOC_RESERVED.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
Theodore Ts'o [Thu, 4 Sep 2014 22:06:25 +0000 (18:06 -0400)]
ext4: pass allocation_request struct to ext4_(alloc,splice)_branch
Instead of initializing the allocation_request structure in
ext4_alloc_branch(), set it up in ext4_ind_map_blocks(), and then pass
it to ext4_alloc_branch() and ext4_splice_branch().
This allows ext4_ind_map_blocks to pass flags in the allocation
request structure without having to add Yet Another argument to
ext4_alloc_branch().
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
Zheng Liu [Tue, 2 Sep 2014 02:26:49 +0000 (22:26 -0400)]
ext4: track extent status tree shrinker delay statictics
This commit adds some statictics in extent status tree shrinker. The
purpose to add these is that we want to collect more details when we
encounter a stall caused by extent status tree shrinker. Here we count
the following statictics:
stats:
the number of all objects on all extent status trees
the number of reclaimable objects on lru list
cache hits/misses
the last sorted interval
the number of inodes on lru list
average:
scan time for shrinking some objects
the number of shrunk objects
maximum:
the inode that has max nr. of objects on lru list
the maximum scan time for shrinking some objects
The output looks like below:
$ cat /proc/fs/ext4/sda1/es_shrinker_info
stats:
28228 objects
6341 reclaimable objects
5281/631 cache hits/misses
586 ms last sorted interval
250 inodes on lru list
average:
153 us scan time
128 shrunk objects
maximum:
255 inode (255 objects, 198 reclaimable)
125723 us max scan time
If the lru list has never been sorted, the following line will not be
printed:
586ms last sorted interval
If there is an empty lru list, the following lines also will not be
printed:
250 inodes on lru list
...
maximum:
255 inode (255 objects, 198 reclaimable)
0 us max scan time
Meanwhile in this commit a new trace point is defined to print some
details in __ext4_es_shrink().
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Cc: Jan Kara <jack@suse.cz>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Zheng Liu [Tue, 2 Sep 2014 02:22:13 +0000 (22:22 -0400)]
ext4: improve extents status tree trace point
This commit improves the trace point of extents status tree. We rename
trace_ext4_es_shrink_enter in ext4_es_count() because it is also used
in ext4_es_scan() and we can not identify them from the result.
Further this commit fixes a variable name in trace point in order to
keep consistency with others.
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Cc: Jan Kara <jack@suse.cz>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Seunghun Lee [Tue, 2 Sep 2014 02:15:30 +0000 (22:15 -0400)]
ext4: fix comments about get_blocks
get_blocks is renamed to get_block.
Signed-off-by: Seunghun Lee <waydi1@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Darrick J. Wong [Tue, 2 Sep 2014 01:34:09 +0000 (21:34 -0400)]
ext4: enable block_validity by default
Enable by default the block_validity feature, which checks for
collisions between newly allocated blocks and critical system
metadata.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Theodore Ts'o [Tue, 2 Sep 2014 01:26:09 +0000 (21:26 -0400)]
jbd2: fold __wait_cp_io into jbd2_log_do_checkpoint()
__wait_cp_io() is only called by jbd2_log_do_checkpoint(). Fold it in
to make it a bit easier to understand.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Theodore Ts'o [Tue, 2 Sep 2014 01:19:01 +0000 (21:19 -0400)]
jbd2: fold __process_buffer() into jbd2_log_do_checkpoint()
__process_buffer() is only called by jbd2_log_do_checkpoint(), and it
had a very complex locking protocol where it would be called with the
j_list_lock, and sometimes exit with the lock held (if the return code
was 0), or release the lock.
This was confusing both to humans and to smatch (which erronously
complained that the lock was taken twice).
Folding __process_buffer() to the caller allows us to simplify the
control flow, making the resulting function easier to read and reason
about, and dropping the compiled size of fs/jbd2/checkpoint.c by 150
bytes (over 4% of the text size).
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
Theodore Ts'o [Mon, 1 Sep 2014 18:43:09 +0000 (14:43 -0400)]
ext4: rename ext4_ext_find_extent() to ext4_find_extent()
Make the function name less redundant.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Theodore Ts'o [Mon, 1 Sep 2014 18:42:09 +0000 (14:42 -0400)]
ext4: reuse path object in ext4_move_extents()
Reuse the path object in ext4_move_extents() so we don't unnecessarily
free and reallocate it.
Also clean up the get_ext_path() wrapper so that it has the same
semantics of freeing the path object on error as ext4_ext_find_extent().
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Theodore Ts'o [Mon, 1 Sep 2014 18:41:09 +0000 (14:41 -0400)]
ext4: reuse path object in ext4_ext_shift_extents()
Now that the semantics of ext4_ext_find_extent() are much cleaner,
it's safe and more efficient to reuse the path object across the
multiple calls to ext4_ext_find_extent() in ext4_ext_shift_extents().
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Theodore Ts'o [Mon, 1 Sep 2014 18:40:09 +0000 (14:40 -0400)]
ext4: teach ext4_ext_find_extent() to realloc path if necessary
This adds additional safety in case for some reason we end reusing a
path structure which isn't big enough for current depth of the inode.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Theodore Ts'o [Mon, 1 Sep 2014 18:39:09 +0000 (14:39 -0400)]
ext4: allow a NULL argument to ext4_ext_drop_refs()
Teach ext4_ext_drop_refs() to accept a NULL argument, much like
kfree(). This allows us to drop a lot of checks to make sure path is
non-NULL before calling ext4_ext_drop_refs().
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Theodore Ts'o [Mon, 1 Sep 2014 18:38:09 +0000 (14:38 -0400)]
ext4: call ext4_ext_drop_refs() from ext4_ext_find_extent()
In nearly all of the calls to ext4_ext_find_extent() where the caller
is trying to recycle the path object, ext4_ext_drop_refs() gets called
to release the buffer heads before the path object gets overwritten.
To simplify things for the callers, and to avoid the possibility of a
memory leak, make ext4_ext_find_extent() responsible for dropping the
buffers.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Theodore Ts'o [Mon, 1 Sep 2014 18:37:09 +0000 (14:37 -0400)]
ext4: drop EXT4_EX_NOFREE_ON_ERR from rest of extents handling code
Drop EXT4_EX_NOFREE_ON_ERR from ext4_ext_create_new_leaf(),
ext4_split_extent(), ext4_convert_unwritten_extents_endio().
This requires fixing all of their callers to potentially
ext4_ext_find_extent() to free the struct ext4_ext_path object in case
of an error, and there are interlocking dependencies all the way up to
ext4_ext_map_blocks(), ext4_swap_extents(), and
ext4_ext_remove_space().
Once this is done, we can drop the EXT4_EX_NOFREE_ON_ERR flag since it
is no longer necessary.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Theodore Ts'o [Mon, 1 Sep 2014 18:36:09 +0000 (14:36 -0400)]
ext4: drop EXT4_EX_NOFREE_ON_ERR in convert_initialized_extent()
Transfer responsibility of freeing struct ext4_ext_path on error to
ext4_ext_find_extent().
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Theodore Ts'o [Mon, 1 Sep 2014 18:35:09 +0000 (14:35 -0400)]
ext4: collapse ext4_convert_initialized_extents()
The function ext4_convert_initialized_extents() is only called by a
single function --- ext4_ext_convert_initalized_extents(). Inline the
code and get rid of the unnecessary bits in order to simplify the code.
Rename ext4_ext_convert_initalized_extents() to
convert_initalized_extents() since it's a static function that is
actually only used in a single caller, ext4_ext_map_blocks().
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Theodore Ts'o [Mon, 1 Sep 2014 18:34:09 +0000 (14:34 -0400)]
ext4: teach ext4_ext_find_extent() to free path on error
Right now, there are a places where it is all to easy to leak memory
on an error path, via a usage like this:
struct ext4_ext_path *path = NULL
while (...) {
...
path = ext4_ext_find_extent(inode, block, path, 0);
if (IS_ERR(path)) {
/* oops, if path was non-NULL before the call to
ext4_ext_find_extent, we've leaked it! :-( */
...
return PTR_ERR(path);
}
...
}
Unfortunately, there some code paths where we are doing the following
instead:
path = ext4_ext_find_extent(inode, block, orig_path, 0);
and where it's important that we _not_ free orig_path in the case
where ext4_ext_find_extent() returns an error.
So change the function signature of ext4_ext_find_extent() so that it
takes a struct ext4_ext_path ** for its third argument, and by
default, on an error, it will free the struct ext4_ext_path, and then
zero out the struct ext4_ext_path * pointer. In order to avoid
causing problems, we add a flag EXT4_EX_NOFREE_ON_ERR which causes
ext4_ext_find_extent() to use the original behavior of forcing the
caller to deal with freeing the original path pointer on the error
case.
The goal is to get rid of EXT4_EX_NOFREE_ON_ERR entirely, but this
allows for a gentle transition and makes the patches easier to verify.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Theodore Ts'o [Mon, 1 Sep 2014 18:33:09 +0000 (14:33 -0400)]
ext4: fix accidental flag aliasing in ext4_map_blocks flags
Commit
b8a8684502a0f introduced an accidental flag aliasing between
EXT4_EX_NOCACHE and EXT4_GET_BLOCKS_CONVERT_UNWRITTEN.
Fortunately, this didn't introduce any untorward side effects --- we
got lucky. Nevertheless, fix this and leave a warning to hopefully
avoid this from happening in the future.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Theodore Ts'o [Mon, 1 Sep 2014 18:32:09 +0000 (14:32 -0400)]
ext4: fix ZERO_RANGE bug hidden by flag aliasing
We accidently aliased EXT4_EX_NOCACHE and EXT4_GET_CONVERT_UNWRITTEN
falgs, which apparently was hiding a bug that was unmasked when this
flag aliasing issue was addressed (see the subsequent commit). The
reproduction case was:
fsx -N 10000 -l 500000 -r 4096 -t 4096 -w 4096 -Z -R -W /vdb/junk
... which would cause fsx to report corruption in the data file.
The fix we have is a bit of an overkill, but I'd much rather be
conservative for now, and we can optimize ZERO_RANGE_FL handling
later. The fact that we need to zap the extent_status cache for the
inode is unfortunate, but correctness is far more important than
performance.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: Namjae Jeon <namjae.jeon@samsung.com>
Theodore Ts'o [Sun, 31 Aug 2014 19:03:14 +0000 (15:03 -0400)]
ext4: fix ext4_swap_extents() error handling
If ext4_ext_find_extent() returns an error, we have to clear path1 or
path2 or else we would end up trying to free an ERR_PTR, which would
be bad.
Also eliminate some redundant code and mark the error paths as unlikely()
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Dmitry Monakhov [Sun, 31 Aug 2014 03:52:19 +0000 (23:52 -0400)]
ext4: refactor ext4_move_extents code base
ext4_move_extents is too complex for review. It has duplicate almost
each function available in the rest of other codebase. It has useless
artificial restriction orig_offset == donor_offset. But in fact logic
of ext4_move_extents is very simple:
Iterate extents one by one (similar to ext4_fill_fiemap_extents)
->Iterate each page covered extent (similar to generic_perform_write)
->swap extents for covered by page (can be shared with IOC_MOVE_DATA)
Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Dmitry Monakhov [Sun, 31 Aug 2014 03:50:56 +0000 (23:50 -0400)]
ext4: use ext4_ext_next_allocated_block instead of mext_next_extent
This allows us to make mext_next_extent static and potentially get rid
of it.
Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Dmitry Monakhov [Sun, 31 Aug 2014 03:34:06 +0000 (23:34 -0400)]
ext4: use ext4_update_i_disksize instead of opencoded ones
Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Wang Shilong [Sat, 30 Aug 2014 03:20:44 +0000 (23:20 -0400)]
ext4: remove a duplicate call in ext4_init_new_dir()
ext4_journal_get_write_access() has just been called in ext4_append()
calling it again here is duplicated.
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Theodore Ts'o [Sat, 30 Aug 2014 00:52:18 +0000 (20:52 -0400)]
ext4: convert do_split() to use the ERR_PTR convention
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Theodore Ts'o [Sat, 30 Aug 2014 00:52:17 +0000 (20:52 -0400)]
ext4: convert dx_probe() to use the ERR_PTR convention
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Theodore Ts'o [Sat, 30 Aug 2014 00:52:15 +0000 (20:52 -0400)]
ext4: convert ext4_bread() to use the ERR_PTR convention
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Theodore Ts'o [Sat, 30 Aug 2014 00:51:32 +0000 (20:51 -0400)]
ext4: convert ext4_getblk() to use the ERR_PTR convention
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Theodore Ts'o [Sat, 30 Aug 2014 00:49:51 +0000 (20:49 -0400)]
ext4: convert ext4_dx_find_entry() to use the ERR_PTR convention
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Linus Torvalds [Fri, 29 Aug 2014 18:52:46 +0000 (11:52 -0700)]
Merge tag 'ext4_for_linus_stable' of git://git./linux/kernel/git/tytso/ext4
Pull ext4 bugfixes from Ted Ts'o:
"Ext4 bug fixes for 3.17, to provide better handling of memory
allocation failures, and to fix some journaling bugs involving
journal checksums and FALLOC_FL_ZERO_RANGE"
* tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4: fix same-dir rename when inline data directory overflows
jbd2: fix descriptor block size handling errors with journal_csum
jbd2: fix infinite loop when recovering corrupt journal blocks
ext4: update i_disksize coherently with block allocation on error path
ext4: fix transaction issues for ext4_fallocate and ext_zero_range
ext4: fix incorect journal credits reservation in ext4_zero_range
ext4: move i_size,i_disksize update routines to helper function
ext4: fix BUG_ON in mb_free_blocks()
ext4: propagate errors up to ext4_find_entry()'s callers
Linus Torvalds [Fri, 29 Aug 2014 18:49:10 +0000 (11:49 -0700)]
Merge tag 'dm-3.17-fix' of git://git./linux/kernel/git/device-mapper/linux-dm
Pull device mapper fix from Mike Snitzer:
"Fix a 3.17-rc1 regression introduced by switching the DM crypt target
to using per-bio data"
* tag 'dm-3.17-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm crypt: fix access beyond the end of allocated space
Linus Torvalds [Fri, 29 Aug 2014 18:21:48 +0000 (11:21 -0700)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull block layer fixes from Jens Axboe:
"A smaller collection of fixes that have come up since the initial
merge window pull request. This contains:
- error handling cleanup and support for larger than 16 byte cdbs in
sg_io() from Christoph. The latter just matches what bsg and
friends support, sg_io() got left out in the merge.
- an option for brd to expose partitions in /proc/partitions. They
are hidden by default for compat reasons. From Dmitry Monakhov.
- a few blk-mq fixes from me - killing a dead/unused flag, fix for
merging happening even if turned off, and correction of a few
comments.
- removal of unnecessary ->owner setting in systemace. From Michal
Simek.
- two related fixes for a problem with nesting freezing of queues in
blk-mq. One from Ming Lei removing an unecessary freeze operation,
and another from Tejun fixing the nesting regression introduced in
the merge window.
- fix for a BUG_ON() at bio_endio time when protection info is
attached and the IO has an error. From Sagi Grimberg.
- two scsi_ioctl bug fixes for regressions with scsi-mq from Tony
Battersby.
- a cfq weight update fix and subsequent comment update from Toshiaki
Makita"
* 'for-linus' of git://git.kernel.dk/linux-block:
cfq-iosched: Add comments on update timing of weight
cfq-iosched: Fix wrong children_weight calculation
block: fix error handling in sg_io
fix regression in SCSI_IOCTL_SEND_COMMAND
scsi-mq: fix requests that use a separate CDB buffer
block: support > 16 byte CDBs for SG_IO
block: cleanup error handling in sg_io
brd: add ram disk visibility option
block: systemace: Remove .owner field for driver
blk-mq: blk_mq_freeze_queue() should allow nesting
blk-mq: correct a few wrong/bad comments
block: Fix BUG_ON when pi errors occur
blk-mq: don't allow merges if turned off for the queue
blk-mq: get rid of unused BLK_MQ_F_SHOULD_SORT flag
blk-mq: fix WARNING "percpu_ref_kill() called more than once!"
Will Deacon [Fri, 25 Jul 2014 00:53:54 +0000 (17:53 -0700)]
alpha: io: implement relaxed accessor macros for writes
write{b,w,l,q}_relaxed are implemented by some architectures in order to
permit memory-mapped I/O writes with weaker barrier semantics than the
non-relaxed variants.
This patch implements these write macros for Alpha, in the same vein as
the relaxed read macros, which are already implemented.
Acked-by: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Michael Cree [Fri, 25 Jul 2014 00:53:53 +0000 (17:53 -0700)]
alpha: Wire up sched_setattr, sched_getattr, and renameat2 syscalls.
Signed-off-by: Michael Cree <mcree@orcon.net.nz>
Signed-off-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Darrick J. Wong [Wed, 27 Aug 2014 22:40:09 +0000 (18:40 -0400)]
ext4: fix same-dir rename when inline data directory overflows
When performing a same-directory rename, it's possible that adding or
setting the new directory entry will cause the directory to overflow
the inline data area, which causes the directory to be converted to an
extent-based directory. Under this circumstance it is necessary to
re-read the directory when deleting the old dirent because the "old
directory" context still points to i_block in the inode table, which
is now an extent tree root! The delete fails with an FS error, and
the subsequent fsck complains about incorrect link counts and
hardlinked directories.
Test case (originally found with flat_dir_test in the metadata_csum
test program):
# mkfs.ext4 -O inline_data /dev/sda
# mount /dev/sda /mnt
# mkdir /mnt/x
# touch /mnt/x/changelog.gz /mnt/x/copyright /mnt/x/README.Debian
# sync
# for i in /mnt/x/*; do mv $i $i.longer; done
# ls -la /mnt/x/
total 0
-rw-r--r-- 1 root root 0 Aug 25 12:03 changelog.gz.longer
-rw-r--r-- 1 root root 0 Aug 25 12:03 copyright
-rw-r--r-- 1 root root 0 Aug 25 12:03 copyright.longer
-rw-r--r-- 1 root root 0 Aug 25 12:03 README.Debian.longer
(Hey! Why are there four files now??)
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
Darrick J. Wong [Wed, 27 Aug 2014 22:40:07 +0000 (18:40 -0400)]
jbd2: fix descriptor block size handling errors with journal_csum
It turns out that there are some serious problems with the on-disk
format of journal checksum v2. The foremost is that the function to
calculate descriptor tag size returns sizes that are too big. This
causes alignment issues on some architectures and is compounded by the
fact that some parts of jbd2 use the structure size (incorrectly) to
determine the presence of a 64bit journal instead of checking the
feature flags.
Therefore, introduce journal checksum v3, which enlarges the
descriptor block tag format to allow for full 32-bit checksums of
journal blocks, fix the journal tag function to return the correct
sizes, and fix the jbd2 recovery code to use feature flags to
determine 64bitness.
Add a few function helpers so we don't have to open-code quite so
many pieces.
Switching to a 16-byte block size was found to increase journal size
overhead by a maximum of 0.1%, to convert a 32-bit journal with no
checksumming to a 32-bit journal with checksum v3 enabled.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reported-by: TR Reardon <thomas_reardon@hotmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
Darrick J. Wong [Wed, 27 Aug 2014 22:40:05 +0000 (18:40 -0400)]
jbd2: fix infinite loop when recovering corrupt journal blocks
When recovering the journal, don't fall into an infinite loop if we
encounter a corrupt journal block. Instead, just skip the block and
return an error, which fails the mount and thus forces the user to run
a full filesystem fsck.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
Dmitry Monakhov [Wed, 27 Aug 2014 22:40:03 +0000 (18:40 -0400)]
ext4: update i_disksize coherently with block allocation on error path
In case of delalloc block i_disksize may be less than i_size. So we
have to update i_disksize each time we allocated and submitted some
blocks beyond i_disksize. We weren't doing this on the error paths,
so fix this.
testcase: xfstest generic/019
Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
Mikulas Patocka [Thu, 28 Aug 2014 15:09:31 +0000 (11:09 -0400)]
dm crypt: fix access beyond the end of allocated space
The DM crypt target accesses memory beyond allocated space resulting in
a crash on 32 bit x86 systems.
This bug is very old (it dates back to 2.6.25 commit
3a7f6c990ad04 "dm
crypt: use async crypto"). However, this bug was masked by the fact
that kmalloc rounds the size up to the next power of two. This bug
wasn't exposed until 3.17-rc1 commit
298a9fa08a ("dm crypt: use per-bio
data"). By switching to using per-bio data there was no longer any
padding beyond the end of a dm-crypt allocated memory block.
To minimize allocation overhead dm-crypt puts several structures into one
block allocated with kmalloc. The block holds struct ablkcipher_request,
cipher-specific scratch pad (crypto_ablkcipher_reqsize(any_tfm(cc))),
struct dm_crypt_request and an initialization vector.
The variable dmreq_start is set to offset of struct dm_crypt_request
within this memory block. dm-crypt allocates the block with this size:
cc->dmreq_start + sizeof(struct dm_crypt_request) + cc->iv_size.
When accessing the initialization vector, dm-crypt uses the function
iv_of_dmreq, which performs this calculation: ALIGN((unsigned long)(dmreq
+ 1), crypto_ablkcipher_alignmask(any_tfm(cc)) + 1).
dm-crypt allocated "cc->iv_size" bytes beyond the end of dm_crypt_request
structure. However, when dm-crypt accesses the initialization vector, it
takes a pointer to the end of dm_crypt_request, aligns it, and then uses
it as the initialization vector. If the end of dm_crypt_request is not
aligned on a crypto_ablkcipher_alignmask(any_tfm(cc)) boundary the
alignment causes the initialization vector to point beyond the allocated
space.
Fix this bug by calculating the variable iv_size_padding and adding it
to the allocated size.
Also correct the alignment of dm_crypt_request. struct dm_crypt_request
is specific to dm-crypt (it isn't used by the crypto subsystem at all),
so it is aligned on __alignof__(struct dm_crypt_request).
Also align per_bio_data_size on ARCH_KMALLOC_MINALIGN, so that it is
aligned as if the block was allocated with kmalloc.
Reported-by: Krzysztof Kolasa <kkolasa@winsoft.pl>
Tested-by: Milan Broz <gmazyland@gmail.com>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Linus Torvalds [Thu, 28 Aug 2014 17:47:10 +0000 (10:47 -0700)]
Merge tag 'backlight-fixes-3.17' of git://git./linux/kernel/git/lee/backlight
Pull backlight fix from Lee Jones:
"One simple fix to invalidate GPIO non-request"
* tag 'backlight-fixes-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight:
pwm-backlight: Fix bogus request for GPIO#0 when instantiated from DT
Linus Torvalds [Thu, 28 Aug 2014 17:46:25 +0000 (10:46 -0700)]
Merge tag 'mfd-fixes-3.17' of git://git./linux/kernel/git/lee/mfd
Pull mfd fixes from Lee Jones:
"Couple of simple fixes due for the 3.17 rcs
(and a sneaky document addition that slipped from the previous
pull-request)"
* tag 'mfd-fixes-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd:
mfd: twl4030-power: Fix PM idle pin configuration to not conflict with regulators
mfd: tc3589x: Add device tree bindings
mfd: ab8500-core: Use 'ifdef' for config options
mfd: htc-i2cpld: Fix %d confusingly prefixed with 0x in format string
mfd: omap-usb-host: Fix %d confusingly prefixed with 0x in format string
Linus Torvalds [Thu, 28 Aug 2014 17:31:29 +0000 (10:31 -0700)]
Merge tag 'pinctrl-v3.17-2' of git://git./linux/kernel/git/linusw/linux-pinctrl
Pull pin-control fixes from Linus Walleij:
"My first (a bit delayed) pack of pin control fixes for the v3.17
series, only driver fixes:
- SH-PFC (Renesas) r8a7791 CAN bus pin group problem
- Rockchip (GPIO0 configuration)
- Tegra-xusb (interrupt handling)
- Exynos (GPIO interrupt locking)
- Qualcomm (fix misleading example interrupts)
- minor non-critical fixes for abx500 and AT91 also sneaked in,
because I initially intended this pull for post RC-1, hope it's
still OK"
* tag 'pinctrl-v3.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
pinctrl: qcom: apq8064: Correct interrupts in example
pinctrl: exynos: Lock GPIOs as interrupts when used as EINTs
pinctrl: pinctrl-at91.c: fix decimal printf format specifiers prefixed with 0x
pinctrl: abx500: remove useless check
pinctrl: tegra-xusb: testing wrong variable in probe()
pinctrl: tegra-xusb: fix an off by one test
pinctrl: rockchip: fix rk3288 gpio0 configuration
sh-pfc: r8a7791: fix CAN pin groups
Linus Torvalds [Thu, 28 Aug 2014 17:30:25 +0000 (10:30 -0700)]
Merge tag 'for-3.17-rc3' of git://git./linux/kernel/git/sumits/dma-buf
Pull dma-buf fixes from Sumit Semwal:
"The major changes for 3.17 already went via Greg-KH's tree this time
as well; this is a small pull request for dma-buf - all documentation
related"
* tag 'for-3.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/sumits/dma-buf:
dma-buf/fence: Fix one more kerneldoc warning
dma-buf/fence: Fix a kerneldoc warning
Documentation/dma-buf-sharing.txt: update API descriptions
Linus Torvalds [Thu, 28 Aug 2014 16:44:25 +0000 (09:44 -0700)]
Merge tag 'sound-3.17-rc3' of git://git./linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"Here contains not many exciting changes but just a few minor ones: An
off-by-one proc write fix, a couple of trivial incldue guard fixes,
Acer laptop pinconfig fix, and a fix for DSD formats that are still
rarely used"
* tag 'sound-3.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda - Set up initial pins for Acer Aspire V5
ALSA: pcm: Fix the silence data for DSD formats
ALSA: ctxfi: ct20k1reg: Fix typo in include guard
ALSA: hda: ca0132_regs.h: Fix typo in include guard
ALSA: core: fix buffer overflow in snd_info_get_line()
Linus Torvalds [Thu, 28 Aug 2014 16:40:37 +0000 (09:40 -0700)]
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
"Nothing major, one core oops fixes, some radeon oops fixes, some sti
driver fixups, msm driver fixes and a minor Kconfig update for the ww
mutex debugging"
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
drm/ast: Add missing entry to dclk_table[]
drm: fix division-by-zero on dumb_create()
ww-mutex: clarify help text for DEBUG_WW_MUTEX_SLOWPATH
radeon: Test for PCI root bus before assuming bus->self
drm/radeon: handle broken disabled rb mask gracefully (6xx/7xx) (v2)
drm/radeon: save/restore the PD addr on suspend/resume
drm/msm: Fix missing unlock on error in msm_fbdev_create()
drm/msm: fix compile error for non-dt builds
drm/msm/mdp4: request vblank during modeset
drm/msm: avoid flood of kernel logs on faults
drm: sti: Add missing dependency on RESET_CONTROLLER
drm: sti: Make of_device_id array const
drm: sti: Fix return value check in sti_drm_platform_probe()
drm: sti: hda: fix return value check in sti_hda_probe()
drm: sti: hdmi: fix return value check in sti_hdmi_probe()
drm: sti: tvout: fix return value check in sti_tvout_probe()
Tony Lindgren [Tue, 19 Aug 2014 15:24:05 +0000 (08:24 -0700)]
mfd: twl4030-power: Fix PM idle pin configuration to not conflict with regulators
Commit
43fef47f94a1 (mfd: twl4030-power: Add a configuration to turn
off oscillator during off-idle) added support for configuring the PMIC
to cut off resources during deeper idle states to save power.
This however caused regression for n900 display power that needed the
PMIC configuration to be disabled with commit
d937678ab625 (ARM: dts:
Revert enabling of twl configuration for n900).
Turns out the root cause of the problem is that we must use
TWL4030_RESCONFIG_UNDEF instead of DEV_GRP_NULL to avoid disabling
regulators that may have been enabled before the init function
for twl4030-power.c runs. With TWL4030_RESCONFIG_UNDEF we let the
regulator framework control the regulators like it should. Here we
need to only configure the sys_clken and sys_off_mode triggers for
the regulators that cannot be done by the regulator framework as
it's not running at that point.
This allows us to enable the PMIC configuration for n900.
Fixes:
43fef47f94a1 (mfd: twl4030-power: Add a configuration to turn off oscillator during off-idle)
Cc: stable@vger.kernel.org # v3.16
Signed-off-by: Tony Lindgren <tony@atomide.com>
Tested-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Linus Walleij [Wed, 9 Apr 2014 07:28:00 +0000 (09:28 +0200)]
mfd: tc3589x: Add device tree bindings
This defines the device tree bindings for the Toshiba TC3589x
series of multi-purpose expanders. Only the stuff I can test
is defined: GPIO and keypad. Others may implement more
subdevices further down the road.
This is to complement
commit
a435ae1d51e2f18414f2a87219fdbe068231e692
"mfd: Enable the tc3589x for Device Tree" which left off
the definition of the device tree bindings.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Toshiaki Makita [Thu, 28 Aug 2014 08:14:58 +0000 (17:14 +0900)]
cfq-iosched: Add comments on update timing of weight
Explain that weight has to be updated on activation.
This complements previous fix
e15693ef18e1 ("cfq-iosched: Fix wrong
children_weight calculation").
Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Signed-off-by: Jens Axboe <axboe@fb.com>
Thierry Reding [Fri, 8 Aug 2014 11:06:30 +0000 (13:06 +0200)]
dma-buf/fence: Fix one more kerneldoc warning
The seqno_fence_init() function's cond argument isn't described in the
kerneldoc comment. Fix that to silence a warning when building DocBook
documentation.
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Thierry Reding [Fri, 8 Aug 2014 10:42:32 +0000 (12:42 +0200)]
dma-buf/fence: Fix a kerneldoc warning
kerneldoc doesn't know how to parse variables, so don't let it try.
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Gioh Kim [Tue, 13 May 2014 23:49:43 +0000 (08:49 +0900)]
Documentation/dma-buf-sharing.txt: update API descriptions
Update some descriptions for API arguments and descriptions.
Signed-off-by: Gioh Kim <gioh.kim@lge.com>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Y.C. Chen [Fri, 22 Aug 2014 03:00:17 +0000 (11:00 +0800)]
drm/ast: Add missing entry to dclk_table[]
This avoid reading past the end of the list for certain modes
Signed-off-by: Y.C. Chen <yc_chen@aspeedtech.com>
Reviewed-by: Egbert Eich <eich@freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Thu, 28 Aug 2014 01:48:58 +0000 (11:48 +1000)]
Merge branch 'drm-3.17-rc2-sti-fixes' of git://git.linaro.org/people/benjamin.gaignard/kernel into drm-fixes
I have tested the 6 patches send on mailing list since you merge the sti driver.
I haven't seen issue with those patches except for the missing
dependency on Kconfig
where I have change "depends on" to "select".
* 'drm-3.17-rc2-sti-fixes' of git://git.linaro.org/people/benjamin.gaignard/kernel:
drm: sti: Add missing dependency on RESET_CONTROLLER
drm: sti: Make of_device_id array const
drm: sti: Fix return value check in sti_drm_platform_probe()
drm: sti: hda: fix return value check in sti_hda_probe()
drm: sti: hdmi: fix return value check in sti_hdmi_probe()
drm: sti: tvout: fix return value check in sti_tvout_probe()
Dave Airlie [Thu, 28 Aug 2014 01:48:05 +0000 (11:48 +1000)]
Merge branch 'msm-fixes-3.17' of git://people.freedesktop.org/~robclark/linux into drm-fixes
misc msm fixes from Rob.
* 'msm-fixes-3.17' of git://people.freedesktop.org/~robclark/linux:
drm/msm: Fix missing unlock on error in msm_fbdev_create()
drm/msm: fix compile error for non-dt builds
drm/msm/mdp4: request vblank during modeset
drm/msm: avoid flood of kernel logs on faults
David Herrmann [Sun, 24 Aug 2014 17:23:26 +0000 (19:23 +0200)]
drm: fix division-by-zero on dumb_create()
Kinda unexpected, but DIV_ROUND_UP() can overflow if passed an argument
bigger than UINT_MAX - DIVISOR. Fix this by testing for "!cpp" before
using it in the following division.
Note that DIV_ROUND_UP() is defined as:
#define DIV_ROUND_UP(n,d) (((n) + (d) - 1) / (d))
..this will obviously overflow if (n + d - 1) is bigger than UINT_MAX.
Reported-by: Tommi Rantala <tt.rantala@gmail.com>
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Rob Clark [Wed, 27 Aug 2014 15:19:26 +0000 (11:19 -0400)]
ww-mutex: clarify help text for DEBUG_WW_MUTEX_SLOWPATH
We really don't want distro's enabling this in their kernels. Try and
make that more clear.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Acked-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Thu, 28 Aug 2014 01:32:20 +0000 (11:32 +1000)]
Merge branch 'drm-fixes-3.17' of git://people.freedesktop.org/~agd5f/linux into drm-fixes
Just a few more radeon fixes for 3.17.
* 'drm-fixes-3.17' of git://people.freedesktop.org/~agd5f/linux:
radeon: Test for PCI root bus before assuming bus->self
drm/radeon: handle broken disabled rb mask gracefully (6xx/7xx) (v2)
drm/radeon: save/restore the PD addr on suspend/resume
Linus Torvalds [Thu, 28 Aug 2014 00:32:37 +0000 (17:32 -0700)]
Merge branch 'sec-v3.17-rc2' of git://git./linux/kernel/git/sergeh/linux-security
Pull tomoyo fix from Serge Hallyn.
* 'sec-v3.17-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/sergeh/linux-security:
tomoyo: Fix pathname calculation breakage.
Dmitry Monakhov [Wed, 27 Aug 2014 22:40:00 +0000 (18:40 -0400)]
ext4: fix transaction issues for ext4_fallocate and ext_zero_range
After commit
f282ac19d86f we use different transactions for
preallocation and i_disksize update which result in complain from fsck
after power-failure. spotted by generic/019. IMHO this is regression
because fs becomes inconsistent, even more 'e2fsck -p' will no longer
works (which drives admins go crazy) Same transaction requirement
applies ctime,mtime updates
testcase: xfstest generic/019
Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
Dmitry Monakhov [Wed, 27 Aug 2014 22:33:49 +0000 (18:33 -0400)]
ext4: fix incorect journal credits reservation in ext4_zero_range
Currently we reserve only 4 blocks but in worst case scenario
ext4_zero_partial_blocks() may want to zeroout and convert two
non adjacent blocks.
Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
Alex Williamson [Wed, 27 Aug 2014 19:01:35 +0000 (13:01 -0600)]
radeon: Test for PCI root bus before assuming bus->self
If we assign a Radeon device to a virtual machine, we can no longer
assume a fixed hardware topology, like the GPU having a parent device.
This patch simply adds a few pci_is_root_bus() tests to avoid passing
a NULL pointer to PCI access functions, allowing the radeon driver to
work in a QEMU 440FX machine with an assigned HD8570 on the emulated
PCI root bus.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Linus Torvalds [Wed, 27 Aug 2014 16:38:06 +0000 (09:38 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/jikos/hid
Pull HID fixes from Jiri Kosina:
- fixes for potential memory corruption problems in magicmouse and
picolcd drivers (the HW would have to be manufactured to be
deliberately evil to trigger those) which were found by Steven
Vittitoe
- fix for false error message appearing in dmesg from logitech-dj
driver, from Benjamin Tissoires
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
HID: picolcd: sanity check report size in raw_event() callback
HID: magicmouse: sanity check report size in raw_event() callback
HID: logitech-dj: prevent false errors to be shown
Linus Torvalds [Wed, 27 Aug 2014 16:14:17 +0000 (09:14 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/mason/linux-btrfs
Pull btrfs fixes from Chris Mason:
"The biggest of these comes from Liu Bo, who tracked down a hang we've
been hitting since moving to kernel workqueues (it's a btrfs bug, not
in the generic code). His patch needs backporting to 3.16 and 3.15
stable, which I'll send once this is in.
Otherwise these are assorted fixes. Most were integrated last week
during KS, but I wanted to give everyone the chance to test the
result, so I waited for rc2 to come out before sending"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (24 commits)
Btrfs: fix task hang under heavy compressed write
Btrfs: fix filemap_flush call in btrfs_file_release
Btrfs: fix crash on endio of reading corrupted block
btrfs: fix leak in qgroup_subtree_accounting() error path
btrfs: Use right extent length when inserting overlap extent map.
Btrfs: clone, don't create invalid hole extent map
Btrfs: don't monopolize a core when evicting inode
Btrfs: fix hole detection during file fsync
Btrfs: ensure tmpfile inode is always persisted with link count of 0
Btrfs: race free update of commit root for ro snapshots
Btrfs: fix regression of btrfs device replace
Btrfs: don't consider the missing device when allocating new chunks
Btrfs: Fix wrong device size when we are resizing the device
Btrfs: don't write any data into a readonly device when scrub
Btrfs: Fix the problem that the replace destroys the seed filesystem
btrfs: Return right extent when fiemap gives unaligned offset and len.
Btrfs: fix wrong extent mapping for DirectIO
Btrfs: fix wrong write range for filemap_fdatawrite_range()
Btrfs: fix wrong missing device counter decrease
Btrfs: fix unzeroed members in fs_devices when creating a fs from seed fs
...
Linus Torvalds [Wed, 27 Aug 2014 16:12:36 +0000 (09:12 -0700)]
Merge tag 'trace-fixes-v3.17-rc1-2' of git://git./linux/kernel/git/rostedt/linux-trace
Pull trace buffer epoll hang fix from Steven Rostedt:
"Josef Bacik found a bug in the ring_buffer_poll_wait() where the
condition variable (waiters_pending) was set before being added to the
poll queue via poll_wait(). This allowed for a small race window to
happen where an event could come in, check the condition variable see
it set to true, clear it, and then wake all the waiters. But because
the waiter set the variable before adding itself to the queue, the
waker could have cleared the variable after it was set and then miss
waking it up as it wasn't added to the queue yet.
Discussing this bug, we realized that a memory barrier needed to be
added too, for the rare case that something polls for a single trace
event to happen (and just one, no more to come in), and miss the
wakeup due to memory ordering. Ideally, a memory barrier needs to be
added on the writer side too, but as that will kill tracing
performance and this is for a situation that tracing wasn't even
designed for (who traces one instance of an event, use a printk
instead!), this isn't worth adding the barrier. But we can in the
future add the barrier for when the buffer goes from empty to the
first event, as that would cover this case"
* tag 'trace-fixes-v3.17-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
trace: Fix epoll hang when we race with new entries
Jiri Kosina [Wed, 27 Aug 2014 07:13:15 +0000 (09:13 +0200)]
HID: picolcd: sanity check report size in raw_event() callback
The report passed to us from transport driver could potentially be
arbitrarily large, therefore we better sanity-check it so that raw_data
that we hold in picolcd_pending structure are always kept within proper
bounds.
Cc: stable@vger.kernel.org
Reported-by: Steven Vittitoe <scvitti@google.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Jiri Kosina [Wed, 27 Aug 2014 07:12:24 +0000 (09:12 +0200)]
HID: magicmouse: sanity check report size in raw_event() callback
The report passed to us from transport driver could potentially be
arbitrarily large, therefore we better sanity-check it so that
magicmouse_emit_touch() gets only valid values of raw_id.
Cc: stable@vger.kernel.org
Reported-by: Steven Vittitoe <scvitti@google.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Takashi Iwai [Wed, 27 Aug 2014 06:19:05 +0000 (08:19 +0200)]
ALSA: hda - Set up initial pins for Acer Aspire V5
Acer Aspire V5 doesn't set up the pins correctly at the cold boot
while the pins are corrected after the warm reboot. This patch gives
the proper pin configs statically in the driver as a workaround.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=81561
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Tetsuo Handa [Wed, 20 Aug 2014 05:14:04 +0000 (14:14 +0900)]
tomoyo: Fix pathname calculation breakage.
Commit
7177a9c4b509 ("fs: call rename2 if exists") changed
"struct inode_operations"->rename == NULL if
"struct inode_operations"->rename2 != NULL .
TOMOYO needs to check for both ->rename and ->rename2 , or
a system on (e.g.) ext4 filesystem won't boot.
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
Linus Torvalds [Tue, 26 Aug 2014 20:50:23 +0000 (13:50 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/s390/linux
Pull s390 updates from Martin Schwidefsky:
- wire up the system calls seccomp, getrandom and memfd_create
- use static system information as input to add_device_randomness
- .. and three bug fixes
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/sclp: remove unnecessary XTABS flag
s390/3215: fix tty output containing tabs
s390: wire up memfd_create syscall
s390: add system information as device randomness
s390/kdump: Clear subchannel ID to signal non-CCW/SCSI IPL
s390: wire up seccomp and getrandom syscalls
Pranith Kumar [Mon, 25 Aug 2014 01:17:32 +0000 (18:17 -0700)]
Documentation: this_cpu_ops.txt: Update description of this_cpu_ops
Update the description for per cpu operations to clarify use cases of
this_cpu operations and add considerations for remote access.
Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Randy Dunlap [Mon, 25 Aug 2014 01:17:17 +0000 (18:17 -0700)]
scripts/kernel-doc: recognize __meminit
Fix scripts/kernel-doc to recognize __meminit in a function prototype
and to strip it, as done with many other attributes.
Fixes this warning:
Warning(..//mm/page_alloc.c:2973): cannot understand function prototype: 'void * __meminit alloc_pages_exact_nid(int nid, size_t size, gfp_t gfp_mask) '
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Alex Deucher [Mon, 25 Aug 2014 18:52:15 +0000 (14:52 -0400)]
drm/radeon: handle broken disabled rb mask gracefully (6xx/7xx) (v2)
This is a port of
cedb655a3a7764c3fd946077944383c9e0e68dd4
to older asics. Fixes a possible divide by 0 if the harvest
register is invalid.
v2: drop some additional harvest munging.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Christian König [Tue, 26 Aug 2014 12:45:54 +0000 (14:45 +0200)]
drm/radeon: save/restore the PD addr on suspend/resume
This fixes a problem with GPU resets and TLB flushes on SI/CIK.
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Toshiaki Makita [Tue, 26 Aug 2014 11:56:36 +0000 (20:56 +0900)]
cfq-iosched: Fix wrong children_weight calculation
cfq_group_service_tree_add() is applying new_weight at the beginning of
the function via cfq_update_group_weight().
This actually allows weight to change between adding it to and subtracting
it from children_weight, and triggers WARN_ON_ONCE() in
cfq_group_service_tree_del(), or even causes oops by divide error during
vfr calculation in cfq_group_service_tree_add().
The detailed scenario is as follows:
1. Create blkio cgroups X and Y as a child of X.
Set X's weight to 500 and perform some I/O to apply new_weight.
This X's I/O completes before starting Y's I/O.
2. Y starts I/O and cfq_group_service_tree_add() is called with Y.
3. cfq_group_service_tree_add() walks up the tree during children_weight
calculation and adds parent X's weight (500) to children_weight of root.
children_weight becomes 500.
4. Set X's weight to 1000.
5. X starts I/O and cfq_group_service_tree_add() is called with X.
6. cfq_group_service_tree_add() applies its new_weight (1000).
7. I/O of Y completes and cfq_group_service_tree_del() is called with Y.
8. I/O of X completes and cfq_group_service_tree_del() is called with X.
9. cfq_group_service_tree_del() subtracts X's weight (1000) from
children_weight of root. children_weight becomes -500.
This triggers WARN_ON_ONCE().
10. Set X's weight to 500.
11. X starts I/O and cfq_group_service_tree_add() is called with X.
12. cfq_group_service_tree_add() applies its new_weight (500) and adds it
to children_weight of root. children_weight becomes 0. Calcularion of
vfr triggers oops by divide error.
weight should be updated right before adding it to children_weight.
Reported-by: Ruki Sekiya <sekiya.ruki@lab.ntt.co.jp>
Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@fb.com>
Wei Yongjun [Thu, 14 Aug 2014 01:01:40 +0000 (09:01 +0800)]
drm/msm: Fix missing unlock on error in msm_fbdev_create()
Add the missing unlock before return from function msm_fbdev_create()
in the error handling case.
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Mon, 4 Aug 2014 19:45:16 +0000 (15:45 -0400)]
drm/msm: fix compile error for non-dt builds
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Mon, 4 Aug 2014 14:44:53 +0000 (10:44 -0400)]
drm/msm/mdp4: request vblank during modeset
This avoids a problem seen with weston (for example) where the display
gets stuck in "black screen" if starting weston first thing after boot.
Possibly mdp5 needs something similar. The downstream android fbdev
driver always requests DMA_E (or DMA_P) when display is active, rather
than only enabling it on-demand as the drm driver does, which I believe
has the same end result.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Sat, 9 Aug 2014 13:07:25 +0000 (09:07 -0400)]
drm/msm: avoid flood of kernel logs on faults
87e956e9 changed the fault handler to return -ENOSYS, which causes the
iommu driver to print out a huge splat. Which wouldn't be quite so bad
if nothing ever faulted. But seems like some EXA composite operations
generate quite a lot of (seemingly harmless) faults. That is probably a
userspace problem, but the huge increase in verbosity from iommu fault
dumps makes things kind of unusable.
We probably should actually log *some* message (not conditional on
drm.debug). But ratelimit it.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Sabrina Dubroca [Tue, 26 Aug 2014 14:14:02 +0000 (16:14 +0200)]
block: fix error handling in sg_io
Before commit
2cada584b200 ("block: cleanup error handling in sg_io"),
we had ret = 0 before entering the last big if block of sg_io.
Since
2cada584b200, ret = -EFAULT, which breaks hdparm:
/dev/sda:
setting Advanced Power Management level to 0xc8 (200)
HDIO_DRIVE_CMD failed: Bad address
APM_level = 128
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Fixes:
2cada584b200 ("block: cleanup error handling in sg_io")
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
Jingoo Han [Tue, 26 Aug 2014 10:26:18 +0000 (12:26 +0200)]
drm: sti: Add missing dependency on RESET_CONTROLLER
Add missing dependency on RESET_CONTROLLER in order to fix
the following build error.
drivers/gpu/drm/sti/sti_hdmi.c: In function 'sti_hdmi_probe'
drivers/gpu/drm/sti/sti_hdmi.c:780:2: error: implicit declaration of function 'devm_reset_control_get'
[-Werror=implicit-function-declaration]
Benjamin Gaignard remark:
I have change "depends on" to "select" but keep the original author name.
Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Kiran Padwal [Tue, 26 Aug 2014 10:25:24 +0000 (12:25 +0200)]
drm: sti: Make of_device_id array const
Make of_device_id array const, because all OF functions handle it as const.
Signed-off-by: Kiran Padwal <kiran.padwal@smartplayin.com>
Wei Yongjun [Tue, 26 Aug 2014 10:24:24 +0000 (12:24 +0200)]
drm: sti: Fix return value check in sti_drm_platform_probe()
In case of error, the function platform_device_register_resndata()
returns ERR_PTR() and never returns NULL. The NULL test in the return
value check should be replaced with IS_ERR().
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Wei Yongjun [Tue, 26 Aug 2014 10:23:52 +0000 (12:23 +0200)]
drm: sti: hda: fix return value check in sti_hda_probe()
In case of error, the function devm_ioremap_nocache() returns NULL
pointer not ERR_PTR(). The IS_ERR() test in the return value check
should be replaced with NULL test.
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Wei Yongjun [Tue, 26 Aug 2014 10:23:07 +0000 (12:23 +0200)]
drm: sti: hdmi: fix return value check in sti_hdmi_probe()
In case of error, the function devm_ioremap_nocache() returns NULL
pointer not ERR_PTR(). The IS_ERR() test in the return value check
should be replaced with NULL test.
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Wei Yongjun [Tue, 26 Aug 2014 10:17:36 +0000 (12:17 +0200)]
drm: sti: tvout: fix return value check in sti_tvout_probe()
In case of error, the function devm_ioremap_nocache() returns NULL
pointer not ERR_PTR(). The IS_ERR() test in the return value check
should be replaced with NULL test.
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Josef Bacik [Mon, 25 Aug 2014 17:59:41 +0000 (13:59 -0400)]
trace: Fix epoll hang when we race with new entries
Epoll on trace_pipe can sometimes hang in a weird case. If the ring buffer is
empty when we set waiters_pending but an event shows up exactly at that moment
we can miss being woken up by the ring buffers irq work. Since
ring_buffer_empty() is inherently racey we will sometimes think that the buffer
is not empty. So we don't get woken up and we don't think there are any events
even though there were some ready when we added the watch, which makes us hang.
This patch fixes this by making sure that we are actually on the wait list
before we set waiters_pending, and add a memory barrier to make sure
ring_buffer_empty() is going to be correct.
Link: http://lkml.kernel.org/p/1408989581-23727-1-git-send-email-jbacik@fb.com
Cc: stable@vger.kernel.org # 3.10+
Cc: Martin Lau <kafai@fb.com>
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Linus Torvalds [Mon, 25 Aug 2014 22:36:20 +0000 (15:36 -0700)]
Linux 3.17-rc2
Linus Torvalds [Mon, 25 Aug 2014 22:34:28 +0000 (15:34 -0700)]
Merge tag 'nfs-for-3.17-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client fixes from Trond Myklebust:
"Highlights:
- more fixes for read/write codepath regressions
* sleeping while holding the inode lock
* stricter enforcement of page contiguity when coalescing requests
* fix up error handling in the page coalescing code
- don't busy wait on SIGKILL in the file locking code"
* tag 'nfs-for-3.17-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
nfs: Don't busy-wait on SIGKILL in __nfs_iocounter_wait
nfs: can_coalesce_requests must enforce contiguity
nfs: disallow duplicate pages in pgio page vectors
nfs: don't sleep with inode lock in lock_and_join_requests
nfs: fix error handling in lock_and_join_requests
nfs: use blocking page_group_lock in add_request
nfs: fix nonblocking calls to nfs_page_group_lock
nfs: change nfs_page_group_lock argument
Linus Torvalds [Mon, 25 Aug 2014 22:29:33 +0000 (15:29 -0700)]
Merge tag 'renesas-sh-drivers-for-v3.17' of git://git./linux/kernel/git/horms/renesas
Pull SH driver fix from Simon Horman:
"Confine SH_INTC to platforms that need it"
* tag 'renesas-sh-drivers-for-v3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas:
sh: intc: Confine SH_INTC to platforms that need it
Linus Torvalds [Mon, 25 Aug 2014 22:28:57 +0000 (15:28 -0700)]
Merge branch 'upstream' of git://git.linux-mips.org/ralf/upstream-linus
Pull MIPS fixes from Ralf Baechle:
"Pretty much all across the field so with this we should be in
reasonable shape for the upcoming -rc2"
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
MIPS: OCTEON: make get_system_type() thread-safe
MIPS: CPS: Initialize EVA before bringing up VPEs from secondary cores
MIPS: Malta: EVA: Rename 'eva_entry' to 'platform_eva_init'
MIPS: EVA: Add new EVA header
MIPS: scall64-o32: Fix indirect syscall detection
MIPS: syscall: Fix AUDIT value for O32 processes on MIPS64
MIPS: Loongson: Fix COP2 usage for preemptible kernel
MIPS: NL: Fix nlm_xlp_defconfig build error
MIPS: Remove race window in page fault handling
MIPS: Malta: Improve system memory detection for '{e, }memsize' >= 2G
MIPS: Alchemy: Fix db1200 PSC clock enablement
MIPS: BCM47XX: Fix reboot problem on BCM4705/BCM4785
MIPS: Remove duplicated include from numa.c
MIPS: Add common plat_irq_dispatch declaration
MIPS: MSP71xx: remove unused plat_irq_dispatch() argument
MIPS: GIC: Remove useless parens from GICBIS().
MIPS: perf: Mark pmu interupt IRQF_NO_THREAD
Linus Torvalds [Mon, 25 Aug 2014 22:11:53 +0000 (15:11 -0700)]
Merge tag 'trace-fixes-v3.17-rc1' of git://git./linux/kernel/git/rostedt/linux-trace
Pull fix for ftrace function tracer/profiler conflict from Steven Rostedt:
"The rewrite of the ftrace code that makes it possible to allow for
separate trampolines had a design flaw with the interaction between
the function and function_graph tracers.
The main flaw was the simplification of the use of multiple tracers
having the same filter (like function and function_graph, that use the
set_ftrace_filter file to filter their code). The design assumed that
the two tracers could never run simultaneously as only one tracer can
be used at a time. The problem with this assumption was that the
function profiler could be implemented on top of the function graph
tracer, and the function profiler could run at the same time as the
function tracer. This caused the assumption to be broken and when
ftrace detected this failed assumpiton it would spit out a nasty
warning and shut itself down.
Instead of using a single ftrace_ops that switches between the
function and function_graph callbacks, the two tracers can again use
their own ftrace_ops. But instead of having a complex hierarchy of
ftrace_ops, the filter fields are placed in its own structure and the
ftrace_ops can carefully use the same filter. This change took a bit
to be able to allow for this and currently only the global_ops can
share the same filter, but this new design can easily be modified to
allow for any ftrace_ops to share its filter with another ftrace_ops.
The first four patches deal with the change of allowing the ftrace_ops
to share the filter (and this needs to go to 3.16 as well).
The fifth patch fixes a bug that was also caused by the new changes
but only for archs other than x86, and only if those archs implement a
direct call to the function_graph tracer which they do not do yet but
will in the future. It does not need to go to stable, but needs to be
fixed before the other archs update their code to allow direct calls
to the function_graph trampoline"
* tag 'trace-fixes-v3.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
ftrace: Use current addr when converting to nop in __ftrace_replace_code()
ftrace: Fix function_profiler and function tracer together
ftrace: Fix up trampoline accounting with looping on hash ops
ftrace: Update all ftrace_ops for a ftrace_hash_ops update
ftrace: Allow ftrace_ops to use the hashes from other ops
Benjamin Tissoires [Fri, 22 Aug 2014 20:16:05 +0000 (16:16 -0400)]
HID: logitech-dj: prevent false errors to be shown
Commit "HID: logitech: perform bounds checking on device_id early
enough" unfortunately leaks some errors to dmesg which are not real
ones:
- if the report is not a DJ one, then there is not point in checking
the device_id
- the receiver (index 0) can also receive some notifications which
can be safely ignored given the current implementation
Move out the test regarding the report_id and also discards
printing errors when the receiver got notified.
Fixes:
ad3e14d7c5268c2e24477c6ef54bbdf88add5d36
Cc: stable@vger.kernel.org
Reported-and-tested-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Linus Torvalds [Sun, 24 Aug 2014 23:17:41 +0000 (16:17 -0700)]
Merge branch 'x86-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
"A couple of EFI fixes, plus misc fixes all around the map"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
efi/arm64: Store Runtime Services revision
firmware: Do not use WARN_ON(!spin_is_locked())
x86_32, entry: Clean up sysenter_badsys declaration
x86/doc: Fix the 'tlb_single_page_flush_ceiling' sysconfig path
x86/mm: Fix sparse 'tlb_single_page_flush_ceiling' warning and make the variable read-mostly
x86/mm: Fix RCU splat from new TLB tracepoints
Linus Torvalds [Sun, 24 Aug 2014 23:16:55 +0000 (16:16 -0700)]
Merge branch 'perf-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
"A kprobes and a perf compat ioctl fix"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf: Handle compat ioctl
kprobes: Skip kretprobe hit in NMI context to avoid deadlock
Linus Torvalds [Sun, 24 Aug 2014 22:57:00 +0000 (15:57 -0700)]
Merge tag 'fixes-for-linus' of git://git./linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Olof Johansson:
"A collection of fixes from this week, it's been pretty quiet and
nothing really stands out as particularly noteworthy here -- mostly
minor fixes across the field:
- ODROID booting was fixed due to PMIC interrupts missing in DT
- a collection of i.MX fixes
- minor Tegra fix for regulators
- Rockchip fix and addition of SoC-specific mailing list to make it
easier to find posted patches"
* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
bus: arm-ccn: Fix warning message
ARM: shmobile: koelsch: Remove non-existent i2c6 pinmux
ARM: tegra: apalis/colibri t30: fix on-module 5v0 supplies
MAINTAINERS: add new Rockchip SoC list
ARM: dts: rockchip: readd missing mmc0 pinctrl settings
ARM: dts: ODROID i2c improvements
ARM: dts: Enable PMIC interrupts on ODROID
ARM: dts: imx6sx: fix the pad setting for uart CTS_B
ARM: dts: i.MX53: fix apparent bug in VPU clks
ARM: imx: correct gpu2d_axi and gpu3d_axi clock setting
ARM: dts: imx6: edmqmx6: change enet reset pin
ARM: dts: vf610-twr: Fix pinctrl_esdhc1 pin definitions.
ARM: imx: remove unnecessary ARCH_HAS_OPP select
ARM: imx: fix TLB missing of IOMUXC base address during suspend
ARM: imx6: fix SMP compilation again
ARM: dt: sun6i: Add #address-cells and #size-cells to i2c controller nodes
Linus Torvalds [Sun, 24 Aug 2014 22:54:23 +0000 (15:54 -0700)]
Merge tag 'gpio-v3.17-2' of git://git./linux/kernel/git/linusw/linux-gpio
Pull gpio fixes from Linus Walleij:
- a largeish fix for the IRQ handling in the new Zynq driver. The
quite verbose commit message gives the exact details.
- move some defines for gpiod flags outside an ifdef to make stub
functions work again.
- various minor fixes that we can accept for -rc1.
* tag 'gpio-v3.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
gpio-lynxpoint: enable input sensing in resume
gpio: move GPIOD flags outside #ifdef
gpio: delete unneeded test before of_node_put
gpio: zynq: Fix IRQ handlers
gpiolib: devres: use correct structure type name in sizeof
MAINTAINERS: Change maintainer for gpio-bcm-kona.c
Linus Torvalds [Sun, 24 Aug 2014 22:48:12 +0000 (15:48 -0700)]
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
"Intel and radeon fixes.
Post KS/LC git requests from i915 and radeon stacked up. They are all
fixes along with some new pci ids for radeon, and one maintainers file
entry.
- i915: display fixes and irq fixes
- radeon: pci ids, and misc gpuvm, dpm and hdp cache"
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (29 commits)
MAINTAINERS: Add entry for Renesas DRM drivers
drm/radeon: add additional SI pci ids
drm/radeon: add new bonaire pci ids
drm/radeon: add new KV pci id
Revert "drm/radeon: Use write-combined CPU mappings of ring buffers with PCIe"
drm/radeon: fix active_cu mask on SI and CIK after re-init (v3)
drm/radeon: fix active cu count for SI and CIK
drm/radeon: re-enable selective GPUVM flushing
drm/radeon: Sync ME and PFP after CP semaphore waits v4
drm/radeon: fix display handling in radeon_gpu_reset
drm/radeon: fix pm handling in radeon_gpu_reset
drm/radeon: Only flush HDP cache for indirect buffers from userspace
drm/radeon: properly document reloc priority mask
drm/i915: don't try to retrain a DP link on an inactive CRTC
drm/i915: make sure VDD is turned off during system suspend
drm/i915: cancel hotplug and dig_port work during suspend and unload
drm/i915: fix HPD IRQ reenable work cancelation
drm/i915: take display port power domain in DP HPD handler
drm/i915: Don't try to enable cursor from setplane when crtc is disabled
drm/i915: Skip load detect when intel_crtc->new_enable==true
...
Benjamin LaHaise [Sun, 24 Aug 2014 17:14:05 +0000 (13:14 -0400)]
aio: fix reqs_available handling
As reported by Dan Aloni, commit
f8567a3845ac ("aio: fix aio request
leak when events are reaped by userspace") introduces a regression when
user code attempts to perform io_submit() with more events than are
available in the ring buffer. Reverting that commit would reintroduce a
regression when user space event reaping is used.
Fixing this bug is a bit more involved than the previous attempts to fix
this regression. Since we do not have a single point at which we can
count events as being reaped by user space and io_getevents(), we have
to track event completion by looking at the number of events left in the
event ring. So long as there are as many events in the ring buffer as
there have been completion events generate, we cannot call
put_reqs_available(). The code to check for this is now placed in
refill_reqs_available().
A test program from Dan and modified by me for verifying this bug is available
at http://www.kvack.org/~bcrl/
20140824-aio_bug.c .
Reported-by: Dan Aloni <dan@kernelim.com>
Signed-off-by: Benjamin LaHaise <bcrl@kvack.org>
Acked-by: Dan Aloni <dan@kernelim.com>
Cc: Kent Overstreet <kmo@daterainc.com>
Cc: Mateusz Guzik <mguzik@redhat.com>
Cc: Petr Matousek <pmatouse@redhat.com>
Cc: stable@vger.kernel.org # v3.16 and anything that f8567a3845ac was backported to
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pawel Moll [Mon, 18 Aug 2014 17:20:49 +0000 (18:20 +0100)]
bus: arm-ccn: Fix warning message
A message warning a user about wrong vc value was printing
out port instead.
Reported-by: Drew Richardson <drew.richardson@arm.com>
Signed-off-by: Pawel Moll <pawel.moll@arm.com>
Signed-off-by: Olof Johansson <olof@lixom.net>