Al Viro [Wed, 19 Nov 2014 18:02:53 +0000 (13:02 -0500)]
Merge tag 'trace-seq-file-cleanup' of git://git./linux/kernel/git/rostedt/linux-trace into for-next
Pull the beginning of seq_file cleanup from Steven:
"I'm looking to clean up the seq_file code and to eventually merge the
trace_seq code with seq_file as well, since they basically do the same thing.
Part of this process is to remove the return code of seq_printf() and friends
as they are rather inconsistent. It is better to use the new function
seq_has_overflowed() if you want to stop processing when the buffer
is full. Note, if the buffer is full, the seq_file code will throw away
the contents, allocate a bigger buffer, and then call your code again
to fill in the data. The only thing that breaking out of the function
early does is to save a little time which is probably never noticed.
I started with patches from Joe Perches and modified them as well.
There's many more places that need to be updated before we can convert
seq_printf() and friends to return void. But this patch set introduces
the seq_has_overflowed() and does some initial updates."
Al Viro [Fri, 31 Oct 2014 05:22:04 +0000 (01:22 -0400)]
kill f_dentry macro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Wed, 19 Nov 2014 18:01:59 +0000 (13:01 -0500)]
Merge branch 'for-lustre' into for-next
Mikulas Patocka [Fri, 5 Sep 2014 16:16:01 +0000 (12:16 -0400)]
dcache: fix kmemcheck warning in switch_names
This patch fixes kmemcheck warning in switch_names. The function
switch_names swaps inline names of two dentries. It swaps full arrays
d_iname, no matter how many bytes are really used by the strings. Reading
data beyond string ends results in kmemcheck warning.
We fix the bug by marking both arrays as fully initialized.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org # v3.15
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Fri, 31 Oct 2014 21:44:57 +0000 (17:44 -0400)]
new helper: audit_file()
... for situations when we don't have any candidate in pathnames - basically,
in descriptor-based syscalls.
[Folded the build fix for !CONFIG_AUDITSYSCALL configs from Chen Gang]
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Fri, 31 Oct 2014 06:42:53 +0000 (02:42 -0400)]
nfsd_vfs_write(): use file_inode()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Fri, 31 Oct 2014 06:41:28 +0000 (02:41 -0400)]
ncpfs: use file_inode()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Fri, 31 Oct 2014 05:22:04 +0000 (01:22 -0400)]
kill f_dentry uses
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Fri, 31 Oct 2014 04:43:30 +0000 (00:43 -0400)]
lockd: get rid of ->f_path.dentry->d_sb
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Fri, 31 Oct 2014 04:42:35 +0000 (00:42 -0400)]
procfs: get rid of ->f_dentry
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Fri, 31 Oct 2014 04:41:12 +0000 (00:41 -0400)]
nfsd: get rid of ->f_dentry
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Fri, 31 Oct 2014 04:36:30 +0000 (00:36 -0400)]
rpc_pipefs.c: get rid of f_dentry
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Tue, 3 Sep 2013 17:37:45 +0000 (13:37 -0400)]
afs_fsync: don't bother with ->f_path.dentry
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Wed, 22 Oct 2014 04:25:12 +0000 (00:25 -0400)]
cifs: get rid of ->f_path.dentry->d_sb uses, add a new helper
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Wed, 22 Oct 2014 00:21:18 +0000 (20:21 -0400)]
btrfs: get rid of f_dentry use
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Wed, 22 Oct 2014 00:19:11 +0000 (20:19 -0400)]
nfsd/nfsctl.c: new helper
... to get from opened file on nfsctl to relevant struct net *
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Wed, 22 Oct 2014 00:11:25 +0000 (20:11 -0400)]
assorted conversions to %p[dD]
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Mon, 13 Oct 2014 02:24:21 +0000 (22:24 -0400)]
switch d_materialise_unique() users to d_splice_alias()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Mon, 13 Oct 2014 02:16:02 +0000 (22:16 -0400)]
merge d_materialise_unique() into d_splice_alias()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Wed, 19 Nov 2014 18:00:57 +0000 (13:00 -0500)]
Merge branch 'for-gfs2' into for-next
Al Viro [Sun, 12 Oct 2014 22:46:26 +0000 (18:46 -0400)]
d_add_ci() should just accept a hashed exact match if it finds one
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Wed, 19 Nov 2014 02:10:23 +0000 (21:10 -0500)]
gfs2_atomic_open(): simplify the use of finish_no_open()
In ->atomic_open(inode, dentry, file, opened) calling finish_no_open(file, NULL)
is equivalent to dget(dentry); return finish_no_open(file, dentry);
No need to open-code that...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Thu, 18 Sep 2014 21:42:35 +0000 (17:42 -0400)]
gfs2_create_inode(): don't bother with d_splice_alias()
dentry is always hashed and negative, inode - non-error, non-NULL and
non-directory. In such conditions d_splice_alias() is equivalent to
"d_instantiate(dentry, inode) and return NULL", which simplifies the
downstream code and is consistent with the "have to create a new object"
case.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Thu, 18 Sep 2014 21:38:59 +0000 (17:38 -0400)]
gfs2: bugger off early if O_CREAT open finds a directory
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Joe Perches [Mon, 29 Sep 2014 23:08:26 +0000 (16:08 -0700)]
debugfs: Have debugfs_print_regs32() return void
The seq_printf() will soon just return void, and seq_has_overflowed()
should be used instead to see if the seq can no longer accept input.
As the return value of debugfs_print_regs32() has no users and
the seq_file descriptor should be checked with seq_has_overflowed()
instead of return values of functions, it is better to just have
debugfs_print_regs32() also return void.
Link: http://lkml.kernel.org/p/2634b19eb1c04a9d31148c1fe6f1f3819be95349.1412031505.git.joe@perches.com
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Joe Perches <joe@perches.com>
[ original change only updated seq_printf() return, added return of
void to debugfs_print_regs32() as well ]
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Joe Perches [Mon, 29 Sep 2014 23:08:25 +0000 (16:08 -0700)]
fs: Convert show_fdinfo functions to void
seq_printf functions shouldn't really check the return value.
Checking seq_has_overflowed() occasionally is used instead.
Update vfs documentation.
Link: http://lkml.kernel.org/p/e37e6e7b76acbdcc3bb4ab2a57c8f8ca1ae11b9a.1412031505.git.joe@perches.com
Cc: David S. Miller <davem@davemloft.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Joe Perches <joe@perches.com>
[ did a few clean ups ]
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Joe Perches [Mon, 29 Sep 2014 23:08:24 +0000 (16:08 -0700)]
dlm: Use seq_puts() instead of seq_printf() for constant strings
Convert the seq_printf output with constant strings to seq_puts.
Link: http://lkml.kernel.org/p/b416b016f4a6e49115ba736cad6ea2709a8bc1c4.1412031505.git.joe@perches.com
Cc: Christine Caulfield <ccaulfie@redhat.com>
Cc: David Teigland <teigland@redhat.com>
Cc: cluster-devel@redhat.com
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Joe Perches [Mon, 29 Sep 2014 23:08:23 +0000 (16:08 -0700)]
dlm: Remove seq_printf() return checks and use seq_has_overflowed()
The seq_printf() return is going away soon and users of it should
check seq_has_overflowed() to see if the buffer is full and will
not accept any more data.
Convert functions returning int to void where seq_printf() is used.
Link: http://lkml.kernel.org/p/43590057bcb83846acbbcc1fe641f792b2fb7773.1412031505.git.joe@perches.com
Link: http://lkml.kernel.org/r/20141029220107.939492048@goodmis.org
Acked-by: David Teigland <teigland@redhat.com>
Cc: Christine Caulfield <ccaulfie@redhat.com>
Cc: cluster-devel@redhat.com
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Steven Rostedt (Red Hat) [Mon, 27 Oct 2014 21:43:45 +0000 (17:43 -0400)]
netfilter: Remove checks of seq_printf() return values
The return value of seq_printf() is soon to be removed. Remove the
checks from seq_printf() in favor of seq_has_overflowed().
Link: http://lkml.kernel.org/r/20141104142236.GA10239@salvia
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Cc: netfilter-devel@vger.kernel.org
Cc: coreteam@netfilter.org
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Joe Perches [Mon, 29 Sep 2014 23:08:22 +0000 (16:08 -0700)]
netfilter: Convert print_tuple functions to return void
Since adding a new function to seq_file (seq_has_overflowed())
there isn't any value for functions called from seq_show to
return anything. Remove the int returns of the various
print_tuple/<foo>_print_tuple functions.
Link: http://lkml.kernel.org/p/f2e8cf8df433a197daa62cbaf124c900c708edc7.1412031505.git.joe@perches.com
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Cc: netfilter-devel@vger.kernel.org
Cc: coreteam@netfilter.org
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Steven Rostedt (Red Hat) [Mon, 27 Oct 2014 20:02:47 +0000 (16:02 -0400)]
netfilter: Remove return values for print_conntrack callbacks
The seq_printf() and friends are having their return values removed.
The print_conntrack() returns the result of seq_printf(), which is
meaningless when seq_printf() returns void. Might as well remove the
return values of print_conntrack() as well.
Link: http://lkml.kernel.org/r/20141029220107.465008329@goodmis.org
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Cc: netfilter-devel@vger.kernel.org
Cc: coreteam@netfilter.org
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Al Viro [Sun, 26 Oct 2014 23:31:10 +0000 (19:31 -0400)]
deal with deadlock in d_walk()
... by not hitting rename_retry for reasons other than rename having
happened. In other words, do _not_ restart when finding that
between unlocking the child and locking the parent the former got
into __dentry_kill(). Skip the killed siblings instead...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Tue, 21 Oct 2014 19:20:42 +0000 (15:20 -0400)]
lustre: use is_root_inode()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Tue, 21 Oct 2014 03:36:49 +0000 (23:36 -0400)]
lustre: get rid of duplicate mountpoint checks
VFS has already done them
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Tue, 21 Oct 2014 03:35:36 +0000 (23:35 -0400)]
kill ll_link_generic()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Tue, 21 Oct 2014 03:31:54 +0000 (23:31 -0400)]
ll_get_child_fid(): callers already have the child
no need to bother with d_lookup()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Tue, 21 Oct 2014 03:27:44 +0000 (23:27 -0400)]
kill ll_rename_generic()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Tue, 21 Oct 2014 02:38:51 +0000 (22:38 -0400)]
kill ll_unlink_generic()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Tue, 21 Oct 2014 02:36:22 +0000 (22:36 -0400)]
kill ll_rmdir_generic()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Tue, 21 Oct 2014 02:29:54 +0000 (22:29 -0400)]
ll_new_inode(): don't bother with name - it's always &dentry->d_name
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Tue, 21 Oct 2014 02:26:04 +0000 (22:26 -0400)]
kill ll_symlink_generic()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Tue, 21 Oct 2014 02:22:38 +0000 (22:22 -0400)]
kill ll_mkdir_generic()
fold into ll_mkdir()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Tue, 21 Oct 2014 02:19:53 +0000 (22:19 -0400)]
kill ll_mknod_generic()
just fold into ll_mknod()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Tue, 21 Oct 2014 00:39:57 +0000 (20:39 -0400)]
lustre: switch ll_release_openhandle() to struct inode *
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Mon, 20 Oct 2014 22:02:33 +0000 (18:02 -0400)]
lustre: use file_inode()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Thu, 8 May 2014 01:10:24 +0000 (21:10 -0400)]
lustre: use %p[dD]
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Thu, 8 May 2014 00:55:28 +0000 (20:55 -0400)]
lustre: opened file can't have negative dentry
... and ll_md_close() gets inode already equal to file_inode(file),
along with ll_file_release()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Fri, 9 May 2014 14:05:18 +0000 (10:05 -0400)]
ll_setxattr(): get rid of struct file on stack
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Thu, 8 May 2014 01:24:47 +0000 (21:24 -0400)]
lustre: switch ll_intent_file_open() to struct dentry *
... because fake struct file is wrong.
ll_prep_inode() in there is an atrocity - despite passing a pointer to
inode by address, it actually only uses the value when that value is
non-NULL, as it will be here.
Oh, and ->d_parent of anything is never NULL.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Sun, 26 Oct 2014 23:19:16 +0000 (19:19 -0400)]
move d_rcu from overlapping d_child to overlapping d_alias
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Tue, 21 Oct 2014 19:20:42 +0000 (15:20 -0400)]
new helper: is_root_inode()
replace open-coded instances
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Miklos Szeredi [Thu, 30 Oct 2014 16:37:34 +0000 (17:37 +0100)]
vfs: make first argument of dir_context.actor typed
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Miklos Szeredi [Fri, 31 Oct 2014 19:02:42 +0000 (20:02 +0100)]
ovl: initialize ->is_cursor
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
David Jeffery [Mon, 29 Sep 2014 14:21:10 +0000 (10:21 -0400)]
Return short read or 0 at end of a raw device, not EIO
Author: David Jeffery <djeffery@redhat.com>
Changes to the basic direct I/O code have broken the raw driver when reading
to the end of a raw device. Instead of returning a short read for a read that
extends partially beyond the device's end or 0 when at the end of the device,
these reads now return EIO.
The raw driver needs the same end of device handling as was added for normal
block devices. Using blkdev_read_iter, which has the needed size checks,
prevents the EIO conditions at the end of the device.
Signed-off-by: David Jeffery <djeffery@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Tue, 28 Oct 2014 22:40:11 +0000 (18:40 -0400)]
isofs: don't bother with ->d_op for normal case
we only need it for joliet and case-insensitive mounts
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Joe Perches [Mon, 29 Sep 2014 23:08:21 +0000 (16:08 -0700)]
seq_file: Rename seq_overflow() to seq_has_overflowed() and make public
The return values of seq_printf/puts/putc are frequently misused.
Start down a path to remove all the return value uses of these
functions.
Move the seq_overflow() to a global inlined function called
seq_has_overflowed() that can be used by the users of seq_file() calls.
Update the documentation to not show return types for seq_printf
et al. Add a description of seq_has_overflowed().
Link: http://lkml.kernel.org/p/848ac7e3d1c31cddf638a8526fa3c59fa6fdeb8a.1412031505.git.joe@perches.com
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Joe Perches <joe@perches.com>
[ Reworked the original patch from Joe ]
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Al Viro [Tue, 28 Oct 2014 22:37:40 +0000 (18:37 -0400)]
isofs_cmp(): we'll never see a dentry for . or ..
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Miklos Szeredi [Mon, 27 Oct 2014 14:42:01 +0000 (15:42 +0100)]
overlayfs: fix lockdep misannotation
In an overlay directory that shadows an empty lower directory, say
/mnt/a/empty102, do:
touch /mnt/a/empty102/x
unlink /mnt/a/empty102/x
rmdir /mnt/a/empty102
It's actually harmless, but needs another level of nesting between
I_MUTEX_CHILD and I_MUTEX_NORMAL.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Tested-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Miklos Szeredi [Mon, 27 Oct 2014 12:48:48 +0000 (13:48 +0100)]
ovl: fix check for cursor
ovl_cache_entry.name is now an array not a pointer, so it makes no sense
test for it being NULL.
Detected by coverity.
From: Miklos Szeredi <mszeredi@suse.cz>
Fixes:
68bf8611076a ("overlayfs: make ovl_cache_entry->name an array instead of
+pointer")
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Tue, 28 Oct 2014 22:27:28 +0000 (18:27 -0400)]
overlayfs: barriers for opening upper-layer directory
make sure that
a) all stores done by opening struct file don't leak past storing
the reference in od->upperfile
b) the lockless side has read dependency barrier
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Paul E. McKenney [Tue, 28 Oct 2014 04:11:27 +0000 (21:11 -0700)]
rcu: Provide counterpart to rcu_dereference() for non-RCU situations
Although rcu_dereference() and friends can be used in situations where
object lifetimes are being managed by something other than RCU, the
resulting sparse and lockdep-RCU noise can be annoying. This commit
therefore supplies a lockless_dereference(), which provides the
protection for dereferences without the RCU-related debugging noise.
Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Daniel Thompson [Mon, 27 Oct 2014 18:51:43 +0000 (18:51 +0000)]
staging: android: logger: Fix log corruption regression
Since commit
cd678fce4280 ("switch logger to ->write_iter()"), any
attempt to write to the log results in the log data being written over
its own metadata, thus rendering the log unreadable.
The problem was first detected when I ran an Android userspace on the
v3.18-rc1 kernel. However the issue can also be observed with a
non-Android userspace by using echo/cat to write to/from /dev/log_main .
This patch resolves the problem by using a temporary to track the status
of not-yet-committed writes to the log buffer.
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Linus Torvalds [Sun, 26 Oct 2014 23:48:41 +0000 (16:48 -0700)]
Linux 3.18-rc2
Linus Torvalds [Sun, 26 Oct 2014 18:35:51 +0000 (11:35 -0700)]
Merge tag 'armsoc-for-rc2' of git://git./linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Olof Johansson:
"Another week, another small batch of fixes.
Most of these make zynq, socfpga and sunxi platforms work a bit
better:
- due to new requirements for regulators, DWMMC on socfpga broke past
v3.17
- SMP spinup fix for socfpga
- a few DT fixes for zynq
- another option (FIXED_REGULATOR) for sunxi is needed that used to
be selected by other options but no longer is.
- a couple of small DT fixes for at91
- ...and a couple for i.MX"
* tag 'armsoc-for-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
ARM: dts: imx28-evk: Let i2c0 run at 100kHz
ARM: i.MX6: Fix "emi" clock name typo
ARM: multi_v7_defconfig: enable CONFIG_MMC_DW_ROCKCHIP
ARM: sunxi_defconfig: enable CONFIG_REGULATOR_FIXED_VOLTAGE
ARM: dts: socfpga: Add a 3.3V fixed regulator node
ARM: dts: socfpga: Fix SD card detect
ARM: dts: socfpga: rename gpio nodes
ARM: at91/dt: sam9263: fix PLLB frequencies
power: reset: at91-reset: fix power down register
MAINTAINERS: add atmel ssc driver maintainer entry
arm: socfpga: fix fetching cpu1start_addr for SMP
ARM: zynq: DT: trivial: Fix mc node
ARM: zynq: DT: Add cadence watchdog node
ARM: zynq: DT: Add missing reference for memory-controller
ARM: zynq: DT: Add missing reference for ADC
ARM: zynq: DT: Add missing address for L2 pl310
ARM: zynq: DT: Remove 222 MHz OPP
ARM: zynq: DT: Fix GEM register area size
Linus Torvalds [Sun, 26 Oct 2014 18:19:18 +0000 (11:19 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/viro/vfs
Pull vfs updates from Al Viro:
"overlayfs merge + leak fix for d_splice_alias() failure exits"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
overlayfs: embed middle into overlay_readdir_data
overlayfs: embed root into overlay_readdir_data
overlayfs: make ovl_cache_entry->name an array instead of pointer
overlayfs: don't hold ->i_mutex over opening the real directory
fix inode leaks on d_splice_alias() failure exits
fs: limit filesystem stacking depth
overlay: overlay filesystem documentation
overlayfs: implement show_options
overlayfs: add statfs support
overlay filesystem
shmem: support RENAME_WHITEOUT
ext4: support RENAME_WHITEOUT
vfs: add RENAME_WHITEOUT
vfs: add whiteout support
vfs: export check_sticky()
vfs: introduce clone_private_mount()
vfs: export __inode_permission() to modules
vfs: export do_splice_direct() to modules
vfs: add i_op->dentry_open()
Olof Johansson [Sun, 26 Oct 2014 03:44:05 +0000 (20:44 -0700)]
Merge tag 'imx-fixes-3.18' of git://git./linux/kernel/git/shawnguo/linux into fixes
Merge "ARM: imx: fixes for 3.18" from Shawn Guo:
The i.MX fixes for 3.18:
- Revert one patch which increases I2C bus frequency on imx28-evk
- Fix a typo on imx6q EIM clock name
* tag 'imx-fixes-3.18' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux:
ARM: dts: imx28-evk: Let i2c0 run at 100kHz
ARM: i.MX6: Fix "emi" clock name typo
Signed-off-by: Olof Johansson <olof@lixom.net>
Fabio Estevam [Mon, 20 Oct 2014 13:08:01 +0000 (11:08 -0200)]
ARM: dts: imx28-evk: Let i2c0 run at 100kHz
Commit
78b81f4666fb ("ARM: dts: imx28-evk: Run I2C0 at 400kHz") caused issues
when doing the following sequence in loop:
- Boot the kernel
- Perform audio playback
- Reboot the system via 'reboot' command
In many times the audio card cannot be probed, which causes playback to fail.
After restoring to the original i2c0 frequency of 100kHz there is no such
problem anymore.
This reverts commit
78b81f4666fbb22a20b1e63e5baf197ad2e90e88.
Cc: <stable@vger.kernel.org> # 3.16+
Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Steve Longerbeam [Tue, 14 Oct 2014 17:41:49 +0000 (20:41 +0300)]
ARM: i.MX6: Fix "emi" clock name typo
Fix a typo error, the "emi" names refer to the eim clocks.
The change fixes typo in EIM and EIM_SLOW pre-output dividers and
selectors clock names. Notably EIM_SLOW clock itself is named correctly.
Signed-off-by: Steve Longerbeam <steve_longerbeam@mentor.com>
[vladimir_zapolskiy@mentor.com: ported to v3.17]
Signed-off-by: Vladimir Zapolskiy <vladimir_zapolskiy@mentor.com>
Cc: Sascha Hauer <kernel@pengutronix.de>
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Al Viro [Fri, 24 Oct 2014 03:03:03 +0000 (23:03 -0400)]
overlayfs: embed middle into overlay_readdir_data
same story...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Fri, 24 Oct 2014 03:00:53 +0000 (23:00 -0400)]
overlayfs: embed root into overlay_readdir_data
no sense having it a pointer - all instances have it pointing to
local variable in the same stack frame
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Fri, 24 Oct 2014 02:58:56 +0000 (22:58 -0400)]
overlayfs: make ovl_cache_entry->name an array instead of pointer
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Fri, 24 Oct 2014 02:56:05 +0000 (22:56 -0400)]
overlayfs: don't hold ->i_mutex over opening the real directory
just use it to serialize the assignment
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Linus Torvalds [Fri, 24 Oct 2014 19:48:47 +0000 (12:48 -0700)]
Merge branch 'upstream' of git://git.linux-mips.org/ralf/upstream-linus
Pull MIPS fixes from Ralf Baechle:
"This is the first round of fixes and tying up loose ends for MIPS.
- plenty of fixes for build errors in specific obscure configurations
- remove redundant code on the Lantiq platform
- removal of a useless SEAD I2C driver that was causing a build issue
- fix an earlier TLB exeption handler fix to also work on Octeon.
- fix ISA level dependencies in FPU emulator's instruction decoding.
- don't hardcode kernel command line in Octeon software emulator.
- fix an earlier fix for the Loondson 2 clock setting"
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
MIPS: SEAD3: Fix I2C device registration.
MIPS: SEAD3: Nuke PIC32 I2C driver.
MIPS: ftrace: Fix a microMIPS build problem
MIPS: MSP71xx: Fix build error
MIPS: Malta: Do not build the malta-amon.c file if CMP is not enabled
MIPS: Prevent compiler warning from cop2_{save,restore}
MIPS: Kconfig: Add missing MIPS_CPS dependencies to PM and cpuidle
MIPS: idle: Remove leftover __pastwait symbol and its references
MIPS: Sibyte: Include the swarm subdir to the sb1250 LittleSur builds
MIPS: ptrace.h: Add a missing include
MIPS: ath79: Fix compilation error when CONFIG_PCI is disabled
MIPS: MSP71xx: Remove compilation error when CONFIG_MIPS_MT is present
MIPS: Octeon: Remove special case for simulator command line.
MIPS: tlbex: Properly fix HUGE TLB Refill exception handler
MIPS: loongson2_cpufreq: Fix CPU clock rate setting mismerge
pci: pci-lantiq: remove duplicate check on resource
MIPS: Lasat: Add missing CONFIG_PROC_FS dependency to PICVUE_PROC
MIPS: cp1emu: Fix ISA restrictions for cop1x_op instructions
Linus Torvalds [Fri, 24 Oct 2014 19:48:04 +0000 (12:48 -0700)]
Merge tag 'arm64-fixes' of git://git./linux/kernel/git/arm64/linux
Pull arm64 fixes from Catalin Marinas:
- enable 48-bit VA space now that KVM has been fixed, together with a
couple of fixes for pgd allocation alignment and initial memblock
current_limit. There is still a dependency on !ARM_SMMU which needs
to be updated as it uses the page table manipulation macros of the
host kernel
- eBPF fixes following changes/conflicts during the merging window
- Compat types affecting compat_elf_prpsinfo
- Compilation error on UP builds
- ASLR fix when /proc/sys/kernel/randomize_va_space == 0
- DT definitions for CLCD support on ARMv8 model platform
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: Fix memblock current_limit with 64K pages and 48-bit VA
arm64: ASLR: Don't randomise text when randomise_va_space == 0
arm64: vexpress: Add CLCD support to the ARMv8 model platform
arm64: Fix compilation error on UP builds
Documentation/arm64/memory.txt: fix typo
net: bpf: arm64: minor fix of type in jited
arm64: bpf: add 'load 64-bit immediate' instruction
arm64: bpf: add 'shift by register' instructions
net: bpf: arm64: address randomize and write protect JIT code
arm64: mm: Correct fixmap pagetable types
arm64: compat: fix compat types affecting struct compat_elf_prpsinfo
arm64: Align less than PAGE_SIZE pgds naturally
arm64: Allow 48-bits VA space without ARM_SMMU
Linus Torvalds [Fri, 24 Oct 2014 19:45:47 +0000 (12:45 -0700)]
Merge git://git./linux/kernel/git/davem/sparc
Pull two sparc fixes from David Miller:
1) Fix boots with gcc-4.9 compiled sparc64 kernels.
2) Add missing __get_user_pages_fast() on sparc64 to fix hangs on
futexes used in transparent hugepage areas.
It's really idiotic to have a weak symbolled fallback that just
returns zero, and causes this kind of bug. There should be no
backup implementation and the link should fail if the architecture
fails to provide __get_user_pages_fast() and supports transparent
hugepages.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
sparc64: Implement __get_user_pages_fast().
sparc64: Fix register corruption in top-most kernel stack frame during boot.
Linus Torvalds [Fri, 24 Oct 2014 19:42:55 +0000 (12:42 -0700)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
"This is a pretty large update. I think it is roughly as big as what I
usually had for the _whole_ rc period.
There are a few bad bugs where the guest can OOPS or crash the host.
We have also started looking at attack models for nested
virtualization; bugs that usually result in the guest ring 0 crashing
itself become more worrisome if you have nested virtualization,
because the nested guest might bring down the non-nested guest as
well. For current uses of nested virtualization these do not really
have a security impact, but you never know and bugs are bugs
nevertheless.
A lot of these bugs are in 3.17 too, resulting in a large number of
stable@ Ccs. I checked that all the patches apply there with no
conflicts"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
kvm: vfio: fix unregister kvm_device_ops of vfio
KVM: x86: Wrong assertion on paging_tmpl.h
kvm: fix excessive pages un-pinning in kvm_iommu_map error path.
KVM: x86: PREFETCH and HINT_NOP should have SrcMem flag
KVM: x86: Emulator does not decode clflush well
KVM: emulate: avoid accessing NULL ctxt->memopp
KVM: x86: Decoding guest instructions which cross page boundary may fail
kvm: x86: don't kill guest on unknown exit reason
kvm: vmx: handle invvpid vm exit gracefully
KVM: x86: Handle errors when RIP is set during far jumps
KVM: x86: Emulator fixes for eip canonical checks on near branches
KVM: x86: Fix wrong masking on relative jump/call
KVM: x86: Improve thread safety in pit
KVM: x86: Prevent host from panicking on shared MSR writes.
KVM: x86: Check non-canonical addresses upon WRMSR
Linus Torvalds [Fri, 24 Oct 2014 19:41:50 +0000 (12:41 -0700)]
Merge tag 'stable/for-linus-3.18-b-rc1-tag' of git://git./linux/kernel/git/xen/tip
Pull xen bug fixes from David Vrabel:
- Fix regression in xen_clocksource_read() which caused all Xen guests
to crash early in boot.
- Several fixes for super rare race conditions in the p2m.
- Assorted other minor fixes.
* tag 'stable/for-linus-3.18-b-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen/pci: Allocate memory for physdev_pci_device_add's optarr
x86/xen: panic on bad Xen-provided memory map
x86/xen: Fix incorrect per_cpu accessor in xen_clocksource_read()
x86/xen: avoid race in p2m handling
x86/xen: delay construction of mfn_list_list
x86/xen: avoid writing to freed memory after race in p2m handling
xen/balloon: Don't continue ballooning when BP_ECANCELED is encountered
Linus Torvalds [Fri, 24 Oct 2014 19:35:48 +0000 (12:35 -0700)]
Merge tag 'sound-3.18-rc2' of git://git./linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"Here are a chunk of small fixes since rc1: two PCM core fixes, one is
a long-standing annoyance about lockdep and another is an ARM64 mmap
fix.
The rest are a HD-audio HDMI hotplug notification fix, a fix for
missing NULL termination in Realtek codec quirks and a few new
device/codec-specific quirks as usual"
* tag 'sound-3.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda - Add missing terminating entry to SND_HDA_PIN_QUIRK macro
ALSA: pcm: Fix false lockdep warnings
ALSA: hda - Fix inverted LED gpio setup for Lenovo Ideapad
ALSA: hda - hdmi: Fix missing ELD change event on plug/unplug
ALSA: usb-audio: Add support for Steinberg UR22 USB interface
ALSA: ALC283 codec - Avoid pop noise on headphones during suspend/resume
ALSA: pcm: use the same dma mmap codepath both for arm and arm64
Linus Torvalds [Fri, 24 Oct 2014 19:33:32 +0000 (12:33 -0700)]
Merge tag 'random_for_linus' of git://git./linux/kernel/git/tytso/random
Pull /dev/random updates from Ted Ts'o:
"This adds a memzero_explicit() call which is guaranteed not to be
optimized away by GCC. This is important when we are wiping
cryptographically sensitive material"
* tag 'random_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random:
crypto: memzero_explicit - make sure to clear out sensitive data
random: add and use memzero_explicit() for clearing data
Linus Torvalds [Fri, 24 Oct 2014 18:29:31 +0000 (11:29 -0700)]
Merge tag 'pm+acpi-3.18-rc2' of git://git./linux/kernel/git/rafael/linux-pm
Pull ACPI and power management updates from Rafael Wysocki:
"This is material that didn't make it to my 3.18-rc1 pull request for
various reasons, mostly related to timing and travel (LinuxCon EU /
LPC) plus a couple of fixes for recent bugs.
The only really new thing here is the PM QoS class for memory
bandwidth, but it is simple enough and users of it will be added in
the next cycle. One major change in behavior is that platform devices
enumerated by ACPI will use 32-bit DMA mask by default. Also included
is an ACPICA update to a new upstream release, but that's mostly
cleanups, changes in tools and similar. The rest is fixes and
cleanups mostly.
Specifics:
- Fix for a recent PCI power management change that overlooked the
fact that some IRQ chips might not be able to configure PCIe PME
for system wakeup from Lucas Stach.
- Fix for a bug introduced in 3.17 where acpi_device_wakeup() is
called with a wrong ordering of arguments from Zhang Rui.
- A bunch of intel_pstate driver fixes (all -stable candidates) from
Dirk Brandewie, Gabriele Mazzotta and Pali Rohár.
- Fixes for a rather long-standing problem with the OOM killer and
the freezer that frozen processes killed by the OOM do not actually
release any memory until they are thawed, so OOM-killing them is
rather pointless, with a couple of cleanups on top (Michal Hocko,
Cong Wang, Rafael J Wysocki).
- ACPICA update to upstream release
20140926, inlcuding mostly
cleanups reducing differences between the upstream ACPICA and the
kernel code, tools changes (acpidump, acpiexec) and support for the
_DDN object (Bob Moore, Lv Zheng).
- New PM QoS class for memory bandwidth from Tomeu Vizoso.
- Default 32-bit DMA mask for platform devices enumerated by ACPI
(this change is mostly needed for some drivers development in
progress targeted at 3.19) from Heikki Krogerus.
- ACPI EC driver cleanups, mostly related to debugging, from Lv
Zheng.
- cpufreq-dt driver updates from Thomas Petazzoni.
- powernv cpuidle driver update from Preeti U Murthy"
* tag 'pm+acpi-3.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (34 commits)
intel_pstate: Correct BYT VID values.
intel_pstate: Fix BYT frequency reporting
intel_pstate: Don't lose sysfs settings during cpu offline
cpufreq: intel_pstate: Reflect current no_turbo state correctly
cpufreq: expose scaling_cur_freq sysfs file for set_policy() drivers
cpufreq: intel_pstate: Fix setting max_perf_pct in performance policy
PCI / PM: handle failure to enable wakeup on PCIe PME
ACPI: invoke acpi_device_wakeup() with correct parameters
PM / freezer: Clean up code after recent fixes
PM: convert do_each_thread to for_each_process_thread
OOM, PM: OOM killed task shouldn't escape PM suspend
freezer: remove obsolete comments in __thaw_task()
freezer: Do not freeze tasks killed by OOM killer
ACPI / platform: provide default DMA mask
cpuidle: powernv: Populate cpuidle state details by querying the device-tree
cpufreq: cpufreq-dt: adjust message related to regulators
cpufreq: cpufreq-dt: extend with platform_data
cpufreq: allow driver-specific data
ACPI / EC: Cleanup coding style.
ACPI / EC: Refine event/query debugging messages.
...
Linus Torvalds [Fri, 24 Oct 2014 18:21:43 +0000 (11:21 -0700)]
Merge branch 'next' of git://git./linux/kernel/git/rzhang/linux
Pull thermal management updates from Zhang Rui:
"Sorry that I missed the merge window as there is a bug found in the
last minute, and I have to fix it and wait for the code to be tested
in linux-next tree for a few days. Now the buggy patch has been
dropped entirely from my next branch. Thus I hope those changes can
still be merged in 3.18-rc2 as most of them are platform thermal
driver changes.
Specifics:
- introduce ACPI INT340X thermal drivers.
Newer laptops and tablets may have thermal sensors and other
devices with thermal control capabilities that are exposed for the
OS to use via the ACPI INT340x device objects. Several drivers are
introduced to expose the temperature information and cooling
ability from these objects to user-space via the normal thermal
framework.
From: Lu Aaron, Lan Tianyu, Jacob Pan and Zhang Rui.
- introduce a new thermal governor, which just uses a hysteresis to
switch abruptly on/off a cooling device. This governor can be used
to control certain fan devices that can not be throttled but just
switched on or off. From: Peter Feuerer.
- introduce support for some new thermal interrupt functions on
i.MX6SX, in IMX thermal driver. From: Anson, Huang.
- introduce tracing support on thermal framework. From: Punit
Agrawal.
- small fixes in OF thermal and thermal step_wise governor"
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux: (25 commits)
Thermal: int340x thermal: select ACPI fan driver
Thermal: int3400_thermal: use acpi_thermal_rel parsing APIs
Thermal: int340x_thermal: expose acpi thermal relationship tables
Thermal: introduce int3403 thermal driver
Thermal: introduce INT3402 thermal driver
Thermal: move the KELVIN_TO_MILLICELSIUS macro to thermal.h
ACPI / Fan: support INT3404 thermal device
ACPI / Fan: add ACPI 4.0 style fan support
ACPI / fan: convert to platform driver
ACPI / fan: use acpi_device_xxx_power instead of acpi_bus equivelant
ACPI / fan: remove no need check for device pointer
ACPI / fan: remove unused macro
Thermal: int3400 thermal: register to thermal framework
Thermal: int3400 thermal: add capability to detect supporting UUIDs
Thermal: introduce int3400 thermal driver
ACPI: add ACPI_TYPE_LOCAL_REFERENCE support to acpi_extract_package()
ACPI: make acpi_create_platform_device() an external API
thermal: step_wise: fix: Prevent from binary overflow when trend is dropping
ACPI: introduce ACPI int340x thermal scan handler
thermal: Added Bang-bang thermal governor
...
Catalin Marinas [Fri, 24 Oct 2014 17:16:47 +0000 (18:16 +0100)]
arm64: Fix memblock current_limit with 64K pages and 48-bit VA
With 48-bit VA space, the 64K page configuration uses 3 levels instead
of 2 and PUD_SIZE != PMD_SIZE. Since with 64K pages we only cover
PMD_SIZE with the initial swapper_pg_dir populated in head.S, the
memblock current_limit needs to be set accordingly in map_mem() to avoid
allocating unmapped memory. The memblock current_limit is progressively
increased as more blocks are mapped.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
David S. Miller [Fri, 24 Oct 2014 16:59:02 +0000 (09:59 -0700)]
sparc64: Implement __get_user_pages_fast().
It is not sufficient to only implement get_user_pages_fast(), you
must also implement the atomic version __get_user_pages_fast()
otherwise you end up using the weak symbol fallback implementation
which simply returns zero.
This is dangerous, because it causes the futex code to loop forever
if transparent hugepages are supported (see get_futex_key()).
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 23 Oct 2014 19:58:13 +0000 (12:58 -0700)]
sparc64: Fix register corruption in top-most kernel stack frame during boot.
Meelis Roos reported that kernels built with gcc-4.9 do not boot, we
eventually narrowed this down to only impacting machines using
UltraSPARC-III and derivitive cpus.
The crash happens right when the first user process is spawned:
[ 54.451346] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000004
[ 54.451346]
[ 54.571516] CPU: 1 PID: 1 Comm: init Not tainted
3.16.0-rc2-00211-gd7933ab #96
[ 54.666431] Call Trace:
[ 54.698453] [
0000000000762f8c] panic+0xb0/0x224
[ 54.759071] [
000000000045cf68] do_exit+0x948/0x960
[ 54.823123] [
000000000042cbc0] fault_in_user_windows+0xe0/0x100
[ 54.902036] [
0000000000404ad0] __handle_user_windows+0x0/0x10
[ 54.978662] Press Stop-A (L1-A) to return to the boot prom
[ 55.050713] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000004
Further investigation showed that compiling only per_cpu_patch() with
an older compiler fixes the boot.
Detailed analysis showed that the function is not being miscompiled by
gcc-4.9, but it is using a different register allocation ordering.
With the gcc-4.9 compiled function, something during the code patching
causes some of the %i* input registers to get corrupted. Perhaps
we have a TLB miss path into the firmware that is deep enough to
cause a register window spill and subsequent restore when we get
back from the TLB miss trap.
Let's plug this up by doing two things:
1) Stop using the firmware stack for client interface calls into
the firmware. Just use the kernel's stack.
2) As soon as we can, call into a new function "start_early_boot()"
to put a one-register-window buffer between the firmware's
deepest stack frame and the top-most initial kernel one.
Reported-by: Meelis Roos <mroos@linux.ee>
Tested-by: Meelis Roos <mroos@linux.ee>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arun Chandran [Fri, 10 Oct 2014 11:31:24 +0000 (12:31 +0100)]
arm64: ASLR: Don't randomise text when randomise_va_space == 0
When user asks to turn off ASLR by writing "0" to
/proc/sys/kernel/randomize_va_space there should not be
any randomization to mmap base, stack, VDSO, libs, text and heap
Currently arm64 violates this behavior by randomising text.
Fix this by defining a constant ELF_ET_DYN_BASE. The randomisation of
mm->mmap_base is done by setup_new_exec -> arch_pick_mmap_layout ->
mmap_base -> mmap_rnd.
Signed-off-by: Arun Chandran <achandran@mvista.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Ralf Baechle [Fri, 24 Oct 2014 10:23:26 +0000 (12:23 +0200)]
MIPS: SEAD3: Fix I2C device registration.
This isn't a module and shouldn't be one.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Wanpeng Li [Thu, 9 Oct 2014 10:30:08 +0000 (18:30 +0800)]
kvm: vfio: fix unregister kvm_device_ops of vfio
After commit
80ce163 (KVM: VFIO: register kvm_device_ops dynamically),
kvm_device_ops of vfio can be registered dynamically. Commit
3c3c29fd
(kvm-vfio: do not use module_init) move the dynamic register invoked by
kvm_init in order to fix broke unloading of the kvm module. However,
kvm_device_ops of vfio is unregistered after rmmod kvm-intel module
which lead to device type collision detection warning after kvm-intel
module reinsmod.
WARNING: CPU: 1 PID: 10358 at /root/cathy/kvm/arch/x86/kvm/../../../virt/kvm/kvm_main.c:3289 kvm_init+0x234/0x282 [kvm]()
Modules linked in: kvm_intel(O+) kvm(O) nfsv3 nfs_acl auth_rpcgss oid_registry nfsv4 dns_resolver nfs fscache lockd sunrpc pci_stub bridge stp llc autofs4 8021q cpufreq_ondemand ipv6 joydev microcode pcspkr igb i2c_algo_bit ehci_pci ehci_hcd e1000e i2c_i801 ixgbe ptp pps_core hwmon mdio tpm_tis tpm ipmi_si ipmi_msghandler acpi_cpufreq isci libsas scsi_transport_sas button dm_mirror dm_region_hash dm_log dm_mod [last unloaded: kvm_intel]
CPU: 1 PID: 10358 Comm: insmod Tainted: G W O 3.17.0-rc1 #2
Hardware name: Intel Corporation S2600CP/S2600CP, BIOS RMLSDP.86I.00.29.D696.
1311111329 11/11/2013
0000000000000cd9 ffff880ff08cfd18 ffffffff814a61d9 0000000000000cd9
0000000000000000 ffff880ff08cfd58 ffffffff810417b7 ffff880ff08cfd48
ffffffffa045bcac ffffffffa049c420 0000000000000040 00000000000000ff
Call Trace:
[<
ffffffff814a61d9>] dump_stack+0x49/0x60
[<
ffffffff810417b7>] warn_slowpath_common+0x7c/0x96
[<
ffffffffa045bcac>] ? kvm_init+0x234/0x282 [kvm]
[<
ffffffff810417e6>] warn_slowpath_null+0x15/0x17
[<
ffffffffa045bcac>] kvm_init+0x234/0x282 [kvm]
[<
ffffffffa016e995>] vmx_init+0x1bf/0x42a [kvm_intel]
[<
ffffffffa016e7d6>] ? vmx_check_processor_compat+0x64/0x64 [kvm_intel]
[<
ffffffff810002ab>] do_one_initcall+0xe3/0x170
[<
ffffffff811168a9>] ? __vunmap+0xad/0xb8
[<
ffffffff8109c58f>] do_init_module+0x2b/0x174
[<
ffffffff8109d414>] load_module+0x43e/0x569
[<
ffffffff8109c6d8>] ? do_init_module+0x174/0x174
[<
ffffffff8109c75a>] ? copy_module_from_user+0x39/0x82
[<
ffffffff8109b7dd>] ? module_sect_show+0x20/0x20
[<
ffffffff8109d65f>] SyS_init_module+0x54/0x81
[<
ffffffff814a9a12>] system_call_fastpath+0x16/0x1b
---[ end trace
0626f4a3ddea56f3 ]---
The bug can be reproduced by:
rmmod kvm_intel.ko
insmod kvm_intel.ko
without rmmod/insmod kvm.ko
This patch fixes the bug by unregistering kvm_device_ops of vfio when the
kvm-intel module is removed.
Reported-by: Liu Rongrong <rongrongx.liu@intel.com>
Fixes:
3c3c29fd0d7cddc32862c350d0700ce69953e3bd
Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Nadav Amit [Tue, 30 Sep 2014 17:49:18 +0000 (20:49 +0300)]
KVM: x86: Wrong assertion on paging_tmpl.h
Even after the recent fix, the assertion on paging_tmpl.h is triggered.
Apparently, the assertion wants to check that the PAE is always set on
long-mode, but does it in incorrect way. Note that the assertion is not
enabled unless the code is debugged by defining MMU_DEBUG.
Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Quentin Casasnovas [Fri, 17 Oct 2014 20:55:59 +0000 (22:55 +0200)]
kvm: fix excessive pages un-pinning in kvm_iommu_map error path.
The third parameter of kvm_unpin_pages() when called from
kvm_iommu_map_pages() is wrong, it should be the number of pages to un-pin
and not the page size.
This error was facilitated with an inconsistent API: kvm_pin_pages() takes
a size, but kvn_unpin_pages() takes a number of pages, so fix the problem
by matching the two.
This was introduced by commit
350b8bd ("kvm: iommu: fix the third parameter
of kvm_iommu_put_pages (CVE-2014-3601)"), which fixes the lack of
un-pinning for pages intended to be un-pinned (i.e. memory leak) but
unfortunately potentially aggravated the number of pages we un-pin that
should have stayed pinned. As far as I understand though, the same
practical mitigations apply.
This issue was found during review of Red Hat 6.6 patches to prepare
Ksplice rebootless updates.
Thanks to Vegard for his time on a late Friday evening to help me in
understanding this code.
Fixes:
350b8bd ("kvm: iommu: fix the third parameter of... (CVE-2014-3601)")
Cc: stable@vger.kernel.org
Signed-off-by: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: Jamie Iles <jamie.iles@oracle.com>
Reviewed-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Nadav Amit [Mon, 13 Oct 2014 10:04:14 +0000 (13:04 +0300)]
KVM: x86: PREFETCH and HINT_NOP should have SrcMem flag
The decode phase of the x86 emulator assumes that every instruction with the
ModRM flag, and which can be used with RIP-relative addressing, has either
SrcMem or DstMem. This is not the case for several instructions - prefetch,
hint-nop and clflush.
Adding SrcMem|NoAccess for prefetch and hint-nop and SrcMem for clflush.
This fixes CVE-2014-8480.
Fixes:
41061cdb98a0bec464278b4db8e894a3121671f5
Cc: stable@vger.kernel.org
Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Nadav Amit [Mon, 13 Oct 2014 10:04:13 +0000 (13:04 +0300)]
KVM: x86: Emulator does not decode clflush well
Currently, all group15 instructions are decoded as clflush (e.g., mfence,
xsave). In addition, the clflush instruction requires no prefix (66/f2/f3)
would exist. If prefix exists it may encode a different instruction (e.g.,
clflushopt).
Creating a group for clflush, and different group for each prefix.
This has been the case forever, but the next patch needs the cflush group
in order to fix a bug introduced in 3.17.
Fixes:
41061cdb98a0bec464278b4db8e894a3121671f5
Cc: stable@vger.kernel.org
Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Thu, 23 Oct 2014 12:54:14 +0000 (14:54 +0200)]
KVM: emulate: avoid accessing NULL ctxt->memopp
A failure to decode the instruction can cause a NULL pointer access.
This is fixed simply by moving the "done" label as close as possible
to the return.
This fixes CVE-2014-8481.
Reported-by: Andy Lutomirski <luto@amacapital.net>
Cc: stable@vger.kernel.org
Fixes:
41061cdb98a0bec464278b4db8e894a3121671f5
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Ralf Baechle [Fri, 24 Oct 2014 11:27:37 +0000 (13:27 +0200)]
MIPS: SEAD3: Nuke PIC32 I2C driver.
A platform driver for which nothing ever registers the corresponding
platform device.
Also it was driving the same hardware as sead3-i2c-drv.c so redundant
anyway and couldn't co-exist with that driver because each of them was
using a private spinlock to protect access to the same hardware
resources.
This also fixes a randconfig problem:
arch/mips/mti-sead3/sead3-pic32-i2c-drv.c: In function 'i2c_platform_probe':
arch/mips/mti-sead3/sead3-pic32-i2c-drv.c:345:2: error: implicit declaration of
function 'i2c_add_numbered_adapter' [-Werror=implicit-function-declaration]
ret = i2c_add_numbered_adapter(&priv->adap);
^
arch/mips/mti-sead3/sead3-pic32-i2c-drv.c: In function
'i2c_platform_remove':
arch/mips/mti-sead3/sead3-pic32-i2c-drv.c:361:2: error: implicit declaration
of function 'i2c_del_adapter' [-Werror=implicit-function-declaration]
i2c_del_adapter(&priv->adap);
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Nadav Amit [Thu, 2 Oct 2014 22:10:04 +0000 (01:10 +0300)]
KVM: x86: Decoding guest instructions which cross page boundary may fail
Once an instruction crosses a page boundary, the size read from the second page
disregards the common case that part of the operand resides on the first page.
As a result, fetch of long insturctions may fail, and thereby cause the
decoding to fail as well.
Cc: stable@vger.kernel.org
Fixes:
5cfc7e0f5e5e1adf998df94f8e36edaf5d30d38e
Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Michael S. Tsirkin [Thu, 18 Sep 2014 13:21:16 +0000 (16:21 +0300)]
kvm: x86: don't kill guest on unknown exit reason
KVM_EXIT_UNKNOWN is a kvm bug, we don't really know whether it was
triggered by a priveledged application. Let's not kill the guest: WARN
and inject #UD instead.
Cc: stable@vger.kernel.org
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Petr Matousek [Tue, 23 Sep 2014 18:22:30 +0000 (20:22 +0200)]
kvm: vmx: handle invvpid vm exit gracefully
On systems with invvpid instruction support (corresponding bit in
IA32_VMX_EPT_VPID_CAP MSR is set) guest invocation of invvpid
causes vm exit, which is currently not handled and results in
propagation of unknown exit to userspace.
Fix this by installing an invvpid vm exit handler.
This is CVE-2014-3646.
Cc: stable@vger.kernel.org
Signed-off-by: Petr Matousek <pmatouse@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Nadav Amit [Thu, 18 Sep 2014 19:39:39 +0000 (22:39 +0300)]
KVM: x86: Handle errors when RIP is set during far jumps
Far jmp/call/ret may fault while loading a new RIP. Currently KVM does not
handle this case, and may result in failed vm-entry once the assignment is
done. The tricky part of doing so is that loading the new CS affects the
VMCS/VMCB state, so if we fail during loading the new RIP, we are left in
unconsistent state. Therefore, this patch saves on 64-bit the old CS
descriptor and restores it if loading RIP failed.
This fixes CVE-2014-3647.
Cc: stable@vger.kernel.org
Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Nadav Amit [Thu, 18 Sep 2014 19:39:38 +0000 (22:39 +0300)]
KVM: x86: Emulator fixes for eip canonical checks on near branches
Before changing rip (during jmp, call, ret, etc.) the target should be asserted
to be canonical one, as real CPUs do. During sysret, both target rsp and rip
should be canonical. If any of these values is noncanonical, a #GP exception
should occur. The exception to this rule are syscall and sysenter instructions
in which the assigned rip is checked during the assignment to the relevant
MSRs.
This patch fixes the emulator to behave as real CPUs do for near branches.
Far branches are handled by the next patch.
This fixes CVE-2014-3647.
Cc: stable@vger.kernel.org
Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Nadav Amit [Thu, 18 Sep 2014 19:39:37 +0000 (22:39 +0300)]
KVM: x86: Fix wrong masking on relative jump/call
Relative jumps and calls do the masking according to the operand size, and not
according to the address size as the KVM emulator does today.
This patch fixes KVM behavior.
Cc: stable@vger.kernel.org
Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Andy Honig [Wed, 27 Aug 2014 21:42:54 +0000 (14:42 -0700)]
KVM: x86: Improve thread safety in pit
There's a race condition in the PIT emulation code in KVM. In
__kvm_migrate_pit_timer the pit_timer object is accessed without
synchronization. If the race condition occurs at the wrong time this
can crash the host kernel.
This fixes CVE-2014-3611.
Cc: stable@vger.kernel.org
Signed-off-by: Andrew Honig <ahonig@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>