Miklos Szeredi [Mon, 21 May 2012 15:30:19 +0000 (17:30 +0200)]
vfs: retry last component if opening stale dentry
NFS optimizes away d_revalidates for last component of open. This means that
open itself can find the dentry stale.
This patch allows the filesystem to return EOPENSTALE and the VFS will retry the
lookup on just the last component if possible.
If the lookup was done using RCU mode, including the last component, then this
is not possible since the parent dentry is lost. In this case fall back to
non-RCU lookup. Currently this is not used since NFS will always leave RCU
mode.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Miklos Szeredi [Mon, 21 May 2012 15:30:18 +0000 (17:30 +0200)]
vfs: nameidata_to_filp(): don't throw away file on error
If open fails, don't put the file. This allows it to be reused if open needs to
be retried.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Miklos Szeredi [Mon, 21 May 2012 15:30:17 +0000 (17:30 +0200)]
vfs: nameidata_to_filp(): inline __dentry_open()
Copy __dentry_open() into nameidata_to_filp().
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Miklos Szeredi [Mon, 21 May 2012 15:30:16 +0000 (17:30 +0200)]
vfs: do_dentry_open(): don't put filp
Move put_filp() out to __dentry_open(), the only caller now.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Miklos Szeredi [Mon, 21 May 2012 15:30:15 +0000 (17:30 +0200)]
vfs: split __dentry_open()
Split __dentry_open() into two functions:
do_dentry_open() - does most of the actual work, doesn't put file on failure
open_check_o_direct() - after a successful open, checks direct_IO method
This will allow i_op->atomic_open to do just the file initialization and leave
the direct_IO checking to the VFS.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Miklos Szeredi [Mon, 21 May 2012 15:30:14 +0000 (17:30 +0200)]
vfs: do_last() common post lookup
Now the post lookup code can be shared between O_CREAT and plain opens since
they are essentially the same.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Miklos Szeredi [Mon, 21 May 2012 15:30:13 +0000 (17:30 +0200)]
vfs: do_last(): add audit_inode before open
This allows this code to be shared between O_CREAT and plain opens.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Miklos Szeredi [Mon, 21 May 2012 15:30:12 +0000 (17:30 +0200)]
vfs: do_last(): only return EISDIR for O_CREAT
This allows this code to be shared between O_CREAT and plain opens.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Miklos Szeredi [Mon, 21 May 2012 15:30:11 +0000 (17:30 +0200)]
vfs: do_last(): check LOOKUP_DIRECTORY
Check for ENOTDIR before finishing open. This allows this code to be shared
between O_CREAT and plain opens.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Miklos Szeredi [Mon, 21 May 2012 15:30:10 +0000 (17:30 +0200)]
vfs: do_last(): make ENOENT exit RCU safe
This will allow this code to be used in RCU mode.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Miklos Szeredi [Mon, 21 May 2012 15:30:09 +0000 (17:30 +0200)]
vfs: make follow_link check RCU safe
This will allow this code to be used in RCU mode.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Miklos Szeredi [Mon, 21 May 2012 15:30:08 +0000 (17:30 +0200)]
vfs: do_last(): use inode variable
Use helper variable instead of path->dentry->d_inode before complete_walk().
This will allow this code to be used in RCU mode.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Miklos Szeredi [Mon, 21 May 2012 15:30:07 +0000 (17:30 +0200)]
vfs: do_last(): inline walk_component()
Copy walk_component() into do_lookup().
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Miklos Szeredi [Mon, 21 May 2012 15:30:06 +0000 (17:30 +0200)]
vfs: do_last(): make exit RCU safe
Allow returning from do_last() with LOOKUP_RCU still set on the "out:" and
"exit:" labels.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Miklos Szeredi [Mon, 21 May 2012 15:30:05 +0000 (17:30 +0200)]
vfs: split do_lookup()
Split do_lookup() into two functions:
lookup_fast() - does cached lookup without i_mutex
lookup_slow() - does lookup with i_mutex
Both follow managed dentries.
The new functions are needed by atomic_open.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Josef Bacik [Mon, 26 Mar 2012 13:46:47 +0000 (09:46 -0400)]
Btrfs: move over to use ->update_time
Btrfs had been doing it's own file_update_time so we could catch ENOSPC
properly, so just update our btrfs_update_time to work with the new stuff and
then we'll be fancy later. Thanks,
Signed-off-by: Josef Bacik <josef@redhat.com>
Josef Bacik [Mon, 26 Mar 2012 13:59:21 +0000 (09:59 -0400)]
fs: introduce inode operation ->update_time
Btrfs has to make sure we have space to allocate new blocks in order to modify
the inode, so updating time can fail. We've gotten around this by having our
own file_update_time but this is kind of a pain, and Christoph has indicated he
would like to make xfs do something different with atime updates. So introduce
->update_time, where we will deal with i_version an a/m/c time updates and
indicate which changes need to be made. The normal version just does what it
has always done, updates the time and marks the inode dirty, and then
filesystems can choose to do something different.
I've gone through all of the users of file_update_time and made them check for
errors with the exception of the fault code since it's complicated and I wasn't
quite sure what to do there, also Jan is going to be pushing the file time
updates into page_mkwrite for those who have it so that should satisfy btrfs and
make it not a big deal to check the file_update_time() return code in the
generic fault path. Thanks,
Signed-off-by: Josef Bacik <josef@redhat.com>
Artem Bityutskiy [Fri, 1 Jun 2012 14:18:08 +0000 (17:18 +0300)]
reiserfs: get rid of resierfs_sync_super
This patch stops reiserfs using the VFS 'write_super()' method along with the
s_dirt flag, because they are on their way out.
The whole "superblock write-out" VFS infrastructure is served by the
'sync_supers()' kernel thread, which wakes up every 5 (by default) seconds and
writes out all dirty superblock using the '->write_super()' call-back. But the
problem with this thread is that it wastes power by waking up the system every
5 seconds, even if there are no diry superblocks, or there are no client
file-systems which would need this (e.g., btrfs does not use
'->write_super()'). So we want to kill it completely and thus, we need to make
file-systems to stop using the '->write_super()' VFS service, and then remove
it together with the kernel thread.
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Artem Bityutskiy [Fri, 1 Jun 2012 14:18:07 +0000 (17:18 +0300)]
reiserfs: mark the superblock as dirty a bit later
The 'journal_mark_dirty()' function currently first marks the superblock as
dirty by setting 's_dirt' to 1, then does various sanity checks and returns,
then actuall does all the magic with the journal.
This is not an ideal order, though. It makes more sense to first do all the
checks, then do all the internal stuff, and at the end notify the VFS that the
superblock is now dirty.
This patch moves the 's_dirt = 1' assignment from the very beginning of this
function to the very end.
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Artem Bityutskiy [Fri, 1 Jun 2012 14:18:06 +0000 (17:18 +0300)]
reiserfs: remove useless superblock dirtying
The 'reiserfs_resize()' function marks the superblock as dirty by assigning 1
to 's_dirt' and then calls 'journal_mark_dirty()' which does the same. Thus,
we can remove the assignment from 'reiserfs_resize()'.
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Artem Bityutskiy [Fri, 1 Jun 2012 14:18:05 +0000 (17:18 +0300)]
reiserfs: clean-up function return type
Turn 'reiserfs_flush_old_commits()' into a void function because the callers
do not cares about what it returns anyway.
We are going to remove the 'sb->s_dirt' field completely and this patch is a
small step towards this direction.
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Artem Bityutskiy [Fri, 1 Jun 2012 14:18:04 +0000 (17:18 +0300)]
reiserfs: cleanup reiserfs_fill_super a bit
We have the reiserfs superblock pointer in the 'sbi' variable in this
function, no need to use the 'REISERFS_SB(s)' macro which is the same.
This is jut a small clean-up.
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Thu, 19 Apr 2012 22:17:15 +0000 (18:17 -0400)]
sch_atm.c: get rid of poinless extern
sockfd_lookup() is declared in linux/net.h, which is pulled by
linux/skbuff.h (and needed for a lot of other stuff in sch_atm.c
anyway).
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Thu, 31 May 2012 00:19:20 +0000 (20:19 -0400)]
unexport do_munmap()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Thu, 31 May 2012 00:17:35 +0000 (20:17 -0400)]
new helper: vm_mmap_pgoff()
take it to mm/util.c, convert vm_mmap() to use of that one and
take it to mm/util.c as well, convert both sys_mmap_pgoff() to
use of vm_mmap_pgoff()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Thu, 31 May 2012 00:11:57 +0000 (20:11 -0400)]
kill do_mmap() completely
just pull into vm_mmap()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Thu, 31 May 2012 00:08:42 +0000 (20:08 -0400)]
switch aio and shm to do_mmap_pgoff(), make do_mmap() static
after all, 0 bytes and 0 pages is the same thing...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Wed, 30 May 2012 23:58:30 +0000 (19:58 -0400)]
take calculation of final prot in security_mmap_file() into a helper
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Wed, 30 May 2012 21:13:15 +0000 (17:13 -0400)]
move security_mmap_addr() to saner place
it really should be done by get_unmapped_area(); that cuts down on
the amount of callers considerably and it's the right place for
that stuff anyway.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Wed, 30 May 2012 21:11:23 +0000 (17:11 -0400)]
take security_mmap_file() outside of ->mmap_sem
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Wed, 30 May 2012 17:30:51 +0000 (13:30 -0400)]
split ->file_mmap() into ->mmap_addr()/->mmap_file()
... i.e. file-dependent and address-dependent checks.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Wed, 30 May 2012 17:11:37 +0000 (13:11 -0400)]
split cap_mmap_addr() out of cap_file_mmap()
... switch callers.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Wed, 30 May 2012 16:09:53 +0000 (12:09 -0400)]
unexport do_mmap()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Wed, 30 May 2012 15:55:49 +0000 (11:55 -0400)]
ia64 perfmon: fix get_unmapped_area() use there
get_unmapped_area() returns -E... on failure, not 0. Moreover, the
wrapper around it is completely pointless.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Wed, 30 May 2012 15:32:04 +0000 (11:32 -0400)]
merge do_mremap() into sys_mremap()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Wed, 30 May 2012 06:12:40 +0000 (02:12 -0400)]
ia64, sparc64: convert wrappers around do_mremap() to sys_mremap()
they contain open-coded sys_mremap()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Wed, 30 May 2012 05:56:23 +0000 (01:56 -0400)]
binfmt_flat: use vm_munmap, we are missing ->mmap_sem there
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Wed, 30 May 2012 05:49:38 +0000 (01:49 -0400)]
binfmt_elf: switch elf_map() to vm_mmap/vm_munmap
No reason to hold ->mmap_sem over the sequence
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Wed, 30 May 2012 02:03:48 +0000 (22:03 -0400)]
vfs: umount_tree() might be called on subtree that had never made it
__mnt_make_shortterm() in there undoes the effect of __mnt_make_longterm()
we'd done back when we set ->mnt_ns non-NULL; it should not be done to
vfsmounts that had never gone through commit_tree() and friends. Kudos to
lczerner for catching that one...
Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Will Deacon [Fri, 25 May 2012 10:39:13 +0000 (11:39 +0100)]
pipe: return -ENOIOCTLCMD instead of -EINVAL on unknown ioctl command
As described in commit
07d106d0a ("vfs: fix up ENOIOCTLCMD error
handling"), drivers should return -ENOIOCTLCMD if they receive an ioctl
command which they don't understand. Doing so will result in -ENOTTY
being returned to userspace, which matches the behaviour of the compat
layer if it fails to translate an ioctl command.
This patch fixes the pipe ioctl to return -ENOIOCTLCMD instead of
-EINVAL when passed an unknown ioctl command.
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
J. Bruce Fields [Wed, 9 May 2012 21:18:06 +0000 (17:18 -0400)]
vfs: remove unused __d_splice_alias argument
Nobody sets want_disconn any more.
Reported-by: Peng Tao <bergwolf@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
J. Bruce Fields [Wed, 9 May 2012 21:18:05 +0000 (17:18 -0400)]
vfs: stop d_splice_alias creating directory aliases
A directory should never have more than one dentry pointing to it.
But d_splice_alias() will add one if it finds a directory with an
already-existing non-DISCONNECTED dentry.
I can't find an obvious reproducer, but I also can't see what prevents
d_splice_alias() from encountering such a case.
It therefore seems safest to allow d_splice_alias to use any dentry it
finds.
(Prior to the removal of dentry_unhash() from vfs_rmdir(), around v3.0,
this could cause an nfsd deadlock like this:
- Somebody attempts to remove a non-empty directory.
- The dentry_unhash() in vfs_rmdir() unhashes the dentry
pointing to the non-empty directory.
- ->rmdir() then fails with -ENOTEMPTY
- Before the vfs_rmdir() caller reaches dput(), an nfsd process
in rename looks up the directory by filehandle; at the end of
that lookup, this dentry is found by d_alloc_anon(), and a
reference is taken on it, preventing dput() from removing it.
- A regular lookup of the directory calls d_splice_alias(),
finds only an unhashed (not a DISCONNECTED) dentry, and
insteads adds a new one, so the directory now has two
dentries.
- The nfsd process in rename, which was previously looking up
the source directory of the rename, now looks up the target
directory (which is the same), and gets the dentry newly
created by the previous lookup.
- The rename, seeing two different dentries, assumes this is a
cross-directory rename and attempts to take the i_mutex on the
directory twice.
That reproducer no longer exists, but I don't think there was anything
fundamentally incorrect about the vfs_rmdir() behavior there, so I think
the real fault was here in d_splice_alias().)
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Wed, 30 May 2012 01:24:36 +0000 (21:24 -0400)]
i810: switch to vm_mmap()
Weirdness around do_mmap() in there does not rely on ->mmap_sem for
exclusion, so no need to keep it under that. As the result, we can
turn that do_mmap() into vm_mmap().
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Dan Carpenter [Tue, 29 May 2012 18:02:24 +0000 (11:02 -0700)]
fsnotify: remove unused parameter from send_to_group()
We don't use "mnt" anymore in send_to_group() after
1968f5eed5 ("fanotify:
use both marks when possible") was applied.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Naohiro Aota [Tue, 29 May 2012 18:02:24 +0000 (11:02 -0700)]
fsnotify: handle subfiles' perm events
Recently I'm working on fanotify and found the following strange
behaviors.
I wrote a program to set fanotify_mark on "/tmp/block" and FAN_DENY
all events notified.
fanotify_mask = FAN_ALL_EVENTS | FAN_ALL_PERM_EVENTS | FAN_EVENT_ON_CHILD:
$ cd /tmp/block; cat foo
cat: foo: Operation not permitted
Operation on the file is blocked as expected.
But,
fanotify_mask = FAN_ALL_PERM_EVENTS | FAN_EVENT_ON_CHILD:
$ cd /tmp/block; cat foo
aaa
It's not blocked anymore. This is confusing behavior. Also reading
commit "fsnotify: call fsnotify_parent in perm events", it seems like
fsnotify should handle subfiles' perm events as well as the other notify
events.
With this patch, regardless of FAN_ALL_EVENTS set or not:
$ cd /tmp/block; cat foo
cat: foo: Operation not permitted
Operation on the file is now blocked properly.
FS_OPEN_PERM and FS_ACCESS_PERM are not listed on FS_EVENTS_POSS_ON_CHILD.
Due to fsnotify_inode_watches_children() check, if you only specify only
these events as fsnotify_mask, you don't get subfiles' perm events
notified.
This patch add the events to FS_EVENTS_POSS_ON_CHILD to get them notified
even if only these events are specified to fsnotify_mask.
Signed-off-by: Naohiro Aota <naota@elisp.net>
Cc: Eric Paris <eparis@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Dmitry Kasatkin [Tue, 29 May 2012 18:02:21 +0000 (11:02 -0700)]
vfs: increment iversion when a file is truncated
When a file is truncated with truncate()/ftruncate() and then closed,
iversion is not updated. This patch uses ATTR_SIZE flag as an indication
to increment iversion.
Mimi said:
On fput(), i_version is used to detect and flag files that have changed
and need to be re-measured in the IMA measurement policy. When a file
is truncated with truncate()/ftruncate() and then closed, i_version is
not updated. As a result, although the file has changed, it will not be
re-measured and added to the IMA measurement list on subsequent access.
Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com>
Acked-by: Mimi Zohar <zohar@us.ibm.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Shai Fultheim [Tue, 15 May 2012 09:29:52 +0000 (12:29 +0300)]
fs: Move bh_cachep to the __read_mostly section
bh_cachep is only written to once on initialization, so move it to the
__read_mostly section.
Signed-off-by: Shai Fultheim <shai@scalemp.com>
Signed-off-by: Vlad Zolotarov <vlad@scalemp.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Cong Wang [Tue, 15 May 2012 06:57:33 +0000 (14:57 +0800)]
fs: move file_remove_suid() to fs/inode.c
file_remove_suid() is a generic function operates on struct file,
it almost has no relations with file mapping, so move it to fs/inode.c.
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Artem Bityutskiy [Mon, 7 May 2012 16:56:53 +0000 (19:56 +0300)]
jffs2: get rid of jffs2_sync_super
Currently JFFS2 file-system maps the VFS "superblock" abstraction to the
write-buffer. Namely, it uses VFS services to synchronize the write-buffer
periodically.
The whole "superblock write-out" VFS infrastructure is served by the
'sync_supers()' kernel thread, which wakes up every 5 (by default) seconds and
writes out all dirty superblock using the '->write_super()' call-back. But the
problem with this thread is that it wastes power by waking up the system every
5 seconds no matter what. So we want to kill it completely and thus, we need to
make file-systems to stop using the '->write_super' VFS service, and then
remove it together with the kernel thread.
This patch switches the JFFS2 write-buffer management from
'->write_super()'/'->s_dirt' to a delayed work. Instead of setting the 's_dirt'
flag we just schedule a delayed work for synchronizing the write-buffer.
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Artem Bityutskiy [Mon, 7 May 2012 16:56:52 +0000 (19:56 +0300)]
jffs2: remove unnecessary GC pass on sync
We do not need to call 'jffs2_write_super()' on sync. This function
causes a GC pass to make sure the current contents is pushed out with
the data which we already have on the media.
But this is not needed on unmount and only slows sync down unnecessarily.
It is enough to just sync the write-buffer.
This call was added by one of the generic VFS rework patch-sets,
see
d579ed00aa96a7f7486978540a0d7cecaff742ae.
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Artem Bityutskiy [Mon, 7 May 2012 16:56:51 +0000 (19:56 +0300)]
jffs2: remove unnecessary GC pass on umount
We do not need to call 'jffs2_write_super()' on unmount. This function
causes a GC pass to make sure the current contents is pushed out with
the data which we already have on the media.
But this is not needed on unmount and only slows unmount down unnecessarily.
It is enough to just sync the write-buffer.
This call was added by one of the generic VFS rework patch-sets,
see
8c85e125124a473d6f3e9bb187b0b84207f81d91.
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Artem Bityutskiy [Mon, 7 May 2012 16:56:50 +0000 (19:56 +0300)]
jffs2: remove lock_super
We do not need 'lock_super()'/'unlock_super()' in JFFS2 - kill them.
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Sat, 19 May 2012 14:25:23 +0000 (10:25 -0400)]
bury __kernel_nlink_t, make internal nlink_t consistent
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Sat, 19 May 2012 14:17:45 +0000 (10:17 -0400)]
parisc: get rid of nlink_t, switch to explicitly-sized type
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Sat, 19 May 2012 14:16:30 +0000 (10:16 -0400)]
powerpc: get rid of nlink_t uses, switch to explicitly-sized type
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Sat, 19 May 2012 14:13:52 +0000 (10:13 -0400)]
mips: get rid of nlink_t, use explictly-sized type (__u32 in all cases)
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Sat, 19 May 2012 14:00:52 +0000 (10:00 -0400)]
mode_t whack-a-mole: ->is_visible() returns umode_t...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Sat, 19 May 2012 13:54:29 +0000 (09:54 -0400)]
get rid of idiotic misplaced __kernel_mode_t in ncfps kernel-private data structure
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Andi Kleen [Tue, 8 May 2012 04:02:02 +0000 (13:32 +0930)]
brlocks/lglocks: API cleanups
lglocks and brlocks are currently generated with some complicated macros
in lglock.h. But there's no reason to not just use common utility
functions and put all the data into a common data structure.
In preparation, this patch changes the API to look more like normal
function calls with pointers, not magic macros.
The patch is rather large because I move over all users in one go to keep
it bisectable. This impacts the VFS somewhat in terms of lines changed.
But no actual behaviour change.
[akpm@linux-foundation.org: checkpatch fixes]
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Andi Kleen [Tue, 8 May 2012 04:02:24 +0000 (13:32 +0930)]
brlocks/lglocks: turn into functions
lglocks and brlocks are currently generated with some complicated macros
in lglock.h. But there's no reason to not just use common utility
functions and put all the data into a common data structure.
Since there are at least two users it makes sense to share this code in a
library. This is also easier maintainable than a macro forest.
This will also make it later possible to dynamically allocate lglocks and
also use them in modules (this would both still need some additional, but
now straightforward, code)
[akpm@linux-foundation.org: checkpatch fixes]
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Rusty Russell [Tue, 8 May 2012 03:59:45 +0000 (13:29 +0930)]
lglock: remove online variants of lock
Optimizing the slow paths adds a lot of complexity. If you need to
grab every lock often, you have other problems.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Nick Piggin <npiggin@kernel.dk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Thu, 3 May 2012 14:14:29 +0000 (10:14 -0400)]
ocfs: simplify symlink handling
seeing that "fast" symlinks still get allocation + copy, we might as
well simply switch them to pagecache-based variant of ->follow_link();
just need an appropriate ->readpage() for them...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Thu, 3 May 2012 13:34:20 +0000 (09:34 -0400)]
get rid of pointless allocations and copying in ecryptfs_follow_link()
switch to generic_readlink(), while we are at it
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Tue, 17 Apr 2012 20:41:13 +0000 (16:41 -0400)]
hpfs: assorted endianness annotations
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Tue, 17 Apr 2012 20:26:46 +0000 (16:26 -0400)]
hpfs: annotate ea
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Tue, 17 Apr 2012 20:20:49 +0000 (16:20 -0400)]
hpfs: annotate struct hpfs_dirent
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Tue, 17 Apr 2012 20:11:25 +0000 (16:11 -0400)]
hpfs: annotate struct anode
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Tue, 17 Apr 2012 20:09:25 +0000 (16:09 -0400)]
hpfs: annotate struct fnode
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Tue, 17 Apr 2012 19:59:35 +0000 (15:59 -0400)]
hpfs: annotate btree nodes, get rid of bitfields mess
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Tue, 17 Apr 2012 19:32:22 +0000 (15:32 -0400)]
hpfs: annotate struct dnode
little-endians...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Tue, 17 Apr 2012 19:28:51 +0000 (15:28 -0400)]
hpfs: bitmaps are little-endian
annotate properly...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Fri, 6 Apr 2012 18:30:07 +0000 (14:30 -0400)]
hpfs: get rid of bitfields in struct fnode
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Fri, 6 Apr 2012 17:21:09 +0000 (13:21 -0400)]
hpfs: get rid of bitfields endianness wanking in extended_attribute
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Randy Dunlap [Wed, 18 Apr 2012 00:03:25 +0000 (17:03 -0700)]
fs: fix inode.c kernel-doc warnings
Fix kernel-doc warnings in fs/inode.c:
Warning(fs/inode.c:1493): No description found for parameter 'path'
Warning(fs/inode.c:1493): Excess function parameter 'mnt' description in 'touch_atime'
Warning(fs/inode.c:1493): Excess function parameter 'dentry' description in 'touch_atime'
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Fri, 13 Apr 2012 05:24:37 +0000 (01:24 -0400)]
hpfs: endianness bugs
a couple of le32 and le16 used with wrong le..._to_cpu(), plus
idiotic use of le32_to_cpu() on 1-bit bitfield
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Fri, 13 Apr 2012 15:03:55 +0000 (11:03 -0400)]
btrfs: trivial endianness annotations
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Thu, 12 Apr 2012 23:58:53 +0000 (19:58 -0400)]
ocfs2: kill endianness abuses in blockcheck.c
ocfs2_block_check is for little-endian contents; if we just want to
its fields converted to host-endian in a couple of functions, just
put those values into local u32 and u16...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Thu, 12 Apr 2012 23:52:19 +0000 (19:52 -0400)]
ocfs2: deal with __user misannotations
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Thu, 12 Apr 2012 22:47:13 +0000 (18:47 -0400)]
ocfs2: trivial endianness misannotations
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Fri, 6 Apr 2012 05:40:50 +0000 (01:40 -0400)]
affs: bury unused macros
... unused since 2.4.4.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Tue, 3 Apr 2012 00:02:53 +0000 (20:02 -0400)]
kill v9fs_dentry_from_dir_inode()
In *all* callers we have a dentry of child of that directory.
Just use ->d_parent of that one, for fsck sake...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Mon, 2 Apr 2012 23:40:47 +0000 (19:40 -0400)]
selinuxfs snprintf() misuses
a) %d does _not_ produce a page worth of output
b) snprintf() doesn't return negatives - it used to in old glibc, but
that's the kernel...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Sage Weil [Thu, 5 Apr 2012 19:07:36 +0000 (12:07 -0700)]
ceph: move encode_fh to new API
Use parent_inode has a flag for whether nfsd wants a connectable fh, but
generate one opportunistically so that we can take advantage of the
additional info in there.
Signed-off-by: Sage Weil <sage@newdream.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Mon, 2 Apr 2012 18:34:06 +0000 (14:34 -0400)]
->encode_fh() API change
pass inode + parent's inode or NULL instead of dentry + bool saying
whether we want the parent or not.
NOTE: that needs ceph fix folded in.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Mon, 2 Apr 2012 18:25:07 +0000 (14:25 -0400)]
ubifs: use generic_fillattr()
don't open-code it...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Mon, 2 Apr 2012 10:24:04 +0000 (06:24 -0400)]
xfs: switch to proper __bitwise type for KM_... flags
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Sat, 21 Apr 2012 22:47:57 +0000 (18:47 -0400)]
switch utimes() to fget_light/fput_light
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Sat, 21 Apr 2012 22:47:27 +0000 (18:47 -0400)]
switch statfs to fget_light/fput_light
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Sat, 21 Apr 2012 22:46:53 +0000 (18:46 -0400)]
switch flock to fget_light/fput_light
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Sat, 21 Apr 2012 22:44:12 +0000 (18:44 -0400)]
switch signalfd4() to fget_light/fput_light
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Sat, 21 Apr 2012 22:42:19 +0000 (18:42 -0400)]
switch fcntl to fget_raw_light/fput_light
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Sat, 21 Apr 2012 22:41:25 +0000 (18:41 -0400)]
switch xattr syscalls to fget_light/fput_light
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Sat, 21 Apr 2012 22:40:32 +0000 (18:40 -0400)]
switch readdir/getdents to fget_light/fput_light
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Fri, 20 Apr 2012 03:52:50 +0000 (23:52 -0400)]
switch do_fsync() to fget_light()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Linus Torvalds [Tue, 29 May 2012 19:42:10 +0000 (12:42 -0700)]
Merge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6
Pull CIFS updates from Steve French.
* 'for-next' of git://git.samba.org/sfrench/cifs-2.6: (29 commits)
cifs: fix oops while traversing open file list (try #4)
cifs: Fix comment as d_alloc_root() is replaced by d_make_root()
CIFS: Introduce SMB2 mounts as vers=2.1
CIFS: Introduce SMB2 Kconfig option
CIFS: Move add/set_credits and get_credits_field to ops structure
CIFS: Move protocol specific demultiplex thread calls to ops struct
CIFS: Move protocol specific part from cifs_readv_receive to ops struct
CIFS: Move header_size/max_header_size to ops structure
CIFS: Move protocol specific part from SendReceive2 to ops struct
cifs: Include backup intent search flags during searches {try #2)
CIFS: Separate protocol specific part from setlk
CIFS: Separate protocol specific part from getlk
CIFS: Separate protocol specific lock type handling
CIFS: Convert lock type to 32 bit variable
CIFS: Move locks to cifsFileInfo structure
cifs: convert send_nt_cancel into a version specific op
cifs: add a smb_version_operations/values structures and a smb_version enum
cifs: remove the vers= and version= synonyms for ver=
cifs: add warning about change in default cache semantics in 3.7
cifs: display cache= option in /proc/mounts
...
Linus Torvalds [Tue, 29 May 2012 18:53:11 +0000 (11:53 -0700)]
Merge tag 'mfd-3.5-1' of git://git./linux/kernel/git/sameo/mfd-2.6
Pull MFD changes from Samuel Ortiz:
"Besides the usual cleanups, this one brings:
* Support for 5 new chipsets: Intel's ICH LPC and SCH Centerton,
ST-E's STAX211, Samsung's MAX77693 and TI's LM3533.
* Device tree support for the twl6040, tps65910, da9502 and ab8500
drivers.
* Fairly big tps56910, ab8500 and db8500 updates.
* i2c support for mc13xxx.
* Our regular update for the wm8xxx driver from Mark."
Fix up various conflicts with other trees, largely due to ab5500 removal
etc.
* tag 'mfd-3.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-2.6: (106 commits)
mfd: Fix build break of max77693 by adding REGMAP_I2C option
mfd: Fix twl6040 build failure
mfd: Fix max77693 build failure
mfd: ab8500-core should depend on MFD_DB8500_PRCMU
gpio: tps65910: dt: process gpio specific device node info
mfd: Remove the parsing of dt info for tps65910 gpio
mfd: Save device node parsed platform data for tps65910 sub devices
mfd: Add r_select to lm3533 platform data
gpio: Add Intel Centerton support to gpio-sch
mfd: Emulate active low IRQs as well as active high IRQs for wm831x
mfd: Mark two lm3533 zone registers as volatile
mfd: Fix return type of lm533 attribute is_visible
mfd: Enable Device Tree support in the ab8500-pwm driver
mfd: Enable Device Tree support in the ab8500-sysctrl driver
mfd: Add support for Device Tree to twl6040
mfd: Register the twl6040 child for the ASoC codec unconditionally
mfd: Allocate twl6040 IRQ numbers dynamically
mfd: twl6040 code cleanup in interrupt initialization part
mfd: Enable ab8500-gpadc driver for Device Tree
mfd: Prevent unassigned pointer from being used in ab8500-gpadc driver
...
Linus Torvalds [Tue, 29 May 2012 17:43:51 +0000 (10:43 -0700)]
Merge tag 'nfs-for-3.5-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client updates from Trond Myklebust:
"New features include:
- Rewrite the O_DIRECT code so that it can share the same coalescing
and pNFS functionality as the page cache code.
- Allow the server to provide hints as to when we should use pNFS,
and when it is more efficient to read and write through the
metadata server.
- NFS cache consistency updates:
* Use the ctime to emulate a change attribute for NFSv2/v3 so that
all NFS versions can share the same cache management code.
* New cache management code will only look at the change attribute
and size attribute when deciding whether or not our cached data
is still valid or not.
* Don't request NFSv4 post-op attributes on writes in cases such as
O_DIRECT, where we don't care about data cache consistency, or
when we have a write delegation, and know that our cache is still
consistent.
* Don't request NFSv4 post-op attributes on operations such as
COMMIT, where there are no expected metadata updates.
* Don't request NFSv4 directory post-op attributes in cases where
the operations themselves already return change attribute
updates: i.e. operations such as OPEN, CREATE, REMOVE, LINK and
RENAME.
- Speed up 'ls' and friends by using READDIR rather than READDIRPLUS
if we detect no attempts to lookup filenames.
- Improve the code sharing between NFSv2/v3 and v4 mounts
- NFSv4.1 state management efficiency improvements
- More patches in preparation for NFSv4/v4.1 migration functionality."
Fix trivial conflict in fs/nfs/nfs4proc.c that was due to the dcache
qstr name initialization changes (that made the length/hash a 64-bit
union)
* tag 'nfs-for-3.5-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (146 commits)
NFSv4: Add debugging printks to state manager
NFSv4: Map NFS4ERR_SHARE_DENIED into an EACCES error instead of EIO
NFSv4: update_changeattr does not need to set NFS_INO_REVAL_PAGECACHE
NFSv4.1: nfs4_reset_session should use nfs4_handle_reclaim_lease_error
NFSv4.1: Handle other occurrences of NFS4ERR_CONN_NOT_BOUND_TO_SESSION
NFSv4.1: Handle NFS4ERR_CONN_NOT_BOUND_TO_SESSION in the state manager
NFSv4.1: Handle errors in nfs4_bind_conn_to_session
NFSv4.1: nfs4_bind_conn_to_session should drain the session
NFSv4.1: Don't clobber the seqid if exchange_id returns a confirmed clientid
NFSv4.1: Add DESTROY_CLIENTID
NFSv4.1: Ensure we use the correct credentials for bind_conn_to_session
NFSv4.1: Ensure we use the correct credentials for session create/destroy
NFSv4.1: Move NFSPROC4_CLNT_BIND_CONN_TO_SESSION to the end of the operations
NFSv4.1: Handle NFS4ERR_SEQ_MISORDERED when confirming the lease
NFSv4: When purging the lease, we must clear NFS4CLNT_LEASE_CONFIRM
NFSv4: Clean up the error handling for nfs4_reclaim_lease
NFSv4.1: Exchange ID must use GFP_NOFS allocation mode
nfs41: Use BIND_CONN_TO_SESSION for CB_PATH_DOWN*
nfs4.1: add BIND_CONN_TO_SESSION operation
NFSv4.1 test the mdsthreshold hint parameters
...
Alan Cox [Tue, 29 May 2012 12:45:16 +0000 (13:45 +0100)]
tty: fix ldisc lock inversion trace
This is caused by tty_release using tty_lock_pair to lock both sides of
the pty/tty pair, and then tty_ldisc_release dropping and relocking one
side only. We can drop both fine, so drop both to avoid any lock
ordering concerns.
Rework the release path to fix the new locking model.
Signed-off-by: Alan Cox <alan@linux.intel.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Alan Cox [Tue, 29 May 2012 12:45:01 +0000 (13:45 +0100)]
pty: Fix lock inversion
The ptmx_open path takes the tty and devpts locks in the wrong order
because tty_init_dev locks and returns a locked tty. As far as I can
tell this is actually safe anyway because the tty being returned is new
so nobody can get a reference to lock it at this point.
However we don't even need the devpts lock at this point, it's only held
as a byproduct of the way the locks were pushe down.
Signed-off-by: Alan Cox <alan@linux.intel.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Trond Myklebust [Mon, 28 May 2012 19:12:27 +0000 (15:12 -0400)]
NFSv4: Add debugging printks to state manager
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>