Linus Torvalds [Fri, 5 Mar 2010 21:20:53 +0000 (13:20 -0800)]
Merge branch 'for_linus' of git://git./linux/kernel/git/jack/linux-fs-2.6
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6: (33 commits)
quota: stop using QUOTA_OK / NO_QUOTA
dquot: cleanup dquot initialize routine
dquot: move dquot initialization responsibility into the filesystem
dquot: cleanup dquot drop routine
dquot: move dquot drop responsibility into the filesystem
dquot: cleanup dquot transfer routine
dquot: move dquot transfer responsibility into the filesystem
dquot: cleanup inode allocation / freeing routines
dquot: cleanup space allocation / freeing routines
ext3: add writepage sanity checks
ext3: Truncate allocated blocks if direct IO write fails to update i_size
quota: Properly invalidate caches even for filesystems with blocksize < pagesize
quota: generalize quota transfer interface
quota: sb_quota state flags cleanup
jbd: Delay discarding buffers in journal_unmap_buffer
ext3: quota_write cross block boundary behaviour
quota: drop permission checks from xfs_fs_set_xstate/xfs_fs_set_xquota
quota: split out compat_sys_quotactl support from quota.c
quota: split out netlink notification support from quota.c
quota: remove invalid optimization from quota_sync_all
...
Fixed trivial conflicts in fs/namei.c and fs/ufs/inode.c
Linus Torvalds [Fri, 5 Mar 2010 21:12:34 +0000 (13:12 -0800)]
Merge branch 'kvm-updates/2.6.34' of git://git./virt/kvm/kvm
* 'kvm-updates/2.6.34' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (145 commits)
KVM: x86: Add KVM_CAP_X86_ROBUST_SINGLESTEP
KVM: VMX: Update instruction length on intercepted BP
KVM: Fix emulate_sys[call, enter, exit]()'s fault handling
KVM: Fix segment descriptor loading
KVM: Fix load_guest_segment_descriptor() to inject page fault
KVM: x86 emulator: Forbid modifying CS segment register by mov instruction
KVM: Convert kvm->requests_lock to raw_spinlock_t
KVM: Convert i8254/i8259 locks to raw_spinlocks
KVM: x86 emulator: disallow opcode 82 in 64-bit mode
KVM: x86 emulator: code style cleanup
KVM: Plan obsolescence of kernel allocated slots, paravirt mmu
KVM: x86 emulator: Add LOCK prefix validity checking
KVM: x86 emulator: Check CPL level during privilege instruction emulation
KVM: x86 emulator: Fix popf emulation
KVM: x86 emulator: Check IOPL level during io instruction emulation
KVM: x86 emulator: fix memory access during x86 emulation
KVM: x86 emulator: Add Virtual-8086 mode of emulation
KVM: x86 emulator: Add group9 instruction decoding
KVM: x86 emulator: Add group8 instruction decoding
KVM: do not store wqh in irqfd
...
Trivial conflicts in Documentation/feature-removal-schedule.txt
Linus Torvalds [Fri, 5 Mar 2010 19:53:53 +0000 (11:53 -0800)]
Merge branch 'write_inode2' of git://git./linux/kernel/git/viro/vfs-2.6
* 'write_inode2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
pass writeback_control to ->write_inode
make sure data is on disk before calling ->write_inode
Linus Torvalds [Fri, 5 Mar 2010 19:46:31 +0000 (11:46 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/viro/vfs-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
Switch !O_CREAT case to use of do_last()
Get rid of symlink body copying
Finish pulling of -ESTALE handling to upper level in do_filp_open()
Turn do_link spaghetty into a normal loop
Unify exits in O_CREAT handling
Kill is_link argument of do_last()
Pull handling of LAST_BIND into do_last(), clean up ok: part in do_filp_open()
Leave mangled flag only for setting nd.intent.open.flag
Get rid of passing mangled flag to do_last()
Don't pass mangled open_flag to finish_open()
pull more into do_last()
bail out with ELOOP earlier in do_link loop
pull the common predecessors into do_last()
postpone __putname() until after do_last()
unroll do_last: loop in do_filp_open()
Shift releasing nd->root from do_last() to its caller
gut do_filp_open() a bit more (do_last separation)
beginning to untangle do_filp_open()
Randy Dunlap [Fri, 5 Mar 2010 17:52:52 +0000 (09:52 -0800)]
x86: fix mtrr missing kernel-doc
Fix missing kernel-doc notation in mtrr/main.c:
Warning(arch/x86/kernel/cpu/mtrr/main.c:152): No description found for parameter 'info'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Fri, 5 Mar 2010 18:50:22 +0000 (10:50 -0800)]
Merge branch 'perf-probes-for-linus-2' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'perf-probes-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: Issue at least one memory barrier in stop_machine_text_poke()
perf probe: Correct probe syntax on command line help
perf probe: Add lazy line matching support
perf probe: Show more lines after last line
perf probe: Check function address range strictly in line finder
perf probe: Use libdw callback routines
perf probe: Use elfutils-libdw for analyzing debuginfo
perf probe: Rename probe finder functions
perf probe: Fix bugs in line range finder
perf probe: Update perf probe document
perf probe: Do not show --line option without dwarf support
kprobes: Add documents of jump optimization
kprobes/x86: Support kprobes jump optimization on x86
x86: Add text_poke_smp for SMP cross modifying code
kprobes/x86: Cleanup save/restore registers
kprobes/x86: Boost probes when reentering
kprobes: Jump optimization sysctl interface
kprobes: Introduce kprobes jump optimization
kprobes: Introduce generic insn_slot framework
kprobes/x86: Cleanup RELATIVEJUMP_INSTRUCTION to RELATIVEJUMP_OPCODE
Linus Torvalds [Fri, 5 Mar 2010 18:47:57 +0000 (10:47 -0800)]
Merge git://git./linux/kernel/git/herbert/crypto-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
padata: Allocate the cpumask for the padata instance
crypto: authenc - Move saved IV in front of the ablkcipher request
crypto: hash - Fix handling of unaligned buffers
crypto: authenc - Use correct ahash complete functions
crypto: md5 - Set statesize
Linus Torvalds [Fri, 5 Mar 2010 18:47:00 +0000 (10:47 -0800)]
Merge branch 'for_linus' of git://git./linux/kernel/git/tytso/ext4
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (36 commits)
ext4: fix up rb_root initializations to use RB_ROOT
ext4: Code cleanup for EXT4_IOC_MOVE_EXT ioctl
ext4: Fix the NULL reference in double_down_write_data_sem()
ext4: Fix insertion point of extent in mext_insert_across_blocks()
ext4: consolidate in_range() definitions
ext4: cleanup to use ext4_grp_offs_to_block()
ext4: cleanup to use ext4_group_first_block_no()
ext4: Release page references acquired in ext4_da_block_invalidatepages
ext4: Fix ext4_quota_write cross block boundary behaviour
ext4: Convert BUG_ON checks to use ext4_error() instead
ext4: Use direct_IO_no_locking in ext4 dio read
ext4: use ext4_get_block_write in buffer write
ext4: mechanical rename some of the direct I/O get_block's identifiers
ext4: make "offset" consistent in ext4_check_dir_entry()
ext4: Handle non empty on-disk orphan link
ext4: explicitly remove inode from orphan list after failed direct io
ext4: fix error handling in migrate
ext4: deprecate obsoleted mount options
ext4: Fix fencepost error in chosing choosing group vs file preallocation.
jbd2: clean up an assertion in jbd2_journal_commit_transaction()
...
Linus Torvalds [Fri, 5 Mar 2010 18:46:04 +0000 (10:46 -0800)]
Merge git://git./linux/kernel/git/pkl/squashfs-linus
* git://git.kernel.org/pub/scm/linux/kernel/git/pkl/squashfs-linus:
Squashfs: get rid of obsolete definition in header file
Squashfs: get rid of obsolete variable in struct squashfs_sb_info
Squashfs: add decompressor entries for lzma and lzo
Squashfs: add a decompressor framework
Squashfs: factor out remaining zlib dependencies into separate wrapper file
Squashfs: move zlib decompression wrapper code into a separate file
Christoph Hellwig [Fri, 5 Mar 2010 08:21:37 +0000 (09:21 +0100)]
pass writeback_control to ->write_inode
This gives the filesystem more information about the writeback that
is happening. Trond requested this for the NFS unstable write handling,
and other filesystems might benefit from this too by beeing able to
distinguish between the different callers in more detail.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Christoph Hellwig [Fri, 5 Mar 2010 08:21:21 +0000 (09:21 +0100)]
make sure data is on disk before calling ->write_inode
Similar to the fsync issue fixed a while ago in commit
2daea67e966dc0c42067ebea015ddac6834cef88 we need to write for data to
actually hit the disk before writing out the metadata to guarantee
data integrity for filesystems that modify the inode in the data I/O
completion path. Currently XFS and NFS handle this manually, and AFS
has a write_inode method that does nothing but waiting for data, while
others are possibly missing out on this.
Fortunately this change has a lot less impact than the fsync change
as none of the write_inode methods starts data writeout of any form
by itself.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Phillip Lougher [Thu, 25 Feb 2010 01:31:13 +0000 (01:31 +0000)]
Squashfs: get rid of obsolete definition in header file
Signed-off-by: Phillip Lougher <phillip@lougher.demon.co.uk>
Phillip Lougher [Thu, 25 Feb 2010 00:54:48 +0000 (00:54 +0000)]
Squashfs: get rid of obsolete variable in struct squashfs_sb_info
Signed-off-by: Phillip Lougher <phillip@lougher.demon.co.uk>
Al Viro [Sat, 26 Dec 2009 15:56:19 +0000 (10:56 -0500)]
Switch !O_CREAT case to use of do_last()
... and now we have all intents crap well localized
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Sat, 26 Dec 2009 13:37:05 +0000 (08:37 -0500)]
Get rid of symlink body copying
Now that nd->last stays around until ->put_link() is called, we can
just postpone that ->put_link() in do_filp_open() a bit and don't
bother with copying.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Sat, 26 Dec 2009 12:21:48 +0000 (07:21 -0500)]
Finish pulling of -ESTALE handling to upper level in do_filp_open()
Don't bother with path_walk() (and its retry loop); link_path_walk()
will do it.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Sat, 26 Dec 2009 12:16:40 +0000 (07:16 -0500)]
Turn do_link spaghetty into a normal loop
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Sat, 26 Dec 2009 12:09:49 +0000 (07:09 -0500)]
Unify exits in O_CREAT handling
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Sat, 26 Dec 2009 12:04:50 +0000 (07:04 -0500)]
Kill is_link argument of do_last()
We set it to 1 iff we return NULL
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Sat, 26 Dec 2009 12:01:01 +0000 (07:01 -0500)]
Pull handling of LAST_BIND into do_last(), clean up ok: part in do_filp_open()
Note that in case of !O_CREAT we know that nd.root has already been given up
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Thu, 24 Dec 2009 12:15:41 +0000 (07:15 -0500)]
Leave mangled flag only for setting nd.intent.open.flag
Nothing else uses it anymore
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Thu, 24 Dec 2009 11:51:13 +0000 (06:51 -0500)]
Get rid of passing mangled flag to do_last()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Thu, 24 Dec 2009 11:49:47 +0000 (06:49 -0500)]
Don't pass mangled open_flag to finish_open()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Thu, 24 Dec 2009 08:39:50 +0000 (03:39 -0500)]
pull more into do_last()
Handling of LAST_DOT/LAST_ROOT/LAST_DOTDOT/terminating slash
can be pulled in as well
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Thu, 24 Dec 2009 07:27:30 +0000 (02:27 -0500)]
bail out with ELOOP earlier in do_link loop
If we'd passed through 32 trailing symlinks already, there's
no sense following the 33rd - we'll bail out anyway. Better
bugger off earlier.
It *does* change behaviour, after a fashion - if the 33rd happens
to be a procfs-style symlink, original code *would* allow it.
This one will not. Cry me a river if that hurts you. Please, do.
And post a video of that, while you are at it.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Thu, 24 Dec 2009 07:12:06 +0000 (02:12 -0500)]
pull the common predecessors into do_last()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Thu, 24 Dec 2009 07:08:19 +0000 (02:08 -0500)]
postpone __putname() until after do_last()
Since do_last() doesn't mangle nd->last_name, we can safely postpone
__putname() done in handling of trailing symlinks until after the
call of do_last()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Thu, 24 Dec 2009 07:05:43 +0000 (02:05 -0500)]
unroll do_last: loop in do_filp_open()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Thu, 24 Dec 2009 07:02:38 +0000 (02:02 -0500)]
Shift releasing nd->root from do_last() to its caller
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Thu, 24 Dec 2009 06:58:28 +0000 (01:58 -0500)]
gut do_filp_open() a bit more (do_last separation)
Brute-force separation of stuff reachable from do_last: with
the exception of do_link:; just take all that crap to a helper
function as-is and have it tell the caller if it has to go
to do_link.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Thu, 24 Dec 2009 06:26:48 +0000 (01:26 -0500)]
beginning to untangle do_filp_open()
That's going to be a long and painful series. The first step:
take the stuff reachable from 'ok' label in do_filp_open() into
a new helper (finish_open()).
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Venkatesh Pallipadi [Fri, 5 Mar 2010 03:25:21 +0000 (22:25 -0500)]
ext4: fix up rb_root initializations to use RB_ROOT
ext4 uses rb_node = NULL; to zero rb_root at few places. Using
RB_ROOT as the initializer is more portable in case the underlying
implementation of rbtrees changes in the future.
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: Eric Paris <eparis@redhat.com>
Christoph Hellwig [Wed, 3 Mar 2010 14:05:08 +0000 (09:05 -0500)]
quota: stop using QUOTA_OK / NO_QUOTA
Just use 0 / -EDQUOT directly - that's what it translates to anyway.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
Christoph Hellwig [Wed, 3 Mar 2010 14:05:07 +0000 (09:05 -0500)]
dquot: cleanup dquot initialize routine
Get rid of the initialize dquot operation - it is now always called from
the filesystem and if a filesystem really needs it's own (which none
currently does) it can just call into it's own routine directly.
Rename the now static low-level dquot_initialize helper to __dquot_initialize
and vfs_dq_init to dquot_initialize to have a consistent namespace.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
Christoph Hellwig [Wed, 3 Mar 2010 14:05:06 +0000 (09:05 -0500)]
dquot: move dquot initialization responsibility into the filesystem
Currently various places in the VFS call vfs_dq_init directly. This means
we tie the quota code into the VFS. Get rid of that and make the
filesystem responsible for the initialization. For most metadata operations
this is a straight forward move into the methods, but for truncate and
open it's a bit more complicated.
For truncate we currently only call vfs_dq_init for the sys_truncate case
because open already takes care of it for ftruncate and open(O_TRUNC) - the
new code causes an additional vfs_dq_init for those which is harmless.
For open the initialization is moved from do_filp_open into the open method,
which means it happens slightly earlier now, and only for regular files.
The latter is fine because we don't need to initialize it for operations
on special files, and we already do it as part of the namespace operations
for directories.
Add a dquot_file_open helper that filesystems that support generic quotas
can use to fill in ->open.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
Christoph Hellwig [Wed, 3 Mar 2010 14:05:05 +0000 (09:05 -0500)]
dquot: cleanup dquot drop routine
Get rid of the drop dquot operation - it is now always called from
the filesystem and if a filesystem really needs it's own (which none
currently does) it can just call into it's own routine directly.
Rename the now static low-level dquot_drop helper to __dquot_drop
and vfs_dq_drop to dquot_drop to have a consistent namespace.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
Christoph Hellwig [Wed, 3 Mar 2010 14:05:04 +0000 (09:05 -0500)]
dquot: move dquot drop responsibility into the filesystem
Currently clear_inode calls vfs_dq_drop directly. This means
we tie the quota code into the VFS. Get rid of that and make the
filesystem responsible for the drop inside the ->clear_inode
superblock operation.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
Christoph Hellwig [Wed, 3 Mar 2010 14:05:03 +0000 (09:05 -0500)]
dquot: cleanup dquot transfer routine
Get rid of the transfer dquot operation - it is now always called from
the filesystem and if a filesystem really needs it's own (which none
currently does) it can just call into it's own routine directly.
Rename the now static low-level dquot_transfer helper to __dquot_transfer
and vfs_dq_transfer to dquot_transfer to have a consistent namespace,
and make the new dquot_transfer return a normal negative errno value
which all callers expect.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
Christoph Hellwig [Wed, 3 Mar 2010 14:05:02 +0000 (09:05 -0500)]
dquot: move dquot transfer responsibility into the filesystem
Currently notify_change calls vfs_dq_transfer directly. This means
we tie the quota code into the VFS. Get rid of that and make the
filesystem responsible for the transfer. Most filesystems already
do this, only ufs and udf need the code added, and for jfs it needs to
be enabled unconditionally instead of only when ACLs are enabled.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
Christoph Hellwig [Wed, 3 Mar 2010 14:05:01 +0000 (09:05 -0500)]
dquot: cleanup inode allocation / freeing routines
Get rid of the alloc_inode and free_inode dquot operations - they are
always called from the filesystem and if a filesystem really needs
their own (which none currently does) it can just call into it's
own routine directly.
Also get rid of the vfs_dq_alloc/vfs_dq_free wrappers and always
call the lowlevel dquot_alloc_inode / dqout_free_inode routines
directly, which now lose the number argument which is always 1.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
Christoph Hellwig [Wed, 3 Mar 2010 14:05:00 +0000 (09:05 -0500)]
dquot: cleanup space allocation / freeing routines
Get rid of the alloc_space, free_space, reserve_space, claim_space and
release_rsv dquot operations - they are always called from the filesystem
and if a filesystem really needs their own (which none currently does)
it can just call into it's own routine directly.
Move shared logic into the common __dquot_alloc_space,
dquot_claim_space_nodirty and __dquot_free_space low-level methods,
and rationalize the wrappers around it to move as much as possible
code into the common block for CONFIG_QUOTA vs not. Also rename
all these helpers to be named dquot_* instead of vfs_dq_*.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
Dmitry Monakhov [Tue, 2 Mar 2010 12:51:02 +0000 (15:51 +0300)]
ext3: add writepage sanity checks
- There is theoretical possibility to perform writepage on
RO superblock. Add explicit check for what case.
- Page must being locked before writepage.
Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: Jan Kara <jack@suse.cz>
Jan Kara [Mon, 1 Mar 2010 13:02:37 +0000 (14:02 +0100)]
ext3: Truncate allocated blocks if direct IO write fails to update i_size
We have to truncate blocks allocated to file during direct IO when we
fail to update i_size properly.
Signed-off-by: Jan Kara <jack@suse.cz>
Jan Kara [Mon, 22 Feb 2010 20:07:17 +0000 (21:07 +0100)]
quota: Properly invalidate caches even for filesystems with blocksize < pagesize
Sometimes invalidate_bdev() can fail to invalidate a part of block
device cache because of dirty data. If the filesystem has blocksize
smaller than page size, this can happen even for pages containing
quota files and thus kernel would operate on stale data. Fix the
issue by syncing the filesystem before invalidating the cache.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
Dmitry Monakhov [Tue, 16 Feb 2010 05:31:50 +0000 (08:31 +0300)]
quota: generalize quota transfer interface
Current quota transfer interface support only uid/gid.
This patch extend interface in order to support various quotas types
The goal is accomplished without changes in most frequently used
vfs_dq_transfer() func.
Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: Jan Kara <jack@suse.cz>
Dmitry Monakhov [Tue, 16 Feb 2010 05:31:49 +0000 (08:31 +0300)]
quota: sb_quota state flags cleanup
- remove hardcoded USRQUOTA/GRPQUOTA flags
- convert int to bool for appropriate functions
Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: Jan Kara <jack@suse.cz>
Jan Kara [Tue, 16 Feb 2010 19:37:12 +0000 (20:37 +0100)]
jbd: Delay discarding buffers in journal_unmap_buffer
Delay discarding buffers in journal_unmap_buffer until
we know that "add to orphan" operation has definitely been
committed, otherwise the log space of committing transation
may be freed and reused before truncate get committed, updates
may get lost if crash happens.
This patch is a backport of JBD2 fix by dingdinghua <dingdinghua@nrchpc.ac.cn>.
Signed-off-by: Jan Kara <jack@suse.cz>
Dmitry Monakhov [Tue, 16 Feb 2010 16:33:42 +0000 (19:33 +0300)]
ext3: quota_write cross block boundary behaviour
We always assume what dquot update result in changes in one data block
But ext3_quota_write() function may handle cross block boundary writes
In fact if this ever happen it will result in incorrect journal credits
reservation. And later bug_on triggering. As soon this never happen the
boundary cross loop is NOOP. In order to make things straight
let's remove this loop and assert cross boundary condition.
Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: Jan Kara <jack@suse.cz>
Christoph Hellwig [Tue, 16 Feb 2010 08:44:56 +0000 (03:44 -0500)]
quota: drop permission checks from xfs_fs_set_xstate/xfs_fs_set_xquota
We already do these checks in the generic code.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
Christoph Hellwig [Tue, 16 Feb 2010 08:44:55 +0000 (03:44 -0500)]
quota: split out compat_sys_quotactl support from quota.c
Instead of adding ifdefs just split it into a new file.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
Christoph Hellwig [Tue, 16 Feb 2010 08:44:54 +0000 (03:44 -0500)]
quota: split out netlink notification support from quota.c
Instead of adding ifdefs just split it into a new file.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
Christoph Hellwig [Tue, 16 Feb 2010 08:44:53 +0000 (03:44 -0500)]
quota: remove invalid optimization from quota_sync_all
Checking the "VFS" quota enabled and dirty bits from generic code means
this code will never get called for other implementations, e.g. XFS and
GFS2. Grabbing the reference on the superblock really isn't much overhead
for a global Q_SYNC call, so just drop this optimization.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
Christoph Hellwig [Tue, 16 Feb 2010 08:44:52 +0000 (03:44 -0500)]
quota: move code from sync_quota_sb into vfs_quota_sync
Currenly sync_quota_sb does a lot of sync and truncate action that only
applies to "VFS" style quotas and is actively harmful for the sync
performance in XFS. Move it into vfs_quota_sync and add a wait parameter
to ->quota_sync to tell if we need it or not.
My audit of the GFS2 code says it's also not needed given the way GFS2
implements quotas, but I'd be happy if this can get a detailed review.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
Christoph Hellwig [Tue, 16 Feb 2010 08:44:51 +0000 (03:44 -0500)]
quota: clean up Q_XQUOTASYNC
Currently Q_XQUOTASYNC calls into the quota_sync method, but XFS does something
entirely different in it than the rest of the filesystems. xfs_quota which
calls Q_XQUOTASYNC expects an asynchronous data writeout to flush delayed
allocations, while the "VFS" quota support wants to flush changes to the quota
file.
So make Q_XQUOTASYNC call into the writeback code directly and make the
quota_sync method optional as XFS doesn't need in the sense expected by the
rest of the quota code.
GFS2 was using limited XFS-style quota and has a quota_sync method fitting
neither the style used by vfs_quota_sync nor xfs_fs_quota_sync. I left it
in for now as per discussion with Steve it expects to be called from the
sync path this way.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
Christoph Hellwig [Tue, 16 Feb 2010 08:44:50 +0000 (03:44 -0500)]
quota: simplify permission checking
Stop having complicated different routines for checking permissions for
XQM vs "VFS" quotas. Instead do the checks for having sb->s_qcop and
a valid type directly in do_quotactl, and munge the *quotactl_valid functions
into a check_quotactl_permission helper that only checks for permissions.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
Christoph Hellwig [Tue, 16 Feb 2010 08:44:49 +0000 (03:44 -0500)]
quota: special case Q_SYNC without device name
The Q_SYNC command can be called without the path to a device, in which case
it iterates over all superblocks. Special case this variant directly in
sys_quotactl so that the other code always gets a superblock and doesn't
need to deal with this case.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
Christoph Hellwig [Tue, 16 Feb 2010 08:44:48 +0000 (03:44 -0500)]
quota: clean up checks for supported quota methods
Move the checks for sb->s_qcop->foo next to the actual calls for them, same
for sb_has_quota_active checks where applicable.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
Christoph Hellwig [Tue, 16 Feb 2010 08:44:47 +0000 (03:44 -0500)]
quota: split do_quotactl
Split out a helper for each non-trivial command from do_quotactl.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
Jan Kara [Tue, 9 Feb 2010 17:20:39 +0000 (18:20 +0100)]
quota: Fix warning when a delayed write happens before quota is enabled
If a delayed-allocation write happens before quota is enabled, the
kernel spits out a warning:
WARNING: at fs/quota/dquot.c:988 dquot_claim_space+0x77/0x112()
because the fact that user has some delayed allocation is not recorded
in quota structure.
Make dquot_initialize() update amount of reserved space for user if it sees
inode has some space reserved. Also make sure that reserved quota space does
not go negative and we warn about the filesystem bug just once.
Signed-off-by: Jan Kara <jack@suse.cz>
Dmitry Monakhov [Tue, 9 Feb 2010 16:53:36 +0000 (17:53 +0100)]
quota: manage reserved space when quota is not active [v2]
Since we implemented generic reserved space management interface,
then it is possible to account reserved space even when quota
is not active (similar to i_blocks/i_bytes).
Without this patch following testcase result in massive comlain from
WARN_ON in dquot_claim_space()
TEST_CASE:
mount /dev/sdb /mnt -oquota
dd if=/dev/zero of=/mnt/test bs=1M count=1
quotaon /mnt
# fs_reserved_spave == 1Mb
# quota_reserved_space == 0, because quota was disabled
dd if=/dev/zero of=/mnt/test seek=1 bs=1M count=1
# fs_reserved_spave == 2Mb
# quota_reserved_space == 1Mb
sync # ->dquot_claim_space() -> WARN_ON
Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: Jan Kara <jack@suse.cz>
Christoph Egger [Fri, 5 Feb 2010 13:13:33 +0000 (14:13 +0100)]
jbd[2]: remove references to BUFFER_DEBUG
CONFIG_BUFFER_DEBUG seems to have been removed from the documentation
somewhere around 2.4.15 and seemingly hasn't been available even
longer. It is, however, still referenced at one place from the jbd
code (one is a copy of the other header). Time to clean it up
Signed-off-by: Christoph Egger <siccegge@stud.informatik.uni-erlangen.de>
Signed-off-by: Jan Kara <jack@suse.cz>
Dmitry Monakhov [Tue, 2 Feb 2010 13:05:53 +0000 (16:05 +0300)]
ext3: trivial quota cleanup
The patch is aimed to reorganize and simplify quota code a bit.
Quota code is itself complex enouth, but we can make it more readable
in some places:
- Move quota option parsing to separate functions.
- Simplify old-quota and journaled-quota mix check.
Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: Jan Kara <jack@suse.cz>
Dmitry Monakhov [Tue, 2 Feb 2010 13:05:51 +0000 (16:05 +0300)]
ext3: mount flags manipulation cleanup
Replace intermediate EXT3_MOUNT_XXX flags manipulation to
corresponding macro.
Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: Jan Kara <jack@suse.cz>
Jan Kara [Wed, 6 Jan 2010 20:58:48 +0000 (21:58 +0100)]
ext3: Use bitops to read/modify EXT3_I(inode)->i_state
At several places we modify EXT3_I(inode)->i_state without holding i_mutex
(ext3_release_file, ext3_bmap, ext3_journalled_writepage, ext3_do_update_inode,
...). These modifications are racy and we can lose updates to i_state. So
convert handling of i_state to use bitops which are atomic.
Signed-off-by: Jan Kara <jack@suse.cz>
Jan Kara [Wed, 6 Jan 2010 16:20:35 +0000 (17:20 +0100)]
quota: Cleanup S_NOQUOTA handling
Cleanup handling of S_NOQUOTA inode flag and document it a bit. The flag
does not have to be set under dqptr_sem. Only functions modifying inode's
dquot pointers have to check the flag under dqptr_sem before going forward
with the modification. This way we are sure that we cannot add new dquot
pointers to the inode which is just becoming a quota file.
The good thing about this cleanup is that there are no more places in quota
code which enforce i_mutex vs. dqptr_sem lock ordering (in particular that
dqptr_sem -> i_mutex of quota file). This should silence some (false) lockdep
warnings with ext4 + quota and generally make life of some filesystems easier.
Signed-off-by: Jan Kara <jack@suse.cz>
Linus Torvalds [Thu, 4 Mar 2010 16:26:08 +0000 (08:26 -0800)]
Merge branch 'for-linus' of git://git.open-osd.org/linux-open-osd
* 'for-linus' of git://git.open-osd.org/linux-open-osd:
exofs: groups support
exofs: Prepare for groups
exofs: Error recovery if object is missing from storage
exofs: convert io_state to use pages array instead of bio at input
exofs: RAID0 support
exofs: Define on-disk per-inode optional layout attribute
exofs: unindent exofs_sbi_read
exofs: Move layout related members to a layout structure
exofs: Recover in the case of read-passed-end-of-file
exofs: Micro-optimize exofs_i_info
exofs: debug print even less
Linus Torvalds [Thu, 4 Mar 2010 16:24:47 +0000 (08:24 -0800)]
Merge git://git./linux/kernel/git/davem/sparc-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6:
sparc64: Make prom entry spinlock NMI safe.
sparc64: Kill off old sys_perfctr system call and state.
sparc: Update defconfigs.
sparc: Provide io{read,write}{16,32}be().
Linus Torvalds [Thu, 4 Mar 2010 16:24:06 +0000 (08:24 -0800)]
Merge git://git./linux/kernel/git/davem/ide-next-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/ide-next-2.6: (49 commits)
drivers/ide: Fix continuation line formats
ide: fixed section mismatch warning in cmd640.c
ide: ide_timing_compute() fixup
ide: make ide_get_best_pio_mode() static
via82cxxx: use ->pio_mode value to determine pair device speed
tx493xide: use ->pio_mode value to determine pair device speed
siimage: use ->pio_mode value to determine pair device speed
palm_bk3710: use ->pio_mode value to determine pair device speed
it821x: use ->pio_mode value to determine pair device speed
cs5536: use ->pio_mode value to determine pair device speed
cs5535: use ->pio_mode value to determine pair device speed
cmd64x: fix handling of address setup timings
amd74xx: use ->pio_mode value to determine pair device speed
alim15x3: fix handling of UDMA enable bit
alim15x3: fix handling of DMA timings
alim15x3: fix handling of command timings
alim15x3: fix handling of address setup timings
ide-timings: use ->pio_mode value to determine fastest PIO speed
ide: change ->set_dma_mode method parameters
ide: change ->set_pio_mode method parameters
...
Linus Torvalds [Thu, 4 Mar 2010 16:20:14 +0000 (08:20 -0800)]
Merge branch 'next' of git://git./linux/kernel/git/djbw/async_tx
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx: (28 commits)
ioat: cleanup ->timer_fn() and ->cleanup_fn() prototypes
ioat3: interrupt coalescing
ioat: close potential BUG_ON race in the descriptor cleanup path
ioat2: kill pending flag
ioat3: use ioat2_quiesce()
ioat3: cleanup, don't enable DCA completion writes
DMAENGINE: COH 901 318 lli sg offset fix
DMAENGINE: COH 901 318 configure channel direction
DMAENGINE: COH 901 318 remove irq counting
DMAENGINE: COH 901 318 descriptor pool refactoring
DMAENGINE: COH 901 318 cleanups
dma: Add MPC512x DMA driver
Debugging options for the DMA engine subsystem
iop-adma: redundant/wrong tests in iop_*_count()?
dmatest: fix handling of an even number of xor_sources
dmatest: correct raid6 PQ test
fsldma: Fix cookie issues
fsldma: Fix cookie issues
dma: cases IPU_PIX_FMT_BGRA32, BGR32 and ABGR32 are the same in ipu_ch_param_set_size()
dma: make Open Firmware device id constant
...
Linus Torvalds [Thu, 4 Mar 2010 16:15:33 +0000 (08:15 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/viro/vfs-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (52 commits)
init: Open /dev/console from rootfs
mqueue: fix typo "failues" -> "failures"
mqueue: only set error codes if they are really necessary
mqueue: simplify do_open() error handling
mqueue: apply mathematics distributivity on mq_bytes calculation
mqueue: remove unneeded info->messages initialization
mqueue: fix mq_open() file descriptor leak on user-space processes
fix race in d_splice_alias()
set S_DEAD on unlink() and non-directory rename() victims
vfs: add NOFOLLOW flag to umount(2)
get rid of ->mnt_parent in tomoyo/realpath
hppfs can use existing proc_mnt, no need for do_kern_mount() in there
Mirror MS_KERNMOUNT in ->mnt_flags
get rid of useless vfsmount_lock use in put_mnt_ns()
Take vfsmount_lock to fs/internal.h
get rid of insanity with namespace roots in tomoyo
take check for new events in namespace (guts of mounts_poll()) to namespace.c
Don't mess with generic_permission() under ->d_lock in hpfs
sanitize const/signedness for udf
nilfs: sanitize const/signedness in dealing with ->d_name.name
...
Fix up fairly trivial (famous last words...) conflicts in
drivers/infiniband/core/uverbs_main.c and security/tomoyo/realpath.c
Linus Torvalds [Thu, 4 Mar 2010 16:04:22 +0000 (08:04 -0800)]
Merge git://git.infradead.org/battery-2.6
* git://git.infradead.org/battery-2.6:
power_supply: bq27x00: fix voltage and current units
power_supply: bq27x00: add status and time properties
power_supply: bq27x00: add BQ27500 support
power_supply: bq27x00: fix temperature conversion
power_supply: bq27x00: remove unused struct fields
power_supply: bq27x00: remove double endian swap
da9030_battery: fix spelling in comment
wm97xx_battery: Clean up some warnings
Linus Torvalds [Thu, 4 Mar 2010 15:52:06 +0000 (07:52 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/lrg/voltage-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lrg/voltage-2.6: (27 commits)
Regulators: wm8400 - cleanup platform driver data handling
Regulators: wm8994 - clean up driver data after removal
Regulators: wm831x-xxx - clean up driver data after removal
Regulators: pcap-regulator - clean up driver data after removal
Regulators: max8660 - annotate probe and remove methods
Regulators: max1586 - annotate probe and remove methods
Regulators: lp3971 - fail if platform data was not supplied
Regulators: tps6507x-regulator - mark probe method as __devinit
Regulators: tps65023-regulator - mark probe method as __devinit
Regulators: twl-regulator - mark probe function as __devinit
Regulators: fixed - annotate probe and remove methods
Regulators: ab3100 - fix probe and remove annotations
Regulators: virtual - use sysfs attribute groups
twl6030: regulator: Configure STATE register instead of REMAP
regulator: Provide optional dummy regulator for consumers
regulator: Assume regulators are enabled if they don't report anything
regulator: Convert fixed voltage regulator to use enable_time()
regulator: Add WM8994 regulator support
regulator: enable max8649 regulator driver
regulator: trivial: fix typos in user-visible Kconfig text
...
Linus Torvalds [Thu, 4 Mar 2010 15:51:36 +0000 (07:51 -0800)]
Merge git://git./linux/kernel/git/brodo/pcmcia-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/brodo/pcmcia-2.6:
pcmcia: disable pcmcia ioctl for !ARM, prepare for removal
pcmcia: CodingStyle fixes
pcmcia: alchemy: fixup wrong comments
pcmcia: remove irq_list parameter from pd6729
yenta_socket: ENE CB712 CardBus bridge needs special treatment with Echo Audio Indigo soundcards
Linus Torvalds [Thu, 4 Mar 2010 15:49:37 +0000 (07:49 -0800)]
Merge branch 'drm-linus' of git://git./linux/kernel/git/airlied/drm-2.6
* 'drm-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: (151 commits)
vga_switcheroo: disable default y by new rules.
drm/nouveau: fix *staging* driver build with switcheroo off.
drm/radeon: fix typo in Makefile
vga_switcheroo: fix build on platforms with no ACPI
drm/radeon: Fix printf type warning in 64bit system.
drm/radeon/kms: bump the KMS version number for square tiling support.
vga_switcheroo: initial implementation (v15)
drm/radeon/kms: do not disable audio engine twice
Revert "drm/radeon/kms: disable HDMI audio for now on rv710/rv730"
drm/radeon/kms: do not preset audio stuff and start timer when not using audio
drm/radeon: r100/r200 ums: block ability for userspace app to trash 0 page and beyond
drm/ttm: fix function prototype to match implementation
drm/radeon: use ALIGN instead of open coding it
drm/radeon/kms: initialize set_surface_reg reg for rs600 asic
drm/i915: Use a dmi quirk to skip a broken SDVO TV output.
drm/i915: enable/disable LVDS port at DPMS time
drm/i915: check for multiple write domains in pin_and_relocate
drm/i915: clean-up i915_gem_flush_gpu_write_domain
drm/i915: reuse i915_gpu_idle helper
drm/i915: ensure lru ordering of fence_list
...
Fixed trivial conflicts in drivers/gpu/vga/Kconfig
Masami Hiramatsu [Thu, 4 Mar 2010 03:38:50 +0000 (22:38 -0500)]
x86: Issue at least one memory barrier in stop_machine_text_poke()
Fix stop_machine_text_poke() to issue smp_mb() before exiting
waiting loop, and use cpu_relax() for waiting.
Changes in v2:
- Don't use ACCESS_ONCE().
Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: systemtap <systemtap@sources.redhat.com>
Cc: DLE <dle-develop@lists.sourceforge.net>
Cc: Jason Baron <jbaron@redhat.com>
LKML-Reference: <
20100304033850.3819.74590.stgit@localhost6.localdomain6>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Masami Hiramatsu [Thu, 4 Mar 2010 03:38:43 +0000 (22:38 -0500)]
perf probe: Correct probe syntax on command line help
Move @SRC right after FUNC in syntax according to syntax change
on command line help.
Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: systemtap <systemtap@sources.redhat.com>
Cc: DLE <dle-develop@lists.sourceforge.net>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: K.Prasad <prasad@linux.vnet.ibm.com>
LKML-Reference: <
20100304033843.3819.10087.stgit@localhost6.localdomain6>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Akira Fujita [Thu, 4 Mar 2010 05:39:24 +0000 (00:39 -0500)]
ext4: Code cleanup for EXT4_IOC_MOVE_EXT ioctl
a) Fix sparse warning in ext4_ioctl()
b) Remove unneeded variable in mext_leaf_block()
c) Fix spelling typo in mext_check_arguments()
Signed-off-by: Akira Fujita <a-fujita@rs.jp.nec.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Akira Fujita [Thu, 4 Mar 2010 05:34:58 +0000 (00:34 -0500)]
ext4: Fix the NULL reference in double_down_write_data_sem()
If EXT4_IOC_MOVE_EXT ioctl is called with NULL donor_fd, fget() in
ext4_ioctl() gets inappropriate file structure for donor; so we need
to do this check earlier, before calling double_down_write_data_sem().
Signed-off-by: Akira Fujita <a-fujita@rs.jp.nec.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Akira Fujita [Thu, 4 Mar 2010 05:31:06 +0000 (00:31 -0500)]
ext4: Fix insertion point of extent in mext_insert_across_blocks()
If the leaf node has 2 extent space or fewer and EXT4_IOC_MOVE_EXT
ioctl is called with the file offset where after the 2nd extent
covers, mext_insert_across_blocks() always tries to insert extent into
the first extent. As a result, the file gets corrupted because of
wrong extent order. The patch fixes this problem.
Signed-off-by: Akira Fujita <a-fujita@rs.jp.nec.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Steffen Klassert [Thu, 4 Mar 2010 05:30:22 +0000 (13:30 +0800)]
padata: Allocate the cpumask for the padata instance
The cpumask of the padata instance was used without allocated.
This caused boot crashes if CONFIG_CPUMASK_OFFSTACK is enabled.
This patch fixes this by doing proper allocation for this cpumask.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Akinobu Mita [Thu, 4 Mar 2010 04:55:01 +0000 (23:55 -0500)]
ext4: consolidate in_range() definitions
There are duplicate macro definitions of in_range() in mballoc.h and
balloc.c. This consolidates these two definitions into ext4.h, and
changes extents.c to use in_range() as well.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: Andreas Dilger <adilger@sun.com>
Akinobu Mita [Thu, 4 Mar 2010 04:53:25 +0000 (23:53 -0500)]
ext4: cleanup to use ext4_grp_offs_to_block()
More cleanup to convert open-coded calculations of the first block
number of a free extent to use ext4_grp_offs_to_block() instead.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: Andreas Dilger <adilger@sun.com>
Akinobu Mita [Thu, 4 Mar 2010 04:53:39 +0000 (23:53 -0500)]
ext4: cleanup to use ext4_group_first_block_no()
This is a cleanup and simplification patch which takes some open-coded
calculations to calculate the first block number of a group and
converts them to use the (already defined) ext4_group_first_block_no()
function.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: Andreas Dilger <adilger@sun.com>
Dan Williams [Thu, 4 Mar 2010 04:22:21 +0000 (21:22 -0700)]
Merge branch 'coh' into dmaengine
Dan Williams [Thu, 4 Mar 2010 04:21:13 +0000 (21:21 -0700)]
ioat: cleanup ->timer_fn() and ->cleanup_fn() prototypes
If the calling convention of ->timer_fn() and ->cleanup_fn() are unified
across hardware versions we can drop parameters to ioat_init_channel() and
unify ioat_is_dma_complete() implementations.
Both ->timer_fn() and ->cleanup_fn() are modified to expect a struct
dma_chan pointer.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Dan Williams [Thu, 4 Mar 2010 04:21:13 +0000 (21:21 -0700)]
ioat3: interrupt coalescing
The hardware automatically disables further interrupts after each event
until rearmed. This allows a delay to be injected between the occurence
of the interrupt and the running of the cleanup routine. The delay is
scaled by the descriptor backlog and then written to the INTRDELAY
register which specifies the number of microseconds to hold off
interrupt delivery after an interrupt event occurs. According to
powertop this reduces the interrupt rate from ~5000 intr/s to ~150
intr/s per without affecting throughput (simple dd to a raid6 array).
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Dan Williams [Thu, 4 Mar 2010 04:21:10 +0000 (21:21 -0700)]
ioat: close potential BUG_ON race in the descriptor cleanup path
Since ioat_cleanup_preamble() and the update of the last completed
descriptor are not synchronized there is a chance that two cleanup threads
can see descriptors to clean. If the first cleans up all pending
descriptors then the second will trigger the BUG_ON.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Linus Torvalds [Tue, 2 Mar 2010 16:36:46 +0000 (08:36 -0800)]
Prioritize synchronous signals over 'normal' signals
This makes sure that we pick the synchronous signals caused by a
processor fault over any pending regular asynchronous signals sent to
use by [t]kill().
This is not strictly required semantics, but it makes it _much_ easier
for programs like Wine that expect to find the fault information in the
signal stack.
Without this, if a non-synchronous signal gets picked first, the delayed
asynchronous signal will have its signal context pointing to the new
signal invocation, rather than the instruction that caused the SIGSEGV
or SIGBUS in the first place.
This is not all that pretty, and we're discussing making the synchronous
signals more explicit rather than have these kinds of implicit
preferences of SIGSEGV and friends. See for example
http://bugzilla.kernel.org/show_bug.cgi?id=15395
for some of the discussion. But in the meantime this is a simple and
fairly straightforward work-around, and the whole
if (x & Y)
x &= Y;
thing can be compiled into (and gcc does do it) just three instructions:
movq %rdx, %rax
andl $Y, %eax
cmovne %rax, %rdx
so it is at least a simple solution to a subtle issue.
Reported-and-tested-by: Pavel Vilim <wylda@volny.cz>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Al Viro [Wed, 3 Mar 2010 22:12:40 +0000 (17:12 -0500)]
Merge branch 'for-fsnotify' into for-linus
Jan Kara [Wed, 3 Mar 2010 21:19:32 +0000 (16:19 -0500)]
ext4: Release page references acquired in ext4_da_block_invalidatepages
We forget to release page references we acquire in
ext4_da_block_invalidatepages. Luckily, this function gets called only if we
are not able to allocate blocks for delay-allocated data so that function
should better never be called.
Also cleanup handling of index variable.
Reported-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Eric W. Biederman [Wed, 3 Mar 2010 07:53:19 +0000 (23:53 -0800)]
init: Open /dev/console from rootfs
To avoid potential problems with an empty /dev open /dev/console
from rootfs instead of waiting to mount our root filesystem and
mounting it there. This effectively guarantees that there will
be a device node, and it won't be on a filesystem that we will
ever unmount, so there are no issues with leaving /dev/console
open and pinning the filesystem.
This is actually more effective than automatically mounting
devtmpfs on /dev because it removes removes the occasionally
problematic assumption that /dev/console exists from the boot
code.
With this patch I was able to throw busybox on my /boot partition
(which has no /dev directory) and boot into userspace without
problems.
The only possible negative consequence I can think of is that
someone out there deliberately used did not use a character device
that is major 5 minor 2 for /dev/console. Does anyone know of a
situation in which that could make sense?
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
André Goddard Rosa [Tue, 23 Feb 2010 07:04:27 +0000 (04:04 -0300)]
mqueue: fix typo "failues" -> "failures"
Signed-off-by: André Goddard Rosa <andre.goddard@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
André Goddard Rosa [Tue, 23 Feb 2010 07:04:26 +0000 (04:04 -0300)]
mqueue: only set error codes if they are really necessary
... postponing assignments until they're needed. Doesn't change code size.
Signed-off-by: André Goddard Rosa <andre.goddard@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
André Goddard Rosa [Tue, 23 Feb 2010 07:04:25 +0000 (04:04 -0300)]
mqueue: simplify do_open() error handling
It reduces code size:
text data bss dec hex filename
9925 72 16 10013 271d ipc/mqueue-BEFORE.o
9885 72 16 9973 26f5 ipc/mqueue-AFTER.o
Signed-off-by: André Goddard Rosa <andre.goddard@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
André Goddard Rosa [Tue, 23 Feb 2010 07:04:24 +0000 (04:04 -0300)]
mqueue: apply mathematics distributivity on mq_bytes calculation
Code size reduction:
text data bss dec hex filename
9941 72 16 10029 272d ipc/mqueue-BEFORE.o
9925 72 16 10013 271d ipc/mqueue-AFTER.o
Signed-off-by: André Goddard Rosa <andre.goddard@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
André Goddard Rosa [Tue, 23 Feb 2010 07:04:23 +0000 (04:04 -0300)]
mqueue: remove unneeded info->messages initialization
... and abort earlier if we couldn't allocate the message pointers array,
avoiding the u->mq_bytes accounting logic.
It reduces code size:
text data bss dec hex filename
9949 72 16 10037 2735 ipc/mqueue-BEFORE.o
9941 72 16 10029 272d ipc/mqueue-AFTER.o
Signed-off-by: André Goddard Rosa <andre.goddard@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
André Goddard Rosa [Tue, 23 Feb 2010 07:04:28 +0000 (04:04 -0300)]
mqueue: fix mq_open() file descriptor leak on user-space processes
We leak fd on lookup_one_len() failure
Signed-off-by: André Goddard Rosa <andre.goddard@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Wed, 3 Mar 2010 19:13:08 +0000 (14:13 -0500)]
fix race in d_splice_alias()
rehashing the negative placeholder opens a race with d_lookup();
we unhash it almost immediately (by d_move()), but the race
window is there. Since d_move() doesn't rely on target being
hashed, we don't need that d_rehash() at all.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Wed, 3 Mar 2010 19:12:08 +0000 (14:12 -0500)]
set S_DEAD on unlink() and non-directory rename() victims
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Miklos Szeredi [Wed, 10 Feb 2010 11:15:53 +0000 (12:15 +0100)]
vfs: add NOFOLLOW flag to umount(2)
Add a new UMOUNT_NOFOLLOW flag to umount(2). This is needed to prevent
symlink attacks in unprivileged unmounts (fuse, samba, ncpfs).
Additionally, return -EINVAL if an unknown flag is used (and specify
an explicitly unused flag: UMOUNT_UNUSED). This makes it possible for
the caller to determine if a flag is supported or not.
CC: Eugene Teo <eugene@redhat.com>
CC: Michael Kerrisk <mtk.manpages@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>