Dave Chinner [Thu, 22 Mar 2012 05:15:13 +0000 (05:15 +0000)]
xfs: add lots of attribute trace points
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
Jan Kara [Thu, 15 Mar 2012 09:34:02 +0000 (09:34 +0000)]
xfs: Fix oops on IO error during xlog_recover_process_iunlinks()
When an IO error happens during inode deletion run from
xlog_recover_process_iunlinks() filesystem gets shutdown. Thus any subsequent
attempt to read buffers fails. Code in xlog_recover_process_iunlinks() does not
count with the fact that read of a buffer which was read a while ago can
really fail which results in the oops on
agi = XFS_BUF_TO_AGI(agibp);
Fix the problem by cleaning up the buffer handling in
xlog_recover_process_iunlinks() as suggested by Dave Chinner. We release buffer
lock but keep buffer reference to AG buffer. That is enough for buffer to stay
pinned in memory and we don't have to call xfs_read_agi() all the time.
CC: stable@kernel.org
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
Dave Chinner [Thu, 22 Mar 2012 05:15:12 +0000 (05:15 +0000)]
xfs: fix fstrim offset calculations
xfs_ioc_fstrim() doesn't treat the incoming offset and length
correctly. It treats them as a filesystem block address, rather than
a disk address. This is wrong because the range passed in is a
linear representation, while the filesystem block address notation
is a sparse representation. Hence we cannot convert the range direct
to filesystem block units and then use that for calculating the
range to trim.
While this sounds dangerous, the problem is limited to calculating
what AGs need to be trimmed. The code that calcuates the actual
ranges to trim gets the right result (i.e. only ever discards free
space), even though it uses the wrong ranges to limit what is
trimmed. Hence this is not a bug that endangers user data.
Fix this by treating the range as a disk address range and use the
appropriate functions to convert the range into the desired formats
for calculations.
Further, fix the first free extent lookup (the longest) to actually
find the largest free extent. Currently this lookup uses a <=
lookup, which results in finding the extent to the left of the
largest because we can never get an exact match on the largest
extent. This is due to the fact that while we know it's size, we
don't know it's location and so the exact match fails and we move
one record to the left to get the next largest extent. Instead, use
a >= search so that the lookup returns the largest extent regardless
of the fact we don't get an exact match on it.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
Dave Chinner [Thu, 22 Mar 2012 05:15:11 +0000 (05:15 +0000)]
xfs: Account log unmount transaction correctly
There have been a few reports of this warning appearing recently:
XFS (dm-4): xlog_space_left: head behind tail
tail_cycle = 129, tail_bytes =
20163072
GH cycle = 129, GH bytes =
20162880
The common cause appears to be lots of freeze and unfreeze cycles,
and the output from the warnings indicates that we are leaking
around 8 bytes of log space per freeze/unfreeze cycle.
When we freeze the filesystem, we write an unmount record and that
uses xlog_write directly - a special type of transaction,
effectively. What it doesn't do, however, is correctly account for
the log space it uses. The unmount record writes an 8 byte structure
with a special magic number into the log, and the space this
consumes is not accounted for in the log ticket tracking the
operation. Hence we leak 8 bytes every unmount record that is
written.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
Dave Chinner [Thu, 22 Mar 2012 05:15:10 +0000 (05:15 +0000)]
xfs: don't cache inodes read through bulkstat
When we read inodes via bulkstat, we generally only read them once
and then throw them away - they never get used again. If we retain
them in cache, then it simply causes the working set of inodes and
other cached items to be reclaimed just so the inode cache can grow.
Avoid this problem by marking inodes read by bulkstat not to be
cached and check this flag in .drop_inode to determine whether the
inode should be added to the VFS LRU or not. If the inode lookup
hits an already cached inode, then don't set the flag. If the inode
lookup hits an inode marked with no cache flag, remove the flag and
allow it to be cached once the current reference goes away.
Inodes marked as not cached will get cleaned up by the background
inode reclaim or via memory pressure, so they will still generate
some short term cache pressure. They will, however, be reclaimed
much sooner and in preference to cache hot inodes.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
Christoph Hellwig [Tue, 28 Feb 2012 11:01:40 +0000 (11:01 +0000)]
xfs: trace xfs_name strings correctly
Strings store in an xfs_name structure are often not NUL terminated,
print them using the correct printf specifiers that make use of the
string length store in the xfs_name structure.
Reported-by: Brian Candler <B.Candler@pobox.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
Dave Chinner [Thu, 22 Mar 2012 05:15:07 +0000 (05:15 +0000)]
xfs: introduce an allocation workqueue
We currently have significant issues with the amount of stack that
allocation in XFS uses, especially in the writeback path. We can
easily consume 4k of stack between mapping the page, manipulating
the bmap btree and allocating blocks from the free list. Not to
mention btree block readahead and other functionality that issues IO
in the allocation path.
As a result, we can no longer fit allocation in the writeback path
in the stack space provided on x86_64. To alleviate this problem,
introduce an allocation workqueue and move all allocations to a
seperate context. This can be easily added as an interposing layer
into xfs_alloc_vextent(), which takes a single argument structure
and does not return until the allocation is complete or has failed.
To do this, add a work structure and a completion to the allocation
args structure. This allows xfs_alloc_vextent to queue the args onto
the workqueue and wait for it to be completed by the worker. This
can be done completely transparently to the caller.
The worker function needs to ensure that it sets and clears the
PF_TRANS flag appropriately as it is being run in an active
transaction context. Work can also be queued in a memory reclaim
context, so a rescuer is needed for the workqueue.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
Dave Chinner [Thu, 22 Mar 2012 05:15:06 +0000 (05:15 +0000)]
xfs: Fix open flag handling in open_by_handle code
Sparse identified some unsafe handling of open flags in the xfs open
by handle ioctl code. Update the code to use the correct access
macros to ensure that we handle the open flags correctly.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
Kamal Dasu [Thu, 23 Feb 2012 00:41:39 +0000 (00:41 +0000)]
xfs: fix deadlock in xfs_rtfree_extent
To fix the deadlock caused by repeatedly calling xfs_rtfree_extent
- removed xfs_ilock() and xfs_trans_ijoin() from xfs_rtfree_extent(),
instead added asserts that the inode is locked and has an inode_item
attached to it.
- in xfs_bunmapi() when dealing with an inode with the rt flag
call xfs_ilock() and xfs_trans_ijoin() so that the
reference count is bumped on the inode and attached it to the
transaction before calling into xfs_bmap_del_extent, similar to
what we do in xfs_bmap_rtalloc.
Signed-off-by: Kamal Dasu <kdasu.kdev@gmail.com>
Reviewed-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Ben Myers <bpm@sgi.com>
Gerard Snitselaar [Fri, 16 Mar 2012 18:36:18 +0000 (18:36 +0000)]
fs: xfs: fix section mismatch in linux-next
xfs_qm_exit() is called in init_xfs_fs().
Signed-off-by: Gerard Snitselaar <dev@snitselaar.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
Dave Chinner [Wed, 7 Mar 2012 04:50:22 +0000 (04:50 +0000)]
xfs: fallback to vmalloc for large buffers in xfs_getbmap
xfs_getbmap uses for a large buffer for extents, which is kmalloc'd.
This can fail after the system has been running for some time as it
is a high order allocation. Add a fallback to vmalloc so that it
doesn't require contiguous memory and so won't randomly fail on
files with large extent lists.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
Dave Chinner [Wed, 7 Mar 2012 04:50:21 +0000 (04:50 +0000)]
xfs: fallback to vmalloc for large buffers in xfs_attrmulti_attr_get
xfsdump uses for a large buffer for extended attributes, which has a
kmalloc'd shadow buffer in the kernel. This can fail after the
system has been running for some time as it is a high order
allocation. Add a fallback to vmalloc so that it doesn't require
contiguous memory and so won't randomly fail while xfsdump is
running.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
Dave Chinner [Wed, 7 Mar 2012 04:50:24 +0000 (04:50 +0000)]
xfs: remove remaining scraps of struct xfs_iomap
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
Dave Chinner [Wed, 7 Mar 2012 04:50:25 +0000 (04:50 +0000)]
xfs: fix inode lookup race
When we get concurrent lookups of the same inode that is not in the
per-AG inode cache, there is a race condition that triggers warnings
in unlock_new_inode() indicating that we are initialising an inode
that isn't in a the correct state for a new inode.
When we do an inode lookup via a file handle or a bulkstat, we don't
serialise lookups at a higher level through the dentry cache (i.e.
pathless lookup), and so we can get concurrent lookups of the same
inode.
The race condition is between the insertion of the inode into the
cache in the case of a cache miss and a concurrently lookup:
Thread 1 Thread 2
xfs_iget()
xfs_iget_cache_miss()
xfs_iread()
lock radix tree
radix_tree_insert()
rcu_read_lock
radix_tree_lookup
lock inode flags
XFS_INEW not set
igrab()
unlock inode flags
rcu_read_unlock
use uninitialised inode
.....
lock inode flags
set XFS_INEW
unlock inode flags
unlock radix tree
xfs_setup_inode()
inode flags = I_NEW
unlock_new_inode()
WARNING as inode flags != I_NEW
This can lead to inode corruption, inode list corruption, etc, and
is generally a bad thing to occur.
Fix this by setting XFS_INEW before inserting the inode into the
radix tree. This will ensure any concurrent lookup will find the new
inode with XFS_INEW set and that forces the lookup to wait until the
XFS_INEW flag is removed before allowing the lookup to succeed.
cc: <stable@vger.kernel.org> # for 3.0.x, 3.2.x
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
Dave Chinner [Wed, 7 Mar 2012 04:50:19 +0000 (04:50 +0000)]
xfs: clean up minor sparse warnings
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
Christoph Hellwig [Tue, 13 Mar 2012 08:52:37 +0000 (08:52 +0000)]
xfs: remove the global xfs_Gqm structure
If we initialize the slab caches for the quota code when XFS is loaded there
is no need for a global and reference counted quota manager structure. Drop
all this overhead and also fix the error handling during quota initialization.
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
Christoph Hellwig [Wed, 14 Mar 2012 16:53:34 +0000 (11:53 -0500)]
xfs: remove the per-filesystem list of dquots
Instead of keeping a separate per-filesystem list of dquots we can walk
the radix tree for the two places where we need to iterate all quota
structures.
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
Christoph Hellwig [Tue, 13 Mar 2012 08:52:35 +0000 (08:52 +0000)]
xfs: use per-filesystem radix trees for dquot lookup
Replace the global hash tables for looking up in-memory dquot structures
with per-filesystem radix trees to allow scaling to a large number of
in-memory dquot structures.
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
Christoph Hellwig [Tue, 13 Mar 2012 08:52:34 +0000 (08:52 +0000)]
xfs: per-filesystem dquot LRU lists
Replace the global dquot lru lists with a per-filesystem one.
Note that the shrinker isn't wire up to the per-superblock VFS shrinker
infrastructure as would have problems summing up and splitting the counts
for inodes and dquots. I don't think this is a major problem as the quota
cache isn't as interwinded with the inode cache as the dentry cache is,
because an inode that is dropped from the cache will generally release
a dquot reference, but most of the time it won't be the last one.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
Christoph Hellwig [Tue, 13 Mar 2012 08:52:33 +0000 (08:52 +0000)]
xfs: use common code for quota statistics
Switch the quota code over to use the generic XFS statistics infrastructure.
While the legacy /proc/fs/xfs/xqm and /proc/fs/xfs/xqmstats interfaces are
preserved for now the statistics that still have a meaning with the current
code are now also available from /proc/fs/xfs/stats.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
Christoph Hellwig [Wed, 29 Feb 2012 09:53:55 +0000 (09:53 +0000)]
xfs: reimplement fdatasync support
Add an in-memory only flag to say we logged timestamps only, and use it to
check if fdatasync can optimize away the log force.
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
Christoph Hellwig [Wed, 29 Feb 2012 09:53:54 +0000 (09:53 +0000)]
xfs: split in-core and on-disk inode log item fields
Add a new ili_fields member to the inode log item to isolate the in-memory
flags from the ones that actually go to the log. This will allow tracking
timestamp-only updates for fdatasync and O_DSYNC in the next patch and
prepares for divorcing the on-disk log format from the in-memory log item
a little further down the road.
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
Christoph Hellwig [Wed, 29 Feb 2012 09:53:53 +0000 (09:53 +0000)]
xfs: make xfs_inode_item_size idempotent
Move all code messing with the inode log item flags into xfs_inode_item_format
to make sure xfs_inode_item_size really only calculates the the number of
vectors, but doesn't modify any state of the inode item.
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
Christoph Hellwig [Wed, 29 Feb 2012 09:53:52 +0000 (09:53 +0000)]
xfs: log timestamp updates
Timestamps on regular files are the last metadata that XFS does not update
transactionally. Now that we use the delaylog mode exclusively and made
the log scode scale extremly well there is no need to bypass that code for
timestamp updates. Logging all updates allows to drop a lot of code, and
will allow for further performance improvements later on.
Note that this patch drops optimized handling of fdatasync - it will be
added back in a separate commit.
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
Christoph Hellwig [Tue, 13 Mar 2012 08:41:05 +0000 (08:41 +0000)]
xfs: log file size updates at I/O completion time
Do not use unlogged metadata updates and the VFS dirty bit for updating
the file size after writeback. In addition to causing various problems
with updates getting delayed for far too long this also drags in the
unscalable VFS dirty tracking, and is one of the few remaining unlogged
metadata updates.
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
Christoph Hellwig [Wed, 29 Feb 2012 09:53:50 +0000 (09:53 +0000)]
xfs: log file size updates as part of unwritten extent conversion
If we convert and unwritten extent past the current i_size log the size update
as part of the extent manipulation transactions instead of doing an unlogged
metadata update later.
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
Christoph Hellwig [Wed, 29 Feb 2012 09:53:49 +0000 (09:53 +0000)]
xfs: do not require an ioend for new EOF calculation
Replace xfs_ioend_new_eof with a new inline xfs_new_eof helper that
doesn't require and ioend, and is available also outside of xfs_aops.c.
Also make the code a bit more clear by using a normal if statement
instead of a slightly misleading MIN().
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
Christoph Hellwig [Wed, 29 Feb 2012 09:53:48 +0000 (09:53 +0000)]
xfs: use per-filesystem I/O completion workqueues
The new concurrency managed workqueues are cheap enough that we can create
per-filesystem instead of global workqueues. This allows us to remove the
trylock or defer scheme on the ilock, which is not helpful once we have
outstanding log reservations until finishing a size update.
Also allow the default concurrency on this workqueues so that I/O completions
blocking on the ilock for one inode do not block process for another inode.
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
Christoph Hellwig [Mon, 20 Feb 2012 02:28:18 +0000 (02:28 +0000)]
quota: make Q_XQUOTASYNC a noop
Now that XFS takes quota reservations into account there is no need to flush
anything before reporting quotas - in addition to beeing fully transactional
all quota information is also 100% coherent with the rest of the filesystem
now.
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
Christoph Hellwig [Mon, 20 Feb 2012 02:28:17 +0000 (02:28 +0000)]
xfs: include reservations in quota reporting
Report all quota usage including the currently pending reservations. This
avoids the need to flush delalloc space before gathering quota information,
and matches quota enforcement, which already takes the reservations into
account.
This fixes xfstests 270.
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
Christoph Hellwig [Mon, 20 Feb 2012 02:28:16 +0000 (02:28 +0000)]
xfs: merge xfs_qm_export_dquot into xfs_qm_scall_getquota
The is no good reason to have these two separate, and for the next change
we would need the full struct xfs_dquot in xfs_qm_export_dquot, so better
just fold the code now instead of changing it spuriously.
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
Alex Elder [Thu, 16 Feb 2012 22:01:00 +0000 (22:01 +0000)]
xfs: only take the ILOCK in xfs_reclaim_inode()
At the end of xfs_reclaim_inode(), the inode is locked in order to
we wait for a possible concurrent lookup to complete before the
inode is freed. This synchronization step was taking both the ILOCK
and the IOLOCK, but the latter was causing lockdep to produce
reports of the possibility of deadlock.
It turns out that there's no need to acquire the IOLOCK at this
point anyway. It may have been required in some earlier version of
the code, but there should be no need to take the IOLOCK in
xfs_iget(), so there's no (longer) any need to get it here for
synchronization. Add an assertion in xfs_iget() as a reminder
of this assumption.
Dave Chinner diagnosed this on IRC, and Christoph Hellwig suggested
no longer including the IOLOCK. I just put together the patch.
Signed-off-by: Alex Elder <elder@dreamhost.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
Christoph Hellwig [Mon, 20 Feb 2012 02:31:31 +0000 (02:31 +0000)]
xfs: split and cleanup xfs_log_reserve
Split the log regrant case out of xfs_log_reserve into a separate function,
and merge xlog_grant_log_space and xlog_regrant_write_log_space into their
respective callers. Also replace the XFS_LOG_PERM_RESERV flag, which easily
got misused before the previous cleanups with a simple boolean parameter.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
Christoph Hellwig [Mon, 20 Feb 2012 02:31:30 +0000 (02:31 +0000)]
xfs: share code for grant head availability checks
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
Christoph Hellwig [Mon, 20 Feb 2012 02:31:29 +0000 (02:31 +0000)]
xfs: share code for grant head wakeups
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
Christoph Hellwig [Mon, 20 Feb 2012 02:31:28 +0000 (02:31 +0000)]
xfs: share code for grant head waiting
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
Christoph Hellwig [Mon, 20 Feb 2012 02:31:27 +0000 (02:31 +0000)]
xfs: add xlog_grant_head_wake_all
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
Christoph Hellwig [Mon, 20 Feb 2012 02:31:26 +0000 (02:31 +0000)]
xfs: add xlog_grant_head_init
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
Christoph Hellwig [Mon, 20 Feb 2012 02:31:25 +0000 (02:31 +0000)]
xfs: add the xlog_grant_head structure
Add a new data structure to allow sharing code between the log grant and
regrant code.
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
Christoph Hellwig [Mon, 20 Feb 2012 02:31:24 +0000 (02:31 +0000)]
xfs: remove log space waitqueues
The tic->t_wait waitqueues can never have more than a single waiter
on them, so we can easily replace them with a task_struct pointer
and wake_up_process.
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
Christoph Hellwig [Mon, 20 Feb 2012 02:31:23 +0000 (02:31 +0000)]
xfs: cleanup xfs_log_space_wake
Remove the now unused opportunistic parameter, and use the the
xlog_writeq_wake and xlog_reserveq_wake helpers now that we don't have
to care about the opportunistic wakeups.
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
Christoph Hellwig [Mon, 20 Feb 2012 02:31:22 +0000 (02:31 +0000)]
xfs: remove xfs_trans_unlocked_item
There is no reason to wake up log space waiters when unlocking inodes or
dquots, and the commit log has no explanation for this function either.
Given that we now have exact log space wakeups everywhere we can assume
the reason for this function was to paper over log space races in earlier
XFS versions.
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
Christoph Hellwig [Mon, 20 Feb 2012 02:31:21 +0000 (02:31 +0000)]
xfs: do exact log space wakeups in xlog_ungrant_log_space
The only reason that xfs_log_space_wake had to do opportunistic wakeups
was that the old xfs_log_move_tail calling convention didn't allow for
exact wakeups when not updating the log tail LSN. Since this issue has
been fixed we can do exact wakeups now.
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
Christoph Hellwig [Mon, 20 Feb 2012 02:31:20 +0000 (02:31 +0000)]
xfs: split tail_lsn assignments from log space wakeups
Currently xfs_log_move_tail has a tail_lsn argument that is horribly
overloaded: it may contain either an actual lsn to assign to the log tail,
0 as a special case to use the last sync LSN, or 1 to indicate that no tail
LSN assignment should be performed, and we should opportunisticly wake up
at one task waiting for log space even if we did not move the LSN.
Remove the tail lsn assigned from xfs_log_move_tail and make the two callers
use xlog_assign_tail_lsn instead of the current variant of partially using
the code in xfs_log_move_tail and partially opencoding it. Note that means
we grow an addition lock roundtrip on the AIL lock for each bulk update
or delete, which is still far less than what we had before introducing the
bulk operations. If this proves to be a problem we can still add a variant
of xlog_assign_tail_lsn that expects the lock to be held already.
Also rename the remainder of xfs_log_move_tail to xfs_log_space_wake as
that name describes its functionality much better.
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
Mitsuo Hayasaka [Mon, 6 Feb 2012 12:51:05 +0000 (12:51 +0000)]
xfs: cleanup quota check on disk blocks and inodes reservations
This patch is a cleanup of quota check on disk blocks and inodes
reservations, and changes it as follows.
(1) add a total_count variable to store the total number of
current usages and new reservations for disk blocks and inodes,
respectively.
(2) make it more readable to check if the local variables softlimit
and hardlimit are positive. It has been changed as follows.
if (softlimit > 0ULL) -> if (softlimit)
if (hardlimit > 0ULL) -> if (hardlimit)
This is because they are defined as xfs_qcnt_t which is unsigned.
Signed-off-by: Mitsuo Hayasaka <mitsuo.hayasaka.hu@hitachi.com>
Cc: Ben Myers <bpm@sgi.com>
Cc: Alex Elder <elder@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
Mitsuo Hayasaka [Mon, 6 Feb 2012 12:50:30 +0000 (12:50 +0000)]
xfs: make inode quota check more general
The xfs checks quota when reserving disk blocks and inodes. In the block
reservation, it checks if the total number of blocks including current
usage and new reservation exceed quota. In the inode reservation,
it checks using the total number of inodes including only current usage
without new reservation. However, this inode quota check works well
since the caller of xfs_trans_dquot() always sets the argument of the
number of new inode reservation to 1 or 0 and inode is reserved one by
one in current xfs.
To make it more general, this patch changes it to the same way as the
block quota check.
Signed-off-by: Mitsuo Hayasaka <mitsuo.hayasaka.hu@hitachi.com>
Cc: Ben Myers <bpm@sgi.com>
Cc: Alex Elder <elder@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
(cherry picked from commit
c922bbc819324558e61402a7a76c10c550ca61bc)
Mitsuo Hayasaka [Mon, 6 Feb 2012 12:50:07 +0000 (12:50 +0000)]
xfs: change available ranges of softlimit and hardlimit in quota check
In general, quota allows us to use disk blocks and inodes up to each
limit, that is, they are available if they don't exceed their limitations.
Current xfs sets their available ranges to lower than them except disk
inode quota check. So, this patch changes the ranges to not beyond them.
Signed-off-by: Mitsuo Hayasaka <mitsuo.hayasaka.hu@hitachi.com>
Cc: Ben Myers <bpm@sgi.com>
Cc: Alex Elder <elder@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
(cherry picked from commit
20f12d8ac01917d96860f352f67eddd912df0afb)
Jesper Juhl [Mon, 13 Feb 2012 20:51:05 +0000 (20:51 +0000)]
XFS: xfs_trans_add_item() - don't assign in ASSERT() when compare is intended
It looks to me like the two ASSERT()s in xfs_trans_add_item() really
want to do a compare (==) rather than assignment (=).
This patch changes it from the latter to the former.
Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: Ben Myers <bpm@sgi.com>
(cherry picked from commit
05293485a0b6b1f803e8a3c0ff188c38f6969985)
Christoph Hellwig [Wed, 1 Feb 2012 13:57:20 +0000 (13:57 +0000)]
xfs: use a normal shrinker for the dquot freelist
Stop reusing dquots from the freelist when allocating new ones directly, and
implement a shrinker that actually follows the specifications for the
interface. The shrinker implementation is still highly suboptimal at this
point, but we can gradually work on it.
This also fixes an bug in the previous lock ordering, where we would take
the hash and dqlist locks inside of the freelist lock against the normal
lock ordering. This is only solvable by introducing the dispose list,
and thus not when using direct reclaim of unused dquots for new allocations.
As a side-effect the quota upper bound and used to free ratio values in
/proc/fs/xfs/xqm are set to 0 as these values don't make any sense in the
new world order.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
(cherry picked from commit
04da0c8196ac0b12fb6b84f4b7a51ad2fa56d869)
Chandra Seetharaman [Mon, 23 Jan 2012 17:31:43 +0000 (17:31 +0000)]
Define new macro XFS_ALL_QUOTA_ACTIVE and simply some usage
Define new macro XFS_ALL_QUOTA_ACTIVE and simply some usage
of quota macros.
Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
Chandra Seetharaman [Mon, 23 Jan 2012 17:31:37 +0000 (17:31 +0000)]
Change xfs_sb_from_disk() interface to take a mount pointer
Change xfs_sb_from_disk() interface to take a mount pointer
instead of a superblock pointer.
This is to print mount point specific error messages in future
fixes.
Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
Chandra Seetharaman [Mon, 23 Jan 2012 17:31:30 +0000 (17:31 +0000)]
Define a new function xfs_inode_dquot()
Define a new function xfs_inode_dquot() that takes a inode pointer
and a disk quota type and returns the quota pointer for the specified
quota type.
This simplifies the xfs_qm_dqget() error path significantly.
Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
Chandra Seetharaman [Mon, 23 Jan 2012 17:31:25 +0000 (17:31 +0000)]
Define a new function xfs_this_quota_on()
Create a new function xfs_this_quota_on() that takes a xfs_mount
data structure and a disk quota type and returns true if the specified
type of quota is ON in the xfs_mount data structure.
Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
Amit Sahrawat [Mon, 16 Jan 2012 12:24:36 +0000 (12:24 +0000)]
xfs: kill the unused XFS_BB_FSB_OFFSET macro
Removing the macro, as this is no more needed in the code.
Tried to find the reference when it was last used - but the usage
for this seemed to have been dropped long time ago.
Signed-off-by: Amit Sahrawat <amit.sahrawat83@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
Mitsuo Hayasaka [Fri, 13 Jan 2012 05:58:39 +0000 (05:58 +0000)]
xfs: show uuid when mount fails due to duplicate uuid
When a system tries to mount a filesystem (FS) using UUID, the xfs
returns -EINVAL and shows a message if a FS with the same UUID has
been already mounted. It is useful to output the duplicate UUID
with it.
Signed-off-by: Mitsuo Hayasaka <mitsuo.hayasaka.hu@hitachi.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Ben Myers <bpm@sgi.com>
Cc: Alex Elder <elder@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
Mitsuo Hayasaka [Fri, 27 Jan 2012 06:37:26 +0000 (06:37 +0000)]
xfs: pass KM_SLEEP flag to kmem_realloc() in xlog_recover_add_to_cnt_trans()
The kmem_realloc() in xfs is given KM_* memory allocation flags. And it
allocates memory using kmalloc() after they are converted to gfp_mask
flags. In xlog_recover_add_to_cont_trans(), 0u is passed to kmem_realloc(),
instead of them. I guess it is preferred to use them, and here memory must
be allocated but don't have to be done with GFP_ATOMIC. So, this patch
changes it to KM_SLEEP.
Signed-off-by: Mitsuo Hayasaka <mitsuo.hayasaka.hu@hitachi.com>
Cc: Ben Myers <bpm@sgi.com>
Cc: Alex Elder <elder@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
Jan Kara [Wed, 11 Jan 2012 18:52:10 +0000 (18:52 +0000)]
xfs: Fix missing xfs_iunlock() on error recovery path in xfs_readlink()
Commit
b52a360b forgot to call xfs_iunlock() when it detected corrupted
symplink and bailed out. Fix it by jumping to 'out' instead of doing return.
CC: stable@kernel.org
CC: Carlos Maiolino <cmaiolino@redhat.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Alex Elder <elder@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
Linus Torvalds [Thu, 19 Jan 2012 23:04:48 +0000 (15:04 -0800)]
Linux 3.3-rc1
Linus Torvalds [Thu, 19 Jan 2012 22:53:06 +0000 (14:53 -0800)]
Merge branches 'sched-urgent-for-linus', 'perf-urgent-for-linus' and 'x86-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/accounting, proc: Fix /proc/stat interrupts sum
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
tracepoints/module: Fix disabling tracepoints with taint CRAP or OOT
x86/kprobes: Add arch/x86/tools/insn_sanity to .gitignore
x86/kprobes: Fix typo transferred from Intel manual
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86, syscall: Need __ARCH_WANT_SYS_IPC for 32 bits
x86, tsc: Fix SMI induced variation in quick_pit_calibrate()
x86, opcode: ANDN and Group 17 in x86-opcode-map.txt
x86/kconfig: Move the ZONE_DMA entry under a menu
x86/UV2: Add accounting for BAU strong nacks
x86/UV2: Ack BAU interrupt earlier
x86/UV2: Remove stale no-resources test for UV2 BAU
x86/UV2: Work around BAU bug
x86/UV2: Fix BAU destination timeout initialization
x86/UV2: Fix new UV2 hardware by using native UV2 broadcast mode
x86: Get rid of dubious one-bit signed bitfield
Linus Torvalds [Thu, 19 Jan 2012 22:52:03 +0000 (14:52 -0800)]
Merge tag 'gpio-for-linus' of git://git.secretlab.ca/git/linux-2.6
gpio bug fixes for v3.3
* tag 'gpio-for-linus' of git://git.secretlab.ca/git/linux-2.6:
gpio: tps65910: Use correct offset for gpio initialization
gpio/it8761e: Restrict it8761e gpio driver to x86.
gpio-ml-ioh: cleanup __iomem annotation usage
gpio-ml-ioh: cleanup NULL pointer checking
gpio-pch: cleanup __iomem annotation usage
gpio-pch: cleanup NULL pointer checking
Linus Torvalds [Thu, 19 Jan 2012 22:49:16 +0000 (14:49 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/viro/vfs
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
qnx4: don't leak ->BitMap on late failure exits
qnx4: reduce the insane nesting in qnx4_checkroot()
qnx4: di_fname is an array, for crying out loud...
vfs: remove printk from set_nlink()
wake up s_wait_unfrozen when ->freeze_fs fails
H. Peter Anvin [Thu, 19 Jan 2012 20:41:25 +0000 (12:41 -0800)]
x86, syscall: Need __ARCH_WANT_SYS_IPC for 32 bits
In checkin
303395ac3bf3 x86: Generate system call tables and unistd_*.h from tables
the feature macros in <asm/unistd.h> were unified between 32 and 64
bits. Unfortunately 32 bits requires __ARCH_WANT_SYS_IPC and this was
inadvertently dropped.
Reported-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Link: http://lkml.kernel.org/r/CALLzPKbeXN5gdngo8uYYU8mAow=XhrwBFBhKfG811f37BubQOg@mail.gmail.com
H. Peter Anvin [Thu, 19 Jan 2012 20:56:50 +0000 (12:56 -0800)]
Merge remote-tracking branch 'linus/master' into x86/urgent
Linus Torvalds [Thu, 19 Jan 2012 19:46:08 +0000 (11:46 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/jmorris/linux-security
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
KEYS: Permit key_serial() to be called with a const key pointer
keys: fix user_defined key sparse messages
ima: fix cred sparse warning
MPILIB: Add a missing ENOMEM check
Al Viro [Thu, 19 Jan 2012 18:54:36 +0000 (13:54 -0500)]
qnx4: don't leak ->BitMap on late failure exits
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Thu, 19 Jan 2012 18:40:57 +0000 (13:40 -0500)]
qnx4: reduce the insane nesting in qnx4_checkroot()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Thu, 19 Jan 2012 18:19:42 +0000 (13:19 -0500)]
qnx4: di_fname is an array, for crying out loud...
(struct qnx4_inode_entry *)(bh->b_data + some_offset)->di_fname
is not going to be NULL, TYVM...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
David Howells [Wed, 18 Jan 2012 10:04:29 +0000 (10:04 +0000)]
KEYS: Permit key_serial() to be called with a const key pointer
Permit key_serial() to be called with a const key pointer.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
Mimi Zohar [Wed, 18 Jan 2012 10:03:14 +0000 (10:03 +0000)]
keys: fix user_defined key sparse messages
Replace the rcu_assign_pointer() calls with rcu_assign_keypointer().
Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
Mimi Zohar [Wed, 18 Jan 2012 03:11:28 +0000 (22:11 -0500)]
ima: fix cred sparse warning
Fix ima_policy.c sparse "warning: dereference of noderef expression"
message, by accessing cred->uid using current_cred().
Changelog v1:
- Change __cred to just cred (based on David Howell's comment)
Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
Linus Torvalds [Thu, 19 Jan 2012 03:26:11 +0000 (19:26 -0800)]
uml: fix compile for x86-64
Randy Dunlap reports that we get
arch/x86/um/shared/sysdep/ptrace.h:7:20: error: redefinition of 'regs_return_value'
arch/x86/um/shared/sysdep/ptrace.h:7:20: note: previous definition of 'regs_return_value' was here
when compiling UML for x86-64.
Stephen Rothwell root-caused it and says:
"Caused by commit
d7e7528bcd45 ("Audit: push audit success and retcode
into arch ptrace.h") (another patch that was never in linux-next :-().
This file now needs protection against double inclusion."
so let's do as the man says.
Reported-by: Randy Dunlap <rdunlap@xenotime.net>
Analyzed-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Howells [Wed, 18 Jan 2012 10:03:54 +0000 (10:03 +0000)]
MPILIB: Add a missing ENOMEM check
Add a missing ENOMEM check.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Mimi Zohar <zohar@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
Linus Torvalds [Thu, 19 Jan 2012 00:29:42 +0000 (16:29 -0800)]
Merge branch 'for-next-merge' of git://git./linux/kernel/git/nab/target-pending
* 'for-next-merge' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
ib_srpt: Initial SRP Target merge for v3.3-rc1
Linus Torvalds [Wed, 18 Jan 2012 23:59:18 +0000 (15:59 -0800)]
Merge branch 'for-next' of git://git./linux/kernel/git/nab/target-pending
* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (26 commits)
target: Set additional sense length field in sense data
target: Remove legacy device status check from transport_execute_tasks
target: Remove __transport_execute_tasks() for each processing context
target: Remove extra se_device->execute_task_lock access in fast path
target: Drop se_device TCQ queue_depth usage from I/O path
target: Fix possible NULL pointer with __transport_execute_tasks
target: Remove TFO->check_release_cmd() fabric API caller
tcm_fc: Convert ft_send_work to use target_submit_cmd
target: Add target_submit_cmd() for process context fabric submission
target: Make target_put_sess_cmd use target_release_cmd_kref
target: Set response format in INQUIRY response
target: tcm_mod_builder: small fixups
Documentation/target: Fix tcm_mod_builder.py build breakage
target: remove overagressive ____cacheline_aligned annoations
tcm_loop: bump max_sectors
target/configs: remove trailing newline from udev_path and alias
iscsi-target: fix chap identifier simple_strtoul usage
target: remove useless casts
target: simplify target_check_cdb_and_preempt
target: Move core_scsi3_check_cdb_abort_and_preempt
...
Linus Torvalds [Wed, 18 Jan 2012 23:51:48 +0000 (15:51 -0800)]
Merge branch 'release' of git://git./linux/kernel/git/lenb/linux
This includes initial support for the recently published ACPI 5.0 spec.
In particular, support for the "hardware-reduced" bit that eliminates
the dependency on legacy hardware.
APEI has patches resulting from testing on real hardware.
Plus other random fixes.
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux: (52 commits)
acpi/apei/einj: Add extensions to EINJ from rev 5.0 of acpi spec
intel_idle: Split up and provide per CPU initialization func
ACPI processor: Remove unneeded variable passed by acpi_processor_hotadd_init V2
ACPI processor: Remove unneeded cpuidle_unregister_driver call
intel idle: Make idle driver more robust
intel_idle: Fix a cast to pointer from integer of different size warning in intel_idle
ACPI: kernel-parameters.txt : Add intel_idle.max_cstate
intel_idle: remove redundant local_irq_disable() call
ACPI processor: Fix error path, also remove sysdev link
ACPI: processor: fix acpi_get_cpuid for UP processor
intel_idle: fix API misuse
ACPI APEI: Convert atomicio routines
ACPI: Export interfaces for ioremapping/iounmapping ACPI registers
ACPI: Fix possible alignment issues with GAS 'address' references
ACPI, ia64: Use SRAT table rev to use 8bit or 16/32bit PXM fields (ia64)
ACPI, x86: Use SRAT table rev to use 8bit or 32bit PXM fields (x86/x86-64)
ACPI: Store SRAT table revision
ACPI, APEI, Resolve false conflict between ACPI NVS and APEI
ACPI, Record ACPI NVS regions
ACPI, APEI, EINJ, Refine the fix of resource conflict
...
Stefan Berger [Wed, 18 Jan 2012 03:07:30 +0000 (22:07 -0500)]
tpm: fix (ACPI S3) suspend regression
This patch fixes an (ACPI S3) suspend regression introduced in commit
68d6e6713fcb ("tpm: Introduce function to poll for result of self test")
and occurring with an Infineon TPM and tpm_tis and tpm_infineon drivers
active.
The suspend problem occurred if the TPM was disabled and/or deactivated
and therefore the TPM_PCRRead checking the result of the (asynchronous)
self test returned an error code which then caused the tpm_tis driver to
become inactive and this then seemed to have negatively influenced the
suspend support by the tpm_infineon driver... Besides that the tpm_tis
drive may stay active even if the TPM is disabled and/or deactivated.
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Tested-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Rajiv Andrade <srajiv@linux.vnet.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Wed, 18 Jan 2012 23:41:27 +0000 (15:41 -0800)]
nvme: fix merge error due to change of 'make_request_fn' fn type
The type of 'make_request_fn' changed in
5a7bbad27a4 ("block: remove
support for bio remapping from ->make_request"), but the merge of the
nvme driver didn't take that into account, and as a result the driver
would compile with a warning:
drivers/block/nvme.c: In function 'nvme_alloc_ns':
drivers/block/nvme.c:1336:2: warning: passing argument 2 of 'blk_queue_make_request' from incompatible pointer type [enabled by default]
include/linux/blkdev.h:830:13: note: expected 'void (*)(struct request_queue *, struct bio *)' but argument is of type 'int (*)(struct request_queue *, struct bio *)'
It's benign, but the warning is annoying.
Reported-by: Stephen Rothwell <sfr@canb.auug.org>
Cc: Matthew Wilcox <matthew.r.wilcox@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Stephen Rothwell [Wed, 18 Jan 2012 23:24:31 +0000 (10:24 +1100)]
xen: using EXPORT_SYMBOL requires including export.h
Fix these warnings:
drivers/xen/biomerge.c:14:1: warning: data definition has no type or storage class [enabled by default]
drivers/xen/biomerge.c:14:1: warning: type defaults to 'int' in declaration of 'EXPORT_SYMBOL' [-Wimplicit-int]
drivers/xen/biomerge.c:14:1: warning: parameter names (without types) in function declaration [enabled by default]
And this build error:
ERROR: "xen_biovec_phys_mergeable" [drivers/block/nvme.ko] undefined!
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Wed, 18 Jan 2012 21:46:13 +0000 (13:46 -0800)]
Merge branch 'for-linus/i2c-33' of git://git.fluff.org/bjdooks/linux
* 'for-linus/i2c-33' of git://git.fluff.org/bjdooks/linux:
i2c-eg20t: Change-company-name-OKI-SEMICONDUCTOR to LAPIS Semiconductor
i2c-eg20t: Support new device LAPIS Semiconductor ML7831 IOH
i2c-eg20t: modified the setting of transfer rate.
i2c-eg20t: use i2c_add_numbered_adapter to get a fixed bus number
i2c: OMAP: Add DT support for i2c controller
I2C: OMAP: NACK without STP
I2C: OMAP: correct SYSC register offset for OMAP4
Linus Torvalds [Wed, 18 Jan 2012 20:53:54 +0000 (12:53 -0800)]
Merge branch 'v4l_for_linus' of git://git./linux/kernel/git/mchehab/linux-media
* 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (57 commits)
[media] as3645a: Fix compilation by including slab.h
[media] s5p-fimc: Remove linux/version.h include from fimc-mdevice.c
[media] s5p-mfc: Remove linux/version.h include from s5p_mfc.c
[media] ds3000: using logical && instead of bitwise &
[media] v4l2-ctrls: make control names consistent
[media] DVB: dib0700, add support for Nova-TD LEDs
[media] DVB: dib0700, add corrected Nova-TD frontend_attach
[media] DVB: dib0700, separate stk7070pd initialization
[media] DVB: dib0700, move Nova-TD Stick to a separate set
[media] : add MODULE_FIRMWARE to dib0700
[media] DVB-CORE: remove superfluous DTV_CMDs
[media] s5p-jpeg: adapt to recent videobuf2 changes
[media] s5p-g2d: fixed a bug in controls setting function
[media] s5p-mfc: Fix volatile controls setup
[media] drivers/media/video/s5p-mfc/s5p_mfc.c: adjust double test
[media] drivers/media/video/s5p-fimc/fimc-capture.c: adjust double test
[media] s5p-fimc: Fix incorrect control ID assignment
[media] dvb_frontend: Don't call get_frontend() if idle
[media] DocBook/dvbproperty.xml: Remove DTV_MODULATION from ISDB-T
[media] DocBook/dvbproperty.xml: Fix ISDB-T delivery system parameters
...
Linus Torvalds [Wed, 18 Jan 2012 20:53:36 +0000 (12:53 -0800)]
Merge branch 'fix/asoc' of git://git./linux/kernel/git/tiwai/sound
* 'fix/asoc' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ASoC: Wait for WM8993 FLL to stabilise
ASoC: core - Free platform DAPM context at platform removal.
ASoC: dapm - Fix check for codec context in dapm_power_widgets().
ASoC: sgtl5000: update author email address
ASoC: Fix DMA channel leak in imx-pcm-dma-mx2 driver.
Laxman Dewangan [Wed, 18 Jan 2012 14:37:35 +0000 (20:07 +0530)]
gpio: tps65910: Use correct offset for gpio initialization
Using the correct gpio offset for setting the initial value
of gpio when setting output direction.
Signed-off-by: Laxman Dewangan <ldewangan@nvidia.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Linus Torvalds [Wed, 18 Jan 2012 20:35:17 +0000 (12:35 -0800)]
Merge tag 'scsi-misc' of git://git./linux/kernel/git/jejb/scsi-misc-2.6
SCSI updates on
20120118
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (49 commits)
[SCSI] libfc: remove redundant timer init for fcp
[SCSI] fcoe: Move fcoe_debug_logging from fcoe.h to fcoe.c
[SCSI] libfc: Declare local functions static
[SCSI] fcoe: fix regression on offload em matching function for initiator/target
[SCSI] qla4xxx: Update driver version to 5.02.00-k12
[SCSI] qla4xxx: Cleanup modinfo display
[SCSI] qla4xxx: Update license
[SCSI] qla4xxx: Clear the RISC interrupt bit during FW init
[SCSI] qla4xxx: Added error logging for firmware abort
[SCSI] qla4xxx: Disable generating pause frames in case of FW hung
[SCSI] qla4xxx: Temperature monitoring for ISP82XX core.
[SCSI] megaraid: fix sparse warnings
[SCSI] sg: convert to kstrtoul_from_user()
[SCSI] don't change sdev starvation list order without request dispatched
[SCSI] isci: fix, prevent port from getting stuck in the 'configuring' state
[SCSI] isci: fix start OOB
[SCSI] isci: fix io failures while wide port links are coming up
[SCSI] isci: allow more time for wide port targets
[SCSI] isci: enable wide port targets
[SCSI] isci: Fix IO fails when pull cable from phy in x4 wideport in MPC mode.
...
Linus Torvalds [Wed, 18 Jan 2012 20:34:09 +0000 (12:34 -0800)]
Merge git://git.infradead.org/users/willy/linux-nvme
* git://git.infradead.org/users/willy/linux-nvme: (105 commits)
NVMe: Set number of queues correctly
NVMe: Version 0.8
NVMe: Set queue flags correctly
NVMe: Simplify nvme_unmap_user_pages
NVMe: Mark the end of the sg list
NVMe: Fix DMA mapping for admin commands
NVMe: Rename IO_TIMEOUT to NVME_IO_TIMEOUT
NVMe: Merge the nvme_bio and nvme_prp data structures
NVMe: Change nvme_completion_fn to take a dev
NVMe: Change get_nvmeq to take a dev instead of a namespace
NVMe: Simplify completion handling
NVMe: Update Identify Controller data structure
NVMe: Implement doorbell stride capability
NVMe: Version 0.7
NVMe: Don't probe namespace 0
Fix calculation of number of pages in a PRP List
NVMe: Create nvme_identify and nvme_get_features functions
NVMe: Fix memory leak in nvme_dev_add()
NVMe: Fix calls to dma_unmap_sg
NVMe: Correct sg list setup in nvme_map_user_pages
...
Linus Torvalds [Wed, 18 Jan 2012 06:26:41 +0000 (22:26 -0800)]
Merge git://git./linux/kernel/git/davem/net
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (47 commits)
tg3: Fix single-vector MSI-X code
openvswitch: Fix multipart datapath dumps.
ipv6: fix per device IP snmp counters
inetpeer: initialize ->redirect_genid in inet_getpeer()
net: fix NULL-deref in WARN() in skb_gso_segment()
net: WARN if skb_checksum_help() is called on skb requiring segmentation
caif: Remove bad WARN_ON in caif_dev
caif: Fix typo in Vendor/Product-ID for CAIF modems
bnx2x: Disable AN KR work-around for BCM57810
bnx2x: Remove AutoGrEEEn for BCM84833
bnx2x: Remove 100Mb force speed for BCM84833
bnx2x: Fix PFC setting on BCM57840
bnx2x: Fix Super-Isolate mode for BCM84833
net: fix some sparse errors
net: kill duplicate included header
net: sh-eth: Fix build error by the value which is not defined
net: Use device model to get driver name in skb_gso_segment()
bridge: BH already disabled in br_fdb_cleanup()
net: move sock_update_memcg outside of CONFIG_INET
mwl8k: Fixing Sparse ENDIAN CHECK warning
...
Len Brown [Wed, 18 Jan 2012 06:15:54 +0000 (01:15 -0500)]
Merge branches 'einj', 'intel_idle', 'misc', 'srat' and 'turbostat-ivb' into release
Tony Luck [Tue, 17 Jan 2012 20:10:16 +0000 (12:10 -0800)]
acpi/apei/einj: Add extensions to EINJ from rev 5.0 of acpi spec
ACPI 5.0 provides extensions to the EINJ mechanism to specify the
target for the error injection - by APICID for cpu related errors,
by address for memory related errors, and by segment/bus/device/function
for PCIe related errors. Also extensions for vendor specific error
injections.
Tested-by: Chen Gong <gong.chen@linux.intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Len Brown [Wed, 18 Jan 2012 05:46:30 +0000 (00:46 -0500)]
Merge branch 'atomicio-remove' into release
Len Brown [Wed, 18 Jan 2012 05:18:10 +0000 (00:18 -0500)]
Merge branch 'apei' into release
Thomas Renninger [Tue, 17 Jan 2012 21:40:08 +0000 (22:40 +0100)]
intel_idle: Split up and provide per CPU initialization func
Function split up, should have no functional change.
Provides entry point for physically hotplugged CPUs
to initialize and activate cpuidle.
Signed-off-by: Thomas Renninger <trenn@suse.de>
CC: Deepthi Dharwar <deepthi@linux.vnet.ibm.com>
CC: Shaohua Li <shaohua.li@intel.com>
CC: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Len Brown <len.brown@intel.com>
Thomas Renninger [Tue, 17 Jan 2012 21:40:07 +0000 (22:40 +0100)]
ACPI processor: Remove unneeded variable passed by acpi_processor_hotadd_init V2
V2: Fix typo: pr->handle -> pr, here: acpi_processor_hotadd_init(pr)
This is a very small part taken from patches which afaik
are coming from Yunhong Jiang (for a Xen not a Linus repo?).
Cleanup only: no functional change.
Advantage (beside cleanup) is that other data of the pr (acpi_processor) struct
in the acpi_processor_hotadd_init() is needed later, for example a newly
introduced flag:
pr->flags.need_hotplug_init
Signed-off-by: Thomas Renninger <trenn@suse.de>
CC: Bjorn Helgaas <bhelgaas@google.com>
CC: Jiang, Yunhong <yunhong.jiang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Matt Carlson [Tue, 17 Jan 2012 15:27:23 +0000 (15:27 +0000)]
tg3: Fix single-vector MSI-X code
Kdump kernels leave MSI-X interrupts (as setup by the crashed kernel)
enabled. However, kdump only enables one CPU in the new environment,
thus causing tg3 to abort MSI-X setup. When the driver attempts to
enable INTA or MSI interrupt modes on a kdump kernel, interrupt
delivery fails.
This patch attempts to workaround the problem by forcing the driver
to enable a single MSI-X interrupt. In such a configuration, the
device's multivector interrupt mode must be disabled.
Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ben Pfaff [Tue, 17 Jan 2012 13:33:39 +0000 (13:33 +0000)]
openvswitch: Fix multipart datapath dumps.
The logic to split up the list of datapaths into multiple Netlink messages
was simply wrong, causing the list to be terminated after the first part.
Only about the first 50 datapaths would be dumped. This fixes the
problem.
Reported-by: Paul Ingram <paul@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Tue, 17 Jan 2012 12:45:36 +0000 (12:45 +0000)]
ipv6: fix per device IP snmp counters
In commit
4ce3c183fca (snmp: 64bit ipstats_mib for all arches), I forgot
to change the /proc/net/dev_snmp6/xxx output for IP counters.
percpu array is 64bit per counter but the folding still used the 'long'
variant, and output garbage on 32bit arches.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Wed, 18 Jan 2012 02:55:56 +0000 (18:55 -0800)]
Merge tag 'arm-soc-fixes' of git://git./linux/kernel/git/arm/arm-soc
ARM: fixes for ARM platforms
Some fallout from the 3.3. merge window as well as a couple bug fixes
for older preexisting bugs that seem valid to include at this time:
* sched_clock changes broke picoxcell, fix included
* BSYM bugs causing issues with thumb2-built kernels on SMP
* Missing module.h include on msm.
* A collection of bugfixes for samsung platforms that didn't make it into
the first pull requests.
* tag 'arm-soc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
ARM: make BSYM macro assembly only
ARM: highbank: remove incorrect BSYM usage
ARM: imx: remove incorrect BSYM usage
ARM: exynos: remove incorrect BSYM usage
ARM: ux500: add missing ENDPROC to headsmp.S
ARM: msm: Add missing ENDPROC to headsmp.S
ARM: versatile: Add missing ENDPROC to headsmp.S
ARM: EXYNOS: Invert VCLK polarity for framebuffer on ORIGEN
ARM: S3C64XX: Fix interrupt configuration for PCA935x on Cragganmore
ARM: S3C64XX: Fix the memory mapped GPIOs on Cragganmore
ARM: S3C64XX: Remove hsmmc1 from Cragganmore
ARM: S3C64XX: Remove unconditional power domain disables
ARM: SAMSUNG: Declare struct platform_device in plat/s3c64xx-spi.h
ARM: SAMSUNG: dma-ops.h needs mach/dma.h
ARM: SAMSUNG: Guard against multiple inclusion of plat/dma.h
ARM: picoxcell: fix sched_clock() cleanup fallout
ARM: msm: vreg is a module and so needs module.h
Linus Torvalds [Wed, 18 Jan 2012 02:40:24 +0000 (18:40 -0800)]
Merge branch 'next' of git://git.infradead.org/users/vkoul/slave-dma
* 'next' of git://git.infradead.org/users/vkoul/slave-dma: (53 commits)
ARM: mach-shmobile: specify CHCLR registers on SH7372
dma: shdma: fix runtime PM: clear channel buffers on reset
dma/imx-sdma: save irq flags when use spin_lock in sdma_tx_submit
dmaengine/ste_dma40: clear LNK on channel startup
dmaengine: intel_mid_dma: remove legacy pm interface
ASoC: mxs: correct 'direction' of device_prep_dma_cyclic
dmaengine: intel_mid_dma: error path fix
dmaengine: intel_mid_dma: locking and freeing fixes
mtd: gpmi-nand: move to dma_transfer_direction
mtd: fix compile error for gpmi-nand
mmc: mxs-mmc: fix the dma_transfer_direction migration
dmaengine: add DMA_TRANS_NONE to dma_transfer_direction
dma: mxs-dma: Don't use CLKGATE bits in CTRL0 to disable DMA channels
dma: mxs-dma: make mxs_dma_prep_slave_sg() multi user safe
dma: mxs-dma: Always leave mxs_dma_init() with the clock disabled.
dma: mxs-dma: fix a typo in comment
DMA: PL330: Remove pm_runtime_xxx calls from pl330 probe/remove
video i.MX IPU: Fix display connections
i.MX IPU DMA: Fix wrong burstsize settings
dmaengine/ste_dma40: allow fixed physical channel
...
Fix up conflicts in drivers/dma/{Kconfig,mxs-dma.c,pl330.c}
The conflicts looked pretty trivial, but I'll ask people to verify them.
Linus Torvalds [Wed, 18 Jan 2012 02:11:38 +0000 (18:11 -0800)]
Merge branch 'upstream-linus' of git://github.com/jgarzik/libata-dev
* 'upstream-linus' of git://github.com/jgarzik/libata-dev:
[libata] ata_piix: Add Toshiba Satellite Pro A120 to the quirks list due to broken suspend functionality.
[libata] add DVRTD08A and DVR-215 to NOSETXFER device quirk list
[libata] pata_bf54x: Support sg list in bmdma transfer.
[libata] sata_fsl: fix the controller operating mode
[libata] enable ata port async suspend
Al Viro [Wed, 18 Jan 2012 01:51:22 +0000 (01:51 +0000)]
x86-32: Fix build failure with AUDIT=y, AUDITSYSCALL=n
JONGMAN HEO reports:
With current linus git (commit
a25a2b84), I got following build error,
arch/x86/kernel/vm86_32.c: In function 'do_sys_vm86':
arch/x86/kernel/vm86_32.c:340: error: implicit declaration of function '__audit_syscall_exit'
make[3]: *** [arch/x86/kernel/vm86_32.o] Error 1
OK, I can reproduce it (32bit allmodconfig with AUDIT=y, AUDITSYSCALL=n)
It's due to commit
d7e7528bcd45: "Audit: push audit success and retcode
into arch ptrace.h".
Reported-by: JONGMAN HEO <jongman.heo@samsung.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Benjamin Larsson [Sat, 7 Jan 2012 23:39:10 +0000 (00:39 +0100)]
[libata] ata_piix: Add Toshiba Satellite Pro A120 to the quirks list
due to broken suspend functionality.
Signed-off-by: Benjamin Larsson <benjamin@southpole.se>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Vladimir LAVALLADE [Sun, 8 Jan 2012 12:50:13 +0000 (13:50 +0100)]
[libata] add DVRTD08A and DVR-215 to NOSETXFER device quirk list
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>