GitHub/LineageOS/android_kernel_motorola_exynos9610.git
14 years agoxfs: log changed inodes instead of writing them synchronously
Christoph Hellwig [Tue, 9 Feb 2010 00:43:49 +0000 (11:43 +1100)]
xfs: log changed inodes instead of writing them synchronously

When an inode has already be flushed delayed write,
xfs_inode_clean() returns true and hence xfs_fs_write_inode() can
return on a synchronous inode write without having written the
inode. Currently these sycnhronous writes only come sync(1),
unmount, a sycnhronous NFS export and cachefiles so should be
relatively rare and out of common performance paths.

Realistically, a synchronous inode write is not necessary here; we
can avoid writing the inode by logging any non-transactional changes
that are pending.  This needs to be done with synchronous
transactions, but it avoids seeking between the log and inode
clusters as we do now. We don't force the log if the inode is
pinned, though, so this differs from the fsync case.  For normal
sys_sync and unmount behaviour this is fine because we do a
synchronous log force in xfs_sync_data which is called from the
->sync_fs code.

It does however break the NFS synchronous export guarantees for now,
but work is under way to fix this at a higher level or for the
higher level to provide an additional flag in the writeback control
to tell us that a log force is needed.

Portions of this patch are based on work from Dave Chinner.

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Reviewed-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Alex Elder <aelder@sgi.com>
14 years agoxfs: remove invalid barrier optimization from xfs_fsync
Christoph Hellwig [Mon, 1 Feb 2010 23:16:26 +0000 (10:16 +1100)]
xfs: remove invalid barrier optimization from xfs_fsync

We always need to flush the disk write cache and can't skip it just because
the no inode attributes have changed.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
14 years agoxfs: kill the unused XFS_QMOPT_* flush flags V2
Dave Chinner [Wed, 3 Feb 2010 22:48:58 +0000 (09:48 +1100)]
xfs: kill the unused XFS_QMOPT_* flush flags V2

dquots are never flushed asynchronously. Remove the flag and the
async write support from the flush function. Make the default flush
a delwri flush to make the inode flush code, which leaves the
XFS_QMOPT_SYNC the only flag remaining.  Convert that to use
SYNC_WAIT instead, just like the inode flush code.

V2:
- just pass flush flags straight through

Signed-off-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
15 years agoxfs: Use delay write promotion for dquot flushing
Dave Chinner [Tue, 26 Jan 2010 04:13:41 +0000 (15:13 +1100)]
xfs: Use delay write promotion for dquot flushing

xfs_qm_dqflock_pushbuf_wait() does a very similar trick to item
pushing used to do to flush out delayed write dquot buffers. Change
it to use the new promotion method rather than an async flush.

Also, xfs_qm_dqflock_pushbuf_wait() can return without the flush lock
held, yet the callers make the assumption that after this call the
flush lock is held. Always return with the flush lock held.

Signed-off-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
15 years agoxfs: Sort delayed write buffers before dispatch
Dave Chinner [Tue, 26 Jan 2010 04:13:25 +0000 (15:13 +1100)]
xfs: Sort delayed write buffers before dispatch

Currently when the xfsbufd writes delayed write buffers, it pushes
them to disk in the order they come off the delayed write list. If
there are lots of buffers ѕpread widely over the disk, this results
in overwhelming the elevator sort queues in the block layer and we
end up losing the posibility of merging adjacent buffers to minimise
the number of IOs.

Use the new generic list_sort function to sort the delwri dispatch
queue before issue to ensure that the buffers are pushed in the most
friendly order possible to the lower layers.

Signed-off-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
14 years agoxfs: Don't issue buffer IO direct from AIL push V2
Dave Chinner [Mon, 1 Feb 2010 23:13:42 +0000 (10:13 +1100)]
xfs: Don't issue buffer IO direct from AIL push V2

All buffers logged into the AIL are marked as delayed write.
When the AIL needs to push the buffer out, it issues an async write of the
buffer. This means that IO patterns are dependent on the order of
buffers in the AIL.

Instead of flushing the buffer, promote the buffer in the delayed
write list so that the next time the xfsbufd is run the buffer will
be flushed by the xfsbufd. Return the state to the xfsaild that the
buffer was promoted so that the xfsaild knows that it needs to cause
the xfsbufd to run to flush the buffers that were promoted.

Using the xfsbufd for issuing the IO allows us to dispatch all
buffer IO from the one queue. This means that we can make much more
enlightened decisions on what order to flush buffers to disk as
we don't have multiple places issuing IO. Optimisations to xfsbufd
will be in a future patch.

Version 2
- kill XFS_ITEM_FLUSHING as it is now unused.

Signed-off-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
14 years agoxfs: Use delayed write for inodes rather than async V2
Dave Chinner [Sat, 6 Feb 2010 01:39:36 +0000 (12:39 +1100)]
xfs: Use delayed write for inodes rather than async V2

We currently do background inode flush asynchronously, resulting in
inodes being written in whatever order the background writeback
issues them. Not only that, there are also blocking and non-blocking
asynchronous inode flushes, depending on where the flush comes from.

This patch completely removes asynchronous inode writeback. It
removes all the strange writeback modes and replaces them with
either a synchronous flush or a non-blocking delayed write flush.
That is, inode flushes will only issue IO directly if they are
synchronous, and background flushing may do nothing if the operation
would block (e.g. on a pinned inode or buffer lock).

Delayed write flushes will now result in the inode buffer sitting in
the delwri queue of the buffer cache to be flushed by either an AIL
push or by the xfsbufd timing out the buffer. This will allow
accumulation of dirty inode buffers in memory and allow optimisation
of inode cluster writeback at the xfsbufd level where we have much
greater queue depths than the block layer elevators. We will also
get adjacent inode cluster buffer IO merging for free when a later
patch in the series allows sorting of the delayed write buffers
before dispatch.

This effectively means that any inode that is written back by
background writeback will be seen as flush locked during AIL
pushing, and will result in the buffers being pushed from there.
This writeback path is currently non-optimal, but the next patch
in the series will fix that problem.

A side effect of this delayed write mechanism is that background
inode reclaim will no longer directly flush inodes, nor can it wait
on the flush lock. The result is that inode reclaim must leave the
inode in the reclaimable state until it is clean. Hence attempts to
reclaim a dirty inode in the background will simply skip the inode
until it is clean and this allows other mechanisms (i.e. xfsbufd) to
do more optimal writeback of the dirty buffers. As a result, the
inode reclaim code has been rewritten so that it no longer relies on
the ambiguous return values of xfs_iflush() to determine whether it
is safe to reclaim an inode.

Portions of this patch are derived from patches by Christoph
Hellwig.

Version 2:
- cleanup reclaim code as suggested by Christoph
- log background reclaim inode flush errors
- just pass sync flags to xfs_iflush

Signed-off-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
14 years agoxfs: Make inode reclaim states explicit
Dave Chinner [Sat, 6 Feb 2010 01:37:26 +0000 (12:37 +1100)]
xfs: Make inode reclaim states explicit

A.K.A.: don't rely on xfs_iflush() return value in reclaim

We have gradually been moving checks out of the reclaim code because
they are duplicated in xfs_iflush(). We've had a history of problems
in this area, and many of them stem from the overloading of the
return values from xfs_iflush() and interaction with inode flush
locking to determine if the inode is safe to reclaim.

With the desire to move to delayed write flushing of inodes and
non-blocking inode tree reclaim walks, the overloading of the
return value of xfs_iflush makes it very difficult to determine
the correct thing to do next.

This patch explicitly re-adds the checks to the inode reclaim code,
removing the reliance on the return value of xfs_iflush() to
determine what to do next. It also means that we can clearly
document all the inode states that reclaim must handle and hence
we can easily see that we handled all the necessary cases.

This also removes the need for the xfs_inode_clean() check in
xfs_iflush() as all callers now check this first (safely).

Signed-off-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
14 years agoxfs: more reserved blocks fixups
Eric Sandeen [Fri, 5 Feb 2010 22:59:53 +0000 (22:59 +0000)]
xfs: more reserved blocks fixups

This mangles the reserved blocks counts a little more.

1) add a helper function for the default reserved count
2) add helper functions to save/restore counts on ro/rw
3) save/restore reserved blocks on freeze/thaw
4) disallow changing reserved count while readonly

V2: changed field name to match Dave's changes

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Alex Elder <aelder@sgi.com>
15 years agoxfs: turn off sign warnings
Dave Chinner [Tue, 26 Jan 2010 04:10:15 +0000 (15:10 +1100)]
xfs: turn off sign warnings

Because they cause warnings in static inline functions conditionally
compiled into XFS from the VFS (e.g. fsnotify).

Signed-off-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
15 years agoxfs: don't hold onto reserved blocks on remount,ro
Dave Chinner [Tue, 26 Jan 2010 04:08:49 +0000 (15:08 +1100)]
xfs: don't hold onto reserved blocks on remount,ro

If we hold onto reserved blocks when doing a remount,ro we end
up writing the blocks used count to disk that includes the reserved
blocks. Reserved blocks are not actually used, so this results in
the values in the superblock being incorrect.

Hence if we run xfs_check or xfs_repair -n while the filesystem is
mounted remount,ro we end up with an inconsistent filesystem being
reported. Also, running xfs_copy on the remount,ro filesystem will
result in an inconsistent image being generated.

To fix this, unreserve the blocks when doing the remount,ro, and
reserved them again on remount,rw. This way a remount,ro filesystem
will appear consistent on disk to all utilities.

Signed-off-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
15 years agoxfs: quota limit statvfs available blocks
Christoph Hellwig [Thu, 21 Jan 2010 11:17:20 +0000 (11:17 +0000)]
xfs: quota limit statvfs available blocks

A "df" run on an NFS client of an exported XFS file system reports
the wrong information for "available" blocks.  When a block quota is
enforced, the amount reported as free is limited by the quota, but
the amount reported available is not (and should be).

Reported-by: Guk-Bong, Kwon <gbkwon@gmail.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>
15 years agoxfs: replace KM_LARGE with explicit vmalloc use
Christoph Hellwig [Wed, 20 Jan 2010 21:55:30 +0000 (21:55 +0000)]
xfs: replace KM_LARGE with explicit vmalloc use

We use the KM_LARGE flag to make kmem_alloc and friends use vmalloc
if necessary.  As we only need this for a few boot/mount time
allocations just switch to explicit vmalloc calls there.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>
15 years agoxfs: cleanup up xfs_log_force calling conventions
Christoph Hellwig [Tue, 19 Jan 2010 09:56:46 +0000 (09:56 +0000)]
xfs: cleanup up xfs_log_force calling conventions

Remove the XFS_LOG_FORCE argument which was always set, and the
XFS_LOG_URGE define, which was never used.

Split xfs_log_force into a two helpers - xfs_log_force which forces
the whole log, and xfs_log_force_lsn which forces up to the
specified LSN.  The underlying implementations already were entirely
separate, as were the users.

Also re-indent the new _xfs_log_force/_xfs_log_force which
previously had a weird coding style.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>
15 years agoxfs: kill XLOG_VEC_SET_TYPE
Christoph Hellwig [Tue, 19 Jan 2010 09:56:45 +0000 (09:56 +0000)]
xfs: kill XLOG_VEC_SET_TYPE

This macro only obsfucates the log item type assignments, so kill it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Alex Elder <aelder@sgi.com>
15 years agoxfs: remove duplicate buffer flags
Christoph Hellwig [Tue, 19 Jan 2010 09:56:44 +0000 (09:56 +0000)]
xfs: remove duplicate buffer flags

Currently we define aliases for the buffer flags in various
namespaces, which only adds confusion.  Remove all but the XBF_
flags to clean this up a bit.

Note that we still abuse XFS_B_ASYNC/XBF_ASYNC for some non-buffer
uses, but I'll clean that up later.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Alex Elder <aelder@sgi.com>
15 years agoxfs: implement quota warnings via netlink
Christoph Hellwig [Sun, 17 Jan 2010 22:36:19 +0000 (22:36 +0000)]
xfs: implement quota warnings via netlink

Wire up quota_send_warning to send quota warnings over netlink.
This is used by various desktops to show user quota warnings.

Tested by running the quota_nld daemon while running the xfstest
quota tests and observing the warnings.  I'll see how I can get a
more formal testcase for it written.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Alex Elder <aelder@sgi.com>
15 years agoxfs: clean up error handling in xfs_trans_dqresv
Christoph Hellwig [Wed, 13 Jan 2010 22:05:49 +0000 (22:05 +0000)]
xfs: clean up error handling in xfs_trans_dqresv

Move the error code selection after the goto label and fold the
xfs_quota_error helper into it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Alex Elder <aelder@sgi.com>
15 years agoxfs: kill XFS_QMOPT_ASYNC
Christoph Hellwig [Wed, 13 Jan 2010 22:05:48 +0000 (22:05 +0000)]
xfs: kill XFS_QMOPT_ASYNC

The option is unused and one of the few remaining users of
xfs_bawrite, so let's get rid of it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Alex Elder <aelder@sgi.com>
15 years agoxfs: rearrange xfs_mod_sb() to avoid array subscript warning
Dave Chinner [Wed, 20 Jan 2010 01:04:53 +0000 (12:04 +1100)]
xfs: rearrange xfs_mod_sb() to avoid array subscript warning

gcc warns of an array subscript out of bounds in xfs_mod_sb().
The code is written in such a way that if the array subscript is
out of bounds, then it will assert fail. Rearrange the code to
avoid the bounds check warning.

Signed-off-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
15 years agoxfs: suppress spurious uninitialised var warning in xfs_bmapi()
Dave Chinner [Tue, 19 Jan 2010 23:50:06 +0000 (10:50 +1100)]
xfs: suppress spurious uninitialised var warning in xfs_bmapi()

Initialise the xfs_bmalloca_t structure to zero to avoid uninitialised
variable warnings. This is done by zeroing the arg structure rather than
using the uninitialised_var() trick so we know for certain that the
structure is correctly initialised as xfs_bmapi is a very complex
function and it is difficult to prove warnings are spurious.

Signed-off-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
15 years agoxfs: make compile warn about char sign mismatches again
Dave Chinner [Tue, 19 Jan 2010 23:49:18 +0000 (10:49 +1100)]
xfs: make compile warn about char sign mismatches again

The -fno-unsigned-char directive has no effect anymore as the
XFs build is clean. However, the kernel build hides pointer sign
differences so turn that back on so that we can clean up all the
mismatches prior to a userspace code resync.

Signed-off-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
15 years agoxfs: clean up sign warnings in dir2 code
Dave Chinner [Tue, 19 Jan 2010 23:48:05 +0000 (10:48 +1100)]
xfs: clean up sign warnings in dir2 code

We are now consistently using unsigned char strings for names
so fix up the remaining warnings in the dir2 code to complete
the cleanup.

Signed-off-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
15 years agoxfs: convert attr to use unsigned names
Dave Chinner [Tue, 19 Jan 2010 23:47:48 +0000 (10:47 +1100)]
xfs: convert attr to use unsigned names

To be consistent with the directory code, the attr code should use
unsigned names. Convert the names from the vfs at the highest level
to unsigned, and ænsure they are consistenly used as unsigned down
to disk.

Signed-off-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
15 years agoxfs: xfs_buf_iomove() doesn't care about signedness
Dave Chinner [Tue, 19 Jan 2010 23:47:39 +0000 (10:47 +1100)]
xfs: xfs_buf_iomove() doesn't care about signedness

xfs_buf_iomove() uses xfs_caddr_t as it's parameter types, but it doesn't
care about the signedness of the variables as it is just copying the
data. Change the prototype to use void * so that we don't get sign
warnings at call sites.

Signed-off-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
15 years agoxfs: make xfs_dir_cilookup_result use unsigned char
Dave Chinner [Tue, 19 Jan 2010 23:47:25 +0000 (10:47 +1100)]
xfs: make xfs_dir_cilookup_result use unsigned char

For consistency with the result of the code.

Signed-off-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
15 years agoxfs: convert dirnameops to unsigned char names
Dave Chinner [Tue, 19 Jan 2010 23:47:17 +0000 (10:47 +1100)]
xfs: convert dirnameops to unsigned char names

To be consistent across the codebase, convert the dirnameops to pass
the directory names by unsigned char strings.

Signed-off-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
15 years agoxfs: convert DM ops to use unsigned char names
Dave Chinner [Tue, 19 Jan 2010 23:47:08 +0000 (10:47 +1100)]
xfs: convert DM ops to use unsigned char names

dmops uses a signed char for it's namespace event. To be consistent
with the rest of the code, convert them to unsigned char for the
namespace string.

Signed-off-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
15 years agoxfs: directory names are unsigned
Dave Chinner [Tue, 19 Jan 2010 23:44:58 +0000 (10:44 +1100)]
xfs: directory names are unsigned

Convert the struct xfs_name to use unsigned chars for the name
strings to match both what is stored on disk (__uint8_t) and what
the VFS expects (unsigned char).

Signed-off-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
15 years agoxfs: move more buffer helpers into xfs_buf.c
Christoph Hellwig [Wed, 13 Jan 2010 22:17:56 +0000 (22:17 +0000)]
xfs: move more buffer helpers into xfs_buf.c

Move xfsbdstrat and xfs_bdstrat_cb from xfs_lrw.c and xfs_bioerror
and xfs_bioerror_relse from xfs_rw.c into xfs_buf.c.  This also
means xfs_bioerror and xfs_bioerror_relse can be marked static now.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>
15 years agoxfs: clean up xfs_bwrite
Christoph Hellwig [Wed, 13 Jan 2010 22:17:58 +0000 (22:17 +0000)]
xfs: clean up xfs_bwrite

Fold XFS_bwrite into it's only caller, xfs_bwrite and move it into
xfs_buf.c instead of leaving it as a fairly large inline function.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>
15 years agoxfs: clean up log buffer writes
Christoph Hellwig [Wed, 13 Jan 2010 22:17:57 +0000 (22:17 +0000)]
xfs: clean up log buffer writes

Don't bother using XFS_bwrite as it doesn't provide much code for
our use case.  Instead opencode it and fold xlog_bdstrat_cb into the
new xlog_bdstrat helper.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>
15 years agoxfs: embed the pagb_list array in the perag structure
Dave Chinner [Mon, 11 Jan 2010 11:47:49 +0000 (11:47 +0000)]
xfs: embed the pagb_list array in the perag structure

Now that the perag structure is allocated memory rather than held in
an array, we don't need to have the busy extent array external to
the structure. Embed it into the perag structure to avoid needing an
extra allocation when setting up.

Signed-off-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>
15 years agoxfs: handle ENOMEM correctly during initialisation of perag structures
Dave Chinner [Mon, 11 Jan 2010 11:47:48 +0000 (11:47 +0000)]
xfs: handle ENOMEM correctly during initialisation of perag structures

Add proper error handling in case an error occurs while initializing
new perag structures for a mount point.  The mount structure is
restored to its previous state by deleting and freeing any perag
structures added during the call.

Signed-off-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>
15 years agoxfs: Kill filestreams cache flush
Dave Chinner [Mon, 11 Jan 2010 11:47:47 +0000 (11:47 +0000)]
xfs: Kill filestreams cache flush

The filestreams cache flush is not needed in the sync code as it
does not affect data writeback, and it is now not used by the growfs
code, either, so kill it.

Signed-off-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>
15 years agoxfs: Add trace points for per-ag refcount debugging.
Dave Chinner [Mon, 11 Jan 2010 11:47:46 +0000 (11:47 +0000)]
xfs: Add trace points for per-ag refcount debugging.

Uninline xfs_perag_{get,put} so that tracepoints can be inserted
into them to speed debugging of reference count problems.

Signed-off-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>
15 years agoxfs: Reference count per-ag structures
Dave Chinner [Mon, 11 Jan 2010 11:47:45 +0000 (11:47 +0000)]
xfs: Reference count per-ag structures

Reference count the per-ag structures to ensure that we keep get/put
pairs balanced. Assert that the reference counts are zero at unmount
time to catch leaks. In future, reference counts will enable us to
safely remove perag structures by allowing us to detect when they
are no longer in use.

Signed-off-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>
15 years agoxfs: Replace per-ag array with a radix tree
Dave Chinner [Mon, 11 Jan 2010 11:47:44 +0000 (11:47 +0000)]
xfs: Replace per-ag array with a radix tree

The use of an array for the per-ag structures requires reallocation
of the array when growing the filesystem. This requires locking
access to the array to avoid use after free situations, and the
locking is difficult to get right. To avoid needing to reallocate an
array, change the per-ag structures to an allocated object per ag
and index them using a tree structure.

The AGs are always densely indexed (hence the use of an array), but
the number supported is 2^32 and lookups tend to be random and hence
indexing needs to scale. A simple choice is a radix tree - it works
well with this sort of index.  This change also removes another
large contiguous allocation from the mount/growfs path in XFS.

The growing process now needs to change to only initialise the new
AGs required for the extra space, and as such only needs to
exclusively lock the tree for inserts. The rest of the code only
needs to lock the tree while doing lookups, and hence this will
remove all the deadlocks that currently occur on the m_perag_lock as
it is now an innermost lock. The lock is also changed to a spinlock
from a read/write lock as the hold time is now extremely short.

To complete the picture, the per-ag structures will need to be
reference counted to ensure that we don't free/modify them while
they are still in use.  This will be done in subsequent patch.

Signed-off-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>
15 years agoxfs: convert remaining direct references to m_perag
Dave Chinner [Mon, 11 Jan 2010 11:47:43 +0000 (11:47 +0000)]
xfs: convert remaining direct references to m_perag

Convert the remaining direct lookups of the per ag structures to use
get/put accesses. Ensure that the loops across AGs and prior users
of the interface balance gets and puts correctly.

Signed-off-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>
15 years agoxfs: Convert filestreams code to use per-ag get/put routines
Dave Chinner [Mon, 11 Jan 2010 11:47:42 +0000 (11:47 +0000)]
xfs: Convert filestreams code to use per-ag get/put routines

Use xfs_perag_get() and xfs_perag_put() in the filestreams code.

Signed-off-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>
15 years agoxfs: Don't directly reference m_perag in allocation code
Dave Chinner [Mon, 11 Jan 2010 11:47:41 +0000 (11:47 +0000)]
xfs: Don't directly reference m_perag in allocation code

Start abstracting the perag references so that the indexing of the
structures is not directly coded into all the places that uses the
perag structures. This will allow us to separate the use of the
perag structure and the way it is indexed and hence avoid the known
deadlocks related to growing a busy filesystem.

Signed-off-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>
15 years agoxfs: rename xfs_get_perag
Dave Chinner [Mon, 11 Jan 2010 11:47:40 +0000 (11:47 +0000)]
xfs: rename xfs_get_perag

xfs_get_perag is really getting the perag that an inode belongs to
based on it's inode number. Convert the use of this function to just
get the perag from a provided ag number.  Use this new function to
obtain the per-ag structure when traversing the per AG inode trees
for sync and reclaim.

Signed-off-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>
15 years agoxfs: Don't wake xfsbufd when idle
Dave Chinner [Mon, 11 Jan 2010 11:49:59 +0000 (11:49 +0000)]
xfs: Don't wake xfsbufd when idle

The xfsbufd wakes every xfsbufd_centisecs (once per second by
default) for each filesystem even when the filesystem is idle.  If
the xfsbufd has nothing to do, put it into a long term sleep and
only wake it up when there is work pending (i.e. dirty buffers to
flush soon). This will make laptop power misers happy.

Signed-off-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>
15 years agoxfs: Don't wake the aild once per second
Dave Chinner [Mon, 11 Jan 2010 11:49:58 +0000 (11:49 +0000)]
xfs: Don't wake the aild once per second

Now that the AIL push algorithm is traversal safe, we don't need a
watchdog function in the xfsaild to catch pushes that fail to make
progress. Remove the watchdog timeout and make pushes purely driven
by demand. This will remove the once-per-second wakeup that is seen
when the filesystem is idle and make laptop power misers happy.

Signed-off-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>
15 years agoxfs: Use list_heads for log recovery item lists
Dave Chinner [Mon, 11 Jan 2010 11:49:57 +0000 (11:49 +0000)]
xfs: Use list_heads for log recovery item lists

Remove the roll-your-own linked list operations.

Signed-off-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>
15 years agoxfs: make several more functions static
Eric Sandeen [Thu, 19 Nov 2009 15:52:00 +0000 (15:52 +0000)]
xfs: make several more functions static

Just minor housekeeping, a lot more functions can be trivially made
static; others could if we reordered things a bit...

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Alex Elder <aelder@sgi.com>
15 years agoxfs: clean up inconsistent variable naming in xfs_swap_extent
Dave Chinner [Thu, 14 Jan 2010 01:33:56 +0000 (01:33 +0000)]
xfs: clean up inconsistent variable naming in xfs_swap_extent

The swap extent ioctl passes in a target inode and a temporary inode
which are clearly named in the ioctl structure. The code then
assigns temp to target and vice versa, making it extremely difficult
to work out which inode is which later in the code.  Make this
consistent throughout the code.

Also make xfs_swap_extent static as there are no external users of
the function.

Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Alex Elder <aelder@sgi.com>
15 years agoxfs: add tracing to xfs_swap_extents
Dave Chinner [Thu, 14 Jan 2010 01:33:55 +0000 (01:33 +0000)]
xfs: add tracing to xfs_swap_extents

To be able to diagnose whether the swap extents function is
detecting compatible inode data fork configurations for swapping
extents, add tracing points to the code to allow us to see the
format of the inode forks before and after the swap.

Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Alex Elder <aelder@sgi.com>
15 years agoxfs: xfs_swap_extents needs to handle dynamic fork offsets
Dave Chinner [Thu, 14 Jan 2010 01:33:54 +0000 (01:33 +0000)]
xfs: xfs_swap_extents needs to handle dynamic fork offsets

When swapping extents, we can corrupt inodes by swapping data forks
that are in incompatible formats.  This is caused by the two indoes
having different fork offsets due to the presence of an attribute
fork on an attr2 filesystem.  xfs_fsr tries to be smart about
setting the fork offset, but the trick it plays only works on attr1
(old fixed format attribute fork) filesystems.

Changing the way xfs_fsr sets up the attribute fork will prevent
this situation from ever occurring, so in the kernel code we can get
by with a preventative fix - check that the data fork in the
defragmented inode is in a format valid for the inode it is being
swapped into.  This will lead to files that will silently and
potentially repeatedly fail defragmentation, so issue a warning to
the log when this particular failure occurs to let us know that
xfs_fsr needs updating/fixing.

To help identify how to improve xfs_fsr to avoid this issue, add
trace points for the inodes being swapped so that we can determine
why the swap was rejected and to confirm that the code is making the
right decisions and modifications when swapping forks.

A further complication is even when the swap is allowed to proceed
when the fork offset is different between the two inodes then value
for the maximum number of extents the data fork can hold can be
wrong. Make sure these are also set correctly after the swap occurs.

Signed-off-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>
15 years agoxfs: fix missing error check in xfs_rtfree_range
Dave Chinner [Thu, 14 Jan 2010 08:44:46 +0000 (08:44 +0000)]
xfs: fix missing error check in xfs_rtfree_range

When xfs_rtfind_forw() returns an error, the block is returned
uninitialised.  xfs_rtfree_range() is not checking the error return,
so could be using an uninitialised block number for modifying bitmap
summary info.

The problem was found by gcc when compiling the *userspace* libxfs
code - it is an copy of the kernel code with the exact same bug.
gcc gives an uninitialised variable warning on the userspace code
but not on the kernel code. You gotta love the consistency (Mmmm,
slightly chewy today!).

Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Alex Elder <aelder@sgi.com>
15 years agoxfs: fix stale inode flush avoidance
Dave Chinner [Mon, 11 Jan 2010 11:45:21 +0000 (11:45 +0000)]
xfs: fix stale inode flush avoidance

When reclaiming stale inodes, we need to guarantee that inodes are
unpinned before returning with a "clean" status. If we don't we can
reclaim inodes that are pinned, leading to use after free in the
transaction subsystem as transactions complete.

Signed-off-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>
15 years agoxfs: Remove inode iolock held check during allocation
Dave Chinner [Sun, 10 Jan 2010 23:51:48 +0000 (23:51 +0000)]
xfs: Remove inode iolock held check during allocation

lockdep complains about a the lock not being initialised as we do an
ASSERT based check that the lock is not held before we initialise it
to catch inodes freed with the lock held.

lockdep does this check for us in the lock initialisation code, so
remove the ASSERT to stop the lockdep warning.

Signed-off-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>
15 years agoxfs: reclaim all inodes by background tree walks
Dave Chinner [Sun, 10 Jan 2010 23:51:47 +0000 (23:51 +0000)]
xfs: reclaim all inodes by background tree walks

We cannot do direct inode reclaim without taking the flush lock to
ensure that we do not reclaim an inode under IO. We check the inode
is clean before doing direct reclaim, but this is not good enough
because the inode flush code marks the inode clean once it has
copied the in-core dirty state to the backing buffer.

It is the flush lock that determines whether the inode is still
under IO, even though it is marked clean, and the inode is still
required at IO completion so we can't reclaim it even though it is
clean in core. Hence the requirement that we need to take the flush
lock even on clean inodes because this guarantees that the inode
writeback IO has completed and it is safe to reclaim the inode.

With delayed write inode flushing, we coul dend up waiting a long
time on the flush lock even for a clean inode. The background
reclaim already handles this efficiently, so avoid all the problems
by killing the direct reclaim path altogether.

Signed-off-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>
15 years agoxfs: Avoid inodes in reclaim when flushing from inode cache
Dave Chinner [Sun, 10 Jan 2010 23:51:46 +0000 (23:51 +0000)]
xfs: Avoid inodes in reclaim when flushing from inode cache

The reclaim code will handle flushing of dirty inodes before reclaim
occurs, so avoid them when determining whether an inode is a
candidate for flushing to disk when walking the radix trees.  This
is based on a test patch from Christoph Hellwig.

Signed-off-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>
15 years agoxfs: reclaim inodes under a write lock
Dave Chinner [Sun, 10 Jan 2010 23:51:45 +0000 (23:51 +0000)]
xfs: reclaim inodes under a write lock

Make the inode tree reclaim walk exclusive to avoid races with
concurrent sync walkers and lookups. This is a version of a patch
posted by Christoph Hellwig that avoids all the code duplication.

Signed-off-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>
15 years agoLinux 2.6.33-rc4
Linus Torvalds [Wed, 13 Jan 2010 05:15:00 +0000 (21:15 -0800)]
Linux 2.6.33-rc4

15 years agoMerge git://git.infradead.org/battery-2.6
Linus Torvalds [Wed, 13 Jan 2010 05:13:06 +0000 (21:13 -0800)]
Merge git://git.infradead.org/battery-2.6

* git://git.infradead.org/battery-2.6:
  pmu_battery: Fix battery full reporting

15 years ago[SCSI] megaraid_sas: remove sysfs poll_mode_io world writeable permissions
Bryn M. Reeves [Thu, 12 Nov 2009 18:31:54 +0000 (18:31 +0000)]
[SCSI] megaraid_sas: remove sysfs poll_mode_io world writeable permissions

/sys/bus/pci/drivers/megaraid_sas/poll_mode_io defaults to being
world-writable, which seems bad (letting any user affect kernel driver
behavior).

This turns off group and user write permissions, so that on typical
production systems only root can write to it.

Signed-off-by: Bryn M. Reeves <bmr@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoMerge branch 'for-linus' of git://gitorious.org/linux-omap-dss2/linux
Linus Torvalds [Wed, 13 Jan 2010 05:04:04 +0000 (21:04 -0800)]
Merge branch 'for-linus' of git://gitorious.org/linux-omap-dss2/linux

* 'for-linus' of git://gitorious.org/linux-omap-dss2/linux:
  OMAP: DSS2: OMAPFB: fix crash when panel driver was not loaded
  OMAP: DSS2: Reject scaling settings when they cannot be supported
  OMAP: DSS2: Make check-delay-loops consistent
  OMAP: DSS2: OMAPFB: fix omapfb_free_fbmem()
  video/omap: add __init/__exit macros to drivers/video/omap/lcd_htcherald.c
  OMAP: DSS2: Fix compile warning
  MAINTAINERS: Combine DSS2 and OMAPFB2 into one entry
  MAINTAINERS: change omapfb maintainer
  OMAP: OMAPFB: add dummy release function for omapdss
  OMAP: OMAPFB: fix clk_get for RFBI
  OMAP: DSS2: RFBI: convert to new kfifo API
  OMAP: DSS2: Fix crash when panel doesn't define enable_te()
  OMAP: DSS2: Collect interrupt statistics
  OMAP: DSS2: DSI: print debug DCS cmd in hex
  OMAP: DSS2: DSI: fix VC channels in send_short and send_null

15 years agolib: Introduce generic list_sort function
Dave Chinner [Tue, 12 Jan 2010 06:39:16 +0000 (17:39 +1100)]
lib: Introduce generic list_sort function

There are two copies of list_sort() in the tree already, one in the DRM
code, another in ubifs.  Now XFS needs this as well.  Create a generic
list_sort() function from the ubifs version and convert existing users
to it so we don't end up with yet another copy in the tree.

Signed-off-by: Dave Chinner <david@fromorbit.com>
Acked-by: Dave Airlie <airlied@redhat.com>
Acked-by: Artem Bityutskiy <dedekind@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoremove my email address from checkpatch.
Dave Jones [Tue, 12 Jan 2010 21:59:52 +0000 (16:59 -0500)]
remove my email address from checkpatch.

Maybe this will stop people emailing me about it.

Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoMerge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzi...
Linus Torvalds [Wed, 13 Jan 2010 04:56:20 +0000 (20:56 -0800)]
Merge branch 'upstream-linus' of git://git./linux/kernel/git/jgarzik/libata-dev

* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
  libata: retry link resume if necessary
  ata_piix: enable 32bit PIO on SATA piix
  sata_promise: don't classify overruns as HSM errors

15 years agoMerge master.kernel.org:/home/rmk/linux-2.6-arm
Linus Torvalds [Wed, 13 Jan 2010 04:56:01 +0000 (20:56 -0800)]
Merge master.kernel.org:/home/rmk/linux-2.6-arm

* master.kernel.org:/home/rmk/linux-2.6-arm:
  ARM: Ensure ARMv6/7 mm files are built using appropriate assembler options
  ARM: Fix wrong dmb
  ARM: 5874/1: serial21285: fix disable_irq-from-interrupt-handler deadlock
  ARM: 5873/1: ARM: Fix the reset logic for ARM RealView boards
  ARM: 5872/1: ARM: include needed linux/cpu.h in asm/cpu.h
  ARM: 5871/1: arch/arm: Fix build failure for lpd7a404_defconfig caused by missing includes
  ARM: 5870/1: arch/arm: Fix build failure for defconfigs without CONFIG_ISA_DMA_API set
  ARM: 5868/1: ARM: fix "BUG: using smp_processor_id() in preemptible code"
  ARM: 5867/1: Update U300 defconfig
  ARM: 5866/1: arm ptrace: use unsigned types for kernel pt_regs
  [ARM] pxa: fix strange characters in zaurus gpio .desc
  ARM: add missing recvmmsg syscall number
  [ARM] pxa: fix compiler warnings of unused variable 'id' in cpu_is_pxa9*()
  [ARM] pxa: update pwm_backlight->notify() to include missed 'struct device *'
  [ARM] pxa: enable L2 if present in XSC3
  [ARM] pxa: do not enable L2 after MMU is enabled

15 years agoMerge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus
Linus Torvalds [Wed, 13 Jan 2010 04:55:31 +0000 (20:55 -0800)]
Merge branch 'upstream' of git://ftp.linux-mips.org/upstream-linus

* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus: (22 commits)
  MIPS: Ignore vmlinux.*
  MIPS: Move vmlinux.ecoff to arch/mips/boot
  MIPS: cpumask_of_node() should handle -1 as a node
  MIPS: Octeon: Use non-overflowing arithmetic in sched_clock
  MIPS: Malta, PowerTV: Remove unnecessary "Linux started"
  MIPS: BCM63xx: Remove duplicate CONFIG_CMDLINE.
  MIPS: AR7: Remove unused prom_getchar()
  MIPS: PowerTV: Remove extra r4k_clockevent_init() call
  MIPS: Cobalt use strlcat() for the command line arguments
  MIPS: Octeon: Add sched_clock() to csrc-octeon.c
  MIPS: TXx9: Cleanup builtin-cmdline processing
  MIPS: PowerTV: simplify prom_init_cmdline() and merge into prom_init()
  MIPS: PowerTV: Remove unused platform_die()
  MIPS: PowerTV: Remove mips_machine_halt()
  MIPS: PowerTV: Remove unused ptv_memsize
  MIPS: PowerTV: Remove unused prom_getcmdline()
  MIPS: AR7: Remove kgdb_enabled
  MIPS: Alchemy: Correct code taking the size of a pointer
  MIPS: BCM63xx: Fix whitespace damaged board_bcm963xx.c
  MIPS: VR41xx: Use strlcat() for the command line arguments
  ...

15 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6
Linus Torvalds [Wed, 13 Jan 2010 04:54:52 +0000 (20:54 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/tiwai/sound-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
  ALSA: hda - Fix ALC861-VD capture source mixer
  ALSA: ac97: add AC97 STMicroelectronics' codecs
  ALSA: ac97: Add Dell Dimension 2400 to Headphone/Line Jack Sense blacklist
  ASoC: Fix WM8350 DSP mode B configuration
  sbawe: fix memory detection part 2
  sound: oss: off by one bug
  ALSA: usb-audio - Avoid Oops after disconnect
  ALSA: test off by one in setsamplerate()
  ALSA: atiixp: Specify codec for Foxconn RC4107MA-RS2

15 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
Linus Torvalds [Wed, 13 Jan 2010 04:53:29 +0000 (20:53 -0800)]
Merge git://git./linux/kernel/git/davem/net-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (56 commits)
  sky2: Fix oops in sky2_xmit_frame() after TX timeout
  Documentation/3c509: document ethtool support
  af_packet: Don't use skb after dev_queue_xmit()
  vxge: use pci_dma_mapping_error to test return value
  netfilter: ebtables: enforce CAP_NET_ADMIN
  e1000e: fix and commonize code for setting the receive address registers
  e1000e: e1000e_enable_tx_pkt_filtering() returns wrong value
  e1000e: perform 10/100 adaptive IFS only on parts that support it
  e1000e: don't accumulate PHY statistics on PHY read failure
  e1000e: call pci_save_state() after pci_restore_state()
  netxen: update version to 4.0.72
  netxen: fix set mac addr
  netxen: fix smatch warning
  netxen: fix tx ring memory leak
  tcp: update the netstamp_needed counter when cloning sockets
  TI DaVinci EMAC: Handle emac module clock correctly.
  dmfe/tulip: Let dmfe handle DM910x except for SPARC on-board chips
  ixgbe: Fix compiler warning about variable being used uninitialized
  netfilter: nf_ct_ftp: fix out of bounds read in update_nl_seq()
  mv643xx_eth: don't include cache padding in rx desc buffer size
  ...

Fix trivial conflict in drivers/scsi/cxgb3i/cxgb3i_offload.c

15 years agom68knommu: fix definitions of __pa() and __va()
Greg Ungerer [Wed, 13 Jan 2010 00:42:05 +0000 (10:42 +1000)]
m68knommu: fix definitions of __pa() and __va()

Fix compilation breakage of all m68knommu targets:

  CC      arch/m68knommu/kernel/asm-offsets.s
In file included from include/linux/sched.h:77,
                 from arch/m68knommu/kernel/asm-offsets.c:12:
include/linux/percpu.h: In function 'per_cpu_ptr_to_phys':
include/linux/percpu.h:161: error: implicit declaration of function 'virt_to_phy

This is broken in linux-2.6.33-rc3.

Change the definitions of __pa() and __va() to not use virt_to_phys()
and phys_to_virt(). Trivial 1:1 conversion required for the non-MMU case.

A side effect if this is that the m68knommu can now use asm/virtconvert.h
for the definition of virt_to_phys() and phys_to_virt().

Also cleaned up the definition of page_to_phys() when moving into
virtconvert.h.

Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agolibata: retry link resume if necessary
Tejun Heo [Mon, 11 Jan 2010 02:14:44 +0000 (11:14 +0900)]
libata: retry link resume if necessary

Interestingly, when SIDPR is used in ata_piix, writes to DET in
SControl sometimes get ignored leading to detection failure.  Update
sata_link_resume() such that it reads back SControl after clearing DET
and retry if it's not clear.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: fengxiangjun <fengxiangjun@neusoft.com>
Reported-by: Jim Faulkner <jfaulkne@ccs.neu.edu>
Cc: stable@kernel.org
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
15 years agoata_piix: enable 32bit PIO on SATA piix
Tejun Heo [Mon, 11 Jan 2010 08:03:11 +0000 (17:03 +0900)]
ata_piix: enable 32bit PIO on SATA piix

Commit 871af1210f13966ab911ed2166e4ab2ce775b99d enabled 32bit PIO for
PATA piix but didn't for SATA.  There's no reason not to use 32bit PIO
on SATA piix.  Enable it.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
15 years agosata_promise: don't classify overruns as HSM errors
Mikael Pettersson [Sat, 9 Jan 2010 22:32:06 +0000 (23:32 +0100)]
sata_promise: don't classify overruns as HSM errors

When sata_promise encounters an overrun or underrun error it
translates that to a libata AC_ERR_HSM, causing a hard reset.
Since over/under-runs were thought to be rare and transient,
this action seemed reasonable.

Unfortunately it turns out that the controller throws overrun
errors when e.g. hal polls a CD or DVD writer containing blank
media, causing long sequences of hard resets and retries before
EH finally gives up.

This patch updates sata_promise to classify over/under-runs as
AC_ERR_OTHER instead. This allows libata EH and upper layers to
retry or fail the operation as they see fit without the disruption
caused by repeated hard resets.

This fixes a problem using a DVD-RAM drive with sata_promise,
reported by Thomas Schorpp. I also tested it on a DVD-RW drive.

Signed-off-by: Mikael Pettersson <mikpe@it.uu.se>
Tested-by: thomas schorpp <thomas.schorpp@googlemail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
15 years agoARM: Ensure ARMv6/7 mm files are built using appropriate assembler options
Russell King [Tue, 12 Jan 2010 19:02:05 +0000 (19:02 +0000)]
ARM: Ensure ARMv6/7 mm files are built using appropriate assembler options

A kernel with both ARMv6 and ARMv7 selected results in build errors.
Fix this by specifying the proper architectures for these assembly
files.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
15 years agoARM: Fix wrong dmb
Russell King [Tue, 12 Jan 2010 18:59:16 +0000 (18:59 +0000)]
ARM: Fix wrong dmb

The __kuser_cmpxchg code uses an ARMv6 dmb instruction, rather than
one based upon the architecture being built for.  Switch to using
the macro provided for this purpose, which also eliminates the
need for an ifdef.

Acked-by: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
15 years agoMIPS: Ignore vmlinux.*
Yoichi Yuasa [Fri, 18 Dec 2009 12:14:19 +0000 (21:14 +0900)]
MIPS: Ignore vmlinux.*

Signed-off-by: Yoichi Yuasa <yuasa@linux-mips.org>
Cc: linux-mips <linux-mips@linux-mips.org>
Patchwork: http://patchwork.linux-mips.org/patch/795/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
15 years agoMIPS: Move vmlinux.ecoff to arch/mips/boot
Yoichi Yuasa [Fri, 18 Dec 2009 12:13:17 +0000 (21:13 +0900)]
MIPS: Move vmlinux.ecoff to arch/mips/boot

It moves to the same directory as the boot files in other formats.

Signed-off-by: Yoichi Yuasa <yuasa@linux-mips.org>
Cc: linux-mips <linux-mips@linux-mips.org>
Patchwork: http://patchwork.linux-mips.org/patch/796/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
15 years agoMIPS: cpumask_of_node() should handle -1 as a node
Anton Blanchard [Wed, 6 Jan 2010 04:55:13 +0000 (15:55 +1100)]
MIPS: cpumask_of_node() should handle -1 as a node

pcibus_to_node can return -1 if we cannot determine which node a pci bus
is on. If passed -1, cpumask_of_node will negatively index the lookup array
and pull in random data:

# cat /sys/devices/pci0000:00/0000:00:01.0/local_cpus
00000000,00000003,00000000,00000000
# cat /sys/devices/pci0000:00/0000:00:01.0/local_cpulist
64-65

Change cpumask_of_node to check for -1 and return cpu_all_mask in this
case:

# cat /sys/devices/pci0000:00/0000:00:01.0/local_cpus
ffffffff,ffffffff,ffffffff,ffffffff
# cat /sys/devices/pci0000:00/0000:00:01.0/local_cpulist
0-127

Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Andrew Morton <akpm@linux-foundation.org>
Patchwork: http://patchwork.linux-mips.org/patch/831/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
15 years agoMIPS: Octeon: Use non-overflowing arithmetic in sched_clock
David Daney [Fri, 8 Jan 2010 22:47:36 +0000 (14:47 -0800)]
MIPS: Octeon: Use non-overflowing arithmetic in sched_clock

With typical mult and shift values, the calculation for Octeon's sched_clock
overflows when using 64-bit arithmetic.  Use 128-bit calculations instead.

Signed-off-by: David Daney <ddaney@caviumnetworks.com>
To: linux-mips@linux-mips.org
Patchwork: http://patchwork.linux-mips.org/patch/849/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
15 years agoMIPS: Malta, PowerTV: Remove unnecessary "Linux started"
Yoichi Yuasa [Sun, 3 Jan 2010 05:47:34 +0000 (14:47 +0900)]
MIPS: Malta, PowerTV: Remove unnecessary "Linux started"

Signed-off-by: Yoichi Yuasa <yuasa@linux-mips.org>
Cc: linux-mips@linux-mips.org
Patchwork: http://patchwork.linux-mips.org/patch/813/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
15 years agoMIPS: BCM63xx: Remove duplicate CONFIG_CMDLINE.
Yoichi Yuasa [Sun, 3 Jan 2010 05:39:11 +0000 (14:39 +0900)]
MIPS: BCM63xx: Remove duplicate CONFIG_CMDLINE.

Builtin cmdline is copied by arch_mem_init().

Signed-off-by: Yoichi Yuasa <yuasa@linux-mips.org>
Cc: linux-mips@linux-mips.org
Patchwork: http://patchwork.linux-mips.org/patch/812/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
15 years agoMIPS: AR7: Remove unused prom_getchar()
Yoichi Yuasa [Sun, 3 Jan 2010 05:13:04 +0000 (14:13 +0900)]
MIPS: AR7: Remove unused prom_getchar()

Signed-off-by: Yoichi Yuasa <yuasa@linux-mips.org>
Cc: linux-mips@linux-mips.org
Patchwork: http://patchwork.linux-mips.org/patch/811/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
15 years agoMIPS: PowerTV: Remove extra r4k_clockevent_init() call
David VomLehn [Tue, 22 Dec 2009 01:43:42 +0000 (17:43 -0800)]
MIPS: PowerTV: Remove extra r4k_clockevent_init() call

A call to r4k_clocksource_init() was added to plat_time_init(), but
when init_mips_clock_source() calls the same function, boot fails in
clockevents_register_device(). This patch removes the extraneous call.

Signed-off-by: David VomLehn <dvomlehn@cisco.com>
Patchwork: http://patchwork.linux-mips.org/patch/803/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
15 years agoMIPS: Cobalt use strlcat() for the command line arguments
Yoichi Yuasa [Thu, 24 Dec 2009 08:06:34 +0000 (17:06 +0900)]
MIPS: Cobalt use strlcat() for the command line arguments

Tested with CoLo v1.22

Signed-off-by: Yoichi Yuasa <yuasa@linux-mips.org>
Patchwork: http://patchwork.linux-mips.org/patch/807/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
15 years agoMIPS: Octeon: Add sched_clock() to csrc-octeon.c
David Daney [Wed, 23 Dec 2009 21:18:54 +0000 (13:18 -0800)]
MIPS: Octeon: Add sched_clock() to csrc-octeon.c

With the advent of function graph tracing on MIPS, Octeon needs a high
precision sched_clock() implementation.  Without it, most timing
numbers are reported as 0.000.

This new sched_clock just uses the 64-bit cycle counter appropriately
scaled.

Signed-off-by: David Daney <ddaney@caviumnetworks.com>
Patchwork: http://patchwork.linux-mips.org/patch/805/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
15 years agoMIPS: TXx9: Cleanup builtin-cmdline processing
Atsushi Nemoto [Mon, 21 Dec 2009 15:48:57 +0000 (00:48 +0900)]
MIPS: TXx9: Cleanup builtin-cmdline processing

Since commit 898d357b5262f9e26bc2418e01f8676e80d9867e (lmo) /
6acc7d485c24c00e111c61b2e6dff9180faebcae (kernel.org) ("Fix and enhance
built-in kernel command line") arcs_cmdline[] does not contain built-in
command line.  The commit introduce CONFIG_CMDLINE_BOOL and
CONFIG_CMDLINE_OVERRIDE to control built-in command line, and now we can
use them instead of platform-specific built-in command line processing.

Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Patchwork: http://patchwork.linux-mips.org/patch/802/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
15 years agoMIPS: PowerTV: simplify prom_init_cmdline() and merge into prom_init()
Yoichi Yuasa [Fri, 18 Dec 2009 12:38:37 +0000 (21:38 +0900)]
MIPS: PowerTV: simplify prom_init_cmdline() and merge into prom_init()

Signed-off-by: Yoichi Yuasa <yuasa@linux-mips.org>
Patchwork: http://patchwork.linux-mips.org/patch/801/
Reviewed-by: David VomLehn <dvomlehn@cisco.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
15 years agoMIPS: PowerTV: Remove unused platform_die()
Yoichi Yuasa [Fri, 18 Dec 2009 12:36:32 +0000 (21:36 +0900)]
MIPS: PowerTV: Remove unused platform_die()

Signed-off-by: Yoichi Yuasa <yuasa@linux-mips.org>
Patchwork: http://patchwork.linux-mips.org/patch/800/
Reviewed-by: David VomLehn <dvomlehn@cisco.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
15 years agoMIPS: PowerTV: Remove mips_machine_halt()
Yoichi Yuasa [Fri, 18 Dec 2009 12:33:46 +0000 (21:33 +0900)]
MIPS: PowerTV: Remove mips_machine_halt()

mips_machine_halt() is same as mips_machine_restart().  Also delete the
registration of _machine_halt and pm_power_off because mips_machine_halt()
is the restart function.

Signed-off-by: Yoichi Yuasa <yuasa@linux-mips.org>
Patchwork: http://patchwork.linux-mips.org/patch/798/
Reviewed-by: David VomLehn <dvomlehn@cisco.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
15 years agoMIPS: PowerTV: Remove unused ptv_memsize
Yoichi Yuasa [Fri, 18 Dec 2009 12:30:18 +0000 (21:30 +0900)]
MIPS: PowerTV: Remove unused ptv_memsize

Signed-off-by: Yoichi Yuasa <yuasa@linux-mips.org>
Patchwork: http://patchwork.linux-mips.org/patch/799/
Reviewed-by: David VomLehn <dvomlehn@cisco.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
15 years agoMIPS: PowerTV: Remove unused prom_getcmdline()
Yoichi Yuasa [Fri, 18 Dec 2009 12:29:17 +0000 (21:29 +0900)]
MIPS: PowerTV: Remove unused prom_getcmdline()

Signed-off-by: Yoichi Yuasa <yuasa@linux-mips.org>
Patchwork: http://patchwork.linux-mips.org/patch/797/
Reviewed-by: David VomLehn <dvomlehn@cisco.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
15 years agoMIPS: AR7: Remove kgdb_enabled
Yoichi Yuasa [Fri, 18 Dec 2009 12:20:24 +0000 (21:20 +0900)]
MIPS: AR7: Remove kgdb_enabled

An unused leftover from the old KGDB implementation.

Signed-off-by: Yoichi Yuasa <yuasa@linux-mips.org>
Patchwork: http://patchwork.linux-mips.org/patch/794/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
15 years agoMIPS: Alchemy: Correct code taking the size of a pointer
Julia Lawall [Sun, 13 Dec 2009 11:40:39 +0000 (12:40 +0100)]
MIPS: Alchemy: Correct code taking the size of a pointer

sizeof(dp) is just the size of the pointer.  Change it to the size of the
referenced structure.

A simplified version of the semantic patch that finds this problem is as
follows: (http://coccinelle.lip6.fr/)

// <smpl>
@@
expression *x;
expression f;
type T;
@@

*f(...,(T)x,...)
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Patchwork: http://patchwork.linux-mips.org/patch/789/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
15 years agoMIPS: BCM63xx: Fix whitespace damaged board_bcm963xx.c
Florian Fainelli [Sat, 12 Dec 2009 16:57:39 +0000 (17:57 +0100)]
MIPS: BCM63xx: Fix whitespace damaged board_bcm963xx.c

Signed-off-by: Florian Fainelli <ffainelli@freebox.fr>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
15 years agoMIPS: VR41xx: Use strlcat() for the command line arguments
Yoichi Yuasa [Thu, 10 Dec 2009 05:00:39 +0000 (14:00 +0900)]
MIPS: VR41xx: Use strlcat() for the command line arguments

Signed-off-by: Yoichi Yuasa <yuasa@linux-mips.org>
Cc: linux-mips@linux-mips.org
Patchwork: http://patchwork.linux-mips.org/patch/784/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
15 years agoMIPS: Cleanup and Fixup of compressed kernel support
Wu Zhangjin [Thu, 10 Dec 2009 14:55:13 +0000 (22:55 +0800)]
MIPS: Cleanup and Fixup of compressed kernel support

 o Remove the .initrd section.  The initrd section was already handled
   when vmlinux was linked.
 o Discard .MIPS.options, .options, .pdr, .reginfo, .comment and .note
   sections.  If .MIPS.options is not removed, kernels compiled with gcc
   3.4.6 will not boot.
 o Clean up the file format.
 o Remove several other unneeded sections.

Tested with GCC 3.4.6 and 4.4.1 with and without initrd.

Signed-off-by: Wu Zhangjin <wuzhangjin@gmail.com>
Cc: linux-mips@linux-mips.org
Patchwork: http://patchwork.linux-mips.org/patch/785/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
15 years agoMIPS: Cleanup forgotten label_module_alloc in tlbex.c
David Daney [Fri, 4 Dec 2009 01:43:54 +0000 (17:43 -0800)]
MIPS: Cleanup forgotten label_module_alloc in tlbex.c

commit c8af165342e83a4eb078c9607d29a7c399d30a53 (lmo) rsp.
e0cc87f59490d7d62a8ab2a76498dc8a2b64927a (kernel.org) left
label_module_alloc unused.  Remove it now.

Signed-off-by: David Daney <ddaney@caviumnetworks.com>
Cc: linux-mips@linux-mips.org
Patchwork: http://patchwork.linux-mips.org/patch/752/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
15 years agoMerge branch 'fix/asoc' into for-linus
Takashi Iwai [Tue, 12 Jan 2010 16:50:06 +0000 (17:50 +0100)]
Merge branch 'fix/asoc' into for-linus

15 years agoMerge branch 'fix/hda' into for-linus
Takashi Iwai [Tue, 12 Jan 2010 16:50:03 +0000 (17:50 +0100)]
Merge branch 'fix/hda' into for-linus

15 years agoALSA: hda - Fix ALC861-VD capture source mixer
Takashi Iwai [Tue, 12 Jan 2010 13:00:11 +0000 (14:00 +0100)]
ALSA: hda - Fix ALC861-VD capture source mixer

The capture source or input source mixer element wasn't created properly
for ALC861-VD codec due to the wrong NID passed to
alc_auto_create_input_ctls().

References: Novell bnc#568305
http://bugzilla.novell.com/show_bug.cgi?id=568305

Signed-off-by: Takashi Iwai <tiwai@suse.de>
Cc: <stable@kernel.org>
15 years agosky2: Fix oops in sky2_xmit_frame() after TX timeout
Jarek Poplawski [Mon, 4 Jan 2010 08:48:41 +0000 (08:48 +0000)]
sky2: Fix oops in sky2_xmit_frame() after TX timeout

During TX timeout procedure dev could be awoken too early, e.g. by
sky2_complete_tx() called from sky2_down(). Then sky2_xmit_frame()
can run while buffers are freed causing an oops. This patch fixes it
by adding netif_device_present() test in sky2_tx_complete().

Fixes: http://bugzilla.kernel.org/show_bug.cgi?id=14925

With debugging by: Mike McCormack <mikem@ring3k.org>

Reported-by: Berck E. Nash <flyboy@gmail.com>
Tested-by: Berck E. Nash <flyboy@gmail.com>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoOMAP: DSS2: OMAPFB: fix crash when panel driver was not loaded
Tomi Valkeinen [Thu, 7 Jan 2010 15:45:03 +0000 (17:45 +0200)]
OMAP: DSS2: OMAPFB: fix crash when panel driver was not loaded

If the panel's probe had failed, omapfb would still go on, eventually
crashing.

A better fix would be to handle each display properly, and leaving just
the failed display out. But that is a bigger change.

Signed-off-by: Tomi Valkeinen <tomi.valkeinen@nokia.com>
15 years agoOMAP: DSS2: Reject scaling settings when they cannot be supported
Ville Syrjälä [Fri, 8 Jan 2010 09:56:41 +0000 (11:56 +0200)]
OMAP: DSS2: Reject scaling settings when they cannot be supported

If the scaling ratio is below 0.5 video output width can't be identical
to the display width. Reject such settings.

Signed-off-by: Ville Syrjälä <ville.syrjala@nokia.com>
Acked-by: Tomi Valkeinen <tomi.valkeinen@nokia.com>