GitHub/MotorolaMobilityLLC/kernel-slsi.git
18 years agoNFS: Make stat() return updated mtimes after a write()
Trond Myklebust [Tue, 3 Jan 2006 08:55:34 +0000 (09:55 +0100)]
NFS: Make stat() return updated mtimes after a write()

 The SuS states that a call to write() will cause mtime to be updated on
 the file. In order to satisfy that requirement, we need to flush out
 any cached writes in nfs_getattr().
 Speed things up slightly by not committing the writes.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
18 years agoNFSv4: Ensure that we return the delegation on the target of a rename too.
Trond Myklebust [Tue, 3 Jan 2006 08:55:33 +0000 (09:55 +0100)]
NFSv4: Ensure that we return the delegation on the target of a rename too.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
18 years agoNFS: support large reads and writes on the wire
Chuck Lever [Wed, 30 Nov 2005 23:09:02 +0000 (18:09 -0500)]
NFS: support large reads and writes on the wire

 Most NFS server implementations allow up to 64KB reads and writes on the
 wire.  The Solaris NFS server allows up to a megabyte, for instance.

 Now the Linux NFS client supports transfer sizes up to 1MB, too.  This will
 help reduce protocol and context switch overhead on read/write intensive NFS
 workloads, and support larger atomic read and write operations on servers
 that support them.

 Test-plan:
 Connectathon and iozone on mount point with wsize=rsize>32768 over TCP.
 Tests with NFS over UDP to verify the maximum RPC payload size cap.

Signed-off-by: Chuck Lever <cel@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
18 years agoNFS: make "inode number mismatch" message more useful
Chuck Lever [Wed, 30 Nov 2005 23:08:55 +0000 (18:08 -0500)]
NFS: make "inode number mismatch" message more useful

 To help NFS users and server developers, make the "inode number mismatch"
 message display more useful information.

 Test-plan:
 None.

Signed-off-by: Chuck Lever <cel@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
18 years agoNFS: get rid of useless kernel log message
Chuck Lever [Wed, 30 Nov 2005 23:08:57 +0000 (18:08 -0500)]
NFS: get rid of useless kernel log message

 nfs_statfs() generates a log message when GETATTR returns an error.  This
 is usually a useless message.  Make it a dprintk.

 Test plan:
 None

Signed-off-by: Chuck Lever <cel@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
18 years agoNFS: simplify inlined bit ops in nfs_page.h
Chuck Lever [Wed, 30 Nov 2005 23:08:59 +0000 (18:08 -0500)]
NFS: simplify inlined bit ops in nfs_page.h

 Minor cleanup:  inlined bit ops in nfs_page.h can be simpler.

 Test plan:
 Write-intensive workload against a server that requires COMMITs.

Signed-off-by: Chuck Lever <cel@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
18 years agoNFS: Fix error recovery code in fs/nfs/inode.c:__init_nfs()
Chuck Lever [Wed, 30 Nov 2005 23:08:19 +0000 (18:08 -0500)]
NFS: Fix error recovery code in fs/nfs/inode.c:__init_nfs()

 Red Hat found a problem in the error recovery logic in __init_nfs.

Signed-off-by: Chuck Lever <cel@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
18 years agoNFS: use generic_write_checks() to sanity check direct writes
Chuck Lever [Wed, 30 Nov 2005 23:08:17 +0000 (18:08 -0500)]
NFS: use generic_write_checks() to sanity check direct writes

 Replace ad hoc write parameter sanity checking in nfs_file_direct_write()
 with a call to generic_write_checks().  This should make the proper checks
 modulo the O_LARGEFILE flag, and should catch NFSv2-specific limitations by
 virtue of i_sb->s_maxbytes.

 Test plan:
 Posix compliance testing with both NFSv2 and NFSv3.

Signed-off-by: Chuck Lever <cel@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
18 years agoNFSv4: Remove requirement for machine creds for the "setclientid" operation
Trond Myklebust [Tue, 3 Jan 2006 08:55:26 +0000 (09:55 +0100)]
NFSv4: Remove requirement for machine creds for the "setclientid" operation

 Use a cred from the nfs4_client->cl_state_owners list.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
18 years agoNFSv4: Remove requirement for machine creds for the "renew" operation
Trond Myklebust [Tue, 3 Jan 2006 08:55:25 +0000 (09:55 +0100)]
NFSv4: Remove requirement for machine creds for the "renew" operation

 In RFC3530, the RENEW operation is allowed to use either

 the same principal, RPC security flavour and (if RPCSEC_GSS), the same
  mechanism and service that was used for SETCLIENTID_CONFIRM

 OR

 Any principal, RPC security flavour and service combination that
 currently has an OPEN file on the server.

 Choose the latter since that doesn't require us to keep credentials for
 the same principal for the entire duration of the mount.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
18 years agoNFSv4: Send RENEW requests to the server only when we're holding state
Trond Myklebust [Tue, 3 Jan 2006 08:55:24 +0000 (09:55 +0100)]
NFSv4: Send RENEW requests to the server only when we're holding state

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
18 years agoNFS: Convert instances of kernel_thread() to kthread()
Trond Myklebust [Tue, 3 Jan 2006 08:55:23 +0000 (09:55 +0100)]
NFS: Convert instances of kernel_thread() to kthread()

 Convert private implementations in NFSv4 state recovery and delegation
 code to use kthreads.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
18 years agoNFSv4: State recovery cleanup
Trond Myklebust [Tue, 3 Jan 2006 08:55:22 +0000 (09:55 +0100)]
NFSv4: State recovery cleanup

 Use wait_on_bit() when waiting for state recovery to complete.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
18 years agoNFSv4: OPEN/LOCK/LOCKU/CLOSE will automatically renew the NFSv4 lease
Trond Myklebust [Tue, 3 Jan 2006 08:55:21 +0000 (09:55 +0100)]
NFSv4: OPEN/LOCK/LOCKU/CLOSE will automatically renew the NFSv4 lease

 Cut down on the number of unnecessary RENEW requests on the wire.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
18 years agoSUNRPC: Ensure that SIGKILL will always terminate a synchronous RPC call.
Trond Myklebust [Tue, 3 Jan 2006 08:55:19 +0000 (09:55 +0100)]
SUNRPC: Ensure that SIGKILL will always terminate a synchronous RPC call.

 ...and make sure that the "intr" flag also enables SIGHUP and SIGTERM to
 interrupt RPC calls too (as per the Solaris implementation).

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
18 years agoNFSv4: Make DELEGRETURN an interruptible operation.
Trond Myklebust [Tue, 3 Jan 2006 08:55:18 +0000 (09:55 +0100)]
NFSv4: Make DELEGRETURN an interruptible operation.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
18 years agoNFSv4: Convert LOCK rpc call into an asynchronous RPC call
Trond Myklebust [Tue, 3 Jan 2006 08:55:17 +0000 (09:55 +0100)]
NFSv4: Convert LOCK rpc call into an asynchronous RPC call

 In order to allow users to interrupt/cancel it.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
18 years agoNFSv4: locking XDR cleanup
Trond Myklebust [Tue, 3 Jan 2006 08:55:16 +0000 (09:55 +0100)]
NFSv4: locking XDR cleanup

 Get rid of some unnecessary intermediate structures

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
18 years agoNFSv4: Make open recovery track O_RDWR, O_RDONLY and O_WRONLY correctly
Trond Myklebust [Tue, 3 Jan 2006 08:55:15 +0000 (09:55 +0100)]
NFSv4: Make open recovery track O_RDWR, O_RDONLY and O_WRONLY correctly

 When recovering from a delegation recall or a network partition, we need
 to replay open(O_RDWR), open(O_RDONLY) and open(O_WRONLY) separately.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
18 years agoNFSv4: Make nfs4_state track O_RDWR, O_RDONLY and O_WRONLY separately
Trond Myklebust [Tue, 3 Jan 2006 08:55:13 +0000 (09:55 +0100)]
NFSv4: Make nfs4_state track O_RDWR, O_RDONLY and O_WRONLY separately

 A closer reading of RFC3530 reveals that OPEN_DOWNGRADE must always
 specify a access modes that have been the argument of a previous OPEN
 operation.
 IOW: doing OPEN(O_RDWR) and then OPEN_DOWNGRADE(O_WRONLY) is forbidden
 unless the user called OPEN(O_WRONLY)

 In order to fix that, we really need to track the three possible open
 states separately.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
18 years agoNFSv4: Make open_confirm() asynchronous too
Trond Myklebust [Tue, 3 Jan 2006 08:55:12 +0000 (09:55 +0100)]
NFSv4: Make open_confirm() asynchronous too

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
18 years agoNFSv4: Convert open() into an asynchronous RPC call
Trond Myklebust [Tue, 3 Jan 2006 08:55:11 +0000 (09:55 +0100)]
NFSv4: Convert open() into an asynchronous RPC call

 OPEN is a stateful operation, so we must ensure that it always
 completes. In order to allow users to interrupt the operation,
 we need to make the RPC call asynchronous, and then wait on
 completion (or cancel).

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
18 years agoSUNRPC: rpc_execute should not return task->tk_status;
Trond Myklebust [Tue, 3 Jan 2006 08:55:10 +0000 (09:55 +0100)]
SUNRPC: rpc_execute should not return task->tk_status;

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
18 years agoSUNRPC: Get rid of some unused exports
Trond Myklebust [Tue, 3 Jan 2006 08:55:09 +0000 (09:55 +0100)]
SUNRPC: Get rid of some unused exports

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
18 years agoNFSv4: Allocate OPEN call RPC arguments using kmalloc()
Trond Myklebust [Tue, 3 Jan 2006 08:55:08 +0000 (09:55 +0100)]
NFSv4: Allocate OPEN call RPC arguments using kmalloc()

 Cleanup in preparation for making OPEN calls interruptible by the user.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
18 years agoNFSv4: Make locku use the new RPC "wait on completion" interface.
Trond Myklebust [Tue, 3 Jan 2006 08:55:07 +0000 (09:55 +0100)]
NFSv4: Make locku use the new RPC "wait on completion" interface.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
18 years agoNFSv4: stateful NFSv4 RPC call interface
Trond Myklebust [Tue, 3 Jan 2006 08:55:06 +0000 (09:55 +0100)]
NFSv4: stateful NFSv4 RPC call interface

 The NFSv4 model requires us to complete all RPC calls that might
 establish state on the server whether or not the user wants to
 interrupt it. We may also need to schedule new work (including
 new RPC calls) in order to cancel the new state.

 The asynchronous RPC model will allow us to ensure that RPC calls
 always complete, but in order to allow for "synchronous" RPC, we
 want to add the ability to wait for completion.
 The waits are, of course, interruptible.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
18 years agoSUNRPC: Further cleanups
Trond Myklebust [Tue, 3 Jan 2006 08:55:05 +0000 (09:55 +0100)]
SUNRPC: Further cleanups

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
18 years agoRPC: Clean up RPC task structure
Trond Myklebust [Tue, 3 Jan 2006 08:55:04 +0000 (09:55 +0100)]
RPC: Clean up RPC task structure

 Shrink the RPC task structure. Instead of storing separate pointers
 for task->tk_exit and task->tk_release, put them in a structure.

 Also pass the user data pointer as a parameter instead of passing it via
 task->tk_calldata. This enables us to nest callbacks.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
18 years agoSUNRPC: Yet more RPC cleanups
Trond Myklebust [Tue, 3 Jan 2006 08:55:03 +0000 (09:55 +0100)]
SUNRPC: Yet more RPC cleanups

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
18 years agoNFS: Work correctly with single-page ->writepage() calls
Trond Myklebust [Tue, 3 Jan 2006 08:55:02 +0000 (09:55 +0100)]
NFS: Work correctly with single-page ->writepage() calls

 Ensure that we always initiate flushing of data before we exit
 a single-page ->writepage() call.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
18 years agoidentify multipage ->writepages() calls
Andrew Morton [Wed, 16 Nov 2005 23:07:01 +0000 (15:07 -0800)]
identify multipage ->writepages() calls

 NFS needs to be able to distinguish between single-page ->writepage() calls and
 multipage ->writepages() calls.

 For the single-page writepage calls NFS can kick off the I/O within the
 context of ->writepage().

 For multipage ->writepages calls, nfs_writepage() will leave the I/O pending
 and nfs_writepages() will kick off the I/O when it all has been queued up
 within NFS.

Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
18 years agoMerge branch 'post-2.6.15' of git://brick.kernel.dk/data/git/linux-2.6-block
Linus Torvalds [Fri, 6 Jan 2006 17:01:25 +0000 (09:01 -0800)]
Merge branch 'post-2.6.15' of git://brick.kernel.dk/data/git/linux-2.6-block

Manual fixup for merge with Jens' "Suspend support for libata", commit
ID 9b847548663ef1039dd49f0eb4463d001e596bc3.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years agox86: remove bogus 'pci=usepirqmask' suggestion when no irq is defined
Linus Torvalds [Fri, 6 Jan 2006 16:43:16 +0000 (08:43 -0800)]
x86: remove bogus 'pci=usepirqmask' suggestion when no irq is defined

This was harmless, but for the case of a device that had no irq
pre-defined we would incorrectly suggest that "usepirqmask" might make a
difference.  It never would, and the message was just confusing people.

Reported in the dmesg of Etienne Lorrain.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] Suspend support for libata
Jens Axboe [Fri, 6 Jan 2006 08:28:07 +0000 (09:28 +0100)]
[PATCH] Suspend support for libata

This patch adds suspend patch to libata, and ata_piix in particular. For
most low level drivers, they should just need to add the 4 hooks to
work. As I can only test ata_piix, I didn't enable it for more
though.

Suspend support is the single most important feature on a notebook, and
most new notebooks have sata drives. It's quite embarrassing that we
_still_ do not support this. Right now, it's perfectly possible to
suspend the drive in mid-transfer.

Signed-off-by: Jens Axboe <axboe@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] md: allow sync-speed to be controlled per-device
NeilBrown [Fri, 6 Jan 2006 08:21:36 +0000 (00:21 -0800)]
[PATCH] md: allow sync-speed to be controlled per-device

Also export current (average) speed and status in sysfs.

Signed-off-by: Neil Brown <neilb@suse.de>
Acked-by: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] md: support adding new devices to md arrays via sysfs
NeilBrown [Fri, 6 Jan 2006 08:21:16 +0000 (00:21 -0800)]
[PATCH] md: support adding new devices to md arrays via sysfs

Writing major:minor to md/new_dev will bind that device to the array.

Signed-off-by: Neil Brown <neilb@suse.de>
Acked-by: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] md: allow available size of component devices to be set via sysfs
NeilBrown [Fri, 6 Jan 2006 08:21:06 +0000 (00:21 -0800)]
[PATCH] md: allow available size of component devices to be set via sysfs

Signed-off-by: Neil Brown <neilb@suse.de>
Acked-by: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] md-export-rdev-data_offset-via-sysfs-fix
Andrew Morton [Fri, 6 Jan 2006 08:20:59 +0000 (00:20 -0800)]
[PATCH] md-export-rdev-data_offset-via-sysfs-fix

drivers/md/md.c: In function `offset_show':
drivers/md/md.c:1670: warning: long long unsigned int format, different type arg (arg 3)

Cc: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] md: export rdev->data_offset via sysfs
NeilBrown [Fri, 6 Jan 2006 08:20:56 +0000 (00:20 -0800)]
[PATCH] md: export rdev->data_offset via sysfs

Signed-off-by: Neil Brown <neilb@suse.de>
Acked-by: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] md: expose device slot information via sysfs
NeilBrown [Fri, 6 Jan 2006 08:20:55 +0000 (00:20 -0800)]
[PATCH] md: expose device slot information via sysfs

This the role that a device has in an array can be viewed and set.

Signed-off-by: Neil Brown <neilb@suse.de>
Acked-by: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] md: keep better track of dev/array size when assembling md arrays
NeilBrown [Fri, 6 Jan 2006 08:20:55 +0000 (00:20 -0800)]
[PATCH] md: keep better track of dev/array size when assembling md arrays

Move the checks - that dev size is never less than array size - into
bind_rdev_to_array to make sure it always happens properly (there is one place
where currently it doesn't).

Also reject any superblock which claims an array size smaller than the device
in question can hold.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] md: allow md/raid_disks to be settable
NeilBrown [Fri, 6 Jan 2006 08:20:54 +0000 (00:20 -0800)]
[PATCH] md: allow md/raid_disks to be settable

If array is active, try to reshape, else just set the value.

Signed-off-by: Neil Brown <neilb@suse.de>
Acked-by: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] md: count corrected read errors per drive
NeilBrown [Fri, 6 Jan 2006 08:20:52 +0000 (00:20 -0800)]
[PATCH] md: count corrected read errors per drive

Store this total in superblock (As appropriate), and make it available to
userspace via sysfs.

Signed-off-by: Neil Brown <neilb@suse.de>
Acked-by: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] md: allow array level to be set textually via sysfs
NeilBrown [Fri, 6 Jan 2006 08:20:51 +0000 (00:20 -0800)]
[PATCH] md: allow array level to be set textually via sysfs

Signed-off-by: Neil Brown <neilb@suse.de>
Acked-by: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] md: expose md metadata format in sysfs
NeilBrown [Fri, 6 Jan 2006 08:20:50 +0000 (00:20 -0800)]
[PATCH] md: expose md metadata format in sysfs

Allow it to be set to a particular version, or 'none'.

Signed-off-by: Neil Brown <neilb@suse.de>
Acked-by: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] md: allow md array component size to be accessed and set via sysfs
NeilBrown [Fri, 6 Jan 2006 08:20:49 +0000 (00:20 -0800)]
[PATCH] md: allow md array component size to be accessed and set via sysfs

Signed-off-by: Neil Brown <neilb@suse.de>
Acked-by: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] md: allow chunk_size to be settable through sysfs
NeilBrown [Fri, 6 Jan 2006 08:20:47 +0000 (00:20 -0800)]
[PATCH] md: allow chunk_size to be settable through sysfs

... only before array is started of course.

Signed-off-by: Neil Brown <neilb@suse.de>
Acked-by: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] md: fix rdev->pending counts in raid1
NeilBrown [Fri, 6 Jan 2006 08:20:46 +0000 (00:20 -0800)]
[PATCH] md: fix rdev->pending counts in raid1

When we do a user-requested check/repair, we lose count of the outstanding
requests...

Also make sure that when anything is written to md/sync_action, the
RECOVERY_NEEDED flag is set and the thread is woken up so any changes take
effect.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] md: make sure bitmap updates are visible through filesystem
NeilBrown [Fri, 6 Jan 2006 08:20:45 +0000 (00:20 -0800)]
[PATCH] md: make sure bitmap updates are visible through filesystem

When we update a page_cache page in the kernel, we need to flush_dache_page or
userspace might not see the change.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] drivers/md/md.c: make md_new_event() static
Adrian Bunk [Fri, 6 Jan 2006 08:20:44 +0000 (00:20 -0800)]
[PATCH] drivers/md/md.c: make md_new_event() static

Make the needlessly global function md_new_event() static.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Cc: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] md: make a couple of names in md.c static
NeilBrown [Fri, 6 Jan 2006 08:20:43 +0000 (00:20 -0800)]
[PATCH] md: make a couple of names in md.c static

.. because they aren't used outside md.c

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] md: fix typo in comment
NeilBrown [Fri, 6 Jan 2006 08:20:42 +0000 (00:20 -0800)]
[PATCH] md: fix typo in comment

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] md: helper function to match commands written to sysfs files
NeilBrown [Fri, 6 Jan 2006 08:20:41 +0000 (00:20 -0800)]
[PATCH] md: helper function to match commands written to sysfs files

Commands written to sysfs files may, or my not, be \n terminated.  We want to
accept with case.  For this we use cmd_match.

Signed-off-by: Neil Brown <neilb@suse.de>
Acked-by: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] md: define and use safe_put_page for md
NeilBrown [Fri, 6 Jan 2006 08:20:40 +0000 (00:20 -0800)]
[PATCH] md: define and use safe_put_page for md

md sometimes call put_page on NULL pointers (treating it like kfree).  This is
not safe, so define and use a 'safe_put_page' which checks for NULL.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] md: remove inappropriate limits in md/bitmap configuration.
NeilBrown [Fri, 6 Jan 2006 08:20:39 +0000 (00:20 -0800)]
[PATCH] md: remove inappropriate limits in md/bitmap configuration.

The kernel should not be imposing these policy limits: The time between
bitmap updates should certainly be allowed to be more than 15 seconds, and
if someone wants a bitmap chunk size in excess of 4MB, the kernel isn't the
place to stop them.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] md: fix possible problem in raid1/raid10 error overwriting
NeilBrown [Fri, 6 Jan 2006 08:20:37 +0000 (00:20 -0800)]
[PATCH] md: fix possible problem in raid1/raid10 error overwriting

The code to overwrite/reread for addressing read errors in raid1/raid10
currently assumes that the read will not alter the buffer which could be used
to write to the next device.  This is not a safe assumption to make.

So we split the loops into a overwrite loop and a separate re-read loop, so
that the writing is complete before reading is attempted.

Cc: Paul Clements <paul.clements@steeleye.com>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] md: remove personality numbering from md
NeilBrown [Fri, 6 Jan 2006 08:20:36 +0000 (00:20 -0800)]
[PATCH] md: remove personality numbering from md

md supports multiple different RAID level, each being implemented by a
'personality' (which is often in a separate module).

These personalities have fairly artificial 'numbers'.  The numbers
are use to:
 1- provide an index into an array where the various personalities
    are recorded
 2- identify the module (via an alias) which implements are particular
    personality.

Neither of these uses really justify the existence of personality numbers.
The array can be replaced by a linked list which is searched (array lookup
only happens very rarely).  Module identification can be done using an alias
based on level rather than 'personality' number.

The current 'raid5' modules support two level (4 and 5) but only one
personality.  This slight awkwardness (which was handled in the mapping from
level to personality) can be better handled by allowing raid5 to register 2
personalities.

With this change in place, the core md module does not need to have an
exhaustive list of all possible personalities, so other personalities can be
added independently.

This patch also moves the check for chunksize being non-zero into the ->run
routines for the personalities that need it, rather than having it in core-md.
 This has a side effect of allowing 'faulty' and 'linear' not to have a
chunk-size set.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] md: break out of a loop that doesn't need to run to completion
NeilBrown [Fri, 6 Jan 2006 08:20:35 +0000 (00:20 -0800)]
[PATCH] md: break out of a loop that doesn't need to run to completion

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] md: convert recently exported symbol to GPL
NeilBrown [Fri, 6 Jan 2006 08:20:34 +0000 (00:20 -0800)]
[PATCH] md: convert recently exported symbol to GPL

...because that seems to be the preferred practice these days.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] md: convert various kmap calls to kmap_atomic
NeilBrown [Fri, 6 Jan 2006 08:20:34 +0000 (00:20 -0800)]
[PATCH] md: convert various kmap calls to kmap_atomic

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] md: tidy up raid5/6 hash table code
NeilBrown [Fri, 6 Jan 2006 08:20:33 +0000 (00:20 -0800)]
[PATCH] md: tidy up raid5/6 hash table code

- replace open-coded hash chain with hlist macros

- Fix hash-table size at one page - it is already quite generous, so there
  will never be a need to use multiple pages, so no need for __get_free_pages

No functional change.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] md: convert md to use kzalloc throughout
NeilBrown [Fri, 6 Jan 2006 08:20:32 +0000 (00:20 -0800)]
[PATCH] md: convert md to use kzalloc throughout

Replace multiple kmalloc/memset pairs with kzalloc calls.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] md: clean up 'page' related names in md
NeilBrown [Fri, 6 Jan 2006 08:20:31 +0000 (00:20 -0800)]
[PATCH] md: clean up 'page' related names in md

Substitute:

  page_cache_get -> get_page
  page_cache_release -> put_page
  PAGE_CACHE_SHIFT -> PAGE_SHIFT
  PAGE_CACHE_SIZE -> PAGE_SIZE
  PAGE_CACHE_MASK -> PAGE_MASK
  __free_page -> put_page

because we aren't using the page cache, we are just using pages.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] md: make /proc/mdstat pollable
NeilBrown [Fri, 6 Jan 2006 08:20:30 +0000 (00:20 -0800)]
[PATCH] md: make /proc/mdstat pollable

With this patch it is possible to poll /proc/mdstat to detect arrays appearing
or disappearing, to detect failures, recovery starting, recovery completing,
and devices being added and removed.

It is similar to the poll-ability of /proc/mounts, though different in that:

We always report that the file is readable (because face it, it is, even if
only for EOF).

We report POLLPRI when there is a change so that select() can detect
it as an exceptional event.  Not only are these exceptional events, but
that is the mechanism that the current 'mdadm' uses to watch for events
(It also polls after a timeout).
(We also report POLLERR like /proc/mounts).

Finally, we only reset the per-file event counter when the start of the file
is read, rather than when poll() returns an event.  This is more robust as it
means that an fd will continue to report activity to poll/select until the
program clearly responds to that activity.

md_new_event takes an 'mddev' which isn't currently used, but it will be soon.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] md: raid10 read-error handling - resync and read-only
NeilBrown [Fri, 6 Jan 2006 08:20:29 +0000 (00:20 -0800)]
[PATCH] md: raid10 read-error handling - resync and read-only

Add in correct read-error handling for resync and read-only situations.

When read-only, we don't over-write, so we need to mark the failed drive in
the r10_bio so we don't re-try it.  During resync, we always read all blocks,
so if there is a read error, we simply over-write it with the good block that
we found (assuming we found one).

Note that the recovery case still isn't handled in an interesting way.  There
is nothing useful to do for the 2-copies case.  If there are 3 or more copies,
then we could try reading from one of the non-missing copies, but this is a
bit complicated and very rarely would be used, so I'm leaving it for now.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] md: auto-correct correctable read errors in raid10
NeilBrown [Fri, 6 Jan 2006 08:20:28 +0000 (00:20 -0800)]
[PATCH] md: auto-correct correctable read errors in raid10

Largely just a cross-port from raid1.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] md: make sure read error on last working drive of raid1 actually returns...
NeilBrown [Fri, 6 Jan 2006 08:20:27 +0000 (00:20 -0800)]
[PATCH] md: make sure read error on last working drive of raid1 actually returns failure

We are inadvertently setting the R1BIO_Uptodate bit on read errors when we
decide not to try correcting (because there are no other working devices).
This means that the read error is reported to the client as success.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] md: allow raid1 to check consistency
NeilBrown [Fri, 6 Jan 2006 08:20:26 +0000 (00:20 -0800)]
[PATCH] md: allow raid1 to check consistency

Where performing a user-requested 'check' or 'repair', we read all readable
devices, and compare the contents.  We only write to blocks which had read
errors, or blocks with content that differs from the first good device found.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] md: support check-without-repair of raid10 arrays
NeilBrown [Fri, 6 Jan 2006 08:20:25 +0000 (00:20 -0800)]
[PATCH] md: support check-without-repair of raid10 arrays

Also keep count on the number of errors found.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] md: fix up some rdev rcu locking in raid5/6
NeilBrown [Fri, 6 Jan 2006 08:20:24 +0000 (00:20 -0800)]
[PATCH] md: fix up some rdev rcu locking in raid5/6

There is this "FIXME" comment with a typo in it!!  that been annoying me for
days, so I just had to remove it.

conf->disks[i].rdev should only be accessed if
  - we know we hold a reference or
  - the mddev->reconfig_sem is down or
  - we have a rcu_readlock

handle_stripe was referencing rdev in three places without any of these.  For
the first two, get an rcu_readlock.  For the last, the same access
(md_sync_acct call) is made a little later after the rdev has been claimed
under and rcu_readlock, if R5_Syncio is set.  So just use that access...
However R5_Syncio isn't really needed as the 'syncing' variable contains the
same information.  So use that instead.

Issues, comment, and fix are identical in raid5 and raid6.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] md: handle errors when read-only
NeilBrown [Fri, 6 Jan 2006 08:20:23 +0000 (00:20 -0800)]
[PATCH] md: handle errors when read-only

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] md: better handling for read error in raid1 during resync
NeilBrown [Fri, 6 Jan 2006 08:20:22 +0000 (00:20 -0800)]
[PATCH] md: better handling for read error in raid1 during resync

Handling of read errors during resync is separate from handling of read errors
during normal IO in raid1.  A previous patch added support for read errors
during normal IO.  This one adds support for read errors during resync or
recovery.

The key differences are that we don't need to freeze the array, because the
normal handling of resync means that this part of the array will be idle
except for resync, and the read/overwrite/re-read is needed in a separate
piece of code.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] md: tidyup some issues with raid1 resync and prepare for catching read errors
NeilBrown [Fri, 6 Jan 2006 08:20:21 +0000 (00:20 -0800)]
[PATCH] md: tidyup some issues with raid1 resync and prepare for catching read errors

We are dereferencing ->rdev without an rcu lock!

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] md: attempt to auto-correct read errors in raid1
NeilBrown [Fri, 6 Jan 2006 08:20:19 +0000 (00:20 -0800)]
[PATCH] md: attempt to auto-correct read errors in raid1

On a read-error we suspend the array, then synchronously read the block from
other arrays until we find one where we can read it.  Then we try writing the
good data back everywhere and make sure it works.  If any write or subsequent
read fails, only then do we fail the device out of the array.

To be able to suspend the array, we need to also keep track of how many
requests are queued for handling by raid1d.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] md: improve handing of read errors with raid6
NeilBrown [Fri, 6 Jan 2006 08:20:18 +0000 (00:20 -0800)]
[PATCH] md: improve handing of read errors with raid6

This is a simple port of match functionality across from raid5.  If we get a
read error, we don't kick the drive straight away, but try to over-write with
good data first.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] md: fix raid6 resync check/repair code
NeilBrown [Fri, 6 Jan 2006 08:20:17 +0000 (00:20 -0800)]
[PATCH] md: fix raid6 resync check/repair code

raid6 currently does not check the P/Q syndromes when doing a resync, it just
calculates the correct value and writes it.  Doing the check can reduce writes
(often to 0) for a resync, and it is needed to properly implement the

  echo check > sync_action

operation.

This patch implements the appropriate checks and tidies up some related code.

It also allows raid6 user-requested resync to bypass the intent bitmap.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] md: write intent bitmap support for raid10
NeilBrown [Fri, 6 Jan 2006 08:20:16 +0000 (00:20 -0800)]
[PATCH] md: write intent bitmap support for raid10

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] md: move bitmap_create to after md array has been initialised
NeilBrown [Fri, 6 Jan 2006 08:20:16 +0000 (00:20 -0800)]
[PATCH] md: move bitmap_create to after md array has been initialised

This is important because bitmap_create uses
  mddev->resync_max_sectors
and that doesn't have a valid value until after the array
has been initialised (with pers->run()).
[It doesn't make a difference for current personalities that
 support bitmaps, but will make a difference for raid10]

This has the added advantage of meaning with can move the thread->timeout
manipulation inside the bitmap.c code instead of sprinkling identical code
throughout all personalities.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] md: allow dirty raid[456] arrays to be started at boot
NeilBrown [Fri, 6 Jan 2006 08:20:15 +0000 (00:20 -0800)]
[PATCH] md: allow dirty raid[456] arrays to be started at boot

See patch to md.txt for more details

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] md: small cleanups for raid5
NeilBrown [Fri, 6 Jan 2006 08:20:14 +0000 (00:20 -0800)]
[PATCH] md: small cleanups for raid5

Resync code:
  A test that isn't needed,
  a 'compute_block' that makes more sense
    elsewhere (And then doesn't need a test),
  a couple of BUG_ONs to confirm the change makes sense.

Printks:
  A few were missing KERN_*

Also fix a typo in a comment..

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] md: improve raid10 "IO Barrier" concept
NeilBrown [Fri, 6 Jan 2006 08:20:13 +0000 (00:20 -0800)]
[PATCH] md: improve raid10 "IO Barrier" concept

raid10 needs to put up a barrier to new requests while it does resync or other
background recovery.  The code for this is currently open-coded, slighty
obscure by its use of two waitqueues, and not documented.

This patch gathers all the related code into 4 functions, and includes a
comment which (hopefully) explains what is happening.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] md: improve raid1 "IO Barrier" concept
NeilBrown [Fri, 6 Jan 2006 08:20:12 +0000 (00:20 -0800)]
[PATCH] md: improve raid1 "IO Barrier" concept

raid1 needs to put up a barrier to new requests while it does resync or other
background recovery.  The code for this is currently open-coded, slighty
obscure by its use of two waitqueues, and not documented.

This patch gathers all the related code into 4 functions, and includes a
comment which (hopefully) explains what is happening.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] make dm-mirror not issue invalid resync requests
Darrick J. Wong [Fri, 6 Jan 2006 08:20:11 +0000 (00:20 -0800)]
[PATCH] make dm-mirror not issue invalid resync requests

I've been attempting to set up a (Host)RAID mirror with dm_mirror on
2.6.14.3, and I've been having a strange little problem.  The configuration
in question is a set of 9GB SCSI disks that have 17942584 sectors.  I set
up the dm_mirror table as such:

17942528 mirror core 2 2048 nosync 2 8:48 0 8:64 0

If I'm not mistaken, this sets up a 9GB RAID1 mriror with 1MB stripes
across both SCSI disks.  The sector count of the dm device is less than the
size of the disks, so we shouldn't fall off the end.  However, I always get
the messages like this in dmesg when I set up the dm table:

attempt to access beyond end of device
sdd: rw=0, want=17958656, limit=17942584

Clearly, something is trying to read sectors past the end of the drive.  I
traced it down to the __rh_recovery_prepare function in dm-raid1.c, which
gets called when we're putting the mirror set together.  This function
calls the dirty region log's get_resync_work function to see if there's any
resync that needs to be done, and queues up any areas that are out of sync.
 The log's get_resync_work function is actually a pointer to the
core_get_resync_work function in dm-log.c.

The core_get_resync_work function queries a bitset lc->sync_bits to find
out if there are any regions that are out of date (i.e.  the bit is 0),
which is where the problem occurs.  If every bit in lc->sync_bits is 1
(which is the case when we've just configured a new RAID1 with the nosync
option), the find_next_zero_bit does NOT return the size parameter
(lc->region_count in this case), it returns the size parameter rounded up
to the nearest multiple of 32!  I don't know if this is intentional, but
i386 and x86_64 both exhibit this behavior.

In any case, the statement "if (*region == lc->region_count)" looks like
it's supposed to catch the case where are no regions to resync and
return 0.  Since find_next_zero_bit apparently has a habit of returning
a value that's larger than lc->region_count, the enclosed patch changes
the equality test to a greater-than test so that we don't try to resync
areas outside of the RAID1 region.  Seeing as the HostRAID metadata
lives just past the end of the RAID1 data, mucking around in that area
is not a good idea.

I suppose another way to fix this would be to amend find_next_zero_bit so
that it doesn't return values larger than "size", but I don't know if
there's a reason for the current behavior.

Signed-Off-By: Darrick J. Wong <djwong@us.ibm.com>
Acked-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] dm-crypt: zero key before freeing it
Stefan Rompf [Fri, 6 Jan 2006 08:20:08 +0000 (00:20 -0800)]
[PATCH] dm-crypt: zero key before freeing it

Zap the memory before freeing it so we don't leave crypto information
around in memory.

Signed-off-by: Stefan Rompf <stefan@loplof.de>
Acked-by: Clemens Fruhwirth <clemens@endorphin.org>
Acked-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] drivers/md/kcopyd.c: #if 0 kcopyd_cancel()
Adrian Bunk [Fri, 6 Jan 2006 08:20:08 +0000 (00:20 -0800)]
[PATCH] drivers/md/kcopyd.c: #if 0 kcopyd_cancel()

This patch #if 0's the not yet implemented global function kcopyd_cancel().

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Acked-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] device-mapper ioctl: add skip lock_fs flag
Alasdair G Kergon [Fri, 6 Jan 2006 08:20:07 +0000 (00:20 -0800)]
[PATCH] device-mapper ioctl: add skip lock_fs flag

Add ioctl DM_SKIP_LOCKFS_FLAG for userspace to request that lock_fs is
bypassed when suspending a device.

There's no change to the behaviour of existing code that doesn't know about
the new flag.

Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] device-mapper: make lock_fs optional
Alasdair G Kergon [Fri, 6 Jan 2006 08:20:06 +0000 (00:20 -0800)]
[PATCH] device-mapper: make lock_fs optional

Devices only needs syncing when creating snapshots, so make this optional when
suspending a device.

Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] device-mapper: rename frozen_bdev
Alasdair G Kergon [Fri, 6 Jan 2006 08:20:05 +0000 (00:20 -0800)]
[PATCH] device-mapper: rename frozen_bdev

Rename frozen_bdev to suspended_bdev and move the bdget outside lockfs.  (This
prepares for making lockfs optional.)

Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] device-mapper raid1: add default mirror
Jonathan E Brassow [Fri, 6 Jan 2006 08:20:05 +0000 (00:20 -0800)]
[PATCH] device-mapper raid1: add default mirror

This patch introduces a new field to the mirror_set (default_mirror) to store
the default mirror.

(A subsequent patch will allow us to change the default mirror in the event of
a failure.)

Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] device-mapper: scanf sector format change
Alasdair G Kergon [Fri, 6 Jan 2006 08:20:04 +0000 (00:20 -0800)]
[PATCH] device-mapper: scanf sector format change

Use %llu not %Lu in sscanf/printf format strings.

Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] device-mapper: remove unused definition
Andrew Stribblehill [Fri, 6 Jan 2006 08:20:03 +0000 (00:20 -0800)]
[PATCH] device-mapper: remove unused definition

This patch removes an unused #define.

Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] device-mapper snapshot: metadata reading separation
Alasdair G Kergon [Fri, 6 Jan 2006 08:20:02 +0000 (00:20 -0800)]
[PATCH] device-mapper snapshot: metadata reading separation

More snapshot metadata reading into separate function, to prepare for changing
the place it gets called from.

Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] device-mapper ioctl: event on rename
goggin, edward [Fri, 6 Jan 2006 08:20:01 +0000 (00:20 -0800)]
[PATCH] device-mapper ioctl: event on rename

After changing the name of a mapped device, trigger a dm event.  (For
userspace multipath tools.)

Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] device-mapper: add dm_get_md
David Teigland [Fri, 6 Jan 2006 08:20:01 +0000 (00:20 -0800)]
[PATCH] device-mapper: add dm_get_md

Add dm_get_dev() to get a mapped device given its dev_t.

Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] device-mapper: add dm_find_md
David Teigland [Fri, 6 Jan 2006 08:20:00 +0000 (00:20 -0800)]
[PATCH] device-mapper: add dm_find_md

Abstract dm_find_md() from dm_get_mdptr() to allow use elsewhere.

Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] knfsd: reduce stack consumption
Neil Brown [Fri, 6 Jan 2006 08:19:59 +0000 (00:19 -0800)]
[PATCH] knfsd: reduce stack consumption

A typical nfsd call trace is
 nfsd -> svc_process -> nfsd_dispatch -> nfsd3_proc_write ->
   nfsd_write ->nfsd_vfs_write -> vfs_writev

These add up to over 300 bytes on the stack.
Looking at each of these, I see that nfsd_write (which includes
 nfsd_vfs_write) contributes 0x8c to stack usage itself!!

It turns out this is because it puts a 'struct iattr' on the stack so
it can kill suid if needed.  The following patch saves about 50 bytes
off the stack in this call path.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] knfsd: check error status from vfs_getattr and i_op->fsync
David Shaw [Fri, 6 Jan 2006 08:19:58 +0000 (00:19 -0800)]
[PATCH] knfsd: check error status from vfs_getattr and i_op->fsync

Both vfs_getattr and i_op->fsync return error statuses which nfsd was
largely ignoring.  This as noticed when exporting directories using fuse.

This patch cleans up most of the offences, which involves moving the call
to vfs_getattr out of the xdr encoding routines (where it is too late to
report an error) into the main NFS procedure handling routines.

There is still a called to vfs_gettattr (related to the ACL code) where the
status is ignored, and called to nfsd_sync_dir don't check return status
either.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] Keep nfsd from exiting when seeing recv() errors
Olaf Kirch [Fri, 6 Jan 2006 08:19:56 +0000 (00:19 -0800)]
[PATCH] Keep nfsd from exiting when seeing recv() errors

I submitted this one previously - svc_tcp_recvfrom currently returns
any errors to the caller, including ECONNRESET and the like.

This is something svc_recv isn't able to deal with:

len = svsk->sk_recvfrom(rqstp);
[...]
if (len == 0 || len == -EAGAIN) {
[...]
return -EAGAIN;
}

[...]
return len;

The nfsd main loop will exit when it sees an error code other than
EAGAIN.

The following patch fixes this problem

svc_recv is not equipped to deal with error codes other than EAGAIN,
and will propagate anything else (such as ECONNRESET) up to nfsd,
causing it to exit.

Signed-off-by: Olaf Kirch <okir@suse.de>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] jbd: split checkpoint lists
Jan Kara [Fri, 6 Jan 2006 08:19:55 +0000 (00:19 -0800)]
[PATCH] jbd: split checkpoint lists

Split the checkpoint list of the transaction into two lists.  In the first
list we keep the buffers that need to be submitted for IO.  In the second
list are kept buffers that were already submitted and we just have to wait
for the IO to complete.  This should simplify a handling of checkpoint
lists a bit and can eventually be also a performance gain.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>