GitHub/moto-9609/android_kernel_motorola_exynos9610.git
15 years agoknfsd: remove unreported filehandle stats counters
Greg Banks [Tue, 31 Mar 2009 20:28:21 +0000 (07:28 +1100)]
knfsd: remove unreported filehandle stats counters

The file nfsfh.c contains two static variables nfsd_nr_verified and
nfsd_nr_put.  These are counters which are incremented as a side
effect of the fh_verify() fh_compose() and fh_put() operations,
i.e. at least twice per NFS call for any non-trivial workload.
Needless to say this makes the cacheline that contains them (and any
other innocent victims) a very hot contention point indeed under high
call-rate workloads on multiprocessor NFS server.  It also turns out
that these counters are not used anywhere.  They're not reported to
userspace, they're not used in logic, they're not even exported from
the object file (let alone the module).  All they do is waste CPU time.

So this patch removes them.

Tests on a 16 CPU Altix A4700 with 2 10gige Myricom cards, configured
separately (no bonding).  Workload is 640 client threads doing directory
traverals with random small reads, from server RAM.

Before
======

Kernel profile:

  %   cumulative   self              self     total
 time   samples   samples    calls   1/call   1/call  name
  6.05   2716.00  2716.00    30406     0.09     1.02  svc_process
  4.44   4706.00  1990.00     1975     1.01     1.01  spin_unlock_irqrestore
  3.72   6376.00  1670.00     1666     1.00     1.00  svc_export_put
  3.41   7907.00  1531.00     1786     0.86     1.02  nfsd_ofcache_lookup
  3.25   9363.00  1456.00    10965     0.13     1.01  nfsd_dispatch
  3.10  10752.00  1389.00     1376     1.01     1.01  nfsd_cache_lookup
  2.57  11907.00  1155.00     4517     0.26     1.03  svc_tcp_recvfrom
  ...
  2.21  15352.00  1003.00     1081     0.93     1.00  nfsd_choose_ofc  <----
  ^^^^

Here the function nfsd_choose_ofc() reads a global variable
which by accident happened to be located in the same cacheline as
nfsd_nr_verified.

Call rate:

nullarbor:~ # pmdumptext nfs3.server.calls
...
Thu Dec 13 00:15:27     184780.663
Thu Dec 13 00:15:28     184885.881
Thu Dec 13 00:15:29     184449.215
Thu Dec 13 00:15:30     184971.058
Thu Dec 13 00:15:31     185036.052
Thu Dec 13 00:15:32     185250.475
Thu Dec 13 00:15:33     184481.319
Thu Dec 13 00:15:34     185225.737
Thu Dec 13 00:15:35     185408.018
Thu Dec 13 00:15:36     185335.764

After
=====

kernel profile:

  %   cumulative   self              self     total
 time   samples   samples    calls   1/call   1/call  name
  6.33   2813.00  2813.00    29979     0.09     1.01  svc_process
  4.66   4883.00  2070.00     2065     1.00     1.00  spin_unlock_irqrestore
  4.06   6687.00  1804.00     2182     0.83     1.00  nfsd_ofcache_lookup
  3.20   8110.00  1423.00    10932     0.13     1.00  nfsd_dispatch
  3.03   9456.00  1346.00     1343     1.00     1.00  nfsd_cache_lookup
  2.62  10622.00  1166.00     4645     0.25     1.01  svc_tcp_recvfrom
[...]
  0.10  42586.00    44.00       74     0.59     1.00  nfsd_choose_ofc  <--- HA!!
  ^^^^

Call rate:

nullarbor:~ # pmdumptext nfs3.server.calls
...
Thu Dec 13 01:45:28     194677.118
Thu Dec 13 01:45:29     193932.692
Thu Dec 13 01:45:30     194294.364
Thu Dec 13 01:45:31     194971.276
Thu Dec 13 01:45:32     194111.207
Thu Dec 13 01:45:33     194999.635
Thu Dec 13 01:45:34     195312.594
Thu Dec 13 01:45:35     195707.293
Thu Dec 13 01:45:36     194610.353
Thu Dec 13 01:45:37     195913.662
Thu Dec 13 01:45:38     194808.675

i.e. about a 5.3% improvement in call rate.

Signed-off-by: Greg Banks <gnb@melbourne.sgi.com>
Reviewed-by: David Chinner <dgc@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agoknfsd: fix reply cache memory corruption
Greg Banks [Tue, 31 Mar 2009 20:28:15 +0000 (07:28 +1100)]
knfsd: fix reply cache memory corruption

Fix a regression in the reply cache introduced when the code was
converted to use proper Linux lists.  When a new entry needs to be
inserted, the case where all the entries are currently being used
by threads is not correctly detected.  This can result in memory
corruption and a crash.  In the current code this is an extremely
unlikely corner case; it would require the machine to have 1024
nfsd threads and all of them to be busy at the same time.  However,
upcoming reply cache changes make this more likely; a crash due to
this problem was actually observed in field.

Signed-off-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agoknfsd: reply cache cleanups
Greg Banks [Tue, 31 Mar 2009 20:28:13 +0000 (07:28 +1100)]
knfsd: reply cache cleanups

Make REQHASH() an inline function.  Rename hash_list to cache_hash.
Fix an obsolete comment.

Signed-off-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agolockd: fix FILE_LOCKING=n build error
Randy Dunlap [Tue, 12 May 2009 20:28:09 +0000 (13:28 -0700)]
lockd: fix FILE_LOCKING=n build error

lockd/svclock.c is missing a header file <linux/fs.h>.

<linux/fs.h> is missing a definition of locks_release_private()
for the config case of FILE_LOCKING=n, causing a build error:

fs/lockd/svclock.c:330: error: implicit declaration of function 'locks_release_private'

lockd without FILE_LOCKING doesn't make sense, so make LOCKD and LOCKD_V4
depend on FILE_LOCKING, and make NFS depend on FILE_LOCKING.

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agonfsd: nfs4_stat_init cleanup
Wang Chen [Fri, 24 Apr 2009 07:41:57 +0000 (15:41 +0800)]
nfsd: nfs4_stat_init cleanup

Save some loop time.

Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agonfsd: use C99 struct initializers
Randy Dunlap [Tue, 28 Apr 2009 23:48:25 +0000 (16:48 -0700)]
nfsd: use C99 struct initializers

Eliminate 56 sparse warnings like this one:

fs/nfsd/nfs4xdr.c:1331:15: warning: obsolete array initializer, use C99 syntax

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agonfsd4: make recall callback an asynchronous rpc
J. Bruce Fields [Sat, 2 May 2009 02:36:55 +0000 (22:36 -0400)]
nfsd4: make recall callback an asynchronous rpc

As with the probe, this removes the need for another kthread.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agonfsd4: track recall retries in nfs4_delegation
J. Bruce Fields [Sat, 2 May 2009 00:11:12 +0000 (20:11 -0400)]
nfsd4: track recall retries in nfs4_delegation

Move this out of a local variable into the nfs4_delegation object in
preparation for making this an async rpc call (at which point we'll need
any state like this in a common object that's preserved across function
calls).

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agonfsd4: remove unused dl_trunc
J. Bruce Fields [Fri, 1 May 2009 23:57:46 +0000 (19:57 -0400)]
nfsd4: remove unused dl_trunc

There's no point in keeping this field around--it's always zero.

(Background: the protocol allows you to tell the client that the file is
about to be truncated, as an optimization to save the client from
writing back dirty pages that will just be discarded.  We don't
implement this hint.  If we do some day, adding this field back in will
be the least of the work involved.)

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agonfsd4: eliminate struct nfs4_cb_recall
J. Bruce Fields [Fri, 1 May 2009 23:50:00 +0000 (19:50 -0400)]
nfsd4: eliminate struct nfs4_cb_recall

The nfs4_cb_recall struct is used only in nfs4_delegation, so its
pointer to the containing delegation is unnecessary--we could just use
container_of().

But there's no real reason to have this a separate struct at all--just
move these fields to nfs4_delegation.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agonfsd4: rename callback struct to cb_conn
J. Bruce Fields [Wed, 29 Apr 2009 23:09:19 +0000 (19:09 -0400)]
nfsd4: rename callback struct to cb_conn

I want to use the name for a struct that actually does represent a
single callback.

(Actually, I've never been sure it helps to a separate struct for the
callback information.  Some day maybe those fields could just be dumped
into struct nfs4_client.  I don't know.)

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agonfsd4: replace callback thread by asynchronous rpc
J. Bruce Fields [Thu, 5 Mar 2009 20:01:11 +0000 (15:01 -0500)]
nfsd4: replace callback thread by asynchronous rpc

We don't really need a synchronous rpc, and moving to an asynchronous
rpc allows us to do without this extra kthread.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agonfsd4: lookup up callback cred only once
J. Bruce Fields [Tue, 24 Feb 2009 05:42:10 +0000 (21:42 -0800)]
nfsd4: lookup up callback cred only once

Lookup the callback cred once and then use it for all subsequent
callbacks.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agonfsd4: create rpc callback client from server thread
J. Bruce Fields [Tue, 24 Feb 2009 03:35:22 +0000 (19:35 -0800)]
nfsd4: create rpc callback client from server thread

The code is a little simpler, and it should be easier to avoid races, if
we just do all rpc client creation/destruction from nfsd or laundromat
threads and do only the rpc calls themselves asynchronously.  The rpc
creation doesn't involve any significant waiting (it doesn't call the
client, for example), so there's no reason not to do this.

Also don't bother destroying the client on failure of the rpc null
probe.  We may want to retry the probe later anyway.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agonfsd4: set cb_client inside setup_callback_client
J. Bruce Fields [Mon, 23 Feb 2009 18:45:27 +0000 (10:45 -0800)]
nfsd4: set cb_client inside setup_callback_client

This is just a minor code simplification.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agonfsd4: set shorter timeout
J. Bruce Fields [Thu, 5 Mar 2009 22:18:10 +0000 (17:18 -0500)]
nfsd4: set shorter timeout

We tried to do something overly complicated with the callback rpc
timeouts here.  And they're wrong--the result is that by the time a
single callback times out, it's already too late to tell the client
(using the cb_path_down return to RENEW) that the callback is down.

Use a much shorter, simpler timeout.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agonfsd4: setclientid_confirm callback-change fixes
J. Bruce Fields [Wed, 29 Apr 2009 17:45:36 +0000 (13:45 -0400)]
nfsd4: setclientid_confirm callback-change fixes

This setclientid_confirm case should allow the client to change
callbacks, but it currently has a dummy implementation that just turns
off callbacks completely.  That dummy implementation isn't completely
correct either, though:

- There's no need to remove any client recovery directory in
  this case.
- New clientid confirm verifiers should be generated (and
  returned) in setclientid; there's no need to generate a new
  one here.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agonfsd: quiet compile warning
J. Bruce Fields [Wed, 29 Apr 2009 15:36:17 +0000 (11:36 -0400)]
nfsd: quiet compile warning

Stephen Rothwell said:
"Today's linux-next build (powerpc ppc64_defconfig) produced this new
warning:

fs/nfsd/nfs4state.c: In function 'EXPIRED_STATEID':
fs/nfsd/nfs4state.c:2757: warning: comparison of distinct pointer types lacks a cast

Caused by commit 78155ed75f470710f2aecb3e75e3d97107ba8374 ("nfsd4:
distinguish expired from stale stateids")."

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
15 years agonfsd: support ext4 i_version
J. Bruce Fields [Thu, 16 Apr 2009 21:33:25 +0000 (17:33 -0400)]
nfsd: support ext4 i_version

ext4 supports a real NFSv4 change attribute, which is bumped whenever
the ctime would be updated, including times when two updates arrive
within a jiffy of each other.  (Note that although ext4 has space for
nanosecond-precision ctime, the real resolution is lower: it actually
uses jiffies as the time-source.)  This ensures clients will invalidate
their caches when they need to.

There is some fear that keeping the i_version up-to-date could have
performance drawbacks, so for now it's turned on only by a mount option.
We hope to do something better eventually.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Cc: Theodore Tso <tytso@mit.edu>
15 years agonfsd4: delete obsolete xdr comments
J. Bruce Fields [Wed, 8 Apr 2009 00:03:19 +0000 (17:03 -0700)]
nfsd4: delete obsolete xdr comments

We don't need comments to tell us these macros are ugly.  And we're long
past trying to share any of this code with the BSD's.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agonfsd: eliminate ENCODE_HEAD macro
J. Bruce Fields [Tue, 7 Apr 2009 23:55:27 +0000 (16:55 -0700)]
nfsd: eliminate ENCODE_HEAD macro

This macro doesn't serve any useful purpose.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agoNFSD: Stricter buffer size checking in fs/nfsd/nfsctl.c
Chuck Lever [Thu, 23 Apr 2009 23:33:25 +0000 (19:33 -0400)]
NFSD: Stricter buffer size checking in fs/nfsd/nfsctl.c

Clean up: For consistency, handle output buffer size checking in a
other nfsctl functions the same way it's done for write_versions().

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agoNFSD: Stricter buffer size checking in write_versions()
Chuck Lever [Thu, 23 Apr 2009 23:33:18 +0000 (19:33 -0400)]
NFSD: Stricter buffer size checking in write_versions()

While it's not likely today that there are enough NFS versions to
overflow the output buffer in write_versions(), we should be more
careful about detecting the end of the buffer.

The number of NFS versions will only increase as NFSv4 minor versions
are added.

Note that this API doesn't behave the same as portlist.  Here we
attempt to display as many versions as will fit in the buffer, and do
not provide any indication that an overflow would have occurred.  I
don't have any good rationale for that.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agoNFSD: Stricter buffer size checking in write_recoverydir()
Chuck Lever [Thu, 23 Apr 2009 23:33:10 +0000 (19:33 -0400)]
NFSD: Stricter buffer size checking in write_recoverydir()

While it's not likely a pathname will be longer than
SIMPLE_TRANSACTION_SIZE, we should be more careful about just
plopping it into the output buffer without bounds checking.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agoSUNRPC: Clean up one_sock_name()
Chuck Lever [Thu, 23 Apr 2009 23:33:03 +0000 (19:33 -0400)]
SUNRPC: Clean up one_sock_name()

Clean up svc_one_sock_name() by setting up automatic variables for
frequently used expressions.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agoSUNRPC: Support PF_INET6 in one_sock_name()
Chuck Lever [Thu, 23 Apr 2009 23:32:55 +0000 (19:32 -0400)]
SUNRPC: Support PF_INET6 in one_sock_name()

Add an arm to the switch statement in svc_one_sock_name() so it can
construct the name of PF_INET6 sockets properly.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: Aime Le Rouzic <aime.le-rouzic@bull.net>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agoSUNRPC: Switch one_sock_name() to use snprintf()
Chuck Lever [Thu, 23 Apr 2009 23:32:48 +0000 (19:32 -0400)]
SUNRPC: Switch one_sock_name() to use snprintf()

Use snprintf() in one_sock_name() to prevent overflowing the output
buffer.  If the name doesn't fit in the buffer, the buffer is filled
in with an empty string, and -ENAMETOOLONG is returned.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agoSUNRPC: pass buffer size to svc_sock_names()
Chuck Lever [Thu, 23 Apr 2009 23:32:40 +0000 (19:32 -0400)]
SUNRPC: pass buffer size to svc_sock_names()

Adjust the synopsis of svc_sock_names() to pass in the size of the
output buffer.  Add a documenting comment.

This is a cosmetic change for now.  A subsequent patch will make sure
the buffer length is passed to one_sock_name(), where the length will
actually be useful.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agoSUNRPC: pass buffer size to svc_addsock()
Chuck Lever [Thu, 23 Apr 2009 23:32:33 +0000 (19:32 -0400)]
SUNRPC: pass buffer size to svc_addsock()

Adjust the synopsis of svc_addsock() to pass in the size of the output
buffer.  Add a documenting comment.

This is a cosmetic change for now.  A subsequent patch will make sure
the buffer length is passed to one_sock_name(), where the length will
actually be useful.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agoNFSD: Prevent a buffer overflow in svc_xprt_names()
Chuck Lever [Thu, 23 Apr 2009 23:32:25 +0000 (19:32 -0400)]
NFSD: Prevent a buffer overflow in svc_xprt_names()

The svc_xprt_names() function can overflow its buffer if it's so near
the end of the passed in buffer that the "name too long" string still
doesn't fit.  Of course, it could never tell if it was near the end
of the passed in buffer, since its only caller passes in zero as the
buffer length.

Let's make this API a little safer.

Change svc_xprt_names() so it *always* checks for a buffer overflow,
and change its only caller to pass in the correct buffer length.

If svc_xprt_names() does overflow its buffer, it now fails with an
ENAMETOOLONG errno, instead of trying to write a message at the end
of the buffer.  I don't like this much, but I can't figure out a clean
way that's always safe to return some of the names, *and* an
indication that the buffer was not long enough.

The displayed error when doing a 'cat /proc/fs/nfsd/portlist' is
"File name too long".

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agoNFSD: move lockd_up() before svc_addsock()
Chuck Lever [Thu, 23 Apr 2009 23:32:18 +0000 (19:32 -0400)]
NFSD: move lockd_up() before svc_addsock()

Clean up.

A couple of years ago, a series of commits, finishing with commit
5680c446, swapped the order of the lockd_up() and svc_addsock() calls
in __write_ports().  At that time lockd_up() needed to know the
transport protocol of the passed-in socket to start a listener on the
same transport protocol.

These days, lockd_up() doesn't take a protocol argument; it always
starts both a UDP and TCP listener.  It's now more straightforward to
try the lockd_up() first, then do a lockd_down() if the svc_addsock()
fails.

Careful review of this code shows that the svc_sock_names() call is
used only to close the just-opened socket in case lockd_up() fails.
So it is no longer needed if lockd_up() is done first.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agoNFSD: Finish refactoring __write_ports()
Chuck Lever [Thu, 23 Apr 2009 23:32:10 +0000 (19:32 -0400)]
NFSD: Finish refactoring __write_ports()

Clean up: Refactor transport name listing out of __write_ports() to
make it easier to understand and maintain.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agoNFSD: Note an additional requirement when passing TCP sockets to portlist
Chuck Lever [Thu, 23 Apr 2009 23:32:03 +0000 (19:32 -0400)]
NFSD: Note an additional requirement when passing TCP sockets to portlist

User space must call listen(3) on SOCK_STREAM sockets passed into
/proc/fs/nfsd/portlist, otherwise that listener is ignored.  Document
this.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agoNFSD: Refactor socket creation out of __write_ports()
Chuck Lever [Thu, 23 Apr 2009 23:31:55 +0000 (19:31 -0400)]
NFSD: Refactor socket creation out of __write_ports()

Clean up: Refactor the socket creation logic out of __write_ports() to
make it easier to understand and maintain.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agoNFSD: Refactor portlist socket closing into a helper
Chuck Lever [Thu, 23 Apr 2009 23:31:48 +0000 (19:31 -0400)]
NFSD: Refactor portlist socket closing into a helper

Clean up: Refactor the socket closing logic out of __write_ports() to
make it easier to understand and maintain.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agoNFSD: Refactor transport addition out of __write_ports()
Chuck Lever [Thu, 23 Apr 2009 23:31:40 +0000 (19:31 -0400)]
NFSD: Refactor transport addition out of __write_ports()

Clean up: Refactor transport addition out of __write_ports() to make
it easier to understand and maintain.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agoNFSD: Refactor transport removal out of __write_ports()
Chuck Lever [Thu, 23 Apr 2009 23:31:32 +0000 (19:31 -0400)]
NFSD: Refactor transport removal out of __write_ports()

Clean up: Refactor transport removal out of __write_ports() to make it
easier to understand and maintain.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agoSUNRPC: Fix error return value of svc_addr_len()
Chuck Lever [Thu, 23 Apr 2009 23:31:25 +0000 (19:31 -0400)]
SUNRPC: Fix error return value of svc_addr_len()

The svc_addr_len() helper function returns -EAFNOSUPPORT if it doesn't
recognize the address family of the passed-in socket address.  However,
the return type of this function is size_t, which means -EAFNOSUPPORT
is turned into a very large positive value in this case.

The check in svc_udp_recvfrom() to see if the return value is less
than zero therefore won't work at all.

Additionally, handle_connect_req() passes this value directly to
memset().  This could cause memset() to clobber a large chunk of memory
if svc_addr_len() has returned an error.  Currently the address family
of these addresses, however, is known to be supported long before
handle_connect_req() is called, so this isn't a real risk.

Change the error return value of svc_addr_len() to zero, which fits in
the range of size_t, and is safer to pass to memset() directly.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agonet/sunrpc/svc_xprt.c: fix sparse warnings
H Hartley Sweeten [Thu, 23 Apr 2009 00:18:19 +0000 (20:18 -0400)]
net/sunrpc/svc_xprt.c: fix sparse warnings

Fix the following sparse warnings in net/sunrpc/svc_xprt.c.

  warning: symbol 'svc_recv' was not declared. Should it be static?
  warning: symbol 'svc_drop' was not declared. Should it be static?
  warning: symbol 'svc_send' was not declared. Should it be static?
  warning: symbol 'svc_close_all' was not declared. Should it be static?

Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agoupdate Documentation/filesystems/00-INDEX with new nfsd related docs.
Benny Halevy [Tue, 7 Apr 2009 18:45:37 +0000 (21:45 +0300)]
update Documentation/filesystems/00-INDEX with new nfsd related docs.

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Cc: James Lentini <jlentini@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agonfsd4: distinguish expired from stale stateids
Bian Naimeng [Wed, 22 Apr 2009 10:25:37 +0000 (18:25 +0800)]
nfsd4: distinguish expired from stale stateids

If we encode the time of client creation into the stateid instead of the
time of server boot, then we can determine whether that stateid is from
a previous instance of the a server, or from a client that has expired,
and return an appropriate error to the client.

Signed-off-by: Bian Naimeng <biannm@cn.fujitsu.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agolockd: call locks_release_private to cleanup per-filesystem state
Felix Blyakher [Tue, 31 Mar 2009 20:12:56 +0000 (15:12 -0500)]
lockd: call locks_release_private to cleanup per-filesystem state

For every lock request lockd creates a new file_lock object
in nlmsvc_setgrantargs() by copying the passed in file_lock with
locks_copy_lock(). A filesystem can attach it's own lock_operations
vector to the file_lock. It has to be cleaned up at the end of the
file_lock's life. However, lockd doesn't do it today, yet it
asserts in nlmclnt_release_lockargs() that the per-filesystem
state is clean.
This patch fixes it by exporting locks_release_private() and adding
it to nlmsvc_freegrantargs(), to be symmetrical to creating a
file_lock in nlmsvc_setgrantargs().

Signed-off-by: Felix Blyakher <felixb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agorpcgss: remove redundant test on unsigned
Roel Kluin [Tue, 21 Apr 2009 14:08:39 +0000 (16:08 +0200)]
rpcgss: remove redundant test on unsigned

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agoLinux 2.6.30-rc3
Linus Torvalds [Wed, 22 Apr 2009 03:07:00 +0000 (20:07 -0700)]
Linux 2.6.30-rc3

15 years agodriver synchronization: make scsi_wait_scan more advanced
Arjan van de Ven [Tue, 21 Apr 2009 20:32:54 +0000 (13:32 -0700)]
driver synchronization: make scsi_wait_scan more advanced

There is currently only one way for userspace to say "wait for my storage
device to get ready for the modules I just loaded": to load the
scsi_wait_scan module. Expectations of userspace are that once this
module is loaded, all the (storage) devices for which the drivers
were loaded before the module load are present.

Now, there are some issues with the implementation, and the async
stuff got caught in the middle of this: The existing code only
waits for the scsy async probing to finish, but it did not take
into account at all that probing might not have begun yet.
(Russell ran into this problem on his computer and the fix works for him)

This patch fixes this more thoroughly than the previous "fix", which
had some bad side effects (namely, for kernel code that wanted to wait for
the scsi scan it would also do an async sync, which would deadlock if you did
it from async context already.. there's a report about that on lkml):
The patch makes the module first wait for all device driver probes, and then it
will wait for the scsi parallel scan to finish.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Tested-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoTrivial: fix a typo in slow-work.h
Jonathan Corbet [Tue, 21 Apr 2009 22:30:32 +0000 (16:30 -0600)]
Trivial: fix a typo in slow-work.h

Fix a comment typo in slow-work.h

...a trivial mistake, but it will mess up kerneldoc if nothing else.

Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoPERCPU: Collect the DECLARE/DEFINE declarations together
David Howells [Tue, 21 Apr 2009 22:00:29 +0000 (23:00 +0100)]
PERCPU: Collect the DECLARE/DEFINE declarations together

Collect the DECLARE/DEFINE declarations together in linux/percpu-defs.h so
that they're in one place, and give them descriptive comments, particularly
the SHARED_ALIGNED variant.

It would be nice to collect these in linux/percpu.h, but that's not possible
without sorting out the severe #include recursion between the x86 arch headers
and the general headers (and possibly other arches too).

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoFRV: Fix the section attribute on UP DECLARE_PER_CPU()
David Howells [Tue, 21 Apr 2009 22:00:24 +0000 (23:00 +0100)]
FRV: Fix the section attribute on UP DECLARE_PER_CPU()

In non-SMP mode, the variable section attribute specified by DECLARE_PER_CPU()
does not agree with that specified by DEFINE_PER_CPU().  This means that
architectures that have a small data section references relative to a base
register may throw up linkage errors due to too great a displacement between
where the base register points and the per-CPU variable.

On FRV, the .h declaration says that the variable is in the .sdata section, but
the .c definition says it's actually in the .data section.  The linker throws
up the following errors:

kernel/built-in.o: In function `release_task':
kernel/exit.c:78: relocation truncated to fit: R_FRV_GPREL12 against symbol `per_cpu__process_counts' defined in .data section in kernel/built-in.o
kernel/exit.c:78: relocation truncated to fit: R_FRV_GPREL12 against symbol `per_cpu__process_counts' defined in .data section in kernel/built-in.o

To fix this, DECLARE_PER_CPU() should simply apply the same section attribute
as does DEFINE_PER_CPU().  However, this is made slightly more complex by
virtue of the fact that there are several variants on DEFINE, so these need to
be matched by variants on DECLARE.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable
Linus Torvalds [Tue, 21 Apr 2009 21:12:58 +0000 (14:12 -0700)]
Merge git://git./linux/kernel/git/mason/btrfs-unstable

* git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable:
  Btrfs: fix btrfs fallocate oops and deadlock
  Btrfs: use the right node in reada_for_balance
  Btrfs: fix oops on page->mapping->host during writepage
  Btrfs: add a priority queue to the async thread helpers
  Btrfs: use WRITE_SYNC for synchronous writes

15 years agoMerge branch 'i2c-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6
Linus Torvalds [Tue, 21 Apr 2009 21:12:43 +0000 (14:12 -0700)]
Merge branch 'i2c-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6

* 'i2c-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6:
  go7007: Convert to the new i2c device binding model

15 years agobfin_5xx: misplaced parentheses
Roel Kluin [Tue, 21 Apr 2009 19:24:58 +0000 (12:24 -0700)]
bfin_5xx: misplaced parentheses

`!' has a higher precedence than `&', parentheses are misplaced.

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Acked-by: Sonic Zhang <sonic.zhang@analog.com>
Cc: Bryan Wu <cooloney@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agovmscan,memcg: reintroduce sc->may_swap
KOSAKI Motohiro [Tue, 21 Apr 2009 19:24:57 +0000 (12:24 -0700)]
vmscan,memcg: reintroduce sc->may_swap

Commit a6dc60f8975ad96d162915e07703a4439c80dcf0 ("vmscan: rename
sc.may_swap to may_unmap") removed the may_swap flag, but memcg had used
it as a flag for "we need to use swap?", as the name indicate.

And in the current implementation, memcg cannot reclaim mapped file
caches when mem+swap hits the limit.

re-introduce may_swap flag and handle it at get_scan_ratio().  This
patch doesn't influence any scan_control users other than memcg.

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoedac: ppc mpc85xx fix mc err detect
Dave Jiang [Tue, 21 Apr 2009 19:24:56 +0000 (12:24 -0700)]
edac: ppc mpc85xx fix mc err detect

Error found by Jeff Haran.

The error detect register is 0s when no errors are detected.  The check
code is incorrect, so reverse check sense.

Reported-by: Jeff Haran <jharan@Brocade.COM>
Signed-off-by: Dave Jiang <djiang@mvista.com>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoscsi: mpt: suppress debugobjects warning
Eric Paris [Tue, 21 Apr 2009 19:24:54 +0000 (12:24 -0700)]
scsi: mpt: suppress debugobjects warning

Addresses http://bugzilla.kernel.org/show_bug.cgi?id=13133

ODEBUG: object is on stack, but not annotated
------------[ cut here ]------------
WARNING: at lib/debugobjects.c:253 __debug_object_init+0x1f3/0x276()
Hardware name: VMware Virtual Platform
Modules linked in: mptspi(+) mptscsih mptbase scsi_transport_spi ext3 jbd mbcache
Pid: 540, comm: insmod Not tainted 2.6.28-mm1 #2
Call Trace:
 [<c042c51c>] warn_slowpath+0x74/0x8a
 [<c0469600>] ? start_critical_timing+0x96/0xb7
 [<c060c8ea>] ? _spin_unlock_irqrestore+0x2f/0x3c
 [<c0446fad>] ? trace_hardirqs_off_caller+0x18/0xaf
 [<c044704f>] ? trace_hardirqs_off+0xb/0xd
 [<c060c8ea>] ? _spin_unlock_irqrestore+0x2f/0x3c
 [<c042cb84>] ? release_console_sem+0x1a5/0x1ad
 [<c05013e6>] __debug_object_init+0x1f3/0x276
 [<c0501494>] debug_object_init+0x13/0x17
 [<c0433c56>] init_timer+0x10/0x1a
 [<e08e5b54>] mpt_config+0x1c1/0x2b7 [mptbase]
 [<e08e3b82>] ? kmalloc+0x8/0xa [mptbase]
 [<e08e3b82>] ? kmalloc+0x8/0xa [mptbase]
 [<e08e6fa2>] mpt_do_ioc_recovery+0x950/0x1212 [mptbase]
 [<c04496c2>] ? __lock_acquire+0xa69/0xacc
 [<c060c8f1>] ? _spin_unlock_irqrestore+0x36/0x3c
 [<c060c3af>] ? _spin_unlock_irq+0x22/0x26
 [<c04f2d8b>] ? string+0x2b/0x76
 [<c04f310e>] ? vsnprintf+0x338/0x7b3
 [<c04496c2>] ? __lock_acquire+0xa69/0xacc
 [<c060c8ea>] ? _spin_unlock_irqrestore+0x2f/0x3c
 [<c04496c2>] ? __lock_acquire+0xa69/0xacc
 [<c044897d>] ? debug_check_no_locks_freed+0xeb/0x105
 [<c060c8f1>] ? _spin_unlock_irqrestore+0x36/0x3c
 [<c04488bc>] ? debug_check_no_locks_freed+0x2a/0x105
 [<c0446b8c>] ? lock_release_holdtime+0x43/0x48
 [<c043f742>] ? up_read+0x16/0x29
 [<c05076f8>] ? pci_get_slot+0x66/0x72
 [<e08e89ca>] mpt_attach+0x881/0x9b1 [mptbase]
 [<e091c8e5>] mptspi_probe+0x11/0x354 [mptspi]

Noticing that every caller of mpt_config has its CONFIGPARMS struct
declared on the stack and thus the &pCfg->timer is always on the stack I
changed init_timer() to init_timer_on_stack() and it seems to have shut
up.....

Cc: "Moore, Eric Dean" <Eric.Moore@lsil.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Acked-by: "Desai, Kashyap" <Kashyap.Desai@lsi.com>
Cc: <stable@kernel.org> [2.6.29.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agosgi-xp/sgi-gru: allow modules to load on non-uv systems
Robin Holt [Tue, 21 Apr 2009 19:24:53 +0000 (12:24 -0700)]
sgi-xp/sgi-gru: allow modules to load on non-uv systems

For an upcoming distro release, we need to have the xp kernel module
loadable even when not on UV equipment.  The xpc module will not load.
This will allow one set of modules dependent upon xp to work on either UV
or non-UV equipment.

Signed-off-by: Robin Holt <holt@sgi.com>
Signed-off-by: Jack Steiner <steiner@sgi.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agouml: kill a kconfig warning
WANG Cong [Tue, 21 Apr 2009 19:24:52 +0000 (12:24 -0700)]
uml: kill a kconfig warning

Got this warning from Kconfig:

   boolean symbol INPUT tested for 'm'? test forced to 'n'

because INPUT is tristate, not bool.

Signed-off-by: WANG Cong <xiyou.wangcong@gmail.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agofrv: insert PCI root bus resources for the MB93090 devel motherboard
David Howells [Tue, 21 Apr 2009 19:24:51 +0000 (12:24 -0700)]
frv: insert PCI root bus resources for the MB93090 devel motherboard

Insert PCI root bus resources for the FRV-based MB93090 development kit
motherboard.  This is required because the CPU's window onto the PCI bus
address space is considerably smaller than the CPU's full address space
and non-PCI devices lie outside of the PCI window that we might want to
access.

Without this patch, the PCI root bus uses the platform-level bus
resources, and these are then confined to the PCI window, thus making
platform_device_add() reject devices outside of this window.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agortc-cmos: fix printk output
Krzysztof Halasa [Tue, 21 Apr 2009 19:24:49 +0000 (12:24 -0700)]
rtc-cmos: fix printk output

With no IRQ available/defined, RTC-CMOS driver prints something like:
rtc0: alarms up to one no, y3k, 114 bytes nvram
                              ^^^^
I guess the following is a bit easier to understand:
rtc0: no alarms, y3k, 114 bytes nvram

Signed-off-by: Krzysztof Halasa <khc@pm.waw.pl>
Cc: David Brownell <david-b@pacbell.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agospi: documentation: emphasise spi_master.setup() semantics
David Brownell [Tue, 21 Apr 2009 19:24:49 +0000 (12:24 -0700)]
spi: documentation: emphasise spi_setup() semantics

This is a doc-only patch which I hope will reduce the number of
spi_master controller driver patches starting out with a common
implementation bug.

(As in: almost every spi_master driver I see starts out with its
version of this bug.  Sigh.)

It just re-emphasizes that the setup() method may be called for one
device while a transfer is active on another ...  which means that most
driver implementations shouldn't touch any registers.

Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoMAINTAINERS: add a more searchable string for the H8300 architecture.
Robert P. J. Day [Tue, 21 Apr 2009 19:24:47 +0000 (12:24 -0700)]
MAINTAINERS: add a more searchable string for the H8300 architecture.

Add a parenthesized string of "H8300" for more convenient searchability
in the MAINTAINERS file.

Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoMAINTAINERS: add Matt Mackall to embedded maintainers
Matt Mackall [Tue, 21 Apr 2009 19:24:47 +0000 (12:24 -0700)]
MAINTAINERS: add Matt Mackall to embedded maintainers

Impact: make more work for myself

Signed-off-by: Matt Mackall <mpm@selenic.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Acked-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agospi: pxa2xx: limit reaches -1
Roel Kluin [Tue, 21 Apr 2009 19:24:46 +0000 (12:24 -0700)]
spi: pxa2xx: limit reaches -1

On line 944 the return value of flush() is considered as a boolean,
but limit reaches -1 upon timeout which evaluates to true.

On 540, 594, 720 the same occurs for wait_ssp_rx_stall()
On 536 the same occurs for wait_dma_channel_stop()

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Acked-by: Eric Miao <eric.miao@marvell.com>
Cc: David Brownell <david-b@pacbell.net>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoMAINTAINERS: update KMEMTRACE pattern after file rename
Joe Perches [Tue, 21 Apr 2009 19:24:45 +0000 (12:24 -0700)]
MAINTAINERS: update KMEMTRACE pattern after file rename

Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Acked-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoMAINTAINERS: remove include/asm-*/suspend* file patterns
Joe Perches [Tue, 21 Apr 2009 19:24:44 +0000 (12:24 -0700)]
MAINTAINERS: remove include/asm-*/suspend* file patterns

There are no more arches with suspend support using these directories.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agopxa2xx_spi: restore DRCMR on resume
Daniel Ribeiro [Tue, 21 Apr 2009 19:24:43 +0000 (12:24 -0700)]
pxa2xx_spi: restore DRCMR on resume

If DMA is enabled, any spi_sync call after suspend/resume would block
forever, because DRCMR is lost on suspend.  This patch restores DRCMR to
the same values set by probe.

Signed-off-by: Daniel Ribeiro <drwyrm@gmail.com>
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agodrivers/input/serio/hp_sdc.c: fix crash when removing hp_sdc module
Helge Deller [Tue, 21 Apr 2009 19:24:42 +0000 (12:24 -0700)]
drivers/input/serio/hp_sdc.c: fix crash when removing hp_sdc module

On parisc machines, which don't have HIL, removing the hp_sdc module
panics the kernel.  Fix this by returning early in hp_sdc_exit() if no HP
SDC controller was found.

Add functionality to probe for the hp_sdc_mlc kernel module (which takes
care of the upper layer HIL functionality on parisc) after two seconds.
This is needed to get all the other HIL drivers (keyboard / mouse/ ..)
drivers automatically loaded by udev later as well.

Signed-off-by: Helge Deller <deller@gmx.de>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Frans Pop <elendil@planet.nl>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Grant Grundler <grundler@parisc-linux.org>
Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agomemcg: use rcu_dereference to access mm->owner
KAMEZAWA Hiroyuki [Tue, 21 Apr 2009 19:24:41 +0000 (12:24 -0700)]
memcg: use rcu_dereference to access mm->owner

mm->owner should be accessed with rcu_dereference().

Reported-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agohugetlbfs: return negative error code for bad mount option
Akinobu Mita [Tue, 21 Apr 2009 19:24:05 +0000 (12:24 -0700)]
hugetlbfs: return negative error code for bad mount option

This fixes the following BUG:

  # mount -o size=MM -t hugetlbfs none /huge
  hugetlbfs: Bad value 'MM' for mount option 'size=MM'
  ------------[ cut here ]------------
  kernel BUG at fs/super.c:996!

Due to

BUG_ON(!mnt->mnt_sb);

in vfs_kern_mount().

Also, remove unused #include <linux/quotaops.h>

Cc: William Irwin <wli@holomorphy.com>
Cc: <stable@kernel.org>
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoipmi: add oem message handling
dann frazier [Tue, 21 Apr 2009 19:24:05 +0000 (12:24 -0700)]
ipmi: add oem message handling

Enable userspace to receive messages that a BMC transmits using an OEM
medium.  This is used by the HP iLO2.

Based on code originally written by Patrick Schoeller.

Signed-off-by: dann frazier <dannf@hp.com>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoipmi: fix statistics counting issues
Corey Minyard [Tue, 21 Apr 2009 19:24:04 +0000 (12:24 -0700)]
ipmi: fix statistics counting issues

Bela Lubkin noticed that the statistics for send IPMB and LAN commands
in the IPMI driver could be incremented even if an error occurred.  Move
the increments to the proper place to avoid this.

Also add some statistics for retransmissions that failed, and some little
helper functions to neaten up the code a little.

Signed-off-by: Corey Minyard <cminyard@mvista.com>
Cc: Bela Lubkin <blubkin@vmware.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoipmi: test for event buffer before using
Corey Minyard [Tue, 21 Apr 2009 19:24:03 +0000 (12:24 -0700)]
ipmi: test for event buffer before using

The IPMI driver would attempt to use the event buffer even if that
didn't exist on the BMC.  This patch modified the IPMI driver to check
for the event buffer's existence before trying to use it.

Signed-off-by: Corey Minyard <minyard@acm.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoipmi: fix platform return check
Corey Minyard [Tue, 21 Apr 2009 19:24:02 +0000 (12:24 -0700)]
ipmi: fix platform return check

The wrong return value is being tested when allocating a platform device
in the IPMI SI code.  Check the right value.

Signed-off-by: Corey Minyard <minyard@acm.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoclocksource: add enable() and disable() callbacks
Magnus Damm [Tue, 21 Apr 2009 19:24:02 +0000 (12:24 -0700)]
clocksource: add enable() and disable() callbacks

Add enable() and disable() callbacks for clocksources.

This allows us to put unused clocksources in power save mode.  The
functions clocksource_enable() and clocksource_disable() wrap the
callbacks and are inserted in the timekeeping code to enable before use
and disable after switching to a new clocksource.

Signed-off-by: Magnus Damm <damm@igel.co.jp>
Acked-by: John Stultz <johnstul@us.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoclocksource: pass clocksource to read() callback
Magnus Damm [Tue, 21 Apr 2009 19:24:00 +0000 (12:24 -0700)]
clocksource: pass clocksource to read() callback

Pass clocksource pointer to the read() callback for clocksources.  This
allows us to share the callback between multiple instances.

[hugh@veritas.com: fix powerpc build of clocksource pass clocksource mods]
[akpm@linux-foundation.org: cleanup]
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Acked-by: John Stultz <johnstul@us.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agopxafb: lcsr1 is unused without CONFIG_FB_PXA_OVERLAY
Denis V. Lunev [Tue, 21 Apr 2009 19:23:59 +0000 (12:23 -0700)]
pxafb: lcsr1 is unused without CONFIG_FB_PXA_OVERLAY

Fixes the warning:

  drivers/video/pxafb.c: In function 'pxafb_handle_irq':
  drivers/video/pxafb.c:1442: warning: unused variable 'lcsr1'

[akpm@linux-foundation.org: save an ifdef]
Signed-off-by: Denis V. Lunev <den@openvz.org>
Cc: Eric Miao <eric.miao@marvell.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoasiliantfb: add missing return statement
Vlada Peric [Tue, 21 Apr 2009 19:23:59 +0000 (12:23 -0700)]
asiliantfb: add missing return statement

Commit 032220ba (asiliantfb: fix cmap memory leaks) changed the function
init_asiliant from void to int, resulting in the following compile warning:

  drivers/video/asiliantfb.c: In function `init_asiliant':
  drivers/video/asiliantfb.c:536: warning: control reaches end of non-void function

Fix the warning by returning 0.

Signed-off-by: Vlada Peric <vlada.peric@gmail.com>
Cc: Andres Salomon <dilinger@debian.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agogo7007: Convert to the new i2c device binding model
Jean Delvare [Tue, 21 Apr 2009 19:47:22 +0000 (21:47 +0200)]
go7007: Convert to the new i2c device binding model

Move the go7007 driver away from the legacy i2c binding model, which
is going away really soon now.

The I2C addresses of the audio and video chips in s2250-board didn't
look quite right, apparently they were left-aligned values when Linux
wants right-aligned values, so I fixed them too.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoBtrfs: fix btrfs fallocate oops and deadlock
Chris Mason [Tue, 21 Apr 2009 15:53:38 +0000 (11:53 -0400)]
Btrfs: fix btrfs fallocate oops and deadlock

Btrfs fallocate was incorrectly starting a transaction with a lock held
on the extent_io tree for the file, which could deadlock.  Strictly
speaking it was using join_transaction which would be safe, but it is better
to move the transaction outside of the lock.

When preallocated extents are overwritten, btrfs_mark_buffer_dirty was
being called on an unlocked buffer.  This was triggering an assertion and
oops because the lock is supposed to be held.

The bug was calling btrfs_mark_buffer_dirty on a leaf after btrfs_del_item had
been run.  btrfs_del_item takes care of dirtying things, so the solution is a
to skip the btrfs_mark_buffer_dirty call in this case.

Signed-off-by: Chris Mason <chris.mason@oracle.com>
15 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-fixes
Linus Torvalds [Tue, 21 Apr 2009 15:27:30 +0000 (08:27 -0700)]
Merge git://git./linux/kernel/git/steve/gfs2-2.6-fixes

* git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-fixes:
  GFS2: Fix page_mkwrite() return code
  GFS2: Clear dirty bit at end of inode glock sync

15 years agoMerge branch 'sh/for-2.6.30' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal...
Linus Torvalds [Tue, 21 Apr 2009 15:16:14 +0000 (08:16 -0700)]
Merge branch 'sh/for-2.6.30' of git://git./linux/kernel/git/lethal/sh-2.6

* 'sh/for-2.6.30' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6:
  sh: Fix mmap2 for handling differing PAGE_SIZEs.
  sh: sh7723: Don't default enable the RTC clock.
  sh: sh7722: Don't default enable the RTC clock.
  rtc: rtc-sh: clock framework support.

15 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
Linus Torvalds [Tue, 21 Apr 2009 14:56:17 +0000 (07:56 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/viro/vfs-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
  reiserfs: fix j_last_flush_trans_id type
  fs: Mark get_filesystem_list() as __init function.
  kill vfs_stat_fd / vfs_lstat_fd
  Separate out common fstatat code into vfs_fstatat
  ecryptfs: use memdup_user()
  ncpfs: use memdup_user()
  xfs: use memdup_user()
  sysfs: use memdup_user()
  btrfs: use memdup_user()
  xattr: use memdup_user()
  autofs4: use memchr() in invalid_string()
  Documentation/filesystems: remove out of date reference to BKL being held
  Fix i_mutex vs. readdir handling in nfsd
  fs/compat_ioctl: fix build when !BLOCK
  Fix autofs_expire()
  No need for crossing to mountpoint in audit_tag_tree()
  Safer nfsd_cross_mnt()
  Touch all affected namespaces on propagation of mount
  Fix AUTOFS_DEV_IOCTL_REQUESTER_CMD

15 years agoFix SYSCALL_ALIAS for older MIPS assembler
Thomas Bogendoerfer [Tue, 21 Apr 2009 11:44:13 +0000 (13:44 +0200)]
Fix SYSCALL_ALIAS for older MIPS assembler

Older MIPS assembler don't support .set for defining aliases.
Using = works for old and new assembers.

Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoNFS: Fix the XDR iovec calculation in nfs3_xdr_setaclargs
Trond Myklebust [Mon, 20 Apr 2009 18:58:35 +0000 (14:58 -0400)]
NFS: Fix the XDR iovec calculation in nfs3_xdr_setaclargs

Commit ae46141ff08f1965b17c531b571953c39ce8b9e2 (NFSv3: Fix posix ACL code)
introduces a bug in the calculation of the XDR header iovec. In the case
where we are inlining the acls, we need to adjust the length of the iovec
req->rq_svec, in addition to adjusting the total buffer length.

Tested-by: Leonardo Chiquitto <leonardo.lists@gmail.com>
Tested-by: Suresh Jayaraman <sjayaraman@suse.de>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoMerge branch 'sh/stable-updates' into sh/for-2.6.30
Paul Mundt [Tue, 21 Apr 2009 08:12:16 +0000 (17:12 +0900)]
Merge branch 'sh/stable-updates' into sh/for-2.6.30

15 years agoreiserfs: fix j_last_flush_trans_id type
Al Viro [Tue, 21 Apr 2009 03:29:41 +0000 (23:29 -0400)]
reiserfs: fix j_last_flush_trans_id type

Conversion in commit 600ed41675d8c384519d8f0b3c76afed39ef2f4b had missed
that one, but converted format from %lu to %u.  As the result,
/proc/..../journal got buggered on 64bit boxen.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
15 years agofs: Mark get_filesystem_list() as __init function.
Tetsuo Handa [Thu, 9 Apr 2009 11:17:52 +0000 (20:17 +0900)]
fs: Mark get_filesystem_list() as __init function.

"int get_filesystem_list(char * buf)" is called by only
"static void __init get_fs_names(char *page)".
We can mark get_filesystem_list() as "__init".

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
15 years agokill vfs_stat_fd / vfs_lstat_fd
Christoph Hellwig [Wed, 8 Apr 2009 20:34:03 +0000 (16:34 -0400)]
kill vfs_stat_fd / vfs_lstat_fd

There's really no reason to keep vfs_stat_fd and vfs_lstat_fd with
Oleg's vfs_fstatat.  Use vfs_fstatat for the few cases having the
directory fd, and switch all others to vfs_stat / vfs_lstat.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
15 years agoSeparate out common fstatat code into vfs_fstatat
Oleg Drokin [Wed, 8 Apr 2009 16:05:42 +0000 (20:05 +0400)]
Separate out common fstatat code into vfs_fstatat

This is a version incorporating Christoph's suggestion.

Separate out common *fstatat functionality into a single function
instead of duplicating it all over the code.

Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
15 years agoecryptfs: use memdup_user()
Li Zefan [Wed, 8 Apr 2009 07:09:29 +0000 (15:09 +0800)]
ecryptfs: use memdup_user()

Remove open-coded memdup_user().

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
15 years agoncpfs: use memdup_user()
Li Zefan [Wed, 8 Apr 2009 07:08:53 +0000 (15:08 +0800)]
ncpfs: use memdup_user()

Remove open-coded memdup_user()

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
15 years agoxfs: use memdup_user()
Li Zefan [Wed, 8 Apr 2009 07:08:04 +0000 (15:08 +0800)]
xfs: use memdup_user()

Remove open-coded memdup_user()

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
15 years agosysfs: use memdup_user()
Li Zefan [Wed, 8 Apr 2009 07:07:30 +0000 (15:07 +0800)]
sysfs: use memdup_user()

Remove open-coded memdup_user().

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
15 years agobtrfs: use memdup_user()
Li Zefan [Wed, 8 Apr 2009 07:06:54 +0000 (15:06 +0800)]
btrfs: use memdup_user()

Remove open-coded memdup_user().

Note this changes some GFP_NOFS to GFP_KERNEL, since copy_from_user() may
cause pagefault, it's pointless to pass GFP_NOFS to kmalloc().

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
15 years agoxattr: use memdup_user()
Li Zefan [Wed, 8 Apr 2009 07:06:12 +0000 (15:06 +0800)]
xattr: use memdup_user()

Remove open-coded memdup_user()

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
15 years agoautofs4: use memchr() in invalid_string()
Al Viro [Tue, 7 Apr 2009 15:12:46 +0000 (11:12 -0400)]
autofs4: use memchr() in invalid_string()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
15 years agoDocumentation/filesystems: remove out of date reference to BKL being held
Adrian McMenamin [Tue, 21 Apr 2009 01:38:28 +0000 (18:38 -0700)]
Documentation/filesystems: remove out of date reference to BKL being held

Documentation/filesystems/vfs.txt incorrectly states that the kernel is
locked during the call to statfs (Documentation/filesystems/Locking
correctly says it is not). This patch removes the offending sentence.

remove reference to BKL being held in statfs

Signed-off-by: Adrian McMenamin <adrian@mcmen.demon.co.uk>
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
15 years agoFix i_mutex vs. readdir handling in nfsd
David Woodhouse [Mon, 20 Apr 2009 22:18:37 +0000 (23:18 +0100)]
Fix i_mutex vs. readdir handling in nfsd

Commit 14f7dd63 ("Copy XFS readdir hack into nfsd code") introduced a
bug to generic code which had been extant for a long time in the XFS
version -- it started to call through into lookup_one_len() and hence
into the file systems' ->lookup() methods without i_mutex held on the
directory.

This patch fixes it by locking the directory's i_mutex again before
calling the filldir functions. The original deadlocks which commit
14f7dd63 was designed to avoid are still avoided, because they were due
to fs-internal locking, not i_mutex.

While we're at it, fix the return type of nfsd_buffered_readdir() which
should be a __be32 not an int -- it's an NFS errno, not a Linux errno.
And return nfserrno(-ENOMEM) when allocation fails, not just -ENOMEM.
Sparse would have caught that, if it wasn't so busy bitching about
__cold__.

Commit 05f4f678 ("nfsd4: don't do lookup within readdir in recovery
code") introduced a similar problem with calling lookup_one_len()
without i_mutex, which this patch also addresses. To fix that, it was
necessary to fix the called functions so that they expect i_mutex to be
held; that part was done by J. Bruce Fields.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Umm-I-can-live-with-that-by: Al Viro <viro@zeniv.linux.org.uk>
Reported-by: J. R. Okajima <hooanon05@yahoo.co.jp>
Tested-by: J. Bruce Fields <bfields@citi.umich.edu>
LKML-Reference: <8036.1237474444@jrobl>
Cc: stable@kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
15 years agofs/compat_ioctl: fix build when !BLOCK
Alexander Beregalov [Mon, 20 Apr 2009 08:23:02 +0000 (12:23 +0400)]
fs/compat_ioctl: fix build when !BLOCK

In file included from fs/compat_ioctl.c:61:
include/linux/loop.h:59: error: field 'lo_bio_list' has incomplete type

Signed-off-by: Alexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
15 years agoFix autofs_expire()
Al Viro [Sat, 18 Apr 2009 15:19:26 +0000 (11:19 -0400)]
Fix autofs_expire()

mnt should remain the same for all iterations through the list;
as it is, if we have a busy mount, mnt follows into it and isn't
restored for the next iteration.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
15 years agoNo need for crossing to mountpoint in audit_tag_tree()
Al Viro [Sat, 18 Apr 2009 07:25:41 +0000 (03:25 -0400)]
No need for crossing to mountpoint in audit_tag_tree()

is_under() will DTRT anyway.  And yes, is_subdir() behaviour
is intentional.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>