Kent Overstreet [Tue, 7 May 2013 23:18:49 +0000 (16:18 -0700)]
aio: use cancellation list lazily
Cancelling kiocbs requires adding them to a per kioctx linked list,
which is one of the few things we need to take the kioctx lock for in
the fast path. But most kiocbs can't be cancelled - so if we just do
this lazily, we can avoid quite a bit of locking overhead.
While we're at it, instead of using a flag bit switch to using ki_cancel
itself to indicate that a kiocb has been cancelled/completed. This lets
us get rid of ki_flags entirely.
[akpm@linux-foundation.org: remove buggy BUG()]
Signed-off-by: Kent Overstreet <koverstreet@google.com>
Cc: Zach Brown <zab@redhat.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Reviewed-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kent Overstreet [Tue, 7 May 2013 23:18:47 +0000 (16:18 -0700)]
aio: use flush_dcache_page()
This wasn't causing problems before because it's not needed on x86, but
it is needed on other architectures.
Signed-off-by: Kent Overstreet <koverstreet@google.com>
Cc: Zach Brown <zab@redhat.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kent Overstreet [Tue, 7 May 2013 23:18:45 +0000 (16:18 -0700)]
aio: make aio_read_evt() more efficient, convert to hrtimers
Previously, aio_read_event() pulled a single completion off the
ringbuffer at a time, locking and unlocking each time. Change it to
pull off as many events as it can at a time, and copy them directly to
userspace.
This also fixes a bug where if copying the event to userspace failed,
we'd lose the event.
Also convert it to wait_event_interruptible_hrtimeout(), which
simplifies it quite a bit.
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Kent Overstreet <koverstreet@google.com>
Cc: Zach Brown <zab@redhat.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Reviewed-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kent Overstreet [Tue, 7 May 2013 23:18:43 +0000 (16:18 -0700)]
wait: add wait_event_hrtimeout()
Analagous to wait_event_timeout() and friends, this adds
wait_event_hrtimeout() and wait_event_interruptible_hrtimeout().
Note that unlike the versions that use regular timers, these don't
return the amount of time remaining when they return - instead, they
return 0 or -ETIME if they timed out. because I was uncomfortable with
the semantics of doing it the other way (that I could get it right,
anyways).
If the timer expires, there's no real guarantee that expire_time -
current_time would be <= 0 - due to timer slack certainly, and I'm not
sure I want to know the implications of the different clock bases in
hrtimers.
If the timer does expire and the code calculates that the time remaining
is nonnegative, that could be even worse if the calling code then reuses
that timeout. Probably safer to just return 0 then, but I could imagine
weird bugs or at least unintended behaviour arising from that too.
I came to the conclusion that if other users end up actually needing the
amount of time remaining, the sanest thing to do would be to create a
version that uses absolute timeouts instead of relative.
[akpm@linux-foundation.org: fix description of `timeout' arg]
Signed-off-by: Kent Overstreet <koverstreet@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Zach Brown <zab@redhat.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Reviewed-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kent Overstreet [Tue, 7 May 2013 23:18:41 +0000 (16:18 -0700)]
aio: refcounting cleanup
The usage of ctx->dead was fubar - it makes no sense to explicitly check
it all over the place, especially when we're already using RCU.
Now, ctx->dead only indicates whether we've dropped the initial
refcount. The new teardown sequence is:
set ctx->dead
hlist_del_rcu();
synchronize_rcu();
Now we know no system calls can take a new ref, and it's safe to drop
the initial ref:
put_ioctx();
We also need to ensure there are no more outstanding kiocbs. This was
done incorrectly - it was being done in kill_ctx(), and before dropping
the initial refcount. At this point, other syscalls may still be
submitting kiocbs!
Now, we cancel and wait for outstanding kiocbs in free_ioctx(), after
kioctx->users has dropped to 0 and we know no more iocbs could be
submitted.
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Kent Overstreet <koverstreet@google.com>
Cc: Zach Brown <zab@redhat.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Reviewed-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kent Overstreet [Tue, 7 May 2013 23:18:39 +0000 (16:18 -0700)]
aio: make aio_put_req() lockless
Freeing a kiocb needed to touch the kioctx for three things:
* Pull it off the reqs_active list
* Decrementing reqs_active
* Issuing a wakeup, if the kioctx was in the process of being freed.
This patch moves these to aio_complete(), for a couple reasons:
* aio_complete() already has to issue the wakeup, so if we drop the
kioctx refcount before aio_complete does its wakeup we don't have to
do it twice.
* aio_complete currently has to take the kioctx lock, so it makes sense
for it to pull the kiocb off the reqs_active list too.
* A later patch is going to change reqs_active to include unreaped
completions - this will mean allocating a kiocb doesn't have to look
at the ringbuffer. So taking the decrement of reqs_active out of
kiocb_free() is useful prep work for that patch.
This doesn't really affect cancellation, since existing (usb) code that
implements a cancel function still calls aio_complete() - we just have
to make sure that aio_complete does the necessary teardown for cancelled
kiocbs.
It does affect code paths where we free kiocbs that were never
submitted; they need to decrement reqs_active and pull the kiocb off the
reqs_active list. This occurs in two places: kiocb_batch_free(), which
is going away in a later patch, and the error path in io_submit_one.
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Kent Overstreet <koverstreet@google.com>
Cc: Zach Brown <zab@redhat.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Acked-by: Jeff Moyer <jmoyer@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Reviewed-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kent Overstreet [Tue, 7 May 2013 23:18:37 +0000 (16:18 -0700)]
aio: do fget() after aio_get_req()
aio_get_req() will fail if we have the maximum number of requests
outstanding, which depending on the application may not be uncommon. So
avoid doing an unnecessary fget().
Signed-off-by: Kent Overstreet <koverstreet@google.com>
Cc: Zach Brown <zab@redhat.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Acked-by: Jeff Moyer <jmoyer@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Reviewed-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kent Overstreet [Tue, 7 May 2013 23:18:35 +0000 (16:18 -0700)]
aio: dprintk() -> pr_debug()
Signed-off-by: Kent Overstreet <koverstreet@google.com>
Cc: Zach Brown <zab@redhat.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Acked-by: Jeff Moyer <jmoyer@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kent Overstreet [Tue, 7 May 2013 23:18:33 +0000 (16:18 -0700)]
aio: move private stuff out of aio.h
Signed-off-by: Kent Overstreet <koverstreet@google.com>
Cc: Zach Brown <zab@redhat.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Acked-by: Jeff Moyer <jmoyer@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kent Overstreet [Tue, 7 May 2013 23:18:31 +0000 (16:18 -0700)]
aio: add kiocb_cancel()
Minor refactoring, to get rid of some duplicated code
[akpm@linux-foundation.org: fix warning]
Signed-off-by: Kent Overstreet <koverstreet@google.com>
Cc: Zach Brown <zab@redhat.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Acked-by: Jeff Moyer <jmoyer@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Reviewed-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kent Overstreet [Tue, 7 May 2013 23:18:29 +0000 (16:18 -0700)]
aio: kill return value of aio_complete()
Nothing used the return value, and it probably wasn't possible to use it
safely for the locked versions (aio_complete(), aio_put_req()). Just
kill it.
Signed-off-by: Kent Overstreet <koverstreet@google.com>
Acked-by: Zach Brown <zab@redhat.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Acked-by: Jeff Moyer <jmoyer@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Reviewed-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Zach Brown [Tue, 7 May 2013 23:18:27 +0000 (16:18 -0700)]
char: add aio_{read,write} to /dev/{null,zero}
These are handy for measuring the cost of the aio infrastructure with
operations that do very little and complete immediately.
Signed-off-by: Zach Brown <zab@redhat.com>
Signed-off-by: Kent Overstreet <koverstreet@google.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Acked-by: Jeff Moyer <jmoyer@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Reviewed-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Zach Brown [Tue, 7 May 2013 23:18:25 +0000 (16:18 -0700)]
aio: remove retry-based AIO
This removes the retry-based AIO infrastructure now that nothing in tree
is using it.
We want to remove retry-based AIO because it is fundemantally unsafe.
It retries IO submission from a kernel thread that has only assumed the
mm of the submitting task. All other task_struct references in the IO
submission path will see the kernel thread, not the submitting task.
This design flaw means that nothing of any meaningful complexity can use
retry-based AIO.
This removes all the code and data associated with the retry machinery.
The most significant benefit of this is the removal of the locking
around the unused run list in the submission path.
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Kent Overstreet <koverstreet@google.com>
Signed-off-by: Zach Brown <zab@redhat.com>
Cc: Zach Brown <zab@redhat.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Acked-by: Jeff Moyer <jmoyer@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Reviewed-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Zach Brown [Tue, 7 May 2013 23:18:23 +0000 (16:18 -0700)]
gadget: remove only user of aio retry
This removes the only in-tree user of aio retry. This will let us
remove the retry code from the aio core.
Removing retry is relatively easy as the USB gadget wasn't using it to
retry IOs at all. It always fully submitted the IO in the context of
the initial io_submit() call. It only used the AIO retry facility to
get the submitter's mm context for copying the result of a read back to
user space. This is easy to implement with use_mm() and a work struct,
much like kvm does with async_pf_execute() for get_user_pages().
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Zach Brown <zab@redhat.com>
Signed-off-by: Kent Overstreet <koverstreet@google.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Acked-by: Jeff Moyer <jmoyer@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Zach Brown [Tue, 7 May 2013 23:18:21 +0000 (16:18 -0700)]
aio: remove dead code from aio.h
Signed-off-by: Zach Brown <zab@redhat.com>
Signed-off-by: Kent Overstreet <koverstreet@google.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Acked-by: Jeff Moyer <jmoyer@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Reviewed-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Zach Brown [Tue, 7 May 2013 23:18:19 +0000 (16:18 -0700)]
mm: remove old aio use_mm() comment
Bunch of performance improvements and cleanups Zach Brown and I have
been working on. The code should be pretty solid at this point, though
it could of course use more review and testing.
The results in my testing are pretty impressive, particularly when an
ioctx is being shared between multiple threads. In my crappy synthetic
benchmark, with 4 threads submitting and one thread reaping completions,
I saw overhead in the aio code go from ~50% (mostly ioctx lock
contention) to low single digits. Performance with ioctx per thread
improved too, but I'd have to rerun those benchmarks.
The reason I've been focused on performance when the ioctx is shared is
that for a fair number of real world completions, userspace needs the
completions aggregated somehow - in practice people just end up
implementing this aggregation in userspace today, but if it's done right
we can do it much more efficiently in the kernel.
Performance wise, the end result of this patch series is that submitting
a kiocb writes to _no_ shared cachelines - the penalty for sharing an
ioctx is gone there. There's still going to be some cacheline
contention when we deliver the completions to the aio ringbuffer (at
least if you have interrupts being delivered on multiple cores, which
for high end stuff you do) but I have a couple more patches not in this
series that implement coalescing for that (by taking advantage of
interrupt coalescing). With that, there's basically no bottlenecks or
performance issues to speak of in the aio code.
This patch:
use_mm() is used in more places than just aio. There's no need to mention
callers when describing the function.
Signed-off-by: Zach Brown <zab@redhat.com>
Signed-off-by: Kent Overstreet <koverstreet@google.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Acked-by: Jeff Moyer <jmoyer@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Reviewed-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andrew Morton [Tue, 7 May 2013 23:18:18 +0000 (16:18 -0700)]
mm/vmalloc.c: add vfree comment
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Akinobu Mita [Tue, 7 May 2013 23:18:17 +0000 (16:18 -0700)]
remove unused random32() and srandom32()
After finishing a naming transition, remove unused backward
compatibility wrapper macros
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andrew Morton [Tue, 7 May 2013 23:18:16 +0000 (16:18 -0700)]
drivers/infiniband/hw: rename random32() to prandom_u32()
Use preferable function name which implies using a pseudo-random number
generator.
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Akinobu Mita [Tue, 7 May 2013 23:18:15 +0000 (16:18 -0700)]
drivers/net: rename random32() to prandom_u32()
Use preferable function name which implies using a pseudo-random number
generator.
[akpm@linux-foundation.org: convert team_mode_random.c]
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Acked-by: Thomas Sailer <t.sailer@alumni.ethz.ch>
Acked-by: Bing Zhao <bzhao@marvell.com> [mwifiex]
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Michael Chan <mchan@broadcom.com>
Cc: Thomas Sailer <t.sailer@alumni.ethz.ch>
Cc: Jean-Paul Roubelat <jpr@f6fbb.org>
Cc: Bing Zhao <bzhao@marvell.com>
Cc: Brett Rudley <brudley@broadcom.com>
Cc: Arend van Spriel <arend@broadcom.com>
Cc: "Franky (Zhenhui) Lin" <frankyl@broadcom.com>
Cc: Hante Meuleman <meuleman@broadcom.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Naoya Horiguchi [Tue, 7 May 2013 23:18:13 +0000 (16:18 -0700)]
hugetlbfs: fix mmap failure in unaligned size request
The current kernel returns -EINVAL unless a given mmap length is
"almost" hugepage aligned. This is because in sys_mmap_pgoff() the
given length is passed to vm_mmap_pgoff() as it is without being aligned
with hugepage boundary.
This is a regression introduced in commit
40716e29243d ("hugetlbfs: fix
alignment of huge page requests"), where alignment code is pushed into
hugetlb_file_setup() and the variable len in caller side is not changed.
To fix this, this patch partially reverts that commit, and adds
alignment code in caller side. And it also introduces hstate_sizelog()
in order to get proper hstate to specified hugepage size.
Addresses https://bugzilla.kernel.org/show_bug.cgi?id=56881
[akpm@linux-foundation.org: fix warning when CONFIG_HUGETLB_PAGE=n]
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reported-by: <iceman_dvd@yahoo.com>
Cc: Steven Truelove <steven.truelove@utoronto.ca>
Cc: Jianguo Wu <wujianguo@huawei.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Zhao Hongjiang [Tue, 7 May 2013 23:18:12 +0000 (16:18 -0700)]
parisc: remove the second argument of kmap_atomic()
kmap_atomic() requires only one argument now.
Signed-off-by: Zhao Hongjiang <zhaohongjiang@huawei.com>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Helge Deller <deller@gmx.de>
Cc: Rolf Eike Beer <eike-kernel@sf-tec.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Lucas Stach [Tue, 7 May 2013 23:18:11 +0000 (16:18 -0700)]
drivers/rtc/rtc-rs5c372.c: add R2221T/L variant to the driver
Register layout is the same, so just add the variant to the appropriate
places.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: Jan Luebbe <jlu@pengutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andrew Morton [Tue, 7 May 2013 23:18:10 +0000 (16:18 -0700)]
include/linux/mm.h: complete the mm_walk definition
That nameless-function-arguments thing drives me batty. Fix.
Cc: Dave Hansen <dave.hansen@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Rientjes [Tue, 7 May 2013 23:18:09 +0000 (16:18 -0700)]
mm, memcg: add rss_huge stat to memory.stat
This exports the amount of anonymous transparent hugepages for each
memcg via the new "rss_huge" stat in memory.stat. The units are in
bytes.
This is helpful to determine the hugepage utilization for individual
jobs on the system in comparison to rss and opportunities where
MADV_HUGEPAGE may be helpful.
The amount of anonymous transparent hugepages is also included in "rss"
for backwards compatibility.
Signed-off-by: David Rientjes <rientjes@google.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jiang Liu [Tue, 7 May 2013 23:18:08 +0000 (16:18 -0700)]
mm/SPARC: use common help functions to free reserved pages
Use common help functions to free reserved pages.
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Tue, 7 May 2013 15:42:20 +0000 (08:42 -0700)]
Merge branch 'slab/for-linus' of git://git./linux/kernel/git/penberg/linux
Pull slab changes from Pekka Enberg:
"The bulk of the changes are more slab unification from Christoph.
There's also few fixes from Aaron, Glauber, and Joonsoo thrown into
the mix."
* 'slab/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux: (24 commits)
mm, slab_common: Fix bootstrap creation of kmalloc caches
slab: Return NULL for oversized allocations
mm: slab: Verify the nodeid passed to ____cache_alloc_node
slub: tid must be retrieved from the percpu area of the current processor
slub: Do not dereference NULL pointer in node_match
slub: add 'likely' macro to inc_slabs_node()
slub: correct to calculate num of acquired objects in get_partial_node()
slub: correctly bootstrap boot caches
mm/sl[au]b: correct allocation type check in kmalloc_slab()
slab: Fixup CONFIG_PAGE_ALLOC/DEBUG_SLAB_LEAK sections
slab: Handle ARCH_DMA_MINALIGN correctly
slab: Common definition for kmem_cache_node
slab: Rename list3/l3 to node
slab: Common Kmalloc cache determination
stat: Use size_t for sizes instead of unsigned
slab: Common function to create the kmalloc array
slab: Common definition for the array of kmalloc caches
slab: Common constants for kmalloc boundaries
slab: Rename nodelists to node
slab: Common name for the per node structures
...
Linus Torvalds [Tue, 7 May 2013 14:59:19 +0000 (07:59 -0700)]
Merge branch 'misc' of git://git./linux/kernel/git/mmarek/kbuild
Pull misc kbuild updates from Michal Marek:
"Non-critical kbuild changes:
- make coccicheck improvements, but no new semantic patches this time
- make rpm improvements
- make tar-pkg change to include the architecture in the filename.
This is a deliberate incompatibility, but nobody has complained so
far and it is useful if you build for different architectures. It
also matches what the deb-pkg and rpm-pkg targets produce.
- kbuild documentation fix"
* 'misc' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
rpm-pkg: Remove pointless set -e statements
rpm-pkg: Always regenerate the specfile
rpm-pkg: Do not write to the parent directory
rpm-pkg: Do not package the whole source directory
buildtar: Add ARCH to the archive name
Coccinelle: Fix patch output when coccicheck is used with M= and C=
Coccinelle: Add support to the SPFLAGS variable
Coccinelle: Cleanup the setting of the FLAGS and OPTIONS variables
Coccinelle: Restore coccicheck verbosity in ONLINE mode (C=1 or C=2)
scripts/package/Makefile: compare objtree with srctree instead of test KBUILD_OUTPUT
doc: change example to existing Makefile fragment
scripts/tags.sh: Add magic for OFFSET and DEFINE
Linus Torvalds [Tue, 7 May 2013 14:58:05 +0000 (07:58 -0700)]
Merge branch 'kconfig' of git://git./linux/kernel/git/mmarek/kbuild
Pull kconfig updates from Michal Marek:
- use pkg-config to detect curses libraries
- clean up the way curses headers are searched
- Some randconfig fixes, of which one had to be reverted
- KCONFIG_SEED for randconfig debugging
- memuconfig memory leak plugged
- menuconfig > breadcrumbs > navigation
- xconfig compilation fix
- Other minor fixes
* 'kconfig' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
kconfig: fix lists definition for C++
Revert "kconfig: fix randomising choice entries in presence of KCONFIG_ALLCONFIG"
kconfig: implement KCONFIG_PROBABILITY for randconfig
kconfig: allow specifying the seed for randconfig
kconfig: fix randomising choice entries in presence of KCONFIG_ALLCONFIG
kconfig: do not override symbols already set
kconfig: fix randconfig tristate detection
kconfig/lxdialog: rationalise the include paths where to find {.n}curses{,w}.h
menuconfig: Add "breadcrumbs" navigation aid
menuconfig: Fix memory leak introduced by jump keys feature
merge_config.sh: Avoid creating unnessary source softlinks
kconfig: optionally use pkg-config to detect ncurses libs
menuconfig: optionally use pkg-config to detect ncurses libs
Linus Torvalds [Tue, 7 May 2013 14:56:26 +0000 (07:56 -0700)]
Merge branch 'kbuild' of git://git./linux/kernel/git/mmarek/kbuild
Pull kbuild changes from Michal Marek:
"Kbuild commits for v3.10-rc1:
- Fix make mrproper after mod/file2alias rework
- Fix ld-option Makefile function
- Rewrite headers_install to shell to drop Perl dependency.
There are some more patches I have to look at, so I might send another
pull request later. Or just queue them for 3.11."
* 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
Fix cleaning in scripts/mod
headers_install.pl: convert to headers_install.sh
kbuild: fix ld-option function
Li Zefan [Tue, 7 May 2013 13:56:54 +0000 (15:56 +0200)]
menuconfig: fix NULL pointer dereference when searching a symbol
Searching for PPC_EFIKA results in a segmentation fault, and it's
because get_symbol_prop() returns NULL.
In this case CONFIG_PPC_EFIKA is defined in arch/powerpc/platforms/
52xx/Kconfig, so it won't be parsed if ARCH!=PPC, but menuconfig knows
this symbol when it parses sound/soc/fsl/Kconfig:
config SND_MPC52xx_SOC_EFIKA
tristate "SoC AC97 Audio support for bbplan Efika and STAC9766"
depends on PPC_EFIKA
This bug was introduced by commit
bcdedcc1afd6 ("menuconfig: print more
info for symbol without prompts").
Reported-and-tested-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Li Zefan <lizefan@huawei.com>
Tested-by: Libo Chen <libo.chen@huawei.com>
Reviewed-by: "Yann E. MORIN" <yann.morin.1998@free.fr>
Signed-off-by: Michal Marek <mmarek@suse.cz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bruce Allan [Tue, 7 May 2013 05:52:47 +0000 (22:52 -0700)]
e1000e: fix scheduling while atomic bug
A scheduling while atomic bug was introduced recently (by commit
ce43a2168c59: "e1000e: cleanup USLEEP_RANGE checkpatch checks").
Revert the particular instance of usleep_range() which causes the bug.
Reported-by: Maarten Lankhorst <m.b.lankhorst@gmail.com>
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pekka Enberg [Tue, 7 May 2013 06:19:47 +0000 (09:19 +0300)]
Merge branch 'slab/next' into slab/for-linus
Linus Torvalds [Mon, 6 May 2013 22:51:10 +0000 (15:51 -0700)]
Merge git://git./linux/kernel/git/davem/net
Pull networking fixes from David Miller:
"Just a small pile of fixes"
1) Fix race conditions in IP fragmentation LRU list handling, from
Konstantin Khlebnikov.
2) vfree() is no longer verboten in interrupts, so deferring is
pointless, from Al Viro.
3) Conversion from mutex to semaphore in netpoll left trylock test
inverted, caught by Dan Carpenter.
4) 3c59x uses wrong base address when releasing regions, from Sergei
Shtylyov.
5) Bounds checking in TIPC from Dan Carpenter.
6) Fastopen cookies should not be expired as aggressively as other TCP
metrics. From Eric Dumazet.
7) Fix retrieval of MAC address in ibmveth, from Ben Herrenschmidt.
8) Don't use "u16" in virtio user headers, from Stephen Hemminger
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
tipc: potential divide by zero in tipc_link_recv_fragment()
tipc: add a bounds check in link_recv_changeover_msg()
net/usb: new driver for RTL8152
3c59x: fix freeing nonexistent resource on driver unload
netpoll: inverted down_trylock() test
rps_dev_flow_table_release(): no need to delay vfree()
fib_trie: no need to delay vfree()
net: frag, fix race conditions in LRU list maintenance
tcp: do not expire TCP fastopen cookies
net/eth/ibmveth: Fixup retrieval of MAC address
virtio: don't expose u16 in userspace api
Linus Torvalds [Mon, 6 May 2013 22:41:42 +0000 (15:41 -0700)]
Merge branch 'for-next' of git://git./linux/kernel/git/cooloney/linux-leds
Pull LED subsystem updates from Bryan Wu:
- move LED trigger drivers into a new directory
- lp55xx common driver updates
- other led drivers updates and bug fixing
* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/cooloney/linux-leds:
leds: leds-asic3: switch to using SIMPLE_DEV_PM_OPS
leds: leds-bd2802: add CONFIG_PM_SLEEP to suspend/resume functions
leds: lp55xx: configure the clock detection
leds: lp55xx: use common clock framework when external clock is used
leds: leds-ns2: fix oops at module removal
leds: leds-pwm: Defer led_pwm_set() if PWM can sleep
leds: lp55xx: fix the sysfs read operation
leds: lm355x, lm3642: support camera LED triggers for flash and torch
leds: add camera LED triggers
leds: trigger: use inline functions instead of macros
leds: tca6507: Use of_match_ptr() macro
leds: wm8350: Complain if we fail to reenable DCDC
leds: renesas: set gpio_request_one() flags param correctly
leds: leds-ns2: set devm_gpio_request_one() flags param correctly
leds: leds-lt3593: set devm_gpio_request_one() flags param correctly
leds: leds-bd2802: remove erroneous __exit annotation
leds: atmel-pwm: remove erroneous __exit annotation
leds: move LED trigger drivers into new subdirectory
leds: add new LP5562 LED driver
Linus Torvalds [Mon, 6 May 2013 22:40:55 +0000 (15:40 -0700)]
Merge tag 'gpio-for-linus' of git://git.secretlab.ca/git/linux
Pull GPIO changes from Grant Likely:
"The usual selection of bug fixes and driver updates for GPIO. Nothing
really stands out except the addition of the GRGPIO driver and some
enhacements to ACPI support"
I'm pulling this despite the earlier mess. Let's hope it compiles these
days.
* tag 'gpio-for-linus' of git://git.secretlab.ca/git/linux: (46 commits)
gpio: grgpio: Add irq support
gpio: grgpio: Add device driver for GRGPIO cores
gpiolib-acpi: introduce acpi_get_gpio_by_index() helper
GPIO: gpio-generic: remove kfree() from bgpio_remove call
gpio / ACPI: Handle ACPI events in accordance with the spec
gpio: lpc32xx: Fix off-by-one valid range checking for bank
gpio: mcp23s08: convert driver to DT
gpio/omap: force restore if context loss is not detectable
gpio/omap: optimise interrupt service routine
gpio/omap: remove extra context restores in *_runtime_resume()
gpio/omap: free irq domain in probe() failure paths
gpio: gpio-generic: Add 16 and 32 bit big endian byte order support
gpio: samsung: Add terminating entry for exynos_pinctrl_ids
gpio: mvebu: add dbg_show function
MAX7301 GPIO: Do not force SPI speed when using OF Platform
gpio: gpio-tps65910.c: fix checkpatch error
gpio: gpio-timberdale.c: fix checkpatch error
gpio: gpio-tc3589x.c: fix checkpatch errors
gpio: gpio-stp-xway.c: fix checkpatch error
gpio: gpio-sch.c: fix checkpatch error
...
Linus Torvalds [Mon, 6 May 2013 22:32:36 +0000 (15:32 -0700)]
Merge tag 'for-3.10-rc1' of git://gitorious.org/linux-pwm/linux-pwm
Pull pwm changes from Thierry Reding:
"Nothing very exciting this time around. A couple of bug fixes and a
lot of cleanup across the board. The DaVinci 8xx family of SoCs now
use the same driver as the AM33xx family.
Many thanks to Axel Lin and Jingoo Han who have done a great job
fixing various bugs and inconsistencies."
* tag 'for-3.10-rc1' of git://gitorious.org/linux-pwm/linux-pwm: (27 commits)
pwm: lpc32xx: Don't change PWM_ENABLE bit in lpc32xx_pwm_config
pwm: lpc32xx: Properly set PWM_ENABLE bit in lpc32xx_pwm_[enable|disable]
pwm: Constify OF match tables
pwm: pwm-tiehrpwm: Update device-tree binding document
pwm: pwm-tiecap: Update device-tree binding document
pwm: puv3: Remove unused enabled filed from struct puv3_pwm_chip
pwm: pxa: Remove PWM_ID_BASE macro
pwm: spear: Remove unused *dev from struct spear_pwm_chip
pwm: mxs: Remove unused *dev from struct mxs_pwm_chip
pwm: twl: Return proper error if twl6030_pwm_enable() fails
pwm: pxa: Remove clk_enabled field from struct pxa_pwm_chip
pwm: imx: Remove enabled field from struct imx_chip
pwm: twl: Add .owner to struct pwm_ops
pwm: twl-led: Add .owner to struct pwm_ops
pwm: atmel-tcb: Add .owner to struct pwm_ops
pwm: ab8500: Add .owner to struct pwm_ops
pwm: spear: Fix checking return value of clk_enable() and clk_prepare()
pwm: tiehrpwm: Staticize non-exported symbols
pwm: tiecap: Staticize non-exported symbols
pwm: ab8500: Fix trivial typo in dev_err message
...
Linus Torvalds [Mon, 6 May 2013 21:59:13 +0000 (14:59 -0700)]
Merge tag 'iommu-updates-v3.10' of git://git./linux/kernel/git/joro/iommu
Pull IOMMU updates from Joerg Roedel:
"The updates are mostly about the x86 IOMMUs this time.
Exceptions are the groundwork for the PAMU IOMMU from Freescale (for a
PPC platform) and an extension to the IOMMU group interface.
On the x86 side this includes a workaround for VT-d to disable
interrupt remapping on broken chipsets. On the AMD-Vi side the most
important new feature is a kernel command-line interface to override
broken information in IVRS ACPI tables and get interrupt remapping
working this way.
Besides that there are small fixes all over the place."
* tag 'iommu-updates-v3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (24 commits)
iommu/tegra: Fix printk formats for dma_addr_t
iommu: Add a function to find an iommu group by id
iommu/vt-d: Remove warning for HPET scope type
iommu: Move swap_pci_ref function to drivers/iommu/pci.h.
iommu/vt-d: Disable translation if already enabled
iommu/amd: fix error return code in early_amd_iommu_init()
iommu/AMD: Per-thread IOMMU Interrupt Handling
iommu: Include linux/err.h
iommu/amd: Workaround for ERBT1312
iommu/amd: Document ivrs_ioapic and ivrs_hpet parameters
iommu/amd: Don't report firmware bugs with cmd-line ivrs overrides
iommu/amd: Add ioapic and hpet ivrs override
iommu/amd: Add early maps for ioapic and hpet
iommu/amd: Extend IVRS special device data structure
iommu/amd: Move add_special_device() to __init
iommu: Fix compile warnings with forward declarations
iommu/amd: Properly initialize irq-table lock
iommu/amd: Use AMD specific data structure for irq remapping
iommu/amd: Remove map_sg_no_iommu()
iommu/vt-d: add quirk for broken interrupt remapping on 55XX chipsets
...
Andreas Schwab [Sat, 4 May 2013 14:32:53 +0000 (16:32 +0200)]
Fix cleaning in scripts/mod
Make sure devicetable-offsets.h is cleaned in the scripts/mod directory
Signed-off-by: Andreas Schwab <schwab@linux-m68k.org>
Tested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Michal Marek <mmarek@suse.cz>
Christoph Lameter [Fri, 3 May 2013 18:04:18 +0000 (18:04 +0000)]
mm, slab_common: Fix bootstrap creation of kmalloc caches
For SLAB the kmalloc caches must be created in ascending sizes in order
for the OFF_SLAB sub-slab cache to work properly.
Create the non power of two caches immediately after the prior power of
two kmalloc cache. Do not create the non power of two caches before all
other caches.
Reported-and-tested-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Christoph Lamete <cl@linux.com>
Link: http://lkml.kernel.org/r/201305040348.CIF81716.OStQOHFJMFLOVF@I-love.SAKURA.ne.jp
Signed-off-by: Pekka Enberg <penberg@kernel.org>
Dan Carpenter [Mon, 6 May 2013 09:31:17 +0000 (09:31 +0000)]
tipc: potential divide by zero in tipc_link_recv_fragment()
The worry here is that fragm_sz could be zero since it comes from
skb->data.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Mon, 6 May 2013 08:28:41 +0000 (08:28 +0000)]
tipc: add a bounds check in link_recv_changeover_msg()
The bearer_id here comes from skb->data and it can be a number from 0 to
7. The problem is that the ->links[] array has only 2 elements so I
have added a range check.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
hayeswang [Thu, 2 May 2013 16:01:25 +0000 (16:01 +0000)]
net/usb: new driver for RTL8152
Add new driver for supporting Realtek RTL8152 Based USB 2.0 Ethernet Adapters
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Cc: Realtek linux nic maintainers <nic_swsd@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Mon, 6 May 2013 20:11:19 +0000 (13:11 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/sage/ceph-client
Pull Ceph changes from Alex Elder:
"This is a big pull.
Most of it is culmination of Alex's work to implement RBD image
layering, which is now complete (yay!).
There is also some work from Yan to fix i_mutex behavior surrounding
writes in cephfs, a sync write fix, a fix for RBD images that get
resized while they are mapped, and a few patches from me that resolve
annoying auth warnings and fix several bugs in the ceph auth code."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (254 commits)
rbd: fix image request leak on parent read
libceph: use slab cache for osd client requests
libceph: allocate ceph message data with a slab allocator
libceph: allocate ceph messages with a slab allocator
rbd: allocate image object names with a slab allocator
rbd: allocate object requests with a slab allocator
rbd: allocate name separate from obj_request
rbd: allocate image requests with a slab allocator
rbd: use binary search for snapshot lookup
rbd: clear EXISTS flag if mapped snapshot disappears
rbd: kill off the snapshot list
rbd: define rbd_snap_size() and rbd_snap_features()
rbd: use snap_id not index to look up snap info
rbd: look up snapshot name in names buffer
rbd: drop obj_request->version
rbd: drop rbd_obj_method_sync() version parameter
rbd: more version parameter removal
rbd: get rid of some version parameters
rbd: stop tracking header object version
rbd: snap names are pointer to constant data
...
Linus Torvalds [Mon, 6 May 2013 20:07:47 +0000 (13:07 -0700)]
Merge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6
Pull CIFS fixes from Steve French:
"A set of cifs cleanup fixes.
The only big one of this set optimizes the cifs error logging,
renaming cFYI and cERROR macros to cifs_dbg, and in the process makes
it clearer and reduces module size."
* 'for-next' of git://git.samba.org/sfrench/cifs-2.6:
cifs: small variable name cleanup
CIFS: fix error return code in cifs_atomic_open()
cifs: store the real expected sequence number in the mid
cifs: on send failure, readjust server sequence number downward
cifs: remove ENOSPC handling in smb_sendv
[CIFS] cifs: Rename cERROR and cFYI to cifs_dbg
fs: cifs: use kmemdup instead of kmalloc + memcpy
cifs: replaced kmalloc + memset with kzalloc
cifs: ignore the unc= and prefixpath= mount options
David Jeffery [Mon, 6 May 2013 05:49:30 +0000 (13:49 +0800)]
autofs - remove autofs dentry mount check
When checking if an autofs mount point is busy it isn't sufficient to
only check if it's a mount point.
For example, if the mount of an offset mountpoint in a tree is denied
for this host by its export and the dentry becomes a process working
directory the check incorrectly returns the mount as not in use at
expire.
This can happen since the default when mounting within a tree is
nostrict, which means ingnore mount fails on mounts within the tree and
continue. The nostrict option is meant to allow mounting in this case.
Signed-off-by: David Jeffery <djeffery@redhat.com>
Signed-off-by: Ian Kent <raven@themaw.net>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Claudiu Ghioc [Mon, 6 May 2013 05:47:16 +0000 (13:47 +0800)]
autofs - fix sparse warning for autofs4_d_manage()
Fixed the sparse warning:
fs/autofs4/root.c:411:5: warning: symbol 'autofs4_d_manage' was not declared. Should it be static?"
[ Clearly it should be static as the function is declared static at the
top of root.c. - imk ]
Signed-off-by: Claudiu Ghioc <claudiu.ghioc@gmail.com>
Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Mon, 6 May 2013 19:34:53 +0000 (12:34 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/s390/linux
Pull more s390 updates from Martin Schwidefsky:
"This is the second batch of s390 patches for the 3.10 merge window.
Heiko improved the memory detection, this fixes kdump for large memory
sizes. Some kvm related memory management work, new ipldev/condev
keywords in cio and bug fixes."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/mem_detect: remove artificial kdump memory types
s390/mm: add pte invalidation notifier for kvm
s390/zcrypt: ap bus rescan problem when toggle crypto adapters on/off
s390/memory hotplug,sclp: get rid of per memory increment usecount
s390/memory hotplug: provide memory_block_size_bytes() function
s390/mem_detect: limit memory detection loop to "mem=" parameter
s390/kdump,bootmem: fix bootmem allocator bitmap size
s390: get rid of odd global real_memory_size
s390/kvm: Change the virtual memory mapping location for Virtio devices
s390/zcore: calculate real memory size using own get_mem_size function
s390/mem_detect: add DAT sanity check
s390/mem_detect: fix lockdep irq tracing
s390/mem_detect: move memory detection code to mm folder
s390/zfcpdump: exploit new cio_ignore keywords
s390/cio: add condev keyword to cio_ignore
s390/cio: add ipldev keyword to cio_ignore
s390/uaccess: add "fallthrough" comments
Sergei Shtylyov [Thu, 2 May 2013 11:10:22 +0000 (11:10 +0000)]
3c59x: fix freeing nonexistent resource on driver unload
When unloading the driver that drives an EISA board, a message similar to the
following one is displayed:
Trying to free nonexistent resource <
0000000000013000-
000000000001301f>
Then an user is unable to reload the driver because the resource it requested in
the previous load hasn't been freed. This happens most probably due to a typo in
vortex_eisa_remove() which calls release_region() with 'dev->base_addr' instead
of 'edev->base_addr'...
Reported-by: Matthew Whitehead <tedheadster@gmail.com>
Tested-by: Matthew Whitehead <tedheadster@gmail.com>
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Mon, 6 May 2013 02:15:13 +0000 (02:15 +0000)]
netpoll: inverted down_trylock() test
The return value is reversed from mutex_trylock().
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Al Viro [Sun, 5 May 2013 16:05:55 +0000 (16:05 +0000)]
rps_dev_flow_table_release(): no need to delay vfree()
The same story as with fib_trie patch - vfree() from RCU callbacks
is legitimate now.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Al Viro [Sun, 5 May 2013 16:03:46 +0000 (16:03 +0000)]
fib_trie: no need to delay vfree()
Now that vfree() can be called from interrupt contexts, there's no
need to play games with schedule_work() to escape calling vfree()
from RCU callbacks.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Konstantin Khlebnikov [Sun, 5 May 2013 04:56:22 +0000 (04:56 +0000)]
net: frag, fix race conditions in LRU list maintenance
This patch fixes race between inet_frag_lru_move() and inet_frag_lru_add()
which was introduced in commit
3ef0eb0db4bf92c6d2510fe5c4dc51852746f206
("net: frag, move LRU list maintenance outside of rwlock")
One cpu already added new fragment queue into hash but not into LRU.
Other cpu found it in hash and tries to move it to the end of LRU.
This leads to NULL pointer dereference inside of list_move_tail().
Another possible race condition is between inet_frag_lru_move() and
inet_frag_lru_del(): move can happens after deletion.
This patch initializes LRU list head before adding fragment into hash and
inet_frag_lru_move() doesn't touches it if it's empty.
I saw this kernel oops two times in a couple of days.
[119482.128853] BUG: unable to handle kernel NULL pointer dereference at (null)
[119482.132693] IP: [<
ffffffff812ede89>] __list_del_entry+0x29/0xd0
[119482.136456] PGD
2148f6067 PUD
215ab9067 PMD 0
[119482.140221] Oops: 0000 [#1] SMP
[119482.144008] Modules linked in: vfat msdos fat 8021q fuse nfsd auth_rpcgss nfs_acl nfs lockd sunrpc ppp_async ppp_generic bridge slhc stp llc w83627ehf hwmon_vid snd_hda_codec_hdmi snd_hda_codec_realtek kvm_amd k10temp kvm snd_hda_intel snd_hda_codec edac_core radeon snd_hwdep ath9k snd_pcm ath9k_common snd_page_alloc ath9k_hw snd_timer snd soundcore drm_kms_helper ath ttm r8169 mii
[119482.152692] CPU 3
[119482.152721] Pid: 20, comm: ksoftirqd/3 Not tainted
3.9.0-zurg-00001-g9f95269 #132 To Be Filled By O.E.M. To Be Filled By O.E.M./RS880D
[119482.161478] RIP: 0010:[<
ffffffff812ede89>] [<
ffffffff812ede89>] __list_del_entry+0x29/0xd0
[119482.166004] RSP: 0018:
ffff880216d5db58 EFLAGS:
00010207
[119482.170568] RAX:
0000000000000000 RBX:
ffff88020882b9c0 RCX:
dead000000200200
[119482.175189] RDX:
0000000000000000 RSI:
0000000000000880 RDI:
ffff88020882ba00
[119482.179860] RBP:
ffff880216d5db58 R08:
ffffffff8155c7f0 R09:
0000000000000014
[119482.184570] R10:
0000000000000000 R11:
0000000000000000 R12:
ffff88020882ba00
[119482.189337] R13:
ffffffff81c8d780 R14:
ffff880204357f00 R15:
00000000000005a0
[119482.194140] FS:
00007f58124dc700(0000) GS:
ffff88021fcc0000(0000) knlGS:
0000000000000000
[119482.198928] CS: 0010 DS: 0000 ES: 0000 CR0:
000000008005003b
[119482.203711] CR2:
0000000000000000 CR3:
00000002155f0000 CR4:
00000000000007e0
[119482.208533] DR0:
0000000000000000 DR1:
0000000000000000 DR2:
0000000000000000
[119482.213371] DR3:
0000000000000000 DR6:
00000000ffff0ff0 DR7:
0000000000000400
[119482.218221] Process ksoftirqd/3 (pid: 20, threadinfo
ffff880216d5c000, task
ffff880216d3a9a0)
[119482.223113] Stack:
[119482.228004]
ffff880216d5dbd8 ffffffff8155dcda 0000000000000000 ffff000200000001
[119482.233038]
ffff8802153c1f00 ffff880000289440 ffff880200000014 ffff88007bc72000
[119482.238083]
00000000000079d5 ffff88007bc72f44 ffffffff00000002 ffff880204357f00
[119482.243090] Call Trace:
[119482.248009] [<
ffffffff8155dcda>] ip_defrag+0x8fa/0xd10
[119482.252921] [<
ffffffff815a8013>] ipv4_conntrack_defrag+0x83/0xe0
[119482.257803] [<
ffffffff8154485b>] nf_iterate+0x8b/0xa0
[119482.262658] [<
ffffffff8155c7f0>] ? inet_del_offload+0x40/0x40
[119482.267527] [<
ffffffff815448e4>] nf_hook_slow+0x74/0x130
[119482.272412] [<
ffffffff8155c7f0>] ? inet_del_offload+0x40/0x40
[119482.277302] [<
ffffffff8155d068>] ip_rcv+0x268/0x320
[119482.282147] [<
ffffffff81519992>] __netif_receive_skb_core+0x612/0x7e0
[119482.286998] [<
ffffffff81519b78>] __netif_receive_skb+0x18/0x60
[119482.291826] [<
ffffffff8151a650>] process_backlog+0xa0/0x160
[119482.296648] [<
ffffffff81519f29>] net_rx_action+0x139/0x220
[119482.301403] [<
ffffffff81053707>] __do_softirq+0xe7/0x220
[119482.306103] [<
ffffffff81053868>] run_ksoftirqd+0x28/0x40
[119482.310809] [<
ffffffff81074f5f>] smpboot_thread_fn+0xff/0x1a0
[119482.315515] [<
ffffffff81074e60>] ? lg_local_lock_cpu+0x40/0x40
[119482.320219] [<
ffffffff8106d870>] kthread+0xc0/0xd0
[119482.324858] [<
ffffffff8106d7b0>] ? insert_kthread_work+0x40/0x40
[119482.329460] [<
ffffffff816c32dc>] ret_from_fork+0x7c/0xb0
[119482.334057] [<
ffffffff8106d7b0>] ? insert_kthread_work+0x40/0x40
[119482.338661] Code: 00 00 55 48 8b 17 48 b9 00 01 10 00 00 00 ad de 48 8b 47 08 48 89 e5 48 39 ca 74 29 48 b9 00 02 20 00 00 00 ad de 48 39 c8 74 7a <4c> 8b 00 4c 39 c7 75 53 4c 8b 42 08 4c 39 c7 75 2b 48 89 42 08
[119482.343787] RIP [<
ffffffff812ede89>] __list_del_entry+0x29/0xd0
[119482.348675] RSP <
ffff880216d5db58>
[119482.353493] CR2:
0000000000000000
Oops happened on this path:
ip_defrag() -> ip_frag_queue() -> inet_frag_lru_move() -> list_move_tail() -> __list_del_entry()
Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Florian Westphal <fw@strlen.de>
Cc: Eric Dumazet <edumazet@google.com>
Cc: David S. Miller <davem@davemloft.net>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Christoph Lameter [Fri, 3 May 2013 15:43:18 +0000 (15:43 +0000)]
slab: Return NULL for oversized allocations
The inline path seems to have changed the SLAB behavior for very large
kmalloc allocations with commit
e3366016 ("slab: Use common
kmalloc_index/kmalloc_size functions"). This patch restores the old
behavior but also adds diagnostics so that we can figure where in the
code these large allocations occur.
Reported-and-tested-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Christoph Lameter <cl@linux.com>
Link: http://lkml.kernel.org/r/201305040348.CIF81716.OStQOHFJMFLOVF@I-love.SAKURA.ne.jp
[ penberg@kernel.org: use WARN_ON_ONCE ]
Signed-off-by: Pekka Enberg <penberg@kernel.org>
Linus Torvalds [Mon, 6 May 2013 00:36:20 +0000 (17:36 -0700)]
Merge tag 'mfd-3.10-1' of git://git./linux/kernel/git/sameo/mfd-next
Pull MFD update from Samuel Ortiz:
"For 3.10 we have a few new MFD drivers for:
- The ChromeOS embedded controller which provides keyboard, battery
and power management services. This controller is accessible
through i2c or SPI.
- Silicon Laboratories 476x controller, providing access to their FM
chipset and their audio codec.
- Realtek's RTS5249, a memory stick, MMC and SD/SDIO PCI based
reader.
- Nokia's Tahvo power button and watchdog device. This device is
very similar to Retu and is thus supported by the same code base.
- STMicroelectronics STMPE1801, a keyboard and GPIO controller
supported by the stmpe driver.
- ST-Ericsson AB8540 and AB8505 power management and voltage
converter controllers through the existing ab8500 code.
Some other drivers got cleaned up or improved. In particular:
- The Linaro/STE guys got the ab8500 driver in sync with their
internal code through a series of optimizations, fixes and
improvements.
- The AS3711 and OMAP USB drivers now have DT support.
- The arizona clock and interrupt handling code got improved.
- The wm5102 register patch and boot mechanism also got improved."
* tag 'mfd-3.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-next: (104 commits)
mfd: si476x: Don't use 0bNNN
mfd: vexpress: Handle pending config transactions
mfd: ab8500: Export ab8500_gpadc_sw_hw_convert properly
mfd: si476x: Fix i2c warning
mfd: si476x: Add header files and Kbuild plumbing
mfd: si476x: Add chip properties handling code
mfd: si476x: Add the bulk of the core driver
mfd: si476x: Add commands abstraction layer
mfd: rtsx: Support RTS5249
mfd: retu: Add Tahvo support
mfd: ucb1400: Pass ucb1400-gpio data through ac97 bus
mfd: wm8994: Add some OF properties
mfd: wm8994: Add device ID data to WM8994 OF device IDs
input: Export matrix_keypad_parse_of_params()
mfd: tps65090: Add compatible string for charger subnode
mfd: db8500-prcmu: Support platform dependant device selection
mfd: syscon: Fix warnings when printing resource_size_t
of: Add stub of_get_parent for non-OF builds
mfd: omap-usb-tll: Convert to devm_ioremap_resource()
mfd: omap-usb-host: Convert to devm_ioremap_resource()
...
Linus Torvalds [Sun, 5 May 2013 21:47:31 +0000 (14:47 -0700)]
Merge tag 'kvm-3.10-1' of git://git./virt/kvm/kvm
Pull kvm updates from Gleb Natapov:
"Highlights of the updates are:
general:
- new emulated device API
- legacy device assignment is now optional
- irqfd interface is more generic and can be shared between arches
x86:
- VMCS shadow support and other nested VMX improvements
- APIC virtualization and Posted Interrupt hardware support
- Optimize mmio spte zapping
ppc:
- BookE: in-kernel MPIC emulation with irqfd support
- Book3S: in-kernel XICS emulation (incomplete)
- Book3S: HV: migration fixes
- BookE: more debug support preparation
- BookE: e6500 support
ARM:
- reworking of Hyp idmaps
s390:
- ioeventfd for virtio-ccw
And many other bug fixes, cleanups and improvements"
* tag 'kvm-3.10-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (204 commits)
kvm: Add compat_ioctl for device control API
KVM: x86: Account for failing enable_irq_window for NMI window request
KVM: PPC: Book3S: Add API for in-kernel XICS emulation
kvm/ppc/mpic: fix missing unlock in set_base_addr()
kvm/ppc: Hold srcu lock when calling kvm_io_bus_read/write
kvm/ppc/mpic: remove users
kvm/ppc/mpic: fix mmio region lists when multiple guests used
kvm/ppc/mpic: remove default routes from documentation
kvm: KVM_CAP_IOMMU only available with device assignment
ARM: KVM: iterate over all CPUs for CPU compatibility check
KVM: ARM: Fix spelling in error message
ARM: KVM: define KVM_ARM_MAX_VCPUS unconditionally
KVM: ARM: Fix API documentation for ONE_REG encoding
ARM: KVM: promote vfp_host pointer to generic host cpu context
ARM: KVM: add architecture specific hook for capabilities
ARM: KVM: perform HYP initilization for hotplugged CPUs
ARM: KVM: switch to a dual-step HYP init code
ARM: KVM: rework HYP page table freeing
ARM: KVM: enforce maximum size for identity mapped code
ARM: KVM: move to a KVM provided HYP idmap
...
David Howells [Sat, 4 May 2013 07:48:27 +0000 (08:48 +0100)]
Give the OID registry file module info to avoid kernel tainting
Give the OID registry file module information so that it doesn't taint the
kernel when compiled as a module and loaded.
Reported-by: Dros Adamson <Weston.Adamson@netapp.com>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Trond Myklebust <Trond.Myklebust@netapp.com>
cc: stable@vger.kernel.org
cc: linux-nfs@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Eric Dumazet [Fri, 3 May 2013 19:12:45 +0000 (19:12 +0000)]
tcp: do not expire TCP fastopen cookies
TCP metric cache expires entries after one hour.
This probably make sense for TCP RTT/RTTVAR/CWND, but not
for TCP fastopen cookies.
Its better to try previous cookie. If it appears to be obsolete,
server will send us new cookie anyway.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Benjamin Herrenschmidt [Fri, 3 May 2013 17:19:01 +0000 (17:19 +0000)]
net/eth/ibmveth: Fixup retrieval of MAC address
Some ancient pHyp versions used to create a 8 bytes local-mac-address
property in the device-tree instead of a 6 bytes one for veth.
The Linux driver code to deal with that is an insane hack which also
happens to break with some choices of MAC addresses in qemu by testing
for a bit in the address rather than just looking at the size of the
property.
Sanitize this by doing the latter instead.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: <stable@vger.kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
stephen hemminger [Fri, 3 May 2013 14:49:41 +0000 (14:49 +0000)]
virtio: don't expose u16 in userspace api
Programs using virtio headers outside of kernel will no longer
build because u16 type does not exist in userspace. All user ABI
must use __u16 typedef instead.
Bug introduce by:
commit
986a4f4d452dec004697f667439d27c3fda9c928
Author: Jason Wang <jasowang@redhat.com>
Date: Fri Dec 7 07:04:56 2012 +0000
virtio_net: multiqueue support
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Sun, 5 May 2013 20:23:27 +0000 (13:23 -0700)]
Merge branch 'timers-nohz-for-linus' of git://git./linux/kernel/git/tip/tip
Pull 'full dynticks' support from Ingo Molnar:
"This tree from Frederic Weisbecker adds a new, (exciting! :-) core
kernel feature to the timer and scheduler subsystems: 'full dynticks',
or CONFIG_NO_HZ_FULL=y.
This feature extends the nohz variable-size timer tick feature from
idle to busy CPUs (running at most one task) as well, potentially
reducing the number of timer interrupts significantly.
This feature got motivated by real-time folks and the -rt tree, but
the general utility and motivation of full-dynticks runs wider than
that:
- HPC workloads get faster: CPUs running a single task should be able
to utilize a maximum amount of CPU power. A periodic timer tick at
HZ=1000 can cause a constant overhead of up to 1.0%. This feature
removes that overhead - and speeds up the system by 0.5%-1.0% on
typical distro configs even on modern systems.
- Real-time workload latency reduction: CPUs running critical tasks
should experience as little jitter as possible. The last remaining
source of kernel-related jitter was the periodic timer tick.
- A single task executing on a CPU is a pretty common situation,
especially with an increasing number of cores/CPUs, so this feature
helps desktop and mobile workloads as well.
The cost of the feature is mainly related to increased timer
reprogramming overhead when a CPU switches its tick period, and thus
slightly longer to-idle and from-idle latency.
Configuration-wise a third mode of operation is added to the existing
two NOHZ kconfig modes:
- CONFIG_HZ_PERIODIC: [formerly !CONFIG_NO_HZ], now explicitly named
as a config option. This is the traditional Linux periodic tick
design: there's a HZ tick going on all the time, regardless of
whether a CPU is idle or not.
- CONFIG_NO_HZ_IDLE: [formerly CONFIG_NO_HZ=y], this turns off the
periodic tick when a CPU enters idle mode.
- CONFIG_NO_HZ_FULL: this new mode, in addition to turning off the
tick when a CPU is idle, also slows the tick down to 1 Hz (one
timer interrupt per second) when only a single task is running on a
CPU.
The .config behavior is compatible: existing !CONFIG_NO_HZ and
CONFIG_NO_HZ=y settings get translated to the new values, without the
user having to configure anything. CONFIG_NO_HZ_FULL is turned off by
default.
This feature is based on a lot of infrastructure work that has been
steadily going upstream in the last 2-3 cycles: related RCU support
and non-periodic cputime support in particular is upstream already.
This tree adds the final pieces and activates the feature. The pull
request is marked RFC because:
- it's marked 64-bit only at the moment - the 32-bit support patch is
small but did not get ready in time.
- it has a number of fresh commits that came in after the merge
window. The overwhelming majority of commits are from before the
merge window, but still some aspects of the tree are fresh and so I
marked it RFC.
- it's a pretty wide-reaching feature with lots of effects - and
while the components have been in testing for some time, the full
combination is still not very widely used. That it's default-off
should reduce its regression abilities and obviously there are no
known regressions with CONFIG_NO_HZ_FULL=y enabled either.
- the feature is not completely idempotent: there is no 100%
equivalent replacement for a periodic scheduler/timer tick. In
particular there's ongoing work to map out and reduce its effects
on scheduler load-balancing and statistics. This should not impact
correctness though, there are no known regressions related to this
feature at this point.
- it's a pretty ambitious feature that with time will likely be
enabled by most Linux distros, and we'd like you to make input on
its design/implementation, if you dislike some aspect we missed.
Without flaming us to crisp! :-)
Future plans:
- there's ongoing work to reduce 1Hz to 0Hz, to essentially shut off
the periodic tick altogether when there's a single busy task on a
CPU. We'd first like 1 Hz to be exposed more widely before we go
for the 0 Hz target though.
- once we reach 0 Hz we can remove the periodic tick assumption from
nr_running>=2 as well, by essentially interrupting busy tasks only
as frequently as the sched_latency constraints require us to do -
once every 4-40 msecs, depending on nr_running.
I am personally leaning towards biting the bullet and doing this in
v3.10, like the -rt tree this effort has been going on for too long -
but the final word is up to you as usual.
More technical details can be found in Documentation/timers/NO_HZ.txt"
* 'timers-nohz-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (39 commits)
sched: Keep at least 1 tick per second for active dynticks tasks
rcu: Fix full dynticks' dependency on wide RCU nocb mode
nohz: Protect smp_processor_id() in tick_nohz_task_switch()
nohz_full: Add documentation.
cputime_nsecs: use math64.h for nsec resolution conversion helpers
nohz: Select VIRT_CPU_ACCOUNTING_GEN from full dynticks config
nohz: Reduce overhead under high-freq idling patterns
nohz: Remove full dynticks' superfluous dependency on RCU tree
nohz: Fix unavailable tick_stop tracepoint in dynticks idle
nohz: Add basic tracing
nohz: Select wide RCU nocb for full dynticks
nohz: Disable the tick when irq resume in full dynticks CPU
nohz: Re-evaluate the tick for the new task after a context switch
nohz: Prepare to stop the tick on irq exit
nohz: Implement full dynticks kick
nohz: Re-evaluate the tick from the scheduler IPI
sched: New helper to prevent from stopping the tick in full dynticks
sched: Kick full dynticks CPU that have more than one task enqueued.
perf: New helper to prevent full dynticks CPUs from stopping tick
perf: Kick full dynticks CPU if events rotation is needed
...
Linus Torvalds [Sun, 5 May 2013 18:37:16 +0000 (11:37 -0700)]
Merge branch 'perf-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
"Misc fixes plus a small hw-enablement patch for Intel IB model 58
uncore events"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86/intel/lbr: Demand proper privileges for PERF_SAMPLE_BRANCH_KERNEL
perf/x86/intel/lbr: Fix LBR filter
perf/x86: Blacklist all MEM_*_RETIRED events for Ivy Bridge
perf: Fix vmalloc ring buffer pages handling
perf/x86/intel: Fix unintended variable name reuse
perf/x86/intel: Add support for IvyBridge model 58 Uncore
perf/x86/intel: Fix typo in perf_event_intel_uncore.c
x86: Eliminate irq_mis_count counted in arch_irq_stat
Linus Torvalds [Sun, 5 May 2013 17:58:06 +0000 (10:58 -0700)]
Merge tag 'modules-next-for-linus' of git://git./linux/kernel/git/rusty/linux
Pull mudule updates from Rusty Russell:
"We get rid of the general module prefix confusion with a binary config
option, fix a remove/insert race which Never Happens, and (my
favorite) handle the case when we have too many modules for a single
commandline. Seriously, the kernel is full, please go away!"
* tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
modpost: fix unwanted VMLINUX_SYMBOL_STR expansion
X.509: Support parse long form of length octets in Authority Key Identifier
module: don't unlink the module until we've removed all exposure.
kernel: kallsyms: memory override issue, need check destination buffer length
MODSIGN: do not send garbage to stderr when enabling modules signature
modpost: handle huge numbers of modules.
modpost: add -T option to read module names from file/stdin.
modpost: minor cleanup.
genksyms: pass symbol-prefix instead of arch
module: fix symbol versioning with symbol prefixes
CONFIG_SYMBOL_PREFIX: cleanup.
Linus Torvalds [Sun, 5 May 2013 17:35:26 +0000 (10:35 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/viro/vfs
Pull single_open() leak fixes from Al Viro:
"A bunch of fixes for a moderately common class of bugs: file with
single_open() done by its ->open() and seq_release as its ->release().
That leaks; fortunately, it's not _too_ common (either people manage
to RTFM that says "When using single_open(), the programmer should use
single_release() instead of seq_release() in the file_operations
structure to avoid a memory leak", or they just copy a correct
instance), but grepping through the tree has caught quite a pile.
All of that is, AFAICS, -stable fodder, for as far as the patches
apply. I tried to carve it up into reasonably-sized pieces (more or
less "comes from the same tree")"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
rcutrace: single_open() leaks
gadget: single_open() leaks
staging: single_open() leaks
megaraid: single_open() leak
wireless: single_open() leaks
input: single_open() leak
rtc: single_open() leaks
ds1620: single_open() leak
sh: single_open() leaks
parisc: single_open() leaks
mips: single_open() leaks
ia64: single_open() leaks
h8300: single_open() leaks
cris: single_open() leaks
arm: single_open() leaks
Linus Torvalds [Sun, 5 May 2013 17:13:44 +0000 (10:13 -0700)]
Merge branch 'ipc-cleanups'
Merge ipc fixes and cleanups from my IPC branch.
The ipc locking has always been pretty ugly, and the scalability fixes
to some degree made it even less readable. We had two cases of double
unlocks in error paths due to this (one rcu read unlock, one semaphore
unlock), and this fixes the bugs I found while trying to clean things up
a bit so that we are less likely to have more.
* ipc-cleanups:
ipc: simplify rcu_read_lock() in semctl_nolock()
ipc: simplify semtimedop/semctl_main() common error path handling
ipc: move sem_obtain_lock() rcu locking into the only caller
ipc: fix double sem unlock in semctl error path
ipc: move the rcu_read_lock() from sem_lock_and_putref() into callers
ipc: sem_putref() does not need the semaphore lock any more
ipc: move rcu_read_unlock() out of sem_unlock() and into callers
Scott Wood [Wed, 1 May 2013 01:00:45 +0000 (20:00 -0500)]
kvm: Add compat_ioctl for device control API
This API shouldn't have 32/64-bit issues, but VFS assumes it does
unless told otherwise.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Peter Zijlstra [Fri, 3 May 2013 12:11:25 +0000 (14:11 +0200)]
perf/x86/intel/lbr: Demand proper privileges for PERF_SAMPLE_BRANCH_KERNEL
We should always have proper privileges when requesting kernel
data.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: <stable@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: eranian@google.com
Link: http://lkml.kernel.org/r/20130503121256.230745028@chello.nl
[ Fix build error reported by fengguang.wu@intel.com, propagate error code back. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: http://lkml.kernel.org/n/tip-v0x9ky3ahzr6nm3c6ilwrili@git.kernel.org
Al Viro [Sun, 5 May 2013 04:16:35 +0000 (00:16 -0400)]
rcutrace: single_open() leaks
Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Sun, 5 May 2013 04:16:11 +0000 (00:16 -0400)]
gadget: single_open() leaks
Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Sun, 5 May 2013 04:15:43 +0000 (00:15 -0400)]
staging: single_open() leaks
Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Sun, 5 May 2013 04:15:15 +0000 (00:15 -0400)]
megaraid: single_open() leak
Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Sun, 5 May 2013 04:13:20 +0000 (00:13 -0400)]
wireless: single_open() leaks
Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Sun, 5 May 2013 04:12:56 +0000 (00:12 -0400)]
input: single_open() leak
Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Sun, 5 May 2013 04:12:29 +0000 (00:12 -0400)]
rtc: single_open() leaks
Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Sun, 5 May 2013 04:11:29 +0000 (00:11 -0400)]
ds1620: single_open() leak
Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Sun, 5 May 2013 04:11:01 +0000 (00:11 -0400)]
sh: single_open() leaks
Cc: vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Sun, 5 May 2013 04:09:44 +0000 (00:09 -0400)]
parisc: single_open() leaks
Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Sun, 5 May 2013 04:09:30 +0000 (00:09 -0400)]
mips: single_open() leaks
Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Sun, 5 May 2013 04:09:04 +0000 (00:09 -0400)]
ia64: single_open() leaks
Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Sun, 5 May 2013 04:08:26 +0000 (00:08 -0400)]
h8300: single_open() leaks
Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Sun, 5 May 2013 04:07:52 +0000 (00:07 -0400)]
cris: single_open() leaks
Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Sun, 5 May 2013 04:07:22 +0000 (00:07 -0400)]
arm: single_open() leaks
Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Dan Carpenter [Wed, 10 Apr 2013 11:43:00 +0000 (14:43 +0300)]
cifs: small variable name cleanup
server and ses->server are the same, but it's a little bit ugly that we
lock &ses->server->srv_mutex and unlock &server->srv_mutex. It causes
a false positive in Smatch about inconsistent locking.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Wei Yongjun [Thu, 4 Apr 2013 06:16:21 +0000 (14:16 +0800)]
CIFS: fix error return code in cifs_atomic_open()
Fix to return a negative error code from the error handling
case instead of 0, as returned elsewhere in this function.
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Jeff Layton [Wed, 3 Apr 2013 15:55:03 +0000 (11:55 -0400)]
cifs: store the real expected sequence number in the mid
Currently, the signing routines take a pointer to a place to store the
expected sequence number for the mid response. It then stores a value
that's one below what that sequence number should be, and then adds one
to it when verifying the signature on the response.
Increment the sequence number before storing the value in the mid, and
eliminate the "+1" when checking the signature.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Jeff Layton [Wed, 3 Apr 2013 14:27:36 +0000 (10:27 -0400)]
cifs: on send failure, readjust server sequence number downward
If sending a call to the server fails for some reason (for instance, the
sending thread caught a signal), then we must readjust the sequence
number downward again or the next send will have it too high.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Jeff Layton [Fri, 22 Mar 2013 12:36:45 +0000 (08:36 -0400)]
cifs: remove ENOSPC handling in smb_sendv
To my knowledge, no one ever reported seeing this pop.
Acked-by: Suresh Jayaraman <sjayaraman@novell.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Joe Perches [Sun, 5 May 2013 03:12:25 +0000 (22:12 -0500)]
[CIFS] cifs: Rename cERROR and cFYI to cifs_dbg
It's not obvious from reading the macro names that these macros
are for debugging. Convert the names to a single more typical
kernel style cifs_dbg macro.
cERROR(1, ...) -> cifs_dbg(VFS, ...)
cFYI(1, ...) -> cifs_dbg(FYI, ...)
cFYI(DBG2, ...) -> cifs_dbg(NOISY, ...)
Move the terminating format newline from the macro to the call site.
Add CONFIG_CIFS_DEBUG function cifs_vfs_err to emit the
"CIFS VFS: " prefix for VFS messages.
Size is reduced ~ 1% when CONFIG_CIFS_DEBUG is set (default y)
$ size fs/cifs/cifs.ko*
text data bss dec hex filename
265245 2525 132 267902 4167e fs/cifs/cifs.ko.new
268359 2525 132 271016 422a8 fs/cifs/cifs.ko.old
Other miscellaneous changes around these conversions:
o Miscellaneous typo fixes
o Add terminating \n's to almost all formats and remove them
from the macros to be more kernel style like. A few formats
previously had defective \n's
o Remove unnecessary OOM messages as kmalloc() calls dump_stack
o Coalesce formats to make grep easier,
added missing spaces when coalescing formats
o Use %s, __func__ instead of embedded function name
o Removed unnecessary "cifs: " prefixes
o Convert kzalloc with multiply to kcalloc
o Remove unused cifswarn macro
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Linus Torvalds [Sun, 5 May 2013 03:10:04 +0000 (20:10 -0700)]
Merge git://git./linux/kernel/git/davem/net
Pull networking fixes from David Miller:
1) Several routines do not use netdev_features_t to hold such bitmasks,
fixes from Patrick McHardy and Bjørn Mork.
2) Update cpsw IRQ software state and the actual HW irq enabling in the
correct order. From Mugunthan V N.
3) When sending tipc packets to multiple bearers, we have to make
copies of the SKB rather than just giving the original SKB directly.
Fix from Gerlando Falauto.
4) Fix race with bridging topology change timer, from Stephen
Hemminger.
5) Fix TCPv6 segmentation handling in GRE and VXLAN, from Pravin B
Shelar.
6) Endian bug in USB pegasus driver, from Dan Carpenter.
7) Fix crashes on MTU reduction in USB asix driver, from Holger
Eitzenberger.
8) Don't allow the kernel to BUG() just because the user puts some crap
in an AF_PACKET mmap() ring descriptor. Fix from Daniel Borkmann.
9) Don't use variable sized arrays on the stack in xen-netback, from
Wei Liu.
10) Fix stats reporting and an unbalanced napi_disable() in be2net
driver. From Somnath Kotur and Ajit Khaparde.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (25 commits)
cxgb4: fix error recovery when t4_fw_hello returns a positive value
sky2: Fix crash on receiving VLAN frames
packet: tpacket_v3: do not trigger bug() on wrong header status
asix: fix BUG in receive path when lowering MTU
net: qmi_wwan: Add Telewell TW-LTE 4G
usbnet: pegasus: endian bug in write_mii_word()
vxlan: Fix TCPv6 segmentation.
gre: Fix GREv4 TCPv6 segmentation.
bridge: fix race with topology change timer
tipc: pskb_copy() buffers when sending on more than one bearer
tipc: tipc_bcbearer_send(): simplify bearer selection
tipc: cosmetic: clean up comments and break a long line
drivers: net: cpsw: irq not disabled in cpsw isr in particular sequence
xen-netback: better names for thresholds
xen-netback: avoid allocating variable size array on stack
xen-netback: remove redundent parameter in netbk_count_requests
be2net: Fix to fail probe if MSI-X enable fails for a VF
be2net: avoid napi_disable() when it has not been enabled
be2net: Fix firmware download for Lancer
be2net: Fix to receive Multicast Packets when Promiscuous mode is enabled on certain devices
...
Linus Torvalds [Sun, 5 May 2013 03:08:49 +0000 (20:08 -0700)]
Merge git://git./linux/kernel/git/davem/sparc-next
Pull sparc updates from David Miller:
1) Hibernation support, as well as removal of excess interrupt
twiddling in MMU context allocation on sparc64 from Kirill Tkhai.
2) Kill references to __ARCH_WANT_UNLOCKED_CTXSW.
3) Sparc32 LEON bug fixes from Daniel Hellstrom and Andreas Larsson.
4) Provide cmpxchg64(), from Geert Uytterhoeven.
5) Device refcount and registry bug fixes from Federico Vaga and Wei
Yongjun.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-next:
serial: sunsu: add missing platform_driver_unregister() when module exit
sparc32, leon: Do not overwrite previously set irq flow handlers
sparc/kernel/vio.c: add put_device() after device_find_child()
sparc64: Do not save/restore interrupts in get_new_mmu_context()
sparc: Consistently use 'wr' and 'rd' instructions for ASRs.
sparc64: Kill __ARCH_WANT_UNLOCKED_CTXSW
sparc64: Provide cmpxchg64()
sparc64: Do not change num_physpages during initmem freeing
sparc64: Hibernation support
sparc,leon: updated GRPCI2 config name
sparc,leon: support for GRPCI1 PCI host bridge controller
sparc32,leon: add support for PCI busn resource for GRPCI2
Silviu-Mihai Popescu [Mon, 11 Mar 2013 16:22:32 +0000 (18:22 +0200)]
fs: cifs: use kmemdup instead of kmalloc + memcpy
This replaces calls to kmalloc followed by memcpy with a single call to
kmemdup. This was found via make coccicheck.
Signed-off-by: Silviu-Mihai Popescu <silviupopescu1990@gmail.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Dia Vasile [Sun, 10 Mar 2013 12:29:04 +0000 (14:29 +0200)]
cifs: replaced kmalloc + memset with kzalloc
Signed-off-by: Diana Vasile <kill.elohim@hotmail.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Jeff Layton [Fri, 22 Mar 2013 12:42:57 +0000 (08:42 -0400)]
cifs: ignore the unc= and prefixpath= mount options
...as advertised for 3.10.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Signed-off-by: Steve French <smfrench@gmail.com>
David S. Miller [Sun, 5 May 2013 01:34:13 +0000 (18:34 -0700)]
Merge git://git./linux/kernel/git/davem/sparc
Merge sparc bug fixes that didn't make it into v3.9 into
sparc-next.
Signed-off-by: David S. Miller <davem@davemloft.net>
Wei Yongjun [Sat, 27 Apr 2013 00:13:16 +0000 (00:13 +0000)]
serial: sunsu: add missing platform_driver_unregister() when module exit
We have registered platform driver when module init, and
need unregister it when module exit.
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andreas Larsson [Sun, 21 Apr 2013 21:23:06 +0000 (21:23 +0000)]
sparc32, leon: Do not overwrite previously set irq flow handlers
This is needed because when scan_of_devices finds the GAISLER_GPTIMER
core that corresponds to the SMP "ticker" timer, the previously set
proper irq flow handler gets overwritten with an incorrect one. This
leads to very flaky timer interrupt handling on some hardware. Proper
updates to handlers can still be done using leon_update_virq_handling.
Signed-off-by: Andreas Larsson <andreas@gaisler.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Federico Vaga [Mon, 15 Apr 2013 04:42:52 +0000 (04:42 +0000)]
sparc/kernel/vio.c: add put_device() after device_find_child()
The vio_remove() function uses device_find_child() but it does not drop
the reference of the retrieved child.
Signed-off-by: Federico Vaga <federico.vaga@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Sat, 4 May 2013 18:04:29 +0000 (11:04 -0700)]
ipc: simplify rcu_read_lock() in semctl_nolock()
This trivially combines two rcu_read_lock() calls in both sides of a
if-statement into one single one in front of the if-statement.
Split out as an independent cleanup from the previous commit.
Acked-by: Davidlohr Bueso <davidlohr.bueso@hp.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sat, 4 May 2013 18:04:29 +0000 (11:04 -0700)]
ipc: simplify semtimedop/semctl_main() common error path handling
With various straight RCU lock/unlock movements, one common exit path
pattern had become
rcu_read_unlock();
goto out_wakeup;
and in fact there were no cases where we wanted to exit to out_wakeup
_without_ releasing the RCU read lock.
So replace that pattern with "goto out_rcu_wakeup", and remove the old
out_wakeup.
Acked-by: Davidlohr Bueso <davidlohr.bueso@hp.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sat, 4 May 2013 17:47:57 +0000 (10:47 -0700)]
ipc: move sem_obtain_lock() rcu locking into the only caller
sem_obtain_lock() was another of those functions that returned with the
RCU lock held for reading in the success case. Move the RCU locking to
the caller (semtimedop()), making it more obvious. We already did RCU
locking elsewhere in that function.
Side note: why does semtimedop() re-do the semphore lookup after the
sleep, rather than just getting a reference to the semaphore it already
looked up originally?
Acked-by: Davidlohr Bueso <davidlohr.bueso@hp.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>