Paul E. McKenney [Mon, 2 Nov 2009 21:52:29 +0000 (13:52 -0800)]
rcu: Fix note_new_gpnum() uses of ->gpnum
Impose a clear locking design on the note_new_gpnum()
function's use of the ->gpnum counter. This is done by updating
rdp->gpnum only from the corresponding leaf rcu_node structure's
rnp->gpnum field, and even then only under the protection of
that same rcu_node structure's ->lock field. Performance and
scalability are maintained using a form of double-checked
locking, and excessive spinning is avoided by use of the
spin_trylock() function. The use of spin_trylock() is safe due
to the fact that CPUs who fail to acquire this lock will try
again later. The hierarchical nature of the rcu_node data
structure limits contention (which could be limited further if
need be using the RCU_FANOUT kernel parameter).
Without this patch, obscure but quite possible races could
result in a quiescent state that occurred during one grace
period to be accounted to the following grace period, causing
this following grace period to end prematurely. Not good!
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
Cc: <stable@kernel.org> # .32.x
LKML-Reference: <
12571987492350-git-send-email->
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Paul E. McKenney [Mon, 2 Nov 2009 21:52:28 +0000 (13:52 -0800)]
rcu: Fix synchronization for rcu_process_gp_end() uses of ->completed counter
Impose a clear locking design on the rcu_process_gp_end()
function's use of the ->completed counter. This is done by
creating a ->completed field in the rcu_node structure, which
can safely be accessed under the protection of that structure's
lock. Performance and scalability are maintained by using a
form of double-checked locking, so that rcu_process_gp_end()
only acquires the leaf rcu_node structure's ->lock if a grace
period has recently ended.
This fix reduces rcutorture failure rate by at least two orders
of magnitude under heavy stress with force_quiescent_state()
being invoked artificially often. Without this fix,
unsynchronized access to the ->completed field can cause
rcu_process_gp_end() to advance callbacks whose grace period has
not yet expired. (Bad idea!)
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
Cc: <stable@kernel.org> # .32.x
LKML-Reference: <
12571987494069-git-send-email->
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Paul E. McKenney [Mon, 2 Nov 2009 21:52:27 +0000 (13:52 -0800)]
rcu: Prepare for synchronization fixes: clean up for non-NO_HZ handling of ->completed counter
Impose a clear locking design on non-NO_HZ handling of the
->completed counter. This increases the distance between the
RCU and the CPU-hotplug mechanisms.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
Cc: <stable@kernel.org> # .32.x
LKML-Reference: <
12571987491353-git-send-email->
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Tue, 10 Nov 2009 03:10:31 +0000 (04:10 +0100)]
Merge branch 'core/urgent' into core/rcu
Merge reason: Pick up RCU fixlet to base further commits on.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Lai Jiangshan [Wed, 28 Oct 2009 15:14:48 +0000 (08:14 -0700)]
rcu: Cleanup: balance rcu_irq_enter()/rcu_irq_exit() calls
Currently, rcu_irq_exit() is invoked only for CONFIG_NO_HZ,
while rcu_irq_enter() is invoked unconditionally. This patch
moves rcu_irq_exit() out from under CONFIG_NO_HZ so that the
calls are balanced.
This patch has no effect on the behavior of the kernel because
both rcu_irq_enter() and rcu_irq_exit() are empty for
!CONFIG_NO_HZ, but the code is easier to understand if the calls
are obviously balanced in all cases.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
LKML-Reference: <
12567428891605-git-send-email->
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Paul E. McKenney [Wed, 28 Oct 2009 15:14:49 +0000 (08:14 -0700)]
rcu: Fix long-grace-period race between forcing and initialization
Very long RCU read-side critical sections (50 milliseconds or
so) can cause a race between force_quiescent_state() and
rcu_start_gp() as follows on kernel builds with multi-level
rcu_node hierarchies:
1. CPU 0 calls force_quiescent_state(), sees that there is a
grace period in progress, and acquires ->fsqlock.
2. CPU 1 detects the end of the grace period, and so
cpu_quiet_msk_finish() sets rsp->completed to rsp->gpnum.
This operation is carried out under the root rnp->lock,
but CPU 0 has not yet acquired that lock. Note that
rsp->signaled is still RCU_SAVE_DYNTICK from the last
grace period.
3. CPU 1 calls rcu_start_gp(), but no one wants a new grace
period, so it drops the root rnp->lock and returns.
4. CPU 0 acquires the root rnp->lock and picks up rsp->completed
and rsp->signaled, then drops rnp->lock. It then enters the
RCU_SAVE_DYNTICK leg of the switch statement.
5. CPU 2 invokes call_rcu(), and now needs a new grace period.
It calls rcu_start_gp(), which acquires the root rnp->lock, sets
rsp->signaled to RCU_GP_INIT (too bad that CPU 0 is already in
the RCU_SAVE_DYNTICK leg of the switch statement!) and starts
initializing the rcu_node hierarchy. If there are multiple
levels to the hierarchy, it will drop the root rnp->lock and
initialize the lower levels of the hierarchy.
6. CPU 0 notes that rsp->completed has not changed, which permits
both CPU 2 and CPU 0 to try updating it concurrently. If CPU 0's
update prevails, later calls to force_quiescent_state() can
count old quiescent states against the new grace period, which
can in turn result in premature ending of grace periods.
Not good.
This patch adds an RCU_GP_IDLE state for rsp->signaled that is
set initially at boot time and any time a grace period ends.
This prevents CPU 0 from getting into the workings of
force_quiescent_state() in step 4. Additional locking and
checks prevent the concurrent update of rsp->signaled in step 6.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
LKML-Reference: <
1256742889199-git-send-email->
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Thomas Gleixner [Mon, 2 Nov 2009 12:01:56 +0000 (13:01 +0100)]
uids: Prevent tear down race
Ingo triggered the following warning:
WARNING: at lib/debugobjects.c:255 debug_print_object+0x42/0x50()
Hardware name: System Product Name
ODEBUG: init active object type: timer_list
Modules linked in:
Pid: 2619, comm: dmesg Tainted: G W 2.6.32-rc5-tip+ #5298
Call Trace:
[<
81035443>] warn_slowpath_common+0x6a/0x81
[<
8120e483>] ? debug_print_object+0x42/0x50
[<
81035498>] warn_slowpath_fmt+0x29/0x2c
[<
8120e483>] debug_print_object+0x42/0x50
[<
8120ec2a>] __debug_object_init+0x279/0x2d7
[<
8120ecb3>] debug_object_init+0x13/0x18
[<
810409d2>] init_timer_key+0x17/0x6f
[<
81041526>] free_uid+0x50/0x6c
[<
8104ed2d>] put_cred_rcu+0x61/0x72
[<
81067fac>] rcu_do_batch+0x70/0x121
debugobjects warns about an enqueued timer being initialized. If
CONFIG_USER_SCHED=y the user management code uses delayed work to
remove the user from the hash table and tear down the sysfs objects.
free_uid is called from RCU and initializes/schedules delayed work if
the usage count of the user_struct is 0. The init/schedule happens
outside of the uidhash_lock protected region which allows a concurrent
caller of find_user() to reference the about to be destroyed
user_struct w/o preventing the work from being scheduled. If the next
free_uid call happens before the work timer expired then the active
timer is initialized and the work scheduled again.
The race was introduced in commit
5cb350ba (sched: group scheduling,
sysfs tunables) and made more prominent by commit
3959214f (sched:
delayed cleanup of user_struct)
Move the init/schedule_delayed_work inside of the uidhash_lock
protected region to prevent the race.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Dhaval Giani <dhaval@linux.vnet.ibm.com>
Cc: Paul E. McKenney <paulmck@us.ibm.com>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: stable@kernel.org
Thomas Gleixner [Wed, 28 Oct 2009 19:26:48 +0000 (20:26 +0100)]
futex: Fix spurious wakeup for requeue_pi really
The requeue_pi path doesn't use unqueue_me() (and the racy lock_ptr ==
NULL test) nor does it use the wake_list of futex_wake() which where
the reason for commit
41890f2 (futex: Handle spurious wake up)
See debugging discussing on LKML Message-ID: <
4AD4080C.20703@us.ibm.com>
The changes in this fix to the wait_requeue_pi path were considered to
be a likely unecessary, but harmless safety net. But it turns out that
due to the fact that for unknown $@#!*( reasons EWOULDBLOCK is defined
as EAGAIN we built an endless loop in the code path which returns
correctly EWOULDBLOCK.
Spurious wakeups in wait_requeue_pi code path are unlikely so we do
the easy solution and return EWOULDBLOCK^WEAGAIN to user space and let
it deal with the spurious wakeup.
Cc: Darren Hart <dvhltc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: John Stultz <johnstul@linux.vnet.ibm.com>
Cc: Dinakar Guniguntala <dino@in.ibm.com>
LKML-Reference: <
4AE23C74.
1090502@us.ibm.com>
Cc: stable@kernel.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Paul E. McKenney [Mon, 26 Oct 2009 20:57:44 +0000 (13:57 -0700)]
rcu: Fix TINY_RCU #elif condition
Some compilers are happy with "#elif CONFIG_RCU_TINY", while
others strongly prefer "#elif defined(CONFIG_RCU_TINY)". Change
to the latter to make more compilers happy.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
LKML-Reference: <
12565906642768-git-send-email->
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Peter Zijlstra [Mon, 26 Oct 2009 17:24:31 +0000 (10:24 -0700)]
rcu: Simplify creating of lockdep class for root rcu_node
Use lockdep_set_class() to simplify the code and to avoid any
additional overhead in the !LOCKDEP case. Also move the
definition of rcu_root_class into kernel/rcutree.c, as suggested
by Lai Jiangshan.
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
LKML-Reference: <
1256577871443-git-send-email->
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Mon, 26 Oct 2009 06:55:55 +0000 (07:55 +0100)]
rcu: Do tiny cleanups in rcutiny
No change in functionality - just straighten out a few small
stylistic details.
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: avi@redhat.com
Cc: mtosatti@redhat.com
LKML-Reference: <
12565226351355-git-send-email->
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Paul E. McKenney [Mon, 26 Oct 2009 02:03:54 +0000 (19:03 -0700)]
rcu: Improve rcutorture diagnostics when bad torture_type specified
Make rcutorture list the available torture_type values when it
doesn't like the one specified.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Josh Triplett <josh@joshtriplett.org>
Reviewed-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
Cc: avi@redhat.com
Cc: mtosatti@redhat.com
LKML-Reference: <
12565226351868-git-send-email->
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Paul E. McKenney [Mon, 26 Oct 2009 02:03:53 +0000 (19:03 -0700)]
rcu: Add synchronize_srcu_expedited() to the documentation
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Josh Triplett <josh@joshtriplett.org>
Reviewed-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
Cc: avi@redhat.com
Cc: mtosatti@redhat.com
LKML-Reference: <
12565226354176-git-send-email->
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Paul E. McKenney [Mon, 26 Oct 2009 02:03:52 +0000 (19:03 -0700)]
rcu: Add synchronize_srcu_expedited() to the rcutorture test suite
Adds the "srcu_expedited" torture type, and also renames
sched_ops_sync to sched_sync_ops for consistency while we are in
this file.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Josh Triplett <josh@joshtriplett.org>
Reviewed-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
Cc: avi@redhat.com
Cc: mtosatti@redhat.com
LKML-Reference: <
12565226353636-git-send-email->
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Paul E. McKenney [Mon, 26 Oct 2009 02:03:51 +0000 (19:03 -0700)]
rcu: Add synchronize_srcu_expedited()
This patch creates a synchronize_srcu_expedited() that uses
synchronize_sched_expedited() where synchronize_srcu()
uses synchronize_sched(). The synchronize_srcu() and
synchronize_srcu_expedited() functions become one-liners that
pass synchronize_sched() or synchronize_sched_expedited(),
repectively, to a new __synchronize_srcu() function.
While in the file, move the EXPORT_SYMBOL_GPL()s to immediately
follow the corresponding functions.
Requested-by: Avi Kivity <avi@redhat.com>
Tested-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Josh Triplett <josh@joshtriplett.org>
Reviewed-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
Cc: avi@redhat.com
LKML-Reference: <
12565226354038-git-send-email->
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Paul E. McKenney [Mon, 26 Oct 2009 02:03:50 +0000 (19:03 -0700)]
rcu: "Tiny RCU", The Bloatwatch Edition
This patch is a version of RCU designed for !SMP provided for a
small-footprint RCU implementation. In particular, the
implementation of synchronize_rcu() is extremely lightweight and
high performance. It passes rcutorture testing in each of the
four relevant configurations (combinations of NO_HZ and PREEMPT)
on x86. This saves about 1K bytes compared to old Classic RCU
(which is no longer in mainline), and more than three kilobytes
compared to Hierarchical RCU (updated to 2.6.30):
CONFIG_TREE_RCU:
text data bss dec filename
183 4 0 187 kernel/rcupdate.o
2783 520 36 3339 kernel/rcutree.o
3526 Total (vs 4565 for v7)
CONFIG_TREE_PREEMPT_RCU:
text data bss dec filename
263 4 0 267 kernel/rcupdate.o
4594 776 52 5422 kernel/rcutree.o
5689 Total (6155 for v7)
CONFIG_TINY_RCU:
text data bss dec filename
96 4 0 100 kernel/rcupdate.o
734 24 0 758 kernel/rcutiny.o
858 Total (vs 848 for v7)
The above is for x86. Your mileage may vary on other platforms.
Further compression is possible, but is being procrastinated.
Changes from v7 (http://lkml.org/lkml/2009/10/9/388)
o Apply Lai Jiangshan's review comments (aside from
might_sleep() in synchronize_sched(), which is covered by SMP builds).
o Fix up expedited primitives.
Changes from v6 (http://lkml.org/lkml/2009/9/23/293).
o Forward ported to put it into the 2.6.33 stream.
o Added lockdep support.
o Make lightweight rcu_barrier.
Changes from v5 (http://lkml.org/lkml/2009/6/23/12).
o Ported to latest pre-2.6.32 merge window kernel.
- Renamed rcu_qsctr_inc() to rcu_sched_qs().
- Renamed rcu_bh_qsctr_inc() to rcu_bh_qs().
- Provided trivial rcu_cpu_notify().
- Provided trivial exit_rcu().
- Provided trivial rcu_needs_cpu().
- Fixed up the rcu_*_enter/exit() functions in linux/hardirq.h.
o Removed the dependence on EMBEDDED, with a view to making
TINY_RCU default for !SMP at some time in the future.
o Added (trivial) support for expedited grace periods.
Changes from v4 (http://lkml.org/lkml/2009/5/2/91) include:
o Squeeze the size down a bit further by removing the
->completed field from struct rcu_ctrlblk.
o This permits synchronize_rcu() to become the empty function.
Previous concerns about rcutorture were unfounded, as
rcutorture correctly handles a constant value from
rcu_batches_completed() and rcu_batches_completed_bh().
Changes from v3 (http://lkml.org/lkml/2009/3/29/221) include:
o Changed rcu_batches_completed(), rcu_batches_completed_bh()
rcu_enter_nohz(), rcu_exit_nohz(), rcu_nmi_enter(), and
rcu_nmi_exit(), to be static inlines, as suggested by David
Howells. Doing this saves about 100 bytes from rcutiny.o.
(The numbers between v3 and this v4 of the patch are not directly
comparable, since they are against different versions of Linux.)
Changes from v2 (http://lkml.org/lkml/2009/2/3/333) include:
o Fix whitespace issues.
o Change short-circuit "||" operator to instead be "+" in order
to fix performance bug noted by "kraai" on LWN.
(http://lwn.net/Articles/324348/)
Changes from v1 (http://lkml.org/lkml/2009/1/13/440) include:
o This version depends on EMBEDDED as well as !SMP, as suggested
by Ingo.
o Updated rcu_needs_cpu() to unconditionally return zero,
permitting the CPU to enter dynticks-idle mode at any time.
This works because callbacks can be invoked upon entry to
dynticks-idle mode.
o Paul is now OK with this being included, based on a poll at
the Kernel Miniconf at linux.conf.au, where about ten people said
that they cared about saving 900 bytes on single-CPU systems.
o Applies to both mainline and tip/core/rcu.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: David Howells <dhowells@redhat.com>
Acked-by: Josh Triplett <josh@joshtriplett.org>
Reviewed-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: avi@redhat.com
Cc: mtosatti@redhat.com
LKML-Reference: <
12565226351355-git-send-email->
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Darren Hart [Thu, 15 Oct 2009 22:30:48 +0000 (15:30 -0700)]
futex: Move drop_futex_key_refs out of spinlock'ed region
When requeuing tasks from one futex to another, the reference held
by the requeued task to the original futex location needs to be
dropped eventually.
Dropping the reference may ultimately lead to a call to
"iput_final" and subsequently call into filesystem- specific code -
which may be non-atomic.
It is therefore safer to defer this drop operation until after the
futex_hash_bucket spinlock has been dropped.
Originally-From: Helge Bahmann <hcb@chaoticmind.net>
Signed-off-by: Darren Hart <dvhltc@us.ibm.com>
Cc: <stable@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Dinakar Guniguntala <dino@in.ibm.com>
Cc: John Stultz <johnstul@linux.vnet.ibm.com>
Cc: Sven-Thorsten Dietrich <sdietrich@novell.com>
Cc: John Kacur <jkacur@redhat.com>
LKML-Reference: <
4AD7A298.
5040802@us.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Paul E. McKenney [Thu, 15 Oct 2009 16:26:14 +0000 (09:26 -0700)]
rcu: Fix TREE_PREEMPT_RCU CPU_HOTPLUG bad-luck hang
If the following sequence of events occurs, then
TREE_PREEMPT_RCU will hang waiting for a grace period to
complete, eventually OOMing the system:
o A TREE_PREEMPT_RCU build of the kernel is booted on a system
with more than 64 physical CPUs present (32 on a 32-bit system).
Alternatively, a TREE_PREEMPT_RCU build of the kernel is booted
with RCU_FANOUT set to a sufficiently small value that the
physical CPUs populate two or more leaf rcu_node structures.
o A task is preempted in an RCU read-side critical section
while running on a CPU corresponding to a given leaf rcu_node
structure.
o All CPUs corresponding to this same leaf rcu_node structure
record quiescent states for the current grace period.
o All of these same CPUs go offline (hence the need for enough
physical CPUs to populate more than one leaf rcu_node structure).
This causes the preempted task to be moved to the root rcu_node
structure.
At this point, there is nothing left to cause the quiescent
state to be propagated up the rcu_node tree, so the current
grace period never completes.
The simplest fix, especially after considering the deadlock
possibilities, is to detect this situation when the last CPU is
offlined, and to set that CPU's ->qsmask bit in its leaf
rcu_node structure. This will cause the next invocation of
force_quiescent_state() to end the grace period.
Without this fix, this hang can be triggered in an hour or so on
some machines with rcutorture and random CPU onlining/offlining.
With this fix, these same machines pass a full 10 hours of this
sort of abuse.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
LKML-Reference: <
20091015162614.GA19131@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Paul E. McKenney [Wed, 14 Oct 2009 17:15:59 +0000 (10:15 -0700)]
rcu: Update trace.txt documentation for blocked-tasks lists
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
Cc: npiggin@suse.de
Cc: jens.axboe@oracle.com
LKML-Reference: <
12555405592804-git-send-email->
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Paul E. McKenney [Wed, 14 Oct 2009 17:15:54 +0000 (10:15 -0700)]
rcu: Update trace.txt documentation to reflect recent changes
o Remove the CONFIG_PREEMPT_RCU documentation since this
config option has now been removed.
o Change the now-incorrect references to "rcu" labels to
instead be "rcu_sched".
o Add notes stating that CONFIG_TREE_PREEMPT_RCU kernels will
have additional "rcu_preempt" output.
o Note the new "oqlen" field in the rcuhier output (for
RCU callbacks orphaned by an offlined CPU).
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
Cc: npiggin@suse.de
Cc: jens.axboe@oracle.com
LKML-Reference: <
1255540559799-git-send-email->
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Paul E. McKenney [Wed, 14 Oct 2009 23:36:38 +0000 (16:36 -0700)]
rcu: Add rnp->blocked_tasks to tracing
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
Cc: npiggin@suse.de
Cc: jens.axboe@oracle.com
Cc: Josh Triplett <josh@joshtriplett.org>
LKML-Reference: <
20091014233638.GE6763@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
kernel/rcutree_trace.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
Paul E. McKenney [Wed, 14 Oct 2009 17:15:56 +0000 (10:15 -0700)]
rcu: Stopgap fix for synchronize_rcu_expedited() for TREE_PREEMPT_RCU
For the short term, map synchronize_rcu_expedited() to
synchronize_rcu() for TREE_PREEMPT_RCU and to
synchronize_sched_expedited() for TREE_RCU.
Longer term, there needs to be a real expedited grace period for
TREE_PREEMPT_RCU, but candidate patches to date are considerably
more complex and intrusive.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
Cc: npiggin@suse.de
Cc: jens.axboe@oracle.com
LKML-Reference: <
12555405592331-git-send-email->
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Paul E. McKenney [Wed, 14 Oct 2009 17:15:55 +0000 (10:15 -0700)]
rcu: Prevent RCU IPI storms in presence of high call_rcu() load
As the number of callbacks on a given CPU rises, invoke
force_quiescent_state() only every blimit number of callbacks
(defaults to 10,000), and even then only if no other CPU has
invoked force_quiescent_state() in the meantime.
This should fix the performance regression reported by Nick.
Reported-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
Cc: jens.axboe@oracle.com
LKML-Reference: <
12555405592133-git-send-email->
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Darren Hart [Wed, 14 Oct 2009 17:12:39 +0000 (10:12 -0700)]
futex: Check for NULL keys in match_futex
If userspace tries to perform a requeue_pi on a non-requeue_pi waiter,
it will find the futex_q->requeue_pi_key to be NULL and OOPS.
Check for NULL in match_futex() instead of doing explicit NULL pointer
checks on all call sites. While match_futex(NULL, NULL) returning
false is a little odd, it's still correct as we expect valid key
references.
Signed-off-by: Darren Hart <dvhltc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@elte.hu>
CC: Eric Dumazet <eric.dumazet@gmail.com>
CC: Dinakar Guniguntala <dino@in.ibm.com>
CC: John Stultz <johnstul@us.ibm.com>
Cc: stable@kernel.org
LKML-Reference: <
4AD60687.10306@us.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Thomas Gleixner [Tue, 13 Oct 2009 18:40:43 +0000 (20:40 +0200)]
futex: Handle spurious wake up
The futex code does not handle spurious wake up in futex_wait and
futex_wait_requeue_pi.
The code assumes that any wake up which was not caused by futex_wake /
requeue or by a timeout was caused by a signal wake up and returns one
of the syscall restart error codes.
In case of a spurious wake up the signal delivery code which deals
with the restart error codes is not invoked and we return that error
code to user space. That causes applications which actually check the
return codes to fail. Blaise reported that on preempt-rt a python test
program run into a exception trap. -rt exposed that due to a built in
spurious wake up accelerator :)
Solve this by checking signal_pending(current) in the wake up path and
handle the spurious wake up case w/o returning to user space.
Reported-by: Blaise Gassend <blaise@willowgarage.com>
Debugged-by: Darren Hart <dvhltc@us.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: stable@kernel.org
LKML-Reference: <new-submission>
Ingo Molnar [Mon, 12 Oct 2009 21:26:36 +0000 (23:26 +0200)]
Merge branch 'urgent' of git://git./linux/kernel/git/rric/oprofile into core/urgent
Robert Richter [Fri, 9 Oct 2009 01:17:44 +0000 (03:17 +0200)]
oprofile: warn on freeing event buffer too early
A race shouldn't happen since all workqueues or handlers are canceled
or flushed before the event buffer is freed. A warning is triggered
now if the buffer is freed too early.
Also, this patch adds some comments about event buffer protection,
reworks some code and adds code to clear buffer_pos during alloc and
free of the event buffer.
Cc: David Rientjes <rientjes@google.com>
Cc: Stephane Eranian <eranian@google.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
David Rientjes [Wed, 9 Sep 2009 13:02:33 +0000 (15:02 +0200)]
oprofile: fix race condition in event_buffer free
Looking at the 2.6.31-rc9 code, it appears there is a race condition
in the event_buffer cleanup code path (shutdown). This could lead to
kernel panic as some CPUs may be operating on the event buffer AFTER
it has been freed. The attached patch solves the problem and makes
sure CPUs check if the buffer is not NULL before they access it as
some may have been spinning on the mutex while the buffer was being
freed.
The race may happen if the buffer is freed during pending reads. But
it is not clear why there are races in add_event_entry() since all
workqueues or handlers are canceled or flushed before the event buffer
is freed.
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
Peter Zijlstra [Fri, 9 Oct 2009 08:12:41 +0000 (10:12 +0200)]
lockdep: Use cpu_clock() for lockstat
Some tracepoint magic (TRACE_EVENT(lock_acquired)) relies on
the fact that lock hold times are positive and uses div64 on
that. That triggered a build warning on MIPS, and probably
causes bad output in certain circumstances as well.
Make it truly positive.
Reported-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <
1254818502.21044.112.camel@laptop>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Linus Torvalds [Thu, 8 Oct 2009 19:22:45 +0000 (12:22 -0700)]
Merge branch 'upstream-linus' of git://git./linux/kernel/git/jgarzik/libata-dev
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
pata_atp867x: add Power Management support
pata_atp867x: PIO support fixes
pata_atp867x: clarifications in timings calculations and cable detection
pata_atp867x: fix it to not claim MWDMA support
libata: fix incorrect link online check during probe
ahci: filter FPDMA non-zero offset enable for Aspire 3810T
libata: make gtf_filter per-dev
libata: implement more acpi filtering options
libata: cosmetic updates
ahci: display all AHCI 1.3 HBA capability flags (v2)
pata_ali: trivial fix of a very frequent spelling mistake
ahci: disable 64bit DMA by default on SB600s
Linus Torvalds [Thu, 8 Oct 2009 19:16:35 +0000 (12:16 -0700)]
Merge branch 'core-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
futex: fix requeue_pi key imbalance
futex: Fix typo in FUTEX_WAIT/WAKE_BITSET_PRIVATE definitions
rcu: Place root rcu_node structure in separate lockdep class
rcu: Make hot-unplugged CPU relinquish its own RCU callbacks
rcu: Move rcu_barrier() to rcutree
futex: Move exit_pi_state() call to release_mm()
futex: Nullify robust lists after cleanup
futex: Fix locking imbalance
panic: Fix panic message visibility by calling bust_spinlocks(0) before dying
rcu: Replace the rcu_barrier enum with pointer to call_rcu*() function
rcu: Clean up code based on review feedback from Josh Triplett, part 4
rcu: Clean up code based on review feedback from Josh Triplett, part 3
rcu: Fix rcu_lock_map build failure on CONFIG_PROVE_LOCKING=y
rcu: Clean up code to address Ingo's checkpatch feedback
rcu: Clean up code based on review feedback from Josh Triplett, part 2
rcu: Clean up code based on review feedback from Josh Triplett
Linus Torvalds [Thu, 8 Oct 2009 19:07:24 +0000 (12:07 -0700)]
Merge branch 'sched-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
sched: Set correct normal_prio and prio values in sched_fork()
Linus Torvalds [Thu, 8 Oct 2009 19:06:36 +0000 (12:06 -0700)]
Merge branch 'x86-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86, pci: Correct spelling in a comment
x86: Simplify bound checks in the MTRR code
x86: EDAC: carve out AMD MCE decoding logic
initcalls: Add early_initcall() for modules
x86: EDAC: MCE: Fix MCE decoding callback logic
Linus Torvalds [Thu, 8 Oct 2009 19:06:09 +0000 (12:06 -0700)]
Merge branch 'tracing-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
tracing: user local buffer variable for trace branch tracer
tracing: fix warning on kernel/trace/trace_branch.c andtrace_hw_branches.c
ftrace: check for failure for all conversions
tracing: correct module boundaries for ftrace_release
tracing: fix transposed numbers of lock_depth and preempt_count
trace: Fix missing assignment in trace_ctxwake_*
tracing: Use free_percpu instead of kfree
tracing: Check total refcount before releasing bufs in profile_enable failure
Linus Torvalds [Thu, 8 Oct 2009 19:05:50 +0000 (12:05 -0700)]
Merge branch 'sparc-perf-events-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'sparc-perf-events-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
mm, perf_event: Make vmalloc_user() align base kernel virtual address to SHMLBA
perf_event: Provide vmalloc() based mmap() backing
Linus Torvalds [Thu, 8 Oct 2009 19:05:00 +0000 (12:05 -0700)]
Merge branch 'perf-fixes-for-linus-2' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'perf-fixes-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
perf_events: Make ABI definitions available to userspace
perf tools: elf_sym__is_function() should accept "zero" sized functions
tracing/syscalls: Use long for syscall ret format and field definitions
perf trace: Update eval_flag() flags array to match interrupt.h
perf trace: Remove unused code in builtin-trace.c
perf: Propagate term signal to child
Linus Torvalds [Thu, 8 Oct 2009 19:04:04 +0000 (12:04 -0700)]
Merge branch 'timers-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86, timers: Check for pending timers after (device) interrupts
NOHZ: update idle state also when NOHZ is inactive
Linus Torvalds [Thu, 8 Oct 2009 19:03:21 +0000 (12:03 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/tiwai/sound-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
ALSA: ice1724: increase SPDIF and independent stereo buffer sizes
ALSA: opl3: circular locking in the snd_opl3_note_on() and snd_opl3_note_off()
ALSA: ICE1712/24 - Change the Multi Track Peak control (level meters) from MIXER to PCM type
ALSA: hda - Fix yet another auto-mic bug in ALC268
ASoC: WM8350 capture PGA mutes are inverted
ASoC: Remove absent SYNC and TDM DAI format options from i.MX SSI
sound: via82xx: move DXS volume controls to PCM interface
ALSA: hda - Don't pick up invalid HP pins in alc_subsystem_id()
ALSA: hda - Add a workaround for ASUS A7K
ALSA: hda - Fix invalid initializations for ALC861 auto mode
ASoC: wm8940: Fix check on error code form snd_soc_codec_set_cache_io
ASoC: Fix SND_SOC_DAPM_LINE handling
Linus Torvalds [Thu, 8 Oct 2009 19:02:06 +0000 (12:02 -0700)]
Merge branch 'drm-linus' of git://git./linux/kernel/git/airlied/drm-2.6
* 'drm-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: (24 commits)
drm/radeon/kms: fix vline register for second head.
drm/r600: avoid assigning vb twice in blit code
drm/radeon: use list_for_each_entry instead of list_for_each
drm/radeon/kms: Fix AGP support for R600/RV770 family (v2)
drm/radeon/kms: Fallback to non AGP when acceleration fails to initialize (v2)
drm/radeon/kms: Fix RS600/RV515/R520/RS690 IRQ
drm/radeon: Fix setting of bits
drm/ttm: fix refcounting in ttm global code.
drm/fb: add more correct 8/16/24/32 bpp fb support.
drm/fb: add setcmap and fix 8-bit support.
drm/radeon/kms: respect single crtc cards, only create one crtc. (v2)
drm: Delete the DRM_DEBUG_KMS in drm_mode_cursor_ioctl
drm/radeon/kms: add support for "Surround View"
drm/radeon/kms: Fix irq handling on AVIVO hw
drm/radeon/kms: R600/RV770 remove dead code and print message for wrong BIOS
drm/radeon/kms: Fix R600/RV770 disable acceleration path
drm/radeon/kms: Fix R600/RV770 startup path & reset
drm/radeon/kms: Fix R600 write back buffer
drm/radeon/kms: Remove old init path as no hw use it anymore
drm/radeon/kms: Convert RS600 to new init path
...
Linus Torvalds [Thu, 8 Oct 2009 19:01:01 +0000 (12:01 -0700)]
Merge branch 'omap-fixes-for-linus' of git://git./linux/kernel/git/tmlind/linux-omap-2.6
* 'omap-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap-2.6:
omapfb: Blizzard: constify register address tables
omapfb: Blizzard: fix pointer to be const
omapfb: Condition mutex acquisition
omap: iovmm: Add missing mutex_unlock
omap: iovmm: Fix incorrect spelling
omap: SRAM: flush the right address after memcpy in omap_sram_push
omap: Lock DPLL5 at boot
omap: Fix incorrect 730 vs 850 detection
OMAP3: PM: introduce a new powerdomain walk helper
OMAP3: PM: Enable GPIO module-level wakeups
OMAP3: PM: USBHOST: clear wakeup events on both hosts
OMAP3: PM: PRCM interrupt: only handle selected PRCM interrupts
OMAP3: PM: PRCM interrupt: check MPUGRPSEL register
OMAP3: PM: Prevent hang in prcm_interrupt_handler
Linus Torvalds [Thu, 8 Oct 2009 19:00:39 +0000 (12:00 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/bp/bp
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp:
amd64_edac: beef up DRAM error injection
amd64_edac: fix DRAM base and limit extraction
amd64_edac: fix chip select handling
amd64_edac: simple fix to allow reporting of CECC errors
amd64_edac: fix K8 intlv_sel check
amd64_edac: fix interleave enable tests
amd64_edac: fix DRAM base and limit address extraction
amd64_edac: fix driver instance lookup table allocation
Linus Torvalds [Thu, 8 Oct 2009 18:59:30 +0000 (11:59 -0700)]
Merge git://git./linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (40 commits)
ethoc: limit the number of buffers to 128
ethoc: use system memory as buffer
ethoc: align received packet to make IP header at word boundary
ethoc: fix buffer address mapping
ethoc: fix typo to compute number of tx descriptors
au1000_eth: Duplicate test of RX_OVERLEN bit in update_rx_stats()
netxen: Fix Unlikely(x) > y
pasemi_mac: ethtool get settings fix
add maintainer for network drop monitor kernel service
tg3: Fix phylib locking strategy
rndis_host: support ETHTOOL_GPERMADDR
ipv4: arp_notify address list bug
gigaset: add kerneldoc comments
gigaset: correct debugging output selection
gigaset: improve error recovery
gigaset: fix device ERROR response handling
gigaset: announce if built with debugging
gigaset: handle isoc frame errors more gracefully
gigaset: linearize skb
gigaset: fix reject/hangup handling
...
Linus Torvalds [Thu, 8 Oct 2009 18:59:06 +0000 (11:59 -0700)]
Merge git://git./linux/kernel/git/davem/ide-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/ide-2.6:
Revert "Revert "ide: try to use PIO Mode 0 during probe if possible""
sis5513: fix PIO setup for ATAPI devices
Arjan van de Ven [Thu, 8 Oct 2009 13:40:41 +0000 (06:40 -0700)]
x86, timers: Check for pending timers after (device) interrupts
Now that range timers and deferred timers are common, I found a
problem with these using the "perf timechart" tool. Frans Pop also
reported high scheduler latencies via LatencyTop, when using
iwlagn.
It turns out that on x86, these two 'opportunistic' timers only get
checked when another "real" timer happens. These opportunistic
timers have the objective to save power by hitchhiking on other
wakeups, as to avoid CPU wakeups by themselves as much as possible.
The change in this patch runs this check not only at timer
interrupts, but at all (device) interrupts. The effect is that:
1) the deferred timers/range timers get delayed less
2) the range timers cause less wakeups by themselves because
the percentage of hitchhiking on existing wakeup events goes up.
I've verified the working of the patch using "perf timechart", the
original exposed bug is gone with this patch. Frans also reported
success - the latencies are now down in the expected ~10 msec
range.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Tested-by: Frans Pop <elendil@planet.nl>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <
20091008064041.
67219b13@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
David Miller [Mon, 21 Sep 2009 19:22:34 +0000 (12:22 -0700)]
mm, perf_event: Make vmalloc_user() align base kernel virtual address to SHMLBA
When a vmalloc'd area is mmap'd into userspace, some kind of
co-ordination is necessary for this to work on platforms with cpu
D-caches which can have aliases.
Otherwise kernel side writes won't be seen properly in userspace
and vice versa.
If the kernel side mapping and the user side one have the same
alignment, modulo SHMLBA, this can work as long as VM_SHARED is
shared of VMA and for all current users this is true. VM_SHARED
will force SHMLBA alignment of the user side mmap on platforms with
D-cache aliasing matters.
The bulk of this patch is just making it so that a specific
alignment can be passed down into __get_vm_area_node(). All
existing callers pass in '1' which preserves existing behavior.
vmalloc_user() gives SHMLBA for the alignment.
As a side effect this should get the video media drivers and other
vmalloc_user() users into more working shape on such systems.
Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <
200909211922.n8LJMYjw029425@imap1.linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Linus Torvalds [Thu, 8 Oct 2009 14:40:19 +0000 (07:40 -0700)]
Merge branch 'fixes' of git://git./linux/kernel/git/kyle/parisc-2.6
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/kyle/parisc-2.6:
agp: parisc-agp.c - use correct page_mask function
parisc: Fix linker script breakage.
parisc: convert to asm-generic/hardirq.h
parisc: Make THREAD_SIZE available to assembly files and linker scripts.
parisc: correct use of SHF_ALLOC
parisc: rename parisc's vmalloc_start to parisc_vmalloc_start
parisc: add me to Maintainers
parisc: includecheck fix: signal.c
parisc: HAVE_ARCH_TRACEHOOK
parisc: add skeleton syscall.h
parisc: stop using task->ptrace for {single,block}step flags
parisc: split syscall_trace into two halves
parisc: add missing TI_TASK macro in syscall.S
parisc: tracehook_signal_handler
parisc: tracehook_report_syscall
Samu Onkalo [Wed, 7 Oct 2009 23:32:35 +0000 (16:32 -0700)]
lis3lv02d_spi: module unload didn't remove sysfs entry
In module unload, lis3lv02d core driver sysfs clean up was not called.
Signed-off-by: Samu Onkalo <samu.p.onkalo@nokia.com>
Acked-by: Daniel Mack <daniel@caiaq.de>
Cc: Éric Piel <eric.piel@tremplin-utc.net>
Cc: "Trisal, Kalhan" <kalhan.trisal@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Vrabel [Wed, 7 Oct 2009 23:32:33 +0000 (16:32 -0700)]
mmc: sdio: don't require CISTPL_VERS_1 to contain 4 strings
The PC Card 8.0 specification (vol. 4, section 3.2.10) says the
TPLLV1_INFO field of the CISTPL_VERS_1 tuple must contain 4 strings. Some
cards don't have all 4 so just parse as many as we can.
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: David Vrabel <david.vrabel@csr.com>
Tested-by: Jonathan Cameron <jic23@cam.ac.uk>
Tested-by: Bing Zhao <bzhao@marvell.com>
Cc: Roel Kluin <roel.kluin@gmail.com>
Cc: <linux-mmc@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Wu Fengguang [Wed, 7 Oct 2009 23:32:32 +0000 (16:32 -0700)]
page-types: add hwpoison/unpoison feature
For hwpoison stress testing. The debugfs mount point is assumed to be
/debug/.
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Wu Fengguang [Wed, 7 Oct 2009 23:32:31 +0000 (16:32 -0700)]
page-types: introduce kpageflags_flags()
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Wu Fengguang [Wed, 7 Oct 2009 23:32:30 +0000 (16:32 -0700)]
page-types: make voffset local variables
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Wu Fengguang [Wed, 7 Oct 2009 23:32:30 +0000 (16:32 -0700)]
page-types: make standalone pagemap/kpageflags read routines
Refactor the code to be more modular and easier to reuse.
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Wu Fengguang [Wed, 7 Oct 2009 23:32:29 +0000 (16:32 -0700)]
page-types: introduce checked_open()
This helps merge duplicate code (now and future) and outstand the main
logic.
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Wu Fengguang [Wed, 7 Oct 2009 23:32:28 +0000 (16:32 -0700)]
page-types: add GPL note
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Wu Fengguang [Wed, 7 Oct 2009 23:32:28 +0000 (16:32 -0700)]
pagemap: document KPF_KSM and show it in page-types
It indicates to the system admin that processes mapping such pages may be
eating less physical memory than the reported numbers by legacy tools.
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Cc: Izik Eidus <ieidus@redhat.com>
Acked-by: Chris Wright <chrisw@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Wu Fengguang [Wed, 7 Oct 2009 23:32:27 +0000 (16:32 -0700)]
pagemap: export KPF_HWPOISON
This flag indicates a hardware detected memory corruption on the page.
Any future access of the page data may bring down the machine.
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Paul Menage [Wed, 7 Oct 2009 23:32:26 +0000 (16:32 -0700)]
cgroups: update documentation of cgroups tasks and procs files
Update documentation of cgroups tasks and procs files
Document the cgroup.procs file.
Clarify the semantics of the cgroup.procs and tasks files. Although the
current cgroup.procs interface returns a sorted and uniqified list of
pids, potential future performance enhancements could result in those
properties being removed - explicitly document this aspect of the API.
There are no existing users of cgroup.procs, so compatibility isn't an
issue. There are users of the "tasks" file, but none that would appear to
break in the event of the sorted property being broken. The standard
"libcpuset" explicitly sorts the results of reading from the tasks file,
and "libcg" and other users don't appear to care about ordering.
Signed-off-by: Paul Menage <menage@google.com>
Reviewed-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jaswinder Singh Rajput [Wed, 7 Oct 2009 23:32:25 +0000 (16:32 -0700)]
video: includecheck fix: da8xx-fb.c
fix the following 'make includecheck' warning:
drivers/video/da8xx-fb.c: linux/device.h is included more than once.
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jaswinder Singh Rajput [Wed, 7 Oct 2009 23:32:24 +0000 (16:32 -0700)]
video: includecheck fix: msm, mddi.c
fix the following 'make includecheck' warning:
drivers/video/msm/mddi.c: linux/delay.h is included more than once.
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jaswinder Singh Rajput [Wed, 7 Oct 2009 23:32:24 +0000 (16:32 -0700)]
fs: includecheck fix: proc, kcore.c
fix the following 'make includecheck' warning:
fs/proc/kcore.c: linux/mm.h is included more than once.
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jaswinder Singh Rajput [Wed, 7 Oct 2009 23:32:23 +0000 (16:32 -0700)]
mm: includecheck fix: vmalloc.c
fix the following 'make includecheck' warning:
mm/vmalloc.c: linux/highmem.h is included more than once.
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Hugh Dickins [Wed, 7 Oct 2009 23:32:22 +0000 (16:32 -0700)]
ksm: more on default values
Adjust the max_kernel_pages default to a quarter of totalram_pages,
instead of nr_free_buffer_pages() / 4: the KSM pages themselves come from
highmem, and even on a 16GB PAE machine, 4GB of KSM pages would only be
pinning 32MB of lowmem with their rmap_items, so no need for the more
obscure calculation (nor for its own special init function).
There is no way for the user to switch KSM on if CONFIG_SYSFS is not
enabled, so in that case default run to KSM_RUN_MERGE.
Update KSM Documentation and Kconfig to reflect the new defaults.
Signed-off-by: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Cc: Izik Eidus <ieidus@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Takashi Iwai [Thu, 8 Oct 2009 11:00:02 +0000 (13:00 +0200)]
Merge branch 'fix/misc' into for-linus
Takashi Iwai [Thu, 8 Oct 2009 10:59:58 +0000 (12:59 +0200)]
Merge branch 'fix/hda' into for-linus
Robert Hancock [Thu, 8 Oct 2009 02:19:21 +0000 (20:19 -0600)]
ALSA: ice1724: increase SPDIF and independent stereo buffer sizes
Increase the default and maximum PCM buffer prellocation size for ice1724's
SPDIF and independent stereo pair outputs to 256K, which is the hardware's
maximum supported size. This allows a reduction in interrupt rate and
potentially power usage when an application is not latency-critical.
Signed-off-by: Robert Hancock <hancockrwd@gmail.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Krzysztof Helt [Wed, 7 Oct 2009 20:51:34 +0000 (22:51 +0200)]
ALSA: opl3: circular locking in the snd_opl3_note_on() and snd_opl3_note_off()
Fix following circular locking in the opl3 driver.
=======================================================
[ INFO: possible circular locking dependency detected ]
2.6.32-rc3 #87
-------------------------------------------------------
swapper/0 is trying to acquire lock:
(&opl3->voice_lock){..-...}, at: [<
cca748fe>] snd_opl3_note_off+0x1e/0xe0 [snd_opl3_synth]
but task is already holding lock:
(&opl3->sys_timer_lock){..-...}, at: [<
cca75169>] snd_opl3_timer_func+0x19/0xc0 [snd_opl3_synth]
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (&opl3->sys_timer_lock){..-...}:
[<
c02461d5>] validate_chain+0xa25/0x1040
[<
c0246aca>] __lock_acquire+0x2da/0xab0
[<
c024731a>] lock_acquire+0x7a/0xa0
[<
c044c300>] _spin_lock_irqsave+0x40/0x60
[<
cca75046>] snd_opl3_note_on+0x686/0x790 [snd_opl3_synth]
[<
cca68912>] snd_midi_process_event+0x322/0x590 [snd_seq_midi_emul]
[<
cca74245>] snd_opl3_synth_event_input+0x15/0x20 [snd_opl3_synth]
[<
cca4dcc0>] snd_seq_deliver_single_event+0x100/0x200 [snd_seq]
[<
cca4de07>] snd_seq_deliver_event+0x47/0x1f0 [snd_seq]
[<
cca4e50b>] snd_seq_dispatch_event+0x3b/0x140 [snd_seq]
[<
cca5008c>] snd_seq_check_queue+0x10c/0x120 [snd_seq]
[<
cca5037b>] snd_seq_enqueue_event+0x6b/0xe0 [snd_seq]
[<
cca4e0fd>] snd_seq_client_enqueue_event+0xdd/0x100 [snd_seq]
[<
cca4eb7a>] snd_seq_write+0xea/0x190 [snd_seq]
[<
c02827b6>] vfs_write+0x96/0x160
[<
c0282c9d>] sys_write+0x3d/0x70
[<
c0202c45>] syscall_call+0x7/0xb
-> #0 (&opl3->voice_lock){..-...}:
[<
c02467e6>] validate_chain+0x1036/0x1040
[<
c0246aca>] __lock_acquire+0x2da/0xab0
[<
c024731a>] lock_acquire+0x7a/0xa0
[<
c044c300>] _spin_lock_irqsave+0x40/0x60
[<
cca748fe>] snd_opl3_note_off+0x1e/0xe0 [snd_opl3_synth]
[<
cca751f0>] snd_opl3_timer_func+0xa0/0xc0 [snd_opl3_synth]
[<
c022ac46>] run_timer_softirq+0x166/0x1e0
[<
c02269e8>] __do_softirq+0x78/0x110
[<
c0226ac6>] do_softirq+0x46/0x50
[<
c0226e26>] irq_exit+0x36/0x40
[<
c0204bd2>] do_IRQ+0x42/0xb0
[<
c020328e>] common_interrupt+0x2e/0x40
[<
c021092f>] apm_cpu_idle+0x10f/0x290
[<
c0201b11>] cpu_idle+0x21/0x40
[<
c04443cd>] rest_init+0x4d/0x60
[<
c055c835>] start_kernel+0x235/0x280
[<
c055c066>] i386_start_kernel+0x66/0x70
other info that might help us debug this:
2 locks held by swapper/0:
#0: (&opl3->tlist){+.-...}, at: [<
c022abd0>] run_timer_softirq+0xf0/0x1e0
#1: (&opl3->sys_timer_lock){..-...}, at: [<
cca75169>] snd_opl3_timer_func+0x19/0xc0 [snd_opl3_synth]
stack backtrace:
Pid: 0, comm: swapper Not tainted 2.6.32-rc3 #87
Call Trace:
[<
c0245188>] print_circular_bug+0xc8/0xd0
[<
c02467e6>] validate_chain+0x1036/0x1040
[<
c0247f14>] ? check_usage_forwards+0x54/0xd0
[<
c0246aca>] __lock_acquire+0x2da/0xab0
[<
c024731a>] lock_acquire+0x7a/0xa0
[<
cca748fe>] ? snd_opl3_note_off+0x1e/0xe0 [snd_opl3_synth]
[<
c044c300>] _spin_lock_irqsave+0x40/0x60
[<
cca748fe>] ? snd_opl3_note_off+0x1e/0xe0 [snd_opl3_synth]
[<
cca748fe>] snd_opl3_note_off+0x1e/0xe0 [snd_opl3_synth]
[<
c044c307>] ? _spin_lock_irqsave+0x47/0x60
[<
cca751f0>] snd_opl3_timer_func+0xa0/0xc0 [snd_opl3_synth]
[<
c022ac46>] run_timer_softirq+0x166/0x1e0
[<
c022abd0>] ? run_timer_softirq+0xf0/0x1e0
[<
cca75150>] ? snd_opl3_timer_func+0x0/0xc0 [snd_opl3_synth]
[<
c02269e8>] __do_softirq+0x78/0x110
[<
c044c0fd>] ? _spin_unlock+0x1d/0x20
[<
c025915f>] ? handle_level_irq+0xaf/0xe0
[<
c0226ac6>] do_softirq+0x46/0x50
[<
c0226e26>] irq_exit+0x36/0x40
[<
c0204bd2>] do_IRQ+0x42/0xb0
[<
c024463c>] ? trace_hardirqs_on_caller+0x12c/0x180
[<
c020328e>] common_interrupt+0x2e/0x40
[<
c0208d88>] ? default_idle+0x38/0x50
[<
c021092f>] apm_cpu_idle+0x10f/0x290
[<
c0201b11>] cpu_idle+0x21/0x40
[<
c04443cd>] rest_init+0x4d/0x60
[<
c055c835>] start_kernel+0x235/0x280
[<
c055c210>] ? unknown_bootoption+0x0/0x210
[<
c055c066>] i386_start_kernel+0x66/0x70
Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Pavel Hofman [Tue, 6 Oct 2009 14:04:11 +0000 (16:04 +0200)]
ALSA: ICE1712/24 - Change the Multi Track Peak control (level meters) from MIXER to PCM type
* PLEASE NOTE - this change requires the corresponding update of
envy24control for ice1712 - kind of an ABI change.
* The "Multi Track Peak" control is read-only level meters indicator.
* The control is VERY confusing to most users since it is currently displayed
in regular mixers. E.g. alsamixer ignores its read-only status
and allows changing the levels with keys which makes no sense.
Signed-off-by: Pavel Hofman <pavel.hofman@ivitera.com>
Acked-by: Jaroslav Kysela <perex@perex.cz>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Dave Airlie [Thu, 8 Oct 2009 04:03:05 +0000 (14:03 +1000)]
Merge branch 'drm-next' of ../drm-next into drm-linus
conflict in radeon since new init path merged with vga arb code.
Conflicts:
drivers/gpu/drm/radeon/radeon.h
drivers/gpu/drm/radeon/radeon_asic.h
drivers/gpu/drm/radeon/radeon_device.c
Steven Rostedt [Thu, 8 Oct 2009 01:53:41 +0000 (21:53 -0400)]
tracing: user local buffer variable for trace branch tracer
Just using the tr->buffer for the API to trace_buffer_lock_reserve
is not good enough. This is because the tr->buffer may change, and we
do not want to commit with a different buffer that we reserved from.
This patch uses a local variable to hold the buffer that was used to
reserve and commit with.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Zhenwen Xu [Thu, 8 Oct 2009 01:21:46 +0000 (09:21 +0800)]
tracing: fix warning on kernel/trace/trace_branch.c andtrace_hw_branches.c
fix warnings that caused the API change of trace_buffer_lock_reserve()
change files: kernel/trace/trace_hw_branch.c
kernel/trace/trace_branch.c
Signed-off-by: Zhenwen Xu <helight.xu@gmail.com>
LKML-Reference: <
20091008012146.GA4170@helight>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Dave Airlie [Thu, 8 Oct 2009 01:32:49 +0000 (11:32 +1000)]
drm/radeon/kms: fix vline register for second head.
Both r100/r600 had this wrong, use the macro to extract the register
to relocate.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Robert Noland [Mon, 5 Oct 2009 15:56:44 +0000 (11:56 -0400)]
drm/r600: avoid assigning vb twice in blit code
There is no need to assign vb before you know that space is available.
[agd5f: adapted for kernel tree.]
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Wed, 7 Oct 2009 23:28:19 +0000 (09:28 +1000)]
drm/radeon: use list_for_each_entry instead of list_for_each
This is just a cleanup of the list macro usage.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Jerome Glisse [Tue, 6 Oct 2009 17:04:30 +0000 (19:04 +0200)]
drm/radeon/kms: Fix AGP support for R600/RV770 family (v2)
For AGP to work unmapped access must cover VRAM & AGP as
AGP is treated like VRAM by the GPU (ie physical address).
This patch properly setup the virtual memory system aperture
to cover AGP if AGP is enabled. It seems that there is memory
corruption after resume when using AGP (RV770 seems unaffected
thought). Version 2 just fix merge issue with updated AGP
fallback patch.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Jerome Glisse [Tue, 6 Oct 2009 17:04:29 +0000 (19:04 +0200)]
drm/radeon/kms: Fallback to non AGP when acceleration fails to initialize (v2)
When GPU acceleration is not working with AGP try to fallback to non
AGP GART (either PCI or PCIE GART). This should make KMS failure on
AGP less painfull. We still need to find out what is wrong when AGP
fails but at least user have a lot of more chances to get a working
configuration with acceleration. This patch also cleanup R600/RV770
fallback path so they use same code as others asics. Version 2
factorize agp disabling logic to avoid code duplication and bugs.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Jerome Glisse [Wed, 7 Oct 2009 09:08:22 +0000 (11:08 +0200)]
drm/radeon/kms: Fix RS600/RV515/R520/RS690 IRQ
Bad generated header file leaded to use wrong register
to check IRQ status and acknowledge them. Fix the header
and use proper registers.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Steven Rostedt [Wed, 7 Oct 2009 20:57:56 +0000 (16:57 -0400)]
ftrace: check for failure for all conversions
Due to legacy code from back when the dynamic tracer used a daemon,
only core kernel code was checking for failures. This is no longer
the case. We must check for failures any time we perform text modifications.
Cc: stable@kernel.org
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
jolsa@redhat.com [Wed, 7 Oct 2009 17:00:35 +0000 (19:00 +0200)]
tracing: correct module boundaries for ftrace_release
When the module is about the unload we release its call records.
The ftrace_release function was given wrong values representing
the module core boundaries, thus not releasing its call records.
Plus making ftrace_release function module specific.
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
LKML-Reference: <
1254934835-363-3-git-send-email-jolsa@redhat.com>
Cc: stable@kernel.org
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Darren Hart [Wed, 7 Oct 2009 18:46:54 +0000 (11:46 -0700)]
futex: fix requeue_pi key imbalance
If futex_wait_requeue_pi() wakes prior to requeue, we drop the
reference to the source futex_key twice, once in
handle_early_requeue_pi_wakeup() and once on our way out.
Remove the drop from the handle_early_requeue_pi_wakeup() and keep
the get/drops together in futex_wait_requeue_pi().
Reported-by: Helge Bahmann <hcb@chaoticmind.net>
Signed-off-by: Darren Hart <dvhltc@us.ibm.com>
Cc: Helge Bahmann <hcb@chaoticmind.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Dinakar Guniguntala <dino@in.ibm.com>
Cc: John Stultz <johnstul@us.ibm.com>
Cc: stable-2.6.31 <stable@kernel.org>
LKML-Reference: <
4ACCE21E.
5030805@us.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Steven Rostedt [Sun, 27 Sep 2009 11:02:07 +0000 (07:02 -0400)]
tracing: fix transposed numbers of lock_depth and preempt_count
The lock_depth and preempt_count numbers in the latency format is
transposed.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Borislav Petkov [Thu, 24 Sep 2009 09:05:30 +0000 (11:05 +0200)]
amd64_edac: beef up DRAM error injection
When injecting DRAM ECC errors (F3xBC_x8), EccVector[15:0] is a bitmask
of which bits should be error injected when written to and holds the
payload of 16-bit DRAM word when read, respectively.
Add /sysfs members to show the DRAM ECC section/word/vector.
Fail wrong injection values entered over /sysfs instead of truncating
them.
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
Borislav Petkov [Tue, 22 Sep 2009 14:48:37 +0000 (16:48 +0200)]
amd64_edac: fix DRAM base and limit extraction
On Fam10h and above, F1x[1, 0][7C:40] are DRAM Base/Limit registers
which specify the destination node of a DRAM address. Those address
boundaries are being extracted into ->dram_base[] and ->dram_limit[].
Correct the extraction masks to match the respective address bits.
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
Borislav Petkov [Mon, 21 Sep 2009 12:35:51 +0000 (14:35 +0200)]
amd64_edac: fix chip select handling
Different processor families support a different number of chip selects.
Handle this in a family-dependent way with the proper values assigned at
init time (see amd64_set_dct_base_and_mask).
Remove _DCSM_COUNT defines since they're used at one place and originate
from public documentation.
CC: Keith Mannthey <kmannth@us.ibm.com>
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
Keith Mannthey [Fri, 18 Sep 2009 12:35:23 +0000 (14:35 +0200)]
amd64_edac: simple fix to allow reporting of CECC errors
This allows the errors to be further decoded and mapped to csrows.
Tested with ECC debug dimms and an Rev F cpu based system.
Signed-off-by: Keith Mannthey <kmannth@us.ibm.com>
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
Borislav Petkov [Fri, 18 Sep 2009 10:39:19 +0000 (12:39 +0200)]
amd64_edac: fix K8 intlv_sel check
The check when DRAM interleaving is enabled should be done against the
pvt->dram_IntlvSel field and not against the ->dram_limit.
Simplify first loop and fixup printk formatting while at it.
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
Borislav Petkov [Fri, 18 Sep 2009 10:27:27 +0000 (12:27 +0200)]
amd64_edac: fix interleave enable tests
The pvt->dram_IntlvEn saves the 3 "Interleave Enable" bits already
right-shifted by 8 so the check in find_mc_by_sys_addr() by shifting the
values to the left 8 bits is wrong.
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
Borislav Petkov [Fri, 18 Sep 2009 10:12:46 +0000 (12:12 +0200)]
amd64_edac: fix DRAM base and limit address extraction
K8 DRAM base and limit addresses from F1x40 +8*i and F1x44 + 8*i, where
i in (0..7) are both bits 39-24 and therefore the shifting should be
done by 24 and not by 8.
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
Borislav Petkov [Mon, 21 Sep 2009 11:23:34 +0000 (13:23 +0200)]
amd64_edac: fix driver instance lookup table allocation
Allocate memory statically for 8-node machines max for simplicity
instead of relying on MAX_NUMNODES which is 0 on !CONFIG_NUMA builds.
Spotted by Jan Beulich.
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
Takashi Iwai [Wed, 7 Oct 2009 13:12:27 +0000 (15:12 +0200)]
ALSA: hda - Fix yet another auto-mic bug in ALC268
Since patch_alc268() doesn't call set_capture_mixer() (due to its h/w
design different from other siblings), it needs to call fixup_automic_adc()
explicitly to set up the auto-mic routing. Otherwise the indices for
int/ext mics aren't set properly.
Reference: Novell bnc#544899
http://bugzilla.novell.com/show_bug.cgi?id=544899
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Bartlomiej Zolnierkiewicz [Tue, 6 Oct 2009 12:27:45 +0000 (12:27 +0000)]
Revert "Revert "ide: try to use PIO Mode 0 during probe if possible""
This reverts commit
24df31acaff8465d797f0006437b45ad0f2a5cb1.
The root cause of reported system hangs was (now fixed) sis5513 bug
and not "ide: try to use PIO Mode 0 during probe if possible" change
(commit
6029336426a2b43e4bc6f4a84be8789a047d139e) so the revert was
incorrect (it simply replaced one regression with the other one).
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Bartlomiej Zolnierkiewicz [Tue, 6 Oct 2009 14:46:05 +0000 (14:46 +0000)]
sis5513: fix PIO setup for ATAPI devices
Clear prefetch setting before potentially (re-)enabling it in
config_drive_art_rwp() so the transition of the device type on
the port from ATA to ATAPI (i.e. during warm-plug operation)
is handled correctly.
This is a really old bug (it probably goes back to very early
days of the driver) but it was only affecting warm-plug operation
until the recent "ide: try to use PIO Mode 0 during probe if
possible" change (commit
6029336426a2b43e4bc6f4a84be8789a047d139e).
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Tested-by: David Fries <david@fries.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eero Nurkkala [Wed, 7 Oct 2009 08:54:26 +0000 (11:54 +0300)]
NOHZ: update idle state also when NOHZ is inactive
Commit
f2e21c9610991e95621a81407cdbab881226419b had unfortunate side
effects with cpufreq governors on some systems.
If the system did not switch into NOHZ mode ts->inidle is not set when
tick_nohz_stop_sched_tick() is called from the idle routine. Therefor
all subsequent calls from irq_exit() to tick_nohz_stop_sched_tick()
fail to call tick_nohz_start_idle(). This results in bogus idle
accounting information which is passed to cpufreq governors.
Set the inidle flag unconditionally of the NOHZ active state to keep
the idle time accounting correct in any case.
[ tglx: Added comment and tweaked the changelog ]
Reported-by: Steven Noonan <steven@uplinklabs.net>
Signed-off-by: Eero Nurkkala <ext-eero.nurkkala@nokia.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Cc: Greg KH <greg@kroah.com>
Cc: Steven Noonan <steven@uplinklabs.net>
Cc: stable@kernel.org
LKML-Reference: <
1254907901.30157.93.camel@eenurkka-desktop>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Thomas Chou [Tue, 6 Oct 2009 03:25:25 +0000 (03:25 +0000)]
ethoc: limit the number of buffers to 128
Only 128 buffer descriptors are supported in the core. Limit the
number in case we have more memory.
Signed-off-by: Thomas Chou <thomas@wytron.com.tw>
Signed-off-by: David S. Miller <davem@davemloft.net>
Thomas Chou [Sun, 4 Oct 2009 23:33:20 +0000 (23:33 +0000)]
ethoc: use system memory as buffer
This patch enabled the ethoc to allocate system memory as buffer
when there is no dedicated buffer memory.
Some hardware designs may not have dedicated buffer memory such as
on chip or off chip SRAM. In this case, only one memory resource is
supplied in the platform data instead of two. Then a DMA buffer can
be allocated from system memory and used for the transfer.
Signed-off-by: Thomas Chou <thomas@wytron.com.tw>
Signed-off-by: David S. Miller <davem@davemloft.net>
Thomas Chou [Sun, 4 Oct 2009 23:33:19 +0000 (23:33 +0000)]
ethoc: align received packet to make IP header at word boundary
The packet buffer is allocated at 4 bytes boundary, but the IP header
length and version bits is located at byte 14. These bit fields access
as 32 bits word and caused exception on processors that do not support
unaligned access.
The patch adds 2 bytes offset to make the bit fields word aligned.
Signed-off-by: Thomas Chou <thomas@wytron.com.tw>
Signed-off-by: David S. Miller <davem@davemloft.net>
Thomas Chou [Sun, 4 Oct 2009 23:33:18 +0000 (23:33 +0000)]
ethoc: fix buffer address mapping
The pointer address in buffer descriptors is physical address. The
pointer that processor used to access packet is virtual address.
Though the higher bits of pointer address used by the MAC may be
truncated to zero in special case, it is not always true in larger
designs.
Signed-off-by: Thomas Chou <thomas@wytron.com.tw>
Signed-off-by: David S. Miller <davem@davemloft.net>
Thomas Chou [Sun, 4 Oct 2009 23:33:17 +0000 (23:33 +0000)]
ethoc: fix typo to compute number of tx descriptors
It should be max() instead of min(). Use 1/4 of available
descriptors for tx, and there should be at least 2 tx
descriptors.
Signed-off-by: Thomas Chou <thomas@wytron.com.tw>
Signed-off-by: David S. Miller <davem@davemloft.net>
roel kluin [Tue, 6 Oct 2009 09:54:18 +0000 (09:54 +0000)]
au1000_eth: Duplicate test of RX_OVERLEN bit in update_rx_stats()
in update_rx_stats() the RX_OVERLEN bit is set twice, replace it by RX_RUNT.
in au1000_rx() the RX_MISSED_FRAME bit was tested a few lines earlier already
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Acked-by: Manuel Lauss <manuel.lauss@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Roel Kluin [Tue, 6 Oct 2009 19:34:39 +0000 (19:34 +0000)]
netxen: Fix Unlikely(x) > y
The closing parenthesis was not on the right location.
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Acked-by: Dhananjay Phadke <dhananjay@netxen.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Valentine Barshak [Mon, 5 Oct 2009 03:27:56 +0000 (03:27 +0000)]
pasemi_mac: ethtool get settings fix
Not all pasemi mac interfaces can have a phy attached.
For example, XAUI has no phy and phydev is NULL for it.
In this case ethtool get settings causes kernel crash.
Fix it by returning -EOPNOTSUPP if there's no PHY attached.
Signed-off-by: Valentine Barshak <vbarshak@ru.mvista.com>
Acked-by: Olof Johansson <olof@lixom.net>
Signed-off-by: David S. Miller <davem@davemloft.net>