Davidlohr Bueso [Fri, 2 May 2014 18:24:15 +0000 (11:24 -0700)]
locking/rwsem: Support optimistic spinning
We have reached the point where our mutexes are quite fine tuned
for a number of situations. This includes the use of heuristics
and optimistic spinning, based on MCS locking techniques.
Exclusive ownership of read-write semaphores are, conceptually,
just about the same as mutexes, making them close cousins. To
this end we need to make them both perform similarly, and
right now, rwsems are simply not up to it. This was discovered
by both reverting commit
4fc3f1d6 (mm/rmap, migration: Make
rmap_walk_anon() and try_to_unmap_anon() more scalable) and
similarly, converting some other mutexes (ie: i_mmap_mutex) to
rwsems. This creates a situation where users have to choose
between a rwsem and mutex taking into account this important
performance difference. Specifically, biggest difference between
both locks is when we fail to acquire a mutex in the fastpath,
optimistic spinning comes in to play and we can avoid a large
amount of unnecessary sleeping and overhead of moving tasks in
and out of wait queue. Rwsems do not have such logic.
This patch, based on the work from Tim Chen and I, adds support
for write-side optimistic spinning when the lock is contended.
It also includes support for the recently added cancelable MCS
locking for adaptive spinning. Note that is is only applicable
to the xadd method, and the spinlock rwsem variant remains intact.
Allowing optimistic spinning before putting the writer on the wait
queue reduces wait queue contention and provided greater chance
for the rwsem to get acquired. With these changes, rwsem is on par
with mutex. The performance benefits can be seen on a number of
workloads. For instance, on a 8 socket, 80 core 64bit Westmere box,
aim7 shows the following improvements in throughput:
+--------------+---------------------+-----------------+
| Workload | throughput-increase | number of users |
+--------------+---------------------+-----------------+
| alltests | 20% | >1000 |
| custom | 27%, 60% | 10-100, >1000 |
| high_systime | 36%, 30% | >100, >1000 |
| shared | 58%, 29% | 10-100, >1000 |
+--------------+---------------------+-----------------+
There was also improvement on smaller systems, such as a quad-core
x86-64 laptop running a 30Gb PostgreSQL (pgbench) workload for up
to +60% in throughput for over 50 clients. Additionally, benefits
were also noticed in exim (mail server) workloads. Furthermore, no
performance regression have been seen at all.
Based-on-work-from: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Davidlohr Bueso <davidlohr@hp.com>
[peterz: rej fixup due to comment patches, sched/rt.h header]
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Alex Shi <alex.shi@linaro.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Michel Lespinasse <walken@google.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Peter Hurley <peter@hurleysoftware.com>
Cc: "Paul E.McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Jason Low <jason.low2@hp.com>
Cc: Aswin Chandramouleeswaran <aswin@hp.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: "Scott J Norton" <scott.norton@hp.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Chris Mason <clm@fb.com>
Cc: Josef Bacik <jbacik@fusionio.com>
Link: http://lkml.kernel.org/r/1399055055.6275.15.camel@buesod1.americas.hpqcorp.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tim Chen [Fri, 2 May 2014 19:53:57 +0000 (12:53 -0700)]
rwsem: Add comments to explain the meaning of the rwsem's count field
It took me quite a while to understand how rwsem's count field
mainifested itself in different scenarios.
Add comments to provide a quick reference to the the rwsem's count
field for each scenario where readers and writers are contending
for the lock.
Hopefully it will be useful for future maintenance of the code and
for people to get up to speed on how the logic in the code works.
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Davidlohr Bueso <davidlohr@hp.com>
Cc: Alex Shi <alex.shi@linaro.org>
Cc: Michel Lespinasse <walken@google.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Peter Hurley <peter@hurleysoftware.com>
Cc: Paul E.McKenney <paulmck@linux.vnet.ibm.com>
Cc: Jason Low <jason.low2@hp.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/1399060437.2970.146.camel@schen9-DESK
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Sasha Levin [Wed, 8 Jan 2014 19:21:46 +0000 (14:21 -0500)]
lockdep: Increase static allocations
Fuzzing a recent kernel with a large configuration hits the static
allocation limits and disables lockdep.
This patch doubles the limits.
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1389208906-24338-1-git-send-email-sasha.levin@oracle.com
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Mon, 17 Mar 2014 17:06:10 +0000 (18:06 +0100)]
arch: Mass conversion of smp_mb__*()
Mostly scripted conversion of the smp_mb__* barriers.
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/n/tip-55dhyhocezdw1dg7u19hmh1u@git.kernel.org
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-arch@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Thu, 13 Mar 2014 18:00:35 +0000 (19:00 +0100)]
arch,doc: Convert smp_mb__*()
Update the documentation to reflect the change of barrier primitives.
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: David Howells <dhowells@redhat.com>
Link: http://lkml.kernel.org/n/tip-xslfehiga1twbk5uk94rij1e@git.kernel.org
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: linux-doc@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Thu, 13 Mar 2014 18:00:35 +0000 (19:00 +0100)]
arch,xtensa: Convert smp_mb__*()
Xtensa SMP has compare-and-swap which is fully serializing, therefore
its exising smp_mb__{before,after}_clear_bit() appear unduly heavy.
Implement the new barriers as barrier().
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/n/tip-e9rqjxr1m1ejsob9p433kmji@git.kernel.org
Cc: Chris Zankel <chris@zankel.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: linux-kernel@vger.kernel.org
Cc: linux-xtensa@linux-xtensa.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Thu, 13 Mar 2014 18:00:35 +0000 (19:00 +0100)]
arch,x86: Convert smp_mb__*()
x86 is strongly ordered and all its atomic ops imply a full barrier.
Implement the two new primitives as the old ones were.
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/n/tip-knswsr5mldkr0w1lrdxvc81w@git.kernel.org
Cc: Dave Jones <davej@redhat.com>
Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michel Lespinasse <walken@google.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Thu, 13 Mar 2014 18:00:35 +0000 (19:00 +0100)]
arch,tile: Convert smp_mb__*()
Implement the new smp_mb__* ops as per the old ones.
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Chris Metcalf <cmetcalf@tilera.com>
Link: http://lkml.kernel.org/n/tip-euuabnf5a3u23fy4fq8m3jcg@git.kernel.org
Cc: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Chen Gang <gang.chen@asianux.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Thu, 13 Mar 2014 18:00:35 +0000 (19:00 +0100)]
arch,sparc: Convert smp_mb__*()
sparc32: fully relies on asm-generic/barrier.h and thus can use its
implementation.
sparc64: is strongly ordered and its atomic ops imply a full barrier,
implement the new primitives using barrier().
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: David S. Miller <davem@davemloft.net>
Link: http://lkml.kernel.org/n/tip-2cla9ubpd8chrntnm7e4zdt4@git.kernel.org
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-kernel@vger.kernel.org
Cc: sparclinux@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Thu, 13 Mar 2014 18:00:35 +0000 (19:00 +0100)]
arch,sh: Convert smp_mb__*()
SH can use the asm-generic/barrier.h implementation since that uses
smp_mb().
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/n/tip-2z962by2ppzcd984ybw2mwdw@git.kernel.org
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-kernel@vger.kernel.org
Cc: linux-sh@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Thu, 13 Mar 2014 18:00:35 +0000 (19:00 +0100)]
arch,score: Convert smp_mb__*()
score fully relies on asm-generic/barrier.h, so it can use its default
implementation.
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Lennox Wu <lennox.wu@gmail.com>
Link: http://lkml.kernel.org/n/tip-4mv9svf28lnotjpfuza8urh8@git.kernel.org
Cc: Chen Liqin <liqin.linux@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Thu, 13 Mar 2014 18:00:35 +0000 (19:00 +0100)]
arch,s390: Convert smp_mb__*()
As per the existing implementation; implement the new one using
smp_mb().
AFAICT the s390 compare-and-swap does imply a barrier, however there
are some immediate ops that seem to be singly-copy atomic and do not
imply a barrier. One such is the "ni" op (which would be
and-immediate) which is used for the constant clear_bit
implementation. Therefore s390 needs full barriers for the
{before,after} atomic ops.
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/n/tip-kme5dz5hcobpnufnnkh1ech2@git.kernel.org
Cc: Chen Gang <gang.chen@asianux.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux390@de.ibm.com
Cc: linux-kernel@vger.kernel.org
Cc: linux-s390@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Thu, 13 Mar 2014 18:00:35 +0000 (19:00 +0100)]
arch,powerpc: Convert smp_mb__*()
Powerpc allows reordering over its ll/sc implementation. Implement the
two new barriers as appropriate.
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/n/tip-gg2ffgq32sjgy9b8lj6m3hsc@git.kernel.org
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-kernel@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Thu, 13 Mar 2014 18:00:36 +0000 (19:00 +0100)]
arch,parisc: Convert smp_mb__*()
parisc fully relies on asm-generic/barrier.h, therefore its smp_mb()
is barrier and the default implementation suffices.
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/n/tip-mxs4aubiyesi79v8xx53093q@git.kernel.org
Cc: Helge Deller <deller@gmx.de>
Cc: James E.J. Bottomley <jejb@parisc-linux.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-kernel@vger.kernel.org
Cc: linux-parisc@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Thu, 13 Mar 2014 18:00:36 +0000 (19:00 +0100)]
arch,openrisc: Convert smp_mb__*()
Openrisc fully relies on asm-generic/barrier.h and therefore its
smp_mb() is barrier().
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/n/tip-sxgxgqag9tond4kji07d22oh@git.kernel.org
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-kernel@vger.kernel.org
Cc: linux@lists.openrisc.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Thu, 13 Mar 2014 18:00:36 +0000 (19:00 +0100)]
arch,mn10300: Convert smp_mb__*()
mn10300 fully relies on asm-generic/barrier.h and therefore its
smp_mb() is barrier(). We can use the default implementation.
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/n/tip-wotyeoj99h1dpojjeest2jbk@git.kernel.org
Cc: David Howells <dhowells@redhat.com>
Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-am33-list@redhat.com
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Thu, 13 Mar 2014 18:00:36 +0000 (19:00 +0100)]
arch,mips: Convert smp_mb__*()
MIPS is interesting and has hardware variants that reorder over ll/sc
as well as those that do not.
Implement the 2 new barrier functions as per the old barriers.
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/n/tip-9ph49jbae3hol9v721sbc2g6@git.kernel.org
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Maciej W. Rozycki" <macro@codesourcery.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-kernel@vger.kernel.org
Cc: linux-mips@linux-mips.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Thu, 13 Mar 2014 18:00:36 +0000 (19:00 +0100)]
arch,metag: Convert smp_mb__*()
Implement the new barriers; as per the old versions the metag atomic
imply a full barrier.
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/n/tip-dqnyo215kq38wi4xcxnbpjw3@git.kernel.org
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-kernel@vger.kernel.org
Cc: linux-metag@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Thu, 13 Mar 2014 18:00:36 +0000 (19:00 +0100)]
arch,m68k: Convert smp_mb__*()
m68k uses asm-generic/barrier.h and its smp_mb() is barrier(),
therefore we can use the generic versions that use smp_mb().
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/n/tip-s5dvosrb7qhvpmtaffwfn0zg@git.kernel.org
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-kernel@vger.kernel.org
Cc: linux-m68k@lists.linux-m68k.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Thu, 13 Mar 2014 18:00:36 +0000 (19:00 +0100)]
arch,m32r: Convert smp_mb__*()
M32r uses asm-generic/barrier.h and its smp_mb() is barrier();
therefore we can use the generic versions which default to smp_mb().
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/n/tip-wh6xljltyvmpy9t0bc80k1fy@git.kernel.org
Cc: Hirokazu Takata <takata@linux-m32r.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-kernel@vger.kernel.org
Cc: linux-m32r-ja@ml.linux-m32r.org
Cc: linux-m32r@ml.linux-m32r.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Thu, 13 Mar 2014 18:00:36 +0000 (19:00 +0100)]
arch,ia64: Convert smp_mb__*()
ia64 atomic ops are full barriers; implement the new
smp_mb__{before,after}_atomic().
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/n/tip-hyp7yj68cmqz1nqbfpr541ca@git.kernel.org
Cc: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-ia64@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Thu, 13 Mar 2014 18:00:36 +0000 (19:00 +0100)]
arch,hexagon: Convert smp_mb__*()
Hexagon uses asm-gemeric/barrier.h and its smp_mb() is barrier().
Therefore we can use the default implementation that uses smp_mb().
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/n/tip-87irqrrbgizeojjfdqhypud3@git.kernel.org
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: linux-hexagon@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Thu, 13 Mar 2014 18:00:37 +0000 (19:00 +0100)]
arch,frv: Convert smp_mb__*()
Because:
arch/frv/include/asm/smp.h:#error SMP not supported
smp_mb() is barrier() and we can use the default implementation that
uses smp_mb().
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/n/tip-n296g51yzdu5ru1vp7mccxmf@git.kernel.org
Cc: David Howells <dhowells@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Thu, 13 Mar 2014 18:00:37 +0000 (19:00 +0100)]
arch,cris: Convert smp_mb__*()
Cris fully relies on asm-generic/barrier.h, therefore its smp_mb() is
barrier(), thus we can use the default implementation that uses
smp_mb().
(Include asm/system.h and asm/barrier.h to avoid header dependency hell.)
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Jesper Nilsson <jesper.nilsson@axis.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/n/tip-wvewbe8os3s1e4pt1cdotuee@git.kernel.org
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mikael Starvik <starvik@axis.com>
Cc: linux-cris-kernel@axis.com
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Thu, 13 Mar 2014 18:00:37 +0000 (19:00 +0100)]
arch,c6x: Convert smp_mb__*()
c6x doesn't have a barrier.h and completely relies on
asm-generic/barrier.h. Therefore its smp_mb() is barrier() and we can
use the default versions that are smp_mb().
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Mark Salter <msalter@redhat.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/n/tip-kl53k3pyj0rbd80jq8ralpf3@git.kernel.org
Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Salter <msalter@redhat.com>
Cc: linux-c6x-dev@linux-c6x.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Thu, 13 Mar 2014 18:00:37 +0000 (19:00 +0100)]
arch,blackfin: Convert smp_mb__*()
Blackfin's atomic primitives do not imply a full barrier as whitnessed
from its SMP smp_mb__{before,after}_clear_bit() implementations.
However since !SMP smp_mb() reduces to barrier() remove everything and
rely on the asm-generic/barrier.h implentation.
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/n/tip-1widdkdsb3c1titq8jez6g3g@git.kernel.org
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Steven Miao <realmz6@gmail.com>
Cc: adi-buildroot-devel@lists.sourceforge.net
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Thu, 13 Mar 2014 18:00:37 +0000 (19:00 +0100)]
arch,avr32: Convert smp_mb__*()
AVR32's mb() implementation is a compiler barrier(), therefore it all
doesn't matter, fully rely on whatever asm-generic/barrier.h
generates.
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/n/tip-8gow97a7mapmnec0pvf729pj@git.kernel.org
Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Thu, 13 Mar 2014 18:00:37 +0000 (19:00 +0100)]
arch,arm64: Convert smp_mb__*()
AARGH64 uses ll/sc primitives that do not imply any barriers for the
normal atomics, therefore smp_mb__{before,after} should be a full
barrier.
Since AARGH64 doesn't use asm-generic/barrier.h, add the required
definitions to its asm/barrier.h.
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/n/tip-8p5iclqgy78al33kck3ht7nr@git.kernel.org
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chen Gang <gang.chen@asianux.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Wed, 12 Mar 2014 16:11:00 +0000 (17:11 +0100)]
arch,arm: Convert smp_mb__*()
ARM uses ll/sc primitives that do not imply barriers for all regular
atomic ops, therefore smp_mb__{before,after} need be a full barrier.
Since ARM doesn't use asm-generic/barrier.h include the required
definitions in its asm/barrier.h
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/n/tip-yijo7sglsl7uusbp13upcuvo@git.kernel.org
Cc: Albin Tonnerre <albin.tonnerre@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chen Gang <gang.chen@asianux.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Nicolas Pitre <nico@linaro.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Victor Kamensky <victor.kamensky@linaro.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Wed, 12 Mar 2014 16:11:00 +0000 (17:11 +0100)]
arch,arc: Convert smp_mb__*()
The arc mb() implementation is a compiler barrier(), therefore it all
doesn't matter one way or the other. Simply remove the existing
definitions and use whatever is generated by the defaults.
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/n/tip-ua48a59wri3ybz1rz8i7uvbr@git.kernel.org
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Wed, 12 Mar 2014 16:11:00 +0000 (17:11 +0100)]
arch,alpha: Convert smp_mb__*() to the asm-generic primitives
The Alpha ll/sc primitives do not imply any sort of barrier; therefore
the smp_mb__{before,after} should be a full barrier. This is the
default from asm-generic/barrier.h and therefore just remove the
current definitions.
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/n/tip-iacwfd15lq3ta2v7jut747r7@git.kernel.org
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: linux-alpha@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Thu, 6 Feb 2014 17:16:07 +0000 (18:16 +0100)]
arch: Prepare for smp_mb__{before,after}_atomic()
Since the smp_mb__{before,after}*() ops are fundamentally dependent on
how an arch can implement atomics it doesn't make sense to have 3
variants of them. They must all be the same.
Furthermore, the 3 variants suggest they're only valid for those 3
atomic ops, while we have many more where they could be applied.
So move away from
smp_mb__{before,after}_{atomic,clear}_{dec,inc,bit}() and reduce the
interface to just the two: smp_mb__{before,after}_atomic().
This patch prepares the way by introducing default implementations in
asm-generic/barrier.h that default to a full barrier and providing
__deprecated inlines for the previous 6 barriers if they're not
provided by the arch.
This should allow for a mostly painless transition (lots of deprecated
warns in the interim).
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/n/tip-wr59327qdyi9mbzn6x937s4e@git.kernel.org
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: "Chen, Gong" <gong.chen@linux.intel.com>
Cc: John Sullivan <jsrhbz@kanargh.force9.co.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mauro Carvalho Chehab <m.chehab@samsung.com>
Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: linux-arch@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Thu, 6 Feb 2014 13:26:10 +0000 (14:26 +0100)]
arc,hexagon: Delete asm/barrier.h
Both already use asm-generic/barrier.h as per their
include/asm/Kbuild. Remove the stale files.
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/n/tip-c7vlkshl3tblim0o8z2p70kt@git.kernel.org
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: linux-hexagon@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Tue, 4 Feb 2014 19:36:01 +0000 (20:36 +0100)]
ia64: Fix up smp_mb__{before,after}_clear_bit()
IA64 doesn't actually have acquire/release barriers, its a lie!
Add a comment explaining this and fix up the bitop barriers.
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/n/tip-akevfh136um9dqvb1ohm55ca@git.kernel.org
Cc: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-ia64@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Linus Torvalds [Thu, 17 Apr 2014 20:21:35 +0000 (13:21 -0700)]
Merge branch 'parisc-3.15' of git://git./linux/kernel/git/deller/parisc-linux
Pull parisc updates from Helge Deller:
"There are two major changes in this patchset:
The major fix is that the epoll_pwait() syscall for 32bit userspace
was not using the compat wrapper on a 64bit kernel.
Secondly we changed the value of SHMLBA from 4MB to PAGE_SIZE to
reflect that we can actually mmap to any multiple of PAGE_SIZE. The
only thing which needs care is that shared mmaps need to be mapped at
the same offset inside the 4MB cache window"
* 'parisc-3.15' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: fix epoll_pwait syscall on compat kernel
parisc: change value of SHMLBA from 0x00400000 to PAGE_SIZE
parisc: Replace __get_cpu_var uses for address calculation
Linus Torvalds [Thu, 17 Apr 2014 19:31:07 +0000 (12:31 -0700)]
Merge branch 'ipmi' (emailed ipmi fixes)
Merge ipmi fixes from Corey Minyard:
"Things collected since last kernel release.
Some of these are pretty important. The first three are bug fixes.
The next two are to hopefully make everyone happy about allowing
ACPI to be on all the time and not have IPMI have an effect on the
system when not in use. The last is a little cleanup"
* emailed patches from Corey Minyard <cminyard@mvista.com>:
ipmi: boolify some things
ipmi: Turn off all activity on an idle ipmi interface
ipmi: Turn off default probing of interfaces
ipmi: Reset the KCS timeout when starting error recovery
ipmi: Fix a race restarting the timer
Char: ipmi_bt_sm, fix infinite loop
Corey Minyard [Mon, 14 Apr 2014 14:46:56 +0000 (09:46 -0500)]
ipmi: boolify some things
Convert some ints to bools.
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Corey Minyard [Mon, 14 Apr 2014 14:46:54 +0000 (09:46 -0500)]
ipmi: Turn off all activity on an idle ipmi interface
The IPMI driver would wake up periodically looking for events and
watchdog pretimeouts. If there is nothing waiting for these events,
it's really kind of pointless to be checking for them. So modify the
driver so the message handler can pass down if it needs the lower layer
to be waiting for these. Modify the system interface lower layer to
turn off all timer and thread activity if the upper layer doesn't need
anything and it is not currently handling messages. And modify the
message handler to not restart the timer if its timer is not needed.
The timers and kthread will still be enabled if:
- the SI interface is handling a message.
- a user has enabled watching for events.
- the IPMI watchdog timer is in use (since it uses pretimeouts).
- the message handler is waiting on a remote response.
- a user has registered to receive commands.
This mostly affects interfaces without interrupts. Interfaces with
interrupts already don't use CPU in the system interface when the
interface is idle.
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Corey Minyard [Mon, 14 Apr 2014 14:46:53 +0000 (09:46 -0500)]
ipmi: Turn off default probing of interfaces
The default probing can cause problems with some system, slow booting,
extra CPU usages, etc. Turn it off by default and give a config option
to enable it.
From: Matthew Garrett <matthew.garrett@nebula.com>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Corey Minyard [Mon, 14 Apr 2014 14:46:52 +0000 (09:46 -0500)]
ipmi: Reset the KCS timeout when starting error recovery
The OBF timer in KCS was not reset in one situation when error recovery
was started, resulting in an immediate timeout.
Reported-by: Bodo Stroesser <bstroesser@ts.fujitsu.com>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bodo Stroesser [Mon, 14 Apr 2014 14:46:51 +0000 (09:46 -0500)]
ipmi: Fix a race restarting the timer
With recent changes it is possible for the timer handler to detect an
idle interface and not start the timer, but the thread to start an
operation at the same time. The thread will not start the timer in that
instance, resulting in the timer not running.
Instead, move all timer operations under the lock and start the timer in
the thread if it detect non-idle and the timer is not already running.
Moving under locks allows the last timeout to be set in both the thread
and the timer. 'Timer is not running' means that the timer is not
pending and smi_timeout() is not running. So we need a flag to detect
this correctly.
Also fix a few other timeout bugs: setting the last timeout when the
interrupt has to be disabled and the timer started, and setting the last
timeout in check_start_timer_thread possibly racing with the timer
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Bodo Stroesser <bstroesser@ts.fujitsu.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jiri Slaby [Mon, 14 Apr 2014 14:46:50 +0000 (09:46 -0500)]
Char: ipmi_bt_sm, fix infinite loop
In read_all_bytes, we do
unsigned char i;
...
bt->read_data[0] = BMC2HOST;
bt->read_count = bt->read_data[0];
...
for (i = 1; i <= bt->read_count; i++)
bt->read_data[i] = BMC2HOST;
If bt->read_data[0] == bt->read_count == 255, we loop infinitely in the
'for' loop. Make 'i' an 'int' instead of 'char' to get rid of the
overflow and finish the loop after 255 iterations every time.
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Reported-and-debugged-by: Rui Hui Dian <rhdian@novell.com>
Cc: Tomas Cech <tcech@suse.cz>
Cc: Corey Minyard <minyard@acm.org>
Cc: <openipmi-developer@lists.sourceforge.net>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Thu, 17 Apr 2014 17:54:07 +0000 (10:54 -0700)]
Merge tag 'stable/for-linus-3.15-rc1-tag' of git://git./linux/kernel/git/xen/tip
Pull Xen fixes from David Vrabel:
"Xen regression and bug fixes for 3.15-rc1:
- fix completely broken 32-bit PV guests caused by x86 refactoring
32-bit thread_info.
- only enable ticketlock slow path on Xen (not bare metal)
- fix two bugs with PV guests not shutting down when requested
- fix a minor memory leak in xen-pciback error path"
* tag 'stable/for-linus-3.15-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen/manage: Poweroff forcefully if user-space is not yet up.
xen/xenbus: Avoid synchronous wait on XenBus stalling shutdown/restart.
xen/spinlock: Don't enable them unconditionally.
xen-pciback: silence an unwanted debug printk
xen: fix memory leak in __xen_pcibk_add_pci_dev()
x86/xen: Fix 32-bit PV guests's usage of kernel_stack
Linus Torvalds [Thu, 17 Apr 2014 17:51:01 +0000 (10:51 -0700)]
Merge tag '3.15-fixes' of git://neil.brown.name/md
Pull md bugfix from Neil Brown:
"One BUG fix for md for recent commit"
* tag '3.15-fixes' of git://neil.brown.name/md:
raid5: fix a race of stripe count check
Linus Torvalds [Thu, 17 Apr 2014 17:48:08 +0000 (10:48 -0700)]
Merge tag 'fbdev-reorder-3.15' of git://git./linux/kernel/git/tomba/linux
Pull fbdev renaming patches from Tomi Valkeinen:
"Reorder drivers/video/ directory so that all fbdev drivers are now
located in drivers/video/fbdev/ and the fbdev framework core files are
located in drivers/video/fbdev/core/
The drivers/video/Kconfig is modified so that the DRM and the fbdev
menu options are in separate submenus, instead of both being mixed in
the same 'Graphics support' menu level"
* tag 'fbdev-reorder-3.15' of git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux:
video: Kconfig: move drm and fb into separate menus
fbdev: move fbdev core files to separate directory
video: move fbdev to drivers/video/fbdev
Shaohua Li [Tue, 15 Apr 2014 01:12:54 +0000 (09:12 +0800)]
raid5: fix a race of stripe count check
I hit another BUG_ON with
e240c1839d11152b0355442. In __get_priority_stripe(),
stripe count equals to 0 initially. Between atomic_inc and BUG_ON,
get_active_stripe() finds the stripe. So the stripe count isn't 1 any more.
V2: keeps the BUG_ON suggested by Neil.
Signed-off-by: Shaohua Li <shli@fusionio.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Tomi Valkeinen [Thu, 13 Feb 2014 14:32:13 +0000 (16:32 +0200)]
video: Kconfig: move drm and fb into separate menus
At the moment the "Device Drivers / Graphics support" kernel config page
looks rather messy, with DRM and fbdev driver selections on the same
page, some on the top level Graphics support page, some under their
respective subsystems.
If I'm not mistaken, this is caused by the drivers depending on other
things than DRM or FB, which causes Kconfig to arrange the options in
not-so-neat manner.
Both DRM and FB have a main menuconfig option for the whole DRM or FB
subsystem. Optimally, this would be enough to arrange all DRM and FB
options under the respective subsystem, but for whatever reason this
doesn't work reliably.
This patch adds an explicit submenu for DRM and FB, making it much
clearer which options are related to FB, and which to DRM.
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Jingoo Han <jg1.han@samsung.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Tomi Valkeinen [Thu, 13 Feb 2014 14:24:55 +0000 (16:24 +0200)]
fbdev: move fbdev core files to separate directory
Instead of having fbdev framework core files at the root fbdev
directory, mixed with random fbdev device drivers, move the fbdev core
files to a separate core directory. This makes it much clearer which of
the files are actually part of the fbdev framework, and which are part
of device drivers.
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Jingoo Han <jg1.han@samsung.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Tomi Valkeinen [Thu, 13 Feb 2014 13:31:38 +0000 (15:31 +0200)]
video: move fbdev to drivers/video/fbdev
The drivers/video directory is a mess. It contains generic video related
files, directories for backlight, console, linux logo, lots of fbdev
device drivers, fbdev framework files.
Make some order into the chaos by creating drivers/video/fbdev
directory, and move all fbdev related files there.
No functionality is changed, although I guess it is possible that some
subtle Makefile build order related issue could be created by this
patch.
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Jingoo Han <jg1.han@samsung.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Linus Torvalds [Wed, 16 Apr 2014 23:40:18 +0000 (16:40 -0700)]
Merge branch 'x86-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
"Various fixes:
- reboot regression fix
- build message spam fix
- GPU quirk fix
- 'make kvmconfig' fix
plus the wire-up of the renameat2() system call on i386"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86: Remove the PCI reboot method from the default chain
x86/build: Supress "Nothing to be done for ..." messages
x86/gpu: Fix sign extension issue in Intel graphics stolen memory quirks
x86/platform: Fix "make O=dir kvmconfig"
i386: Wire up the renameat2() syscall
Linus Torvalds [Wed, 16 Apr 2014 23:38:57 +0000 (16:38 -0700)]
Merge branch 'perf-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
"Tooling fixes, plus a simple hardware-enablement patch for the Intel
RAPL PMU (energy use measurement) on Haswell CPUs, which I hope is
still fine at this stage"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf tools: Instead of redirecting flex output, use -o
perf tools: Fix double free in perf test 21 (code-reading.c)
perf stat: Initialize statistics correctly
perf bench: Set more defaults in the 'numa' suite
perf bench: Fix segfault at the end of an 'all' execution
perf bench: Update manpage to mention numa and futex
perf probe: Use dwarf_getcfi_elf() instead of dwarf_getcfi()
perf probe: Fix to handle errors in line_range searching
perf probe: Fix --line option behavior
perf tools: Pick up libdw without explicit LIBDW_DIR
MAINTAINERS: Change e-mail to kernel.org one
perf callchains: Disable unwind libraries when libelf isn't found
tools lib traceevent: Do not call warning() directly
tools lib traceevent: Print event name when show warning if possible
perf top: Fix documentation of invalid -s option
perf/x86: Enable DRAM RAPL support on Intel Haswell
Linus Torvalds [Wed, 16 Apr 2014 23:36:00 +0000 (16:36 -0700)]
Merge branch 'irq-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull irq fix from Ingo Molnar:
"ARM VIC (Vectored Irq Controller) irqchip driver fix"
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip: vic: Properly chain the cascaded IRQs
Linus Torvalds [Wed, 16 Apr 2014 23:35:18 +0000 (16:35 -0700)]
Merge branch 'core-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull locking fixes from Ingo Molnar:
"liblockdep fixes and mutex debugging fixes"
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
locking/mutex: Fix debug_mutexes
tools/liblockdep: Add proper versioning to the shared obj
tools/liblockdep: Ignore asmlinkage and visible
Linus Torvalds [Wed, 16 Apr 2014 23:03:24 +0000 (16:03 -0700)]
Merge tag 'fbdev-fixes-3.15' of git://git./linux/kernel/git/tomba/linux
Pull fbdev fixes from Tomi Valkeinen:
- fix build errors for bf54x-lq043fb and imxfb
- fbcon fix for da8xx-fb
- omapdss fixes for hdmi audio, irq handling and fclk calculation
* tag 'fbdev-fixes-3.15' of git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux:
video: bf54x-lq043fb: fix build error
OMAPDSS: Change struct reg_field to dispc_reg_field
OMAPDSS: Take pixelclock unit change into account in hdmi_compute_acr()
OMAPDSS: fix shared irq handlers
video: imxfb: Select LCD_CLASS_DEVICE unconditionally
OMAPDSS: fix rounding when calculating fclk rate
video: da8xx-fb: Fix casting of info->pseudo_palette
Linus Torvalds [Wed, 16 Apr 2014 22:59:16 +0000 (15:59 -0700)]
Merge tag 'pinctrl-v3.15-2' of git://git./linux/kernel/git/linusw/linux-pinctrl
Pull pincontrol fixes from Linus Walleij:
"A first set of pin control fixes for the v3.15 series:
- Fix a couple of barnsjukdomar on the Rockchip driver.
- Remove an idiotic debug print I happened to leave behind in the
Nomadik driver.
- Fixup the Qualcomm MSM interrupt handling code for the TLMM v2.
- Three patches renaming the Broadcom Capri driver to BCM28155. This
has been falling between the chairs for some time due to some
cross-tree synchronization misunderstandings, now I'm fed up with
this and just rename it in this -rc1 phase"
* tag 'pinctrl-v3.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
pinctrl: fix typo in bindings documentation
Update bcm_defconfig with new pinctrl CONFIG
pinctrl: Rename Broadcom Capri pinctrl driver
pinctrl: msm: Correct interrupt code for TLMM v2
pinctrl: nomadik: delete stray debug print
pinctrl: rockchip: handle first half of rk3188-bank0 correctly
pinctrl: rockchip: add return value to rockchip_set_mux
pinctrl: rockchip: fix offset of mux registers for rk3188
Linus Torvalds [Wed, 16 Apr 2014 18:28:25 +0000 (11:28 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/s390/linux
Pull s390 patches from Martin Schwidefsky:
"An update to the oops output with additional information about the
crash. The renameat2 system call is enabled. Two patches in regard
to the PTR_ERR_OR_ZERO cleanup. And a bunch of bug fixes"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/sclp_cmd: replace PTR_RET with PTR_ERR_OR_ZERO
s390/sclp: replace PTR_RET with PTR_ERR_OR_ZERO
s390/sclp_vt220: Fix kernel panic due to early terminal input
s390/compat: fix typo
s390/uaccess: fix possible register corruption in strnlen_user_srst()
s390: add 31 bit warning message
s390: wire up sys_renameat2
s390: show_registers() should not map user space addresses to kernel symbols
s390/mm: print control registers and page table walk on crash
s390/smp: fix smp_stop_cpu() for !CONFIG_SMP
s390: fix control register update
Linus Torvalds [Wed, 16 Apr 2014 18:22:45 +0000 (11:22 -0700)]
Merge tag 'please-pull-ia64-erratum' of git://git./linux/kernel/git/aegl/linux
Pull itanium erratum fix from Tony Luck:
"Small workaround for a rare, but annoying, erratum #237"
* tag 'please-pull-ia64-erratum' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux:
[IA64] Change default PSR.ac from '1' to '0' (Fix erratum #237)
Tony Luck [Fri, 28 Mar 2014 21:42:07 +0000 (14:42 -0700)]
[IA64] Change default PSR.ac from '1' to '0' (Fix erratum #237)
April 2014 Itanium processor specification update:
http://www.intel.com/content/www/us/en/processors/itanium/itanium-specification-update.html
describes this erratum:
=========================================================================
237. Under a complex set of conditions, store to load forwarding for a
sub 8-byte load may complete incorrectly
Problem: A load instruction may complete incorrectly when a code sequence
using 4-byte or smaller load and store operations to the same address
is executed in combination with specific timing of all the following
concurrent conditions: store to load forwarding, alignment checking
enabled, a mis-predicted branch, and complex cache utilization activity.
Implication: The affected sub 8-byte instruction may complete
incorrectly resulting in unpredictable system behavior. There is an
extremely low probability of exposure due to the significant number of
complex microarchitectural concurrent conditions required to encounter
the erratum.
Workaround: Set PSR.ac = 0 to completely avoid the erratum. Disabling
Hyper-Threading will significantly reduce exposure to the conditions
that contribute to encountering the erratum.
Status: See the Summary Table of Changes for the affected steppings.
=========================================================================
[Table of changes essentially lists all models from McKinley to Tukwila]
The PSR.ac bit controls whether the processor will always generate
an unaligned reference trap (0x5a00) for a misaligned data access
(when PSR.ac=1) or if it will let the access succeed when running
on a cpu that implements logic to handle some unaligned accesses.
Way back in 2008 in commit
b704882e70d87d7f56db5ff17e2253f3fa90e4f3
[IA64] Rationalize kernel mode alignment checking
we made the decision to always enable strict checking. We were
already doing so in trap/interrupt context because the common
preamble code set this bit - but the rest of supervisor code
(and by inheritance user code) ran with PSR.ac=0.
We now reverse that decision and set PSR.ac=0 everywhere in the
kernel (also inherited by user processes). This will avoid the
erratum using the method described in the Itanium specification
update. Net effect for users is that the processor will handle
unaligned access when it can (typically with a tiny performance
bubble in the pipeline ... but much less invasive than taking a
trap and having the OS perform the access).
Signed-off-by: Tony Luck <tony.luck@intel.com>
Ingo Molnar [Fri, 4 Apr 2014 06:41:26 +0000 (08:41 +0200)]
x86: Remove the PCI reboot method from the default chain
Steve reported a reboot hang and bisected it back to this commit:
a4f1987e4c54 x86, reboot: Add EFI and CF9 reboot methods into the default list
He heroically tested all reboot methods and found the following:
reboot=t # triple fault ok
reboot=k # keyboard ctrl FAIL
reboot=b # BIOS ok
reboot=a # ACPI FAIL
reboot=e # EFI FAIL [system has no EFI]
reboot=p # PCI 0xcf9 FAIL
And I think it's pretty obvious that we should only try PCI 0xcf9 as a
last resort - if at all.
The other observation is that (on this box) we should never try
the PCI reboot method, but close with either the 'triple fault'
or the 'BIOS' (terminal!) reboot methods.
Thirdly, CF9_COND is a total misnomer - it should be something like
CF9_SAFE or CF9_CAREFUL, and 'CF9' should be 'CF9_FORCE' ...
So this patch fixes the worst problems:
- it orders the actual reboot logic to follow the reboot ordering
pattern - it was in a pretty random order before for no good
reason.
- it fixes the CF9 misnomers and uses BOOT_CF9_FORCE and
BOOT_CF9_SAFE flags to make the code more obvious.
- it tries the BIOS reboot method before the PCI reboot method.
(Since 'BIOS' is a terminal reboot method resulting in a hang
if it does not work, this is essentially equivalent to removing
the PCI reboot method from the default reboot chain.)
- just for the miraculous possibility of terminal (resulting
in hang) reboot methods of triple fault or BIOS returning
without having done their job, there's an ordering between
them as well.
Reported-and-bisected-and-tested-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Li Aubrey <aubrey.li@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Link: http://lkml.kernel.org/r/20140404064120.GB11877@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Linus Torvalds [Wed, 16 Apr 2014 03:30:30 +0000 (20:30 -0700)]
Merge git://git./linux/kernel/git/davem/net
Pull networking fixes from David Miller:
1) Fix BPF filter validation of netlink attribute accesses, from
Mathias Kruase.
2) Netfilter conntrack generation seqcount not initialized properly,
from Andrey Vagin.
3) Fix comparison mask computation on big-endian in nft_cmp_fast(),
from Patrick McHardy.
4) Properly limit MTU over ipv6, from Eric Dumazet.
5) Fix seccomp system call argument population on 32-bit, from Daniel
Borkmann.
6) skb_network_protocol() should not use hard-coded ETH_HLEN, instead
skb->mac_len needs to be used. From Vlad Yasevich.
7) We have several cases of using socket based communications to
implement a tunnel. For example, some tunnels are encapsulations
over UDP so we use an internal kernel UDP socket to do the
transmits.
These tunnels should behave just like other software devices and
pass the packets on down to the next layer.
Most importantly we want the top-level socket (eg TCP) that created
the traffic to be charged for the SKB memory.
However, once you get into the IP output path, we have code that
assumed that whatever was attached to skb->sk is an IP socket.
To keep the top-level socket being charged for the SKB memory,
whilst satisfying the needs of the IP output path, we now pass in an
explicit 'sk' argument.
From Eric Dumazet.
8) ping_init_sock() leaks group info, from Xiaoming Wang.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (33 commits)
cxgb4: use the correct max size for firmware flash
qlcnic: Fix MSI-X initialization code
ip6_gre: don't allow to remove the fb_tunnel_dev
ipv4: add a sock pointer to dst->output() path.
ipv4: add a sock pointer to ip_queue_xmit()
driver/net: cosa driver uses udelay incorrectly
at86rf230: fix __at86rf230_read_subreg function
at86rf230: remove check if AVDD settled
net: cadence: Add architecture dependencies
net: Start with correct mac_len in skb_network_protocol
Revert "net: sctp: Fix a_rwnd/rwnd management to reflect real state of the receiver's buffer"
cxgb4: Save the correct mac addr for hw-loopback connections in the L2T
net: filter: seccomp: fix wrong decoding of BPF_S_ANC_SECCOMP_LD_W
seccomp: fix populating a0-a5 syscall args in 32-bit x86 BPF
qlcnic: Do not disable SR-IOV when VFs are assigned to VMs
qlcnic: Fix QLogic application/driver interface for virtual NIC configuration
qlcnic: Fix PVID configuration on eSwitch port.
qlcnic: Fix max ring count calculation
qlcnic: Fix to send INIT_NIC_FUNC as first mailbox.
qlcnic: Fix panic due to uninitialzed delayed_work struct in use.
...
Steve Wise [Tue, 15 Apr 2014 19:22:34 +0000 (14:22 -0500)]
cxgb4: use the correct max size for firmware flash
The wrong max fw size was being used and causing false
"too big" errors running ethtool -f.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Gordeev [Tue, 15 Apr 2014 09:37:14 +0000 (11:37 +0200)]
qlcnic: Fix MSI-X initialization code
Function qlcnic_setup_tss_rss_intr() might enter endless
loop in case pci_enable_msix() contiguously returns a
positive number of MSI-Xs that could have been allocated.
Besides, the function contains 'err = -EIO;' assignment
that never could be reached. This update fixes the
aforementioned issues.
Cc: Shahed Shaikh <shahed.shaikh@qlogic.com>
Cc: Dept-HSGLinuxNICDev@qlogic.com
Cc: netdev@vger.kernel.org
Cc: linux-pci@vger.kernel.org
Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
Acked-by: Shahed Shaikh <shahed.shaikh@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nicolas Dichtel [Mon, 14 Apr 2014 15:11:38 +0000 (17:11 +0200)]
ip6_gre: don't allow to remove the fb_tunnel_dev
It's possible to remove the FB tunnel with the command 'ip link del ip6gre0' but
this is unsafe, the module always supposes that this device exists. For example,
ip6gre_tunnel_lookup() may use it unconditionally.
Let's add a rtnl handler for dellink, which will never remove the FB tunnel (we
let ip6gre_destroy_tunnels() do the job).
Introduced by commit
c12b395a4664 ("gre: Support GRE over IPv6").
CC: Dmitry Kozlov <xeb@mail.ru>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Tue, 15 Apr 2014 17:47:15 +0000 (13:47 -0400)]
ipv4: add a sock pointer to dst->output() path.
In the dst->output() path for ipv4, the code assumes the skb it has to
transmit is attached to an inet socket, specifically via
ip_mc_output() : The sk_mc_loop() test triggers a WARN_ON() when the
provider of the packet is an AF_PACKET socket.
The dst->output() method gets an additional 'struct sock *sk'
parameter. This needs a cascade of changes so that this parameter can
be propagated from vxlan to final consumer.
Fixes:
8f646c922d55 ("vxlan: keep original skb ownership")
Reported-by: lucien xin <lucien.xin@gmail.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ingo Molnar [Tue, 15 Apr 2014 17:25:07 +0000 (19:25 +0200)]
Merge tag 'perf-urgent-for-mingo' of git://git./linux/kernel/git/jolsa/perf into perf/urgent
Pull perf/urgent fixes from Jiri Olsa:
* Instead of redirecting flex output, use -o (Cody P Schafer)
* Fix double free in perf test 21 (Adrian Hunter)
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Eric Dumazet [Tue, 15 Apr 2014 16:58:34 +0000 (12:58 -0400)]
ipv4: add a sock pointer to ip_queue_xmit()
ip_queue_xmit() assumes the skb it has to transmit is attached to an
inet socket. Commit
31c70d5956fc ("l2tp: keep original skb ownership")
changed l2tp to not change skb ownership and thus broke this assumption.
One fix is to add a new 'struct sock *sk' parameter to ip_queue_xmit(),
so that we do not assume skb->sk points to the socket used by l2tp
tunnel.
Fixes:
31c70d5956fc ("l2tp: keep original skb ownership")
Reported-by: Zhan Jianyu <nasa4836@gmail.com>
Tested-by: Zhan Jianyu <nasa4836@gmail.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Konrad Rzeszutek Wilk [Fri, 4 Apr 2014 18:53:41 +0000 (14:53 -0400)]
xen/manage: Poweroff forcefully if user-space is not yet up.
The user can launch the guest in this sequence:
xl create -p /vm.cfg [launch, but pause it]
xl shutdown latest [sets control/shutdown=poweroff]
xl unpause latest
xl console latest [and see that the guest has completely
ignored the shutdown request]
In reality the guest hasn't ignored it. It registers a watch
and gets a notification that there is value. It then calls
the shutdown_handler which ends up calling orderly_shutdown.
Unfortunately that is so early in the bootup that there
are no user-space. Which means that the orderly_shutdown fails.
But since the force flag was set to false it continues on without
reporting.
What we really want to is to use the force when we are in the
SYSTEM_BOOTING state and not use the 'force' when SYSTEM_RUNNING.
However, if we are in the running state - and the shutdown command
has been given before the user-space has been setup, there is nothing
we can do. Worst yet, we stop ignoring the 'xl shutdown' requests!
As such, the other part of this patch is to only stop ignoring
the 'xl shutdown' when we are truly in the power off sequence.
That means the user can do multiple 'xl shutdown' and we will try
to act on them instead of ignoring them.
Fixes-Bug: http://bugs.xenproject.org/xen/bug/6
Reported-by: Alex Bligh <alex@alex.org.uk>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Konrad Rzeszutek Wilk [Fri, 4 Apr 2014 18:53:40 +0000 (14:53 -0400)]
xen/xenbus: Avoid synchronous wait on XenBus stalling shutdown/restart.
The 'read_reply' works with 'process_msg' to read of a reply in XenBus.
'process_msg' is running from within the 'xenbus' thread. Whenever
a message shows up in XenBus it is put on a xs_state.reply_list list
and 'read_reply' picks it up.
The problem is if the backend domain or the xenstored process is killed.
In which case 'xenbus' is still awaiting - and 'read_reply' if called -
stuck forever waiting for the reply_list to have some contents.
This is normally not a problem - as the backend domain can come back
or the xenstored process can be restarted. However if the domain
is in process of being powered off/restarted/halted - there is no
point of waiting on it coming back - as we are effectively being
terminated and should not impede the progress.
This patch solves this problem by checking whether the guest is the
right domain. If it is an initial domain and hurtling towards death -
there is no point of continuing the wait. All other type of guests
continue with their behavior (as Xenstore is expected to still be
running in another domain).
Fixes-Bug: http://bugs.xenproject.org/xen/bug/8
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Konrad Rzeszutek Wilk [Fri, 4 Apr 2014 18:48:04 +0000 (14:48 -0400)]
xen/spinlock: Don't enable them unconditionally.
The git commit
a945928ea2709bc0e8e8165d33aed855a0110279
('xen: Do not enable spinlocks before jump_label_init() has executed')
was added to deal with the jump machinery. Earlier the code
that turned on the jump label was only called by Xen specific
functions. But now that it had been moved to the initcall machinery
it gets called on Xen, KVM, and baremetal - ouch!. And the detection
machinery to only call it on Xen wasn't remembered in the heat
of merge window excitement.
This means that the slowpath is enabled on baremetal while it should
not be.
Reported-by: Waiman Long <waiman.long@hp.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
CC: stable@vger.kernel.org
CC: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Dan Carpenter [Fri, 28 Mar 2014 08:24:59 +0000 (11:24 +0300)]
xen-pciback: silence an unwanted debug printk
There is a missing curly brace here so we might print some extra debug
information.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Daeseok Youn [Tue, 1 Apr 2014 10:15:59 +0000 (19:15 +0900)]
xen: fix memory leak in __xen_pcibk_add_pci_dev()
It need to free dev_entry when it failed to assign to a new
slot on the virtual PCI bus.
smatch says:
drivers/xen/xen-pciback/vpci.c:142 __xen_pcibk_add_pci_dev() warn:
possible memory leak of 'dev_entry'
Signed-off-by: Daeseok Youn <daeseok.youn@gmail.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Linus Walleij [Tue, 15 Apr 2014 08:28:04 +0000 (10:28 +0200)]
irqchip: vic: Properly chain the cascaded IRQs
We are flagging the parent IRQ as chained, then we must also
make sure to call the chained_irq_[enter|exit] functions for
things to work smoothly.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Link: http://lkml.kernel.org/r/1397550484-7119-1-git-send-email-linus.walleij@linaro.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Boris Ostrovsky [Thu, 10 Apr 2014 16:17:09 +0000 (12:17 -0400)]
x86/xen: Fix 32-bit PV guests's usage of kernel_stack
Commit
198d208df4371734ac4728f69cb585c284d20a15 ("x86: Keep
thread_info on thread stack in x86_32") made 32-bit kernels use
kernel_stack to point to thread_info. That change missed a couple of
updates needed by Xen's 32-bit PV guests:
1. kernel_stack needs to be initialized for secondary CPUs
2. GET_THREAD_INFO() now uses %fs register which may not be the
kernel's version when executing xen_iret().
With respect to the second issue, we don't need GET_THREAD_INFO()
anymore: we used it as an intermediate step to get to per_cpu xen_vcpu
and avoid referencing %fs. Now that we are going to use %fs anyway we
may as well go directly to xen_vcpu.
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Cody P Schafer [Mon, 14 Apr 2014 10:47:01 +0000 (12:47 +0200)]
perf tools: Instead of redirecting flex output, use -o
This gives us a real filename instead of having '<stdout>' show up all
over the place when debugging.
Signed-off-by: Cody P Schafer <cody@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/1396652539-2416-1-git-send-email-cody@linux.vnet.ibm.com
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Adrian Hunter [Thu, 10 Apr 2014 09:02:54 +0000 (12:02 +0300)]
perf tools: Fix double free in perf test 21 (code-reading.c)
perf_evlist__delete() deletes attached cpu and thread maps
but the test is still using them, so remove them from the
evlist before deleting it.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Link: http://lkml.kernel.org/r/53465E3E.8070201@intel.com
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Steven Miao [Tue, 15 Apr 2014 07:17:19 +0000 (15:17 +0800)]
video: bf54x-lq043fb: fix build error
Fix build error by including linux/gpio.h. Also drop asm/gpio.h which is
not needed.
Signed-off-by: Steven Miao <realmz6@gmail.com>
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
Li, Zhen-Hua [Tue, 15 Apr 2014 01:53:11 +0000 (09:53 +0800)]
driver/net: cosa driver uses udelay incorrectly
In cosa driver, udelay with more than 20000 may cause __bad_udelay.
Use msleep for instead.
Signed-off-by: Li, Zhen-Hua <zhen-hual@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Aring [Mon, 14 Apr 2014 16:48:02 +0000 (18:48 +0200)]
at86rf230: fix __at86rf230_read_subreg function
The __at86rf230_read_subreg function don't mask and shift register
contents which it should do. This patch adds the necessary masks and
shift operations in this function.
Since we have csma support this can make some trouble on state changes.
Since CSMA support turned on some bits in the TRX_STATUS register that
used to be zero, not masking broke checking of the TRX_STATUS field
after commanding a state change.
Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Reviewed-by: Werner Almesberger <werner@almesberger.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Aring [Mon, 14 Apr 2014 16:48:01 +0000 (18:48 +0200)]
at86rf230: remove check if AVDD settled
The AVDD regulator is only enabled when the RF section is active TX_ON
(PLL_ON) state. Since commit
7dcbd22a97eb0689e6c583ad630ae0e7341e34c1
("ieee802154: ensure that first RF212 state comes from TRX_OFF").
We are in TRX_OFF state at the time at86rf230_hw_init is run.
Note that this test would only fail in case of a severe hardware
malfunction (faulty/shorted power supply, etc.) so it wasn't all that
useful in the first place.
Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Reviewed-by: Werner Almesberger <werner@almesberger.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jean Delvare [Mon, 14 Apr 2014 13:38:49 +0000 (15:38 +0200)]
net: cadence: Add architecture dependencies
The Cadence ethernet chipsets are only used on specific ARM
architectures. Add Kconfig dependencies so that drivers for these
chipsets are only buildable on the relevant architectures.
Signed-off-by: Jean Delvare <jdelvare@suse.de>
Cc: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Mon, 14 Apr 2014 23:21:28 +0000 (16:21 -0700)]
Merge git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fixes from Marcelo Tosatti:
- Fix for guest triggerable BUG_ON (CVE-2014-0155)
- CR4.SMAP support
- Spurious WARN_ON() fix
* git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: x86: remove WARN_ON from get_kernel_ns()
KVM: Rename variable smep to cr4_smep
KVM: expose SMAP feature to guest
KVM: Disable SMAP for guests in EPT realmode and EPT unpaging mode
KVM: Add SMAP support when setting CR4
KVM: Remove SMAP bit from CR4_RESERVED_BITS
KVM: ioapic: try to recover if pending_eoi goes out of range
KVM: ioapic: fix assignment of ioapic->rtc_status.pending_eoi (CVE-2014-0155)
Linus Torvalds [Mon, 14 Apr 2014 23:04:14 +0000 (16:04 -0700)]
Merge git://git./linux/kernel/git/herbert/crypto-2.6
Pull bmc2835 crypto fix from Herbert Xu:
"This fixes a potential boot crash on bcm2835 due to the recent change
that now causes hardware RNGs to be accessed on registration"
* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
hwrng: bcm2835 - fix oops when rng h/w is accessed during registration
Mikulas Patocka [Mon, 14 Apr 2014 20:58:55 +0000 (16:58 -0400)]
user namespace: fix incorrect memory barriers
smp_read_barrier_depends() can be used if there is data dependency between
the readers - i.e. if the read operation after the barrier uses address
that was obtained from the read operation before the barrier.
In this file, there is only control dependency, no data dependecy, so the
use of smp_read_barrier_depends() is incorrect. The code could fail in the
following way:
* the cpu predicts that idx < entries is true and starts executing the
body of the for loop
* the cpu fetches map->extent[0].first and map->extent[0].count
* the cpu fetches map->nr_extents
* the cpu verifies that idx < extents is true, so it commits the
instructions in the body of the for loop
The problem is that in this scenario, the cpu read map->extent[0].first
and map->nr_extents in the wrong order. We need a full read memory barrier
to prevent it.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David S. Miller [Mon, 14 Apr 2014 23:00:10 +0000 (19:00 -0400)]
Merge git://git./pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains three Netfilter fixes for your net tree,
they are:
* Fix missing generation sequence initialization which results in a splat
if lockdep is enabled, it was introduced in the recent works to improve
nf_conntrack scalability, from Andrey Vagin.
* Don't flush the GRE keymap list in nf_conntrack when the pptp helper is
disabled otherwise this crashes due to a double release, from Andrey
Vagin.
* Fix nf_tables cmp fast in big endian, from Patrick McHardy.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Vlad Yasevich [Mon, 14 Apr 2014 21:37:26 +0000 (17:37 -0400)]
net: Start with correct mac_len in skb_network_protocol
Sometimes, when the packet arrives at skb_mac_gso_segment()
its skb->mac_len already accounts for some of the mac lenght
headers in the packet. This seems to happen when forwarding
through and OpenSSL tunnel.
When we start looking for any vlan headers in skb_network_protocol()
we seem to ignore any of the already known mac headers and start
with an ETH_HLEN. This results in an incorrect offset, dropped
TSO frames and general slowness of the connection.
We can start counting from the known skb->mac_len
and return at least that much if all mac level headers
are known and accounted for.
Fixes:
53d6471cef17262d3ad1c7ce8982a234244f68ec (net: Account for all vlan headers in skb_mac_gso_segment)
CC: Eric Dumazet <eric.dumazet@gmail.com>
CC: Daniel Borkman <dborkman@redhat.com>
Tested-by: Martin Filip <nexus+kernel@smoula.net>
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Marcelo Tosatti [Thu, 10 Apr 2014 21:19:12 +0000 (18:19 -0300)]
KVM: x86: remove WARN_ON from get_kernel_ns()
Function and callers can be preempted.
https://bugzilla.kernel.org/show_bug.cgi?id=73721
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Feng Wu [Tue, 1 Apr 2014 09:56:48 +0000 (17:56 +0800)]
KVM: Rename variable smep to cr4_smep
Rename variable smep to cr4_smep, which can better reflect the
meaning of the variable.
Signed-off-by: Feng Wu <feng.wu@intel.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Feng Wu [Tue, 1 Apr 2014 09:46:36 +0000 (17:46 +0800)]
KVM: expose SMAP feature to guest
This patch exposes SMAP feature to guest
Signed-off-by: Feng Wu <feng.wu@intel.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Feng Wu [Tue, 1 Apr 2014 09:46:35 +0000 (17:46 +0800)]
KVM: Disable SMAP for guests in EPT realmode and EPT unpaging mode
SMAP is disabled if CPU is in non-paging mode in hardware.
However KVM always uses paging mode to emulate guest non-paging
mode with TDP. To emulate this behavior, SMAP needs to be
manually disabled when guest switches to non-paging mode.
Signed-off-by: Feng Wu <feng.wu@intel.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Feng Wu [Tue, 1 Apr 2014 09:46:34 +0000 (17:46 +0800)]
KVM: Add SMAP support when setting CR4
This patch adds SMAP handling logic when setting CR4 for guests
Thanks a lot to Paolo Bonzini for his suggestion to use the branchless
way to detect SMAP violation.
Signed-off-by: Feng Wu <feng.wu@intel.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Feng Wu [Tue, 1 Apr 2014 09:46:33 +0000 (17:46 +0800)]
KVM: Remove SMAP bit from CR4_RESERVED_BITS
This patch removes SMAP bit from CR4_RESERVED_BITS.
Signed-off-by: Feng Wu <feng.wu@intel.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Daniel Borkmann [Mon, 14 Apr 2014 19:45:17 +0000 (21:45 +0200)]
Revert "net: sctp: Fix a_rwnd/rwnd management to reflect real state of the receiver's buffer"
This reverts commit
ef2820a735f7 ("net: sctp: Fix a_rwnd/rwnd management
to reflect real state of the receiver's buffer") as it introduced a
serious performance regression on SCTP over IPv4 and IPv6, though a not
as dramatic on the latter. Measurements are on 10Gbit/s with ixgbe NICs.
Current state:
[root@Lab200slot2 ~]# iperf3 --sctp -4 -c 192.168.241.3 -V -l 1452 -t 60
iperf version 3.0.1 (10 January 2014)
Linux Lab200slot2 3.14.0 #1 SMP Thu Apr 3 23:18:29 EDT 2014 x86_64
Time: Fri, 11 Apr 2014 17:56:21 GMT
Connecting to host 192.168.241.3, port 5201
Cookie: Lab200slot2.
1397238981.812898.548918
[ 4] local 192.168.241.2 port 38616 connected to 192.168.241.3 port 5201
Starting Test: protocol: SCTP, 1 streams, 1452 byte blocks, omitting 0 seconds, 60 second test
[ ID] Interval Transfer Bandwidth
[ 4] 0.00-1.09 sec 20.8 MBytes 161 Mbits/sec
[ 4] 1.09-2.13 sec 10.8 MBytes 86.8 Mbits/sec
[ 4] 2.13-3.15 sec 3.57 MBytes 29.5 Mbits/sec
[ 4] 3.15-4.16 sec 4.33 MBytes 35.7 Mbits/sec
[ 4] 4.16-6.21 sec 10.4 MBytes 42.7 Mbits/sec
[ 4] 6.21-6.21 sec 0.00 Bytes 0.00 bits/sec
[ 4] 6.21-7.35 sec 34.6 MBytes 253 Mbits/sec
[ 4] 7.35-11.45 sec 22.0 MBytes 45.0 Mbits/sec
[ 4] 11.45-11.45 sec 0.00 Bytes 0.00 bits/sec
[ 4] 11.45-11.45 sec 0.00 Bytes 0.00 bits/sec
[ 4] 11.45-11.45 sec 0.00 Bytes 0.00 bits/sec
[ 4] 11.45-12.51 sec 16.0 MBytes 126 Mbits/sec
[ 4] 12.51-13.59 sec 20.3 MBytes 158 Mbits/sec
[ 4] 13.59-14.65 sec 13.4 MBytes 107 Mbits/sec
[ 4] 14.65-16.79 sec 33.3 MBytes 130 Mbits/sec
[ 4] 16.79-16.79 sec 0.00 Bytes 0.00 bits/sec
[ 4] 16.79-17.82 sec 5.94 MBytes 48.7 Mbits/sec
(etc)
[root@Lab200slot2 ~]# iperf3 --sctp -6 -c 2001:db8:0:f101::1 -V -l 1400 -t 60
iperf version 3.0.1 (10 January 2014)
Linux Lab200slot2 3.14.0 #1 SMP Thu Apr 3 23:18:29 EDT 2014 x86_64
Time: Fri, 11 Apr 2014 19:08:41 GMT
Connecting to host 2001:db8:0:f101::1, port 5201
Cookie: Lab200slot2.
1397243321.714295.2b3f7c
[ 4] local 2001:db8:0:f101::2 port 55804 connected to 2001:db8:0:f101::1 port 5201
Starting Test: protocol: SCTP, 1 streams, 1400 byte blocks, omitting 0 seconds, 60 second test
[ ID] Interval Transfer Bandwidth
[ 4] 0.00-1.00 sec 169 MBytes 1.42 Gbits/sec
[ 4] 1.00-2.00 sec 201 MBytes 1.69 Gbits/sec
[ 4] 2.00-3.00 sec 188 MBytes 1.58 Gbits/sec
[ 4] 3.00-4.00 sec 174 MBytes 1.46 Gbits/sec
[ 4] 4.00-5.00 sec 165 MBytes 1.39 Gbits/sec
[ 4] 5.00-6.00 sec 199 MBytes 1.67 Gbits/sec
[ 4] 6.00-7.00 sec 163 MBytes 1.36 Gbits/sec
[ 4] 7.00-8.00 sec 174 MBytes 1.46 Gbits/sec
[ 4] 8.00-9.00 sec 193 MBytes 1.62 Gbits/sec
[ 4] 9.00-10.00 sec 196 MBytes 1.65 Gbits/sec
[ 4] 10.00-11.00 sec 157 MBytes 1.31 Gbits/sec
[ 4] 11.00-12.00 sec 175 MBytes 1.47 Gbits/sec
[ 4] 12.00-13.00 sec 192 MBytes 1.61 Gbits/sec
[ 4] 13.00-14.00 sec 199 MBytes 1.67 Gbits/sec
(etc)
After patch:
[root@Lab200slot2 ~]# iperf3 --sctp -4 -c 192.168.240.3 -V -l 1452 -t 60
iperf version 3.0.1 (10 January 2014)
Linux Lab200slot2 3.14.0+ #1 SMP Mon Apr 14 12:06:40 EDT 2014 x86_64
Time: Mon, 14 Apr 2014 16:40:48 GMT
Connecting to host 192.168.240.3, port 5201
Cookie: Lab200slot2.
1397493648.413274.65e131
[ 4] local 192.168.240.2 port 50548 connected to 192.168.240.3 port 5201
Starting Test: protocol: SCTP, 1 streams, 1452 byte blocks, omitting 0 seconds, 60 second test
[ ID] Interval Transfer Bandwidth
[ 4] 0.00-1.00 sec 240 MBytes 2.02 Gbits/sec
[ 4] 1.00-2.00 sec 239 MBytes 2.01 Gbits/sec
[ 4] 2.00-3.00 sec 240 MBytes 2.01 Gbits/sec
[ 4] 3.00-4.00 sec 239 MBytes 2.00 Gbits/sec
[ 4] 4.00-5.00 sec 245 MBytes 2.05 Gbits/sec
[ 4] 5.00-6.00 sec 240 MBytes 2.01 Gbits/sec
[ 4] 6.00-7.00 sec 240 MBytes 2.02 Gbits/sec
[ 4] 7.00-8.00 sec 239 MBytes 2.01 Gbits/sec
With the reverted patch applied, the SCTP/IPv4 performance is back
to normal on latest upstream for IPv4 and IPv6 and has same throughput
as 3.4.2 test kernel, steady and interval reports are smooth again.
Fixes:
ef2820a735f7 ("net: sctp: Fix a_rwnd/rwnd management to reflect real state of the receiver's buffer")
Reported-by: Peter Butler <pbutler@sonusnet.com>
Reported-by: Dongsheng Song <dongsheng.song@gmail.com>
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Tested-by: Peter Butler <pbutler@sonusnet.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Matija Glavinic Pecotic <matija.glavinic-pecotic.ext@nsn.com>
Cc: Alexander Sverdlin <alexander.sverdlin@nsn.com>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Steve Wise [Mon, 14 Apr 2014 19:22:43 +0000 (14:22 -0500)]
cxgb4: Save the correct mac addr for hw-loopback connections in the L2T
Hardware needs the local device mac address to support hw loopback for
rdma loopback connections.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Mon, 14 Apr 2014 19:20:12 +0000 (21:20 +0200)]
net: filter: seccomp: fix wrong decoding of BPF_S_ANC_SECCOMP_LD_W
While reviewing seccomp code, we found that BPF_S_ANC_SECCOMP_LD_W has
been wrongly decoded by commit
a8fc927780 ("sk-filter: Add ability to
get socket filter program (v2)") into the opcode BPF_LD|BPF_B|BPF_ABS
although it should have been decoded as BPF_LD|BPF_W|BPF_ABS.
In practice, this should not have much side-effect though, as such
conversion is/was being done through prctl(2) PR_SET_SECCOMP. Reverse
operation PR_GET_SECCOMP will only return the current seccomp mode, but
not the filter itself. Since the transition to the new BPF infrastructure,
it's also not used anymore, so we can simply remove this as it's
unreachable.
Fixes:
a8fc927780 ("sk-filter: Add ability to get socket filter program (v2)")
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Mon, 14 Apr 2014 19:02:59 +0000 (21:02 +0200)]
seccomp: fix populating a0-a5 syscall args in 32-bit x86 BPF
Linus reports that on 32-bit x86 Chromium throws the following seccomp
resp. audit log messages:
audit: type=1326 audit(
1397359304.356:28108): auid=500 uid=500
gid=500 ses=2 subj=unconfined_u:unconfined_r:chrome_sandbox_t:s0-s0:c0.c1023
pid=3677 comm="chrome" exe="/opt/google/chrome/chrome" sig=0
syscall=172 compat=0 ip=0xb2dd9852 code=0x30000
audit: type=1326 audit(
1397359304.356:28109): auid=500 uid=500
gid=500 ses=2 subj=unconfined_u:unconfined_r:chrome_sandbox_t:s0-s0:c0.c1023
pid=3677 comm="chrome" exe="/opt/google/chrome/chrome" sig=0 syscall=5
compat=0 ip=0xb2dd9852 code=0x50000
These audit messages are being triggered via audit_seccomp() through
__secure_computing() in seccomp mode (BPF) filter with seccomp return
codes 0x30000 (== SECCOMP_RET_TRAP) and 0x50000 (== SECCOMP_RET_ERRNO)
during filter runtime. Moreover, Linus reports that x86_64 Chromium
seems fine.
The underlying issue that explains this is that the implementation of
populate_seccomp_data() is wrong. Our seccomp data structure sd that
is being shared with user ABI is:
struct seccomp_data {
int nr;
__u32 arch;
__u64 instruction_pointer;
__u64 args[6];
};
Therefore, a simple cast to 'unsigned long *' for storing the value of
the syscall argument via syscall_get_arguments() is just wrong as on
32-bit x86 (or any other 32bit arch), it would result in storing a0-a5
at wrong offsets in args[] member, and thus i) could leak stack memory
to user space and ii) tampers with the logic of seccomp BPF programs
that read out and check for syscall arguments:
syscall_get_arguments(task, regs, 0, 1, (unsigned long *) &sd->args[0]);
Tested on 32-bit x86 with Google Chrome, unfortunately only via remote
test machine through slow ssh X forwarding, but it fixes the issue on
my side. So fix it up by storing args in type correct variables, gcc
is clever and optimizes the copy away in other cases, e.g. x86_64.
Fixes:
bd4cf0ed331a ("net: filter: rework/optimize internal BPF interpreter's instruction set")
Reported-and-bisected-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Eric Paris <eparis@redhat.com>
Cc: James Morris <james.l.morris@oracle.com>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 14 Apr 2014 17:43:58 +0000 (13:43 -0400)]
Merge branch 'qlcnic'
Shahed Shaikh says:
====================
qlcnic: Bug fixes
This patch series contains following bug fixes -
* Send INIT_NIC_FUNC mailbox command as first mailbox
* Fix a panic because of uninitialized delayed_work.
* Fix inconsistent calculation of max rings count.
* Fix PVID configuration issue. Driver needs to clear older
PVID before adding new one.
* Fix QLogic application/driver interface by packing vNIC information
array.
* Fix a crash when user tries to disable SR-IOV while VFs are
still assigned to VMs.
Please apply to net.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Manish Chopra [Mon, 14 Apr 2014 14:02:23 +0000 (10:02 -0400)]
qlcnic: Do not disable SR-IOV when VFs are assigned to VMs
o While disabling SR-IOV when VFs are assigned to VMs causes host crash
so return -EPERM when user request to disable SR-IOV using pci sysfs in
case of VFs are assigned to VMs.
Signed-off-by: Manish Chopra <manish.chopra@qlogic.com>
Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jitendra Kalsaria [Mon, 14 Apr 2014 14:02:22 +0000 (10:02 -0400)]
qlcnic: Fix QLogic application/driver interface for virtual NIC configuration
o Application expect vNIC number as the array index but driver interface
return configuration in array index form.
o Pack the vNIC information array in the buffer such that application can
access it using vNIC number as the array index.
Signed-off-by: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com>
Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jitendra Kalsaria [Mon, 14 Apr 2014 14:02:21 +0000 (10:02 -0400)]
qlcnic: Fix PVID configuration on eSwitch port.
Clear older PVID before adding a newer PVID to the eSwicth port
Signed-off-by: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com>
Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Shahed Shaikh [Mon, 14 Apr 2014 14:02:20 +0000 (10:02 -0400)]
qlcnic: Fix max ring count calculation
Do not read max rings count from qlcnic_get_nic_info(). Use driver defined
values for 82xx adapters. In case of 83xx adapters, use minimum of firmware
provided and driver defined values.
Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>