Nicolas Pitre [Mon, 26 May 2014 22:19:34 +0000 (18:19 -0400)]
sched/fair: Remove "power" from 'struct numa_stats'
It is better not to think about compute capacity as being equivalent
to "CPU power". The upcoming "power aware" scheduler work may create
confusion with the notion of energy consumption if "power" is used too
liberally.
To make things explicit and not create more confusion with the existing
"capacity" member, let's rename things as follows:
power -> compute_capacity
capacity -> task_capacity
Note: none of those fields are actually used outside update_numa_stats().
Signed-off-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Morten Rasmussen <morten.rasmussen@arm.com>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: linaro-kernel@lists.linaro.org
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/n/tip-2e2ndymj5gyshyjq8am79f20@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Dan Carpenter [Fri, 23 May 2014 10:20:42 +0000 (13:20 +0300)]
sched: Fix signedness bug in yield_to()
yield_to() is supposed to return -ESRCH if there is no task to
yield to, but because the type is bool that is the same as returning
true.
The only place I see which cares is kvm_vcpu_on_spin().
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Raghavendra <raghavendra.kt@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: kvm@vger.kernel.org
Link: http://lkml.kernel.org/r/20140523102042.GA7267@mwanda
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Manuel Schölling [Thu, 22 May 2014 17:45:23 +0000 (19:45 +0200)]
sched/fair: Use time_after() in record_wakee()
To be future-proof and for better readability the time comparisons are modified
to use time_after() instead of plain, error-prone math.
Signed-off-by: Manuel Schölling <manuel.schoelling@gmx.de>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1400780723-24626-1-git-send-email-manuel.schoelling@gmx.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tim Chen [Tue, 20 May 2014 21:39:27 +0000 (14:39 -0700)]
sched/balancing: Reduce the rate of needless idle load balancing
The current no_hz idle load balancer do load balancing for *all* idle cpus,
even though the time due to load balance for a particular
idle cpu could be still a while in the future. This introduces a much
higher load balancing rate than what is necessary. The patch
changes the behavior by only doing idle load balancing on
behalf of an idle cpu only when it is due for load balancing.
On SGI's systems with over 3000 cores, the cpu responsible for idle balancing
got overwhelmed with idle balancing, and introduces a lot of OS noise
to workloads. This patch fixes the issue.
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Acked-by: Russ Anderson <rja@sgi.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Reviewed-by: Jason Low <jason.low2@hp.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Len Brown <len.brown@intel.com>
Cc: Dimitri Sivanich <sivanich@sgi.com>
Cc: Hedi Berriche <hedi@sgi.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: MichelLespinasse <walken@google.com>
Cc: Peter Hurley <peter@hurleysoftware.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1400621967.2970.280.camel@schen9-DESK
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Ben Segall [Mon, 19 May 2014 22:49:45 +0000 (15:49 -0700)]
sched/fair: Fix unlocked reads of some cfs_b->quota/period
sched_cfs_period_timer() reads cfs_b->period without locks before calling
do_sched_cfs_period_timer(), and similarly unthrottle_offline_cfs_rqs()
would read cfs_b->period without the right lock. Thus a simultaneous
change of bandwidth could cause corruption on any platform where ktime_t
or u64 writes/reads are not atomic.
Extend cfs_b->lock from do_sched_cfs_period_timer() to include the read of
cfs_b->period to solve that issue; unthrottle_offline_cfs_rqs() can just
use 1 rather than the exact quota, much like distribute_cfs_runtime()
does.
There is also an unlocked read of cfs_b->runtime_expires, but a race
there would only delay runtime expiry by a tick. Still, the comparison
should just be != anyway, which clarifies even that problem.
Signed-off-by: Ben Segall <bsegall@google.com>
Tested-by: Roman Gushchin <klamm@yandex-team.ru>
[peterz: Fix compile warn]
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20140519224945.20303.93530.stgit@sword-of-the-dawn.mtv.corp.google.com
Cc: pjt@google.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Rik van Riel [Fri, 16 May 2014 04:13:32 +0000 (00:13 -0400)]
sched/numa: Decay ->wakee_flips instead of zeroing
Affine wakeups have the potential to interfere with NUMA placement.
If a task wakes up too many other tasks, affine wakeups will get
disabled.
However, regardless of how many other tasks it wakes up, it gets
re-enabled once a second, potentially interfering with NUMA
placement of other tasks.
By decaying wakee_wakes in half instead of zeroing it, we can avoid
that problem for some workloads.
Signed-off-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: chegu_vinod@hp.com
Cc: umgwanakikbuti@gmail.com
Link: http://lkml.kernel.org/r/20140516001332.67f91af2@annuminas.surriel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Rik van Riel [Thu, 15 May 2014 17:03:06 +0000 (13:03 -0400)]
sched/numa: Update migrate_improves/degrades_locality()
Update the migrate_improves/degrades_locality() functions with
knowledge of pseudo-interleaving.
Do not consider moving tasks around within the set of group's active
nodes as improving or degrading locality. Instead, leave the load
balancer free to balance the load between a numa_group's active nodes.
Also, switch from the group/task_weight functions to the group/task_fault
functions. The "weight" functions involve a division, but both calls use
the same divisor, so there's no point in doing that from these functions.
On a 4 node (x10 core) system, performance of SPECjbb2005 seems
unaffected, though the number of migrations with 2 8-warehouse wide
instances seems to have almost halved, due to the scheduler running
each instance on a single node.
Signed-off-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: mgorman@suse.de
Cc: chegu_vinod@hp.com
Link: http://lkml.kernel.org/r/20140515130306.61aae7db@cuia.bos.redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Rik van Riel [Wed, 14 May 2014 17:22:21 +0000 (13:22 -0400)]
sched/numa: Allow task switch if load imbalance improves
Currently the NUMA balancing code only allows moving tasks between NUMA
nodes when the load on both nodes is in balance. This breaks down when
the load was imbalanced to begin with.
Allow tasks to be moved between NUMA nodes if the imbalance is small,
or if the new imbalance is be smaller than the original one.
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: mgorman@suse.de
Cc: chegu_vinod@hp.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: http://lkml.kernel.org/r/20140514132221.274b3463@annuminas.surriel.com
xiaofeng.yan [Fri, 9 May 2014 03:21:27 +0000 (03:21 +0000)]
sched/rt: Fix 'struct sched_dl_entity' and dl_task_time() comments, to match the current upstream code
Signed-off-by: xiaofeng.yan <xiaofeng.yan@huawei.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1399605687-18094-1-git-send-email-xiaofeng.yan@huawei.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Dongsheng Yang [Thu, 8 May 2014 09:33:49 +0000 (18:33 +0900)]
sched: Consolidate open coded implementations of nice level frobbing into nice_to_rlimit() and rlimit_to_nice()
Signed-off-by: Dongsheng Yang <yangds.fnst@cn.fujitsu.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/a568a1e3cc8e78648f41b5035fa5e381d36274da.1399532322.git.yangds.fnst@cn.fujitsu.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Corey Minyard [Thu, 8 May 2014 18:47:39 +0000 (13:47 -0500)]
sched: Initialize rq->age_stamp on processor start
If the sched_clock time starts at a large value, the kernel will spin
in sched_avg_update for a long time while rq->age_stamp catches up
with rq->clock.
The comment in kernel/sched/clock.c says that there is no strict promise
that it starts at zero. So initialize rq->age_stamp when a cpu starts up
to avoid this.
I was seeing long delays on a simulator that didn't start the clock at
zero. This might also be an issue on reboots on processors that don't
re-initialize the timer to zero on reset, and when using kexec.
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1399574859-11714-1-git-send-email-minyard@acm.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Kirill Tkhai [Thu, 8 May 2014 23:00:14 +0000 (03:00 +0400)]
sched, nohz: Change rq->nr_running to always use wrappers
Sometimes ->nr_running may cross 2 but interrupt is not being
sent to rq's cpu. In this case we don't reenable the timer.
Looks like this may be the reason for rare unexpected effects,
if nohz is enabled.
Patch replaces all places of direct changing of nr_running
and makes add_nr_running() caring about crossing border.
Signed-off-by: Kirill Tkhai <tkhai@yandex.ru>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20140508225830.2469.97461.stgit@localhost
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Jason Low [Fri, 9 May 2014 00:49:22 +0000 (17:49 -0700)]
sched: Fix the rq->next_balance logic in rebalance_domains() and idle_balance()
Currently, in idle_balance(), we update rq->next_balance when we pull_tasks.
However, it is also important to update this in the !pulled_tasks case too.
When the CPU is "busy" (the CPU isn't idle), rq->next_balance gets computed
using sd->busy_factor (so we increase the balance interval when the CPU is
busy). However, when the CPU goes idle, rq->next_balance could still be set
to a large value that was computed with the sd->busy_factor.
Thus, we need to also update rq->next_balance in idle_balance() in the cases
where !pulled_tasks too, so that rq->next_balance gets updated without taking
the busy_factor into account when the CPU is about to go idle.
This patch makes rq->next_balance get updated independently of whether or
not we pulled_task. Also, we add logic to ensure that we always traverse
at least 1 of the sched domains to get a proper next_balance value for
updating rq->next_balance.
Additionally, since load_balance() modifies the sd->balance_interval, we
need to re-obtain the sched domain's interval after the call to
load_balance() in rebalance_domains() before we update rq->next_balance.
This patch adds and uses 2 new helper functions, update_next_balance() and
get_sd_balance_interval() to update next_balance and obtain the sched
domain's balance_interval.
Signed-off-by: Jason Low <jason.low2@hp.com>
Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: daniel.lezcano@linaro.org
Cc: alex.shi@linaro.org
Cc: efault@gmx.de
Cc: vincent.guittot@linaro.org
Cc: morten.rasmussen@arm.com
Cc: aswin@hp.com
Link: http://lkml.kernel.org/r/1399596562.2200.7.camel@j-VirtualBox
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Dongsheng Yang [Thu, 8 May 2014 09:35:15 +0000 (18:35 +0900)]
sched: Use clamp() and clamp_val() to make sys_nice() more readable
Suggested-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Dongsheng Yang <yangds.fnst@cn.fujitsu.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1399541715-19568-1-git-send-email-yangds.fnst@cn.fujitsu.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Dietmar Eggemann [Wed, 30 Apr 2014 13:39:38 +0000 (14:39 +0100)]
sched: Do not zero sg->cpumask and sg->sgp->power in build_sched_groups()
There is no need to zero struct sched_group member cpumask and struct
sched_group_power member power since both structures are already allocated
as zeroed memory in __sdt_alloc().
This patch has been tested with
BUG_ON(!cpumask_empty(sched_group_cpus(sg))); and BUG_ON(sg->sgp->power);
in build_sched_groups() on ARM TC2 and INTEL i5 M520 platform including
CPU hotplug scenarios.
Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1398865178-12577-1-git-send-email-dietmar.eggemann@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Vincent Guittot [Tue, 13 May 2014 09:11:01 +0000 (11:11 +0200)]
sched/numa: Fix initialization of sched_domain_topology for NUMA
Jet Chen has reported a kernel panics when booting qemu-system-x86_64 with
kvm64 cpu. A panic occured while building the sched_domain.
In sched_init_numa, we create a new topology table in which both default
levels and numa levels are copied. The last row of the table must have a null
pointer in the mask field.
The current implementation doesn't add this last row in the computation of the
table size. So we add 1 row in the allocation size that will be used as the
last row of the table. The kzalloc will ensure that the mask field is NULL.
Reported-by: Jet Chen <jet.chen@intel.com>
Tested-by: Jet Chen <jet.chen@intel.com>
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: fengguang.wu@intel.com
Link: http://lkml.kernel.org/r/1399972261-25693-1-git-send-email-vincent.guittot@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Rik van Riel [Wed, 14 May 2014 15:40:37 +0000 (11:40 -0400)]
sched: Call select_idle_sibling() when not affine_sd
On smaller systems, the top level sched domain will be an affine
domain, and select_idle_sibling is invoked for every SD_WAKE_AFFINE
wakeup. This seems to be working well.
On larger systems, with the node distance between far away NUMA nodes
being > RECLAIM_DISTANCE, select_idle_sibling is only called if the
waker and the wakee are on nodes less than RECLAIM_DISTANCE apart.
This patch leaves in place the policy of not pulling the task across
nodes on such systems, while fixing the issue that select_idle_sibling
is not called at all in certain circumstances.
The code will look for an idle CPU in the same CPU package as the
CPU where the task ran previously.
Signed-off-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: morten.rasmussen@arm.com
Cc: george.mccollister@gmail.com
Cc: ktkhai@parallels.com
Cc: Mel Gorman <mgorman@suse.de>
Cc: Mike Galbraith <umgwanakikbuti@gmail.com>
Link: http://lkml.kernel.org/r/20140514114037.2d93266f@annuminas.surriel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Michael Kerrisk [Fri, 9 May 2014 14:54:33 +0000 (16:54 +0200)]
sched: Simplify return logic in sched_read_attr()
Gotos are chained pointlessly here, and the 'out' label
can be dispensed with.
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/536CEC29.9090503@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Michael Kerrisk [Fri, 9 May 2014 14:54:28 +0000 (16:54 +0200)]
sched: Simplify return logic in sched_copy_attr()
The logic in this function is a little contorted, clean it up:
* Rather than having chained gotos for the -EFBIG case, just
return -EFBIG directly.
* Now, the label 'out' is no longer needed, and 'ret' must be zero
zero by the time we fall through to this point, so just return 0.
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/536CEC24.9080201@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Ben Segall [Thu, 15 May 2014 22:59:20 +0000 (15:59 -0700)]
sched: Fix exec_start/task_hot on migrated tasks
task_hot checks exec_start on any runnable task, but if it has been
migrated since the it last ran, then exec_start is a clock_task from
another cpu. If the old cpu's clock_task was sufficiently far ahead of
this cpu's then the task will not be considered for another migration
until it has run. Instead reset exec_start whenever a task is migrated,
since it is presumably no longer hot anyway.
Signed-off-by: Ben Segall <bsegall@google.com>
[ Made it compile. ]
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20140515225920.7179.13924.stgit@sword-of-the-dawn.mtv.corp.google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Ingo Molnar [Thu, 22 May 2014 08:55:03 +0000 (10:55 +0200)]
Merge branch 'sched/urgent' into sched/core to avoid conflicts with upcoming changes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Ingo Molnar [Thu, 22 May 2014 08:37:06 +0000 (10:37 +0200)]
Merge branch 'pm-cpuidle' of git://git./linux/kernel/git/rafael/linux-pm into sched/core
Pull scheduling related CPU idle updates from Rafael J. Wysocki.
Conflicts:
kernel/sched/idle.c
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Ingo Molnar [Thu, 22 May 2014 08:28:56 +0000 (10:28 +0200)]
Merge tag 'v3.15-rc6' into sched/core, to pick up the latest fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Fri, 9 May 2014 17:04:00 +0000 (19:04 +0200)]
arm64: Remove TIF_POLLING_NRFLAG
The only idle method for arm64 is WFI and it therefore
unconditionally requires the reschedule interrupt when idle.
Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: http://lkml.kernel.org/r/20140509170649.GG13658@twins.programming.kicks-ass.net
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
James Hogan [Fri, 9 May 2014 14:36:21 +0000 (15:36 +0100)]
metag: Remove TIF_POLLING_NRFLAG
The Meta idle function jumps into the interrupt handler which
efficiently blocks waiting for the next interrupt when it reads the
interrupt status register (TXSTATI). No other (polling) idle functions
can be used, therefore TIF_POLLING_NRFLAG is unnecessary, so lets remove
it.
Peter Zijlstra said:
> Most archs have (x86) hlt or (arm) wfi like idle instructions, and if
> that is your only possible idle function, you'll require the interrupt
> to wake up and there's really no point to having the POLLING bit.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/536CEB7E.9080007@imgtec.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Lai Jiangshan [Fri, 16 May 2014 03:50:42 +0000 (11:50 +0800)]
sched: Fix hotplug vs. set_cpus_allowed_ptr()
Lai found that:
WARNING: CPU: 1 PID: 13 at arch/x86/kernel/smp.c:124 native_smp_send_reschedule+0x2d/0x4b()
...
migration_cpu_stop+0x1d/0x22
was caused by set_cpus_allowed_ptr() assuming that cpu_active_mask is
always a sub-set of cpu_online_mask.
This isn't true since
5fbd036b552f ("sched: Cleanup cpu_active madness").
So set active and online at the same time to avoid this particular
problem.
Fixes:
5fbd036b552f ("sched: Cleanup cpu_active madness")
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michael wang <wangyun@linux.vnet.ibm.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Cc: Toshi Kani <toshi.kani@hp.com>
Link: http://lkml.kernel.org/r/53758B12.8060609@cn.fujitsu.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Wed, 14 May 2014 14:04:26 +0000 (16:04 +0200)]
sched/cpupri: Replace NR_CPUS arrays
Tejun reported that his resume was failing due to order-3 allocations
from sched_domain building.
Replace the NR_CPUS arrays in there with a dynamically allocated
array.
Reported-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/n/tip-7cysnkw1gik45r864t1nkudh@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Wed, 14 May 2014 14:13:56 +0000 (16:13 +0200)]
sched/deadline: Replace NR_CPUS arrays
Tejun reported that his resume was failing due to order-3 allocations
from sched_domain building.
Replace the NR_CPUS arrays in there with a dynamically allocated
array.
Reported-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Juri Lelli <juri.lelli@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/n/tip-kat4gl1m5a6dwy6nzuqox45e@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Juri Lelli [Tue, 13 May 2014 12:11:31 +0000 (14:11 +0200)]
sched/deadline: Restrict user params max value to 2^63 ns
Michael Kerrisk noticed that creating SCHED_DEADLINE reservations
with certain parameters (e.g, a runtime of something near 2^64 ns)
can cause a system freeze for some amount of time.
The problem is that in the interface we have
u64 sched_runtime;
while internally we need to have a signed runtime (to cope with
budget overruns)
s64 runtime;
At the time we setup a new dl_entity we copy the first value in
the second. The cast turns out with negative values when
sched_runtime is too big, and this causes the scheduler to go crazy
right from the start.
Moreover, considering how we deal with deadlines wraparound
(s64)(a - b) < 0
we also have to restrict acceptable values for sched_{deadline,period}.
This patch fixes the thing checking that user parameters are always
below 2^63 ns (still large enough for everyone).
It also rewrites other conditions that we check, since in
__checkparam_dl we don't have to deal with deadline wraparounds
and what we have now erroneously fails when the difference between
values is too big.
Reported-by: Michael Kerrisk <mtk.manpages@gmail.com>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Juri Lelli <juri.lelli@gmail.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: <stable@vger.kernel.org>
Cc: Dario Faggioli<raistlin@linux.it>
Cc: Dave Jones <davej@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20140513141131.20d944f81633ee937f256385@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Mon, 12 May 2014 20:50:34 +0000 (22:50 +0200)]
sched/deadline: Change sched_getparam() behaviour vs SCHED_DEADLINE
The way we read POSIX one should only call sched_getparam() when
sched_getscheduler() returns either SCHED_FIFO or SCHED_RR.
Given that we currently return sched_param::sched_priority=0 for all
others, extend the same behaviour to SCHED_DEADLINE.
Requested-by: Michael Kerrisk <mtk.manpages@gmail.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Dario Faggioli <raistlin@linux.it>
Cc: linux-man <linux-man@vger.kernel.org>
Cc: "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com>
Cc: Juri Lelli <juri.lelli@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20140512205034.GH13467@laptop.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Fri, 9 May 2014 08:49:03 +0000 (10:49 +0200)]
sched: Disallow sched_attr::sched_policy < 0
The scheduler uses policy=-1 to preserve the current policy state to
implement sys_sched_setparam(), this got exposed to userspace by
accident through sys_sched_setattr(), cure this.
Reported-by: Michael Kerrisk <mtk.manpages@gmail.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: <stable@vger.kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20140509085311.GJ30445@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Michael Kerrisk [Fri, 9 May 2014 14:54:15 +0000 (16:54 +0200)]
sched: Make sched_setattr() correctly return -EFBIG
The documented[1] behavior of sched_attr() in the proposed man page text is:
sched_attr::size must be set to the size of the structure, as in
sizeof(struct sched_attr), if the provided structure is smaller
than the kernel structure, any additional fields are assumed
'0'. If the provided structure is larger than the kernel structure,
the kernel verifies all additional fields are '0' if not the
syscall will fail with -E2BIG.
As currently implemented, sched_copy_attr() returns -EFBIG for
for this case, but the logic in sys_sched_setattr() converts that
error to -EFAULT. This patch fixes the behavior.
[1] http://thread.gmane.org/gmane.linux.kernel/
1615615/focus=
1697760
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: <stable@vger.kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/536CEC17.9070903@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Linus Torvalds [Wed, 21 May 2014 21:42:02 +0000 (06:42 +0900)]
Linux 3.15-rc6
Linus Torvalds [Wed, 21 May 2014 20:55:12 +0000 (05:55 +0900)]
Merge branch 'merge' of git://git./linux/kernel/git/benh/powerpc
Pull two powerpc fixes from Ben Herrenschmidt:
"Here are a couple of fixes for 3.15. One from Anton fixes a nasty
regression I introduced when trying to fix a loss of irq_work whose
consequences is that we can completely lose timer interrupts on a
CPU... not pretty.
The other one is a change to our PCIe reset hook to use a firmware
call instead of direct config space accesses to trigger a fundamental
reset on the root port. This is necessary so that the FW gets a
chance to disable the link down error monitoring, which would
otherwise trip and cause subsequent fatal EEH error"
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
powerpc: irq work racing with timer interrupt can result in timer interrupt hang
powerpc/powernv: Reset root port in firmware
Linus Torvalds [Wed, 21 May 2014 20:40:13 +0000 (05:40 +0900)]
Merge branch 'for-linus' of git://git./linux/kernel/git/mason/linux-btrfs
Pull two btrfs fixes from Chris Mason:
"This has two fixes that we've been testing for 3.16, but since both
are safe and fix real bugs, it makes sense to send for 3.15 instead"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
Btrfs: send, fix incorrect ref access when using extrefs
Btrfs: fix EIO on reading file after ioctl clone works on it
Linus Torvalds [Wed, 21 May 2014 20:38:51 +0000 (05:38 +0900)]
Merge branch 'for-linus' of git://git./linux/kernel/git/sage/ceph-client
Pull two ceph fixes from Sage Weil:
"The first patch fixes a problem when we have a page count of 0 for
sendpage which is triggered by zfs. The second fixes a bug in CRUSH
that was resolved in the userland code a while back but fell through
the cracks on the kernel side"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
crush: decode and initialize chooseleaf_vary_r
libceph: fix corruption when using page_count 0 page in rbd
Linus Torvalds [Wed, 21 May 2014 20:36:07 +0000 (05:36 +0900)]
Merge tag 'xfs-for-linus-3.15-rc6' of git://oss.sgi.com/xfs/xfs
Pull xfs fixes from Dave Chinner:
"Code inspection of the XFS error number sign translations found a
bunch of issues, including returning incorrectly signed errors for
some data integrity operations.
These leak to userspace and result in applications not getting the
errors correctly reported. Hence they need fixing sooner rather than
later.
A couple of the bugs are in data integrity operations, a couple more
are in the new COLLAPSE_RANGE code. One of these came in through a
recent ext4 merge and so I had to update the base tree to 3.15-rc5
before fixing the issues"
* tag 'xfs-for-linus-3.15-rc6' of git://oss.sgi.com/xfs/xfs:
xfs: list_lru_init returns a negative error
xfs: negate xfs_icsb_init_counters error value
xfs: negate mount workqueue init error value
xfs: fix wrong err sign on xfs_set_acl()
xfs: fix wrong errno from xfs_initxattrs
xfs: correct error sign on COLLAPSE_RANGE errors
xfs: xfs_commit_metadata returns wrong errno
xfs: fix incorrect error sign in xfs_file_aio_read
xfs: xfs_dir_fsync() returns positive errno
Linus Torvalds [Wed, 21 May 2014 20:34:57 +0000 (05:34 +0900)]
Merge branch 'renameat2' of git://git./linux/kernel/git/mszeredi/vfs
Pull renameat2 arch support from Miklos Szeredi:
"I've collected architecture patches for the renameat2 syscall that
maintainers acked and/or asked me to queue.
This adds architecture support for the renameat2 syscall to m68k,
parisc, ia64 and through asm-generic to arc, arm64, c6x, hexagon,
metag, openrisc, score, tile, unicore32"
* 'renameat2' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
scripts/checksyscalls.sh: Make renameat optional
asm-generic: Add renameat2 syscall
ia64: add renameat2 syscall
parisc: add renameat2 syscall
m68k: add renameat2 syscall
Linus Torvalds [Wed, 21 May 2014 19:29:39 +0000 (04:29 +0900)]
Merge tag 'iommu-fixes-v3.15-rc5' of git://git./linux/kernel/git/joro/iommu
Pull iommu fixes from Joerg Roedel:
"Three fixes for the AMD IOMMU driver:
- fix a locking issue around get_user_pages()
- fix two issues with device aliasing and exclusion range handling"
* tag 'iommu-fixes-v3.15-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
iommu/amd: fix enabling exclusion range for an exact device
iommu/amd: Take mmap_sem when calling get_user_pages
iommu/amd: Fix interrupt remapping for aliased devices
Linus Torvalds [Wed, 21 May 2014 19:28:21 +0000 (04:28 +0900)]
Merge tag 'stable/for-linus-3.15-rc5-tag' of git://git./linux/kernel/git/konrad/ibft
Pull iscsi_ibft fix from Konrad Rzeszutek Wilk:
"Fix iBFT regression on Broadcom NICs introduced in 3.2"
* tag 'stable/for-linus-3.15-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/ibft:
iscsi_ibft: Fix finding Broadcom specific ibft sign
Linus Torvalds [Wed, 21 May 2014 19:26:23 +0000 (04:26 +0900)]
Merge tag 'renesas-sh-drivers-for-v3.15' of git://git./linux/kernel/git/horms/renesas
Pull SH driver fix from Simon Horman:
"Compile drivers/sh/pm_runtime.c if ARCH_SHMOBILE_MULTI
This resolves a regression introduced in v3.14 by commit
bf98c1eac1d4
("ARM: Rename ARCH_SHMOBILE to ARCH_SHMOBILE_LEGACY")"
* tag 'renesas-sh-drivers-for-v3.15' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas:
drivers: sh: compile drivers/sh/pm_runtime.c if ARCH_SHMOBILE_MULTI
Linus Torvalds [Wed, 21 May 2014 10:01:08 +0000 (19:01 +0900)]
Merge branch 'v4l_for_linus' of git://git./linux/kernel/git/mchehab/linux-media
Pull media fixes from Mauro Carvalho Chehab:
"Most of the changes are drivers fixes (rtl28xuu, fc2580, ov7670,
davinci, gspca, s5p-fimc and s5c73m3).
There is also a compat32 fix and one infoleak fixup at the media
controller"
* 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
[media] V4L2: fix VIDIOC_CREATE_BUFS in 64- / 32-bit compatibility mode
[media] V4L2: ov7670: fix a wrong index, potentially Oopsing the kernel from user-space
[media] media-device: fix infoleak in ioctl media_enum_entities()
[media] fc2580: fix tuning failure on 32-bit arch
[media] Prefer gspca_sonixb over sn9c102 for all devices
[media] media: davinci: vpfe: make sure all the buffers unmapped and released
[media] staging: media: davinci: vpfe: make sure all the buffers are released
[media] media: davinci: vpbe_display: fix releasing of active buffers
[media] media: davinci: vpif_display: fix releasing of active buffers
[media] media: davinci: vpif_capture: fix releasing of active buffers
[media] s5p-fimc: Fix YUV422P depth
[media] s5c73m3: Add missing rename of v4l2_of_get_next_endpoint() function
[media] rtl28xxu: silence error log about disabled rtl2832_sdr module
[media] rtl28xxu: do not hard depend on staging SDR module
Linus Torvalds [Wed, 21 May 2014 10:00:09 +0000 (19:00 +0900)]
Merge tag 'staging-3.15-rc6' of git://git./linux/kernel/git/gregkh/staging
Pull staging driver fixes from Greg KH:
"Here are five staging driver fixes for 3.15-rc6 that resolve some
reported issues. They are for the imx and rtl8723au drivers"
* tag 'staging-3.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
staging: rtl8723au: Do not reset wdev->iftype in netdev_close()
staging: rtl8723au: Use correct pipe type for USB interrupts
imx-drm: imx-tve: correct DDC property name to 'ddc-i2c-bus'
imx-drm: imx-drm-core: skip components whose parent device is disabled
imx-drm: imx-drm-core: fix imx_drm_encoder_get_mux_id
Linus Torvalds [Wed, 21 May 2014 09:59:25 +0000 (18:59 +0900)]
Merge tag 'driver-core-3.15-rc6' of git://git./linux/kernel/git/gregkh/driver-core
Pull driver core fixes from Greg KH:
"Here are two driver core (well, sysfs) fixes for 3.15-rc6 that resolve
some reported issues and a regression from 3.13"
* tag 'driver-core-3.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
sysfs: make sure read buffer is zeroed
kernfs, sysfs, cgroup: restrict extra perm check on open to sysfs
Linus Torvalds [Wed, 21 May 2014 09:57:25 +0000 (18:57 +0900)]
Merge tag 'pci-v3.15-fixes-2' of git://git./linux/kernel/git/helgaas/pci
Pull PCI fixes from Bjorn Helgaas:
"These are fixes for an SHPCHP hotplug regression, a "wait for pending
transaction" problem (used in device reset paths), and an email
address update.
PCI device hotplug:
- Fix SHPCHP bus speed mismatch issue (Marcel Apfelbaum)
Miscellaneous:
- Fix pci_wait_for_pending_transaction() (Gavin Shan)
- Update email address (Ben Hutchings)"
* tag 'pci-v3.15-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
PCI: Wrong register used to check pending traffic
PCI: shpchp: Check bridge's secondary (not primary) bus speed
PCI: Update my email address
Linus Torvalds [Wed, 21 May 2014 09:56:35 +0000 (18:56 +0900)]
Merge tag 'random_for_linus' of git://git./linux/kernel/git/tytso/random
Pull /dev/random fix from Ted Ts'o:
"This fixes a BUG_ON-causing regression that was introduced during the
last merge window"
* tag 'random_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random:
random: fix BUG_ON caused by accounting simplification
Linus Torvalds [Wed, 21 May 2014 09:55:17 +0000 (18:55 +0900)]
Merge tag 'clk-fixes-for-linus' of git://git.linaro.org/people/mike.turquette/linux
Pull clock framework fixes from Mike Turquette:
"Clock framework and driver fixes, all of which fix user-visible
regressions.
As usual most fixes are for platform-specific clock drivers, but there
are also two fixes to the clk core after recent changes to the way
that clock unregistration is handled"
* tag 'clk-fixes-for-linus' of git://git.linaro.org/people/mike.turquette/linux:
clk: tegra: Fix wrong value written to PLLE_AUX
clk: shmobile: clk-mstp: change to using clock-indices
clk: Fix slab corruption in clk_unregister()
clk: Fix double free due to devm_clk_register()
clk: socfpga: fix clock driver for 3.15
clk: divider: Fix best div calculation for power-of-two and table dividers
clk: bcm281xx: don't use unnamed structs or unions
Linus Torvalds [Wed, 21 May 2014 09:53:55 +0000 (18:53 +0900)]
Merge tag 'spi-v3.15-rc5' of git://git./linux/kernel/git/broonie/spi
Pull spi fixes from Mark Brown:
"A few core fixes around outlying cases here, nothing that should
affect most users but useful fixes. The diffstat is rather larger
than one might hope due some simple code motion in the fix for
!CONFIG_DMA, the actual meaningful change is much smaller.
- Fix handling of unsupported dual and quad mode support on slave
registration so that drivers that can degrade gracefully do so,
preventing regressions for drivers this is added.
- Fix build in !CONFIG_DMA cases following addition of generic DMA
mapping support.
- Fix error handling for queue creation which due to wider kernel
changes can be triggered more easily.
- A couple of driver specific fixes"
* tag 'spi-v3.15-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi/pxa2xx: Prevent DMA from transferring too many bytes
spi: core: Don't destroy master queue if we fail to create it
spi: qup: Fix return value checking for pm_runtime_get_sync()
spi: core: Protect DMA code by #ifdef CONFIG_HAS_DMA
spi: core: Ignore unsupported Dual/Quad Transfer Mode bits
Linus Torvalds [Wed, 21 May 2014 09:53:13 +0000 (18:53 +0900)]
Merge tag 'gpio-v3.15-3' of git://git./linux/kernel/git/linusw/linux-gpio
Pull GPIO fixes from Linus Walleij:
- fix a null pointer bug in the ICH6 chipset driver
- fix device tree registration for the mcp23s08 driver
* tag 'gpio-v3.15-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
gpio: mcp23s08: Bug fix of SPI device tree registration.
gpio: ich: set regs and reglen for i3100 and ich6 chipset
Linus Torvalds [Wed, 21 May 2014 09:36:40 +0000 (18:36 +0900)]
Merge branch 'for-3.15-fixes' of git://git./linux/kernel/git/tj/cgroup
Pull more cgroup fixes from Tejun Heo:
"Three more patches to fix cgroup_freezer breakage due to the recent
cgroup internal locking changes - an operation cgroup_freezer was
using now requires sleepable context and cgroup_freezer was invoking
that while holding a spin lock. cgroup_freezer was using an overly
elaborate hierarchical locking scheme.
While it's possible to convert the hierarchical spinlocks directly to
mutexes, this patch simplifies the overall locking so that it uses a
global mutex. This has the added benefit of avoiding iterating
potentially huge number of tasks under a spinlock. While the patch is
on the larger side in the devel cycle, the changes made are mostly
straight-forward and the locking logic is a lot simpler afterwards"
* 'for-3.15-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cgroup: fix rcu_read_lock() leak in update_if_frozen()
cgroup_freezer: replace freezer->lock with freezer_mutex
cgroup: introduce task_css_is_root()
Linus Torvalds [Wed, 21 May 2014 09:35:42 +0000 (18:35 +0900)]
Merge branch 'for-3.15-fixes' of git://git./linux/kernel/git/tj/libata
Pull libata fixes from Tejun Heo:
"Mostly device-specific fixes. The only thing which isn't is the fix
for zpodd oops-on-detach bug"
* 'for-3.15-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
ahci: imx: PLL clock needs 100us to settle down
ata: pata_at91 only works on sam9
libata: clean up ZPODD when a port is detached
ahci: imx: software workaround for phy reset issue in resume
ahci: imx: add namespace for register enums
ahci: disable DEVSLP for Intel Valleyview
Linus Torvalds [Wed, 21 May 2014 09:34:35 +0000 (18:34 +0900)]
Merge git://git./linux/kernel/git/herbert/crypto-2.6
Pull crypto fixes from Herbert Xu:
"This fixes a NULL pointer dereference on allocation failure in caam,
as well as a regression in the ctr mode on s390 that was added with
the recent concurrency fixes"
* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: s390 - fix aes,des ctr mode concurrency finding.
crypto: caam - add allocation failure handling in SPRINTFCAT macro
Linus Torvalds [Wed, 21 May 2014 09:03:14 +0000 (18:03 +0900)]
Merge git://git./linux/kernel/git/nab/target-pending
Pull scsi target fixes from Nicholas Bellinger:
"This series include:
- Close race between iser-target network portal shutdown + accepting
new connection logins (sagi)
- Fix free-after-use regression in tcm_fc post conversion to
percpu-ida pre-allocation (nab)
- Explicitly disable Immediate + Unsolicited Data for iser-target
connections when T10-PI is enabled (sagi + nab)
- Allow pi_prot_type + emulate_write_cache attributes to be set to
zero regardless of backend support (andy)
- memory leak fix (mikulas)"
* git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
target: fix memory leak on XCOPY
target: Don't allow setting WC emulation if device doesn't support
iscsi-target: Disable Immediate + Unsolicited Data with ISER Protection
tcm_fc: Fix free-after-use regression in ft_free_cmd
iscsi-target: Change BUG_ON to REJECT in iscsit_process_nop_out
Target/iscsi,iser: Avoid accepting transport connections during stop stage
Target/iser: Fix iscsit_accept_np and rdma_cm racy flow
Target/iser: Fix wrong connection requests list addition
target: Allow non-supporting backends to set pi_prot_type to 0
Linus Torvalds [Wed, 21 May 2014 09:02:12 +0000 (18:02 +0900)]
Merge branch 'i2c/for-current' of git://git./linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
"Some I2C bugfixes for 3.15. Typical stuff, I'd say"
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: rcar: bail out on zero length transfers
i2c: qup: Fix pm_runtime_get_sync usage
i2c: s3c2410: resume race fix
i2c: nomadik: Don't use IS_ERR for devm_ioremap
i2c: designware: Mask all interrupts during i2c controller enable
Linus Torvalds [Wed, 21 May 2014 08:58:34 +0000 (17:58 +0900)]
Merge tag 'pm+acpi-3.15-rc6' of git://git./linux/kernel/git/rafael/linux-pm
Pull ACPI and power management fixes from Rafael Wysocki:
"Still fixing regressions (partly by reverting commits that broke
things for people), fixing other stable-candidate bugs and adding some
blacklist entries for ACPI video and _OSI.
Two ACPICA regression fixes (one recent and one for a 3.14 commit), a
fix for an ACPI-related regression in TPM (introduced in 3.14), a
revert of the ACPI AC driver conversion in 3.13 that went wrong for an
unknown reason, two reverts of commits that attempted to remove an old
user space interface in /proc and broke some utilities, in 3.13 too, a
fix for a CPU hotplug bug in the ACPI processor driver (stable
material), two (stable candidate) fixes for intel_pstate and a few new
blacklist entries, mostly for systems that shipped with Windows 8.
Specifics:
- ACPICA fix for a stale pointer access introduced by a recent commit
in the XSDT validation code from Lv Zheng.
- ACPICA fix for the default value of the command line switch to
favor 32-bit FADT addresses (in case there's a conflict between a
64-bit and a 32-bit address). The previous default was that the
32-bit version would take precedence and we tried to change it to
the other way around and it didn't work. From Lv Zheng.
- A TPM commit related to ACPI _DSM in 3.14 caused the driver to
refuse to load if a specific _DSM was missing and that broke resume
from system suspend on Chromebooks that require the TPM hardware to
be restored to a working state during resume by the OS. Restore
the old behavior to load the driver if the _DSM in question is not
present, but prevent it from using the feature the _DSM is for.
- ACPI AC driver conversion in 3.13 broke thermal management on at
least one machine and has to be reverted. From Guenter Roeck.
- Two reverts of 3.13 commits that attempted to remove the old ACPI
battery interface in /proc, but turned out to break some utilities
still using that interface. From Lan Tianyu.
- ACPI processor driver fix to prevent acpi_processor_add() from
modifying the CPU device's .offline field which leads to breakage
if the initial online of the CPU fails. From Igor Mammedov.
- Two intel_pstate fixes, one to take a BayTrail documentation update
into account and one to avoid forcing the maximum P-state on init
which causes CPU PM trouble on systems with P-states coordination
when one of the CPU cores is initialized after an offline/online
cycle triggered by user space. Both stable candidates, from Dirk
Brandewie.
- Fix for the ACPI video DMI blacklist entry for Dell Inspiron 7520
from Aaron Lu.
- Two new ACPI video blacklist entries for machines shipping with
Win8 that need to use native backlight so that it can be controlled
in a usual way (which doesn't work otherwise due bugs in the ACPI
tables) from Hans de Goede.
- Two ACPI _OSI quirks for systems that need them to work correctly
with Linux from Edward Lin and Hans de Goede"
* tag 'pm+acpi-3.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI / video: Revert native brightness quirk for ThinkPad T530
intel_pstate: remove setting P state to MAX on init
ACPICA: Tables: Restore old behavor to favor 32-bit FADT addresses.
ACPI / video: correct DMI tag for Dell Inspiron 7520
intel_pstate: Set turbo VID for BayTrail
ACPI / TPM: Fix resume regression on Chromebooks
ACPI / proc: Do not say when /proc interfaces will be deleted in Kconfig
ACPI / processor: do not mark present at boot but not onlined CPU as onlined
ACPI: Revert "ACPI / AC: convert ACPI ac driver to platform bus"
ACPI / blacklist: Add dmi_enable_osi_linux quirk for Asus EEE PC 1015PX
ACPI: blacklist win8 OSI for Dell Inspiron 7737
ACPI / video: Add use_native_backlight quirks for more systems
ACPI: Revert "ACPI / Battery: Remove battery's proc directory"
ACPI: Revert "ACPI: Remove CONFIG_ACPI_PROCFS_POWER and cm_sbsc.c"
ACPICA: Tables: Fix invalid pointer accesses in acpi_tb_parse_root_table().
Linus Torvalds [Wed, 21 May 2014 08:57:31 +0000 (17:57 +0900)]
Merge tag 'dm-3.15-fixes-2' of git://git./linux/kernel/git/device-mapper/linux-dm
Pull device mapper fixes from Mike Snitzer:
"A dm-crypt fix for a cpu hotplug crash that switches from using
per-cpu data to a mempool allocation (which offers allocation with cpu
locality, and there is no inter-cpu communication on slab allocation).
A couple dm-thinp stable fixes to address "out-of-data-space" issues.
A dm-multipath fix for a LOCKDEP warning introduced in 3.15-rc1"
* tag 'dm-3.15-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm mpath: fix lock order inconsistency in multipath_ioctl
dm thin: add timeout to stop out-of-data-space mode holding IO forever
dm thin: allow metadata commit if pool is in PM_OUT_OF_DATA_SPACE mode
dm crypt: fix cpu hotplug crash by removing per-cpu structure
Linus Torvalds [Wed, 21 May 2014 08:54:55 +0000 (17:54 +0900)]
Merge tag 'dt-for-linus' of git://git.secretlab.ca/git/linux
Pull device tree fixes from Grant Likely:
"Drivercore bugfixes for v3.15
This branch contains bug fixes important to get into v3.15. There is
a fix for modifying properties seen during early boot, a fix for an
incorrect prototype when CONFIG_OF=n, and a couple of corrections to
device tree memory nodes on a few platforms"
* tag 'dt-for-linus' of git://git.secretlab.ca/git/linux:
mips: dts: Fix missing device_type="memory" property in memory nodes
arm: dts: Fix missing device_type="memory" for ste-ccu8540
of: fix CONFIG_OF=n prototype of of_node_full_name()
of: make of_update_property() usable earlier in the boot process
Filipe Manana [Tue, 13 May 2014 21:01:02 +0000 (22:01 +0100)]
Btrfs: send, fix incorrect ref access when using extrefs
When running send, if an inode only has extended reference items
associated to it and no regular references, send.c:get_first_ref()
was incorrectly assuming the reference it found was of type
BTRFS_INODE_REF_KEY due to use of the wrong key variable.
This caused weird behaviour when using the found item has a regular
reference, such as weird path string, and occasionally (when lucky)
a crash:
[ 190.600652] general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC
[ 190.600994] Modules linked in: btrfs xor raid6_pq binfmt_misc nfsd auth_rpcgss oid_registry nfs_acl nfs lockd fscache sunrpc psmouse serio_raw evbug pcspkr i2c_piix4 e1000 floppy
[ 190.602565] CPU: 2 PID: 14520 Comm: btrfs Not tainted 3.13.0-fdm-btrfs-next-26+ #1
[ 190.602728] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[ 190.602868] task:
ffff8800d447c920 ti:
ffff8801fa79e000 task.ti:
ffff8801fa79e000
[ 190.603030] RIP: 0010:[<
ffffffff813266b4>] [<
ffffffff813266b4>] memcpy+0x54/0x110
[ 190.603262] RSP: 0018:
ffff8801fa79f880 EFLAGS:
00010202
[ 190.603395] RAX:
ffff8800d4326e3f RBX:
000000000000036a RCX:
ffff880000000000
[ 190.603553] RDX:
000000000000032a RSI:
ffe708844042936a RDI:
ffff8800d43271a9
[ 190.603710] RBP:
ffff8801fa79f8c8 R08:
00000000003a4ef0 R09:
0000000000000000
[ 190.603867] R10:
793a4ef09f000000 R11:
9f0000000053726f R12:
ffff8800d43271a9
[ 190.604020] R13:
0000160000000000 R14:
ffff8802110134f0 R15:
000000000000036a
[ 190.604020] FS:
00007fb423d09b80(0000) GS:
ffff880216200000(0000) knlGS:
0000000000000000
[ 190.604020] CS: 0010 DS: 0000 ES: 0000 CR0:
000000008005003b
[ 190.604020] CR2:
00007fb4229d4b78 CR3:
00000001f5d76000 CR4:
00000000000006e0
[ 190.604020] Stack:
[ 190.604020]
ffffffffa01f4d49 ffff8801fa79f8f0 00000000000009f9 ffff8801fa79f8c8
[ 190.604020]
00000000000009f9 ffff880211013260 000000000000f971 ffff88021147dba8
[ 190.604020]
00000000000009f9 ffff8801fa79f918 ffffffffa02367f5 ffff8801fa79f928
[ 190.604020] Call Trace:
[ 190.604020] [<
ffffffffa01f4d49>] ? read_extent_buffer+0xb9/0x120 [btrfs]
[ 190.604020] [<
ffffffffa02367f5>] fs_path_add_from_extent_buffer+0x45/0x60 [btrfs]
[ 190.604020] [<
ffffffffa0238806>] get_first_ref+0x1f6/0x210 [btrfs]
[ 190.604020] [<
ffffffffa0238994>] __get_cur_name_and_parent+0x174/0x3a0 [btrfs]
[ 190.604020] [<
ffffffff8118df3d>] ? kmem_cache_alloc_trace+0x11d/0x1e0
[ 190.604020] [<
ffffffffa0236674>] ? fs_path_alloc+0x24/0x60 [btrfs]
[ 190.604020] [<
ffffffffa0238c91>] get_cur_path+0xd1/0x240 [btrfs]
(...)
Steps to reproduce (either crash or some weirdness like an odd path string):
mkfs.btrfs -f -O extref /dev/sdd
mount /dev/sdd /mnt
mkdir /mnt/testdir
touch /mnt/testdir/foobar
for i in `seq 1 2550`; do
ln /mnt/testdir/foobar /mnt/testdir/foobar_link_`printf "%04d" $i`
done
ln /mnt/testdir/foobar /mnt/testdir/final_foobar_name
rm -f /mnt/testdir/foobar
for i in `seq 1 2550`; do
rm -f /mnt/testdir/foobar_link_`printf "%04d" $i`
done
btrfs subvolume snapshot -r /mnt /mnt/mysnap
btrfs send /mnt/mysnap -f /tmp/mysnap.send
Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com>
Signed-off-by: Chris Mason <clm@fb.com>
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
Liu Bo [Fri, 9 May 2014 02:01:02 +0000 (10:01 +0800)]
Btrfs: fix EIO on reading file after ioctl clone works on it
For inline data extent, we need to make its length aligned, otherwise,
we can get a phantom extent map which confuses readpages() to return -EIO.
This can be detected by xfstests/btrfs/035.
Reported-by: David Disseldorp <ddiss@suse.de>
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Chris Mason <clm@fb.com>
James Hogan [Wed, 23 Apr 2014 10:08:06 +0000 (11:08 +0100)]
scripts/checksyscalls.sh: Make renameat optional
The new renameat2 syscall provides all the functionality of renameat
with an additional flags argument, so make renameat optional so that
future architectures can omit it without getting a warning.
This patch doesn't affect existing architectures.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Cc: linux-arch@vger.kernel.org
James Hogan [Wed, 23 Apr 2014 10:08:05 +0000 (11:08 +0100)]
asm-generic: Add renameat2 syscall
Add the renameat2 syscall to the generic syscall list, which is used by the
following architectures: arc, arm64, c6x, hexagon, metag, openrisc, score,
tile, unicore32.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Cc: linux-arch@vger.kernel.org
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Mark Salter <msalter@redhat.com>
Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: linux-hexagon@vger.kernel.org
Cc: linux-metag@vger.kernel.org
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Chen Liqin <liqin.linux@gmail.com>
Cc: Lennox Wu <lennox.wu@gmail.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Miklos Szeredi [Tue, 20 May 2014 08:59:38 +0000 (10:59 +0200)]
ia64: add renameat2 syscall
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Acked-by: Tony Luck <tony.luck@intel.com>
Miklos Szeredi [Tue, 20 May 2014 08:59:37 +0000 (10:59 +0200)]
parisc: add renameat2 syscall
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Acked-by: Helge Deller <deller@gmx.de>
Miklos Szeredi [Tue, 20 May 2014 08:59:37 +0000 (10:59 +0200)]
m68k: add renameat2 syscall
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Linus Torvalds [Tue, 20 May 2014 07:50:38 +0000 (16:50 +0900)]
Merge tag 'sound-3.15-rc6' of git://git./linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"Unfortunately this update became bigger than previous pull requests,
which is almost a pattern in rc5-6. But, the only obvious big changes
are for the new Intel DSP ASoC drivers, so the impact must be fairly
limited.
Other than that, usual small fixes in various fields: HD-audio, ASoC
core and ASoC fsl and codec drivers"
* tag 'sound-3.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (26 commits)
ALSA: sb_mixer: missing return statement
ASoC: wm8962: Update register CLASS_D_CONTROL_1 to be non-volatile
ASoC: Intel: Fix Baytrail SST DSP firmware loading
ALSA: hda - mask buggy stream DMA0 for Broadwell display controller
ALSA: hda - Add new GPU codec ID to snd-hda
ASoC: fsl_esai: Set PCRC and PRRC registers at the end of hw_params()
ASoC: fsl_esai: Only bypass sck_div for EXTAL source
ASoC: fsl_esai: Fix incorrect condition within ratio range check for FP
ASoC: dapm: Fix SUSPEND -> OFF bias sequence
ASoC: dapm: Skip CODEC<->CODEC links in connect_dai_link_widgets()
ASoC: pcm: Fix incorrect condition check for case SNDRV_PCM_TRIGGER_SUSPEND
ALSA: hda - add headset mic detect quirks for three Dell laptops
ASoC: Update Cirrus Logic CODEC maintainers.
ASoC: Intel: Fix block offset calculations.
ASoC: Intel: Fix check for pdata usage before dereference.
ASoC: Intel: Fix stream position pointer.
ASoC: Intel: Fix allow hw_params to be called more than once.
ASoC: Intel: Fix Audio DSP usage when IOMMU is enabled.
ASoC: Intel: Fix Haswell/Broadwell DSP page table creation.
ASoC: Intel: Fix allocated block list usage when adding blocks.
...
Linus Torvalds [Tue, 20 May 2014 07:47:33 +0000 (16:47 +0900)]
Merge branch 'upstream' of git://git.linux-mips.org/ralf/upstream-linus
Pull MIPS fixes from Ralf Baechle:
"MIPS fixes for various loose ends:
- Fix workarounds for R4000 erratum.
- Patch up DEC, Siemens-Nixdorf and Loongson hardware support.
- Wire up renameat2 syscall.
- Delete unused file - it was causing false warnings from maintenance
scripts.
- Revert a patch because it's functionality is now implemented twice
which causes superfluous /proc/cpuinfo output.
- Fix a microMIPS regression"
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
MIPS: mm: Fix broken microMIPS kernel regression.
MIPS: Add new AUDIT_ARCH token for the N32 ABI on MIPS64
MIPS: Wire up renameat2 syscall.
MIPS: inst.h: Rename BITFIELD_FIELD to __BITFIELD_FIELD.
MIPS: Remove file missed when removing rm9k support a while ago.
MIPS/loongson2_cpufreq: Fix CPU clock rate setting
MIPS: Loongson: No need to select GENERIC_HARDIRQS_NO__DO_IRQ
MIPS: csum_partial.S CPU_DADDI_WORKAROUNDS bug fix
MIPS: __strncpy_from_user_asm CPU_DADDI_WORKAROUNDS bug fix
MIPS: __delay CPU_DADDI_WORKAROUNDS bug fix
MIPS: DEC/SNI: O32 wrapper stack switching fixes
MIPS: DEC: Bus error handler <asm/cpu-type.h> fixes
MAINTAINERS: TURBOchannel: Update entry
Revert "MIPS: MT: proc: Add support for printing VPE and TC ids"
Linus Torvalds [Tue, 20 May 2014 05:35:28 +0000 (14:35 +0900)]
Merge branch 'parisc-3.15-4' of git://git./linux/kernel/git/deller/parisc-linux
Pull parisc fixes from Helge Deller:
"There are two patches in here:
The first patch greatly improves latency and corrects the memory
ordering in our light-weight atomic locking syscall.
The second patch ratelimits printing of userspace segfaults in the
same way as it's done on other platforms. This fixes a possible DOS
on parisc since it prevents the syslog to grow too fast. For example,
when the debian acl2 package was built on our debian buildd servers,
this package produced lots of gigabytes in syslog in very short time
and thus filled our harddisks, which then turned the server nearly
completely unaccessible and unresponsive"
* 'parisc-3.15-4' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: Improve LWS-CAS performance
parisc: ratelimit userspace segfault printing
Linus Torvalds [Tue, 20 May 2014 05:33:48 +0000 (14:33 +0900)]
Merge tag 'arm64-fixes' of git://git./linux/kernel/git/arm64/linux
Pull two arm64 fixes from Catalin Marinas:
- arm64 migrate_irqs() fix following commit
ffde1de64012 (irqchip: Gic:
Support forced affinity setting)
- fix arm64 pud_huge() to return 0 when only 2 levels page tables are
used (__PAGETABLE_PMD_FOLDED defined and pmd_huge already covers
block entries at the first level), otherwise KVM gets confused
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: fix pud_huge() for 2-level pagetables
arm64: use cpu_online_mask when using forced irq_set_affinity
Linus Torvalds [Tue, 20 May 2014 05:30:34 +0000 (14:30 +0900)]
Merge tag 'metag-for-v3.15-2' of git://git./linux/kernel/git/jhogan/metag
Pull Metag architecture and related fixes from James Hogan:
"Mostly fixes for metag and parisc relating to upgrowing stacks.
- Fix missing compiler barriers in metag memory barriers.
- Fix BUG_ON on metag when RLIMIT_STACK hard limit is increased
beyond safe value.
- Make maximum stack size configurable. This reduces the default
user stack size back to 80MB (especially on parisc after their
removal of _STK_LIM_MAX override). This only affects metag and
parisc.
- Remove metag _STK_LIM_MAX override to match other arches and follow
parisc, now that it is safe to do so (due to the BUG_ON fix
mentioned above).
- Finally now that both metag and parisc _STK_LIM_MAX overrides have
been removed, it makes sense to remove _STK_LIM_MAX altogether"
* tag 'metag-for-v3.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/metag:
asm-generic: remove _STK_LIM_MAX
metag: Remove _STK_LIM_MAX override
parisc,metag: Do not hardcode maximum userspace stack size
metag: Reduce maximum stack size to 256MB
metag: fix memory barriers
Linus Torvalds [Tue, 20 May 2014 05:28:33 +0000 (14:28 +0900)]
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull drm/intel fixes from Dave Airlie:
"Just some intel fixes.
I have some radeon ones but I need to get some patches dropped from
the pull req"
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
drm/i915: Increase WM memory latency values on SNB
drm/i915: restore backlight precision when converting from ACPI
drm/i915: Use the first mode if there is no preferred mode in the EDID
drm/i915/dp: force eDP lane count to max available lanes on BDW
drm/i915/vlv: reset VLV media force wake request register
drm/i915/SDVO: For sysfs link put directory and target in correct order
drm/i915: use lane count and link rate from VBT as minimums for eDP
drm/i915: clean up VBT eDP link param decoding
drm/i915: consider the source max DP lane count too
Linus Torvalds [Tue, 20 May 2014 05:21:11 +0000 (14:21 +0900)]
Merge branch 'x86-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull x86 fixes from Peter Anvin.
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86-64, modify_ldt: Make support for 16-bit segments a runtime option
x86, mm, hugetlb: Add missing TLB page invalidation for hugetlb_cow()
x86, rdrand: When nordrand is specified, disable RDSEED as well
Linus Torvalds [Tue, 20 May 2014 05:19:10 +0000 (14:19 +0900)]
Merge branch 'timers-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull timer fix from Thomas Gleixner:
"A single bug fix for a long standing issue:
- Updating the expiry value of a relative timer _after_ letting the
idle logic select a target cpu for the timer based on its stale
expiry value is outright stupid. Thanks to Viresh for spotting the
brainfart"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
hrtimer: Set expiry time before switch_hrtimer_base()
Linus Torvalds [Tue, 20 May 2014 05:18:04 +0000 (14:18 +0900)]
Merge branch 'irq-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull irq fixes from Thomas Gleixner:
"Two small updates from the irq departement:
- Provide missing inline stub for a SMP only function
- Add sub-maintainer for the drivers/irqchip/ part of the irq
subsystem. YAY!"
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
MAINTAINERS: Add co-maintainer for drivers/irqchip
genirq: Provide irq_force_affinity fallback for non-SMP
Tejun Heo [Mon, 19 May 2014 19:52:10 +0000 (15:52 -0400)]
sysfs: make sure read buffer is zeroed
13c589d5b0ac ("sysfs: use seq_file when reading regular files")
switched sysfs from custom read implementation to seq_file to enable
later transition to kernfs. After the change, the buffer passed to
->show() is acquired through seq_get_buf(); unfortunately, this
introduces a subtle behavior change. Before the commit, the buffer
passed to ->show() was always zero as it was allocated using
get_zeroed_page(). Because seq_file doesn't clear buffers on
allocation and neither does seq_get_buf(), after the commit, depending
on the behavior of ->show(), we may end up exposing uninitialized data
to userland thus possibly altering userland visible behavior and
leaking information.
Fix it by explicitly clearing the buffer.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Ron <ron@debian.org>
Fixes:
13c589d5b0ac ("sysfs: use seq_file when reading regular files")
Cc: stable <stable@vger.kernel.org> # 3.13+
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Dave Airlie [Mon, 19 May 2014 23:56:26 +0000 (09:56 +1000)]
Merge tag 'drm-intel-fixes-2014-05-16' of git://anongit.freedesktop.org/drm-intel into drm-fixes
Intel fixes for regressions, black screens and hangs, for 3.15.
* tag 'drm-intel-fixes-2014-05-16' of git://anongit.freedesktop.org/drm-intel:
drm/i915: Increase WM memory latency values on SNB
drm/i915: restore backlight precision when converting from ACPI
drm/i915: Use the first mode if there is no preferred mode in the EDID
drm/i915/dp: force eDP lane count to max available lanes on BDW
drm/i915/vlv: reset VLV media force wake request register
drm/i915/SDVO: For sysfs link put directory and target in correct order
drm/i915: use lane count and link rate from VBT as minimums for eDP
drm/i915: clean up VBT eDP link param decoding
drm/i915: consider the source max DP lane count too
Shawn Guo [Sat, 17 May 2014 12:46:01 +0000 (20:46 +0800)]
ahci: imx: PLL clock needs 100us to settle down
The commit
e783c51 (ahci: imx: software workaround for phy reset issue
in resume) calls imx_sata_phy_reset() to reset phy immediately after
SATA MPLL is enabled. It seems working fine mostly, but fails in some
case as below.
...
ahci-imx
2200000.sata: failed to reset phy: -110
ahci-imx: probe of
2200000.sata failed with error -110
After talking to the designer, we learnt that when enabling i.MX6Q SATA
MPLL, we need to wait 100us for it to settle down for safety. Add this
required delay to fix above failure.
Signed-off-by: Shawn Guo <shawn.guo@freescale.com>
Tested-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Gavin Shan [Mon, 19 May 2014 03:06:46 +0000 (13:06 +1000)]
PCI: Wrong register used to check pending traffic
The incorrect register offset is passed to pci_wait_for_pending(), which is
caused by commit
157e876ffe ("PCI: Add pci_wait_for_pending() (refactor
pci_wait_for_pending_transaction())").
Fixes:
157e876ffe ("PCI: Add pci_wait_for_pending() (refactor pci_wait_for_pending_transaction())
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Alex Williamson <alex.williamson@gmail.com>
CC: stable@vger.kernel.org # v3.14+
Mikulas Patocka [Sat, 17 May 2014 10:49:22 +0000 (06:49 -0400)]
target: fix memory leak on XCOPY
On each processed XCOPY command, two "kmalloc-512" memory objects are
leaked. These represent two allocations of struct xcopy_pt_cmd in
target_core_xcopy.c.
The reason for the memory leak is that the cmd_kref field is not
initialized (thus, it is zero because the allocations were done with
kzalloc). When we decrement zero kref in target_put_sess_cmd, the result
is not zero, thus target_release_cmd_kref is not called.
This patch fixes the bug by moving kref initialization from
target_get_sess_cmd to transport_init_se_cmd (this function is called from
target_core_xcopy.c, so it will correctly initialize cmd_kref). It can be
easily verified that all code that calls target_get_sess_cmd also calls
transport_init_se_cmd earlier, thus moving kref_init shouldn't introduce
any new problems.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org # 3.12+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Theodore Ts'o [Sat, 17 May 2014 01:40:41 +0000 (21:40 -0400)]
random: fix BUG_ON caused by accounting simplification
Commit
ee1de406ba6eb1 ("random: simplify accounting logic") simplified
things too much, in that it allows the following to trigger an
overflow that results in a BUG_ON crash:
dd if=/dev/urandom of=/dev/zero bs=
67108707 count=1
Thanks to Peter Zihlstra for discovering the crash, and Hannes
Frederic for analyizing the root cause.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reported-by: Peter Zijlstra <peterz@infradead.org>
Reported-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Greg Price <price@mit.edu>
Tuomas Tynkkynen [Fri, 16 May 2014 13:50:20 +0000 (16:50 +0300)]
clk: tegra: Fix wrong value written to PLLE_AUX
The value written to PLLE_AUX was incorrect due to a wrong variable
being used. Without this fix SATA does not work.
Cc: stable@vger.kernel.org
Signed-off-by: Tuomas Tynkkynen <ttynkkynen@nvidia.com>
Tested-by: Mikko Perttunen <mperttunen@nvidia.com>
Reviewed-by: Thierry Reding <treding@nvidia.com>
Tested-by: Thierry Reding <treding@nvidia.com>
Acked-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Mike Turquette <mturquette@linaro.org>
[mturquette@linaro.org: improved changelog]
Jes Sorensen [Fri, 16 May 2014 20:59:18 +0000 (22:59 +0200)]
staging: rtl8723au: Do not reset wdev->iftype in netdev_close()
wdev->ifdev should be set by .change_virtual_intf(). This solves the
problem of WARN() messages on module unload.
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Rafael J. Wysocki [Fri, 16 May 2014 21:43:56 +0000 (23:43 +0200)]
Merge branch 'acpi-video'
* acpi-video:
ACPI / video: Revert native brightness quirk for ThinkPad T530
Hans de Goede [Fri, 16 May 2014 19:10:41 +0000 (21:10 +0200)]
ACPI / video: Revert native brightness quirk for ThinkPad T530
Seems it helps some users, but causes issues for other users:
https://bugzilla.redhat.com/show_bug.cgi?id=
1089545
So lets drop it for now until we've figured out a better fix.
Fixes:
43d949024425 (ACPI / video: Add use_native_backlight quirks for more systems)
References: https://bugzilla.redhat.com/show_bug.cgi?id=
1089545
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Jes Sorensen [Fri, 16 May 2014 08:05:04 +0000 (10:05 +0200)]
staging: rtl8723au: Use correct pipe type for USB interrupts
Use a correct pipe type when filling un interrupt urbs. This should
finally take care of the WARN() messages on the console when USB urbs
are submitted.
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Ilya Dryomov [Fri, 9 May 2014 14:27:34 +0000 (18:27 +0400)]
crush: decode and initialize chooseleaf_vary_r
Commit
e2b149cc4ba0 ("crush: add chooseleaf_vary_r tunable") added the
crush_map::chooseleaf_vary_r field but missed the decode part. This
lead to misdirected requests caused by incorrect raw crush mapping
sets.
Fixes: http://tracker.ceph.com/issues/8226
Reported-and-Tested-by: Dmitry Smirnov <onlyjob@member.fsf.org>
Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Chunwei Chen [Wed, 23 Apr 2014 04:35:09 +0000 (12:35 +0800)]
libceph: fix corruption when using page_count 0 page in rbd
It has been reported that using ZFSonLinux on rbd will result in memory
corruption. The bug report can be found here:
https://github.com/zfsonlinux/spl/issues/241
http://tracker.ceph.com/issues/7790
The reason is that ZFS will send pages with page_count 0 into rbd, which in
turns send them to tcp_sendpage. However, tcp_sendpage cannot deal with
page_count 0, as it will do get_page and put_page, and erroneously free the
page.
This type of issue has been noted before, and handled in iscsi, drbd,
etc. So, rbd should also handle this. This fix address this issue by fall back
to slower sendmsg when page_count 0 detected.
Cc: Sage Weil <sage@inktank.com>
Cc: Yehuda Sadeh <yehuda@inktank.com>
Cc: stable@vger.kernel.org
Signed-off-by: Chunwei Chen <tuxoko@gmail.com>
Reviewed-by: Ilya Dryomov <ilya.dryomov@inktank.com>
Mark Salter [Thu, 15 May 2014 14:19:22 +0000 (15:19 +0100)]
arm64: fix pud_huge() for 2-level pagetables
The following happens when trying to run a kvm guest on a kernel
configured for 64k pages. This doesn't happen with 4k pages:
BUG: failure at include/linux/mm.h:297/put_page_testzero()!
Kernel panic - not syncing: BUG!
CPU: 2 PID: 4228 Comm: qemu-system-aar Tainted: GF 3.13.0-0.rc7.31.sa2.k32v1.aarch64.debug #1
Call trace:
[<
fffffe0000096034>] dump_backtrace+0x0/0x16c
[<
fffffe00000961b4>] show_stack+0x14/0x1c
[<
fffffe000066e648>] dump_stack+0x84/0xb0
[<
fffffe0000668678>] panic+0xf4/0x220
[<
fffffe000018ec78>] free_reserved_area+0x0/0x110
[<
fffffe000018edd8>] free_pages+0x50/0x88
[<
fffffe00000a759c>] kvm_free_stage2_pgd+0x30/0x40
[<
fffffe00000a5354>] kvm_arch_destroy_vm+0x18/0x44
[<
fffffe00000a1854>] kvm_put_kvm+0xf0/0x184
[<
fffffe00000a1938>] kvm_vm_release+0x10/0x1c
[<
fffffe00001edc1c>] __fput+0xb0/0x288
[<
fffffe00001ede4c>] ____fput+0xc/0x14
[<
fffffe00000d5a2c>] task_work_run+0xa8/0x11c
[<
fffffe0000095c14>] do_notify_resume+0x54/0x58
In arch/arm/kvm/mmu.c:unmap_range(), we end up doing an extra put_page()
on the stage2 pgd which leads to the BUG in put_page_testzero(). This
happens because a pud_huge() test in unmap_range() returns true when it
should always be false with 2-level pages tables used by 64k pages.
This patch removes support for huge puds if 2-level pagetables are
being used.
Signed-off-by: Mark Salter <msalter@redhat.com>
[catalin.marinas@arm.com: removed #ifndef around PUD_SIZE check]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: <stable@vger.kernel.org> # v3.11+
Leif Lindholm [Thu, 17 Apr 2014 17:42:00 +0000 (18:42 +0100)]
mips: dts: Fix missing device_type="memory" property in memory nodes
A few platforms lack a 'device_type = "memory"' for their memory
nodes, relying on an old ppc quirk in order to discover its memory.
Add the missing data so that all parsing code can find memory nodes
correctly.
Signed-off-by: Leif Lindholm <leif.lindholm@linaro.org>
Cc: linux-mips@linux-mips.org
Cc: devicetree@vger.kernel.org
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: <stable@vger.kernel.org>
Acked-by: John Crispin <blogic@openwrt.org>
Signed-off-by: Grant Likely <grant.likely@linaro.org>
Leif Lindholm [Thu, 17 Apr 2014 17:41:59 +0000 (18:41 +0100)]
arm: dts: Fix missing device_type="memory" for ste-ccu8540
The current .dts for ste-ccu8540 lacks a 'device_type = "memory"' for
its memory node, relying on an old ppc quirk in order to discover its
memory. Fix the data so that all parsing code can handle it correctly.
Signed-off-by: Leif Lindholm <leif.lindholm@linaro.org>
Acked-by: Lee Jones <lee.jones@linaro.org>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: devicetree@vger.kernel.org
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Grant Likely <grant.likely@linaro.org>
Andy Grover [Wed, 14 May 2014 22:48:06 +0000 (15:48 -0700)]
target: Don't allow setting WC emulation if device doesn't support
Just like for pSCSI, if the transport sets get_write_cache, then it is
not valid to enable write cache emulation for it. Return an error.
see https://bugzilla.redhat.com/show_bug.cgi?id=
1082675
Reviewed-by: Chris Leech <cleech@redhat.com>
Signed-off-by: Andy Grover <agrover@redhat.com>
Cc: stable@vger.kernel.org # 3.10+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Nicholas Bellinger [Wed, 14 May 2014 20:54:26 +0000 (20:54 +0000)]
iscsi-target: Disable Immediate + Unsolicited Data with ISER Protection
This patch explicitly disables Immediate + Unsolicited Data for ISER
connections during login in iscsi_login_zero_tsih_s2() when protection
has been enabled for the session by the underlying hardware.
This is currently required because protection / signature memory regions
(MRs) expect T10 PI to occur on RDMA READs + RDMA WRITEs transfers, and
not on a immediate data payload associated with ISCSI_OP_SCSI_CMD, or
unsolicited data-out associated with a ISCSI_OP_SCSI_DATA_OUT.
v2 changes:
- Add TARGET_PROT_DOUT_INSERT check (Sagi)
- Add pr_debug noisemaker (Sagi)
- Add goto to avoid early return from MRDSL check (nab)
Cc: Sagi Grimberg <sagig@mellanox.com>
Cc: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Nicholas Bellinger [Mon, 12 May 2014 19:18:32 +0000 (12:18 -0700)]
tcm_fc: Fix free-after-use regression in ft_free_cmd
This patch fixes a free-after-use regression in ft_free_cmd(), where
ft_sess_put() is called with cmd->sess after percpu_ida_free() has
already released the tag.
Fix this bug by saving the ft_sess pointer ahead of percpu_ida_free(),
and pass it directly to ft_sess_put().
The regression was originally introduced in v3.13-rc1 commit:
commit
5f544cfac956971099e906f94568bc3fd1a7108a
Author: Nicholas Bellinger <nab@daterainc.com>
Date: Mon Sep 23 12:12:42 2013 -0700
tcm_fc: Convert to per-cpu command map pre-allocation of ft_cmd
Reported-by: Jun Wu <jwu@stormojo.com>
Cc: Mark Rustad <mark.d.rustad@intel.com>
Cc: Robert Love <robert.w.love@intel.com>
Cc: <stable@vger.kernel.org> #3.13+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Nicholas Bellinger [Thu, 1 May 2014 20:44:56 +0000 (13:44 -0700)]
iscsi-target: Change BUG_ON to REJECT in iscsit_process_nop_out
This patch changes an incorrect use of BUG_ON to instead generate a
REJECT + PROTOCOL_ERROR in iscsit_process_nop_out() code. This case
can occur with traditional TCP where a flood of zeros in the data
stream can reach this block for what is presumed to be a NOP-OUT with
a solicited reply, but without a valid iscsi_cmd pointer.
This incorrect BUG_ON was introduced during the v3.11-rc timeframe
with the following commit:
commit
778de368964c5b7e8100cde9f549992d521e9c89
Author: Nicholas Bellinger <nab@linux-iscsi.org>
Date: Fri Jun 14 16:07:47 2013 -0700
iscsi/isert-target: Refactor ISCSI_OP_NOOP RX handling
Reported-by: Arshad Hussain <arshad.hussain@calsoftinc.com>
Cc: stable@vger.kernel.org # 3.11+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Sagi Grimberg [Tue, 29 Apr 2014 10:13:47 +0000 (13:13 +0300)]
Target/iscsi,iser: Avoid accepting transport connections during stop stage
When the target is in stop stage, iSER transport initiates RDMA disconnects.
The iSER initiator may wish to establish a new connection over the
still existing network portal. In this case iSER transport should not
accept and resume new RDMA connections. In order to learn that, iscsi_np
is added with enabled flag so the iSER transport can check when deciding
weather to accept and resume a new connection request.
The iscsi_np is enabled after successful transport setup, and disabled
before iscsi_np login threads are cleaned up.
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Cc: stable@vger.kernel.org # 3.10+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Sagi Grimberg [Tue, 29 Apr 2014 10:13:45 +0000 (13:13 +0300)]
Target/iser: Fix iscsit_accept_np and rdma_cm racy flow
RDMA CM and iSCSI target flows are asynchronous and completely
uncorrelated. Relying on the fact that iscsi_accept_np will be called
after CM connection request event and will wait for it is a mistake.
When attempting to login to a few targets this flow is racy and
unpredictable, but for parallel login to dozens of targets will
race and hang every time.
The correct synchronizing mechanism in this case is pending on
a semaphore rather than a wait_for_event. We keep the pending
interruptible for iscsi_np cleanup stage.
(Squash patch to remove dead code into parent - nab)
Reported-by: Slava Shwartsman <valyushash@gmail.com>
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Cc: stable@vger.kernel.org # 3.10+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Sagi Grimberg [Tue, 29 Apr 2014 10:13:44 +0000 (13:13 +0300)]
Target/iser: Fix wrong connection requests list addition
Should be adding list_add_tail($new, $head) and not
the other way around.
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Cc: stable@vger.kernel.org # 3.10+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Andy Grover [Tue, 15 Apr 2014 21:13:12 +0000 (14:13 -0700)]
target: Allow non-supporting backends to set pi_prot_type to 0
Userspace tools assume if a value is read from configfs, it is valid
and will not cause an error if the same value is written back. The only
valid value for pi_prot_type for backends not supporting DIF is 0, so allow
this particular value to be set without returning an error.
Reported-by: Krzysztof Chojnowski <frirajder@gmail.com>
Signed-off-by: Andy Grover <agrover@redhat.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Cc: stable@vger.kernel.org # 3.14+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
John David Anglin [Sun, 11 May 2014 22:40:50 +0000 (18:40 -0400)]
parisc: Improve LWS-CAS performance
The attached change significantly improves the performance of the LWS-CAS code
in syscall.S.
This allows a number of packages to build (e.g., zeromq3, gtest and libxs)
that previously failed because slow LWS-CAS performance under contention. In
particular, interrupts taken while the lock was taken degraded performance
significantly.
The change does the following:
1) Disables interrupts around the CAS operation, and
2) Changes the loads and stores to use the ordered completer, "o", on
PA 2.0. "o" and "ma" with a zero offset are equivalent. The latter is
accepted on both PA 1.X and 2.0.
The use of ordered loads and stores probably makes no difference on all
existing hardware, but it seemed pedantically correct. In particular, the CAS
operation must complete before LDCW lock is released. As written before, a
processor could reorder the operations.
I don't believe the period interrupts are disabled is long enough to
significantly increase interrupt latency. For example, the TLB insert code is
longer. Worst case is a memory fault in the CAS operation.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Cc: stable@vger.kernel.org # 3.13+
Signed-off-by: Helge Deller <deller@gmx.de>
Helge Deller [Mon, 5 May 2014 16:07:12 +0000 (18:07 +0200)]
parisc: ratelimit userspace segfault printing
Ratelimit printing of userspace segfaults and make it runtime
configurable via the /proc/sys/debug/exception-trace variable. This
should resolve syslog from growing way too fast and thus prevents
possible system service attacks.
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org # 3.13+
Marcel Apfelbaum [Thu, 15 May 2014 18:42:49 +0000 (12:42 -0600)]
PCI: shpchp: Check bridge's secondary (not primary) bus speed
When a new device is added below a hotplug bridge, the bridge's secondary
bus speed and the device's bus speed must match. The shpchp driver
previously checked the bridge's *primary* bus speed, not the secondary bus
speed.
This caused hot-add errors like:
shpchp 0000:00:03.0: Speed of bus ff and adapter 0 mismatch
Check the secondary bus speed instead.
[bhelgaas: changelog]
Link: https://bugzilla.kernel.org/show_bug.cgi?id=75251
Fixes:
3749c51ac6c1 ("PCI: Make current and maximum bus speeds part of the PCI core")
Signed-off-by: Marcel Apfelbaum <marcel.a@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
CC: stable@vger.kernel.org # v2.6.34+