Marcelo Tosatti [Wed, 14 May 2014 15:43:24 +0000 (12:43 -0300)]
KVM: x86: disable master clock if TSC is reset during suspend
Updating system_time from the kernel clock once master clock
has been enabled can result in time backwards event, in case
kernel clock frequency is lower than TSC frequency.
Disable master clock in case it is necessary to update it
from the resume path.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Tue, 13 May 2014 16:15:16 +0000 (18:15 +0200)]
Merge tag 'signed-for-3.15' of git://github.com/agraf/linux-2.6 into kvm-master
Patch queue for 3.15 - 2014-05-12
This request includes a few bug fixes that really shouldn't wait for the next
release.
It fixes KVM on 32bit PowerPC when built as module. It also fixes the PV KVM
acceleration when NX gets honored by the host. Furthermore we fix transactional
memory support and numa support on HV KVM.
Paolo Bonzini [Wed, 7 May 2014 09:20:54 +0000 (11:20 +0200)]
KVM: vmx: disable APIC virtualization in nested guests
While running a nested guest, we should disable APIC virtualization
controls (virtualized APIC register accesses, virtual interrupt
delivery and posted interrupts), because we do not expose them to
the nested guest.
Reported-by: Hu Yaohui <loki2441@gmail.com>
Suggested-by: Abel Gordon <abel@stratoscale.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Linus Torvalds [Mon, 5 May 2014 01:14:42 +0000 (18:14 -0700)]
Linux 3.15-rc4
Linus Torvalds [Sun, 4 May 2014 21:36:52 +0000 (14:36 -0700)]
Merge tag 'locks-v3.15-3' of git://git.samba.org/jlayton/linux
Pull file locking change from Jeff Layton:
"Only an email address change to the MAINTAINERS file"
* tag 'locks-v3.15-3' of git://git.samba.org/jlayton/linux:
MAINTAINERS: email address change for Jeff Layton
Linus Torvalds [Sun, 4 May 2014 21:34:50 +0000 (14:34 -0700)]
Merge tag 'arm64-fixes' of git://git./linux/kernel/git/arm64/linux
Pull arm64 fixes from Catalin Marinas:
"These are mostly arm64 fixes with an additional arm(64) platform fix
for the initialisation of vexpress clocks (the latter only affecting
arm64; the arch/arm64 code is SoC agnostic and does not rely on early
SoC-specific calls)
- vexpress platform clocks initialisation moved earlier following the
arm64 move of of_clk_init() call in a previous commit
- Default DMA ops changed to non-coherent to preserve compatibility
with 32-bit ARM DT files. The "dma-coherent" property can be used
to explicitly mark a device coherent. The Applied Micro DT file
has been updated to avoid DMA cache maintenance for the X-Gene SATA
controller (the only arm64 related driver with such assumption in
-rc mainline)
- Fixmap correction for earlyprintk
- kern_addr_valid() fix for huge pages"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
vexpress: Initialise the sysregs before setting up the clocks
arm64: Mark the Applied Micro X-Gene SATA controller as DMA coherent
arm64: Use bus notifiers to set per-device coherent DMA ops
arm64: Make default dma_ops to be noncoherent
arm64: fixmap: fix missing sub-page offset for earlyprintk
arm64: Fix for the arm64 kern_addr_valid() function
Linus Torvalds [Sun, 4 May 2014 21:31:51 +0000 (14:31 -0700)]
Merge tag 'scsi-fixes' of git://git./linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"This is two patches both fixing bugs in drivers (virtio-scsi and
mpt2sas) causing an oops in certain circumstances"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
[SCSI] virtio-scsi: Skip setting affinity on uninitialized vq
[SCSI] mpt2sas: Don't disable device twice at suspend.
Catalin Marinas [Mon, 28 Apr 2014 16:08:37 +0000 (17:08 +0100)]
vexpress: Initialise the sysregs before setting up the clocks
Following arm64 commit
bc3ee18a7a57 (arm64: init: Move of_clk_init to
time_init()), vexpress_osc_of_setup() is called via of_clk_init() long
before initcalls are issued. Initialising the vexpress oscillators
requires the vespress sysregs to be already initialised, so this patch
adds an explicit call to vexpress_sysreg_of_early_init() in vexpress
oscillator setup function.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Will Deacon <will.deacon@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Tested-by: Pawel Moll <pawel.moll@arm.com>
Acked-by: Pawel Moll <pawel.moll@arm.com>
Cc: Mike Turquette <mturquette@linaro.org>
Catalin Marinas [Fri, 25 Apr 2014 15:39:49 +0000 (16:39 +0100)]
arm64: Mark the Applied Micro X-Gene SATA controller as DMA coherent
Since the default DMA ops for arm64 are non-coherent, mark the X-Gene
controller explicitly as dma-coherent to avoid additional cache
maintenance.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Loc Ho <lho@apm.com>
Catalin Marinas [Fri, 25 Apr 2014 14:31:45 +0000 (15:31 +0100)]
arm64: Use bus notifiers to set per-device coherent DMA ops
Recently, the default DMA ops have been changed to non-coherent for
alignment with 32-bit ARM platforms (and DT files). This patch adds bus
notifiers to be able to set the coherent DMA ops (with no cache
maintenance) for devices explicitly marked as coherent via the
"dma-coherent" DT property.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Ritesh Harjani [Wed, 23 Apr 2014 05:29:46 +0000 (06:29 +0100)]
arm64: Make default dma_ops to be noncoherent
Currently arm64 dma_ops is by default made coherent which makes it
opposite in default policy from arm.
Make default dma_ops to be noncoherent (same as arm), as currently there
aren't any dma-capable drivers which assumes coherent ops
Signed-off-by: Ritesh Harjani <ritesh.harjani@gmail.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Marc Zyngier [Mon, 28 Apr 2014 18:50:06 +0000 (19:50 +0100)]
arm64: fixmap: fix missing sub-page offset for earlyprintk
Commit
d57c33c5daa4 (add generic fixmap.h) added (among other
similar things) set_fixmap_io to deal with early ioremap of devices.
More recently, commit
bf4b558eba92 (arm64: add early_ioremap support)
converted the arm64 earlyprintk to use set_fixmap_io. A side effect of
this conversion is that my virtual machines have stopped booting when
I pass "earlyprintk=uart8250-8bit,0x3f8" to the guest kernel.
Turns out that the new earlyprintk code doesn't care at all about
sub-page offsets, and just assumes that the earlyprintk device will
be page-aligned. Obviously, that doesn't play well with the above example.
Further investigation shows that set_fixmap_io uses __set_fixmap instead
of __set_fixmap_offset. A fix is to introduce a set_fixmap_offset_io that
uses the latter, and to remove the superflous call to fix_to_virt
(which only returns the value that set_fixmap_io has already given us).
With this applied, my VMs are back in business. Tested on a Cortex-A57
platform with kvmtool as platform emulation.
Cc: Will Deacon <will.deacon@arm.com>
Acked-by: Mark Salter <msalter@redhat.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Dave Anderson [Tue, 15 Apr 2014 17:53:24 +0000 (18:53 +0100)]
arm64: Fix for the arm64 kern_addr_valid() function
Fix for the arm64 kern_addr_valid() function to recognize
virtual addresses in the kernel logical memory map. The
function fails as written because it does not check whether
the addresses in that region are mapped at the pmd level to
2MB or 512MB pages, continues the page table walk to the
pte level, and issues a garbage value to pfn_valid().
Tested on 4K-page and 64K-page kernels.
Signed-off-by: Dave Anderson <anderson@redhat.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Linus Torvalds [Sat, 3 May 2014 15:32:48 +0000 (08:32 -0700)]
Merge branch 'irq-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull irq fixes from Thomas Gleixner:
"This udpate delivers:
- A fix for dynamic interrupt allocation on x86 which is required to
exclude the GSI interrupts from the dynamic allocatable range.
This was detected with the newfangled tablet SoCs which have GPIOs
and therefor allocate a range of interrupts. The MSI allocations
already excluded the GSI range, so we never noticed before.
- The last missing set_irq_affinity() repair, which was delayed due
to testing issues
- A few bug fixes for the armada SoC interrupt controller
- A memory allocation fix for the TI crossbar interrupt controller
- A trivial kernel-doc warning fix"
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip: irq-crossbar: Not allocating enough memory
irqchip: armanda: Sanitize set_irq_affinity()
genirq: x86: Ensure that dynamic irq allocation does not conflict
linux/interrupt.h: fix new kernel-doc warnings
irqchip: armada-370-xp: Fix releasing of MSIs
irqchip: armada-370-xp: implement the ->check_device() msi_chip operation
irqchip: armada-370-xp: fix invalid cast of signed value into unsigned variable
Linus Torvalds [Sat, 3 May 2014 15:31:45 +0000 (08:31 -0700)]
Merge branch 'timers-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull timer fixes from Thomas Gleixner:
"This update brings along:
- Two fixes for long standing bugs in the hrtimer code, one which
prevents remote enqueuing and the other preventing arbitrary delays
after a interrupt hang was detected
- A fix in the timer wheel which prevents math overflow
- A fix for a long standing issue with the architected ARM timer
related to the C3STOP mechanism.
- A trivial compile fix for nspire SoC clocksource"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
timer: Prevent overflow in apply_slack
hrtimer: Prevent remote enqueue of leftmost timers
hrtimer: Prevent all reprogramming if hang detected
clocksource: nspire: Fix compiler warning
clocksource: arch_arm_timer: Fix age-old arch timer C3STOP detection issue
Linus Torvalds [Sat, 3 May 2014 15:30:44 +0000 (08:30 -0700)]
Merge tag 'trace-fixes-v3.15-rc3' of git://git./linux/kernel/git/rostedt/linux-trace
Pull tracing fix from Steven Rostedt:
"This is a small fix where the trigger code used the wrong
rcu_dereference(). It required rcu_dereference_sched() instead of the
normal rcu_dereference(). It produces a nasty RCU lockdep splat due
to the incorrect rcu notation"
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
* tag 'trace-fixes-v3.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing: Use rcu_dereference_sched() for trace event triggers
Steven Rostedt (Red Hat) [Fri, 2 May 2014 17:30:04 +0000 (13:30 -0400)]
tracing: Use rcu_dereference_sched() for trace event triggers
As trace event triggers are now part of the mainline kernel, I added
my trace event trigger tests to my test suite I run on all my kernels.
Now these tests get run under different config options, and one of
those options is CONFIG_PROVE_RCU, which checks under lockdep that
the rcu locking primitives are being used correctly. This triggered
the following splat:
===============================
[ INFO: suspicious RCU usage. ]
3.15.0-rc2-test+ #11 Not tainted
-------------------------------
kernel/trace/trace_events_trigger.c:80 suspicious rcu_dereference_check() usage!
other info that might help us debug this:
rcu_scheduler_active = 1, debug_locks = 0
4 locks held by swapper/1/0:
#0: ((&(&j_cdbs->work)->timer)){..-...}, at: [<
ffffffff8104d2cc>] call_timer_fn+0x5/0x1be
#1: (&(&pool->lock)->rlock){-.-...}, at: [<
ffffffff81059856>] __queue_work+0x140/0x283
#2: (&p->pi_lock){-.-.-.}, at: [<
ffffffff8106e961>] try_to_wake_up+0x2e/0x1e8
#3: (&rq->lock){-.-.-.}, at: [<
ffffffff8106ead3>] try_to_wake_up+0x1a0/0x1e8
stack backtrace:
CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.15.0-rc2-test+ #11
Hardware name: /DG965MQ, BIOS MQ96510J.86A.0372.2006.0605.1717 06/05/2006
0000000000000001 ffff88007e083b98 ffffffff819f53a5 0000000000000006
ffff88007b0942c0 ffff88007e083bc8 ffffffff81081307 ffff88007ad96d20
0000000000000000 ffff88007af2d840 ffff88007b2e701c ffff88007e083c18
Call Trace:
<IRQ> [<
ffffffff819f53a5>] dump_stack+0x4f/0x7c
[<
ffffffff81081307>] lockdep_rcu_suspicious+0x107/0x110
[<
ffffffff810ee51c>] event_triggers_call+0x99/0x108
[<
ffffffff810e8174>] ftrace_event_buffer_commit+0x42/0xa4
[<
ffffffff8106aadc>] ftrace_raw_event_sched_wakeup_template+0x71/0x7c
[<
ffffffff8106bcbf>] ttwu_do_wakeup+0x7f/0xff
[<
ffffffff8106bd9b>] ttwu_do_activate.constprop.126+0x5c/0x61
[<
ffffffff8106eadf>] try_to_wake_up+0x1ac/0x1e8
[<
ffffffff8106eb77>] wake_up_process+0x36/0x3b
[<
ffffffff810575cc>] wake_up_worker+0x24/0x26
[<
ffffffff810578bc>] insert_work+0x5c/0x65
[<
ffffffff81059982>] __queue_work+0x26c/0x283
[<
ffffffff81059999>] ? __queue_work+0x283/0x283
[<
ffffffff810599b7>] delayed_work_timer_fn+0x1e/0x20
[<
ffffffff8104d3a6>] call_timer_fn+0xdf/0x1be^M
[<
ffffffff8104d2cc>] ? call_timer_fn+0x5/0x1be
[<
ffffffff81059999>] ? __queue_work+0x283/0x283
[<
ffffffff8104d823>] run_timer_softirq+0x1a4/0x22f^M
[<
ffffffff8104696d>] __do_softirq+0x17b/0x31b^M
[<
ffffffff81046d03>] irq_exit+0x42/0x97
[<
ffffffff81a08db6>] smp_apic_timer_interrupt+0x37/0x44
[<
ffffffff81a07a2f>] apic_timer_interrupt+0x6f/0x80
<EOI> [<
ffffffff8100a5d8>] ? default_idle+0x21/0x32
[<
ffffffff8100a5d6>] ? default_idle+0x1f/0x32
[<
ffffffff8100ac10>] arch_cpu_idle+0xf/0x11
[<
ffffffff8107b3a4>] cpu_startup_entry+0x1a3/0x213
[<
ffffffff8102a23c>] start_secondary+0x212/0x219
The cause is that the triggers are protected by rcu_read_lock_sched() but
the data is dereferenced with rcu_dereference() which expects it to
be protected with rcu_read_lock(). The proper reference should be
rcu_dereference_sched().
Cc: Tom Zanussi <tom.zanussi@linux.intel.com>
Cc: stable@vger.kernel.org # 3.14+
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Linus Torvalds [Sat, 3 May 2014 01:16:31 +0000 (18:16 -0700)]
Merge tag 'pm+acpi-3.15-rc4' of git://git./linux/kernel/git/rafael/linux-pm
Pull ACPI and power management fixes from Rafael Wysocki:
"A bunch of regression fixes this time. They fix two regressions in
the PNP subsystem, one in the ACPI processor driver and one in the
ACPI EC driver, four cpufreq driver regressions and an unrelated bug
in one of the drivers. The regressions are recent or introduced in
3.14.
Specifics:
- There are two bugs in the ACPI PNP core that cause errors to be
returned if optional ACPI methods are not present. After an ACPI
core change made in 3.14 one of those errors leads to serial port
suspend failures on some systems. Fix from Rafael J Wysocki.
- A recently added PNP quirk related to Intel chipsets intorduced a
build error in unusual configurations (PNP without PCI). Fix from
Bjorn Helgaas.
- An ACPI EC workaround related to system suspend on Samsung machines
added in 3.14 introduced a race causing some valid EC events to be
discarded. Fix from Kieran Clancy.
- The acpi-cpufreq driver fails to load on some systems after a 3.14
commit related to APIC ID parsing that overlooked one corner case.
Fix from Lan Tianyu.
- Fix for a recently introduced build problem in the ppc-corenet
cpufreq driver from Tim Gardner.
- A recent cpufreq core change to ensure serialization of frequency
transitions for drivers with a ->target_index() callback overlooked
the fact that some of those drivers had been doing operations
introduced by it into the core already by themselves. That
resulted in a mess in which the core and the drivers try to do the
same thing and block each other which leads to deadlocks. Fixes
for the powernow-k7, powernow-k6, and longhaul cpufreq drivers from
Srivatsa S Bhat.
- Fix for a computational error in the powernow-k6 cpufreq driver
from Srivatsa S Bhat"
* tag 'pm+acpi-3.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI / processor: Fix failure of loading acpi-cpufreq driver
PNP / ACPI: Do not return errors if _DIS or _SRS are not present
PNP: Fix compile error in quirks.c
ACPI / EC: Process rather than discard events in acpi_ec_clear
cpufreq: ppc-corenet-cpufreq: Fix __udivdi3 modpost error
cpufreq: powernow-k7: Fix double invocation of cpufreq_freq_transition_begin/end
cpufreq: powernow-k6: Fix double invocation of cpufreq_freq_transition_begin/end
cpufreq: powernow-k6: Fix incorrect comparison with max_multipler
cpufreq: longhaul: Fix double invocation of cpufreq_freq_transition_begin/end
Linus Torvalds [Sat, 3 May 2014 01:12:54 +0000 (18:12 -0700)]
Merge tag 'dt-for-linus' of git://git.secretlab.ca/git/linux
Pull driver core deferred probe fix from Grant Likely:
"Drivercore race condition fix (exposed by devicetree)
This branch fixes a bug where a device can get stuck in the deferred
list even though all its dependencies are met. The bug has existed
for a long time, but new platform conversions to device tree have
exposed it. This patch is needed to get those platforms working.
This was the pending bug fix I mentioned in my previous pull request.
Normally this would go through Greg's tree seeing that it is a
drivercore change, but devicetree exposes the problem. I've discussed
with Greg and he okayed me asking you to pull directly"
* tag 'dt-for-linus' of git://git.secretlab.ca/git/linux:
drivercore: deferral race condition fix
Rafael J. Wysocki [Fri, 2 May 2014 22:20:31 +0000 (00:20 +0200)]
Merge branches 'acpi-ec' and 'acpi-processor'
* acpi-ec:
ACPI / EC: Process rather than discard events in acpi_ec_clear
* acpi-processor:
ACPI / processor: Fix failure of loading acpi-cpufreq driver
Rafael J. Wysocki [Fri, 2 May 2014 22:20:18 +0000 (00:20 +0200)]
Merge branch 'pnp'
* pnp:
PNP / ACPI: Do not return errors if _DIS or _SRS are not present
PNP: Fix compile error in quirks.c
Rafael J. Wysocki [Fri, 2 May 2014 22:19:56 +0000 (00:19 +0200)]
Merge branch 'pm-cpufreq'
* pm-cpufreq:
cpufreq: ppc-corenet-cpufreq: Fix __udivdi3 modpost error
cpufreq: powernow-k7: Fix double invocation of cpufreq_freq_transition_begin/end
cpufreq: powernow-k6: Fix double invocation of cpufreq_freq_transition_begin/end
cpufreq: powernow-k6: Fix incorrect comparison with max_multipler
cpufreq: longhaul: Fix double invocation of cpufreq_freq_transition_begin/end
Linus Torvalds [Fri, 2 May 2014 21:14:02 +0000 (14:14 -0700)]
Merge tag 'dm-3.15-fixes' of git://git./linux/kernel/git/device-mapper/linux-dm
Pull device mapper fixes from Mike Snitzer:
"A few dm-thinp fixes for changes merged in 3.15-rc1.
A dm-verity fix for an immutable biovec regression that affects 3.14+.
A dm-cache fix to properly quiesce when using writethrough mode (3.14+)"
* tag 'dm-3.15-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm cache: fix writethrough mode quiescing in cache_map
dm thin: use INIT_WORK_ONSTACK in noflush_work to avoid ODEBUG warning
dm verity: fix biovecs hash calculation regression
dm thin: fix rcu_read_lock being held in code that can sleep
dm thin: irqsave must always be used with the pool->lock spinlock
Linus Torvalds [Fri, 2 May 2014 21:04:52 +0000 (14:04 -0700)]
Merge branch 'x86-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull x86 fixes from Peter Anvin:
"Two very small changes: one fix for the vSMP Foundation platform, and
one to help LLVM not choke on options it doesn't understand (although
it probably should)"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/vsmp: Fix irq routing
x86: LLVMLinux: Wrap -mno-80387 with cc-option
Linus Torvalds [Fri, 2 May 2014 16:26:09 +0000 (09:26 -0700)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm
Pull KVM fixes from Paolo Bonzini:
- Fix for a Haswell regression in nested virtualization, introduced
during the merge window.
- A fix from Oleg to async page faults.
- A bunch of small ARM changes.
- A trivial patch to use the new MSI-X API introduced during the merge
window.
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: ARM: vgic: Fix the overlap check action about setting the GICD & GICC base address.
KVM: arm/arm64: vgic: fix GICD_ICFGR register accesses
KVM: async_pf: mm->mm_users can not pin apf->mm
KVM: ARM: vgic: Fix sgi dispatch problem
MAINTAINERS: co-maintainance of KVM/{arm,arm64}
arm: KVM: fix possible misalignment of PGDs and bounce page
KVM: x86: Check for host supported fields in shadow vmcs
kvm: Use pci_enable_msix_exact() instead of pci_enable_msix()
ARM: KVM: disable KVM in Kconfig on big-endian systems
Linus Torvalds [Fri, 2 May 2014 16:25:32 +0000 (09:25 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/s390/linux
Pull s390 fixes from Martin Schwidefsky:
"Two bug fixes, one to fix a potential information leak in the BPF jit
and common-io-layer fix for old firmware levels"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/bpf,jit: initialize A register if 1st insn is BPF_S_LDX_B_MSH
s390/chsc: fix SEI usage on old FW levels
Linus Torvalds [Fri, 2 May 2014 00:52:42 +0000 (17:52 -0700)]
Merge tag 'rdma-for-linus' of git://git./linux/kernel/git/roland/infiniband
Pull infiniband/rdma fixes from Roland Dreier:
"cxgb4 hardware driver fixes"
* tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
RDMA/cxgb4: Update Kconfig to include Chelsio T5 adapter
RDMA/cxgb4: Only allow kernel db ringing for T4 devs
RDMA/cxgb4: Force T5 connections to use TAHOE congestion control
RDMA/cxgb4: Fix endpoint mutex deadlocks
Linus Torvalds [Thu, 1 May 2014 22:54:44 +0000 (15:54 -0700)]
Merge branch 'parisc-3.15-3' of git://git./linux/kernel/git/deller/parisc-linux
Pull parisc fixes from Helge Deller:
"Drop the architecture-specifc value for_STK_LIM_MAX to fix stack
related problems with GNU make"
* 'parisc-3.15-3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: Use generic uapi/asm/resource.h file
parisc: remove _STK_LIM_MAX override
Mike Snitzer [Thu, 1 May 2014 20:14:24 +0000 (16:14 -0400)]
dm cache: fix writethrough mode quiescing in cache_map
Commit
2ee57d58735 ("dm cache: add passthrough mode") inadvertently
removed the deferred set reference that was taken in cache_map()'s
writethrough mode support. Restore taking this reference.
This issue was found with code inspection.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Acked-by: Joe Thornber <ejt@redhat.com>
Cc: stable@vger.kernel.org # 3.13+
Linus Torvalds [Thu, 1 May 2014 18:28:03 +0000 (11:28 -0700)]
Merge tag 'pinctrl-v3.15-3' of git://git./linux/kernel/git/linusw/linux-pinctrl
Pull pin control fixes from Linus Walleij:
"Here is a small set of pin control fixes for the v3.15 series. All
are individual driver fixes and quite self-contained. One of them
tagged for stable.
- Signedness bug in the TB10x
- GPIO inversion fix for the AS3722
- Clear pending pin interrups enabled in the bootloader in the
pinctrl-single driver
- Minor pin definition fixes for the PFC/Renesas driver"
* tag 'pinctrl-v3.15-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
sh-pfc: r8a7791: Fix definition of MOD_SEL3
sh-pfc: r8a7790: Fix definition of IPSR5
pinctrl: single: Clear pin interrupts enabled by bootloader
pinctrl: as3722: fix handling of GPIO invert bit
pinctrl/TB10x: Fix signedness bug
Linus Torvalds [Thu, 1 May 2014 17:35:01 +0000 (10:35 -0700)]
Merge tag 'fixes-for-linus' of git://git./linux/kernel/git/rusty/linux
Pull module fixes from Rusty Russell:
"Fixed one missing place for the new taint flag, and remove a warning
giving only false positives (now we finally figured out why)"
* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
module: remove warning about waiting module removal.
Fix: tracing: use 'E' instead of 'X' for unsigned module taint flag
Helge Deller [Tue, 29 Apr 2014 14:13:22 +0000 (16:13 +0200)]
parisc: Use generic uapi/asm/resource.h file
Signed-off-by: Helge Deller <deller@gmx.de>
John David Anglin [Sun, 27 Apr 2014 20:20:47 +0000 (16:20 -0400)]
parisc: remove _STK_LIM_MAX override
There are only a couple of architectures that override _STK_LIM_MAX to
a non-infinity value. This changes the stack allocation semantics in
subtle ways. For example, GNU make changes its stack allocation to the
hard maximum defined by _STK_LIM_MAX. As a results, threads executed
by processes running under make are allocated a stack size of
_STK_LIM_MAX rather than a sensible default value. This causes various
thread stress tests to fail when they can't muster more than about 50
threads.
The attached change implements the default behavior used by the
majority of architectures.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Reviewed-by: Carlos O'Donell <carlos@systemhalted.org>
Cc: stable@vger.kernel.org # 3.14
Signed-off-by: Helge Deller <deller@gmx.de>
Vineet Gupta [Fri, 18 Apr 2014 08:08:35 +0000 (13:38 +0530)]
Hexagon: Delete stale barrier.h
Commit
93ea02bb8435 ("arch: Clean up asm/barrier.h implementations")
wired generic barrier.h for hexagon, but failed to delete the existing
file.
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Compile-tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Thu, 1 May 2014 16:50:58 +0000 (09:50 -0700)]
Merge branch 'perf-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
"Mostly tooling fixes, plus an Intel RAPL PMU driver fix"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf tests x86: Fix stack map lookup in dwarf unwind test
perf x86: Fix perf to use non-executable stack, again
perf tools: Remove extra '/' character in events file path
perf machine: Search for modules in %s/lib/modules/%s
perf tests: Add static build make test
perf tools: Fix bfd dependency libraries detection
perf tools: Use LDFLAGS instead of ALL_LDFLAGS
perf/x86: Fix RAPL rdmsrl_safe() usage
tools lib traceevent: Fix memory leak in pretty_print()
tools lib traceevent: Fix backward compatibility macros for pevent filter enums
perf tools: Disable libdw unwind for all but x86 arch
perf tests x86: Fix memory leak in sample_ustack()
Linus Torvalds [Thu, 1 May 2014 15:59:49 +0000 (08:59 -0700)]
Merge tag 'hwmon-for-linus' of git://git./linux/kernel/git/groeck/linux-staging
Pull hwmon fix from Guenter Roeck:
"Fix Tjmax detection in coretemp driver"
* tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
Revert "hwmon: (coretemp) Refine TjMax detection"
H. Peter Anvin [Wed, 30 Apr 2014 21:22:19 +0000 (14:22 -0700)]
word-at-a-time: simplify big-endian zero_bytemask macro
This is simpler and cleaner. Depending on architecture, a smart
compiler may or may not generate the same code.
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Thu, 1 May 2014 15:54:03 +0000 (08:54 -0700)]
Merge git://git.kvack.org/~bcrl/aio-fixes
Pull aio fixes from Ben LaHaise:
"The first change from Anatol fixes a regression where io_destroy() no
longer waits for outstanding aios to complete. The second corrects a
memory leak in an error path for vectored aio operations.
Both of these bug fixes should be queued up for stable as well"
* git://git.kvack.org/~bcrl/aio-fixes:
aio: fix potential leak in aio_run_iocb().
aio: block io_destroy() until all context requests are completed
Leon Yu [Thu, 1 May 2014 03:31:28 +0000 (03:31 +0000)]
aio: fix potential leak in aio_run_iocb().
iovec should be reclaimed whenever caller of rw_copy_check_uvector() returns,
but it doesn't hold when failure happens right after aio_setup_vectored_rw().
Fix that in a such way to avoid hairy goto.
Signed-off-by: Leon Yu <chianglungyu@gmail.com>
Signed-off-by: Benjamin LaHaise <bcrl@kvack.org>
Cc: stable@vger.kernel.org
Guenter Roeck [Wed, 30 Apr 2014 21:08:14 +0000 (14:08 -0700)]
Revert "hwmon: (coretemp) Refine TjMax detection"
This reverts commit
9fb6c9c73b11bef65ba80a362547fd116c1e1c9d.
Tjmax on some Intel CPUs is below 85 degrees C. One known example is
L5630 with Tjmax of 71 degrees C. There are other Xeon processors with
Tjmax of 70 or 80 degrees C. Also, the Intel IA32 System Programming
document states that the temperature target is in bits 23:16 of MSR 0x1a2
(MSR_TEMPERATURE_TARGET), which is 8 bits, not 7.
So even if turbostat uses similar checks to validate Tjmax, there is no
evidence that the checks are actually required. On the contrary, the
checks are known to cause problems and therefore need to be removed.
This fixes https://bugzilla.kernel.org/show_bug.cgi?id=75071.
Fixes:
9fb6c9c hwmon: (coretemp) Refine TjMax detection
Reviewed-by: Jean Delvare <jdelvare@suse.de>
Cc: stable@vger.kernel.org # 3.14+
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Ingo Molnar [Thu, 1 May 2014 06:22:08 +0000 (08:22 +0200)]
Merge tag 'perf-urgent-for-mingo' of git://git./linux/kernel/git/jolsa/perf into perf/urgent
Pull perf/urgent fixes from Jiri Olsa:
* Fix perf to use non-executable stack, again (Mathias Krause)
* Remove extra '/' character in events file path (Xia Kaixu)
* Search for modules in %s/lib/modules/%s (Richard Yao)
* Build related fixies plus static build test (Jiri Olsa)
* Fix stack map lookup in dwarf unwind test (Jiri Olsa)
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Lan Tianyu [Wed, 30 Apr 2014 07:46:33 +0000 (15:46 +0800)]
ACPI / processor: Fix failure of loading acpi-cpufreq driver
According commit
d640113fe (ACPI: processor: fix acpi_get_cpuid for UP
processor), BIOS may not provide _MAT or MADT tables and acpi_get_apicid()
always returns -1. For these cases, original code will pass apic_id with
vaule of -1 to acpi_map_cpuid() and it will check the acpi_id. If acpi_id
is equal to zero, ignores apic_id and return zero for CPU0.
Commit
b981513 (ACPI / scan: bail out early if failed to parse APIC
ID for CPU) changed the behavior. Return ENODEV when find apic_id is
less than zero after calling acpi_get_apicid(). This causes acpi-cpufreq
driver fails to be loaded on some machines. This patch is to fix it.
Fixes:
b981513f806d (ACPI / scan: bail out early if failed to parse APIC ID for CPU)
References: https://bugzilla.kernel.org/show_bug.cgi?id=73781
Cc: 3.14+ <stable@vger.kernel.org> # 3.14+
Reported-and-tested-by: KATO Hiroshi <katoh@mikage.ne.jp>
Reported-and-tested-by: Stuart Foster <smf.linux@ntlworld.com>
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Rafael J. Wysocki [Wed, 30 Apr 2014 20:36:33 +0000 (22:36 +0200)]
PNP / ACPI: Do not return errors if _DIS or _SRS are not present
The ACPI PNP subsystem returns errors from pnpacpi_set_resources()
and pnpacpi_disable_resources() if the _SRS or _DIS methods are not
present, respectively, but it should not do that, because those
methods are optional. For this reason, modify pnpacpi_set_resources()
and pnpacpi_disable_resources(), respectively, to ignore missing _SRS
or _DIS.
This problem has been uncovered by commit
202317a573b2 (ACPI / scan:
Add acpi_device objects for all device nodes in the namespace) and
manifested itself by causing serial port suspend to fail on some
systems.
Fixes:
202317a573b2 (ACPI / scan: Add acpi_device objects for all device nodes in the namespace)
References: https://bugzilla.kernel.org/show_bug.cgi?id=74371
Reported-by: wxg4net <wxg4net@gmail.com>
Reported-and-tested-by: <nonproffessional@gmail.com>
Cc: 3.14+ <stable@vger.kernel.org> # 3.14+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Paolo Bonzini [Wed, 30 Apr 2014 19:25:09 +0000 (21:25 +0200)]
Merge tag 'kvm-arm-for-3.15-rc4' of git://git./linux/kernel/git/kvmarm/kvmarm into kvm-master
First round of KVM/ARM Fixes for 3.15
Includes vgic fixes, a possible kernel corruption bug due to
misalignment of pages and disabling of KVM in KConfig on big-endian
systems, because the last one breaks the build.
Jeff Layton [Wed, 30 Apr 2014 17:28:17 +0000 (13:28 -0400)]
MAINTAINERS: email address change for Jeff Layton
jlayton@redhat.com -> jlayton@poochiereds.net
Signed-off-by: Jeff Layton <jlayton@poochiereds.net>
Vineet Gupta [Wed, 30 Apr 2014 09:56:45 +0000 (15:26 +0530)]
ARC: !PREEMPT: Ensure Return to kernel mode is IRQ safe
There was a very small race window where resume to kernel mode from a
Exception Path (or pure kernel mode which is true for most of ARC
exceptions anyways), was not disabling interrupts in restore_regs,
clobbering the exception regs
Anton found the culprit call flow (after many sleepless nights)
| 1. we got a Trap from user land
| 2. started to service it.
| 3. While doing some stuff on user-land memory (I think it is padzero()),
| we got a DataTlbMiss
| 4. On return from it we are taking "resume_kernel_mode" path
| 5. NEED_RESHED is not set, so we go to "return from exception" path in
| restore regs.
| 6. there seems to be IRQ happening
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Cc: <stable@vger.kernel.org> #3.10, 3.12, 3.13, 3.14
Cc: Anton Kolesov <Anton.Kolesov@synopsys.com>
Cc: Francois Bedard <Francois.Bedard@synopsys.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Wed, 30 Apr 2014 15:15:59 +0000 (08:15 -0700)]
Merge tag 'sound-3.15-rc4' of git://git./linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"A few collections of small eggs that have been gathered during the
Easter holidays. Mostly small ASoC fixes, with a HD-audio quirk and a
workaround for Nvidia controller"
* tag 'sound-3.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda - Suppress CORBRP clear on Nvidia controller chips
ALSA: hda - add headset mic detect quirk for a Dell laptop
ASoC: jz4740: Remove Makefile entry for removed file
ASoC: Intel: Fix audio crash due to negative address offset
ASoC: dapm: Fix widget double free with auto-disable DAPM kcontrol
ASoC: Intel: Fix incorrect sizeof() in sst_hsw_stream_get_volume()
ASoC: Intel: some incorrect sizeof() usages
ASoC: cs42l73: Convert to use devm_gpio_request_one
ASoC: cs42l52: Convert to use devm_gpio_request_one
ASoC: tlv320aic31xx: document that the regulators are mandatory
ASoC: fsl_spdif: Fix wrong OFFSET of STC_SYSCLK_DIV
ASoC: alc5623: Fix regmap endianness
ASoC: tlv320aic3x: fix shared reset pin for DT
ASoC: rsnd: fix clock prepare/unprepare
Jiri Olsa [Wed, 30 Apr 2014 14:39:44 +0000 (16:39 +0200)]
perf tests x86: Fix stack map lookup in dwarf unwind test
Previous commit 'perf x86: Fix perf to use non-executable stack, again'
moved stack map into MAP__VARIABLE map type again. Fixing the dwarf
unwind test stack map lookup appropriately.
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Jean Pihet <jean.pihet@linaro.org>
Link: http://lkml.kernel.org/n/tip-ttzyhbe4zls24z7ednkmhvxl@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Mathias Krause [Sun, 27 Apr 2014 16:51:05 +0000 (18:51 +0200)]
perf x86: Fix perf to use non-executable stack, again
arch/x86/tests/regs_load.S is missing the linker note about the stack
requirements, therefore making the linker fall back to an executable
stack. As this object gets linked against the final perf binary, it'll
needlessly end up with an executable stack. Fix this by adding the
appropriate linker note.
Also add a global linker flag to prevent future regressions, as
suggested by Jiri. This way perf won't get an executable stack even if
we fail to add the .GNU-stack linker note to future assembler files.
Though, doing so might create regressions the other way around, when
(statically) linking against libraries needing an executable stack.
But, apparently, regressing in that direction is wanted as it is an
indicator of poor code quality -- or just missing linker notes.
Fixes:
3c8b06f981 ("perf tests x86: Introduce perf_regs_load function")
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1398617466-22749-1-git-send-email-minipli@googlemail.com
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Xia Kaixu [Sat, 26 Apr 2014 07:55:12 +0000 (15:55 +0800)]
perf tools: Remove extra '/' character in events file path
The array debugfs_known_mountpoints[] will cause extra '/'
character output.
Remove it.
pre:
$ perf probe -l
/sys/kernel/debug//tracing/uprobe_events file does not exist -
please rebuild kernel with CONFIG_UPROBE_EVENTS.
post:
$ perf probe -l
/sys/kernel/debug/tracing/uprobe_events file does not exist -
please rebuild kernel with CONFIG_UPROBE_EVENTS.
Signed-off-by: Xia Kaixu <xiakaixu@huawei.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/535B6660.2060001@huawei.com
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Richard Yao [Sat, 26 Apr 2014 17:17:55 +0000 (13:17 -0400)]
perf machine: Search for modules in %s/lib/modules/%s
Modules installed outside of the kernel's build system should go into
"%s/lib/modules/%s/extra", but at present, perf will only look at them
when they are in "%s/lib/modules/%s/kernel". Lets encourage good
citizenship by relaxing this requirement to "%s/lib/modules/%s". This
way open source modules that are out-of-tree have no incentive to start
populating a directory reserved for in-kernel modules and I can stop
hex-editing my system's perf binary when profiling OSS out-of-tree
modules.
Feedback from Namhyung Kim correctly revealed that the hex-edits that I
had been doing meant that perf was also traversing the build and source
symlinks in %s/lib/modules/%s. That is undesireable, so we explicitly
exclude them from traversal with a minor tweak to the traversal routine.
Signed-off-by: Richard Yao <ryao@gentoo.org>
Acked-by: Namhyung kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1398532675-13684-1-git-send-email-ryao@gentoo.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Jiri Olsa [Tue, 29 Apr 2014 07:53:40 +0000 (09:53 +0200)]
perf tests: Add static build make test
Adding test for building static perf build into the automated
suite. Also available via following commands:
$ make -f tests/make make_static
- make_static: cd . && make -f Makefile DESTDIR=/tmp/tmp.7u5MlB4njo LDFLAGS=-static
$ make -f tests/make make_static_O
- make_static_O: cd . && make -f Makefile O=/tmp/tmp.Ay6r3wEmtX DESTDIR=/tmp/tmp.vK0KQwO0Vi LDFLAGS=-static
Acked-by: David Ahern <dsahern@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1398760413-7574-1-git-send-email-jolsa@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Jiri Olsa [Wed, 23 Apr 2014 14:53:25 +0000 (16:53 +0200)]
perf tools: Fix bfd dependency libraries detection
There's false assumption in the library detection code
assuming -liberty and -lz are always present once bfd
is detected. The fails on Ubuntu (14.04) as reported
by Ingo.
Forcing the bdf dependency libraries detection any
time bfd library is detected.
Reported-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Ingo Molnar <mingo@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1398676935-6615-1-git-send-email-jolsa@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Jiri Olsa [Sun, 27 Apr 2014 09:12:21 +0000 (11:12 +0200)]
perf tools: Use LDFLAGS instead of ALL_LDFLAGS
We no longer use ALL_LDFLAGS, Replacing with LDFLAGS.
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1398675770-3109-1-git-send-email-jolsa@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Jiri Bohac [Fri, 18 Apr 2014 15:23:11 +0000 (17:23 +0200)]
timer: Prevent overflow in apply_slack
On architectures with sizeof(int) < sizeof (long), the
computation of mask inside apply_slack() can be undefined if the
computed bit is > 32.
E.g. with: expires = 0xffffe6f5 and slack = 25, we get:
expires_limit = 0x20000000e
bit = 33
mask = (1 << 33) - 1 /* undefined */
On x86, mask becomes 1 and and the slack is not applied properly.
On s390, mask is -1, expires is set to 0 and the timer fires immediately.
Use 1UL << bit to solve that issue.
Suggested-by: Deborah Townsend <dstownse@us.ibm.com>
Signed-off-by: Jiri Bohac <jbohac@suse.cz>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20140418152310.GA13654@midget.suse.cz
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Leon Ma [Wed, 30 Apr 2014 08:43:10 +0000 (16:43 +0800)]
hrtimer: Prevent remote enqueue of leftmost timers
If a cpu is idle and starts an hrtimer which is not pinned on that
same cpu, the nohz code might target the timer to a different cpu.
In the case that we switch the cpu base of the timer we already have a
sanity check in place, which determines whether the timer is earlier
than the current leftmost timer on the target cpu. In that case we
enqueue the timer on the current cpu because we cannot reprogram the
clock event device on the target.
If the timers base is already the target CPU we do not have this
sanity check in place so we enqueue the timer as the leftmost timer in
the target cpus rb tree, but we cannot reprogram the clock event
device on the target cpu. So the timer expires late and subsequently
prevents the reprogramming of the target cpu clock event device until
the previously programmed event fires or a timer with an earlier
expiry time gets enqueued on the target cpu itself.
Add the same target check as we have for the switch base case and
start the timer on the current cpu if it would become the leftmost
timer on the target.
[ tglx: Rewrote subject and changelog ]
Signed-off-by: Leon Ma <xindong.ma@intel.com>
Link: http://lkml.kernel.org/r/1398847391-5994-1-git-send-email-xindong.ma@intel.com
Cc: stable@vger.kernel.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Stuart Hayes [Tue, 29 Apr 2014 22:55:02 +0000 (17:55 -0500)]
hrtimer: Prevent all reprogramming if hang detected
If the last hrtimer interrupt detected a hang it sets hang_detected=1
and programs the clock event device with a delay to let the system
make progress.
If hang_detected == 1, we prevent reprogramming of the clock event
device in hrtimer_reprogram() but not in hrtimer_force_reprogram().
This can lead to the following situation:
hrtimer_interrupt()
hang_detected = 1;
program ce device to Xms from now (hang delay)
We have two timers pending:
T1 expires 50ms from now
T2 expires 5s from now
Now T1 gets canceled, which causes hrtimer_force_reprogram() to be
invoked, which in turn programs the clock event device to T2 (5
seconds from now).
Any hrtimer_start after that will not reprogram the hardware due to
hang_detected still being set. So we effectivly block all timers until
the T2 event fires and cleans up the hang situation.
Add a check for hang_detected to hrtimer_force_reprogram() which
prevents the reprogramming of the hang delay in the hardware
timer. The subsequent hrtimer_interrupt will resolve all outstanding
issues.
[ tglx: Rewrote subject and changelog and fixed up the comment in
hrtimer_force_reprogram() ]
Signed-off-by: Stuart Hayes <stuart.w.hayes@gmail.com>
Link: http://lkml.kernel.org/r/53602DC6.2060101@gmail.com
Cc: stable@vger.kernel.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Linus Torvalds [Wed, 30 Apr 2014 00:51:26 +0000 (17:51 -0700)]
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
"Smattering of fixes, i915, exynos, tegra, msm, vmwgfx.
A bit of framebuffer reference counting fallout fixes, i915 GM45
regression fix, DVI regression fix, vmware info leak between processes
fix"
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
drm/exynos: use %pad for dma_addr_t
drm/exynos: dsi: use IS_ERR() to check devm_ioremap_resource() results
MAINTAINERS: update maintainer entry for Exynos DP driver
drm/exynos: balance framebuffer refcount
drm/i915: Move all ring resets before setting the HWS page
drm/i915: Don't WARN nor handle unexpected hpd interrupts on gmch platforms
drm/msm/mdp4: cure for the cursor blues (v2)
drm/msm: default to XR24 rather than AR24
drm/msm: fix memory leak
drm/tegra: restrict plane loops to legacy planes
drm/i915: Allow full PPGTT with param override
drm/i915: Discard BIOS framebuffers too small to accommodate chosen mode
drm/vmwgfx: Make sure user-space can't DMA across buffer object boundaries v2
drm/i915: get power domain in case the BIOS enabled eDP VDD
drm/i915: Don't check gmch state on inherited configs
drm/i915: Allow user modes to exceed DVI 165MHz limit
Jingoo Han [Tue, 22 Apr 2014 05:45:32 +0000 (14:45 +0900)]
drm/exynos: use %pad for dma_addr_t
Use %pad for dma_addr_t, because a dma_addr_t type can vary
based on build options. So, it prevents possible build warnings
in printks.
Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Reviewed-by: Daniel Kurtz <djkurtz@chromium.org>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Jingoo Han [Thu, 17 Apr 2014 10:08:40 +0000 (19:08 +0900)]
drm/exynos: dsi: use IS_ERR() to check devm_ioremap_resource() results
devm_ioremap_resource() returns an error pointer, not NULL. Thus,
the result should be checked with IS_ERR().
Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Jingoo Han [Thu, 10 Apr 2014 11:24:03 +0000 (20:24 +0900)]
MAINTAINERS: update maintainer entry for Exynos DP driver
Recently, Exynos DP driver was moved from drivers/video/exynos/
directory to drivers/gpu/drm/exynos/ directory. So, I update
and add maintainer entry for Exynos DP driver.
Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Andrzej Hajda [Tue, 15 Apr 2014 13:33:01 +0000 (15:33 +0200)]
drm/exynos: balance framebuffer refcount
exynos_drm_crtc_mode_set assigns primary framebuffer to plane without
taking reference. Then during framebuffer removal it is dereferenced twice,
causing oops. The patch fixes it.
Signed-off-by: Andrzej Hajda <a.hajda@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Tue, 29 Apr 2014 23:43:43 +0000 (09:43 +1000)]
Merge branch 'vmwgfx-fixes-3.15' of git://people.freedesktop.org/~thomash/linux into drm-next
single security fix, cc'd stable.
* 'vmwgfx-fixes-3.15' of git://people.freedesktop.org/~thomash/linux:
drm/vmwgfx: Make sure user-space can't DMA across buffer object boundaries v2
Bjorn Helgaas [Mon, 28 Apr 2014 22:20:34 +0000 (00:20 +0200)]
PNP: Fix compile error in quirks.c
Fix the compile error:
drivers/pnp/quirks.c:393:2: error: implicit declaration of function 'pcibios_bus_to_resource'
that occurs when building with CONFIG_PCI unset. The quirk is only
relevent to Intel devices, so we could use "#if defined(CONFIG_X86) &&
defined(CONFIG_PCI)" instead, but testing CONFIG_X86 is not strictly
necessary.
Fixes:
cb171f7abb9a (PNP: Work around BIOS defects in Intel MCH area reporting)
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Kieran Clancy [Tue, 29 Apr 2014 14:51:20 +0000 (00:21 +0930)]
ACPI / EC: Process rather than discard events in acpi_ec_clear
Address a regression caused by commit
ad332c8a4533:
(ACPI / EC: Clear stale EC events on Samsung systems)
After the earlier patch, there was found to be a race condition on some
earlier Samsung systems (N150/N210/N220). The function acpi_ec_clear was
sometimes discarding a new EC event before its GPE was triggered by the
system. In the case of these systems, this meant that the "lid open"
event was not registered on resume if that was the cause of the wake,
leading to problems when attempting to close the lid to suspend again.
After testing on a number of Samsung systems, both those affected by the
previous EC bug and those affected by the race condition, it seemed that
the best course of action was to process rather than discard the events.
On Samsung systems which accumulate stale EC events, there does not seem
to be any adverse side-effects of running the associated _Q methods.
This patch adds an argument to the static function acpi_ec_sync_query so
that it may be used within the acpi_ec_clear loop in place of
acpi_ec_query_unlocked which was used previously.
With thanks to Stefan Biereigel for reporting the issue, and for all the
people who helped test the new patch on affected systems.
Fixes:
ad332c8a4533 (ACPI / EC: Clear stale EC events on Samsung systems)
References: https://lkml.kernel.org/r/
532FE3B2.
9060808@biereigel-wb.de
References: https://bugzilla.kernel.org/show_bug.cgi?id=44161#c173
Reported-by: Stefan Biereigel <stefan@biereigel.de>
Signed-off-by: Kieran Clancy <clancy.kieran@gmail.com>
Tested-by: Stefan Biereigel <stefan@biereigel.de>
Tested-by: Dennis Jansen <dennis.jansen@web.de>
Tested-by: Nicolas Porcel <nicolasporcel06@gmail.com>
Tested-by: Maurizio D'Addona <mauritiusdadd@gmail.com>
Tested-by: Juan Manuel Cabo <juanmanuel.cabo@gmail.com>
Tested-by: Giannis Koutsou <giannis.koutsou@gmail.com>
Tested-by: Kieran Clancy <clancy.kieran@gmail.com>
Cc: 3.14+ <stable@vger.kernel.org> # 3.14+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Thomas Gleixner [Tue, 29 Apr 2014 17:26:58 +0000 (19:26 +0200)]
Merge branch 'clockevents/3.15-fixes' of git://git.linaro.org/people/daniel.lezcano/linux into timers/urgent
clockevent fixes for 3.15 from Daniel Lezcano:
* Lorenzo Pieralizi fixed an issue with the arch_arm_timer where the
C3STOP flag for all the arch can cause some trouble by setting the
flag only if the power domain is not always on
* Alexander Shiyan fixed a compilation by changing the init function
to the right prototype
Thomas Gleixner [Tue, 29 Apr 2014 17:23:22 +0000 (19:23 +0200)]
Merge tag 'mvebu-irqchip-fixes-3.15' of git://git.infradead.org/linux-mvebu into irq/urgent
Bugfixes for armada-370-xp SoC from Jason Cooper:
* Fix invalid cast (signed to unsigned)
* Add missing ->check_device() msi_chip op
* Fix releasing of MSIs
Takashi Iwai [Tue, 29 Apr 2014 16:38:21 +0000 (18:38 +0200)]
ALSA: hda - Suppress CORBRP clear on Nvidia controller chips
The recent commit (
ca460f86521) changed the CORB RP reset procedure to
follow the specification with a couple of sanity checks.
Unfortunately, Nvidia controller chips seem not following this way,
and spew the warning messages like:
snd_hda_intel 0000:00:10.1: CORB reset timeout#1, CORBRP = 0
This patch adds the workaround for such chips. It just skips the new
reset procedure for the known broken chips.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Mike Snitzer [Tue, 29 Apr 2014 15:22:04 +0000 (11:22 -0400)]
dm thin: use INIT_WORK_ONSTACK in noflush_work to avoid ODEBUG warning
Use INIT_WORK_ONSTACK to silence "ODEBUG: object is on stack, but not
annotated".
Reported-by: Zdeněk Kabeláč <zkabelac@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Acked-by: Joe Thornber <ejt@redhat.com>
Grant Likely [Tue, 29 Apr 2014 11:05:22 +0000 (12:05 +0100)]
drivercore: deferral race condition fix
When the kernel is built with CONFIG_PREEMPT it is possible to reach a state
when all modules loaded but some driver still stuck in the deferred list
and there is a need for external event to kick the deferred queue to probe
these drivers.
The issue has been observed on embedded systems with CONFIG_PREEMPT enabled,
audio support built as modules and using nfsroot for root filesystem.
The following log fragment shows such sequence when all audio modules
were loaded but the sound card is not present since the machine driver has
failed to probe due to missing dependency during it's probe.
The board is am335x-evmsk (McASP<->tlv320aic3106 codec) with davinci-evm
machine driver:
...
[ 12.615118] davinci-mcasp
4803c000.mcasp: davinci_mcasp_probe: ENTER
[ 12.719969] davinci_evm sound.3: davinci_evm_probe: ENTER
[ 12.725753] davinci_evm sound.3: davinci_evm_probe: snd_soc_register_card
[ 12.753846] davinci-mcasp
4803c000.mcasp: davinci_mcasp_probe: snd_soc_register_component
[ 12.922051] davinci-mcasp
4803c000.mcasp: davinci_mcasp_probe: snd_soc_register_component DONE
[ 12.950839] davinci_evm sound.3: ASoC: platform (null) not registered
[ 12.957898] davinci_evm sound.3: davinci_evm_probe: snd_soc_register_card DONE (-517)
[ 13.099026] davinci-mcasp
4803c000.mcasp: Kicking the deferred list
[ 13.177838] davinci-mcasp
4803c000.mcasp: really_probe: probe_count = 2
[ 13.194130] davinci_evm sound.3: snd_soc_register_card failed (-517)
[ 13.346755] davinci_mcasp_driver_init: LEAVE
[ 13.377446] platform sound.3: Driver davinci_evm requests probe deferral
[ 13.592527] platform sound.3: really_probe: probe_count = 0
In the log the machine driver enters it's probe at 12.719969 (this point it
has been removed from the deferred lists). McASP driver already executing
it's probing (since 12.615118).
The machine driver tries to construct the sound card (12.950839) but did
not found one of the components so it fails. After this McASP driver
registers all the ASoC components (the machine driver still in it's probe
function after it failed to construct the card) and the deferred work is
prepared at 13.099026 (note that this time the machine driver is not in the
lists so it is not going to be handled when the work is executing).
Lastly the machine driver exit from it's probe and the core places it to
the deferred list but there will be no other driver going to load and the
deferred queue is not going to be kicked again - till we have external event
like connecting USB stick, etc.
The proposed solution is to try the deferred queue once more when the last
driver is asking for deferring and we had drivers loaded while this last
driver was probing.
This way we can avoid drivers stuck in the deferred queue.
Signed-off-by: Grant Likely <grant.likely@linaro.org>
Reviewed-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Tested-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Stable <stable@vger.kernel.org> # v3.4+
Alexander Shiyan [Sat, 26 Apr 2014 05:32:16 +0000 (09:32 +0400)]
clocksource: nspire: Fix compiler warning
CC drivers/clocksource/zevio-timer.o
drivers/clocksource/zevio-timer.c:215:1: warning: comparison of distinct pointer types lacks a cast [enabled by default]
Signed-off-by: Alexander Shiyan <shc_work@mail.ru>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Lorenzo Pieralisi [Tue, 8 Apr 2014 09:04:32 +0000 (10:04 +0100)]
clocksource: arch_arm_timer: Fix age-old arch timer C3STOP detection issue
ARM arch timers are tightly coupled with the CPU logic and lose context
on platform implementing HW power management when cores are powered
down at run-time. Marking the arch timers as C3STOP regardless of power
management capabilities causes issues on platforms with no power management,
since in that case the arch timers cannot possibly enter states where the
timer loses context at runtime and therefore can always be used as a high
resolution clockevent device.
In order to fix the C3STOP issue in a way compliant with how real HW
works, this patch adds a boolean property to the arch timer bindings
to define if the arch timer is managed by an always-on power domain.
This power domain is present on all ARM platforms to date, and manages
HW that must not be turned off, whatever the state of other HW
components (eg power controller). On platforms with no power management
capabilities, it is the only power domain present, which encompasses
and manages power supply for all HW components in the system.
If the timer is powered by the always-on power domain, the always-on
property must be present in the bindings which means that the timer cannot
be shutdown at runtime, so it is not a C3STOP clockevent device.
If the timer binding does not contain the always-on property, the timer is
assumed to be power-gateable, hence it must be defined as a C3STOP
clockevent device.
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Magnus Damm <damm@opensource.se>
Cc: Marc Carino <marc.ceeeee@gmail.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Alexander Graf [Tue, 29 Apr 2014 10:17:26 +0000 (12:17 +0200)]
KVM guest: Make pv trampoline code executable
Our PV guest patching code assembles chunks of instructions on the fly when it
encounters more complicated instructions to hijack. These instructions need
to live in a section that we don't mark as non-executable, as otherwise we
fault when jumping there.
Right now we put it into the .bss section where it automatically gets marked
as non-executable. Add a check to the NX setting function to ensure that we
leave these particular pages executable.
Signed-off-by: Alexander Graf <agraf@suse.de>
Haibin Wang [Tue, 29 Apr 2014 06:49:17 +0000 (14:49 +0800)]
KVM: ARM: vgic: Fix the overlap check action about setting the GICD & GICC base address.
Currently below check in vgic_ioaddr_overlap will always succeed,
because the vgic dist base and vgic cpu base are still kept UNDEF
after initialization. The code as follows will be return forever.
if (IS_VGIC_ADDR_UNDEF(dist) || IS_VGIC_ADDR_UNDEF(cpu))
return 0;
So, before invoking the vgic_ioaddr_overlap, it needs to set the
corresponding base address firstly.
Signed-off-by: Haibin Wang <wanghaibin.wang@huawei.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Hariprasad S [Fri, 25 Apr 2014 19:21:18 +0000 (00:51 +0530)]
RDMA/cxgb4: Update Kconfig to include Chelsio T5 adapter
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Steve Wise [Thu, 24 Apr 2014 19:32:04 +0000 (14:32 -0500)]
RDMA/cxgb4: Only allow kernel db ringing for T4 devs
The whole db drop avoidance stuff is for T4 only. So we cannot allow
that to be enabled for T5 devices.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Steve Wise [Thu, 24 Apr 2014 19:31:59 +0000 (14:31 -0500)]
RDMA/cxgb4: Force T5 connections to use TAHOE congestion control
This is required to work around a T5 HW issue.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Steve Wise [Thu, 24 Apr 2014 19:31:53 +0000 (14:31 -0500)]
RDMA/cxgb4: Fix endpoint mutex deadlocks
In cases where the cm calls c4iw_modify_rc_qp() with the endpoint
mutex held, they must be called with internal == 1. rx_data() and
process_mpa_reply() are not doing this. This causes a deadlock
because c4iw_modify_rc_qp() might call c4iw_ep_disconnect() in some
!internal cases, and c4iw_ep_disconnect() acquires the endpoint mutex.
The design was intended to only do the disconnect for !internal calls.
Change rx_data(), FPDU_MODE case, to call c4iw_modify_rc_qp() with
internal == 1, and then disconnect only after releasing the mutex.
Change process_mpa_reply() to call c4iw_modify_rc_qp(TERMINATE) with
internal == 1 and set a new attr flag telling it to send a TERMINATE
message. Previously this was implied by !internal.
Change process_mpa_reply() to return whether the caller should
disconnect after releasing the endpoint mutex. Now rx_data() will do
the disconnect in the cases where process_mpa_reply() wants to
disconnect after the TERMINATE is sent.
Change c4iw_modify_rc_qp() RTS->TERM to only disconnect if !internal,
and to send a TERMINATE message if attrs->send_term is 1.
Change abort_connection() to not aquire the ep mutex for setting the
state, and make all calls to abort_connection() do so with the mutex
held.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Linus Torvalds [Mon, 28 Apr 2014 23:57:51 +0000 (16:57 -0700)]
Merge tag 'trace-fixes-v3.15-rc2' of git://git./linux/kernel/git/rostedt/linux-trace
Pull ftrace bugfix from Steven Rostedt:
"Takao Indoh reported that he was able to cause a ftrace bug while
loading a module and enabling function tracing at the same time.
He uncovered a race where the module when loaded will convert the
calls to mcount into nops, and expects the module's text to be RW.
But when function tracing is enabled, it will convert all kernel text
(core and module) from RO to RW to convert the nops to calls to ftrace
to record the function. After the convertion, it will convert all the
text back from RW to RO.
The issue is, it will also convert the module's text that is loading.
If it converts it to RO before ftrace does its conversion, it will
cause ftrace to fail and require a reboot to fix it again.
This patch moves the ftrace module update that converts calls to
mcount into nops to be done when the module state is still
MODULE_STATE_UNFORMED. This will ignore the module when the text is
being converted from RW back to RO"
* tag 'trace-fixes-v3.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
ftrace/module: Hardcode ftrace_module_init() call into load_module()
Tim Gardner [Mon, 28 Apr 2014 16:18:18 +0000 (10:18 -0600)]
cpufreq: ppc-corenet-cpufreq: Fix __udivdi3 modpost error
bfa709bc823fc32ee8dd5220d1711b46078235d8 (cpufreq: powerpc: add cpufreq
transition latency for FSL e500mc SoCs) introduced a modpost error:
ERROR: "__udivdi3" [drivers/cpufreq/ppc-corenet-cpufreq.ko] undefined!
make[1]: *** [__modpost] Error 1
Fix this by avoiding 64 bit integer division.
gcc version 4.8.2
Fixes:
bfa709bc823f (cpufreq: powerpc: add cpufreq transition latency for FSL e500mc SoCs)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Srivatsa S. Bhat [Mon, 28 Apr 2014 18:54:58 +0000 (00:24 +0530)]
cpufreq: powernow-k7: Fix double invocation of cpufreq_freq_transition_begin/end
During frequency transitions, the cpufreq core takes the responsibility of
invoking cpufreq_freq_transition_begin() and cpufreq_freq_transition_end()
for those cpufreq drivers that define the ->target_index callback but don't
set the ASYNC_NOTIFICATION flag.
The powernow-k7 cpufreq driver falls under this category, but this driver was
invoking the _begin() and _end() APIs itself around frequency transitions,
which led to double invocation of the _begin() API. The _begin API makes
contending callers wait until the previous invocation is complete. Hence,
the powernow-k7 driver ended up waiting on itself, leading to system hangs
during boot.
Fix this by removing the calls to the _begin() and _end() APIs from the
powernow-k7 driver, since they rightly belong to the cpufreq core.
Fixes:
12478cf0c55e (cpufreq: Make sure frequency transitions are serialized)
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Srivatsa S. Bhat [Mon, 28 Apr 2014 18:54:42 +0000 (00:24 +0530)]
cpufreq: powernow-k6: Fix double invocation of cpufreq_freq_transition_begin/end
During frequency transitions, the cpufreq core takes the responsibility of
invoking cpufreq_freq_transition_begin() and cpufreq_freq_transition_end()
for those cpufreq drivers that define the ->target_index callback but don't
set the ASYNC_NOTIFICATION flag.
The powernow-k6 cpufreq driver falls under this category, but this driver was
invoking the _begin() and _end() APIs itself around frequency transitions,
which led to double invocation of the _begin() API. The _begin API makes
contending callers wait until the previous invocation is complete. Hence,
the powernow-k6 driver ended up waiting on itself, leading to system hangs
during boot.
Fix this by removing the calls to the _begin() and _end() APIs from the
powernow-k6 driver, since they rightly belong to the cpufreq core.
(Note that during ->exit(), the powernow-k6 driver sets the frequency
without any help from the cpufreq core. So add explicit calls to the
_begin() and _end() APIs around that frequency transition alone, to take
care of that special case. Also, add a missing 'break' statement there.)
Fixes:
12478cf0c55e (cpufreq: Make sure frequency transitions are serialized)
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Srivatsa S. Bhat [Mon, 28 Apr 2014 18:54:27 +0000 (00:24 +0530)]
cpufreq: powernow-k6: Fix incorrect comparison with max_multipler
The value of 'max_multiplier' is meant to be used for comparison with
clock_ratio[index].driver_data, not the index itself! Fix the code in
powernow_k6_cpu_exit() that has this bug.
Also, while at it, make the for-loop condition look for CPUFREQ_TABLE_END,
instead of hard-coding the loop count to 8.
Reported-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Srivatsa S. Bhat [Mon, 28 Apr 2014 18:54:09 +0000 (00:24 +0530)]
cpufreq: longhaul: Fix double invocation of cpufreq_freq_transition_begin/end
During frequency transitions, the cpufreq core takes the responsibility of
invoking cpufreq_freq_transition_begin() and cpufreq_freq_transition_end()
for those cpufreq drivers that define the ->target_index callback but don't
set the ASYNC_NOTIFICATION flag.
The longhaul cpufreq driver falls under this category, but this driver was
invoking the _begin() and _end() APIs itself around frequency transitions,
which led to double invocation of the _begin() API. The _begin API makes
contending callers wait until the previous invocation is complete. Hence,
the longhaul driver ended up waiting on itself, leading to system hangs
during boot.
Fix this by removing the calls to the _begin() and _end() APIs from the
longhaul driver, since they rightly belong to the cpufreq core.
(Note that during module_exit(), the longhaul driver sets the frequency
without any help from the cpufreq core. So add explicit calls to the
_begin() and _end() APIs around that frequency transition alone, to take
care of that special case.)
Fixes:
12478cf0c55e (cpufreq: Make sure frequency transitions are serialized)
Reported-and-tested-by: Meelis Roos <mroos@linux.ee>
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Fam Zheng [Mon, 14 Apr 2014 02:16:09 +0000 (10:16 +0800)]
[SCSI] virtio-scsi: Skip setting affinity on uninitialized vq
virtscsi_init calls virtscsi_remove_vqs on err, even before initializing
the vqs. The latter calls virtscsi_set_affinity, so let's check the
pointer there before setting affinity on it.
This fixes a panic when setting device's num_queues=2 on RHEL 6.5:
qemu-system-x86_64 ... \
-device virtio-scsi-pci,id=scsi0,addr=0x13,...,num_queues=2 \
-drive file=/stor/vm/dummy.raw,id=drive-scsi-disk,... \
-device scsi-hd,drive=drive-scsi-disk,...
[ 0.354734] scsi0 : Virtio SCSI HBA
[ 0.379504] BUG: unable to handle kernel NULL pointer dereference at
0000000000000020
[ 0.380141] IP: [<
ffffffff814741ef>] __virtscsi_set_affinity+0x4f/0x120
[ 0.380141] PGD 0
[ 0.380141] Oops: 0000 [#1] SMP
[ 0.380141] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.14.0+ #5
[ 0.380141] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
[ 0.380141] task:
ffff88003c9f0000 ti:
ffff88003c9f8000 task.ti:
ffff88003c9f8000
[ 0.380141] RIP: 0010:[<
ffffffff814741ef>] [<
ffffffff814741ef>] __virtscsi_set_affinity+0x4f/0x120
[ 0.380141] RSP: 0000:
ffff88003c9f9c08 EFLAGS:
00010256
[ 0.380141] RAX:
0000000000000000 RBX:
ffff88003c3a9d40 RCX:
0000000000001070
[ 0.380141] RDX:
0000000000000002 RSI:
0000000000000000 RDI:
0000000000000000
[ 0.380141] RBP:
ffff88003c9f9c28 R08:
00000000000136c0 R09:
ffff88003c801c00
[ 0.380141] R10:
ffffffff81475229 R11:
0000000000000008 R12:
0000000000000000
[ 0.380141] R13:
ffffffff81cc7ca8 R14:
ffff88003cac3d40 R15:
ffff88003cac37a0
[ 0.380141] FS:
0000000000000000(0000) GS:
ffff88003e400000(0000) knlGS:
0000000000000000
[ 0.380141] CS: 0010 DS: 0000 ES: 0000 CR0:
000000008005003b
[ 0.380141] CR2:
0000000000000020 CR3:
0000000001c0e000 CR4:
00000000000006f0
[ 0.380141] Stack:
[ 0.380141]
ffff88003c3a9d40 0000000000000000 ffff88003cac3d80 ffff88003cac3d40
[ 0.380141]
ffff88003c9f9c48 ffffffff814742e8 ffff88003c26d000 ffff88003c26d000
[ 0.380141]
ffff88003c9f9c68 ffffffff81474321 ffff88003c26d000 ffff88003c3a9d40
[ 0.380141] Call Trace:
[ 0.380141] [<
ffffffff814742e8>] virtscsi_set_affinity+0x28/0x40
[ 0.380141] [<
ffffffff81474321>] virtscsi_remove_vqs+0x21/0x50
[ 0.380141] [<
ffffffff81475231>] virtscsi_init+0x91/0x240
[ 0.380141] [<
ffffffff81365290>] ? vp_get+0x50/0x70
[ 0.380141] [<
ffffffff81475544>] virtscsi_probe+0xf4/0x280
[ 0.380141] [<
ffffffff81363ea5>] virtio_dev_probe+0xe5/0x140
[ 0.380141] [<
ffffffff8144c669>] driver_probe_device+0x89/0x230
[ 0.380141] [<
ffffffff8144c8ab>] __driver_attach+0x9b/0xa0
[ 0.380141] [<
ffffffff8144c810>] ? driver_probe_device+0x230/0x230
[ 0.380141] [<
ffffffff8144c810>] ? driver_probe_device+0x230/0x230
[ 0.380141] [<
ffffffff8144ac1c>] bus_for_each_dev+0x8c/0xb0
[ 0.380141] [<
ffffffff8144c499>] driver_attach+0x19/0x20
[ 0.380141] [<
ffffffff8144bf28>] bus_add_driver+0x198/0x220
[ 0.380141] [<
ffffffff8144ce9f>] driver_register+0x5f/0xf0
[ 0.380141] [<
ffffffff81d27c91>] ? spi_transport_init+0x79/0x79
[ 0.380141] [<
ffffffff8136403b>] register_virtio_driver+0x1b/0x30
[ 0.380141] [<
ffffffff81d27d19>] init+0x88/0xd6
[ 0.380141] [<
ffffffff81d27c18>] ? scsi_init_procfs+0x5b/0x5b
[ 0.380141] [<
ffffffff81ce88a7>] do_one_initcall+0x7f/0x10a
[ 0.380141] [<
ffffffff81ce8aa7>] kernel_init_freeable+0x14a/0x1de
[ 0.380141] [<
ffffffff81ce8b3b>] ? kernel_init_freeable+0x1de/0x1de
[ 0.380141] [<
ffffffff817dec20>] ? rest_init+0x80/0x80
[ 0.380141] [<
ffffffff817dec29>] kernel_init+0x9/0xf0
[ 0.380141] [<
ffffffff817e68fc>] ret_from_fork+0x7c/0xb0
[ 0.380141] [<
ffffffff817dec20>] ? rest_init+0x80/0x80
[ 0.380141] RIP [<
ffffffff814741ef>] __virtscsi_set_affinity+0x4f/0x120
[ 0.380141] RSP <
ffff88003c9f9c08>
[ 0.380141] CR2:
0000000000000020
[ 0.380141] ---[ end trace
8074b70c3d5e1d73 ]---
[ 0.475018] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009
[ 0.475018]
[ 0.475068] Kernel Offset: 0x0 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffff9fffffff)
[ 0.475068] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009
[jejb: checkpatch fixes]
Signed-off-by: Fam Zheng <famz@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Linus Torvalds [Mon, 28 Apr 2014 22:19:06 +0000 (15:19 -0700)]
Merge tag 'dt-for-linus' of git://git.secretlab.ca/git/linux
Pull devicetree bug fixes from Grant Likely:
"These are some important bug fixes that need to get into v3.15.
This branch contains a pair of important bug fixes for the DT code:
- Fix some incorrect binding property names before they enter common
usage
- Fix bug where some platform devices will be unable to get their
interrupt number when they depend on an interrupt controller that
is not available at device creation time. This is a problem
causing mainline to fail on a number of ARM platforms"
* tag 'dt-for-linus' of git://git.secretlab.ca/git/linux:
of/irq: do irq resolution in platform_get_irq
of: selftest: add deferred probe interrupt test
dt: Fix binding typos in clock-names and interrupt-names
Linus Torvalds [Mon, 28 Apr 2014 21:58:15 +0000 (14:58 -0700)]
Merge branch 'merge' of git://git./linux/kernel/git/benh/powerpc
Pull powerpc fixes from Ben Herrenschmidt:
"Here is a bunch of post-merge window fixes that have been accumulating
in patchwork while I was on vacation or buried under other stuff last
week.
We have the now usual batch of LE fixes from Anton (sadly some new
stuff that went into this merge window had endian issues, we'll try to
make sure we do better next time)
Some fixes and cleanups to the new 24x7 performance monitoring stuff
(mostly typos and cleaning up printk's)
A series of fixes for an issue with our runlatch bit, which wasn't set
properly for offlined threads/cores and under KVM, causing potentially
some counters to misbehave along with possible power management
issues.
A fix for kexec nasty race where the new kernel wouldn't "see" the
secondary processors having reached back into firmware in time.
And finally a few other misc (and pretty simple) bug fixes"
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (33 commits)
powerpc/4xx: Fix section mismatch in ppc4xx_pci.c
ppc/kvm: Clear the runlatch bit of a vcpu before napping
ppc/kvm: Set the runlatch bit of a CPU just before starting guest
ppc/powernv: Set the runlatch bits correctly for offline cpus
powerpc/pseries: Protect remove_memory() with device hotplug lock
powerpc: Fix error return in rtas_flash module init
powerpc: Bump BOOT_COMMAND_LINE_SIZE to 2048
powerpc: Bump COMMAND_LINE_SIZE to 2048
powerpc: Rename duplicate COMMAND_LINE_SIZE define
powerpc/perf/hv-24x7: Catalog version number is be64, not be32
powerpc/perf/hv-24x7: Remove [static 4096], sparse chokes on it
powerpc/perf/hv-24x7: Use (unsigned long) not (u32) values when calling plpar_hcall_norets()
powerpc/perf/hv-gpci: Make device attr static
powerpc/perf/hv_gpci: Probe failures use pr_debug(), and padding reduced
powerpc/perf/hv_24x7: Probe errors changed to pr_debug(), padding fixed
powerpc/mm: Fix tlbie to add AVAL fields for 64K pages
powerpc/powernv: Fix little endian issues in OPAL dump code
powerpc/powernv: Create OPAL sglist helper functions and fix endian issues
powerpc/powernv: Fix little endian issues in OPAL error log code
powerpc/powernv: Fix little endian issues with opal_do_notifier calls
...
Linus Torvalds [Mon, 28 Apr 2014 21:24:09 +0000 (14:24 -0700)]
mm: don't pointlessly use BUG_ON() for sanity check
BUG_ON() is a big hammer, and should be used _only_ if there is some
major corruption that you cannot possibly recover from, making it
imperative that the current process (and possibly the whole machine) be
terminated with extreme prejudice.
The trivial sanity check in the vmacache code is *not* such a fatal
error. Recovering from it is absolutely trivial, and using BUG_ON()
just makes it harder to debug for no actual advantage.
To make matters worse, the placement of the BUG_ON() (only if the range
check matched) actually makes it harder to hit the sanity check to begin
with, so _if_ there is a bug (and we just got a report from Srivatsa
Bhat that this can indeed trigger), it is harder to debug not just
because the machine is possibly dead, but because we don't have better
coverage.
BUG_ON() must *die*. Maybe we should add a checkpatch warning for it,
because it is simply just about the worst thing you can ever do if you
hit some "this cannot happen" situation.
Reported-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Cc: Davidlohr Bueso <davidlohr@hp.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Dan Carpenter [Thu, 3 Apr 2014 07:21:34 +0000 (10:21 +0300)]
irqchip: irq-crossbar: Not allocating enough memory
We are allocating the size of a pointer and not the size of the data.
This will lead to memory corruption.
There isn't actually a "cb_device" struct, btw. The code is only able
to compile because GCC knows that all pointers are the same size.
Fixes:
96ca848ef7ea ('DRIVERS: IRQCHIP: CROSSBAR: Add support for Crossbar IP')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Sricharan R <r.sricharan@ti.com>
Cc: Grant Likely <grant.likely@linaro.org>
Cc: Rob Herring <robh+dt@kernel.org>
Link: http://lkml.kernel.org/r/20140403072134.GA14286@mwanda
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Thomas Gleixner [Tue, 4 Mar 2014 20:43:41 +0000 (20:43 +0000)]
irqchip: armanda: Sanitize set_irq_affinity()
The set_irq_affinity() function has two issues:
1) It has no protection against selecting an offline cpu from the
given mask.
2) It pointlessly restricts the affinity masks to have a single cpu
set. This collides with the irq migration code of arm.
irq affinity is set to core 3
core 3 goes offline
migration code sets mask to cpu_online_mask and calls the
irq_set_affinity() callback of the irq_chip which fails due to bit
0,1,2 set.
So instead of doing silly for_each_cpu() loops just pick any bit of
the mask which intersects with the online mask.
Get rid of fiddling with the default_irq_affinity as well.
[ Gregory: Fixed the access to the routing register ]
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Tested-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@elte.hu>
Link: http://lkml.kernel.org/r/20140304203101.088889302@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tyler Stachecki [Fri, 25 Apr 2014 20:41:04 +0000 (16:41 -0400)]
[SCSI] mpt2sas: Don't disable device twice at suspend.
On suspend, _scsih_suspend calls mpt2sas_base_free_resources, which
in turn calls pci_disable_device if the device is enabled prior to
suspending. However, _scsih_suspend also calls pci_disable_device
itself.
Thus, in the event that the device is enabled prior to suspending,
pci_disable_device will be called twice. This patch removes the
duplicate call to pci_disable_device in _scsi_suspend as it is both
unnecessary and results in a kernel oops.
Signed-off-by: Tyler Stachecki <tstache1@binghamton.edu>
Cc: stable@vger.kernel.org
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Steven Rostedt (Red Hat) [Thu, 24 Apr 2014 14:40:12 +0000 (10:40 -0400)]
ftrace/module: Hardcode ftrace_module_init() call into load_module()
A race exists between module loading and enabling of function tracer.
CPU 1 CPU 2
----- -----
load_module()
module->state = MODULE_STATE_COMING
register_ftrace_function()
mutex_lock(&ftrace_lock);
ftrace_startup()
update_ftrace_function();
ftrace_arch_code_modify_prepare()
set_all_module_text_rw();
<enables-ftrace>
ftrace_arch_code_modify_post_process()
set_all_module_text_ro();
[ here all module text is set to RO,
including the module that is
loading!! ]
blocking_notifier_call_chain(MODULE_STATE_COMING);
ftrace_init_module()
[ tries to modify code, but it's RO, and fails!
ftrace_bug() is called]
When this race happens, ftrace_bug() will produces a nasty warning and
all of the function tracing features will be disabled until reboot.
The simple solution is to treate module load the same way the core
kernel is treated at boot. To hardcode the ftrace function modification
of converting calls to mcount into nops. This is done in init/main.c
there's no reason it could not be done in load_module(). This gives
a better control of the changes and doesn't tie the state of the
module to its notifiers as much. Ftrace is special, it needs to be
treated as such.
The reason this would work, is that the ftrace_module_init() would be
called while the module is in MODULE_STATE_UNFORMED, which is ignored
by the set_all_module_text_ro() call.
Link: http://lkml.kernel.org/r/1395637826-3312-1-git-send-email-indou.takao@jp.fujitsu.com
Reported-by: Takao Indoh <indou.takao@jp.fujitsu.com>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: stable@vger.kernel.org # 2.6.38+
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Andre Przywara [Thu, 10 Apr 2014 22:07:18 +0000 (00:07 +0200)]
KVM: arm/arm64: vgic: fix GICD_ICFGR register accesses
Since KVM internally represents the ICFGR registers by stuffing two
of them into one word, the offset for accessing the internal
representation and the one for the MMIO based access are different.
So keep the original offset around, but adjust the internal array
offset by one bit.
Reported-by: Haibin Wang <wanghaibin.wang@huawei.com>
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Oleg Nesterov [Mon, 21 Apr 2014 13:26:01 +0000 (15:26 +0200)]
KVM: async_pf: mm->mm_users can not pin apf->mm
get_user_pages(mm) is simply wrong if mm->mm_users == 0 and exit_mmap/etc
was already called (or is in progress), mm->mm_count can only pin mm->pgd
and mm_struct itself.
Change kvm_setup_async_pf/async_pf_execute to inc/dec mm->mm_users.
kvm_create_vm/kvm_destroy_vm play with ->mm_count too but this case looks
fine at first glance, it seems that this ->mm is only used to verify that
current->mm == kvm->mm.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Alexander Graf [Sun, 6 Apr 2014 21:31:48 +0000 (23:31 +0200)]
KVM: PPC: Book3S: ifdef on CONFIG_KVM_BOOK3S_32_HANDLER for 32bit
The book3s_32 target can get built as module which means we don't see the
config define for it in code. Instead, check on the bool define
CONFIG_KVM_BOOK3S_32_HANDLER whenever we want to know whether we're building
for a book3s_32 host.
This fixes running book3s_32 kvm as a module for me.
Signed-off-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Paul Mackerras [Sun, 13 Apr 2014 22:56:26 +0000 (08:56 +1000)]
KVM: PPC: Book3S HV: Add missing code for transaction reclaim on guest exit
Testing by Michael Neuling revealed that commit
e4e38121507a ("KVM:
PPC: Book3S HV: Add transactional memory support") is missing the code
that saves away the checkpointed state of the guest when switching to
the host. This adds that code, which was in earlier versions of the
patch but went missing somehow.
Reported-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
pingfank@linux.vnet.ibm.com [Tue, 15 Apr 2014 08:33:40 +0000 (16:33 +0800)]
KVM: PPC: Book3S: HV: make _PAGE_NUMA take effect
Numa fault is a method which help to achieve auto numa balancing.
When such a page fault takes place, the page fault handler will check
whether the page is placed correctly. If not, migration should be
involved to cut down the distance between the cpu and pages.
A pte with _PAGE_NUMA help to implement numa fault. It means not to
allow the MMU to access the page directly. So a page fault is triggered
and numa fault handler gets the opportunity to run checker.
As for the access of MMU, we need special handling for the powernv's guest.
When we mark a pte with _PAGE_NUMA, we already call mmu_notifier to
invalidate it in guest's htab, but when we tried to re-insert them,
we firstly try to map it in real-mode. Only after this fails, we fallback
to virt mode, and most of important, we run numa fault handler in virt
mode. This patch guards the way of real-mode to ensure that if a pte is
marked with _PAGE_NUMA, it will NOT be mapped in real mode, instead, it will
be mapped in virt mode and have the opportunity to be checked with placement.
Signed-off-by: Liu Ping Fan <pingfank@linux.vnet.ibm.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Haibin Wang [Thu, 10 Apr 2014 12:14:32 +0000 (13:14 +0100)]
KVM: ARM: vgic: Fix sgi dispatch problem
When dispatch SGI(mode == 0), that is the vcpu of VM should send
sgi to the cpu which the target_cpus list.
So, there must add the "break" to branch of case 0.
Cc: <stable@vger.kernel.org> # 3.10+
Signed-off-by: Haibin Wang <wanghaibin.wang@huawei.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Marc Zyngier [Sat, 26 Apr 2014 10:43:01 +0000 (03:43 -0700)]
MAINTAINERS: co-maintainance of KVM/{arm,arm64}
The KVM/{arm,arm64} ports are sharing a lot of code, and are
effectively co-maintained (and have been for quite a while).
Make the situation official and list the two maintainers
for both ports.
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Mark Salter [Fri, 28 Mar 2014 14:25:19 +0000 (14:25 +0000)]
arm: KVM: fix possible misalignment of PGDs and bounce page
The kvm/mmu code shared by arm and arm64 uses kalloc() to allocate
a bounce page (if hypervisor init code crosses page boundary) and
hypervisor PGDs. The problem is that kalloc() does not guarantee
the proper alignment. In the case of the bounce page, the page sized
buffer allocated may also cross a page boundary negating the purpose
and leading to a hang during kvm initialization. Likewise the PGDs
allocated may not meet the minimum alignment requirements of the
underlying MMU. This patch uses __get_free_page() to guarantee the
worst case alignment needs of the bounce page and PGDs on both arm
and arm64.
Cc: <stable@vger.kernel.org> # 3.10+
Signed-off-by: Mark Salter <msalter@redhat.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>