Matt Fleming [Fri, 11 Mar 2016 11:19:23 +0000 (11:19 +0000)]
x86/efi: Fix boot crash by always mapping boot service regions into new EFI page tables
Some machines have EFI regions in page zero (physical address
0x00000000) and historically that region has been added to the e820
map via trim_bios_range(), and ultimately mapped into the kernel page
tables. It was not mapped via efi_map_regions() as one would expect.
Alexis reports that with the new separate EFI page tables some boot
services regions, such as page zero, are not mapped. This triggers an
oops during the SetVirtualAddressMap() runtime call.
For the EFI boot services quirk on x86 we need to memblock_reserve()
boot services regions until after SetVirtualAddressMap(). Doing that
while respecting the ownership of regions that may have already been
reserved by the kernel was the motivation behind this commit:
7d68dc3f1003 ("x86, efi: Do not reserve boot services regions within reserved areas")
That patch was merged at a time when the EFI runtime virtual mappings
were inserted into the kernel page tables as described above, and the
trick of setting ->numpages (and hence the region size) to zero to
track regions that should not be freed in efi_free_boot_services()
meant that we never mapped those regions in efi_map_regions(). Instead
we were relying solely on the existing kernel mappings.
Now that we have separate page tables we need to make sure the EFI
boot services regions are mapped correctly, even if someone else has
already called memblock_reserve(). Instead of stashing a tag in
->numpages, set the EFI_MEMORY_RUNTIME bit of ->attribute. Since it
generally makes no sense to mark a boot services region as required at
runtime, it's pretty much guaranteed the firmware will not have
already set this bit.
For the record, the specific circumstances under which Alexis
triggered this bug was that an EFI runtime driver on his machine was
responding to the EVT_SIGNAL_VIRTUAL_ADDRESS_CHANGE event during
SetVirtualAddressMap().
The event handler for this driver looks like this,
sub rsp,0x28
lea rdx,[rip+0x2445] # 0xaa948720
mov ecx,0x4
call func_aa9447c0 ; call to ConvertPointer(4, & 0xaa948720)
mov r11,QWORD PTR [rip+0x2434] # 0xaa948720
xor eax,eax
mov BYTE PTR [r11+0x1],0x1
add rsp,0x28
ret
Which is pretty typical code for an EVT_SIGNAL_VIRTUAL_ADDRESS_CHANGE
handler. The "mov r11, QWORD PTR [rip+0x2424]" was the faulting
instruction because ConvertPointer() was being called to convert the
address 0x0000000000000000, which when converted is left unchanged and
remains 0x0000000000000000.
The output of the oops trace gave the impression of a standard NULL
pointer dereference bug, but because we're accessing physical
addresses during ConvertPointer(), it wasn't. EFI boot services code
is stored at that address on Alexis' machine.
Reported-by: Alexis Murzeau <amurzeau@gmail.com>
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Ben Hutchings <ben@decadent.org.uk>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Raphael Hertzog <hertzog@debian.org>
Cc: Roger Shimizu <rogershimizu@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/1457695163-29632-2-git-send-email-matt@codeblueprint.co.uk
Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=815125
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Borislav Petkov [Fri, 11 Mar 2016 11:32:06 +0000 (12:32 +0100)]
x86/fpu: Fix eager-FPU handling on legacy FPU machines
i486 derived cores like Intel Quark support only the very old,
legacy x87 FPU (FSAVE/FRSTOR, CPUID bit FXSR is not set), and
our FPU code wasn't handling the saving and restoring there
properly in the 'eagerfpu' case.
So after we made eagerfpu the default for all CPU types:
58122bf1d856 x86/fpu: Default eagerfpu=on on all CPUs
these old FPU designs broke. First, Andy Shevchenko reported a splat:
WARNING: CPU: 0 PID: 823 at arch/x86/include/asm/fpu/internal.h:163 fpu__clear+0x8c/0x160
which was us trying to execute FXRSTOR on those machines even though
they don't support it.
After taking care of that, Bryan O'Donoghue reported that a simple FPU
test still failed because we weren't initializing the FPU state properly
on those machines.
Take care of all that.
Reported-and-tested-by: Bryan O'Donoghue <pure.logic@nexus-software.ie>
Reported-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yu-cheng <yu-cheng.yu@intel.com>
Link: http://lkml.kernel.org/r/20160311113206.GD4312@pd.tnic
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Borislav Petkov [Wed, 9 Mar 2016 20:56:22 +0000 (21:56 +0100)]
x86/delay: Avoid preemptible context checks in delay_mwaitx()
We do use this_cpu_ptr(&cpu_tss) as a cacheline-aligned, seldomly
accessed per-cpu var as the MONITORX target in delay_mwaitx(). However,
when called in preemptible context, this_cpu_ptr -> smp_processor_id() ->
debug_smp_processor_id() fires:
BUG: using smp_processor_id() in preemptible [
00000000] code: udevd/312
caller is delay_mwaitx+0x40/0xa0
But we don't care about that check - we only need cpu_tss as a MONITORX
target and it doesn't really matter which CPU's var we're touching as
we're going idle anyway. Fix that.
Suggested-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Huang Rui <ray.huang@amd.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: spg_linux_kernel@amd.com
Link: http://lkml.kernel.org/r/20160309205622.GG6564@pd.tnic
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Yu-cheng Yu [Thu, 10 Mar 2016 00:28:54 +0000 (16:28 -0800)]
x86/fpu: Revert ("x86/fpu: Disable AVX when eagerfpu is off")
Leonid Shatz noticed that the SDM interpretation of the following
recent commit:
394db20ca240741 ("x86/fpu: Disable AVX when eagerfpu is off")
... is incorrect and that the original behavior of the FPU code was correct.
Because AVX is not stated in CR0 TS bit description, it was mistakenly
believed to be not supported for lazy context switch. This turns out
to be false:
Intel Software Developer's Manual Vol. 3A, Sec. 2.5 Control Registers:
'TS Task Switched bit (bit 3 of CR0) -- Allows the saving of the x87 FPU/
MMX/SSE/SSE2/SSE3/SSSE3/SSE4 context on a task switch to be delayed until
an x87 FPU/MMX/SSE/SSE2/SSE3/SSSE3/SSE4 instruction is actually executed
by the new task.'
Intel Software Developer's Manual Vol. 2A, Sec. 2.4 Instruction Exception
Specification:
'AVX instructions refer to exceptions by classes that include #NM
"Device Not Available" exception for lazy context switch.'
So revert the commit.
Reported-by: Leonid Shatz <leonid.shatz@ravellosystems.com>
Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1457569734-3785-1-git-send-email-yu-cheng.yu@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Andy Lutomirski [Thu, 21 Jan 2016 23:24:31 +0000 (15:24 -0800)]
x86/fpu: Fix 'no387' regression
After fixing FPU option parsing, we now parse the 'no387' boot option
too early: no387 clears X86_FEATURE_FPU before it's even probed, so
the boot CPU promptly re-enables it.
I suspect it gets even more confused on SMP.
Fix the probing code to leave X86_FEATURE_FPU off if it's been
disabled by setup_clear_cpu_cap().
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: yu-cheng yu <yu-cheng.yu@intel.com>
Fixes:
4f81cbafcce2 ("x86/fpu: Fix early FPU command-line parsing")
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Linus Torvalds [Mon, 7 Mar 2016 23:41:10 +0000 (15:41 -0800)]
Merge git://git./linux/kernel/git/davem/net
Pull networking fixes from David Miller:
1) Fix ordering of WEXT netlink messages so we don't see a newlink
after a dellink, from Johannes Berg.
2) Out of bounds access in minstrel_ht_set_best_prob_rage, from
Konstantin Khlebnikov.
3) Paging buffer memory leak in iwlwifi, from Matti Gottlieb.
4) Wrong units used to set initial TCP rto from cached metrics, also
from Konstantin Khlebnikov.
5) Fix stale IP options data in the SKB control block from leaking
through layers of encapsulation, from Bernie Harris.
6) Zero padding len miscalculated in bnxt_en, from Michael Chan.
7) Only CHECKSUM_PARTIAL packets should be passed down through GSO, fix
from Hannes Frederic Sowa.
8) Fix suspend/resume with JME networking devices, from Diego Violat
and Guo-Fu Tseng.
9) Checksums not validated properly in bridge multicast support due to
the placement of the SKB header pointers at the time of the check,
fix from Álvaro Fernández Rojas.
10) Fix hang/tiemout with r8169 if a stats fetch is done while the
device is runtime suspended. From Chun-Hao Lin.
11) The forwarding database netlink dump facilities don't track the
state of the dump properly, resulting in skipped/missed entries.
From Minoura Makoto.
12) Fix regression from a recent 3c59x bug fix, from Neil Horman.
13) Fix list corruption in bna driver, from Ivan Vecera.
14) Big endian machines crash on vlan add in bnx2x, fix from Michal
Schmidt.
15) Ethtool RSS configuration not propagated properly in mlx5 driver,
from Tariq Toukan.
16) Fix regression in PHY probing in stmmac driver, from Gabriel
Fernandez.
17) Fix SKB tailroom calculation in igmp/mld code, from Benjamin
Poirier.
18) A past change to skip empty routing headers in ipv6 extention header
parsing accidently caused fragment headers to not be matched any
longer. Fix from Florian Westphal.
19) eTSEC-106 erratum needs to be applied to more gianfar chips, from
Atsushi Nemoto.
20) Fix netdev reference after free via workqueues in usb networking
drivers, from Oliver Neukum and Bjørn Mork.
21) mdio->irq is now an array rather than a pointer to dynamic memory,
but several drivers were still trying to free it :-/ Fixes from
Colin Ian King.
22) act_ipt iptables action forgets to set the family field, thus LOG
netfilter targets don't work with it. Fix from Phil Sutter.
23) SKB leak in ibmveth when skb_linearize() fails, from Thomas Falcon.
24) pskb_may_pull() cannot be called with interrupts disabled, fix code
that tries to do this in vmxnet3 driver, from Neil Horman.
25) be2net driver leaks iomap'd memory on removal, fix from Douglas
Miller.
26) Forgotton RTNL mutex unlock in ppp_create_interface() error paths,
from Guillaume Nault.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (97 commits)
ppp: release rtnl mutex when interface creation fails
cdc_ncm: do not call usbnet_link_change from cdc_ncm_bind
tcp: fix tcpi_segs_in after connection establishment
net: hns: fix the bug about loopback
jme: Fix device PM wakeup API usage
jme: Do not enable NIC WoL functions on S0
udp6: fix UDP/IPv6 encap resubmit path
be2net: Don't leak iomapped memory on removal.
vmxnet3: avoid calling pskb_may_pull with interrupts disabled
net: ethernet: Add missing MFD_SYSCON dependency on HAS_IOMEM
ibmveth: check return of skb_linearize in ibmveth_start_xmit
cdc_ncm: toggle altsetting to force reset before setup
usbnet: cleanup after bind() in probe()
mlxsw: pci: Correctly determine if descriptor queue is full
mlxsw: spectrum: Always decrement bridge's ref count
tipc: fix nullptr crash during subscription cancel
net: eth: altera: do not free array priv->mdio->irq
net/ethoc: do not free array priv->mdio->irq
net: sched: fix act_ipt for LOG target
asix: do not free array priv->mdio->irq
...
Linus Torvalds [Mon, 7 Mar 2016 23:23:25 +0000 (15:23 -0800)]
Merge branch 'overlayfs-linus' of git://git./linux/kernel/git/mszeredi/vfs
Pull overlayfs fixes from Miklos Szeredi:
"Overlayfs bug fixes. All marked as -stable material"
* 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
ovl: copy new uid/gid into overlayfs runtime inode
ovl: ignore lower entries when checking purity of non-directory entries
ovl: fix getcwd() failure after unsuccessful rmdir
ovl: fix working on distributed fs as lower layer
Linus Torvalds [Mon, 7 Mar 2016 21:15:09 +0000 (13:15 -0800)]
Revert "drm/radeon: call hpd_irq_event on resume"
This reverts commit
dbb17a21c131eca94eb31136eee9a7fe5aff00d9.
It turns out that commit can cause problems for systems with multiple
GPUs, and causes X to hang on at least a HP Pavilion dv7 with hybrid
graphics.
This got noticed originally in 4.4.4, where this patch had already
gotten back-ported, but 4.5-rc7 was verified to have the same problem.
Alexander Deucher says:
"It looks like you have a muxed system so I suspect what's happening is
that one of the display is being reported as connected for both the
IGP and the dGPU and then the desktop environment gets confused or
there some sort problem in the detect functions since the mux is not
switched to the dGPU. I don't see an easy fix unless Dave has any
ideas. I'd say just revert for now"
Reported-by: Jörg-Volker Peetz <jvpeetz@web.de>
Acked-by: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Dave Airlie <airlied@gmail.com>
Cc: stable@kernel.org # wherever dbb17a21c131 got back-ported
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Guillaume Nault [Mon, 7 Mar 2016 18:36:44 +0000 (19:36 +0100)]
ppp: release rtnl mutex when interface creation fails
Add missing rtnl_unlock() in the error path of ppp_create_interface().
Fixes:
58a89ecaca53 ("ppp: fix lockdep splat in ppp_dev_uninit()")
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Bjørn Mork [Mon, 7 Mar 2016 20:15:36 +0000 (21:15 +0100)]
cdc_ncm: do not call usbnet_link_change from cdc_ncm_bind
usbnet_link_change will call schedule_work and should be
avoided if bind is failing. Otherwise we will end up with
scheduled work referring to a netdev which has gone away.
Instead of making the call conditional, we can just defer
it to usbnet_probe, using the driver_info flag made for
this purpose.
Fixes:
8a34b0ae8778 ("usbnet: cdc_ncm: apply usbnet_link_change")
Reported-by: Andrey Konovalov <andreyknvl@gmail.com>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Sun, 6 Mar 2016 17:29:21 +0000 (09:29 -0800)]
tcp: fix tcpi_segs_in after connection establishment
If final packet (ACK) of 3WHS is lost, it appears we do not properly
account the following incoming segment into tcpi_segs_in
While we are at it, starts segs_in with one, to count the SYN packet.
We do not yet count number of SYN we received for a request sock, we
might add this someday.
packetdrill script showing proper behavior after fix :
// Tests tcpi_segs_in when 3rd packet (ACK) of 3WHS is lost
0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
+0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
+0 bind(3, ..., ...) = 0
+0 listen(3, 1) = 0
+0 < S 0:0(0) win 32792 <mss 1000,sackOK,nop,nop>
+0 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK>
+.020 < P. 1:1001(1000) ack 1 win 32792
+0 accept(3, ..., ...) = 4
+.000 %{ assert tcpi_segs_in == 2, 'tcpi_segs_in=%d' % tcpi_segs_in }%
Fixes:
2efd055c53c06 ("tcp: add tcpi_segs_in and tcpi_segs_out to tcp_info")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
yankejian [Sat, 5 Mar 2016 06:10:42 +0000 (14:10 +0800)]
net: hns: fix the bug about loopback
It will always be passed if the soc is tested the loopback cases. This
patch will fix this bug.
Signed-off-by: Kejian Yan <yankejian@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Guo-Fu Tseng [Sat, 5 Mar 2016 00:11:56 +0000 (08:11 +0800)]
jme: Fix device PM wakeup API usage
According to Documentation/power/devices.txt
The driver should not use device_set_wakeup_enable() which is the policy
for user to decide.
Using device_init_wakeup() to initialize dev->power.should_wakeup and
dev->power.can_wakeup on driver initialization.
And use device_may_wakeup() on suspend to decide if WoL function should
be enabled on NIC.
Reported-by: Diego Viola <diego.viola@gmail.com>
Signed-off-by: Guo-Fu Tseng <cooldavid@cooldavid.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Guo-Fu Tseng [Sat, 5 Mar 2016 00:11:55 +0000 (08:11 +0800)]
jme: Do not enable NIC WoL functions on S0
Otherwise it might be back on resume right after going to suspend in
some hardware.
Reported-by: Diego Viola <diego.viola@gmail.com>
Signed-off-by: Guo-Fu Tseng <cooldavid@cooldavid.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Bill Sommerfeld [Fri, 4 Mar 2016 22:47:21 +0000 (14:47 -0800)]
udp6: fix UDP/IPv6 encap resubmit path
IPv4 interprets a negative return value from a protocol handler as a
request to redispatch to a new protocol. In contrast, IPv6 interprets a
negative value as an error, and interprets a positive value as a request
for redispatch.
UDP for IPv6 was unaware of this difference. Change __udp6_lib_rcv() to
return a positive value for redispatch. Note that the socket's
encap_rcv hook still needs to return a negative value to request
dispatch, and in the case of IPv6 packets, adjust IP6CB(skb)->nhoff to
identify the byte containing the next protocol.
Signed-off-by: Bill Sommerfeld <wsommerfeld@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Douglas Miller [Fri, 4 Mar 2016 21:36:56 +0000 (15:36 -0600)]
be2net: Don't leak iomapped memory on removal.
The adapter->pcicfg resource is either mapped via pci_iomap() or
derived from adapter->db. During be_remove() this resource was ignored
and so could remain mapped after remove.
Add a flag to track whether adapter->pcicfg was mapped or not, then
use that flag in be_unmap_pci_bars() to unmap if required.
Fixes:
25848c901 ("use PCI MMIO read instead of config read for errors")
Signed-off-by: Douglas Miller <dougmill@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Neil Horman [Fri, 4 Mar 2016 18:40:48 +0000 (13:40 -0500)]
vmxnet3: avoid calling pskb_may_pull with interrupts disabled
vmxnet3 has a function vmxnet3_parse_and_copy_hdr which, among other operations,
uses pskb_may_pull to linearize the header portion of an skb. That operation
eventually uses local_bh_disable/enable to ensure that it doesn't race with the
drivers bottom half handler. Unfortunately, vmxnet3 preforms this
parse_and_copy operation with a spinlock held and interrupts disabled. This
causes us to run afoul of the WARN_ON_ONCE(irqs_disabled()) warning in
local_bh_enable, resulting in this:
WARNING: at kernel/softirq.c:159 local_bh_enable+0x59/0x90() (Not tainted)
Hardware name: VMware Virtual Platform
Modules linked in: ipv6 ppdev parport_pc parport microcode e1000 vmware_balloon
vmxnet3 i2c_piix4 sg ext4 jbd2 mbcache sd_mod crc_t10dif sr_mod cdrom mptspi
mptscsih mptbase scsi_transport_spi pata_acpi ata_generic ata_piix vmwgfx ttm
drm_kms_helper drm i2c_core dm_mirror dm_region_hash dm_log dm_mod [last
unloaded: mperf]
Pid: 6229, comm: sshd Not tainted 2.6.32-616.el6.i686 #1
Call Trace:
[<
c04624d9>] ? warn_slowpath_common+0x89/0xe0
[<
c0469e99>] ? local_bh_enable+0x59/0x90
[<
c046254b>] ? warn_slowpath_null+0x1b/0x20
[<
c0469e99>] ? local_bh_enable+0x59/0x90
[<
c07bb936>] ? skb_copy_bits+0x126/0x210
[<
f8d1d9fe>] ? ext4_ext_find_extent+0x24e/0x2d0 [ext4]
[<
c07bc49e>] ? __pskb_pull_tail+0x6e/0x2b0
[<
f95a6164>] ? vmxnet3_xmit_frame+0xba4/0xef0 [vmxnet3]
[<
c05d15a6>] ? selinux_ip_postroute+0x56/0x320
[<
c0615988>] ? cfq_add_rq_rb+0x98/0x110
[<
c0852df8>] ? packet_rcv+0x48/0x350
[<
c07c5839>] ? dev_queue_xmit_nit+0xc9/0x140
...
Fix it by splitting vmxnet3_parse_and_copy_hdr into two functions:
vmxnet3_parse_hdr, which sets up the internal/on stack ctx datastructure, and
pulls the skb (both of which can be done without holding the spinlock with irqs
disabled
and
vmxnet3_copy_header, which just copies the skb to the tx ring under the lock
safely.
tested and shown to correct the described problem. Applies cleanly to the head
of the net tree
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: Shrikrishna Khare <skhare@vmware.com>
CC: "VMware, Inc." <pv-drivers@vmware.com>
CC: "David S. Miller" <davem@davemloft.net>
Acked-by: Shrikrishna Khare <skhare@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 7 Mar 2016 19:58:11 +0000 (14:58 -0500)]
Merge tag 'wireless-drivers-for-davem-2016-03-04' of git://git./linux/kernel/git/kvalo/wireless-drivers
Kalle Valo says:
====================
wireless-drivers fixes for 4.5
iwlwifi
* free firmware paging memory when the module is unloaded or device removed
* fix pending frames counter to fix an issue when removing stations
ssb
* fix a build problem related to ssb_fill_sprom_with_fallback()
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Krzysztof Kozlowski [Fri, 4 Mar 2016 01:04:52 +0000 (10:04 +0900)]
net: ethernet: Add missing MFD_SYSCON dependency on HAS_IOMEM
The MFD_SYSCON depends on HAS_IOMEM so when selecting it avoid unmet
direct dependencies.
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Thomas Falcon [Thu, 3 Mar 2016 21:22:36 +0000 (15:22 -0600)]
ibmveth: check return of skb_linearize in ibmveth_start_xmit
If skb_linearize fails, the driver should drop the packet
instead of trying to copy it into the bounce buffer.
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Bjørn Mork [Thu, 3 Mar 2016 21:20:53 +0000 (22:20 +0100)]
cdc_ncm: toggle altsetting to force reset before setup
Some devices will silently fail setup unless they are reset first.
This is necessary even if the data interface is already in
altsetting 0, which it will be when the device is probed for the
first time. Briefly toggling the altsetting forces a function
reset regardless of the initial state.
This fixes a setup problem observed on a number of Huawei devices,
appearing to operate in NTB-32 mode even if we explicitly set them
to NTB-16 mode.
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
Oliver Neukum [Mon, 7 Mar 2016 10:31:10 +0000 (11:31 +0100)]
usbnet: cleanup after bind() in probe()
In case bind() works, but a later error forces bailing
in probe() in error cases work and a timer may be scheduled.
They must be killed. This fixes an error case related to
the double free reported in
http://www.spinics.net/lists/netdev/msg367669.html
and needs to go on top of Linus' fix to cdc-ncm.
Signed-off-by: Oliver Neukum <ONeukum@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 7 Mar 2016 16:39:16 +0000 (11:39 -0500)]
Merge branch 'mlxsw-fixes'
Jiri Pirko says:
====================
mlxsw: couple of fixes
Couple of fixes from Ido.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Mon, 7 Mar 2016 14:15:30 +0000 (15:15 +0100)]
mlxsw: pci: Correctly determine if descriptor queue is full
The descriptor queues for sending (SDQs) and receiving (RDQs) packets
are managed by two counters - producer and consumer - which are both
16-bit in size. A queue is considered full when the difference between
the two equals the queue's maximum number of descriptors.
However, if the producer counter overflows, then it's possible for the
full queue check to fail, as it doesn't take the overflow into account.
In such a case, descriptors already passed to the device - but for which
a completion has yet to be posted - will be overwritten, thereby causing
undefined behavior. The above can be achieved under heavy load (~30
netperf instances).
Fix that by casting the subtraction result to u16, preventing it from
being treated as a signed integer.
Fixes:
eda6500a987a ("mlxsw: Add PCI bus implementation")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Mon, 7 Mar 2016 14:15:29 +0000 (15:15 +0100)]
mlxsw: spectrum: Always decrement bridge's ref count
Since we only support one VLAN filtering bridge we need to associate a
reference count with it, so that when the last port netdev leaves it, we
would know that a different bridge can be offloaded to hardware.
When a LAG device is memeber in a bridge and port netdevs are leaving
the LAG, we should always decrement the bridge's reference count, as it's
incremented for any port in the LAG.
Fixes:
4dc236c31733 ("mlxsw: spectrum: Handle port leaving LAG while bridged")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Parthasarathy Bhuvaragan [Thu, 3 Mar 2016 16:54:54 +0000 (17:54 +0100)]
tipc: fix nullptr crash during subscription cancel
commit
4d5cfcba2f6e ('tipc: fix connection abort during subscription
cancel'), removes the check for a valid subscription before calling
tipc_nametbl_subscribe().
This will lead to a nullptr exception when we process a
subscription cancel request. For a cancel request, a null
subscription is passed to tipc_nametbl_subscribe() resulting
in exception.
In this commit, we call tipc_nametbl_subscribe() only for
a valid subscription.
Fixes:
4d5cfcba2f6e ('tipc: fix connection abort during subscription cancel')
Reported-by: Anders Widell <anders.widell@ericsson.com>
Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Colin Ian King [Thu, 3 Mar 2016 13:47:18 +0000 (13:47 +0000)]
net: eth: altera: do not free array priv->mdio->irq
priv->mdio->irq used to be allocated and required freeing, but it
is now a fixed sized array and should no longer be free'd.
Issue detected using static analysis with CoverityScan
Fixes:
e7f4dc3536a400 ("mdio: Move allocation of interrupts into core")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Colin Ian King [Thu, 3 Mar 2016 13:43:34 +0000 (13:43 +0000)]
net/ethoc: do not free array priv->mdio->irq
priv->mdio->irq used to be allocated and required freeing, but it
is now a fixed sized array and should no longer be free'd.
Issue detected using static analysis with CoverityScan
Fixes:
e7f4dc3536a400 ("mdio: Move allocation of interrupts into core")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Tobias Klauser <tklauser@distanz.ch>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Phil Sutter [Thu, 3 Mar 2016 13:34:14 +0000 (14:34 +0100)]
net: sched: fix act_ipt for LOG target
Before calling the destroy() or target() callbacks, the family parameter
field has to be initialized. Otherwise at least the LOG target will
refuse to work and upon removal oops the kernel.
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Phil Sutter <phil@nwl.cc>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Colin Ian King [Thu, 3 Mar 2016 13:27:56 +0000 (13:27 +0000)]
asix: do not free array priv->mdio->irq
Used to be allocated and required freeing, but now
priv->mdio->irq is now a fixed sized array and should no longer be
free'd.
Issue detected using static analysis with CoverityScan
Fixes:
e7f4dc3536a400 ("mdio: Move allocation of interrupts into core")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Atsushi Nemoto [Thu, 3 Mar 2016 00:07:51 +0000 (09:07 +0900)]
gianfar: Enable eTSEC-106 erratum w/a for MPC8548E Rev2
Enable workaround for MPC8548E erratum eTSEC 106,
"Excess delays when transmitting TOE=1 large frames".
(see commit
53fad77375ce "gianfar: Enable eTSEC-20 erratum w/a
for P2020 Rev1")
This erratum was fixed in Rev 3.1.x.
Signed-off-by: Atsushi Nemoto <nemoto@toshiba-tops.co.jp>
Acked-by: Claudiu Manoil <claudiu.manoil@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Sun, 6 Mar 2016 22:48:03 +0000 (14:48 -0800)]
Linux 4.5-rc7
Linus Torvalds [Sun, 6 Mar 2016 22:14:54 +0000 (14:14 -0800)]
Merge tag 'armsoc-fixes' of git://git./linux/kernel/git/arm/arm-soc
Pull ARM SoC fix from Olof Johansson:
"Tiny fixes branch this week, in fact only one patch.
Turns out the USB support for a Renesas board was developed on a
pre-release board that ended up being changed before shipping. To
avoid breakage on those boards, and avoid confusion, it's a reasonable
idea to patch now instead of later. There are no known users of the
pre-release variant any more"
* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
ARM: dts: porter: remove enable prop from HS-USB device node
Linus Torvalds [Sun, 6 Mar 2016 21:51:27 +0000 (13:51 -0800)]
Merge branch 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm
Pull ARM fixes from Russell King:
"Just two ARM fixes this time: one to fix the hyp-stub for older ARM
CPUs, and another to fix the set_memory_xx() permission functions to
deal with zero sizes correctly"
* 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
ARM: 8544/1: set_memory_xx fixes
ARM: 8534/1: virt: fix hyp-stub build for pre-ARMv7 CPUs
Linus Torvalds [Sun, 6 Mar 2016 19:31:13 +0000 (11:31 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/sage/ceph-client
Pull ceph fix from Sage Weil:
"This is a final commit we missed to align the protocol compatibility
with the feature bits.
It decodes a few extra fields in two different messages and reports
EIO when they are used (not yet supported)"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
ceph: initial CEPH_FEATURE_FS_FILE_LAYOUT_V2 support
Linus Torvalds [Sun, 6 Mar 2016 19:24:05 +0000 (11:24 -0800)]
Merge tag 'upstream-4.5-rc7' of git://git.infradead.org/linux-ubifs
Pull UBI fix from Richard Weinberger:
"This contains a single bug fix for UBI"
* tag 'upstream-4.5-rc7' of git://git.infradead.org/linux-ubifs:
ubi: Fix out of bounds write in volume update code
Linus Torvalds [Sun, 6 Mar 2016 19:19:28 +0000 (11:19 -0800)]
Merge branch 'for-linus-4.5-rc7' of git://git./linux/kernel/git/rw/uml
Pull UML fixes from Richard Weinberger:
"This contains three bug/build fixes"
* 'for-linus-4.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml:
um: use %lx format specifiers for unsigned longs
um: Export pm_power_off
Revert "um: Fix get_signal() usage"
Linus Torvalds [Sun, 6 Mar 2016 19:14:16 +0000 (11:14 -0800)]
Merge branch 'upstream' of git://git.linux-mips.org/ralf/upstream-linus
Pull MIPS fixes from Ralf Baechle:
"Another round of fixes for 4.5:
- Fix the use of an undocumented syntactial variant of the .type
pseudo op which is not supported by the LLVM assembler.
- Fix invalid initialization on S-cache-less systems.
- Fix possible information leak from the kernel stack for SIGFPE.
- Fix handling of copy_{from,to}_user() return value in KVM
- Fix the last instance of irq_to_gpio() which now was causing build
errors"
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
MIPS: traps: Fix SIGFPE information leak from `do_ov' and `do_trap_or_bp'
MIPS: kvm: Fix ioctl error handling.
MIPS: scache: Fix scache init with invalid line size.
MIPS: Avoid variant of .type unsupported by LLVM Assembler
MIPS: jz4740: Fix surviving instance of irq_to_gpio()
Linus Torvalds [Sun, 6 Mar 2016 19:08:06 +0000 (11:08 -0800)]
Merge tag 'powerpc-4.5-5' of git://git./linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
- cxl: Fix PSL timebase synchronization detection from Frederic Barrat
- Fix oops when destroying hw_breakpoint event from Ravi Bangoria
- Avoid lbarx on e5500 from Scott Wood
* tag 'powerpc-4.5-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/fsl-book3e: Avoid lbarx on e5500
powerpc/hw_breakpoint: Fix oops when destroying hw_breakpoint event
cxl: Fix PSL timebase synchronization detection
Linus Torvalds [Sun, 6 Mar 2016 19:03:34 +0000 (11:03 -0800)]
Merge branch 'i2c/for-current' of git://git./linux/kernel/git/wsa/linux
Pull i2c fix from Wolfram Sang:
"One I2C bugfix ensuring correct memory allocation in a driver"
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: brcmstb: allocate correct amount of memory for regmap
Linus Torvalds [Sun, 6 Mar 2016 18:50:00 +0000 (10:50 -0800)]
Merge tag 'usb-4.5-rc7' of git://git./linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
"Here are some USB driver ids for 4.5-rc7, and the removal of a driver
we merged in 4.5-rc1 but it turns out it's not needed as the hardware
is the same as a driver we already have in the tree.
This was only figured out after doing a lot of cleanup on it, gotta
love vendor-provided drivers... The new device ids for the devices
for this driver will be added later on when testing is completed, but
for now, we will remove the driver to keep people from accidentally
cleaning it up.
All of these have been in linux-next for a while with no reported
issues"
* tag 'usb-4.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
USB: qcserial: add Sierra Wireless EM74xx device ID
Revert "USB: serial: add Moxa UPORT 11x0 driver"
USB: serial: option: add support for Quectel UC20
USB: serial: option: add support for Telit LE922 PID 0x1045
USB: cp210x: Add ID for Parrot NMEA GPS Flight Recorder
USB: qcserial: add Dell Wireless 5809e Gobi 4G HSPA+ (rev3)
usb: chipidea: otg: change workqueue ci_otg as freezable
Colin Ian King [Sat, 23 Jan 2016 19:17:59 +0000 (19:17 +0000)]
um: use %lx format specifiers for unsigned longs
static analysis from cppcheck detected %x being used for
unsigned longs:
[arch/x86/um/os-Linux/task_size.c:112]: (warning) %x in format
string (no. 1) requires 'unsigned int' but the argument type
is 'unsigned long'.
Use %lx instead of %x
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Richard Weinberger [Mon, 25 Jan 2016 22:24:21 +0000 (23:24 +0100)]
um: Export pm_power_off
...modules are using this symbol.
Export it like all other archs to.
Signed-off-by: Richard Weinberger <richard@nod.at>
Richard Weinberger [Mon, 25 Jan 2016 22:33:30 +0000 (23:33 +0100)]
Revert "um: Fix get_signal() usage"
Commit
db2f24dc240856fb1d78005307f1523b7b3c121b
was plain wrong. I did not realize the we are
allowed to loop here.
In fact we have to loop and must not return to userspace
before all SIGSEGVs have been delivered.
Other archs do this directly in their entry code, UML
does it here.
Reported-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Richard Weinberger <richard@nod.at>
Richard Weinberger [Sun, 21 Feb 2016 09:53:03 +0000 (10:53 +0100)]
ubi: Fix out of bounds write in volume update code
ubi_start_leb_change() allocates too few bytes.
ubi_more_leb_change_data() will write up to req->upd_bytes +
ubi->min_io_size bytes.
Cc: stable@vger.kernel.org
Signed-off-by: Richard Weinberger <richard@nod.at>
Reviewed-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Linus Torvalds [Sat, 5 Mar 2016 20:35:48 +0000 (12:35 -0800)]
Merge tag 'sound-4.5-rc7' of git://git./linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"It's our tradition to get a high volume of fixes late at rc7: this
time, X32 ABI breakage was found and this resulted in a high number
LOCs. The necessary changes to ALSA core codes were fairly
straightforward, and more importantly, they are specific to X32, thus
should be safe to apply.
Other than that, rather a collection of small fixes:
- Removal of the code that blocks too long at closing the OSS
sequencer client (which was spotted by syzkaller, unsurprisingly)
- Fixes races at HD-audio HDMI i915 audio binding
- a few HDSP/HDPM zero-division fixes
- Quirks for HD-audio and USB-audio as usual"
* tag 'sound-4.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda - hdmi defer to register acomp eld notifier
ALSA: hda - hdmi add wmb barrier for audio component
ALSA: hda - Fix mic issues on Acer Aspire E1-472
ALSA: seq: oss: Don't drain at closing a client
ALSA: usb-audio: Add a quirk for Plantronics DA45
ALSA: hdsp: Fix wrong boolean ctl value accesses
ALSA: hdspm: Fix zero-division
ALSA: hdspm: Fix wrong boolean ctl value accesses
ALSA: timer: Fix ioctls for X32 ABI
ALSA: timer: Fix broken compat timer user status ioctl
ALSA: rawmidi: Fix ioctls X32 ABI
ALSA: rawmidi: Use comapt_put_timespec()
ALSA: pcm: Fix ioctls for X32 ABI
ALSA: ctl: Fix ioctls for X32 ABI
Linus Torvalds [Sat, 5 Mar 2016 20:34:29 +0000 (12:34 -0800)]
Merge tag 'dmaengine-fix-4.5-rc7' of git://git.infradead.org/users/vkoul/slave-dma
Pull dmaengine fix from Vinod Koul:
"One minor fix on pxa driver to fix the cyclic dma tranfers"
* tag 'dmaengine-fix-4.5-rc7' of git://git.infradead.org/users/vkoul/slave-dma:
dmaengine: pxa_dma: fix cyclic transfers
Linus Torvalds [Sat, 5 Mar 2016 20:32:34 +0000 (12:32 -0800)]
Merge tag 'media/v4.5-4' of git://git./linux/kernel/git/mchehab/linux-media
Pull media fixes from Mauro Carvalho Chehab:
- some last time changes before we stablize the new entity function
integer numbers at uAPI
- probe: fix erroneous return value on i2c/adp1653 driver
- fix tx 5v detect regression on adv7604 driver
- fix missing unlock on error in vpfe_prepare_pipeline() on
davinci_vpfe driver
* tag 'media/v4.5-4' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
[media] media: Sanitise the reserved fields of the G_TOPOLOGY IOCTL arguments
[media] media.h: postpone connectors entities
[media] media.h: use hex values for range offsets, move connectors base up.
[media] adv7604: fix tx 5v detect regression
[media] media.h: get rid of MEDIA_ENT_F_CONN_TEST
[media] [for,v4.5] media.h: increase the spacing between function ranges
[media] media: i2c/adp1653: probe: fix erroneous return value
[media] media: davinci_vpfe: fix missing unlock on error in vpfe_prepare_pipeline()
Linus Torvalds [Sat, 5 Mar 2016 02:47:18 +0000 (18:47 -0800)]
Merge branch 'libnvdimm-fixes' of git://git./linux/kernel/git/nvdimm/nvdimm
Pull libnvcimm fix from Dan Williams:
"One straggling fix for NVDIMM support.
The KVM/QEMU enabling for NVDIMMs has recently reached the point where
it is able to accept some ACPI _DSM requests from a guest VM. However
they immediately found that the 4.5-rc kernel is unusable because the
kernel's 'nfit' driver fails to load upon seeing a valid "not
supported" response from the virtual BIOS for an address range scrub
command.
It is not mandatory that a platform implement address range scrubbing,
so this fix from Vishal properly treats the 'not supported' response
as 'skip scrubbing and continue loading the driver'"
* 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
nfit: Continue init even if ARS commands are unimplemented
Linus Torvalds [Sat, 5 Mar 2016 02:41:40 +0000 (18:41 -0800)]
Merge tag 'scsi-fixes' of git://git./linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Two fairly simple fixes.
One is a regression with ipr firmware loading caused by one of the
trivial patches in the last merge window which failed to strip the \n
from the file name string, so now the firmware loader no longer works
leading to a lot of unhappy ipr users; fix by stripping the \n.
The second is a memory leak within SCSI: the BLK_PREP_INVALID state
was introduced a recent fix but we forgot to account for it correctly
when freeing state, resulting in memory leakage. Add the correct
state freeing in scsi_prep_return()"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
ipr: Fix regression when loading firmware
SCSI: Free resources when we return BLKPREP_INVALID
Linus Torvalds [Sat, 5 Mar 2016 02:31:36 +0000 (18:31 -0800)]
Merge branch 'for-4.5-fixes' of git://git./linux/kernel/git/tj/libata
Pull libata fixes from Tejun Heo:
"Assorted fixes for libata drivers.
- Turns out HDIO_GET_32BIT ioctl was subtly broken all along.
- Recent update to ahci external port handling was incorrectly
marking hotpluggable ports as external making userland handle
devices connected to those ports incorrectly.
- ahci_xgene needs its own irq handler to work around a hardware
erratum. libahci updated to allow irq handler override.
- Misc driver specific updates"
* 'for-4.5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
ata: ahci: don't mark HotPlugCapable Ports as external/removable
ahci: Workaround for ThunderX Errata#22536
libata: Align ata_device's id on a cacheline
Adding Intel Lewisburg device IDs for SATA
pata-rb532-cf: get rid of the irq_to_gpio() call
libata: fix HDIO_GET_32BIT ioctl
ahci_xgene: Implement the workaround to fix the missing of the edge interrupt for the HOST_IRQ_STAT.
ata: Remove the AHCI_HFLAG_EDGE_IRQ support from libahci.
libahci: Implement the capability to override the generic ahci interrupt handler.
Linus Torvalds [Sat, 5 Mar 2016 02:17:17 +0000 (18:17 -0800)]
Merge branch 'for-linus2' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
"Round 2 of this. I cut back to the bare necessities, the patch is
still larger than it usually would be at this time, due to the number
of NVMe fixes in there. This pull request contains:
- The 4 core fixes from Ming, that fix both problems with exceeding
the virtual boundary limit in case of merging, and the gap checking
for cloned bio's.
- NVMe fixes from Keith and Christoph:
- Regression on larger user commands, causing problems with
reading log pages (for instance). This touches both NVMe,
and the block core since that is now generally utilized also
for these types of commands.
- Hot removal fixes.
- User exploitable issue with passthrough IO commands, if !length
is given, causing us to fault on writing to the zero
page.
- Fix for a hang under error conditions
- And finally, the current series regression for umount with cgroup
writeback, where the final flush would happen async and hence open
up window after umount where the device wasn't consistent. fsck
right after umount would show this. From Tejun"
* 'for-linus2' of git://git.kernel.dk/linux-block:
block: support large requests in blk_rq_map_user_iov
block: fix blk_rq_get_max_sectors for driver private requests
nvme: fix max_segments integer truncation
nvme: set queue limits for the admin queue
writeback: flush inode cgroup wb switches instead of pinning super_block
NVMe: Fix 0-length integrity payload
NVMe: Don't allow unsupported flags
NVMe: Move error handling to failed reset handler
NVMe: Simplify device reset failure
NVMe: Fix namespace removal deadlock
NVMe: Use IDA for namespace disk naming
NVMe: Don't unmap controller registers on reset
block: merge: get the 1st and last bvec via helpers
block: get the 1st and last bvec via helpers
block: check virt boundary in bio_will_gap()
block: bio: introduce helpers to get the 1st and last bvec
Linus Torvalds [Sat, 5 Mar 2016 02:06:49 +0000 (18:06 -0800)]
Merge tag 'for-linus' of git://git./linux/kernel/git/dledford/rdma
Pull rdma fixes from Doug Ledford:
"Additional 4.5-rc6 fixes.
I have four patches today. I had previously thought I had submitted
two of them last week, but they were accidentally skipped :-(.
- One fix to an error path in the core
- One fix for RoCE in the core
- Two related fixes for the core/mlx5"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma:
IB/core: Use GRH when the path hop-limit > 0
IB/{core, mlx5}: Fix input len in vendor part of create_qp/srq
IB/mlx5: Avoid using user-index for SRQs
IB/core: Fix missed clean call in registration path
Linus Torvalds [Sat, 5 Mar 2016 01:56:48 +0000 (17:56 -0800)]
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
"This contains one i915 patch twice, as I merged it locally for
testing, and then pulled some stuff in on top, and then Jani sent to
me, I didn't think it was worth redoing all the merges of what I had
tested.
Summary:
- amdgpu/radeon fixes for some more power management and VM races.
- Two i915 fixes, one for the a recent regression, one another power
management fix for skylake.
- Two tegra dma mask fixes for a regression.
- One ast fix for a typo I made transcribing the userspace driver,
that I'd like to get into stable so I don't forget about it"
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
gpu: host1x: Set DMA ops on device creation
gpu: host1x: Set DMA mask
drm/amdgpu: return from atombios_dp_get_dpcd only when error
drm/amdgpu/cz: remove commented out call to enable vce pg
drm/amdgpu/powerplay/cz: enable/disable vce dpm independent of vce pg
drm/amdgpu/cz: enable/disable vce dpm even if vce pg is disabled
drm/amdgpu/gfx8: specify which engine to wait before vm flush
drm/amdgpu: apply gfx_v8 fixes to gfx_v7 as well
drm/amd/powerplay: send event to notify powerplay all modules are initialized.
drm/amd/powerplay: export AMD_PP_EVENT_COMPLETE_INIT task to amdgpu.
drm/radeon/pm: update current crtc info after setting the powerstate
drm/amdgpu/pm: update current crtc info after setting the powerstate
drm/i915: Balance assert_rpm_wakelock_held() for !IS_ENABLED(CONFIG_PM)
drm/i915/skl: Fix power domain suspend sequence
drm/ast: Fix incorrect register check for DRAM width
drm/i915: Balance assert_rpm_wakelock_held() for !IS_ENABLED(CONFIG_PM)
Linus Torvalds [Sat, 5 Mar 2016 01:51:16 +0000 (17:51 -0800)]
Merge tag 'pm+acpi-4.5-rc7' of git://git./linux/kernel/git/rafael/linux-pm
Pull power management and ACPI fixes from Rafael Wysocki:
"Two build fixes for cpufreq drivers (including one for breakage
introduced recently) and a fix for a graph tracer crash when used over
suspend-to-RAM on x86.
Specifics:
- Prevent the graph tracer from crashing when used over suspend-to-
RAM on x86 by pausing it before invoking do_suspend_lowlevel() and
un-pausing it when that function has returned (Todd Brandt).
- Fix build issues in the qoriq and mediatek cpufreq drivers related
to broken dependencies on THERMAL (Arnd Bergmann)"
* tag 'pm+acpi-4.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
PM / sleep / x86: Fix crash on graph trace through x86 suspend
cpufreq: mediatek: allow building as a module
cpufreq: qoriq: allow building as module with THERMAL=m
Linus Torvalds [Sat, 5 Mar 2016 01:43:40 +0000 (17:43 -0800)]
Merge tag 'arm64-fixes' of git://git./linux/kernel/git/arm64/linux
Pull arm64 fix from Will Deacon:
"Arm64 fix for -rc7. Without it, our struct page array can overflow
the vmemmap region on systems with a large PHYS_OFFSET.
Nothing else on the radar at the moment, so hopefully that's it for
4.5 from us.
Summary: Ensure struct page array fits within vmemmap area"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: vmemmap: use virtual projection of linear region
Linus Torvalds [Sat, 5 Mar 2016 01:36:46 +0000 (17:36 -0800)]
Merge tag 'for-linus-
20160304' of git://git.infradead.org/linux-mtd
Pull jffs2 fixes from David Woodhouse:
"This contains two important JFFS2 fixes marked for stable:
- a lock ordering problem between the page lock and the internal
f->sem mutex, which was causing occasional deadlocks in garbage
collection
- a scan failure causing moved directories to sometimes end up
appearing to have hard links.
There are also a couple of trivial MAINTAINERS file updates"
* tag 'for-linus-
20160304' of git://git.infradead.org/linux-mtd:
MAINTAINERS: add maintainer entry for FREESCALE GPMI NAND driver
Fix directory hardlinks from deleted directories
jffs2: Fix page lock / f->sem deadlock
Revert "jffs2: Fix lock acquisition order bug in jffs2_write_begin"
MAINTAINERS: update Han's email
Linus Torvalds [Sat, 5 Mar 2016 01:31:32 +0000 (17:31 -0800)]
Merge branch 'for-linus-4.5' of git://git./linux/kernel/git/mason/linux-btrfs
Pull btrfs fix from Chris Mason:
"Filipe nailed down a problem where tree log replay would do some work
that orphan code wasn't expecting to be done yet, leading to BUG_ON"
* 'for-linus-4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
Btrfs: fix loading of orphan roots leading to BUG_ON
Linus Torvalds [Sat, 5 Mar 2016 00:57:04 +0000 (16:57 -0800)]
Merge tag 'trace-fixes-v4.5-rc6' of git://git./linux/kernel/git/rostedt/linux-trace
Pull tracing fix from Steven Rostedt:
"A feature was added in 4.3 that allowed users to filter trace points
on a tasks "comm" field. But this prevented filtering on a comm field
that is within a trace event (like sched_migrate_task).
When trying to filter on when a program migrated, this change
prevented the filtering of the sched_migrate_task.
To fix this, the event fields are examined first, and then the extra
fields like "comm" and "cpu" are examined. Also, instead of testing
to assign the comm filter function based on the field's name, the
generic comm field is given a new filter type (FILTER_COMM). When
this field is used to filter the type is checked. The same is done
for the cpu filter field.
Two new special filter types are added: "COMM" and "CPU". This allows
users to still filter the tasks comm for events that have "comm" as
one of their fields, in cases that users would like to filter
sched_migrate_task on the comm of the task that called the event, and
not the comm of the task that is being migrated"
* tag 'trace-fixes-v4.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing: Do not have 'comm' filter override event 'comm' field
Vishal Verma [Thu, 3 Mar 2016 22:39:41 +0000 (15:39 -0700)]
nfit: Continue init even if ARS commands are unimplemented
If firmware doesn't implement any of the ARS commands, take that to
mean that ARS is unsupported, and continue to initialize regions without
bad block lists. We cannot make the assumption that ARS commands will be
unconditionally supported on all NVDIMMs.
Reported-by: Haozhong Zhang <haozhong.zhang@intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
Acked-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Tested-by: Haozhong Zhang <haozhong.zhang@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Mika Penttilä [Mon, 22 Feb 2016 16:56:52 +0000 (17:56 +0100)]
ARM: 8544/1: set_memory_xx fixes
Allow zero size updates. This makes set_memory_xx() consistent with x86, s390 and arm64 and makes apply_to_page_range() not to BUG() when loading modules.
Signed-off-by: Mika Penttilä mika.penttila@nextfour.com
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Dave Airlie [Fri, 4 Mar 2016 21:53:25 +0000 (07:53 +1000)]
Merge tag 'drm/tegra/for-4.5-rc7' of git://anongit.freedesktop.org/tegra/linux into drm-fixes
drm/tegra: Fixes for v4.5-rc7
Two small fixes that restore PRIME support.
* tag 'drm/tegra/for-4.5-rc7' of git://anongit.freedesktop.org/tegra/linux:
gpu: host1x: Set DMA ops on device creation
gpu: host1x: Set DMA mask
Maciej W. Rozycki [Fri, 4 Mar 2016 01:42:49 +0000 (01:42 +0000)]
MIPS: traps: Fix SIGFPE information leak from `do_ov' and `do_trap_or_bp'
Avoid sending a partially initialised `siginfo_t' structure along SIGFPE
signals issued from `do_ov' and `do_trap_or_bp', leading to information
leaking from the kernel stack.
Signed-off-by: Maciej W. Rozycki <macro@imgtec.com>
Cc: stable@vger.kernel.org
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Rafael J. Wysocki [Fri, 4 Mar 2016 21:41:05 +0000 (22:41 +0100)]
Merge branches 'pm-cpufreq-fixes' and 'pm-sleep-fixes'
* pm-cpufreq-fixes:
cpufreq: mediatek: allow building as a module
cpufreq: qoriq: allow building as module with THERMAL=m
* pm-sleep-fixes:
PM / sleep / x86: Fix crash on graph trace through x86 suspend
Yan, Zheng [Sun, 14 Feb 2016 10:06:41 +0000 (18:06 +0800)]
ceph: initial CEPH_FEATURE_FS_FILE_LAYOUT_V2 support
Add support for the format change of MClientReply/MclientCaps.
Also add code that denies access to inodes with pool_ns layouts.
Signed-off-by: Yan, Zheng <zyan@redhat.com>
Reviewed-by: Sage Weil <sage@redhat.com>
David S. Miller [Fri, 4 Mar 2016 19:32:47 +0000 (14:32 -0500)]
Merge tag 'linux-can-fixes-for-4.5-
20160304' of git://git./linux/kernel/git/mkl/linux-can
Marc Kleine-Budde says:
====================
pull-request: can 2016-03-04
this is a pull request for net/master.
There is one patch from Ed Spiridonov, which increases the performance of the
mcp251x SPI CAN driver, by avoiding to write to error flag register if it's
unnecessary.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexandre Courbot [Fri, 26 Feb 2016 09:06:53 +0000 (18:06 +0900)]
gpu: host1x: Set DMA ops on device creation
Currently host1x-instanciated devices have their dma_ops left to NULL,
which makes any DMA operation (like buffer import) on ARM64 fallback
to the dummy_dma_ops and fail with an error.
This patch calls of_dma_configure() with the host1x node when creating
such a device, so the proper DMA operations are set.
Suggested-by: Thierry Reding <thierry.reding@gmail.com>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Alexandre Courbot [Fri, 26 Feb 2016 09:06:52 +0000 (18:06 +0900)]
gpu: host1x: Set DMA mask
The default DMA mask covers a 32 bits address range, but host1x devices
can address a larger range on TK1 and TX1. Set the DMA mask to the range
addressable when we use the IOMMU to prevent the use of bounce buffers.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Steven Rostedt (Red Hat) [Thu, 3 Mar 2016 22:18:20 +0000 (17:18 -0500)]
tracing: Do not have 'comm' filter override event 'comm' field
Commit
9f61668073a8d "tracing: Allow triggers to filter for CPU ids and
process names" added a 'comm' filter that will filter events based on the
current tasks struct 'comm'. But this now hides the ability to filter events
that have a 'comm' field too. For example, sched_migrate_task trace event.
That has a 'comm' field of the task to be migrated.
echo 'comm == "bash"' > events/sched_migrate_task/filter
will now filter all sched_migrate_task events for tasks named "bash" that
migrates other tasks (in interrupt context), instead of seeing when "bash"
itself gets migrated.
This fix requires a couple of changes.
1) Change the look up order for filter predicates to look at the events
fields before looking at the generic filters.
2) Instead of basing the filter function off of the "comm" name, have the
generic "comm" filter have its own filter_type (FILTER_COMM). Test
against the type instead of the name to assign the filter function.
3) Add a new "COMM" filter that works just like "comm" but will filter based
on the current task, even if the trace event contains a "comm" field.
Do the same for "cpu" field, adding a FILTER_CPU and a filter "CPU".
Cc: stable@vger.kernel.org # v4.3+
Fixes:
9f61668073a8d "tracing: Allow triggers to filter for CPU ids and process names"
Reported-by: Matt Fleming <matt@codeblueprint.co.uk>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Ed Spiridonov [Fri, 4 Mar 2016 06:07:27 +0000 (09:07 +0300)]
can: mcp251x: avoid write to error flag register if it's unnecessary
Only two bits (RX0OVR and RX1OVR) are writable in EFLG, write is useless
if these bits aren't set.
Signed-off-by: Ed Spiridonov <edo.rus@gmail.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Libin Yang [Fri, 4 Mar 2016 06:33:43 +0000 (14:33 +0800)]
ALSA: hda - hdmi defer to register acomp eld notifier
Defer to register acomp eld notifier until hdmi audio driver
is fully ready.
After registering eld notifier, gfx driver can use this
callback function to notify audio driver the monitor
connection event. However this action may happen when
audio driver is adding the pins or doing other initialization.
This is not always safe, however. For example, using
per_pin->lock before the lock is initialized.
Let's register the eld notifier after the initialization is done.
Signed-off-by: Libin Yang <libin.yang@linux.intel.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Libin Yang [Fri, 4 Mar 2016 06:33:06 +0000 (14:33 +0800)]
ALSA: hda - hdmi add wmb barrier for audio component
To make sure audio_ptr is set before intel_audio_codec_enable()
or intel_audio_codec_disable() calling pin_eld_notify(),
this patch adds wmb barrier to prevent optimizing.
Signed-off-by: Libin Yang <libin.yang@linux.intel.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Scott Wood [Thu, 3 Mar 2016 04:51:04 +0000 (22:51 -0600)]
powerpc/fsl-book3e: Avoid lbarx on e5500
lbarx/stbcx. are implemented on e6500, but not on e5500.
Likewise, SMT is on e6500, but not on e5500.
So, avoid executing an unimplemented instruction by only locking
when needed (i.e. in the presence of SMT).
Signed-off-by: Scott Wood <oss@buserror.net>
Dave Airlie [Fri, 4 Mar 2016 03:51:53 +0000 (13:51 +1000)]
Merge tag 'drm-intel-fixes-2016-03-03' of git://anongit.freedesktop.org/drm-intel into drm-fixes
Small conflict as I had the balance in my tree already for testing.
* tag 'drm-intel-fixes-2016-03-03' of git://anongit.freedesktop.org/drm-intel:
drm/i915: Balance assert_rpm_wakelock_held() for !IS_ENABLED(CONFIG_PM)
drm/i915/skl: Fix power domain suspend sequence
Filipe Manana [Wed, 2 Mar 2016 15:49:38 +0000 (15:49 +0000)]
Btrfs: fix loading of orphan roots leading to BUG_ON
When looking for orphan roots during mount we can end up hitting a
BUG_ON() (at root-item.c:btrfs_find_orphan_roots()) if a log tree is
replayed and qgroups are enabled. This is because after a log tree is
replayed, a transaction commit is made, which triggers qgroup extent
accounting which in turn does backref walking which ends up reading and
inserting all roots in the radix tree fs_info->fs_root_radix, including
orphan roots (deleted snapshots). So after the log tree is replayed, when
finding orphan roots we hit the BUG_ON with the following trace:
[118209.182438] ------------[ cut here ]------------
[118209.183279] kernel BUG at fs/btrfs/root-tree.c:314!
[118209.184074] invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
[118209.185123] Modules linked in: btrfs dm_flakey dm_mod crc32c_generic ppdev xor raid6_pq evdev sg parport_pc parport acpi_cpufreq tpm_tis tpm psmouse
processor i2c_piix4 serio_raw pcspkr i2c_core button loop autofs4 ext4 crc16 mbcache jbd2 sd_mod sr_mod cdrom ata_generic virtio_scsi ata_piix libata
virtio_pci virtio_ring virtio scsi_mod e1000 floppy [last unloaded: btrfs]
[118209.186318] CPU: 14 PID: 28428 Comm: mount Tainted: G W 4.5.0-rc5-btrfs-next-24+ #1
[118209.186318] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS by qemu-project.org 04/01/2014
[118209.186318] task:
ffff8801ec131040 ti:
ffff8800af34c000 task.ti:
ffff8800af34c000
[118209.186318] RIP: 0010:[<
ffffffffa04237d7>] [<
ffffffffa04237d7>] btrfs_find_orphan_roots+0x1fc/0x244 [btrfs]
[118209.186318] RSP: 0018:
ffff8800af34faa8 EFLAGS:
00010246
[118209.186318] RAX:
00000000ffffffef RBX:
00000000ffffffef RCX:
0000000000000001
[118209.186318] RDX:
0000000080000000 RSI:
0000000000000001 RDI:
00000000ffffffff
[118209.186318] RBP:
ffff8800af34fb08 R08:
0000000000000001 R09:
0000000000000000
[118209.186318] R10:
ffff8800af34f9f0 R11:
6db6db6db6db6db7 R12:
ffff880171b97000
[118209.186318] R13:
ffff8801ca9d65e0 R14:
ffff8800afa2e000 R15:
0000160000000000
[118209.186318] FS:
00007f5bcb914840(0000) GS:
ffff88023edc0000(0000) knlGS:
0000000000000000
[118209.186318] CS: 0010 DS: 0000 ES: 0000 CR0:
000000008005003b
[118209.186318] CR2:
00007f5bcaceb5d9 CR3:
00000000b49b5000 CR4:
00000000000006e0
[118209.186318] Stack:
[118209.186318]
fffffbffffffffff 010230ffffffffff 0101000000000000 ff84000000000000
[118209.186318]
fbffffffffffffff 30ffffffffffffff 0000000000000101 ffff880082348000
[118209.186318]
0000000000000000 ffff8800afa2e000 ffff8800afa2e000 0000000000000000
[118209.186318] Call Trace:
[118209.186318] [<
ffffffffa042e2db>] open_ctree+0x1e37/0x21b9 [btrfs]
[118209.186318] [<
ffffffffa040a753>] btrfs_mount+0x97e/0xaed [btrfs]
[118209.186318] [<
ffffffff8108e1c0>] ? trace_hardirqs_on+0xd/0xf
[118209.186318] [<
ffffffff8117b87e>] mount_fs+0x67/0x131
[118209.186318] [<
ffffffff81192d2b>] vfs_kern_mount+0x6c/0xde
[118209.186318] [<
ffffffffa0409f81>] btrfs_mount+0x1ac/0xaed [btrfs]
[118209.186318] [<
ffffffff8108e1c0>] ? trace_hardirqs_on+0xd/0xf
[118209.186318] [<
ffffffff8108c26b>] ? lockdep_init_map+0xb9/0x1b3
[118209.186318] [<
ffffffff8117b87e>] mount_fs+0x67/0x131
[118209.186318] [<
ffffffff81192d2b>] vfs_kern_mount+0x6c/0xde
[118209.186318] [<
ffffffff81195637>] do_mount+0x8a6/0x9e8
[118209.186318] [<
ffffffff8119598d>] SyS_mount+0x77/0x9f
[118209.186318] [<
ffffffff81493017>] entry_SYSCALL_64_fastpath+0x12/0x6b
[118209.186318] Code: 64 00 00 85 c0 89 c3 75 24 f0 41 80 4c 24 20 20 49 8b bc 24 f0 01 00 00 4c 89 e6 e8 e8 65 00 00 85 c0 89 c3 74 11 83 f8 ef 75 02 <0f> 0b
4c 89 e7 e8 da 72 00 00 eb 1c 41 83 bc 24 00 01 00 00 00
[118209.186318] RIP [<
ffffffffa04237d7>] btrfs_find_orphan_roots+0x1fc/0x244 [btrfs]
[118209.186318] RSP <
ffff8800af34faa8>
[118209.230735] ---[ end trace
83938f987d85d477 ]---
So fix this by not treating the error -EEXIST, returned when attempting
to insert a root already inserted by the backref walking code, as an error.
The following test case for xfstests reproduces the bug:
seq=`basename $0`
seqres=$RESULT_DIR/$seq
echo "QA output created by $seq"
tmp=/tmp/$$
status=1 # failure is the default!
trap "_cleanup; exit \$status" 0 1 2 3 15
_cleanup()
{
_cleanup_flakey
cd /
rm -f $tmp.*
}
# get standard environment, filters and checks
. ./common/rc
. ./common/filter
. ./common/dmflakey
# real QA test starts here
_supported_fs btrfs
_supported_os Linux
_require_scratch
_require_dm_target flakey
_require_metadata_journaling $SCRATCH_DEV
rm -f $seqres.full
_scratch_mkfs >>$seqres.full 2>&1
_init_flakey
_mount_flakey
_run_btrfs_util_prog quota enable $SCRATCH_MNT
# Create 2 directories with one file in one of them.
# We use these just to trigger a transaction commit later, moving the file from
# directory a to directory b and doing an fsync against directory a.
mkdir $SCRATCH_MNT/a
mkdir $SCRATCH_MNT/b
touch $SCRATCH_MNT/a/f
sync
# Create our test file with 2 4K extents.
$XFS_IO_PROG -f -s -c "pwrite -S 0xaa 0 8K" $SCRATCH_MNT/foobar | _filter_xfs_io
# Create a snapshot and delete it. This doesn't really delete the snapshot
# immediately, just makes it inaccessible and invisible to user space, the
# snapshot is deleted later by a dedicated kernel thread (cleaner kthread)
# which is woke up at the next transaction commit.
# A root orphan item is inserted into the tree of tree roots, so that if a
# power failure happens before the dedicated kernel thread does the snapshot
# deletion, the next time the filesystem is mounted it resumes the snapshot
# deletion.
_run_btrfs_util_prog subvolume snapshot $SCRATCH_MNT $SCRATCH_MNT/snap
_run_btrfs_util_prog subvolume delete $SCRATCH_MNT/snap
# Now overwrite half of the extents we wrote before. Because we made a snapshpot
# before, which isn't really deleted yet (since no transaction commit happened
# after we did the snapshot delete request), the non overwritten extents get
# referenced twice, once by the default subvolume and once by the snapshot.
$XFS_IO_PROG -c "pwrite -S 0xbb 4K 8K" $SCRATCH_MNT/foobar | _filter_xfs_io
# Now move file f from directory a to directory b and fsync directory a.
# The fsync on the directory a triggers a transaction commit (because a file
# was moved from it to another directory) and the file fsync leaves a log tree
# with file extent items to replay.
mv $SCRATCH_MNT/a/f $SCRATCH_MNT/a/b
$XFS_IO_PROG -c "fsync" $SCRATCH_MNT/a
$XFS_IO_PROG -c "fsync" $SCRATCH_MNT/foobar
echo "File digest before power failure:"
md5sum $SCRATCH_MNT/foobar | _filter_scratch
# Now simulate a power failure and mount the filesystem to replay the log tree.
# After the log tree was replayed, we used to hit a BUG_ON() when processing
# the root orphan item for the deleted snapshot. This is because when processing
# an orphan root the code expected to be the first code inserting the root into
# the fs_info->fs_root_radix radix tree, while in reallity it was the second
# caller attempting to do it - the first caller was the transaction commit that
# took place after replaying the log tree, when updating the qgroup counters.
_flakey_drop_and_remount
echo "File digest before after failure:"
# Must match what he got before the power failure.
md5sum $SCRATCH_MNT/foobar | _filter_scratch
_unmount_flakey
status=0
exit
Fixes:
2d9e97761087 ("Btrfs: use btrfs_get_fs_root in resolve_indirect_ref")
Cc: stable@vger.kernel.org # 4.4+
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: Chris Mason <clm@fb.com>
Venkat Duvvuru [Wed, 2 Mar 2016 11:00:28 +0000 (06:00 -0500)]
be2net: don't enable multicast flag in be_enable_if_filters() routine
When the interface is opened (in be_open()) the routine
be_enable_if_filters() must be called to switch on the basic filtering
capabilities of an interface that are not changed at run-time.
These include the flags UNTAGGED, BROADCAST and PASS_L3L4_ERRORS.
Other flags such as MULTICAST and PROMISC must be enabled later by
be_set_rx_mode() based on the state in the netdev/adapter struct.
be_enable_if_filters() routine is wrongly trying to enable MULTICAST flag
without checking the current adapter state. This can cause the RX_FILTER
cmds to the FW to fail. This patch fixes this problem by only enabling
the basic filtering flags in be_enable_if_filters().
The VF must be able to issue RX_FILTER cmd with any filter flag, as long
as the PF allowed those flags (if_cap_flags) in the iface it provisioned
for the VF. This rule is applicable even when the VF doesn't have the
FILTMGMT privilege. There is a bug in BE3 FW that wrongly fails RX_FILTER
multicast programming cmds on VFs that don't have FILTMGMT privilege.
This patch also helps in insulating the VF driver from be_open failures due
to the FW bug. A fix for the BE3 FW issue will be available in
versions >= 11.0.283.0 and 10.6.334.0
Reported-by: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: Venkat Duvvuru <venkatkumar.duvvuru@avagotech.com>
Signed-off-by: Sathya Perla <sathya.perla@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Wed, 2 Mar 2016 10:11:10 +0000 (13:11 +0300)]
net: moxa: fix an error code
We accidentally return IS_ERR(priv->base) which is 1 instead of
PTR_ERR(priv->base) which is the error code.
Fixes:
6c821bd9edc9 ('net: Add MOXA ART SoCs ethernet driver')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nimrod Andy [Wed, 2 Mar 2016 09:24:53 +0000 (17:24 +0800)]
MAINTAINERS: add maintainer entry for FREESCALE FEC ethernet driver
Add a maintainer entry for FREESCALE FEC ethernet driver and add myself
as a maintainer.
Signed-off-by: Fugang Duan <fugang.duan@nxp.com>
Acked-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Wed, 2 Mar 2016 01:32:08 +0000 (02:32 +0100)]
vxlan: fix missing options_len update on RX with collect metadata
When signalling to metadata consumers that the metadata_dst entry
carries additional GBP extension data for vxlan (TUNNEL_VXLAN_OPT),
the dst's vxlan_metadata information is populated, but options_len
is left to zero. F.e. in ovs, ovs_flow_key_extract() checks for
options_len before extracting the data through ip_tunnel_info_opts_get().
Geneve uses ip_tunnel_info_opts_set() helper in receive path, which
sets options_len internally, vxlan however uses ip_tunnel_info_opts(),
so when filling vxlan_metadata, we do need to update options_len.
Fixes:
4c22279848c5 ("ip-tunnel: Use API to access tunnel metadata options.")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Christoph Hellwig [Wed, 2 Mar 2016 17:07:14 +0000 (18:07 +0100)]
block: support large requests in blk_rq_map_user_iov
This patch adds support for larger requests in blk_rq_map_user_iov by
allowing it to build multiple bios for a request. This functionality
used to exist for the non-vectored blk_rq_map_user in the past, and
this patch reuses the existing functionality for it on the unmap side,
which stuck around. Thanks to the iov_iter API supporting multiple
bios is fairly trivial, as we can just iterate the iov until we've
consumed the whole iov_iter.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Jeff Lien <Jeff.Lien@hgst.com>
Tested-by: Jeff Lien <Jeff.Lien@hgst.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Christoph Hellwig [Thu, 3 Mar 2016 21:43:45 +0000 (14:43 -0700)]
block: fix blk_rq_get_max_sectors for driver private requests
Driver private request types should not get the artifical cap for the
FS requests. This is important to use the full device capabilities
for internal command or NVMe pass through commands.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Jeff Lien <Jeff.Lien@hgst.com>
Tested-by: Jeff Lien <Jeff.Lien@hgst.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Updated by me to use an explicit check for the one command type that
does support extended checking, instead of relying on the ordering
of the enum command values - as suggested by Keith.
Signed-off-by: Jens Axboe <axboe@fb.com>
Christoph Hellwig [Wed, 2 Mar 2016 17:07:12 +0000 (18:07 +0100)]
nvme: fix max_segments integer truncation
The block layer uses an unsigned short for max_segments. The way we
calculate the value for NVMe tends to generate very large 32-bit values,
which after integer truncation may lead to a zero value instead of
the desired outcome.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Jeff Lien <Jeff.Lien@hgst.com>
Tested-by: Jeff Lien <Jeff.Lien@hgst.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Christoph Hellwig [Wed, 2 Mar 2016 17:07:11 +0000 (18:07 +0100)]
nvme: set queue limits for the admin queue
Factor out a helper to set all the device specific queue limits and apply
them to the admin queue in addition to the I/O queues. Without this the
command size on the admin queue is arbitrarily low, and the missing
other limitations are just minefields waiting for victims.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Jeff Lien <Jeff.Lien@hgst.com>
Tested-by: Jeff Lien <Jeff.Lien@hgst.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Tejun Heo [Mon, 29 Feb 2016 23:28:53 +0000 (18:28 -0500)]
writeback: flush inode cgroup wb switches instead of pinning super_block
If cgroup writeback is in use, inodes can be scheduled for
asynchronous wb switching. Before
5ff8eaac1636 ("writeback: keep
superblock pinned during cgroup writeback association switches"), this
could race with umount leading to super_block being destroyed while
inodes are pinned for wb switching.
5ff8eaac1636 fixed it by bumping
s_active while wb switches are in flight; however, this allowed
in-flight wb switches to make umounts asynchronous when the userland
expected synchronosity - e.g. fsck immediately following umount may
fail because the device is still busy.
This patch removes the problematic super_block pinning and instead
makes generic_shutdown_super() flush in-flight wb switches. wb
switches are now executed on a dedicated isw_wq so that they can be
flushed and isw_nr_in_flight keeps track of the number of in-flight wb
switches so that flushing can be avoided in most cases.
v2: Move cgroup_writeback_umount() further below and add MS_ACTIVE
check in inode_switch_wbs() as Jan an Al suggested.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Tahsin Erdogan <tahsin@google.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Link: http://lkml.kernel.org/g/CAAeU0aNCq7LGODvVGRU-oU_o-6enii5ey0p1c26D1ZzYwkDc5A@mail.gmail.com
Fixes:
5ff8eaac1636 ("writeback: keep superblock pinned during cgroup writeback association switches")
Cc: stable@vger.kernel.org #v4.5
Reviewed-by: Jan Kara <jack@suse.cz>
Tested-by: Tahsin Erdogan <tahsin@google.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Keith Busch [Wed, 24 Feb 2016 16:15:58 +0000 (09:15 -0700)]
NVMe: Fix 0-length integrity payload
A user could send a passthrough IO command with a metadata pointer to a
namespace without metadata. With metadata length of 0, kmalloc returns
ZERO_SIZE_PTR. Since that is not NULL, the driver would have set this as
the bio's integrity payload, which causes an access fault on completion.
This patch ignores the users metadata buffer if the namespace format
does not support separate metadata.
Reported-by: Stephen Bates <stephen.bates@microsemi.com>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
Keith Busch [Wed, 24 Feb 2016 16:15:57 +0000 (09:15 -0700)]
NVMe: Don't allow unsupported flags
The command flags can change the meaning of other fields in the command
that the driver is not prepared to handle. Specifically, the user could
passthrough an SGL flag, causing the controller to misinterpret the PRP
list the driver created, potentially corrupting memory or data.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Jon Derrick <jonathan.derrick@intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
Keith Busch [Wed, 24 Feb 2016 16:15:56 +0000 (09:15 -0700)]
NVMe: Move error handling to failed reset handler
This moves failed queue handling out of the namespace removal path and
into the reset failure path, fixing a hanging condition if the controller
fails or link down during del_gendisk. Previously the driver had to see
the controller as degraded prior to calling del_gendisk to setup the
queues to fail. But, if the controller happened to fail after this,
there was no task to end outstanding requests.
On failure, all namespace states are set to dead. This has capacity
revalidate to 0, and ends all new requests with error status.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
Keith Busch [Wed, 24 Feb 2016 16:15:55 +0000 (09:15 -0700)]
NVMe: Simplify device reset failure
A reset failure schedules the device to unbind from the driver through
the pci driver's remove. This cleans up all intialization, so there is
no need to duplicate the potentially racy cleanup.
To help understand why a reset failed, the status is logged with the
existing warning message.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
Keith Busch [Wed, 24 Feb 2016 16:15:54 +0000 (09:15 -0700)]
NVMe: Fix namespace removal deadlock
This patch makes nvme namespace removal lockless. It is up to the caller
to ensure no active namespace scanning is occuring. To ensure no scan
work occurs, the nvme pci driver adds a removing state to the controller
device to avoid queueing scan work during removal. The work is flushed
after setting the state, so no new scan work can be queued.
The lockless removal allows the driver to cleanup a namespace
request_queue if the controller fails during removal. Previously this
could deadlock trying to acquire the namespace mutex in order to handle
such events.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
Keith Busch [Wed, 24 Feb 2016 16:15:53 +0000 (09:15 -0700)]
NVMe: Use IDA for namespace disk naming
A namespace may be detached from a controller, but a user may be holding
a reference to it. Attaching a new namespace with the same NSID will create
duplicate names when using the NSID to name the disk.
This patch uses an IDA that is released only when the last reference is
released instead of using the namespace ID.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
Keith Busch [Wed, 24 Feb 2016 16:15:52 +0000 (09:15 -0700)]
NVMe: Don't unmap controller registers on reset
Unmapping the registers on reset or shutdown is not necessary. Keeping
the mapping simplifies reset handling.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
Ming Lei [Fri, 26 Feb 2016 15:40:53 +0000 (23:40 +0800)]
block: merge: get the 1st and last bvec via helpers
This patch applies the two introduced helpers to
figure out the 1st and last bvec.
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Ming Lei [Fri, 26 Feb 2016 15:40:52 +0000 (23:40 +0800)]
block: get the 1st and last bvec via helpers
This patch applies the two introduced helpers to
figure out the 1st and last bvec, and fixes the
original way after bio splitting.
Cc: stable@vger.kernel.org
Reported-by: Sagi Grimberg <sagig@dev.mellanox.co.il>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Ming Lei [Fri, 26 Feb 2016 15:40:51 +0000 (23:40 +0800)]
block: check virt boundary in bio_will_gap()
In the following patch, the way for figuring out
the last bvec will be changed with a bit cost introduced,
so return immediately if the queue doesn't have virt
boundary limit. Actually most of devices have not
this limit.
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Ming Lei [Fri, 26 Feb 2016 15:40:50 +0000 (23:40 +0800)]
block: bio: introduce helpers to get the 1st and last bvec
The bio passed to bio_will_gap() may be fast cloned from upper
layer(dm, md, bcache, fs, ...), or from bio splitting in block
core.
Unfortunately bio_will_gap() just figures out the last bvec via
'bi_io_vec[prev->bi_vcnt - 1]' directly, and this way is obviously
wrong.
This patch introduces two helpers for getting the first and last
bvec of one bio for fixing the issue.
Cc: stable@vger.kernel.org
Reported-by: Sagi Grimberg <sagig@dev.mellanox.co.il>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Wolfram Sang [Tue, 1 Mar 2016 16:37:59 +0000 (17:37 +0100)]
net: ethernet: renesas: sh_eth: don't open code of_device_get_match_data()
This change will also make Coverity happy by avoiding a theoretical NULL
pointer dereference; yet another reason is to use the above helper function
to tighten the code and make it more readable.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Wolfram Sang [Tue, 1 Mar 2016 16:37:58 +0000 (17:37 +0100)]
net: ethernet: renesas: ravb_main: don't open code of_device_get_match_data()
This change will also make Coverity happy by avoiding a theoretical NULL
pointer dereference; yet another reason is to use the above helper function
to tighten the code and make it more readable.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Simon Horman <horms+renesas@verge.net.au>
Acked-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
Thomas Falcon [Tue, 1 Mar 2016 16:20:09 +0000 (10:20 -0600)]
ibmvnic: Fix ibmvnic_capability struct
The ibmvnic_capability struct was defined incorrectly. The last two
elements of the struct are in the wrong order. In addition, the number
element should be 64-bit. Byteswapping functions are updated
as well.
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Westphal [Tue, 1 Mar 2016 15:15:16 +0000 (16:15 +0100)]
ipv6: re-enable fragment header matching in ipv6_find_hdr
When ipv6_find_hdr is used to find a fragment header
(caller specifies target NEXTHDR_FRAGMENT) we erronously return
-ENOENT for all fragments with nonzero offset.
Before commit
9195bb8e381d, when target was specified, we did not
enter the exthdr walk loop as nexthdr == target so this used to work.
Now we do (so we can skip empty route headers). When we then stumble upon
a frag with nonzero frag_off we must return -ENOENT ("header not found")
only if the caller did not specifically request NEXTHDR_FRAGMENT.
This allows nfables exthdr expression to match ipv6 fragments, e.g. via
nft add rule ip6 filter input frag frag-off gt 0
Fixes:
9195bb8e381d ("ipv6: improve ipv6_find_hdr() to skip empty routing headers")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Bjørn Mork [Tue, 1 Mar 2016 13:31:02 +0000 (14:31 +0100)]
qmi_wwan: add Sierra Wireless EM74xx device ID
The MC74xx and EM74xx modules use different IDs by default, according
to the Lenovo EM7455 driver for Windows.
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>