Christoph Paasch [Wed, 27 Sep 2017 00:38:50 +0000 (17:38 -0700)]
net: Set sk_prot_creator when cloning sockets to the right proto
sk->sk_prot and sk->sk_prot_creator can differ when the app uses
IPV6_ADDRFORM (transforming an IPv6-socket to an IPv4-one).
Which is why sk_prot_creator is there to make sure that sk_prot_free()
does the kmem_cache_free() on the right kmem_cache slab.
Now, if such a socket gets transformed back to a listening socket (using
connect() with AF_UNSPEC) we will allocate an IPv4 tcp_sock through
sk_clone_lock() when a new connection comes in. But sk_prot_creator will
still point to the IPv6 kmem_cache (as everything got copied in
sk_clone_lock()). When freeing, we will thus put this
memory back into the IPv6 kmem_cache although it was allocated in the
IPv4 cache. I have seen memory corruption happening because of this.
With slub-debugging and MEMCG_KMEM enabled this gives the warning
"cache_from_obj: Wrong slab cache. TCPv6 but object is from TCP"
A C-program to trigger this:
void main(void)
{
int fd = socket(AF_INET6, SOCK_STREAM, IPPROTO_TCP);
int new_fd, newest_fd, client_fd;
struct sockaddr_in6 bind_addr;
struct sockaddr_in bind_addr4, client_addr1, client_addr2;
struct sockaddr unsp;
int val;
memset(&bind_addr, 0, sizeof(bind_addr));
bind_addr.sin6_family = AF_INET6;
bind_addr.sin6_port = ntohs(42424);
memset(&client_addr1, 0, sizeof(client_addr1));
client_addr1.sin_family = AF_INET;
client_addr1.sin_port = ntohs(42424);
client_addr1.sin_addr.s_addr = inet_addr("127.0.0.1");
memset(&client_addr2, 0, sizeof(client_addr2));
client_addr2.sin_family = AF_INET;
client_addr2.sin_port = ntohs(42421);
client_addr2.sin_addr.s_addr = inet_addr("127.0.0.1");
memset(&unsp, 0, sizeof(unsp));
unsp.sa_family = AF_UNSPEC;
bind(fd, (struct sockaddr *)&bind_addr, sizeof(bind_addr));
listen(fd, 5);
client_fd = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP);
connect(client_fd, (struct sockaddr *)&client_addr1, sizeof(client_addr1));
new_fd = accept(fd, NULL, NULL);
close(fd);
val = AF_INET;
setsockopt(new_fd, SOL_IPV6, IPV6_ADDRFORM, &val, sizeof(val));
connect(new_fd, &unsp, sizeof(unsp));
memset(&bind_addr4, 0, sizeof(bind_addr4));
bind_addr4.sin_family = AF_INET;
bind_addr4.sin_port = ntohs(42421);
bind(new_fd, (struct sockaddr *)&bind_addr4, sizeof(bind_addr4));
listen(new_fd, 5);
client_fd = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP);
connect(client_fd, (struct sockaddr *)&client_addr2, sizeof(client_addr2));
newest_fd = accept(new_fd, NULL, NULL);
close(new_fd);
close(client_fd);
close(new_fd);
}
As far as I can see, this bug has been there since the beginning of the
git-days.
Signed-off-by: Christoph Paasch <cpaasch@apple.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vivien Didelot [Tue, 26 Sep 2017 18:57:21 +0000 (14:57 -0400)]
net: dsa: mv88e6xxx: lock mutex when freeing IRQs
mv88e6xxx_g2_irq_free locks the registers mutex, but not
mv88e6xxx_g1_irq_free, which results in a stack trace from
assert_reg_lock when unloading the mv88e6xxx module. Fix this.
Fixes:
3460a5770ce9 ("net: dsa: mv88e6xxx: Mask g1 interrupts and free interrupt")
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Willem de Bruijn [Tue, 26 Sep 2017 16:20:17 +0000 (12:20 -0400)]
packet: only test po->has_vnet_hdr once in packet_snd
Packet socket option po->has_vnet_hdr can be updated concurrently with
other operations if no ring is attached.
Do not test the option twice in packet_snd, as the value may change in
between calls. A race on setsockopt disable may cause a packet > mtu
to be sent without having GSO options set.
Fixes:
bfd5f4a3d605 ("packet: Add GSO/csum offload support.")
Signed-off-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Willem de Bruijn [Tue, 26 Sep 2017 16:19:37 +0000 (12:19 -0400)]
packet: in packet_do_bind, test fanout with bind_lock held
Once a socket has po->fanout set, it remains a member of the group
until it is destroyed. The prot_hook must be constant and identical
across sockets in the group.
If fanout_add races with packet_do_bind between the test of po->fanout
and taking the lock, the bind call may make type or dev inconsistent
with that of the fanout group.
Hold po->bind_lock when testing po->fanout to avoid this race.
I had to introduce artificial delay (local_bh_enable) to actually
observe the race.
Fixes:
dc99f600698d ("packet: Add fanout support.")
Signed-off-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ed Blake [Tue, 26 Sep 2017 10:44:53 +0000 (11:44 +0100)]
net: stmmac: dwmac4: Re-enable MAC Rx before powering down
Re-enable the MAC receiver by setting CONFIG_RE before powering down,
as instructed in section 6.3.5.1 of [1]. Without this the MAC fails
to receive WoL packets and never wakes up.
[1] DWC Ethernet QoS Databook 4.10a October 2014
Signed-off-by: Ed Blake <ed.blake@sondrel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ed Blake [Tue, 26 Sep 2017 10:43:46 +0000 (11:43 +0100)]
net: stmmac: dwc-qos: Add suspend / resume support
Add hook to stmmac_pltfr_pm_ops for suspend / resume handling.
Signed-off-by: Ed Blake <ed.blake@sondrel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Mon, 25 Sep 2017 22:55:53 +0000 (15:55 -0700)]
net: dsa: Fix network device registration order
We cannot be registering the network device first, then setting its
carrier off and finally connecting it to a PHY, doing that leaves a
window during which the carrier is at best inconsistent, and at worse
the device is not usable without a down/up sequence since the network
device is visible to user space with possibly no PHY device attached.
Re-order steps so that they make logical sense. This fixes some devices
where the port was not usable after e.g: an unbind then bind of the
driver.
Fixes:
0071f56e46da ("dsa: Register netdev before phy")
Fixes:
91da11f870f0 ("net: Distributed Switch Architecture protocol support")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Mon, 25 Sep 2017 21:32:20 +0000 (23:32 +0200)]
net: dsa: mv88e6xxx: Allow dsa and cpu ports in multiple vlans
Ports with the same VLAN must all be in the same bridge. However the
CPU and DSA ports need to be in multiple VLANs spread over multiple
bridges. So exclude them when performing this test.
Fixes:
b2f81d304cee ("net: dsa: add CPU and DSA ports as VLAN members")
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Mon, 25 Sep 2017 15:40:02 +0000 (08:40 -0700)]
inetpeer: fix RCU lookup() again
My prior fix was not complete, as we were dereferencing a pointer
three times per node, not twice as I initially thought.
Fixes:
4cc5b44b29a9 ("inetpeer: fix RCU lookup()")
Fixes:
b145425f269a ("inetpeer: remove AVL implementation in favor of RB tree")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 28 Sep 2017 16:33:51 +0000 (09:33 -0700)]
Merge branch 'mvpp2-various-fixes'
Antoine Tenart says:
====================
net: mvpp2: various fixes
This series contains 3 fixes for the Marvell PPv2 driver.
Since v1:
- Removed one patch about dma masks as it would need a better fix.
- Added one fix about the MAC Tx clock source selection.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Antoine Tenart [Mon, 25 Sep 2017 12:59:48 +0000 (14:59 +0200)]
net: mvpp2: do not select the internal source clock
This patch stops the internal MAC Tx clock from being enabled as the
internal clock isn't used. The definition used for the bit controlling
this behaviour is renamed as well as it was wrongly named (bit 4 of
GMAC_CTRL_2_REG).
Fixes:
3919357fb0bb ("net: mvpp2: initialize the GMAC when using a port")
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yan Markman [Mon, 25 Sep 2017 12:59:47 +0000 (14:59 +0200)]
net: mvpp2: fix port list indexing
The private port_list array has a list of pointers to mvpp2_port
instances. This list is allocated given the number of ports enabled in
the device tree, but the pointers are set using the port-id property. If
on a single port is enabled, the port_list array will be of size 1, but
when registering the port, if its id is not 0 the driver will crash.
Other crashes were encountered in various situations.
This fixes the issue by using an index not equal to the value of the
port-id property.
Fixes:
3f518509dedc ("ethernet: Add new driver for Marvell Armada 375 network unit")
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Signed-off-by: Yan Markman <ymarkman@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stefan Chulski [Mon, 25 Sep 2017 12:59:46 +0000 (14:59 +0200)]
net: mvpp2: fix parsing fragmentation detection
Parsing fragmentation detection failed due to wrong configured
parser TCAM entry's. Some traffic was marked as fragmented in RX
descriptor, even it wasn't IP fragmented. The hardware also failed to
calculate checksums which lead to use software checksum and caused
performance degradation.
Fixes:
3f518509dedc ("ethernet: Add new driver for Marvell Armada 375 network unit")
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Signed-off-by: Stefan Chulski <stefanc@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Potapenko [Thu, 28 Sep 2017 09:32:37 +0000 (11:32 +0200)]
tun: bail out from tun_get_user() if the skb is empty
KMSAN (https://github.com/google/kmsan) reported accessing uninitialized
skb->data[0] in the case the skb is empty (i.e. skb->len is 0):
================================================
BUG: KMSAN: use of uninitialized memory in tun_get_user+0x19ba/0x3770
CPU: 0 PID: 3051 Comm: probe Not tainted 4.13.0+ #3140
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
...
__msan_warning_32+0x66/0xb0 mm/kmsan/kmsan_instr.c:477
tun_get_user+0x19ba/0x3770 drivers/net/tun.c:1301
tun_chr_write_iter+0x19f/0x300 drivers/net/tun.c:1365
call_write_iter ./include/linux/fs.h:1743
new_sync_write fs/read_write.c:457
__vfs_write+0x6c3/0x7f0 fs/read_write.c:470
vfs_write+0x3e4/0x770 fs/read_write.c:518
SYSC_write+0x12f/0x2b0 fs/read_write.c:565
SyS_write+0x55/0x80 fs/read_write.c:557
do_syscall_64+0x242/0x330 arch/x86/entry/common.c:284
entry_SYSCALL64_slow_path+0x25/0x25 arch/x86/entry/entry_64.S:245
...
origin:
...
kmsan_poison_shadow+0x6e/0xc0 mm/kmsan/kmsan.c:211
slab_alloc_node mm/slub.c:2732
__kmalloc_node_track_caller+0x351/0x370 mm/slub.c:4351
__kmalloc_reserve net/core/skbuff.c:138
__alloc_skb+0x26a/0x810 net/core/skbuff.c:231
alloc_skb ./include/linux/skbuff.h:903
alloc_skb_with_frags+0x1d7/0xc80 net/core/skbuff.c:4756
sock_alloc_send_pskb+0xabf/0xfe0 net/core/sock.c:2037
tun_alloc_skb drivers/net/tun.c:1144
tun_get_user+0x9a8/0x3770 drivers/net/tun.c:1274
tun_chr_write_iter+0x19f/0x300 drivers/net/tun.c:1365
call_write_iter ./include/linux/fs.h:1743
new_sync_write fs/read_write.c:457
__vfs_write+0x6c3/0x7f0 fs/read_write.c:470
vfs_write+0x3e4/0x770 fs/read_write.c:518
SYSC_write+0x12f/0x2b0 fs/read_write.c:565
SyS_write+0x55/0x80 fs/read_write.c:557
do_syscall_64+0x242/0x330 arch/x86/entry/common.c:284
return_from_SYSCALL_64+0x0/0x6a arch/x86/entry/entry_64.S:245
================================================
Make sure tun_get_user() doesn't touch skb->data[0] unless there is
actual data.
C reproducer below:
==========================
// autogenerated by syzkaller (http://github.com/google/syzkaller)
#define _GNU_SOURCE
#include <fcntl.h>
#include <linux/if_tun.h>
#include <netinet/ip.h>
#include <net/if.h>
#include <string.h>
#include <sys/ioctl.h>
int main()
{
int sock = socket(PF_INET, SOCK_STREAM, IPPROTO_IP);
int tun_fd = open("/dev/net/tun", O_RDWR);
struct ifreq req;
memset(&req, 0, sizeof(struct ifreq));
strcpy((char*)&req.ifr_name, "gre0");
req.ifr_flags = IFF_UP | IFF_MULTICAST;
ioctl(tun_fd, TUNSETIFF, &req);
ioctl(sock, SIOCSIFFLAGS, "gre0");
write(tun_fd, "hi", 0);
return 0;
}
==========================
Signed-off-by: Alexander Potapenko <glider@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Mon, 25 Sep 2017 10:19:26 +0000 (13:19 +0300)]
sctp: Fix a big endian bug in sctp_diag_dump()
The sctp_for_each_transport() function takes an pointer to int. The
cb->args[] array holds longs so it's only using the high 32 bits. It
works on little endian system but will break on big endian 64 bit
machines.
Fixes:
d25adbeb0cdb ("sctp: fix an use-after-free issue in sctp_sock_dump")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 27 Sep 2017 03:21:46 +0000 (20:21 -0700)]
Merge tag 'wireless-drivers-for-davem-2017-09-25' of git://git./linux/kernel/git/kvalo/wireless-drivers
Kalle Valo says:
====================
wireless-drivers fixes for 4.14
Quite a lot of fixes this time. Most notable is the brcmfmac fix for a
CVE issue.
iwlwifi
* a couple of bugzilla bugs related to multicast handling
* two fixes for WoWLAN bugs that were causing queue hangs and
re-initialization problems
* two fixes for potential uninitialized variable use reported by Dan
Carpenter in relation to a recently introduced patch
* a fix for buffer reordering in the newly supported 9000 device
family
* fix a race when starting aggregation
* small fix for a recent patch to wake mac80211 queues
* send non-bufferable management frames in the generic queue so they
are not sent on queues that are under power-save
ath10k
* fix a PCI PM related gcc warning
brcmfmac
* CVE-2017-0786: add length check scan results from firmware
* respect passive scan requests from user space
qtnfmac
* fix race in tx path when using multiple interfaces
* cancel ongoing scan when removing the wireless interface
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 26 Sep 2017 20:44:32 +0000 (13:44 -0700)]
Merge branch 'aquantia-fixes'
Igor Russkikh says:
====================
aquantia: Atlantic driver bugfixes und improvements
This series contains bugfixes for aQuantia Atlantic driver.
Changes in v2:
Review comments applied:
- min_mtu set removed
- extra mtu range check is removed
- err codes handling improved
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Pavel Belous [Mon, 25 Sep 2017 07:48:50 +0000 (10:48 +0300)]
atlantic: fix iommu errors
Call skb_frag_dma_map multiple times if tx length is greater than
device max and avoid processing tx ring until entire packet has been
sent.
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: Pavel Belous <pavel.belous@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Igor Russkikh [Mon, 25 Sep 2017 07:48:49 +0000 (10:48 +0300)]
aquantia: Fix transient invalid link down/up indications
Due to a bug in aquantia atlantic card firmware, it sometimes reports
invalid link speed bits. That caused driver to report link down events,
although link itself is totally fine.
This patch ignores such out of blue readings.
Signed-off-by: Pavel Belous <Pavel.Belous@aquantia.com>
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Igor Russkikh [Mon, 25 Sep 2017 07:48:48 +0000 (10:48 +0300)]
aquantia: Fix Tx queue hangups
Driver did a poor job in managing its Tx queues: Sometimes it could stop
tx queues due to link down condition in aq_nic_xmit - but never waked up
them. That led to Tx path total suspend.
This patch fixes this and improves generic queue management:
- introduces queue restart counter
- uses generic netif_ interface to disable and enable tx path
- refactors link up/down condition and introduces dmesg log event when
link changes.
- introduces new constant for minimum descriptors count required for queue
wakeup
Signed-off-by: Pavel Belous <Pavel.Belous@aquantia.com>
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Igor Russkikh [Mon, 25 Sep 2017 07:48:47 +0000 (10:48 +0300)]
aquantia: Setup max_mtu in ndev to enable jumbo frames
Although hardware is capable for almost 16K MTU, without max_mtu field
correctly set it only allows standard MTU to be used.
This patch enables max MTU, calculating it from hardware maximum frame size
of 16352 octets (including FCS).
Fixes:
5513e16421cb ("net: ethernet: aquantia: Fixes for aq_ndev_change_mtu")
Signed-off-by: Pavel Belous <Pavel.Belous@aquantia.com>
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sabrina Dubroca [Tue, 26 Sep 2017 14:16:43 +0000 (16:16 +0200)]
l2tp: fix race condition in l2tp_tunnel_delete
If we try to delete the same tunnel twice, the first delete operation
does a lookup (l2tp_tunnel_get), finds the tunnel, calls
l2tp_tunnel_delete, which queues it for deletion by
l2tp_tunnel_del_work.
The second delete operation also finds the tunnel and calls
l2tp_tunnel_delete. If the workqueue has already fired and started
running l2tp_tunnel_del_work, then l2tp_tunnel_delete will queue the
same tunnel a second time, and try to free the socket again.
Add a dead flag to prevent firing the workqueue twice. Then we can
remove the check of queue_work's result that was meant to prevent that
race but doesn't.
Reproducer:
ip l2tp add tunnel tunnel_id 3000 peer_tunnel_id 4000 local 192.168.0.2 remote 192.168.0.1 encap udp udp_sport 5000 udp_dport 6000
ip l2tp add session name l2tp1 tunnel_id 3000 session_id 1000 peer_session_id 2000
ip link set l2tp1 up
ip l2tp del tunnel tunnel_id 3000
ip l2tp del tunnel tunnel_id 3000
Fixes:
f8ccac0e4493 ("l2tp: put tunnel socket release on a workqueue")
Reported-by: Jianlin Shi <jishi@redhat.com>
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Acked-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexey Kodanev [Tue, 26 Sep 2017 12:14:29 +0000 (15:14 +0300)]
vti: fix use after free in vti_tunnel_xmit/vti6_tnl_xmit
When running LTP IPsec tests, KASan might report:
BUG: KASAN: use-after-free in vti_tunnel_xmit+0xeee/0xff0 [ip_vti]
Read of size 4 at addr
ffff880dc6ad1980 by task swapper/0/0
...
Call Trace:
<IRQ>
dump_stack+0x63/0x89
print_address_description+0x7c/0x290
kasan_report+0x28d/0x370
? vti_tunnel_xmit+0xeee/0xff0 [ip_vti]
__asan_report_load4_noabort+0x19/0x20
vti_tunnel_xmit+0xeee/0xff0 [ip_vti]
? vti_init_net+0x190/0x190 [ip_vti]
? save_stack_trace+0x1b/0x20
? save_stack+0x46/0xd0
dev_hard_start_xmit+0x147/0x510
? icmp_echo.part.24+0x1f0/0x210
__dev_queue_xmit+0x1394/0x1c60
...
Freed by task 0:
save_stack_trace+0x1b/0x20
save_stack+0x46/0xd0
kasan_slab_free+0x70/0xc0
kmem_cache_free+0x81/0x1e0
kfree_skbmem+0xb1/0xe0
kfree_skb+0x75/0x170
kfree_skb_list+0x3e/0x60
__dev_queue_xmit+0x1298/0x1c60
dev_queue_xmit+0x10/0x20
neigh_resolve_output+0x3a8/0x740
ip_finish_output2+0x5c0/0xe70
ip_finish_output+0x4ba/0x680
ip_output+0x1c1/0x3a0
xfrm_output_resume+0xc65/0x13d0
xfrm_output+0x1e4/0x380
xfrm4_output_finish+0x5c/0x70
Can be fixed if we get skb->len before dst_output().
Fixes:
b9959fd3b0fa ("vti: switch to new ip tunnel code")
Fixes:
22e1b23dafa8 ("vti6: Support inter address family tunneling.")
Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arnd Bergmann [Fri, 22 Sep 2017 21:29:19 +0000 (23:29 +0200)]
netlink: fix nla_put_{u8,u16,u32} for KASAN
When CONFIG_KASAN is enabled, the "--param asan-stack=1" causes rather large
stack frames in some functions. This goes unnoticed normally because
CONFIG_FRAME_WARN is disabled with CONFIG_KASAN by default as of commit
3f181b4d8652 ("lib/Kconfig.debug: disable -Wframe-larger-than warnings with
KASAN=y").
The kernelci.org build bot however has the warning enabled and that led
me to investigate it a little further, as every build produces these warnings:
net/wireless/nl80211.c:4389:1: warning: the frame size of 2240 bytes is larger than 2048 bytes [-Wframe-larger-than=]
net/wireless/nl80211.c:1895:1: warning: the frame size of 3776 bytes is larger than 2048 bytes [-Wframe-larger-than=]
net/wireless/nl80211.c:1410:1: warning: the frame size of 2208 bytes is larger than 2048 bytes [-Wframe-larger-than=]
net/bridge/br_netlink.c:1282:1: warning: the frame size of 2544 bytes is larger than 2048 bytes [-Wframe-larger-than=]
Most of this problem is now solved in gcc-8, which can consolidate
the stack slots for the inline function arguments. On older compilers
we can add a workaround by declaring a local variable in each function
to pass the inline function argument.
Cc: stable@vger.kernel.org
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81715
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arnd Bergmann [Fri, 22 Sep 2017 21:29:18 +0000 (23:29 +0200)]
rocker: fix rocker_tlv_put_* functions for KASAN
Inlining these functions creates lots of stack variables that each take
64 bytes when KASAN is enabled, leading to this warning about potential
stack overflow:
drivers/net/ethernet/rocker/rocker_ofdpa.c: In function 'ofdpa_cmd_flow_tbl_add':
drivers/net/ethernet/rocker/rocker_ofdpa.c:621:1: error: the frame size of 2752 bytes is larger than 1536 bytes [-Werror=frame-larger-than=]
gcc-8 can now consolidate the stack slots itself, but on older versions
we get the same behavior by using a temporary variable that holds a
copy of the inline function argument.
Cc: stable@vger.kernel.org
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81715
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Timur Tabi [Fri, 22 Sep 2017 20:32:44 +0000 (15:32 -0500)]
net: qcom/emac: specify the correct size when mapping a DMA buffer
When mapping the RX DMA buffers, the driver was accidentally specifying
zero for the buffer length. Under normal circumstances, SWIOTLB does not
need to allocate a bounce buffer, so the address is just mapped without
checking the size field. This is why the error was not detected earlier.
Fixes:
b9b17debc69d ("net: emac: emac gigabit ethernet controller driver")
Cc: stable@vger.kernel.org
Signed-off-by: Timur Tabi <timur@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 25 Sep 2017 21:44:41 +0000 (14:44 -0700)]
Merge branch 'l2tp-fix-some-races-in-session-deletion'
Guillaume Nault says:
====================
l2tp: fix some races in session deletion
L2TP provides several interfaces for deleting sessions. Using two of
them concurrently can lead to use-after-free bugs.
Patch #2 uses a flag to prevent double removal of L2TP sessions.
Patch #1 fixes a bug found in the way. Fixing this bug is also
necessary for patch #2 to handle all cases.
This issue is similar to the tunnel deletion bug being worked on by
Sabrina: https://patchwork.ozlabs.org/patch/814173/
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Guillaume Nault [Fri, 22 Sep 2017 13:39:24 +0000 (15:39 +0200)]
l2tp: fix race between l2tp_session_delete() and l2tp_tunnel_closeall()
There are several ways to remove L2TP sessions:
* deleting a session explicitly using the netlink interface (with
L2TP_CMD_SESSION_DELETE),
* deleting the session's parent tunnel (either by closing the
tunnel's file descriptor or using the netlink interface),
* closing the PPPOL2TP file descriptor of a PPP pseudo-wire.
In some cases, when these methods are used concurrently on the same
session, the session can be removed twice, leading to use-after-free
bugs.
This patch adds a 'dead' flag, used by l2tp_session_delete() and
l2tp_tunnel_closeall() to prevent them from stepping on each other's
toes.
The session deletion path used when closing a PPPOL2TP file descriptor
doesn't need to be adapted. It already has to ensure that a session
remains valid for the lifetime of its PPPOL2TP file descriptor.
So it takes an extra reference on the session in the ->session_close()
callback (pppol2tp_session_close()), which is eventually dropped
in the ->sk_destruct() callback of the PPPOL2TP socket
(pppol2tp_session_destruct()).
Still, __l2tp_session_unhash() and l2tp_session_queue_purge() can be
called twice and even concurrently for a given session, but thanks to
proper locking and re-initialisation of list fields, this is not an
issue.
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Guillaume Nault [Fri, 22 Sep 2017 13:39:23 +0000 (15:39 +0200)]
l2tp: ensure sessions are freed after their PPPOL2TP socket
If l2tp_tunnel_delete() or l2tp_tunnel_closeall() deletes a session
right after pppol2tp_release() orphaned its socket, then the 'sock'
variable of the pppol2tp_session_close() callback is NULL. Yet the
session is still used by pppol2tp_release().
Therefore we need to take an extra reference in any case, to prevent
l2tp_tunnel_delete() or l2tp_tunnel_closeall() from freeing the session.
Since the pppol2tp_session_close() callback is only set if the session
is associated to a PPPOL2TP socket and that both l2tp_tunnel_delete()
and l2tp_tunnel_closeall() hold the PPPOL2TP socket before calling
pppol2tp_session_close(), we're sure that pppol2tp_session_close() and
pppol2tp_session_destruct() are paired and called in the right order.
So the reference taken by the former will be released by the later.
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kalle Valo [Mon, 25 Sep 2017 07:06:12 +0000 (10:06 +0300)]
Merge ath-current from ath.git
ath.git fixes for 4.14. Major changes:
ath10k
* fix a PCI PM related gcc warning
Subash Abhinov Kasiviswanathan [Fri, 22 Sep 2017 00:00:36 +0000 (18:00 -0600)]
net: qualcomm: rmnet: Fix rcu splat in rmnet_is_real_dev_registered
Xiaolong reported a suspicious rcu_dereference_check in the device
unregister notifier callback. Since we do not dereference the
rx_handler_data, it's ok to just check for the value of the pointer.
Note that this section is already protected by rtnl_lock.
[ 101.364846] WARNING: suspicious RCU usage
[ 101.365654]
4.13.0-rc6-01701-gceed73a #1 Not tainted
[ 101.370873] -----------------------------
[ 101.372472] drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c:57 suspicious rcu_dereference_check() usage!
[ 101.374427]
[ 101.374427] other info that might help us debug this:
[ 101.374427]
[ 101.387491]
[ 101.387491] rcu_scheduler_active = 2, debug_locks = 1
[ 101.389368] 1 lock held by trinity-main/2809:
[ 101.390736] #0: (rtnl_mutex){+.+.+.}, at: [<
8146085b>] rtnl_lock+0xf/0x11
[ 101.395482]
[ 101.395482] stack backtrace:
[ 101.396948] CPU: 0 PID: 2809 Comm: trinity-main Not tainted
4.13.0-rc6-01701-gceed73a #1
[ 101.398857] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3-20161025_171302-gandalf 04/01/2014
[ 101.401079] Call Trace:
[ 101.401656] dump_stack+0xa1/0xeb
[ 101.402871] lockdep_rcu_suspicious+0xc7/0xd0
[ 101.403665] rmnet_is_real_dev_registered+0x40/0x4e
[ 101.405199] rmnet_config_notify_cb+0x2c/0x142
[ 101.406344] ? wireless_nlevent_flush+0x47/0x71
[ 101.407385] notifier_call_chain+0x2d/0x47
[ 101.408645] raw_notifier_call_chain+0xc/0xe
[ 101.409882] call_netdevice_notifiers_info+0x41/0x49
[ 101.411402] call_netdevice_notifiers+0xc/0xe
[ 101.412713] rollback_registered_many+0x268/0x36e
[ 101.413702] rollback_registered+0x39/0x56
[ 101.414965] unregister_netdevice_queue+0x79/0x88
[ 101.415908] unregister_netdev+0x16/0x1d
Fixes:
ceed73a2cf4a ("drivers: net: ethernet: qualcomm: rmnet: Initial implementation")
Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Reported-by: kernel test robot <xiaolong.ye@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Christophe JAILLET [Thu, 21 Sep 2017 23:01:11 +0000 (01:01 +0200)]
cnic: Fix an error handling path in 'cnic_alloc_bnx2x_resc()'
All the error handling paths 'goto error', except this one.
We should also go to error in this case, or some resources will be
leaking.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Sat, 23 Sep 2017 16:14:06 +0000 (06:14 -1000)]
Merge branch 'parisc-4.14-2' of git://git./linux/kernel/git/deller/parisc-linux
Pull parisc fixes from Helge Deller:
- Unbreak parisc bootloader by avoiding a gcc-7 optimization to convert
multiple byte-accesses into one word-access.
- Add missing HWPOISON page fault handler code. I completely missed
that when I added HWPOISON support during this merge window and it
only showed up now with the madvise07 LTP test case.
- Fix backtrace unwinding to stop when stack start has been reached.
- Issue warning if initrd has been loaded into memory regions with
broken RAM modules.
- Fix HPMC handler (parisc hardware fault handler) to comply with
architecture specification.
- Avoid compiler warnings about too large frame sizes.
- Minor init-section fixes.
* 'parisc-4.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: Unbreak bootloader due to gcc-7 optimizations
parisc: Reintroduce option to gzip-compress the kernel
parisc: Add HWPOISON page fault handler code
parisc: Move init_per_cpu() into init section
parisc: Check if initrd was loaded into broken RAM
parisc: Add PDCE_CHECK instruction to HPMC handler
parisc: Add wrapper for pdc_instr() firmware function
parisc: Move start_parisc() into init section
parisc: Stop unwinding at start of stack
parisc: Fix too large frame size warnings
Linus Torvalds [Sat, 23 Sep 2017 15:47:04 +0000 (05:47 -1000)]
Merge tag 'for-linus' of git://git./linux/kernel/git/dledford/rdma
Pull rdma fixes from Doug Ledford:
- Smattering of miscellanous fixes
- A five patch series for i40iw that had a patch (5/5) that was larger
than I would like, but I took it because it's needed for large scale
users
- An 8 patch series for bnxt_re that landed right as I was leaving on
PTO and so had to wait until now...they are all appropriate fixes for
-rc IMO
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (22 commits)
bnxt_re: Don't issue cmd to delete GID for QP1 GID entry before the QP is destroyed
bnxt_re: Fix memory leak in FRMR path
bnxt_re: Remove RTNL lock dependency in bnxt_re_query_port
bnxt_re: Fix race between the netdev register and unregister events
bnxt_re: Free up devices in module_exit path
bnxt_re: Fix compare and swap atomic operands
bnxt_re: Stop issuing further cmds to FW once a cmd times out
bnxt_re: Fix update of qplib_qp.mtu when modified
i40iw: Add support for port reuse on active side connections
i40iw: Add missing VLAN priority
i40iw: Call i40iw_cm_disconn on modify QP to disconnect
i40iw: Prevent multiple netdev event notifier registrations
i40iw: Fail open if there are no available MSI-X vectors
RDMA/vmw_pvrdma: Fix reporting correct opcodes for completion
IB/bnxt_re: Fix frame stack compilation warning
IB/mlx5: fix debugfs cleanup
IB/ocrdma: fix incorrect fall-through on switch statement
IB/ipoib: Suppress the retry related completion errors
iw_cxgb4: remove the stid on listen create failure
iw_cxgb4: drop listen destroy replies if no ep found
...
Linus Torvalds [Sat, 23 Sep 2017 15:41:27 +0000 (05:41 -1000)]
Merge git://git./linux/kernel/git/davem/net
Pull networking fixes from David Miller:
1) Fix NAPI poll list corruption in enic driver, from Christian
Lamparter.
2) Fix route use after free, from Eric Dumazet.
3) Fix regression in reuseaddr handling, from Josef Bacik.
4) Assert the size of control messages in compat handling since we copy
it in from userspace twice. From Meng Xu.
5) SMC layer bug fixes (missing RCU locking, bad refcounting, etc.)
from Ursula Braun.
6) Fix races in AF_PACKET fanout handling, from Willem de Bruijn.
7) Don't use ARRAY_SIZE on spinlock array which might have zero
entries, from Geert Uytterhoeven.
8) Fix miscomputation of checksum in ipv6 udp code, from Subash Abhinov
Kasiviswanathan.
9) Push the ipv6 header properly in ipv6 GRE tunnel driver, from Xin
Long.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (75 commits)
inet: fix improper empty comparison
net: use inet6_rcv_saddr to compare sockets
net: set tb->fast_sk_family
net: orphan frags on stand-alone ptype in dev_queue_xmit_nit
MAINTAINERS: update git tree locations for ieee802154 subsystem
net: prevent dst uses after free
net: phy: Fix truncation of large IRQ numbers in phy_attached_print()
net/smc: no close wait in case of process shut down
net/smc: introduce a delay
net/smc: terminate link group if out-of-sync is received
net/smc: longer delay for client link group removal
net/smc: adapt send request completion notification
net/smc: adjust net_device refcount
net/smc: take RCU read lock for routing cache lookup
net/smc: add receive timeout check
net/smc: add missing dev_put
net: stmmac: Cocci spatch "of_table"
lan78xx: Use default values loaded from EEPROM/OTP after reset
lan78xx: Allow EEPROM write for less than MAX_EEPROM_SIZE
lan78xx: Fix for eeprom read/write when device auto suspend
...
Linus Torvalds [Sat, 23 Sep 2017 15:33:29 +0000 (05:33 -1000)]
Merge tag 'apparmor-pr-2017-09-22' of git://git./linux/kernel/git/jj/linux-apparmor
Pull apparmor updates from John Johansen:
"This is the apparmor pull request, similar to SELinux and seccomp.
It's the same series that I was sent to James' security tree + one
regression fix that was found after the series was sent to James and
would have been sent for v4.14-rc2.
Features:
- in preparation for secid mapping add support for absolute root view
based labels
- add base infastructure for socket mediation
- add mount mediation
- add signal mediation
minor cleanups and changes:
- be defensive, ensure unconfined profiles have dfas initialized
- add more debug asserts to apparmorfs
- enable policy unpacking to audit different reasons for failure
- cleanup conditional check for label in label_print
- Redundant condition: prev_ns. in [label.c:1498]
Bug Fixes:
- fix regression in apparmorfs DAC access permissions
- fix build failure on sparc caused by undeclared signals
- fix sparse report of incorrect type assignment when freeing label proxies
- fix race condition in null profile creation
- Fix an error code in aafs_create()
- Fix logical error in verify_header()
- Fix shadowed local variable in unpack_trans_table()"
* tag 'apparmor-pr-2017-09-22' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor:
apparmor: fix apparmorfs DAC access permissions
apparmor: fix build failure on sparc caused by undeclared signals
apparmor: fix incorrect type assignment when freeing proxies
apparmor: ensure unconfined profiles have dfas initialized
apparmor: fix race condition in null profile creation
apparmor: move new_null_profile to after profile lookup fns()
apparmor: add base infastructure for socket mediation
apparmor: add more debug asserts to apparmorfs
apparmor: make policy_unpack able to audit different info messages
apparmor: add support for absolute root view based labels
apparmor: cleanup conditional check for label in label_print
apparmor: add mount mediation
apparmor: add the ability to mediate signals
apparmor: Redundant condition: prev_ns. in [label.c:1498]
apparmor: Fix an error code in aafs_create()
apparmor: Fix logical error in verify_header()
apparmor: Fix shadowed local variable in unpack_trans_table()
Linus Torvalds [Sat, 23 Sep 2017 03:40:11 +0000 (17:40 -1000)]
Merge tag 'acpi-4.14-rc2' of git://git./linux/kernel/git/rafael/linux-pm
Pull ACPI fixes from Rafael Wysocki:
"These fix the initialization of resources in the ACPI WDAT watchdog
driver, a recent regression in the ACPI device properties handling, a
recent change in behavior causing the ACPI_HANDLE() macro to only work
for GPL code and create a MAINTAINERS entry for ACPI PMIC drivers in
order to specify the official reviewers for that code.
Specifics:
- Fix the initialization of resources in the ACPI WDAT watchdog
driver that uses unititialized memory which causes compiler
warnings to be triggered (Arnd Bergmann).
- Fix a recent regression in the ACPI device properties handling that
causes some device properties data to be skipped during enumeration
(Sakari Ailus).
- Fix a recent change in behavior that caused the ACPI_HANDLE() macro
to stop working for non-GPL code which is a problem for the NVidia
binary graphics driver, for example (John Hubbard).
- Add a MAINTAINERS entry for the ACPI PMIC drivers to specify the
official reviewers for that code (Rafael Wysocki)"
* tag 'acpi-4.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI: properties: Return _DSD hierarchical extension (data) sub-nodes correctly
ACPI / bus: Make ACPI_HANDLE() work for non-GPL code again
ACPI / watchdog: properly initialize resources
ACPI / PMIC: Add code reviewers to MAINTAINERS
David S. Miller [Sat, 23 Sep 2017 03:33:18 +0000 (20:33 -0700)]
Merge branch 'net-fix-reuseaddr-regression'
Josef Bacik says:
====================
net: fix reuseaddr regression
I introduced a regression when reworking the fastreuse port stuff that allows
bind conflicts to occur once a reuseaddr successfully opens on an existing tb.
The root cause is I reversed an if statement which caused us to set the tb as if
there were no owners on the socket if there were, which obviously is not
correct.
Dave could you please queue these changes up for -stable, I've run them through
the net tests and added another test to check for this problem specifically.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Josef Bacik [Sat, 23 Sep 2017 00:20:08 +0000 (20:20 -0400)]
inet: fix improper empty comparison
When doing my reuseport rework I screwed up and changed a
if (hlist_empty(&tb->owners))
to
if (!hlist_empty(&tb->owners))
This is obviously bad as all of the reuseport/reuse logic was reversed,
which caused weird problems like allowing an ipv4 bind conflict if we
opened an ipv4 only socket on a port followed by an ipv6 only socket on
the same port.
Fixes:
b9470c27607b ("inet: kill smallest_size and smallest_port")
Reported-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Josef Bacik [Sat, 23 Sep 2017 00:20:07 +0000 (20:20 -0400)]
net: use inet6_rcv_saddr to compare sockets
In ipv6_rcv_saddr_equal() we need to use inet6_rcv_saddr(sk) for the
ipv6 compare with the fast socket information to make sure we're doing
the proper comparisons.
Fixes:
637bc8bbe6c0 ("inet: reset tb->fastreuseport when adding a reuseport sk")
Reported-and-tested-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Josef Bacik [Sat, 23 Sep 2017 00:20:06 +0000 (20:20 -0400)]
net: set tb->fast_sk_family
We need to set the tb->fast_sk_family properly so we can use the proper
comparison function for all subsequent reuseport bind requests.
Fixes:
637bc8bbe6c0 ("inet: reset tb->fastreuseport when adding a reuseport sk")
Reported-and-tested-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Willem de Bruijn [Fri, 22 Sep 2017 23:42:37 +0000 (19:42 -0400)]
net: orphan frags on stand-alone ptype in dev_queue_xmit_nit
Zerocopy skbs frags are copied when the skb is looped to a local sock.
Commit
1080e512d44d ("net: orphan frags on receive") introduced calls
to skb_orphan_frags to deliver_skb and __netif_receive_skb for this.
With msg_zerocopy, these skbs can also exist in the tx path and thus
loop from dev_queue_xmit_nit. This already calls deliver_skb in its
loop. But it does not orphan before a separate pt_prev->func().
Add the missing skb_orphan_frags_rx.
Changes
v1->v2: handle skb_orphan_frags_rx failure
Fixes:
1f8b977ab32d ("sock: enable MSG_ZEROCOPY")
Signed-off-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Sat, 23 Sep 2017 03:28:59 +0000 (17:28 -1000)]
Merge tag 'pm-4.14-rc2' of git://git./linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
"These fix a cpufreq regression introduced by recent changes related to
the generic DT driver, an initialization time memory leak in cpuidle
on ARM, a PM core bug that may cause system suspend/resume to fail on
some systems, a request type validation issue in the PM QoS framework
and two documentation-related issues.
Specifics:
- Fix a regression in cpufreq on systems using DT as the source of
CPU configuration information where two different code paths
attempt to create the cpufreq-dt device object (there can be only
one) and fix up the "compatible" matching for some TI platforms on
top of that (Viresh Kumar, Dave Gerlach).
- Fix an initialization time memory leak in cpuidle on ARM which
occurs if the cpuidle driver initialization fails (Stefan Wahren).
- Fix a PM core function that checks whether or not there are any
system suspend/resume callbacks for a device, but forgets to check
legacy callbacks which then may be skipped incorrectly and the
system may crash and/or the device may become unusable after a
suspend-resume cycle (Rafael Wysocki).
- Fix request type validation for latency tolerance PM QoS requests
which may lead to unexpected behavior (Jan Schönherr).
- Fix a broken link to PM documentation from a header file and a typo
in a PM document (Geert Uytterhoeven, Rafael Wysocki)"
* tag 'pm-4.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpufreq: ti-cpufreq: Support additional am43xx platforms
ARM: cpuidle: Avoid memleak if init fail
cpufreq: dt-platdev: Add some missing platforms to the blacklist
PM: core: Fix device_pm_check_callbacks()
PM: docs: Drop an excess character from devices.rst
PM / QoS: Use the correct variable to check the QoS request type
driver core: Fix link to device power management documentation
Linus Torvalds [Sat, 23 Sep 2017 03:23:41 +0000 (17:23 -1000)]
Merge branch 'for-linus' of git://git./linux/kernel/git/dtor/input
Pull input fixes from Dmitry Torokhov:
- fixes for two long standing issues (lock up and a crash) in force
feedback handling in uinput driver
- tweak to firmware update timing in Elan I2C touchpad driver.
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: elan_i2c - extend Flash-Write delay
Input: uinput - avoid crash when sending FF request to device going away
Input: uinput - avoid FF flush when destroying device
Linus Torvalds [Sat, 23 Sep 2017 02:16:41 +0000 (16:16 -1000)]
Merge tag 'seccomp-v4.14-rc2' of git://git./linux/kernel/git/kees/linux
Pull seccomp updates from Kees Cook:
"Major additions:
- sysctl and seccomp operation to discover available actions
(tyhicks)
- new per-filter configurable logging infrastructure and sysctl
(tyhicks)
- SECCOMP_RET_LOG to log allowed syscalls (tyhicks)
- SECCOMP_RET_KILL_PROCESS as the new strictest possible action
- self-tests for new behaviors"
[ This is the seccomp part of the security pull request during the merge
window that was nixed due to unrelated problems - Linus ]
* tag 'seccomp-v4.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
samples: Unrename SECCOMP_RET_KILL
selftests/seccomp: Test thread vs process killing
seccomp: Implement SECCOMP_RET_KILL_PROCESS action
seccomp: Introduce SECCOMP_RET_KILL_PROCESS
seccomp: Rename SECCOMP_RET_KILL to SECCOMP_RET_KILL_THREAD
seccomp: Action to log before allowing
seccomp: Filter flag to log all actions except SECCOMP_RET_ALLOW
seccomp: Selftest for detection of filter flag support
seccomp: Sysctl to configure actions that are allowed to be logged
seccomp: Operation for checking if an action is available
seccomp: Sysctl to display available actions
seccomp: Provide matching filter for introspection
selftests/seccomp: Refactor RET_ERRNO tests
selftests/seccomp: Add simple seccomp overhead benchmark
selftests/seccomp: Add tests for basic ptrace actions
Linus Torvalds [Sat, 23 Sep 2017 02:11:48 +0000 (16:11 -1000)]
Merge tag '4.14-smb3-fixes-from-recent-test-events-for-stable' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs fixes from Steve French:
"Various SMB3 fixes for stable and security improvements from the
recently completed SMB3/Samba test events
* tag '4.14-smb3-fixes-from-recent-test-events-for-stable' of git://git.samba.org/sfrench/cifs-2.6:
SMB3: Don't ignore O_SYNC/O_DSYNC and O_DIRECT flags
SMB3: handle new statx fields
SMB: Validate negotiate (to protect against downgrade) even if signing off
cifs: release auth_key.response for reconnect.
cifs: release cifs root_cred after exit_cifs
CIFS: make arrays static const, reduces object code size
[SMB3] Update session and share information displayed for debugging SMB2/SMB3
cifs: show 'soft' in the mount options for hard mounts
SMB3: Warn user if trying to sign connection that authenticated as guest
SMB3: Fix endian warning
Fix SMB3.1.1 guest authentication to Samba
Linus Torvalds [Sat, 23 Sep 2017 02:09:31 +0000 (16:09 -1000)]
Merge tag 'ceph-for-4.14-rc2' of git://github.com/ceph/ceph-client
Pull ceph fixes from Ilya Dryomov:
"Two small but important fixes: RADOS semantic change in upcoming v12.2.1
release and a rare NULL dereference in create_session_open_msg()"
* tag 'ceph-for-4.14-rc2' of git://github.com/ceph/ceph-client:
ceph: avoid panic in create_session_open_msg() if utsname() returns NULL
libceph: don't allow bidirectional swap of pg-upmap-items
Stefan Schmidt [Fri, 22 Sep 2017 12:28:46 +0000 (14:28 +0200)]
MAINTAINERS: update git tree locations for ieee802154 subsystem
Patches for ieee802154 will go through my new trees towards netdev from
now on. The 6LoWPAN subsystem will stay as is (shared between ieee802154
and bluetooth) and go through the bluetooth tree as usual.
Signed-off-by: Stefan Schmidt <stefan@osg.samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Steve French [Fri, 22 Sep 2017 06:40:27 +0000 (01:40 -0500)]
SMB3: Don't ignore O_SYNC/O_DSYNC and O_DIRECT flags
Signed-off-by: Steve French <smfrench@gmail.com>
CC: Stable <stable@vger.kernel.org>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
Linus Torvalds [Fri, 22 Sep 2017 23:09:11 +0000 (13:09 -1000)]
Merge tag 'pci-v4.14-fixes-2' of git://git./linux/kernel/git/helgaas/pci
Pull PCI fixes from Bjorn Helgaas:
- fix endpoint "end of test" interrupt issue (introduced in v4.14-rc1)
(John Keeping)
- fix MIPS use-after-free map_irq() issue (introduced in v4.14-rc1)
(Lorenzo Pieralisi)
* tag 'pci-v4.14-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
PCI: endpoint: Use correct "end of test" interrupt
MIPS: PCI: Move map_irq() hooks out of initdata
Linus Torvalds [Fri, 22 Sep 2017 23:06:05 +0000 (13:06 -1000)]
Merge tag 'iommu-fixes-v4.14-rc1' of git://git./linux/kernel/git/joro/iommu
Pull IOMMU fixes from Joerg Roedel:
- two Kconfig fixes to fix dependencies that cause compile failures
when they are not fulfilled.
- a section mismatch fix for Intel VT-d
- a fix for PCI topology detection in ARM device-tree code
* tag 'iommu-fixes-v4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
iommu/of: Remove PCI host bridge node check
iommu/qcom: Depend on HAS_DMA to fix compile error
iommu/vt-d: Fix harmless section mismatch warning
iommu: Add missing dependencies
Linus Torvalds [Fri, 22 Sep 2017 23:02:54 +0000 (13:02 -1000)]
Merge git://git./linux/kernel/git/cmetcalf/linux-tile
Pull arch/tile fixes from Chris Metcalf:
"These are a code cleanup and config cleanup, respectively"
* git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
tile: array underflow in setup_maxnodemem()
tile: defconfig: Cleanup from old Kconfig options
Linus Torvalds [Fri, 22 Sep 2017 23:01:16 +0000 (13:01 -1000)]
Merge tag 'arm64-fixes' of git://git./linux/kernel/git/arm64/linux
Pull arm64 fixes from Catalin Marinas:
- #ifdef CONFIG_EFI around __efi_fpsimd_begin/end
- Assembly code alignment reduced to 4 bytes from 16
- Ensure the kernel is compiled for LP64 (there are some arm64
compilers around defaulting to ILP32)
- Fix arm_pmu_acpi memory leak on the error path
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
drivers/perf: arm_pmu_acpi: Release memory obtained by kasprintf
arm64: ensure the kernel is compiled for LP64
arm64: relax assembly code alignment from 16 byte to 4 byte
arm64: efi: Don't include EFI fpsimd save/restore code in non-EFI kernels
Steve French [Fri, 22 Sep 2017 02:32:29 +0000 (21:32 -0500)]
SMB3: handle new statx fields
We weren't returning the creation time or the two easily supported
attributes (ENCRYPTED or COMPRESSED) for the getattr call to
allow statx to return these fields.
Signed-off-by: Steve French <smfrench@gmail.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>\
Acked-by: Jeff Layton <jlayton@poochiereds.net>
CC: Stable <stable@vger.kernel.org>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
Tobias Klauser [Fri, 22 Sep 2017 07:42:42 +0000 (09:42 +0200)]
arch: remove unused *_segments() macros/functions
Some architectures define the no-op macros/functions copy_segments,
release_segments and forget_segments. These are used nowhere in the
tree, so removed them.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Acked-by: Vineet Gupta <vgupta@synopsys.com> [for arch/arc]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Rafael J. Wysocki [Fri, 22 Sep 2017 21:38:45 +0000 (23:38 +0200)]
Merge branches 'acpi-pmic', 'acpi-bus', 'acpi-wdat' and 'acpi-properties'
* acpi-pmic:
ACPI / PMIC: Add code reviewers to MAINTAINERS
* acpi-bus:
ACPI / bus: Make ACPI_HANDLE() work for non-GPL code again
* acpi-wdat:
ACPI / watchdog: properly initialize resources
* acpi-properties:
ACPI: properties: Return _DSD hierarchical extension (data) sub-nodes correctly
Rafael J. Wysocki [Fri, 22 Sep 2017 20:45:54 +0000 (22:45 +0200)]
Merge branches 'pm-cpufreq' and 'pm-cpuidle'
* pm-cpufreq:
cpufreq: ti-cpufreq: Support additional am43xx platforms
cpufreq: dt-platdev: Add some missing platforms to the blacklist
* pm-cpuidle:
ARM: cpuidle: Avoid memleak if init fail
Rafael J. Wysocki [Fri, 22 Sep 2017 20:45:28 +0000 (22:45 +0200)]
Merge branches 'pm-core', 'pm-qos' and 'pm-docs'
* pm-core:
PM: core: Fix device_pm_check_callbacks()
* pm-qos:
PM / QoS: Use the correct variable to check the QoS request type
* pm-docs:
PM: docs: Drop an excess character from devices.rst
driver core: Fix link to device power management documentation
Helge Deller [Fri, 22 Sep 2017 19:57:11 +0000 (21:57 +0200)]
parisc: Unbreak bootloader due to gcc-7 optimizations
gcc-7 optimizes the byte-wise accesses of get_unaligned_le32() into
word-wise accesses if the 32-bit integer output_len is declared as
external. This panics then the bootloader since we don't have the
unaligned access fault trap handler installed during boot time.
Avoid this optimization by declaring output_len as byte-aligned and thus
unbreak the bootloader code.
Additionally, compile the boot code optimized for size.
Signed-off-by: Helge Deller <deller@gmx.de>
Helge Deller [Fri, 22 Sep 2017 20:24:02 +0000 (22:24 +0200)]
parisc: Reintroduce option to gzip-compress the kernel
By adding the feature to build the kernel as self-extracting
executeable, the possibility to simply compress the kernel with gzip was
lost.
This patch now reintroduces this possibilty again and leaves it up to
the user to decide how the kernel should be built.
The palo bootloader is able to natively load both formats.
Signed-off-by: Helge Deller <deller@gmx.de>
John Johansen [Thu, 31 Aug 2017 16:54:43 +0000 (09:54 -0700)]
apparmor: fix apparmorfs DAC access permissions
The DAC access permissions for several apparmorfs files are wrong.
.access - needs to be writable by all tasks to perform queries
the others in the set only provide a read fn so should be read only.
With policy namespace virtualization all apparmor needs to control
the permission and visibility checks directly which means DAC
access has to be allowed for all user, group, and other.
BugLink: http://bugs.launchpad.net/bugs/1713103
Fixes:
c97204baf840b ("apparmor: rename apparmor file fns and data to indicate use")
Signed-off-by: John Johansen <john.johansen@canonical.com>
John Johansen [Wed, 23 Aug 2017 19:10:39 +0000 (12:10 -0700)]
apparmor: fix build failure on sparc caused by undeclared signals
In file included from security/apparmor/ipc.c:23:0:
security/apparmor/include/sig_names.h:26:3: error: 'SIGSTKFLT' undeclared here (not in a function)
[SIGSTKFLT] = 16, /* -, 16, - */
^
security/apparmor/include/sig_names.h:26:3: error: array index in initializer not of integer type
security/apparmor/include/sig_names.h:26:3: note: (near initialization for 'sig_map')
security/apparmor/include/sig_names.h:51:3: error: 'SIGUNUSED' undeclared here (not in a function)
[SIGUNUSED] = 34, /* -, 31, - */
^
security/apparmor/include/sig_names.h:51:3: error: array index in initializer not of integer type
security/apparmor/include/sig_names.h:51:3: note: (near initialization for 'sig_map')
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Fixes:
c6bf1adaecaa ("apparmor: add the ability to mediate signals")
Signed-off-by: John Johansen <john.johansen@canonical.com>
John Johansen [Wed, 16 Aug 2017 16:33:48 +0000 (09:33 -0700)]
apparmor: fix incorrect type assignment when freeing proxies
sparse reports
poisoning the proxy->label before freeing the struct is resulting in
a sparse build warning.
../security/apparmor/label.c:52:30: warning: incorrect type in assignment (different address spaces)
../security/apparmor/label.c:52:30: expected struct aa_label [noderef] <asn:4>*label
../security/apparmor/label.c:52:30: got struct aa_label *<noident>
fix with RCU_INIT_POINTER as this is one of those cases where
rcu_assign_pointer() is not needed.
Signed-off-by: John Johansen <john.johansen@canonical.com>
John Johansen [Wed, 16 Aug 2017 12:48:06 +0000 (05:48 -0700)]
apparmor: ensure unconfined profiles have dfas initialized
Generally unconfined has early bailout tests and does not need the
dfas initialized, however if an early bailout test is ever missed
it will result in an oops.
Be defensive and initialize the unconfined profile to have null dfas
(no permission) so if an early bailout test is missed we fail
closed (no perms granted) instead of oopsing.
Signed-off-by: John Johansen <john.johansen@canonical.com>
John Johansen [Wed, 16 Aug 2017 12:40:49 +0000 (05:40 -0700)]
apparmor: fix race condition in null profile creation
There is a race when null- profile is being created between the
initial lookup/creation of the profile and lock/addition of the
profile. This could result in multiple version of a profile being
added to the list which need to be removed/replaced.
Since these are learning profile their is no affect on mediation.
Signed-off-by: John Johansen <john.johansen@canonical.com>
John Johansen [Wed, 16 Aug 2017 15:59:57 +0000 (08:59 -0700)]
apparmor: move new_null_profile to after profile lookup fns()
new_null_profile will need to use some of the profile lookup fns()
so move instead of doing forward fn declarations.
Signed-off-by: John Johansen <john.johansen@canonical.com>
John Johansen [Wed, 19 Jul 2017 06:18:33 +0000 (23:18 -0700)]
apparmor: add base infastructure for socket mediation
Provide a basic mediation of sockets. This is not a full net mediation
but just whether a spcific family of socket can be used by an
application, along with setting up some basic infrastructure for
network mediation to follow.
the user space rule hav the basic form of
NETWORK RULE = [ QUALIFIERS ] 'network' [ DOMAIN ]
[ TYPE | PROTOCOL ]
DOMAIN = ( 'inet' | 'ax25' | 'ipx' | 'appletalk' | 'netrom' |
'bridge' | 'atmpvc' | 'x25' | 'inet6' | 'rose' |
'netbeui' | 'security' | 'key' | 'packet' | 'ash' |
'econet' | 'atmsvc' | 'sna' | 'irda' | 'pppox' |
'wanpipe' | 'bluetooth' | 'netlink' | 'unix' | 'rds' |
'llc' | 'can' | 'tipc' | 'iucv' | 'rxrpc' | 'isdn' |
'phonet' | 'ieee802154' | 'caif' | 'alg' | 'nfc' |
'vsock' | 'mpls' | 'ib' | 'kcm' ) ','
TYPE = ( 'stream' | 'dgram' | 'seqpacket' | 'rdm' | 'raw' |
'packet' )
PROTOCOL = ( 'tcp' | 'udp' | 'icmp' )
eg.
network,
network inet,
Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
John Johansen [Wed, 19 Jul 2017 06:41:13 +0000 (23:41 -0700)]
apparmor: add more debug asserts to apparmorfs
Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
John Johansen [Wed, 19 Jul 2017 06:37:18 +0000 (23:37 -0700)]
apparmor: make policy_unpack able to audit different info messages
Switch unpack auditing to using the generic name field in the audit
struct and make it so we can start adding new info messages about
why an unpack failed.
Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
John Johansen [Sun, 6 Aug 2017 12:39:08 +0000 (05:39 -0700)]
apparmor: add support for absolute root view based labels
With apparmor policy virtualization based on policy namespace View's
we don't generally want/need absolute root based views, however there
are cases like debugging and some secid based conversions where
using a root based view is important.
Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
John Johansen [Sun, 6 Aug 2017 12:36:40 +0000 (05:36 -0700)]
apparmor: cleanup conditional check for label in label_print
Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
John Johansen [Wed, 19 Jul 2017 06:04:47 +0000 (23:04 -0700)]
apparmor: add mount mediation
Add basic mount mediation. That allows controlling based on basic
mount parameters. It does not include special mount parameters for
apparmor, super block labeling, or any triggers for apparmor namespace
parameter modifications on pivot root.
default userspace policy rules have the form of
MOUNT RULE = ( MOUNT | REMOUNT | UMOUNT )
MOUNT = [ QUALIFIERS ] 'mount' [ MOUNT CONDITIONS ] [ SOURCE FILEGLOB ]
[ '->' MOUNTPOINT FILEGLOB ]
REMOUNT = [ QUALIFIERS ] 'remount' [ MOUNT CONDITIONS ]
MOUNTPOINT FILEGLOB
UMOUNT = [ QUALIFIERS ] 'umount' [ MOUNT CONDITIONS ] MOUNTPOINT FILEGLOB
MOUNT CONDITIONS = [ ( 'fstype' | 'vfstype' ) ( '=' | 'in' )
MOUNT FSTYPE EXPRESSION ]
[ 'options' ( '=' | 'in' ) MOUNT FLAGS EXPRESSION ]
MOUNT FSTYPE EXPRESSION = ( MOUNT FSTYPE LIST | MOUNT EXPRESSION )
MOUNT FSTYPE LIST = Comma separated list of valid filesystem and
virtual filesystem types (eg ext4, debugfs, etc)
MOUNT FLAGS EXPRESSION = ( MOUNT FLAGS LIST | MOUNT EXPRESSION )
MOUNT FLAGS LIST = Comma separated list of MOUNT FLAGS.
MOUNT FLAGS = ( 'ro' | 'rw' | 'nosuid' | 'suid' | 'nodev' | 'dev' |
'noexec' | 'exec' | 'sync' | 'async' | 'remount' |
'mand' | 'nomand' | 'dirsync' | 'noatime' | 'atime' |
'nodiratime' | 'diratime' | 'bind' | 'rbind' | 'move' |
'verbose' | 'silent' | 'loud' | 'acl' | 'noacl' |
'unbindable' | 'runbindable' | 'private' | 'rprivate' |
'slave' | 'rslave' | 'shared' | 'rshared' |
'relatime' | 'norelatime' | 'iversion' | 'noiversion' |
'strictatime' | 'nouser' | 'user' )
MOUNT EXPRESSION = ( ALPHANUMERIC | AARE ) ...
PIVOT ROOT RULE = [ QUALIFIERS ] pivot_root [ oldroot=OLD PUT FILEGLOB ]
[ NEW ROOT FILEGLOB ]
SOURCE FILEGLOB = FILEGLOB
MOUNTPOINT FILEGLOB = FILEGLOB
eg.
mount,
mount /dev/foo,
mount options=ro /dev/foo -> /mnt/,
mount options in (ro,atime) /dev/foo -> /mnt/,
mount options=ro options=atime,
Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
John Johansen [Wed, 19 Jul 2017 05:56:22 +0000 (22:56 -0700)]
apparmor: add the ability to mediate signals
Add signal mediation where the signal can be mediated based on the
signal, direction, or the label or the peer/target. The signal perms
are verified on a cross check to ensure policy consistency in the case
of incremental policy load/replacement.
The optimization of skipping the cross check when policy is guaranteed
to be consistent (single compile unit) remains to be done.
policy rules have the form of
SIGNAL_RULE = [ QUALIFIERS ] 'signal' [ SIGNAL ACCESS PERMISSIONS ]
[ SIGNAL SET ] [ SIGNAL PEER ]
SIGNAL ACCESS PERMISSIONS = SIGNAL ACCESS | SIGNAL ACCESS LIST
SIGNAL ACCESS LIST = '(' Comma or space separated list of SIGNAL
ACCESS ')'
SIGNAL ACCESS = ( 'r' | 'w' | 'rw' | 'read' | 'write' | 'send' |
'receive' )
SIGNAL SET = 'set' '=' '(' SIGNAL LIST ')'
SIGNAL LIST = Comma or space separated list of SIGNALS
SIGNALS = ( 'hup' | 'int' | 'quit' | 'ill' | 'trap' | 'abrt' |
'bus' | 'fpe' | 'kill' | 'usr1' | 'segv' | 'usr2' |
'pipe' | 'alrm' | 'term' | 'stkflt' | 'chld' | 'cont' |
'stop' | 'stp' | 'ttin' | 'ttou' | 'urg' | 'xcpu' |
'xfsz' | 'vtalrm' | 'prof' | 'winch' | 'io' | 'pwr' |
'sys' | 'emt' | 'exists' | 'rtmin+0' ... 'rtmin+32'
)
SIGNAL PEER = 'peer' '=' AARE
eg.
signal, # allow all signals
signal send set=(hup, kill) peer=foo,
Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
John Johansen [Tue, 1 Aug 2017 06:44:37 +0000 (23:44 -0700)]
apparmor: Redundant condition: prev_ns. in [label.c:1498]
Reported-by: David Binderman <dcb314@hotmail.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
Dan Carpenter [Thu, 13 Jul 2017 07:39:20 +0000 (10:39 +0300)]
apparmor: Fix an error code in aafs_create()
We accidentally forgot to set the error code on this path. It means we
return NULL instead of an error pointer. I looked through a bunch of
callers and I don't think it really causes a big issue, but the
documentation says we're supposed to return error pointers here.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Serge Hallyn <serge@hallyn.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
Christos Gkekas [Sat, 8 Jul 2017 19:50:21 +0000 (20:50 +0100)]
apparmor: Fix logical error in verify_header()
verify_header() is currently checking whether interface version is less
than 5 *and* greater than 7, which always evaluates to false. Instead it
should check whether it is less than 5 *or* greater than 7.
Signed-off-by: Christos Gkekas <chris.gekas@gmail.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
Geert Uytterhoeven [Thu, 6 Jul 2017 08:56:21 +0000 (10:56 +0200)]
apparmor: Fix shadowed local variable in unpack_trans_table()
with W=2:
security/apparmor/policy_unpack.c: In function ‘unpack_trans_table’:
security/apparmor/policy_unpack.c:469: warning: declaration of ‘pos’ shadows a previous local
security/apparmor/policy_unpack.c:451: warning: shadowed declaration is here
Rename the old "pos" to "saved_pos" to fix this.
Fixes:
5379a3312024a8be ("apparmor: support v7 transition format compatible with label_parse")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: Serge Hallyn <serge@hallyn.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
Somnath Kotur [Thu, 31 Aug 2017 03:57:35 +0000 (09:27 +0530)]
bnxt_re: Don't issue cmd to delete GID for QP1 GID entry before the QP is destroyed
FW needs the 0th GID Entry in the Table to be preserved before
it's corresponding QP1 is deleted, else it will fail the cmd.
Check for the same and return to prevent error msg being logged for
cmd failure.
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Selvin Xavier [Thu, 31 Aug 2017 03:57:34 +0000 (09:27 +0530)]
bnxt_re: Fix memory leak in FRMR path
This patch fixes a memory leak issue when alloc_mr is used.
mr->pages and mr->npages are used only in alloc_mr path. mr->pages
is allocated when alloc_mr is called or in the case of FRMR, while
creating the MR. mr->npages is updated only when the MR created
is used i.e. after invoking map_mr_sg verb, before data transfer.
In the dereg_mr path, if mr->npages is 0, driver ends up not freeing
the memory created.
Removing the npages check from the dereg_mr path for kernel consumers.
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Somnath Kotur [Thu, 31 Aug 2017 03:57:33 +0000 (09:27 +0530)]
bnxt_re: Remove RTNL lock dependency in bnxt_re_query_port
When there is a NETDEV_UNREGISTER event, bnxt_re driver calls
ib_unregister_device() (RTNL lock held).
ib_unregister_device attempts to flush a worker queue scheduled by
ib_core and that queue might have a pending ib_query_port().
ib_query_port in turn calls bnxt_re_query_port(), which while querying the
link speed using ib_get_eth_speed(), tries to acquire the rtnl_lock() which
was already held by NETDEV_UNREGISTER.
Fixing the issue by removing the link speed query from bnxt_re_query_port()
Now the speed is queried post a successful ib_register_device or whenever
there is a NETDEV_CHANGE event.
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Somnath Kotur [Thu, 31 Aug 2017 03:57:32 +0000 (09:27 +0530)]
bnxt_re: Fix race between the netdev register and unregister events
Upon receipt of the NETDEV_REGISTER event from the netdev notifier chain,
the IB stack registration is spawned off to a workqueue since that also
requires an rtnl lock.
There could be 2 kinds of races between the NETDEV_REGISTER and the
NETDEV_UNREGISTER event handling.
a)The NETDEV_UNREGISTER event is received in rapid succession after
the NETDEV_REGISTER event even before the work queue got a chance to run.
b)The NETDEV_UNREGISTER event is received while the workqueue that handles
registration with the IB stack is still in progress.
Handle both the races with a bit flag that is set just before the work item
is queued and cleared in the workqueue after the event is handled just
before the workqueue item is freed.
While adding the new flag, it was noted that the flags are all used in
*_bit() operations which expect a bit number and not a literal constant
with a bit set. So change the numbers to be bit numbers.
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Somnath Kotur [Thu, 31 Aug 2017 03:57:31 +0000 (09:27 +0530)]
bnxt_re: Free up devices in module_exit path
Clean up all devices added to the bnxt_re_dev_list in the
module_exit entry point.
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Devesh Sharma [Thu, 31 Aug 2017 03:57:30 +0000 (09:27 +0530)]
bnxt_re: Fix compare and swap atomic operands
Driver must assign the user supplied compare/swap values in
the wqe to successfully complete the atomic compare and
swap operation.
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Somnath Kotur [Thu, 31 Aug 2017 03:57:29 +0000 (09:27 +0530)]
bnxt_re: Stop issuing further cmds to FW once a cmd times out
Once a cmd to FW times out(after 20s) it is reasonable to
assume the FW or atleast the control path is dead.
No point issuing further cmds to the FW as each subsequent cmd
with another 20s timeout will cascade resulting in unnecessary
traces and/or NMI Lockups.
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Devesh Sharma [Thu, 31 Aug 2017 03:57:28 +0000 (09:27 +0530)]
bnxt_re: Fix update of qplib_qp.mtu when modified
The MTU value in the qplib_qp.mtu should be
consistent with whatever mtu was set during
INIT to RTR.The Next PSN and number of packets
are calculated based on this member in the qplib_qp structure.
Signed-off-by: Narender Reddy <narender.reddy@broadcom.com>
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Helge Deller [Thu, 21 Sep 2017 19:52:08 +0000 (21:52 +0200)]
parisc: Add HWPOISON page fault handler code
Commit
24587380f61d ("parisc: Add MADV_HWPOISON and MADV_SOFT_OFFLINE") added
the necessary constants to handle hardware-poisoning. Those were needed to
support the page deallocation feature from firmware.
But I completely missed to add the relevant fault handler code. This now
showed up when I ran the madvise07 testcase from the Linux Test Project,
which failed with a kernel BUG at arch/parisc/mm/fault.c:320.
With this patch the parisc kernel now behaves like other platforms and
gives the same kernel syslog warnings when poisoning pages.
Signed-off-by: Helge Deller <deller@gmx.de>
Helge Deller [Thu, 21 Sep 2017 19:22:27 +0000 (21:22 +0200)]
parisc: Move init_per_cpu() into init section
Signed-off-by: Helge Deller <deller@gmx.de>
Helge Deller [Mon, 18 Sep 2017 15:55:24 +0000 (17:55 +0200)]
parisc: Check if initrd was loaded into broken RAM
While scanning the PDT for reported broken memory modules, warn if the
initrd was coincidentally loaded into bad memory.
Signed-off-by: Helge Deller <deller@gmx.de>
Helge Deller [Sun, 17 Sep 2017 19:28:11 +0000 (21:28 +0200)]
parisc: Add PDCE_CHECK instruction to HPMC handler
According to the programming note at page 1-31 of the PA 1.1 Firmware
Architecture document, one should use the PDC_INSTR firmware function to
get the instruction that invokes a PDCE_CHECK in the HPMC handler. This
patch follows this note and sets the instruction which has been a nop up
until now.
Testing on a C3000 and C8000 showed that this firmware call isn't
implemented on those machines, so maybe it's only needed on older ones.
Signed-off-by: Helge Deller <deller@gmx.de>
Helge Deller [Sun, 17 Sep 2017 19:15:09 +0000 (21:15 +0200)]
parisc: Add wrapper for pdc_instr() firmware function
Signed-off-by: Helge Deller <deller@gmx.de>
Helge Deller [Sun, 17 Sep 2017 19:17:10 +0000 (21:17 +0200)]
parisc: Move start_parisc() into init section
Signed-off-by: Helge Deller <deller@gmx.de>
Helge Deller [Sun, 17 Sep 2017 19:05:02 +0000 (21:05 +0200)]
parisc: Stop unwinding at start of stack
Check stack pointer if we are reaching the stack end and stop unwinding
if we do. This fixes early backtraces and avoids showing unrealistic
call stacks.
Signed-off-by: Helge Deller <deller@gmx.de>
Helge Deller [Mon, 11 Sep 2017 19:41:43 +0000 (21:41 +0200)]
parisc: Fix too large frame size warnings
The parisc architecture has larger stack frames than most other
architectures on 32-bit kernels.
Increase the maximum allowed stack frame to 1280 bytes for parisc to
avoid warnings in the do_sys_poll() and pat_memconfig() functions.
Signed-off-by: Helge Deller <deller@gmx.de>
Shiraz Saleem [Tue, 19 Sep 2017 14:19:13 +0000 (09:19 -0500)]
i40iw: Add support for port reuse on active side connections
During OpenMPI scale up testing, we observe rdma_connect
failures if ports are reused on multiple connections.
This is because the Control Queue-Pair (CQP) command to add
the reused port to Accelerated Port Bit VectorTable (APBVT)
fails as there already exists an entry.
Check for duplicate port before invoking the CQP command
to add APBVT entry and delete the entry only if the port
is not in use.
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Mustafa Ismail [Tue, 19 Sep 2017 14:19:12 +0000 (09:19 -0500)]
i40iw: Add missing VLAN priority
Set the VLAN priority which is in the upper 3 bits of the VLAN
tag field in the QP context.
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Shiraz Saleem [Tue, 19 Sep 2017 14:19:11 +0000 (09:19 -0500)]
i40iw: Call i40iw_cm_disconn on modify QP to disconnect
If QP modify to closing/terminate/error fails, connection is
not torn down as there is no corresponding asynchronous
event that will initiate the teardown.
Add explicit call to i40iw_cm_disconn if not waiting in
modify QP, otherwise schedule it in CM timer.
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Shiraz Saleem [Tue, 19 Sep 2017 14:19:10 +0000 (09:19 -0500)]
i40iw: Prevent multiple netdev event notifier registrations
Netdev event notifier registration/de-registration is not
synchronized with a lock and there is a possibility of a
duplicate registration of notifier before the unregister
completes.
Register netdev event notifiers during module init and
de-register them at module exit.
This avoids the need to tie the registration to first netdev
client interface open and de-registration to last client
interface close and the synchronization to achieve it.
This also fixes a crash due to duplicate registration.
BUG: unable to handle kernel paging request at
ffffffffa0d60388
IP: [<
ffffffff8160f75d>] notifier_call_chain+0x3d/0x70
PGD
190d067 PUD
190e063 PMD
76c840067 PTE 0
Oops: 0000 [#1] SMP
Modules linked in: i40e(OF-) fuse btrfs zlib_deflate raid6_pq xor vfat msdos
[..]
e1000e vxlan ip_tunnel ptp pps_core i2c_core video [last unloaded: i40iw]
CPU: 1 PID: 27101 Comm: modprobe Tainted: GF W O-------------- 3.10.0-229.el7.x86_64 #1
Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./Q87M-D2H, BIOS F7 01/17/2014
task:
ffff88076e8a96c0 ti:
ffff8806959c8000 task.ti:
ffff8806959c8000
RIP: 0010:[<
ffffffff8160f75d>] [<
ffffffff8160f75d>] notifier_call_chain+0x3d/0x70
RSP: 0018:
ffff8806959cbb38 EFLAGS:
00010282
RAX:
ffffffffa0d60380 RBX:
00000000fffffffd RCX:
0000000000000000
0708] RDX:
0000000000000000 RSI:
ffff88081227a000 RDI:
0000000000000002
RBP:
ffff8806959cbb60 R08:
0000000000000246 R09:
000000000000700c
R10:
ffff88080e16ea40 R11:
00000000000ae8df R12:
ffffffffa0d60380
R13:
0000000000000002 R14:
ffff88076e738800 R15:
0000000000000000
FS:
00007f604ef4a740(0000) GS:
ffff88083e240000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
CR2:
ffffffffa0d60388 CR3:
0000000753cd2000 CR4:
00000000001407e0
DR0:
0000000000000000 DR1:
0000000000000000 DR2:
0000000000000000
DR3:
0000000000000000 DR6:
00000000ffff0ff0 DR7:
0000000000000400
Stack:
ffffffff819e73a0 0000000000000000 0000000000000002 ffff88076e738800
00000000ffffffff ffff8806959cbba0 ffffffff8109d61d 0000000000000000
0000000000000000 ffff88076e738800 0000000000000000 ffff88076e738800
Call Trace:
[<
ffffffff8109d61d>] __blocking_notifier_call_chain+0x4d/0x70
[<
ffffffff8109d656>] blocking_notifier_call_chain+0x16/0x20
[<
ffffffff8156b9e4>] __inet_del_ifa+0x154/0x2b0
[<
ffffffff8156d102>] inetdev_event+0x182/0x530
[<
ffffffff8160f76c>] notifier_call_chain+0x4c/0x70
[<
ffffffff8109d446>] raw_notifier_call_chain+0x16/0x20
[<
ffffffff814f71fd>] call_netdevice_notifiers+0x2d/0x60
[<
ffffffff814f8845>] rollback_registered_many+0x105/0x220
[<
ffffffff814f89a0>] rollback_registered+0x40/0x70
[<
ffffffff814f9c88>] unregister_netdevice_queue+0x48/0x80
[<
ffffffff814f9cdc>] unregister_netdev+0x1c/0x30
[<
ffffffffa0067139>] i40e_vsi_release+0x2a9/0x2b0 [i40e]
[<
ffffffffa00674e8>] i40e_remove+0x128/0x2b0 [i40e]
[<
ffffffff813092db>] pci_device_remove+0x3b/0xb0
[<
ffffffff813d26ef>] __device_release_driver+0x7f/0xf0
[<
ffffffff813d3068>] driver_detach+0xb8/0xc0
[<
ffffffff813d22db>] bus_remove_driver+0x9b/0x120
[<
ffffffff813d36dc>] driver_unregister+0x2c/0x50
[<
ffffffff81307d4c>] pci_unregister_driver+0x2c/0x90
[<
ffffffffa008f9d0>] i40e_exit_module+0x10/0x23 [i40e]
[<
ffffffff810dad0b>] SyS_delete_module+0x16b/0x2d0
[<
ffffffff81013b0c>] ? do_notify_resume+0x9c/0xb0
[<
ffffffff81613da9>] system_call_fastpath+0x16/0x1b
Code: e5 41 57 4d 89 c7 41 56 49 89 d6 41 55 49 89 f5 41 54 53 89 cb
75 14 eb 3d 0f 1f 44 00 00 83 eb 01 74 25 4d 85 e4 74 20 4c 89 e0 <4c>
8b 60 08 4c 89 f2 4c 89 ee 48 89 c7 ff 10 4d 85 ff 74 04 41
RIP [<
ffffffff8160f75d>] notifier_call_chain+0x3d/0x70
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Shiraz Saleem [Tue, 19 Sep 2017 14:19:09 +0000 (09:19 -0500)]
i40iw: Fail open if there are no available MSI-X vectors
Check number of available MSI-X vectors for i40iw.
If there are no available vectors, fail the open.
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Adit Ranadive [Thu, 21 Sep 2017 22:56:21 +0000 (15:56 -0700)]
RDMA/vmw_pvrdma: Fix reporting correct opcodes for completion
Since the IB_WC_BIND_MW opcode has been dropped, set the correct
IB WC opcode explicitly.
Fixes:
29c8d9eba550 ("IB: Add vmw_pvrdma driver")
Reviewed-by: Aditya Sarwade <asarwade@vmware.com>
Reviewed-by: Jorgen Hansen <jhansen@vmware.com>
Signed-off-by: Adit Ranadive <aditr@vmware.com>
Signed-off-by: Bryan Tan <bryantan@vmware.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Leon Romanovsky [Tue, 19 Sep 2017 10:22:13 +0000 (13:22 +0300)]
IB/bnxt_re: Fix frame stack compilation warning
Reduce stack size by dynamically allocating memory instead
of declaring large struct on the stack:
drivers/infiniband/hw/bnxt_re/ib_verbs.c: In function ‘bnxt_re_query_qp’:
drivers/infiniband/hw/bnxt_re/ib_verbs.c:1600:1: warning: the frame size of 1216 bytes is larger than 1024 bytes [-Wframe-larger-than=]
}
^
Cc: Selvin Xavier <selvin.xavier@broadcom.com>
Fixes:
1ac5a4047975 ("RDMA/bnxt_re: Add bnxt_re RoCE driver")
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Acked-by: Selvin Xavier <selvin.xavier@broadcom.com>
Reviewed-by: Jonathan Toppins <jtoppins@redhat.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>