Florian Westphal [Tue, 13 Dec 2016 12:59:33 +0000 (13:59 +0100)]
netfilter: nf_tables: fix oob access
BUG: KASAN: slab-out-of-bounds in nf_tables_rule_destroy+0xf1/0x130 at addr
ffff88006a4c35c8
Read of size 8 by task nft/1607
When we've destroyed last valid expr, nft_expr_next() returns an invalid expr.
We must not dereference it unless it passes != nft_expr_last() check.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Pablo Neira Ayuso [Sun, 11 Dec 2016 19:46:51 +0000 (20:46 +0100)]
netfilter: nft_queue: use raw_smp_processor_id()
Using smp_processor_id() causes splats with PREEMPT_RCU:
[19379.552780] BUG: using smp_processor_id() in preemptible [
00000000] code: ping/32389
[19379.552793] caller is debug_smp_processor_id+0x17/0x19
[...]
[19379.552823] Call Trace:
[19379.552832] [<
ffffffff81274e9e>] dump_stack+0x67/0x90
[19379.552837] [<
ffffffff8129a4d4>] check_preemption_disabled+0xe5/0xf5
[19379.552842] [<
ffffffff8129a4fb>] debug_smp_processor_id+0x17/0x19
[19379.552849] [<
ffffffffa07c42dd>] nft_queue_eval+0x35/0x20c [nft_queue]
No need to disable preemption since we only fetch the numeric value, so
let's use raw_smp_processor_id() instead.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Pablo Neira Ayuso [Sun, 11 Dec 2016 19:09:23 +0000 (20:09 +0100)]
netfilter: nft_quota: reset quota after dump
Dumping of netlink attributes may fail due to insufficient room in the
skbuff, so let's reset consumed quota if we succeed to put netlink
attributes into the skbuff.
Fixes:
43da04a593d8 ("netfilter: nf_tables: atomic dump and reset for stateful objects")
Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Jason Wang [Tue, 13 Dec 2016 06:23:05 +0000 (14:23 +0800)]
virtio-net: correctly enable multiqueue
Commit
4490001029012539937ff02778fe6180613fa949 ("virtio-net: enable
multiqueue by default") blindly set the affinity instead of queues
during probe which can cause a mismatch of #queues between guest and
host. This patch fixes it by setting queues.
Reported-by: Theodore Ts'o <tytso@mit.edu>
Tested-by: Theodore Ts'o <tytso@mit.edu>
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Fixes:
49000102901 ("virtio-net: enable multiqueue by default")
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Mon, 12 Dec 2016 17:06:38 +0000 (09:06 -0800)]
Merge tag 'cris-for-4.10' of git://git./linux/kernel/git/jesper/cris
Pull CRIS updates from Jesper Nilsson:
"Three patches for minor issues"
* tag 'cris-for-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/jesper/cris:
cris: No need to append -O2 and $(LINUXINCLUDE)
tty: serial: make crisv10 explicitly non-modular
cris: Only build flash rescue image if CONFIG_ETRAX_AXISFLASHMAP is selected
Linus Torvalds [Mon, 12 Dec 2016 16:51:37 +0000 (08:51 -0800)]
Merge tag 'openrisc-for-linus' of git://github.com/openrisc/linux
Pull Openrisc updates from Stafford Horne:
- changes to MAINTAINER for openrisc
- probably biggest actual change is the move to memblock from bootmem
- ... plus several bug and build fixes
* tag 'openrisc-for-linus' of git://github.com/openrisc/linux:
openrisc: prevent VGA console, fix builds
openrisc: include l.swa in check for write data pagefault
openrisc: Updates after openrisc.net has been lost
openrisc: Consolidate setup to use memblock instead of bootmem
openrisc: remove the redundant of_platform_populate
openrisc: add NR_CPUS Kconfig default value
openrisc: Support both old (or32) and new (or1k) toolchain
openrisc: Add thread-local storage (TLS) support
openrisc: restore all regs on rt_sigreturn
openrisc: fix PTRS_PER_PGD define
Linus Torvalds [Mon, 12 Dec 2016 16:48:28 +0000 (08:48 -0800)]
Merge tag 'm68k-for-v4.10-tag1' of git://git./linux/kernel/git/geert/linux-m68k
Pull m68k updates from Geert Uytterhoeven:
"Use seq_puts() for fixed strings"
* tag 'm68k-for-v4.10-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
m68k/atari: Use seq_puts() in atari_get_hardware_list()
m68k/amiga: Use seq_puts() in amiga_get_hardware_list()
Linus Torvalds [Mon, 12 Dec 2016 16:46:34 +0000 (08:46 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/egtvedt/linux-avr32
Pull AVR32 updates from Hans-Christian Noren Egtvedt.
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/egtvedt/linux-avr32:
avr32: wire up pkey syscalls
AVR32-pio: Replace two seq_printf() calls by seq_puts() in pio_bank_show()
AVR32-pio: Use seq_putc() in pio_bank_show()
AVR32-clock: Combine nine seq_printf() calls into one call in clk_show()
AVR32-clock: Use seq_putc() in two functions
Linus Torvalds [Mon, 12 Dec 2016 16:44:23 +0000 (08:44 -0800)]
Merge branch 'for-next' of git://git./linux/kernel/git/gerg/m68knommu
Pull m68knommu updates from Greg Ungerer:
"There are two sets of changes in this pull.
The largest is the addition of the ColdFire platform side i2c support
(the IO addressing, setup and clock definitions). The i2c hardware
module itself is driven by the kernels existing iMX i2c driver.
The other change is the addition of support for the Amcore board"
* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu:
m68knommu: AMCORE board, add iMX i2c support
m68k: add Sysam AMCORE open board support
m68knommu: platform support for i2c devices on ColdFire SoC
Linus Torvalds [Mon, 12 Dec 2016 16:18:41 +0000 (08:18 -0800)]
Merge git://git./linux/kernel/git/davem/sparc
Pull sparc updates from David Miller:
"Just a bunch of small cleanups and fixes here, and support for user
probes from Allen Pais"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
sparc: fix a building error reported by kbuild
sparc64: fix typo in pgd_clear()
sparc64: restore irq in error paths in iommu
sparc: leon: Fix a retry loop in leon_init_timers()
sparc64: make string buffers large enough
sparc64: move dereference after check for NULL
sparc: kernel: use builtin_platform_driver
sparc64:Support User Probes for sparc
Linus Torvalds [Mon, 12 Dec 2016 15:54:15 +0000 (07:54 -0800)]
Merge git://git./linux/kernel/git/davem/net-next
Pull networking updates from David Miller:
1) Platform regulatory domain support for ath10k, from Bartosz
Markowski.
2) Centralize min/max MTU checking, thus removing tons of duplicated
code all of the the various drivers. From Jarod Wilson.
3) Support ingress actions in act_mirred, from Shmulik Ladkani.
4) Improve device adjacency tracking, from David Ahern.
5) Add support for LED triggers on PHY link state changes, from Zach
Brown.
6) Improve UDP socket memory accounting, from Paolo Abeni.
7) Set SK_MEM_QUANTUM to a fixed size of 4096, instead of PAGE_SIZE.
From Eric Dumazet.
8) Collapse TCP SKBs at retransmit time even if the right side SKB has
frags. Also from Eric Dumazet.
9) Add IP_RECVFRAGSIZE and IPV6_RECVFRAGSIZE cmsgs, from Willem de
Bruijn.
10) Support routing by UID, from Lorenzo Colitti.
11) Handle L3 domain binding (ie. VRF) for RAW sockets, from David
Ahern.
12) tcp_get_info() can run lockless, from Eric Dumazet.
13) 4-tuple UDP hashing in SFC driver, from Edward Cree.
14) Avoid reorders in GRO code, from Eric Dumazet.
15) IPV6 Segment Routing support, from David Lebrun.
16) Support MPLS push and pop for L3 packets in openvswitch, from Jiri
Benc.
17) Add LRU datastructure support for BPF, Martin KaFai Lau.
18) VF support in liquidio driver, from Raghu Vatsavayi.
19) Multiqueue support in alx driver, from Tobias Regnery.
20) Networking cgroup BPF support, from Daniel Mack.
21) TCP chronograph measurements, from Francis Yan.
22) XDP support for qed driver, from Yuval Mintz.
23) BPF based lwtunnels, from Thomas Graf.
24) Consistent FIB dumping to offloading drivers, from Ido Schimmel.
25) Many optimizations for UDP under high load, from Eric Dumazet.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1522 commits)
netfilter: nft_counter: rework atomic dump and reset
e1000: use disable_hardirq() for e1000_netpoll()
i40e: don't truncate match_method assignment
net: ethernet: ti: netcp: add support of cpts
net: phy: phy drivers should not set SUPPORTED_[Asym_]Pause
net: l2tp: ppp: change PPPOL2TP_MSG_* => L2TP_MSG_*
net: l2tp: deprecate PPPOL2TP_MSG_* in favour of L2TP_MSG_*
net: l2tp: export debug flags to UAPI
net: ethernet: stmmac: remove private tx queue lock
net: ethernet: sxgbe: remove private tx queue lock
net: bridge: shorten ageing time on topology change
net: bridge: add helper to set topology change
net: bridge: add helper to offload ageing time
net: nicvf: use new api ethtool_{get|set}_link_ksettings
net: ethernet: ti: cpsw: sync rates for channels in dual emac mode
net: ethernet: ti: cpsw: re-split res only when speed is changed
net: ethernet: ti: cpsw: combine budget and weight split and check
net: ethernet: ti: cpsw: don't start queue twice
net: ethernet: ti: cpsw: use same macros to get active slave
net: mvneta: select GENERIC_ALLOCATOR
...
Randy Dunlap [Sun, 20 Nov 2016 23:40:32 +0000 (08:40 +0900)]
openrisc: prevent VGA console, fix builds
OpenRISC does not support VGA console, so prevent that kconfig symbol
from being enabled for OpenRISC, thus fixing these build errors:
drivers/built-in.o: In function `vgacon_save_screen':
vgacon.c:(.text+0x20e0): undefined reference to `screen_info'
vgacon.c:(.text+0x20e8): undefined reference to `screen_info'
drivers/built-in.o: In function `vgacon_init':
vgacon.c:(.text+0x284c): undefined reference to `screen_info'
vgacon.c:(.text+0x2850): undefined reference to `screen_info'
drivers/built-in.o: In function `vgacon_startup':
vgacon.c:(.text+0x28d8): undefined reference to `screen_info'
drivers/built-in.o:vgacon.c:(.text+0x28f0): more undefined references to `screen_info' follow
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Cc: Chen Gang <gang.chen@asianux.com>
Cc: Jonas Bonn <jonas@southpole.se>
Signed-off-by: Stafford Horne <shorne@gmail.com>
Stefan Kristiansson [Thu, 3 Jul 2014 17:46:40 +0000 (20:46 +0300)]
openrisc: include l.swa in check for write data pagefault
During page fault handling we check the last instruction to understand
if the fault was for a read or for a write. By default we fall back to
read. New instructions were added to the openrisc 1.1 spec for an
atomic load/store pair (l.lwa/l.swa).
This patch adds the opcode for l.swa (0x33) allowing it to be treated as
a write operation.
Signed-off-by: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
[shorne@gmail.com: expanded a bit on the comment]
Signed-off-by: Stafford Horne <shorne@gmail.com>
Stafford Horne [Mon, 21 Mar 2016 08:11:10 +0000 (17:11 +0900)]
openrisc: Updates after openrisc.net has been lost
The openrisc.net domain expired and was taken over by squatters.
These updates point documentation to the new domain, mailing lists
and git repos.
Also, Jonas is not the main maintainer anylonger, he reviews changes
but does not maintain a repo or sent pull requests. Updating this to
add Stafford and Stefan who are the active maintainers.
Acked-by: Olof Kindgren <olof.kindgren@gmail.com>
Signed-off-by: Stafford Horne <shorne@gmail.com>
Stafford Horne [Sun, 3 Apr 2016 10:14:49 +0000 (19:14 +0900)]
openrisc: Consolidate setup to use memblock instead of bootmem
Clearing out one todo item. Use the memblock boot time memory
which is the current standard.
Tested-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Jonas <jonas@southpole.se>
Signed-off-by: Stafford Horne <shorne@gmail.com>
Rob Herring [Tue, 30 Aug 2016 15:10:59 +0000 (00:10 +0900)]
openrisc: remove the redundant of_platform_populate
The of_platform_populate call in the openrisc arch code is now redundant
as the DT core provides a default call. Openrisc has a NULL match table
which means only top level nodes with compatible strings will have
devices creates. The default version will also descend nodes in the
match table such as "simple-bus" which should be fine as openrisc
doesn't have any of these (though it is preferred that memory-mapped
peripherals be grouped under a bus node(s)).
Signed-off-by: Rob Herring <robh@kernel.org>
Cc: Jonas Bonn <jonas@southpole.se>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Stafford Horne <shorne@gmail.com>
Stafford Horne [Sat, 24 Sep 2016 13:20:42 +0000 (22:20 +0900)]
openrisc: add NR_CPUS Kconfig default value
The build system now expects that NR_CPUS is defined.
Follow
4cbbbb4 ("microblaze: Fix missing NR_CPUS in menuconfig")
Signed-off-by: Stafford Horne <shorne@gmail.com>
Guenter Roeck [Sat, 19 Jul 2014 21:39:08 +0000 (00:39 +0300)]
openrisc: Support both old (or32) and new (or1k) toolchain
The output file format for or1k has changed from "elf32-or32"
to "elf32-or1k". Select the correct output format automatically
to be able to compile the kernel with both toolchain variants.
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Stafford Horne <shorne@gmail.com>
Christian Svensson [Sat, 25 Jan 2014 15:48:54 +0000 (15:48 +0000)]
openrisc: Add thread-local storage (TLS) support
Historically OpenRISC GCC has reserved r10 which we now use to hold
the thread pointer for thread-local storage (TLS).
Signed-off-by: Christian Svensson <blue@cmd.nu>
Signed-off-by: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Stafford Horne <shorne@gmail.com>
Jonas Bonn [Mon, 23 Sep 2013 10:04:20 +0000 (12:04 +0200)]
openrisc: restore all regs on rt_sigreturn
Fix signal handling for when signals are handled as the result of timers
or exceptions, previous code assumed syscalls. This was noticeable with X
crashing where it uses SIGALRM.
This patch restores all regs before returning to userspace via
_resume_userspace instead of via syscall return path.
The rt_sigreturn syscall is more like a context switch than a function
call; it entails a return from one context (the signal handler) to another
(the process in question). For a context switch like this there are
effectively no call-saved regs that remain constant across the transition.
Reported-by: Sebastian Macke <sebastian@macke.de>
Signed-off-by: Jonas Bonn <jonas@southpole.se>
Tested-by: Guenter Roeck <linux@roeck-us.net>
[shorne@gmail.com: Updated comment better reflect change and issue]
Signed-off-by: Stafford Horne <shorne@gmail.com>
Stefan Kristiansson [Fri, 10 Jan 2014 22:17:38 +0000 (00:17 +0200)]
openrisc: fix PTRS_PER_PGD define
On OpenRISC, with its 8k pages, PAGE_SHIFT is defined to be 13.
That makes the expression (1UL << (PAGE_SHIFT-2)) evaluate
to 2048.
The correct value for PTRS_PER_PGD should be 256.
Correcting the PTRS_PER_PGD define unveiled a bug in map_ram(),
where PTRS_PER_PGD was used when the intent was to iterate
over a set of page table entries.
This patch corrects that issue as well.
Signed-off-by: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
Acked-by: Jonas Bonn <jonas@southpole.se>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Stafford Horne <shorne@gmail.com>
Hans-Christian Noren Egtvedt [Mon, 12 Dec 2016 08:23:09 +0000 (09:23 +0100)]
avr32: wire up pkey syscalls
This patch wires up the new pkey_mprotect, pkey_alloc and pkey_free syscalls on
AVR32.
Markus Elfring [Sun, 16 Oct 2016 20:18:31 +0000 (22:18 +0200)]
AVR32-pio: Replace two seq_printf() calls by seq_puts() in pio_bank_show()
Strings which did not contain data format specifications should be put
into a sequence. Thus use the corresponding function "seq_puts".
This issue was detected by using the Coccinelle software.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Markus Elfring [Sun, 16 Oct 2016 20:13:19 +0000 (22:13 +0200)]
AVR32-pio: Use seq_putc() in pio_bank_show()
A single character (line break) should be put into a sequence.
Thus use the corresponding function "seq_putc".
This issue was detected by using the Coccinelle software.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Markus Elfring [Sun, 16 Oct 2016 20:04:10 +0000 (22:04 +0200)]
AVR32-clock: Combine nine seq_printf() calls into one call in clk_show()
Some data were printed into a sequence by nine separate function calls.
Print the same data by a single function call instead.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Markus Elfring [Sun, 16 Oct 2016 19:51:09 +0000 (21:51 +0200)]
AVR32-clock: Use seq_putc() in two functions
A single character (line break) should be put into two sequences.
Thus use the corresponding function "seq_putc".
This issue was detected by using the Coccinelle software.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Gonglei \(Arei\) [Thu, 8 Dec 2016 04:37:08 +0000 (12:37 +0800)]
sparc: fix a building error reported by kbuild
>> arch/sparc/include/asm/topology_64.h:44:44:
error: implicit declaration of function 'cpu_data'
[-Werror=implicit-function-declaration]
#define topology_physical_package_id(cpu) (cpu_data(cpu).proc_id)
^
Let's include cpudata.h in topology_64.h.
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: sparclinux@vger.kernel.org
Suggested-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kirill A. Shutemov [Fri, 9 Dec 2016 11:24:00 +0000 (14:24 +0300)]
sparc64: fix typo in pgd_clear()
It really has to be pgdp, not pgd.
It just happend to work since all callers have 'pgd' as an argument.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Fri, 25 Nov 2016 21:15:37 +0000 (00:15 +0300)]
sparc64: restore irq in error paths in iommu
There are some error paths where we should restore IRQs but we don't.
Fixes:
bb620c3d3925 ("sparc: Make sparc64 use scalable lib/iommu-common.c functions")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Fri, 25 Nov 2016 11:25:54 +0000 (14:25 +0300)]
sparc: leon: Fix a retry loop in leon_init_timers()
The original code causes a static checker warning because it has a
continue inside a do { } while (0); loop. In that context, a continue
and a break are equivalent. The intent was to go back to the start of
the loop so the continue was a bug.
I've added a retry label at the start and changed the continue to a goto
retry. Then I removed the do { } while (0) loop and pulled the code in
one indent level.
Fixes:
2791c1a43900 ("SPARC/LEON: added support for selecting Timer Core and Timer within core")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Fri, 25 Nov 2016 11:03:55 +0000 (14:03 +0300)]
sparc64: make string buffers large enough
My static checker complains that if "lvl" is ULONG_MAX (this is 64 bit)
then some of the strings will overflow. I don't know if that's possible
but it seems simple enough to make the buffers slightly larger.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Fri, 25 Nov 2016 11:01:32 +0000 (14:01 +0300)]
sparc64: move dereference after check for NULL
We shouldn't dereference "iommu" until after we have checked that it is
non-NULL.
Fixes:
f08978b0fdbf ("sparc64: Enable sun4v dma ops to use IOMMU v2 APIs")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Geliang Tang [Wed, 23 Nov 2016 15:06:05 +0000 (23:06 +0800)]
sparc: kernel: use builtin_platform_driver
Use builtin_platform_driver() helper to simplify the code.
Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allen Pais [Thu, 13 Oct 2016 04:36:13 +0000 (10:06 +0530)]
sparc64:Support User Probes for sparc
Signed-off-by: Eric Saint Etienne <eric.saint.etienne@oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Sun, 11 Dec 2016 19:17:54 +0000 (11:17 -0800)]
Linux 4.9
Linus Torvalds [Sun, 11 Dec 2016 18:17:39 +0000 (10:17 -0800)]
Merge branch 'upstream' of git://git.linux-mips.org/ralf/upstream-linus
Pull MIPS fixes from Ralf Baechle:
"Two more MIPS fixes for 4.9:
- RTC: Return -ENODEV so an external RTC will be tried
- Fix mask of GPE frequency
These two have been tested on Imagination's automated test system and
also both received positive reviews on the linux-mips mailing list"
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
MIPS: Lantiq: Fix mask of GPE frequency
MIPS: Return -ENODEV from weak implementation of rtc_mips_set_time
Pablo Neira [Sun, 11 Dec 2016 10:43:59 +0000 (11:43 +0100)]
netfilter: nft_counter: rework atomic dump and reset
Dump and reset doesn't work unless cmpxchg64() is used both from packet
and control plane paths. This approach is going to be slow though.
Instead, use a percpu seqcount to fetch counters consistently, then
subtract bytes and packets in case a reset was requested.
The cpu that running over the reset code is guaranteed to own this stats
exclusively, we have to turn counters into signed 64bit though so stats
update on reset don't get wrong on underflow.
This patch is based on original sketch from Eric Dumazet.
Fixes:
43da04a593d8 ("netfilter: nf_tables: atomic dump and reset for stateful objects")
Suggested-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hauke Mehrtens [Wed, 7 Dec 2016 21:32:00 +0000 (22:32 +0100)]
MIPS: Lantiq: Fix mask of GPE frequency
The hardware documentation says bit 11:10 are used for the GPE
frequency selection. Fix the mask in the define to match these bits.
Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Thomas Langer <thomas.langer@intel.com>
Cc: linux-mips@linux-mips.org
Cc: john@phrozen.org
Patchwork: https://patchwork.linux-mips.org/patch/14648/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Luuk Paulussen [Wed, 7 Dec 2016 22:43:34 +0000 (11:43 +1300)]
MIPS: Return -ENODEV from weak implementation of rtc_mips_set_time
The sync_cmos_clock function in kernel/time/ntp.c first tries to update
the internal clock of the cpu by calling the "update_persistent_clock64"
architecture specific function. If this returns -ENODEV, it then tries
to update an external RTC using "rtc_set_ntp_time".
On the mips architecture, the weak implementation of the underlying
function would return 0 if it wasn't overridden. This meant that the
sync_cmos_clock function would never try to update an external RTC
(if both CONFIG_GENERIC_CMOS_UPDATE and CONFIG_RTC_SYSTOHC are
configured)
Returning -ENODEV instead, means that an external RTC will be tried.
Signed-off-by: Luuk Paulussen <luuk.paulussen@alliedtelesis.co.nz>
Reviewed-by: Richard Laing <richard.laing@alliedtelesis.co.nz>
Reviewed-by: Scott Parlane <scott.parlane@alliedtelesis.co.nz>
Reviewed-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/14649/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
WANG Cong [Sat, 10 Dec 2016 22:22:42 +0000 (14:22 -0800)]
e1000: use disable_hardirq() for e1000_netpoll()
In commit
02cea3958664 ("genirq: Provide disable_hardirq()")
Peter introduced disable_hardirq() for netpoll, but it is forgotten
to use it for e1000.
This patch changes disable_irq() to disable_hardirq() for e1000.
Reported-by: Dave Jones <davej@codemonkey.org.uk>
Suggested-by: Sabrina Dubroca <sd@queasysnail.net>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Keller, Jacob E [Fri, 9 Dec 2016 21:39:21 +0000 (13:39 -0800)]
i40e: don't truncate match_method assignment
The .match_method field is a u8, so we shouldn't be casting to a u16,
and because it is only one byte, we do not need to byte swap anything.
Just assign the value directly. This avoids issues on Big Endian
architectures which would have byte swapped and then incorrectly
truncated the value.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Bimmy Pujari <bimmy.pujari@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
WingMan Kwok [Thu, 8 Dec 2016 22:21:56 +0000 (16:21 -0600)]
net: ethernet: ti: netcp: add support of cpts
This patch adds support of the cpts device found in the
gbe and 10gbe ethernet switches on the keystone 2 SoCs
(66AK2E/L/Hx, 66AK2Gx).
Cc: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: WingMan Kwok <w-kwok2@ti.com>
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Timur Tabi [Wed, 7 Dec 2016 19:20:51 +0000 (13:20 -0600)]
net: phy: phy drivers should not set SUPPORTED_[Asym_]Pause
Instead of having individual PHY drivers set the SUPPORTED_Pause and
SUPPORTED_Asym_Pause flags, phylib itself should set those flags,
unless there is a hardware erratum or other special case. During
autonegotiation, the PHYs will determine whether to enable pause
frame support.
Pause frames are a feature that is supported by the MAC. It is the MAC
that generates the frames and that processes them. The PHY can only be
configured to allow them to pass through.
This commit also effectively reverts the recently applied
c7a61319
("net: phy: dp83848: Support ethernet pause frames").
So the new process is:
1) Unless the PHY driver overrides it, phylib sets the SUPPORTED_Pause
and SUPPORTED_AsymPause bits in phydev->supported. This indicates that
the PHY supports pause frames.
2) The MAC driver checks phydev->supported before it calls phy_start().
If (SUPPORTED_Pause | SUPPORTED_AsymPause) is set, then the MAC driver
sets those bits in phydev->advertising, if it wants to enable pause
frame support.
3) When the link state changes, the MAC driver checks phydev->pause and
phydev->asym_pause, If the bits are set, then it enables the corresponding
features in the MAC. The algorithm is:
if (phydev->pause)
The MAC should be programmed to receive and honor
pause frames it receives, i.e. enable receive flow control.
if (phydev->pause != phydev->asym_pause)
The MAC should be programmed to transmit pause
frames when needed, i.e. enable transmit flow control.
Signed-off-by: Timur Tabi <timur@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Asbjørn Sloth Tønnesen [Sun, 11 Dec 2016 00:18:59 +0000 (00:18 +0000)]
net: l2tp: ppp: change PPPOL2TP_MSG_* => L2TP_MSG_*
Signed-off-by: Asbjoern Sloth Toennesen <asbjorn@asbjorn.st>
Signed-off-by: David S. Miller <davem@davemloft.net>
Asbjørn Sloth Tønnesen [Sun, 11 Dec 2016 00:18:58 +0000 (00:18 +0000)]
net: l2tp: deprecate PPPOL2TP_MSG_* in favour of L2TP_MSG_*
PPPOL2TP_MSG_* and L2TP_MSG_* are duplicates, and are being used
interchangeably in the kernel, so let's standardize on L2TP_MSG_*
internally, and keep PPPOL2TP_MSG_* defined in UAPI for compatibility.
Signed-off-by: Asbjoern Sloth Toennesen <asbjorn@asbjorn.st>
Signed-off-by: David S. Miller <davem@davemloft.net>
Asbjørn Sloth Tønnesen [Sun, 11 Dec 2016 00:18:57 +0000 (00:18 +0000)]
net: l2tp: export debug flags to UAPI
Move the L2TP_MSG_* definitions to UAPI, as it is part of
the netlink API.
Signed-off-by: Asbjoern Sloth Toennesen <asbjorn@asbjorn.st>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sun, 11 Dec 2016 04:27:02 +0000 (23:27 -0500)]
Merge branch 'sxgbe-stmmac-remove-private-tx-lock'
Lino Sanfilippo says:
====================
Remove private tx queue locks
this patch series removes unnecessary private locks in the sxgbe and the
stmmac driver.
v2:
- adjust commit message
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Lino Sanfilippo [Thu, 8 Dec 2016 23:55:43 +0000 (00:55 +0100)]
net: ethernet: stmmac: remove private tx queue lock
The driver uses a private lock for synchronization of the xmit function and
the xmit completion handler, but since the NETIF_F_LLTX flag is not set,
the xmit function is also called with the xmit_lock held.
On the other hand the completion handler uses the reverse locking order by
first taking the private lock and (in case that the tx queue had been
stopped) then the xmit_lock.
Improve the locking by removing the private lock and using only the
xmit_lock for synchronization instead.
Signed-off-by: Lino Sanfilippo <LinoSanfilippo@gmx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Lino Sanfilippo [Thu, 8 Dec 2016 23:55:42 +0000 (00:55 +0100)]
net: ethernet: sxgbe: remove private tx queue lock
The driver uses a private lock for synchronization of the xmit function and
the xmit completion handler, but since the NETIF_F_LLTX flag is not set,
the xmit function is also called with the xmit_lock held.
On the other hand the completion handler uses the reverse locking order by
first taking the private lock and (in case that the tx queue had been
stopped) then the xmit_lock.
Improve the locking by removing the private lock and using only the
xmit_lock for synchronization instead.
Signed-off-by: Lino Sanfilippo <LinoSanfilippo@gmx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sun, 11 Dec 2016 02:28:39 +0000 (21:28 -0500)]
Merge branch 'bridge-fast-ageing-on-topology-change'
Vivien Didelot says:
====================
net: bridge: fast ageing on topology change
802.1D [1] specifies that the bridges in a network must use a short
value to age out dynamic entries in the Filtering Database for a period,
once a topology change has been communicated by the root bridge.
This patchset fixes this for the in-kernel STP implementation.
Once the topology change flag is set in a net_bridge instance, the
ageing time value is shorten to twice the forward delay used by the
topology.
When the topology change flag is cleared, the ageing time configured for
the bridge is restored.
To accomplish that, a new bridge_ageing_time member is added to the
net_bridge structure, to store the user configured bridge ageing time.
Two helpers are added to offload the ageing time and set the topology
change flag in the net_bridge instance. Then the required logic is added
in the topology change helper if in-kernel STP is used.
This has been tested on the following topology:
+--------------+
| root bridge |
| 1 2 3 4 |
+--+--+--+--+--+
| | | | +--------+
| | | +------| laptop |
| | | +--------+
+--+--+--+-----+
| 1 2 3 |
| slave bridge |
+--------------+
When unplugging/replugging the laptop, the slave bridge (under test)
gets the topology change flag sent by the root bridge, and fast ageing
is triggered on the bridges. Once the topology change timer of the root
bridge expires, the topology change flag is cleared and the configured
ageing time is restored on the bridges.
A similar test has been done between two bridges under test.
When changing the forward delay of the root bridge with:
# echo 3000 > /sys/class/net/br0/bridge/forward_delay
the ageing time correctly changes on both bridges from 300s to 60s while
the TOPOLOGY_CHANGE flag is present.
[1] "8.3.5 Notifying topology changes",
http://profesores.elo.utfsm.cl/~agv/elo309/doc/802.1D-1998.pdf
No change since RFC: https://lkml.org/lkml/2016/10/19/828
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Vivien Didelot [Sat, 10 Dec 2016 18:44:29 +0000 (13:44 -0500)]
net: bridge: shorten ageing time on topology change
802.1D [1] specifies that the bridges must use a short value to age out
dynamic entries in the Filtering Database for a period, once a topology
change has been communicated by the root bridge.
Add a bridge_ageing_time member in the net_bridge structure to store the
bridge ageing time value configured by the user (ioctl/netlink/sysfs).
If we are using in-kernel STP, shorten the ageing time value to twice
the forward delay used by the topology when the topology change flag is
set. When the flag is cleared, restore the configured ageing time.
[1] "8.3.5 Notifying topology changes ",
http://profesores.elo.utfsm.cl/~agv/elo309/doc/802.1D-1998.pdf
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vivien Didelot [Sat, 10 Dec 2016 18:44:28 +0000 (13:44 -0500)]
net: bridge: add helper to set topology change
Add a __br_set_topology_change helper to set the topology change value.
This can be later extended to add actions when the topology change flag
is set or cleared.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vivien Didelot [Sat, 10 Dec 2016 18:44:27 +0000 (13:44 -0500)]
net: bridge: add helper to offload ageing time
The SWITCHDEV_ATTR_ID_BRIDGE_AGEING_TIME switchdev attr is actually set
when initializing a bridge port, and when configuring the bridge ageing
time from ioctl/netlink/sysfs.
Add a __set_ageing_time helper to offload the ageing time to physical
switches, and add the SWITCHDEV_F_DEFER flag since it can be called
under bridge lock.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Philippe Reynes [Sat, 10 Dec 2016 14:00:48 +0000 (15:00 +0100)]
net: nicvf: use new api ethtool_{get|set}_link_ksettings
The ethtool api {get|set}_settings is deprecated.
We move this driver to new api {get|set}_link_ksettings.
Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ivan Khoronzhuk [Sat, 10 Dec 2016 12:23:50 +0000 (14:23 +0200)]
net: ethernet: ti: cpsw: sync rates for channels in dual emac mode
The channels are common for both ndevs in dual emac mode. Hence, keep
in sync their rates.
Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ivan Khoronzhuk [Sat, 10 Dec 2016 12:23:49 +0000 (14:23 +0200)]
net: ethernet: ti: cpsw: re-split res only when speed is changed
Don't re-split res in the following cases:
- speed of phys is not changed
- speed of phys is changed and no rate limited channels
- speed of phys is changed and all channels are rate limited
- phy is unlinked while dev is open
- phy is linked back but speed is not changed
The maximum speed is sum of "linked" phys, thus res are split taken
in account two interfaces, both for dual emac mode and for
switch mode.
Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ivan Khoronzhuk [Sat, 10 Dec 2016 12:23:48 +0000 (14:23 +0200)]
net: ethernet: ti: cpsw: combine budget and weight split and check
Re-split weight along with budget. It simplify code a little
and update state after every rate change. Also it's necessarily
to move arguments checks to this combined function. Replace
maximum rate check for an interface on maximum possible rate.
Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ivan Khoronzhuk [Sat, 10 Dec 2016 12:23:47 +0000 (14:23 +0200)]
net: ethernet: ti: cpsw: don't start queue twice
No need to start queues after cpsw is started as it will be done
while cpsw_adjust_link(), after phy connection.
Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ivan Khoronzhuk [Sat, 10 Dec 2016 12:23:46 +0000 (14:23 +0200)]
net: ethernet: ti: cpsw: use same macros to get active slave
Use the same, more convenient macros, to get active slave.
Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arnd Bergmann [Sat, 10 Dec 2016 10:38:32 +0000 (11:38 +0100)]
net: mvneta: select GENERIC_ALLOCATOR
We previously relied on GENERIC_ALLOCATOR to be selected by CONFIG_ARM,
but now we can compile-test the driver on other architectures that
don't select it:
drivers/net/built-in.o: In function `mvneta_bm_remove':
mvneta_bm.c:(.text+0x4ee35): undefined reference to `gen_pool_free'
This adds an explicit select for the part of the driver that has
the dependency.
Fixes:
a0627f776a45 ("net: marvell: Allow drivers to be built with COMPILE_TEST")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Amit Kushwaha [Sat, 10 Dec 2016 05:44:47 +0000 (11:14 +0530)]
net: socket: removed an unnecessary newline
This patch removes a newline which was added
in socket.c file in net-next
Signed-off-by: Amit Kushwaha <kushwaha.a@samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
WANG Cong [Sat, 10 Dec 2016 05:10:59 +0000 (21:10 -0800)]
netlink: use blocking notifier
netlink_chain is called in ->release(), which is apparently
a process context, so we don't have to use an atomic notifier
here.
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sat, 10 Dec 2016 21:21:55 +0000 (16:21 -0500)]
Merge git://git./linux/kernel/git/davem/net
Linus Torvalds [Sat, 10 Dec 2016 17:47:13 +0000 (09:47 -0800)]
Merge branch 'linus' of git://git./linux/kernel/git/herbert/crypto-2.6
Pull crypto fixes from Herbert Xu:
"This fixes the following issues:
- Fix pointer size when caam is used with AArch64 boot loader on
AArch32 kernel.
- Fix ahash state corruption in marvell driver.
- Fix buggy algif_aed tag handling.
- Prevent mcryptd from being used with incompatible algorithms which
can cause crashes"
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: algif_aead - fix uninitialized variable warning
crypto: mcryptd - Check mcryptd algorithm compatibility
crypto: algif_aead - fix AEAD tag memory handling
crypto: caam - fix pointer size for AArch64 boot loader, AArch32 kernel
crypto: marvell - Don't corrupt state of an STD req for re-stepped ahash
crypto: marvell - Don't copy hash operation twice into the SRAM
Linus Torvalds [Sat, 10 Dec 2016 17:23:19 +0000 (09:23 -0800)]
Merge git://git./linux/kernel/git/davem/net
Pull networking fixes from David Miller:
1) Limit the number of can filters to avoid > MAX_ORDER allocations.
Fix from Marc Kleine-Budde.
2) Limit GSO max size in netvsc driver to avoid problems with NVGRE
configurations. From Stephen Hemminger.
3) Return proper error when memory allocation fails in
ser_gigaset_init(), from Dan Carpenter.
4) Missing linkage undo in error paths of ipvlan_link_new(), from Gao
Feng.
5) Missing necessayr SET_NETDEV_DEV in lantiq and cpmac drivers, from
Florian Fainelli.
6) Handle probe deferral properly in smsc911x driver.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
net: mlx5: Fix Kconfig help text
net: smsc911x: back out silently on probe deferrals
ibmveth: set correct gso_size and gso_type
net: ethernet: cpmac: Call SET_NETDEV_DEV()
net: ethernet: lantiq_etop: Call SET_NETDEV_DEV()
vhost-vsock: fix orphan connection reset
cxgb4/cxgb4vf: Assign netdev->dev_port with port ID
driver: ipvlan: Unlink the upper dev when ipvlan_link_new failed
ser_gigaset: return -ENOMEM on error instead of success
NET: usb: cdc_mbim: add quirk for supporting Telit LE922A
can: peak: fix bad memory access and free sequence
phy: Don't increment MDIO bus refcount unless it's a different owner
netvsc: reduce maximum GSO size
drivers: net: cpsw-phy-sel: Clear RGMII_IDMODE on "rgmii" links
can: raw: raw_setsockopt: limit number of can_filter that can be set
Christopher Covington [Fri, 9 Dec 2016 21:53:05 +0000 (16:53 -0500)]
net: mlx5: Fix Kconfig help text
Since the following commit, Infiniband and Ethernet have not been
mutually exclusive.
Fixes:
4aa17b28 mlx5: Enable mutual support for IB and Ethernet
Signed-off-by: Christopher Covington <cov@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Fri, 9 Dec 2016 16:02:05 +0000 (08:02 -0800)]
net: skb_condense() can also deal with empty skbs
It seems attackers can also send UDP packets with no payload at all.
skb_condense() can still be a win in this case.
It will be possible to replace the custom code in tcp_add_backlog()
to get full benefit from skb_condense()
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Walleij [Fri, 9 Dec 2016 13:18:00 +0000 (14:18 +0100)]
net: smsc911x: back out silently on probe deferrals
When trying to get a regulator we may get deferred and we see
this noise:
smsc911x
1b800000.ethernet-ebi2 (unnamed net_device) (uninitialized):
couldn't get regulators -517
Then the driver continues anyway. Which means that the regulator
may not be properly retrieved and reference counted, and may be
switched off in case noone else is using it.
Fix this by returning silently on deferred probe and let the
system work it out.
Cc: Jeremy Linton <jeremy.linton@arm.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sat, 10 Dec 2016 03:59:05 +0000 (22:59 -0500)]
Merge tag 'mac80211-next-for-davem-2016-12-09' of git://git./linux/kernel/git/jberg/mac80211-next
Johannes Berg says:
====================
Three fixes:
* fix a logic bug introduced by a previous cleanup
* fix nl80211 attribute confusing (trying to use
a single attribute for two purposes)
* fix a long-standing BSS leak that happens when an
association attempt is abandoned
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Thomas Falcon [Thu, 8 Dec 2016 22:40:03 +0000 (16:40 -0600)]
ibmveth: set correct gso_size and gso_type
This patch is based on an earlier one submitted
by Jon Maxwell with the following commit message:
"We recently encountered a bug where a few customers using ibmveth on the
same LPAR hit an issue where a TCP session hung when large receive was
enabled. Closer analysis revealed that the session was stuck because the
one side was advertising a zero window repeatedly.
We narrowed this down to the fact the ibmveth driver did not set gso_size
which is translated by TCP into the MSS later up the stack. The MSS is
used to calculate the TCP window size and as that was abnormally large,
it was calculating a zero window, even although the sockets receive buffer
was completely empty."
We rely on the Virtual I/O Server partition in a pseries
environment to provide the MSS through the TCP header checksum
field. The stipulation is that users should not disable checksum
offloading if rx packet aggregation is enabled through VIOS.
Some firmware offerings provide the MSS in the RX buffer.
This is signalled by a bit in the RX queue descriptor.
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Reviewed-by: Pradeep Satyanarayana <pradeeps@linux.vnet.ibm.com>
Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Reviewed-by: Jonathan Maxwell <jmaxwell37@gmail.com>
Reviewed-by: David Dai <zdai@us.ibm.com>
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sat, 10 Dec 2016 03:12:30 +0000 (22:12 -0500)]
Merge branch 'udp-receive-path-optimizations'
Eric Dumazet says:
====================
udp: receive path optimizations
This patch series provides about 100 % performance increase under flood.
v2: added Paolo feedback on udp_rmem_release() for tiny sk_rcvbuf
added the last patch touching sk_rmem_alloc later
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Thu, 8 Dec 2016 19:41:57 +0000 (11:41 -0800)]
udp: udp_rmem_release() should touch sk_rmem_alloc later
In flood situations, keeping sk_rmem_alloc at a high value
prevents producers from touching the socket.
It makes sense to lower sk_rmem_alloc only at the end
of udp_rmem_release() after the thread draining receive
queue in udp_recvmsg() finished the writes to sk_forward_alloc.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Thu, 8 Dec 2016 19:41:56 +0000 (11:41 -0800)]
udp: add batching to udp_rmem_release()
If udp_recvmsg() constantly releases sk_rmem_alloc
for every read packet, it gives opportunity for
producers to immediately grab spinlocks and desperatly
try adding another packet, causing false sharing.
We can add a simple heuristic to give the signal
by batches of ~25 % of the queue capacity.
This patch considerably increases performance under
flood by about 50 %, since the thread draining the queue
is no longer slowed by false sharing.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Thu, 8 Dec 2016 19:41:55 +0000 (11:41 -0800)]
udp: copy skb->truesize in the first cache line
In UDP RX handler, we currently clear skb->dev before skb
is added to receive queue, because device pointer is no longer
available once we exit from RCU section.
Since this first cache line is always hot, lets reuse this space
to store skb->truesize and thus avoid a cache line miss at
udp_recvmsg()/udp_skb_destructor time while receive queue
spinlock is held.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Thu, 8 Dec 2016 19:41:54 +0000 (11:41 -0800)]
udp: add busylocks in RX path
Idea of busylocks is to let producers grab an extra spinlock
to relieve pressure on the receive_queue spinlock shared by consumer.
This behavior is requested only once socket receive queue is above
half occupancy.
Under flood, this means that only one producer can be in line
trying to acquire the receive_queue spinlock.
These busylock can be allocated on a per cpu manner, instead of a
per socket one (that would consume a cache line per socket)
This patch considerably improves UDP behavior under stress,
depending on number of NIC RX queues and/or RPS spread.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sat, 10 Dec 2016 03:11:15 +0000 (22:11 -0500)]
Merge branch 'qcom-emac'
Timur Tabi says:
====================
net: qcom/emac: simplify support for different SOCs
On SOCs that have the Qualcomm EMAC network controller, the internal
PHY block is always different. Sometimes the differences are small,
sometimes it might be a completely different IP. Either way, using version
numbers to differentiate them and putting all of the init code in one
file does not scale.
This patchset does two things: The first breaks up the current code into
different files, and the second patch adds support for a third SOC, the
Qualcomm Technologies QDF2400 ARM Server SOC.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Timur Tabi [Thu, 8 Dec 2016 19:24:21 +0000 (13:24 -0600)]
net: qcom/emac: add support for the Qualcomm Technologies QDF2400
The QDF2432 and the QDF2400 have slightly different internal PHYs,
so there are some programming differences. Some of the registers in
the QDF2400 have moved, and some registers require different values
during initialization.
Because of the differences, and because HIDs are a scare resource,
the ACPI tables specify the hardware version in an _HRV property.
Version 1 is the QDF2432, and version 2 is the QDF2400. Any future
SOC that has the same internal PHY but different programming
requirements will be assigned the next available version number.
Signed-off-by: Timur Tabi <timur@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Timur Tabi [Thu, 8 Dec 2016 19:24:20 +0000 (13:24 -0600)]
net: qcom/emac: move phy init code to separate files
The internal PHY of the EMAC differs on each SOC, and the list will
only continue to grow. By separating the code into individual files,
we can add support for more SOCs more cleanly.
Note: The internal PHY is also sometimes called the SGMII device.
We also stop referring to the various PHY variations by version number,
so no more "v2", "v3", etc. Instead, the devices are named after the
SOC they are, which is in sync with the device tree property names.
Future patches will probably rearrange more code among the files.
Signed-off-by: Timur Tabi <timur@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Fri, 9 Dec 2016 19:27:22 +0000 (11:27 -0800)]
Merge branch 'libnvdimm-fixes' of git://git./linux/kernel/git/nvdimm/nvdimm
Pull libnvdimm fixes from Dan Williams:
"Several fixes to the DSM (ACPI device specific method) marshaling
implementation.
I consider these urgent enough to send for 4.9 consideration since
they fix the kernel's handling of ARS (Address Range Scrub) commands.
Especially for platforms without machine-check-recovery capabilities,
successful execution of ARS commands enables the platform to
potentially break out of an infinite reboot problem if a media error
is present in the boot path. There is also a one line fix for a
device-dax read-only mapping regression.
Commits
9a901f5495e2 ("acpi, nfit: fix extended status translations
for ACPI DSMs") and
325896ffdf90 ("device-dax: fix private mapping
restriction, permit read-only") are true regression fixes for changes
introduced this cycle.
Commit
efda1b5d87cb ("acpi, nfit, libnvdimm: fix / harden ars_status
output length handling") fixes the kernel's handling of zero-length
results, this never would have worked in the past, but we only just
recently discovered a BIOS implementation that emits this arguably
spec non-compliant result.
The remaining two commits are additional fall out from thinking
through the implications of a zero / truncated length result of the
ARS Status command.
In order to mitigate the risk that these changes introduce yet more
regressions they are backstopped by a new unit test in commit
a7de92dac9f0 ("tools/testing/nvdimm: unit test acpi_nfit_ctl()") that
mocks up inputs to acpi_nfit_ctl()"
* 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
device-dax: fix private mapping restriction, permit read-only
tools/testing/nvdimm: unit test acpi_nfit_ctl()
acpi, nfit: fix bus vs dimm confusion in xlat_status
acpi, nfit: validate ars_status output buffer size
acpi, nfit, libnvdimm: fix / harden ars_status output length handling
acpi, nfit: fix extended status translations for ACPI DSMs
Linus Torvalds [Fri, 9 Dec 2016 19:07:45 +0000 (11:07 -0800)]
Merge branch 'for-4.9-fixes' of git://git./linux/kernel/git/tj/libata
Pull libata fixes from Tejun Heo:
"This is quite late but SCT Write Same support added during this cycle
is broken subtly but seriously and it'd be best to disable it before
v4.9 gets released.
This contains two commits - one low impact sata_mv fix and the
mentioned disabling of SCT Write Same"
* 'for-4.9-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
libata-scsi: disable SCT Write Same for the moment
ata: sata_mv: check for errors when parsing nr-ports from dt
Linus Torvalds [Fri, 9 Dec 2016 19:02:40 +0000 (11:02 -0800)]
Merge tag 'ceph-for-4.9-rc9' of git://github.com/ceph/ceph-client
Pull ceph fix from Ilya Dryomov:
"A fix for an issue with ->d_revalidate() in ceph, causing frequent
kernel crashes.
Marked for stable - it goes back to 4.6, but started popping up only
in 4.8"
* tag 'ceph-for-4.9-rc9' of git://github.com/ceph/ceph-client:
ceph: don't set req->r_locked_dir in ceph_d_revalidate
Linus Torvalds [Fri, 9 Dec 2016 19:00:39 +0000 (11:00 -0800)]
Merge tag 'armsoc-fixes' of git://git./linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Olof Johansson:
"Final batch of SoC fixes
A few fixes that have trickled in over the last week, all fixing minor
errors in devicetrees -- UART pin assignment on Allwinner H3,
correcting number of SATA ports on a Marvell-based Linkstation
platform and a display clock fix for Freescale/NXP i.MX7D that fixes a
freeze when starting up X"
* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
ARM: dts: orion5x: fix number of sata port for linkstation ls-gl
ARM: dts: imx7d: fix LCDIF clock assignment
dts: sun8i-h3: correct UART3 pin definitions
Linus Torvalds [Fri, 9 Dec 2016 18:54:54 +0000 (10:54 -0800)]
Merge tag 'm68k-for-v4.9-tag2' of git://git./linux/kernel/git/geert/linux-m68k
Pull m68k fixes from Geert Uytterhoeven:
- build fix for drivers calling ndelay() in a conditional block without
curly braces
- defconfig updates
* tag 'm68k-for-v4.9-tag2' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
m68k: Fix ndelay() macro
m68k/defconfig: Update defconfigs for v4.9-rc1
Linus Torvalds [Fri, 9 Dec 2016 18:50:49 +0000 (10:50 -0800)]
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull drm fix from Dave Airlie:
"Just a single fix for amdgpu to just suspend the gpu on 'shutdown'
instead of shutting it down fully, as for some reason the hw was
getting upset in some situations"
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
drm/amdgpu: just suspend the hw on pci shutdown
Linus Torvalds [Fri, 9 Dec 2016 18:41:42 +0000 (10:41 -0800)]
Revert "radix tree test suite: fix compilation"
This reverts commit
53855d10f4567a0577360b6448d52a863929775b.
It shouldn't have come in yet - it depends on the changes in linux-next
that will come in during the next merge window. As Matthew Wilcox says,
the test suite is broken with the current state without the revert.
Requested-by: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Johannes Berg [Thu, 8 Dec 2016 16:22:09 +0000 (17:22 +0100)]
cfg80211/mac80211: fix BSS leaks when abandoning assoc attempts
When mac80211 abandons an association attempt, it may free
all the data structures, but inform cfg80211 and userspace
about it only by sending the deauth frame it received, in
which case cfg80211 has no link to the BSS struct that was
used and will not cfg80211_unhold_bss() it.
Fix this by providing a way to inform cfg80211 of this with
the BSS entry passed, so that it can clean up properly, and
use this ability in the appropriate places in mac80211.
This isn't ideal: some code is more or less duplicated and
tracing is missing. However, it's a fairly small change and
it's thus easier to backport - cleanups can come later.
Cc: stable@vger.kernel.org
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Vamsi Krishna [Fri, 2 Dec 2016 21:59:08 +0000 (23:59 +0200)]
nl80211: Use different attrs for BSSID and random MAC addr in scan req
NL80211_ATTR_MAC was used to set both the specific BSSID to be scanned
and the random MAC address to be used when privacy is enabled. When both
the features are enabled, both the BSSID and the local MAC address were
getting same value causing Probe Request frames to go with unintended
DA. Hence, this has been fixed by using a different NL80211_ATTR_BSSID
attribute to set the specific BSSID (which was the more recent addition
in cfg80211) for a scan.
Backwards compatibility with old userspace software is maintained to
some extent by allowing NL80211_ATTR_MAC to be used to set the specific
BSSID when scanning without enabling random MAC address use.
Scanning with random source MAC address was introduced by commit
ad2b26abc157 ("cfg80211: allow drivers to support random MAC addresses
for scan") and the issue was introduced with the addition of the second
user for the same attribute in commit
818965d39177 ("cfg80211: Allow a
scan request for a specific BSSID").
Fixes:
818965d39177 ("cfg80211: Allow a scan request for a specific BSSID")
Signed-off-by: Vamsi Krishna <vamsin@qti.qualcomm.com>
Signed-off-by: Jouni Malinen <jouni@qca.qualcomm.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Johannes Berg [Mon, 21 Nov 2016 12:55:48 +0000 (13:55 +0100)]
nl80211: fix logic inversion in start_nan()
Arend inadvertently inverted the logic while converting to
wdev_running(), fix that.
Fixes:
73c7da3dae1e ("cfg80211: add generic helper to check interface is running")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Markus Elfring [Sun, 23 Oct 2016 21:00:22 +0000 (23:00 +0200)]
m68k/atari: Use seq_puts() in atari_get_hardware_list()
A string which did not contain a data format specification should be put
into a sequence. Thus use the corresponding function "seq_puts".
This issue was detected by using the Coccinelle software.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Markus Elfring [Sun, 23 Oct 2016 20:55:03 +0000 (22:55 +0200)]
m68k/amiga: Use seq_puts() in amiga_get_hardware_list()
A string which did not contain a data format specification should be put
into a sequence. Thus use the corresponding function "seq_puts".
This issue was detected by using the Coccinelle software.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Arnd Bergmann [Thu, 8 Dec 2016 21:57:05 +0000 (22:57 +0100)]
net: xgene: avoid bogus maybe-uninitialized warning
In some configurations, gcc cannot trace the state of variables
across a spin_unlock() barrier, leading to a warning about
correct code:
xgene_enet_main.c: In function 'xgene_enet_start_xmit':
../../../phy/mdio-xgene.h:112:14: error: 'mss_index' may be used uninitialized in this function [-Werror=maybe-uninitialized]
Here we can trivially move the assignment before that spin_unlock,
which reliably avoids the warning.
Fixes:
e3978673f514 ("drivers: net: xgene: Fix MSS programming")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arnd Bergmann [Thu, 8 Dec 2016 21:57:04 +0000 (22:57 +0100)]
net: xgene: move xgene_cle_ptree_ewdn data off stack
The array for initializing the cle is set up on the stack with
almost entirely constant data and then passed to a function that
converts it into HW specific bit patterns. With the latest
addition, the size of this array has grown to the point that
we get a warning about potential stack overflow in allmodconfig
builds:
xgene_enet_cle.c: In function ‘xgene_enet_cle_init’:
xgene_enet_cle.c:836:1: error: the frame size of 1032 bytes is larger than 1024 bytes [-Werror=frame-larger-than=]
Looking a bit deeper at the usage, I noticed that the only modification
of the data is in dead code, as we don't even use the cle module
for phy_mode other than PHY_INTERFACE_MODE_XGMII. This means we
can simply mark the structure constant and access it directly rather
than passing the pointer down through another structure, making
the code more efficient at the same time as avoiding the
warning.
Fixes:
a809701fed15 ("drivers: net: xgene: fix: RSS for non-TCP/UDP")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arnd Bergmann [Thu, 8 Dec 2016 21:57:03 +0000 (22:57 +0100)]
net/mlx5e: use %pad format string for dma_addr_t
On 32-bit ARM with 64-bit dma_addr_t I get this warning about an
incorrect format string:
In file included from /git/arm-soc/drivers/net/ethernet/mellanox/mlx5/core/alloc.c:42:0:
drivers/net/ethernet/mellanox/mlx5/core/alloc.c: In function ‘mlx5_frag_buf_alloc_node’:
drivers/net/ethernet/mellanox/mlx5/core/alloc.c:134:12: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast]
We have the special %pad format for printing dma_addr_t, so use that
to print the correct address and avoid the warning.
Fixes:
1c1b522808a1 ("net/mlx5e: Implement Fragmented Work Queue (WQ)")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 9 Dec 2016 02:26:59 +0000 (21:26 -0500)]
Merge branch 'ethernet-missing-netdev-parent'
Florian Fainelli says:
====================
net: ethernet: Make sure we set dev->dev.parent
This patch series builds atop:
ec988ad78ed6d184a7f4ca6b8e962b0e8f1de461 ("phy: Don't increment MDIO
bus refcount unless it's a different owner")
FMAN is the one that potentially needs patching as well (call
SET_NETDEV_DEV), but there appears to be no way that init_phy is
called right now, or there is not such an in-tree user. Madalin, can
you comment on that?
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Thu, 8 Dec 2016 19:41:25 +0000 (11:41 -0800)]
net: ethernet: cpmac: Call SET_NETDEV_DEV()
The TI CPMAC driver calls into PHYLIB which now checks for
net_device->dev.parent, so make sure we do set it before calling into
any MDIO/PHYLIB related function.
Fixes:
ec988ad78ed6 ("phy: Don't increment MDIO bus refcount unless it's a different owner")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Thu, 8 Dec 2016 19:41:24 +0000 (11:41 -0800)]
net: ethernet: lantiq_etop: Call SET_NETDEV_DEV()
The Lantiq Etop driver calls into PHYLIB which now checks for
net_device->dev.parent, so make sure we do set it before calling into
any MDIO/PHYLIB related function.
Fixes:
ec988ad78ed6 ("phy: Don't increment MDIO bus refcount unless it's a different owner")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Peng Tao [Thu, 8 Dec 2016 17:10:46 +0000 (01:10 +0800)]
vhost-vsock: fix orphan connection reset
local_addr.svm_cid is host cid. We should check guest cid instead,
which is remote_addr.svm_cid. Otherwise we end up resetting all
connections to all guests.
Cc: stable@vger.kernel.org [4.8+]
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Peng Tao <bergwolf@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Thu, 8 Dec 2016 23:40:15 +0000 (15:40 -0800)]
Merge branch 'parisc-4.9-5' of git://git./linux/kernel/git/deller/parisc-linux
Pull parisc fixes from Helge Deller:
"Three important fixes for the parisc architecture.
Dave provided two patches: One which purges the TLB before setting a
PTE entry and a second one which drops unnecessary TLB flushes. Both
patches have been tested for one week on the debian buildd servers and
prevent random segmentation faults.
The patch from me fixes a crash at boot inside the TLB measuring code
on SMP machines with PA8000-PA8700 CPUs (specifically A500-44 and
J5000 servers)"
* 'parisc-4.9-5' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: Fix TLB related boot crash on SMP machines
parisc: Remove unnecessary TLB purges from flush_dcache_page_asm and flush_icache_page_asm
parisc: Purge TLB before setting PTE
David S. Miller [Thu, 8 Dec 2016 23:22:40 +0000 (18:22 -0500)]
Merge tag 'linux-can-fixes-for-4.9-
20161208' of git://git./linux/kernel/git/mkl/linux-can
Marc Kleine-Budde says:
====================
pull-request: can 2016-12-08
this is a pull request for one patch.
Jiho Chu found and fixed a use-after-free error in the cleanup path in
the peak pcan USB CAN driver.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Amit Kushwaha [Thu, 8 Dec 2016 12:51:53 +0000 (18:21 +0530)]
net: socket: preferred __aligned(size) for control buffer
This patch cleanup checkpatch.pl warning
WARNING: __aligned(size) is preferred over __attribute__((aligned(size)))
Signed-off-by: Amit Kushwaha <kushwaha.a@samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>