Florian Westphal [Fri, 4 Nov 2016 15:54:58 +0000 (16:54 +0100)]
netfilter: conntrack: refine gc worker heuristics
Nicolas Dichtel says:
After commit
b87a2f9199ea ("netfilter: conntrack: add gc worker to
remove timed-out entries"), netlink conntrack deletion events may be
sent with a huge delay.
Nicolas further points at this line:
goal = min(nf_conntrack_htable_size / GC_MAX_BUCKETS_DIV, GC_MAX_BUCKETS);
and indeed, this isn't optimal at all. Rationale here was to ensure that
we don't block other work items for too long, even if
nf_conntrack_htable_size is huge. But in order to have some guarantee
about maximum time period where a scan of the full conntrack table
completes we should always use a fixed slice size, so that once every
N scans the full table has been examined at least once.
We also need to balance this vs. the case where the system is either idle
(i.e., conntrack table (almost) empty) or very busy (i.e. eviction happens
from packet path).
So, after some discussion with Nicolas:
1. want hard guarantee that we scan entire table at least once every X s
-> need to scan fraction of table (get rid of upper bound)
2. don't want to eat cycles on idle or very busy system
-> increase interval if we did not evict any entries
3. don't want to block other worker items for too long
-> make fraction really small, and prefer small scan interval instead
4. Want reasonable short time where we detect timed-out entry when
system went idle after a burst of traffic, while not doing scans
all the time.
-> Store next gc scan in worker, increasing delays when no eviction
happened and shrinking delay when we see timed out entries.
The old gc interval is turned into a max number, scans can now happen
every jiffy if stale entries are present.
Longest possible time period until an entry is evicted is now 2 minutes
in worst case (entry expires right after it was deemed 'not expired').
Reported-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Florian Westphal [Thu, 3 Nov 2016 13:44:42 +0000 (14:44 +0100)]
netfilter: conntrack: fix CT target for UNSPEC helpers
Thomas reports its not possible to attach the H.245 helper:
iptables -t raw -A PREROUTING -p udp -j CT --helper H.245
iptables: No chain/target/match by that name.
xt_CT: No such helper "H.245"
This is because H.245 registers as NFPROTO_UNSPEC, but the CT target
passes NFPROTO_IPV4/IPV6 to nf_conntrack_helper_try_module_get.
We should treat UNSPEC as wildcard and ignore the l3num instead.
Reported-by: Thomas Woerner <twoerner@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Florian Westphal [Sat, 29 Oct 2016 01:01:50 +0000 (03:01 +0200)]
netfilter: connmark: ignore skbs with magic untracked conntrack objects
The (percpu) untracked conntrack entries can end up with nonzero connmarks.
The 'untracked' conntrack objects are merely a way to distinguish INVALID
(i.e. protocol connection tracker says payload doesn't meet some
requirements or packet was never seen by the connection tracking code)
from packets that are intentionally not tracked (some icmpv6 types such as
neigh solicitation, or by using 'iptables -j CT --notrack' option).
Untracked conntrack objects are implementation detail, we might as well use
invalid magic address instead to tell INVALID and UNTRACKED apart.
Check skb->nfct for untracked dummy and behave as if skb->nfct is NULL.
Reported-by: XU Tianwen <evan.xu.tianwen@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
WANG Cong [Fri, 4 Nov 2016 00:14:03 +0000 (17:14 -0700)]
ipvs: use IPVS_CMD_ATTR_MAX for family.maxattr
family.maxattr is the max index for policy[], the size of
ops[] is determined with ARRAY_SIZE().
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Tested-by: Andrey Konovalov <andreyknvl@google.com>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Liping Zhang [Sat, 29 Oct 2016 14:09:51 +0000 (22:09 +0800)]
netfilter: nft_dup: do not use sreg_dev if the user doesn't specify it
The NFTA_DUP_SREG_DEV attribute is not a must option, so we should use it
in routing lookup only when the user specify it.
Fixes:
d877f07112f1 ("netfilter: nf_tables: add nft_dup expression")
Signed-off-by: Liping Zhang <zlpnobody@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Liping Zhang [Sat, 29 Oct 2016 14:03:05 +0000 (22:03 +0800)]
netfilter: nf_tables: destroy the set if fail to add transaction
When the memory is exhausted, then we will fail to add the NFT_MSG_NEWSET
transaction. In such case, we should destroy the set before we free it.
Fixes:
958bee14d071 ("netfilter: nf_tables: use new transaction infrastructure to handle sets")
Signed-off-by: Liping Zhang <zlpnobody@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Arnd Bergmann [Mon, 24 Oct 2016 15:34:32 +0000 (17:34 +0200)]
netfilter: ip_vs_sync: fix bogus maybe-uninitialized warning
Building the ip_vs_sync code with CONFIG_OPTIMIZE_INLINING on x86
confuses the compiler to the point where it produces a rather
dubious warning message:
net/netfilter/ipvs/ip_vs_sync.c:1073:33: error: ‘opt.init_seq’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
struct ip_vs_sync_conn_options opt;
^~~
net/netfilter/ipvs/ip_vs_sync.c:1073:33: error: ‘opt.delta’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
net/netfilter/ipvs/ip_vs_sync.c:1073:33: error: ‘opt.previous_delta’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
net/netfilter/ipvs/ip_vs_sync.c:1073:33: error: ‘*((void *)&opt+12).init_seq’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
net/netfilter/ipvs/ip_vs_sync.c:1073:33: error: ‘*((void *)&opt+12).delta’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
net/netfilter/ipvs/ip_vs_sync.c:1073:33: error: ‘*((void *)&opt+12).previous_delta’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
The problem appears to be a combination of a number of factors, including
the __builtin_bswap32 compiler builtin being slightly odd, having a large
amount of code inlined into a single function, and the way that some
functions only get partially inlined here.
I've spent way too much time trying to work out a way to improve the
code, but the best I've come up with is to add an explicit memset
right before the ip_vs_seq structure is first initialized here. When
the compiler works correctly, this has absolutely no effect, but in the
case that produces the warning, the warning disappears.
In the process of analysing this warning, I also noticed that
we use memcpy to copy the larger ip_vs_sync_conn_options structure
over two members of the ip_vs_conn structure. This works because
the layout is identical, but seems error-prone, so I'm changing
this in the process to directly copy the two members. This change
seemed to have no effect on the object code or the warning, but
it deals with the same data, so I kept the two changes together.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Florian Westphal [Wed, 26 Oct 2016 21:46:17 +0000 (23:46 +0200)]
netfilter: conntrack: avoid excess memory allocation
This is now a fixed-size extension, so we don't need to pass a variable
alloc size. This (harmless) error results in allocating 32 instead of
the needed 16 bytes for this extension as the size gets passed twice.
Fixes:
23014011ba420 ("netfilter: conntrack: support a fixed size of 128 distinct labels")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
John W. Linville [Tue, 25 Oct 2016 19:56:39 +0000 (15:56 -0400)]
netfilter: nf_tables: fix type mismatch with error return from nft_parse_u32_check
Commit
36b701fae12ac ("netfilter: nf_tables: validate maximum value of
u32 netlink attributes") introduced nft_parse_u32_check with a return
value of "unsigned int", yet on error it returns "-ERANGE".
This patch corrects the mismatch by changing the return value to "int",
which happens to match the actual users of nft_parse_u32_check already.
Found by Coverity, CID
1373930.
Note that commit
21a9e0f1568ea ("netfilter: nft_exthdr: fix error
handling in nft_exthdr_init()) attempted to address the issue, but
did not address the return type of nft_parse_u32_check.
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Cc: Laura Garcia Liebana <nevola@gmail.com>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Fixes:
36b701fae12ac ("netfilter: nf_tables: validate maximum value...")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Ulrich Weber [Mon, 24 Oct 2016 16:07:23 +0000 (18:07 +0200)]
netfilter: nf_conntrack_sip: extend request line validation
on SIP requests, so a fragmented TCP SIP packet from an allow header starting with
INVITE,NOTIFY,OPTIONS,REFER,REGISTER,UPDATE,SUBSCRIBE
Content-Length: 0
will not bet interpreted as an INVITE request. Also Request-URI must start with an alphabetic character.
Confirm with RFC 3261
Request-Line = Method SP Request-URI SP SIP-Version CRLF
Fixes:
30f33e6dee80 ("[NETFILTER]: nf_conntrack_sip: support method specific request/response handling")
Signed-off-by: Ulrich Weber <ulrich.weber@riverbed.com>
Acked-by: Marco Angaroni <marcoangaroni@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Liping Zhang [Sat, 22 Oct 2016 10:51:26 +0000 (18:51 +0800)]
netfilter: nf_tables: fix race when create new element in dynset
Packets may race when create the new element in nft_hash_update:
CPU0 CPU1
lookup_fast - fail lookup_fast - fail
new - ok new - ok
insert - ok insert - fail(EEXIST)
So when race happened, we reuse the existing element. Otherwise,
these *racing* packets will not be handled properly.
Fixes:
22fe54d5fefc ("netfilter: nf_tables: add support for dynamic set updates")
Signed-off-by: Liping Zhang <zlpnobody@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Liping Zhang [Sat, 22 Oct 2016 10:51:25 +0000 (18:51 +0800)]
netfilter: nf_tables: fix *leak* when expr clone fail
When nft_expr_clone failed, a series of problems will happen:
1. module refcnt will leak, we call __module_get at the beginning but
we forget to put it back if ops->clone returns fail
2. memory will be leaked, if clone fail, we just return NULL and forget
to free the alloced element
3. set->nelems will become incorrect when set->size is specified. If
clone fail, we should decrease the set->nelems
Now this patch fixes these problems. And fortunately, clone fail will
only happen on counter expression when memory is exhausted.
Fixes:
086f332167d6 ("netfilter: nf_tables: add clone interface to expression operations")
Signed-off-by: Liping Zhang <zlpnobody@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Liping Zhang [Sat, 22 Oct 2016 10:51:24 +0000 (18:51 +0800)]
netfilter: nft_dynset: fix panic if NFT_SET_HASH is not enabled
When CONFIG_NFT_SET_HASH is not enabled and I input the following rule:
"nft add rule filter output flow table test {ip daddr counter }", kernel
panic happened on my system:
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [< (null)>] (null)
[...]
Call Trace:
[<
ffffffffa0590466>] ? nft_dynset_eval+0x56/0x100 [nf_tables]
[<
ffffffffa05851bb>] nft_do_chain+0xfb/0x4e0 [nf_tables]
[<
ffffffffa0432f01>] ? nf_conntrack_tuple_taken+0x61/0x210 [nf_conntrack]
[<
ffffffffa0459ea6>] ? get_unique_tuple+0x136/0x560 [nf_nat]
[<
ffffffffa043bca1>] ? __nf_ct_ext_add_length+0x111/0x130 [nf_conntrack]
[<
ffffffffa045a357>] ? nf_nat_setup_info+0x87/0x3b0 [nf_nat]
[<
ffffffff81761e27>] ? ipt_do_table+0x327/0x610
[<
ffffffffa045a6d7>] ? __nf_nat_alloc_null_binding+0x57/0x80 [nf_nat]
[<
ffffffffa059f21f>] nft_ipv4_output+0xaf/0xd0 [nf_tables_ipv4]
[<
ffffffff81702515>] nf_iterate+0x55/0x60
[<
ffffffff81702593>] nf_hook_slow+0x73/0xd0
Because in rbtree type set, ops->update is not implemented. So just keep
it simple, in such case, report -EOPNOTSUPP to the user space.
Fixes:
22fe54d5fefc ("netfilter: nf_tables: add support for dynamic set updates")
Signed-off-by: Liping Zhang <zlpnobody@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Mintz, Yuval [Mon, 24 Oct 2016 05:48:09 +0000 (08:48 +0300)]
MAINTAINERS: Update qlogic networking drivers
Following Cavium's acquisition of qlogic we need to update all the qlogic
drivers maintainer's entries to point to our new e-mail addresses,
as well as update some of the driver's maintainers as those are no longer
working for Cavium.
I would like to thank Sony Chacko and Rajesh Borundia for their support
and development of our various networking drivers.
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stephen Hemminger [Mon, 24 Oct 2016 04:32:47 +0000 (21:32 -0700)]
netvsc: fix incorrect receive checksum offloading
The Hyper-V netvsc driver was looking at the incorrect status bits
in the checksum info. It was setting the receive checksum unnecessary
flag based on the IP header checksum being correct. The checksum
flag is skb is about TCP and UDP checksum status. Because of this
bug, any packet received with bad TCP checksum would be passed
up the stack and to the application causing data corruption.
The problem is reproducible via netcat and netem.
This had a side effect of not doing receive checksum offload
on IPv6. The driver was also also always doing checksum offload
independent of the checksum setting done via ethtool.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Mon, 24 Oct 2016 01:03:06 +0000 (18:03 -0700)]
udp: fix IP_CHECKSUM handling
First bug was added in commit
ad6f939ab193 ("ip: Add offset parameter to
ip_cmsg_recv") : Tom missed that ipv4 udp messages could be received on
AF_INET6 socket. ip_cmsg_recv(msg, skb) should have been replaced by
ip_cmsg_recv_offset(msg, skb, sizeof(struct udphdr));
Then commit
e6afc8ace6dd ("udp: remove headers from UDP packets before
queueing") forgot to adjust the offsets now UDP headers are pulled
before skb are put in receive queue.
Fixes:
ad6f939ab193 ("ip: Add offset parameter to ip_cmsg_recv")
Fixes:
e6afc8ace6dd ("udp: remove headers from UDP packets before queueing")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Sam Kumar <samanthakumar@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Tested-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Sun, 23 Oct 2016 17:01:09 +0000 (01:01 +0800)]
sctp: fix the panic caused by route update
Commit
7303a1475008 ("sctp: identify chunks that need to be fragmented
at IP level") made the chunk be fragmented at IP level in the next round
if it's size exceed PMTU.
But there still is another case, PMTU can be updated if transport's dst
expires and transport's pmtu_pending is set in sctp_packet_transmit. If
the new PMTU is less than the chunk, the same issue with that commit can
be triggered.
So we should drop this packet and let it retransmit in another round
where it would be fragmented at IP level.
This patch is to fix it by checking the chunk size after PMTU may be
updated and dropping this packet if it's size exceed PMTU.
Fixes:
90017accff61 ("sctp: Add GSO support")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Neil Horman <nhorman@txudriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stephen Hemminger [Sun, 23 Oct 2016 16:28:29 +0000 (09:28 -0700)]
doc: update docbook annotations for socket and skb
The skbuff and sock structure both had missing parameter annotation
values.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Wei Yongjun [Sat, 22 Oct 2016 14:31:06 +0000 (14:31 +0000)]
rocker: fix error return code in rocker_world_check_init()
Fix to return error code -EINVAL from the error handling
case instead of 0, as done elsewhere in this function.
Fixes:
e420114eef4a ("rocker: introduce worlds infrastructure")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sun, 23 Oct 2016 21:51:38 +0000 (17:51 -0400)]
Merge branch 'for-upstream' of git://git./linux/kernel/git/bluetooth/bluetooth
Johan Hedberg says:
====================
pull request: bluetooth 2016-10-21
Here are some more Bluetooth fixes for the 4.9 kernel:
- Fix to btwilink driver probe function return value
- Power management fix to hci_bcm
- Fix to encoding name in scan response data
Please let me know if there are any issues pulling. Thanks.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Slaby [Fri, 21 Oct 2016 12:13:24 +0000 (14:13 +0200)]
net: sctp, forbid negative length
Most of getsockopt handlers in net/sctp/socket.c check len against
sizeof some structure like:
if (len < sizeof(int))
return -EINVAL;
On the first look, the check seems to be correct. But since len is int
and sizeof returns size_t, int gets promoted to unsigned size_t too. So
the test returns false for negative lengths. Yes, (-1 < sizeof(long)) is
false.
Fix this in sctp by explicitly checking len < 0 before any getsockopt
handler is called.
Note that sctp_getsockopt_events already handled the negative case.
Since we added the < 0 check elsewhere, this one can be removed.
If not checked, this is the result:
UBSAN: Undefined behaviour in ../mm/page_alloc.c:2722:19
shift exponent 52 is too large for 32-bit type 'int'
CPU: 1 PID: 24535 Comm: syz-executor Not tainted 4.8.1-0-syzkaller #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
rel-1.9.1-0-gb3ef39f-prebuilt.qemu-project.org 04/01/2014
0000000000000000 ffff88006d99f2a8 ffffffffb2f7bdea 0000000041b58ab3
ffffffffb4363c14 ffffffffb2f7bcde ffff88006d99f2d0 ffff88006d99f270
0000000000000000 0000000000000000 0000000000000034 ffffffffb5096422
Call Trace:
[<
ffffffffb3051498>] ? __ubsan_handle_shift_out_of_bounds+0x29c/0x300
...
[<
ffffffffb273f0e4>] ? kmalloc_order+0x24/0x90
[<
ffffffffb27416a4>] ? kmalloc_order_trace+0x24/0x220
[<
ffffffffb2819a30>] ? __kmalloc+0x330/0x540
[<
ffffffffc18c25f4>] ? sctp_getsockopt_local_addrs+0x174/0xca0 [sctp]
[<
ffffffffc18d2bcd>] ? sctp_getsockopt+0x10d/0x1b0 [sctp]
[<
ffffffffb37c1219>] ? sock_common_getsockopt+0xb9/0x150
[<
ffffffffb37be2f5>] ? SyS_getsockopt+0x1a5/0x270
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: linux-sctp@vger.kernel.org
Cc: netdev@vger.kernel.org
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fabio Estevam [Fri, 21 Oct 2016 11:34:29 +0000 (09:34 -0200)]
net: fec: Call swap_buffer() prior to IP header alignment
Commit
3ac72b7b63d5 ("net: fec: align IP header in hardware") breaks
networking on mx28.
There is an erratum on mx28 (ENGR121613 - ENET big endian mode
not compatible with ARM little endian) that requires an additional
byte-swap operation to workaround this problem.
So call swap_buffer() prior to performing the IP header alignment
to restore network functionality on mx28.
Fixes:
3ac72b7b63d5 ("net: fec: align IP header in hardware")
Reported-and-tested-by: Henri Roosen <henri.roosen@ginzinger.com>
Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jason A. Donenfeld [Fri, 21 Oct 2016 09:28:25 +0000 (18:28 +0900)]
ipv6: do not increment mac header when it's unset
Otherwise we'll overflow the integer. This occurs when layer 3 tunneled
packets are handed off to the IPv6 layer.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sudarsana Reddy Kalluru [Fri, 21 Oct 2016 06:09:17 +0000 (02:09 -0400)]
bnx2x: Use the correct divisor value for PHC clock readings.
Time Sync (PTP) implementation uses the divisor/shift value for converting
the clock ticks to nanoseconds. Driver currently defines shift value as 1,
this results in the nanoseconds value to be calculated as half the actual
value. Hence the user application fails to synchronize the device clock
value with the PTP master device clock. Need to use the 'shift' value of 0.
Signed-off-by: Sony.Chacko <Sony.Chacko@cavium.com>
Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sat, 22 Oct 2016 21:08:07 +0000 (17:08 -0400)]
Merge branch 'qed-fixes'
Sudarsana Reddy Kalluru says:
====================
qed*: fix series.
The patch series contains several minor bug fixes for qed/qede modules.
Please consider applying this to 'net' branch.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Manish Chopra [Fri, 21 Oct 2016 08:43:45 +0000 (04:43 -0400)]
qede: Fix incorrrect usage of APIs for un-mapping DMA memory
Driver uses incorrect APIs to unmap DMA memory which were
mapped using dma_map_single(). This patch fixes it to use
appropriate APIs for un-mapping DMA memory.
Signed-off-by: Manish Chopra <manish.chopra@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sudarsana Reddy Kalluru [Fri, 21 Oct 2016 08:43:44 +0000 (04:43 -0400)]
qed: Zero-out the buffer paased to dcbx_query() API
qed_dcbx_query_params() implementation populate the values to input
buffer based on the dcbx mode and, the current negotiated state/params,
the caller of this API need to memset the buffer to zero.
Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sudarsana Reddy Kalluru [Fri, 21 Oct 2016 08:43:43 +0000 (04:43 -0400)]
qede: Reconfigure rss indirection direction table when rss count is updated
Rx indirection table entries are in the range [0, (rss_count - 1)]. If
user reduces the rss count, the table entries may not be in the ccorrect
range. Need to reconfigure the table with new rss_count as a basis.
Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sudarsana Reddy Kalluru [Fri, 21 Oct 2016 08:43:42 +0000 (04:43 -0400)]
qed*: Reduce the memory footprint for Rx path
With the current default values for Rx path i.e., 8 queues of 8Kb entries
each with 4Kb size, interface will consume 256Mb for Rx. The default values
causing the driver probe to fail when the system memory is low. Based on
the perforamnce results, rx-ring count value of 1Kb gives the comparable
performance with Rx coalesce timeout of 12 seconds. Updating the default
values.
Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sudarsana Reddy Kalluru [Fri, 21 Oct 2016 08:43:41 +0000 (04:43 -0400)]
qede: Loopback implementation should ignore the normal traffic
During the execution of loopback test, driver may receive the packets which
are not originated by this test, loopback implementation need to skip those
packets.
Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sudarsana Reddy Kalluru [Fri, 21 Oct 2016 08:43:40 +0000 (04:43 -0400)]
qede: Do not allow RSS config for 100G devices
RSS configuration is not supported for 100G adapters.
Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sudarsana Reddy Kalluru [Fri, 21 Oct 2016 08:43:39 +0000 (04:43 -0400)]
qede: get_channels() need to populate max tx/rx coalesce values
Recent changes in kernel ethtool implementation requires the driver
callback for get_channels() has to populate the values for max tx/rx
coalesce fields.
Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
WANG Cong [Thu, 20 Oct 2016 21:19:46 +0000 (14:19 -0700)]
ipv4: use the right lock for ping_group_range
This reverts commit
a681574c99be23e4d20b769bf0e543239c364af5
("ipv4: disable BH in set_ping_group_range()") because we never
read ping_group_range in BH context (unlike local_port_range).
Then, since we already have a lock for ping_group_range, those
using ip_local_ports.lock for ping_group_range are clearly typos.
We might consider to share a same lock for both ping_group_range
and local_port_range w.r.t. space saving, but that should be for
net-next.
Fixes:
a681574c99be ("ipv4: disable BH in set_ping_group_range()")
Fixes:
ba6b918ab234 ("ping: move ping_group_range out of CONFIG_SYSCTL")
Cc: Eric Dumazet <edumazet@google.com>
Cc: Eric Salo <salo@google.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sat, 22 Oct 2016 20:17:54 +0000 (16:17 -0400)]
Merge branch 'dsa-bcm_sf2-do-not-rely-on-kexec_in_progress'
Florian Fainelli says:
====================
net: dsa: bcm_sf2: Do not rely on kexec_in_progress
These are the two patches following the discussing we had on kexec_in_progress.
Feel free to apply or discard them, thanks!
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Fri, 21 Oct 2016 21:21:56 +0000 (14:21 -0700)]
net: dsa: bcm_sf2: Do not rely on kexec_in_progress
After discussing with Eric, it turns out that, while using
kexec_in_progress is a nice optimization, which prevents us from always
powering on the integrated PHY, let's just turn it on in the shutdown
path.
This removes a dependency on kexec_in_progress which, according to Eric
should not be used by modules
Fixes:
2399d6143f85 ("net: dsa: bcm_sf2: Prevent GPHY shutdown for kexec'd kernels")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Fri, 21 Oct 2016 21:21:55 +0000 (14:21 -0700)]
Revert "kexec: Export kexec_in_progress to modules"
This reverts commit
97dcaa0fcfd24daa9a36c212c1ad1d5a97759212. Based on
the review discussion with Eric, we will come up with a different fix
for the bcm_sf2 driver which does not make it rely on the
kexec_in_progress value.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Paul Moore [Sat, 22 Oct 2016 01:49:14 +0000 (21:49 -0400)]
netns: revert "netns: avoid disabling irq for netns id"
This reverts commit
bc51dddf98c9 ("netns: avoid disabling irq for
netns id") as it was found to cause problems with systems running
SELinux/audit, see the mailing list thread below:
* http://marc.info/?t=
147694653900002&r=1&w=2
Eventually we should be able to reintroduce this code once we have
rewritten the audit multicast code to queue messages much the same
way we do for unicast messages. A tracking issue for this can be
found below:
* https://github.com/linux-audit/audit-kernel/issues/23
Reported-by: Stephen Smalley <sds@tycho.nsa.gov>
Reported-by: Elad Raz <e@eladraz.com>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
WANG Cong [Thu, 20 Oct 2016 06:35:12 +0000 (23:35 -0700)]
ipv6: fix a potential deadlock in do_ipv6_setsockopt()
Baozeng reported this deadlock case:
CPU0 CPU1
---- ----
lock([ 165.136033] sk_lock-AF_INET6);
lock([ 165.136033] rtnl_mutex);
lock([ 165.136033] sk_lock-AF_INET6);
lock([ 165.136033] rtnl_mutex);
Similar to commit
87e9f0315952
("ipv4: fix a potential deadlock in mcast getsockopt() path")
this is due to we still have a case, ipv6_sock_mc_close(),
where we acquire sk_lock before rtnl_lock. Close this deadlock
with the similar solution, that is always acquire rtnl lock first.
Fixes:
baf606d9c9b1 ("ipv4,ipv6: grab rtnl before locking the socket")
Reported-by: Baozeng Ding <sploving1@gmail.com>
Tested-by: Baozeng Ding <sploving1@gmail.com>
Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 21 Oct 2016 14:25:22 +0000 (10:25 -0400)]
Merge git://git./pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains Netfilter fixes for your net tree,
they are:
1) Fix compilation warning in xt_hashlimit on m68k 32-bits, from
Geert Uytterhoeven.
2) Fix wrong timeout in set elements added from packet path via
nft_dynset, from Anders K. Pedersen.
3) Remove obsolete nf_conntrack_events_retry_timeout sysctl
documentation, from Nicolas Dichtel.
4) Ensure proper initialization of log flags via xt_LOG, from
Liping Zhang.
5) Missing alias to autoload ipcomp, also from Liping Zhang.
6) Missing NFTA_HASH_OFFSET attribute validation, again from Liping.
7) Wrong integer type in the new nft_parse_u32_check() function,
from Dan Carpenter.
8) Another wrong integer type declaration in nft_exthdr_init, also
from Dan Carpenter.
9) Fix insufficient mode validation in nft_range.
10) Fix compilation warning in nft_range due to possible uninitialized
value, from Arnd Bergmann.
11) Zero nf_hook_ops allocated via xt_hook_alloc() in x_tables to
calm down kmemcheck, from Florian Westphal.
12) Schedule gc_worker() to run again if GC_MAX_EVICTS quota is reached,
from Nicolas Dichtel.
13) Fix nf_queue() after conversion to single-linked hook list, related
to incorrect bypass flag handling and incorrect hook point of
reinjection.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Fri, 21 Oct 2016 01:15:16 +0000 (18:15 -0700)]
kexec: Export kexec_in_progress to modules
The bcm_sf2 driver uses kexec_in_progress to know whether it can power
down an integrated PHY during shutdown, and can be built as a module.
Other modules may be using this in the future, so export it.
Fixes:
2399d6143f85 ("net: dsa: bcm_sf2: Prevent GPHY shutdown for kexec'd kernels")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Thu, 20 Oct 2016 17:26:48 +0000 (10:26 -0700)]
ipv4: disable BH in set_ping_group_range()
In commit
4ee3bd4a8c746 ("ipv4: disable BH when changing ip local port
range") Cong added BH protection in set_local_port_range() but missed
that same fix was needed in set_ping_group_range()
Fixes:
b8f1a55639e6 ("udp: Add function to make source port for UDP tunnels")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Eric Salo <salo@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Thu, 20 Oct 2016 16:39:40 +0000 (09:39 -0700)]
udp: must lock the socket in udp_disconnect()
Baozeng Ding reported KASAN traces showing uses after free in
udp_lib_get_port() and other related UDP functions.
A CONFIG_DEBUG_PAGEALLOC=y kernel would eventually crash.
I could write a reproducer with two threads doing :
static int sock_fd;
static void *thr1(void *arg)
{
for (;;) {
connect(sock_fd, (const struct sockaddr *)arg,
sizeof(struct sockaddr_in));
}
}
static void *thr2(void *arg)
{
struct sockaddr_in unspec;
for (;;) {
memset(&unspec, 0, sizeof(unspec));
connect(sock_fd, (const struct sockaddr *)&unspec,
sizeof(unspec));
}
}
Problem is that udp_disconnect() could run without holding socket lock,
and this was causing list corruptions.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Baozeng Ding <sploving1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Thu, 20 Oct 2016 16:32:19 +0000 (09:32 -0700)]
net: dsa: bcm_sf2: Prevent GPHY shutdown for kexec'd kernels
For a kernel that is being kexec'd we re-enable the integrated GPHY in
order for the subsequent MDIO bus scan to succeed and properly bind to
the bcm7xxx PHY driver. If we did not do that, the GPHY would be shut
down by the time the MDIO driver is probing the bus, and it would fail
to read the correct PHY OUI and therefore bind to an appropriate PHY
driver. Later on, this would cause DSA not to be able to successfully
attach to the PHY, and the interface would not be created at all.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Thu, 20 Oct 2016 15:13:53 +0000 (17:13 +0200)]
bpf, test: fix ld_abs + vlan push/pop stress test
After commit
636c2628086e ("net: skbuff: Remove errornous length
validation in skb_vlan_pop()") mentioned test case stopped working,
throwing a -12 (ENOMEM) return code. The issue however is not due to
636c2628086e, but rather due to a buggy test case that got uncovered
from the change in behaviour in
636c2628086e.
The data_size of that test case for the skb was set to 1. In the
bpf_fill_ld_abs_vlan_push_pop() handler bpf insns are generated that
loop with: reading skb data, pushing 68 tags, reading skb data,
popping 68 tags, reading skb data, etc, in order to force a skb
expansion and thus trigger that JITs recache skb->data. Problem is
that initial data_size is too small.
While before
636c2628086e, the test silently bailed out due to the
skb->len < VLAN_ETH_HLEN check with returning 0, and now throwing an
error from failing skb_ensure_writable(). Set at least minimum of
ETH_HLEN as an initial length so that on first push of data, equivalent
pop will succeed.
Fixes:
4d9c5c53ac99 ("test_bpf: add bpf_skb_vlan_push/pop() tests")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sabrina Dubroca [Thu, 20 Oct 2016 13:58:02 +0000 (15:58 +0200)]
net: add recursion limit to GRO
Currently, GRO can do unlimited recursion through the gro_receive
handlers. This was fixed for tunneling protocols by limiting tunnel GRO
to one level with encap_mark, but both VLAN and TEB still have this
problem. Thus, the kernel is vulnerable to a stack overflow, if we
receive a packet composed entirely of VLAN headers.
This patch adds a recursion counter to the GRO layer to prevent stack
overflow. When a gro_receive function hits the recursion limit, GRO is
aborted for this skb and it is processed normally. This recursion
counter is put in the GRO CB, but could be turned into a percpu counter
if we run out of space in the CB.
Thanks to Vladimír Beneš <vbenes@redhat.com> for the initial bug report.
Fixes: CVE-2016-7039
Fixes:
9b174d88c257 ("net: Add Transparent Ethernet Bridging GRO support.")
Fixes:
66e5133f19e9 ("vlan: Add GRO support for non hardware accelerated vlan")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: Jiri Benc <jbenc@redhat.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Bohac [Thu, 20 Oct 2016 10:29:26 +0000 (12:29 +0200)]
ipv6: properly prevent temp_prefered_lft sysctl race
The check for an underflow of tmp_prefered_lft is always false
because tmp_prefered_lft is unsigned. The intention of the check
was to guard against racing with an update of the
temp_prefered_lft sysctl, potentially resulting in an underflow.
As suggested by David Miller, the best way to prevent the race is
by reading the sysctl variable using READ_ONCE.
Signed-off-by: Jiri Bohac <jbohac@suse.cz>
Reported-by: Julia Lawall <julia.lawall@lip6.fr>
Fixes:
76506a986dc3 ("IPv6: fix DESYNC_FACTOR")
Signed-off-by: David S. Miller <davem@davemloft.net>
Pablo Neira Ayuso [Mon, 17 Oct 2016 17:05:32 +0000 (18:05 +0100)]
netfilter: fix nf_queue handling
nf_queue handling is broken since
e3b37f11e6e4 ("netfilter: replace
list_head with single linked list") for two reasons:
1) If the bypass flag is set on, there are no userspace listeners and
we still have more hook entries to iterate over, then jump to the
next hook. Otherwise accept the packet. On nf_reinject() path, the
okfn() needs to be invoked.
2) We should not re-enter the same hook on packet reinjection. If the
packet is accepted, we have to skip the current hook from where the
packet was enqueued, otherwise the packets gets enqueued over and
over again.
This restores the previous list_for_each_entry_continue() behaviour
happening from nf_iterate() that was dealing with these two cases.
This patch introduces a new nf_queue() wrapper function so this fix
becomes simpler.
Fixes:
e3b37f11e6e4 ("netfilter: replace list_head with single linked list")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Nicolas Dichtel [Tue, 18 Oct 2016 12:37:32 +0000 (14:37 +0200)]
netfilter: conntrack: restart gc immediately if GC_MAX_EVICTS is reached
When the maximum evictions number is reached, do not wait 5 seconds before
the next run.
CC: Florian Westphal <fw@strlen.de>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Giuseppe CAVALLARO [Thu, 20 Oct 2016 08:01:28 +0000 (10:01 +0200)]
stmmac: display the descriptors if DES0 = 0
It makes sense to display the descriptors even if
DES0 is zero. This helps for example in case of it
is needed to dump rx write-back descriptors to get
timestamp status.
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre TORGUE <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 20 Oct 2016 15:23:08 +0000 (11:23 -0400)]
Merge branch 'ncsi-fixes'
Gavin Shan says:
====================
net/ncsi: More bug fixes
This series fixes 2 issues that were found during NCSI's availability
testing on BCM5718 and improves HNCDSC AEN handler:
* PATCH[1] refactors the code so that minimal code change is put
to PATCH[2].
* PATCH[2] fixes the NCSI channel's stale link state before doing
failover.
* PATCH[3] chooses the hot channel, which was ever chosen as active
channel, when the available channels are all in link-down state.
* PATCH[4] improves Host Network Controller Driver Status Change
(HNCDSC) AEN handler
Changelog
=========
v2:
* Merged PATCH[v1 1/2] to PATCH[v2 1].
* Avoid if/else statements in ncsi_suspend_channel() as Joel suggested.
* Added comments to explain why we need retrieve last link states in
ncsi_suspend_channel().
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Gavin Shan [Thu, 20 Oct 2016 00:45:52 +0000 (11:45 +1100)]
net/ncsi: Improve HNCDSC AEN handler
This improves AEN handler for Host Network Controller Driver Status
Change (HNCDSC):
* The channel's lock should be hold when accessing its state.
* Do failover when host driver isn't ready.
* Configure channel when host driver becomes ready.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Gavin Shan [Thu, 20 Oct 2016 00:45:51 +0000 (11:45 +1100)]
net/ncsi: Choose hot channel as active one if necessary
The issue was found on BCM5718 which has two NCSI channels in one
package: C0 and C1. C0 is in link-up state while C1 is in link-down
state. C0 is chosen as active channel until unplugging and plugging
C0's cable: On unplugging C0's cable, LSC (Link State Change) AEN
packet received on C0 to report link-down event. After that, C1 is
chosen as active channel. LSC AEN for link-up event is lost on C0
when plugging C0's cable back. We lose the network even C0 is usable.
This resolves the issue by recording the (hot) channel that was ever
chosen as active one. The hot channel is chosen to be active one
if none of available channels in link-up state. With this, C0 is still
the active one after unplugging C0's cable. LSC AEN packet received
on C0 when plugging its cable back.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Gavin Shan [Thu, 20 Oct 2016 00:45:50 +0000 (11:45 +1100)]
net/ncsi: Fix stale link state of inactive channels on failover
The issue was found on BCM5718 which has two NCSI channels in one
package: C0 and C1. Both of them are connected to different LANs,
means they are in link-up state and C0 is chosen as the active one
until resetting BCM5718 happens as below.
Resetting BCM5718 results in LSC (Link State Change) AEN packet
received on C0, meaning LSC AEN is missed on C1. When LSC AEN packet
received on C0 to report link-down, it fails over to C1 because C1
is in link-up state as software can see. However, C1 is in link-down
state in hardware. It means the link state is out of synchronization
between hardware and software, resulting in inappropriate channel (C1)
selected as active one.
This resolves the issue by sending separate GLS (Get Link Status)
commands to all channels in the package before trying to do failover.
The last link states of all channels in the package are retrieved.
With it, C0 (not C1) is selected as active one as expected.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Gavin Shan [Thu, 20 Oct 2016 00:45:49 +0000 (11:45 +1100)]
net/ncsi: Avoid if statements in ncsi_suspend_channel()
There are several if/else statements in the state machine implemented
by switch/case in ncsi_suspend_channel() to avoid duplicated code. It
makes the code a bit hard to be understood.
This drops if/else statements in ncsi_suspend_channel() to improve the
code readability as Joel Stanley suggested. Also, it becomes easy to
add more states in the state machine without affecting current code.
No logical changes introduced by this.
Suggested-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Paul Blakey [Wed, 19 Oct 2016 14:42:39 +0000 (17:42 +0300)]
net/sched: act_mirred: Use passed lastuse argument
stats_update callback is called by NIC drivers doing hardware
offloading of the mirred action. Lastuse is passed as argument
to specify when the stats was actually last updated and is not
always the current time.
Fixes:
9798e6fe4f9b ('net: act_mirred: allow statistic updates from offloaded actions')
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Thu, 20 Oct 2016 14:05:45 +0000 (16:05 +0200)]
mlxsw: pci: Fix reset wait for SwitchX2
SwitchX2 firmware does not implement reset done yet. Moreover, when
busy-polled for ready magic, that slows down firmware and reset takes
longer than the defined timeout, causing initialization to fail.
So restore the previous behaviour and just sleep in this case.
Fixes:
233fa44bd67a ("mlxsw: pci: Implement reset done check")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Elad Raz [Thu, 20 Oct 2016 14:05:44 +0000 (16:05 +0200)]
mlxsw: switchx2: Fix ethernet port initialization
When creating an ethernet port fails, we must move the port to disable,
otherwise putting the port in switch partition 0 (ETH) or 1 (IB) will
always fails.
Fixes:
31557f0f9755 ("mlxsw: Introduce Mellanox SwitchX-2 ASIC support")
Signed-off-by: Elad Raz <eladr@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Thu, 20 Oct 2016 14:05:43 +0000 (16:05 +0200)]
mlxsw: spectrum_router: Make mlxsw_sp_router_fib4_del return void and remove warn
The function return value is not checked anywhere. Also, the warning
causes huge slowdown when removing large number of FIB entries which
were not offloaded, because of ordering issue. Ido's preparing
a patchset to fix the ordering issue, but that is definitelly not
net tree material.
Fixes:
b45f64d16d45 ("mlxsw: spectrum_router: Use FIB notifications instead of switchdev calls")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Thu, 20 Oct 2016 14:05:42 +0000 (16:05 +0200)]
mlxsw: spectrum_router: Use correct tree index for binding
By a mistake, there is tree index 0 passed to RALTB. Should be
MLXSW_SP_LPM_TREE_MIN.
Fixes:
b45f64d16d45 ("mlxsw: spectrum_router: Use FIB notifications instead of switchdev calls")
Reported-by: Yotam Gigi <yotamg@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jacob Siverskog [Thu, 20 Oct 2016 07:05:09 +0000 (09:05 +0200)]
Bluetooth: btwilink: Fix probe return value
Probe functions should return 0 on success. This driver's probe
returns the value returned by hci_register_dev(), which is the hci
index. This works for systems with only one hci device (id = 0) but
for systems where the btwilink device ends up with an id larger than
0, things will start to fall apart.
Make the probe function return 0 on success.
Signed-off-by: Jacob Siverskog <jacob@teenage.engineering>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Giuseppe CAVALLARO [Wed, 19 Oct 2016 07:06:41 +0000 (09:06 +0200)]
stmmac: fix and review the ptp registration.
The commit commit
7086605a6ab5 ("stmmac: fix error check when init ptp")
breaks the procedure added by the
commit
efee95f42b5d ("ptp_clock: future-proofing drivers against PTP
subsystem becoming optional")
So this patch tries to re-import the logic added by the latest
commit above: it makes sense to have the stmmac_ptp_register
as void function and, inside the main, the stmmac_init_ptp can fails
in case of the capability cannot be supported by the HW.
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre TORGUE <alexandre.torgue@st.com>
Cc: Rayagond Kokatanur <rayagond@vayavyalabs.com>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Nicolas Pitre <nico@linaro.org>
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michał Narajowski [Wed, 19 Oct 2016 08:20:27 +0000 (10:20 +0200)]
Bluetooth: Fix append max 11 bytes of name to scan rsp data
Append maximum of 10 + 1 bytes of name to scan response data.
Complete name is appended only if exists and is <= 10 characters.
Else append short name if exists or shorten complete name if not.
This makes sure name is consistent across multiple advertising
instances.
Signed-off-by: Michał Narajowski <michal.narajowski@codecoup.pl>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Florian Westphal [Mon, 17 Oct 2016 19:50:23 +0000 (21:50 +0200)]
netfilter: x_tables: suppress kmemcheck warning
Markus Trippelsdorf reports:
WARNING: kmemcheck: Caught 64-bit read from uninitialized memory (
ffff88001e605480)
4055601e0088ffff000000000000000090686d81ffffffff0000000000000000
u u u u u u u u u u u u u u u u i i i i i i i i u u u u u u u u
^
|RIP: 0010:[<
ffffffff8166e561>] [<
ffffffff8166e561>] nf_register_net_hook+0x51/0x160
[..]
[<
ffffffff8166e561>] nf_register_net_hook+0x51/0x160
[<
ffffffff8166eaaf>] nf_register_net_hooks+0x3f/0xa0
[<
ffffffff816d6715>] ipt_register_table+0xe5/0x110
[..]
This warning is harmless; we copy 'uninitialized' data from the hook ops
but it will not be used.
Long term the structures keeping run-time data should be disentangled
from those only containing config-time data (such as where in the list
to insert a hook), but thats -next material.
Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Suggested-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Aaron Conole <aconole@bytheb.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Eric Dumazet [Tue, 18 Oct 2016 20:24:07 +0000 (13:24 -0700)]
tcp: do not export sysctl_tcp_low_latency
Since commit
b2fb4f54ecd4 ("tcp: uninline tcp_prequeue()") we no longer
access sysctl_tcp_low_latency from a module.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Tue, 18 Oct 2016 16:59:34 +0000 (18:59 +0200)]
rtnetlink: Add rtnexthop offload flag to compare mask
The offload flag is a status flag and should not be used by
FIB semantics for comparison.
Fixes:
37ed9493699c ("rtnetlink: add RTNH_F_EXTERNAL flag for fib offload")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Tue, 18 Oct 2016 16:50:23 +0000 (18:50 +0200)]
switchdev: Execute bridge ndos only for bridge ports
We recently got the following warning after setting up a vlan device on
top of an offloaded bridge and executing 'bridge link':
WARNING: CPU: 0 PID: 18566 at drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c:81 mlxsw_sp_port_orig_get.part.9+0x55/0x70 [mlxsw_spectrum]
[...]
CPU: 0 PID: 18566 Comm: bridge Not tainted 4.8.0-rc7 #1
Hardware name: Mellanox Technologies Ltd. Mellanox switch/Mellanox switch, BIOS 4.6.5 05/21/2015
0000000000000286 00000000e64ab94f ffff880406e6f8f0 ffffffff8135eaa3
0000000000000000 0000000000000000 ffff880406e6f930 ffffffff8108c43b
0000005106e6f988 ffff8803df398840 ffff880403c60108 ffff880406e6f990
Call Trace:
[<
ffffffff8135eaa3>] dump_stack+0x63/0x90
[<
ffffffff8108c43b>] __warn+0xcb/0xf0
[<
ffffffff8108c56d>] warn_slowpath_null+0x1d/0x20
[<
ffffffffa01420d5>] mlxsw_sp_port_orig_get.part.9+0x55/0x70 [mlxsw_spectrum]
[<
ffffffffa0142195>] mlxsw_sp_port_attr_get+0xa5/0xb0 [mlxsw_spectrum]
[<
ffffffff816f151f>] switchdev_port_attr_get+0x4f/0x140
[<
ffffffff816f15d0>] switchdev_port_attr_get+0x100/0x140
[<
ffffffff816f15d0>] switchdev_port_attr_get+0x100/0x140
[<
ffffffff816f1d6b>] switchdev_port_bridge_getlink+0x5b/0xc0
[<
ffffffff816f2680>] ? switchdev_port_fdb_dump+0x90/0x90
[<
ffffffff815f5427>] rtnl_bridge_getlink+0xe7/0x190
[<
ffffffff8161a1b2>] netlink_dump+0x122/0x290
[<
ffffffff8161b0df>] __netlink_dump_start+0x15f/0x190
[<
ffffffff815f5340>] ? rtnl_bridge_dellink+0x230/0x230
[<
ffffffff815fab46>] rtnetlink_rcv_msg+0x1a6/0x220
[<
ffffffff81208118>] ? __kmalloc_node_track_caller+0x208/0x2c0
[<
ffffffff815f5340>] ? rtnl_bridge_dellink+0x230/0x230
[<
ffffffff815fa9a0>] ? rtnl_newlink+0x890/0x890
[<
ffffffff8161cf54>] netlink_rcv_skb+0xa4/0xc0
[<
ffffffff815f56f8>] rtnetlink_rcv+0x28/0x30
[<
ffffffff8161c92c>] netlink_unicast+0x18c/0x240
[<
ffffffff8161ccdb>] netlink_sendmsg+0x2fb/0x3a0
[<
ffffffff815c5a48>] sock_sendmsg+0x38/0x50
[<
ffffffff815c6031>] SYSC_sendto+0x101/0x190
[<
ffffffff815c7111>] ? __sys_recvmsg+0x51/0x90
[<
ffffffff815c6b6e>] SyS_sendto+0xe/0x10
[<
ffffffff817017f2>] entry_SYSCALL_64_fastpath+0x1a/0xa4
The problem is that the 8021q module propagates the call to
ndo_bridge_getlink() via switchdev ops, but the switch driver doesn't
recognize the netdev, as it's not offloaded.
While we can ignore calls being made to non-bridge ports inside the
driver, a better fix would be to push this check up to the switchdev
layer.
Note that these ndos can be called for non-bridged netdev, but this only
happens in certain PF drivers which don't call the corresponding
switchdev functions anyway.
Fixes:
99f44bb3527b ("mlxsw: spectrum: Enable L3 interfaces on top of bridge devices")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reported-by: Tamir Winetroub <tamirw@mellanox.com>
Tested-by: Tamir Winetroub <tamirw@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Wed, 19 Oct 2016 13:57:08 +0000 (16:57 +0300)]
net: core: Correctly iterate over lower adjacency list
Tamir reported the following trace when processing ARP requests received
via a vlan device on top of a VLAN-aware bridge:
NMI watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [swapper/1:0]
[...]
CPU: 1 PID: 0 Comm: swapper/1 Tainted: G W 4.8.0-rc7 #1
Hardware name: Mellanox Technologies Ltd. "MSN2100-CB2F"/"SA001017", BIOS 5.6.5 06/07/2016
task:
ffff88017edfea40 task.stack:
ffff88017ee10000
RIP: 0010:[<
ffffffff815dcc73>] [<
ffffffff815dcc73>] netdev_all_lower_get_next_rcu+0x33/0x60
[...]
Call Trace:
<IRQ>
[<
ffffffffa015de0a>] mlxsw_sp_port_lower_dev_hold+0x5a/0xa0 [mlxsw_spectrum]
[<
ffffffffa016f1b0>] mlxsw_sp_router_netevent_event+0x80/0x150 [mlxsw_spectrum]
[<
ffffffff810ad07a>] notifier_call_chain+0x4a/0x70
[<
ffffffff810ad13a>] atomic_notifier_call_chain+0x1a/0x20
[<
ffffffff815ee77b>] call_netevent_notifiers+0x1b/0x20
[<
ffffffff815f2eb6>] neigh_update+0x306/0x740
[<
ffffffff815f38ce>] neigh_event_ns+0x4e/0xb0
[<
ffffffff8165ea3f>] arp_process+0x66f/0x700
[<
ffffffff8170214c>] ? common_interrupt+0x8c/0x8c
[<
ffffffff8165ec29>] arp_rcv+0x139/0x1d0
[<
ffffffff816e505a>] ? vlan_do_receive+0xda/0x320
[<
ffffffff815e3794>] __netif_receive_skb_core+0x524/0xab0
[<
ffffffff815e6830>] ? dev_queue_xmit+0x10/0x20
[<
ffffffffa06d612d>] ? br_forward_finish+0x3d/0xc0 [bridge]
[<
ffffffffa06e5796>] ? br_handle_vlan+0xf6/0x1b0 [bridge]
[<
ffffffff815e3d38>] __netif_receive_skb+0x18/0x60
[<
ffffffff815e3dc0>] netif_receive_skb_internal+0x40/0xb0
[<
ffffffff815e3e4c>] netif_receive_skb+0x1c/0x70
[<
ffffffffa06d7856>] br_pass_frame_up+0xc6/0x160 [bridge]
[<
ffffffffa06d63d7>] ? deliver_clone+0x37/0x50 [bridge]
[<
ffffffffa06d656c>] ? br_flood+0xcc/0x160 [bridge]
[<
ffffffffa06d7b14>] br_handle_frame_finish+0x224/0x4f0 [bridge]
[<
ffffffffa06d7f94>] br_handle_frame+0x174/0x300 [bridge]
[<
ffffffff815e3599>] __netif_receive_skb_core+0x329/0xab0
[<
ffffffff81374815>] ? find_next_bit+0x15/0x20
[<
ffffffff8135e802>] ? cpumask_next_and+0x32/0x50
[<
ffffffff810c9968>] ? load_balance+0x178/0x9b0
[<
ffffffff815e3d38>] __netif_receive_skb+0x18/0x60
[<
ffffffff815e3dc0>] netif_receive_skb_internal+0x40/0xb0
[<
ffffffff815e3e4c>] netif_receive_skb+0x1c/0x70
[<
ffffffffa01544a1>] mlxsw_sp_rx_listener_func+0x61/0xb0 [mlxsw_spectrum]
[<
ffffffffa005c9f7>] mlxsw_core_skb_receive+0x187/0x200 [mlxsw_core]
[<
ffffffffa007332a>] mlxsw_pci_cq_tasklet+0x63a/0x9b0 [mlxsw_pci]
[<
ffffffff81091986>] tasklet_action+0xf6/0x110
[<
ffffffff81704556>] __do_softirq+0xf6/0x280
[<
ffffffff8109213f>] irq_exit+0xdf/0xf0
[<
ffffffff817042b4>] do_IRQ+0x54/0xd0
[<
ffffffff8170214c>] common_interrupt+0x8c/0x8c
The problem is that netdev_all_lower_get_next_rcu() never advances the
iterator, thereby causing the loop over the lower adjacency list to run
forever.
Fix this by advancing the iterator and avoid the infinite loop.
Fixes:
7ce856aaaf13 ("mlxsw: spectrum: Add couple of lower device helper functions")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reported-by: Tamir Winetroub <tamirw@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Garver [Mon, 17 Oct 2016 20:30:12 +0000 (16:30 -0400)]
flow_dissector: Check skb for VLAN only if skb specified.
Fixes a panic when calling eth_get_headlen(). Noticed on i40e driver.
Fixes:
d5709f7ab776 ("flow_dissector: For stripped vlan, get vlan info from skb->vlan_tci")
Signed-off-by: Eric Garver <e@erig.me>
Reviewed-by: Jakub Sitnicki <jkbs@redhat.com>
Acked-by: Amir Vadai <amir@vadai.me>
Signed-off-by: David S. Miller <davem@davemloft.net>
Wei Yongjun [Mon, 17 Oct 2016 15:17:51 +0000 (15:17 +0000)]
qed: Use list_move_tail instead of list_del/list_add_tail
Using list_move_tail() instead of list_del() + list_add_tail().
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Acked-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arnd Bergmann [Mon, 17 Oct 2016 22:16:15 +0000 (00:16 +0200)]
rocker: fix maybe-uninitialized warning
In some rare configurations, we get a warning about the 'index' variable
being used without an initialization:
drivers/net/ethernet/rocker/rocker_ofdpa.c: In function ‘ofdpa_port_fib_ipv4.isra.16.constprop’:
drivers/net/ethernet/rocker/rocker_ofdpa.c:2425:92: warning: ‘index’ may be used uninitialized in this function [-Wmaybe-uninitialized]
This is a false positive, the logic is just a bit too complex for gcc
to follow here. Moving the intialization of 'index' a little further
down makes it clear to gcc that the function always returns an error
if it is not initialized.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arnd Bergmann [Mon, 17 Oct 2016 22:16:09 +0000 (00:16 +0200)]
net/hyperv: avoid uninitialized variable
The hdr_offset variable is only if we deal with a TCP or UDP packet,
but as the check surrounding its usage tests for skb_is_gso()
instead, the compiler has no idea if the variable is initialized
or not at that point:
drivers/net/hyperv/netvsc_drv.c: In function ‘netvsc_start_xmit’:
drivers/net/hyperv/netvsc_drv.c:494:42: error: ‘hdr_offset’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
This adds an additional check for the transport type, which
tells the compiler that this path cannot happen. Since the
get_net_transport_info() function should always be inlined
here, I don't expect this to result in additional runtime
checks.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arnd Bergmann [Mon, 17 Oct 2016 22:16:08 +0000 (00:16 +0200)]
net: bcm63xx: avoid referencing uninitialized variable
gcc found a reference to an uninitialized variable in the error handling
of bcm_enet_open, introduced by a recent cleanup:
drivers/net/ethernet/broadcom/bcm63xx_enet.c: In function 'bcm_enet_open'
drivers/net/ethernet/broadcom/bcm63xx_enet.c:1129:2: warning: 'phydev' may be used uninitialized in this function [-Wmaybe-uninitialized]
This makes the use of that variable conditional, so we only reference it
here after it has been used before. Unlike my normal patches, I have not
build-tested this one, as I don't currently have mips test in my
randconfig setup.
Fixes:
625eb8667d6f ("net: ethernet: broadcom: bcm63xx: use phydev from struct net_device")
Cc: Philippe Reynes <tremyfr@gmail.com>
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Mon, 17 Oct 2016 21:22:48 +0000 (14:22 -0700)]
soreuseport: do not export reuseport_add_sock()
reuseport_add_sock() is not used from a module,
no need to export it.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Thomas Falcon [Mon, 17 Oct 2016 20:28:10 +0000 (15:28 -0500)]
ibmvnic: Update MTU after device initialization
It is possible for the MTU to be changed during the initialization
process with the VNIC Server. Ensure that the net device is updated
to reflect the new MTU.
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Thomas Falcon [Mon, 17 Oct 2016 20:28:09 +0000 (15:28 -0500)]
ibmvnic: Fix GFP_KERNEL allocation in interrupt context
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Thomas Falcon [Mon, 17 Oct 2016 20:56:29 +0000 (15:56 -0500)]
ibmvnic: Driver Version 1.0.1
Increment driver version to reflect features that have
been added since release.
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nikolay Aleksandrov [Tue, 18 Oct 2016 16:09:48 +0000 (18:09 +0200)]
bridge: multicast: restore perm router ports on multicast enable
Satish reported a problem with the perm multicast router ports not getting
reenabled after some series of events, in particular if it happens that the
multicast snooping has been disabled and the port goes to disabled state
then it will be deleted from the router port list, but if it moves into
non-disabled state it will not be re-added because the mcast snooping is
still disabled, and enabling snooping later does nothing.
Here are the steps to reproduce, setup br0 with snooping enabled and eth1
added as a perm router (multicast_router = 2):
1. $ echo 0 > /sys/class/net/br0/bridge/multicast_snooping
2. $ ip l set eth1 down
^ This step deletes the interface from the router list
3. $ ip l set eth1 up
^ This step does not add it again because mcast snooping is disabled
4. $ echo 1 > /sys/class/net/br0/bridge/multicast_snooping
5. $ bridge -d -s mdb show
<empty>
At this point we have mcast enabled and eth1 as a perm router (value = 2)
but it is not in the router list which is incorrect.
After this change:
1. $ echo 0 > /sys/class/net/br0/bridge/multicast_snooping
2. $ ip l set eth1 down
^ This step deletes the interface from the router list
3. $ ip l set eth1 up
^ This step does not add it again because mcast snooping is disabled
4. $ echo 1 > /sys/class/net/br0/bridge/multicast_snooping
5. $ bridge -d -s mdb show
router ports on br0: eth1
Note: we can directly do br_multicast_enable_port for all because the
querier timer already has checks for the port state and will simply
expire if it's in blocking/disabled. See the comment added by
commit
9aa66382163e7 ("bridge: multicast: add a comment to
br_port_state_selection about blocking state")
Fixes:
561f1103a2b7 ("bridge: Add multicast_snooping sysfs toggle")
Reported-by: Satish Ashok <sashok@cumulusnetworks.com>
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arnd Bergmann [Mon, 17 Oct 2016 22:05:30 +0000 (00:05 +0200)]
netfilter: nf_tables: avoid uninitialized variable warning
The newly added nft_range_eval() function handles the two possible
nft range operations, but as the compiler warning points out,
any unexpected value would lead to the 'mismatch' variable being
used without being initialized:
net/netfilter/nft_range.c: In function 'nft_range_eval':
net/netfilter/nft_range.c:45:5: error: 'mismatch' may be used uninitialized in this function [-Werror=maybe-uninitialized]
This removes the variable in question and instead moves the
condition into the switch itself, which is potentially more
efficient than adding a bogus 'default' clause as in my
first approach, and is nicer than using the 'uninitialized_var'
macro.
Fixes:
0f3cd9b36977 ("netfilter: nf_tables: add range expression")
Link: http://patchwork.ozlabs.org/patch/677114/
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Tobias Klauser [Tue, 18 Oct 2016 09:22:54 +0000 (11:22 +0200)]
tcp: Remove unused but set variable
Remove the unused but set variable icsk in listening_get_next to fix the
following GCC warning when building with 'W=1':
net/ipv4/tcp_ipv4.c: In function ‘listening_get_next’:
net/ipv4/tcp_ipv4.c:1890:31: warning: variable ‘icsk’ set but not used [-Wunused-but-set-variable]
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ganesh Goudar [Tue, 18 Oct 2016 08:51:25 +0000 (14:21 +0530)]
cxgb4: Fix number of queue sets corssing the limit
Do not let number of offload queue sets to go more than
MAX_OFLD_QSETS, which would otherwise crash the driver
on machines with cores more than MAX_OFLD_QSETS.
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tobias Klauser [Tue, 18 Oct 2016 07:40:20 +0000 (09:40 +0200)]
ipv4: Remove unused but set variable
Remove the unused but set variable dev in ip_do_fragment to fix the
following GCC warning when building with 'W=1':
net/ipv4/ip_output.c: In function ‘ip_do_fragment’:
net/ipv4/ip_output.c:541:21: warning: variable ‘dev’ set but not used [-Wunused-but-set-variable]
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Niklas Cassel [Tue, 18 Oct 2016 07:20:55 +0000 (09:20 +0200)]
dwc_eth_qos: enable flow control by default
Allow autoneg to enable flow control by default.
The behavior when autoneg is off has not changed.
Signed-off-by: Niklas Cassel <niklas.cassel@axis.com>
Signed-off-by: Jesper Nilsson <jespern@axis.com>
Acked-by: Lars Persson <larper@axis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Niklas Cassel [Tue, 18 Oct 2016 07:20:33 +0000 (09:20 +0200)]
dwc_eth_qos: do not clear pause flags from phy_device->supported
phy_device->supported is originally set by the PHY driver.
The ethernet driver should filter phy_device->supported to only contain
flags supported by the IP.
The IP supports setting rx and tx flow control independently,
therefore SUPPORTED_Pause and SUPPORTED_Asym_Pause should not be cleared.
If the flags are cleared, pause frames cannot be enabled (even if they
are supported by the PHY).
Signed-off-by: Niklas Cassel <niklas.cassel@axis.com>
Signed-off-by: Jesper Nilsson <jespern@axis.com>
Acked-by: Lars Persson <larper@axis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tobias Klauser [Tue, 18 Oct 2016 07:07:29 +0000 (09:07 +0200)]
net/hsr: Remove unused but set variable
Remove the unused but set variable master_dev in check_local_dest to fix
the following GCC warning when building with 'W=1':
net/hsr/hsr_forward.c: In function ‘check_local_dest’:
net/hsr/hsr_forward.c:303:21: warning: variable ‘master_dev’ set but not used [-Wunused-but-set-variable]
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 18 Oct 2016 14:26:15 +0000 (10:26 -0400)]
Merge tag 'mac80211-for-davem-2016-10-18' of git://git./linux/kernel/git/jberg/mac80211
Johannes Berg says:
====================
This is relatively small, mostly to get the SG/crypto
from stack removal fix that crashes things when VMAP
stack is used in conjunction with software crypto.
Aside from that, we have:
* a fix for AP_VLAN usage with the nl80211 frame command
* two fixes (and two preparation patches) for A-MSDU, one
to discard group-addressed (multicast) and unexpected
4-address A-MSDUs, the other to validate A-MSDU inner
MAC addresses properly to prevent controlled port bypass
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Ivan Vecera [Tue, 18 Oct 2016 06:16:03 +0000 (08:16 +0200)]
bnx2: fix locking when netconsole is used
Functions bnx2_reg_rd_ind(), bnx2_reg_wr_ind() and bnx2_ctx_wr()
can be called with IRQs disabled when netconsole is enabled. So they
should use spin_{,un}lock_irq{save,restore} instead of _bh variants.
Example call flow:
bnx2_poll()
->bnx2_poll_link()
->bnx2_phy_int()
->bnx2_set_remote_link()
->bnx2_shmem_rd()
->bnx2_reg_rd_ind()
-> spin_lock_bh(&bp->indirect_lock);
spin_unlock_bh(&bp->indirect_lock);
...
-> __local_bh_enable_ip
static inline void __local_bh_enable_ip(unsigned long ip)
WARN_ON_ONCE(in_irq() || irqs_disabled()); <<<<<< WARN
Cc: Sony Chacko <sony.chacko@qlogic.com>
Cc: Dept-HSGLinuxNICDev@qlogic.com
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 17 Oct 2016 17:03:04 +0000 (13:03 -0400)]
Merge branch 'net-driver-autoload'
Javier Martinez Canillas says:
====================
net: Fix module autoload for several platform drivers
I noticed that module autoload won't be working in a bunch of platform
drivers in the net subsystem and this patch series contains the fixes.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Javier Martinez Canillas [Mon, 17 Oct 2016 14:05:46 +0000 (11:05 -0300)]
net: dsa: bcm_sf2: Fix module autoload for OF registration
If the driver is built as a module, autoload won't work because the module
alias information is not filled. So user-space can't match the registered
device with the corresponding module.
Export the module alias information using the MODULE_DEVICE_TABLE() macro.
Before this patch:
$ modinfo drivers/net/dsa/bcm_sf2.ko | grep alias
alias: platform:brcm-sf2
After this patch:
$ modinfo drivers/net/dsa/bcm_sf2.ko | grep alias
alias: platform:brcm-sf2
alias: of:N*T*Cbrcm,bcm7445-switch-v4.0C*
alias: of:N*T*Cbrcm,bcm7445-switch-v4.0
Signed-off-by: Javier Martinez Canillas <javier@osg.samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Javier Martinez Canillas [Mon, 17 Oct 2016 14:05:45 +0000 (11:05 -0300)]
net: dsa: b53: Fix module autoload
If the driver is built as a module, autoload won't work because the module
alias information is not filled. So user-space can't match the registered
device with the corresponding module.
Export the module alias information using the MODULE_DEVICE_TABLE() macro.
Before this patch:
$ modinfo drivers/net/dsa/b53/b53_mmap.ko | grep alias
$
After this patch:
$ modinfo drivers/net/dsa/b53/b53_mmap.ko | grep alias
alias: of:N*T*Cbrcm,bcm63xx-switchC*
alias: of:N*T*Cbrcm,bcm63xx-switch
alias: of:N*T*Cbrcm,bcm6368-switchC*
alias: of:N*T*Cbrcm,bcm6368-switch
alias: of:N*T*Cbrcm,bcm6328-switchC*
alias: of:N*T*Cbrcm,bcm6328-switch
alias: of:N*T*Cbrcm,bcm3384-switchC*
alias: of:N*T*Cbrcm,bcm3384-switch
Signed-off-by: Javier Martinez Canillas <javier@osg.samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Javier Martinez Canillas [Mon, 17 Oct 2016 14:05:44 +0000 (11:05 -0300)]
net: hisilicon: Fix hns_mdio module autoload for OF registration
If the driver is built as a module, autoload won't work because the module
alias information is not filled. So user-space can't match the registered
device with the corresponding module.
Export the module alias information using the MODULE_DEVICE_TABLE() macro.
Before this patch:
$ modinfo drivers/net/ethernet/hisilicon//hns_mdio.ko | grep alias
alias: platform:Hi-HNS_MDIO
alias: acpi*:HISI0141:*
After this patch:
$ modinfo drivers/net/ethernet/hisilicon//hns_mdio.ko | grep alias
alias: platform:Hi-HNS_MDIO
alias: of:N*T*Chisilicon,hns-mdioC*
alias: of:N*T*Chisilicon,hns-mdio
alias: of:N*T*Chisilicon,mdioC*
alias: of:N*T*Chisilicon,mdio
alias: acpi*:HISI0141:*
Signed-off-by: Javier Martinez Canillas <javier@osg.samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Javier Martinez Canillas [Mon, 17 Oct 2016 14:05:43 +0000 (11:05 -0300)]
net: qcom/emac: Fix module autoload for OF registration
If the driver is built as a module, autoload won't work because the module
alias information is not filled. So user-space can't match the registered
device with the corresponding module.
Export the module alias information using the MODULE_DEVICE_TABLE() macro.
Before this patch:
$ modinfo drivers/net/ethernet/qualcomm/emac/qcom-emac.ko | grep alias
alias: platform:qcom-emac
After this patch:
$ modinfo drivers/net/ethernet/qualcomm/emac/qcom-emac.ko | grep alias
alias: platform:qcom-emac
alias: of:N*T*Cqcom,fsm9900-emacC*
alias: of:N*T*Cqcom,fsm9900-emac
Signed-off-by: Javier Martinez Canillas <javier@osg.samsung.com>
Acked-by: Timur Tabi <timur@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Javier Martinez Canillas [Mon, 17 Oct 2016 14:05:42 +0000 (11:05 -0300)]
net: hns: Fix hns_dsaf module autoload for OF registration
If the driver is built as a module, autoload won't work because the module
alias information is not filled. So user-space can't match the registered
device with the corresponding module.
Export the module alias information using the MODULE_DEVICE_TABLE() macro.
Before this patch:
$ modinfo drivers/net/ethernet/hisilicon/hns/hns_dsaf.ko | grep alias
alias: acpi*:HISI00B2:*
alias: acpi*:HISI00B1:*
After this patch:
$ modinfo drivers/net/ethernet/hisilicon/hns/hns_dsaf.ko | grep alias
alias: acpi*:HISI00B2:*
alias: acpi*:HISI00B1:*
alias: of:N*T*Chisilicon,hns-dsaf-v2C*
alias: of:N*T*Chisilicon,hns-dsaf-v2
alias: of:N*T*Chisilicon,hns-dsaf-v1C*
alias: of:N*T*Chisilicon,hns-dsaf-v1
Signed-off-by: Javier Martinez Canillas <javier@osg.samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Javier Martinez Canillas [Mon, 17 Oct 2016 14:05:41 +0000 (11:05 -0300)]
net: ethernet: nb8800: Fix module autoload
If the driver is built as a module, autoload won't work because the module
alias information is not filled. So user-space can't match the registered
device with the corresponding module.
Export the module alias information using the MODULE_DEVICE_TABLE() macro.
Before this patch:
$ $ modinfo drivers/net/ethernet/aurora/nb8800.ko | grep alias
$
After this patch:
$ modinfo drivers/net/ethernet/aurora/nb8800.ko | grep alias
alias: of:N*T*Csigma,smp8734-ethernetC*
alias: of:N*T*Csigma,smp8734-ethernet
alias: of:N*T*Csigma,smp8642-ethernetC*
alias: of:N*T*Csigma,smp8642-ethernet
alias: of:N*T*Caurora,nb8800C*
alias: of:N*T*Caurora,nb8800
Signed-off-by: Javier Martinez Canillas <javier@osg.samsung.com>
Acked-by: Mans Rullgard <mans@mansr.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Javier Martinez Canillas [Mon, 17 Oct 2016 14:05:40 +0000 (11:05 -0300)]
net: nps_enet: Fix module autoload
If the driver is built as a module, autoload won't work because the module
alias information is not filled. So user-space can't match the registered
device with the corresponding module.
Export the module alias information using the MODULE_DEVICE_TABLE() macro.
Before this patch:
$ modinfo drivers/net/ethernet/ezchip/nps_enet.ko | grep alias
$
After this patch:
$ modinfo drivers/net/ethernet/ezchip/nps_enet.ko | grep alias
alias: of:N*T*Cezchip,nps-mgt-enetC*
alias: of:N*T*Cezchip,nps-mgt-enet
Signed-off-by: Javier Martinez Canillas <javier@osg.samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pablo Neira Ayuso [Thu, 13 Oct 2016 06:42:17 +0000 (08:42 +0200)]
netfilter: nft_range: validate operation netlink attribute
Use nft_parse_u32_check() to make sure we don't get a value over the
unsigned 8-bit integer. Moreover, make sure this value doesn't go over
the two supported range comparison modes.
Fixes:
9286c2eb1fda ("netfilter: nft_range: validate operation netlink attribute")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Dan Carpenter [Wed, 12 Oct 2016 06:09:12 +0000 (09:09 +0300)]
netfilter: nft_exthdr: fix error handling in nft_exthdr_init()
"err" needs to be signed for the error handling to work.
Fixes:
36b701fae12a ('netfilter: nf_tables: validate maximum value of u32 netlink attributes')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Dan Carpenter [Wed, 12 Oct 2016 09:14:29 +0000 (12:14 +0300)]
netfilter: nf_tables: underflow in nft_parse_u32_check()
We don't want to allow negatives here.
Fixes:
36b701fae12a ('netfilter: nf_tables: validate maximum value of u32 netlink attributes')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Liping Zhang [Wed, 12 Oct 2016 13:10:45 +0000 (21:10 +0800)]
netfilter: nft_hash: add missing NFTA_HASH_OFFSET's nla_policy
Missing the nla_policy description will also miss the validation check
in kernel.
Fixes:
70ca767ea1b2 ("netfilter: nft_hash: Add hash offset value")
Signed-off-by: Liping Zhang <liping.zhang@spreadtrum.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Liping Zhang [Wed, 12 Oct 2016 13:09:22 +0000 (21:09 +0800)]
netfilter: xt_ipcomp: add "ip[6]t_ipcomp" module alias name
Otherwise, user cannot add related rules if xt_ipcomp.ko is not loaded:
# iptables -A OUTPUT -p 108 -m ipcomp --ipcompspi 1
iptables: No chain/target/match by that name.
Signed-off-by: Liping Zhang <liping.zhang@spreadtrum.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Liping Zhang [Tue, 11 Oct 2016 13:03:45 +0000 (21:03 +0800)]
netfilter: xt_NFLOG: fix unexpected truncated packet
Justin and Chris spotted that iptables NFLOG target was broken when they
upgraded the kernel to 4.8: "ulogd-2.0.5- IPs are no longer logged" or
"results in segfaults in ulogd-2.0.5".
Because "struct nf_loginfo li;" is a local variable, and flags will be
filled with garbage value, not inited to zero. So if it contains 0x1,
packets will not be logged to the userspace anymore.
Fixes:
7643507fe8b5 ("netfilter: xt_NFLOG: nflog-range does not truncate packets")
Reported-by: Justin Piszcz <jpiszcz@lucidpixels.com>
Reported-by: Chris Caputo <ccaputo@alt.net>
Tested-by: Chris Caputo <ccaputo@alt.net>
Signed-off-by: Liping Zhang <liping.zhang@spreadtrum.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>