Eric Garver [Fri, 6 Jan 2017 01:22:36 +0000 (20:22 -0500)]
udp: inuse checks can quit early for reuseport
UDP lib inuse checks will walk the entire hash bucket to check if the
portaddr is in use. In the case of reuseport we can stop searching when
we find a matching reuseport.
On a 16-core VM a test program that spawns 16 threads that each bind to
1024 sockets (one per 10ms) takes 1m45s. With this change it takes 11s.
Also add a cond_resched() when the port is not specified.
Signed-off-by: Eric Garver <e@erig.me>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ganesh Goudar [Fri, 6 Jan 2017 11:22:10 +0000 (16:52 +0530)]
cxgb4: Add port description for new cards.
Add port description for 25G and 100G cards, and also
change few port descriptions in compliance with the new
naming convention.
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ganesh Goudar [Fri, 6 Jan 2017 11:21:46 +0000 (16:51 +0530)]
cxgb4/cxgb4vf: Display 25G and 100G link speed
Add support to report 25G and 100G links, which was missed
as part of commit "
eb97ad99f9ed".
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 6 Jan 2017 21:05:37 +0000 (16:05 -0500)]
Merge branch '1GbE' of git://git./linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:
====================
1GbE Intel Wired LAN Driver Updates 2017-01-06
This series contains updates/fixes to igb and e1000e.
Joe fixes indentation and improper line wrapping in igb.
David Singleton fixes an issue in e1000e where in systemd, where things
are done in parallel and can create a condition where e1000_shutdown is
called after e1000_close, hitting BUG_ON assert in free_msi_irqs.
Cao Jin fixes a code comment on the wakeup status register. Also fixes
a possible NULL pointer dereference by using igb_adapter->io_addr
instead of e1000_hw->hw_addr in igb_configure_tx_ring().
Chris Arges works around a firmware issue, which can cause probe of i210
NIC to fail, so zero the page select register during igb_get_phy_id() to
workaround the issue. Aaron Sierra adds also a check for this issue
during the initialization of PHY parameters to ensure that this same
issue happens after probe.
Todd fixes a possible race condition in close/suspend by extending
the rtnl_lock() to protect the call to netif_device_detach() and
igb_clear_interrupt_scheme(). Also adds i211 to a known i210/i211
workaround.
Hannu Lounento fixes inverted logic on a debug statement.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David Ahern [Fri, 6 Jan 2017 03:33:59 +0000 (19:33 -0800)]
net: ipv4: make fib_select_default static
fib_select_default has a single caller within the same file.
Make it static.
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Ahern [Fri, 6 Jan 2017 03:32:46 +0000 (19:32 -0800)]
net: ipv4: Simplify rt_fill_info
rt_fill_info has only 1 caller and both of the last 2 args -- nowait
and flags -- are hardcoded to 0. Given that remove them as input arguments
and simplify rt_fill_info accordingly.
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hariprasad Shenai [Fri, 6 Jan 2017 03:17:20 +0000 (08:47 +0530)]
cxgb4: Synchronize access to mailbox
The issue comes when there are multiple threads attempting to use
the mailbox facility at the same time.
When DCB operations and interface up/down is run in a loop for every
0.1 sec, we observed mailbox collisions. And out of the two commands
one would fail with the present code, since we don't queue the second
command.
To overcome the above issue, added a queue to access the mailbox.
Whenever a mailbox command is issued add it to the queue. If its at
the head issue the mailbox command, else wait for the existing command
to complete. Usually command takes less than a milli-second to
complete.
Also timeout from the loop, if the command under execution takes
long time to run.
In reality, the number of mailbox access collisions is going to be
very rare since no one runs such abusive script.
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Thu, 5 Jan 2017 19:08:58 +0000 (11:08 -0800)]
net: dsa: b53: Utilize common helpers for u64/MAC
Utilize the two functions recently introduced: u64_to_ether() and
ether_to_u64() instead of our own versions.
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vivien Didelot [Thu, 5 Jan 2017 17:28:41 +0000 (12:28 -0500)]
net: dsa: remove version string
The dsa_driver_version string is irrelevant and has not been bumped
since its introduction about 9 years ago. Kill it.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Weilin Chang [Thu, 5 Jan 2017 00:18:50 +0000 (16:18 -0800)]
liquidio: fix wrong information about channels reported to ethtool
Information reported to ethtool about channels is sometimes wrong for PF,
and always wrong for VF. Fix them by getting the information from the
right fields from the right structs.
Signed-off-by: Weilin Chang <weilin.chang@cavium.com>
Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: Derek Chickles <derek.chickles@cavium.com>
Signed-off-by: Satanand Burla <satananda.burla@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mahesh Bandewar [Wed, 4 Jan 2017 23:01:01 +0000 (15:01 -0800)]
ipv6: do not send RTM_DELADDR for tentative addresses
RTM_NEWADDR notification is sent when IFA_F_TENTATIVE is cleared from
the address. So if the address is added and deleted before DAD probes
completes, the RTM_DELADDR will be sent for which there was no
RTM_NEWADDR causing asymmetry in notification. However if the same
logic is used while sending RTM_DELADDR notification, this asymmetry
can be avoided.
Signed-off-by: Mahesh Bandewar <maheshb@google.com>
CC: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
CC: Patrick McHardy <kaber@trash.net>
CC: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Prasad Kanneganti [Wed, 4 Jan 2017 19:31:55 +0000 (11:31 -0800)]
liquidio VF: fix incorrect struct being used
The VF driver is using the wrong struct when sending commands to the NIC
firmware, sometimes causing adverse effects in the firmware. The right
struct is the one that the PF is using, so make the VF use that as well.
Signed-off-by: Prasad Kanneganti <prasad.kanneganti@cavium.com>
Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: Derek Chickles <derek.chickles@cavium.com>
Signed-off-by: Satanand Burla <satananda.burla@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hannu Lounento [Mon, 2 Jan 2017 17:26:06 +0000 (18:26 +0100)]
igb: Fix hw_dbg logging in igb_update_flash_i210
Fix an if statement with hw_dbg lines where the logic was inverted with
regards to the corresponding return value used in the if statement.
Signed-off-by: Hannu Lounento <hannu.lounento@ge.com>
Signed-off-by: Peter Senna Tschudin <peter.senna@collabora.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Todd Fujinaka [Mon, 28 Nov 2016 17:09:57 +0000 (09:09 -0800)]
igb: add i211 to i210 PHY workaround
i210 and i211 share the same PHY but have different PCI IDs. Don't
forget i211 for any i210 workarounds.
Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Todd Fujinaka [Tue, 15 Nov 2016 16:54:26 +0000 (08:54 -0800)]
igb: close/suspend race in netif_device_detach
Similar to ixgbe, when an interface is part of a namespace it is
possible that igb_close() may be called while __igb_shutdown() is
running which ends up in a double free WARN and/or a BUG in
free_msi_irqs().
Extend the rtnl_lock() to protect the call to netif_device_detach() and
igb_clear_interrupt_scheme() in __igb_shutdown() and check for
netif_device_present() to avoid calling igb_clear_interrupt_scheme() a
second time in igb_close().
Also extend the rtnl lock in igb_resume() to netif_device_attach().
Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com>
Acked-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Guilherme G Piccoli [Thu, 10 Nov 2016 18:46:43 +0000 (16:46 -0200)]
igb: re-assign hw address pointer on reset after PCI error
Whenever the igb driver detects the result of a read operation returns
a value composed only by F's (like 0xFFFFFFFF), it will detach the
net_device, clear the hw_addr pointer and warn to the user that adapter's
link is lost - those steps happen on igb_rd32().
In case a PCI error happens on Power architecture, there's a recovery
mechanism called EEH, that will reset the PCI slot and call driver's
handlers to reset the adapter and network functionality as well.
We observed that once hw_addr is NULL after the error is detected on
igb_rd32(), it's never assigned back, so in the process of resetting
the network functionality we got a NULL pointer dereference in both
igb_configure_tx_ring() and igb_configure_rx_ring(). In order to avoid
such bug, this patch re-assigns the hw_addr value in the slot_reset
handler.
Reported-by: Anthony H Thai <ahthai@us.ibm.com>
Reported-by: Harsha Thyagaraja <hathyaga@in.ibm.com>
Signed-off-by: Guilherme G Piccoli <gpiccoli@linux.vnet.ibm.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Aaron Sierra [Tue, 29 Nov 2016 16:03:56 +0000 (10:03 -0600)]
igb: reset the PHY before reading the PHY ID
Several people have reported firmware leaving the I210/I211 PHY's page
select register set to something other than the default of zero. This
causes the first accesses, PHY_IDx register reads, to access something
else, resulting in device probe failure:
igb: Intel(R) Gigabit Ethernet Network Driver - version 5.4.0-k
igb: Copyright (c) 2007-2014 Intel Corporation.
igb: probe of 0000:01:00.0 failed with error -2
This problem began for them after a previous patch I submitted was
applied:
commit
2a3cdead8b408351fa1e3079b220fa331480ffbc
Author: Aaron Sierra <asierra@xes-inc.com>
Date: Tue Nov 3 12:37:09 2015 -0600
igb: Remove GS40G specific defines/functions
I personally experienced this problem after attempting to PXE boot from
I210 devices using this firmware:
Intel(R) Boot Agent GE v1.5.78
Copyright (C) 1997-2014, Intel Corporation
Resetting the PHY before reading from it, ensures the page select
register is in its default state and doesn't make assumptions about
the PHY's register set before the PHY has been probed.
Cc: Matwey V. Kornilov <matwey@sai.msu.ru>
Cc: Chris Arges <carges@vectranetworks.com>
Cc: Jochen Henneberg <jh@henneberg-systemdesign.com>
Signed-off-by: Aaron Sierra <asierra@xes-inc.com>
Tested-by: Matwey V. Kornilov <matwey@sai.msu.ru>
Tested-by: Chris J Arges <christopherarges@gmail.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cao jin [Tue, 8 Nov 2016 07:06:20 +0000 (15:06 +0800)]
igb: use igb_adapter->io_addr instead of e1000_hw->hw_addr
When running as guest, under certain condition, it will oops as following.
writel() in igb_configure_tx_ring() results in oops, because hw->hw_addr
is NULL. While other register access won't oops kernel because they use
wr32/rd32 which have a defense against NULL pointer.
[ 141.225449] pcieport 0000:00:1c.0: AER: Multiple Uncorrected (Fatal)
error received: id=0101
[ 141.225523] igb 0000:01:00.1: PCIe Bus Error:
severity=Uncorrected (Fatal), type=Unaccessible,
id=0101(Unregistered Agent ID)
[ 141.299442] igb 0000:01:00.1: broadcast error_detected message
[ 141.300539] igb 0000:01:00.0 enp1s0f0: PCIe link lost, device now
detached
[ 141.351019] igb 0000:01:00.1 enp1s0f1: PCIe link lost, device now
detached
[ 143.465904] pcieport 0000:00:1c.0: Root Port link has been reset
[ 143.465994] igb 0000:01:00.1: broadcast slot_reset message
[ 143.466039] igb 0000:01:00.0: enabling device (0000 -> 0002)
[ 144.389078] igb 0000:01:00.1: enabling device (0000 -> 0002)
[ 145.312078] igb 0000:01:00.1: broadcast resume message
[ 145.322211] BUG: unable to handle kernel paging request at
0000000000003818
[ 145.361275] IP: [<
ffffffffa02fd38d>]
igb_configure_tx_ring+0x14d/0x280 [igb]
[ 145.400048] PGD 0
[ 145.438007] Oops: 0002 [#1] SMP
A similar issue & solution could be found at:
http://patchwork.ozlabs.org/patch/689592/
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Acked-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Chris J Arges [Wed, 2 Nov 2016 14:13:42 +0000 (09:13 -0500)]
igb: Workaround for igb i210 firmware issue
Sometimes firmware may not properly initialize I347AT4_PAGE_SELECT causing
the probe of an igb i210 NIC to fail. This patch adds an addition zeroing
of this register during igb_get_phy_id to workaround this issue.
Thanks for Jochen Henneberg for the idea and original patch.
Signed-off-by: Chris J Arges <christopherarges@gmail.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cao jin [Wed, 2 Nov 2016 07:20:15 +0000 (15:20 +0800)]
igb: correct register comments
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
khalidm [Mon, 17 Oct 2016 16:51:08 +0000 (09:51 -0700)]
e1000e: driver trying to free already-free irq
During systemd reboot sequence network driver interface is shutdown
by e1000_close. The PCI driver interface is shut by e1000_shutdown.
The e1000_shutdown checks for netif_running status, if still up it
brings down driver. But it disables msi outside of this if statement,
regardless of netif status. All this is OK when e1000_close happens
after shutdown. However, by default, everything in systemd is done
in parallel. This creates a conditions where e1000_shutdown is called
after e1000_close, therefore hitting BUG_ON assert in free_msi_irqs.
CC: xe-kernel@external.cisco.com
Signed-off-by: khalidm <khalidm@cisco.com>
Signed-off-by: David Singleton <davsingl@cisco.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Joe Perches [Tue, 27 Sep 2016 03:46:25 +0000 (20:46 -0700)]
igb: Realign bad indentation
Statements should start on tabstops.
Use a single statement and test instead of multiple tests.
Signed-off-by: Joe Perches <joe@perches.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Sowmini Varadhan [Thu, 5 Jan 2017 19:06:22 +0000 (11:06 -0800)]
tools: psock_tpacket: block Rx until socket filter has been added and socket has been bound to loopback.
Packets from any/all interfaces may be queued up on the PF_PACKET socket
before it is bound to the loopback interface by psock_tpacket, and
when these are passed up by the kernel, they could interfere
with the Rx tests.
Avoid interference from spurious packet by blocking Rx until the
socket filter has been set up, and the packet has been bound to the
desired (lo) interface. The effective sequence is
socket(PF_PACKET, SOCK_RAW, 0);
set up ring
Invoke SO_ATTACH_FILTER
bind to sll_protocol set to ETH_P_ALL, sll_ifindex for lo
After this sequence, the only packets that will be passed up are
those received on loopback that pass the attached filter.
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Soheil Hassas Yeganeh [Wed, 4 Jan 2017 16:19:34 +0000 (11:19 -0500)]
tcp: provide timestamps for partial writes
For TCP sockets, TX timestamps are only captured when the user data
is successfully and fully written to the socket. In many cases,
however, TCP writes can be partial for which no timestamp is
collected.
Collect timestamps whenever any user data is (fully or partially)
copied into the socket. Pass tcp_write_queue_tail to tcp_tx_timestamp
instead of the local skb pointer since it can be set to NULL on
the error path.
Note that tcp_write_queue_tail can be NULL, even if bytes have been
copied to the socket. This is because acknowledgements are being
processed in tcp_sendmsg(), and by the time tcp_tx_timestamp is
called tcp_write_queue_tail can be NULL. For such cases, this patch
does not collect any timestamps (i.e., it is best-effort).
This patch is written with suggestions from Willem de Bruijn and
Eric Dumazet.
Change-log V1 -> V2:
- Use sockc.tsflags instead of sk->sk_tsflags.
- Use the same code path for normal writes and errors.
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 5 Jan 2017 17:43:28 +0000 (12:43 -0500)]
Merge tag 'rxrpc-rewrite-
20170105' of git://git./linux/kernel/git/dhowells/linux-fs
David Howells says:
====================
rxrpc: Update tracing and proc interfaces
This set of patches fixes and extends tracing:
(1) Fix the handling of enum-to-string translations so that external
tracing tools can make use of it by using TRACE_DEFINE_ENUM.
(2) Extend a couple of tracepoints to export some extra available
information and add three new tracepoints to allow monitoring of
received DATA packets, call disconnection and improper/implicit call
termination.
and adds a bit more procfs-exported information:
(3) Show a call's hard-ACK cursors in /proc/net/rxrpc_calls.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Volodymyr Bendiuga [Thu, 5 Jan 2017 10:10:13 +0000 (11:10 +0100)]
net:dsa: check for EPROBE_DEFER from dsa_dst_parse()
Since there can be multiple dsa switches stacked together but
not all of devicetree nodes available at the time of calling
dsa_dst_parse(), EPROBE_DEFER can be returned by it. When this
happens, only the last dsa switch has to be deleted by
dsa_dst_del_ds(), but not the whole list, because next time linux
cames back to this function it will try to add only the last dsa
switch which returned EPROBE_DEFER.
Signed-off-by: Volodymyr Bendiuga <volodymyr.bendiuga@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Volodymyr Bendiuga [Thu, 5 Jan 2017 09:44:18 +0000 (10:44 +0100)]
net:mv88e6xxx: use g2 interrupt for 6097 chip
This chip needs MV88E6XXX_FLAG_G2_INT
Signed-off-by: Volodymyr Bendiuga <volodymyr.bendiuga@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tobias Klauser [Thu, 5 Jan 2017 09:41:36 +0000 (10:41 +0100)]
net: xilinx: emaclite: Remove xemaclite_remove_ndev()
xemaclite_remove_ndev() is a simple wrapper around free_netdev()
checking for NULL before the call. All possible paths calling
it are guaranteed to pass a non-NULL argument, so rather call
free_netdev() directly.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tobias Klauser [Thu, 5 Jan 2017 08:16:27 +0000 (09:16 +0100)]
net: ethoc: Remove unused members from struct ethoc
The io_region_size and dma_alloc members of struct ethoc are only
written but never read, so they might as well be removed.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 5 Jan 2017 16:03:07 +0000 (11:03 -0500)]
Merge git://git./linux/kernel/git/davem/net
David Howells [Thu, 5 Jan 2017 10:38:33 +0000 (10:38 +0000)]
rxrpc: Show a call's hard-ACK cursors in /proc/net/rxrpc_calls
Show a call's hard-ACK cursors in /proc/net/rxrpc_calls so that a call's
progress can be more easily monitored.
Signed-off-by: David Howells <dhowells@redhat.com>
David Howells [Thu, 5 Jan 2017 10:38:34 +0000 (10:38 +0000)]
rxrpc: Add some more tracing
Add the following extra tracing information:
(1) Modify the rxrpc_transmit tracepoint to record the Tx window size as
this is varied by the slow-start algorithm.
(2) Modify the rxrpc_rx_ack tracepoint to record more information from
received ACK packets.
(3) Add an rxrpc_rx_data tracepoint to record the information in DATA
packets.
(4) Add an rxrpc_disconnect_call tracepoint to record call disconnection,
including the reason the call was disconnected.
(5) Add an rxrpc_improper_term tracepoint to record implicit termination
of a call by a client either by starting a new call on a particular
connection channel without first transmitting the final ACK for the
previous call.
Signed-off-by: David Howells <dhowells@redhat.com>
David Howells [Thu, 5 Jan 2017 10:38:33 +0000 (10:38 +0000)]
rxrpc: Fix handling of enums-to-string translation in tracing
Fix the way enum values are translated into strings in AF_RXRPC
tracepoints. The problem with just doing a lookup in a normal flat array
of strings or chars is that external tracing infrastructure can't find it.
Rather, TRACE_DEFINE_ENUM must be used.
Also sort the enums and string tables to make it easier to keep them in
order so that a future patch to __print_symbolic() can be optimised to try
a direct lookup into the table first before iterating over it.
A couple of _proto() macro calls are removed because they refered to tables
that got moved to the tracing infrastructure. The relevant data can be
found by way of tracing.
Signed-off-by: David Howells <dhowells@redhat.com>
Daniel Borkmann [Thu, 5 Jan 2017 01:34:28 +0000 (02:34 +0100)]
packet: fix panic in __packet_set_timestamp on tpacket_v3 in tx mode
When TX timestamping is in use with TPACKET_V3's TX ring, then we'll
hit the BUG() in __packet_set_timestamp() when ring buffer slot is
returned to user space via tpacket_destruct_skb(). This is due to v3
being assumed as unreachable here, but since
7f953ab2ba46 ("af_packet:
TX_RING support for TPACKET_V3") it's not anymore. Fix it by filling
the timestamp back into the ring slot.
Fixes:
7f953ab2ba46 ("af_packet: TX_RING support for TPACKET_V3")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Thu, 5 Jan 2017 02:33:35 +0000 (18:33 -0800)]
Merge tag 'xfs-for-linus-4.10-rc3' of git://git./fs/xfs/xfs-linux
Pull xfs fixes from Darrick Wong:
- fixes for crashes and double-cleanup errors
- XFS maintainership handover
- fix to prevent absurdly large block reservations
- fix broken sysfs getter/setters
* tag 'xfs-for-linus-4.10-rc3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: fix max_retries _show and _store functions
xfs: update MAINTAINERS
xfs: fix crash and data corruption due to removal of busy COW extents
xfs: use the actual AG length when reserving blocks
xfs: fix double-cleanup when CUI recovery fails
Linus Torvalds [Wed, 4 Jan 2017 22:14:53 +0000 (14:14 -0800)]
Merge git://git./linux/kernel/git/davem/net
Pull networking fixes from David Miller:
1) stmmac_drv_probe() can race with stmmac_open() because we register
the netdevice too early. Fix from Florian Fainelli.
2) UFO handling in __ip6_append_data() and ip6_finish_output() use
different tests for deciding whether a frame will be fragmented or
not, put them in sync. Fix from Zheng Li.
3) The rtnetlink getstats handlers need to validate that the netlink
request is large enough, fix from Mathias Krause.
4) Use after free in mlx4 driver, from Jack Morgenstein.
5) Fix setting of garbage UID value in sockets during setattr() calls,
from Eric Biggers.
6) Packet drop_monitor doesn't format the netlink messages properly
such that nlmsg_next fails to work, fix from Reiter Wolfgang.
7) Fix handling of wildcard addresses in l2tp lookups, from Guillaume
Nault.
8) __skb_flow_dissect() can crash on pptp packets, from Ian Kumlien.
9) IGMP code doesn't reset group query timers properly, from Michal
Tesar.
10) Fix overzealous MAIN/LOCAL route table combining in ipv4, from
Alexander Duyck.
11) vxlan offload check needs to be more strict in be2net driver, from
Sabrina Dubroca.
12) Moving l3mdev to packet hooks lost RX stat counters unintentionally,
fix from David Ahern.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (52 commits)
sh_eth: enable RX descriptor word 0 shift on SH7734
sfc: don't report RX hash keys to ethtool when RSS wasn't enabled
dpaa_eth: Initialize CGR structure before init
dpaa_eth: cleanup after init_phy() failure
net: systemport: Pad packet before inserting TSB
net: systemport: Utilize skb_put_padto()
LiquidIO VF: s/select/imply/ for PTP_1588_CLOCK
libcxgb: fix error check for ip6_route_output()
net: usb: asix_devices: add .reset_resume for USB PM
net: vrf: Add missing Rx counters
drop_monitor: consider inserted data in genlmsg_end
benet: stricter vxlan offloading check in be_features_check
ipv4: Do not allow MAIN to be alias for new LOCAL w/ custom rules
net: macb: Updated resource allocation function calls to new version of API.
net: stmmac: dwmac-oxnas: use generic pm implementation
net: stmmac: dwmac-oxnas: fix fixed-link-phydev leaks
net: stmmac: dwmac-oxnas: fix of-node leak
Documentation/networking: fix typo in mpls-sysctl
igmp: Make igmp group member RFC 3376 compliant
flow_dissector: Update pptp handling to avoid null pointer deref.
...
Andrew Lunn [Wed, 4 Jan 2017 18:56:24 +0000 (19:56 +0100)]
dsa: mv88e6xxx: Optimise atu_get
Lookup in the ATU can be performed starting from a given MAC
address. This is faster than starting with the first possible MAC
address and iterating all entries.
Entries are returned in numeric order. So if the MAC address returned
is bigger than what we are searching for, we know it is not in the
ATU.
Using the benchmark provided by Volodymyr Bendiuga
<volodymyr.bendiuga@gmail.com>,
https://www.spinics.net/lists/netdev/msg411550.html
on an Marvell Armada 370 RD, the test to add a number of static fdb
entries went from 1.616531 seconds to 0.312052 seconds.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sergei Shtylyov [Wed, 4 Jan 2017 20:10:23 +0000 (23:10 +0300)]
sh_eth: enable RX descriptor word 0 shift on SH7734
The RX descriptor word 0 on SH7734 has the RFS[9:0] field in bits 16-25
(bits 0-15 usually used for that are occupied by the packet checksum).
Thus we need to set the 'shift_rd0' field in the SH7734 SoC data...
Fixes:
f0e81fecd4f8 ("net: sh_eth: Add support SH7734")
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Edward Cree [Wed, 4 Jan 2017 15:10:56 +0000 (15:10 +0000)]
sfc: don't report RX hash keys to ethtool when RSS wasn't enabled
If we failed to set up RSS on EF10 (e.g. because firmware declared
RX_RSS_LIMITED), ethtool --show-nfc $dev rx-flow-hash ... should report
no fields, rather than confusingly reporting what fields we _would_ be
hashing on if RSS was working.
Fixes:
dcb4123cbec0 ("sfc: disable RSS when unsupported")
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arjun V [Wed, 4 Jan 2017 13:34:20 +0000 (19:04 +0530)]
cxgb4: Support compressed error vector for T6
t6fw-1.15.15.0 enabled compressed error vector in cpl_rx_pkt for T6.
Updating driver to take care of these changes.
Signed-off-by: Santosh Rastapur <santosh@chelsio.com>
Signed-off-by: Arjun V <arjun@chelsio.com>
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 4 Jan 2017 18:47:55 +0000 (13:47 -0500)]
Merge branch 'sh_eth-intrs-cleanup'
Sergei Shtylyov says:
====================
sh_eth: E-MAC interrupt handler cleanups
Here's a set of 3 patches against DaveM's 'net-next.git' repo. I'm cleaning
up the E-MAC interrupt handling with the main goal of factoring out the E-MAC
interrupt handler into a separate function.
[1/3] sh_eth: handle only enabled E-MAC interrupts
[2/3] sh_eth: no need for *else* after *goto*
[3/3] sh_eth: factor out sh_eth_emac_interrupt()
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Sergei Shtylyov [Wed, 4 Jan 2017 12:11:21 +0000 (15:11 +0300)]
sh_eth: factor out sh_eth_emac_interrupt()
The E-MAC interrupt (EESR.ECI) is not always caused by an error condition,
so it really shouldn't be handled by sh_eth_error(). Factor out the E-MAC
interrupt handler, sh_eth_emac_interrupt(), removing the ECI bit from the
EESR's values throughout the driver...
Update Cogent Embedded's copyright and clean up the whitespace in Renesas'
copyright, while at it...
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sergei Shtylyov [Wed, 4 Jan 2017 12:10:50 +0000 (15:10 +0300)]
sh_eth: no need for *else* after *goto*
Well, checkpatch.pl complains about *else* after *return* and *break* but
not after *goto*... and it probably should have complained about the code
in sh_eth_error(). Win couple LoCs by removing that *else*. :-)
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sergei Shtylyov [Wed, 4 Jan 2017 12:10:21 +0000 (15:10 +0300)]
sh_eth: handle only enabled E-MAC interrupts
The driver should only handle the enabled E-MAC interrupts, like it does
for the E-DMAC interrupts since commit
3893b27345ac ("sh_eth: workaround
for spurious ECI interrupt"), so mask ECSR with ECSIPR when reading it
in sh_eth_error().
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 4 Jan 2017 18:45:09 +0000 (13:45 -0500)]
Merge branch 'dpaa_eth-fixes'
Madalin Bucur says:
====================
dpaa_eth: a couple of fixes
Add cleanup on PHY initialization failure path, avoid using
uninitialized memory at CGR init.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Roy Pledge [Wed, 4 Jan 2017 11:21:30 +0000 (13:21 +0200)]
dpaa_eth: Initialize CGR structure before init
The QBMan CGR options needs to be zeroed before calling the init
function
Signed-off-by: Roy Pledge <roy.pledge@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Madalin Bucur [Wed, 4 Jan 2017 11:21:29 +0000 (13:21 +0200)]
dpaa_eth: cleanup after init_phy() failure
Signed-off-by: Madalin Bucur <madalin.bucur@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 4 Jan 2017 18:33:30 +0000 (13:33 -0500)]
Merge branch 'systemport-padding-and-TSB-insertion'
Florian Fainelli says:
====================
net: systemport: Fix padding vs. TSB insertion
This patch series fixes how we pad the packets submitted to the SYSTEMPORT
adapter, and how the transmit status block (prepended 8 bytes) fits in the
picture. The first patch is not technically a bug fix, but is required for the
second path to be applied and to greatly simplify the skb length calculation.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Wed, 4 Jan 2017 00:34:49 +0000 (16:34 -0800)]
net: systemport: Pad packet before inserting TSB
Inserting the TSB means adding an extra 8 bytes in front the of packet
that is going to be used as metadata information by the TDMA engine, but
stripped off, so it does not really help with the packet padding.
For some odd packet sizes that fall below the 60 bytes payload (e.g: ARP)
we can end-up padding them after the TSB insertion, thus making them 64
bytes, but with the TDMA stripping off the first 8 bytes, they could
still be smaller than 64 bytes which is required to ingress the switch.
Fix this by swapping the padding and TSB insertion, guaranteeing that
the packets have the right sizes.
Fixes:
80105befdb4b ("net: systemport: add Broadcom SYSTEMPORT Ethernet MAC driver")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Wed, 4 Jan 2017 00:34:48 +0000 (16:34 -0800)]
net: systemport: Utilize skb_put_padto()
Since we need to pad our packets, utilize skb_put_padto() which
increases skb->len by how much we need to pad, allowing us to eliminate
the test on skb->len right below.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mahesh Bandewar [Tue, 3 Jan 2017 20:47:16 +0000 (12:47 -0800)]
ipvlan: assign unique dev-id for each slave device.
IPvlan setup uses one mac-address (of master). The IPv6 link-local
addresses are derived using the mac-address on the link. Lack of
dev-ids makes these link-local addresses same for all slaves including
that of master device. dev-ids are necessary to add differentiation
when L2 address is shared.
Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vivien Didelot [Tue, 3 Jan 2017 19:31:49 +0000 (14:31 -0500)]
net: dsa: remove out label in dsa_switch_setup_one
The "out" label in dsa_switch_setup_one() is useless, thus remove it.
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Prasad Kanneganti [Tue, 3 Jan 2017 19:27:33 +0000 (11:27 -0800)]
liquidio: remove PTP support in 23XX adapters
liquidio driver incorrectly indicates that PTP is supported in 23XX
adapters; this patch fixes that. PTP is supported in 66XX and 68XX
adapters, and the driver correctly indicates that.
Signed-off-by: Prasad Kanneganti <prasad.kanneganti@cavium.com>
Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: Derek Chickles <derek.chickles@cavium.com>
Signed-off-by: Satanand Burla <satananda.burla@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nicolas Pitre [Tue, 3 Jan 2017 18:57:00 +0000 (13:57 -0500)]
LiquidIO VF: s/select/imply/ for PTP_1588_CLOCK
Fix a minor fallout from the merge of the timers and the networking
trees. The following error may result if the PTP_1588_CLOCK
prerequisites are not available:
drivers/built-in.o: In function `ptp_clock_unregister':
(.text+0x40e0a5): undefined reference to `pps_unregister_source'
drivers/built-in.o: In function `ptp_clock_unregister':
(.text+0x40e0cc): undefined reference to `posix_clock_unregister'
drivers/built-in.o: In function `ptp_clock_event':
(.text+0x40e249): undefined reference to `pps_event'
drivers/built-in.o: In function `ptp_clock_register':
(.text+0x40e5e1): undefined reference to `pps_register_source'
drivers/built-in.o: In function `ptp_clock_register':
(.text+0x40e62c): undefined reference to `posix_clock_register'
drivers/built-in.o: In function `ptp_clock_register':
(.text+0x40e68d): undefined reference to `pps_unregister_source'
Signed-off-by: Nicolas Pitre <nico@linaro.org>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Varun Prakash [Tue, 3 Jan 2017 15:55:48 +0000 (21:25 +0530)]
libcxgb: fix error check for ip6_route_output()
ip6_route_output() never returns NULL so
check dst->error instead of !dst.
Signed-off-by: Varun Prakash <varun@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 4 Jan 2017 18:24:19 +0000 (13:24 -0500)]
net: Assert at build time the assumptions we make about the CMSG header.
It must always be the case that CMSG_ALIGN(sizeof(hdr)) == sizeof(hdr).
Otherwise there are missing adjustments in the various calculations
that parse and build these things.
Signed-off-by: David S. Miller <davem@davemloft.net>
yuan linyu [Tue, 3 Jan 2017 12:42:17 +0000 (20:42 +0800)]
scm: remove use CMSG{_COMPAT}_ALIGN(sizeof(struct {compat_}cmsghdr))
sizeof(struct cmsghdr) and sizeof(struct compat_cmsghdr) already aligned.
remove use CMSG_ALIGN(sizeof(struct cmsghdr)) and
CMSG_COMPAT_ALIGN(sizeof(struct compat_cmsghdr)) keep code consistent.
Signed-off-by: yuan linyu <Linyu.Yuan@alcatel-sbell.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
Peter Chen [Tue, 3 Jan 2017 09:22:20 +0000 (17:22 +0800)]
net: usb: asix_devices: add .reset_resume for USB PM
The USB core may call reset_resume when it fails to resume asix device.
And USB core can recovery this abnormal resume at low level driver,
the same .resume at asix driver can work too. Add .reset_resume can
avoid disconnecting after backing from system resume, and NFS can
still be mounted after this commit.
Signed-off-by: Peter Chen <peter.chen@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Wed, 4 Jan 2017 17:03:37 +0000 (09:03 -0800)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull block layer fixes from Jens Axboe:
"A set of fixes for the current series, one fixing a regression with
block size < page cache size in the alias series from Jan. Outside of
that, two small cleanups for wbt from Bart, a nvme pull request from
Christoph, and a few small fixes of documentation updates"
* 'for-linus' of git://git.kernel.dk/linux-block:
block: fix up io_poll documentation
block: Avoid that sparse complains about context imbalance in __wbt_wait()
block: Make wbt_wait() definition consistent with declaration
clean_bdev_aliases: Prevent cleaning blocks that are not in block range
genhd: remove dead and duplicated scsi code
block: add back plugging in __blkdev_direct_IO
nvmet/fcloop: remove some logically dead code performing redundant ret checks
nvmet: fix KATO offset in Set Features
nvme/fc: simplify error handling of nvme_fc_create_hw_io_queues
nvme/fc: correct some printk information
nvme/scsi: Remove START STOP emulation
nvme/pci: Delete misleading queue-wrap comment
nvme/pci: Fix whitespace problem
nvme: simplify stripe quirk
nvme: update maintainers information
Linus Torvalds [Wed, 4 Jan 2017 17:00:57 +0000 (09:00 -0800)]
Merge tag 'fbdev-v4.10-rc2' of git://github.com/bzolnier/linux
Pull fbdev fixes from Bartlomiej Zolnierkiewicz:
- bring fbdev subsystem back into Maintained mode
- add missing devm_ioremap() error checking to cobalt_lcdfb driver
* tag 'fbdev-v4.10-rc2' of git://github.com/bzolnier/linux:
video: fbdev: cobalt_lcdfb: Handle return NULL error from devm_ioremap
MAINTAINERS: add myself as maintainer of fbdev
Linus Torvalds [Wed, 4 Jan 2017 16:56:05 +0000 (08:56 -0800)]
Merge tag 'gcc-plugins-v4.10-rc3' of git://git./linux/kernel/git/kees/linux
Pull gcc-plugins fixes from Kees Cook:
"Small fixes for gcc-plugins when using certain gcc versions:
- update gcc-common.h for gcc 7 (Emese Revfy)
- fix latent_entropy type for early gcc on ARM (PaX Team)"
* tag 'gcc-plugins-v4.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
gcc-plugins: update gcc-common.h for gcc-7
latent_entropy: fix ARM build error on earlier gcc
Arvind Yadav [Tue, 13 Dec 2016 08:20:52 +0000 (13:50 +0530)]
video: fbdev: cobalt_lcdfb: Handle return NULL error from devm_ioremap
Here, If devm_ioremap will fail. It will return NULL.
Kernel can run into a NULL-pointer dereference.
This error check will avoid NULL pointer dereference.
Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Acked-by: Yoichi Yuasa <yuasa@linux-mips.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Bartlomiej Zolnierkiewicz [Wed, 4 Jan 2017 11:58:44 +0000 (12:58 +0100)]
MAINTAINERS: add myself as maintainer of fbdev
I would like to help with fbdev maintenance. I can dedicate some time
for reviewing and handling patches but won't have time for much more.
The subsystem will remain in maintenance mode (no new drivers will be
added to it).
Cc: Tomi Valkeinen <tomi.valkeinen@ti.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Carlos Maiolino [Wed, 4 Jan 2017 04:34:17 +0000 (20:34 -0800)]
xfs: fix max_retries _show and _store functions
max_retries _show and _store functions should test against cfg->max_retries,
not cfg->retry_timeout
Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Darrick J. Wong [Wed, 4 Jan 2017 02:39:34 +0000 (18:39 -0800)]
xfs: update MAINTAINERS
I am taking over as XFS maintainer from Dave Chinner[1], so update
contact information and git tree pointers.
[1] http://lkml.iu.edu/hypermail/linux/kernel/1612.1/04390.html
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Christoph Hellwig [Wed, 4 Jan 2017 02:39:33 +0000 (18:39 -0800)]
xfs: fix crash and data corruption due to removal of busy COW extents
There is a race window between write_cache_pages calling
clear_page_dirty_for_io and XFS calling set_page_writeback, in which
the mapping for an inode is tagged neither as dirty, nor as writeback.
If the COW shrinker hits in exactly that window we'll remove the delayed
COW extents and writepages trying to write it back, which in release
kernels will manifest as corruption of the bmap btree, and in debug
kernels will trip the ASSERT about now calling xfs_bmapi_write with the
COWFORK flag for holes. A complex customer load manages to hit this
window fairly reliably, probably by always having COW writeback in flight
while the cow shrinker runs.
This patch adds another check for having the I_DIRTY_PAGES flag set,
which is still set during this race window. While this fixes the problem
I'm still not overly happy about the way the COW shrinker works as it
still seems a bit fragile.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Darrick J. Wong [Wed, 4 Jan 2017 02:39:33 +0000 (18:39 -0800)]
xfs: use the actual AG length when reserving blocks
We need to use the actual AG length when making per-AG reservations,
since we could otherwise end up reserving more blocks out of the last
AG than there are actual blocks.
Complained-about-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Darrick J. Wong [Wed, 4 Jan 2017 02:39:32 +0000 (18:39 -0800)]
xfs: fix double-cleanup when CUI recovery fails
Dan Carpenter reported a double-free of rcur if _defer_finish fails
while we're recovering CUI items. Fix the error recovery to prevent
this.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Yotam Gigi [Tue, 3 Jan 2017 17:20:24 +0000 (19:20 +0200)]
net/sched: cls_matchall: Fix error path
Fix several error paths in matchall:
- Release reference to actions in case the hardware fails offloading
(relevant to skip_sw only)
- Fix error path in case tcf_exts initialization/validation fail
Fixes:
bf3994d2ed31 ("net/sched: introduce Match-all classifier")
Signed-off-by: Yotam Gigi <yotamg@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Ahern [Tue, 3 Jan 2017 17:37:55 +0000 (09:37 -0800)]
net: vrf: Add missing Rx counters
The move from rx-handler to L3 receive handler inadvertantly dropped the
rx counters. Restore them.
Fixes:
74b20582ac38 ("net: l3mdev: Add hook in ip and ipv6")
Reported-by: Dinesh Dutt <ddutt@cumulusnetworks.com>
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jeff Moyer [Tue, 3 Jan 2017 22:51:33 +0000 (17:51 -0500)]
block: fix up io_poll documentation
/sys/block/<dev>/queue/io_poll is a boolean. Fix the docs.
Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
David S. Miller [Tue, 3 Jan 2017 21:23:01 +0000 (16:23 -0500)]
Merge branch '10GbE' of git://git./linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:
====================
10GbE Intel Wired LAN Driver Updates 2017-01-03
This series contains updates to ixgbe and ixgbevf only.
Emil fixes ixgbe to use the NVM settings for FEC, so do not override the
settings. Fixed the indirection table for x550, where newer devices can
support up to 64 RSS queues. Extends the rtnl_lock() to protect the call
to netif_device_detach() and ixgbe_clear_interrupt_scheme() to avoid
against a double free WARN and/or a BUG in free_msi_irqs(). Fixed AER
error handling by making sure that the driver frees the IRQs in
ixgbe_io_error_detected() when responding to a PCIe AER error, and to
restore them when the interface recovers.
Tony updates the driver to report the driver version to the firmware using
the host interface command for x550 devices. Fixed the PHY reset check
for x550em_ext_t PHY type. Fixed bounds checking for x540 devices to
ensure the index is valid for the LED function. Fixed the BaseT adapters
which support 100Mb capability and were not reporting the capability.
Ken Cox adds a missing check for the trusted bit before trying to set the
MACVLAN MAC address.
Yusuke Suzuki fixes an issue with 82599 and x540 devices where receive
timestamps were not working becase the bitwise operation for RX_HWSTAMP
falg was incorrect.
Don ensures that x553 KR/KX devices correctly advertise link speeds.
Adds the mailbox message to allow for VF promiscuous mode support.
Mark fixes two issues with EEPROM access, where the semaphore was not
being held until the entire response was read and the acquiring/releasing
of the semaphore is slow. Cleaned up firmware version method and
functions which are no longer used. Added new interfaces for firmware
commands to access some new PHYs.
v2: fixed tab indentation in patch 12 and mis-spelled words in patch 15
based on feedback from Sergei Shtylyov and Rami Rosen.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Don Skidmore [Fri, 16 Dec 2016 02:18:32 +0000 (21:18 -0500)]
ixgbe: Add PF support for VF promiscuous mode
This patch extends the xcast mailbox message to include support for
unicast promiscuous mode. To allow a VF to enter this mode the PF
must be in promiscuous mode.
A later patch will add the support needed in the VF driver (ixgbevf)
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Don Skidmore [Fri, 16 Dec 2016 02:18:31 +0000 (21:18 -0500)]
ixgbevf: Add support for VF promiscuous mode
This patch extends the mailbox message to allow for VF promiscuous
mode support.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Mark Rustad [Wed, 14 Dec 2016 19:02:16 +0000 (11:02 -0800)]
ixgbe: Implement support for firmware-controlled PHYs
Implement support for devices that have firmware-controlled PHYs.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Mark Rustad [Wed, 14 Dec 2016 19:02:11 +0000 (11:02 -0800)]
ixgbe: Implement firmware interface to access some PHYs
Implement new interface for firmware commands to access some PHYs.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Mark Rustad [Wed, 14 Dec 2016 19:02:06 +0000 (11:02 -0800)]
ixgbe: Remove unused firmware version functions and method
The firmware version method and functions are not used anywhere, so
remove them all.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Mark Rustad [Wed, 14 Dec 2016 19:02:00 +0000 (11:02 -0800)]
ixgbe: Fix issues with EEPROM access
There are two problems with EEPROM access. One is that it needs to
hold the semaphore until the entire response is read or else the
response can be corrupted by other firmware accesses. The second
problem is that acquiring and releasing the semaphore is slow, so
it should be taken and released once when multiple EEPROM accesses
will be done.
Both of these issues can be solved by adding a new function,
ixgbe_hic_unlocked, to issue firmware commands that will assume
that the caller has acquired the needed semaphore.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Don Skidmore [Wed, 14 Dec 2016 01:34:51 +0000 (20:34 -0500)]
ixgbe: Configure advertised speeds correctly for KR/KX backplane
This patch ensures that the advertised link speeds are configured
for X553 KR/KX backplane. Without this patch the link remains at
1G when resuming from low power after being downshifted by LPLU.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Emil Tantilov [Wed, 23 Nov 2016 19:24:08 +0000 (11:24 -0800)]
ixgbevf: restore hw_addr on resume or error
Restore adapter->hw.hw_addr after handling an error, or a resume
operation to make sure we can access the registers.
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Yusuke Suzuki [Mon, 21 Nov 2016 06:48:45 +0000 (06:48 +0000)]
ixgbe: Fix incorrect bitwise operations of PTP Rx timestamp flags
Rx timestamp does not work on 82599 and X540 because bitwise operation
of RX_HWTSTAMP flags is incorrect and ixgbe_ptp_rx_hwtstamp() is never
called. This patch fixes it to enable Rx timestamp on 82599 and X540.
Without this fix:
ptp4l[278.730]: selected /dev/ptp8 as PTP clock
ptp4l[278.733]: port 1: INITIALIZING to LISTENING on INITIALIZE
ptp4l[278.733]: port 0: INITIALIZING to LISTENING on INITIALIZE
ptp4l[278.834]: port 1: received SYNC without timestamp
ptp4l[278.835]: port 1: new foreign master 1c3947.fffe.60f9cc-1
ptp4l[279.834]: port 1: received SYNC without timestamp
ptp4l[280.834]: port 1: received SYNC without timestamp
ptp4l[281.834]: port 1: received SYNC without timestamp
ptp4l[282.834]: port 1: received SYNC without timestamp
ptp4l[282.835]: selected best master clock 1c3947.fffe.60f9cc
ptp4l[282.835]: port 1: LISTENING to UNCALIBRATED on RS_SLAVE
ptp4l[283.834]: port 1: received SYNC without timestamp
With this fix:
ptp4l[239.154]: selected /dev/ptp8 as PTP clock
ptp4l[239.157]: port 1: INITIALIZING to LISTENING on INITIALIZE
ptp4l[239.157]: port 0: INITIALIZING to LISTENING on INITIALIZE
ptp4l[240.989]: port 1: new foreign master 1c3947.fffe.60f9cc-1
ptp4l[244.989]: selected best master clock 1c3947.fffe.60f9cc
ptp4l[244.989]: port 1: LISTENING to UNCALIBRATED on RS_SLAVE
ptp4l[246.977]: master offset -
899583339542096 s0 freq +0 path delay 16222
ptp4l[247.977]: master offset -
899583339617265 s1 freq -75169 path delay 16177
ptp4l[248.977]: master offset -130 s2 freq -75299 path delay 16177
ptp4l[248.977]: port 1: UNCALIBRATED to SLAVE on MASTER_CLOCK_SELECTED
ptp4l[249.977]: master offset -9 s2 freq -75217 path delay 16177
ptp4l[250.977]: master offset 88 s2 freq -75123 path delay 16132
Fixes:
a9763f3cb54c ("ixgbe: Update PTP to support X550EM_x devices")
Signed-off-by: Yusuke Suzuki <yus-suzuki@uf.jp.nec.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Emil Tantilov [Wed, 16 Nov 2016 19:25:34 +0000 (11:25 -0800)]
ixgbevf: fix AER error handling
Make sure that we free the IRQs in ixgbevf_io_error_detected() when
responding to an PCIe AER error and also restore them when the
interface recovers from it.
Previously it was possible to trigger BUG_ON() check in free_msix_irqs()
in the case where we call ixgbevf_remove() after a failed recovery from
AER error because the interrupts were not freed.
Also moved the down and free functions into ixgbevf_close_suspend()
same as with ixgbe.
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Emil Tantilov [Wed, 16 Nov 2016 17:48:02 +0000 (09:48 -0800)]
ixgbe: fix AER error handling
Make sure that we free the IRQs in ixgbe_io_error_detected() when
responding to an PCIe AER error and also restore them when the
interface recovers from it.
Previously it was possible to trigger BUG_ON() check in free_msix_irqs()
in the case where we call ixgbe_remove() after a failed recovery from
AER error because the interrupts were not freed.
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Ken Cox [Tue, 15 Nov 2016 19:00:37 +0000 (13:00 -0600)]
ixgbe: test for trust in macvlan adjustments for VF
There are two methods for setting mac addresses in a Macvlan, that
differentiate themselves in the function macvlan_set_mac_Address.
If the macvlan mode is passthru, then we use the dev_set_mac_address
method, otherwise we use the dev_uc api via macvlan_sync_addresses.
The latter method (which would stem from using any non-passthru mode,
like bridge, or vepa), calls down into the driver in a path that terminates
in ixgbevf_set_uc_addr_vf, which sends a IXGBE_VF_SET_MACVLAN message,
which causes the pf to spawn the noted error message. This occurs because
it appears that the guest is trying to delete the mac address of the macvlan
before adding another.
The other path in macvlan_set_mac_address uses dev_set_mac_address, which
calls into ixgbevf_set_mac which uses the IXGBE_VF_SET_MAC_ADDR to the
pf to set the macvlan mac address.
The discrepancy here is in the handlers. The handler function for
IXGBE_VF_SET_MAC_ADDR (ixgbe_set_vf_mac_addr) has a check for
the vfinfo[].trusted bit to allow the operation if the vf is trusted.
In comparison, the IXGBE_VF_SET_MACVLAN message handler
(ixgbe_set_vf_macvlan_msg) has no such check of the trusted bit.
Signed-off-by: Ken Cox <jkc@redhat.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Emil Tantilov [Fri, 11 Nov 2016 18:12:51 +0000 (10:12 -0800)]
ixgbevf: handle race between close and suspend on shutdown
When an interface is part of a namespace it is possible that
ixgbevf_close() may be called while ixgbevf_suspend() is running
which ends up in a double free WARN and/or a BUG in free_msi_irqs()
To handle this situation we extend the rtnl_lock() to protect the
call to netif_device_detach() and check for !netif_device_present()
to avoid entering close while in suspend.
Also added rtnl locks to ixgbevf_queue_reset_subtask().
CC: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Emil Tantilov [Fri, 11 Nov 2016 18:07:47 +0000 (10:07 -0800)]
ixgbe: handle close/suspend race with netif_device_detach/present
When an interface is part of a namespace it is possible that
ixgbe_close() may be called while __ixgbe_shutdown() is running
which ends up in a double free WARN and/or a BUG in free_msi_irqs().
To handle this situation we extend the rtnl_lock() to protect the
call to netif_device_detach() and ixgbe_clear_interrupt_scheme()
in __ixgbe_shutdown() and check for netif_device_present()
to avoid clearing the interrupts second time in ixgbe_close();
Also extend the rtnl lock in ixgbe_resume() to netif_device_attach().
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Tony Nguyen [Fri, 11 Nov 2016 00:00:33 +0000 (16:00 -0800)]
ixgbe: Fix reporting of 100Mb capability
BaseT adapters that are capable of supporting 100Mb are not reporting this
capability. This patch corrects the reporting so that 100Mb is shown as
supported on those adapters.
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Tony Nguyen [Thu, 10 Nov 2016 17:57:29 +0000 (09:57 -0800)]
ixgbe: Reduce I2C retry count on X550 devices
A retry count of 10 is likely to run into problems on X550 devices that
have to detect and reset unresponsive CS4227 devices. So, reduce the I2C
retry count to 3 for X550 and above. This should avoid any possible
regressions in existing devices.
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Tony Nguyen [Wed, 9 Nov 2016 18:48:48 +0000 (10:48 -0800)]
ixgbe: Add bounds check for x540 LED functions
This is an extension of commit
003287e0f087 ("ixgbevf: Correct parameter
sent to LED function"); add bounds checking to x540 functions to ensure the
index is valid.
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Emil Tantilov [Fri, 4 Nov 2016 21:03:03 +0000 (14:03 -0700)]
ixgbe: add mask for 64 RSS queues
The indirection table was reported incorrectly for X550 and newer
where we can support up to 64 RSS queues.
Reported-by Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Tony Nguyen [Mon, 31 Oct 2016 19:11:58 +0000 (12:11 -0700)]
ixgbe: Fix check for ixgbe_phy_x550em_ext_t reset
The generic PHY reset check we had previously is not sufficient for the
ixgbe_phy_x550em_ext_t PHY type. Check 1.CC02.0 instead - same as
ixgbe_init_ext_t_x550().
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Tony Nguyen [Wed, 26 Oct 2016 23:25:18 +0000 (16:25 -0700)]
ixgbe: Report driver version to firmware for x550 devices
Some x550 devices require the driver version reported to its firmware; this
patch sends the driver version string to the firmware through the host
interface command for x550 devices.
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Emil Tantilov [Wed, 28 Sep 2016 23:01:48 +0000 (16:01 -0700)]
ixgbe: do not disable FEC from the driver
FEC is configured by the NVM and the driver should not be
overriding it.
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Kees Cook [Fri, 16 Dec 2016 19:36:06 +0000 (11:36 -0800)]
gcc-plugins: update gcc-common.h for gcc-7
This updates gcc-common.h from Emese Revfy for gcc 7. This fixes issues seen
by Kugan and Arnd. Build tested with gcc 5.4 and 7 snapshot.
Cc: stable@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Kees Cook [Fri, 16 Dec 2016 20:59:31 +0000 (12:59 -0800)]
latent_entropy: fix ARM build error on earlier gcc
This fixes build errors seen on gcc-4.9.3 or gcc-5.3.1 for an ARM:
arm-soc/init/initramfs.c: In function 'error':
arm-soc/init/initramfs.c:50:1: error: unrecognizable insn:
}
^
(insn 26 25 27 5 (set (reg:SI 111 [ local_entropy.243 ])
(rotatert:SI (reg:SI 116 [ local_entropy.243 ])
(const_int -30 [0xffffffffffffffe2]))) -1
(nil))
Patch from PaX Team <pageexec@freemail.hu>
Reported-by: Arnd Bergmann <arnd@arndb.de>
Reported-by: Brad Spengler <spender@grsecurity.net>
Cc: stable@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Linus Torvalds [Tue, 3 Jan 2017 18:50:05 +0000 (10:50 -0800)]
Merge branch 'parisc-4.10-2' of git://git./linux/kernel/git/deller/parisc-linux
Pull parisc updates from Helge Deller:
- limit usage of processor-internal cr16 clocksource to UP systems only
- segfault info lines in syslog were too long, split those up
- drop own TIF_RESTORE_SIGMASK flag and switch to generic code
* 'parisc-4.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: Add line-break when printing segfault info
parisc: Drop TIF_RESTORE_SIGMASK and switch to generic code
parisc: Mark cr16 clocksource unstable on SMP systems
David S. Miller [Tue, 3 Jan 2017 16:13:06 +0000 (11:13 -0500)]
Merge branch 'tipc-link-starvation'
Jon Maloy says:
====================
tipc: improve interaction socket-link
We fix a very real starvation problem that may occur when a link
encounters send buffer congestion. At the same time we make the
interaction between the socket and link layer simpler and more
consistent.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jon Paul Maloy [Tue, 3 Jan 2017 15:55:11 +0000 (10:55 -0500)]
tipc: reduce risk of user starvation during link congestion
The socket code currently handles link congestion by either blocking
and trying to send again when the congestion has abated, or just
returning to the user with -EAGAIN and let him re-try later.
This mechanism is prone to starvation, because the wakeup algorithm is
non-atomic. During the time the link issues a wakeup signal, until the
socket wakes up and re-attempts sending, other senders may have come
in between and occupied the free buffer space in the link. This in turn
may lead to a socket having to make many send attempts before it is
successful. In extremely loaded systems we have observed latency times
of several seconds before a low-priority socket is able to send out a
message.
In this commit, we simplify this mechanism and reduce the risk of the
described scenario happening. When a message is attempted sent via a
congested link, we now let it be added to the link's backlog queue
anyway, thus permitting an oversubscription of one message per source
socket. We still create a wakeup item and return an error code, hence
instructing the sender to block or stop sending. Only when enough space
has been freed up in the link's backlog queue do we issue a wakeup event
that allows the sender to continue with the next message, if any.
The fact that a socket now can consider a message sent even when the
link returns a congestion code means that the sending socket code can
be simplified. Also, since this is a good opportunity to get rid of the
obsolete 'mtu change' condition in the three socket send functions, we
now choose to refactor those functions completely.
Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jon Paul Maloy [Tue, 3 Jan 2017 15:55:10 +0000 (10:55 -0500)]
tipc: modify struct tipc_plist to be more versatile
During multicast reception we currently use a simple linked list with
push/pop semantics to store port numbers.
We now see a need for a more generic list for storing values of type
u32. We therefore make some modifications to this list, while replacing
the prefix 'tipc_plist_' with 'u32_'. We also add a couple of new
functions which will come to use in the next commits.
Acked-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jon Paul Maloy [Tue, 3 Jan 2017 15:55:09 +0000 (10:55 -0500)]
tipc: unify tipc_wait_for_sndpkt() and tipc_wait_for_sndmsg() functions
The functions tipc_wait_for_sndpkt() and tipc_wait_for_sndmsg() are very
similar. The latter function is also called from two locations, and
there will be more in the coming commits, which will all need to test on
different conditions.
Instead of making yet another duplicates of the function, we now
introduce a new macro tipc_wait_for_cond() where the wakeup condition
can be stated as an argument to the call. This macro replaces all
current and future uses of the two functions, which can now be
eliminated.
Acked-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>