GitHub/LineageOS/android_kernel_motorola_exynos9610.git
8 years agorxrpc: Perform terminal call ACK/ABORT retransmission from conn processor
David Howells [Tue, 23 Aug 2016 14:27:25 +0000 (15:27 +0100)]
rxrpc: Perform terminal call ACK/ABORT retransmission from conn processor

Perform terminal call ACK/ABORT retransmission in the connection processor
rather than in the call processor.  With this change, once last_call is
set, no more incoming packets will be routed to the corresponding call or
any earlier calls on that channel (call IDs must only increase on a channel
on a connection).

Further, if a packet's callNumber is before the last_call ID or a packet is
aimed at successfully completed service call then that packet is discarded
and ignored.

Signed-off-by: David Howells <dhowells@redhat.com>
8 years agorxrpc: Calculate serial skew on packet reception
David Howells [Tue, 23 Aug 2016 14:27:25 +0000 (15:27 +0100)]
rxrpc: Calculate serial skew on packet reception

Calculate the serial number skew in the data_ready handler when a packet
has been received and a connection looked up.  The skew is cached in the
sk_buff's priority field.

The connection highest received serial number is updated at this time also.
This can be done without locks or atomic instructions because, at this
point, the code is serialised by the socket.

This generates more accurate skew data because if the packet is offloaded
to a work queue before this is determined, more packets may come in,
bumping the highest serial number and thereby increasing the apparent skew.

This also removes some unnecessary atomic ops.

Signed-off-by: David Howells <dhowells@redhat.com>
8 years agorxrpc: Set connection expiry on idle, not put
David Howells [Tue, 23 Aug 2016 14:27:24 +0000 (15:27 +0100)]
rxrpc: Set connection expiry on idle, not put

Set the connection expiry time when a connection becomes idle rather than
doing this in rxrpc_put_connection().  This makes the put path more
efficient (it is likely to be called occasionally whilst a connection has
outstanding calls because active workqueue items needs to be given a ref).

The time is also preset in the connection allocator in case the connection
never gets used.

Signed-off-by: David Howells <dhowells@redhat.com>
8 years agorxrpc: Use a tracepoint for skb accounting debugging
David Howells [Tue, 23 Aug 2016 14:27:24 +0000 (15:27 +0100)]
rxrpc: Use a tracepoint for skb accounting debugging

Use a tracepoint to log various skb accounting points to help in debugging
refcounting errors.

Signed-off-by: David Howells <dhowells@redhat.com>
8 years agorxrpc: Drop channel number field from rxrpc_call struct
David Howells [Tue, 23 Aug 2016 14:27:24 +0000 (15:27 +0100)]
rxrpc: Drop channel number field from rxrpc_call struct

Drop the channel number (channel) field from the rxrpc_call struct to
reduce the size of the call struct.  The field is redundant: if the call is
attached to a connection, the channel can be obtained from there by AND'ing
with RXRPC_CHANNELMASK.

Signed-off-by: David Howells <dhowells@redhat.com>
8 years agorxrpc: When clearing a socket, clear the call sets in the right order
David Howells [Tue, 23 Aug 2016 14:27:24 +0000 (15:27 +0100)]
rxrpc: When clearing a socket, clear the call sets in the right order

When clearing a socket, we should clear the securing-in-progress list
first, then the accept queue and last the main call tree because that's the
order in which a call progresses.  Not that a call should move from the
accept queue to the main tree whilst we're shutting down a socket, but it a
call could possibly move from sequreq to acceptq whilst we're clearing up.

Signed-off-by: David Howells <dhowells@redhat.com>
8 years agorxrpc: Tidy up the rxrpc_call struct a bit
David Howells [Tue, 23 Aug 2016 14:27:24 +0000 (15:27 +0100)]
rxrpc: Tidy up the rxrpc_call struct a bit

Do a little tidying of the rxrpc_call struct:

 (1) in_clientflag is no longer compared against the value that's in the
     packet, so keeping it in this form isn't necessary.  Use a flag in
     flags instead and provide a pair of wrapper functions.

 (2) We don't read the epoch value, so that can go.

 (3) Move what remains of the data that were used for hashing up in the
     struct to be with the channel number.

 (4) Get rid of the local pointer.  We can get at this via the socket
     struct and we only use this in the procfs viewer.

Signed-off-by: David Howells <dhowells@redhat.com>
8 years agorxrpc: Remove RXRPC_CALL_PROC_BUSY
David Howells [Tue, 23 Aug 2016 14:27:23 +0000 (15:27 +0100)]
rxrpc: Remove RXRPC_CALL_PROC_BUSY

Remove RXRPC_CALL_PROC_BUSY as work queue items are now 100% non-reentrant.

Signed-off-by: David Howells <dhowells@redhat.com>
8 years agonet: strparser: fix strparser sk_user_data check
Dave Watson [Mon, 22 Aug 2016 19:27:04 +0000 (12:27 -0700)]
net: strparser: fix strparser sk_user_data check

sk_user_data mismatch between what kcm expects (psock) and what strparser expects (strparser).

Queued rx_work, for example calling strp_check_rcv after socket buffer changes, will never complete.

sk_user_data is unused in strparser, so just remove the check.

Signed-off-by: Dave Watson <davejwatson@fb.com>
Acked-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoqed: Fix address macros
Yuval Mintz [Tue, 23 Aug 2016 04:19:50 +0000 (07:19 +0300)]
qed: Fix address macros

Last FW submission reverted various macros into an older form,
where they generate compilation warnings on some architectures.

Bring back the newer macros instead.

Fixes: 05fafbfb3d77 ("qed: utilize FW 8.10.10.0")
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoMerge branch 'dsa-fix-MV88E6131-tagging'
David S. Miller [Tue, 23 Aug 2016 04:08:09 +0000 (21:08 -0700)]
Merge branch 'dsa-fix-MV88E6131-tagging'

Andrew Lunn says:

====================
Fix MV88E6131 tagging

Marvell has two different tagging protocols for frames passed to a
swicth. There is the older DSA and the newer EDSA. Somewhere along the
way, we broke support for switches which only support DSA, by trying
to configure them to use EDSA. These patches add back support for
switches which only support DSA, by allowing the drivers to
dynamically indicate the tagging protocol they support to the DSA
core. This needs to be dynamic since the mv88e6xxx has to support two
protocols.

Thanks go to Jamie Lentin for reporting the problem, helping debug it,
providing some of the fix, and testing.
====================

Tested-By: Jamie Lentin <jm@lentin.co.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: mv88e6xxx: Enable PORT_CONTROL_FORWARD_UNKNOWN for DSA-tagged CPU ports
Jamie Lentin [Mon, 22 Aug 2016 14:01:04 +0000 (16:01 +0200)]
net: mv88e6xxx: Enable PORT_CONTROL_FORWARD_UNKNOWN for DSA-tagged CPU ports

Without it, a mv88e6131 switch will not forward incoming unicast
packets to the CPU port.

Signed-off-by: Jamie Lentin <jm@lentin.co.uk>
Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agodsa: mv88e6xxx: Delete ppu timer when removing module
Andrew Lunn [Mon, 22 Aug 2016 14:01:03 +0000 (16:01 +0200)]
dsa: mv88e6xxx: Delete ppu timer when removing module

The PPU method of accessing PHYs makes use of a timer. Make sure this
timer is deleted before unloading the driver.

Reported-by: Jamie Lentin <jm@lentin.co.uk>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: dsa: mv88e6xxx: Fix support for DSA tagging for older switches.
Andrew Lunn [Mon, 22 Aug 2016 14:01:02 +0000 (16:01 +0200)]
net: dsa: mv88e6xxx: Fix support for DSA tagging for older switches.

Older chips only support DSA tagging on the CPU port. New devices
support both DSA and EDSA. The driver needs to tell the core the tag
protocol to use, and configure the switch for what is available.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: dsa: Allow the DSA driver to indicate the tag protocol
Andrew Lunn [Mon, 22 Aug 2016 14:01:01 +0000 (16:01 +0200)]
net: dsa: Allow the DSA driver to indicate the tag protocol

DSA drivers may drive different families of switches which need
different tag protocol. Rather than hard code the tag protocol in the
driver structure, have a callback for the DSA core to call.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: ipconfig: Fix NULL pointer dereference on RARP/BOOTP/DHCP timeout
Geert Uytterhoeven [Mon, 22 Aug 2016 13:01:03 +0000 (15:01 +0200)]
net: ipconfig: Fix NULL pointer dereference on RARP/BOOTP/DHCP timeout

If no RARP, BOOTP, or DHCP response is received, ic_dev is never set,
causing a NULL pointer dereference in ic_close_devs():

    Sending DHCP requests ...... timed out!
    Unable to handle kernel NULL pointer dereference at virtual address 00000004

To fix this, add a check to avoid dereferencing ic_dev if it is still
NULL.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Fixes: 2647cffb2bc6fbed ("net: ipconfig: Support using "delayed" DHCP replies")
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoMerge tag 'batadv-next-for-davem-20160822' of git://git.open-mesh.org/linux-merge
David S. Miller [Tue, 23 Aug 2016 03:38:25 +0000 (20:38 -0700)]
Merge tag 'batadv-next-for-davem-20160822' of git://git.open-mesh.org/linux-merge

Simon Wunderlich says:

====================
This feature patchset includes the following changes:

 - place kref_get near usage of referenced objects, separate patches
   for various used objects to improve readability and maintainability
   by Sven Eckelmann (18 patches)

 - Keep batadv net device when all hard interfaces disappear, to
   improve situations where tools currently use work arounds, by
   Sven Eckelmann

 - Add an option to disable debugfs support to minimize footprint when
   userspace uses netlink only, by Sven Eckelmann
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoMerge branch 'cxgb4-tx-rate-limiting'
David S. Miller [Tue, 23 Aug 2016 01:29:14 +0000 (18:29 -0700)]
Merge branch 'cxgb4-tx-rate-limiting'

Rahul Lakkireddy says:

====================
TX max rate limiting for Chelsio T4/T5 adapters

This series of patches implement tx max rate limiting per queue on
Chelsio T4/T5 hardware.  This is achieved by first creating a tx
scheduling class with the specified max rate.  The queue is then
bound to the newly created class.  If a scheduling class with similar
max rate already exists, then the queue is bound to the matching class.

Patch 1 adds support for setting tx scheduling classes.
Patch 2 adds support to bind/unbind queues to/from the scheduling classes.
Patch 3 implements the set_tx_maxrate NDO.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agocxgb4: add support for tx max rate limiting
Rahul Lakkireddy [Mon, 22 Aug 2016 10:59:08 +0000 (16:29 +0530)]
cxgb4: add support for tx max rate limiting

Implement set_tx_maxrate NDO to perform per queue tx rate limiting.

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agocxgb4: add support for per queue tx scheduling
Rahul Lakkireddy [Mon, 22 Aug 2016 10:59:07 +0000 (16:29 +0530)]
cxgb4: add support for per queue tx scheduling

Add support to bind/unbind specified tx queues to/from scheduling
classes.  If a queue is already bound to a scheduling class, it is
unbound first and then bound to a new specified class.

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agocxgb4: add support for tx traffic scheduling classes
Rahul Lakkireddy [Mon, 22 Aug 2016 10:59:06 +0000 (16:29 +0530)]
cxgb4: add support for tx traffic scheduling classes

Add support to create tx traffic scheduling classes with specified
scheduling parameters.  Return an existing class if a match is found
with same scheduling parameters.

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoMerge branch 'qed-sriov-legacy'
David S. Miller [Tue, 23 Aug 2016 01:24:52 +0000 (18:24 -0700)]
Merge branch 'qed-sriov-legacy'

Yuval Mintz says:

====================
qed*: IOV patch series

Recent FW [8.10.10.0] enabled us to support sriov interaction
with legacy VF/PF. This patch series adds the necessary driver changes
to utilize this additional compatibility.
In addition, utilize the new FW ability to prevent pause floods by VFs,
and fix a bug that is [mostly] exposed by the added legacy support.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoqed: Change locking scheme for VF channel
Yuval Mintz [Mon, 22 Aug 2016 10:25:12 +0000 (13:25 +0300)]
qed: Change locking scheme for VF channel

Each VF employees a lock that's supposed to serialize its usage of the
HW channel for communication with its PF, but the critical section is
ill-defined:

  - VFs currently release the lock whenever the PF response arrives,
    prior to actually processing the reply buffer [which was also supposed
    to have been protected by same lock].

  - The lock would be released on first response, ignoring the possibilty
    the sw flow isn't over [as might be the case of the acquisition flow].
    As a result, the flow would run unprotected and would cause a double
    mutex release [as the additional message completion would release it
    while its actually already free].

Change the flow to have a dedicated function to be called at end of each
flow and release the lock.

Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoqed*: Add support for VFs over legacy PFs
Yuval Mintz [Mon, 22 Aug 2016 10:25:11 +0000 (13:25 +0300)]
qed*: Add support for VFs over legacy PFs

Modern VFs can't run on old non-compatible as the fastpath HSI is
slightly changed - but as the HSI is actually very close [basically,
a single bit whose meaning flipped] this can be supported with small
modifications.

The major differences would be in:
  - Recognizing that VF is running on top of a legacy PF.
  - Returning some slowpath configurations that are no longer needed
    on top of modern PFs, but would be required when working over
    the legacy ones.

Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoqed: Prevent VFs from pause flooding
Yuval Mintz [Mon, 22 Aug 2016 10:25:10 +0000 (13:25 +0300)]
qed: Prevent VFs from pause flooding

Firmware would silently drop any control frame sent by VF to prevent
a malicious VF from generating pause flood in the network.

Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoqed: Add support for legacy VFs
Yuval Mintz [Mon, 22 Aug 2016 10:25:09 +0000 (13:25 +0300)]
qed: Add support for legacy VFs

The 8.10.x FW added support for forward compatability as well as
'future' backward compatibility, but only to those VFs that were
using HSI which was 8.10.x based or newer.

The latest firmware now supports backward compatibility for the
older VFs based on 8.7.x and 8.8.x firmware as well.

Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: phy: Add missing of_node_put() in xgmiitorgmii_probe()
Wei Yongjun [Sun, 21 Aug 2016 22:46:15 +0000 (22:46 +0000)]
net: phy: Add missing of_node_put() in xgmiitorgmii_probe()

This node pointer is returned by of_parse_phandle() with
refcount incremented in this function. of_node_put() on it
before exitting this function.

This is detected by Coccinelle semantic patch.

Signed-off-by: Wei Yongjun <weiyj.lk@gmail.com>
Reviewed-by: Kedareswara rao Appana <appanad@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agomlx5/core: Use memdup_user() rather than duplicating its implementation
Markus Elfring [Sat, 20 Aug 2016 05:50:09 +0000 (07:50 +0200)]
mlx5/core: Use memdup_user() rather than duplicating its implementation

* Reuse existing functionality from memdup_user() instead of keeping
  duplicate source code.

  This issue was detected by using the Coccinelle software.

* Return directly if this copy operation failed.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet_sched: properly handle failure case of tcf_exts_init()
WANG Cong [Fri, 19 Aug 2016 19:36:54 +0000 (12:36 -0700)]
net_sched: properly handle failure case of tcf_exts_init()

After commit 22dc13c837c3 ("net_sched: convert tcf_exts from list to pointer array")
we do dynamic allocation in tcf_exts_init(), therefore we need
to handle the ENOMEM case properly.

Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoRevert "l2tp: Refactor the codes with existing macros instead of literal number"
David S. Miller [Sun, 21 Aug 2016 22:50:11 +0000 (15:50 -0700)]
Revert "l2tp: Refactor the codes with existing macros instead of literal number"

This reverts commit 5ab1fe72d5490978104fc493615ea29dd7238766.

This change still has problems.

Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agocxgb4: Simplify the return expression
Wei Yongjun [Sat, 20 Aug 2016 15:32:41 +0000 (15:32 +0000)]
cxgb4: Simplify the return expression

Simplify the return expression.

Signed-off-by: Wei Yongjun <weiyj.lk@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agol2tp: Refactor the codes with existing macros instead of literal number
Gao Feng [Sat, 20 Aug 2016 15:52:27 +0000 (23:52 +0800)]
l2tp: Refactor the codes with existing macros instead of literal number

Use PPP_ALLSTATIONS, PPP_UI, and SEND_SHUTDOWN instead of 0xff,
0x03, and 2 separately.

Signed-off-by: Gao Feng <fgao@ikuai8.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: ethernet: renesas: ravb: use new api ethtool_{get|set}_link_ksettings
Philippe Reynes [Fri, 19 Aug 2016 22:52:19 +0000 (00:52 +0200)]
net: ethernet: renesas: ravb: use new api ethtool_{get|set}_link_ksettings

The ethtool api {get|set}_settings is deprecated.
We move this driver to new api {get|set}_link_ksettings.

Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Acked-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: ethernet: renesas: ravb: use phydev from struct net_device
Philippe Reynes [Fri, 19 Aug 2016 22:52:18 +0000 (00:52 +0200)]
net: ethernet: renesas: ravb: use phydev from struct net_device

The private structure contain a pointer to phydev, but the structure
net_device already contain such pointer. So we can remove the pointer
phy_dev in the private structure, and update the driver to use the
one contained in struct net_device.

Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Acked-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoMerge branch '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next...
David S. Miller [Sun, 21 Aug 2016 04:58:49 +0000 (21:58 -0700)]
Merge branch '10GbE' of git://git./linux/kernel/git/jkirsher/next-queue

Jeff Kirsher says:

====================
10GbE Intel Wired LAN Driver Updates 2016-08-20

This series contains updates to ixgbe and ixgbevf.

Veola fixes how the backplane reports the media in ethtool, as KR, KX or
KX4 based on the backplane interface present.

Emil fixes ixgbevf since an incorrect size parameter for
ixgbevf_write_msg_read_ack() ended up only giving the PF the first 4
bytes of the MAC address, so correct the size by calculating it on the
fly for all instances where we call ixgbevf_write_msg_read_ack().  Added
geneve receive offload support for x550em_a.

Don fixes the LED interface for x557 since it uses a different interface.
Added support for the new x557 copper device.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoixgbe: Add support for new X557 device
Don Skidmore [Thu, 18 Aug 2016 00:34:40 +0000 (20:34 -0400)]
ixgbe: Add support for new X557 device

This patch adds support for the new copper device X557.

Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
8 years agoixgbe: add device to MDIO speed setting
Don Skidmore [Wed, 17 Aug 2016 21:34:07 +0000 (17:34 -0400)]
ixgbe: add device to MDIO speed setting

This shouldn't matter as nothing should be attached still to be
consisted control MDIO speed for these devices as well.

Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
8 years agoixgbe: Fix led interface for X557 devices
Don Skidmore [Wed, 17 Aug 2016 18:11:57 +0000 (14:11 -0400)]
ixgbe: Fix led interface for X557 devices

The X557 devices use a different interface to the LED for the port.
This patch reflect that change.

Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
8 years agoixgbe: add support for geneve Rx offload
Emil Tantilov [Wed, 10 Aug 2016 18:19:23 +0000 (11:19 -0700)]
ixgbe: add support for geneve Rx offload

Add geneve Rx offload support for x550em_a.

The implementation follows the vxlan code with the lower 16 bits of
the VXLANCTRL register holding the UDP port for VXLAN and the upper
for Geneve.

Disabled NFS filters in the RFCTL register which allows us to simplify
the check for VXLAN and Geneve packets in ixgbe_rx_checksum().

Removed vxlan from the name of the callback functions and replaced it
with udp_tunnel which is more in line with the new API.

Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
8 years agoixgbevf: fix incorrect MAC address on load
Emil Tantilov [Wed, 27 Jul 2016 17:55:08 +0000 (10:55 -0700)]
ixgbevf: fix incorrect MAC address on load

The PF driver was only receiving the first 4 bytes of the MAC due
to an incorrect size parameter for ixgbevf_write_msg_read_ack()
in ixgbevf_set_rar_vf().

Correct the size by calculating it on a fly for all instances where
we call ixgbevf_write_msg_read_ack()

Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
8 years agoixgbe: report correct media type for KR, KX and KX4 interfaces
Veola Nazareth [Sun, 21 Aug 2016 02:35:37 +0000 (19:35 -0700)]
ixgbe: report correct media type for KR, KX and KX4 interfaces

ethtool reports backplane type interfaces as 1000/10000baseT link modes.
This has been corrected to report the media as KR, KX or KX4 based on the backplane interface present.

Signed-off-by: Veola Nazareth <veola.nazareth@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
8 years agoMerge branch 'tun-cleanups'
David S. Miller [Sun, 21 Aug 2016 02:11:34 +0000 (19:11 -0700)]
Merge branch 'tun-cleanups'

Markus Elfring says:

====================
tun: Fine-tuning for update_filter()

A few update suggestions were taken into account
from static source code analysis.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agotun: Rename a jump label in update_filter()
Markus Elfring [Sat, 20 Aug 2016 07:00:34 +0000 (09:00 +0200)]
tun: Rename a jump label in update_filter()

Adjust a jump target according to the Linux coding style convention.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agotun: Use memdup_user() rather than duplicating its implementation
Markus Elfring [Sat, 20 Aug 2016 06:54:15 +0000 (08:54 +0200)]
tun: Use memdup_user() rather than duplicating its implementation

Reuse existing functionality from memdup_user() instead of keeping
duplicate source code.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Reviewed-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoMerge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next...
David S. Miller [Sun, 21 Aug 2016 01:56:30 +0000 (18:56 -0700)]
Merge branch '40GbE' of git://git./linux/kernel/git/jkirsher/next-queue

Jeff Kirsher says:

====================
40GbE Intel Wired LAN Driver Updates 2016-08-19

This series contains updates to i40e and i40evf only.

Jake provides several patches, first just moves a function to co-locate
the two functions related to configuring RSS via the admin queue, which
should help in spotting bugs when comparing the two functions.  Fixed
an issue where commit e69ff813af35 ("i40e: rework the functions to
configure RSS with similar parameters") missed checking whether the seed
is NULL before using it and did not use the passed in *lut parameter.
Fixed an issue where a previous refactor missed i40e_vsi_config_rss()
and the values were being ignored, so checked for the fields and used
them instead of default values.  Lastly replaced calls to
create_singlethread_workqueue() with alloc_workqueue() to provide more
control over workqueue creation and allows explicit setting of the
desired mode of operation.

Mitch adds link speed to log messages and reports speed through ethtool.

Carolyn refactors tail bump check and fixes byte ordering problems found
when enabling this feature support.  Adds support for HMC resources and
profile commands for x722 firmware.

Heinrich Schuchardt fixes format identifiers from %u to %d since the
variable is defined as an integer.

Catherine fixes an issue where there was a race condition between the
completion of the client open and calls to the client ops, so ensured
that client ops are not called until we are sure client is open.

Harshitha makes sure that i40e_client_release() does not try to use
an adapter pointer which may not be initialized, so make sure it is.

Joe Perches fixes the use of the local macro XSTRINGIFY() to use
__stringify() instead.

Avinash corrects the mutex usage in client_subtask().  Fixed the RDMA
client to open again after reset since it is closed during a PF reset.

Jeff (me) clean up whitespace issues, where indentation was done
inconsistently and with spaces versus tabs.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agosamples/bpf: Add tunnel set/get tests.
William Tu [Fri, 19 Aug 2016 18:55:44 +0000 (11:55 -0700)]
samples/bpf: Add tunnel set/get tests.

The patch creates sample code exercising bpf_skb_{set,get}_tunnel_key,
and bpf_skb_{set,get}_tunnel_opt for GRE, VXLAN, and GENEVE.  A native
tunnel device is created in a namespace to interact with a lwtunnel
device out of the namespace, with metadata enabled.  The bpf_skb_set_*
program is attached to tc egress and bpf_skb_get_* is attached to egress
qdisc.  A ping between two tunnels is used to verify correctness and
the result of bpf_skb_get_* printed by bpf_trace_printk.

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agohv_netvsc: Implement batching of receive completions
Haiyang Zhang [Fri, 19 Aug 2016 21:47:09 +0000 (14:47 -0700)]
hv_netvsc: Implement batching of receive completions

The existing code uses busy retry when unable to send out receive
completions due to full ring buffer. It also gives up retrying after limit
is reached, and causes receive buffer slots not being recycled.
This patch implements batching of receive completions. It also prevents
dropping receive completions due to full ring buffer.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoi40evf: Open RDMA Client after reset
Avinash Dayanand [Wed, 17 Aug 2016 23:04:08 +0000 (16:04 -0700)]
i40evf: Open RDMA Client after reset

RDMA client is closed during the PF reset and needs to be opened again.
Setting the flag so that RDMA client is opened in watchdog() function.

Change-ID: I507b1e4cbd05528cdff68fd360ef3dcac8901263
Signed-off-by: Avinash Dayanand <avinash.dayanand@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
8 years agoi40e/i40evf: Fix indentation
Jeff Kirsher [Sat, 20 Aug 2016 04:47:41 +0000 (21:47 -0700)]
i40e/i40evf: Fix indentation

Several defines and code comments were indented with spaces instead
of tabs, correct the issue to make indentation consistent.

Change-ID: I0dc6bbb990ec4a9e856acc9ec526d876181f092c
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
8 years agoi40e: Correcting mutex usage in client code
Avinash Dayanand [Wed, 17 Aug 2016 23:04:06 +0000 (16:04 -0700)]
i40e: Correcting mutex usage in client code

Correcting the mutex usage, in client_subtask(), mutex_unlock has
to be called just before client_del_instance() since this function opens
and later closes the same mutex again.
Similarly in client_is_registered removing the mutex since it closes
the mutex twice.

This is a patch suggested by RDMA team.

Change-ID: Icce519c266e4221b8a2a72a15ba5bf01750e5852
Signed-off-by: Avinash Dayanand <avinash.dayanand@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
8 years agoi40e: Remove XSTRINGIFY macro definitions and uses
Joe Perches [Wed, 17 Aug 2016 10:37:31 +0000 (03:37 -0700)]
i40e: Remove XSTRINGIFY macro definitions and uses

Use __stringify instead.

Signed-off-by: Joe Perches <joe@perches.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
8 years agoi40e: Initialize pointer in client_release function
Harshitha Ramamurthy [Mon, 15 Aug 2016 21:17:19 +0000 (14:17 -0700)]
i40e: Initialize pointer in client_release function

The function i40e_client_release has a print statement that uses an
adapter pointer which is not initialized if a previous if statement
is not true. Hence, intialize it in the right place.

Change-ID: I1afdaa2c46771ac42be56edcc41bb56b455b06c8
Signed-off-by: Harshitha Ramamurthy <harshitha.ramamurthy@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
8 years agoi40e: Check client is open before calling client ops
Catherine Sullivan [Mon, 15 Aug 2016 21:17:18 +0000 (14:17 -0700)]
i40e: Check client is open before calling client ops

We were having a race between the completion of the client open and
calls to the client ops so don't call a client op unless we are sure the
client is open.

Testing Hints: Load IWARP driver and make sure it works as expected.

Change-Id: I741f4f2aa4fcbfdad3e40dabbbb1b005856c396b
Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
8 years agoi40e: use matching format identifiers
Heinrich Schuchardt [Wed, 10 Aug 2016 23:07:22 +0000 (01:07 +0200)]
i40e: use matching format identifiers

i is defined as int but output as %u several times.
Adjust the format identifiers.

Signed-off-by: Heinrich Schuchardt <xypron.glpk@gmx.de>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
8 years agoi40e: Add support for HMC resource and profile for X722
Carolyn Wyborny [Thu, 4 Aug 2016 18:37:05 +0000 (11:37 -0700)]
i40e: Add support for HMC resource and profile for X722

This patch adds support for HMC resource and profile cmds for X722
firmware.

Change-ID: Icc332101f38ab15d1bfa167823100eb4f6822f7e
Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
8 years agoi40e: Fix byte ordering in ARP NS code for X722
Carolyn Wyborny [Thu, 4 Aug 2016 18:37:04 +0000 (11:37 -0700)]
i40e: Fix byte ordering in ARP NS code for X722

This patch fixes byte ordering problems found when enabling this feature
support. Without this patch, the feature will not work correctly. This
patch fixes the definitions to have the correct byte order.

Change-ID: Ic7489fbcbe2195df7be62ff5e359201b827cefe6
Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
8 years agoi40e: refactor tail_bump check
Carolyn Wyborny [Thu, 4 Aug 2016 18:37:03 +0000 (11:37 -0700)]
i40e: refactor tail_bump check

This patch refactors tail bump check.

Change-ID: Ide0e19171d67d90cb2b06b8dcd4fa791ae120160
Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
8 years agoi40evf: report link speed
Mitch Williams [Thu, 4 Aug 2016 18:37:02 +0000 (11:37 -0700)]
i40evf: report link speed

The PF driver tells us the link speed, so do something with that
information. Add link speed to log messages, and report speed through
ethtool.

Change-Id: I279dc9540cc5203376406050a3e8d67e128d5882
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
8 years agoi40e: use alloc_workqueue instead of create_singlethread_workqueue
Jacob Keller [Thu, 4 Aug 2016 18:37:01 +0000 (11:37 -0700)]
i40e: use alloc_workqueue instead of create_singlethread_workqueue

Replace calls to create_singlethread_workqueue instead with alloc_workqueue
as is style with other Intel drivers. This provides more control over
workqueue creation, and allows explicit setting of the desired mode of
operation. It also makes it more obvious that driver name constant is
passed to a format "%s".

Change-ID: I6192b44caf5140336cd54c5b350d51c73b541fdb
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
8 years agoi40e: use configured RSS key and lookup table in i40e_vsi_config_rss
Jacob Keller [Tue, 19 Jul 2016 23:23:31 +0000 (16:23 -0700)]
i40e: use configured RSS key and lookup table in i40e_vsi_config_rss

A previous refactor added support to store user configuration for VSIs,
so that extra VSIs such as for VMDq can use this information when
configuring. Unfortunately the i40e_vsi_config_rss function was missed
in this refactor, and the values were being ignored. Fix this by
checking for the fields and using those instead of always using the
default values.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
8 years agoi40e: fix broken i40e_config_rss_aq function
Jacob Keller [Tue, 19 Jul 2016 23:23:30 +0000 (16:23 -0700)]
i40e: fix broken i40e_config_rss_aq function

X722 hardware requires using the admin queue to configure RSS. This
function was previously re-written in commit e69ff813af35 ("i40e: rework
the functions to configure RSS with similar parameters").
However, the previous refactor did not work correctly for a few reasons

(a) it does not check whether seed is NULL before using it, resulting in
a NULL pointer dereference

[  402.954721] BUG: unable to handle kernel NULL pointer dereference at           (null)
[  402.955568] IP: [<ffffffffa0090ccf>] i40e_config_rss_aq.constprop.65+0x2f/0x1c0 [i40e]
[  402.956402] PGD ad610067 PUD accc0067 PMD 0
[  402.957235] Oops: 0000 [#1] SMP
[  402.958064] Modules linked in: ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_filter ebtable_
broute bridge stp llc ebtable_nat ebtables ip6table_mangle ip6table_raw ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv
6 ip6table_security ip6table_filter ip6_tables iptable_mangle iptable_raw iptable_nat nf_conntrack_ipv4_ nf_defrag_ipv4_ nf_nat_ip
v4_ nf_nat nf_conntrack iptable_security intel_rapl i86_kg_temp_thermal coretemp kvm_intel kvm irqbypass crct10dif_clMl crc32_
pclMl ghash_clMlni_intel iTCO_wdt iTCO_vendor_support shpchp sb_edac dcdbas pcspkr joydev ipmi_devintf wmi edac_core ipmi_ssif
 acpi_ad acpi_ower_meter ipmi_si ipmi_msghandler mei_me nfsd lpc_ich mei ioatdma tpm_tis auth_rpcgss tpm nfs_acl lockd grace s
unrpc ifs nngag200 i2c_algo_bit drm_kms_helper ttm drm iigbe bnx2x i40e dca mdio ptp pps_core libcrc32c fjes crc32c_intel
[  402.965563] CPU: 22 PID: 2461 Conm: ethtool Not tainted 4.6.0-rc7_1.2-ABNidQ+ #20
[  402.966719] Hardware name: Dell Inc. PowerEdge R720/0C4Y3R, BIOS 2.5.2 01/28/2015
[  402.967862] task: ffff880219b51dc0 ti: ffff8800b3408000 task.ti: ffff8800b3408000
[  402.969046] RIP: 0010:[<ffffffffa0090ccf>]  [<ffffffffa0090ccf>] i40e_config_rss_aq.constprop.65+0x2f/0x1c0 [i40e]
[  402.970339] RSP: 0018:ffff8800b340ba90  EFLAGS: 00010246
[  402.971616] RAX: 0000000000000000 RBX: ffff88042ec14000 RCX: 0000000000000200
[  402.972961] RDX: ffff880428eb9200 RSI: 0000000000000000 RDI: ffff88042ec14000
[  402.974312] RBP: ffff8800b340baf8 R08: ffff880237ada8f0 R09: ffff880428eb9200
[  402.975709] R10: ffff880428eb9200 R11: 0000000000000000 R12: ffff88042ec2e000
[  402.977104] R13: ffff88042ec2e000 R14: ffff88042ec14000 R15: ffff88022ea00800
[  402.978541] FS:  00007f84fd054700(0000) GS:ffff880237ac0000(0000) knlGS:0000000000000000
[  402.980003] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  402.981508] CR2: 0000000000000000 CR3: 000000003289e000 CR4: 00000000000406e0
[  402.983028] Stack:
[  402.984578]  0000000002000200 0000000000000000 ffff88023ffeda68 ffff88023ffef000
[  402.986187]  0000000000000268 ffff8800b340bbf8 ffff88023ffedd80 0000000088ce4f1d
[  402.987844]  ffff88042ec14000 ffff88022ea00800 ffff88042ec2e000 ffff88042ec14000
[  402.989509] Call Trace:
[  402.991200]  [<ffffffffa009636f>] i40e_config_rss+0x11f/0x1c0 [i40e]
[  402.992924]  [<ffffffffa00a1ae0>] i40e_set_rifh+0ic0/0x130 [i40e]
[  402.994684]  [<ffffffff816d54b7>] ethtool_set_rifh+0x1f7/0x300
[  402.996446]  [<ffffffff8136d02b>] ? cred_has_capability+0io6b/0x100
[  402.998203]  [<ffffffff8136d102>] ? selinux_capable+0x12/0x20
[  402.999968]  [<ffffffff8136277b>] ? security_capable+0x4b/0x70
[  403.001707]  [<ffffffff816d6da3>] dev_ethtool+0x1423/0x2290
[  403.003461]  [<ffffffff816eab41>] dev_ioctl+0x191/0io630
[  403.005186]  [<ffffffff811cf80a>] ? lru_cache_add+0x3a/0i80
[  403.006942]  [<ffffffff817f2a8e>] ? _raw_spin_unlock+0ie/0x20
[  403.008691]  [<ffffffff816adb95>] sock_do_ioctl+0x45/0i50
[  403.010421]  [<ffffffff816ae229>] sock_ioctl+0x209/0x2d0
[  403.012173]  [<ffffffff81262194>] do_vfs_ioctl+0u4/0io6c0
[  403.013911]  [<ffffffff81262829>] SyS_ioctl+0x79/0x90
[  403.015710]  [<ffffffff817f2e72>] entry_SYSCALL_64_fastpath+0x1a/0u4
[  403.017500] Code: 90 55 48 89 e5 41 57 41 56 41 55 41 54 53 48 89 fb 48 83 ec 40 4c 8b a7 e0 05 00 00 65 48 8b 04 25 28 00 00 00 48 89 45 d0 31 c0 <48> 8b 06 41 0f b7 bc 24 f2 0f 00 00 48 89 45 9c 48 8b 46 08 48
[  403.021454] RIP  [<ffffffffa0090ccf>] i40e_config_rss_aq.constprop.65+0x2f/0x1c0 [i40e]
[  403.023395]  RSP <ffff8800b340ba90>
[  403.025271] CR2: 0000000000000000
[  403.027169] ---[ end trace 64561b528cf61cf0 ]---

(b) it does not even bother to use the passed in *lut parameter which
defines the requested lookup table. Instead it uses its own round robin
table.

Fix these issues by re-writing it to be similar to i40e_config_rss_reg
and i40e_get_rss_aq.

Fixes: e69ff813af35 ("i40e: rework the functions to configure RSS with similar parameters", 2015-10-21)
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
8 years agoi40e: move i40e_vsi_config_rss below i40e_get_rss_aq
Jacob Keller [Tue, 19 Jul 2016 23:23:29 +0000 (16:23 -0700)]
i40e: move i40e_vsi_config_rss below i40e_get_rss_aq

Move this function below the two functions related to configuring RSS
via the admin queue. This helps co-locate the two functions, and made it
easier to spot a bug in the first i40e_config_rss_aq function as
compared to the i40e_get_rss_aq function.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
8 years agonet/irda: remove pointless assignment/check
Vegard Nossum [Fri, 19 Aug 2016 16:08:57 +0000 (18:08 +0200)]
net/irda: remove pointless assignment/check

We've already set sk to sock->sk and dereferenced it, so if it's NULL
we would have crashed already. Moreover, if it was NULL we would have
crashed anyway when jumping to 'out' and trying to unlock the sock.
Furthermore, if we had assigned a different value to 'sk' we would
have been calling lock_sock() and release_sock() on different sockets.

My conclusion is that these two lines are complete nonsense and only
serve to confuse the reader.

Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoqed: utilize FW 8.10.10.0
Yuval Mintz [Fri, 19 Aug 2016 06:33:31 +0000 (09:33 +0300)]
qed: utilize FW 8.10.10.0

This new firmware for the qed* adpaters fixes several issues:
 - Better blocking of malicious VFs.
 - After FLR, Tx-switching [internal routing] of packets might
   be incorrect.
 - Deletion of unicast MAC filters would sometime have side-effect
   of corrupting the MAC filters configred for a device.
It also contains fixes for future qed* drivers that *hopefully* would be
sent for review in the near future.

In addition, it would allow driver some new functionality, including:
 - Allowing PF/VF driver compaitibility with old drivers [running
   pre-8.10.5.0 firmware].
 - Better debug facilities.

This would also bump the qed* driver versions to 8.10.9.20.

Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoMerge branch '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next...
David S. Miller [Sat, 20 Aug 2016 00:19:20 +0000 (17:19 -0700)]
Merge branch '10GbE' of git://git./linux/kernel/git/jkirsher/next-queue

Jeff Kirsher says:

====================
10GbE Intel Wired LAN Driver Updates 2016-08-18

This series contains updates to ixgbe and ixgbevf.

Emil cleans up confusing amongst the users by making an error message
into a debug message, since the TXDCTL.ENABLE (and comparable
VFTXDCTL.ENABLE for ixgbevf) bit is set only when the
transmit queue is actually enabled, which may not happen during the
configure phase eve if we waited for it.  Converts to using netdev_dbg()
macro instead of our home brewed macro for ixgbevf.  Converted the
service task to use atomic bitwise operations when setting and checking
reset requests to reduce the possibility of race conditions.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoMerge branch 'bcm_sf2-platform-dev'
David S. Miller [Sat, 20 Aug 2016 00:15:37 +0000 (17:15 -0700)]
Merge branch 'bcm_sf2-platform-dev'

Florian Fainelli says:

====================
net: dsa: bcm_sf2: Platform conversion

This patch series converts the bcm_sf2 driver from a traditional DSA driver
into a platform_device driver and makes it use the new DSA binding that Andrew
introduced in the latest merge window.

Prior attempts used to coerce the code in net/dsa/dsa2.c to accept the old
binding, while really there is only one broken single user out there: bcm_sf2,
so instead, just assume the new DT binding is deployed and use it accordingly.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: dsa: bcm_sf2: Remove probing through old DSA binding
Florian Fainelli [Thu, 18 Aug 2016 22:30:16 +0000 (15:30 -0700)]
net: dsa: bcm_sf2: Remove probing through old DSA binding

Remove our dsa_switch_driver::drv_probe callback to prevent probing
through the old DSA binding, not that this could happen anymore now that
we have moved the matching compatible string from net/dsa/dsa.c to
drivers/net/dsa/bcm_sf2.c, so this is essentially dead code.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: dsa: bcm_sf2: Use device managed helpers
Florian Fainelli [Thu, 18 Aug 2016 22:30:15 +0000 (15:30 -0700)]
net: dsa: bcm_sf2: Use device managed helpers

Now that we have converted the drivers into a proper platform device
driver, we can use the device managed helper functions to simplify the
error paths a bit wrt. register resources and IRQs.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: dsa: bcm_sf2: Make it a real platform device driver
Florian Fainelli [Thu, 18 Aug 2016 22:30:14 +0000 (15:30 -0700)]
net: dsa: bcm_sf2: Make it a real platform device driver

The Broadcom Starfighter 2 switch driver should be a proper platform
driver, now that the DSA code has been updated to allow that, register a
switch device, feed it with the proper configuration data coming from
Device Tree and register our switch device with DSA.

The bulk of the changes consist in moving what bcm_sf2_sw_setup() did
into the platform driver probe function.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoDocumentation: dt: bindings: Update Broadcom 7445 switch document
Florian Fainelli [Thu, 18 Aug 2016 22:30:13 +0000 (15:30 -0700)]
Documentation: dt: bindings: Update Broadcom 7445 switch document

Document the updated binding which conforms to the new DSA binding in
net/dsa/dsa.txt.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: dsa: Export suspend/resume functions
Florian Fainelli [Thu, 18 Aug 2016 22:30:12 +0000 (15:30 -0700)]
net: dsa: Export suspend/resume functions

In preparation for allowing switch drivers to implement system-wide
suspend/resume functions, export dsa_switch_suspend and
dsa_switch_resume() such that these are callable from the appropriate
driver specific suspend/resume functions.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Tested-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoMerge branch 'mv88e6xxx-fix-wait'
David S. Miller [Sat, 20 Aug 2016 00:14:08 +0000 (17:14 -0700)]
Merge branch 'mv88e6xxx-fix-wait'

Andrew Lunn says:

====================
Fix mv88e6xxx wait function

The mv88e6xxx wait function can be upset of the system has nots of
other things to do and a sleep takes a lot longer than expected. Fix
this be using a fixed number of iterations, rather than a fixed
walkclock time.

Witht that change made, it is possible to consoliate another
wait function.

A wait actually timing out should not happen and when it does, it
means something serious is wrong. Make sure an error is logged,
since not all callers will log an error.
====================

Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agodsa: mv88e6xxx: Make mv88e6xxx_wait() timeout verbose
Andrew Lunn [Thu, 18 Aug 2016 22:01:57 +0000 (00:01 +0200)]
dsa: mv88e6xxx: Make mv88e6xxx_wait() timeout verbose

When mv88e6xxx_wait() returns a timeout, something bad has
happened. Make sure it is noticed by logging an error.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agodsa: mv88e6xxx: Use mv88e6xx_wait in mv88e6xxx_update()
Andrew Lunn [Thu, 18 Aug 2016 22:01:56 +0000 (00:01 +0200)]
dsa: mv88e6xxx: Use mv88e6xx_wait in mv88e6xxx_update()

Now that mv88e6xx_wait() iterated on a counter than a fixed time
interval, it implements the same mechanism as mv88e6xxx_update() uses.
So use it in mv88e6xx_wait().

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agodsa: mv88e6xxx: Timeout based on iterations, not time
Andrew Lunn [Thu, 18 Aug 2016 22:01:55 +0000 (00:01 +0200)]
dsa: mv88e6xxx: Timeout based on iterations, not time

The mv88e6xxx driver times out operations on the switch based on
looping until an elapsed wall clock time is reached. However, if
usleep_range() sleeps much longer than expected, it could timeout with
an error without actually checking to see if the devices has completed
the operation. So replace the elapsed time with a fixed upper bound on
the number of loops.

Testing on various switches has shown that switches takes either 0 or
1 iteration, so a maximum of 16 iterations is a safe limit.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoMerge branch 'phy-next'
David S. Miller [Sat, 20 Aug 2016 00:11:50 +0000 (17:11 -0700)]
Merge branch 'phy-next'

Andrew Lunn says:

====================
PHY Kconfig and Makefile cleanup

The Ethernet PHY directory has slowly been getting more entries.
Split the entries in the Makefile and Kconfig into MDIO bus drivers
and PHYs. Within these two groups, sort them. This should reduce merge
conflicts and aid finding what one searches for.

The Kconfig text contains redundant "Driver for" and "Support for"
which add little value, make the vendor less obvious, and defeat the
shortcut key in the menu. Remove such text.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: phy: Kconfig: Remove redundant "Support for"
Andrew Lunn [Thu, 18 Aug 2016 21:56:06 +0000 (23:56 +0200)]
net: phy: Kconfig: Remove redundant "Support for"

Remove the redundant "Support for" and "Drivers for" from the Kconfig
short description. This makes the manufacture much more prominent in
the list and makes the shortcut keys useful.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: phy: Sort Makefile and Kconfig
Andrew Lunn [Thu, 18 Aug 2016 21:56:05 +0000 (23:56 +0200)]
net: phy: Sort Makefile and Kconfig

Sort the files to reduce merge conflicts and to make it easier to find
drivers by name. Also separate the MDIO bus drivers from the PHY
drivers, again to help find what you need.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: ipv4: fix sparse error in fib_good_nh()
Eric Dumazet [Thu, 18 Aug 2016 17:19:34 +0000 (10:19 -0700)]
net: ipv4: fix sparse error in fib_good_nh()

Fixes following sparse errors :

net/ipv4/fib_semantics.c:1579:61: warning: incorrect type in argument 2
(different base types)
net/ipv4/fib_semantics.c:1579:61:    expected unsigned int [unsigned]
[usertype] key
net/ipv4/fib_semantics.c:1579:61:    got restricted __be32 const
[usertype] nh_gw

Fixes: a6db4494d218c ("net: ipv4: Consider failed nexthops in multipath routes")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoudp: include addrconf.h
Eric Dumazet [Thu, 18 Aug 2016 16:59:12 +0000 (09:59 -0700)]
udp: include addrconf.h

Include ipv4_rcv_saddr_equal() definition to avoid this sparse error :

net/ipv4/udp.c:362:5: warning: symbol 'ipv4_rcv_saddr_equal' was not
declared. Should it be static?

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agotcp: md5: remove tcp_md5_hash_header()
Eric Dumazet [Thu, 18 Aug 2016 16:49:55 +0000 (09:49 -0700)]
tcp: md5: remove tcp_md5_hash_header()

After commit 19689e38eca5 ("tcp: md5: use kmalloc() backed scratch
areas") this function is no longer used.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoibmvnic: Handle backing device failover and reinitialization
Thomas Falcon [Thu, 18 Aug 2016 16:37:51 +0000 (11:37 -0500)]
ibmvnic: Handle backing device failover and reinitialization

An upcoming feature of IBM VNIC protocol is the ability to configure
redundant backing devices for a VNIC client. In case of a failure
on the current backing device, the driver will receive a signal
from the hypervisor indicating that a failover will occur. The driver
will then wait for a message from the backing device before
establishing a new connection.

Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: hns: Add reset function support for RoCE driver
oulijun [Thu, 18 Aug 2016 12:32:52 +0000 (20:32 +0800)]
net: hns: Add reset function support for RoCE driver

It added reset function for RoCE driver. RoCE is a feature of hns.
In hip06 SoC, in RoCE reset process, it's needed to configure dsaf
channel reset, port and sl map info. Reset function of RoCE is
located in dsaf module, we only call it in RoCE driver when needed.

This patch is used to fix the conflict, please refer to this link:
  https://www.spinics.net/lists/linux-rdma/msg39114.html

Signed-off-by: Wei Hu <xavier.huwei@huawei.com>
Signed-off-by: Nenglong Zhao <zhaonenglong@hisilicon.com>
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Sheng Li <lisheng011@huawei.com>
Reviewed-by: Yisen Zhuang <yisen.zhuang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoMerge branch 'rhash-raw-walkers-remove-part-1'
David S. Miller [Fri, 19 Aug 2016 21:40:25 +0000 (14:40 -0700)]
Merge branch 'rhash-raw-walkers-remove-part-1'

Herbert Xu says:

====================
rhashtable: Get rid of raw table walkers part 1

This series starts the process of getting rid of all raw rhashtable
walkers (e.g., using any of the rht_for_each helpers) from the
kernel.

We need to do this before I can fix the resize kmalloc failure issue
by using multi-layered tables.

We should do this anyway because almost all raw table walkers are
already buggy in that they don't handle multiple rhashtables during
a resize.
====================

Dave/Tomas, please keep an eye out for any new patches that try
to introduce raw table walkers and nack them.

Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonetlink: Use rhashtable walk interface in diag dump
Herbert Xu [Fri, 19 Aug 2016 08:21:37 +0000 (16:21 +0800)]
netlink: Use rhashtable walk interface in diag dump

This patch converts the diag dumping code to use the rhashtable
walk code instead of going through rhashtable by hand.  The lock
nl_table_lock is now only taken while we process the multicast
list as it's not needed for the rhashtable walk.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoMAINTAINERS: Add extra rhashtable maintainer
Herbert Xu [Thu, 18 Aug 2016 08:50:57 +0000 (16:50 +0800)]
MAINTAINERS: Add extra rhashtable maintainer

As I'm working actively on rhashtable it helps if people CCed me
when they work on in.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agorhashtable: Remove GFP flag from rhashtable_walk_init
Herbert Xu [Thu, 18 Aug 2016 08:50:56 +0000 (16:50 +0800)]
rhashtable: Remove GFP flag from rhashtable_walk_init

The commit 8f6fd83c6c5ec66a4a70c728535ddcdfef4f3697 ("rhashtable:
accept GFP flags in rhashtable_walk_init") added a GFP flag argument
to rhashtable_walk_init because some users wish to use the walker
in an unsleepable context.

In fact we don't need to allocate memory in rhashtable_walk_init
at all.  The walker is always paired with an iterator so we could
just stash ourselves there.

This patch does that by introducing a new enter function to replace
the existing init function.  This way we don't have to churn all
the existing users again.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoMerge branch 'cxgb-crypto'
David S. Miller [Fri, 19 Aug 2016 07:00:45 +0000 (00:00 -0700)]
Merge branch 'cxgb-crypto'

Hariprasad Shenai says:

====================
crypto/chcr: Add support for Chelsio Crypto Driver

This patch series adds support for Chelsio Crypto driver.

The patch series has been created against net-next tree and includes
patches for Chelsio Low Level Driver(cxgb4) and adds the new crypto
Upper Layer Driver(chcr) under a new directory drivers/crypto/chelsio.

Patch 1/4 ("cxgb4: Add support for dynamic allocation of resources for
ULD") adds support for dynamic allocation of resources for ULD. The
objective of this patch is to provide generic interface for upper layer
drivers to allocate and initialize hardware resources.

The present cxgb4 (network driver) apart from network functionality, also
initializes hardware and thus acts as lower layer driver for other drivers
to use hardware resources. Thus it acts as both a Low level driver for
Upper layer driver's like iw_cxgb4, cxgb4i and cxgb4it and a Network Driver.

Right now the allocation of resources for Upper layer driver's is done
statically. Patch 1/4 adds a new infrastructure for dynamic allocation of
resources. cxgb4 will read the hardware capability through firmware and
allocate/free the queues for Upper layer drivers when the respective
driver's are loaded and freed when unloaded.

Patch 2/3, 3/4 and 4/4 adds support for Chelsio Crypto Driver. The Crypto
driver will act as another ULD on top of cxgb4.

In this patch series, the ULD API framework is used only by crypto and other
ULD's will make use of it in the next series.

This patch series is only for review, if this looks ok we will test it
thoroughly and send request for merge.

We have included all the maintainers of respective drivers. Kindly
review the changes and provide feedback on the same.

V3: - Removed crypto queues from cxgb4 and added support for dynamic
      allocation of resources for Upper layer drivers
    - Dependency fix in Kconfig.

V2: - Some residual code cleanup
    - Adds pr_fmt with chcr (KBUILD_MODNAME) added
    - Changes var name to accomodate them <80 columns in the chcr_register_alg
    - Support for printing the crypto queue stats
    - Fix compile warnings reported by kbuild bot for certain architectures
    - Dependency fix in Kconfig.
    - If the request has the MAY_BACKLOG bit set and hardware queue is
      full the request is queued up else -EBUSY is returned to throttle
      the user. The queue when executed and processed returns -EINPROGRESS
      in completion.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agocrypto: Added Chelsio Menu to the Kconfig file
Hariprasad Shenai [Wed, 17 Aug 2016 07:03:06 +0000 (12:33 +0530)]
crypto: Added Chelsio Menu to the Kconfig file

Adds the config entry for the Chelsio Crypto Driver, Makefile changes
for the same.

Signed-off-by: Atul Gupta <atul.gupta@chelsio.com>
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agochcr: Support for Chelsio's Crypto Hardware
Hariprasad Shenai [Wed, 17 Aug 2016 07:03:05 +0000 (12:33 +0530)]
chcr: Support for Chelsio's Crypto Hardware

The Chelsio's Crypto Hardware can perform the following operations:
SHA1, SHA224, SHA256, SHA384 and SHA512, HMAC(SHA1), HMAC(SHA224),
HMAC(SHA256), HMAC(SHA384), HAMC(SHA512), AES-128-CBC, AES-192-CBC,
AES-256-CBC, AES-128-XTS, AES-256-XTS

This patch implements the driver for above mentioned features. This
driver is an Upper Layer Driver which is attached to Chelsio's LLD
(cxgb4) and uses the queue allocated by the LLD for sending the crypto
requests to the Hardware and receiving the responses from it.

The crypto operations can be performed by Chelsio's hardware from the
userspace applications and/or from within the kernel space using the
kernel's crypto API.

The above mentioned crypto features have been tested using kernel's
tests mentioned in testmgr.h. They also have been tested from user
space using libkcapi and Openssl.

Signed-off-by: Atul Gupta <atul.gupta@chelsio.com>
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agocxgb4: Register changes and fw defines for crypto
Hariprasad Shenai [Wed, 17 Aug 2016 07:03:04 +0000 (12:33 +0530)]
cxgb4: Register changes and fw defines for crypto

Signed-off-by: Atul Gupta <atul.gupta@chelsio.com>
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agocxgb4: Add support for dynamic allocation of resources for ULD
Hariprasad Shenai [Wed, 17 Aug 2016 07:03:03 +0000 (12:33 +0530)]
cxgb4: Add support for dynamic allocation of resources for ULD

Add a new commmon infrastructure to allocate reosurces dynamically to
Upper layer driver's(ULD) when they register with cxgb4 driver and free
them during unregistering. All the queues and the interrupts for
them will be allocated during ULD probe only and freed during remove.

Signed-off-by: Atul Gupta <atul.gupta@chelsio.com>
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoatm: fore200e: Do not drop const qualifier
LABBE Corentin [Wed, 17 Aug 2016 13:56:45 +0000 (15:56 +0200)]
atm: fore200e: Do not drop const qualifier

The data member of structure firmware is const and this constness is
dropped by some cast.
This patch add some const for keeping the const information.

Signed-off-by: LABBE Corentin <clabbe.montjoie@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoMerge branch 'bpf-next'
David S. Miller [Fri, 19 Aug 2016 06:38:17 +0000 (23:38 -0700)]
Merge branch 'bpf-next'

Daniel Borkmann says:

====================
BPF helper improvements and cleanups

This set adds various improvements to BPF helpers, a cleanup to use
skb_pkt_type_ok() helper, addition of bpf_skb_change_tail(), a follow
up for event output helper and removing ifdefs around the cgroupv2
helper bits. For details please see individual patches.

The set is based against net-next tree, but requires a merge of net
into net-next first.

Thanks a lot!
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agobpf: get rid of cgroup helper related ifdefs
Daniel Borkmann [Wed, 17 Aug 2016 23:00:41 +0000 (01:00 +0200)]
bpf: get rid of cgroup helper related ifdefs

As recently discussed during the task_under_cgroup_hierarchy() addition,
we should get rid of the ifdefs surrounding the bpf_skb_under_cgroup()
helper. If related functionality is not built-in, the helper cannot be
used anyway, which is also in line with what we do for all other helpers.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agobpf: enable event output helper also for xdp types
Daniel Borkmann [Wed, 17 Aug 2016 23:00:40 +0000 (01:00 +0200)]
bpf: enable event output helper also for xdp types

Follow-up to 555c8a8623a3 ("bpf: avoid stack copy and use skb ctx for
event output") for also adding the event output helper for XDP typed
programs. The event output helper has been very useful in particular for
debugging or event notification purposes, since it's much faster and
flexible than regular trace printk due to programmatically being able to
attach meta data. Same flags structure applies as with tc BPF programs.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agobpf: add bpf_skb_change_tail helper
Daniel Borkmann [Wed, 17 Aug 2016 23:00:39 +0000 (01:00 +0200)]
bpf: add bpf_skb_change_tail helper

This work adds a bpf_skb_change_tail() helper for tc BPF programs. The
basic idea is to expand or shrink the skb in a controlled manner. The
eBPF program can then rewrite the rest via helpers like bpf_skb_store_bytes(),
bpf_lX_csum_replace() and others rather than passing a raw buffer for
writing here.

bpf_skb_change_tail() is really a slow path helper and intended for
replies with f.e. ICMP control messages. Concept is similar to other
helpers like bpf_skb_change_proto() helper to keep the helper without
protocol specifics and let the BPF program mangle the remaining parts.
A flags field has been added and is reserved for now should we extend
the helper in future.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agobpf: use skb_pkt_type_ok helper in bpf_skb_change_type
Daniel Borkmann [Wed, 17 Aug 2016 23:00:38 +0000 (01:00 +0200)]
bpf: use skb_pkt_type_ok helper in bpf_skb_change_type

Since we have a skb_pkt_type_ok() helper for checking the type before
mangling, make use of it instead of open coding. Follow-up to commit
8b10cab64c13 ("net: simplify and make pkt_type_ok() available for other
users") that came in after d2485c4242a8 ("bpf: add bpf_skb_change_type
helper").

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agotipc: add peer removal functionality
Richard Alpe [Thu, 18 Aug 2016 08:33:52 +0000 (10:33 +0200)]
tipc: add peer removal functionality

Add TIPC_NL_PEER_REMOVE netlink command. This command can remove
an offline peer node from the internal data structures.

This will be supported by the tipc user space tool in iproute2.

Signed-off-by: Richard Alpe <richard.alpe@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agotcp: refine tcp_prune_ofo_queue() to not drop all packets
Eric Dumazet [Wed, 17 Aug 2016 21:17:09 +0000 (14:17 -0700)]
tcp: refine tcp_prune_ofo_queue() to not drop all packets

Over the years, TCP BDP has increased a lot, and is typically
in the order of ~10 Mbytes with help of clever Congestion Control
modules.

In presence of packet losses, TCP stores incoming packets into an out of
order queue, and number of skbs sitting there waiting for the missing
packets to be received can match the BDP (~10 Mbytes)

In some cases, TCP needs to make room for incoming skbs, and current
strategy can simply remove all skbs in the out of order queue as a last
resort, incurring a huge penalty, both for receiver and sender.

Unfortunately these 'last resort events' are quite frequent, forcing
sender to send all packets again, stalling the flow and wasting a lot of
resources.

This patch cleans only a part of the out of order queue in order
to meet the memory constraints.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Soheil Hassas Yeganeh <soheil@google.com>
Cc: C. Stephen Gun <csg@google.com>
Cc: Van Jacobson <vanj@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>