GitHub/moto-9609/android_kernel_motorola_exynos9610.git
7 years agonet: style cleanups
stephen hemminger [Fri, 18 Aug 2017 20:46:28 +0000 (13:46 -0700)]
net: style cleanups

Make code closer to current style. Mostly whitespace changes.

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: mark receive queue attributes ro_after_init
stephen hemminger [Fri, 18 Aug 2017 20:46:27 +0000 (13:46 -0700)]
net: mark receive queue attributes ro_after_init

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: make queue attributes ro_after_init
stephen hemminger [Fri, 18 Aug 2017 20:46:26 +0000 (13:46 -0700)]
net: make queue attributes ro_after_init

The XPS queue attributes can be ro_after_init.
Also use __ATTR_RX macros to simplify initialization.

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: make BQL sysfs attributes ro_after_init
stephen hemminger [Fri, 18 Aug 2017 20:46:25 +0000 (13:46 -0700)]
net: make BQL sysfs attributes ro_after_init

Also fix macro to not have ; at end.

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: drop unused attribute argument from sysfs queue funcs
stephen hemminger [Fri, 18 Aug 2017 20:46:24 +0000 (13:46 -0700)]
net: drop unused attribute argument from sysfs queue funcs

The show and store functions don't need/use the attribute.

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: make net sysfs attributes ro_after_init
stephen hemminger [Fri, 18 Aug 2017 20:46:23 +0000 (13:46 -0700)]
net: make net sysfs attributes ro_after_init

The attributes of net devices are immutable.

Ideally, attribute groups would contain const attributes
but there are too many places that do modifications of list
during startup (in other code) to allow that.

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: constify net_ns_type_operations
stephen hemminger [Fri, 18 Aug 2017 20:46:22 +0000 (13:46 -0700)]
net: constify net_ns_type_operations

This can be const.

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: make net_class ro_after_init
stephen hemminger [Fri, 18 Aug 2017 20:46:21 +0000 (13:46 -0700)]
net: make net_class ro_after_init

The net_class in sysfs is only modified on init.

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: constify netdev_class_file
stephen hemminger [Fri, 18 Aug 2017 20:46:20 +0000 (13:46 -0700)]
net: constify netdev_class_file

These functions are wrapper arount class_create_file which can take a
const attribute.

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: don't decrement kobj reference count on init failure
stephen hemminger [Fri, 18 Aug 2017 20:46:19 +0000 (13:46 -0700)]
net: don't decrement kobj reference count on init failure

If kobject_init_and_add failed, then the failure path would
decrement the reference count of the queue kobject whose reference
count was already zero.

Fixes: 114cf5802165 ("bql: Byte queue limits")
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge branch 'amd-xgbe-next'
David S. Miller [Fri, 18 Aug 2017 23:30:17 +0000 (16:30 -0700)]
Merge branch 'amd-xgbe-next'

Tom Lendacky says:

====================
amd-xgbe: AMD XGBE driver updates 2017-08-17

The following updates are included in this driver update series:

- Set the MDIO mode to clause 45 for the 10GBase-T configuration
- Set the MII control width to 8-bits for speeds less than 1Gbps
- Fix an issue to related to module removal when the devices are up
- Fix ethtool statistics related to packet counting of TSO packets
- Add support for device renaming
- Add additional dynamic debug output for the PCS window calculation
- Optimize reading of DMA channel interrupt enablement register
- Add additional dynamic debug output about the hardware features
- Add per queue Tx and Rx ethtool statistics
- Add a macro to clear ethtool_link_ksettings modes
- Convert the driver to use the ethtool_link_ksettings
- Add support for VXLAN offload capabilities
- Add additional ethtool statistics related to VXLAN

This patch series is based on net-next.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoamd-xgbe: Add additional ethtool statistics
Lendacky, Thomas [Fri, 18 Aug 2017 14:04:14 +0000 (09:04 -0500)]
amd-xgbe: Add additional ethtool statistics

Add some additional statistics for tracking VXLAN packets and checksum
errors.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoamd-xgbe: Add support for VXLAN offload capabilities
Lendacky, Thomas [Fri, 18 Aug 2017 14:04:04 +0000 (09:04 -0500)]
amd-xgbe: Add support for VXLAN offload capabilities

The hardware has the capability to perform checksum offload support
(both Tx and Rx) and TSO support for VXLAN packets. Add the support
required to enable this.

The hardware can only support a single VXLAN port for offload. If more
than one VXLAN port is added then the offload capabilities have to be
disabled and can no longer be advertised.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoamd-xgbe: Convert to using the new link mode settings
Lendacky, Thomas [Fri, 18 Aug 2017 14:03:55 +0000 (09:03 -0500)]
amd-xgbe: Convert to using the new link mode settings

Convert from using the old u32 supported, advertising, etc. link settings
to the new link mode settings that support bit positions / settings
greater than 32 bits.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: ethtool: Add macro to clear a link mode setting
Lendacky, Thomas [Fri, 18 Aug 2017 14:03:44 +0000 (09:03 -0500)]
net: ethtool: Add macro to clear a link mode setting

There are currently macros to set and test an ETHTOOL_LINK_MODE_ setting,
but not to clear one. Add a macro to clear an ETHTOOL_LINK_MODE_ setting.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoamd-xgbe: Add per queue Tx and Rx statistics
Lendacky, Thomas [Fri, 18 Aug 2017 14:03:35 +0000 (09:03 -0500)]
amd-xgbe: Add per queue Tx and Rx statistics

Add per queue Tx and Rx packet and byte counts.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoamd-xgbe: Add hardware features debug output
Lendacky, Thomas [Fri, 18 Aug 2017 14:03:26 +0000 (09:03 -0500)]
amd-xgbe: Add hardware features debug output

Use the dynamic debug support to output information about the hardware
features reported by the device.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoamd-xgbe: Optimize DMA channel interrupt enablement
Lendacky, Thomas [Fri, 18 Aug 2017 14:03:17 +0000 (09:03 -0500)]
amd-xgbe: Optimize DMA channel interrupt enablement

Currently whenever the driver needs to enable or disable interrupts for
a DMA channel it reads the interrupt enable register (IER), updates the
value and then writes the new value back to the IER. Since the hardware
does not change the IER, software can track this value and elimiate the
need to read it each time.

Add the IER value to the channel related data structure and use that as
the base for enabling and disabling interrupts, thus removing the need
for the MMIO read.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoamd-xgbe: Add additional dynamic debug messages
Lendacky, Thomas [Fri, 18 Aug 2017 14:03:08 +0000 (09:03 -0500)]
amd-xgbe: Add additional dynamic debug messages

Add some additional dynamic debug message to the driver. The new messages
will provide additional information about the PCS window calculation.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoamd-xgbe: Add support to handle device renaming
Lendacky, Thomas [Fri, 18 Aug 2017 14:02:57 +0000 (09:02 -0500)]
amd-xgbe: Add support to handle device renaming

Many of the names used by the driver are based upon the name of the device
found during device probe.  Move the formatting of the names into the
device open function so that any renaming that occurs before the device is
brought up will be accounted for.  This also means moving the creation of
some named workqueues into the device open path.

Add support to register for net events so that if a device is renamed
the corresponding debugfs directory can be renamed.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoamd-xgbe: Update TSO packet statistics accuracy
Lendacky, Thomas [Fri, 18 Aug 2017 14:02:49 +0000 (09:02 -0500)]
amd-xgbe: Update TSO packet statistics accuracy

When transmitting a TSO packet, the driver only increments the TSO packet
statistic by one rather than the number of total packets that were sent.
Update the driver to record the total number of packets that resulted from
TSO transmit.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoamd-xgbe: Be sure driver shuts down cleanly on module removal
Lendacky, Thomas [Fri, 18 Aug 2017 14:02:40 +0000 (09:02 -0500)]
amd-xgbe: Be sure driver shuts down cleanly on module removal

Sometimes when the driver is being unloaded while the devices are still
up the driver can issue errors.  This is based on timing and the double
invocation of some routines.  The phy_exit() call needs to be run after
the network device has been closed and unregistered from the system.
Also, the phy_exit() does not need to invoke phy_stop() since that will
be called as part of the device closing, so remove that call.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoamd-xgbe: Set the MII control width for the MAC interface
Lendacky, Thomas [Fri, 18 Aug 2017 14:02:27 +0000 (09:02 -0500)]
amd-xgbe: Set the MII control width for the MAC interface

When running in SGMII mode at speeds below 1000Mbps, the auto-negotition
control register must set the MII control width for the MAC interface
to be 8-bits wide.  By default the width is 4-bits.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoamd-xgbe: Set the MDIO mode for 10000Base-T configuration
Lendacky, Thomas [Fri, 18 Aug 2017 14:02:18 +0000 (09:02 -0500)]
amd-xgbe: Set the MDIO mode for 10000Base-T configuration

Currently the MDIO mode is set to none for the 10000Base-T, which is
incorrect.  The MDIO mode should for this configuration should be
clause 45.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agomlx5: ensure 0 is returned when vport is zero
Colin Ian King [Fri, 18 Aug 2017 13:49:25 +0000 (14:49 +0100)]
mlx5: ensure 0 is returned when vport is zero

Currently, if vport is zero then then an uninialized return status
in err is returned.  Since the only return status at the end of the
function esw_add_uc_addr is zero for the current set of return paths
we may as well just return 0 rather than err to fix this issue.

Detected by CoverityScan, CID#1452698 ("Uninitialized scalar variable")

Fixes: eeb66cdb6826 ("net/mlx5: Separate between E-Switch and MPFS")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agobpf: Fix map-in-map checking in the verifier
Martin KaFai Lau [Fri, 18 Aug 2017 01:14:43 +0000 (18:14 -0700)]
bpf: Fix map-in-map checking in the verifier

In check_map_func_compatibility(), a 'break' has been accidentally
removed for the BPF_MAP_TYPE_ARRAY_OF_MAPS and BPF_MAP_TYPE_HASH_OF_MAPS
cases.  This patch adds it back.

Fixes: 174a79ff9515 ("bpf: sockmap with sk redirect support")
Cc: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge branch 'xdp-adjust-xdp-redirect-tracepoint'
David S. Miller [Fri, 18 Aug 2017 23:18:47 +0000 (16:18 -0700)]
Merge branch 'xdp-adjust-xdp-redirect-tracepoint'

Jesper Dangaard Brouer says:

====================
xdp: adjust xdp redirect tracepoint

Working on streamlining the tracepoints for XDP.  The eBPF programs
and XDP have no flow-control or queueing.  Investigating using
tracepoint to provide a feedback on XDP_REDIRECT xmit overflow events.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoxdp: adjust xdp redirect tracepoint to include return error code
Jesper Dangaard Brouer [Thu, 17 Aug 2017 16:22:37 +0000 (18:22 +0200)]
xdp: adjust xdp redirect tracepoint to include return error code

The return error code need to be included in the tracepoint
xdp:xdp_redirect, else its not possible to distinguish successful or
failed XDP_REDIRECT transmits.

XDP have no queuing mechanism. Thus, it is fairly easily to overrun a
NIC transmit queue.  The eBPF program invoking helpers (bpf_redirect
or bpf_redirect_map) to redirect a packet doesn't get any feedback
whether the packet was actually transmitted.

Info on failed transmits in the tracepoint xdp:xdp_redirect, is
interesting as this opens for providing a feedback-loop to the
receiving XDP program.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoixgbe: change ndo_xdp_xmit return code on xmit errors
Jesper Dangaard Brouer [Thu, 17 Aug 2017 16:22:32 +0000 (18:22 +0200)]
ixgbe: change ndo_xdp_xmit return code on xmit errors

Use errno -ENOSPC ("No space left on device") when the XDP xmit
have no space left on the TX ring buffer, instead of -ENOMEM.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoliquidio: remove support for deprecated f/w cmd OCTNET_CMD_RESET_PF
Rick Farrington [Thu, 17 Aug 2017 01:30:13 +0000 (18:30 -0700)]
liquidio: remove support for deprecated f/w cmd OCTNET_CMD_RESET_PF

Signed-off-by: Rick Farrington <ricardo.farrington@cavium.com>
Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: inet: diag: expose sockets cgroup classid
Levin, Alexander (Sasha Levin) [Thu, 17 Aug 2017 00:35:11 +0000 (00:35 +0000)]
net: inet: diag: expose sockets cgroup classid

This is useful for directly looking up a task based on class id rather than
having to scan through all open file descriptors.

Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agomacvlan: add offload features for encapsulation
Dimitris Michailidis [Wed, 16 Aug 2017 21:34:46 +0000 (14:34 -0700)]
macvlan: add offload features for encapsulation

Currently macvlan devices do not set their hw_enc_features making
encapsulated Tx packets resort to SW fallbacks. Add encapsulation GSO
offloads to ->features as is done for the other GSOs and set
->hw_enc_features.

Signed-off-by: Dimitris Michailidis <dmichail@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoliquidio: fix Smatch error
Intiyaz Basha [Fri, 18 Aug 2017 20:07:19 +0000 (13:07 -0700)]
liquidio: fix Smatch error

Fix Smatch error by not dereferencing iq pointer if it's NULL.

See http://marc.info/?l=kernel-janitors&m=150296723301129&w=2

Also, remove unnecessary parentheses.

Fixes: d314ac222829 ("liquidio: moved liquidio_napi_poll to lio_core.c")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Intiyaz Basha <intiyaz.basha@cavium.com>
Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoipv4: convert dst_metrics.refcnt from atomic_t to refcount_t
Eric Dumazet [Fri, 18 Aug 2017 19:08:07 +0000 (12:08 -0700)]
ipv4: convert dst_metrics.refcnt from atomic_t to refcount_t

refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetoot...
David S. Miller [Fri, 18 Aug 2017 18:07:46 +0000 (11:07 -0700)]
Merge branch 'for-upstream' of git://git./linux/kernel/git/bluetooth/bluetooth-next

Johan Hedberg says:

====================
pull request: bluetooth-next 2017-08-18

Here's one more bluetooth-next pull request for the 4.14 kernel:

 - Multiple fixes for Broadcom controllers
 - Fixes to the bluecard HCI driver
 - New USB ID for Realtek RTL8723BE controller
 - Fix static analyzer warning with kfree

Please let me know if there are any issues pulling. Thanks.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoipv6: fix false-postive maybe-uninitialized warning
Arnd Bergmann [Fri, 18 Aug 2017 11:34:22 +0000 (13:34 +0200)]
ipv6: fix false-postive maybe-uninitialized warning

Adding a lock around one of the assignments prevents gcc from
tracking the state of the local 'fibmatch' variable, so it can no
longer prove that 'dst' is always initialized, leading to a bogus
warning:

net/ipv6/route.c: In function 'inet6_rtm_getroute':
net/ipv6/route.c:3659:2: error: 'dst' may be used uninitialized in this function [-Werror=maybe-uninitialized]

This moves the other assignment into the same lock to shut up the
warning.

Fixes: 121622dba8da ("ipv6: route: make rtm_getroute not assume rtnl is locked")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge branch 'hns3-bug-fixes'
David S. Miller [Fri, 18 Aug 2017 17:31:56 +0000 (10:31 -0700)]
Merge branch 'hns3-bug-fixes'

Salil Mehta says:

====================
Misc. Bug fixes for HNS3 Ethernet Driver

This patch-set fixes various bugs reported by community.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: hns3: Fixes the static check warning due to missing unsupp L3 proto check
Salil [Fri, 18 Aug 2017 11:31:39 +0000 (12:31 +0100)]
net: hns3: Fixes the static check warning due to missing unsupp L3 proto check

This patch fixes the static check warning due to missing handling leg of
unsupported L3 protocol type in the hns3_get_l4_protocol() function.

Fixes: 76ad4f0ee747 ("net: hns3: Add support of HNS3 Ethernet Driver for
hip08 SoC")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: hns3: Fixes the static checker error warning in hns3_get_link_ksettings()
Salil [Fri, 18 Aug 2017 11:31:38 +0000 (12:31 +0100)]
net: hns3: Fixes the static checker error warning in hns3_get_link_ksettings()

This patch fixes the static check error warning in hns3_get_link_ksettings()
function by re-arranging the code.

Fixes: 496d03e960ae ("net: hns3: Add Ethtool support to HNS3 Driver")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: hns3: Fixes the missing u64_stats_fetch_begin_irq in 64-bit stats fetch
Salil [Fri, 18 Aug 2017 11:31:37 +0000 (12:31 +0100)]
net: hns3: Fixes the missing u64_stats_fetch_begin_irq in 64-bit stats fetch

This patch fixes the missing u64_stats_fetch_begin_irq() while trying to
atomically do 64-bit RX/TX fetch. We did not get any error during test
as our SoC is 64-bit so all of these seq/lock operations results in NOOP.

As such, this seq lock supports has been added for the sake of completion
if this code ever runs on 32-bit platform and we are trying to do 64-bit
stats fetch.

Fixes: 76ad4f0ee747 ("net: hns3: Add support of HNS3 Ethernet Driver for
hip08 SoC")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet/sched: Fix the logic error to decide the ingress qdisc
Chris Mi [Fri, 18 Aug 2017 11:24:20 +0000 (07:24 -0400)]
net/sched: Fix the logic error to decide the ingress qdisc

The offending commit used a newly added helper function.
But the logic is wrong. Without this fix, the affected NICs
can't do HW offload. Error -EOPNOTSUPP will be returned directly.

Fixes: a2e8da9378cc ("net/sched: use newly added classid identity helpers")
Signed-off-by: Chris Mi <chrism@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge branch 's390-qeth-next'
David S. Miller [Fri, 18 Aug 2017 17:21:31 +0000 (10:21 -0700)]
Merge branch 's390-qeth-next'

Julian Wiedmann says:

====================
s390/net: more updates for 4.14

please apply another batch of qeth patches for net-next.
This reworks the xmit path for L2 OSAs to use skb_cow_head() instead of
skb_realloc_headroom().
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agos390/qeth: use skb_cow_head() for L2 OSA xmit
Julian Wiedmann [Fri, 18 Aug 2017 08:19:10 +0000 (10:19 +0200)]
s390/qeth: use skb_cow_head() for L2 OSA xmit

Taking a full copy via skb_realloc_headroom() on every xmit is overkill
and wastes CPU time; all we actually need is to push on the qeth_hdr.
So rework the L2 OSA TX path to avoid the copy.
Minor complications arise because struct qeth_hdr must not cross a page
boundary. So add a new helper qeth_push_hdr() that catches this, and
falls back to the hdr cache that we already use for IQDs.

This change uncovered that qeth's TX completion takes rather long.
Now that we no longer free the original skb straight away and thus call
skb->destructor later than before, throughput regresses significantly.
For now, restore old behaviour by adding an explicit skb_orphan(),
and a big TODO to improve the TX completion time.

Tested-by: Nils Hoppmann <niho@de.ibm.com>
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agos390/qeth: unify code to build header elements
Julian Wiedmann [Fri, 18 Aug 2017 08:19:09 +0000 (10:19 +0200)]
s390/qeth: unify code to build header elements

After plenty of refactoring, use hd_len as single indication that
the skb needs a dedicated header element.

This preserves existing behaviour for TSO, as 'hdr' always points
to skb->data.

Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agos390/qeth: pass full IQD header length to fill_buffer()
Julian Wiedmann [Fri, 18 Aug 2017 08:19:08 +0000 (10:19 +0200)]
s390/qeth: pass full IQD header length to fill_buffer()

This is a prerequisite for unifying the code to build header elements.
The TSO header has a different size, so we can no longer rely on implicitly
adding the size of a normal qeth_hdr.

No functional change.

Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agos390/qeth: pass TSO data offset to fill_buffer()
Julian Wiedmann [Fri, 18 Aug 2017 08:19:07 +0000 (10:19 +0200)]
s390/qeth: pass TSO data offset to fill_buffer()

For TSO we need to skip the skb's qeth/IP/TCP headers when mapping
it into buffer elements. Instead of (mis)using skb_pull(), pass a
corresponding offset to fill_buffer() like we already do for IQDs.

No actual change in the resulting TSO buffers.

Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agos390/qeth: pass TSO header length to fill_buffer()
Julian Wiedmann [Fri, 18 Aug 2017 08:19:06 +0000 (10:19 +0200)]
s390/qeth: pass TSO header length to fill_buffer()

The TSO code already calculates the length of its header element,
no need to duplicate this in the low-level code again.

Use this opportunity to make hd_len unsigned, and for TSO match
its calculation to what tso_fill_header() does.

No functional change.

Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agos390/qeth: pass full data length to l2_fill_header()
Julian Wiedmann [Fri, 18 Aug 2017 08:19:05 +0000 (10:19 +0200)]
s390/qeth: pass full data length to l2_fill_header()

For IQD we already need to fix up the qeth_hdr's length field, and
future changes will require more flexibility for OSA as well. The
device-specific path knows best what header length it requires, so just
pass it from there.
While at it, remove the unused qeth_card parameter.

No functional change.

Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agos390/qeth: split L2 xmit paths
Julian Wiedmann [Fri, 18 Aug 2017 08:19:04 +0000 (10:19 +0200)]
s390/qeth: split L2 xmit paths

l2_hard_start_xmit() actually doesn't contain much shared code,
and having device-specific paths makes isolated changes a lot easier.
So split it into three routines for IQD, OSN and OSD/OSM/OSX.

No functional change.

Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agobpf: fix a return in sockmap_get_from_fd()
Dan Carpenter [Fri, 18 Aug 2017 07:27:02 +0000 (10:27 +0300)]
bpf: fix a return in sockmap_get_from_fd()

"map" is a valid pointer.  We wanted to return "err" instead.  Also
let's return a zero literal at the end.

Fixes: 174a79ff9515 ("bpf: sockmap with sk redirect support")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge branch 'liquidio-initialization-fixes-for-embedded-firmware'
David S. Miller [Fri, 18 Aug 2017 17:14:26 +0000 (10:14 -0700)]
Merge branch 'liquidio-initialization-fixes-for-embedded-firmware'

Rick Farrington says:

====================
liquidio: initialization fixes for embedded firmware

Fix problems when using an adapter w/embedded f/w (param "fw_type=none").

1. Add support for PF FLR when exiting.
2. Skip some initialization (don't try to load f/w, activate consoles).
3. Issue credits BEFORE enabling DROQs.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoliquidio: with embedded f/w, issue droq credits before enablement
Rick Farrington [Fri, 18 Aug 2017 06:11:30 +0000 (23:11 -0700)]
liquidio: with embedded f/w, issue droq credits before enablement

1. Issue credits BEFORE enabling DROQ's; this prevents PKTPF_ERR interrupt.

Signed-off-by: Rick Farrington <ricardo.farrington@cavium.com>
Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoliquidio: with embedded f/w, don't reload f/w, issue pf flr at exit
Rick Farrington [Fri, 18 Aug 2017 06:11:25 +0000 (23:11 -0700)]
liquidio: with embedded f/w, don't reload f/w, issue pf flr at exit

1. Add support for PF FLR when exiting
   (enables CORE_DRV_ACTIVE upon next driver init)
2. Skip some initialization (don't try to load f/w, activate consoles).

Signed-off-by: Rick Farrington <ricardo.farrington@cavium.com>
Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoBluetooth: hci_bcm: Handle empty packet after firmware loading
Marcel Holtmann [Thu, 17 Aug 2017 19:41:09 +0000 (21:41 +0200)]
Bluetooth: hci_bcm: Handle empty packet after firmware loading

The Broadcom controller on the Raspberry Pi3 sends an empty packet with
packet type 0x00 after launching the firmware. This will cause logging
of errors.

  Bluetooth: hci0: Frame reassembly failed (-84)

Since this seems to be an intented behaviour of the controller, handle
it gracefully by parsing that empty packet with packet type 0x00 and
then just simply report it as diagnostic packet.

With that change no errors are logging and the packet itself is actually
recorded in the Bluetooth monitor traces.

  < HCI Command: Broadcom Launch RAM (0x3f|0x004e) plen 4
         Address: 0xffffffff
  > HCI Event: Command Complete (0x0e) plen 4
       Broadcom Launch RAM (0x3f|0x004e) ncmd 1
         Status: Success (0x00)
  = Vendor Diagnostic (len 0)
  < HCI Command: Broadcom Update UART Baud Rate (0x3f|0x0018) plen 6
         00 00 00 10 0e 00                                ......
  > HCI Event: Command Complete (0x0e) plen 4
       Broadcom Update UART Baud Rate (0x3f|0x0018) ncmd 1
         Status: Success (0x00)
  < HCI Command: Reset (0x03|0x0003) plen 0
  > HCI Event: Command Complete (0x0e) plen 4
       Reset (0x03|0x0003) ncmd 1
         Status: Success (0x00)

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
7 years agodt-bindings: net: bluetooth: Add broadcom-bluetooth
Loic Poulain [Thu, 17 Aug 2017 17:59:48 +0000 (19:59 +0200)]
dt-bindings: net: bluetooth: Add broadcom-bluetooth

Add binding document for serial bluetooth chips using
Broadcom protocol.

Signed-off-by: Loic Poulain <loic.poulain@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
7 years agoBluetooth: hci_bcm: Add serdev support
Loic Poulain [Thu, 17 Aug 2017 17:59:51 +0000 (19:59 +0200)]
Bluetooth: hci_bcm: Add serdev support

Add basic support for Broadcom serial slave devices.
Probe the serial device, retrieve its maximum speed and
register a new hci uart device.

Tested/compatible with bcm43438 (RPi3).

Signed-off-by: Loic Poulain <loic.poulain@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
7 years agoMerge branch 'bpf-smap-followups'
David S. Miller [Thu, 17 Aug 2017 17:25:19 +0000 (10:25 -0700)]
Merge branch 'bpf-smap-followups'

Daniel Borkmann says:

====================
Two BPF smap related followups

Fixing preemption imbalance and consolidating prologue
generation. Thanks!
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agobpf: reuse tc bpf prologue for sk skb progs
Daniel Borkmann [Thu, 17 Aug 2017 15:22:37 +0000 (17:22 +0200)]
bpf: reuse tc bpf prologue for sk skb progs

Given both program types are effecitvely doing the same in the
prologue, just reuse the one that we had for tc and only adapt
to the corresponding drop verdict value. That way, we don't need
to have the duplicate from 8a31db561566 ("bpf: add access to sock
fields and pkt data from sk_skb programs") to maintain.

Reported-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agobpf: don't enable preemption twice in smap_do_verdict
Daniel Borkmann [Thu, 17 Aug 2017 15:22:36 +0000 (17:22 +0200)]
bpf: don't enable preemption twice in smap_do_verdict

In smap_do_verdict(), the fall-through branch leads to call
preempt_enable() twice for the SK_REDIRECT, which creates an
imbalance. Only enable it for all remaining cases again.

Fixes: 174a79ff9515 ("bpf: sockmap with sk redirect support")
Reported-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: ibm: ibmvnic: constify vio_device_id
Arvind Yadav [Thu, 17 Aug 2017 13:22:54 +0000 (18:52 +0530)]
net: ibm: ibmvnic: constify vio_device_id

vio_device_id are not supposed to change at runtime. All functions
working with vio_device_id provided by <asm/vio.h> work with
const vio_device_id. So mark the non-const structs as const.

Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: ibm: ibmveth: constify vio_device_id
Arvind Yadav [Thu, 17 Aug 2017 13:22:53 +0000 (18:52 +0530)]
net: ibm: ibmveth: constify vio_device_id

vio_device_id are not supposed to change at runtime. All functions
working with vio_device_id provided by <asm/vio.h> work with
const vio_device_id. So mark the non-const structs as const.

Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agobpf: no need to nullify ri->map in xdp_do_redirect
Daniel Borkmann [Thu, 17 Aug 2017 13:07:22 +0000 (15:07 +0200)]
bpf: no need to nullify ri->map in xdp_do_redirect

We are guaranteed to have a NULL ri->map in this branch since
we test for it earlier, so we don't need to reset it here.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agobpf: fix liveness propagation to parent in spilled stack slots
Daniel Borkmann [Thu, 17 Aug 2017 12:59:40 +0000 (14:59 +0200)]
bpf: fix liveness propagation to parent in spilled stack slots

Using parent->regs[] when propagating REG_LIVE_READ for spilled regs
doesn't work since parent->regs[] denote the set of normal registers
but not spilled ones. Propagate to the correct regs.

Fixes: dc503a8ad984 ("bpf/verifier: track liveness for pruning")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: hns3: ensure media_type is unitialized
Colin Ian King [Thu, 17 Aug 2017 09:01:07 +0000 (10:01 +0100)]
net: hns3: ensure media_type is unitialized

Media type is only set if h->ae_algo->ops->get_media_type is called
so there is a possibility that media_type is uninitialized when it is
used a switch statement.  Fix this by initializing media_type to
HNAE3_MEDIA_TYPE_UNKNOWN.

Detected by CoverityScan, CID#1452624("Uninitialized scalar variable")

Fixes: 496d03e960ae ("net: hns3: Add Ethtool support to HNS3 driver")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoliquidio: fix spelling mistake: "interuupt" -> "interrupt"
Colin Ian King [Thu, 17 Aug 2017 08:19:30 +0000 (09:19 +0100)]
liquidio: fix spelling mistake: "interuupt" -> "interrupt"

Trivial fix to spelling mistake in dev_info message

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoBluetooth: btbcm: Consolidate the controller information commands
Marcel Holtmann [Thu, 17 Aug 2017 09:02:40 +0000 (11:02 +0200)]
Bluetooth: btbcm: Consolidate the controller information commands

The commands that read the basic vendor information about the Broadcom
controller are duplicated for UART and USB devices. Combine them into a
single function to reduce the code complexity.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
7 years agoMerge branch 'vmbus-sendpacket-cleanups'
David S. Miller [Wed, 16 Aug 2017 23:27:45 +0000 (16:27 -0700)]
Merge branch 'vmbus-sendpacket-cleanups'

Stephen Hemminger says:

====================
vmbus sendpacket cleanups

These patches remove and consolidate vmbus_sendpacket functions.

They should go through the net-next tree since these API's
were only used by the netvsc driver.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agovmbus: remove unused vmbus_sendpacket_ctl
stephen hemminger [Wed, 16 Aug 2017 15:56:26 +0000 (08:56 -0700)]
vmbus: remove unused vmbus_sendpacket_ctl

The only usage of vmbus_sendpacket_ctl was by vmbus_sendpacket.

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agovmbus: remove unused vmubs_sendpacket_pagebuffer_ctl
stephen hemminger [Wed, 16 Aug 2017 15:56:25 +0000 (08:56 -0700)]
vmbus: remove unused vmubs_sendpacket_pagebuffer_ctl

The function vmbus_sendpacket_pagebuffer_ctl was never used directly.
Just have vmbus_send_pagebuffer

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agovmbus: remove unused vmbus_sendpacket_multipagebuffer
stephen hemminger [Wed, 16 Aug 2017 15:56:24 +0000 (08:56 -0700)]
vmbus: remove unused vmbus_sendpacket_multipagebuffer

This function is not used anywhere in current code.

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agotcp: Export tcp_{sendpage,sendmsg}_locked() for ipv6.
David S. Miller [Wed, 16 Aug 2017 22:40:44 +0000 (15:40 -0700)]
tcp: Export tcp_{sendpage,sendmsg}_locked() for ipv6.

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge branch 'sockmap-build-fixes'
David S. Miller [Wed, 16 Aug 2017 22:34:13 +0000 (15:34 -0700)]
Merge branch 'sockmap-build-fixes'

John Fastabend says:

====================
bpf: sockmap build fixes

Two build fixes for sockmap, this should resolve the build errors
and warnings that were reported. Thanks everyone.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agobpf: sock_map fixes for !CONFIG_BPF_SYSCALL and !STREAM_PARSER
John Fastabend [Wed, 16 Aug 2017 22:02:32 +0000 (15:02 -0700)]
bpf: sock_map fixes for !CONFIG_BPF_SYSCALL and !STREAM_PARSER

Resolve issues with !CONFIG_BPF_SYSCALL and !STREAM_PARSER

net/core/filter.c: In function ‘do_sk_redirect_map’:
net/core/filter.c:1881:3: error: implicit declaration of function ‘__sock_map_lookup_elem’ [-Werror=implicit-function-declaration]
   sk = __sock_map_lookup_elem(ri->map, ri->ifindex);
   ^
net/core/filter.c:1881:6: warning: assignment makes pointer from integer without a cast [enabled by default]
   sk = __sock_map_lookup_elem(ri->map, ri->ifindex);

Fixes: 174a79ff9515 ("bpf: sockmap with sk redirect support")
Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agobpf: sockmap state change warning fix
John Fastabend [Wed, 16 Aug 2017 22:02:12 +0000 (15:02 -0700)]
bpf: sockmap state change warning fix

psock will uninitialized in default case we need to do the same psock lookup
and check as in other branch. Fixes compile warning below.

kernel/bpf/sockmap.c: In function ‘smap_state_change’:
kernel/bpf/sockmap.c:156:21: warning: ‘psock’ may be used uninitialized in this function [-Wmaybe-uninitialized]
  struct smap_psock *psock;

Fixes: 174a79ff9515 ("bpf: sockmap with sk redirect support")
Reported-by: David Miller <davem@davemloft.net>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: sched: cls_flower: fix ndo_setup_tc type for stats call
Jiri Pirko [Wed, 16 Aug 2017 15:15:18 +0000 (17:15 +0200)]
net: sched: cls_flower: fix ndo_setup_tc type for stats call

I made a stupid mistake using TC_CLSFLOWER_STATS instead of
TC_SETUP_CLSFLOWER. Funny thing is that both are defined as "2" so it
actually did not cause any harm. Anyway, fixing it now.

Fixes: 2572ac53c46f ("net: sched: make type an argument for ndo_setup_tc")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agotun: make tun_build_skb() thread safe
Eric Dumazet [Wed, 16 Aug 2017 14:14:33 +0000 (22:14 +0800)]
tun: make tun_build_skb() thread safe

tun_build_skb() is not thread safe since it uses per queue page frag,
this will break things when multiple threads are sending through same
queue. Switch to use per-thread generator (no lock involved).

Fixes: 66ccbc9c87c2 ("tap: use build_skb() for small packet")
Tested-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet/mlx4: fix spelling mistake: "availible" -> "available"
Colin Ian King [Wed, 16 Aug 2017 09:05:11 +0000 (10:05 +0100)]
net/mlx4: fix spelling mistake: "availible" -> "available"

Trivial fix to spelling mistakes in the mlx4 driver

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoqdisc: add tracepoint qdisc:qdisc_dequeue for dequeued SKBs
Jesper Dangaard Brouer [Tue, 15 Aug 2017 19:11:03 +0000 (21:11 +0200)]
qdisc: add tracepoint qdisc:qdisc_dequeue for dequeued SKBs

The main purpose of this tracepoint is to monitor bulk dequeue
in the network qdisc layer, as it cannot be deducted from the
existing qdisc stats.

The txq_state can be used for determining the reason for zero packet
dequeues, see enum netdev_queue_state_t.

Notice all packets doesn't necessary activate this tracepoint. As
qdiscs with flag TCQ_F_CAN_BYPASS, can directly invoke
sch_direct_xmit() when qdisc_qlen is zero.

Remember that perf record supports filters like:

 perf record -e qdisc:qdisc_dequeue \
  --filter 'ifindex == 4 && (packets > 1 || txq_state > 0)'

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge branch 'nfp-process-MTU-updates-from-firmware-flower-app'
David S. Miller [Wed, 16 Aug 2017 18:36:45 +0000 (11:36 -0700)]
Merge branch 'nfp-process-MTU-updates-from-firmware-flower-app'

Simon Horman says:

====================
nfp: process MTU updates from firmware flower app

The first patch of this series moves processing of control messages from a
BH handler to a workqueue. That change makes it safe to process MTU
updates from the firmware which is added by the second patch of this
series.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonfp: process MTU updates from firmware flower app
Simon Horman [Wed, 16 Aug 2017 07:37:44 +0000 (09:37 +0200)]
nfp: process MTU updates from firmware flower app

Now that control message processing occurs in a workqueue rather than a BH
handler MTU updates received from the firmware may be safely processed.

Signed-off-by: Simon Horman <simon.horman@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonfp: process control messages in workqueue in flower app
Simon Horman [Wed, 16 Aug 2017 07:37:43 +0000 (09:37 +0200)]
nfp: process control messages in workqueue in flower app

Processing of control messages is not time-critical and future processing
of some messages will require taking the RTNL which is not possible
in a BH handler. It seems simplest to move all control message processing
to a workqueue.

Signed-off-by: Simon Horman <simon.horman@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agobpf: devmap: remove unnecessary value size check
John Fastabend [Wed, 16 Aug 2017 06:35:12 +0000 (23:35 -0700)]
bpf: devmap: remove unnecessary value size check

In the devmap alloc map logic we check to ensure that the sizeof the
values are not greater than KMALLOC_MAX_SIZE. But, in the dev map case
we ensure the value size is 4bytes earlier in the function because all
values should be netdev ifindex values.

The second check is harmless but is not needed so remove it.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge branch 'bpf-sockmap'
David S. Miller [Wed, 16 Aug 2017 18:27:53 +0000 (11:27 -0700)]
Merge branch 'bpf-sockmap'

John Fastabend says:

====================
BPF: sockmap and sk redirect support

This series implements a sockmap and socket redirect helper for BPF
using a model similar to XDP netdev redirect. A sockmap is a BPF map
type that holds references to sock structs. Then with a new sk
redirect bpf helper BPF programs can use the map to redirect skbs
between sockets,

      bpf_sk_redirect_map(map, key, flags)

Finally, we need a call site to attach our BPF logic to do socket
redirects. We added hooks to recv_sock using the existing strparser
infrastructure to do this. The call site is added via the BPF attach
map call. To enable users to use this infrastructure a new BPF program
BPF_PROG_TYPE_SK_SKB is created that allows users to reference sock
details, such as port and ip address fields, to build useful socket
layer program. The sockmap datapath is as follows,

     recv -> strparser -> verdict/action

where this series implements the drop and redirect actions.
Additional, actions can be added as needed.

A sample program is provided to illustrate how a sockmap can
be integrated with cgroups and used to add/delete sockets in
a sockmap. The program is simple but should show many of the
key ideas.

To test this work test_maps in selftests/bpf was leveraged.
We added a set of tests to add sockets and do send/recv ops
on the sockets to ensure correct behavior. Additionally, the
selftests tests a series of negative test cases. We can expand
on this in the future.

I also have a basic test program I use with iperf/netperf
clients that could be sent as an additional sample if folks
want this. It needs a bit of cleanup to send to the list and
wasn't included in this series.

For people who prefer git over pulling patches out of their mail
editor I've posted the code here,

https://github.com/jrfastab/linux-kernel-xdp/tree/sockmap

For some background information on the genesis of this work
it might be helpful to review these slides from netconf 2017
by Thomas Graf,

http://vger.kernel.org/netconf2017.html
https://docs.google.com/a/covalent.io/presentation/d/1dwSKSBGpUHD3WO5xxzZWj8awV_-xL-oYhvqQMOBhhtk/edit?usp=sharing

Thanks to Daniel Borkmann for reviewing and providing initial
feedback.
====================

Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agobpf: selftests add sockmap tests
John Fastabend [Wed, 16 Aug 2017 05:34:22 +0000 (22:34 -0700)]
bpf: selftests add sockmap tests

This generates a set of sockets, attaches BPF programs, and sends some
simple traffic using basic send/recv pattern. Additionally, we do a bunch
of negative tests to ensure adding/removing socks out of the sockmap fail
correctly.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agobpf: selftests: add tests for new __sk_buff members
John Fastabend [Wed, 16 Aug 2017 05:33:56 +0000 (22:33 -0700)]
bpf: selftests: add tests for new __sk_buff members

This adds tests to access new __sk_buff members from sk skb program
type.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agobpf: sockmap sample program
John Fastabend [Wed, 16 Aug 2017 05:33:32 +0000 (22:33 -0700)]
bpf: sockmap sample program

This program binds a program to a cgroup and then matches hard
coded IP addresses and adds these to a sockmap.

This will receive messages from the backend and send them to
the client.

     client:X <---> frontend:10000 client:X <---> backend:10001

To keep things simple this is only designed for 1:1 connections
using hard coded values. A more complete example would allow many
backends and clients.

To run,

 # sockmap <cgroup2_dir>

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agobpf: add access to sock fields and pkt data from sk_skb programs
John Fastabend [Wed, 16 Aug 2017 05:33:09 +0000 (22:33 -0700)]
bpf: add access to sock fields and pkt data from sk_skb programs

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agobpf: sockmap with sk redirect support
John Fastabend [Wed, 16 Aug 2017 05:32:47 +0000 (22:32 -0700)]
bpf: sockmap with sk redirect support

Recently we added a new map type called dev map used to forward XDP
packets between ports (6093ec2dc313). This patches introduces a
similar notion for sockets.

A sockmap allows users to add participating sockets to a map. When
sockets are added to the map enough context is stored with the
map entry to use the entry with a new helper

  bpf_sk_redirect_map(map, key, flags)

This helper (analogous to bpf_redirect_map in XDP) is given the map
and an entry in the map. When called from a sockmap program, discussed
below, the skb will be sent on the socket using skb_send_sock().

With the above we need a bpf program to call the helper from that will
then implement the send logic. The initial site implemented in this
series is the recv_sock hook. For this to work we implemented a map
attach command to add attributes to a map. In sockmap we add two
programs a parse program and a verdict program. The parse program
uses strparser to build messages and pass them to the verdict program.
The parse programs use the normal strparser semantics. The verdict
program is of type SK_SKB.

The verdict program returns a verdict SK_DROP, or  SK_REDIRECT for
now. Additional actions may be added later. When SK_REDIRECT is
returned, expected when bpf program uses bpf_sk_redirect_map(), the
sockmap logic will consult per cpu variables set by the helper routine
and pull the sock entry out of the sock map. This pattern follows the
existing redirect logic in cls and xdp programs.

This gives the flow,

 recv_sock -> str_parser (parse_prog) -> verdict_prog -> skb_send_sock
                                                     \
                                                      -> kfree_skb

As an example use case a message based load balancer may use specific
logic in the verdict program to select the sock to send on.

Sample programs are provided in future patches that hopefully illustrate
the user interfaces. Also selftests are in follow-on patches.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agobpf: export bpf_prog_inc_not_zero
John Fastabend [Wed, 16 Aug 2017 05:32:22 +0000 (22:32 -0700)]
bpf: export bpf_prog_inc_not_zero

bpf_prog_inc_not_zero will be used by upcoming sockmap patches this
patch simply exports it so we can pull it in.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agobpf: introduce new program type for skbs on sockets
John Fastabend [Wed, 16 Aug 2017 05:31:58 +0000 (22:31 -0700)]
bpf: introduce new program type for skbs on sockets

A class of programs, run from strparser and soon from a new map type
called sock map, are used with skb as the context but on established
sockets. By creating a specific program type for these we can use
bpf helpers that expect full sockets and get the verifier to ensure
these helpers are not used out of context.

The new type is BPF_PROG_TYPE_SK_SKB. This patch introduces the
infrastructure and type.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: fixes for skb_send_sock
John Fastabend [Wed, 16 Aug 2017 05:31:34 +0000 (22:31 -0700)]
net: fixes for skb_send_sock

A couple fixes to new skb_send_sock infrastructure. However, no users
currently exist for this code (adding user in next handful of patches)
so it should not be possible to trigger a panic with existing in-kernel
code.

Fixes: 306b13eb3cf9 ("proto_ops: Add locked held versions of sendmsg and sendpage")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: add sendmsg_locked and sendpage_locked to af_inet6
John Fastabend [Wed, 16 Aug 2017 05:31:10 +0000 (22:31 -0700)]
net: add sendmsg_locked and sendpage_locked to af_inet6

To complete the sendmsg_locked and sendpage_locked implementation add
the hooks for af_inet6 as well.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: early init support for strparser
John Fastabend [Wed, 16 Aug 2017 05:30:47 +0000 (22:30 -0700)]
net: early init support for strparser

It is useful to allow strparser to init sockets before the read_sock
callback has been established.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: 3c509: constify pnp_device_id
Arvind Yadav [Wed, 16 Aug 2017 04:55:59 +0000 (10:25 +0530)]
net: 3c509: constify pnp_device_id

pnp_device_id are not supposed to change at runtime. All functions
working with pnp_device_id provided by <linux/pnp.h> work with
const pnp_device_id. So mark the non-const structs as const.

Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoliquidio: update VF's netdev->max_mtu if there's a change in PF's MTU
Veerasenareddy Burru [Tue, 15 Aug 2017 23:26:22 +0000 (16:26 -0700)]
liquidio: update VF's netdev->max_mtu if there's a change in PF's MTU

A VF's MTU is capped at the parent PF's MTU.  So if there's a change in the
PF's MTU, then update the VF's netdev->max_mtu.

Also remove duplicate log messages for MTU change.

Signed-off-by: Veerasenareddy Burru <veerasenareddy.burru@cavium.com>
Signed-off-by: Raghu Vatsavayi <raghu.vatsavayi@cavium.com>
Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge branch 'net-sizeof-cleanups'
David S. Miller [Wed, 16 Aug 2017 18:01:58 +0000 (11:01 -0700)]
Merge branch 'net-sizeof-cleanups'

Stephen Hemminger says:

====================
net: various sizeof cleanups

Noticed some places that were using sizeof as an operator.
This is legal C but is not the convention used in the kernel.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agomlx4: sizeof style usage
stephen hemminger [Tue, 15 Aug 2017 17:29:19 +0000 (10:29 -0700)]
mlx4: sizeof style usage

The kernel coding style is to treat sizeof as a function
(ie. with parenthesis) not as an operator.

Also use kcalloc and kmalloc_array

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoskge: add paren around sizeof arg
stephen hemminger [Tue, 15 Aug 2017 17:29:18 +0000 (10:29 -0700)]
skge: add paren around sizeof arg

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agovirtio: put paren around sizeof
stephen hemminger [Tue, 15 Aug 2017 17:29:17 +0000 (10:29 -0700)]
virtio: put paren around sizeof

Kernel coding style is to put paren around operand of sizeof.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agotun/tap: use paren's with sizeof
stephen hemminger [Tue, 15 Aug 2017 17:29:16 +0000 (10:29 -0700)]
tun/tap: use paren's with sizeof

Although sizeof is an operator in C. The kernel coding style convention
is to always use it like a function and add parenthesis.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>