Ido Shamay [Thu, 18 Sep 2014 08:50:59 +0000 (11:50 +0300)]
net/mlx4_core: Enable CQE/EQE stride support
This feature is intended for archs having cache line larger then 64B.
Since our CQE/EQEs are generally 64B in those systems, HW will write
twice to the same cache line consecutively, causing pipe locks due to
he hazard prevention mechanism. For elements in a cyclic buffer, writes
are consecutive, so entries smaller than a cache line should be
avoided, especially if they are written at a high rate.
Reduce consecutive writes to same cache line in CQs/EQs, by allowing the
driver to increase the distance between entries so that each will reside
in a different cache line. Until the introduction of this feature, there
were two types of CQE/EQE:
1. 32B stride and context in the [0-31] segment
2. 64B stride and context in the [32-63] segment
This feature introduces two additional types:
3. 128B stride and context in the [0-31] segment (128B cache line)
4. 256B stride and context in the [0-31] segment (256B cache line)
Modify the mlx4_core driver to query the device for the CQE/EQE cache
line stride capability and to enable that capability when the host
cache line size is larger than 64 bytes (supported cache lines are
128B and 256B).
The mlx4 IB driver and libmlx4 need not be aware of this change. The PF
context behaviour is changed to require this change in VF drivers
running on such archs.
Signed-off-by: Ido Shamay <idos@mellanox.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sabrina Dubroca [Wed, 17 Sep 2014 21:23:12 +0000 (23:23 +0200)]
net: fix sparse warnings in SNMP_UPD_PO_STATS(_BH)
ptr used to be a non __percpu pointer (result of a this_cpu_ptr
assignment,
7d720c3e4f0c4 ("percpu: add __percpu sparse annotations to
net")). Since
d25398df59b56 ("net: avoid reloads in SNMP_UPD_PO_STATS"),
that's no longer the case, SNMP_UPD_PO_STATS uses this_cpu_add and ptr
is now __percpu.
Silence sparse warnings by preserving the original type and
annotation, and remove the out-of-date comment.
warning: incorrect type in initializer (different address spaces)
expected unsigned long long *ptr
got unsigned long long [noderef] <asn:3>*<noident>
warning: incorrect type in initializer (different address spaces)
expected void const [noderef] <asn:3>*__vpp_verify
got unsigned long long *<noident>
warning: incorrect type in initializer (different address spaces)
expected void const [noderef] <asn:3>*__vpp_verify
got unsigned long long *<noident>
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 19 Sep 2014 21:15:40 +0000 (17:15 -0400)]
Merge branch 'fou-next'
Tom Herbert says:
====================
net: foo-over-udp (fou)
This patch series implements foo-over-udp. The idea is that we can
encapsulate different IP protocols in UDP packets. The rationale for
this is that networking devices such as NICs and switches are usually
implemented with UDP (and TCP) specific mechanims for processing. For
instance, many switches and routers will implement a 5-tuple hash
for UDP packets to perform Equal Cost Multipath Routing (ECMP) or
RSS (on NICs). Many NICs also only provide rudimentary checksum
offload (basic TCP and UDP packet), with foo-over-udp we may be
able to leverage these NICs to offload checksums of tunneled packets
(using checksum unnecessary conversion and eventually remote checksum
offload)
An example encapsulation of IPIP over FOU is diagrammed below. As
illustrated, the packet overhead for FOU is the 8 byte UDP header.
+------------------+
| IPv4 hdr |
+------------------+
| UDP hdr |
+------------------+
| IPv4 hdr |
+------------------+
| TCP hdr |
+------------------+
| TCP payload |
+------------------+
Conceptually, FOU should be able to encapsulate any IP protocol.
The FOU header (UDP hdr.) is essentially an inserted header between the
IP header and transport, so in the case of TCP or UDP encapsulation
the pseudo header would be based on the outer IP header and its length
field must not include the UDP header.
* Receive
In this patch set the RX path for FOU is implemented in a new fou
module. To enable FOU for a particular protocol, a UDP-FOU socket is
opened to the port to receive FOU packets. The socket is mapped to the
IP protocol for the packets. The XFRM mechanism used to receive
encapsulated packets (udp_encap_rcv) for the port. Upon reception, the
UDP is removed and packet is reinjected in the stack for the
corresponding protocol associated with the socket (return -protocol
from udp_encap_rcv function).
GRO is provided with the appropriate fou_gro_receive and
fou_gro_complete. These routines need to know the encapsulation
protocol so we save that in udp_offloads structure with the port
and pass it in the napi_gro_cb structure.
* TX
This patch series implements FOU transmit encapsulation for IPIP, GRE, and
SIT. This done by some common infrastructure in ip_tunnel including an
ip_tunnel_encap to perform FOU encapsulation and common configuration
to enable FOU on IP tunnels. FOU is configured on existing tunnels and
does not create any new interfaces. The transmit and receive paths are
independent, so use of FOU may be assymetric between tunnel endpoints.
* Configuration
The fou module using netlink to configure FOU receive ports. The ip
command can be augmented with a fou subcommand to support this. e.g. to
configure FOU for IPIP on port 5555:
ip fou add port 5555 ipproto 4
GRE, IPIP, and SIT have been modified with netlink commands to
configure use of FOU on transmit. The "ip link" command will be
augmented with an encap subcommand (for supporting various forms of
secondary encapsulation). For instance, to configure an ipip tunnel
with FOU on port 5555:
ip link add name tun1 type ipip \
remote 192.168.1.1 local 192.168.1.2 ttl 225 \
encap fou encap-sport auto encap-dport 5555
* Notes
- This patch set does not implement GSO for FOU. The UDP encapsulation
code assumes TEB, so that will need to be reimplemented.
- When a packet is received through FOU, the UDP header is not
actually removed for the skbuf, pointers to transport header
and length in the IP header are updated (like in ESP/UDP RX). A
side effect is the IP header will now appear to have an incorrect
checksum by an external observer (e.g. tcpdump), it will be off
by sizeof UDP header. If necessary we could adjust the checksum
to compensate.
- Performance results are below. My expectation is that FOU should
entail little overhead (clearly there is some work to do :-) ).
Optimizing UDP socket lookup for encapsulation ports should help
significantly.
- I really don't expect/want devices to have special support for any
of this. Generic checksum offload mechanisms (NETIF_HW_CSUM
and use of CHECKSUM_COMPLETE) should be sufficient. RSS and flow
steering is provided by commonly implemented UDP hashing. GRO/GSO
seem fairly comparable with LRO/TSO already.
* Performance
Ran netperf TCP_RR and TCP_STREAM tests across various configurations.
This was performed on bnx2x and I disabled TSO/GSO on sender to get
fair comparison for FOU versus non-FOU. CPU utilization is reported
for receive in TCP_STREAM.
GRE
IPv4, FOU, UDP checksum enabled
TCP_STREAM
24.85% CPU utilization
9310.6 Mbps
TCP_RR
94.2% CPU utilization
155/249/460 90/95/99% latencies
1.17018e+06 tps
IPv4, FOU, UDP checksum disabled
TCP_STREAM
31.04% CPU utilization
9302.22 Mbps
TCP_RR
94.13% CPU utilization
154/239/419 90/95/99% latencies
1.17555e+06 tps
IPv4, no FOU
TCP_STREAM
23.13% CPU utilization
9354.58 Mbps
TCP_RR
90.24% CPU utilization
156/228/360 90/95/99% latencies
1.18169e+06 tps
IPIP
FOU, UDP checksum enabled
TCP_STREAM
24.13% CPU utilization
9328 Mbps
TCP_RR
94.23
149/237/429 90/95/99% latencies
1.19553e+06 tps
FOU, UDP checksum disabled
TCP_STREAM
29.13% CPU utilization
9370.25 Mbps
TCP_RR
94.13% CPU utilization
149/232/398 90/95/99% latencies
1.19225e+06 tps
No FOU
TCP_STREAM
10.43% CPU utilization
5302.03 Mbps
TCP_RR
51.53% CPU utilization
215/324/475 90/95/99% latencies
864998 tps
SIT
FOU, UDP checksum enabled
TCP_STREAM
30.38% CPU utilization
9176.76 Mbps
TCP_RR
96.9% CPU utilization
170/281/581 90/95/99% latencies
1.03372e+06 tps
FOU, UDP checksum disabled
TCP_STREAM
39.6% CPU utilization
9176.57 Mbps
TCP_RR
97.14% CPU utilization
167/272/548 90/95/99% latencies
1.03203e+06 tps
No FOU
TCP_STREAM
11.2% CPU utilization
4636.05 Mbps
TCP_RR
59.51% CPU utilization
232/346/489 90/95/99% latencies
813199 tps
v2:
- Removed encap IP tunnel ioctls, configuration is done by netlink
only.
- Don't export fou_create and fou_destroy, they are currently
intended to be called within fou module only.
- Filled on tunnel netlink structures and functions for new values.
v3:
- Fixed change logs for some of the patches.
- Remove inline from fou_gro_receive and fou_gro_complete, let
compiler decide on these.
v4:
- Don't need to cast void in fou_from_sock
- Removed incorrest htons for port in fou_destroy
- Some minor cleanup for readability
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Tom Herbert [Wed, 17 Sep 2014 19:26:01 +0000 (12:26 -0700)]
gre: Setup and TX path for gre/UDP foo-over-udp encapsulation
Added netlink attrs to configure FOU encapsulation for GRE, netlink
handling of these flags, and properly adjust MTU for encapsulation.
ip_tunnel_encap is called from ip_tunnel_xmit to actually perform FOU
encapsulation.
Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tom Herbert [Wed, 17 Sep 2014 19:26:00 +0000 (12:26 -0700)]
ipip: Setup and TX path for ipip/UDP foo-over-udp encapsulation
Add netlink handling for IP tunnel encapsulation parameters and
and adjustment of MTU for encapsulation. ip_tunnel_encap is called
from ip_tunnel_xmit to actually perform FOU encapsulation.
Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tom Herbert [Wed, 17 Sep 2014 19:25:59 +0000 (12:25 -0700)]
sit: Setup and TX path for sit/UDP foo-over-udp encapsulation
Added netlink handling of IP tunnel encapulation paramters, properly
adjust MTU for encapsulation. Added ip_tunnel_encap call to
ipip6_tunnel_xmit to actually perform FOU encapsulation.
Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tom Herbert [Wed, 17 Sep 2014 19:25:58 +0000 (12:25 -0700)]
net: Changes to ip_tunnel to support foo-over-udp encapsulation
This patch changes IP tunnel to support (secondary) encapsulation,
Foo-over-UDP. Changes include:
1) Adding tun_hlen as the tunnel header length, encap_hlen as the
encapsulation header length, and hlen becomes the grand total
of these.
2) Added common netlink define to support FOU encapsulation.
3) Routines to perform FOU encapsulation.
Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tom Herbert [Wed, 17 Sep 2014 19:25:57 +0000 (12:25 -0700)]
fou: Add GRO support
Implement fou_gro_receive and fou_gro_complete, and populate these
in the correponsing udp_offloads for the socket. Added ipproto to
udp_offloads and pass this from UDP to the fou GRO routine in proto
field of napi_gro_cb structure.
Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tom Herbert [Wed, 17 Sep 2014 19:25:56 +0000 (12:25 -0700)]
fou: Support for foo-over-udp RX path
This patch provides a receive path for foo-over-udp. This allows
direct encapsulation of IP protocols over UDP. The bound destination
port is used to map to an IP protocol, and the XFRM framework
(udp_encap_rcv) is used to receive encapsulated packets. Upon
reception, the encapsulation header is logically removed (pointer
to transport header is advanced) and the packet is reinjected into
the receive path with the IP protocol indicated by the mapping.
Netlink is used to configure FOU ports. The configuration information
includes the port number to bind to and the IP protocol corresponding
to that port.
This should support GRE/UDP
(http://tools.ietf.org/html/draft-yong-tsvwg-gre-in-udp-encap-02),
as will as the other IP tunneling protocols (IPIP, SIT).
Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tom Herbert [Wed, 17 Sep 2014 19:25:55 +0000 (12:25 -0700)]
net: Export inet_offloads and inet6_offloads
Want to be able to use these in foo-over-udp offloads, etc.
Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
John Fastabend [Wed, 17 Sep 2014 18:11:46 +0000 (11:11 -0700)]
net: sched: cls_u32: rcu can not be last node
tc_u32_sel 'sel' in tc_u_knode expects to be the last element in the
structure and pads the structure with tc_u32_key fields for each key.
kzalloc(sizeof(*n) + s->nkeys*sizeof(struct tc_u32_key), GFP_KERNEL)
CC: Eric Dumazet <edumazet@google.com>
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Wed, 17 Sep 2014 15:05:05 +0000 (08:05 -0700)]
net: sched: use __skb_queue_head_init() where applicable
pfifo_fast and htb use skb lists, without needing their spinlocks.
(They instead use the standard qdisc lock)
We can use __skb_queue_head_init() instead of skb_queue_head_init()
to be consistent.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 19 Sep 2014 20:31:13 +0000 (16:31 -0400)]
Merge branch 'bnx2x-next'
Yuval Mintz says:
====================
bnx2x: Support new Multi-function modes
This patch series adds support for 2 new Multi-function modes -
Unified Fabric Port [UFP] as well as nic partitioning 1.5 [NPAR1.5].
With the addition of the new multi-function modes, the series also
revises some of the storage-related multi-function macros.
[Do notice this series has several small issues with checkpatch]
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Yuval Mintz [Wed, 17 Sep 2014 13:24:38 +0000 (16:24 +0300)]
bnx2x: Add a fallback multi-function mode NPAR1.5
When using new Multi-function modes it's possible that due to incompatible
configuration management FW will fallback into an existing mode.
Notice that at the moment this fallback is exactly the same as the already
existing switch-independent multi-function mode, but we still use existing
infrastructure to hold this information [in case some small differences will
arise in the future].
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: Dmitry Kravkov <Dmitry.Kravkov@qlogic.com>
Signed-off-by: Ariel Elior <Ariel.Elior@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yuval Mintz [Wed, 17 Sep 2014 13:24:37 +0000 (16:24 +0300)]
bnx2x: New multi-function mode: UFP
Add support for a new multi-function mode based on the Unified Fabric Port
system specifications.
Support includes configuration of:
1. Outer vlan tags.
2. Bandwidth settings.
3. Virtual link enable/disable.
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: Dmitry Kravkov <Dmitry.Kravkov@qlogic.com>
Signed-off-by: Ariel Elior <Ariel.Elior@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dmitry Kravkov [Wed, 17 Sep 2014 13:24:36 +0000 (16:24 +0300)]
bnx2x: Changes with storage & MAC macros
Rearrange macros to query for storage-only modes in different MF environment.
Improves the readibility and maintainability of the code. E.g.:
- if (IS_MF_STORAGE_SD(bp) || IS_MF_FCOE_AFEX(bp))
+ if (IS_MF_STORAGE_ONLY(bp))
In addition, this removes the need for bnx2x_is_valid_ether_addr().
Signed-off-by: Dmitry Kravkov <Dmitry.Kravkov@qlogic.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: Ariel Elior <Ariel.Elior@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 19 Sep 2014 20:27:13 +0000 (16:27 -0400)]
Merge branch 'fec-next'
Florian Fainelli says:
====================
net: phy: Broadcom BCM7xxx PHY workaround update
This patch sets the change to of_phy_connect() that you have seen before,
this time with the full context of why it is useful and applicable here.
Due to some design decision, the internal PHY on Broadcom BCM7xxx chips
is not entirely self contained and does not report its internal revision
through MII_PHYSID2, that is left to external PHY designs.
This forces us to get the PHY revision from the GENET and SF2 switch drivers
because those two peripherals integrate such a PHY and do contain the PHY
revision in their registers.
The approach taken here is hopefully easy to extend to similar needs for
other chips/ as well.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Fri, 19 Sep 2014 20:07:56 +0000 (13:07 -0700)]
net: phy: bcm7xxx: utilize PHY revision in config_init
Now that the GENET and SF2 drivers have been updated to communicate us
what is the revision of the BCM7xxx integrated PHY, utilize that
information in the config_init() callback to call into the appropriate
workaround function based on our revision.
While at it, we also print the revision and patch level to help debug
new chips.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Fri, 19 Sep 2014 20:07:55 +0000 (13:07 -0700)]
net: dsa: bcm_sf2: communicate integrated PHY revision to PHY driver
The integrated BCM7xxx PHY contains no useful revision information
in its MII_PHYSID2 bits 3:0, that information is instead contained in
the SWITCH_REG_PHY_REVISION register.
Read this register, store its value, and return it by implementing the
dsa_switch::get_phy_flags() callback accordingly. The register layout is
already matching what the BCM7xxx PHY driver is expecting to find.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Fri, 19 Sep 2014 20:07:54 +0000 (13:07 -0700)]
net: dsa: allow switch drivers to specify phy_device::dev_flags
Some switch drivers (e.g: bcm_sf2) may have to communicate specific
workarounds or flags towards the PHY device driver. Allow switches
driver to be delegated that task by introducing a get_phy_flags()
callback which will do just that.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Fri, 19 Sep 2014 20:07:53 +0000 (13:07 -0700)]
net: bcmgenet: communicate integrated PHY revision to PHY driver
The integrated BCM7xxx PHY contains no useful revision information in
its MII_PHYSID2 bits 3:0, that information is instead contained in the
GENET hardware block.
We already read the GENET 32-bit revision register, so store the
integrated PHY revision in the driver private structure, and then
communicate this revision value to the PHY driver by overriding the
phy_flags value.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Fri, 19 Sep 2014 20:07:52 +0000 (13:07 -0700)]
net: bcmgenet: remove PHY_BRCM_100MBPS_WAR
Now that we have removed the need for the PHY_BRCM_100MBPS_WAR flag, we
can remove it from the GENET driver and the broadcom shared header file.
The PHY driver checks the PHY supported bitmask instead.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Fri, 19 Sep 2014 20:07:51 +0000 (13:07 -0700)]
net: phy: bcm7xxx: do not use PHY_BRCM_100MBPS_WAR
There is no need for the PHY driver to check PHY_BRCM_100MBPS_WAR since
that is redundant with checking the PHY device supported features. Get
rid of that workaround flag.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Fri, 19 Sep 2014 20:07:50 +0000 (13:07 -0700)]
net: phy: broadcom: add helper for PHY revision and patch level
The Broadcom BCM7xxx internal PHYs do not contain any useful revision
information in the low 4-bits of their MII_PHYSID2 (MII register 3)
which could allow us to properly identify them.
As a result, we need the actual hardware block integrating these PHYs:
GENET or the SF2 switch to tell us what revision they are built with. To
assist with that, add two helper macros for fetching the the PHY
revision and patch level from the struct phy_device::dev_flags.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Fri, 19 Sep 2014 20:07:49 +0000 (13:07 -0700)]
of: mdio: honor flags passed to of_phy_connect
Commit
f9a8f83b04e0 ("net: phy: remove flags argument from phy_{attach,
connect, connect_direct}") removed the flags argument to the PHY library
calls to: phy_{attach,connect,connect_direct}.
Most Device Tree aware drivers call of_phy_connect() with the flag
argument set to 0, but some of them might want to set a different value
there in order for the PHY driver to key a specific behavior based on
the phy_device::phy_flags value.
Allow such drivers to set custom phy_flags as part of the
of_phy_connect() call since of_phy_connect() does start the PHY state
machine, it will call into the PHY driver config_init() callback which
is usually where a specific phy_flags value is important.
Fixes:
f9a8f83b04e0 ("net: phy: remove flags argument from phy_{attach, connect, connect_direct}")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Wed, 17 Sep 2014 11:49:49 +0000 (04:49 -0700)]
net: add alloc_skb_with_frags() helper
Extract from sock_alloc_send_pskb() code building skb with frags,
so that we can reuse this in other contexts.
Intent is to use it from tcp_send_rcvq(), tcp_collapse(), ...
We also want to replace some skb_linearize() calls to a more reliable
strategy in pathological cases where we need to reduce number of frags.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Wed, 17 Sep 2014 10:14:42 +0000 (03:14 -0700)]
tcp: do not fake tcp headers in tcp_send_rcvq()
Now we no longer rely on having tcp headers for skbs in receive queue,
tcp repair do not need to build fake ones.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 19 Sep 2014 19:57:46 +0000 (15:57 -0400)]
Merge branch 'udp-tunnel-common'
Andy Zhou says:
====================
Refactor vxlan and l2tp to use new common UDP tunnel APIs
This patch series add a few more UDP tunnel APIs and refactoring current
UDP tunnel based protocols, vxlan and l2tp to make use of the new APIs.
The added APIs are setup_udp_tunnel_sock(), udp_tunnel_xmit_skb() and
udp_tunnel_sock_release(). Those implementation logics already exist in
current vxlan and l2tp implementation. Move them to common APIs to reduce
code duplications.
Also split udp_tunnel.c into net/ipv4/udp_tunnel.c and
net/ipv6/ip6_udp_tunnel.c to maintain proper IP protocol separation.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Andy Zhou [Wed, 17 Sep 2014 00:31:19 +0000 (17:31 -0700)]
l2tp: Refactor l2tp core driver to make use of the common UDP tunnel functions
Simplify l2tp implementation using common UDP tunnel APIs.
Signed-off-by: Andy Zhou <azhou@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andy Zhou [Wed, 17 Sep 2014 00:31:18 +0000 (17:31 -0700)]
vxlan: Refactor vxlan driver to make use of the common UDP tunnel functions.
Simplify vxlan implementation using common UDP tunnel APIs.
Signed-off-by: Andy Zhou <azhou@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andy Zhou [Wed, 17 Sep 2014 00:31:17 +0000 (17:31 -0700)]
udp-tunnel: Add a few more UDP tunnel APIs
Added a few more UDP tunnel APIs that can be shared by UDP based
tunnel protocol implementation. The main ones are highlighted below.
setup_udp_tunnel_sock() configures UDP listener socket for
receiving UDP encapsulated packets.
udp_tunnel_xmit_skb() and upd_tunnel6_xmit_skb() transmit skb
using UDP encapsulation.
udp_tunnel_sock_release() closes the UDP tunnel listener socket.
Signed-off-by: Andy Zhou <azhou@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andy Zhou [Wed, 17 Sep 2014 00:31:16 +0000 (17:31 -0700)]
udp_tunnel: Seperate ipv6 functions into its own file.
Add ip6_udp_tunnel.c for ipv6 UDP tunnel functions to avoid ifdefs
in udp_tunnel.c
Signed-off-by: Andy Zhou <azhou@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 19 Sep 2014 19:36:54 +0000 (15:36 -0400)]
Merge branch 'fec-next'
Frank Li says:
====================
net: fec: add interrupt coalescence
improve error handle when parse queue number.
add interrupt coalescence feature.
Change from v2 to v3
- add error check in fec_enet_set_coalesce
- fix a run time warning to get clock rate in interrupt
- fix commit message use TKT number
Change from v1 to v2
- fix indention
- use errata number instead of TKT
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Fugang Duan [Tue, 16 Sep 2014 21:18:54 +0000 (05:18 +0800)]
net: fec: Workaround for imx6sx enet tx hang when enable three queues
When enable three queues on imx6sx enet, and then do tx performance
test with iperf tool, after some time running, tx hang.
Found that:
If uDMA is running, software set TDAR may cause tx hang.
If uDMA is in idle, software set TDAR don't cause tx hang.
There is a TDAR race condition for mutliQ when the software sets TDAR
and the UDMA clears TDAR simultaneously or in a small window (2-4 cycles).
This will cause the udma_tx and udma_tx_arbiter state machines to hang.
The issue exist at i.MX6SX enet IP.
So, the Workaround is checking TDAR status four time, if TDAR cleared by
hardware and then write TDAR, otherwise don't set TDAR.
The patch is only one Workaround for the issue ERR007885.
Signed-off-by: Fugang Duan <B38611@freescale.com>
Signed-off-by: Frank Li <Frank.Li@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fugang Duan [Tue, 16 Sep 2014 21:18:53 +0000 (05:18 +0800)]
net:fec: increase DMA queue number
when enable interrupt coalesce, 8 BD is not enough.
Signed-off-by: Frank Li <Frank.Li@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fugang Duan [Tue, 16 Sep 2014 21:18:52 +0000 (05:18 +0800)]
net: fec: add interrupt coalescence feature support
i.MX6 SX support interrupt coalescence feature
By default, init the interrupt coalescing frame count threshold and
timer threshold.
Supply the ethtool interfaces as below for user tuning to improve
enet performance:
rx_max_coalesced_frames
rx_coalesce_usecs
tx_max_coalesced_frames
tx_coalesce_usecs
Signed-off-by: Fugang Duan <B38611@freescale.com>
Signed-off-by: Frank Li <Frank.Li@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Frank Li [Tue, 16 Sep 2014 21:18:51 +0000 (05:18 +0800)]
net: fec: refine error handle of parser queue number from DT
check tx and rx queue seperately.
fix typo, "Invalidate" and "fail".
change pr_err to pr_warn.
Signed-off-by: Frank Li <Frank.Li@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexei Starovoitov [Tue, 16 Sep 2014 19:35:35 +0000 (12:35 -0700)]
sparc: bpf_jit: add SKF_AD_PKTTYPE support to JIT
commit
233577a22089 ("net: filter: constify detection of pkt_type_offset")
allows us to implement simple PKTTYPE support in sparc JIT
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Frank Li [Tue, 16 Sep 2014 18:34:18 +0000 (02:34 +0800)]
net: fec: fix build error at m68k platform
reproduce:
wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
git checkout
4d494cdc92b3b9a0f5fb9e1560810fa27d5a0489
make.cross ARCH=m68k m5272c3_defconfig
make.cross ARCH=m68k
drivers/net/ethernet/freescale/fec.h:262:0: warning: "FEC_R_DES_START" redefined
#define FEC_R_DES_START(X) ((X == 1) ? FEC_R_DES_START_1 : \
^
drivers/net/ethernet/freescale/fec.h:158:0: note: this is the location of the previous definition
#define FEC_R_DES_START 0x3d0 /* Receive descriptor ring */
^
drivers/net/ethernet/freescale/fec.h:265:0: warning: "FEC_X_DES_START" redefined
#define FEC_X_DES_START(X) ((X == 1) ? FEC_X_DES_START_1 : \
...
Signed-off-by: Frank Li <Frank.Li@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
John Fastabend [Tue, 16 Sep 2014 07:33:42 +0000 (00:33 -0700)]
net: sched: cls_cgroup need tcf_exts_init in all cases
This ensures the tcf_exts_init() is called for all cases.
Fixes:
952313bd62589cae216a57 ("net: sched: cls_cgroup use RCU")
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Acked-by: Cong Wang <cwang@twopensource.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 16 Sep 2014 20:21:48 +0000 (16:21 -0400)]
Merge branch 'net_next_ovs' of git://git./linux/kernel/git/pshelar/openvswitch
Pravin B Shelar says:
====================
Open vSwitch
Following patches adds recirculation and hash action to OVS.
First patch removes pointer to stack object. Next three patches
does code restructuring which is required for last patch.
Recirculation implementation is changed, according to comments from
David Miller, to avoid using recursive calls in OVS. It is using
queue to record recirc action and deferred recirc is executed at
the end of current actions execution.
v1-v2:
Changed subsystem name in subject to openvswitch
v2-v3:
Added patch to remove pkt_key pointer from skb->cb.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
John Fastabend [Tue, 16 Sep 2014 06:31:42 +0000 (23:31 -0700)]
net: sched: cls_fw: add missing tcf_exts_init call in fw_change()
When allocating a new structure we also need to call tcf_exts_init
to initialize exts.
A follow up patch might be in order to remove some of this code
and do tcf_exts_assign(). With this we could remove the
tcf_exts_init/tcf_exts_change pattern for some of the classifiers.
As part of the future tcf_actions RCU series this will need to be
done. For now fix the call here.
Fixes
e35a8ee5993ba81fd6c0 ("net: sched: fw use RCU")
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Acked-by: Cong Wang <cwang@twopensource.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
John Fastabend [Tue, 16 Sep 2014 06:31:17 +0000 (23:31 -0700)]
net: sched: cls_cgroup fix possible memory leak of 'new'
tree: git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git master
head:
54996b529ab70ca1d6f40677cd2698c4f7127e87
commit:
c7953ef23042b7c4fc2be5ecdd216aacff6df5eb [625/646] net: sched: cls_cgroup use RCU
net/sched/cls_cgroup.c:130 cls_cgroup_change() warn: possible memory leak of 'new'
net/sched/cls_cgroup.c:135 cls_cgroup_change() warn: possible memory leak of 'new'
net/sched/cls_cgroup.c:139 cls_cgroup_change() warn: possible memory leak of 'new'
Fixes:
c7953ef23042b7c4fc2be5ecdd216aac ("net: sched: cls_cgroup use RCU")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Acked-by: Cong Wang <cwang@twopensource.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
John Fastabend [Tue, 16 Sep 2014 06:30:49 +0000 (23:30 -0700)]
net: sched: cls_u32 add missing rcu_assign_pointer and annotation
Add missing rcu_assign_pointer and missing annotation for ht_up
in cls_u32.c
Caught by kbuild bot,
>> net/sched/cls_u32.c:378:36: sparse: incorrect type in initializer (different address spaces)
net/sched/cls_u32.c:378:36: expected struct tc_u_hnode *ht
net/sched/cls_u32.c:378:36: got struct tc_u_hnode [noderef] <asn:4>*ht_up
>> net/sched/cls_u32.c:610:54: sparse: incorrect type in argument 4 (different address spaces)
net/sched/cls_u32.c:610:54: expected struct tc_u_hnode *ht
net/sched/cls_u32.c:610:54: got struct tc_u_hnode [noderef] <asn:4>*ht_up
>> net/sched/cls_u32.c:684:18: sparse: incorrect type in assignment (different address spaces)
net/sched/cls_u32.c:684:18: expected struct tc_u_hnode [noderef] <asn:4>*ht_up
net/sched/cls_u32.c:684:18: got struct tc_u_hnode *[assigned] ht
>> net/sched/cls_u32.c:359:18: sparse: dereference of noderef expression
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
John Fastabend [Tue, 16 Sep 2014 06:30:26 +0000 (23:30 -0700)]
net: sched: fix unsued cpu variable
kbuild test robot reported an unused variable cpu in cls_u32.c
after the patch below. This happens when PERF and MARK config
variables are disabled
Fix this is to use separate variables for perf and mark
and define the cpu variable inside the ifdef logic.
Fixes:
459d5f626da7 ("net: sched: make cls_u32 per cpu")'
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Acked-by: Cong Wang <cwang@twopensource.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
WANG Cong [Mon, 15 Sep 2014 23:43:43 +0000 (16:43 -0700)]
net_sched: fix a null pointer dereference in tcindex_set_parms()
This patch fixes the following crash:
[ 42.199159] BUG: unable to handle kernel NULL pointer dereference at
0000000000000018
[ 42.200027] IP: [<
ffffffff817e3fc4>] tcindex_set_parms+0x45c/0x526
[ 42.200027] PGD
d2319067 PUD
d4ffe067 PMD 0
[ 42.200027] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
[ 42.200027] CPU: 0 PID: 541 Comm: tc Not tainted 3.17.0-rc4+ #603
[ 42.200027] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[ 42.200027] task:
ffff8800d22d2670 ti:
ffff8800ce790000 task.ti:
ffff8800ce790000
[ 42.200027] RIP: 0010:[<
ffffffff817e3fc4>] [<
ffffffff817e3fc4>] tcindex_set_parms+0x45c/0x526
[ 42.200027] RSP: 0018:
ffff8800ce793898 EFLAGS:
00010202
[ 42.200027] RAX:
0000000000000001 RBX:
ffff8800d1786498 RCX:
0000000000000000
[ 42.200027] RDX:
ffffffff82114ec8 RSI:
ffffffff82114ec8 RDI:
ffffffff82114ec8
[ 42.200027] RBP:
ffff8800ce793958 R08:
00000000000080d0 R09:
0000000000000001
[ 42.200027] R10:
ffff8800ce7939a0 R11:
0000000000000246 R12:
ffff8800d017d238
[ 42.200027] R13:
0000000000000018 R14:
ffff8800d017c6a0 R15:
ffff8800d1786620
[ 42.200027] FS:
00007f4e24539740(0000) GS:
ffff88011a600000(0000) knlGS:
0000000000000000
[ 42.200027] CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
[ 42.200027] CR2:
0000000000000018 CR3:
00000000cff38000 CR4:
00000000000006f0
[ 42.200027] Stack:
[ 42.200027]
ffff8800ce0949f0 0000000000000000 0000000200000003 ffff880000000000
[ 42.200027]
ffff8800ce7938b8 ffff8800ce7938b8 0000000600000007 0000000000000000
[ 42.200027]
ffff8800ce7938d8 ffff8800ce7938d8 0000000600000007 ffff8800ce0949f0
[ 42.200027] Call Trace:
[ 42.200027] [<
ffffffff817e4169>] tcindex_change+0xdb/0xee
[ 42.200027] [<
ffffffff817c16ca>] tc_ctl_tfilter+0x44d/0x63f
[ 42.200027] [<
ffffffff8179d161>] rtnetlink_rcv_msg+0x181/0x194
[ 42.200027] [<
ffffffff8179cf9d>] ? rtnl_lock+0x17/0x19
[ 42.200027] [<
ffffffff8179cfe0>] ? __rtnl_unlock+0x17/0x17
[ 42.200027] [<
ffffffff817ee296>] netlink_rcv_skb+0x49/0x8b
[ 43.462494] [<
ffffffff8179cfc2>] rtnetlink_rcv+0x23/0x2a
[ 43.462494] [<
ffffffff817ec8df>] netlink_unicast+0xc7/0x148
[ 43.462494] [<
ffffffff817ed413>] netlink_sendmsg+0x5cb/0x63d
[ 43.462494] [<
ffffffff810ad781>] ? mark_lock+0x2e/0x224
[ 43.462494] [<
ffffffff817757b8>] __sock_sendmsg_nosec+0x25/0x27
[ 43.462494] [<
ffffffff81778165>] sock_sendmsg+0x57/0x71
[ 43.462494] [<
ffffffff81152bbd>] ? might_fault+0x57/0xa4
[ 43.462494] [<
ffffffff81152c06>] ? might_fault+0xa0/0xa4
[ 43.462494] [<
ffffffff81152bbd>] ? might_fault+0x57/0xa4
[ 43.462494] [<
ffffffff817838fd>] ? verify_iovec+0x69/0xb7
[ 43.462494] [<
ffffffff817784f8>] ___sys_sendmsg+0x21d/0x2bb
[ 43.462494] [<
ffffffff81009db3>] ? native_sched_clock+0x35/0x37
[ 43.462494] [<
ffffffff8109ab53>] ? sched_clock_local+0x12/0x72
[ 43.462494] [<
ffffffff810ad781>] ? mark_lock+0x2e/0x224
[ 43.462494] [<
ffffffff8109ada4>] ? sched_clock_cpu+0xa0/0xb9
[ 43.462494] [<
ffffffff810aee37>] ? __lock_acquire+0x5fe/0xde4
[ 43.462494] [<
ffffffff8119f570>] ? rcu_read_lock_held+0x36/0x38
[ 43.462494] [<
ffffffff8119f75a>] ? __fcheck_files.isra.7+0x4b/0x57
[ 43.462494] [<
ffffffff8119fbf2>] ? __fget_light+0x30/0x54
[ 43.462494] [<
ffffffff81779012>] __sys_sendmsg+0x42/0x60
[ 43.462494] [<
ffffffff81779042>] SyS_sendmsg+0x12/0x1c
[ 43.462494] [<
ffffffff819d24d2>] system_call_fastpath+0x16/0x1b
'p->h' could be NULL while 'cp->h' is always update to date.
Fixes: commit
331b72922c5f58d48fd ("net: sched: RCU cls_tcindex")
Cc: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-By: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
WANG Cong [Mon, 15 Sep 2014 23:43:42 +0000 (16:43 -0700)]
net_sched: fix memory leak in cls_tcindex
Fixes: commit
331b72922c5f58d48fd ("net: sched: RCU cls_tcindex")
Cc: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-By: John Fastabend <john.r.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andy Zhou [Tue, 16 Sep 2014 02:37:25 +0000 (19:37 -0700)]
openvswitch: Add recirc and hash action.
Recirc action allows a packet to reenter openvswitch processing.
currently openvswitch lookup flow for packet received and execute
set of actions on that packet, with help of recirc action we can
process/modify the packet and recirculate it back in openvswitch
for another pass.
OVS hash action calculates 5-tupple hash and set hash in flow-key
hash. This can be used along with recirculation for distributing
packets among different ports for bond devices.
For example:
OVS bonding can use following actions:
Match on: bond flow; Action: hash, recirc(id)
Match on: recirc-id == id and hash lower bits == a;
Action: output port_bond_a
Signed-off-by: Andy Zhou <azhou@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Andy Zhou [Tue, 16 Sep 2014 02:33:50 +0000 (19:33 -0700)]
openvswitch: simplify sample action implementation
The current sample() function implementation is more complicated
than necessary in handling single user space action optimization
and skb reference counting. There is no functional changes.
Signed-off-by: Andy Zhou <azhou@nicira.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Pravin B Shelar [Tue, 16 Sep 2014 02:28:44 +0000 (19:28 -0700)]
openvswitch: Use tun_key only for egress tunnel path.
Currently tun_key is used for passing tunnel information
on ingress and egress path, this cause confusion. Following
patch removes its use on ingress path make it egress only parameter.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
Pravin B Shelar [Tue, 16 Sep 2014 02:20:31 +0000 (19:20 -0700)]
openvswitch: refactor ovs flow extract API.
OVS flow extract is called on packet receive or packet
execute code path. Following patch defines separate API
for extracting flow-key in packet execute code path.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
Pravin B Shelar [Tue, 16 Sep 2014 02:15:28 +0000 (19:15 -0700)]
openvswitch: Remove pkt_key from OVS_CB
OVS keeps pointer to packet key in skb->cb, but the packet key is
store on stack. This could make code bit tricky. So it is better to
get rid of the pointer.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Florian Fainelli [Mon, 15 Sep 2014 21:48:08 +0000 (14:48 -0700)]
net: dsa: fix mii_bus to host_dev replacement
dsa_of_probe() still used cd->mii_bus instead of cd->host_dev when
building with CONFIG_OF=y. Fix this by making the replacement here as
well.
Fixes:
b4d2394d01b ("dsa: Replace mii_bus with a generic host device")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
WANG Cong [Mon, 15 Sep 2014 21:06:49 +0000 (14:06 -0700)]
net_sched: use tcindex_filter_result_init()
Fixes: commit
331b72922c5f58d48fd ("net: sched: RCU cls_tcindex")
Cc: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
WANG Cong [Mon, 15 Sep 2014 21:06:48 +0000 (14:06 -0700)]
net_sched: fix suspicious RCU usage in tcindex_classify()
This patch fixes the following kernel warning:
[ 44.805900] [ INFO: suspicious RCU usage. ]
[ 44.808946] 3.17.0-rc4+ #610 Not tainted
[ 44.811831] -------------------------------
[ 44.814873] net/sched/cls_tcindex.c:84 suspicious rcu_dereference_check() usage!
Fixes: commit
331b72922c5f58d48fd ("net: sched: RCU cls_tcindex")
Cc: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
WANG Cong [Mon, 15 Sep 2014 21:06:46 +0000 (14:06 -0700)]
net_sched: fix an allocation bug in tcindex_set_parms()
Fixes: commit
331b72922c5f58d48fd ("net: sched: RCU cls_tcindex")
Cc: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
WANG Cong [Mon, 15 Sep 2014 21:21:50 +0000 (14:21 -0700)]
net_sched: fix suspicious RCU usage in cls_bpf_classify()
Fixes: commit
1f947bf151e90ec0baad2948 ("net: sched: rcu'ify cls_bpf")
Cc: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 15 Sep 2014 21:24:29 +0000 (17:24 -0400)]
Merge branch 'dsa-next'
Alexander Duyck says:
====================
DSA Cleanups
This patch series does two things, first it cleans up the tag_protocol and
protocol ops being configured seperately. Second it addresses the desire
to split DSA away from relying on a MII bus.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Mon, 15 Sep 2014 17:00:27 +0000 (13:00 -0400)]
dsa: Replace mii_bus with a generic host device
This change makes it so that instead of passing and storing a mii_bus we
instead pass and store a host_dev. From there we can test to determine the
exact type of device, and can verify it is the correct device for our switch.
So for example it would be possible to pass a device pointer from a pci_dev
and instead of checking for a PHY ID we could check for a vendor and/or device
ID.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Mon, 15 Sep 2014 17:00:19 +0000 (13:00 -0400)]
dsa: Split ops up, and avoid assigning tag_protocol and receive separately
This change addresses several issues.
First, it was possible to set tag_protocol without setting the ops pointer.
To correct that I have reordered things so that rcv is now populated before
we set tag_protocol.
Second, it didn't make much sense to keep setting the device ops each time a
new slave was registered. So by moving the receive portion out into root
switch initialization that issue should be addressed.
Third, I wanted to avoid sending tags if the rcv pointer was not registered
so I changed the tag check to verify if the rcv function pointer is set on
the root tree. If it is then we start sending DSA tagged frames.
Finally I split the device ops pointer in the structures into two spots. I
placed the rcv function pointer in the root switch since this makes it
easiest to access from there, and I placed the xmit function pointer in the
slave for the same reason.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 15 Sep 2014 21:19:55 +0000 (17:19 -0400)]
Merge branch 'bonding-cleanups'
Nikolay Aleksandrov says:
====================
bonding: style, comment and assertion changes
This is a small and simple patch-set that doesn't introduce (hopefully) any
functional changes, but only stylistic and semantic ones.
Patch 01 simply uses the already provided __rlb_next_rx_slave function inside
rlb_next_rx_slave(), thus removing the duplication of code.
Patch 02 changes all comments that I could find to netdev style, removes
some outdated ones and fixes a few more small cosmetic issues (new line
after declaration, braces around if; else and such)
Patch 03 removes one extra ASSERT_RTNL() because we already have it in the
parent function and consolidates two other ASSERT_RTNL()s to the function
that is exported and supposed to be called with RTNL anyway.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Nikolay Aleksandrov [Mon, 15 Sep 2014 15:19:35 +0000 (17:19 +0200)]
bonding: consolidate ASSERT_RTNL()s and remove the unnecessary
Consolidate the calls to ASSERT_RTNL() before bond_select_active_slave()
inside bond_select_active_slave() itself and remove the ASSERT_RTNL()
from bond_hw_addr_swap() as it's not exported and its only caller -
bond_change_active_slave() already has an ASSERT_RTNL().
Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nikolay Aleksandrov [Mon, 15 Sep 2014 15:19:34 +0000 (17:19 +0200)]
bonding: trivial: style and comment fixes
First adjust a couple of locking comments that were left inaccurate,
then adjust comments to use the netdev styling and remove extra new
lines where necessary and add a couple of new lines between declarations
and code. These are all trivial styling changes, no functional change.
Also removed a couple of outdated or obvious comments.
This patch is by no means a complete fix of all netdev style violations
but it gets the bonding closer.
Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nikolay Aleksandrov [Mon, 15 Sep 2014 15:19:33 +0000 (17:19 +0200)]
bonding: consolidate the two rlb_next_rx_slave functions into one
__rlb_next_rx_slave() is a copy of rlb_next_rx_slave() with the
difference that it uses rcu primitives to walk the slave list. We don't
need the two functions and can make rlb_next_rx_slave() a wrapper for
callers which hold RTNL.
So add a comment and ASSERT_RTNL() to make sure what is intended.
Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 15 Sep 2014 18:41:12 +0000 (14:41 -0400)]
Merge branch 'tcpflags'
Eric Dumazet says:
====================
tcp: no longer keep around headers in input path
Looking at tcp_try_coalesce() I was wondering why I did :
if (tcp_hdr(from)->fin)
return false;
The answer would be to allow the aggregation, if we simply OR the FIN and PSH
flags eventually present in @from to @to packet. (Note a change is also
needed in skb_try_coalesce() to avoid calling skb_put() with 0 len)
Then, looking at tcp_recvmsg(), I realized we access tcp_hdr(skb)->syn
(and maybe tcp_hdr(skb)->fin) for every packet we process from socket
receive queue.
We have to understand TCP flags are cold in cpu caches most of the time
(assuming TCP timestamps, and that application calls recvmsg() a long
time after incoming packet was processed), and bringing a whole
cache line only to access one bit is not very nice.
It would make sense to use in TCP input path TCP_SKB_CB(skb)->tcp_flags
as we do in output path.
This saves one cache line miss, and TCP tcp_collapse() can avoid dealing
with the headers.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Mon, 15 Sep 2014 11:19:53 +0000 (04:19 -0700)]
tcp: do not copy headers in tcp_collapse()
tcp_collapse() wants to shrink skb so that the overhead is minimal.
Now we store tcp flags into TCP_SKB_CB(skb)->tcp_flags, we no longer
need to keep around full headers.
Whole available space is dedicated to the payload.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Mon, 15 Sep 2014 11:19:52 +0000 (04:19 -0700)]
tcp: allow segment with FIN in tcp_try_coalesce()
We can allow a segment with FIN to be aggregated,
if we take care to add tcp flags,
and if skb_try_coalesce() takes care of zero sized skbs.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Mon, 15 Sep 2014 11:19:51 +0000 (04:19 -0700)]
tcp: use TCP_SKB_CB(skb)->tcp_flags in input path
Input path of TCP do not currently uses TCP_SKB_CB(skb)->tcp_flags,
which is only used in output path.
tcp_recvmsg(), looks at tcp_hdr(skb)->syn for every skb found in receive queue,
and its unfortunate because this bit is located in a cache line right before
the payload.
We can simplify TCP by copying tcp flags into TCP_SKB_CB(skb)->tcp_flags.
This patch does so, and avoids the cache line miss in tcp_recvmsg()
Following patches will
- allow a segment with FIN being coalesced in tcp_try_coalesce()
- simplify tcp_collapse() by not copying the headers.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rickard Strandqvist [Sun, 14 Sep 2014 17:34:47 +0000 (19:34 +0200)]
net: ethernet: neterion: vxge: vxge-main.c: Cleaning up missing null-terminate in conjunction with strncpy
Replacing strncpy with strlcpy to avoid strings that lacks null terminate.
Signed-off-by: Rickard Strandqvist <rickard_strandqvist@spectrumdigital.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rickard Strandqvist [Sun, 14 Sep 2014 17:32:42 +0000 (19:32 +0200)]
net: ethernet: freescale: fec_main.c: Cleaning up missing null-terminate in conjunction with strncpy
Replacing strncpy with strlcpy to avoid strings that lacks null terminate.
Signed-off-by: Rickard Strandqvist <rickard_strandqvist@spectrumdigital.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fabian Frederick [Sat, 13 Sep 2014 20:38:27 +0000 (22:38 +0200)]
bna: use container_of to resolve bufdesc_ex from bufdesc
Use container_of instead of casting first structure member.
Compiled but untested.
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fabian Frederick [Sat, 13 Sep 2014 20:38:26 +0000 (22:38 +0200)]
net: fec: use container_of to resolve bufdesc_ex from bufdesc
Use container_of instead of casting first structure member.
ARM cross-compiled but untested.
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sasha Levin [Sat, 13 Sep 2014 04:06:30 +0000 (00:06 -0400)]
net: bpf: correctly handle errors in sk_attach_filter()
Commit "net: bpf: make eBPF interpreter images read-only" has changed bpf_prog
to be vmalloc()ed but never handled some of the errors paths of the old code.
On error within sk_attach_filter (which userspace can easily trigger), we'd
kfree() the vmalloc()ed memory, and leak the internal bpf_work_struct.
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Acked-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Sat, 13 Sep 2014 06:12:46 +0000 (23:12 -0700)]
netdevice: Support DSA tagging when DSA is built as a module
This change corrects an error seen when DSA tagging is built as a module.
Without this change it is not possible to get XDSA tagged frames as the
test for tagging is stripped by the #ifdef check.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Bo Shen [Fri, 12 Sep 2014 23:57:49 +0000 (01:57 +0200)]
net/macb: Add hardware revision information during probe
Print the IP revision when probing.
Signed-off-by: Bo Shen <voice.shen@atmel.com>
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sat, 13 Sep 2014 21:32:29 +0000 (17:32 -0400)]
Merge branch 'fec-next'
Frank Li says:
====================
net: fec: imx6sx multiqueue support
These patches enable i.MX6SX multi queue support.
i.MX6SX support 3 queue and AVB feature.
Change from v3 to v4
- use "unsigned int" instead of "unsigned"
Change from v2 to v3
- fixed alignment requirement for ARM and NO-ARM platform
Change from v1 to v2.
- Change num_tx_queue to unsigned int
- Avoid block non-dt platform
- remove call netif_set_real_num_rx_queues
- seperate multi queue patch two part, one is tx and rx handle, with fixed queue 0
then other one is initilized multiqueue
- use two difference alignment for tx and rx path
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Frank Li [Fri, 12 Sep 2014 21:00:57 +0000 (05:00 +0800)]
ARM: dts: imx6sx: add multi-queue support enet
Enable 3 queues suppport for ethernet
Signed-off-by: Frank Li <Frank.Li@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Frank Li [Fri, 12 Sep 2014 21:00:56 +0000 (05:00 +0800)]
ARM: Documentation: Update fec dts binding doc
This patch update fec devicetree binding doc that add Optional
properties "fsl,num-tx-queues" and "fsl,num-rx-queues".
Signed-off-by: Fugang Duan <B38611@freescale.com>
Signed-off-by: Frank Li <Frank.Li@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fugang Duan [Fri, 12 Sep 2014 21:00:55 +0000 (05:00 +0800)]
net: fec: init complete variable in early to avoid kernel dump
Software clear the MDIO interrupt before MDIO bus access, but
MAC still generate MDIO interrupt. The issue only happen on
imx6slx chip.
CPU: 0 PID: 1 Comm: swapper/0 Not tainted
3.17.0-rc1-00399-g0bcad17 #315
Backtrace:
[<
800121fc>] (dump_backtrace) from [<
800124e0>] (show_stack+0x18/0x1c)
r6:
8096e534 r5:
8096e534 r4:
00000000 r3:
00000000
[<
800124c8>] (show_stack) from [<
806a4c60>] (dump_stack+0x8c/0xa4)
[<
806a4bd4>] (dump_stack) from [<
80060ab8>] (__lock_acquire+0x1814/0x1c40)
r6:
be078000 r5:
be074000 r4:
be03f6e4 r3:
be078000
[<
8005f2a4>] (__lock_acquire) from [<
800616e0>] (lock_acquire+0x70/0x84)
r10:
809ada33 r9:
be010600 r8:
00000096 r7:
00000001 r6:
be074000 r5:
00000000
r4:
60000193
[<
80061670>] (lock_acquire) from [<
806abb20>] (_raw_spin_lock_irqsave+0x40/0x54)
r7:
00000000 r6:
8005a3f8 r5:
00000193 r4:
be03f6d4
[<
806abae0>] (_raw_spin_lock_irqsave) from [<
8005a3f8>] (complete+0x1c/0x4c)
r6:
80950904 r5:
be03f6d0 r4:
be03f6d4
[<
8005a3dc>] (complete) from [<
8041b4c0>] (fec_enet_interrupt+0x128/0x164)
r6:
80950904 r5:
00800000 r4:
be03f000 r3:
00000000
[<
8041b398>] (fec_enet_interrupt) from [<
8006aeac>] (handle_irq_event_percpu+0x38/0x13c)
r6:
00000000 r5:
be01065c r4:
be399e00 r3:
8041b398
[<
8006ae74>] (handle_irq_event_percpu) from [<
8006aff4>] (handle_irq_event+0x44/0x64)
r10:
be03f000 r9:
80989fe0 r8:
00000000 r7:
00000096 r6:
be399e00 r5:
be01065c
r4:
be010600
[<
8006afb0>] (handle_irq_event) from [<
8006e3e8>] (handle_fasteoi_irq+0xc8/0x1bc)
r6:
8096e764 r5:
be01065c r4:
be010600 r3:
00000000
[<
8006e320>] (handle_fasteoi_irq) from [<
8006a63c>] (generic_handle_irq+0x30/0x44)
r6:
be074010 r5:
80945e4c r4:
00000096 r3:
8006e320
[<
8006a60c>] (generic_handle_irq) from [<
8000f218>] (handle_IRQ+0x54/0xbc)
r4:
80950d74 r3:
00000180
[<
8000f1c4>] (handle_IRQ) from [<
800086cc>] (gic_handle_irq+0x30/0x68)
r8:
be3ab478 r7:
c080e100 r6:
be075bd8 r5:
80950eec r4:
c080e10c r3:
000000a0
[<
8000869c>] (gic_handle_irq) from [<
80013064>] (__irq_svc+0x44/0x5c)
Signed-off-by: Fugang Duan <B38611@freescale.com>
Signed-off-by: Frank Li <Frank.Li@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fugang Duan [Fri, 12 Sep 2014 21:00:54 +0000 (05:00 +0800)]
net: fec: change FEC alignment according to i.mx6 sx requirement
i.MX6 SX change FEC alignment requirement.
i.MX6 SX change internal bus from AHB to AXI.
It require RX buffer must be 64 bytes alignment.
And remove TX buffer alignment requirement.
Signed-off-by: Fugang Duan <B38611@freescale.com>
Signed-off-by: Frank Li <Frank.Li@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fugang Duan [Fri, 12 Sep 2014 21:00:53 +0000 (05:00 +0800)]
net:fec: Add fsl,imx6sx-fec compatible strings
Add compatible string "fsl,imx6sx-fec" for i.MX6SX.
Signed-off-by: Fugang Duan <B38611@freescale.com>
Signed-off-by: Frank Li <Frank.Li@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Frank Li [Fri, 12 Sep 2014 21:00:52 +0000 (05:00 +0800)]
net: fec: add enet-avb IP support
i.MX6SX Enet-AVB support 3 tx queues, 3 rx queues.
For tx queues: ring 0 -> best effort
ring 1 -> Class A
ring 2 -> Class B
For rx queues:
ring 0 -> best effort
ring 1 -> receive VLAN packet with classification match
ring 2 -> receive VLAN packet with classification match
Add enet-avb IP multiqueue support for the driver.
Signed-off-by: Fugang Duan <B38611@freescale.com>
Signed-off-by: Frank Li <Frank.Li@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fugang Duan [Fri, 12 Sep 2014 21:00:51 +0000 (05:00 +0800)]
net:fec: Disable enet-avb MAC instead of reset MAC
For i.MX6SX enet use AXI bus, reset MAC will make system bus dead
if ENET-AXI bus has pending access (AHB bus should not have such issue).
So, disable enet with AVB MAC instead of reset MAC itself.
Signed-off-by: Fugang Duan <B38611@freescale.com>
Signed-off-by: Frank Li <Frank.Li@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Frank Li [Fri, 12 Sep 2014 21:00:50 +0000 (05:00 +0800)]
net: fec: init multi queue date structure
initilized all queues according to queue number get from DT file.
Signed-off-by: Frank Li <Frank.Li@freescale.com>
Signed-off-by: Duan Fugang <B38611@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fugang Duan [Fri, 12 Sep 2014 21:00:49 +0000 (05:00 +0800)]
net: fec: parser max queue number from dt file
By default, the tx/rx queue number is 1, user can config the queue number
at DTS file like this:
fsl,num-tx-queues=<3>;
fsl,num-rx-queues=<3>
Since i.MX6SX enet-AVB IP support multi queues, so use multi queues
interface to allocate and set up an Ethernet device.
Signed-off-by: Fugang Duan <B38611@freescale.com>
Signed-off-by: Frank Li <Frank.Li@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fugang Duan [Fri, 12 Sep 2014 21:00:48 +0000 (05:00 +0800)]
net: fec: change data structure to support multiqueue
This patch just change data structure to support multi-queue.
Only 1 queue enabled.
Ethernet multiqueue mechanism can improve performance in SMP system.
For single hw queue, multiqueue can balance cpu loading.
For multi hw queues, multiple cores can process network packets in parallel,
and refer the article for the detail advantage for multiqueue:
http://vger.kernel.org/~davem/davem_nyc09.pdf
Signed-off-by: Fugang Duan <B38611@freescale.com>
Signed-off-by: Frank Li <frank.li@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fugang Duan [Fri, 12 Sep 2014 21:00:47 +0000 (05:00 +0800)]
net:fec: add enet AVB feature macro define for imx6sx
Add enet AVB feature macro define for imx6sx.
Signed-off-by: Fugang Duan <B38611@freescale.com>
Signed-off-by: Frank Li <Frank.Li@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fugang Duan [Fri, 12 Sep 2014 21:00:46 +0000 (05:00 +0800)]
net:fec: add enet refrence clock for i.MX 6SX chip
i.MX6sx enet has below clocks for user config:
clk_ipg: ipg_clk_s, ipg_clk_mac0_s, 66Mhz
clk_ahb: enet system clock, it is enet AXI clock for imx6sx.
For imx6sx, it alos is the clock source of interrupt coalescing.
The clock range: 200Mhz ~ 266Mhz.
clk_ref: refrence clock for tx and rx. For imx6sx enet RGMII mode,
the refrence clock is 125Mhz coming from internal PLL or external.
In i.MX6sx-arm2 board, the clock is from internal PLL.
clk_ref is optional, depends on board.
clk_enet_out: The clock can be output from internal PLL. It can supply 50Mhz
clock for phy. clk_enet_out is optional, depends on chip and board.
clk_ptp: 1588 ts clock. It is optional, depends on chip.
The patch add clk_ref to distiguish the different clocks.
Signed-off-by: Fugang Duan <B38611@freescale.com>
Signed-off-by: Frank Li <Frank.Li@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Fri, 12 Sep 2014 21:58:44 +0000 (23:58 +0200)]
net: DSA: Marvell mv88e6171 switch driver
This is the Marvell driver with some cleanups by Claudio Leite
and myself.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Cc: Claudio Leite <leitec@staticky.com>
Signed-off-by: Claudio Leite <leitec@staticky.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sat, 13 Sep 2014 21:12:25 +0000 (17:12 -0400)]
Merge branch 'be2net-next'
Sathya Perla says:
====================
be2net: patch set
Patch 1 fixes some minor issues with log messages in be2net.
Patch 2 replaces strcpy() calls with strlcpy() to avoid possible buffer
overflow.
Patch 3 improves the RX buffer posting scheme for jumbo frames.
Patch 4 replaces the use of v0 of SET_FLOW_CONTROL cmd with v1 to receive
a definitive completion status from FW.
Patch 5 adds support for ethtool "-m" ethtool option.
Patch 6 fixes port-type reporting via ethtool get_settings for QSFP/SFP+
interfaces.
Patch 7 fixes the usage of MODIFY_EQD FW cmd to target a max of 8 EQs on
Lancer chip.
Patch 8 enables PCIe error reporting even for VFs.
Pls consider applying this patch set to net-next. Thanks.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Kalesh AP [Fri, 12 Sep 2014 12:09:21 +0000 (17:39 +0530)]
be2net: enable PCIe error reporting on VFs too
Currently PCIe error reporting is enabled only on PFs. This patch enables
this feature on VFs too as Lancer VFs support it.
Signed-off-by: Kalesh AP <kalesh.purayil@emulex.com>
Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kalesh AP [Fri, 12 Sep 2014 12:09:20 +0000 (17:39 +0530)]
be2net: send a max of 8 EQs to be_cmd_modify_eqd() on Lancer
The MODIFY_EQ_DELAY FW cmd on Lancer is supported for a max of 8 EQs per cmd.
Signed-off-by: Kalesh AP <kalesh.purayil@emulex.com>
Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ravikumar Nelavelli [Fri, 12 Sep 2014 12:09:19 +0000 (17:39 +0530)]
be2net: fix port-type reporting in get_settings
Report the ethtool port-type/supported/advertising values based on the
cable_type for QSFP and SFP+ interfaces. The cable_type is parsed from
the transceiver data fetched from the FW.
Signed-off-by: Ravikumar Nelavelli <ravikumar.nelavelli@emulex.com>
Signed-off-by: Suresh Reddy <Suresh.Reddy@emulex.com>
Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mark Leonard [Fri, 12 Sep 2014 12:09:18 +0000 (17:39 +0530)]
be2net: add ethtool "-m" option support
This patch adds support for the dump-module-eeprom and module-info
ethtool options.
Signed-off-by: Mark Leonard <mark.leonard@emulex.com>
Signed-off-by: Suresh Reddy <Suresh.Reddy@emulex.com>
Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Suresh Reddy [Fri, 12 Sep 2014 12:09:17 +0000 (17:39 +0530)]
be2net: use v1 of SET_FLOW_CONTROL command
In some configurations the FW doesn't allow changing flow control settings
of a link. Unless a v1 version of the SET_FLOW_CONTROL cmd is used, the FW
doesn't report an error to the driver.
Signed-off-by: Suresh Reddy <Suresh.Reddy@emulex.com>
Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ajit Khaparde [Fri, 12 Sep 2014 12:09:16 +0000 (17:39 +0530)]
be2net: fix RX fragment posting for jumbo frames
In the RX path, the driver currently consumes upto 64 (budget) packets in
one NAPI sweep. When the size of the packet received is larger than a
fragment size (2K), more than one fragment is consumed for each packet.
As the driver currently posts a max of 64 fragments, all the consumed
fragments may not be replenished. This can cause avoidable drops in RX path.
This patch fixes this by posting a max(consumed_frags, 64) frags. This is
done only when there are atleast 64 free slots in the RXQ.
Signed-off-by: Ajit Khaparde <ajit.khaparde@emulex.com>
Signed-off-by: Kalesh AP <kalesh.purayil@emulex.com>
Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vasundhara Volam [Fri, 12 Sep 2014 12:09:15 +0000 (17:39 +0530)]
be2net: replace strcpy with strlcpy
Replace strcpy with strlcpy, as it avoids a possible buffer overflow.
Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vasundhara Volam [Fri, 12 Sep 2014 12:09:14 +0000 (17:39 +0530)]
be2net: fix some log messages
This patch fixes the following minor issues with log messages in be2net:
1) Period is not required at the end of log message.
2) Remove "Unknown grp5 event" logs to reduce noise. The driver can safely
ignore async events from FW it's not interested in.
3) Reword a log message for better readability to say that SRIOV
"is disabled" rather than "not supported".
Signed-off-by: Vasundhara Volam <vasundhara.volam@emulex.com>
Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hannes Frederic Sowa [Fri, 12 Sep 2014 12:04:43 +0000 (14:04 +0200)]
net: filter: constify detection of pkt_type_offset
Currently we have 2 pkt_type_offset functions doing the same thing and
spread across the architecture files. Remove those and replace them
with a PKT_TYPE_OFFSET macro helper which gets the constant value from a
zero sized sk_buff member right in front of the bitfield with offsetof.
This new offset marker does not change size of struct sk_buff.
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Markos Chandras <markos.chandras@imgtec.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Daniel Borkmann <dborkman@redhat.com>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Signed-off-by: Denis Kirjanov <kda@linux-powerpc.org>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Fri, 12 Sep 2014 04:18:09 +0000 (21:18 -0700)]
net: dsa: change tag_protocol to an enum
Now that we introduced an additional multiplexing/demultiplexing layer
with commit
3e8a72d1dae37 ("net: dsa: reduce number of protocol hooks")
that lives within the DSA code, we no longer need to have a given switch
driver tag_protocol be an actual ethertype value, instead, we can
replace it with an enum: dsa_tag_protocol.
Do this replacement in the drivers, which allows us to get rid of the
cpu_to_be16()/htons() dance, and remove ETH_P_BRCMTAG since we do not
need it anymore.
Suggested-by: Alexander Duyck <alexander.duyck@gmail.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>