Yuval Atias [Mon, 9 Jun 2014 07:24:39 +0000 (10:24 +0300)]
net/mlx4_en: Use affinity hint
The “affinity hint” mechanism is used by the user space
daemon, irqbalancer, to indicate a preferred CPU mask for irqs.
Irqbalancer can use this hint to balance the irqs between the
cpus indicated by the mask.
We wish the HCA to preferentially map the IRQs it uses to numa cores
close to it. To accomplish this, we use cpumask_set_cpu_local_first(), that
sets the affinity hint according the following policy:
First it maps IRQs to “close” numa cores. If these are exhausted, the
remaining IRQs are mapped to “far” numa cores.
Signed-off-by: Yuval Atias <yuvala@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Amir Vadai [Mon, 9 Jun 2014 07:24:38 +0000 (10:24 +0300)]
cpumask: Utility function to set n'th cpu - local cpu first
This function sets the n'th cpu - local cpu's first.
For example: in a 16 cores server with even cpu's local, will get the
following values:
cpumask_set_cpu_local_first(0, numa, cpumask) => cpu 0 is set
cpumask_set_cpu_local_first(1, numa, cpumask) => cpu 2 is set
...
cpumask_set_cpu_local_first(7, numa, cpumask) => cpu 14 is set
cpumask_set_cpu_local_first(8, numa, cpumask) => cpu 1 is set
cpumask_set_cpu_local_first(9, numa, cpumask) => cpu 3 is set
...
cpumask_set_cpu_local_first(15, numa, cpumask) => cpu 15 is set
Curently this function will be used by multi queue networking devices to
calculate the irq affinity mask, such that as many local cpu's as
possible will be utilized to handle the mq device irq's.
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 11 Jun 2014 19:25:12 +0000 (12:25 -0700)]
Merge branch 'master' of git://git./linux/kernel/git/jkirsher/net-next
Jeff Kirsher says:
====================
Intel Wired LAN Driver Updates 2014-06-11
This series contains updates to igb, i40e and i40evf.
Todd makes a change to igb to un-hide invariant returns by getting rid of
the E1000_SUCCESS define and converting those returns to return 0.
Jacob separates the hardware logic from the set function, so that we can
re-use it during a ptp_reset in igb. This enables the reset to return
functionality to the last know timestamp mode, rather than resetting the
value.
Ashish implements context flags for headwb and headwb_addr so that we
do not have to keep them always enabled.
Shannon updates the admin queue API for the new firmware, which adds
set_pf_content, nvm_config_read/write, replaces set_phy_reset with
set_phy_debug and removes nvm_read/write_reg_se. Cleans up the driver
to use the stored base_queue value since there is no need to read the
PCI register for the PF's base queue on every single transmit queue
enable and disable as we already have the value stored from reading
the capability features at startup.
Anjali changes the notion of source and destination for FD_SB in ethtool
to align i40e with other drivers. Adds flow director statistics to
the PF stats. Fixes a bug in ethtool for flow director drop packet
filter where the drop action comes down as a ring_cookie value, so allow
it as a special value that can be used to configure destination control.
Mitch fixes the i40evf to keep the driver from going down when it is
already in a down state. This prevents a CPU soft lock in napi_disable().
Also change the i40evf to check the admin queue error bits since the
firmware can indicate any admin queue error states to the driver via
some bits in the length registers.
Neerav separates out the DCB capability and enabled flags because currently
if the firmware reports DCB capability the driver enables
I40E_FLAG_DCB_ENABLED flag. When this flag is enabled the driver inserts
a tag when transmitting a packet from the port even if there are no DCB
traffic classes configured at the port. So by adding the additional flag,
I40E_FLAG_DCB_CAPABLE, that will be set when the DCB capability is present
and the existing enabled flag will only be set if there are more than one
traffic classes configured at the port.
Greg fixes the i40e driver to not automatically accept tagged packets by
default so that the system must request a VLAN tag packet filter to get
packets with that tag. Greg also converts i40e to use the in-kernel
ether_addr_copy() instead of mempcy().
Jesse removes the FTYPE field from the receive descriptor to match the
hardware implementation.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 11 Jun 2014 19:23:30 +0000 (12:23 -0700)]
Merge branch 'sctp-next'
Daniel Borkmann says:
====================
SCTP update
This set contains transport path selection improvements in
SCTP. Please see individual patches for details.
====================
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Wed, 11 Jun 2014 16:19:32 +0000 (18:19 +0200)]
net: sctp: fix incorrect type in gfp initializer
This fixes the following sparse warning:
net/sctp/associola.c:1556:29: warning: incorrect type in initializer (different base types)
net/sctp/associola.c:1556:29: expected bool [unsigned] [usertype] preload
net/sctp/associola.c:1556:29: got restricted gfp_t
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Wed, 11 Jun 2014 16:19:31 +0000 (18:19 +0200)]
net: sctp: improve sctp_select_active_and_retran_path selection
In function sctp_select_active_and_retran_path(), we walk the
transport list in order to look for the two most recently used
ACTIVE transports (trans_pri, trans_sec). In case we didn't find
anything ACTIVE, we currently just camp on a possibly PF or
INACTIVE transport that is primary path; this behavior actually
dates back to linux-history tree of the very early days of
lksctp, and can yield a behavior that chooses suboptimal
transport paths.
Instead, be a bit more clever by reusing and extending the
recently introduced sctp_trans_elect_best() handler. In case
both transports are evaluated to have the same score resulting
from their states, break the tie by looking at: 1) transport
patch error count 2) last_time_heard value from each transport.
This is analogous to Nishida's Quick Failover draft [1],
section 5.1, 3:
The sender SHOULD avoid data transmission to PF destinations.
When all destinations are in either PF or Inactive state,
the sender MAY either move the destination from PF to active
state (and transmit data to the active destination) or the
sender MAY transmit data to a PF destination. In the former
scenario, (i) the sender MUST NOT notify the ULP about the
state transition, and (ii) MUST NOT clear the destination's
error counter. It is recommended that the sender picks the
PF destination with least error count (fewest consecutive
timeouts) for data transmission. In case of a tie (multiple PF
destinations with same error count), the sender MAY choose the
last active destination.
Thus for sctp_select_active_and_retran_path(), we keep track of
the best, if any, transport that is in PF state and in case no
ACTIVE transport has been found (hence trans_{pri,sec} is NULL),
we select the best out of the three: current primary_path and
retran_path as well as a possible PF transport.
The secondary may still camp on the original primary_path as
before. The change in sctp_trans_elect_best() with a more fine
grained tie selection also improves at the same time path selection
for sctp_assoc_update_retran_path() in case of non-ACTIVE states.
[1] http://tools.ietf.org/html/draft-nishida-tsvwg-sctp-failover-05
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Wed, 11 Jun 2014 16:19:30 +0000 (18:19 +0200)]
net: sctp: migrate most recently used transport to ktime
Be more precise in transport path selection and use ktime
helpers instead of jiffies to compare and pick the better
primary and secondary recently used transports. This also
avoids any side-effects during a possible roll-over, and
could lead to better path decision-making.
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Wed, 11 Jun 2014 16:19:29 +0000 (18:19 +0200)]
net: sctp: refactor active path selection
This patch just refactors and moves the code for the active
path selection into its own helper function outside of
sctp_assoc_control_transport() which is already big enough.
No functional changes here.
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Wed, 11 Jun 2014 16:19:28 +0000 (18:19 +0200)]
ktime: add ktime_after and ktime_before helper
Add two minimal helper functions analogous to time_before() and
time_after() that will later on both be needed by SCTP code.
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 11 Jun 2014 19:11:25 +0000 (12:11 -0700)]
Merge branch 'mac802154'
Phoebe Buckheister says:
====================
Recent llsec code introduced a memory leak on decryption failures during rx.
This fixes said leak, and optimizes the receive loops for monitor and wpan
devices to only deliver skbs to devices that are actually up. Also changes a
dev_kfree_skb to kfree_skb when an invalid packet is dropped before being
pushed into the stack.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Phoebe Buckheister [Wed, 11 Jun 2014 10:03:07 +0000 (12:03 +0200)]
mac802154: don't deliver packets to devices that are down
Only one WPAN devices can be active at any given time, so only deliver
packets to that one interface that is actually up. Multiple monitors may
be up at any given time, but we don't have to deliver to monitors that
are down either.
Signed-off-by: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Phoebe Buckheister [Wed, 11 Jun 2014 10:03:06 +0000 (12:03 +0200)]
mac802154: properly free incoming skbs on decryption failure
mac802154 RX did not free skbs on decryption failure, assuming that the
caller would when the local rx handler returned _DROP. This was false.
Signed-off-by: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Catherine Sullivan [Thu, 22 May 2014 06:32:33 +0000 (06:32 +0000)]
i40e/i40evf: Bump i40e to version 0.4.10 and i40evf to 0.9.34
Bump versions.
Change-ID: Ic4a84354955061ca18321b1e97c9c30fe1563b5c
Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Shannon Nelson [Thu, 22 May 2014 06:32:28 +0000 (06:32 +0000)]
i40e: use stored base_queue value
No need to read the PCI register for the PF's base queue on every single Tx
queue enable and disable as we already have the value stored from reading
the capability features at startup.
Change-ID: Ic02fb622757742f43cb8269369c3d972d4f66555
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Anjali Singhai Jain [Thu, 22 May 2014 06:32:23 +0000 (06:32 +0000)]
i40e: Fix a bug in ethtool for FD drop packet filter action
A drop action comes down as a ring_cookie value, so allow it as
a special value that can be used to configure destination control.
Also fix the output to filter read command accordingly.
Change-ID: I9956723cee42f3194885403317dd21ed4a151144
Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Anjali Singhai Jain [Thu, 22 May 2014 06:32:17 +0000 (06:32 +0000)]
i40e/i40evf: Add Flow director stats to PF stats
Add members to stat struct to keep track of Flow director ATR and
SideBand filter packet matches.
Change-ID: Ibbb31a53c7adcc2bb96991dd80565442a2f2513c
Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Jesse Brandeburg [Thu, 22 May 2014 06:32:12 +0000 (06:32 +0000)]
i40e/i40evf: remove FTYPE
This change drops the FTYPE field from the Rx descriptor, to
match the hardware implementation.
Change-ID: I66d31d2b43861da45e8ace4fb03df033abe88bab
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Mitch Williams [Thu, 22 May 2014 06:32:07 +0000 (06:32 +0000)]
i40evf: check admin queue error bits
FW can indicate any admin queue error states to the driver via some bits
in the length registers. Each time we process an admin queue message,
check these bits and log any errors we find. Since the VF really can't
do much, we just print the message and depend on the PF driver to clear
things up on our behalf.
Change-ID: I92bc6c53ce3b4400544e0ca19c5de2d27490bd0d
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Greg Rose [Thu, 22 May 2014 06:32:02 +0000 (06:32 +0000)]
i40e/i40evf: User ether_addr_copy instead of memcpy
Linux gives us a function to copy Ethernet MAC addresses, let's use it.
Change-ID: I0c861900029ca5ea65a53ca39565852fb633f6fd
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Greg Rose [Thu, 22 May 2014 06:31:56 +0000 (06:31 +0000)]
i40e: Do not accept tagged packets by default
Remove the filter created by the firmware with the default MAC address it
reads out of the NVM storage and a promiscuous VLAN tag and replace it
with a filter that will not accept tagged packets by default. The system
must request a VLAN tag packet filter to get packets with that tag.
Change-ID: I119e6c3603a039bd68282ba31bf26f33a575490a
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Neerav Parikh [Thu, 22 May 2014 06:31:51 +0000 (06:31 +0000)]
i40e: Separate out DCB capability and enabled flags
Currently if the firmware reports DCB capability the driver enables
I40E_FLAG_DCB_ENABLED flag. When this flag is enabled the driver
inserts a tag when transmitting a packet from the port even if there
are no DCB traffic classes configured at the port.
This patch adds a new flag I40E_FLAG_DCB_CAPABLE that will be set
when the DCB capability is present and the existing flag
I40E_FLAG_DCB_ENABLED will be set only if there are more than one
traffic classes configured at the port.
Change-ID: I24ccbf53ef293db2eba80c8a9772acf729795bd5
Signed-off-by: Neerav Parikh <neerav.parikh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Mitch Williams [Thu, 22 May 2014 06:31:46 +0000 (06:31 +0000)]
i40evf: don't go further down
If the device is down, there's no place to go but up, so don't try to go
down even more. This prevents a CPU soft lock in napi_disable().
Change-ID: I8b058b9ee974dfa01c212fae2597f4f54b333314
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Anjali Singhai Jain [Thu, 22 May 2014 06:31:41 +0000 (06:31 +0000)]
i40e: Change the notion of src and dst for FD_SB in ethtool
In XL710 devices we program FD filter's fields from Tx perspective of the flow.
However the user interface exposed in ethtool should be compliant with the
previous generation of drivers where a filter src and dst field are from
the RX perspective. This patch changes the ethtool interface in this regard
to match the other drivers.
Change-ID: Iec6ccddd87357c4fb53ccf33aa0fae699faf70cf
Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Shannon Nelson [Thu, 22 May 2014 06:31:30 +0000 (06:31 +0000)]
i40e/i40evf: AdminQ API update for new FW
Add set_pf_context, replace set_phy_reset with set_phy_debug, add
nvm_config_read/write, remove nvm_read/write_reg_se and add some
PHY types.
With these changes we bump the API version to 1.2.
Change-ID: I4dc3aec175c2316f66fc9b726b3f7d594699d84e
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Ashish Shah [Thu, 22 May 2014 06:31:25 +0000 (06:31 +0000)]
i40e/i40evf: set headwb Tx context flags and use them
Set appropriate fields in Tx queue configuration virtchnl message
to pf to enable headwb and setup headwb addr.
Then use that info from the VF to set headwb and headwb_addr instead of
always enabling them.
Change-ID: I7d393d1b2b07f0f3355b3a4f7c2d3c6ee3b0d622
Signed-off-by: Ashish Shah <ashish.n.shah@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Jacob Keller [Thu, 5 Jun 2014 07:25:10 +0000 (07:25 +0000)]
igb: separate hardware setting from the set_ts_config ioctl
This patch separates the hardware logic from the set function, so that
we can re-use it during a ptp_reset. This enables the reset to return
functionality to the last known timestamp mode, rather than resetting
the value. We initialize the mode to off during the ptp_init cycle.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Todd Fujinaka [Wed, 4 Jun 2014 07:12:15 +0000 (07:12 +0000)]
igb: unhide invariant returns
Return a 0 directly rather than a constant.
Reported-by: Peter Senna Tschudin <peter.senna@gmail.com>
Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Lendacky, Thomas [Mon, 9 Jun 2014 14:19:32 +0000 (09:19 -0500)]
amd-xgbe: Rename MAX_DMA_CHANNELS to avoid powerpc conflict
MAX_DMA_CHANNELS is defined in asm/scatterlist.h of the powerpc
architecture. Rename this #define in xgbe.h to avoid the
redefined warning issued during compilation.
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ben Hutchings [Sun, 8 Jun 2014 22:49:34 +0000 (23:49 +0100)]
farsync: Fix confusion about DMA address and buffer offset types
Use dma_addr_t for DMA address parameters and u32 for shared memory
offset parameters.
Do not assume that dma_addr_t is the same as unsigned long; it will
not be in PAE configurations. Truncate DMA addresses to 32 bits when
printing them. This is OK because the DMA mask for this device is
32-bit (per default).
Also rename the DMA address parameters from 'skb' to 'dma'.
Compile-tested only.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 11 Jun 2014 07:32:53 +0000 (00:32 -0700)]
Merge branch 'mlx4'
Or Gerlitz says:
====================
mlx4 SRIOV fixes
The patch from Wei Yang is a designed fix to a regression introduced by earlier commit
of him. Jack added a fix to the resource management which we got from IBM.
Let's get that into 3.16-rc1 1st and later see to what stable version/s this should go.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Wei Yang [Sun, 8 Jun 2014 10:49:46 +0000 (13:49 +0300)]
net/mlx4_core: Keep only one driver entry release mlx4_priv
Following commit
befdf89 "net/mlx4_core: Preserve pci_dev_data after
__mlx4_remove_one()", there are two mlx4 pci callbacks which will
attempt to release the mlx4_priv object -- .shutdown and .remove.
This leads to a use-after-free access to the already freed mlx4_priv
instance and trigger a "Kernel access of bad area" crash when both
.shutdown and .remove are called.
During reboot or kexec, .shutdown is called, with the VFs probed to
the host going through shutdown first and then the PF. Later, the PF
will trigger VFs' .remove since VFs still have driver attached.
Fix that by keeping only one driver entry which releases mlx4_priv.
Fixes:
befdf89 ('net/mlx4_core: Preserve pci_dev_data after __mlx4_remove_one()')
CC: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jack Morgenstein [Sun, 8 Jun 2014 10:49:45 +0000 (13:49 +0300)]
net/mlx4_core: Fix SRIOV free-pool management when enforcing resource quotas
The Hypervisor driver tracks free slots and reserved slots at the global level
and tracks allocated slots and guaranteed slots per VF.
Guaranteed slots are treated as reserved by the driver, so the total
reserved slots is the sum of all guaranteed slots over all the VFs.
As VFs allocate resources, free (global) is decremented and allocated (per VF)
is incremented for those resources. However, reserved (global) is never changed.
This means that effectively, when a VF allocates a resource from its
guaranteed pool, it is actually reducing that resource's free pool (since
the global reserved count was not also reduced).
The fix for this problem is the following: For each resource, as long as a
VF's allocated count is <= its guaranteed number, when allocating for that
VF, the reserved count (global) should be reduced by the allocation as well.
When the global reserved count reaches zero, the remaining global free count
is still accessible as the free pool for that resource.
When the VF frees resources, the reverse happens: the global reserved count
for a resource is incremented only once the VFs allocated number falls below
its guaranteed number.
This fix was developed by Rick Kready <kready@us.ibm.com>
Reported-by: Rick Kready <kready@us.ibm.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rickard Strandqvist [Sat, 7 Jun 2014 11:26:37 +0000 (13:26 +0200)]
net: wimax: i2400m: control.c: Cleaning up conjunction always evaluates to false
Logical conjunction always evaluates to false: minor < 2 && minor > 1
I guess what you wanted is rather: minor > 2 || minor < 1
This was partly found using a static code analysis program called cppcheck.
Signed-off-by: Rickard Strandqvist <rickard_strandqvist@spectrumdigital.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rickard Strandqvist [Sat, 7 Jun 2014 10:22:08 +0000 (12:22 +0200)]
net: ethernet: toshiba: ps3_gelic_net.c: Cleaning up a check on a memory allocation
A check on a memory allocation is checked incorrectly.
This was partly found using a static code analysis program called cppcheck.
Signed-off-by: Rickard Strandqvist <rickard_strandqvist@spectrumdigital.se>
Acked-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
françois romieu [Sat, 7 Jun 2014 09:07:48 +0000 (11:07 +0200)]
amd-xgbe: fix unused variable compilation warning in phylib driver
Fix following compilation warning:
[...]
CC drivers/net/phy/amd-xgbe-phy.o
drivers/net/phy/amd-xgbe-phy.c:1353:30: warning:
‘amd_xgbe_phy_ids’ defined but not used [-Wunused-variable]
static struct mdio_device_id amd_xgbe_phy_ids[] = {
^
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Acked-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexei Starovoitov [Sat, 7 Jun 2014 00:48:20 +0000 (17:48 -0700)]
net: filter: fix nlattr and nlattr_nest BPF tests
- 'struct nlattr' must be 2 byte aligned
- provide big-endian input data for nlattr/nlattr_nest tests
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexei Starovoitov [Fri, 6 Jun 2014 21:46:06 +0000 (14:46 -0700)]
net: filter: cleanup A/X name usage
The macro 'A' used in internal BPF interpreter:
#define A regs[insn->a_reg]
was easily confused with the name of classic BPF register 'A', since
'A' would mean two different things depending on context.
This patch is trying to clean up the naming and clarify its usage in the
following way:
- A and X are names of two classic BPF registers
- BPF_REG_A denotes internal BPF register R0 used to map classic register A
in internal BPF programs generated from classic
- BPF_REG_X denotes internal BPF register R7 used to map classic register X
in internal BPF programs generated from classic
- internal BPF instruction format:
struct sock_filter_int {
__u8 code; /* opcode */
__u8 dst_reg:4; /* dest register */
__u8 src_reg:4; /* source register */
__s16 off; /* signed offset */
__s32 imm; /* signed immediate constant */
};
- BPF_X/BPF_K is 1 bit used to encode source operand of instruction
In classic:
BPF_X - means use register X as source operand
BPF_K - means use 32-bit immediate as source operand
In internal:
BPF_X - means use 'src_reg' register as source operand
BPF_K - means use 32-bit immediate as source operand
Suggested-by: Chema Gonzalez <chema@google.com>
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Chema Gonzalez <chema@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 11 Jun 2014 06:51:00 +0000 (23:51 -0700)]
Merge branch 'bridge_multicast_exports'
Linus Lüssing says:
====================
bridge: multicast snooping patches / exports
The first patch is simply a cosmetic patch. So far I (and maybe others
too?) have been regularly confusing these two structs, therefore I'd
suggest renaming them and therefore making the follow-up patches easier
to understand and nicer to fit in.
The second patch fixes a minor issue, but probably not worth for stable.
On the other hand the first two patches are also preparations for the
third and fourth patch:
These two patches are exporting functionality needed to marry the bridge
multicast snooping with the batman-adv multicast optimizations recently
added for the 3.15 kernel, allowing to use these optimzations in common
setups having a bridge on top of e.g. bat0, too. So far these bridged
setups would fall back to simple flooding through the batman-adv mesh
network for any multicast packet entering bat0.
More information about the batman-adv multicast optimizations currently
implemented can be found here:
http://www.open-mesh.org/projects/batman-adv/wiki/Basic-multicast-optimizations
The integration on the batman-adv side could afterwards look like this,
for instance:
http://git.open-mesh.org/batman-adv.git/commitdiff/
576b59dd3e34737c702e548b21fa72059262f796?hp=
f95ce7131746c65fbcdffcf2089cab59e2c2f7ac
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Lüssing [Sat, 7 Jun 2014 16:26:29 +0000 (18:26 +0200)]
bridge: memorize and export selected IGMP/MLD querier port
Adding bridge support to the batman-adv multicast optimization requires
batman-adv knowing about the existence of bridged-in IGMP/MLD queriers
to be able to reliably serve any multicast listener behind this same
bridge.
Signed-off-by: Linus Lüssing <linus.luessing@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Lüssing [Sat, 7 Jun 2014 16:26:28 +0000 (18:26 +0200)]
bridge: add export of multicast database adjacent to net_dev
With this new, exported function br_multicast_list_adjacent(net_dev) a
list of IPv4/6 addresses is returned. This list contains all multicast
addresses sensed by the bridge multicast snooping feature on all bridge
ports of the bridge interface of net_dev, excluding addresses from the
specified net_device itself.
Adding bridge support to the batman-adv multicast optimization requires
batman-adv knowing about the existence of bridged-in multicast
listeners to be able to reliably serve them with multicast packets.
Signed-off-by: Linus Lüssing <linus.luessing@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Lüssing [Sat, 7 Jun 2014 16:26:27 +0000 (18:26 +0200)]
bridge: adhere to querier election mechanism specified by RFCs
MLDv1 (RFC2710 section 6), MLDv2 (RFC3810 section 7.6.2), IGMPv2
(RFC2236 section 3) and IGMPv3 (RFC3376 section 6.6.2) specify that the
querier with lowest source address shall become the selected
querier.
So far the bridge stopped its querier as soon as it heard another
querier regardless of its source address. This results in the "wrong"
querier potentially becoming the active querier or a potential,
unnecessary querying delay.
With this patch the bridge memorizes the source address of the currently
selected querier and ignores queries from queriers with a higher source
address than the currently selected one. This slight optimization is
supposed to make it more RFC compliant (but is rather uncritical and
therefore probably not necessary to be queued for stable kernels).
Signed-off-by: Linus Lüssing <linus.luessing@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Lüssing [Sat, 7 Jun 2014 16:26:26 +0000 (18:26 +0200)]
bridge: rename struct bridge_mcast_query/querier
The current naming of these two structs is very random, in that
reversing their naming would not make any semantical difference.
This patch tries to make the naming less confusing by giving them a more
specific, distinguishable naming.
This is also useful for the upcoming patches reintroducing the
"struct bridge_mcast_querier" but for storing information about the
selected querier (no matter if our own or a foreign querier).
Signed-off-by: Linus Lüssing <linus.luessing@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 11 Jun 2014 05:49:59 +0000 (22:49 -0700)]
Merge branch 'cxgb4'
Hariprasad Shenai says:
====================
Adds support for CIQ and other misc. fixes for rdma/cxgb4
This patch series adds support to allocate and use IQs specifically for
indirect interrupts, adds fixes to align ISS for iWARP connections & fixes
related to tcp snd/rvd window for Chelsio T4/T5 adapters on iw_cxgb4.
Also changes Interrupt Holdoff Packet Count threshold of response queues for
cxgb4 driver.
The patches series is created against 'net-next' tree.
And includes patches on cxgb4 and iw_cxgb4 driver.
Since this patch-series contains cxgb4 and iw_cxgb4 patches, we would like to
request this patch series to get merged via David Miller's 'net-next' tree.
We have included all the maintainers of respective drivers. Kindly review the
change and let us know in case of any review comments.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Hariprasad Shenai [Fri, 6 Jun 2014 16:10:45 +0000 (21:40 +0530)]
cxgb4: Change default Interrupt Holdoff Packet Count Threshold
Based on original work by Casey Leedom <leedom@chelsio.com>
Signed-off-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hariprasad Shenai [Fri, 6 Jun 2014 16:10:44 +0000 (21:40 +0530)]
iw_cxgb4: don't truncate the recv window size
Fixed a bug that shows up with recv window sizes that exceed the size of
the RCV_BUFSIZ field in opt0 (>= 1024K). If the recv window exceeds
this, then we specify the max possible in opt0, add add the rest in via
a RX_DATA_ACK credits.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hariprasad Shenai [Fri, 6 Jun 2014 16:10:43 +0000 (21:40 +0530)]
iw_cxgb4: Choose appropriate hw mtu index and ISS for iWARP connections
Select the appropriate hw mtu index and initial sequence number to optimize
hw memory performance.
Add new cxgb4_best_aligned_mtu() which allows callers to provide enough
information to be used to [possibly] select an MTU which will result in the
TCP Data Segment Size (AKA Maximum Segment Size) to be an aligned value.
If an RTR message exhange is required, then align the ISS to 8B - 1 + 4, so
that after the SYN the send seqno will align on a 4B boundary. The RTR
message exchange will leave the send seqno aligned on an 8B boundary.
If an RTR is not required, then align the ISS to 8B - 1. The goal is
to have the send seqno be 8B aligned when we send the first FPDU.
Based on original work by Casey Leedom <leeedom@chelsio.com> and
Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hariprasad Shenai [Fri, 6 Jun 2014 16:10:42 +0000 (21:40 +0530)]
iw_cxgb4: Allocate and use IQs specifically for indirect interrupts
Currently indirect interrupts for RDMA CQs funnel through the LLD's RDMA
RXQs, which also handle direct interrupts for offload CPLs during RDMA
connection setup/teardown. The intended T4 usage model, however, is to
have indirect interrupts flow through dedicated IQs. IE not to mix
indirect interrupts with CPL messages in an IQ. This patch adds the
concept of RDMA concentrator IQs, or CIQs, setup and maintained by the
LLD and exported to iw_cxgb4 for use when creating CQs. RDMA CPLs will
flow through the LLD's RDMA RXQs, and CQ interrupts flow through the
CIQs.
Design:
cxgb4 creates and exports an array of CIQs for the RDMA ULD. These IQs
are sized according to the max available CQs available at adapter init.
In addition, these IQs don't need FL buffers since they only service
indirect interrupts. One CIQ is setup per RX channel similar to the
RDMA RXQs.
iw_cxgb4 will utilize these CIQs based on the vector value passed into
create_cq(). The num_comp_vectors advertised by iw_cxgb4 will be the
number of CIQs configured, and thus the vector value will be the index
into the array of CIQs.
Based on original work by Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
stephen hemminger [Fri, 6 Jun 2014 15:10:00 +0000 (08:10 -0700)]
gre: allow changing mac address when device is up
There is no need to require forcing device down on a Ethernet GRE (gretap)
tunnel to change the MAC address.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Octavian Purdila [Fri, 6 Jun 2014 14:32:37 +0000 (17:32 +0300)]
tcp: add gfp parameter to tcp_fragment
tcp_fragment can be called from process context (from tso_fragment).
Add a new gfp parameter to allow it to preserve atomic memory if
possible.
Signed-off-by: Octavian Purdila <octavian.purdila@intel.com>
Reviewed-by: Christoph Paasch <christoph.paasch@uclouvain.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 11 Jun 2014 03:25:52 +0000 (20:25 -0700)]
Merge branch 'master' of git://git./linux/kernel/git/jkirsher/net-next
Jeff Kirsher says:
====================
Intel Wired LAN Driver Updates 2014-06-09
This series contains more updates to i40e and i40evf.
Shannon adds checks for error status bits on the admin event queue and
provides notification if seen. Cleans up unused variable and memory
allocation which was used earlier in driver development and is no longer
needed. Also fixes the driver to not complain about removing
non-existent MAC addresses. Bumps the driver versions for both i40e
and i40evf.
Catherine fixes a function header comment to make sure the comment correctly
reflects the function name.
Mitch adds code to allow for additional VSIs since the number of VSIs that
the firmware reports to us is a guaranteed minimum, not an absolute
maximum. The hardware actually supports for more than the reported value,
which we often need. Implements anti-spoofing for VFs for both MAC
addresses and VLANs, as well as enable this feature by default for all VFs.
Anjali changes the interrupt distribution policy to change the way
resources for special features are handled. Fixes the driver to not fall
back to one queue if the only feature enabled is ATR, since FD_SB
and FD_ATR need to be checked independently in order to decide if we
will support multiple queue or not. Allows the RSS table entry range
and GPS to be any number, not necessarily a power of 2 because hardware
does not restrict us to use a power of 2 GPS in the case of RSS as long as
we are not sharing the RSS table with another VSI (VMDq).
Frank modifies the driver to keep SR-IOV enabled in the case that RSS,
VMFq, FD_SB and DCB are disabled so that SR-IOV does not get turned off
unnecessarily.
Jesse fixes a bug in receive checksum where the driver was not marking
packets with bad checksums correctly, especially IPv6 packets with a bad
checksum. To do this correctly, we need a define that may be set by
hardware in rare cases.
Greg fixes the driver to delete all the old and stale MAC filters for the
VF VSI when the host administrator changes the VF MAC address from under
its feet.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Shannon Nelson [Tue, 20 May 2014 08:01:47 +0000 (08:01 +0000)]
i40e/i40evf: bump version to 0.4.7 for i40e and 0.9.31 for i40evf
Bumpity and Fred Worm say it's time to change the numbers again.
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Change-ID: I658731d022ea23cedede4be2bfecd8b4cc68d270
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Anjali Singhai Jain [Tue, 20 May 2014 08:01:46 +0000 (08:01 +0000)]
i40e: Allow RSS table entry range and GPS to be any number, not necessarily power of 2
We tell the HW upper boundary of power of 2 in VSI config,
but the HW does not restrict us to use just power of 2 GPS in
case of RSS as long as we are not sharing the RSS table with
another VSI (VMDq). We at present are not doing RSS in VMDq
VSI.
If we were to enable that and if the system had CPU count which
was not power 2, the VMDq VSIs will see a little skewed distribution.
Change-ID: I3ea797ce9065a3ca4fc4d04251bf195463410473
Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Greg Rose [Tue, 20 May 2014 08:01:45 +0000 (08:01 +0000)]
i40e: Delete stale MAC filters after change
Delete all the old and stale MAC filters for the VF VSI when the host
administrator changes the VF MAC address from under its feet. Also don't
bother to add a filter for the VSI when its going to go away anyway.
Just record the new address and punch the VF reset.
Change-ID: Ic0d12055926f41989d1965ccf500053729c063ad
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Anjali Singhai Jain [Tue, 20 May 2014 08:01:44 +0000 (08:01 +0000)]
i40e: Do not fall back to one queue model if the only feature enabled is ATR
FD_SB and FD_ATR needs to be checked independently in order to decide if
we will support multiple queues or not.
Change-ID: I9d3274f5924c79e29efdbcf66a2fcca1fee2107f
Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Jesse Brandeburg [Tue, 20 May 2014 08:01:43 +0000 (08:01 +0000)]
i40e/i40evf: add PPRS bit to error bits and fix bug in Rx checksum
The driver was not marking packets with bad checksums
correctly, especially IPv6 packets with a bad checksum.
To do this correctly we need a define that may be set by
hardware in rare cases.
Change-ID: I1a997b72b491ded27a78ac3bce1197b2d2611130
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Frank Zhang [Tue, 20 May 2014 08:01:42 +0000 (08:01 +0000)]
i40e: keep SR-IOV enabled in the case that RSS, VMDQ, FD_SB and DCB are disabled
Modify the logic in i40e_determine_queue_usage() so that
SR-IOV doesn't get turned off unnecessarily.
Change-ID: I86ca304fa9f742a50e9ea831b887f358a6a9d53d
Signed-off-by: Frank Zhang <frank_1.zhang@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Anjali Singhai Jain [Wed, 21 May 2014 23:32:43 +0000 (23:32 +0000)]
i40e: Changes to Interrupt distribution policy
This patch changes the way resources are distributed to special features.
Change-ID: I847e49d714a1d70e97f3f994cb39bfb5e02ab016
Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Mitch Williams [Tue, 20 May 2014 08:01:40 +0000 (08:01 +0000)]
i40e: implement anti-spoofing for VFs
Our hardware supports VF antispoofing for both MAC addresses and VLANs.
Enable this feature by default for all VFs and implement the netdev op
to control it from the command line.
Change-ID: Ifb941da22785848aa3aba6b2231be135b8ea8f31
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Shannon Nelson [Tue, 20 May 2014 08:01:39 +0000 (08:01 +0000)]
i40e: don't complain about removing non-existent addresses
We don't need to complain in the log about mac addresses that
can't be deleted because they don't exist.
Change-ID: I4e6370df175bf72726f06d2206c03bcbfded8387
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Shannon Nelson [Tue, 20 May 2014 08:01:38 +0000 (08:01 +0000)]
i40e: remove unused variable and memory allocation
This was a vestige of early driver development that no longer
has any actual use.
Change-ID: I95b5b19c4bbfaff8759197af671ebaf716cb6ab5
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Mitch Williams [Tue, 20 May 2014 08:01:37 +0000 (08:01 +0000)]
i40e: allow for more VSIs
The number of VSIs that the firmware reports to us is a guaranteed
minimum, not an absolute maximum. The hardware actually supports far
more than the reported value, which we often need.
To allow for this, we allocate space for a larger number of VSIs than is
guaranteed by the firmware, with the knowledge that we may fail to get
them all in the future.
Note that we are just allocating pointers here, the actual (much larger)
VSI structures are allocated on demand.
Change-ID: I6f4e535ce39d3bf417aef78306e04fbc7505140e
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Catherine Sullivan [Tue, 20 May 2014 08:01:36 +0000 (08:01 +0000)]
i40evf: Fix function header
Fix function header comment to have the correct function name.
Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Shannon Nelson [Tue, 20 May 2014 08:01:35 +0000 (08:01 +0000)]
i40e: add checks for AQ error status bits
Check for error status bits on the AdminQ event queue and announce them
if seen. If the Firmware sets these bits, it will trigger an AdminQ
interrupt to get the driver's attention to process the ARQ, which will
likely be enough to clear the actual issue.
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Change-ID: I009e0ebc8be764e40e193b29aed2863f43eb5cb0
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
David S. Miller [Sun, 8 Jun 2014 21:17:39 +0000 (14:17 -0700)]
Merge branch 'for-davem' of git://git./linux/kernel/git/linville/wireless-next
John W. Linville says:
====================
pull request: wireless-next 2014-06-06
Please accept this batch of fixes intended for the 3.16 stream.
For the bluetooth bits, Gustavo says:
"Here some more patches for 3.16. We know that Linus already opened the merge
window, but this is fix only pull request, and most of the patches here are
also tagged for stable."
Along with that, Andrea Merello provides a fix for the broken scanning
in the venerable at76c50x driver...
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sun, 8 Jun 2014 21:07:45 +0000 (14:07 -0700)]
Merge branch 'master' of git://git./linux/kernel/git/jkirsher/net-next
Jeff Kirsher says:
====================
Intel Wired LAN Driver Updates 2014-06-08
This series contains updates to i40e and i40evf.
Jesse fixes an issue reported by Eric Dumazet where the driver was not
masking the right bits in the receive descriptor before checking them.
Also fixes TSO accounting since the kernel now can send as much as 32kB
in a single skb->frag[.] entry, even on a system with 4kB pages.
Anjali cleans up registers which are no longer supported.
Akeem cleans up code comments and removes num_msix_entries from the
interrupt setup routine since it was not being used. Fixes an issue where
FD SB/ATR and NTUPLE configuration status were reported erroneously, so
now the driver reports FDir without further information. Fixes a coding
error where during the registration for NAPI, the driver was requesting
256 budget. The max recommended value for this NAPI_POLL_WEIGHT or 64.
Lastly, removed deprecated device IDs because they will not be shipped.
Mitch removes log messages which were redundant so therefore unnecessary.
Also removes a bogus code comment since VF drivers require MSI-X or they
won't get interrupts at all and cleans up the formatting of several log
messages. Mitch also fixes the possibility of null pointers in VSI, since
not all VSIs have transmit rings.
Shannon ensures to clear the PXE mode bit on each reset after the AdminQ
has been rebuilt.
Catherine bumps the driver versions for i40e and i40evf.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Catherine Sullivan [Sat, 10 May 2014 04:49:15 +0000 (04:49 +0000)]
i40e/i40evf: Bump build version
Bump i40e to 0.4.5 and i40evf to 0.9.29.
Change-ID: I9faca5544446518c5425612e733499cf16ef20a1
Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Jesse Brandeburg [Sat, 10 May 2014 04:49:14 +0000 (04:49 +0000)]
i40e/i40evf: remove deprecated device IDs
Remove two device IDs 1582 and 1573, because they will not be shipped.
Change-ID: Ica2e550b5b21a69e3f353eba2fe5e1c532a548c4
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Jesse Brandeburg [Sat, 10 May 2014 04:49:13 +0000 (04:49 +0000)]
i40e/i40evf: fix poll weight
Fix a coding error where during the registration for NAPI
the driver requested 256 budget. The max recommended
value for this is NAPI_POLL_WEIGHT or 64.
Change-ID: I03ea1e2934a84ff1b5d572988b18315d6d91c5c6
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Jesse Brandeburg [Sat, 10 May 2014 04:49:12 +0000 (04:49 +0000)]
i40e/i40evf: fix TSO accounting
The TSO logic in the transmit path had some assumptions that
have been broken now that the kernel can send as much as 32kB
in a single skb->frag[.] entry, even on a system with 4kB pages.
This fixes the assumptions and allows the kernel to operate
as efficiently as possible with both SENDFILE and SEND.
In addition, the hardware limit of data contained in a descriptor is
changed to the next power of two below where it currently is in
order to align to a power of two value, preventing a single byte
of data in a descriptor.
Change-ID: I6af1f0b87c1458e10644dbd47541591075a52651
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Mitch Williams [Sat, 10 May 2014 04:49:11 +0000 (04:49 +0000)]
i40e/i40evf: remove chatty reset messages
Both the PF side and the VF side of the VF reset process are too noisy.
We already warn the user that a reset is happening, and that is
sufficient.
Because some of these message are inside if statements, we have to
rejigger the brackets at the same time to keep our coding style
consistent.
Change-ID: Id175562fb0ec7c396d9de156b4890e136f52d5f4
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Mitch Williams [Sat, 10 May 2014 04:49:10 +0000 (04:49 +0000)]
i40e: not all VSIs have rings
Once more, with feeling: not all VSIs have rings. To assume so is to
invite null pointers to your party.
Change-ID: I576858824468d9712d119fa1015a1f28c27712c4
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Shannon Nelson [Sat, 10 May 2014 04:49:09 +0000 (04:49 +0000)]
i40e: clear pxe after adminq is rebuilt
Be sure to clear PXE mode bit on each reset after AdminQ has been rebuilt.
Change-ID: I992d8c79594f8ca0660c50844ace675ecb9c9bf2
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Akeem G Abodunrin [Sat, 10 May 2014 04:49:08 +0000 (04:49 +0000)]
i40e: Fix incorrect feature configuration status
This patch fixes an issue where FD SB/ATR and NTUPLE configurations status are
reported erroneously. Without this patch, driver reports FDir without further
information.
Change-ID: I5bdd2871b7f2db1e5f5e76c741ae6a0dc603b453
Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Mitch Williams [Sat, 10 May 2014 04:49:07 +0000 (04:49 +0000)]
i40evf: use correct format for printing MAC addresses
The correct format is %pM, not %pMAC.
Change-ID: Idb335723a966fe56db3a72b9c07c08ca66f9db3c
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Mitch Williams [Sat, 10 May 2014 04:49:06 +0000 (04:49 +0000)]
i40evf: clean up log message formatting
Clean up inconsistent log messages, mostly related to punctuation. Based
on the dogma that "kernel messages are not sentences", remove all
trailing periods. Reword a few of the messages to make them less
sentence-like.
Change-ID: Ibd849aa7623a77549b0709988c66ab05d1311472
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Mitch Williams [Sat, 10 May 2014 04:49:05 +0000 (04:49 +0000)]
i40evf: remove bogus comment
This comment is just plain false. VF drivers require MSI-X or they won't
get interrupts at all.
Change-ID: Iaea5e30b6926948aa834a3c506d9a9223d9e3e29
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Mitch Williams [Sat, 10 May 2014 04:49:04 +0000 (04:49 +0000)]
i40evf: remove unnecessary log messages
We don't need to print log messages when we encounter an out-of-memory
condition, as the allocator will do this for us. Also, remove a Tx hang
message that duplicates the one emitted by the netdev layer, and a
duplicate message in the watchdog.
Change-ID: If2056e6135fe248f66ea939778f9895660f4d189
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Akeem G Abodunrin [Sat, 10 May 2014 04:49:03 +0000 (04:49 +0000)]
i40e/i40evf: Clean up a few things
1. There is no ixgbe_watchdog_task function in the driver, so change
the comment to the correct function name, i40e_watchdog_subtask.
2. Remove num_msix_entries from interrupt set_up routine
because it is never used.
3. Remove some TBD comments that are not needed.
Change-ID: I37697a04007074b797f85fd83d626672e4df1ad1
Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Anjali Singhai Jain [Sat, 10 May 2014 04:49:02 +0000 (04:49 +0000)]
i40e/i40evf: Fix code to accommodate i40e_register.h changes
Remove use of registers no longer supported.
Change-ID: I9d27399091cea78a926489d94f958edd762f5a20
Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Jesse Brandeburg [Sat, 10 May 2014 04:49:01 +0000 (04:49 +0000)]
i40e/i40evf: fix rx descriptor status
As reported by Eric Dumazet, the driver is not masking the right
bits in the receive descriptor before it starts checking them.
This patch extends the mask to allow for the right bits to be
checked, and fixes the issue permanently via a define.
CC: Eric Dumazet <eric.dumazet@gmail.com>
Change-ID: I3274f7619057a950f468143e6d7e11b129f54655
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Phoebe Buckheister [Fri, 6 Jun 2014 12:27:52 +0000 (14:27 +0200)]
mac802154: llsec: add forgotten list_del_rcu in key removal
During key removal, the key object is freed, but not taken out of the
llsec key list properly. Fix that.
Signed-off-by: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Fri, 6 Jun 2014 12:17:01 +0000 (14:17 +0200)]
net: use ethtool_cmd_speed_set helper to set ethtool speed value
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Fri, 6 Jun 2014 12:17:00 +0000 (14:17 +0200)]
net: use SPEED_UNKNOWN and DUPLEX_UNKNOWN when appropriate
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tom Lendacky [Fri, 6 Jun 2014 13:52:19 +0000 (08:52 -0500)]
amd-xgbe: Remove unnecessary include
The include of asm/cputype.h breaks the powerpc build. This
include was accidentally left in from driver debugging and
can be removed.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>,
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 6 Jun 2014 19:55:37 +0000 (12:55 -0700)]
Merge branch 'master' of git://git./linux/kernel/git/jkirsher/net-next
Jeff Kirsher says:
====================
Intel Wired LAN Driver Updates 2014-06-06
This series contains updates to i40e and i40evf.
Shannon sets the lan_veb index when the Virtual Ethernet Bridge (VEB) is
created for the basic LAN device and its Virtual Switch Interfaces (VSIs).
Also adds the VEB/VSI statistics in the ethtool stats output. Increases the
reset wait time because the original time was too optimistic and resets
were failing after EMPR. This won't delay the actual wait, just allows
us to poll more times as needed. He fixes the attempted removal of the
HMC space and unhook IRQs when a reset recovery fails because the HMC space
never got setup and IRQs are already unhooked in the first place, so the
removal is not necessary.
Jesse adds a log message for pre-production hardware so that the user is
notified that there may be issues with the hardware, yet does not prevent
the user from using the hardware. Add the printing of link messages
in the i40e driver like all the other Intel Ethernet drivers.
Mitch makes log message changes to i40evf to prevent confusion by making
the most common messages less scary by lowering them to a less terrifying
log level. This is due to the fact that depending on the timing of what
the PF driver is doing, it may take a few tries before the VF driver is able
to communicate with the PF driver on init or reset recovery.
Kamil provides a change to prevent the driver from getting into a endless
loop while trying to send GetVersion AQ command, since BO silicon blocks
AQ registers when in Blank Flash mode. So introduce a simple check for a
correct value in one of the AQ registers to be sure that AQ was configured
correctly.
Matt adds a function which indicates our intention to enable or disable a
particular transmit queue. Also adds a function to notify the device's
transmit unit that we are about to enable or disable a transmit queue.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
John W. Linville [Fri, 6 Jun 2014 15:59:11 +0000 (11:59 -0400)]
Merge branch 'master' of git://git./linux/kernel/git/linville/wireless-next into for-davem
Catherine Sullivan [Wed, 23 Apr 2014 04:50:17 +0000 (04:50 +0000)]
i40e/i40evf: Bump build version
Bump i40e to 0.4.3 and i40evf to 0.9.27.
Change-ID: I4141e9f8615bdcfa3b1b5ecbc2ac62603a03b7ad
Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Shannon Nelson [Wed, 23 Apr 2014 04:50:16 +0000 (04:50 +0000)]
i40e: remove irqs only when they are set up
Use an extra state variable to keep track of when the IRQs are fully
set up. This keeps us from trying to unhook IRQs that already were
left unhooked in a failed reset recovery, e.g. when firmware is broken.
Change-ID: I073eb081e4ef8aedcbdf1ee0717c0ed64fa172f2
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Shannon Nelson [Wed, 23 Apr 2014 04:50:13 +0000 (04:50 +0000)]
i40e: don't remove HMC that doesn't exist
If a reset recovery failed (e.g. firmware is broken), the HMC space won't
get set up. We don't need to try to delete it if it didn't get set up.
This stops some needless error messages when we already know we need to
just tear things down.
Change-ID: Iac600481765e20b136052b43a544e55d7870268b
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Jesse Brandeburg [Wed, 23 Apr 2014 04:50:12 +0000 (04:50 +0000)]
i40e: print full link message
The i40e driver should print link messages like all the
other Intel Ethernet drivers.
Change-ID: Ia88bdb96794e17a3962fcea94db176de01f921f7
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Shannon Nelson [Wed, 23 Apr 2014 04:50:09 +0000 (04:50 +0000)]
i40e: add xcast stats for port
Add the missing unicast, multicast, and broadcast stats for the port.
Change-ID: Ifc366d7b7745f70eaac9d00eeb0694eb9ec076a9
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Shannon Nelson [Wed, 23 Apr 2014 04:50:07 +0000 (04:50 +0000)]
i40e: add vsi x-cast stats
Add VSI HW stats for unicast, multicast, and broadcast for Rx and Tx.
Stop printing the netdev multicast value because it doesn't include Tx
and would be confusing.
Change-ID: I08278b6657e7c838fd29a4a1f305f78fe1b150be
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Shannon Nelson [Wed, 23 Apr 2014 04:50:06 +0000 (04:50 +0000)]
i40e: increase reset wait time
The wait time was originally too optimistic and the resets were failing
after EMPR. This increases the loop count to wait considerably longer.
This won't delay the actual wait longer than really needed, just allows
us to poll more times as needed.
Change-ID: If7b96f55cc25b8d06cbbe8665259d250188c53d7
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Matt Jared [Wed, 23 Apr 2014 04:50:03 +0000 (04:50 +0000)]
i40e/i40evf: add Tx pre queue disable function
Add a function which indicates our intention to enable or disable a particular
Tx queue. Also add a function to notify the device's Tx unit that we're about
to enable or disable a Tx queue.
Change-ID: I6adf3cbb5bb3e3c984d1ec969e06577c19ef296d
Signed-off-by: Matt Jared <matthew.a.jared@intel.com>
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Kamil Krawczyk [Wed, 23 Apr 2014 04:50:02 +0000 (04:50 +0000)]
i40e/i40evf: check AQ register for valid data
B0 Si blocks AQ registers when in Blank Flash mode - write is dropped,
read gives 0xDEADBEEF. Introduce a simple check for a correct value in one
of the AQ registers to be sure that AQ was configured correctly.
Without this check we get into an endless loop while trying to send
GetVersion AQ cmd.
Change-ID: I00102b8c5fa6c16d14289be677aafadf87f10f0d
Signed-off-by: Kamil Krawczyk <kamil.krawczyk@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Mitch Williams [Wed, 23 Apr 2014 04:50:00 +0000 (04:50 +0000)]
i40evf: make messages less dire
Depending on the timing of what the PF driver is doing, it make take a
few tries before the VF driver is able to communicate with the PF driver
on init or reset recovery. In order to prevent confusion, make the most
common messages less scary by lowering them to a less terrifying log
level and indicate that the driver will retry.
Change-ID: I1ec22aa59a68f4469aabe14775a1bfc1ab4b7f2f
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Jesse Brandeburg [Wed, 23 Apr 2014 04:49:57 +0000 (04:49 +0000)]
i40e: print message for pre-production hardware
The driver and hardware are not expected to work correctly
with revision_id 0 hardware. Don't prevent the user from
using it, but be sure to print a warning.
Change-ID: I3712d34752bfad458078a5f35dfd0aa0ae9fd20e
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Shannon Nelson [Wed, 23 Apr 2014 04:49:55 +0000 (04:49 +0000)]
i40e: add VEB stats to ethtool
Print the VEB statistics in the ethtool stats output.
Change-ID: Ic93d4c3922345c43e4cfd7f7e7a906844dd2f49f
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Shannon Nelson [Wed, 23 Apr 2014 04:49:54 +0000 (04:49 +0000)]
i40e: set lan_veb index
When the VEB is created for the basic LAN device and its VSIs, we need to
set the tracking lan_veb index for later use.
Change-ID: I66bb74993bbda3621ca557437cb4b3517f9b315b
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Sven Wegener [Thu, 29 May 2014 20:27:06 +0000 (20:27 +0000)]
ipv6: Shrink udp_v6_mcast_next() to one socket variable
To avoid the confusion of having two variables, shrink the function to
only use the parameter variable for looping.
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: Sven Wegener <sven.wegener@stealer.net>
Signed-off-by: David S. Miller <davem@davemloft.net>