Alexander Duyck [Tue, 30 Oct 2012 06:01:55 +0000 (06:01 +0000)]
ixgbe: Improve performance and reduce size of ixgbe_tx_map
This change is meant to both improve the performance and reduce the size of
ixgbe_tx_map. To do this I have expanded the work done in the main loop by
pushing first into tx_buffer. This allows us to pull in the dma_mapping_error
check, the tx_buffer value assignment, and the initial DMA value assignment to
the Tx descriptor. The net result is that the function reduces in size by a
little over a 100 bytes and is about 1% or 2% faster.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Alexander Duyck [Wed, 7 Nov 2012 02:34:28 +0000 (02:34 +0000)]
ixgbe: Update ixgbe Tx flags to improve code efficiency
This change is meant to improve the efficiency of the Tx flags in ixgbe by
aligning them with the values that will later be written into either the
cmd_type or olinfo. By doing this we are able to reduce most of these
functions to either just a simple shift followed by an or in the case of
cmd_type, or an and followed by an or in the case of olinfo.
To do this I also needed to change the logic and/or drop some flags. I
dropped the IXGBE_TX_FLAGS_FSO and it was replaced by IXGBE_TX_FLAGS_TSO since
the only place it was ever checked was in conjunction with IXGBE_TX_FLAGS_TSO.
I replaced IXGBE_TX_FLAGS_TXSW with IXGBE_TX_FLAGS_CC, this way we have a
clear point for what the flag is meant to do. Finally the
IXGBE_TX_FLAGS_NO_IFCS was dropped since were are already carrying the data
for that flag in the skb. Instead we can just check the bitflag in the skb.
In order to avoid type conversion errors I also adjusted the locations
where we were switching between CPU and little endian.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Alexander Duyck [Tue, 30 Oct 2012 06:01:45 +0000 (06:01 +0000)]
ixgbe: Always use context 0, even for FCoE and TSO
We were spending cycles separating the FCoE and TSO contexts even though we
always overwriting the context anyway. Instead of doing that we can just
use context 0 for all descriptors.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Alexander Duyck [Tue, 30 Oct 2012 06:01:40 +0000 (06:01 +0000)]
ixgbe: Make TSO check for CHECKSUM_PARTIAL to avoid skb_is_gso check
This change is meant to reduce the overhead for workloads that are not
using either TSO or checksum offloads. Most of the time the compiler
should jump ahead after failing this check to the VLAN check since in the
ixgbe_tx_csum call we start with that check as well.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
John Fastabend [Fri, 19 Oct 2012 02:34:34 +0000 (02:34 +0000)]
ixgbe: SR-IOV: dynamic IEEE DCBx default priority changes
IEEE DCBx has a mechanism to change the default user priority. In
the normal case the OS can handle this via cgroups, iptables, socket,
options etc.
With SR-IOV and direct assigned VF devices the default priority
needs to be set by the PF device so the inserted VLAN tag is
correct.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Tested-by: Marcus Dennis <marcusx.e.dennis@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Hannes Frederic Sowa [Fri, 18 Jan 2013 09:18:17 +0000 (09:18 +0000)]
ipv6: remove unneeded check to pskb_may_pull in ipip6_rcv
This is already checked by the caller (tunnel64_rcv) and brings ipip6_rcv
in line with ipip_rcv.
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
YOSHIFUJI Hideaki / 吉藤英明 [Fri, 18 Jan 2013 02:05:03 +0000 (02:05 +0000)]
ndisc: Check NS message length before access.
Check message length before accessing "target" field,
as we do for other types.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
YOSHIFUJI Hideaki / 吉藤英明 [Fri, 18 Jan 2013 02:00:24 +0000 (02:00 +0000)]
ipv6: Remove unused neigh argument for icmp6_dst_alloc() and its callers.
Because of rt->n removal, we do not need neigh argument any more.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Thu, 17 Jan 2013 21:46:18 +0000 (21:46 +0000)]
enic: change sprintf() to snprintf()
These are copying data into 16 char arrays. They all specify that the
first string can't be more than 11 characters but once you add on the
"-rx-" and the NUL character there isn't space for the %d.
The first string is probably never going to be 11 characters, but if it
is then let's truncate the string instead of corrupting memory.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fabio Estevam [Thu, 17 Jan 2013 16:46:02 +0000 (16:46 +0000)]
smsc: smc911x: Fix sparse warnings
ioremap returns 'void __iomem *' type.
Fix the following build warnings:
drivers/net/ethernet/smsc/smc911x.c:2079:14: warning: incorrect type in assignment (different address spaces)
drivers/net/ethernet/smsc/smc911x.c:2079:14: expected unsigned int *addr
drivers/net/ethernet/smsc/smc911x.c:2079:14: got void [noderef] <asn:2>*
drivers/net/ethernet/smsc/smc911x.c:2086:18: warning: incorrect type in assignment (different address spaces)
drivers/net/ethernet/smsc/smc911x.c:2086:18: expected void [noderef] <asn:2>*base
drivers/net/ethernet/smsc/smc911x.c:2086:18: got unsigned int *addr
drivers/net/ethernet/smsc/smc911x.c:2091:25: warning: incorrect type in argument 1 (different address spaces)
drivers/net/ethernet/smsc/smc911x.c:2091:25: expected void volatile [noderef] <asn:2>*addr
drivers/net/ethernet/smsc/smc911x.c:2091:25: got unsigned int *addr
Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Simon Que [Thu, 17 Jan 2013 09:29:49 +0000 (09:29 +0000)]
net: usb: initialize tmp in dm9601.c to avoid warning
In two places, tmp is initialized implicitly by being passed as a
pointer during a function call. However, this is not obvious to the
compiler, which logs a warning.
Signed-off-by: Simon Que <sque@chromium.org>
Acked-by: Peter Korsgaard <jacmet@sunsite.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mugunthan V N [Thu, 17 Jan 2013 06:31:34 +0000 (06:31 +0000)]
net: ethernet: davinci_cpdma: Add boundary for rx and tx descriptors
When there is heavy transmission traffic in the CPDMA, then Rx descriptors
memory is also utilized as tx desc memory looses all rx descriptors and the
driver stops working then.
This patch adds boundary for tx and rx descriptors in bd ram dividing the
descriptor memory to ensure that during heavy transmission tx doesn't use
rx descriptors.
This patch is already applied to davinci_emac driver, since CPSW and
davici_dmac shares the same CPDMA, moving the boundry seperation from
Davinci EMAC driver to CPDMA driver which was done in the following
commit
commit
86d8c07ff2448eb4e860e50f34ef6ee78e45c40c
Author: Sascha Hauer <s.hauer@pengutronix.de>
Date: Tue Jan 3 05:27:47 2012 +0000
net/davinci: do not use all descriptors for tx packets
The driver uses a shared pool for both rx and tx descriptors.
During open it queues fixed number of 128 descriptors for receive
packets. For each received packet it tries to queue another
descriptor. If this fails the descriptor is lost for rx.
The driver has no limitation on tx descriptors to use, so it
can happen during a nmap / ping -f attack that the driver
allocates all descriptors for tx and looses all rx descriptors.
The driver stops working then.
To fix this limit the number of tx descriptors used to half of
the descriptors available, the rx path uses the other half.
Tested on a custom board using nmap / ping -f to the board from
two different hosts.
Signed-off-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alan Ott [Wed, 16 Jan 2013 19:09:48 +0000 (19:09 +0000)]
6lowpan: Handle uncompressed IPv6 packets over 6LoWPAN
Handle the reception of uncompressed packets (dispatch type = IPv6).
Signed-off-by: Alan Ott <alan@signal11.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alan Ott [Wed, 16 Jan 2013 19:09:47 +0000 (19:09 +0000)]
6lowpan: Refactor packet delivery into a function
Refactor the handing of the skb's to the individual lowpan devices into a
function.
Signed-off-by: Alan Ott <alan@signal11.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
Frank Li [Wed, 16 Jan 2013 16:55:58 +0000 (16:55 +0000)]
net: fec: enable pause frame to improve rx prefomance for 1G network
The limition of imx6 internal bus cause fec can't achieve 1G perfomance.
There will be many packages lost because FIFO over run.
This patch enable pause frame flow control.
Before this patch
iperf -s -i 1
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
[ 4] local 10.192.242.153 port 5001 connected with 10.192.242.94 port 49773
[ ID] Interval Transfer Bandwidth
[ 4] 0.0- 1.0 sec 6.35 MBytes 53.3 Mbits/sec
[ 4] 1.0- 2.0 sec 3.39 MBytes 28.5 Mbits/sec
[ 4] 2.0- 3.0 sec 2.63 MBytes 22.1 Mbits/sec
[ 4] 3.0- 4.0 sec 1.10 MBytes 9.23 Mbits/sec
ifconfig
RX packets:46195 errors:1859 dropped:1 overruns:1859 frame:1859
After this patch
iperf -s -i 1
[ 4] local 10.192.242.153 port 5001 connected with 10.192.242.94 port 49757
[ ID] Interval Transfer Bandwidth
[ 4] 0.0- 1.0 sec 49.8 MBytes 418 Mbits/sec
[ 4] 1.0- 2.0 sec 50.1 MBytes 420 Mbits/sec
[ 4] 2.0- 3.0 sec 47.5 MBytes 399 Mbits/sec
[ 4] 3.0- 4.0 sec 45.9 MBytes 385 Mbits/sec
[ 4] 4.0- 5.0 sec 44.8 MBytes 376 Mbits/sec
ifconfig
RX packets:
2348454 errors:0 dropped:16 overruns:0 frame:0
Signed-off-by: Frank Li <Frank.Li@freescale.com>
Signed-off-by: Fugang Duan <B38611@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Lucas Stach [Wed, 16 Jan 2013 04:24:07 +0000 (04:24 +0000)]
net: asix: handle packets crossing URB boundaries
ASIX AX88772B started to pack data even more tightly. Packets and the ASIX packet
header may now cross URB boundaries. To handle this we have to introduce
some state between individual calls to asix_rx_fixup().
Signed-off-by: Lucas Stach <dev@lynxeye.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Lucas Stach [Wed, 16 Jan 2013 04:24:06 +0000 (04:24 +0000)]
net: asix: init ASIX AX88772B MAC from EEPROM
The device comes up with a MAC address of all zeros. We need to read the
initial device MAC from EEPROM so it can be set properly later.
Signed-off-by: Lucas Stach <dev@lynxeye.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 18 Jan 2013 19:12:11 +0000 (14:12 -0500)]
Merge branch 'intel'
Jeff Kirsher says:
====================
This series contains updates to e1000e and igb. Most notably is the
added timestamp support in e1000e and additional software timestamp
support in igb. As well as, the added thermal data support and SR-IOV
configuration support in igb.
v2- dropped the following patches from the previous 14 patch series
because changes were requested from the community:
e1000e: add support for IEEE-1588 PTP
igb: Report L4 Rx hash via skb->l4_rxhash
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Matthew Vick [Thu, 13 Dec 2012 07:20:36 +0000 (07:20 +0000)]
igb: Use in-kernel PTP_EV_PORT #define
Rather than use an extra #define for something that already exists, use the
kernel #define for the PTP port.
Cc: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Matthew Vick <matthew.vick@intel.com>
Acked-by: Jacob Keller <Jacob.e.keller@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Matthew Vick [Thu, 13 Dec 2012 07:20:37 +0000 (07:20 +0000)]
igb: Free any held skb that should have been timestamped on remove
To prevent a race condition where an skb has been saved to return the Tx
timestamp later and the driver is removed, add a check to determine if we
have an skb stored and, if so, free it.
Cc: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Matthew Vick <matthew.vick@intel.com>
Acked-by: Jacob Keller <Jacob.e.keller@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Matthew Vick [Thu, 13 Dec 2012 07:20:35 +0000 (07:20 +0000)]
igb: Add mechanism for detecting latched hardware Rx timestamp
Add a check against possible Rx timestamp freezing in the hardware via
watchdog mechanism. This situation can occur when an Rx timestamp has been
latched, but the packet has been dropped because the Rx ring is full.
Whenever a packet comes in that should be timestamped, the Rx timestamp
gets latched into the hardware registers and we will store the jiffy value
in the rx_ring. The watchdog will keep track of his own jiffy timer
whenever there is no valid timestamp in the registers.
If the watchdog detects a valid timestamp in the registers, meaning that no
Rx packet has consumed it yet, it will check which time is most recent: the
last time in the watchdog or any time in the rx_rings. If the most recent
"event" was more than 5 seconds ago, it will flush the Rx timestamp and
print a warning message to the syslog.
Cc: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Matthew Vick <matthew.vick@intel.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Acked-by: Jacob Keller <Jacob.e.keller@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Matthew Vick [Thu, 13 Dec 2012 07:20:34 +0000 (07:20 +0000)]
igb: Add timeout for PTP Tx work item
When transmitting a packet that must return a Tx timestamp, a work item
gets scheduled to poll for the Tx timestamp being completed in hardware.
Add a timeout on this work item of 15 seconds from when the driver gets the
skb, after which it will stop polling. Report via stats and system log if
this occurs.
Cc: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Matthew Vick <matthew.vick@intel.com>
Acked-by: Jacob Keller <Jacob.e.keller@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Matthew Vick [Thu, 13 Dec 2012 07:20:33 +0000 (07:20 +0000)]
igb: Add support for SW timestamping
Enable SW timestamping for situations where the user may prefer it over HW
timestamping or there may not be HW timestamping.
Cc: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Matthew Vick <matthew.vick@intel.com>
Acked-by: Jacob Keller <Jacob.e.keller@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Carolyn Wyborny [Fri, 7 Dec 2012 03:01:42 +0000 (03:01 +0000)]
igb: Enable hwmon data output for thermal sensors via I2C.
Some of our adapters have internal sensors that report thermal data. This
patch enables reporting of that data via sysfs.
Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Carolyn Wyborny [Fri, 7 Dec 2012 03:01:16 +0000 (03:01 +0000)]
igb: Add support functions to access thermal data.
Some of our devices have internal sensors for reporting thermal data.
This patch creates the interface to the sensors for exporting via sysfs.
Subsequent patch will actually export the data.
Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Carolyn Wyborny [Fri, 7 Dec 2012 03:00:30 +0000 (03:00 +0000)]
igb: Add i2c interface to igb.
Some of our adapters have sensors on them accessible via i2c and a private
interface. This patch implements the kernel interface for i2c to those sensors.
Subsequent patches will provide functions to export that data.
Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Greg Rose [Thu, 17 Jan 2013 09:03:06 +0000 (01:03 -0800)]
igb: Enable SR-IOV configuration via PCI sysfs interface
Implement callback in the driver for the new PCI bus driver
interface that allows the user to enable/disable SR-IOV
virtual functions in a device via the sysfs interface.
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Thu, 27 Dec 2012 08:32:33 +0000 (08:32 +0000)]
e1000e: add support for hardware timestamping on some devices
On 82574, 82583, 82579, I217 and I218 add support for hardware time
stamping of all or no Rx packets and Tx packets which have the
SKBTX_HW_TSTAMP flag set. Update the .get_ts_info ethtool operation to
report the supported time stamping modes, and enable and disable hardware
time stamping with the SIOCSHWTSTAMP ioctl.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bjorn Helgaas [Thu, 6 Dec 2012 06:40:07 +0000 (06:40 +0000)]
e1000e: Use standard #defines for PCIe Capability ASPM fields
Use the standard #defines for PCIe Capability ASPM fields.
Previously we used PCIE_LINK_STATE_L0S and PCIE_LINK_STATE_L1 directly, but
these are defined for the Linux ASPM interfaces, e.g.,
pci_disable_link_state(), and only coincidentally match the actual register
bits. PCIE_LINK_STATE_CLKPM, also part of that interface, does not match
the register bit.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: e1000-devel@lists.sourceforge.net
Acked-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Wed, 5 Dec 2012 08:40:59 +0000 (08:40 +0000)]
e1000e: add ethtool .get_eee/.set_eee
Add the ability to query and set Energy Efficient Ethernet parameters via
ethtool for applicable devices.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
YOSHIFUJI Hideaki / 吉藤英明 [Thu, 17 Jan 2013 12:54:05 +0000 (12:54 +0000)]
ipv6: Complete neighbour entry removal from dst_entry.
CC: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
YOSHIFUJI Hideaki / 吉藤英明 [Thu, 17 Jan 2013 12:54:00 +0000 (12:54 +0000)]
ipv6: Do not depend on rt->n in ip6_finish_output2().
If neigh is not found, create new one.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
YOSHIFUJI Hideaki / 吉藤英明 [Thu, 17 Jan 2013 12:53:55 +0000 (12:53 +0000)]
ipv6: Do not depend on rt->n in ip6_dst_lookup_tail().
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
YOSHIFUJI Hideaki / 吉藤英明 [Thu, 17 Jan 2013 12:53:48 +0000 (12:53 +0000)]
ipv6: Introduce rt6_nexthop() to select nexthop address.
For RTF_GATEWAY route, return rt->rt6i_gateway.
Otherwise, return 2nd argument (destination address).
This will be used by following patches which remove rt->n
dependency patches in ip6_dst_lookup_tail() and ip6_finish_output2().
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
YOSHIFUJI Hideaki / 吉藤英明 [Thu, 17 Jan 2013 12:53:43 +0000 (12:53 +0000)]
ipv6: Do not depend on rt->n in rt6_probe().
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
YOSHIFUJI Hideaki / 吉藤英明 [Thu, 17 Jan 2013 12:53:38 +0000 (12:53 +0000)]
ipv6: Do not depend on rt->n in rt6_check_neigh().
CC: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
YOSHIFUJI Hideaki / 吉藤英明 [Thu, 17 Jan 2013 12:53:32 +0000 (12:53 +0000)]
ipv6: Do not depend on rt->n in ip6_pol_route().
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
YOSHIFUJI Hideaki / 吉藤英明 [Thu, 17 Jan 2013 12:53:22 +0000 (12:53 +0000)]
ndisc: Introduce __ipv6_neigh_lookup_noref().
This function, which looks up neighbour entry for an IPv6 address
without touching refcnt, will be used for patches to remove
dependency on rt->n (neighbour entry in rt6_info).
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
YOSHIFUJI Hideaki / 吉藤英明 [Thu, 17 Jan 2013 12:53:15 +0000 (12:53 +0000)]
ipv6 route: Dump gateway based on RTF_GATEWAY flag and rt->rt6i_gateway.
Do not depend on rt->n.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
YOSHIFUJI Hideaki / 吉藤英明 [Thu, 17 Jan 2013 12:53:09 +0000 (12:53 +0000)]
ndisc: Remove tbl argument for __ipv6_neigh_lookup().
We can refer to nd_tbl directly.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
YOSHIFUJI Hideaki / 吉藤英明 [Thu, 17 Jan 2013 12:53:02 +0000 (12:53 +0000)]
ndisc: Update neigh->updated with write lock.
neigh->nud_state and neigh->updated are under protection of
neigh->lock.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yuval Mintz [Thu, 17 Jan 2013 03:26:21 +0000 (03:26 +0000)]
bnx2x: fix GRO parameters
bnx2x does an internal GRO pass but doesn't provide gso_segs, thus
breaking qdisc_pkt_len_init() in case ingress qdisc is used.
We store gso_segs in NAPI_GRO_CB(skb)->count, where tcp_gro_complete()
expects to find the number of aggregated segments.
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vipul Pandya [Wed, 16 Jan 2013 23:29:59 +0000 (23:29 +0000)]
cxgb3: Fix Tx csum stats
Signed-off-by: Jay Hernandez <jay@chelsio.com>
Signed-off-by: Vipul Pandya <vipul@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fabio Baltieri [Wed, 16 Jan 2013 21:30:17 +0000 (22:30 +0100)]
ipv6: fix ipv6_prefix_equal64_half mask conversion
Fix the 64bit optimized version of ipv6_prefix_equal to convert the
bitmask to network byte order only after the bit-shift.
The bug was introduced in:
3867517 ipv6: 64bit version of ipv6_prefix_equal().
Signed-off-by: Fabio Baltieri <fabio.baltieri@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jesper Dangaard Brouer [Tue, 15 Jan 2013 07:16:35 +0000 (07:16 +0000)]
net: increase fragment memory usage limits
Increase the amount of memory usage limits for incomplete
IP fragments.
Arguing for new thresh high/low values:
High threshold = 4 MBytes
Low threshold = 3 MBytes
The fragmentation memory accounting code, tries to account for the
real memory usage, by measuring both the size of frag queue struct
(inet_frag_queue (ipv4:ipq/ipv6:frag_queue)) and the SKB's truesize.
We want to be able to handle/hold-on-to enough fragments, to ensure
good performance, without causing incomplete fragments to hurt
scalability, by causing the number of inet_frag_queue to grow too much
(resulting longer searches for frag queues).
For IPv4, how much memory does the largest frag consume.
Maximum size fragment is 64K, which is approx 44 fragments with
MTU(1500) sized packets. Sizeof(struct ipq) is 200. A 1500 byte
packet results in a truesize of 2944 (not 2048 as I first assumed)
(44*2944)+200 = 129736 bytes
The current default high thresh of 262144 bytes, is obviously
problematic, as only two 64K fragments can fit in the queue at the
same time.
How many 64K fragment can we fit into 4 MBytes:
4*2^20/((44*2944)+200) = 32.34 fragment in queues
An attacker could send a separate/distinct fake fragment packets per
queue, causing us to allocate one inet_frag_queue per packet, and thus
attacking the hash table and its lists.
How many frag queue do we need to store, and given a current hash size
of 64, what is the average list length.
Using one MTU sized fragment per inet_frag_queue, each consuming
(2944+200) 3144 bytes.
4*2^20/(2944+200) = 1334 frag queues -> 21 avg list length
An attack could send small fragments, the smallest packet I could send
resulted in a truesize of 896 bytes (I'm a little surprised by this).
4*2^20/(896+200) = 3827 frag queues -> 59 avg list length
When increasing these number, we also need to followup with
improvements, that is going to help scalability. Simply increasing
the hash size, is not enough as the current implementation does not
have a per hash bucket locking.
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vincent Bernat [Wed, 16 Jan 2013 21:55:49 +0000 (22:55 +0100)]
sk-filter: Add ability to lock a socket filter program
While a privileged program can open a raw socket, attach some
restrictive filter and drop its privileges (or send the socket to an
unprivileged program through some Unix socket), the filter can still
be removed or modified by the unprivileged program. This commit adds a
socket option to lock the filter (SO_LOCK_FILTER) preventing any
modification of a socket filter program.
This is similar to OpenBSD BIOCLOCK ioctl on bpf sockets, except even
root is not allowed change/drop the filter.
The state of the lock can be read with getsockopt(). No error is
triggered if the state is not changed. -EPERM is returned when a user
tries to remove the lock or to change/remove the filter while the lock
is active. The check is done directly in sk_attach_filter() and
sk_detach_filter() and does not affect only setsockopt() syscall.
Signed-off-by: Vincent Bernat <bernat@luffy.cx>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cong Wang [Thu, 17 Jan 2013 04:21:08 +0000 (12:21 +0800)]
netpoll: fix a missing dev refcounting
__dev_get_by_name() doesn't refcount the network device,
so we have to do this by ourselves. Noticed by Eric.
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
YOSHIFUJI Hideaki [Thu, 17 Jan 2013 03:10:57 +0000 (12:10 +0900)]
ipv6: Fix endianess warning in ip6_flow_hdr().
Commit
3e4e4c1f ("ipv6: Introduce ip6_flow_hdr() to fill version,
tclass and flowlabel.) uses ntohl(), which should be htonl().
Found by Fengguang Wu <fengguang.wu@intel.com>.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Timo Teräs [Tue, 15 Jan 2013 21:01:24 +0000 (21:01 +0000)]
r8169: remove unneeded dirty_rx index
After commit
6f0333b ("r8169: use 50% less ram for RX ring") the rx
ring buffers are always copied making dirty_rx useless.
Signed-off-by: Timo Teräs <timo.teras@iki.fi>
Acked-by: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cong Wang [Mon, 14 Jan 2013 23:34:06 +0000 (23:34 +0000)]
netpoll: fix a rtnl lock assertion failure
v4: hold rtnl lock for the whole netpoll_setup()
v3: remove the comment
v2: use RCU read lock
This patch fixes the following warning:
[ 72.013864] RTNL: assertion failed at net/core/dev.c (4955)
[ 72.017758] Pid: 668, comm: netpoll-prep-v6 Not tainted 3.8.0-rc1+ #474
[ 72.019582] Call Trace:
[ 72.020295] [<
ffffffff8176653d>] netdev_master_upper_dev_get+0x35/0x58
[ 72.022545] [<
ffffffff81784edd>] netpoll_setup+0x61/0x340
[ 72.024846] [<
ffffffff815d837e>] store_enabled+0x82/0xc3
[ 72.027466] [<
ffffffff815d7e51>] netconsole_target_attr_store+0x35/0x37
[ 72.029348] [<
ffffffff811c3479>] configfs_write_file+0xe2/0x10c
[ 72.030959] [<
ffffffff8115d239>] vfs_write+0xaf/0xf6
[ 72.032359] [<
ffffffff81978a05>] ? sysret_check+0x22/0x5d
[ 72.033824] [<
ffffffff8115d453>] sys_write+0x5c/0x84
[ 72.035328] [<
ffffffff819789d9>] system_call_fastpath+0x16/0x1b
In case of other races, hold rtnl lock for the entire netpoll_setup() function.
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stephen Hemminger [Tue, 15 Jan 2013 07:28:35 +0000 (07:28 +0000)]
vmxnet3: better RSS support
The VMXNET3 device provides RSS hash value for received packets,
but it is not being used.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stephen Hemminger [Tue, 15 Jan 2013 07:28:34 +0000 (07:28 +0000)]
vmxnet3: use static RSS key
Rather than generating a different RSS key on each boot, just use
a predetermined value that will map same flow to same value on
every device for more predictable testing. This is already done
on most hardware drivers.
Initial key value just some arbitrary bits extracted once
from /dev/random.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stephen Hemminger [Tue, 15 Jan 2013 07:28:33 +0000 (07:28 +0000)]
vmxnet3: remove unused irq_share_mode
This static variable is never set, it initializes to 0 which
is VMXNET3_INTR_BUDDYSHARE, and never changes.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stephen Hemminger [Tue, 15 Jan 2013 07:28:32 +0000 (07:28 +0000)]
vmxnet3: remove device counter
An atomic counter of devices present is maintained but never used.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stephen Hemminger [Tue, 15 Jan 2013 07:28:31 +0000 (07:28 +0000)]
vmxnet3: remove VMXNET3_MAX_DEVICES
Defined but never used.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stephen Hemminger [Tue, 15 Jan 2013 07:28:30 +0000 (07:28 +0000)]
vmxnet3: use netdev_ printk wrappers
Use the standard netdev_xxx() and dev_xxx() wrappers to format
log messages.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stephen Hemminger [Tue, 15 Jan 2013 07:28:29 +0000 (07:28 +0000)]
vmxnet3: use netdev_dbg
Use netdev_dbg() rather than dev_dbg() because the former prints
the device name which is more useful than the pci name.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stephen Hemminger [Tue, 15 Jan 2013 07:28:28 +0000 (07:28 +0000)]
vmxnet3: fix messages printed before registration
This messages that occur during boot time from this device
when netdev_err is called before calling register_netdevice().
Switch to using dev_XXX macros which correlate message with PCI info which
is available.
Rather than fixing the features message, just remove it since
the information is redundant and available through ethtool.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stephen Hemminger [Tue, 15 Jan 2013 07:28:27 +0000 (07:28 +0000)]
vmxnet3: remove unnecessary bookkeeping
The uncommitted[] array was set but never used except in a debug
message. Remove it.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stephen Hemminger [Tue, 15 Jan 2013 07:28:26 +0000 (07:28 +0000)]
vmxnet3: use netdev_alloc_skb_ip_align
Use netdev_alloc_skb_align, rather than open code using dev_alloc_skb.
Change allocation at startup to use GFP_KERNEL.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 16 Jan 2013 19:31:56 +0000 (14:31 -0500)]
Merge branch 'master' of git://git./linux/kernel/git/jkirsher/net-next
Jeff Kirsher says:
====================
This series contains updates to e1000e only.
v2- updates patch 09/15 "e1000e: resolve checkpatch PREFER_PR_LEVEL warning"
based on feedback from Joe Perches.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Bruce Allan [Wed, 5 Dec 2012 06:26:56 +0000 (06:26 +0000)]
e1000e: merge multiple conditional statements into one
Cleanup a set of conditional tests.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Wed, 5 Dec 2012 06:26:51 +0000 (06:26 +0000)]
e1000e: cleanup code duplication
The removed code block is duplicated in e1000e_write_itr() so use that
instead.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Wed, 5 Dec 2012 06:26:46 +0000 (06:26 +0000)]
e1000e: cleanup magic number
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Wed, 5 Dec 2012 06:26:40 +0000 (06:26 +0000)]
e1000e: cleanup unnecessary line wrap
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Wed, 5 Dec 2012 06:26:35 +0000 (06:26 +0000)]
e1000e: cleanup unusual comment placement
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Wed, 5 Dec 2012 06:26:30 +0000 (06:26 +0000)]
e1000e: cleanup redundant statistics counter
rx_long_byte_count can be removed since it is duplicated in rx_bytes
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Sat, 12 Jan 2013 03:11:25 +0000 (03:11 +0000)]
e1000e: resolve checkpatch PREFER_PR_LEVEL warning
WARNING: Prefer netdev_info(netdev, ... then dev_info(dev, ...
then pr_info(... to printk(KERN_INFO ...
v2 - remove unnecessary "e1000e:" prefix as pointed out by Joe Perches
since that produces a redundant "e1000e:" in the log message
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Wed, 5 Dec 2012 06:26:19 +0000 (06:26 +0000)]
e1000e: add missing bailout on error
...discovered during code inspection.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Wed, 5 Dec 2012 06:26:14 +0000 (06:26 +0000)]
e1000e: unexpected "Reset adapter" message when cable pulled
When there is heavy traffic and the cable is pulled, the driver must reset
the adapter to flush the Tx queue in hardware. This causes the reset path
to be scheduled and logs the message "Reset adapter" which could be mis-
interpreted as an error by the user. Change how the reset path is invoked
for this scenario by using the same method done in an existing work-around
for 80003es2lan (i.e. set a flag and if the flag is set in the reset code
do not log the "Reset adapter" message since the reset is expected).
Re-name the FLAG_RX_RESTART_NOW to FLAG_RESTART_NOW since it is used for
resets in both the Rx and Tx specific code.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Wed, 5 Dec 2012 06:26:08 +0000 (06:26 +0000)]
e1000e: fix enabling of EEE on 82579 and I217
Energy Efficient Ethernet on 82579 and I217 should only be enabled if not
disabled by the user, if the link is full duplex and the link partner has
similar EEE capabilities (stored in different EMI registers on the two
different parts).
After enabling EEE, read the IEEE MMD register 3.1 (which is also stored in
different EMI registers on the two different parts) to clear the count of
received Tx/Rx LPI indications.
Also, rename I217_EEE_100_SUPPORTED to I82579_EEE_100_SUPPORTED to indicate
the bit is valid starting with I82579 (released before I217).
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Wed, 5 Dec 2012 06:26:03 +0000 (06:26 +0000)]
e1000e: 82577: workaround for link drop issue
When connected to certain switches, the 82577 PHY might drop link
unexpectedly. Work around the issue by setting the Mean Square Error
higher than the hardware default.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Wed, 5 Dec 2012 06:25:58 +0000 (06:25 +0000)]
e1000e: helper functions for accessing EMI registers
The Extended Management Interface (EMI) registers are accessed by first
writing the EMI register offset to the EMI_ADDR regiter and then either
reading or writing the data to/from the EMI_DATA register. Add helper
functions for performing these steps and convert existing EMI register
accesses accordingly.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Eric Dumazet [Wed, 16 Jan 2013 05:14:21 +0000 (21:14 -0800)]
net_sched: fix qdisc_pkt_len_init()
commit
1def9238d4aa2 (net_sched: more precise pkt_len computation)
does a wrong computation of mac + network headers length, as it includes
the padding before the frame.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Bruce Allan [Wed, 9 Jan 2013 08:15:42 +0000 (08:15 +0000)]
e1000e: Invalid Image CSUM bit changed for I217
On I217, the bit that indicates an invalid EEPROM (NVM) image checksum has
changed from previous ICH/PCH LOMs. When validating the EEPROM checksum,
check the appropriate bit on different devices.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Wed, 5 Dec 2012 06:25:47 +0000 (06:25 +0000)]
e1000e: Acquire/release semaphore when writing each EEPROM page
When data blocks are written to the EEPROM, the HW/SW/FW semaphore must be
held for the duration. With large data blocks on 80003es2lan, 82571 and
82572, this can take too long and cause the firmware to take ownership of
the semaphore and consequently ownership of writes to the EEPROM.
Instead, acquire and release the semaphore for each page of the block
written.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Wed, 5 Dec 2012 06:25:42 +0000 (06:25 +0000)]
e1000e: SerDes autoneg flow control
Enables flow control to be set in SerDes autoneg mode. This is what is
done for copper, but relies on a different set of register/bit checks
since this is all done within the Mac registers.
Remove inapplicable comment in defines.h
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Yuval Mintz [Mon, 14 Jan 2013 05:11:50 +0000 (05:11 +0000)]
bnx2x: Introduce 2013 and advance version to 1.78.02
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yuval Mintz [Mon, 14 Jan 2013 05:11:49 +0000 (05:11 +0000)]
bnx2x: Added FW GRO bridging support
Since submit
621b4d6 the bnx2x driver support FW GRO.
However, when using the device with GRO enabled in bridging
scenarios throughput is very low, as the bridge expects all
incoming packets to be passed with CHECKSUM_PARTIAL -
a demand which is satisfied by the SW GRO implementation,
but was missed in the bnx2x driver implementation (which returned
CHECKSUM_UNNECESSARY).
Now, given that the traffic is supported by FW GRO (TCP/IP),
the bnx2x driver calculates the pseudo checksum by itself,
passing skbs with CHECKSUM_PARTIAL and giving a much better
throughput when receiving GRO traffic.
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yuval Mintz [Mon, 14 Jan 2013 05:11:48 +0000 (05:11 +0000)]
bnx2x: Clean previous IGU status before ack
When enabling interrupts, acknowledge the interrupt only
after configuring the IGU to the correct interrupt mode
(otherwise it would dirty selftests)
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yuval Mintz [Mon, 14 Jan 2013 05:11:47 +0000 (05:11 +0000)]
bnx2x: improve stop-on-error
Get better control over interrupts during panic, and allow FW to
test outgoing Tx packets when stop-on-error is allowed.
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Miriam Shitrit [Mon, 14 Jan 2013 05:11:46 +0000 (05:11 +0000)]
bnx2x: add `ethtool -w' support.
This revises and enhances the bnx2x register dump facilities,
adding support for `ethtool -w' on top of `ethtool -d'.
Signed-off-by: Miriam Shitrit <miris@broadcom.com>
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yuval Mintz [Mon, 14 Jan 2013 05:11:45 +0000 (05:11 +0000)]
bnx2x: Added nvram personalities support
When a device is configured to act as either iscsi or fcoe
device in its nvram, prevent the other from being misused by
preventing its activation in the driver.
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yaniv Rosner [Mon, 14 Jan 2013 05:11:44 +0000 (05:11 +0000)]
bnx2x: Fix rare self-test failures
On rare occasions, self test link may fail since the link is
being sampled while it's still being stabilized.
To correct this behaviour, try to sample the link for 2 seconds
prior to declaring a failure.
Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dmitry Kravkov [Mon, 14 Jan 2013 05:11:43 +0000 (05:11 +0000)]
bnx2x: use SAN Mac for FCoE.
Current logic causes chips running in switch dependent multi-function
FCoE mode not to configure their MAC, leading to an all 0s MAC.
This patch configures the interface with the SAN Mac instead.
Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dmitry Kravkov [Mon, 14 Jan 2013 05:11:42 +0000 (05:11 +0000)]
bnx2x: Add an additional fatal hw assertion - BRB_HW_INTERRUPT
Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yuval Mintz [Mon, 14 Jan 2013 05:11:41 +0000 (05:11 +0000)]
bnx2x: Clear dirty status when booting after UNDI
Self-tests following boot from SAN have failed as the
UNDI driver might leave some NIG interrupt indications.
This patch does the clean-up, clearing those indications
and allowing the test to pass.
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 15 Jan 2013 20:05:59 +0000 (15:05 -0500)]
Merge git://git./linux/kernel/git/davem/net
Conflicts:
Documentation/networking/ip-sysctl.txt
drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
Both conflicts were simply overlapping context.
A build fix for qlcnic is in here too, simply removing the added
devinit annotations which no longer exist.
Signed-off-by: David S. Miller <davem@davemloft.net>
Nithin Nayak Sujir [Mon, 14 Jan 2013 17:11:00 +0000 (17:11 +0000)]
tg3: Fix crc errors on jumbo frame receive
TG3_PHY_AUXCTL_SMDSP_ENABLE/DISABLE macros do a blind write to the phy
auxiliary control register and overwrite the EXT_PKT_LEN (bit 14) resulting
in intermittent crc errors on jumbo frames with some link partners. Change
the code to do a read/modify/write.
Signed-off-by: Nithin Nayak Sujir <nsujir@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nithin Nayak Sujir [Mon, 14 Jan 2013 17:10:59 +0000 (17:10 +0000)]
tg3: Avoid null pointer dereference in tg3_interrupt in netconsole mode
When netconsole is enabled, logging messages generated during tg3_open
can result in a null pointer dereference for the uninitialized tg3
status block. Use the irq_sync flag to disable polling in the early
stages. irq_sync is cleared when the driver is enabling interrupts after
all initialization is completed.
Signed-off-by: Nithin Nayak Sujir <nsujir@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Paul Gortmaker [Tue, 15 Jan 2013 02:17:25 +0000 (21:17 -0500)]
drivers/net: delete orphaned MCA ibmlana driver content
In commit
a5e371f61ad33c07b28e7c9b60c78d71fdd34e2a ("drivers/net: delete
all code/drivers depending on CONFIG_MCA") most of the MCA drivers went,
including the Kconfig/Makefile hooks for ibmlana, but it seems that I
missed the "git rm" on these actual driver files, and with the namespace
overlap with machine check architecture, it got missed by various git
grep type checking done at that time.
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 14 Jan 2013 23:26:41 +0000 (18:26 -0500)]
Merge branch 'master' of git://1984.lsi.us.es/nf
Pablo Neira Ayuso says:
====================
The following patchset contains netfilter fixes for 3.8-rc3,
they are:
* fix possible BUG_ON if several netns are in use and the nf_conntrack
module is removed, initial patch from Gao feng, final patch from myself.
* fix unset return value if conntrack zone are disabled at
compile-time, reported by Borislav Petkov, fix from myself.
* fix display error message via dmesg for arp_tables, from Jan Engelhardt.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Paul Moore [Mon, 14 Jan 2013 07:12:19 +0000 (07:12 +0000)]
tun: fix LSM/SELinux labeling of tun/tap devices
This patch corrects some problems with LSM/SELinux that were introduced
with the multiqueue patchset. The problem stems from the fact that the
multiqueue work changed the relationship between the tun device and its
associated socket; before the socket persisted for the life of the
device, however after the multiqueue changes the socket only persisted
for the life of the userspace connection (fd open). For non-persistent
devices this is not an issue, but for persistent devices this can cause
the tun device to lose its SELinux label.
We correct this problem by adding an opaque LSM security blob to the
tun device struct which allows us to have the LSM security state, e.g.
SELinux labeling information, persist for the lifetime of the tun
device. In the process we tweak the LSM hooks to work with this new
approach to TUN device/socket labeling and introduce a new LSM hook,
security_tun_dev_attach_queue(), to approve requests to attach to a
TUN queue via TUNSETQUEUE.
The SELinux code has been adjusted to match the new LSM hooks, the
other LSMs do not make use of the LSM TUN controls. This patch makes
use of the recently added "tun_socket:attach_queue" permission to
restrict access to the TUNSETQUEUE operation. On older SELinux
policies which do not define the "tun_socket:attach_queue" permission
the access control decision for TUNSETQUEUE will be handled according
to the SELinux policy's unknown permission setting.
Signed-off-by: Paul Moore <pmoore@redhat.com>
Acked-by: Eric Paris <eparis@parisplace.org>
Tested-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Paul Moore [Mon, 14 Jan 2013 07:12:13 +0000 (07:12 +0000)]
selinux: add the "attach_queue" permission to the "tun_socket" class
Add a new permission to align with the new TUN multiqueue support,
"tun_socket:attach_queue".
The corresponding SELinux reference policy patch is show below:
diff --git a/policy/flask/access_vectors b/policy/flask/access_vectors
index
28802c5..
a0664a1 100644
--- a/policy/flask/access_vectors
+++ b/policy/flask/access_vectors
@@ -827,6 +827,9 @@ class kernel_service
class tun_socket
inherits socket
+{
+ attach_queue
+}
class x_pointer
inherits x_device
Signed-off-by: Paul Moore <pmoore@redhat.com>
Acked-by: Eric Paris <eparis@parisplace.org>
Tested-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Sun, 13 Jan 2013 18:21:51 +0000 (18:21 +0000)]
tcp: fix a panic on UP machines in reqsk_fastopen_remove
spin_is_locked() on a non !SMP build is kind of useless.
BUG_ON(!spin_is_locked(xx)) is guaranteed to crash.
Just remove this check in reqsk_fastopen_remove() as
the callers do hold the socket lock.
Reported-by: Ketan Kulkarni <ketkulka@gmail.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jerry Chu <hkchu@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Dave Taht <dave.taht@gmail.com>
Acked-by: H.K. Jerry Chu <hkchu@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Mon, 14 Jan 2013 21:19:08 +0000 (13:19 -0800)]
Merge tag 'dt-fixes-for-3.8' of git://sources.calxeda.com/kernel/linux
Pull devicetree fixes from Rob Herring:
"Two fixes to prevent unconditional re-compile of dts files on arm and
arm64."
* tag 'dt-fixes-for-3.8' of git://sources.calxeda.com/kernel/linux:
ARM: dts: prevent *.dtb from always being rebuilt
arm64: dts: prevent *.dtb from always being rebuilt
Linus Torvalds [Mon, 14 Jan 2013 21:17:50 +0000 (13:17 -0800)]
vfs: add missing virtual cache flush after editing partial pages
Andrew Morton pointed this out a month ago, and then I completely forgot
about it.
If we read a partial last page of a block device, we will zero out the
end of the page, but since that page can then be mapped into user space,
we should also make sure to flush the cache on architectures that have
virtual caches. We have the flush_dcache_page() function for this, so
use it.
Now, in practice this really never matters, because nobody sane uses
virtual caches to begin with, and they largely exist on old broken RISC
arhitectures.
And even if you did run on one of those obsolete CPU's, the whole "mmap
and access the last partial page of a block device" behavior probably
doesn't actually exist. The normal IO functions (read/write) will never
see the zeroed-out part of the page that migth not be coherent in the
cache, because they honor the size of the device.
So I'm marking this for stable (3.7 only), but I'm not sure anybody will
ever care.
Pointed-out-by: Andrew Morton <akpm@linux-foundation.org>
Cc: stable@vger.kernel.org # 3.7
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Eric Dumazet [Sun, 13 Jan 2013 07:46:34 +0000 (07:46 +0000)]
ifb: dont hard code inet_net use
ifb should lookup devices in the appropriate namespace.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Mon, 14 Jan 2013 00:52:52 +0000 (00:52 +0000)]
net: phy: remove flags argument from phy_{attach, connect, connect_direct}
The flags argument of the phy_{attach,connect,connect_direct} functions
is then used to assign a struct phy_device dev_flags with its value.
All callers but the tg3 driver pass the flag 0, which results in the
underlying PHY drivers in drivers/net/phy/ not being able to actually
use any of the flags they would set in dev_flags. This patch gets rid of
the flags argument, and passes phydev->dev_flags to the internal PHY
library call phy_attach_direct() such that drivers which actually modify
a phy device dev_flags get the value preserved for use by the underlying
phy driver.
Acked-by: Kosta Zertsekel <konszert@marvell.com>
Signed-off-by: Florian Fainelli <florian@openwrt.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Benjamin LaHaise [Mon, 14 Jan 2013 05:15:39 +0000 (05:15 +0000)]
pkt_sched: namespace aware act_mirred
Eric Dumazet pointed out that act_mirred needs to find the current net_ns,
and struct net pointer is not provided in the call chain. His original
patch made use of current->nsproxy->net_ns to find the network namespace,
but this fails to work correctly for userspace code that makes use of
netlink sockets in different network namespaces. Instead, pass the
"struct net *" down along the call chain to where it is needed.
This version removes the ifb changes as Eric has submitted that patch
separately, but is otherwise identical to the previous version.
Signed-off-by: Benjamin LaHaise <bcrl@kvack.org>
Tested-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>