Pablo Neira Ayuso [Tue, 8 May 2012 17:36:44 +0000 (19:36 +0200)]
netfilter: bridge: optionally set indev to vlan
if net.bridge.bridge-nf-filter-vlan-tagged sysctl is enabled, bridge
netfilter removes the vlan header temporarily and then feeds the packet
to ip(6)tables.
When the new "bridge-nf-pass-vlan-input-device" sysctl is on
(default off), then bridge netfilter will also set the
in-interface to the vlan interface; if such an interface exists.
This is needed to make iptables REDIRECT target work with
"vlan-on-top-of-bridge" setups and to allow use of "iptables -i" to
match the vlan device name.
Also update Documentation with current brnf default settings.
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Bart De Schuymer <bdschuym@pandora.be>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Eric Dumazet [Wed, 18 Apr 2012 04:36:40 +0000 (06:36 +0200)]
netfilter: nf_conntrack: use this_cpu_inc()
this_cpu_inc() is IRQ safe and faster than
local_bh_disable()/__this_cpu_inc()/local_bh_enable(), at least on x86.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Christoph Lameter <cl@linux.com>
Cc: Tejun Heo <tj@kernel.org>
Reviewed-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Eric Leblond [Wed, 18 Apr 2012 09:20:41 +0000 (11:20 +0200)]
netfilter: nf_ct_helper: allow to disable automatic helper assignment
This patch allows you to disable automatic conntrack helper
lookup based on TCP/UDP ports, eg.
echo 0 > /proc/sys/net/netfilter/nf_conntrack_helper
[ Note: flows that already got a helper will keep using it even
if automatic helper assignment has been disabled ]
Once this behaviour has been disabled, you have to explicitly
use the iptables CT target to attach helper to flows.
There are good reasons to stop supporting automatic helper
assignment, for further information, please read:
http://www.netfilter.org/news.html#2012-04-03
This patch also adds one message to inform that automatic helper
assignment is deprecated and it will be removed soon (this is
spotted only once, with the first flow that gets a helper attached
to make it as less annoying as possible).
Signed-off-by: Eric Leblond <eric@regit.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Tony Zelenoff [Thu, 8 Mar 2012 23:35:39 +0000 (23:35 +0000)]
netfilter: nf_ct_ecache: refactor notifier registration
* ret variable initialization removed as useless
* similar code strings concatenated and functions code
flow became more plain
Signed-off-by: Tony Zelenoff <antonz@parallels.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Steve Glendinning [Fri, 4 May 2012 00:57:13 +0000 (00:57 +0000)]
smsc75xx: let EEPROM determine GPIO/LED settings
This patch allows the GPIO/LED settings to be configured by the
EEPROM if present, and only sets the default values (LED outputs
for link/activity) when an EEPROM is not detected.
Signed-off-by: Steve Glendinning <steve.glendinning@shawell.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Steve Glendinning [Fri, 4 May 2012 00:57:12 +0000 (00:57 +0000)]
smsc75xx: eliminate unnecessary phy register read
Only a write is necessary to clear the interrupt status, and we
don't use the value from the preceding read operation. This
patch eliminates the unnecessary read.
Signed-off-by: Steve Glendinning <steve.glendinning@shawell.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Steve Glendinning [Fri, 4 May 2012 00:57:11 +0000 (00:57 +0000)]
smsc75xx: replace 0xffff with PHY_INT_SRC_CLEAR_ALL
This patch defines PHY_INT_SRC_CLEAR_ALL to replace the value 0xffff
in order to be more self-documenting.
This patch should make no functional change, it is purely cosmetic.
Signed-off-by: Steve Glendinning <steve.glendinning@shawell.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 8 May 2012 03:35:40 +0000 (23:35 -0400)]
Merge git://git./linux/kernel/git/davem/net
Conflicts:
drivers/net/ethernet/intel/e1000e/param.c
drivers/net/wireless/iwlwifi/iwl-agn-rx.c
drivers/net/wireless/iwlwifi/iwl-trans-pcie-rx.c
drivers/net/wireless/iwlwifi/iwl-trans.h
Resolved the iwlwifi conflict with mainline using 3-way diff posted
by John Linville and Stephen Rothwell. In 'net' we added a bug
fix to make iwlwifi report a more accurate skb->truesize but this
conflicted with RX path changes that happened meanwhile in net-next.
In e1000e a conflict arose in the validation code for settings of
adapter->itr. 'net-next' had more sophisticated logic so that
logic was used.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 8 May 2012 03:05:13 +0000 (23:05 -0400)]
Merge branch 'vhost-net-next' of git://git./linux/kernel/git/mst/vhost
Michael S. Tsirkin says:
--------------------
There are mostly bugfixes here.
I hope to merge some more patches by 3.5, in particular
vlan support fixes are waiting for Eric's ack,
and a version of tracepoint patch might be
ready in time, but let's merge what's ready so it's testable.
This includes a ton of zerocopy fixes by Jason -
good stuff but too intrusive for 3.4 and zerocopy is experimental
anyway.
virtio supported delayed interrupt for a while now
so adding support to the virtio tool made sense
--------------------
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Thu, 3 May 2012 22:37:45 +0000 (22:37 +0000)]
net: IP_MULTICAST_IF setsockopt now recognizes struct mreq
Until now, struct mreq has not been recognized and it was worked with
as with struct in_addr. That means imr_multiaddr was copied to
imr_address. So do recognize struct mreq here and copy that correctly.
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Daney [Wed, 2 May 2012 15:16:39 +0000 (15:16 +0000)]
netdev/of/phy: Add MDIO bus multiplexer driven by GPIO lines.
The GPIO pins select which sub bus is connected to the master.
Initially tested with an sn74cbtlv3253 switch device wired into the
MDIO bus.
Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Daney [Wed, 2 May 2012 15:16:38 +0000 (15:16 +0000)]
netdev/of/phy: Add MDIO bus multiplexer support.
This patch adds a somewhat generic framework for MDIO bus
multiplexers. It is modeled on the I2C multiplexer.
The multiplexer is needed if there are multiple PHYs with the same
address connected to the same MDIO bus adepter, or if there is
insufficient electrical drive capability for all the connected PHY
devices.
Conceptually it could look something like this:
------------------
| Control Signal |
--------+---------
|
--------------- --------+------
| MDIO MASTER |---| Multiplexer |
--------------- --+-------+----
| |
C C
h h
i i
l l
d d
| |
--------- A B ---------
| | | | | |
| PHY@1 +-------+ +---+ PHY@1 |
| | | | | |
--------- | | ---------
--------- | | ---------
| | | | | |
| PHY@2 +-------+ +---+ PHY@2 |
| | | |
--------- ---------
This framework configures the bus topology from device tree data. The
mechanics of switching the multiplexer is left to device specific
drivers.
The follow-on patch contains a multiplexer driven by GPIO lines.
Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Daney [Wed, 2 May 2012 15:16:37 +0000 (15:16 +0000)]
netdev/of/phy: New function: of_mdio_find_bus().
Add of_mdio_find_bus() which allows an mii_bus to be located given its
associated the device tree node.
This is needed by the follow-on patch to add a driver for MDIO bus
multiplexers.
The of_mdiobus_register() function is modified so that the device tree
node is recorded in the mii_bus. Then we can find it again by
iterating over all mdio_bus_class devices.
Because the OF device tree has now become an integral part of the
kernel, this can live in mdio_bus.c (which contains the needed
mdio_bus_class structure) instead of of_mdio.c.
Signed-off-by: David Daney <david.daney@cavium.com>
Cc: Grant Likely <grant.likely@secretlab.ca>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tilman Schmidt [Wed, 25 Apr 2012 13:02:20 +0000 (13:02 +0000)]
isdn/capi: elliminate capincci_find() in non-middleware case
If Kernel CAPI is compiled without CONFIG_ISDN_CAPI_MIDDLEWARE,
the structure retrieved via capincci_find() is never actually
used, so don't compile that function in that case.
Signed-off-by: Tilman Schmidt <tilman@imap.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tilman Schmidt [Wed, 25 Apr 2012 13:02:20 +0000 (13:02 +0000)]
isdn/capi: fix readability damage
Fix up some of the readibility deterioration caused by the recent
whitespace coding style cleanup.
Signed-off-by: Tilman Schmidt <tilman@imap.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tilman Schmidt [Wed, 25 Apr 2012 13:02:20 +0000 (13:02 +0000)]
isdn/gigaset: unify function return values
Various functions in the Gigaset driver were using different
conventions for the meaning of their int return values.
Align them to the usual negative error numbers convention.
Inspired-by: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: Tilman Schmidt <tilman@imap.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tilman Schmidt [Wed, 25 Apr 2012 13:02:20 +0000 (13:02 +0000)]
isdn/gigaset: internal function name cleanup
Functions clear_at_state and free_strings did the same thing;
drop one of them, keeping the more descriptive name.
Drop a redundant call.
Rename function dealloc_at_states to dealloc_temp_at_states
to clarify its purpose.
Signed-off-by: Tilman Schmidt <tilman@imap.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tilman Schmidt [Wed, 25 Apr 2012 13:02:20 +0000 (13:02 +0000)]
isdn/gigaset: fix readability damage
Fix up some of the readibility deterioration caused by the recent
whitespace coding style cleanup.
Signed-off-by: Tilman Schmidt <tilman@imap.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tilman Schmidt [Wed, 25 Apr 2012 13:02:20 +0000 (13:02 +0000)]
isdn/gigaset: improve error handling querying firmware version
An out-of-place "OK" response to the "AT+GMR" (get firmware version)
command turns out to be, more often than not, a delayed response to
a previous command rather than an actual error, so continue waiting
for the version number in that case.
Signed-off-by: Tilman Schmidt <tilman@imap.cc>
CC: stable <stable@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tilman Schmidt [Wed, 25 Apr 2012 13:02:20 +0000 (13:02 +0000)]
isdn/gigaset: fix CAPI disconnect B3 handling
If DISCONNECT_B3_IND was synthesized because of a DISCONNECT_REQ
with existing logical connections, the connection state wasn't
updated accordingly. Also the emitted DISCONNECT_B3_IND message
wasn't included in the debug log as requested.
This patch fixes both of these issues.
Signed-off-by: Tilman Schmidt <tilman@imap.cc>
CC: stable <stable@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tilman Schmidt [Wed, 25 Apr 2012 13:02:19 +0000 (13:02 +0000)]
isdn/gigaset: ratelimit CAPI message dumps
Introduce a global ratelimit for CAPI message dumps to protect
against possible log flood.
Drop the ratelimit for ignored messages which is now covered by the
global one.
Signed-off-by: Tilman Schmidt <tilman@imap.cc>
CC: stable <stable@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Johannes Berg [Mon, 7 May 2012 13:39:06 +0000 (15:39 +0200)]
net: compare_ether_addr[_64bits]() has no ordering
Neither compare_ether_addr() nor compare_ether_addr_64bits()
(as it can fall back to the former) have comparison semantics
like memcmp() where the sign of the return value indicates sort
order. We had a bug in the wireless code due to a blind memcmp
replacement because of this.
A cursory look suggests that the wireless bug was the only one
due to this semantic difference.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 7 May 2012 15:47:51 +0000 (11:47 -0400)]
Merge branch 'master' of git://git./linux/kernel/git/jkirsher/net-next
Alexander Duyck [Fri, 4 May 2012 14:26:56 +0000 (14:26 +0000)]
skb: Add inline helper for getting the skb end offset from head
With the recent changes for how we compute the skb truesize it occurs to me
we are probably going to have a lot of calls to skb_end_pointer -
skb->head. Instead of running all over the place doing that it would make
more sense to just make it a separate inline skb_end_offset(skb) that way
we can return the correct value without having gcc having to do all the
optimization to cancel out skb->head - skb->head.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Fri, 4 May 2012 14:26:51 +0000 (14:26 +0000)]
skb: Drop "fastpath" variable for skb_cloned check in pskb_expand_head
Since there is now only one spot that actually uses "fastpath" there isn't
much point in carrying it. Instead we can just use a check for skb_cloned
to verify if we can perform the fast-path free for the head or not.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Fri, 4 May 2012 14:26:46 +0000 (14:26 +0000)]
skb: Drop bad code from pskb_expand_head
The fast-path for pskb_expand_head contains a check where the size plus the
unaligned size of skb_shared_info is compared against the size of the data
buffer. This code path has two issues. First is the fact that after the
recent changes by Eric Dumazet to __alloc_skb and build_skb the shared info
is always placed in the optimal spot for a buffer size making this check
unnecessary. The second issue is the fact that the check doesn't take into
account the aligned size of shared info. As a result the code burns cycles
doing a memcpy with nothing actually being shifted.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Bjørn Mork [Thu, 26 Apr 2012 02:35:10 +0000 (02:35 +0000)]
cdc_ether: Ignore bogus union descriptor for RNDIS devices
Some RNDIS devices include a bogus CDC Union descriptor pointing
to non-existing interfaces. The RNDIS code is already prepared
to handle devices without a CDC Union descriptor by hardwiring
the driver to use interfaces 0 and 1, which is correct for the
devices with the bogus descriptor as well. So we can reuse the
existing workaround.
Cc: Markus Kolb <linux-201011@tower-net.de>
Cc: Iker Salmón San Millán <shaola@esdebian.org>
Cc: Jonathan Nieder <jrnieder@gmail.com>
Cc: Oliver Neukum <oliver@neukum.org>
Cc: 655387@bugs.debian.org
Cc: stable@vger.kernel.org
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ariel Elior [Sun, 6 May 2012 07:05:57 +0000 (07:05 +0000)]
bnx2x: bug fix when loading after SAN boot
This is a bug fix for an "interface fails to load" issue.
The issue occurs when bnx2x driver loads after UNDI driver was previously
loaded over the chip. In such a scenario the UNDI driver is loaded and operates
in the pre-boot kernel, within its own specific host memory address range.
When the pre-boot stage is complete, the real kernel is loaded, in a new and
distinct host memory address range. The transition from pre-boot stage to boot
is asynchronous from UNDI point of view.
A race condition occurs when UNDI driver triggers a DMAE transaction to valid
host addresses in the pre-boot stage, when control is diverted to the real
kernel. This results in access to illegal addresses by our HW as the addresses
which were valid in the preboot stage are no longer considered valid.
Specifically, the 'was_error' bit in the pci glue of our device is set. This
causes all following pci transactions from chip to host to timeout (in
accordance to the pci spec).
Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
John Fastabend [Mon, 23 Apr 2012 22:27:28 +0000 (22:27 +0000)]
ixgbe: dcb: IEEE PFC stats and reset logic incorrect
PFC stats are only tabulated when PFC is enabled. However in IEEE
mode the ieee_pfc pfc_tc bits were not checked and the calculation
was aborted.
This results in statistics not being reported through ethtool and
possible a false Tx hang occurring when receiving pause frames.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Fri, 4 May 2012 08:52:03 +0000 (08:52 +0000)]
e1000e: increase version number
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Richard Alpe [Fri, 20 Apr 2012 15:24:50 +0000 (15:24 +0000)]
e1000e: clear REQ and GNT in EECD (82571 && 82572)
Clear the REQ and GNT bit in the eeprom control register (EECD).
This is required if the eeprom is to be accessed with auto read
EERD register.
After a cold reset this doesn't matter but if PBIST MAC test was
executed before booting, the register was left in a dirty state
(the 2 bits where set), which caused the read operation to time out
and returning 0.
Reference (page 312):
http://download.intel.com/design/network/manuals/316080.pdf
Reported-by: Aleksandar Igic <aleksandar.igic@dektech.com.au>
Signed-off-by: Richard Alpe <richard.alpe@ericsson.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Tue, 20 Mar 2012 03:47:41 +0000 (03:47 +0000)]
e1000e: enable forced master/slave on 82577
Like other supported (igp) PHYs, the driver needs to be able to force the
master/slave mode on 82577. Since the code is the same as what already
exists in the code flow for igp PHYs, move it to a new function to be
called for both flows.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
David S. Miller [Fri, 4 May 2012 16:07:15 +0000 (12:07 -0400)]
Merge branch 'for-davem' of git://git./linux/kernel/git/linville/wireless
Eric Dumazet [Fri, 4 May 2012 05:14:02 +0000 (05:14 +0000)]
tcp: be more strict before accepting ECN negociation
It appears some networks play bad games with the two bits reserved for
ECN. This can trigger false congestion notifications and very slow
transferts.
Since RFC 3168 (6.1.1) forbids SYN packets to carry CT bits, we can
disable TCP ECN negociation if it happens we receive mangled CT bits in
the SYN packet.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Perry Lorier <perryl@google.com>
Cc: Matt Mathis <mattmathis@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Wilmer van der Gaast <wilmer@google.com>
Cc: Ankur Jain <jankur@google.com>
Cc: Tom Herbert <therbert@google.com>
Cc: Dave Täht <dave.taht@bufferbloat.net>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Karsten Keil [Fri, 4 May 2012 04:15:35 +0000 (04:15 +0000)]
mISDN: Help to identify the card
With multiple cards is hard to figure out which port caused trouble
int the layer2 routines (e.g. got a timeout).
Now we have the informations in the log output.
Signed-off-by: Karsten Keil <kkeil@linux-pingi.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Karsten Keil [Fri, 4 May 2012 04:15:34 +0000 (04:15 +0000)]
mISDN: Layer1 statemachine fix
The timer3 and the activation delay timer need to be independent.
If timer3 fires do not reqest power up we have to send only INFO 0.
Now layer1 pass TBR3 again.
Signed-off-by: Karsten Keil <kkeil@linux-pingi.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Karsten Keil [Fri, 4 May 2012 04:15:33 +0000 (04:15 +0000)]
mISDN: Make layer1 timer 3 value configurable
For certification test it is very useful to change the layer1
timer3 value on runtime.
Signed-off-by: Karsten Keil <kkeil@linux-pingi.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Karsten Keil [Fri, 4 May 2012 04:15:32 +0000 (04:15 +0000)]
mISDN: L2 timeouts need to be queued as L2 event
To be full preemptiv safe, we cannot handle a L2 timeout in the timer
context itself, we should do all actions via the D-channel thread.
Signed-off-by: Karsten Keil <kkeil@linux-pingi.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Karsten Keil [Fri, 4 May 2012 04:15:31 +0000 (04:15 +0000)]
mISDN: Fix refcounting bug
Under some configs it was still not possible to unload the driver,
because the module use count was srewed up.
Signed-off-by: Karsten Keil <keil@b1-systems.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andreas Eversberg [Fri, 4 May 2012 04:15:30 +0000 (04:15 +0000)]
mISDN: Added PH_* state info to tei manager.
Tei manager reports current layer 1 state on creation.
On state change it reports it to the socket interface.
Signed-off-by: Andreas Eversberg <andreas@eversberg.eu>
Signed-off-by: Karsten Keil <keil@b1-systems.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Fri, 4 May 2012 04:37:21 +0000 (04:37 +0000)]
net: sched: factorize code (qdisc_drop())
Use qdisc_drop() helper where possible.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 4 May 2012 14:55:50 +0000 (10:55 -0400)]
Merge branch 'master' of git://git./linux/kernel/git/jkirsher/net
Andrei Emeltchenko [Sun, 25 Mar 2012 17:49:25 +0000 (17:49 +0000)]
e1000: Silence sparse warnings by correcting type
Silence sparse warnings shown below:
...
drivers/net/ethernet/intel/e1000/e1000_main.c:3435:17: warning:
cast to restricted __le64
drivers/net/ethernet/intel/e1000/e1000_main.c:3435:17: warning:
cast to restricted __le64
...
Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
John Fastabend [Mon, 23 Apr 2012 12:22:39 +0000 (12:22 +0000)]
igb, ixgbe: netdev_tx_reset_queue incorrectly called from tx init path
igb and ixgbe incorrectly call netdev_tx_reset_queue() from
i{gb|xgbe}_clean_tx_ring() this sort of works in most cases except
when the number of real tx queues changes. When the number of real
tx queues changes netdev_tx_reset_queue() only gets called on the
new number of queues so when we reduce the number of queues we risk
triggering the watchdog timer and repeated device resets.
So this is not only a cosmetic issue but causes real bugs. For
example enabling/disabling DCB or FCoE in ixgbe will trigger this.
CC: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Tested-by: John Bishop <johnx.bishop@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Alexander Duyck [Thu, 19 Apr 2012 17:48:48 +0000 (17:48 +0000)]
ixgbe: Update link flow control to correctly handle multiple packet buffer DCB
This change updates the link flow control configuration so that we
correctly set the link flow control settings for DCB. Previously we would
have to call the fc_enable call 8 times, once for each packet buffer. If
we move that logic into the fc_enable call itself we can avoid multiple
unnecessary register writes.
This change also corrects an issue in which we were only shifting the water
marks for 82599 parts by 6 instead of 10. This was resulting in us only
using 1/16 of the packet buffer when flow control was enabled.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Alexander Duyck [Thu, 19 Apr 2012 17:49:56 +0000 (17:49 +0000)]
ixgbe: Reorder link flow control functions in ixgbe_common.c
We can avoid many of the forward declarations found in ixgbe_common.c by
just reordering things so this patch does that to help cleanup the code.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Alexander Duyck [Fri, 6 Apr 2012 04:24:50 +0000 (04:24 +0000)]
ixgbe: Use __free_pages instead of put_page to release pages
This change replaces the calls to put_page with calls to __free_page.
Since the FCoE code is able to access order 1 pages I thought it would be a
good idea to change things over to using __free_pages since that is the
preferred approach for freeing pages.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Alexander Duyck [Wed, 28 Mar 2012 08:03:48 +0000 (08:03 +0000)]
ixgbe: Make ixgbe_fc_autoneg return void and always set current_mode
This change makes it so that ixgbe_fc_autoneg is a void and always sets the
current_mode. Previously if the link was down we would return an error,
however there is no harm in simply treating a link down case as a case in
which autoneg simply failed. This allows us to rely on the return value of
the ixgbe_fc_enable call now since there should be no cases where it
returns an error that would normally be ignored.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Alexander Duyck [Wed, 28 Mar 2012 08:03:43 +0000 (08:03 +0000)]
ixgbe: Reorder the ring to q_vector mapping to improve performance
This change reorders the mapping of rings to q_vectors in the case that the
number of rings exceeds the number of q_vectors. Previously we would
allocate the first R/N queues to the first q_vector where R is the number
of rings and N is the number of q_vectors. Instead of doing this we can do
a better job of interleaving the rings to the CPUs by assigning every Nth
ring to the q_vector.
The below tables illustrate this change for the R = 16 N = 4 case.
Before patch After patch
q_vector: 0 1 2 3 0 1 2 3
Rings: 0 4 8 12 0 1 2 3
1 5 9 13 4 5 6 7
3 6 10 14 8 9 10 11
4 7 11 15 12 13 14 15
This should improve the performance for both DCB or ATR when the number of
rings exceeds the number of q_vectors allocated by the adapter.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Alexander Duyck [Wed, 28 Mar 2012 08:03:32 +0000 (08:03 +0000)]
ixgbe: Track instances of buffer available but no DMA resources present
This change makes it so that we can track instances of where a packet was
dropped due to a packet being received when there are no DMA buffers
available in the ring.
For some reason this was only being enabled with RSC, however it makes
more sense to always have this feature on so that we can track any cases
where we might drop a buffer due to an Rx ring being full.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Thu, 19 Apr 2012 03:21:47 +0000 (03:21 +0000)]
e1000e: initial support for i217
i217 is the next-generation LOM that will be available on systems with the
Lynx Point Platform Controller Hub (PCH) chipset from Intel. This patch
provides the initial support for the device.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Matthew Vick [Wed, 25 Apr 2012 04:45:57 +0000 (04:45 +0000)]
e1000e: Update driver version number
Version bump to 1.11.3-k.
Signed-off-by: Matthew Vick <matthew.vick@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Linus Torvalds [Fri, 4 May 2012 00:21:05 +0000 (17:21 -0700)]
Merge tag 'mfd-for-linus-3.4-rc6' of git://git./linux/kernel/git/sameo/mfd-2.6
Pull second set of MFD fixes from Samuel Ortiz:
"This time we only have a one liner fixing an omap-usb build error."
* tag 'mfd-for-linus-3.4-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-2.6:
mfd: Fix build breakage in omap-usb-host.c
Linus Torvalds [Fri, 4 May 2012 00:19:48 +0000 (17:19 -0700)]
Merge branch 'efi-vars' from Matthew Garrett
* efi-vars:
efivars: Improve variable validation
Matthew Garrett [Thu, 3 May 2012 20:50:46 +0000 (16:50 -0400)]
efivars: Improve variable validation
Ben Hutchings pointed out that the validation in efivars was inadequate -
most obviously, an entry with size 0 would server as a DoS against the
kernel. Improve this based on his suggestions.
Signed-off-by: Matthew Garrett <mjg@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Fri, 4 May 2012 00:16:52 +0000 (17:16 -0700)]
Merge tag 'tag/upstream-linus' of git://git./linux/kernel/git/jgarzik/libata-dev
Pull libata fixes from Jeff Garzik:
1) Fix regression that could cause a misdiagnosis, which in turn could
lead to an erroneous 3.0 Gbps -> 1.5 downshift, particularly when hotplug
and suspend/resume is involved.
2) Fix a regression that led to ata%d controller ids being numbered one
larger than in <= 3.4-rc3 (oh, the horror!). Controller ids should now be
as expected.
3) add some DT, PCI id's
4) ata/pata_arasan_cf: minor cpp fixing/cleaning
* tag 'tag/upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
ata: ahci_platform: Add synopsys ahci controller in DT's compatible list
ata/pata_arasan_cf: Move arasan_cf_pm_ops out of #ifdef, #endif macros
libata: init ata_print_id to 0
ahci: Detect Marvell 88SE9172 SATA controller
libata: skip old error history when counting probe trials
Linus Torvalds [Fri, 4 May 2012 00:15:47 +0000 (17:15 -0700)]
Merge branch 'i2c-embedded/for-current' of git://git.pengutronix.de/git/wsa/linux
Pull i2c embedded fixes from Wolfram Sang:
"Here are some typical i2c driver bugfixes for 3.4. Missed clock
handling, improper timeout fixes, hardware wrokarounds... All
patches have been in linux-next for a few days, too."
* 'i2c-embedded/for-current' of git://git.pengutronix.de/git/wsa/linux:
i2c: mxs: disable QUEUE when sending is done
i2c: mxs: handle spurious interrupt
i2c-eg20t: Modify MODULE_AUTHOR's email address
i2c-eg20t: change timeout value 50msec to 1000msec
i2c: tegra: Add delay before resetting the controller after NACK
i2c: pnx: Disable clk in suspend
Linus Torvalds [Fri, 4 May 2012 00:14:55 +0000 (17:14 -0700)]
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
"Just some regression fixes from Ben along with a variable that gcc
failed to spot is uninitialised."
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
nouveau: initialise has_optimus variable.
drm/nv10/gpio: fix thinko in mask for gpio lines 2-9
nvc0/fb: shut up PMFB interrupt after the first occurrence
drm/nouveau/hdmi: use correct hdmi regs for nvaa/nvac
drm/nouveau/bios: fix regression on some nv4x board
Linus Torvalds [Fri, 4 May 2012 00:10:39 +0000 (17:10 -0700)]
Merge git://git./linux/kernel/git/davem/net
Pull networking fixes from David Miller:
1) Transfer padding was wrong for full-speed USB in ASIX driver, fix
from Ingo van Lil.
2) Propagate the negative packet offset fix into the PowerPC BPF JIT.
From Jan Seiffert.
3) dl2k driver's private ioctls were letting unprivileged tasks make
MII writes and other ugly bits like that. Fix from Jeff Mahoney.
4) Fix TX VLAN and RX packet drops in ucc_geth, from Joakim Tjernlund.
5) OOPS and network namespace fixes in IPVS from Hans Schillstrom and
Julian Anastasov.
6) Fix races and sleeping in locked context bugs in drop_monitor, from
Neil Horman.
7) Fix link status indication in smsc95xx driver, from Paolo Pisati.
8) Fix bridge netfilter OOPS, from Peter Huang.
9) L2TP sendmsg can return on error conditions with the socket lock
held, oops. Fix from Sasha Levin.
10) udp_diag should return meaningful values for socket memory usage,
from Shan Wei.
11) Eric Dumazet is so awesome he gets his own section:
Socket memory cgroup code (I never should have applied those
patches, grumble...) made erroneous changes to
sk_sockets_allocated_read_positive(). It was changed to
use percpu_counter_sum_positive (which requires BH disabling)
instead of percpu_counter_read_positive (which does not).
Revert back to avoid crashes and lockdep warnings.
Adjust the default tcp_adv_win_scale and tcp_rmem[2] values
to fix throughput regressions. This is necessary as a result
of our more precise skb->truesize tracking.
Fix SKB leak in netem packet scheduler.
12) New device IDs for various bluetooth devices, from Manoj Iyer,
AceLan Kao, and Steven Harms.
13) Fix command completion race in ipw2200, from Stanislav Yakovlev.
14) Fix rtlwifi oops on unload, from Larry Finger.
15) Fix hard_mtu when adjusting hard_header_len in smsc95xx driver.
From Stephane Fillod.
16) ehea driver registers it's IRQ before all the necessary state is
setup, resulting in crashes. Fix from Thadeu Lima de Souza
Cascardo.
17) Fix PHY connection failures in davinci_emac driver, from Anatolij
Gustschin.
18) Missing break; in switch statement in bluetooth's
hci_cmd_complete_evt(). Fix from Szymon Janc.
19) Fix queue programming in iwlwifi, from Johannes Berg.
20) Interrupt throttling defaults not being actually programmed into the
hardware, fix from Jeff Kirsher and Ying Cai.
21) TLAN driver SKB encoding in descriptor busted on 64-bit, fix from
Benjamin Poirier.
22) Fix blind status block RX producer pointer deref in TG3 driver, from
Matt Carlson.
23) Promisc and multicast are busted on ehea, fixes from Thadeu Lima de
Souza Cascardo.
24) Fix crashes in 6lowpan, from Alexander Smirnov.
25) tcp_complete_cwr() needs to be careful to not rewind the CWND to
ssthresh if ssthresh has the "infinite" value. Fix from Yuchung
Cheng.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (81 commits)
sungem: Fix WakeOnLan
tcp: change tcp_adv_win_scale and tcp_rmem[2]
net: l2tp: unlock socket lock before returning from l2tp_ip_sendmsg
drop_monitor: prevent init path from scheduling on the wrong cpu
usbnet: fix failure handling in usbnet_probe
usbnet: fix leak of transfer buffer of dev->interrupt
ucc_geth: Add 16 bytes to max TX frame for VLANs
net: ucc_geth, increase no. of HW RX descriptors
netem: fix possible skb leak
sky2: fix receive length error in mixed non-VLAN/VLAN traffic
sky2: propogate rx hash when packet is copied
net: fix two typos in skbuff.h
cxgb3: Don't call cxgb_vlan_mode until q locks are initialized
ixgbe: fix calling skb_put on nonlinear skb assertion bug
ixgbe: Fix a memory leak in IEEE DCB
igbvf: fix the bug when initializing the igbvf
smsc75xx: enable mac to detect speed/duplex from phy
smsc75xx: declare smsc75xx's MII as GMII capable
smsc75xx: fix phy interrupt acknowledge
smsc75xx: fix phy init reset loop
...
Linus Torvalds [Fri, 4 May 2012 00:08:58 +0000 (17:08 -0700)]
Merge tag 'hwmon-for-linus' of git://git./linux/kernel/git/groeck/linux-staging
Pull hwmon fixes from Guenter Roeck:
"Fix OOPS seen in coretemp driver if the CPU core ID is too large"
* tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
hwmon: (coretemp) Increase CPU core limit
hwmon: (coretemp) fix oops on cpu unplug
Sebastian Andrzej Siewior [Thu, 3 May 2012 18:22:00 +0000 (20:22 +0200)]
net/niu: remove one superfluous dma mask check
The idea here seems to be to get a 44bit DMA mask working and if this
fails it should fallback to a 32bit DMA mask. The dma_mask variable is
assigned once to 44bit and never updated. pci_set_dma_mask() and
pci_set_consistent_dma_mask() are both implemented as functions so there
is no evil macro which might update dma_mask. Looking at the assembly, I
see a call to dma_set_mask() followed by dma_supported() and then a jump
passed the second dma_set_mask(). The only way to get to second
dma_set_mask() call is by an error code in the first one.
So I hereby remove the check since it looks superfluous. Please ignore
the path if there is black magic involved.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Viresh Kumar [Sat, 21 Apr 2012 12:10:12 +0000 (17:40 +0530)]
ata: ahci_platform: Add synopsys ahci controller in DT's compatible list
SPEAr13xx series of SoCs contain Synopsys AHCI SATA Controller which shares
ahci_platform driver with other controller versions.
This patch updates DT compatible list for ahci_platform. It also updates and
renames binding documentation to more generic name.
Signed-off-by: Viresh Kumar <viresh.kumar@st.com>
Cc: Rob Herring <rob.herring@calxeda.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Viresh Kumar [Sat, 21 Apr 2012 12:10:09 +0000 (17:40 +0530)]
ata/pata_arasan_cf: Move arasan_cf_pm_ops out of #ifdef, #endif macros
#ifdef, #endif is not required in definition/usage of arasan_cf_pm_ops. So, move
this definition and its usage outside of them.
Signed-off-by: Viresh Kumar <viresh.kumar@st.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Tero Roponen [Sun, 22 Apr 2012 08:38:00 +0000 (11:38 +0300)]
libata: init ata_print_id to 0
When comparing the dmesg between 3.4-rc3 and 3.4-rc4 I found the
following differences:
-ata1: SATA max UDMA/133 abar m2048@0xf9fff000 port 0xf9fff100 irq 47
-ata2: SATA max UDMA/133 abar m2048@0xf9fff000 port 0xf9fff180 irq 47
-ata3: DUMMY
+ata2: SATA max UDMA/133 abar m2048@0xf9fff000 port 0xf9fff100 irq 47
+ata3: SATA max UDMA/133 abar m2048@0xf9fff000 port 0xf9fff180 irq 47
ata4: DUMMY
ata5: DUMMY
-ata6: SATA max UDMA/133 abar m2048@0xf9fff000 port 0xf9fff380 irq 47
+ata6: DUMMY
+ata7: SATA max UDMA/133 abar m2048@0xf9fff000 port 0xf9fff380 irq 47
The change of numbering comes from commit
85d6725b7c0d7e3f ("libata:
make ata_print_id atomic") that changed lines like
ap->print_id = ata_print_id++;
to
ap->print_id = atomic_inc_return(&ata_print_id);
As the latter behaves like ++ata_print_id, we must initialize
it to zero to start the numbering from one.
Signed-off-by: Tero Roponen <tero.roponen@gmail.com>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Matt Johnson [Fri, 27 Apr 2012 06:42:30 +0000 (01:42 -0500)]
ahci: Detect Marvell 88SE9172 SATA controller
The Marvell 88SE9172 SATA controller (PCI ID 1b4b 917a) already worked
once it was detected, but was missing an ahci_pci_tbl entry.
Boot tested on a Gigabyte Z68X-UD3H-B3 motherboard.
Signed-off-by: Matt Johnson <johnso87@illinois.edu>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Lin Ming [Thu, 3 May 2012 14:15:07 +0000 (22:15 +0800)]
libata: skip old error history when counting probe trials
Commit
d902747("[libata] Add ATA transport class") introduced
ATA_EFLAG_OLD_ER to mark entries in the error ring as cleared.
But ata_count_probe_trials_cb() didn't check this flag and it still
counts the old error history. So wrong probe trials count is returned
and it causes problem, for example, SATA link speed is slowed down from
3.0Gbps to 1.5Gbps.
Fix it by checking ATA_EFLAG_OLD_ER in ata_count_probe_trials_cb().
Cc: stable <stable@vger.kernel.org> # 2.6.37+
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
David S. Miller [Thu, 3 May 2012 17:30:11 +0000 (13:30 -0400)]
Merge branch 'master' of git://git./linux/kernel/git/jkirsher/net-next
Alexander Duyck [Thu, 3 May 2012 01:09:42 +0000 (01:09 +0000)]
skb: Add skb_head_is_locked helper function
This patch adds support for a skb_head_is_locked helper function. It is
meant to be used any time we are considering transferring the head from
skb->head to a paged frag. If the head is locked it means we cannot remove
the head from the skb so it must be copied or we must take the skb as a
whole.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Wed, 2 May 2012 23:33:21 +0000 (23:33 +0000)]
net: Fix truesize accounting in skb_gro_receive()
GRO is very optimistic in skb truesize estimates, only taking into
account the used part of fragments.
Be conservative, and use more precise computation, so that bloated GRO
skbs can be collapsed eventually.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Alexander Duyck <alexander.h.duyck@intel.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Acked-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
John W. Linville [Thu, 3 May 2012 15:23:15 +0000 (11:23 -0400)]
Merge branch 'master' of git://git./linux/kernel/git/linville/wireless into for-davem
Eric Dumazet [Sat, 24 Mar 2012 04:29:46 +0000 (00:29 -0400)]
iwlwifi: fix skb truesize underestimation
By default, iwlwifi uses order-1 pages (8 KB) to store incoming frames,
but doesnt say so in skb->truesize.
This makes very possible to exhaust kernel memory since these skb evade
normal socket memory accounting.
As struct ieee80211_hdr is going to be pulled before calling IP stack,
there is no need to use dev_alloc_skb() to reserve NET_SKB_PAD bytes.
alloc_skb() is ok in this driver, allowing more tailroom.
Pull beginning of frame in skb header, in the hope we can reuse order-1
pages in the driver immediately for small frames and reduce their
truesize to the minimum (linear skbs)
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Wey-Yi Guy <wey-yi.w.guy@intel.com>
Cc: "John W. Linville" <linville@tuxdriver.com>
Cc: Neal Cardwell <ncardwell@google.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
John Fastabend [Wed, 18 Apr 2012 22:42:27 +0000 (22:42 +0000)]
ixgbe: dcb: BIT_APP_UPCHG not set by ixgbe_copy_dcb_cfg()
After this commit:
commit
aacc1bea190d731755a65cb8ec31dd756f4e263e
Author: Multanen, Eric W <eric.w.multanen@intel.com>
Date: Wed Mar 28 07:49:09 2012 +0000
ixgbe: driver fix for link flap
The BIT_APP_UPCHG bit is no longer set when ixgbe_dcbnl_set_all() is
called. This results in the FCoE app user priority never getting set
and the driver will not configure the tx_rings correctly for FCoE
packets which use the SAN MTU and FCoE offloads.
We resolve this regression by fixing ixgbe_copy_dcb_cfg() to also
check for FCoE application changes. Additionally, we can drop the
IEEE variants of get_dcb_app() because this path is never called
with the IEEE mode enabled.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Don Skidmore [Sat, 17 Mar 2012 05:51:52 +0000 (05:51 +0000)]
ixgbe: fix race condition with shutdown
It was possible for shutdown to pull the rug out from other driver entry
points. Now we just grab the rtnl lock before taking everything apart.
Thanks to Hariharan for noticing this tight race condition.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Cc: Hariharan Nagarajan <hanagara@cisco.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Greg Rose [Tue, 17 Apr 2012 04:29:39 +0000 (04:29 +0000)]
ixgbevf: Update version string
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Greg Rose [Tue, 17 Apr 2012 04:29:34 +0000 (04:29 +0000)]
ixgbevf: Make sure jumbo frames are set correctly after PF reset
If the Physical Function (PF) resets after the VF has set jumbo
frame MTU then the VF jumbo frame is overwritten. Make sure the
VF driver always requests proper MTU size after reset
synchronization.
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Greg Rose [Tue, 10 Apr 2012 01:56:37 +0000 (01:56 +0000)]
ixgbevf: Add support to recognize 100mb link speed
The X540 10Gig controller is capable of linking at 100Mbits - add
support for reporting that link speed.
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Chris Boot [Tue, 24 Apr 2012 07:24:58 +0000 (07:24 +0000)]
e1000e: Remove special case for 82573/82574 ASPM L1 disablement
For the 82573, ASPM L1 gets disabled wholesale so this special-case code
is not required. For the 82574 the previous patch does the same as for
the 82573, disabling L1 on the adapter. Thus, this code is no longer
required and can be removed.
Signed-off-by: Chris Boot <bootc@bootc.net>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Chris Boot [Tue, 24 Apr 2012 07:24:52 +0000 (07:24 +0000)]
e1000e: Disable ASPM L1 on 82574
ASPM on the 82574 causes trouble. Currently the driver disables L0s for
this NIC but only disables L1 if the MTU is >1500. This patch simply
causes L1 to be disabled regardless of the MTU setting.
Signed-off-by: Chris Boot <bootc@bootc.net>
Cc: "Wyborny, Carolyn" <carolyn.wyborny@intel.com>
Cc: Nix <nix@esperi.org.uk>
Link: https://lkml.org/lkml/2012/3/19/362
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Matthew Vick [Wed, 25 Apr 2012 08:01:05 +0000 (08:01 +0000)]
e1000e: Driver workaround for IPv6 Header Extension Erratum.
Previously, IPv6 extension header parsing was disabled for all devices
supported by e1000e when using packet split mode. However, as per a
silicon errata, only certain devices need this restriction and will need
to disable IPv6 extension header parsing for all modes.
Signed-off-by: Matthew Vick <matthew.vick@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Matthew Vick [Wed, 25 Apr 2012 07:25:18 +0000 (07:25 +0000)]
e1000e: Resolve intermittent negotiation issue on 82574/82583.
For 82574 and 82583 devices, resolve an intermittent link issue where
the link negotiates to 100Mbps rather than 1Gbps when powering off the
PHY and powering on the PHY after several seconds.
Signed-off-by: Matthew Vick <matthew.vick@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Sat, 14 Apr 2012 04:21:52 +0000 (04:21 +0000)]
e1000e: cleanup long [read|write]_reg_locked PHY ops function pointers
Calling the locked versions of the read/write PHY ops function pointers
often produces excessively long lines. Shorten these as is done with
the non-locked versions of the PHY register read/write functions.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Tue, 20 Mar 2012 03:48:08 +0000 (03:48 +0000)]
e1000e: suggest a possible workaround to a device hang on 82577/8
There is a known issue in the 82577 and 82578 device that can cause a hang
in the device hardware during traffic stress; the current workaround in the
driver is to disable transmit flow control by default. If the user enables
transmit flow control and the device hang occurs, provide a message in the
syslog suggesting to re-enable the workaround.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Dave Airlie [Wed, 2 May 2012 19:26:24 +0000 (20:26 +0100)]
nouveau: initialise has_optimus variable.
We should initialise this to 0 really to avoid getting false positives.
Cc: stable@vger.kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
Alexander Duyck [Wed, 2 May 2012 21:19:14 +0000 (21:19 +0000)]
ixgbe: Fix use after free on module remove
While testing the TCP changes I had to fix an issue in order to be able to
load and unload the module.
The recent patch that added thermal sensor support added a use after free
bug on module unload with an 82598 adapter in the system. To resolve the
issue I have updated the code so that when we free the info_kobj we set it
back to NULL.
I suspect there are likely other bugs present, but I will leave that for
another patch that can undergo more testing.
I am submitting this directly to net-next since this fixes a fairly serious
bug that will lock up the ixgbe module until the system is rebooted.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Wed, 2 May 2012 21:19:09 +0000 (21:19 +0000)]
tcp: move stats merge to the end of tcp_try_coalesce
This change cleans up the last bits of tcp_try_coalesce so that we only
need one goto which jumps to the end of the function. The idea is to make
the code more readable by putting things in a linear order so that we start
execution at the top of the function, and end it at the bottom.
I also made a slight tweak to the code for handling frags when we are a
clone. Instead of making it an if (clone) loop else nr_frags = 0 I changed
the logic so that if (!clone) we just set the number of frags to 0 which
disables the for loop anyway.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Wed, 2 May 2012 21:19:04 +0000 (21:19 +0000)]
tcp: Move code related to head frag in tcp_try_coalesce
This change reorders the code related to the use of an skb->head_frag so it
is placed before we check the rest of the frags. This allows the code to
read more linearly instead of like some sort of loop.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Wed, 2 May 2012 21:18:59 +0000 (21:18 +0000)]
tcp: Fix truesize accounting in tcp_try_coalesce
This patch addresses several issues in the way we were tracking the
truesize in tcp_try_coalesce.
First it was using ksize which prevents us from having a 0 sized head frag
and getting a usable result. To resolve that this patch uses the end
pointer which is set based off either ksize, or the frag_size supplied in
build_skb. This allows us to compute the original truesize of the entire
buffer and remove that value leaving us with just what was added as pages.
The second issue was the use of skb->len if there is a mergeable head frag.
We should only need to remove the size of an data aligned sk_buff from our
current skb->truesize to compute the delta for a buffer with a reused head.
By using skb->len the value of truesize was being artificially reduced
which means that head frags could use more memory than buffers using
standard allocations.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 3 May 2012 06:25:55 +0000 (02:25 -0400)]
net: Add missing linux/prefetch.h include to net/core/sock.c
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Gerard Lledo [Sat, 28 Apr 2012 08:52:37 +0000 (08:52 +0000)]
sungem: Fix WakeOnLan
WakeOnLan was broken in this driver because gp->asleep_wol is a 1-bit
bitfield and it was being assigned WAKE_MAGIC, which is (1 << 5).
gp->asleep_wol remains 0 and the machine never wakes up. Fixed by casting
gp->wake_on_lan to bool. Tested on an iBook G4.
Signed-off-by: Gerard Lledo <gerard.lledo@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Wed, 2 May 2012 18:18:42 +0000 (18:18 +0000)]
net: Stop decapitating clones that have a head_frag
This change is meant ot prevent stealing the skb->head to use as a page in
the event that the skb->head was cloned. This allows the other clones to
track each other via shinfo->dataref.
Without this we break down to two methods for tracking the reference count,
one being dataref, the other being the page count. As a result it becomes
difficult to track how many references there are to skb->head.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Wed, 2 May 2012 09:58:29 +0000 (09:58 +0000)]
net: implement tcp coalescing in tcp_queue_rcv()
Extend tcp coalescing implementing it from tcp_queue_rcv(), the main
receiver function when application is not blocked in recvmsg().
Function tcp_queue_rcv() is moved a bit to allow its call from
tcp_data_queue()
This gives good results especially if GRO could not kick, and if skb
head is a fragment.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Alexander Duyck <alexander.h.duyck@intel.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Wed, 2 May 2012 07:55:58 +0000 (07:55 +0000)]
net: take care of cloned skbs in tcp_try_coalesce()
Before stealing fragments or skb head, we must make sure skbs are not
cloned.
Alexander was worried about destination skb being cloned : In bridge
setups, a driver could be fooled if skb->data_len would not match skb
nr_frags.
If source skb is cloned, we must take references on pages instead.
Bug happened using tcpdump (if not using mmap())
Introduce kfree_skb_partial() helper to cleanup code.
Reported-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Wed, 2 May 2012 02:28:41 +0000 (02:28 +0000)]
tcp: change tcp_adv_win_scale and tcp_rmem[2]
tcp_adv_win_scale default value is 2, meaning we expect a good citizen
skb to have skb->len / skb->truesize ratio of 75% (3/4)
In 2.6 kernels we (mis)accounted for typical MSS=1460 frame :
1536 + 64 + 256 = 1856 'estimated truesize', and 1856 * 3/4 = 1392.
So these skbs were considered as not bloated.
With recent truesize fixes, a typical MSS=1460 frame truesize is now the
more precise :
2048 + 256 = 2304. But 2304 * 3/4 = 1728.
So these skb are not good citizen anymore, because 1460 < 1728
(GRO can escape this problem because it build skbs with a too low
truesize.)
This also means tcp advertises a too optimistic window for a given
allocated rcvspace : When receiving frames, sk_rmem_alloc can hit
sk_rcvbuf limit and we call tcp_prune_queue()/tcp_collapse() too often,
especially when application is slow to drain its receive queue or in
case of losses (netperf is fast, scp is slow). This is a major latency
source.
We should adjust the len/truesize ratio to 50% instead of 75%
This patch :
1) changes tcp_adv_win_scale default to 1 instead of 2
2) increase tcp_rmem[2] limit from 4MB to 6MB to take into account
better truesize tracking and to allow autotuning tcp receive window to
reach same value than before. Note that same amount of kernel memory is
consumed compared to 2.6 kernels.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Tom Herbert <therbert@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Somnath Kotur [Wed, 2 May 2012 03:41:01 +0000 (03:41 +0000)]
be2net: Fix EEH error reset before a flash dump completes
An EEH error can cause the FW to trigger a flash debug dump.
Resetting the card while flash dump is in progress can cause it not to recover.
Wait for it to finish before letting EEH flow to reset the card.
Signed-off-by: Sathya Perla <Sathya.Perla@emulex.com>
Signed-off-by: Somnath Kotur <somnath.kotur@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Somnath Kotur [Wed, 2 May 2012 03:40:49 +0000 (03:40 +0000)]
be2net: Record receive queue index in skb to aid RPS.
Signed-off-by: Sarveshwar Bandi <Sarveshwar.Bandi@emulex.com>
Signed-off-by: Somnath Kotur <somnath.kotur@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Somnath Kotur [Wed, 2 May 2012 03:40:32 +0000 (03:40 +0000)]
be2net: Fix to apply duplex value as unknown when link is down.
Suggested-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: Sarveshwar Bandi <sarveshwar.bandi@emulex.com>
Signed-off-by: Somnath Kotur <somnath.kotur@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Somnath Kotur [Wed, 2 May 2012 03:40:16 +0000 (03:40 +0000)]
be2net: Fix to not set link speed for disabled functions of a UMC card
This renders the interface view somewhat inconsistent from the Host OS POV
considering the rest of the interfaces are showing their respective speeds
based on the bandwidth assigned to them.
Signed-off-by: Somnath Kotur <somnath.kotur@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sasha Levin [Wed, 2 May 2012 03:58:43 +0000 (03:58 +0000)]
net: l2tp: unlock socket lock before returning from l2tp_ip_sendmsg
l2tp_ip_sendmsg could return without releasing socket lock, making it all the
way to userspace, and generating the following warning:
[ 130.891594] ================================================
[ 130.894569] [ BUG: lock held when returning to user space! ]
[ 130.897257] 3.4.0-rc5-next-
20120501-sasha #104 Tainted: G W
[ 130.900336] ------------------------------------------------
[ 130.902996] trinity/8384 is leaving the kernel with locks still held!
[ 130.906106] 1 lock held by trinity/8384:
[ 130.907924] #0: (sk_lock-AF_INET){+.+.+.}, at: [<
ffffffff82b9503f>] l2tp_ip_sendmsg+0x2f/0x550
Introduced by commit
2f16270 ("l2tp: Fix locking in l2tp_ip.c").
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Neil Horman [Tue, 1 May 2012 08:18:02 +0000 (08:18 +0000)]
drop_monitor: prevent init path from scheduling on the wrong cpu
I just noticed after some recent updates, that the init path for the drop
monitor protocol has a minor error. drop monitor maintains a per cpu structure,
that gets initalized from a single cpu. Normally this is fine, as the protocol
isn't in use yet, but I recently made a change that causes a failed skb
allocation to reschedule itself . Given the current code, the implication is
that this workqueue reschedule will take place on the wrong cpu. If drop
monitor is used early during the boot process, its possible that two cpus will
access a single per-cpu structure in parallel, possibly leading to data
corruption.
This patch fixes the situation, by storing the cpu number that a given instance
of this per-cpu data should be accessed from. In the case of a need for a
reschedule, the cpu stored in the struct is assigned the rescheule, rather than
the currently executing cpu
Tested successfully by myself.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: David Miller <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yuchung Cheng [Wed, 2 May 2012 13:30:04 +0000 (13:30 +0000)]
tcp: early retransmit: delayed fast retransmit
Implementing the advanced early retransmit (sysctl_tcp_early_retrans==2).
Delays the fast retransmit by an interval of RTT/4. We borrow the
RTO timer to implement the delay. If we receive another ACK or send
a new packet, the timer is cancelled and restored to original RTO
value offset by time elapsed. When the delayed-ER timer fires,
we enter fast recovery and perform fast retransmit.
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>