Eric Dumazet [Mon, 30 Apr 2012 08:10:34 +0000 (08:10 +0000)]
net: make GRO aware of skb->head_frag
GRO can check if skb to be merged has its skb->head mapped to a page
fragment, instead of a kmalloc() area.
We 'upgrade' skb->head as a fragment in itself
This avoids the frag_list fallback, and permits to build true GRO skb
(one sk_buff and up to 16 fragments), using less memory.
This reduces number of cache misses when user makes its copy, since a
single sk_buff is fetched.
This is a followup of patch "net: allow skb->head to be a page fragment"
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Maciej Żenczykowski <maze@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Tom Herbert <therbert@google.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Ben Hutchings <bhutchings@solarflare.com>
Cc: Matt Carlson <mcarlson@broadcom.com>
Cc: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Fri, 27 Apr 2012 00:34:49 +0000 (00:34 +0000)]
tg3: provide frags as skb head
This patch converts tg3 driver, one of our reference drivers, to use new
build_skb() api in frag mode.
Instead of using kmalloc() to allocate the memory block that will be
used by build_skb() as skb->head, we use a page fragment.
This is a followup of patch "net: allow skb->head to be a page fragment"
This allows GRO, TCP coalescing, and splice() to be more efficient.
Incidentally, this also removes SLUB slow path contention in kfree()
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Maciej Żenczykowski <maze@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Tom Herbert <therbert@google.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Ben Hutchings <bhutchings@solarflare.com>
Cc: Matt Carlson <mcarlson@broadcom.com>
Cc: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Fri, 27 Apr 2012 00:33:38 +0000 (00:33 +0000)]
net: allow skb->head to be a page fragment
skb->head is currently allocated from kmalloc(). This is convenient but
has the drawback the data cannot be converted to a page fragment if
needed.
We have three spots were it hurts :
1) GRO aggregation
When a linear skb must be appended to another skb, GRO uses the
frag_list fallback, very inefficient since we keep all struct sk_buff
around. So drivers enabling GRO but delivering linear skbs to network
stack aren't enabling full GRO power.
2) splice(socket -> pipe).
We must copy the linear part to a page fragment.
This kind of defeats splice() purpose (zero copy claim)
3) TCP coalescing.
Recently introduced, this permits to group several contiguous segments
into a single skb. This shortens queue lengths and save kernel memory,
and greatly reduce probabilities of TCP collapses. This coalescing
doesnt work on linear skbs (or we would need to copy data, this would be
too slow)
Given all these issues, the following patch introduces the possibility
of having skb->head be a fragment in itself. We use a new skb flag,
skb->head_frag to carry this information.
build_skb() is changed to accept a frag_size argument. Drivers willing
to provide a page fragment instead of kmalloc() data will set a non zero
value, set to the fragment size.
Then, on situations we need to convert the skb head to a frag in itself,
we can check if skb->head_frag is set and avoid the copies or various
fallbacks we have.
This means drivers currently using frags could be updated to avoid the
current skb->head allocation and reduce their memory footprint (aka skb
truesize). (thats 512 or 1024 bytes saved per skb). This also makes
bpf/netfilter faster since the 'first frag' will be part of skb linear
part, no need to copy data.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Maciej Żenczykowski <maze@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Tom Herbert <therbert@google.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Ben Hutchings <bhutchings@solarflare.com>
Cc: Matt Carlson <mcarlson@broadcom.com>
Cc: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Willem de Bruijn [Fri, 27 Apr 2012 09:04:07 +0000 (09:04 +0000)]
forcedeth: add transmit timestamping support
Insert an skb_tx_timestamp call in both ndo_start_xmit routines
Tested to work for the nv_start_xmit_optimized case
Signed-off-by: Willem de Bruijn <willemb@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Willem de Bruijn [Fri, 27 Apr 2012 09:04:06 +0000 (09:04 +0000)]
bnx2x: add transmit timestamping support
Signed-off-by: Willem de Bruijn <willemb@google.com>
Acked-by: Eilon Greenstein <eilong@broadcom.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Willem de Bruijn [Fri, 27 Apr 2012 09:04:05 +0000 (09:04 +0000)]
e1000e: add transmit timestamping support
Signed-off-by: Willem de Bruijn <willemb@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Willem de Bruijn [Fri, 27 Apr 2012 09:04:04 +0000 (09:04 +0000)]
e1000: add transmit timestamping support
Signed-off-by: Willem de Bruijn <willemb@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Herbert Xu [Mon, 30 Apr 2012 00:22:56 +0000 (00:22 +0000)]
bridge: Fix fatal typo in setup of multicast_querier_expired
Unfortunately it seems that I didn't properly test the case of
an expired external querier in the recent multicast bridge series.
The setup of the timer in that case is completely broken and leads
to a NULL-pointer dereference. This patch fixes it.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 30 Apr 2012 17:21:28 +0000 (13:21 -0400)]
l2tp: Add missing net/net/ip6_checksum.h include.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Benjamin LaHaise [Fri, 27 Apr 2012 08:24:18 +0000 (08:24 +0000)]
net/l2tp: add support for L2TP over IPv6 UDP
Now that encap_rcv() works on IPv6 UDP sockets, wire L2TP up to IPv6.
Support has been tested with and without hardware offloading. This
version fixes the L2TP over localhost issue with incorrect checksums
being reported.
Signed-off-by: Benjamin LaHaise <bcrl@kvack.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Benjamin LaHaise [Fri, 27 Apr 2012 08:24:08 +0000 (08:24 +0000)]
net/ipv6/udp: UDP encapsulation: introduce encap_rcv hook into IPv6
Now that the sematics of udpv6_queue_rcv_skb() match IPv4's
udp_queue_rcv_skb(), introduce the UDP encap_rcv() hook for IPv6.
Signed-off-by: Benjamin LaHaise <bcrl@kvack.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Benjamin LaHaise [Fri, 27 Apr 2012 08:23:59 +0000 (08:23 +0000)]
net/ipv6/udp: UDP encapsulation: move socket locking into udpv6_queue_rcv_skb()
In order to make sure that when the encap_rcv() hook is introduced it is
not called with the socket lock held, move socket locking from callers into
udpv6_queue_rcv_skb(), matching what happens in IPv4.
Signed-off-by: Benjamin LaHaise <bcrl@kvack.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Benjamin LaHaise [Fri, 27 Apr 2012 08:23:21 +0000 (08:23 +0000)]
net/ipv6/udp: UDP encapsulation: break backlog_rcv into __udpv6_queue_rcv_skb
This is the first step in reworking the IPv6 UDP code to be structured more
like the IPv4 UDP code. This patch creates __udpv6_queue_rcv_skb() with
the equivalent sematics to __udp_queue_rcv_skb(), and wires it up to the
backlog_rcv method.
Signed-off-by: Benjamin LaHaise <bcrl@kvack.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sun, 29 Apr 2012 02:06:17 +0000 (22:06 -0400)]
Merge branch 'master' of git://git./linux/kernel/git/jkirsher/net-next
RongQing.Li [Thu, 26 Apr 2012 21:01:13 +0000 (21:01 +0000)]
drivers/net/oki-semi: Donot recompute IP header checksum
If I understand correct, NETIF_F_IP_CSUM only means the hardware
will compute the TCP/UDP checksum, IP checksum is always computed
in software
So as a workround of hardware unable to compute small packages
checksum, do not need to compute IP header checksum.
Signed-off-by: RongQing.Li <roy.qing.li@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
RongQing.Li [Thu, 26 Apr 2012 21:01:12 +0000 (21:01 +0000)]
drivers/net/oki-semi: Remove the definition of PCH_GBE_ETH_ALEN
PCH_GBE_ETH_ALEN is equal to ETH_ALEN, so we can replace it with
ETH_ALEN.
If they are not equal, it must be a bug, since this is ethernet,
and the address has been already stored to mc_addr_list as ETH_ALEN
bytes when call pch_gbe_mac_mc_addr_list_update.
Signed-off-by: RongQing.Li <roy.qing.li@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nicolas Ferre [Thu, 26 Apr 2012 00:30:43 +0000 (00:30 +0000)]
net/at91_ether: use gpio_to_irq for phy IRQ line
Use the gpio_to_irq() function to retrieve the phy IRQ line
from the GPIO pin specification.
This fix is needed now that we have moved to irqdomains on AT91.
Reported-by: Jamie Iles <jamie@jamieiles.com>
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Cc: Andrew Victor <avictor.za@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: netdev@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Victor [Thu, 26 Apr 2012 00:30:42 +0000 (00:30 +0000)]
AT91: Remove fixed mapping for AT91RM9200 ethernet
The AT91RM9200 Ethernet controller still has a fixed IO mapping.
So:
* Remove the fixed IO mapping and AT91_VA_BASE_EMAC definition.
* Pass the physical base-address via platform-resources to the driver.
* Convert at91_ether.c driver to perform an ioremap().
* Ethernet PHY detection needs to be performed during the driver
initialization process, it can no longer be done first.
Signed-off-by: Andrew Victor <linux@maxim.org.za>
Signed-off-by: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jeffrin Jose [Wed, 25 Apr 2012 13:47:29 +0000 (19:17 +0530)]
net: Fixed a coding style issue related to spaces.
Fixed a coding style issue relating to spaces
in net/core/sock.c
Signed-off-by: Jeffrin Jose <ahiliation@yahoo.co.in>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jacob Keller [Sat, 21 Apr 2012 06:05:40 +0000 (06:05 +0000)]
ixgbe: check for WoL support in single function
This patch consolidates the case logic for checking whether a device supports
WoL into a single place. Previously ethtool and probe used similar logic that
was copied and maintained separately. This patch encapsulates the core logic
into a function so that a user only has to update one place.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Matthew Vick [Wed, 18 Apr 2012 02:57:44 +0000 (02:57 +0000)]
igb: Force flow control off during reset when forcing speed.
During igb_reset(), we initiate a hardware reset which will clear our
flow control settings. For auto-negotiation, we re-negotiate them when
linking up again, but we need to force them off properly for the forced
speed case.
Signed-off-by: Matthew Vick <matthew.vick@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Tue, 20 Mar 2012 03:47:52 +0000 (03:47 +0000)]
e1000e: 82579 potential system hang on stress when ME enabled
Previously, a workaround was added to address a hardware bug in the
PCIm2PCI arbiter where a write by the driver of the Transmit/Receive
Descriptor Tail register could happen concurrently with a write of any
MAC CSR register by the Manageability Engine (ME) which could cause the
Tail register to have an incorrect value. The arbiter is supposed to
prevent the concurrent writes but there is a bug that can cause the Host
(driver) access to be acknowledged later than it should.
After further investigation, it was discovered that a driver write access
of any MAC CSR register after being idle for some time can be lost when
ME is accessing a MAC CSR register. When this happens, no further target
access is claimed by the MAC which could hang the system.
The workaround to check bit 24 in the FWSM register (set only when ME is
accessing a MAC CSR register) and delay for a limited amount of time until
it is cleared is now done for all driver writes of MAC CSR registers on
82579 with ME enabled. In the rare case when the driver is writing the
Tail register and ME is accessing any MAC CSR register for a duration
longer than the maximum delay, write the register and verify it has the
correct value before continuing, otherwise reset the device.
This patch also moves some pre-existing macros from the hardware-specific
header file to the more appropriate generic driver header file.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Tue, 20 Mar 2012 03:47:47 +0000 (03:47 +0000)]
e1000e: 82579 packet drop workaround
In K1 mode (a MAC/PHY interconnect power mode), the 82579 device shuts down
the Phase Lock Loop (PLL) of the interconnect to save power. When the PLL
starts working, the 82579 device may start to transfer the packet through
the interconnect before it is fully functional causing packet drops. This
workaround disables shutting down the PLL in K1 mode for 1G link speed.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Matthew Vick [Fri, 16 Mar 2012 09:02:59 +0000 (09:02 +0000)]
e1000e: Enable DMA Burst Mode on 82574 by default.
Performance testing has shown that enabling DMA burst on 82574
improves performance on small packets, so enable it by default.
Signed-off-by: Matthew Vick <matthew.vick@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Matthew Vick [Fri, 16 Mar 2012 09:02:58 +0000 (09:02 +0000)]
e1000e: Disable Far-End LoopBack following reset on 80003ES2LAN.
80003ES2LAN has an errata such that far-end loopback may be activated by
bit errors producing a reserved symbol. In order to disable far-end
loopback quickly enough, disable it immediately following a reset.
Signed-off-by: Matthew Vick <matthew.vick@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Eric Dumazet [Thu, 26 Apr 2012 20:07:59 +0000 (20:07 +0000)]
net: cleanups in sock_setsockopt()
Use min_t()/max_t() macros, reformat two comments, use !!test_bit() to
match !!sock_flag()
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Shan Wei [Thu, 26 Apr 2012 16:52:52 +0000 (16:52 +0000)]
net: doc: merge /proc/sys/net/core/* documents into one place
All parameter descriptions in /proc/sys/net/core/* now is separated
two places. So, merge them into Documentation/sysctl/net.txt.
Signed-off-by: Shan Wei <davidshan@tencent.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ajit Khaparde [Thu, 26 Apr 2012 15:42:46 +0000 (15:42 +0000)]
be2net: update the driver version
Signed-off-by: Ajit Khaparde <ajit.khaparde@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ajit Khaparde [Thu, 26 Apr 2012 15:42:39 +0000 (15:42 +0000)]
be2net: fix speed displayed by ethtool on certain SKUs
logical speed returned by link_status_query needs to be multiplied by 10.
Signed-off-by: Ajit Khaparde <ajit.khaparde@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ajit Khaparde [Thu, 26 Apr 2012 15:42:31 +0000 (15:42 +0000)]
be2net: Ignore status of some ioctls during driver load
Signed-off-by: Ajit Khaparde <ajit.khaparde@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Walleij [Thu, 26 Apr 2012 11:51:07 +0000 (11:51 +0000)]
NET: smsc-ircc2: mark non-experimental
This has been used by me and others for ages, let's stop calling it
experimental.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rick Jones [Thu, 26 Apr 2012 11:20:30 +0000 (11:20 +0000)]
bonding: bond_update_speed_duplex() can return void since no callers check its return
As none of the callers of bond_update_speed_duplex (need to) check its
return value, there is little point in it returning anything.
Signed-off-by: Rick Jones <rick.jones2@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Manish Chopra [Thu, 26 Apr 2012 10:31:31 +0000 (10:31 +0000)]
qlcnic: Allow a predefined set of capture masks for FW dump
o 0x3, 0x7, 0xF, 0x1F, 0x3F, 0x7F and 0xFF are the allowed capture masks.
o Updated driver version to 5.0.28
Signed-off-by: Manish chopra <manish.chopra@qlogic.com>
Signed-off-by: Anirban Chakraborty <anirban.chakraborty@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jitendra Kalsaria [Thu, 26 Apr 2012 10:31:30 +0000 (10:31 +0000)]
qlcnic: Adding mac statistics to ethtool.
Signed-off-by: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com>
Signed-off-by: Anirban Chakraborty <anirban.chakraborty@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sucheta Chakraborty [Thu, 26 Apr 2012 10:31:29 +0000 (10:31 +0000)]
qlcnic: Register device in FAILED state.
o Without failing probe, register netdevice when device is in FAILED state.
o Device will come up with minimum functionality.
Signed-off-by: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com>
Signed-off-by: Anirban Chakraborty <anirban.chakraborty@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
hartleys [Tue, 24 Apr 2012 14:38:37 +0000 (14:38 +0000)]
crush: include header for global symbols
Include the header to pickup the definitions of the global symbols.
Quiets the following sparse warnings:
warning: symbol 'crush_find_rule' was not declared. Should it be static?
warning: symbol 'crush_do_rule' was not declared. Should it be static?
Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Cc: Sage Weil <sage@newdream.net>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
hartleys [Tue, 24 Apr 2012 12:56:03 +0000 (12:56 +0000)]
isdn/eicon: use standard __init,__exit function markup
Remove the custom DIVA_{INIT,EXIT}_FUNCTION defines and use
the standard __init,__exit markup.
Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Cc: Armin Schindler <mac@melware.de>
Cc: Karsten Keil <isdn@linux-pingi.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Tue, 24 Apr 2012 07:37:38 +0000 (07:37 +0000)]
ipv6: RTAX_FEATURE_ALLFRAG causes inefficient TCP segment sizing
Quoting Tore Anderson from :
https://bugzilla.kernel.org/show_bug.cgi?id=42572
When RTAX_FEATURE_ALLFRAG is set on a route, the effective TCP segment
size does not take into account the size of the IPv6 Fragmentation
header that needs to be included in outbound packets, causing every
transmitted TCP segment to be fragmented across two IPv6 packets, the
latter of which will only contain 8 bytes of actual payload.
RTAX_FEATURE_ALLFRAG is typically set on a route in response to
receving a ICMPv6 Packet Too Big message indicating a Path MTU of less
than 1280 bytes. 1280 bytes is the minimum IPv6 MTU, however ICMPv6
PTBs with MTU < 1280 are still valid, in particular when an IPv6
packet is sent to an IPv4 destination through a stateless translator.
Any ICMPv4 Need To Fragment packets originated from the IPv4 part of
the path will be translated to ICMPv6 PTB which may then indicate an
MTU of less than 1280.
The Linux kernel refuses to reduce the effective MTU to anything below
1280 bytes, instead it sets it to exactly 1280 bytes, and
RTAX_FEATURE_ALLFRAG is also set. However, the TCP segment size appears
to be set to 1240 bytes (1280 Path MTU - 40 bytes of IPv6 header),
instead of 1232 (additionally taking into account the 8 bytes required
by the IPv6 Fragmentation extension header).
This in turn results in rather inefficient transmission, as every
transmitted TCP segment now is split in two fragments containing
1232+8 bytes of payload.
After this patch, all the outgoing packets that includes a
Fragmentation header all are "atomic" or "non-fragmented" fragments,
i.e., they both have Offset=0 and More Fragments=0.
With help from David S. Miller
Reported-by: Tore Anderson <tore@fud.no>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Maciej Żenczykowski <maze@google.com>
Cc: Tom Herbert <therbert@google.com>
Tested-by: Tore Anderson <tore@fud.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 26 Apr 2012 19:54:45 +0000 (15:54 -0400)]
Merge branch 'for-davem' of git://git./linux/kernel/git/linville/wireless-next
John W. Linville [Thu, 26 Apr 2012 19:03:48 +0000 (15:03 -0400)]
Merge branch 'master' of git://git./linux/kernel/git/linville/wireless-next into for-davem
Conflicts:
drivers/net/wireless/iwlwifi/iwl-testmode.c
Pavel Emelyanov [Wed, 25 Apr 2012 23:43:04 +0000 (23:43 +0000)]
tcp repair: Fix unaligned access when repairing options (v2)
Don't pick __u8/__u16 values directly from raw pointers, but instead use
an array of structures of code:value pairs. This is OK, since the buffer
we take options from is not an skb memory, but a user-to-kernel one.
For those options which don't require any value now, require this to be
zero (for potential future extension of this API).
v2: Changed tcp_repair_opt to use two __u32-s as spotted by David Laight.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
alex.bluesman.smirnov@gmail.com [Wed, 25 Apr 2012 23:35:51 +0000 (23:35 +0000)]
6lowpan: duplicate definition of IEEE802154_ALEN
The same macros is defined in 'include/net/af_ieee802154.h' and is called
IEEE802154_ADDR_LEN. No need another one, so remove it.
Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
alex.bluesman.smirnov@gmail.com [Wed, 25 Apr 2012 23:35:50 +0000 (23:35 +0000)]
6lowpan: move frame allocation code to a separate function
Separate frame allocation routine from data processing function.
This makes code more human readable and easier for understanding.
Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andreas Eversberg [Tue, 24 Apr 2012 20:52:14 +0000 (20:52 +0000)]
mISDN: Added support for fragmentation of E1 interfaces of hfcmulti driver.
Fragmentation is usefull if multiple devices are connected to an E1
interface. Each fragment will have a subset of the available timeslots.
These devices require a cascde connection or a multiplexer.
Signed-off-by: Andreas Eversberg <jolly@eversberg.eu>
Signed-off-by: Karsten Keil <keil@b1-systems.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andreas Eversberg [Tue, 24 Apr 2012 20:52:13 +0000 (20:52 +0000)]
mISDN: Rework of LED status display for HFC-4S/8S/E1 cards.
LEDs will show RED if layer 1 is disabled or fails.
LEDs will show GREEN if layer 1 is active.
LEDs will blink if traffic on D-channel.
Signed-off-by: Andreas Eversberg <jolly@eversberg.eu>
Signed-off-by: Karsten Keil <keil@b1-systems.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andreas Eversberg [Tue, 24 Apr 2012 20:52:12 +0000 (20:52 +0000)]
mISDN: Using FLG_ACTIVE flag to determine if layer 1 is active or not.
We already have the flag for L1 active, so we should use it.
L2 will be solved in a later patch.
Signed-off-by: Andreas Eversberg <andreas@eversberg.eu>
Signed-off-by: Karsten Keil <keil@b1-systems.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andreas Eversberg [Tue, 24 Apr 2012 20:52:11 +0000 (20:52 +0000)]
mISDN: Fixed false interruption of audio during bridging change.
Transmitted audio data was interrupted if a bridge was enabled or disabled.
Now transmission seamlessly continues during that action.
Fix in hfcmulti.ko
Signed-off-by: Andreas Eversberg <jolly@eversberg.eu>
Signed-off-by: Karsten Keil <keil@b1-systems.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Huang, Xiong [Wed, 25 Apr 2012 20:41:00 +0000 (20:41 +0000)]
atl1c: refine start/enable code for MAC module
merge TXQ/RXQ/MAC start/enable code to one function as they
are started/enabled at the same time, just like stop/disable them
in the function of atl1c_stop_mac.
Signed-off-by: xiong <xiong@qca.qualcomm.com>
Tested-by: Liu David <dwliu@qca.qualcomm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Huang, Xiong [Wed, 25 Apr 2012 20:40:59 +0000 (20:40 +0000)]
atl1c: add function atl1c_power_saving
This function is used for suspend of S1/S3/S4 and driver remove.
It sets MAC/PHY based on the WoL configuation to get lower power
consumption.
atl1c_phy_power_saving is renamed to atl1c_phy_to_ps_link, this
function is just make PHY enter a link/speed mode to eat less
power.
REG_MAC_CTRL register is refined as well.
Signed-off-by: xiong <xiong@qca.qualcomm.com>
Tested-by: Liu David <dwliu@qca.qualcomm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Huang, Xiong [Wed, 25 Apr 2012 20:27:15 +0000 (20:27 +0000)]
atl1c: remove PHY reset/init for link down event
it's unnecessary to reset/init phy when link down.
Only L1/L2 chip (supported by atlx) need such action.
Signed-off-by: xiong <xiong@qca.qualcomm.com>
Tested-by: Liu David <dwliu@qca.qualcomm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Huang, Xiong [Wed, 25 Apr 2012 20:27:14 +0000 (20:27 +0000)]
atl1c: update PHY reset related routine
Many magic data are re-configured for PHY during its reset operation
based on chip type to get better compability and stability.
REG_PHY_CTRL register may be configured by BIOS before enter OS.
so, the driver can't directly write to it without any Read-Op.
this change also affect suspend and phy_disable routines.
PHY debug ports and extension registers are refined as well.
Signed-off-by: xiong <xiong@qca.qualcomm.com>
Tested-by: Liu David <dwliu@qca.qualcomm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Huang, Xiong [Wed, 25 Apr 2012 20:27:13 +0000 (20:27 +0000)]
atl1c: remove PHY polling from atl1c_open
PHY polling code for FPGA is considered in every MDIO R/W API.
no need to add additional code to atl1c_open.
Signed-off-by: xiong <xiong@qca.qualcomm.com>
Tested-by: Liu David <dwliu@qca.qualcomm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Huang, Xiong [Wed, 25 Apr 2012 20:27:12 +0000 (20:27 +0000)]
atl1c: refine SERDES-clock related code
bit 17/18 of reg1424 must be clear for l2cb 1.x, or it will cause
the write-reg operation fail without cable connected.
so, please do connect the cable when apply this patch to the driver
to make sure these 2bits are cleared by new driver.
The revised code is move to al1c_reset_mac.
SERDES register definition is refined as well.
when do reset MAC, speed/duplex control right should be transferred
to software before do PHY auto-neg -- by bit MASTER_CTRL_SPEED_MODE_SW.
SERDES register definition is refined as well.
Signed-off-by: xiong <xiong@qca.qualcomm.com>
Tested-by: Liu David <dwliu@qca.qualcomm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Huang, Xiong [Wed, 25 Apr 2012 20:27:11 +0000 (20:27 +0000)]
atl1c: remove PHY contrl in atl1c_reset_pcie
atl1c_reset_phy follows atl1c_reset_pcie in the whole driver,
so, it's unnecessary to add PHY control code in atl1c_reset_pcie.
Signed-off-by: xiong <xiong@qca.qualcomm.com>
Tested-by: Liu David <dwliu@qca.qualcomm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Huang, Xiong [Wed, 25 Apr 2012 20:27:10 +0000 (20:27 +0000)]
atl1c: refine phy-register read/write function
phy register is read/write via MDIO control module ---
that module will be affected by the hibernate status,
to access phy regs in hib stutus, slow frequency clk must
be selected.
To access phy extension register, the MDIO related
registers are refined/updated, a _core function is
re-wroted for both regular PHY regs and extension regs.
existing PHY r/w function is revised based on the _core.
PHY extension registers will be used for the comming
patches.
Signed-off-by: xiong <xiong@qca.qualcomm.com>
Tested-by: Liu David <dwliu@qca.qualcomm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Huang, Xiong [Wed, 25 Apr 2012 20:27:09 +0000 (20:27 +0000)]
atl1c: remove REG_PHY_STATUS
this register is used for l1e(dev=1026)
l1c/l1d/l2cb don't use it.
Signed-off-by: xiong <xiong@qca.qualcomm.com>
Tested-by: Liu David <dwliu@qca.qualcomm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Padmanabh Ratnakar [Wed, 25 Apr 2012 01:47:15 +0000 (01:47 +0000)]
be2net: Fix FW download for BE
Skip flashing a FW component if that component is not present in a
particular FW UFI image.
Signed-off-by: Somnath Kotur <somnath.kotur@emulex.com>
Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Padmanabh Ratnakar [Wed, 25 Apr 2012 01:47:03 +0000 (01:47 +0000)]
be2net: Fix wrong status getting returned for MCC commands
MCC Response CQEs are processed as part of NAPI poll routine and
also synchronously. If MCC completions are consumed by NAPI poll
routine, wrong status is returned to synchronously waiting routine.
Fix this by getting status of MCC command from command response
instead of response CQEs.
Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Padmanabh Ratnakar [Wed, 25 Apr 2012 01:46:52 +0000 (01:46 +0000)]
be2net: Fix Lancer statistics
Fix port num sent in command to get stats. Also skip unnecessary
parsing of stats for Lancer.
Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Padmanabh Ratnakar [Wed, 25 Apr 2012 01:46:39 +0000 (01:46 +0000)]
be2net: Fix traffic stall INTx mode
EQ is getting armed wrongly in INTx mode as INTx interrupt is taking
some time to deassert. This can cause another interrupt while NAPI is
scheduled and scheduling a NAPI in interrupt does not take effect.
This causes interrupt to be missed and traffic stalls. Fixing this by
preventing wrong arming of EQ.
Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Padmanabh Ratnakar [Wed, 25 Apr 2012 01:46:28 +0000 (01:46 +0000)]
be2net: Fix ethtool self test for Lancer
Lancer does not support DDR self test. Fix ethtool self test by
skipping this test for Lancer.
Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Padmanabh Ratnakar [Wed, 25 Apr 2012 01:46:18 +0000 (01:46 +0000)]
be2net: Fix FW download in Lancer
Increase time given by driver to adapter for completing FW download
to 30 seconds. Also return correct status when FW download times out.
Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Padmanabh Ratnakar [Wed, 25 Apr 2012 01:46:06 +0000 (01:46 +0000)]
be2net: Fix VLAN/multicast packet reception
VLAN and multicast hardware filters are limited and can get
exhausted in adapters with many PCI functions. If setting
a VLAN or multicast filter fails due to lack of sufficient
hardware resources, these packets get dropped. Fix this by
switching to VLAN or multicast promiscous mode so that these
packets are not dropped.
Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Karsten Keil [Wed, 25 Apr 2012 20:54:48 +0000 (20:54 +0000)]
mISDN: DSP scheduling fix (version 2)
dsp_spl_jiffies need to be the same datatype as jiffies (which is ulong).
If not, on 64 bit systems it will fallback to schedule the DSP every jiffie
tic as soon jiffies become > 2^32.
Signed-off-by: Karsten Keil <kkeil@linux-pingi.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Karsten Keil [Tue, 24 Apr 2012 02:51:51 +0000 (02:51 +0000)]
mISDN: Fix division by zero
If DTMF debug is set and tresh goes under 100, the printk will cause
a division by zero.
Signed-off-by: Karsten Keil <kkeil@linux-pingi.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andreas Eversberg [Tue, 24 Apr 2012 02:51:50 +0000 (02:51 +0000)]
mISDN: Fixed hardware bridging/conference check routine of mISDN_dsp.ko.
In some cases the hardware bridging/conference (2-n parties) was selected,
but still pure software bridging/conference was used.
Signed-off-by: Andreas Eversberg <jolly@eversberg.eu>
Signed-off-by: Karsten Keil <keil@b1-systems.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andreas Eversberg [Tue, 24 Apr 2012 02:51:49 +0000 (02:51 +0000)]
mISDN: Fix NULL pointer bug in if-condition of mISDN_dsp
Fix a bug (was introduced by a cut & paste error)
in cases when dsp->conf was NULL.
Signed-off-by: Andreas Eversberg <jolly@eversberg.eu>
Signed-off-by: Karsten Keil <keil@b1-systems.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Shan Wei [Tue, 24 Apr 2012 18:21:07 +0000 (18:21 +0000)]
net: sock_diag_handler structs can be const
read only, so change it to const.
Signed-off-by: Shan Wei <davidshan@tencent.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Tue, 24 Apr 2012 10:17:59 +0000 (10:17 +0000)]
ipv6: call consume_skb() in frag/reassembly
Some kfree_skb() calls should be replaced by consume_skb() to avoid
drop_monitor/dropwatch false positives.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
John Fastabend [Fri, 20 Apr 2012 09:49:23 +0000 (09:49 +0000)]
net: dcb: add CEE notify calls
This adds code to trigger CEE events when an APP change or setall
command is made from user space. This simplifies user space code
significantly by creating a single interface to listen on that
works with both firmware and userland agents.
And if we end up with multiple agents this keeps every thing in
sync userland agents, firmware agents, and kernel notifier consumers.
For an example agent that listens for these events see:
https://github.com/jrfastab/cgdcbxd
cgdcbxd is a daemon used to monitor DCB netlink events and manage
the net_prio control group sub-system.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Acked-by: Shmulik Ravid <shmulikr@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrei Emeltchenko [Tue, 24 Apr 2012 11:18:28 +0000 (14:18 +0300)]
mac80211: Adds clean sdata helper
Adds hepler to clean sdata ieee80211_clean_sdata similar way as
ieee80211_setup_sdata is implemented. The function will be used by other
interfaces later.
Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Sujith Manoharan [Tue, 24 Apr 2012 04:53:20 +0000 (10:23 +0530)]
ath9k: Fix IDLE Powersave
* PS_WAIT_FOR_TX_ACK is used in network-sleep mode and checking
it for handling IDLE transitions is incorrect. Fix this.
* RX PCU/DMA engines have to be stopped before setting the chip into
full-sleep mode - otherwise the chip becomes mute.
* Make things a bit clear by checking explicitly for network-sleep
mode in the tx() routine and add a couple of debug statements
to aid PS debugging.
Signed-off-by: Sujith Manoharan <c_manoha@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Felix Fietkau [Mon, 23 Apr 2012 17:49:03 +0000 (19:49 +0200)]
mac80211: fix num_mcast_sta counting issues
Moving a STA to an AP VLAN prevents num_mcast_sta from being decremented
once the STA leaves, because sta->sdata changes. Fix this by checking
for AP VLANs as well.
Also exclude 4-addr VLAN stations from num_mcast_sta - remote 4-addr
stations ignore 3-address multicast frames anyway. In a typical bridge
configuration they receive the same packets as 4-address unicast.
This patch also fixes clearing the sdata->u.vlan.sta pointer when the
STA is removed from a 4-addr VLAN.
Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Felix Fietkau [Mon, 23 Apr 2012 17:49:02 +0000 (19:49 +0200)]
mac80211: rename AP variable num_sta_authorized to num_mcast_sta
It is only used to test for BSS multicast receivers.
Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Wey-Yi Guy [Mon, 23 Apr 2012 16:30:32 +0000 (09:30 -0700)]
mac80211: check for non-managed interface
Average beacon signal only keep tracked by managed interface,
give warning and return 0 for the others.
Signed-off-by: Wey-Yi Guy <wey-yi.w.guy@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Paul Gortmaker [Mon, 23 Apr 2012 04:49:13 +0000 (04:49 +0000)]
tipc: remove inline instances from C source files.
Untie gcc's hands and let it do what it wants within the
individual source files. There are two files, node.c and
port.c -- only the latter effectively changes (gcc-4.5.2).
Objdump shows gcc deciding to not inline port_peernode().
Suggested-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Sun, 22 Apr 2012 21:30:29 +0000 (21:30 +0000)]
af_netlink: drop_monitor/dropwatch friendly
Need to consume_skb() instead of kfree_skb() in netlink_dump() and
netlink_unicast_kernel() to avoid false dropwatch positives.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Sun, 22 Apr 2012 21:30:21 +0000 (21:30 +0000)]
af_netlink: cleanups
netlink_destroy_callback() move to avoid forward reference
CodingStyle cleanups
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Mon, 23 Apr 2012 17:48:27 +0000 (17:48 +0000)]
net: skb_can_coalesce returns a boolean
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Mon, 23 Apr 2012 17:34:36 +0000 (17:34 +0000)]
tcp: tcp_try_coalesce returns a boolean
This clarifies code intention, as suggested by David.
Suggested-by: David Miller <davem@davemloft.net>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Tue, 24 Apr 2012 03:35:04 +0000 (23:35 -0400)]
net: make spd_fill_page() linear argument a bool
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 24 Apr 2012 03:14:36 +0000 (23:14 -0400)]
Merge git://git./linux/kernel/git/davem/net
Fix merge between commit
3adadc08cc1e ("net ax25: Reorder ax25_exit to
remove races") and commit
0ca7a4c87d27 ("net ax25: Simplify and
cleanup the ax25 sysctl handling")
The former moved around the sysctl register/unregister calls, the
later simply removed them.
With help from Stephen Rothwell.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 24 Apr 2012 03:06:11 +0000 (23:06 -0400)]
net: Use bool and remove inline in skb_splice_bits() code.
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Sun, 22 Apr 2012 12:26:16 +0000 (12:26 +0000)]
net: speedup skb_splice_bits()
Commit
35f3d14db (pipe: add support for shrinking and growing pipes)
added a slowdown for splice(socket -> pipe), as we might grow the spd
used in skb_splice_bits() for each skb we process in splice() syscall.
Its not needed since skb lengths are capped. The default on-stack arrays
are more than enough.
Use MAX_SKB_FRAGS instead of PIPE_DEF_BUFFERS to describe the reasonable
limit per skb.
Add coalescing support to help splicing of GRO skbs built from linear
skbs (linked into frag_list)
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Tue, 24 Apr 2012 02:45:19 +0000 (19:45 -0700)]
Merge branch 'rc-fixes' of git://git./linux/kernel/git/mmarek/kbuild
Pull build system failure fix from Michal Marek:
"This fixes build failure with newer gcc that adds some internal
symbols that end in "__mod_*_device_table", but are not actually the
tables themselves."
* 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
Fix modpost failures in fedora 17
Eric Dumazet [Mon, 23 Apr 2012 07:11:42 +0000 (07:11 +0000)]
tcp: introduce tcp_try_coalesce
commit
c8628155ece3 (tcp: reduce out_of_order memory use) took care of
coalescing tcp segments provided by legacy devices (linear skbs)
We extend this idea to fragged skbs, as their truesize can be heavy.
ixgbe for example uses 256+1024+PAGE_SIZE/2 = 3328 bytes per segment.
Use this coalescing strategy for receive queue too.
This contributes to reduce number of tcp collapses, at minimal cost, and
reduces memory overhead and packets drops.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Tom Herbert <therbert@google.com>
Cc: Maciej Żenczykowski <maze@google.com>
Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Barak Witkowski [Mon, 23 Apr 2012 03:05:16 +0000 (03:05 +0000)]
bnx2x: Update driver version to 1.72.50-0
Signed-off-by: Barak Witkowski <barak@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dmitry Kravkov [Mon, 23 Apr 2012 03:05:11 +0000 (03:05 +0000)]
bnx2x: remove gro workaround
Removes GRO workaround, as issue is fixed in FW 7.2.51.
Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Barak Witkowski <barak@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Barak Witkowski [Mon, 23 Apr 2012 03:04:46 +0000 (03:04 +0000)]
bnx2x: add afex support
Following patch adds afex multifunction support to the driver (afex
multifunction is based on vntag header) and updates FW version used to 7.2.51.
Support includes the following:
1. Configure vif parameters in firmware (default vlan, vif id, default
priority, allowed priorities) according to values received from NIC.
2. Configure FW to strip/add default vlan according to afex vlan mode.
3. Notify link up to OS only after vif is fully initialized.
4. Support vif list set/get requests and configure FW accordingly.
5. Supply afex statistics upon request from NIC.
6. Special handling to L2 interface in case of FCoE vif.
Signed-off-by: Barak Witkowski <barak@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yevgeny Petrilin [Mon, 23 Apr 2012 02:18:50 +0000 (02:18 +0000)]
mlx4_en: Byte Queue Limit support
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yevgeny Petrilin [Mon, 23 Apr 2012 02:18:39 +0000 (02:18 +0000)]
mlx4_en: Moving to Interrupts for TX completions
Moving to interrupts instead of polling fpr TX completions
Avoiding situations where skb can be held in by the driver for
a long time (till timer expires).
The change is also necessary for supporting BQL.
Removing comp_lock that was required because we could handle TX
completions from several contexts: Interrupts, timer, polling.
Now there is only interrupts
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yevgeny Petrilin [Mon, 23 Apr 2012 02:18:33 +0000 (02:18 +0000)]
mlx4_en: Added Ethtool support for TX Interrupt coalescing
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Sun, 22 Apr 2012 23:38:54 +0000 (23:38 +0000)]
tcp: sk_add_backlog() is too agressive for TCP
While investigating TCP performance problems on 10Gb+ links, we found a
tcp sender was dropping lot of incoming ACKS because of sk_rcvbuf limit
in sk_add_backlog(), especially if receiver doesnt use GRO/LRO and sends
one ACK every two MSS segments.
A sender usually tweaks sk_sndbuf, but sk_rcvbuf stays at its default
value (87380), allowing a too small backlog.
A TCP ACK, even being small, can consume nearly same truesize space than
outgoing packets. Using sk_rcvbuf + sk_sndbuf as a limit makes sense and
is fast to compute.
Performance results on netperf, single flow, receiver with disabled
GRO/LRO : 7500 Mbits instead of 6050 Mbits, no more TCPBacklogDrop
increments at sender.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Tom Herbert <therbert@google.com>
Cc: Maciej Żenczykowski <maze@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Cc: Rick Jones <rick.jones2@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Sun, 22 Apr 2012 23:34:26 +0000 (23:34 +0000)]
net: add a limit parameter to sk_add_backlog()
sk_add_backlog() & sk_rcvqueues_full() hard coded sk_rcvbuf as the
memory limit. We need to make this limit a parameter for TCP use.
No functional change expected in this patch, all callers still using the
old sk_rcvbuf limit.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Tom Herbert <therbert@google.com>
Cc: Maciej Żenczykowski <maze@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Cc: Rick Jones <rick.jones2@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric W. Biederman [Mon, 23 Apr 2012 14:25:49 +0000 (14:25 +0000)]
net ax25: Fix the build when sysctl support is disabled.
Randy Dunlap <rdunlap@xenotime.net> reported:
> On 04/23/2012 12:07 AM, Stephen Rothwell wrote:
>
>> Hi all,
>>
>> Changes since
20120420:
>
>
> include/net/ax25.h:447:75: error: expected ';' before '}' token
>
> static inline int ax25_register_dev_sysctl(ax25_dev *ax25_dev) { return 0 };
> static inline void ax25_unregister_dev_sysctl(ax25_dev *ax25_dev) {};
>
> First function: move ';' inside braces.
> Second function: drop the ';'.
Put the semicolons where it makes sense.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Tue, 24 Apr 2012 01:25:01 +0000 (18:25 -0700)]
Merge tag 'md-3.4-fixes' of git://neil.brown.name/md
Pull a few more md bug fixes from NeilBrown:
"2 are tagged for -stable, one being for a fairly serious bug that can
corrupt metadata and make it hard to recovery an array. The other is
for a more recent regression since 3.3"
* tag 'md-3.4-fixes' of git://neil.brown.name/md:
md: fix possible corruption of array metadata on shutdown.
md: don't call ->add_disk unless there is good reason.
DM RAID: Use safe version of rdev_for_each
Linus Torvalds [Tue, 24 Apr 2012 01:22:42 +0000 (18:22 -0700)]
Merge tag 'dlm-fixes-3.4' of git://git./linux/kernel/git/teigland/linux-dlm
Pull dlm fixes from David Teigland:
"This includes one short patch fixing the behavior of the QUECVT flag,
which the gfs2 folks are waiting on."
* tag 'dlm-fixes-3.4' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm:
dlm: fix QUECVT when convert queue is empty
Hugh Dickins [Mon, 23 Apr 2012 18:14:50 +0000 (11:14 -0700)]
mm: fix s390 BUG by __set_page_dirty_no_writeback on swap
Mel reports a BUG_ON(slot == NULL) in radix_tree_tag_set() on s390
3.0.13: called from __set_page_dirty_nobuffers() when page_remove_rmap()
tries to transfer dirty flag from s390 storage key to struct page and
radix_tree.
That would be because of reclaim's shrink_page_list() calling
add_to_swap() on this page at the same time: first PageSwapCache is set
(causing page_mapping(page) to appear as &swapper_space), then
page->private set, then tree_lock taken, then page inserted into
radix_tree - so there's an interval before taking the lock when the
radix_tree slot is empty.
We could fix this by moving __add_to_swap_cache()'s spin_lock_irq up
before the SetPageSwapCache. But a better fix is simply to do what's
five years overdue: Ken Chen introduced __set_page_dirty_no_writeback()
(if !PageDirty TestSetPageDirty) for tmpfs to skip all the radix_tree
overhead, and swap is just the same - it ignores the radix_tree tag, and
does not participate in dirty page accounting, so should be using
__set_page_dirty_no_writeback() too.
s390 testing now confirms that this does indeed fix the problem.
Reported-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Ken Chen <kenchen@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
NeilBrown [Tue, 24 Apr 2012 00:23:16 +0000 (10:23 +1000)]
md: fix possible corruption of array metadata on shutdown.
commit
c744a65c1e2d59acc54333ce8
md: don't set md arrays to readonly on shutdown.
removed the possibility of a 'BUG' when data is written to an array
that has just been switched to read-only, but also introduced the
possibility that the array metadata could be corrupted.
If, when md_notify_reboot gets the mddev lock, the array is
in a state where it is assembled but hasn't been started (as can
happen if the personality module is not available, or in other unusual
situations), then incorrect metadata will be written out making it
impossible to re-assemble the array.
So only call __md_stop_writes() if the array has actually been
activated.
This patch is needed for any stable kernel which has had the above
commit applied.
Cc: stable@vger.kernel.org
Reported-by: Christoph Nelles <evilazrael@evilazrael.de>
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Tue, 24 Apr 2012 00:23:14 +0000 (10:23 +1000)]
md: don't call ->add_disk unless there is good reason.
Commit
7bfec5f35c68121e7b18
md/raid5: If there is a spare and a want_replacement device, start replacement.
cause md_check_recovery to call ->add_disk much more often.
Instead of only when the array is degraded, it is now called whenever
md_check_recovery finds anything useful to do, which includes
updating the metadata for clean<->dirty transition.
This causes unnecessary work, and causes info messages from ->add_disk
to be reported much too often.
So refine md_check_recovery to only do any actual recovery checking
(including ->add_disk) if MD_RECOVERY_NEEDED is set.
This fix is suitable for 3.3.y:
Cc: stable@vger.kernel.org
Reported-by: Jan Ceuleers <jan.ceuleers@computer.org>
Signed-off-by: NeilBrown <neilb@suse.de>