Masanari Iida [Fri, 13 Jul 2012 07:22:47 +0000 (07:22 +0000)]
irda: Fix typo in irda
Correct spelling typo in irda.
Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ioan Orghici [Fri, 13 Jul 2012 07:16:37 +0000 (07:16 +0000)]
sctp: fix sparse warning for sctp_init_cause_fixed
Fix the following sparse warning:
* symbol 'sctp_init_cause_fixed' was not declared. Should it be
static?
Signed-off-by: Ioan Orghici <ioanorghici@gmail.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Mon, 16 Jul 2012 14:25:56 +0000 (14:25 +0000)]
bnx2: Try to recover from PCI block reset
If the PCI block has reset, the memory enable bit will be reset and
the device will not respond to MMIO access. bnx2_reset_task() currently
will not recover when this happens. Add code to detect this condition
and restore the PCI state. This scenario has been reported by some
users.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Mon, 16 Jul 2012 16:24:02 +0000 (16:24 +0000)]
tg3: Add hwmon support for temperature
Some tg3 devices have management firmware that can export sensor data.
Export temperature sensor reading via hwmon sysfs.
[hwmon interface suggested by Ben Hutchings <bhutchings@solarflare.com>]
Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: Nithin Nayak Sujir <nsujir@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Matt Carlson [Mon, 16 Jul 2012 16:24:01 +0000 (16:24 +0000)]
tg3: Add APE scratchpad read function
for retreiving temperature sensor data.
Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: Nithin Nayak Sujir <nsujir@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Matt Carlson [Mon, 16 Jul 2012 16:24:00 +0000 (16:24 +0000)]
tg3: Add common function tg3_ape_event_lock()
by refactoring code in tg3_ape_send_event(). The common function will
be used in subsequent patches.
Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Mon, 16 Jul 2012 16:23:59 +0000 (16:23 +0000)]
tg3: Fix the setting of the APE_HAS_NCSI flag
The driver currently skips setting this flag if the VPD contains the
firmware version string. We fix this by separating the probing of NCSI
from the reading of the NCSI version string. The APE_HAS_NCSI flag is
needed to properly read sensor data.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Sat, 14 Jul 2012 03:16:27 +0000 (03:16 +0000)]
netem: refine early skb orphaning
netem does an early orphaning of skbs. Doing so breaks TCP Small Queue
or any mechanism relying on socket sk_wmem_alloc feedback.
Ideally, we should perform this orphaning after the rate module and
before the delay module, to mimic what happens on a real link :
skb orphaning is indeed normally done at TX completion, before the
transit on the link.
+-------+ +--------+ +---------------+ +-----------------+
+ Qdisc +---> Device +--> TX completion +--> links / hops +->
+ + + xmit + + skb orphaning + + propagation +
+-------+ +--------+ +---------------+ +-----------------+
< rate limiting > < delay, drops, reorders >
If netem is used without delay feature (drops, reorders, rate
limiting), then we should avoid early skb orphaning, to keep pressure
on sockets as long as packets are still in qdisc queue.
Ideally, netem should be refactored to implement delay module
as the last stage. Current algorithm merges the two phases
(rate limiting + delay) so its not correct.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Hagen Paul Pfeifer <hagen@jauu.net>
Cc: Mark Gordon <msg@google.com>
Cc: Andreas Terzis <aterzis@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 17 Jul 2012 06:04:00 +0000 (23:04 -0700)]
Merge branch 'master' of git://git./linux/kernel/git/jkirsher/net-next
Jett Kirsher says:
====================
This series contains updates to e1000e and ixgbe.
...
Alexander Duyck (5):
ixgbe: Simplify logic for getting traffic class from user priority
ixgbe: Cleanup unpacking code for DCB
ixgbe: Populate the prio_tc_map in ixgbe_setup_tc
ixgbe: Add function for obtaining FCoE TC based on FCoE user priority
ixgbe: Merge FCoE set_num and cache_ring calls into RSS/DCB config
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Fri, 13 Jul 2012 03:19:41 +0000 (03:19 +0000)]
be2net: dont pull too much data in skb linear part
skb_fill_rx_data() pulls 64 byte of data in skb->data
Its too much for TCP (with no options) on IPv4, as total size of headers
is 14 + 40 = 54
This means tcp stack and splice() are suboptimal, since tcp payload
is in part in tcp->data, and in part in skb frag.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Padmanabh Ratnakar [Fri, 13 Jul 2012 02:46:03 +0000 (02:46 +0000)]
be2net: update driver version
Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Padmanabh Ratnakar [Fri, 13 Jul 2012 02:45:51 +0000 (02:45 +0000)]
be2net: Add description about various RSS hash types
Incorporated review comment from Eric Dumazet. Added description
about different RSS hash types which adapter is capable of.
Will add support for ETHTOOL_GRXFH and ETHTOOL_SRXFX as suggested
by Ben Hutchings in a later patch.
Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Thomas Graf [Tue, 10 Jul 2012 22:29:19 +0000 (22:29 +0000)]
bridge: Fix enforcement of multicast hash_max limit
The hash size is doubled when it needs to grow and compared against
hash_max. The >= comparison will limit the hash table size to half
of what is expected i.e. the default 512 hash_max will not allow
the hash table to grow larger than 256.
Also print the hash table limit instead of the desirable size when
the limit is reached.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Tue, 10 Jul 2012 20:34:07 +0000 (20:34 +0000)]
net/mlx4_en: dereferencing freed memory
We dereferenced "mclist" after the kfree().
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Tue, 10 Jul 2012 20:33:36 +0000 (20:33 +0000)]
net/mlx4: off by one in parse_trans_rule()
This should be ">=" here instead of ">". MLX4_NET_TRANS_RULE_NUM is 6.
We use "spec->id" as an array offset into the __rule_hw_sz[] and
__sw_id_hw[] arrays which have 6 elements.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Hadar Hen Zion <hadarh@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
Denis Ovsienko [Tue, 10 Jul 2012 04:45:50 +0000 (04:45 +0000)]
ipv6: fix RTPROT_RA markup of RA routes w/nexthops
Userspace implementations of network routing protocols sometimes need to
tell RA-originated IPv6 routes from other kernel routes to make proper
routing decisions. This makes most sense for RA routes with nexthops,
namely, default routes and Route Information routes.
The intended mean of preserving RA route origin in a netlink message is
through indicating RTPROT_RA as protocol code. Function rt6_fill_node()
tried to do that for default routes, but its test condition was taken
wrong. This change is modeled after the original mailing list posting
by Jeff Haran. It fixes the test condition for default route case and
sets the same behaviour for Route Information case (both types use
nexthops). Handling of the 3rd RA route type, Prefix Information, is
left unchanged, as it stands for interface connected routes (without
nexthops).
Signed-off-by: Denis Ovsienko <infrastation@yandex.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
Haiyang Zhang [Tue, 10 Jul 2012 07:19:22 +0000 (07:19 +0000)]
hyperv: Add support for setting MAC from within guests
This adds support for setting synthetic NIC MAC address from within Linux
guests. Before using this feature, the option "spoofing of MAC address"
should be enabled at the Hyper-V manager / Settings of the synthetic
NIC.
Thanks to Kin Cho <kcho@infoblox.com> for the initial implementation and
tests. And, thanks to Long Li <longli@microsoft.com> for the debugging
works.
Reported-and-tested-by: Kin Cho <kcho@infoblox.com>
Reported-by: Long Li <longli@microsoft.com>
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tony Cheneau [Wed, 11 Jul 2012 06:51:16 +0000 (06:51 +0000)]
6lowpan: Change byte order when storing/accessing to len field
Lenght field should be encoded using big endian byte order, such as intend in the specs.
As it is currently written, the len field would not be decoded properly on an implementation using the correct byte ordering. Hence, it could lead to interroperability issues.
Also, I rewrote the code so that iphc0 argument of lowpan_alloc_new_frame could be removed.
Signed-off-by: Tony Cheneau <tony.cheneau@amnesiak.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tony Cheneau [Wed, 11 Jul 2012 06:51:15 +0000 (06:51 +0000)]
6lowpan: Change byte order when storing/accessing u16 tag
The tag field should be stored and accessed using big endian byte order (as
intended in the specs). Or else, when displayed with a trafic analyser, such a
Wireshark, the field not properly displayed (e.g. 0x01 00 instead of 0x00 01,
and so on).
Signed-off-by: Tony Cheneau <tony.cheneau@amnesiak.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tony Cheneau [Wed, 11 Jul 2012 06:51:14 +0000 (06:51 +0000)]
6lowpan: Fix null pointer dereference in UDP uncompression function
When a UDP packet gets fragmented, a crash will occur at reassembly time.
This is because skb->transport_header is not set during earlier period of fragment reassembly.
As a consequence, call to udp_hdr() return NULL and uh (which is NULL) gets
dereferenced without much test.
Signed-off-by: Tony Cheneau <tony.cheneau@amnesiak.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Joe Perches [Fri, 13 Jul 2012 05:33:12 +0000 (22:33 -0700)]
arch: Use eth_random_addr
Convert the existing uses of random_ether_addr to
the new eth_random_addr.
Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Joe Perches [Fri, 13 Jul 2012 05:33:11 +0000 (22:33 -0700)]
usb: Use eth_random_addr
Convert the existing uses of random_ether_addr to
the new eth_random_addr.
Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Joe Perches [Fri, 13 Jul 2012 05:33:10 +0000 (22:33 -0700)]
s390: Use eth_random_addr
Convert the existing uses of random_ether_addr to
the new eth_random_addr.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Joe Perches [Thu, 12 Jul 2012 19:33:09 +0000 (19:33 +0000)]
drivers/net: Use eth_random_addr
Convert the existing uses of random_ether_addr to
the new eth_random_addr.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Joe Perches [Thu, 12 Jul 2012 19:33:08 +0000 (19:33 +0000)]
wireless: Use eth_random_addr
Convert the existing uses of random_ether_addr to
the new eth_random_addr.
Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Hin-Tak Leung <htl10@users.sourceforge.net>
Acked-by: Gertjan van Wingerde <gwingerde@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Joe Perches [Thu, 12 Jul 2012 19:33:07 +0000 (19:33 +0000)]
net: usb: Use eth_random_addr
Convert the existing uses of random_ether_addr to
the new eth_random_addr.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Joe Perches [Thu, 12 Jul 2012 19:33:06 +0000 (19:33 +0000)]
ethernet: Use eth_random_addr
Convert the existing uses of random_ether_addr to
the new eth_random_addr.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Joe Perches [Thu, 12 Jul 2012 19:33:05 +0000 (19:33 +0000)]
etherdevice: Rename random_ether_addr to eth_random_addr
Add some API symmetry to eth_broadcast_addr and
add a #define to the old name for backward compatibility.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 17 Jul 2012 05:33:32 +0000 (22:33 -0700)]
Merge branch 'tipc_net-next' of git://git./linux/kernel/git/paulg/linux
Paul Gortmaker says:
====================
This is the same eight commits as sent for review last week[1],
with just the incorporation of the pr_fmt change as suggested
by JoeP. There was no additional change requests, so unless you
can see something else you'd like me to change, please pull.
...
Erik Hugne (5):
tipc: use standard printk shortcut macros (pr_err etc.)
tipc: remove TIPC packet debugging functions and macros
tipc: simplify print buffer handling in tipc_printf
tipc: phase out most of the struct print_buf usage
tipc: remove print_buf and deprecated log buffer code
Paul Gortmaker (3):
tipc: factor stats struct out of the larger link struct
tipc: limit error messages relating to memory leak to one line
tipc: simplify link_print by divorcing it from using tipc_printf
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrey Vagin [Mon, 16 Jul 2012 04:28:49 +0000 (04:28 +0000)]
net: make sock diag per-namespace
Before this patch sock_diag works for init_net only and dumps
information about sockets from all namespaces.
This patch expands sock_diag for all name-spaces.
It creates a netlink kernel socket for each netns and filters
data during dumping.
v2: filter accoding with netns in all places
remove an unused variable.
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: James Morris <jmorris@namei.org>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Pavel Emelyanov <xemul@parallels.com>
CC: Eric Dumazet <eric.dumazet@gmail.com>
Cc: linux-kernel@vger.kernel.org
Cc: netdev@vger.kernel.org
Signed-off-by: Andrew Vagin <avagin@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Duan Jiong [Mon, 16 Jul 2012 02:29:05 +0000 (02:29 +0000)]
lpc_eth: remove duplicated include
Remove duplicated #include <linux/delay.h> in
drivers/net/ethernet/nxp/lpc_eth.c
Signed-off-by: Duan Jiong<djduanjiong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Mon, 16 Jul 2012 01:41:36 +0000 (01:41 +0000)]
tcp: add OFO snmp counters
Add three SNMP TCP counters, to better track TCP behavior
at global stage (netstat -s), when packets are received
Out Of Order (OFO)
TCPOFOQueue : Number of packets queued in OFO queue
TCPOFODrop : Number of packets meant to be queued in OFO
but dropped because socket rcvbuf limit hit.
TCPOFOMerge : Number of packets in OFO that were merged with
other packets.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Sat, 30 Jun 2012 00:14:01 +0000 (00:14 +0000)]
ixgbe: Merge FCoE set_num and cache_ring calls into RSS/DCB config
This change merges the ixgbe_cache_ring_fcoe and ixgbe_set_fcoe_queues
logic into the DCB and RSS initialization calls.
Cc: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Alexander Duyck [Sat, 2 Jun 2012 00:11:02 +0000 (00:11 +0000)]
ixgbe: Add function for obtaining FCoE TC based on FCoE user priority
In upcoming patches it will become increasingly common to need to determine
the FCoE traffic class in order to determine the correct queues for FCoE.
In order to make this easier I am adding a function for obtaining the FCoE
traffic class based on the user priority.
Cc: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Alexander Duyck [Fri, 18 May 2012 06:33:31 +0000 (06:33 +0000)]
ixgbe: Populate the prio_tc_map in ixgbe_setup_tc
There were cases where the prio_tc_map was not populated when we were
calling open. This will result in us incorrectly configuring the traffic
classes when DCB is enabled. In order to correct this I have updated the
code so that we now populate the values prior to allocating the q_vectors
and calling ixgbe_open.
Cc: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Alexander Duyck [Thu, 17 May 2012 05:14:39 +0000 (05:14 +0000)]
ixgbe: Cleanup unpacking code for DCB
This is meant to be a generic clean-up of the remaining functions for
unpacking data from the DCB structures. The only real changes are:
replaced the variable i with tc for functions that were looping through the
traffic classes, and added a pointer for tc_class instead of path since
that way we only need to pull the pointer once instead of once per loop.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Alexander Duyck [Thu, 17 May 2012 05:14:34 +0000 (05:14 +0000)]
ixgbe: Simplify logic for getting traffic class from user priority
This patch is meant to help simplify the logic for getting traffic classes
from user priorities. To do this I am adding a function named
ixgbe_dcb_get_tc_from_up that will go through the traffic classes in
reverse order in order to determine which traffic class contains a bit for
a given user priority.
Adding a declaration for this new function to the header so that
we have a centralized means for sorting out traffic classes belonging to
features such as FCoE.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Matthew Vick [Thu, 12 Jul 2012 00:02:42 +0000 (00:02 +0000)]
e1000e: Program the correct register for ITR when using MSI-X.
When configuring interrupt throttling on 82574 in MSI-X mode, we need to
be programming the EITR registers instead of the ITR register.
-rc2: Renamed e1000_write_itr() to e1000e_write_itr(), fixed whitespace
issues, and removed unnecessary !! operation.
-rc3: Reduced the scope of the loop variable in e1000e_write_itr().
Signed-off-by: Matthew Vick <matthew.vick@intel.com>
Acked-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Tushar Dave [Thu, 12 Jul 2012 08:00:15 +0000 (08:00 +0000)]
e1000e: Cleanup code logic in e1000_check_for_serdes_link_82571()
Cleanup code to make it more clean and readable.
Signed-off-by: Tushar Dave <tushar.n.dave@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Steffen Klassert [Thu, 5 Jul 2012 23:39:34 +0000 (23:39 +0000)]
xfrm: Initialize the struct xfrm_dst behind the dst_enty field
We start initializing the struct xfrm_dst at the first field
behind the struct dst_enty. This is error prone because it
might leave a new field uninitialized. So start initializing
the struct xfrm_dst right behind the dst_entry.
Suggested-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Steffen Klassert [Thu, 5 Jul 2012 23:37:09 +0000 (23:37 +0000)]
ipv6: Initialize the struct rt6_info behind the dst_enty field
We start initializing the struct rt6_info at the first field
behind the struct dst_enty. This is error prone because it
might leave a new field uninitialized. So start initializing
the struct rt6_info right behind the dst_entry.
Suggested-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sat, 14 Jul 2012 06:02:28 +0000 (23:02 -0700)]
Merge branch 'for-davem' of git://git./linux/kernel/git/linville/wireless-next
John Linville says:
====================
Several drivers see updates: mwifiex, ath9k, iwlwifi, brcmsmac,
wlcore/wl12xx/wl18xx, and a handful of others. The bcma bus got a
lot of attention from Hauke Mehrtens. The cfg80211 component gets
a flurry of patches for multi-channel support, and the mac80211
component gets the first few VHT (11ac) and 60GHz (11ad) patches.
This also includes the removal of the iwmc3200 drivers, since the
hardware never became available to normal people.
Additionally, the NFC subsystem gets a series of updates. According to
Samuel, "Here are the interesting bits:
- A better error management for the HCI stack.
- An LLCP "late" binding implementation for a better NFC SAP usage. SAPs are
now reserved only when there's a client for it.
- Support for Sony RC-S360 (a.k.a. PaSoRi) pn533 based dongle. We can read and
write NFC tags and also establish a p2p link with this dongle now.
- A few LLCP fixes."
Finally, this includes another pull of the fixes from the wireless
tree in order to resolve some merge issues.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Erik Hugne [Fri, 29 Jun 2012 04:50:24 +0000 (00:50 -0400)]
tipc: remove print_buf and deprecated log buffer code
The internal log buffer handling functions can now safely be
removed since there is no code using it anymore. Requests to
interact with the internal tipc log buffer over netlink (in
config.c) will report 'obsolete command'.
This represents the final removal of any references to a
struct print_buf, and the removal of the struct itself.
We also get rid of a TIPC specific Kconfig in the process.
Finally, log.h is removed since it is not needed anymore.
Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Erik Hugne [Fri, 29 Jun 2012 04:50:23 +0000 (00:50 -0400)]
tipc: phase out most of the struct print_buf usage
The tipc_printf is renamed to tipc_snprintf, as the new name
describes more what the function actually does. It is also
changed to take a buffer and length parameter and return
number of characters written to the buffer. All callers of
this function that used to pass a print_buf are updated.
Final removal of the struct print_buf itself will be done
synchronously with the pending removal of the deprecated
logging code that also was using it.
Functions that build up a response message with a list of
ports, nametable contents etc. are changed to return the number
of characters written to the output buffer. This information
was previously hidden in a field of the print_buf struct, and
the number of chars written was fetched with a call to
tipc_printbuf_validate. This function is removed since it
is no longer referenced nor needed.
A generic max size ULTRA_STRING_MAX_LEN is defined, named
in keeping with the existing TIPC_TLV_ULTRA_STRING, and the
various definitions in port, link and nametable code that
largely duplicated this information are removed. This means
that amount of link statistics that can be returned is now
increased from 2k to 32k.
The buffer overflow check is now done just before the reply
message is passed over netlink or TIPC to a remote node and
the message indicating a truncated buffer is changed to a less
dramatic one (less CAPS), placed at the end of the message.
Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Erik Hugne [Fri, 29 Jun 2012 04:50:22 +0000 (00:50 -0400)]
tipc: simplify print buffer handling in tipc_printf
tipc_printf was previously used both to construct debug traces
and to append data to buffers that should be sent over netlink
to the tipc-config application. A global print_buffer was
used to format the string before it was copied to the actual
output buffer. This could lead to concurrent access of the
global print_buffer, which then had to be lock protected.
This is simplified by changing tipc_printf to append data
directly to the output buffer using vscnprintf.
With the new implementation of tipc_printf, there is no longer
any risk of concurrent access to the internal log buffer, so
the lock (and the comments describing it) are no longer
strictly necessary. However, there are still a few functions
that do grab this lock before resizing/dumping the log
buffer. We leave the lock, and these functions untouched since
they will be removed with a subsequent commit that drops the
deprecated log buffer handling code
Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Paul Gortmaker [Wed, 11 Jul 2012 23:27:56 +0000 (19:27 -0400)]
tipc: simplify link_print by divorcing it from using tipc_printf
To pave the way for a pending cleanup of tipc_printf, and
removal of struct print_buf entirely, we make that task simpler
by converting link_print to issue its messages with standard
printk infrastructure. [Original idea separated from a larger
patch from Erik Hugne <erik.hugne@ericsson.com>]
Cc: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Erik Hugne [Fri, 29 Jun 2012 04:50:21 +0000 (00:50 -0400)]
tipc: remove TIPC packet debugging functions and macros
The link queue traces and packet level debug functions served
a purpose during early development, but are now redundant
since there are other, more capable tools available for
debugging at the packet level.
The TIPC_DEBUG Kconfig option is removed since it does not
provide any extra debugging features anymore.
This gets rid of a lot of tipc_printf usages, which will
make the pending cleanup work of that function easier.
Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Erik Hugne [Fri, 29 Jun 2012 04:16:37 +0000 (00:16 -0400)]
tipc: use standard printk shortcut macros (pr_err etc.)
All messages should go directly to the kernel log. The TIPC
specific error, warning, info and debug trace macro's are
removed and all references replaced with pr_err, pr_warn,
pr_info and pr_debug.
Commonly used sub-strings are explicitly declared as a const
char to reduce .text size.
Note that this means the debug messages (changed to pr_debug),
are now enabled through dynamic debugging, instead of a TIPC
specific Kconfig option (TIPC_DEBUG). The latter will be
phased out completely
Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
[PG: use pr_fmt as suggested by Joe Perches <joe@perches.com>]
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
David S. Miller [Fri, 13 Jul 2012 15:21:29 +0000 (08:21 -0700)]
ipv4: Don't store a rule pointer in fib_result.
We only use it to fetch the rule's tclassid, so just store the
tclassid there instead.
This also decreases the size of fib_result by a full 8 bytes on
64-bit. On 32-bits it's a wash.
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Thu, 12 Jul 2012 22:46:09 +0000 (22:46 +0000)]
tcp: add LAST_ACK as a valid state for TSQ
Socket state LAST_ACK should allow TSQ to send additional frames,
or else we rely on incoming ACKS or timers to send them.
Reported-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Matt Mathis <mattmathis@google.com>
Cc: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Greg KH [Thu, 12 Jul 2012 15:39:44 +0000 (15:39 +0000)]
tg3: add device id of Apple Thunderbolt Ethernet device
The Apple Thunderbolt ethernet device is already listed in the driver,
but not hooked up in the MODULE_DEVICE_TABLE(). This fixes that and
allows it to work properly.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Thu, 12 Jul 2012 14:23:50 +0000 (14:23 +0000)]
net: Update alloc frag to reduce get/put page usage and recycle pages
This patch is meant to help improve performance by reducing the number of
locked operations required to allocate a frag on x86 and other platforms.
This is accomplished by using atomic_set operations on the page count
instead of calling get_page and put_page. It is based on work originally
provided by Eric Dumazet.
In addition it also helps to reduce memory overhead when using TCP. This
is done by recycling the page if the only holder of the frame is the
netdev_alloc_frag call itself. This can occur when skb heads are stolen by
either GRO or TCP and the driver providing the packets is using paged frags
to store all of the data for the packets.
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
John W. Linville [Thu, 12 Jul 2012 17:44:50 +0000 (13:44 -0400)]
Merge branch 'master' of git://git./linux/kernel/git/linville/wireless-next into for-davem
David S. Miller [Thu, 12 Jul 2012 16:39:28 +0000 (09:39 -0700)]
ipv4: Remove tb_peers from fib_table.
No longer used.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 12 Jul 2012 15:18:45 +0000 (08:18 -0700)]
Merge branch 'for-davem' of git://gitorious.org/linux-can/linux-can-next
Padmanabh Ratnakar [Thu, 12 Jul 2012 03:57:47 +0000 (03:57 +0000)]
be2net: Enable RSS UDP hashing for Lancer and Skyhawk
Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Padmanabh Ratnakar [Thu, 12 Jul 2012 03:57:35 +0000 (03:57 +0000)]
be2net: Fix port name in message during driver load
Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Padmanabh Ratnakar [Thu, 12 Jul 2012 03:57:21 +0000 (03:57 +0000)]
be2net: Fix cleanup path when EQ creation fails
Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Padmanabh Ratnakar [Thu, 12 Jul 2012 03:57:09 +0000 (03:57 +0000)]
be2net: Activate new FW after FW download for Lancer
After FW download, activate new FW by invoking FW reset.
Recreate rings once new FW is operational.
Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Padmanabh Ratnakar [Thu, 12 Jul 2012 03:56:58 +0000 (03:56 +0000)]
be2net: Fix initialization sequence for Lancer
Invoke only required initialization routines for Lancer.
Remove invocation of unnecessary routines.
Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Padmanabh Ratnakar [Thu, 12 Jul 2012 03:56:46 +0000 (03:56 +0000)]
be2net : Fix die temperature stat for Lancer
Query die temperature stat for Lancer to report it correctly
in ethtool.
Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Padmanabh Ratnakar [Thu, 12 Jul 2012 03:56:11 +0000 (03:56 +0000)]
be2net: Fix error while toggling autoneg of pause parameters
Autonegotiation of pause parameters is possible only on some PHYs.
Ability of autoneg of pause parameters is reported by adapter.
Autoneg of pause parameters cannot be changed from driver.
Fix driver to give error when autoneg mode is toggled by user.
Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Wed, 11 Jul 2012 05:34:04 +0000 (05:34 +0000)]
team: make team_port_enabled() and team_port_txable() static inline
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Wed, 11 Jul 2012 05:34:03 +0000 (05:34 +0000)]
team: add broadcast mode
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Wed, 11 Jul 2012 05:34:02 +0000 (05:34 +0000)]
team: use function team_port_txable() for determing enabled and up port
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 12 Jul 2012 15:06:04 +0000 (08:06 -0700)]
ipv4: Put proper checks into icmp_socket_deliver().
All handler->err() routines expect that we've done a pskb_may_pull()
test to make sure that IP header length + 8 bytes can be safely
pulled.
Reported-by: Hiroaki SHIMODA <shimoda.hiroaki@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 12 Jul 2012 15:00:56 +0000 (08:00 -0700)]
Merge branch 'master' of git://git./linux/kernel/git/jkirsher/net-next
Florian Westphal [Wed, 11 Jul 2012 10:56:57 +0000 (10:56 +0000)]
net: sched: add ipset ematch
Can be used to match packets against netfilter ip sets created via ipset(8).
skb->sk_iif is used as 'incoming interface', skb->dev is 'outgoing interface'.
Since ipset is usually called from netfilter, the ematch
initializes a fake xt_action_param, pulls the ip header into the
linear area and also sets skb->data to the IP header (otherwise
matching Layer 4 set types doesn't work).
Tested-by: Mr Dash Four <mr.dash.four@googlemail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Flavio Leitner [Wed, 11 Jul 2012 08:56:55 +0000 (08:56 +0000)]
netxen: fix link notification order
First update the adapter variables with the current speed and
mode before fire the notification. Otherwise, the get_settings()
may provide old values.
Signed-off-by: Flavio Leitner <fbl@redhat.com>
Acked-by: Rajesh Borundia <rajesh.borundia@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
alex.bluesman.smirnov@gmail.com [Tue, 10 Jul 2012 21:22:48 +0000 (21:22 +0000)]
6lowpan: rework fragment-deleting routine
6lowpan module starts collecting incomming frames and fragments
right after lowpan_module_init() therefor it will be better to
clean unfinished fragments in lowpan_cleanup_module() function
instead of doing it when link goes down.
Changed spinlocks type to prevent deadlock with expired timer event
and removed unused one.
Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
alex.bluesman.smirnov@gmail.com [Tue, 10 Jul 2012 21:22:47 +0000 (21:22 +0000)]
6lowpan: fix tag variable size
Function lowpan_alloc_new_frame() takes u8 tag as an argument. However,
its only caller, lowpan_process_data() passes down a u16. Hence,
the tag value can get corrupted. This prevent 6lowpan fragment reassembly of a
message when the fragment tag value is over 256.
Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>
Cc: Tony Cheneau <tony.cheneau@amnesiak.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
alex.bluesman.smirnov@gmail.com [Tue, 10 Jul 2012 21:22:46 +0000 (21:22 +0000)]
mac802154: sparse warnings: make symbols static
Make symbols static to avoid the following warning shown up
by sparse:
warning: symbol ... was not declared. Should it be static?
Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
alex.bluesman.smirnov@gmail.com [Tue, 10 Jul 2012 21:22:45 +0000 (21:22 +0000)]
6lowpan: get extra headroom in allocated frame
Use netdev_alloc_skb_ip_align() instead of alloc_skb() to get some
extra headroom in case we need to forward this frame in a tunnel or
something else.
Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
alex.bluesman.smirnov@gmail.com [Tue, 10 Jul 2012 21:22:44 +0000 (21:22 +0000)]
mac802154: add get short address method
Add method to get the device short 802.15.4 address. This call
needed by ieee802154 layer to satisfy 'iz list' request from
the user space.
Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
alex.bluesman.smirnov@gmail.com [Tue, 10 Jul 2012 21:22:43 +0000 (21:22 +0000)]
drivers/ieee802154/at86rf230: rework irq handler
Fix LOCKDEP bug message for the irq handler spinlock.
Make the irq processing code more explicit and stable.
Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
alex.bluesman.smirnov@gmail.com [Tue, 10 Jul 2012 21:22:42 +0000 (21:22 +0000)]
6lowpan: revert: add missing spin_lock_init()
Revert the commit
768f7c7c121e80f458a9d013b2e8b169e5dfb1e5 to initialize
spinlock in the more preferable way and make it static to avoid sparse
warning.
Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Tue, 10 Jul 2012 20:32:51 +0000 (20:32 +0000)]
smsc95xx: signedness bug in get_regs()
"retval" has to be a signed integer for the error handling to work.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Greg Ungerer [Wed, 4 Jul 2012 13:50:00 +0000 (13:50 +0000)]
net: add support for NS8390 based eth controllers on some ColdFire CPU boards
A number of older ColdFire CPU based boards use NS8390 based network
controllers. Most use the Davicom 9008F or the UMC 9008F. This driver
provides the support code to get these devices working on these platforms.
Generally the NS8390 based eth device is direct connected via the general
purpose bus of the ColdFire CPU. So its addressing and interrupt setup is
fixed on each of the different platforms (classic platform setup).
This driver is based on the other drivers/net/ethernet/8390 drivers, and
includes the lib8390.c code. It uses the existing definitions of the
board NS8390 device addresses, interrupts and access types from the
arch/m68k/include/asm/mcf8390.h, but moves the IO access functions into
the driver code and out of that header.
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Greg Ungerer [Wed, 4 Jul 2012 13:49:59 +0000 (13:49 +0000)]
m68knommu: move the badly named mcfne.h to a better mcf8390.h
The mcfne.h include contains definitions to support NS8390 eth based hardware
on ColdFire based CPU boards. So change its name to reflect that better.
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 12 Jul 2012 14:40:05 +0000 (07:40 -0700)]
ipv4: Fix warnings in ip_do_redirect() for some configurations.
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Paul Gortmaker [Wed, 11 Jul 2012 21:35:01 +0000 (17:35 -0400)]
tipc: limit error messages relating to memory leak to one line
With the default name table size of 1024, it is possible that
the sanity check in tipc_nametbl_stop could spam out 1024
essentially identical error messages if memory was corrupted
or similar. Limit it to issuing no more than a single message.
The actual chain number (i.e. 0 --> 1023) wouldn't provide any
useful insight if/when such an instance happened, so don't
bother printing out that value.
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Paul Gortmaker [Wed, 11 Jul 2012 13:40:43 +0000 (09:40 -0400)]
tipc: factor stats struct out of the larger link struct
This is done to improve readability, and so that we can give
the struct a name that will allow us to declare a local
pointer to it in code, instead of having to always redirect
through the link struct to get to it.
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
David S. Miller [Thu, 12 Jul 2012 10:49:19 +0000 (03:49 -0700)]
Merge branch 'redirect_via_sock'
As described in my patch series from the other day, we need to
rearrange redirect handling so that the local initiators of packets
(sockets, tunnels, xfrms, etc.) that implement the protocols compute
the route and pass this down into the ipv4/ipv6 routing code.
These changes here do so by implementing a new dst_ops->redirect
method.
No more do we have this funny code that tries several different sets
of routing keys to try and figure out which route the redirect should
actually be applied to.
No more do we have the problem wherein TOS rewriting causes problems
for us.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 12 Jul 2012 07:41:25 +0000 (00:41 -0700)]
net: Remove checks for dst_ops->redirect being NULL.
No longer necessary.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 12 Jul 2012 07:39:24 +0000 (00:39 -0700)]
net: Add dummy dst_ops->redirect method where needed.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 12 Jul 2012 07:33:37 +0000 (00:33 -0700)]
ipv6: Use icmpv6_notify() to propagate redirect, instead of rt6_redirect().
And delete rt6_redirect(), since it is no longer used.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 12 Jul 2012 07:25:15 +0000 (00:25 -0700)]
ipv6: Add redirect support to all protocol icmp error handlers.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 12 Jul 2012 07:08:07 +0000 (00:08 -0700)]
ipv6: Add ip6_redirect() and ip6_sk_redirect() helper functions.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 12 Jul 2012 07:05:02 +0000 (00:05 -0700)]
ipv6: Pull main logic of rt6_redirect() into rt6_do_redirect().
Hook it into dst_ops->redirect as well.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 12 Jul 2012 06:43:53 +0000 (23:43 -0700)]
ipv6: Move bulk of redirect handling into rt6_redirect().
This sets things up so that we can have the protocol error handlers
call down into the ipv6 route code for redirects just as ipv4 already
does.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 12 Jul 2012 06:26:46 +0000 (23:26 -0700)]
ipv6: Export ndisc option parsing from ndisc.c
This is going to be used internally by the rt6 redirect code.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 12 Jul 2012 04:30:08 +0000 (21:30 -0700)]
ipv4: Kill ip_rt_redirect().
No longer needed, as the protocol handlers now all properly
propagate the redirect back into the routing code.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 12 Jul 2012 04:27:49 +0000 (21:27 -0700)]
ipv4: Add redirect support to all protocol icmp error handlers.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 12 Jul 2012 04:25:45 +0000 (21:25 -0700)]
ipv4: Add ipv4_redirect() and ipv4_sk_redirect() helper functions.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 12 Jul 2012 03:55:47 +0000 (20:55 -0700)]
ipv4: Generalize ip_do_redirect() and hook into new dst_ops->redirect.
All of the redirect acceptance policy is now contained within.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 12 Jul 2012 03:38:08 +0000 (20:38 -0700)]
ipv4: Rearrange arguments to ip_rt_redirect()
Pass in the SKB rather than just the IP addresses, so that policy
and other aspects can reside in ip_rt_redirect() rather then
icmp_redirect().
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 12 Jul 2012 03:27:54 +0000 (20:27 -0700)]
ipv4: Pull redirect instantiation out into a helper function.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 12 Jul 2012 01:35:12 +0000 (18:35 -0700)]
ipv4: Deliver ICMP redirects to sockets too.
And thus, we can remove the ping_err() hack.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 12 Jul 2012 01:32:17 +0000 (18:32 -0700)]
ipv4: Pull icmp socket delivery out into a helper function.
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Wed, 11 Jul 2012 05:50:31 +0000 (05:50 +0000)]
tcp: TCP Small Queues
This introduce TSQ (TCP Small Queues)
TSQ goal is to reduce number of TCP packets in xmit queues (qdisc &
device queues), to reduce RTT and cwnd bias, part of the bufferbloat
problem.
sk->sk_wmem_alloc not allowed to grow above a given limit,
allowing no more than ~128KB [1] per tcp socket in qdisc/dev layers at a
given time.
TSO packets are sized/capped to half the limit, so that we have two
TSO packets in flight, allowing better bandwidth use.
As a side effect, setting the limit to 40000 automatically reduces the
standard gso max limit (65536) to 40000/2 : It can help to reduce
latencies of high prio packets, having smaller TSO packets.
This means we divert sock_wfree() to a tcp_wfree() handler, to
queue/send following frames when skb_orphan() [2] is called for the
already queued skbs.
Results on my dev machines (tg3/ixgbe nics) are really impressive,
using standard pfifo_fast, and with or without TSO/GSO.
Without reduction of nominal bandwidth, we have reduction of buffering
per bulk sender :
< 1ms on Gbit (instead of 50ms with TSO)
< 8ms on 100Mbit (instead of 132 ms)
I no longer have 4 MBytes backlogged in qdisc by a single netperf
session, and both side socket autotuning no longer use 4 Mbytes.
As skb destructor cannot restart xmit itself ( as qdisc lock might be
taken at this point ), we delegate the work to a tasklet. We use one
tasklest per cpu for performance reasons.
If tasklet finds a socket owned by the user, it sets TSQ_OWNED flag.
This flag is tested in a new protocol method called from release_sock(),
to eventually send new segments.
[1] New /proc/sys/net/ipv4/tcp_limit_output_bytes tunable
[2] skb_orphan() is usually called at TX completion time,
but some drivers call it in their start_xmit() handler.
These drivers should at least use BQL, or else a single TCP
session can still fill the whole NIC TX ring, since TSQ will
have no effect.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Dave Taht <dave.taht@bufferbloat.net>
Cc: Tom Herbert <therbert@google.com>
Cc: Matt Mathis <mattmathis@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Nandita Dukkipati <nanditad@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>