GitHub/MotorolaMobilityLLC/kernel-slsi.git
12 years agobe2net : Fix die temperature stat for Lancer
Padmanabh Ratnakar [Thu, 12 Jul 2012 03:56:46 +0000 (03:56 +0000)]
be2net : Fix die temperature stat for Lancer

Query die temperature stat for Lancer to report it correctly
in ethtool.

Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agobe2net: Fix error while toggling autoneg of pause parameters
Padmanabh Ratnakar [Thu, 12 Jul 2012 03:56:11 +0000 (03:56 +0000)]
be2net: Fix error while toggling autoneg of pause parameters

Autonegotiation of pause parameters is possible only on some PHYs.
Ability of autoneg of pause parameters is reported by adapter.
Autoneg of pause parameters cannot be changed from driver.
Fix driver to give error when autoneg mode is toggled by user.

Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoteam: make team_port_enabled() and team_port_txable() static inline
Jiri Pirko [Wed, 11 Jul 2012 05:34:04 +0000 (05:34 +0000)]
team: make team_port_enabled() and team_port_txable() static inline

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoteam: add broadcast mode
Jiri Pirko [Wed, 11 Jul 2012 05:34:03 +0000 (05:34 +0000)]
team: add broadcast mode

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoteam: use function team_port_txable() for determing enabled and up port
Jiri Pirko [Wed, 11 Jul 2012 05:34:02 +0000 (05:34 +0000)]
team: use function team_port_txable() for determing enabled and up port

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoipv4: Put proper checks into icmp_socket_deliver().
David S. Miller [Thu, 12 Jul 2012 15:06:04 +0000 (08:06 -0700)]
ipv4: Put proper checks into icmp_socket_deliver().

All handler->err() routines expect that we've done a pskb_may_pull()
test to make sure that IP header length + 8 bytes can be safely
pulled.

Reported-by: Hiroaki SHIMODA <shimoda.hiroaki@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net...
David S. Miller [Thu, 12 Jul 2012 15:00:56 +0000 (08:00 -0700)]
Merge branch 'master' of git://git./linux/kernel/git/jkirsher/net-next

12 years agonet: sched: add ipset ematch
Florian Westphal [Wed, 11 Jul 2012 10:56:57 +0000 (10:56 +0000)]
net: sched: add ipset ematch

Can be used to match packets against netfilter ip sets created via ipset(8).
skb->sk_iif is used as 'incoming interface', skb->dev is 'outgoing interface'.

Since ipset is usually called from netfilter, the ematch
initializes a fake xt_action_param, pulls the ip header into the
linear area and also sets skb->data to the IP header (otherwise
matching Layer 4 set types doesn't work).

Tested-by: Mr Dash Four <mr.dash.four@googlemail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonetxen: fix link notification order
Flavio Leitner [Wed, 11 Jul 2012 08:56:55 +0000 (08:56 +0000)]
netxen: fix link notification order

First update the adapter variables with the current speed and
mode before fire the notification. Otherwise, the get_settings()
may provide old values.

Signed-off-by: Flavio Leitner <fbl@redhat.com>
Acked-by: Rajesh Borundia <rajesh.borundia@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years ago6lowpan: rework fragment-deleting routine
alex.bluesman.smirnov@gmail.com [Tue, 10 Jul 2012 21:22:48 +0000 (21:22 +0000)]
6lowpan: rework fragment-deleting routine

6lowpan module starts collecting incomming frames and fragments
right after lowpan_module_init() therefor it will be better to
clean unfinished fragments in lowpan_cleanup_module() function
instead of doing it when link goes down.

Changed spinlocks type to prevent deadlock with expired timer event
and removed unused one.

Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years ago6lowpan: fix tag variable size
alex.bluesman.smirnov@gmail.com [Tue, 10 Jul 2012 21:22:47 +0000 (21:22 +0000)]
6lowpan: fix tag variable size

Function lowpan_alloc_new_frame() takes u8 tag as an argument. However,
its only caller, lowpan_process_data() passes down a u16. Hence,
the tag value can get corrupted. This prevent 6lowpan fragment reassembly of a
message when the fragment tag value is over 256.

Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>
Cc: Tony Cheneau <tony.cheneau@amnesiak.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agomac802154: sparse warnings: make symbols static
alex.bluesman.smirnov@gmail.com [Tue, 10 Jul 2012 21:22:46 +0000 (21:22 +0000)]
mac802154: sparse warnings: make symbols static

Make symbols static to avoid the following warning shown up
by sparse:

    warning: symbol ... was not declared. Should it be static?

Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years ago6lowpan: get extra headroom in allocated frame
alex.bluesman.smirnov@gmail.com [Tue, 10 Jul 2012 21:22:45 +0000 (21:22 +0000)]
6lowpan: get extra headroom in allocated frame

Use netdev_alloc_skb_ip_align() instead of alloc_skb() to get some
extra headroom in case we need to forward this frame in a tunnel or
something else.

Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agomac802154: add get short address method
alex.bluesman.smirnov@gmail.com [Tue, 10 Jul 2012 21:22:44 +0000 (21:22 +0000)]
mac802154: add get short address method

Add method to get the device short 802.15.4 address. This call
needed by ieee802154 layer to satisfy 'iz list' request from
the user space.

Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agodrivers/ieee802154/at86rf230: rework irq handler
alex.bluesman.smirnov@gmail.com [Tue, 10 Jul 2012 21:22:43 +0000 (21:22 +0000)]
drivers/ieee802154/at86rf230: rework irq handler

Fix LOCKDEP bug message for the irq handler spinlock.
Make the irq processing code more explicit and stable.

Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years ago6lowpan: revert: add missing spin_lock_init()
alex.bluesman.smirnov@gmail.com [Tue, 10 Jul 2012 21:22:42 +0000 (21:22 +0000)]
6lowpan: revert: add missing spin_lock_init()

Revert the commit 768f7c7c121e80f458a9d013b2e8b169e5dfb1e5 to initialize
spinlock in the more preferable way and make it static to avoid sparse
warning.

Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agosmsc95xx: signedness bug in get_regs()
Dan Carpenter [Tue, 10 Jul 2012 20:32:51 +0000 (20:32 +0000)]
smsc95xx: signedness bug in get_regs()

"retval" has to be a signed integer for the error handling to work.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet: add support for NS8390 based eth controllers on some ColdFire CPU boards
Greg Ungerer [Wed, 4 Jul 2012 13:50:00 +0000 (13:50 +0000)]
net: add support for NS8390 based eth controllers on some ColdFire CPU boards

A number of older ColdFire CPU based boards use NS8390 based network
controllers. Most use the Davicom 9008F or the UMC 9008F. This driver
provides the support code to get these devices working on these platforms.

Generally the NS8390 based eth device is direct connected via the general
purpose bus of the ColdFire CPU. So its addressing and interrupt setup is
fixed on each of the different platforms (classic platform setup).

This driver is based on the other drivers/net/ethernet/8390 drivers, and
includes the lib8390.c code. It uses the existing definitions of the
board NS8390 device addresses, interrupts and access types from the
arch/m68k/include/asm/mcf8390.h, but moves the IO access functions into
the driver code and out of that header.

Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agom68knommu: move the badly named mcfne.h to a better mcf8390.h
Greg Ungerer [Wed, 4 Jul 2012 13:49:59 +0000 (13:49 +0000)]
m68knommu: move the badly named mcfne.h to a better mcf8390.h

The mcfne.h include contains definitions to support NS8390 eth based hardware
on ColdFire based CPU boards. So change its name to reflect that better.

Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoipv4: Fix warnings in ip_do_redirect() for some configurations.
David S. Miller [Thu, 12 Jul 2012 14:40:05 +0000 (07:40 -0700)]
ipv4: Fix warnings in ip_do_redirect() for some configurations.

Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoMerge branch 'redirect_via_sock'
David S. Miller [Thu, 12 Jul 2012 10:49:19 +0000 (03:49 -0700)]
Merge branch 'redirect_via_sock'

As described in my patch series from the other day, we need to
rearrange redirect handling so that the local initiators of packets
(sockets, tunnels, xfrms, etc.) that implement the protocols compute
the route and pass this down into the ipv4/ipv6 routing code.

These changes here do so by implementing a new dst_ops->redirect
method.

No more do we have this funny code that tries several different sets
of routing keys to try and figure out which route the redirect should
actually be applied to.

No more do we have the problem wherein TOS rewriting causes problems
for us.

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet: Remove checks for dst_ops->redirect being NULL.
David S. Miller [Thu, 12 Jul 2012 07:41:25 +0000 (00:41 -0700)]
net: Remove checks for dst_ops->redirect being NULL.

No longer necessary.

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet: Add dummy dst_ops->redirect method where needed.
David S. Miller [Thu, 12 Jul 2012 07:39:24 +0000 (00:39 -0700)]
net: Add dummy dst_ops->redirect method where needed.

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoipv6: Use icmpv6_notify() to propagate redirect, instead of rt6_redirect().
David S. Miller [Thu, 12 Jul 2012 07:33:37 +0000 (00:33 -0700)]
ipv6: Use icmpv6_notify() to propagate redirect, instead of rt6_redirect().

And delete rt6_redirect(), since it is no longer used.

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoipv6: Add redirect support to all protocol icmp error handlers.
David S. Miller [Thu, 12 Jul 2012 07:25:15 +0000 (00:25 -0700)]
ipv6: Add redirect support to all protocol icmp error handlers.

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoipv6: Add ip6_redirect() and ip6_sk_redirect() helper functions.
David S. Miller [Thu, 12 Jul 2012 07:08:07 +0000 (00:08 -0700)]
ipv6: Add ip6_redirect() and ip6_sk_redirect() helper functions.

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoipv6: Pull main logic of rt6_redirect() into rt6_do_redirect().
David S. Miller [Thu, 12 Jul 2012 07:05:02 +0000 (00:05 -0700)]
ipv6: Pull main logic of rt6_redirect() into rt6_do_redirect().

Hook it into dst_ops->redirect as well.

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoipv6: Move bulk of redirect handling into rt6_redirect().
David S. Miller [Thu, 12 Jul 2012 06:43:53 +0000 (23:43 -0700)]
ipv6: Move bulk of redirect handling into rt6_redirect().

This sets things up so that we can have the protocol error handlers
call down into the ipv6 route code for redirects just as ipv4 already
does.

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoipv6: Export ndisc option parsing from ndisc.c
David S. Miller [Thu, 12 Jul 2012 06:26:46 +0000 (23:26 -0700)]
ipv6: Export ndisc option parsing from ndisc.c

This is going to be used internally by the rt6 redirect code.

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoipv4: Kill ip_rt_redirect().
David S. Miller [Thu, 12 Jul 2012 04:30:08 +0000 (21:30 -0700)]
ipv4: Kill ip_rt_redirect().

No longer needed, as the protocol handlers now all properly
propagate the redirect back into the routing code.

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoipv4: Add redirect support to all protocol icmp error handlers.
David S. Miller [Thu, 12 Jul 2012 04:27:49 +0000 (21:27 -0700)]
ipv4: Add redirect support to all protocol icmp error handlers.

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoipv4: Add ipv4_redirect() and ipv4_sk_redirect() helper functions.
David S. Miller [Thu, 12 Jul 2012 04:25:45 +0000 (21:25 -0700)]
ipv4: Add ipv4_redirect() and ipv4_sk_redirect() helper functions.

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoipv4: Generalize ip_do_redirect() and hook into new dst_ops->redirect.
David S. Miller [Thu, 12 Jul 2012 03:55:47 +0000 (20:55 -0700)]
ipv4: Generalize ip_do_redirect() and hook into new dst_ops->redirect.

All of the redirect acceptance policy is now contained within.

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoipv4: Rearrange arguments to ip_rt_redirect()
David S. Miller [Thu, 12 Jul 2012 03:38:08 +0000 (20:38 -0700)]
ipv4: Rearrange arguments to ip_rt_redirect()

Pass in the SKB rather than just the IP addresses, so that policy
and other aspects can reside in ip_rt_redirect() rather then
icmp_redirect().

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoipv4: Pull redirect instantiation out into a helper function.
David S. Miller [Thu, 12 Jul 2012 03:27:54 +0000 (20:27 -0700)]
ipv4: Pull redirect instantiation out into a helper function.

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoipv4: Deliver ICMP redirects to sockets too.
David S. Miller [Thu, 12 Jul 2012 01:35:12 +0000 (18:35 -0700)]
ipv4: Deliver ICMP redirects to sockets too.

And thus, we can remove the ping_err() hack.

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoipv4: Pull icmp socket delivery out into a helper function.
David S. Miller [Thu, 12 Jul 2012 01:32:17 +0000 (18:32 -0700)]
ipv4: Pull icmp socket delivery out into a helper function.

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agotcp: TCP Small Queues
Eric Dumazet [Wed, 11 Jul 2012 05:50:31 +0000 (05:50 +0000)]
tcp: TCP Small Queues

This introduce TSQ (TCP Small Queues)

TSQ goal is to reduce number of TCP packets in xmit queues (qdisc &
device queues), to reduce RTT and cwnd bias, part of the bufferbloat
problem.

sk->sk_wmem_alloc not allowed to grow above a given limit,
allowing no more than ~128KB [1] per tcp socket in qdisc/dev layers at a
given time.

TSO packets are sized/capped to half the limit, so that we have two
TSO packets in flight, allowing better bandwidth use.

As a side effect, setting the limit to 40000 automatically reduces the
standard gso max limit (65536) to 40000/2 : It can help to reduce
latencies of high prio packets, having smaller TSO packets.

This means we divert sock_wfree() to a tcp_wfree() handler, to
queue/send following frames when skb_orphan() [2] is called for the
already queued skbs.

Results on my dev machines (tg3/ixgbe nics) are really impressive,
using standard pfifo_fast, and with or without TSO/GSO.

Without reduction of nominal bandwidth, we have reduction of buffering
per bulk sender :
< 1ms on Gbit (instead of 50ms with TSO)
< 8ms on 100Mbit (instead of 132 ms)

I no longer have 4 MBytes backlogged in qdisc by a single netperf
session, and both side socket autotuning no longer use 4 Mbytes.

As skb destructor cannot restart xmit itself ( as qdisc lock might be
taken at this point ), we delegate the work to a tasklet. We use one
tasklest per cpu for performance reasons.

If tasklet finds a socket owned by the user, it sets TSQ_OWNED flag.
This flag is tested in a new protocol method called from release_sock(),
to eventually send new segments.

[1] New /proc/sys/net/ipv4/tcp_limit_output_bytes tunable
[2] skb_orphan() is usually called at TX completion time,
  but some drivers call it in their start_xmit() handler.
  These drivers should at least use BQL, or else a single TCP
  session can still fill the whole NIC TX ring, since TSQ will
  have no effect.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Dave Taht <dave.taht@bufferbloat.net>
Cc: Tom Herbert <therbert@google.com>
Cc: Matt Mathis <mattmathis@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Nandita Dukkipati <nanditad@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agotcp: Fix out of bounds access to tcpm_vals
Alexander Duyck [Thu, 12 Jul 2012 00:18:04 +0000 (17:18 -0700)]
tcp: Fix out of bounds access to tcpm_vals

The recent patch "tcp: Maintain dynamic metrics in local cache." introduced
an out of bounds access due to what appears to be a typo.   I believe this
change should resolve the issue by replacing the access to RTAX_CWND with
TCP_METRIC_CWND.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoipv6: Move ipv6 twsk accessors outside of CONFIG_IPV6 ifdefs.
David S. Miller [Wed, 11 Jul 2012 09:39:24 +0000 (02:39 -0700)]
ipv6: Move ipv6 twsk accessors outside of CONFIG_IPV6 ifdefs.

Fixes build when ipv6 is disabled.

Reported-by: Fengguang Wu <wfg@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoixgbe: Merge RSS and flow director ring register caching and configuration
Alexander Duyck [Sat, 5 May 2012 05:31:04 +0000 (05:31 +0000)]
ixgbe: Merge RSS and flow director ring register caching and configuration

There are really only 3 modes that can control the number of queues.  Those
are RSS, DCB, and VMDq/SR-IOV.  Currently we have things much more broken
up than they need to be for how we are configuring the rings.  In order to
try and straiten some of this out I am going to start merging similar
functionality into single functions.  To start with I am merging the Flow
Director ring configuration into the RSS ring configuration since Flow
Director cannot function with DCB or SR-IOV.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
12 years agoixgbe: Clean up a useless switch statement and dead code in configure_srrctl
Alexander Duyck [Sat, 5 May 2012 05:30:59 +0000 (05:30 +0000)]
ixgbe: Clean up a useless switch statement and dead code in configure_srrctl

This patch replaces a switch statement for an 82598 workaround with an if
statement that only applies to 82598. In addition I am pulling out several
dead pieces of code and instead of reading the SRRCTL register and then
modifying it we are just writing a value which we generate from scratch.
Finally I am also removing any drop enable related code since that was
moved to a function of its own.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
12 years agoixgbe: Add feature offset value to ring features
Alexander Duyck [Sat, 5 May 2012 05:30:53 +0000 (05:30 +0000)]
ixgbe: Add feature offset value to ring features

The mask value for ring features was overloaded for FCoE which can lead to
some confusion.  In order to avoid any confusion I am splitting the mask
value and adding an offset value.  This can be used for the start of the
FCoE rings, and in the future I hope to use it to store the start of the
registers for SR-IOV.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
12 years agoixgbe: Add upper limit to ring features
Alexander Duyck [Thu, 10 May 2012 00:01:46 +0000 (00:01 +0000)]
ixgbe: Add upper limit to ring features

We are currently using indices to indicate the upper limit on a ring
feature.  However since we can switch back and forth on features such as
DCB and that has effects on other features such as RSS it is preferable to
instead store the upper limit separate from the current value for the
number of rings related to the feature.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
12 years agoixgbe: count q_vectors instead of MSI-X vectors
Alexander Duyck [Sat, 5 May 2012 05:30:43 +0000 (05:30 +0000)]
ixgbe: count q_vectors instead of MSI-X vectors

It makes much more sense for us to count q_vectors instead of MSI-X
vectors.  We were using num_msix_vectors to find the number of q_vectors in
multiple places.  This was wasteful since we only had one place that
actually needs the number of MSI-X vectors and that is in slow path.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
12 years agobridge: fix endian
Li RongQing [Mon, 9 Jul 2012 23:56:12 +0000 (23:56 +0000)]
bridge: fix endian

mld->mld_maxdelay is net endian, so we should use ntohs, not htons

CC: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoqlge: fix endian issue
Li RongQing [Mon, 9 Jul 2012 22:02:42 +0000 (22:02 +0000)]
qlge: fix endian issue

commit 6d29b1ef introduces a bug, ntohs is __be16_to_cpu,
not cpu_to_be16.

We always use htons on IP_OFFSET and IP_MF, then compare
with network package.

Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoksz884x: fix Endian
Li RongQing [Mon, 9 Jul 2012 20:56:06 +0000 (20:56 +0000)]
ksz884x: fix Endian

ETH_P_IP is host Endian, skb->protocol is big Endian, when
compare them, Using htons on skb->protocol is wrong.

And fix two code style issues: indentation and remove
unnecessary parentheses.

CC: Tristram Ha <Tristram.Ha@micrel.com>
CC: Ben Hutchings <bhutchings@solarflare.com>
CC: Joe Perches <joe@perches.com>
Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoMerge branch 'davem-next.r8169' of git://violet.fr.zoreil.com/romieu/linux
David S. Miller [Wed, 11 Jul 2012 08:28:36 +0000 (01:28 -0700)]
Merge branch 'davem-next.r8169' of git://violet.fr.zoreil.com/romieu/linux

12 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
David S. Miller [Wed, 11 Jul 2012 06:56:33 +0000 (23:56 -0700)]
Merge git://git./linux/kernel/git/davem/net

Conflicts:
net/batman-adv/bridge_loop_avoidance.c
net/batman-adv/bridge_loop_avoidance.h
net/batman-adv/soft-interface.c
net/mac80211/mlme.c

With merge help from Antonio Quartulli (batman-adv) and
Stephen Rothwell (drivers/net/usb/qmi_wwan.c).

The net/mac80211/mlme.c conflict seemed easy enough, accounting for a
conversion to some new tracing macros.

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agobnx2: Fix bug in bnx2_free_tx_skbs().
Michael Chan [Tue, 10 Jul 2012 10:04:40 +0000 (10:04 +0000)]
bnx2: Fix bug in bnx2_free_tx_skbs().

In rare cases, bnx2x_free_tx_skbs() can unmap the wrong DMA address
when it gets to the last entry of the tx ring.  We were not using
the proper macro to skip the last entry when advancing the tx index.

Reported-by: Zongyun Lai <zlai@vmware.com>
Reviewed-by: Jeffrey Huang <huangjw@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoIPoIB: fix skb truesize underestimatiom
Eric Dumazet [Tue, 10 Jul 2012 10:03:41 +0000 (10:03 +0000)]
IPoIB: fix skb truesize underestimatiom

Or Gerlitz reported triggering of WARN_ON_ONCE(delta < len); in
skb_try_coalesce()
This warning tracks drivers that incorrectly set skb->truesize

IPoIB indeed allocates a full page to store a fragment, but only
accounts in skb->truesize the used part of the page (frame length)

This patch fixes skb truesize underestimation, and
also fixes a performance issue, because RX skbs have not enough tailroom
to allow IP and TCP stacks to pull their header in skb linear part
without an expensive call to pskb_expand_head()

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Or Gerlitz <ogerlitz@mellanox.com>
Cc: Erez Shitrit <erezsh@mellanox.com>
Cc: Shlomo Pongartz <shlomop@mellanox.com>
Cc: Roland Dreier <roland@purestorage.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet: Fix memory leak - vlan_info struct
Amir Hanania [Mon, 9 Jul 2012 20:47:19 +0000 (20:47 +0000)]
net: Fix memory leak - vlan_info struct

In driver reload test there is a memory leak.
The structure vlan_info was not freed when the driver was removed.
It was not released since the nr_vids var is one after last vlan was removed.
The nr_vids is one, since vlan zero is added to the interface when the interface
is being set, but the vlan zero is not deleted at unregister.
Fix - delete vlan zero when we unregister the device.

Signed-off-by: Amir Hanania <amir.hanania@intel.com>
Acked-by: John Fastabend <john.r.fastabend@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoMerge tag 'batman-adv-fix-for-davem' of git://git.open-mesh.org/linux-merge
David S. Miller [Wed, 11 Jul 2012 06:31:37 +0000 (23:31 -0700)]
Merge tag 'batman-adv-fix-for-davem' of git://git.open-mesh.org/linux-merge

Included changes:
- fix a bug generated by the wrong interaction between the GW feature and the
  Bridge Loop Avoidance

12 years agoqlge: Bumped driver version to 1.00.00.31
Jitendra Kalsaria [Tue, 10 Jul 2012 14:57:39 +0000 (14:57 +0000)]
qlge: Bumped driver version to 1.00.00.31

Signed-off-by: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoqlge: Refactoring of ethtool stats.
Jitendra Kalsaria [Tue, 10 Jul 2012 14:57:38 +0000 (14:57 +0000)]
qlge: Refactoring of ethtool stats.

Signed-off-by: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoqlge: Moving low level frame error to ethtool statistics.
Jitendra Kalsaria [Tue, 10 Jul 2012 14:57:37 +0000 (14:57 +0000)]
qlge: Moving low level frame error to ethtool statistics.

Signed-off-by: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoqlge: Fixed double pci free upon tx_ring->q allocation failure.
Jitendra Kalsaria [Tue, 10 Jul 2012 14:57:36 +0000 (14:57 +0000)]
qlge: Fixed double pci free upon tx_ring->q allocation failure.

Signed-off-by: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoqlge: Added missing case statement to ethtool get_strings.
Jitendra Kalsaria [Tue, 10 Jul 2012 14:57:35 +0000 (14:57 +0000)]
qlge: Added missing case statement to ethtool get_strings.

Missing case was causing ethtool self test to print garbage
value in extra info section.

Signed-off-by: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoqlge: Clean up ethtool set WOL routine.
Jitendra Kalsaria [Tue, 10 Jul 2012 14:57:34 +0000 (14:57 +0000)]
qlge: Clean up ethtool set WOL routine.

Signed-off-by: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoqlge: Fix ethtool WOL calls to operate only on devices that support WOL.
Jitendra Kalsaria [Tue, 10 Jul 2012 14:57:33 +0000 (14:57 +0000)]
qlge: Fix ethtool WOL calls to operate only on devices that support WOL.

Signed-off-by: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoqlge: Cleanup atomic queue threshold check.
Jitendra Kalsaria [Tue, 10 Jul 2012 14:57:32 +0000 (14:57 +0000)]
qlge: Cleanup atomic queue threshold check.

Signed-off-by: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoqlge: Fix TX queue stoppage due to full condition.
Jitendra Kalsaria [Tue, 10 Jul 2012 14:57:31 +0000 (14:57 +0000)]
qlge: Fix TX queue stoppage due to full condition.

TX queue was being stopped at beginning of send path instead
of at the end when last descriptor is used.

Signed-off-by: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet: calxedaxgmac: enable rx cut-thru mode
Rob Herring [Mon, 9 Jul 2012 14:16:10 +0000 (14:16 +0000)]
net: calxedaxgmac: enable rx cut-thru mode

Enabling RX cut-thru mode yields better performance as received frames
start getting written to memory before a whole frame is received.

Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet: calxedaxgmac: set outstanding AXI bus transactions to 8
Rob Herring [Mon, 9 Jul 2012 14:16:09 +0000 (14:16 +0000)]
net: calxedaxgmac: set outstanding AXI bus transactions to 8

Increase the number of outstanding read and write AXI transactions from 1
to 8 for better performance.

Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet: calxedaxgmac: fix hang on rx refill
Rob Herring [Mon, 9 Jul 2012 14:16:08 +0000 (14:16 +0000)]
net: calxedaxgmac: fix hang on rx refill

Fix intermittent hangs in xgmac_rx_refill. If a ring buffer entry already
had an skb allocated, then xgmac_rx_refill would get stuck in a loop. This
can happen on a rx error when we just leave the skb allocated to the entry.

[ 7884.510000] INFO: rcu_preempt detected stall on CPU 0 (t=727315 jiffies)
[ 7884.510000] [<c0010a59>] (unwind_backtrace+0x1/0x98) from [<c006fd93>] (__rcu_pending+0x11b/0x2c4)
[ 7884.510000] [<c006fd93>] (__rcu_pending+0x11b/0x2c4) from [<c0070b95>] (rcu_check_callbacks+0xed/0x1a8)
[ 7884.510000] [<c0070b95>] (rcu_check_callbacks+0xed/0x1a8) from [<c0036abb>] (update_process_times+0x2b/0x48)
[ 7884.510000] [<c0036abb>] (update_process_times+0x2b/0x48) from [<c004e8fd>] (tick_sched_timer+0x51/0x94)
[ 7884.510000] [<c004e8fd>] (tick_sched_timer+0x51/0x94) from [<c0045527>] (__run_hrtimer+0x4f/0x1e8)
[ 7884.510000] [<c0045527>] (__run_hrtimer+0x4f/0x1e8) from [<c0046003>] (hrtimer_interrupt+0xd7/0x1e4)
[ 7884.510000] [<c0046003>] (hrtimer_interrupt+0xd7/0x1e4) from [<c00101d3>] (twd_handler+0x17/0x24)
[ 7884.510000] [<c00101d3>] (twd_handler+0x17/0x24) from [<c006be39>] (handle_percpu_devid_irq+0x59/0x114)
[ 7884.510000] [<c006be39>] (handle_percpu_devid_irq+0x59/0x114) from [<c0069aab>] (generic_handle_irq+0x17/0x2c)
[ 7884.510000] [<c0069aab>] (generic_handle_irq+0x17/0x2c) from [<c000cc8d>] (handle_IRQ+0x35/0x7c)
[ 7884.510000] [<c000cc8d>] (handle_IRQ+0x35/0x7c) from [<c033b153>] (__irq_svc+0x33/0xb8)
[ 7884.510000] [<c033b153>] (__irq_svc+0x33/0xb8) from [<c0244b06>] (xgmac_rx_refill+0x3a/0x140)
[ 7884.510000] [<c0244b06>] (xgmac_rx_refill+0x3a/0x140) from [<c02458ed>] (xgmac_poll+0x265/0x3bc)
[ 7884.510000] [<c02458ed>] (xgmac_poll+0x265/0x3bc) from [<c029fcbf>] (net_rx_action+0xc3/0x200)
[ 7884.510000] [<c029fcbf>] (net_rx_action+0xc3/0x200) from [<c0030cab>] (__do_softirq+0xa3/0x1bc)

Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet: calxedaxgmac: fix net timeout recovery
Rob Herring [Mon, 9 Jul 2012 14:16:07 +0000 (14:16 +0000)]
net: calxedaxgmac: fix net timeout recovery

Fix net tx watchdog timeout recovery. The descriptor ring was reset,
but the DMA engine was not reset to the beginning of the ring.

Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoll_temac: remove unnecessary setting of skb->dev
Jon Mason [Mon, 9 Jul 2012 14:09:35 +0000 (14:09 +0000)]
ll_temac: remove unnecessary setting of skb->dev

skb->dev is being unnecessarily set by the driver on packet recieve.
eth_type_trans already sets skb->dev to the proper value and it is not
referenced anywhere else in the dirver, thus making its setting unnecessary.

Signed-off-by: Jon Mason <jdmason@kudzu.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agosunhme: remove unnecessary setting of skb->dev
Jon Mason [Mon, 9 Jul 2012 14:09:34 +0000 (14:09 +0000)]
sunhme: remove unnecessary setting of skb->dev

skb->dev is being unnecessarily set during ring init and skb alloc in rx.  It is
already being set to the proper value when eth_type_trans is called on packet
receive, and the skb->dev is not referenced anywhere else in the code.

Signed-off-by: Jon Mason <jdmason@kudzu.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agosungem: remove unnecessary setting of skb->dev
Jon Mason [Mon, 9 Jul 2012 14:09:33 +0000 (14:09 +0000)]
sungem: remove unnecessary setting of skb->dev

skb->dev is being unnecessarily set by the driver's skb alloc routine (which is
called in init and during rx).  It is already being set to the proper value when
eth_type_trans is called on packet receive, and the skb->dev is not referenced
anywhere else in the code.

Signed-off-by: Jon Mason <jdmason@kudzu.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agosunbmac: remove unnecessary setting of skb->dev
Jon Mason [Mon, 9 Jul 2012 14:09:32 +0000 (14:09 +0000)]
sunbmac: remove unnecessary setting of skb->dev

skb->dev is being unnecessarily set during ring init and skb alloc in rx.  It is
already being set to the proper value when eth_type_trans is called on packet
receive, and the skb->dev is not referenced anywhere else in the code.

Signed-off-by: Jon Mason <jdmason@kudzu.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoqlge: remove unnecessary setting of skb->dev
Jon Mason [Mon, 9 Jul 2012 14:09:31 +0000 (14:09 +0000)]
qlge: remove unnecessary setting of skb->dev

skb->dev is being unnecessarily set by the driver on packet recieve.
eth_type_trans already sets skb->dev to the proper value and it is not
referenced anywhere else in the dirver, thus making its setting unnecessary.

Signed-off-by: Jon Mason <jdmason@kudzu.us>
Cc: Anirban Chakraborty <anirban.chakraborty@qlogic.com>
Cc: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com>
Cc: Ron Mercer <ron.mercer@qlogic.com>
Cc: linux-driver@qlogic.com
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoqlcnic: remove unnecessary setting of skb->dev
Jon Mason [Mon, 9 Jul 2012 14:09:30 +0000 (14:09 +0000)]
qlcnic: remove unnecessary setting of skb->dev

skb->dev is being unnecessarily set before calling eth_type_trans.
eth_type_trans already sets skb->dev to the proper value, thus making this
unnecessary.

Signed-off-by: Jon Mason <jdmason@kudzu.us>
Cc: Anirban Chakraborty <anirban.chakraborty@qlogic.com>
Cc: Sony Chacko <sony.chacko@qlogic.com>
Cc: linux-driver@qlogic.com
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoksz884x: remove unnecessary setting of skb->dev
Jon Mason [Mon, 9 Jul 2012 14:09:29 +0000 (14:09 +0000)]
ksz884x: remove unnecessary setting of skb->dev

skb->dev is being unnecessarily set during ring init.  It is already being set
to the proper value when eth_type_trans is called on packet receive, and the
skb->dev is not referenced anywhere else in the code.

Signed-off-by: Jon Mason <jdmason@kudzu.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agolantiq_etop: remove unnecessary setting of skb->dev
Jon Mason [Mon, 9 Jul 2012 14:09:28 +0000 (14:09 +0000)]
lantiq_etop: remove unnecessary setting of skb->dev

skb->dev is being unnecessarily set before calling eth_type_trans.
eth_type_trans already sets skb->dev to the proper value, thus making this
unnecessary.

Signed-off-by: Jon Mason <jdmason@kudzu.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonetxen: remove unnecessary setting of skb->dev
Jon Mason [Mon, 9 Jul 2012 14:09:27 +0000 (14:09 +0000)]
netxen: remove unnecessary setting of skb->dev

skb->dev is being unnecessarily set by the driver on packet recieve.
eth_type_trans already sets skb->dev to the proper value and it is not
referenced anywhere else in the dirver, thus making its setting unnecessary.

Signed-off-by: Jon Mason <jdmason@kudzu.us>
Cc: Sony Chacko <sony.chacko@qlogic.com>
Cc: Rajesh Borundia <rajesh.borundia@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoenic: remove unnecessary setting of skb->dev
Jon Mason [Mon, 9 Jul 2012 14:09:26 +0000 (14:09 +0000)]
enic: remove unnecessary setting of skb->dev

skb->dev is being unnecessarily set after calling eth_type_trans.
eth_type_trans already sets skb->dev to the proper value, thus making this
unnecessary.

Signed-off-by: Jon Mason <jdmason@kudzu.us>
Cc: Christian Benvenuti <benve@cisco.com>
Cc: Roopa Prabhu <roprabhu@cisco.com>
Cc: Neel Patel <neepatel@cisco.com>
Cc: Nishank Trivedi <nistrive@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agolance: remove unnecessary setting of skb->dev
Jon Mason [Mon, 9 Jul 2012 14:09:25 +0000 (14:09 +0000)]
lance: remove unnecessary setting of skb->dev

skb->dev is being unnecessarily set during ring init.  It is already being set
to the proper value when eth_type_trans is called on packet receive, and the
skb->dev is not referenced anywhere else in the code.

Signed-off-by: Jon Mason <jdmason@kudzu.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agovxge/s2io: remove dead URLs
Jon Mason [Mon, 9 Jul 2012 14:07:57 +0000 (14:07 +0000)]
vxge/s2io: remove dead URLs

URLs to neterion.com and s2io.com no longer resolve.  Remove all references to
these URLs in the driver source and documentation.

Signed-off-by: Jon Mason <jdmason@kudzu.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoipv6: optimize ipv6 addresses compares
Eric Dumazet [Tue, 10 Jul 2012 19:05:57 +0000 (19:05 +0000)]
ipv6: optimize ipv6 addresses compares

On 64 bit arches having efficient unaligned accesses (eg x86_64) we can
use long words to reduce number of instructions for free.

Joe Perches suggested to change ipv6_masked_addr_cmp() to return a bool
instead of 'int', to make sure ipv6_masked_addr_cmp() cannot be used
in a sorting function.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agodrivers/net/ethernet: Fix non-kernel-doc comments with kernel-doc start markers
Ben Hutchings [Tue, 10 Jul 2012 10:56:59 +0000 (10:56 +0000)]
drivers/net/ethernet: Fix non-kernel-doc comments with kernel-doc start markers

Convert doxygen (or similar) formatted comments to kernel-doc or
unformatted comment.  Delete a few that are content-free.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agodrivers/net/ethernet: Fix (nearly-)kernel-doc comments for various functions
Ben Hutchings [Tue, 10 Jul 2012 10:56:00 +0000 (10:56 +0000)]
drivers/net/ethernet: Fix (nearly-)kernel-doc comments for various functions

Fix incorrect start markers, wrapped summary lines, missing section
breaks, incorrect separators, and some name mismatches.  Delete
a few that are content-free.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet: Fix non-kernel-doc comments with kernel-doc start marker
Ben Hutchings [Tue, 10 Jul 2012 10:55:35 +0000 (10:55 +0000)]
net: Fix non-kernel-doc comments with kernel-doc start marker

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet: Fix (nearly-)kernel-doc comments for various functions
Ben Hutchings [Tue, 10 Jul 2012 10:55:09 +0000 (10:55 +0000)]
net: Fix (nearly-)kernel-doc comments for various functions

Fix incorrect start markers, wrapped summary lines, missing section
breaks, incorrect separators, and some name mismatches.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet: Properly define functions with no parameters
Ben Hutchings [Tue, 10 Jul 2012 10:54:38 +0000 (10:54 +0000)]
net: Properly define functions with no parameters

Defining a function with no parameters as 'T foo()' is the deprecated
K&R style, and is not strictly equivalent to defining it as 'T foo(void)'.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoMerge branch 'metrics_restructure'
David S. Miller [Wed, 11 Jul 2012 05:53:57 +0000 (22:53 -0700)]
Merge branch 'metrics_restructure'

This patch series works towards the goal of minimizing the amount
of things that can change in an ipv4 route.

In a regime where the routing cache is removed, route changes will
lead to cloning in the FIB tables or similar.

The largest trigger of route metrics writes, TCP, now has it's own
cache of dynamic metric state.  The timewait timestamps are stored
there now as well.

As a result of that, pre-cowing metrics is no longer necessary,
and therefore FLOWI_FLAG_PRECOW_METRICS is removed.

Redirect and PMTU handling is moved back into the ipv4 routes.  I'm
sorry for all the headaches trying to do this in the inetpeer has
caused, it was the wrong approach for sure.

Since metrics become read-only for ipv4 we no longer need the inetpeer
hung off of the ipv4 routes either.  So those disappear too.

Also, timewait sockets no longer need to hold onto an inetpeer either.

After this series, we still have some details to resolve wrt. PMTU and
redirects for a route-cache-less system:

1) With just the plain route cache removal, PMTU will continue to
   work mostly fine.  This is because of how the local route users
   call down into the PMTU update code with the route they already
   hold.

   However, if we wish to cache pre-computed routes in fib_info
   nexthops (which we want for performance), then we need to add
   route cloning for PMTU events.

2) Redirects require more work.  First, redirects must be changed to
   be handled like PMTU.  Wherein we call down into the sockets and
   other entities, and then they call back into the routing code with
   the route they were using.

   So we'll be adding an ->update_nexthop() method alongside
   ->update_pmtu().

   And then, like for PMTU, we'll need cloning support once we start
   caching routes in the fib_info nexthops.

But that's it, we can completely pull the trigger and remove the
routing cache with minimal disruptions.

As it is, this patch series alone helps a lot of things.  For one,
routing cache entry creation should be a lot faster, because we no
longer do inetpeer lookups (even to check if an entry exists).

This patch series also opens the door for non-DST_HOST ipv4 routes,
because nothing fundamentally cares about rt->rt_dst any more.  It
can be removed with the base routing cache removal patch.  In fact,
that was the primary goal of this patch series.

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoipv4: Remove inetpeer from routes.
David S. Miller [Tue, 10 Jul 2012 14:26:01 +0000 (07:26 -0700)]
ipv4: Remove inetpeer from routes.

No longer used.

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoipv4: Calling ->cow_metrics() now is a bug.
David S. Miller [Tue, 10 Jul 2012 14:08:18 +0000 (07:08 -0700)]
ipv4: Calling ->cow_metrics() now is a bug.

Nothing every writes to ipv4 metrics any longer.

PMTU is stored in rt->rt_pmtu.

Dynamic TCP metrics are stored in a special TCP metrics cache,
completely outside of the routes.

Therefore ->cow_metrics() can simply nothing more than a WARN_ON
trigger so we can catch anyone who tries to add new writes to
ipv4 route metrics.

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoipv4: Kill dst_copy_metrics() call from ipv4_blackhole_route().
David S. Miller [Tue, 10 Jul 2012 14:03:43 +0000 (07:03 -0700)]
ipv4: Kill dst_copy_metrics() call from ipv4_blackhole_route().

Blackhole routes have a COW metrics operation that returns NULL
always, therefore this dst_copy_metrics() call did absolutely
nothing.

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoipv4: Enforce max MTU metric at route insertion time.
David S. Miller [Tue, 10 Jul 2012 14:02:09 +0000 (07:02 -0700)]
ipv4: Enforce max MTU metric at route insertion time.

Rather than at every struct rtable creation.

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoipv4: Maintain redirect and PMTU info in struct rtable again.
David S. Miller [Tue, 10 Jul 2012 13:58:42 +0000 (06:58 -0700)]
ipv4: Maintain redirect and PMTU info in struct rtable again.

Maintaining this in the inetpeer entries was not the right way to do
this at all.

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agortnetlink: Remove ts/tsage args to rtnl_put_cacheinfo().
David S. Miller [Tue, 10 Jul 2012 12:06:14 +0000 (05:06 -0700)]
rtnetlink: Remove ts/tsage args to rtnl_put_cacheinfo().

Nobody provides non-zero values any longer.

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoinet: Kill FLOWI_FLAG_PRECOW_METRICS.
David S. Miller [Tue, 10 Jul 2012 11:01:57 +0000 (04:01 -0700)]
inet: Kill FLOWI_FLAG_PRECOW_METRICS.

No longer needed.  TCP writes metrics, but now in it's own special
cache that does not dirty the route metrics.  Therefore there is no
longer any reason to pre-cow metrics in this way.

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoinet: Minimize use of cached route inetpeer.
David S. Miller [Tue, 10 Jul 2012 10:58:16 +0000 (03:58 -0700)]
inet: Minimize use of cached route inetpeer.

Only use it in the absolutely required cases:

1) COW'ing metrics

2) ipv4 PMTU

3) ipv4 redirects

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoinet: Remove ->get_peer() method.
David S. Miller [Tue, 10 Jul 2012 10:32:59 +0000 (03:32 -0700)]
inet: Remove ->get_peer() method.

No longer used.

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agotcp: Remove tw->tw_peer
David S. Miller [Tue, 10 Jul 2012 10:27:56 +0000 (03:27 -0700)]
tcp: Remove tw->tw_peer

No longer used.

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agotcp: Move timestamps from inetpeer to metrics cache.
David S. Miller [Tue, 10 Jul 2012 10:14:24 +0000 (03:14 -0700)]
tcp: Move timestamps from inetpeer to metrics cache.

With help from Lin Ming.

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet: Kill set_dst_metric_rtt().
David S. Miller [Tue, 10 Jul 2012 07:53:48 +0000 (00:53 -0700)]
net: Kill set_dst_metric_rtt().

No longer used.

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet: Don't report route RTT metric value in cache dumps.
David S. Miller [Tue, 10 Jul 2012 07:52:56 +0000 (00:52 -0700)]
net: Don't report route RTT metric value in cache dumps.

We don't maintain it dynamically any longer, so reporting it would
be extremely misleading.  Report zero instead.

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agotcp: Maintain dynamic metrics in local cache.
David S. Miller [Tue, 10 Jul 2012 07:49:14 +0000 (00:49 -0700)]
tcp: Maintain dynamic metrics in local cache.

Maintain a local hash table of TCP dynamic metrics blobs.

Computed TCP metrics are no longer maintained in the route metrics.

The table uses RCU and an extremely simple hash so that it has low
latency and low overhead.  A simple hash is legitimate because we only
make metrics blobs for fully established connections.

Some tweaking of the default hash table sizes, metric timeouts, and
the hash chain length limit certainly could use some tweaking.  But
the basic design seems sound.

With help from Eric Dumazet and Joe Perches.

Signed-off-by: David S. Miller <davem@davemloft.net>