David S. Miller [Thu, 3 Feb 2011 07:06:31 +0000 (23:06 -0800)]
sch_choke: Need linux/vmalloc.h
Signed-off-by: David S. Miller <davem@davemloft.net>
stephen hemminger [Wed, 2 Feb 2011 15:21:10 +0000 (15:21 +0000)]
sched: CHOKe flow scheduler
CHOKe ("CHOose and Kill" or "CHOose and Keep") is an alternative
packet scheduler based on the Random Exponential Drop (RED) algorithm.
The core idea is:
For every packet arrival:
Calculate Qave
if (Qave < minth)
Queue the new packet
else
Select randomly a packet from the queue
if (both packets from same flow)
then Drop both the packets
else if (Qave > maxth)
Drop packet
else
Admit packet with proability p (same as RED)
See also:
Rong Pan, Balaji Prabhakar, Konstantinos Psounis, "CHOKe: a stateless active
queue management scheme for approximating fair bandwidth allocation",
Proceeding of INFOCOM'2000, March 2000.
Help from:
Eric Dumazet <eric.dumazet@gmail.com>
Patrick McHardy <kaber@trash.net>
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
stephen hemminger [Wed, 2 Feb 2011 15:19:51 +0000 (15:19 +0000)]
sfq: deadlock in error path
The change to allow divisor to be a parameter (in 2.6.38-rc1)
commit
817fb15dfd988d8dda916ee04fa506f0c466b9d6
introduced a possible deadlock caught by sparse.
The scheduler tree lock was left locked in the case of an incorrect
divisor value. Simplest fix is to move test outside of lock
which also solves problem of partial update.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 3 Feb 2011 04:48:10 +0000 (20:48 -0800)]
ipv4: Fix fib_trie build in some configurations.
If we end up including include/linux/node.h (either explicitly
or implicitly) that header has a definition of "structt node"
too.
So rename the one we use in fib_trie to "rt_trie_node" to avoid
the conflict.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 3 Feb 2011 01:05:11 +0000 (17:05 -0800)]
tcp: Increase the initial congestion window to 10.
Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Nandita Dukkipati <nanditad@google.com>
Ivan Vecera [Wed, 2 Feb 2011 04:37:02 +0000 (04:37 +0000)]
bna: use device model DMA API
Use DMA API as PCI equivalents will be deprecated.
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 2 Feb 2011 23:24:48 +0000 (15:24 -0800)]
Merge branch 'master' of git://git./linux/kernel/git/kaber/nf-next-2.6
Patrick McHardy [Wed, 2 Feb 2011 23:05:43 +0000 (00:05 +0100)]
netfilter: xtables: add device group match
Add a new 'devgroup' match to match on the device group of the
incoming and outgoing network device of a packet.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jozsef Kadlecsik [Wed, 2 Feb 2011 22:56:00 +0000 (23:56 +0100)]
netfilter: ipset: send error message manually
When a message carries multiple commands and one of them triggers
an error, we have to report to the userspace which one was that.
The line number of the command plays this role and there's an attribute
reserved in the header part of the message to be filled out with the error
line number. In order not to modify the original message received from
the userspace, we construct a new, complete netlink error message and
modifies the attribute there, then send it.
Netlink is notified not to send its ACK/error message.
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu
Signed-off-by: Patrick McHardy <kaber@trash.net>
Patrick McHardy [Wed, 2 Feb 2011 22:50:01 +0000 (23:50 +0100)]
netfilter: ipset: fix linking with CONFIG_IPV6=n
Add a dummy ip_set_get_ip6_port function that unconditionally
returns false for CONFIG_IPV6=n and convert the real function
to ipv6_skip_exthdr() to avoid pulling in the ip6_tables module
when loading ipset.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Patrick McHardy [Wed, 2 Feb 2011 08:31:37 +0000 (09:31 +0100)]
netfilter: ipset: add missing break statemtns in ip_set_get_ip_port()
Don't fall through in the switch statement, otherwise IPv4 headers
are incorrectly parsed again as IPv6 and the return value will always
be 'false'.
Signed-off-by: Patrick McHardy <kaber@trash.net>
David S. Miller [Tue, 1 Feb 2011 23:34:21 +0000 (15:34 -0800)]
ipv4: Rename fib_hash_* locals in fib_semantics.c
To avoid confusion with the recently deleted fib_hash.c
code, use "fib_info_hash_*" instead of plain "fib_hash_*".
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 1 Feb 2011 23:30:56 +0000 (15:30 -0800)]
ipv4: Update some fib_hash centric interface names.
fib_hash_init() --> fib_trie_init()
fib_hash_table() --> fib_trie_table()
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 1 Feb 2011 23:15:39 +0000 (15:15 -0800)]
ipv4: Remove fib_hash.
The time has finally come to remove the hash based routing table
implementation in ipv4.
FIB Trie is mature, well tested, and I've done an audit of it's code
to confirm that it implements insert, delete, and lookup with the same
identical semantics as fib_hash did.
If there are any semantic differences found in fib_trie, we should
simply fix them.
I've placed the trie statistic config option under advanced router
configuration.
Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Patrick McHardy [Tue, 1 Feb 2011 17:52:42 +0000 (18:52 +0100)]
netfilter: ipset: install ipset related header files
Signed-off-by: Patrick McHardy <kaber@trash.net>
Simon Horman [Tue, 1 Feb 2011 17:30:26 +0000 (18:30 +0100)]
IPVS: Remove ip_vs_sync_cleanup from section __exit
ip_vs_sync_cleanup() may be called from ip_vs_init() on error
and thus needs to be accesible from section __init
Reporte-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
Acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Hans Schillstrom <hans@schillstrom.com>
Tested-by: Hans Schillstrom <hans@schillstrom.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Simon Horman [Tue, 1 Feb 2011 17:29:04 +0000 (18:29 +0100)]
IPVS: Allow compilation with CONFIG_SYSCTL disabled
This is a rather naieve approach to allowing PVS to compile with
CONFIG_SYSCTL disabled. I am working on a more comprehensive patch which
will remove compilation of all sysctl-related IPVS code when CONFIG_SYSCTL
is disabled.
Reported-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
Acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Hans Schillstrom <hans@schillstrom.com>
Tested-by: Hans Schillstrom <hans@schillstrom.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Simon Horman [Tue, 1 Feb 2011 17:27:51 +0000 (18:27 +0100)]
IPVS: Remove unused variables
These variables are unused as a result of the recent netns work.
Signed-off-by: Simon Horman <horms@verge.net.au>
Acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Hans Schillstrom <hans@schillstrom.com>
Tested-by: Hans Schillstrom <hans@schillstrom.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Simon Horman [Tue, 1 Feb 2011 17:24:09 +0000 (18:24 +0100)]
IPVS: remove duplicate initialisation or rs_table
Signed-off-by: Simon Horman <horms@verge.net.au>
Acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Hans Schillstrom <hans@schillstrom.com>
Tested-by: Hans Schillstrom <hans@schillstrom.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Simon Horman [Tue, 1 Feb 2011 17:21:53 +0000 (18:21 +0100)]
IPVS: use z modifier for sizeof() argument
Reported-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
Acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Hans Schillstrom <hans@schillstrom.com>
Tested-by: Hans Schillstrom <hans@schillstrom.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Patrick McHardy [Tue, 1 Feb 2011 16:26:37 +0000 (17:26 +0100)]
netfilter: ctnetlink: fix ctnetlink_parse_tuple() warning
net/netfilter/nf_conntrack_netlink.c: In function 'ctnetlink_parse_tuple':
net/netfilter/nf_conntrack_netlink.c:832:11: warning: comparison between 'enum ctattr_tuple' and 'enum ctattr_type'
Use ctattr_type for the 'type' parameter since that's the type of all attributes
passed to this function.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Patrick McHardy [Tue, 1 Feb 2011 15:57:37 +0000 (16:57 +0100)]
netfilter: ipset: remove unnecessary includes
None of the set types need uaccess.h since this is handled centrally
in ip_set_core. Most set types additionally don't need bitops.h and
spinlock.h since they use neither. tcp.h is only needed by those
using before(), udp.h is not needed at all.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Patrick McHardy [Tue, 1 Feb 2011 15:27:25 +0000 (16:27 +0100)]
netfilter: ipset: use nla_parse_nested()
Replace calls of the form:
nla_parse(tb, ATTR_MAX, nla_data(attr), nla_len(attr), policy)
by:
nla_parse_nested(tb, ATTR_MAX, attr, policy)
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jozsef Kadlecsik [Tue, 1 Feb 2011 14:56:00 +0000 (15:56 +0100)]
netfilter: xtables: "set" match and "SET" target support
The patch adds the combined module of the "SET" target and "set" match
to netfilter. Both the previous and the current revisions are supported.
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jozsef Kadlecsik [Tue, 1 Feb 2011 14:54:59 +0000 (15:54 +0100)]
netfilter: ipset: list:set set type support
The module implements the list:set type support in two flavours:
without and with timeout. The sets has two sides: for the userspace,
they store the names of other (non list:set type of) sets: one can add,
delete and test set names. For the kernel, it forms an ordered union of
the member sets: the members sets are tried in order when elements are
added, deleted and tested and the process stops at the first success.
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jozsef Kadlecsik [Tue, 1 Feb 2011 14:53:55 +0000 (15:53 +0100)]
netfilter: ipset: hash:net,port set type support
The module implements the hash:net,port type support in four flavours:
for IPv4 and IPv6, both without and with timeout support. The elements
are two dimensional: IPv4/IPv6 network address/prefix and protocol/port
pairs.
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jozsef Kadlecsik [Tue, 1 Feb 2011 14:52:54 +0000 (15:52 +0100)]
netfilter: ipset: hash:net set type support
The module implements the hash:net type support in four flavours:
for IPv4 and IPv6, both without and with timeout support. The elements
are one dimensional: IPv4/IPv6 network address/prefixes.
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jozsef Kadlecsik [Tue, 1 Feb 2011 14:51:00 +0000 (15:51 +0100)]
netfilter: ipset: hash:ip,port,net set type support
The module implements the hash:ip,port,net type support in four flavours:
for IPv4 and IPv6, both without and with timeout support. The elements
are three dimensional: IPv4/IPv6 address, protocol/port and IPv4/IPv6
network address/prefix triples. The different prefixes are searched/matched
from the longest prefix to the shortes one (most specific to least).
In other words the processing time linearly grows with the number of
different prefixes in the set.
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jozsef Kadlecsik [Tue, 1 Feb 2011 14:41:26 +0000 (15:41 +0100)]
netfilter: ipset: hash:ip,port,ip set type support
The module implements the hash:ip,port,ip type support in four flavours:
for IPv4 and IPv6, both without and with timeout support. The elements
are three dimensional: IPv4/IPv6 address, protocol/port and IPv4/IPv6
address triples.
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jozsef Kadlecsik [Tue, 1 Feb 2011 14:39:52 +0000 (15:39 +0100)]
netfilter: ipset: hash:ip,port set type support
The module implements the hash:ip,port type support in four flavours:
for IPv4 and IPv6, both without and with timeout support. The elements
are two dimensional: IPv4/IPv6 address and protocol/port pairs. The port
is interpeted for TCP, UPD, ICMP and ICMPv6 (at the latters as type/code
of course).
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jozsef Kadlecsik [Tue, 1 Feb 2011 14:38:36 +0000 (15:38 +0100)]
netfilter: ipset: hash:ip set type support
The module implements the hash:ip type support in four flavours:
for IPv4 or IPv6, both without and with timeout support.
All the hash types are based on the "array hash" or ahash structure
and functions as a good compromise between minimal memory footprint
and speed. The hashing uses arrays to resolve clashes. The hash table
is resized (doubled) when searching becomes too long. Resizing can be
triggered by userspace add commands only and those are serialized by
the nfnl mutex. During resizing the set is read-locked, so the only
possible concurrent operations are the kernel side readers. Those are
protected by RCU locking.
Because of the four flavours and the other hash types, the functions
are implemented in general forms in the ip_set_ahash.h header file
and the real functions are generated before compiling by macro expansion.
Thus the dereferencing of low-level functions and void pointer arguments
could be avoided: the low-level functions are inlined, the function
arguments are pointers of type-specific structures.
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jozsef Kadlecsik [Tue, 1 Feb 2011 14:37:04 +0000 (15:37 +0100)]
netfilter: ipset; bitmap:port set type support
The module implements the bitmap:port type in two flavours, without
and with timeout support to store TCP/UDP ports from a range.
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jozsef Kadlecsik [Tue, 1 Feb 2011 14:35:12 +0000 (15:35 +0100)]
netfilter: ipset: bitmap:ip,mac type support
The module implements the bitmap:ip,mac set type in two flavours,
without and with timeout support. In this kind of set one can store
IPv4 address and (source) MAC address pairs. The type supports elements
added without the MAC part filled out: when the first matching from kernel
happens, the MAC part is automatically filled out. The timing out of the
elements stars when an element is complete in the IP,MAC pair.
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jozsef Kadlecsik [Tue, 1 Feb 2011 14:33:17 +0000 (15:33 +0100)]
netfilter: ipset: bitmap:ip set type support
The module implements the bitmap:ip set type in two flavours, without
and with timeout support. In this kind of set one can store IPv4
addresses (or network addresses) from a given range.
In order not to waste memory, the timeout version does not rely on
the kernel timer for every element to be timed out but on garbage
collection. All set types use this mechanism.
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jozsef Kadlecsik [Tue, 1 Feb 2011 14:28:35 +0000 (15:28 +0100)]
netfilter: ipset: IP set core support
The patch adds the IP set core support to the kernel.
The IP set core implements a netlink (nfnetlink) based protocol by which
one can create, destroy, flush, rename, swap, list, save, restore sets,
and add, delete, test elements from userspace. For simplicity (and backward
compatibilty and for not to force ip(6)tables to be linked with a netlink
library) reasons a small getsockopt-based protocol is also kept in order
to communicate with the ip(6)tables match and target.
The netlink protocol passes all u16, etc values in network order with
NLA_F_NET_BYTEORDER flag. The protocol enforces the proper use of the
NLA_F_NESTED and NLA_F_NET_BYTEORDER flags.
For other kernel subsystems (netfilter match and target) the API contains
the functions to add, delete and test elements in sets and the required calls
to get/put refereces to the sets before those operations can be performed.
The set types (which are implemented in independent modules) are stored
in a simple RCU protected list. A set type may have variants: for example
without timeout or with timeout support, for IPv4 or for IPv6. The sets
(i.e. the pointers to the sets) are stored in an array. The sets are
identified by their index in the array, which makes possible easy and
fast swapping of sets. The array is protected indirectly by the nfnl
mutex from nfnetlink. The content of the sets are protected by the rwlock
of the set.
There are functional differences between the add/del/test functions
for the kernel and userspace:
- kernel add/del/test: works on the current packet (i.e. one element)
- kernel test: may trigger an "add" operation in order to fill
out unspecified parts of the element from the packet (like MAC address)
- userspace add/del: works on the netlink message and thus possibly
on multiple elements from the IPSET_ATTR_ADT container attribute.
- userspace add: may trigger resizing of a set
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jozsef Kadlecsik [Tue, 1 Feb 2011 14:20:14 +0000 (15:20 +0100)]
netfilter: NFNL_SUBSYS_IPSET id and NLA_PUT_NET* macros
The patch adds the NFNL_SUBSYS_IPSET id and NLA_PUT_NET* macros to the
vanilla kernel.
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Vladislav Zolotarov [Mon, 31 Jan 2011 14:39:17 +0000 (14:39 +0000)]
bnx2x, cnic: Consolidate iSCSI/FCoE shared mem logic in bnx2x
Move all shared mem code to bnx2x to avoid code duplication. bnx2x now
performs:
- Read the FCoE and iSCSI max connection information.
- Read the iSCSI and FCoE MACs from NPAR configuration in shmem.
- Block the CNIC for the current function if there is neither FCoE nor
iSCSI valid configuration by returning NULL from bnx2x_cnic_probe().
Signed-off-by: Vladislav Zolotarov <vladz@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 1 Feb 2011 00:16:50 +0000 (16:16 -0800)]
ipv4: Consolidate all default route selection implementations.
Both fib_trie and fib_hash have a local implementation of
fib_table_select_default(). This is completely unnecessary
code duplication.
Since we now remember the fib_table and the head of the fib
alias list of the default route, we can implement one single
generic version of this routine.
Looking at the fib_hash implementation you may get the impression
that it's possible for there to be multiple top-level routes in
the table for the default route. The truth is, it isn't, the
insert code will only allow one entry to exist in the zero
prefix hash table, because all keys evaluate to zero and all
keys in a hash table must be unique.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 1 Feb 2011 00:10:03 +0000 (16:10 -0800)]
ipv4: Remember FIB alias list head and table in lookup results.
This will be used later to implement fib_select_default() in a
completely generic manner, instead of the current situation where the
default route is re-looked up in the TRIE/HASH table and then the
available aliases are analyzed.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 31 Jan 2011 21:24:56 +0000 (13:24 -0800)]
Merge branch 'batman-adv/next' of git://git.open-mesh.org/ecsv/linux-merge
Yaniv Rosner [Mon, 31 Jan 2011 04:22:57 +0000 (04:22 +0000)]
bnx2x: Update bnx2x version to 1.62.11-0
Update bnx2x version to 1.62.11-0
Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yaniv Rosner [Mon, 31 Jan 2011 04:22:53 +0000 (04:22 +0000)]
bnx2x: Remove support for emulation/FPGA
Remove unneeded support for emulation/FPGA from the code
Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yaniv Rosner [Mon, 31 Jan 2011 04:22:46 +0000 (04:22 +0000)]
bnx2x: Add CMS functionality for 848x3
Add CMS(Common Mode Sense) functionality for 848x3 as this reduces power consumption and allows a better 10G link stability
Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yaniv Rosner [Mon, 31 Jan 2011 04:22:41 +0000 (04:22 +0000)]
bnx2x: Add support for new PHY BCM84833
Add support for new PHY BCM84833. This PHY is very similar to the BCM84823, only it has different register offset compared to the BCM84823, which needs to be handled correctly.
Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yaniv Rosner [Mon, 31 Jan 2011 04:22:28 +0000 (04:22 +0000)]
bnx2x: Enhance SFP+ module control
Add flexible support to control various SFP+ module features either throughout MDIO registers or GPIO pins according to NVRAM configuration
Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yaniv Rosner [Mon, 31 Jan 2011 04:22:20 +0000 (04:22 +0000)]
bnx2x: Add and change some net_dev messages
Add and modify some net dev prints to improve error control
Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yaniv Rosner [Mon, 31 Jan 2011 04:22:03 +0000 (04:22 +0000)]
bnx2x: Fix compilation warning messages
Fix annoying compilation warning, mainly related to static declarations
Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yaniv Rosner [Mon, 31 Jan 2011 04:21:55 +0000 (04:21 +0000)]
bnx2x: Set comments according to preferred Linux style
This patch contains cosmetic changes only of restyling comments according to Linux coding standard, and add comment for get_emac_base function.
Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yaniv Rosner [Mon, 31 Jan 2011 04:21:45 +0000 (04:21 +0000)]
bnx2x: Rename CL45 macro
This patch contains cosmetic changes only of renaming CL45_WR_OVER_CL22 macro to CL22_WR_OVER_CL45 as it should be.
Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yaniv Rosner [Mon, 31 Jan 2011 04:21:34 +0000 (04:21 +0000)]
bnx2x: Fix line indentation
This patch contains cosmetic changes only to fix code alignment, and update copyright comment year
Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 31 Jan 2011 21:13:24 +0000 (13:13 -0800)]
Merge branch 'master' of /linux/kernel/git/davem/net-2.6
David S. Miller [Mon, 31 Jan 2011 20:31:24 +0000 (12:31 -0800)]
Merge branch 'master' of git://git./linux/kernel/git/kaber/nf-next-2.6
Sven Eckelmann [Thu, 27 Jan 2011 09:56:56 +0000 (10:56 +0100)]
batman-adv: Merge README of v2011.0.0 release
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Sven Eckelmann [Thu, 27 Jan 2011 09:38:15 +0000 (10:38 +0100)]
batman-adv: Update copyright years
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Sven Eckelmann [Thu, 27 Jan 2011 12:48:54 +0000 (13:48 +0100)]
batman-adv: Remove unused variables
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Sven Eckelmann [Thu, 27 Jan 2011 12:16:08 +0000 (13:16 +0100)]
batman-adv: Remove declaration of batman_skb_recv
batman_skb_recv can be defined in hard-interface.c as static because it is
never used outside of that file.
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Sven Eckelmann [Thu, 27 Jan 2011 12:12:04 +0000 (13:12 +0100)]
batman-adv: Remove unused definitions
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Sven Eckelmann [Thu, 27 Jan 2011 12:10:23 +0000 (13:10 +0100)]
batman-adv: Remove dangling declaration of hash_remove_element
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Simon Wunderlich [Wed, 29 Dec 2010 16:15:19 +0000 (16:15 +0000)]
batman-adv: remove unused parameters
Some function parameters are obsolete now and can be removed.
Reported-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Sven Eckelmann [Tue, 25 Jan 2011 22:02:31 +0000 (22:02 +0000)]
batman-adv: Calculate correct size for merged packets
The routing algorithm must be able to decide if a fragment can be merged with
the missing part and still be passed to a forwarding interface. The fragments
can only differ by one byte in case that the original payload had an uneven
length. In that situation the sender has to inform all possible receivers that
the tail is one byte longer using the flag UNI_FRAG_LARGETAIL.
The combination of UNI_FRAG_LARGETAIL and UNI_FRAG_HEAD flag makes it possible
to calculate the correct length for even and uneven sized payloads.
The original formula missed to add the unicast header at all and forgot to
remove the fragment header of the second fragment. This made the results highly
unreliable and only useful for machines with large differences between the
configured MTUs.
Reported-by: Russell Senior <russell@personaltelco.net>
Reported-by: Marek Lindner <lindner_marek@yahoo.de>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Sven Eckelmann [Tue, 25 Jan 2011 21:59:26 +0000 (21:59 +0000)]
batman-adv: Create roughly equal sized fragments
The routing algorithm must know how large two fragments are to be able to
decide that it is safe to merge them or if it should resubmit without waiting
for the second part. When these two fragments have a too different size, it is
not possible to guess right in every situation.
The user could easily configure the MTU of the attached cards so that one
fragment is forwarded and the other one is added to the fragments table to wait
for the missing part.
For even sized packets, it is possible to split it so that the resulting
packages are equal sized by ignoring the old non-fragment header at the
beginning of the original packet.
This still creates different sized fragments for uneven sized packets.
Reported-by: Russell Senior <russell@personaltelco.net>
Reported-by: Marek Lindner <lindner_marek@yahoo.de>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Yaniv Rosner [Sun, 30 Jan 2011 04:15:13 +0000 (04:15 +0000)]
bnx2x: Update bnx2x version to 1.62.00-5
Update bnx2x version to 1.62.00-5
Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yaniv Rosner [Sun, 30 Jan 2011 04:15:07 +0000 (04:15 +0000)]
bnx2x: Fix potential link loss in multi-function mode
All functions on a port should be set to take the MDC/MDIO lock to avoid contention on the bus
Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yaniv Rosner [Sun, 30 Jan 2011 04:15:00 +0000 (04:15 +0000)]
bnx2x: Fix port swap for BCM8073
Fix link on BCM57712 + BCM8073 when port swap is enabled. Common PHY reset was done on the wrong port.
Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yaniv Rosner [Sun, 30 Jan 2011 04:14:55 +0000 (04:14 +0000)]
bnx2x: Fix LED blink rate on BCM84823
Fix blink rate of activity LED of the BCM84823 on 10G link
Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yaniv Rosner [Sun, 30 Jan 2011 04:14:48 +0000 (04:14 +0000)]
bnx2x: Remove setting XAUI low-power for BCM8073
A rare link issue with the BCM8073 PHY may occur due to setting XAUI low power mode, while the PHY microcode already does that.
The fix is not to set set XAUI low power mode for this PHY.
Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 31 Jan 2011 06:16:34 +0000 (22:16 -0800)]
Merge branch 'batman-adv/merge-oopsonly' of git://git.open-mesh.org/ecsv/linux-merge
Sven Eckelmann [Fri, 28 Jan 2011 17:34:07 +0000 (18:34 +0100)]
batman-adv: Make vis info stack traversal threadsafe
The batman-adv vis server has to a stack which stores all information
about packets which should be send later. This stack is protected
with a spinlock that is used to prevent concurrent write access to it.
The send_vis_packets function has to take all elements from the stack
and send them to other hosts over the primary interface. The send will
be initiated without the lock which protects the stack.
The implementation using list_for_each_entry_safe has the problem that
it stores the next element as "safe ptr" to allow the deletion of the
current element in the list. The list may be modified during the
unlock/lock pair in the loop body which may make the safe pointer
not pointing to correct next element.
It is safer to remove and use the first element from the stack until no
elements are available. This does not need reduntant information which
would have to be validated each time the lock was removed.
Reported-by: Russell Senior <russell@personaltelco.net>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Sven Eckelmann [Fri, 28 Jan 2011 17:34:06 +0000 (18:34 +0100)]
batman-adv: Remove vis info element in free_info
The free_info function will be called when no reference to the info
object exists anymore. It must be ensured that the allocated memory
gets freed and not only the elements which are managed by the info
object.
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Sven Eckelmann [Fri, 28 Jan 2011 17:34:05 +0000 (18:34 +0100)]
batman-adv: Remove vis info on hashing errors
A newly created vis info object must be removed when it couldn't be
added to the hash. The old_info which has to be replaced was already
removed and isn't related to the hash anymore.
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Eric W. Biederman [Sat, 29 Jan 2011 16:15:56 +0000 (16:15 +0000)]
net: Add compat ioctl support for the ipv4 multicast ioctl SIOCGETSGCNT
SIOCGETSGCNT is not a unique ioctl value as it it maps tio SIOCPROTOPRIVATE +1,
which unfortunately means the existing infrastructure for compat networking
ioctls is insufficient. A trivial compact ioctl implementation would conflict
with:
SIOCAX25ADDUID
SIOCAIPXPRISLT
SIOCGETSGCNT_IN6
SIOCGETSGCNT
SIOCRSSCAUSE
SIOCX25SSUBSCRIP
SIOCX25SDTEFACILITIES
To make this work I have updated the compat_ioctl decode path to mirror the
the normal ioctl decode path. I have added an ipv4 inet_compat_ioctl function
so that I can have ipv4 specific compat ioctls. I have added a compat_ioctl
function into struct proto so I can break out ioctls by which kind of ip socket
I am using. I have added a compat_raw_ioctl function because SIOCGETSGCNT only
works on raw sockets. I have added a ipmr_compat_ioctl that mirrors the normal
ipmr_ioctl.
This was necessary because unfortunately the struct layout for the SIOCGETSGCNT
has unsigned longs in it so changes between 32bit and 64bit kernels.
This change was sufficient to run a 32bit ip multicast routing daemon on a
64bit kernel.
Reported-by: Bill Fenner <fenner@aristanetworks.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric W. Biederman [Sat, 29 Jan 2011 14:57:22 +0000 (14:57 +0000)]
net: Fix ip link add netns oops
Ed Swierk <eswierk@bigswitch.com> writes:
> On 2.6.35.7
> ip link add link eth0 netns 9999 type macvlan
> where 9999 is a nonexistent PID triggers an oops and causes all network functions to hang:
> [10663.821898] BUG: unable to handle kernel NULL pointer dereference at
000000000000006d
> [10663.821917] IP: [<
ffffffff8149c2fa>] __dev_alloc_name+0x9a/0x170
> [10663.821933] PGD
1d3927067 PUD
22f5c5067 PMD 0
> [10663.821944] Oops: 0000 [#1] SMP
> [10663.821953] last sysfs file: /sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq
> [10663.821959] CPU 3
> [10663.821963] Modules linked in: macvlan ip6table_filter ip6_tables rfcomm ipt_MASQUERADE binfmt_misc iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack sco ipt_REJECT bnep l2cap xt_tcpudp iptable_filter ip_tables x_tables bridge stp vboxnetadp vboxnetflt vboxdrv kvm_intel kvm parport_pc ppdev snd_hda_codec_intelhdmi snd_hda_codec_conexant arc4 iwlagn iwlcore mac80211 snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_seq_midi snd_rawmidi i915 snd_seq_midi_event snd_seq thinkpad_acpi drm_kms_helper btusb tpm_tis nvram uvcvideo snd_timer snd_seq_device bluetooth videodev v4l1_compat v4l2_compat_ioctl32 tpm drm tpm_bios snd cfg80211 psmouse serio_raw intel_ips soundcore snd_page_alloc intel_agp i2c_algo_bit video output netconsole configfs lp parport usbhid hid e1000e sdhci_pci ahci libahci sdhci led_class
> [10663.822155]
> [10663.822161] Pid: 6000, comm: ip Not tainted 2.6.35-23-generic #41-Ubuntu 2901CTO/2901CTO
> [10663.822167] RIP: 0010:[<
ffffffff8149c2fa>] [<
ffffffff8149c2fa>] __dev_alloc_name+0x9a/0x170
> [10663.822177] RSP: 0018:
ffff88014aebf7b8 EFLAGS:
00010286
> [10663.822182] RAX:
00000000fffffff4 RBX:
ffff8801ad900800 RCX:
0000000000000000
> [10663.822187] RDX:
ffff880000000000 RSI:
0000000000000000 RDI:
ffff88014ad63000
> [10663.822191] RBP:
ffff88014aebf808 R08:
0000000000000041 R09:
0000000000000041
> [10663.822196] R10:
0000000000000000 R11:
dead000000200200 R12:
ffff88014aebf818
> [10663.822201] R13:
fffffffffffffffd R14:
ffff88014aebf918 R15:
ffff88014ad62000
> [10663.822207] FS:
00007f00c487f700(0000) GS:
ffff880001f80000(0000) knlGS:
0000000000000000
> [10663.822212] CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
> [10663.822216] CR2:
000000000000006d CR3:
0000000231f19000 CR4:
00000000000026e0
> [10663.822221] DR0:
0000000000000000 DR1:
0000000000000000 DR2:
0000000000000000
> [10663.822226] DR3:
0000000000000000 DR6:
00000000ffff0ff0 DR7:
0000000000000400
> [10663.822231] Process ip (pid: 6000, threadinfo
ffff88014aebe000, task
ffff88014afb16e0)
> [10663.822236] Stack:
> [10663.822240]
ffff88014aebf808 ffffffff814a2bb5 ffff88014aebf7e8 00000000a00ee8d6
> [10663.822251] <0>
0000000000000000 ffffffffa00ef940 ffff8801ad900800 ffff88014aebf818
> [10663.822265] <0>
ffff88014aebf918 ffff8801ad900800 ffff88014aebf858 ffffffff8149c413
> [10663.822281] Call Trace:
> [10663.822290] [<
ffffffff814a2bb5>] ? dev_addr_init+0x75/0xb0
> [10663.822298] [<
ffffffff8149c413>] dev_alloc_name+0x43/0x90
> [10663.822307] [<
ffffffff814a85ee>] rtnl_create_link+0xbe/0x1b0
> [10663.822314] [<
ffffffff814ab2aa>] rtnl_newlink+0x48a/0x570
> [10663.822321] [<
ffffffff814aafcc>] ? rtnl_newlink+0x1ac/0x570
> [10663.822332] [<
ffffffff81030064>] ? native_x2apic_icr_read+0x4/0x20
> [10663.822339] [<
ffffffff814a8c17>] rtnetlink_rcv_msg+0x177/0x290
> [10663.822346] [<
ffffffff814a8aa0>] ? rtnetlink_rcv_msg+0x0/0x290
> [10663.822354] [<
ffffffff814c25d9>] netlink_rcv_skb+0xa9/0xd0
> [10663.822360] [<
ffffffff814a8a85>] rtnetlink_rcv+0x25/0x40
> [10663.822367] [<
ffffffff814c223e>] netlink_unicast+0x2de/0x2f0
> [10663.822374] [<
ffffffff814c303e>] netlink_sendmsg+0x1fe/0x2e0
> [10663.822383] [<
ffffffff81488533>] sock_sendmsg+0xf3/0x120
> [10663.822391] [<
ffffffff815899fe>] ? _raw_spin_lock+0xe/0x20
> [10663.822400] [<
ffffffff81168656>] ? __d_lookup+0x136/0x150
> [10663.822406] [<
ffffffff815899fe>] ? _raw_spin_lock+0xe/0x20
> [10663.822414] [<
ffffffff812b7a0d>] ? _atomic_dec_and_lock+0x4d/0x80
> [10663.822422] [<
ffffffff8116ea90>] ? mntput_no_expire+0x30/0x110
> [10663.822429] [<
ffffffff81486ff5>] ? move_addr_to_kernel+0x65/0x70
> [10663.822435] [<
ffffffff81493308>] ? verify_iovec+0x88/0xe0
> [10663.822442] [<
ffffffff81489020>] sys_sendmsg+0x240/0x3a0
> [10663.822450] [<
ffffffff8111e2a9>] ? __do_fault+0x479/0x560
> [10663.822457] [<
ffffffff815899fe>] ? _raw_spin_lock+0xe/0x20
> [10663.822465] [<
ffffffff8116cf4a>] ? alloc_fd+0x10a/0x150
> [10663.822473] [<
ffffffff8158d76e>] ? do_page_fault+0x15e/0x350
> [10663.822482] [<
ffffffff8100a0f2>] system_call_fastpath+0x16/0x1b
> [10663.822487] Code: 90 48 8d 78 02 be 25 00 00 00 e8 92 1d e2 ff 48 85 c0 75 cf bf 20 00 00 00 e8 c3 b1 c6 ff 49 89 c7 b8 f4 ff ff ff 4d 85 ff 74 bd <4d> 8b 75 70 49 8d 45 70 48 89 45 b8 49 83 ee 58 eb 28 48 8d 55
> [10663.822618] RIP [<
ffffffff8149c2fa>] __dev_alloc_name+0x9a/0x170
> [10663.822627] RSP <
ffff88014aebf7b8>
> [10663.822631] CR2:
000000000000006d
> [10663.822636] ---[ end trace
3dfd6c3ad5327ca7 ]---
This bug was introduced in:
commit
81adee47dfb608df3ad0b91d230fb3cef75f0060
Author: Eric W. Biederman <ebiederm@aristanetworks.com>
Date: Sun Nov 8 00:53:51 2009 -0800
net: Support specifying the network namespace upon device creation.
There is no good reason to not support userspace specifying the
network namespace during device creation, and it makes it easier
to create a network device and pass it to a child network namespace
with a well known name.
We have to be careful to ensure that the target network namespace
for the new device exists through the life of the call. To keep
that logic clear I have factored out the network namespace grabbing
logic into rtnl_link_get_net.
In addtion we need to continue to pass the source network namespace
to the rtnl_link_ops.newlink method so that we can find the base
device source network namespace.
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Where apparently I forgot to add error handling to the path where we create
a new network device in a new network namespace, and pass in an invalid pid.
Cc: stable@kernel.org
Reported-by: Ed Swierk <eswierk@bigswitch.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
sjur.brandeland@stericsson.com [Sat, 29 Jan 2011 13:10:37 +0000 (13:10 +0000)]
caif: bugfix - add caif headers for userspace usage.
Add caif_socket.h and if_caif.h to the kernel header files
exported for use by userspace.
Signed-off-by: Sjur Braendeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Oliver Hartkopp [Sun, 30 Jan 2011 09:09:37 +0000 (01:09 -0800)]
slcan: fix referenced website in Kconfig help text
Fix the referenced project website to www.mictronics.de in the Kconfig
help text for the slcan driver.
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Herbert Xu [Sun, 30 Jan 2011 04:44:54 +0000 (20:44 -0800)]
gro: Reset dev pointer on reuse
On older kernels the VLAN code may zero skb->dev before dropping
it and causing it to be reused by GRO.
Unfortunately we didn't reset skb->dev in that case which causes
the next GRO user to get a bogus skb->dev pointer.
This particular problem no longer happens with the current upstream
kernel due to changes in VLAN processing.
However, for correctness we should still reset the skb->dev pointer
in the GRO reuse function in case a future user does the same thing.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 28 Jan 2011 22:07:16 +0000 (14:07 -0800)]
ipv4: If fib metrics are default, no need to grab ref to FIB info.
The fib metric memory in this case is static in the kernel image,
so we don't need to reference count it since it's never going
to go away on us.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 28 Jan 2011 22:05:05 +0000 (14:05 -0800)]
ipv4: Attach FIB info to dst_default_metrics when possible
If there are no explicit metrics attached to a route, hook
fi->fib_info up to dst_default_metrics.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 28 Jan 2011 22:01:25 +0000 (14:01 -0800)]
ipv4: Allocate fib metrics dynamically.
This is the initial gateway towards super-sharing metrics
if they are all set to zero for a route.
Signed-off-by: David S. Miller <davem@davemloft.net>
Thomas Jacob [Fri, 28 Jan 2011 18:33:13 +0000 (19:33 +0100)]
netfilter: xt_iprange: add IPv6 match debug print code
Signed-off-by: Thomas Jacob <jacob@internet24.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
David S. Miller [Fri, 28 Jan 2011 06:01:53 +0000 (22:01 -0800)]
net: Pre-COW metrics for TCP.
TCP is going to record metrics for the connection,
so pre-COW the route metrics at route cache entry
creation time.
This avoids several atomic operations that have to
occur if we COW the metrics after the entry reaches
global visibility.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 28 Jan 2011 00:00:37 +0000 (16:00 -0800)]
Merge branch 'master' of ssh:///linux/kernel/git/linville/wireless-next-2.6
Denis Kirjanov [Thu, 27 Jan 2011 09:54:12 +0000 (09:54 +0000)]
sungem: Use net_device's internal stats
Use net_device_stats instance from the struct net_device.
Signed-off-by: Denis Kirjanov <dkirjanov@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Wed, 26 Jan 2011 19:28:23 +0000 (19:28 +0000)]
drivers/net: remove some rcu sparse warnings
Add missing __rcu annotations and helpers.
minor : Fix some rcu_dereference() calls in macvtap
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Wed, 26 Jan 2011 18:08:02 +0000 (18:08 +0000)]
net: fix dev_seq_next()
Commit
c6d14c84566d (net: Introduce for_each_netdev_rcu() iterator)
added a race in dev_seq_next().
The rcu_dereference() call should be done _before_ testing the end of
list, or we might return a wrong net_device if a concurrent thread
changes net_device list under us.
Note : discovered thanks to a sparse warning :
net/core/dev.c:3919:9: error: incompatible types in comparison expression
(different address spaces)
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 27 Jan 2011 22:58:42 +0000 (14:58 -0800)]
net: Store ipv4/ipv6 COW'd metrics in inetpeer cache.
Please note that the IPSEC dst entry metrics keep using
the generic metrics COW'ing mechanism using kmalloc/kfree.
This gives the IPSEC routes an opportunity to use metrics
which are unique to their encapsulated paths.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 27 Jan 2011 22:59:08 +0000 (14:59 -0800)]
Merge branch 'master' of /linux/kernel/git/davem/net-2.6
David S. Miller [Thu, 27 Jan 2011 22:55:22 +0000 (14:55 -0800)]
ipv6: Remove route peer binding assertions.
They are bogus. The basic idea is that I wanted to make sure
that prefixed routes never bind to peers.
The test I used was whether RTF_CACHE was set.
But first of all, the RTF_CACHE flag is set at different spots
depending upon which ip6_rt_copy() caller you're talking about.
I've validated all of the code paths, and even in the future
where we bind peers more aggressively (for route metric COW'ing)
we never bind to prefix'd routes, only fully specified ones.
This even applies when addrconf or icmp6 routes are allocated.
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Tue, 25 Jan 2011 23:18:38 +0000 (23:18 +0000)]
net: add kmemcheck annotation in __alloc_skb()
pskb_expand_head() triggers a kmemcheck warning when copy of
skb_shared_info is done in pskb_expand_head()
This is because destructor_arg field is not necessarily initialized at
this point. Add kmemcheck_annotate_variable() call in __alloc_skb() to
instruct kmemcheck this is a normal situation.
Resolves bugzilla.kernel.org 27212
Reference: https://bugzilla.kernel.org/show_bug.cgi?id=27212
Reported-by: Christian Casteyde <casteyde.christian@free.fr>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kurt Van Dijck [Wed, 26 Jan 2011 04:55:24 +0000 (04:55 +0000)]
net: fix validate_link_af in rtnetlink core
I'm testing an API that uses IFLA_AF_SPEC attribute.
In the rtnetlink core , the set_link_af() member
of the rtnl_af_ops struct receives the nested attribute
(as I expected), but the validate_link_af() member
receives the parent attribute.
IMO, this patch fixes this.
Signed-off-by: Kurt Van Dijck <kurt.van.dijck@eia.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stanislaw Gruszka [Wed, 26 Jan 2011 00:45:42 +0000 (00:45 +0000)]
dl2k: nulify fraginfo after unmap
Patch fixes: "DMA-API: device driver tries to free an invalid DMA
memory address" warning reported here:
https://bugzilla.redhat.com/show_bug.cgi?id=639824
Reported-by: Frantisek Hanzlik <franta@hanzlici.cz>
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ian Campbell [Thu, 27 Jan 2011 04:14:03 +0000 (04:14 +0000)]
xen: netfront: handle incoming GSO SKBs which are not CHECKSUM_PARTIAL
The Linux network stack expects all GSO SKBs to have ip_summed ==
CHECKSUM_PARTIAL (which implies that the frame contains a partial
checksum) and the Xen network ring protocol similarly expects an SKB
which has GSO set to also have NETRX_csum_blank (which also implies a
partial checksum).
However there have been cases of buggy guests which mark a frame as
GSO but do not set csum_blank. If we detect that we a receiving such a
frame (which manifests as ip_summed != PARTIAL && skb_is_gso) then
force the SKB to partial and recalculate the checksum, since we cannot
rely on the peer having done so if they have not set csum_blank.
Add an ethtool stat to track occurances of this event.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: David Miller <davem@davemloft.net>
Cc: xen-devel@lists.xensource.com
Cc: netdev@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Wed, 26 Jan 2011 00:04:18 +0000 (00:04 +0000)]
econet: remove compiler warnings
net/econet/af_econet.c: In function ‘econet_sendmsg’:
net/econet/af_econet.c:494: warning: label ‘error’ defined but not used
net/econet/af_econet.c:268: warning: unused variable ‘sk’
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Phil Blundell <philb@gnu.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 27 Jan 2011 21:52:16 +0000 (13:52 -0800)]
inetpeer: Mark metrics as "new" in fresh inetpeer entries.
Set the RTAX_LOCKED metric to INETPEER_METRICS_NEW (basically,
all ones) on fresh inetpeer entries.
This way code can determine if default metrics have been loaded
in from a routing table entry already.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 27 Jan 2011 04:55:53 +0000 (20:55 -0800)]
inetpeer: Add metrics storage to inetpeer entries.
Signed-off-by: David S. Miller <davem@davemloft.net>
Felix Fietkau [Mon, 24 Jan 2011 18:11:54 +0000 (19:11 +0100)]
ath9k: fix misplaced debug code
The commit 'ath9k: Add more information to debugfs xmit file.' added more
debug counters to ath9k and also added some lines of code to ath9k_hw.
Since ath9k_hw is also used by ath9k_htc, its code must not depend on ath9k
data structures. In this case it was not fatal, but it's still wrong, so
the code needs to be moved back to ath9k.
Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Cc: Ben Greear <greearb@candelatech.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Christian Lamparter [Sat, 22 Jan 2011 23:18:28 +0000 (00:18 +0100)]
carl9170: utilize fw seq counter for mgmt/non-QoS data frames
"mac80211 will properly assign sequence numbers to QoS-data
frames but cannot do so correctly for non-QoS-data and
management frames because beacons need them from that counter
as well and mac80211 cannot guarantee proper sequencing."
Signed-off-by: Christian Lamparter <chunkeey@googlemail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Christian Lamparter [Sat, 22 Jan 2011 23:10:01 +0000 (00:10 +0100)]
carl9170: enable wake-on-lan feature testing
Signed-off-by: Christian Lamparter <chunkeey@googlemail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Christian Lamparter [Sat, 22 Jan 2011 21:46:49 +0000 (22:46 +0100)]
carl9170: update fw/hw headers
This patch syncs up the header files with
the project's main firmware carl9170fw.git.
Signed-off-by: Christian Lamparter <chunkeey@googlemail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Thomas Jacob [Thu, 27 Jan 2011 09:56:32 +0000 (10:56 +0100)]
netfilter: xt_iprange: typo in IPv4 match debug print code
Signed-off-by: Thomas Jacob <jacob@internet24.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
David S. Miller [Thu, 27 Jan 2011 04:51:05 +0000 (20:51 -0800)]
net: Implement read-only protection and COW'ing of metrics.
Routing metrics are now copy-on-write.
Initially a route entry points it's metrics at a read-only location.
If a routing table entry exists, it will point there. Else it will
point at the all zero metric place-holder called 'dst_default_metrics'.
The writeability state of the metrics is stored in the low bits of the
metrics pointer, we have two bits left to spare if we want to store
more states.
For the initial implementation, COW is implemented simply via kmalloc.
However future enhancements will change this to place the writable
metrics somewhere else, in order to increase sharing. Very likely
this "somewhere else" will be the inetpeer cache.
Note also that this means that metrics updates may transiently fail
if we cannot COW the metrics successfully.
But even by itself, this patch should decrease memory usage and
increase cache locality especially for routing workloads. In those
cases the read-only metric copies stay in place and never get written
to.
TCP workloads where metrics get updated, and those rare cases where
PMTU triggers occur, will take a very slight performance hit. But
that hit will be alleviated when the long-term writable metrics
move to a more sharable location.
Since the metrics storage went from a u32 array of RTAX_MAX entries to
what is essentially a pointer, some retooling of the dst_entry layout
was necessary.
Most importantly, we need to preserve the alignment of the reference
count so that it doesn't share cache lines with the read-mostly state,
as per Eric Dumazet's alignment assertion checks.
The only non-trivial bit here is the move of the 'flags' member into
the writeable cacheline. This is OK since we are always accessing the
flags around the same moment when we made a modification to the
reference count.
Signed-off-by: David S. Miller <davem@davemloft.net>