Merav Sicron [Tue, 19 Jun 2012 07:48:30 +0000 (07:48 +0000)]
bnx2x: Add support for ethtool -L
Add support for ethtool -L/-l for setting and getting the number of RSS queues.
The 'combined' field is used as we don't support separate IRQ for Rx and Tx.
Signed-off-by: Merav Sicron <meravs@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Merav Sicron [Tue, 19 Jun 2012 07:48:29 +0000 (07:48 +0000)]
bnx2x: Allow up to 63 RSS queues
This patch removed the limitation in the code for 16 RSS queues.
Signed-off-by: Merav Sicron <meravs@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Barak Witkowski [Tue, 19 Jun 2012 07:48:28 +0000 (07:48 +0000)]
bnx2x: Split the FP structure
This patch moves some fields out of the FP structure to different structures, in
order to minimize size of contigiuous memory allocated.
Signed-off-by: Barak Witkowski <barak@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Merav Sicron [Tue, 19 Jun 2012 07:48:27 +0000 (07:48 +0000)]
bnx2x: Move the CNIC L2 CIDs to be right after the RSS CIDs
Currently the CNIC-related L2 CIDs (for sending control FCoE / iSCSI packets)
were at fixed position, according to the maximal number of RSS queues multiplied
by the number of traffic-classes. This change makes the CIDs dynamic, as they
are defined to be right after the highest RSS CID. This decreases the memory
allocated for the context.
Signed-off-by: Merav Sicron <meravs@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Merav Sicron [Tue, 19 Jun 2012 07:48:26 +0000 (07:48 +0000)]
bnx2x: Make the transmission queues adjacent
In the current scheme the transmission queues of traffic-class 0 were 0-15, the
transmission queues of traffic-class 1 were 16-31 and so on. If the number of
RSS queues was smaller than 16, there were gaps in transmission queues
numbering, as well as in CIDs numbering. This is both a waste (especially when
16 is increased to 64), and may causes problems with flushing queues when
reducing the number of RSS queues (using ethtool -L). The new scheme eliminates
the gaps.
Signed-off-by: Merav Sicron <meravs@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Merav Sicron [Tue, 19 Jun 2012 07:48:25 +0000 (07:48 +0000)]
bnx2x: Allow more than 64 L2 CIDs
With increased number of RSS queues, each multiplied by the number of traffic-
classes, we may have up to 64*3=192 CIDs. The current driver scheme with regard
to context allocation supports only 64 CIDs. The new scheme enables scatter-
gatehr list of pages for the context.
Signed-off-by: Merav Sicron <meravs@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Merav Sicron [Tue, 19 Jun 2012 07:48:24 +0000 (07:48 +0000)]
bnx2x: Add support for 4-tupple UDP RSS
This change enables to control via ethtool whether to do UDP RSS on 2-tupple
(IP source / destination only) or on 4-tupple (include UDP source / destination
port). It also enables to read back the RSS configuration.
Signed-off-by: Merav Sicron <meravs@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Merav Sicron [Tue, 19 Jun 2012 07:48:23 +0000 (07:48 +0000)]
bnx2x: Return only online tests for MF
1. In multi-function device, show only the online tests in self-test results as
only these test are performed (offline tests cannot be performed as they may
corrupt the traffic of other functions on the same physical port). Note that
multi-function mode cannot change while the driver is up.
2. Check result code in NIC load and act accordingly.
Signed-off-by: Merav Sicron <meravs@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Merav Sicron [Tue, 19 Jun 2012 07:48:22 +0000 (07:48 +0000)]
bnx2x: Add support for external LB
This change enables to do self-test with external loopback via ethtool.
Signed-off-by: Merav Sicron <meravs@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
stigge@antcom.de [Mon, 18 Jun 2012 10:14:42 +0000 (10:14 +0000)]
net: lpc_eth: Driver cleanup
This patch removes some nowadays superfluous definitions (one unused define and
an obsolete function forward declaration) and corrects a netdev_err() to
netdev_dbg().
Signed-off-by: Roland Stigge <stigge@antcom.de>
Signed-off-by: Alexandre Pereira da Silva <aletes.xgr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pablo Neira Ayuso [Tue, 19 Jun 2012 03:25:46 +0000 (05:25 +0200)]
netfilter: fix missing symbols if CONFIG_NETFILTER_NETLINK_QUEUE_CT unset
ERROR: "nfqnl_ct_parse" [net/netfilter/nfnetlink_queue.ko] undefined!
ERROR: "nfqnl_ct_seq_adjust" [net/netfilter/nfnetlink_queue.ko] undefined!
ERROR: "nfqnl_ct_put" [net/netfilter/nfnetlink_queue.ko] undefined!
ERROR: "nfqnl_ct_get" [net/netfilter/nfnetlink_queue.ko] undefined!
We have to use CONFIG_NETFILTER_NETLINK_QUEUE_CT in
include/net/netfilter/nfnetlink_queue.h, not CONFIG_NF_CONNTRACK.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 19 Jun 2012 03:26:06 +0000 (20:26 -0700)]
Merge branch 'master' of git://1984.lsi.us.es/nf-next
Pablo says:
====================
The following patchset provides fixes for issues that were recently introduced
by my new cthelper infrastructure. They have been spotted by Randy Dunlap,
Andrew Morton and Dan Carpenter.
The patches provide:
* compilation fixes if CONFIG_NF_CONNTRACK is disabled: I moved all the
conntrack code from nfnetlink_queue.c to nfnetlink_queue_ct.c to avoid
peppering the entire code with lots of ifdefs. I needed to rename
nfnetlink_queue.c to nfnetlink_queue_core.c to get it working with the
Makefile tweaks I've added.
* fix NULL pointer dereference via ctnetlink while trying to change the helper
for an existing conntrack entry. I don't find any reasonable use case for
changing the helper from one to another in run-time. Thus, now ctnetlink
returns -EOPNOTSUPP for this operation.
* fix possible out-of-bound zeroing of the conntrack extension area due to
the helper automatic assignation routine.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 19 Jun 2012 03:23:55 +0000 (20:23 -0700)]
Merge tag 'batman-adv-for-davem' of git://git.open-mesh.org/linux-merge
Included changes:
* major skb->data pointer usage fix
* interval version update
* added get_ethtool_stats() support
* endianess clean up
* routing protocol API improvement wrt TT commit code
* fix locking in hash table code
* minor cleanups and fixes
Pablo Neira Ayuso [Tue, 19 Jun 2012 00:10:57 +0000 (02:10 +0200)]
netfilter: nfnetlink_queue: fix compilation with NF_CONNTRACK disabled
In "
9cb0176 netfilter: add glue code to integrate nfnetlink_queue and ctnetlink"
the compilation with NF_CONNTRACK disabled is broken. This patch fixes this
issue.
I have moved the conntrack part into nfnetlink_queue_ct.c to avoid
peppering the entire nfnetlink_queue.c code with ifdefs.
I also needed to rename nfnetlink_queue.c to nfnetlink_queue_pkt.c
to update the net/netfilter/Makefile to support conditional compilation
of the conntrack integration.
This patch also adds CONFIG_NETFILTER_QUEUE_CT in case you want to explicitly
disable the integration between nf_conntrack and nfnetlink_queue.
Reported-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Pablo Neira Ayuso [Mon, 18 Jun 2012 19:14:30 +0000 (21:14 +0200)]
netfilter: fix compilation of the nfnl_cthelper if NF_CONNTRACK is unset
This patch fixes the compilation of net/netfilter/nfnetlink_cthelper.c
if CONFIG_NF_CONNTRACK is not set.
This patch also moves the definition of the cthelper infrastructure to
the scope of NF_CONNTRACK things.
I have also renamed NETFILTER_NETLINK_CTHELPER by NF_CT_NETLINK_HELPER,
to use similar names to other nf_conntrack_netlink extensions. Better now
that this has been only for two days in David's tree.
Two new dependencies have been added:
* NF_CT_NETLINK
* NETFILTER_NETLINK_QUEUE
Since these infrastructure requires both ctnetlink and nfqueue.
Reported-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Pablo Neira Ayuso [Mon, 18 Jun 2012 15:29:53 +0000 (17:29 +0200)]
netfilter: nf_ct_helper: disable automatic helper re-assignment of different type
This patch modifies __nf_ct_try_assign_helper in a way that invalidates support
for the following scenario:
1) attach the helper A for first time when the conntrack is created
2) attach new (different) helper B due to changes the reply tuple caused by NAT
eg. port redirection from TCP/21 to TCP/5060 with both FTP and SIP helpers
loaded, which seems to be a quite unorthodox scenario.
I can provide a more elaborated patch to support this scenario but explicit
helper attachment provides a better solution for this since now the use can
attach the helpers consistently, without relying on the automatic helper
lookup magic.
This patch fixes a possible out of bound zeroing of the conntrack helper
extension if the helper B uses more memory for its private data than
helper A.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Pablo Neira Ayuso [Mon, 18 Jun 2012 15:29:53 +0000 (17:29 +0200)]
netfilter: ctnetlink: fix NULL dereference while trying to change helper
The patch
1afc56794e03: "netfilter: nf_ct_helper: implement variable
length helper private data" from Jun 7, 2012, leads to the following
Smatch complaint:
net/netfilter/nf_conntrack_netlink.c:1231 ctnetlink_change_helper()
error: we previously assumed 'help->helper' could be null (see line 1228)
This NULL dereference can be triggered with the following sequence:
1) attach the helper for first time when the conntrack is created.
2) remove the helper module or detach the helper from the conntrack
via ctnetlink.
3) attach helper again (the same or different one, no matter) to the
that existing conntrack again via ctnetlink.
This patch fixes the problem by removing the use case that allows you
to re-assign again a helper for one conntrack entry via ctnetlink since
I cannot find any practical use for it.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Marek Lindner [Fri, 11 May 2012 08:10:50 +0000 (16:10 +0800)]
batman-adv: only store changed gw_bandwidth values
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Matthias Schiffer [Tue, 8 May 2012 20:31:57 +0000 (22:31 +0200)]
batman-adv: fix locking in hash_add()
To ensure an entry isn't added twice all comparisons have to be protected by the
hash line write spinlock. This doesn't really hurt as the case that it is tried
to add an element already present to the hash shouldn't occur very often, so in
most cases the lock would have have to be taken anyways.
Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net>
Acked-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Antonio Quartulli [Wed, 9 May 2012 07:50:45 +0000 (09:50 +0200)]
batman-adv: use DBG_ALL in log_level sysfs definition
Each time a new log level is added the developer must change either the DBG_ALL
enum definition and the hard coded value in the bat_sysfs.c for the log_level
attribute max value. This is extremely error prone.
With this patch the code directly uses DBG_ALL in the sysfs definition
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Marek Lindner [Sun, 6 May 2012 20:22:05 +0000 (04:22 +0800)]
batman-adv: turn tt commit code into routing protocol agnostic API
Prior to this patch the translation table code made assumptions about how
the routing protocol works and where its buffers are stored (to directly
modify them).
Each protocol now calls the tt code with the relevant pointers, thereby
abstracting the code.
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Acked-by: Antonio Quartulli <ordex@autistici.org>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Matthias Schiffer [Sat, 5 May 2012 15:51:53 +0000 (17:51 +0200)]
batman-adv: fix visualization output without neighbors on the primary interface
The primary entry and the corresponding secondary entries are missing when there
are no neighbors on the primary interface. This also causes the TT entries to
miss and makes nodes with multiply secondary interface fall apart since there
is no way to see they are related without a primary entry.
Fix this by always emitting a primary entry.
Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Al Viro [Sun, 22 Apr 2012 06:45:29 +0000 (07:45 +0100)]
batman-adv: don't bother flipping ->tt_crc
Keep it net-endian
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Al Viro [Sun, 22 Apr 2012 06:44:27 +0000 (07:44 +0100)]
batman-adv: don't bother flipping ->tt_data
just keep it net-endian all along
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
[lindner_marek@yahoo.de: fix checkpatch warnings]
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Sven Eckelmann [Sat, 5 May 2012 11:27:28 +0000 (13:27 +0200)]
batman-adv: Return error codes instead of -1 on failures
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Al Viro [Sun, 22 Apr 2012 06:46:29 +0000 (07:46 +0100)]
batman-adv: keep batman_ogm_packet ->seqno net-endian all along
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Al Viro [Sun, 22 Apr 2012 06:47:50 +0000 (07:47 +0100)]
batman-adv: trivial endianness annotations
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Al Viro [Sun, 22 Apr 2012 06:50:29 +0000 (07:50 +0100)]
batman-adv: get rid of pointless cast in memcpy()
memcpy() arguments are void *, precisely to avoid that kind of pointless
casts.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Marek Lindner [Mon, 23 Apr 2012 08:32:55 +0000 (16:32 +0800)]
batman-adv: return added entries instead of number of possibly added entries
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Acked-by: Antonio Quartulli <ordex@autistici.org>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Marek Lindner [Wed, 18 Apr 2012 09:16:39 +0000 (17:16 +0800)]
batman-adv: ignore trailing CR when comparing protocol names
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Marek Lindner [Wed, 18 Apr 2012 09:15:57 +0000 (17:15 +0800)]
batman-adv: avoid characters requiring shell escapes in protocol names
Reported-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Martin Hundebøll [Fri, 20 Apr 2012 15:02:45 +0000 (17:02 +0200)]
batman-adv: Add get_ethtool_stats() support
Added additional counters in a bat_stats structure, which are exported
through the ethtool api. The counters are specific to batman-adv and
includes:
forwarded packets and bytes
management packets and bytes (aggregated OGMs at this point)
translation table packets
New counters are added by extending "enum bat_counters" in types.h and
adding corresponding descriptive string(s) to bat_counters_strings in
soft-iface.c.
Counters are increased by calling batadv_add_counter() and incremented
by one by calling batadv_inc_counter().
Signed-off-by: Martin Hundebøll <martin@hundeboll.net>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Antonio Quartulli [Sat, 14 Apr 2012 11:15:26 +0000 (13:15 +0200)]
batman-adv: convert bat_priv->tt_crc from atomic_t to uint16_t
In the code we neever need to atomically check and set the bat_priv->tt_crc
field value. It is simply set and read once in different pieces of the code.
Therefore this field can be safely be converted from atomic_t to uint16_t.
Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Sven Eckelmann [Thu, 29 Mar 2012 10:38:20 +0000 (12:38 +0200)]
batman-adv: Initialize lockdep class keys for hashes
The hash for claim and backbone hash in the bridge loop avoidance code receive
the same key because they are getting initialized by hash_new with the same
key. Lockdep will create a backtrace when they are used recursively. This can
be avoided by reinitializing the key directly after the hash_new.
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Antonio Quartulli [Thu, 14 Jun 2012 20:21:28 +0000 (22:21 +0200)]
batman-adv: fix skb->data assignment
skb_linearize(skb) possibly rearranges the skb internal data and then changes
the skb->data pointer value. For this reason any other pointer in the code that
was assigned skb->data before invoking skb_linearise(skb) must be re-assigned.
In the current tt_query message handling code this is not done and therefore, in
case of skb linearization, the pointer used to handle the packet header ends up
in pointing to poisoned memory. The packet is then dropped but the
translation-table mechanism is corrupted.
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Sven Eckelmann [Sun, 17 Jun 2012 11:26:37 +0000 (13:26 +0200)]
batman-adv: update internal version number
Signed-off-by: Sven Eckelmann <sven@narfation.org>
David S. Miller [Mon, 18 Jun 2012 02:47:34 +0000 (19:47 -0700)]
ipv4: Cap ADVMSS metric in the FIB rather than the routing cache.
It makes no sense to execute this limit test every time we create a
routing cache entry.
We can't simply error out on these things since we've silently
accepted and truncated them forever.
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Tue, 12 Jun 2012 23:58:16 +0000 (23:58 +0000)]
net: lpc_eth: free skbs in start_xmit
Transmitted skbs can be freed immediately in lpc_eth_hard_start_xmit()
instead of at TX completion, since driver copies the frames in DMA area.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Tested-by: Roland Stigge <stigge@antcom.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yuval Mintz [Sun, 17 Jun 2012 02:04:50 +0000 (02:04 +0000)]
bnx2x: correct LPI pass-through configuration
Commit
c8c60d88c59cbb48737732ba948663a3efe882aa contained
an incorrect logic which enabled a buffer overflow when accessing
an array during LPI pass-through configuration.
This patch fixes this issue by removing that logic altogether.
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Yaniv Rosner <yaniv.rosner@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Sat, 16 Jun 2012 15:45:44 +0000 (15:45 +0000)]
bnx2: Update version 2.2.2
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Sat, 16 Jun 2012 15:45:43 +0000 (15:45 +0000)]
bnx2: Read PCI function number from internal register
so that it will work on any hypervisor.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Sat, 16 Jun 2012 15:45:42 +0000 (15:45 +0000)]
bnx2: Dump additional BC_STATE during firmware sync timeout.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Sat, 16 Jun 2012 15:45:41 +0000 (15:45 +0000)]
bnx2: Dump all FTQ_CTL registers during tx_timeout
to help debug tx timeouts reported in the field.
Reviewed-by Benjamin Li <benli@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sat, 16 Jun 2012 22:23:35 +0000 (15:23 -0700)]
Merge branch 'master' of git://1984.lsi.us.es/nf-next
Pablo says:
====================
This is the second batch of Netfilter updates for net-next. It contains the
kernel changes for the new user-space connection tracking helper
infrastructure.
More details on this infrastructure are provides here:
http://lwn.net/Articles/500196/
Still, I plan to provide some official documentation through the
conntrack-tools user manual on how to setup user-space utilities for this.
So far, it provides two helper in user-space, one for NFSv3 and another for
Oracle/SQLnet/TNS. Yet in my TODO list.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Eldad Zack [Sat, 16 Jun 2012 13:14:49 +0000 (15:14 +0200)]
include/net/dst.h: neaten asterisk placement
Fix code style - place the asterisk where it belongs.
Signed-off-by: Eldad Zack <eldad@fogrefinery.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pablo Neira Ayuso [Sun, 13 May 2012 19:44:54 +0000 (21:44 +0200)]
netfilter: add user-space connection tracking helper infrastructure
There are good reasons to supports helpers in user-space instead:
* Rapid connection tracking helper development, as developing code
in user-space is usually faster.
* Reliability: A buggy helper does not crash the kernel. Moreover,
we can monitor the helper process and restart it in case of problems.
* Security: Avoid complex string matching and mangling in kernel-space
running in privileged mode. Going further, we can even think about
running user-space helpers as a non-root process.
* Extensibility: It allows the development of very specific helpers (most
likely non-standard proprietary protocols) that are very likely not to be
accepted for mainline inclusion in the form of kernel-space connection
tracking helpers.
This patch adds the infrastructure to allow the implementation of
user-space conntrack helpers by means of the new nfnetlink subsystem
`nfnetlink_cthelper' and the existing queueing infrastructure
(nfnetlink_queue).
I had to add the new hook NF_IP6_PRI_CONNTRACK_HELPER to register
ipv[4|6]_helper which results from splitting ipv[4|6]_confirm into
two pieces. This change is required not to break NAT sequence
adjustment and conntrack confirmation for traffic that is enqueued
to our user-space conntrack helpers.
Basic operation, in a few steps:
1) Register user-space helper by means of `nfct':
nfct helper add ftp inet tcp
[ It must be a valid existing helper supported by conntrack-tools ]
2) Add rules to enable the FTP user-space helper which is
used to track traffic going to TCP port 21.
For locally generated packets:
iptables -I OUTPUT -t raw -p tcp --dport 21 -j CT --helper ftp
For non-locally generated packets:
iptables -I PREROUTING -t raw -p tcp --dport 21 -j CT --helper ftp
3) Run the test conntrackd in helper mode (see example files under
doc/helper/conntrackd.conf
conntrackd
4) Generate FTP traffic going, if everything is OK, then conntrackd
should create expectations (you can check that with `conntrack':
conntrack -E expect
[NEW] 301 proto=6 src=192.168.1.136 dst=130.89.148.12 sport=0 dport=54037 mask-src=255.255.255.255 mask-dst=255.255.255.255 sport=0 dport=65535 master-src=192.168.1.136 master-dst=130.89.148.12 sport=57127 dport=21 class=0 helper=ftp
[DESTROY] 301 proto=6 src=192.168.1.136 dst=130.89.148.12 sport=0 dport=54037 mask-src=255.255.255.255 mask-dst=255.255.255.255 sport=0 dport=65535 master-src=192.168.1.136 master-dst=130.89.148.12 sport=57127 dport=21 class=0 helper=ftp
This confirms that our test helper is receiving packets including the
conntrack information, and adding expectations in kernel-space.
The user-space helper can also store its private tracking information
in the conntrack structure in the kernel via the CTA_HELP_INFO. The
kernel will consider this a binary blob whose layout is unknown. This
information will be included in the information that is transfered
to user-space via glue code that integrates nfnetlink_queue and
ctnetlink.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Pablo Neira Ayuso [Thu, 7 Jun 2012 12:19:42 +0000 (14:19 +0200)]
netfilter: ctnetlink: add CTA_HELP_INFO attribute
This attribute can be used to modify and to dump the internal
protocol information.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Pablo Neira Ayuso [Thu, 7 Jun 2012 11:31:25 +0000 (13:31 +0200)]
netfilter: nfnetlink_queue: add NAT TCP sequence adjustment if packet mangled
User-space programs that receive traffic via NFQUEUE may mangle packets.
If NAT is enabled, this usually puzzles sequence tracking, leading to
traffic disruptions.
With this patch, nfnl_queue will make the corresponding NAT TCP sequence
adjustment if:
1) The packet has been mangled,
2) the NFQA_CFG_F_CONNTRACK flag has been set, and
3) NAT is detected.
There are some records on the Internet complaning about this issue:
http://stackoverflow.com/questions/260757/packet-mangling-utilities-besides-iptables
By now, we only support TCP since we have no helpers for DCCP or SCTP.
Better to add this if we ever have some helper over those layer 4 protocols.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Pablo Neira Ayuso [Thu, 7 Jun 2012 10:13:39 +0000 (12:13 +0200)]
netfilter: add glue code to integrate nfnetlink_queue and ctnetlink
This patch allows you to include the conntrack information together
with the packet that is sent to user-space via NFQUEUE.
Previously, there was no integration between ctnetlink and
nfnetlink_queue. If you wanted to access conntrack information
from your libnetfilter_queue program, you required to query
ctnetlink from user-space to obtain it. Thus, delaying the packet
processing even more.
Including the conntrack information is optional, you can set it
via NFQA_CFG_F_CONNTRACK flag with the new NFQA_CFG_FLAGS attribute.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Pablo Neira Ayuso [Thu, 7 Jun 2012 10:11:50 +0000 (12:11 +0200)]
netfilter: nf_ct_helper: implement variable length helper private data
This patch uses the new variable length conntrack extensions.
Instead of using union nf_conntrack_help that contain all the
helper private data information, we allocate variable length
area to store the private helper data.
This patch includes the modification of all existing helpers.
It also includes a couple of include header to avoid compilation
warnings.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Pablo Neira Ayuso [Wed, 1 Feb 2012 15:18:31 +0000 (16:18 +0100)]
netfilter: nf_ct_ext: support variable length extensions
We can now define conntrack extensions of variable size. This
patch is useful to get rid of these unions:
union nf_conntrack_help
union nf_conntrack_proto
union nf_conntrack_nat_help
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Pablo Neira Ayuso [Sun, 15 Jan 2012 15:34:08 +0000 (16:34 +0100)]
netfilter: nf_ct_helper: allocate 16 bytes for the helper and policy names
This patch modifies the struct nf_conntrack_helper to allocate
the room for the helper name. The maximum length is 16 bytes
(this was already introduced in 2.6.24).
For the maximum length for expectation policy names, I have
also selected 16 bytes.
This patch is required by the follow-up patch to support
user-space connection tracking helpers.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
David S. Miller [Sat, 16 Jun 2012 08:23:04 +0000 (01:23 -0700)]
Merge git://git./linux/kernel/git/davem/net
Conflicts:
net/ipv6/route.c
Pull in 'net' again to get the revert of Thomas's change
which introduced regressions.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sat, 16 Jun 2012 08:12:19 +0000 (01:12 -0700)]
Revert "ipv6: Prevent access to uninitialized fib_table_hash via /proc/net/ipv6_route"
This reverts commit
2a0c451ade8e1783c5d453948289e4a978d417c9.
It causes crashes, because now ip6_null_entry is used before
it is initialized.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sat, 16 Jun 2012 03:01:57 +0000 (20:01 -0700)]
ipv6: Fix types of ip6_update_pmtu().
The mtu should be a __be32, not the mark.
Reported-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 15 Jun 2012 22:51:55 +0000 (15:51 -0700)]
Merge git://git./linux/kernel/git/davem/net
Conflicts:
net/ipv6/route.c
This deals with a merge conflict between the net-next addition of the
inetpeer network namespace ops, and Thomas Graf's bug fix in
2a0c451ade8e1783c5d453948289e4a978d417c9 which makes sure we don't
register /proc/net/ipv6_route before it is actually safe to do so.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 15 Jun 2012 22:37:05 +0000 (15:37 -0700)]
Merge branch 'master' of git://git./linux/kernel/git/jkirsher/net-next
Thomas Graf [Thu, 14 Jun 2012 23:00:17 +0000 (23:00 +0000)]
ipv6: Prevent access to uninitialized fib_table_hash via /proc/net/ipv6_route
/proc/net/ipv6_route reflects the contents of fib_table_hash. The proc
handler is installed in ip6_route_net_init() whereas fib_table_hash is
allocated in fib6_net_init() _after_ the proc handler has been installed.
This opens up a short time frame to access fib_table_hash with its pants
down.
fib6_init() as a whole can't be moved to an earlier position as it also
registers the rtnetlink message handlers which should be registered at
the end. Therefore split it into fib6_init() which is run early and
fib6_init_late() to register the rtnetlink message handlers.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Reviewed-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Thu, 14 Jun 2012 08:34:24 +0000 (08:34 +0000)]
qlcnic: off by one in qlcnic_init_pci_info()
The adapter->npars[] array has QLCNIC_MAX_PCI_FUNC elements. We
allocate it that way a few lines earlier in the function. So this test
is off by one.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Anirban Chakraborty <anirban.chakraborty@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Thu, 14 Jun 2012 06:42:44 +0000 (06:42 +0000)]
net: remove skb_orphan_try()
Orphaning skb in dev_hard_start_xmit() makes bonding behavior
unfriendly for applications sending big UDP bursts : Once packets
pass the bonding device and come to real device, they might hit a full
qdisc and be dropped. Without orphaning, the sender is automatically
throttled because sk->sk_wmemalloc reaches sk->sk_sndbuf (assuming
sk_sndbuf is not too big)
We could try to defer the orphaning adding another test in
dev_hard_start_xmit(), but all this seems of little gain,
now that BQL tends to make packets more likely to be parked
in Qdisc queues instead of NIC TX ring, in cases where performance
matters.
Reverts commits :
fc6055a5ba31 net: Introduce skb_orphan_try()
87fd308cfc6b net: skb_tx_hash() fix relative to skb_orphan_try()
and removes SKBTX_DRV_NEEDS_SK_REF flag
Reported-and-bisected-by: Jean-Michel Hautbois <jhautbois@gmail.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Tested-by: Oliver Hartkopp <socketcan@hartkopp.net>
Acked-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Wed, 13 Jun 2012 09:45:16 +0000 (09:45 +0000)]
bnx2x: fix panic when TX ring is full
There is a off by one error in the minimal number of BD in
bnx2x_start_xmit() and bnx2x_tx_int() before stopping/resuming tx queue.
A full size GSO packet, with data included in skb->head really needs
(MAX_SKB_FRAGS + 4) BDs, because of bnx2x_tx_split()
This error triggers if BQL is disabled and heavy TCP transmit traffic
occurs.
bnx2x_tx_split() definitely can be called, remove a wrong comment.
Reported-by: Tomas Hruby <thruby@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Eilon Greenstein <eilong@broadcom.com>
Cc: Yaniv Rosner <yanivr@broadcom.com>
Cc: Merav Sicron <meravs@broadcom.com>
Cc: Tom Herbert <therbert@google.com>
Cc: Robert Evans <evansr@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Fri, 15 Jun 2012 00:20:44 +0000 (00:20 +0000)]
can: c_can: precedence error in c_can_chip_config()
(CAN_CTRLMODE_LISTENONLY & CAN_CTRLMODE_LOOPBACK) is (0x02 & 0x01) which
is zero so the condition is never true. The intent here was to test
that both flags were set.
Cc: <stable@kernel.org> # 2.6.39+
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 15 Jun 2012 21:54:11 +0000 (14:54 -0700)]
ipv6: Handle PMTU in ICMP error handlers.
One tricky issue on the ipv6 side vs. ipv4 is that the ICMP callouts
to handle the error pass the 32-bit info cookie in network byte order
whereas ipv4 passes it around in host byte order.
Like the ipv4 side, we have two helper functions. One for when we
have a socket context and one for when we do not.
ip6ip6 tunnels are not handled here, because they handle PMTU events
by essentially relaying another ICMP packet-too-big message back to
the original sender.
This patch allows us to get rid of rt6_do_pmtu_disc(). It handles all
kinds of situations that simply cannot happen when we do the PMTU
update directly using a fully resolved route.
In fact, the "plen == 128" check in ip6_rt_update_pmtu() can very
likely be removed or changed into a BUG_ON() check. We should never
have a prefixed ipv6 route when we get there.
Another piece of strange history here is that TCP and DCCP, unlike in
ipv4, never invoke the update_pmtu() method from their ICMP error
handlers. This is incredibly astonishing since this is the context
where we have the most accurate context in which to make a PMTU
update, namely we have a fully connected socket and associated cached
socket route.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 15 Jun 2012 05:21:46 +0000 (22:21 -0700)]
ipv4: Handle PMTU in all ICMP error handlers.
With ip_rt_frag_needed() removed, we have to explicitly update PMTU
information in every ICMP error handler.
Create two helper functions to facilitate this.
1) ipv4_sk_update_pmtu()
This updates the PMTU when we have a socket context to
work with.
2) ipv4_update_pmtu()
Raw version, used when no socket context is available. For this
interface, we essentially just pass in explicit arguments for
the flow identity information we would have extracted from the
socket.
And you'll notice that ipv4_sk_update_pmtu() is simply implemented
in terms of ipv4_update_pmtu()
Note that __ip_route_output_key() is used, rather than something like
ip_route_output_flow() or ip_route_output_key(). This is because we
absolutely do not want to end up with a route that does IPSEC
encapsulation and the like. Instead, we only want the route that
would get us to the node described by the outermost IP header.
Reported-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Thu, 14 Jun 2012 12:46:59 +0000 (15:46 +0300)]
Merge git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fix from Marcelo Tosatti:
"Fix a spurious warning on CPU offline path"
* git://git.kernel.org/pub/scm/virt/kvm/kvm:
x86: kvmclock: remove check_and_clear_guest_paused warning
Linus Torvalds [Thu, 14 Jun 2012 12:43:32 +0000 (15:43 +0300)]
Merge tag 'pinctrl-fixes-for-v3.5' of git://git./linux/kernel/git/linusw/linux-pinctrl
Pull pinctrl fixes from Linus Walleij:
- section markup fixes
- clk_prepare() fix to conform to the clk API
- memory leaks
- incorrect debug messages
- bad errorpaths
- typos
* tag 'pinctrl-fixes-for-v3.5' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
pinctrl: pinctrl-mxs: set platform driver data to NULL at errpath and at unregister
pinctrl: pinctrl-mxs: Take care of frees if the kzalloc fails
pinctrl: pinctrl-imx: fix incorrect debug message of maps
pinctrl: pinctrl-imx: free if of_get_parent fails to get the parent node
pinctrl: pinctrl-imx: free allocated pinctrl_map structure only once and use kernel facilities for IMX_PMX_DUMP
pinctrl: nomadik: fix up typo
pinctrl: nomadik: add clk_prepare() call
pinctrl: fix a minor harmless typo
pinctrl: sirf: mark of_device_id match table as __devinitconst
Linus Torvalds [Thu, 14 Jun 2012 12:38:48 +0000 (15:38 +0300)]
Merge tag 'sound-3.5' of git://git./linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
- Fix a regression of USB-audio PCM assignment since 3.4
- A few VGA-switcheroo-related fixes for proper HDMI audio enablement
- Fixed the missing initializations of HD-audio verbs, which may have
resulted in various breakage
- Some driver-specific ASoC updates
- A few fixes for the dynamic PCM code
- The addition of pinctrl support for the i.MX audmux which didn't make
it into -rc1 due to cross tree dependency issues
- A few minor fixes in compress API codes
* tag 'sound-3.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda - Don't forget to call init verbs added by fixup list
ALSA: HDA: Pin fixup for Zotac Z68 motherboard
ALSA: compress_core: cleanup pointers on stop
ALSA: compress_core: don't wake up on pause
ALSA: hda - Fix detection of Creative SoundCore3D controllers
vga_switcheroo: Enable/disable audio clients at the right time
ALSA: hda - HDMI Audio init all connectors when VGA-switcheroo is off
vga_switcheroo: Fix error without CONFIG_VGA_SWITCHEROO
ALSA: hda - Fix uninitialized HDMI controllers with VGA-switcheroo
vga_switcheroo: Add a helper function to get the client state
ALSA: usb-audio: Fix substream assignments
ASoC: tegra: add MODULE_DEVICE_TABLE to tegra30_ahub
ASoC: wm2000: Always use a 4s timeout for the firmware
ASoC: dapm: Fix input list to use source widgets
ASoC: dpcm: Fix dpcm_get_be() to check that DAI is BE
ASoC: wm8994: Apply volume updates with clocks enabled
ASoC: wm8994: Ensure all AIFnCLK events are run from the _late variants
ASoC: imx-audmux: add pinctrl support
ASoC: dapm: Fix connected widget capture path query.
Linus Torvalds [Thu, 14 Jun 2012 12:33:55 +0000 (15:33 +0300)]
Merge git://git./linux/kernel/git/davem/net
Pull networking fixes from David S. Miller:
This has the fix for the wireless issues I ran into the other week as
well as:
1) Fix CAN c_can driver transmit handling resulting in BUG check
triggers, from AnilKumar Ch.
2) Fix packet drop monitor sleeping in atomic context, from Eric
Dumazet.
3) Fix mv643xx_eth driver build regression, from Andrew Lunn.
4) Inetpeer freeing needs an RCU grace period in order to avoid races
during tree invalidation. From Eric Dumazet.
5) Fix endianness bugs in xt_HMARK netfilter module, from Hans
Schillstrom.
6) Add proper module refcounting to l2tp_eth to avoid crash on module
unload, from Eric Dumazet.
7) Fix truncation of neighbour entry dumps due to logic errors in
neigh_dump_info() and friends, from Eric Dumazet.
8) The conversion of fib6_age() to dst_neigh_lookup() accidently
reversed the logic of a flags test, fix from Thomas Graf.
9) Fix checksum configuration in newer sky2 chips, from Stephen
Hemminger.
10) Revert BQL support in NIU driver, doesn't work.
11) l2tp_ip_sendmsg() illegally uses a route without a proper reference.
From Eric Dumazet.
12) be2net driver references an SKB after it's potentially been freed,
also from Eric Dumazet.
13) Fix RCU stalls in dummy net driver init. Also from Eric Dumazet.
14) lpc_eth has several bugs in it's transmit engine leading to packet
leaks and improper queue wakes, from Eric Dumazet.
15) Apply short DMA workaround to more tg3 chips, from Matt Carlson.
16) Add tilegx network driver.
17) Bonding queue mapping for a packet can get corrupted, fix from Eric
Dumazet.
18) Fix bug in netpoll_send_udp() SKB management that can leave garbage
in the payload in certain situations. From Eric Dumazet.
19) bnx2x driver interprets chip RX checksum offload incorrectly in
encapsulation situations. Fix from Eric Dumazet.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (75 commits)
bnx2x: fix checksum validation
netpoll: fix netpoll_send_udp() bugs
bonding: Fix corrupted queue_mapping
bonding:record primary when modify it via sysfs
tilegx network driver: initial support
tg3: Apply short DMA frag workaround to 5906
net: stmmac: Fix clock en-/disable calls
lpc_eth: fix tx completion
lpc_eth: add missing ndo_change_mtu()
dummy: fix rcu_sched self-detected stalls
net: Reorder initialization in ip_route_output to fix gcc warning
virtio-net: fix a race on 32bit arches
r8169: avoid NAPI scheduling delay.
net: Make linux/tcp.h C++ friendly (trivial)
netdev: fix drivers/net/phy/ kernel-doc warnings
net/core: fix kernel-doc warnings
be2net: fix a race in be_xmit()
l2tp: fix a race in l2tp_ip_sendmsg()
mac80211: add back channel change flag
NFC: Fix possible NULL ptr deref when getting the name of a socket
...
Jacob Keller [Tue, 22 May 2012 06:18:08 +0000 (06:18 +0000)]
ixgbe: Check PTP Rx timestamps via BPF filter
This patch fixes a potential Rx timestamp deadlock that causes the Rx
timestamping to stall indefinitely. The issue could occur when a PTP packet is
timestamped by hardware but never reaches the Rx queue. In order to prevent a
permanent loss of timestamping, the RXSTMP(L/H) registers have to be read to
unlock them. (This used to only occur when a packet that was timestamped
reached the software.) However the registers can't be read early otherwise
there is no way to correlate them to the packet.
This patch introduces a filter function which can be used to determine if a
packet should have been timestamped. Supplied with the filter setup by the
hwtstamp ioctl, check to make sure the PTP protocol and message type match the
expected values. If so, then read the timestamp registers (to free them.) At
this point check the descriptor bit, if the bit is set then we know this
packet correlates to the timestamp stored in the RXTSTAMP registers.
Otherwise, assume that packet was dropped by the hardware, and ignore this
timestamp value. However, we have at least unlocked the rxtstamp registers for
future timestamping.
Due to the way the driver handles skb data, it cannot be directly accessed. In
order to work around this, a copy of the skb data into a linear buffer is
made. From this buffer it becomes possible to read the data correctly
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Richard Cochran <richardcochran@gmail.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Jacob Keller [Tue, 22 May 2012 06:08:37 +0000 (06:08 +0000)]
ixgbe: PTP Fix hwtstamp mode settings
When enabling the hwtstamp mode for Rx timestamping the V2 ptp event type
specific modes (Delay Request and Sync) have been rolled into the V2 all event
packet modes, in order to more accurately represent what hardware is doing.
Hardware always timestamps the Path delay packets when a V2 mode is selected,
regardless of what type was selected (in order to always support Path delay
mode). However this means the user selected modes of timestamping only Sync or
Delay Request is not truly supported. This patch correctly sets the mode for
the hwtstamp config and returns to the user that all V2 event packets will be
timestamped.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Jacob Keller [Tue, 22 May 2012 06:08:32 +0000 (06:08 +0000)]
ixgbe: ptp code cleanup
This patch fixes two minor nits from Richard Cochran. The first is a case of
ambitious line wrapping that wasn't necessary. The second is to re-order the
flag checks for PPS support. Previously, the hardware test was done first, and
the interrupt flag test was done second. Now, test the interrupt flag and use
the unlikely macro.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Emil Tantilov [Wed, 16 May 2012 07:06:38 +0000 (07:06 +0000)]
ixgbe: do not compile ixgbe_sysfs.c when CONFIG_IXGBE_HWMON is not set
ixgbe_sysfs.c is only needed when CONFIG_IXGBE_HWMON is configured in the
kernel.
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Acked-by: Don Skidmore <Donald.c.skidmore@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
John Fastabend [Wed, 28 Mar 2012 11:42:45 +0000 (11:42 +0000)]
ixgbe: align flow control DV macros with datasheet
The flow control DV macros are used to calculate the flow control
high and low thresholds. This patch annotates these macros slightly
better and fixes the issues below.
The macro variables are renamed LINK to _max_frame_link and TC to
_max_frame_tc. This was to avoid confusion and make them more
readable. It was found that people auditing the code read TC to be
'traffic class' in the 802.1Q definition instead of the max frame
size of the tc. Hopefully it is clear now.
This audit also found the following real deviations from the
theoretical values. Fixed in this patch.
* I multiplied the DV calculations by (36/25) which always
evaluates to 1. This does not match the intended theoretical
value of 1.44.
* IXGBE_BT2KB added 1023 to account for rounding however this
really should be 8 * 1023 - 1 to account for division by 8k.
* x2 multiplication of max frame in DV calculations to account
for updated hardware recommendations.
With this patch the DV values are inline with the recommendations
in the 82599 and 82598 data sheets. Its worth noting I did not
see any dropped frames with flow control on in my experiments without
this patch. However aligning with the hardware specs and
recommendations seems like a good idea here to account for worst
case scenarios.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Thu, 7 Jun 2012 02:23:37 +0000 (02:23 +0000)]
e1000e: use more informative logging macros when netdev not yet registered
Based on a report from Ethan Zhao, before calling register_netdev() the
driver should be using logging macros that do not display the potentially
confusing "(unregistered net_device)" yet still display the useful driver
name and PCI bus/device/function.
Reported-by: Ethan Zhao <ethan.kernel@gmail.com>
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Thomas Graf [Wed, 13 Jun 2012 22:40:15 +0000 (22:40 +0000)]
dcbnl: Use BUG_ON() instead of BUG()
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Thomas Graf [Wed, 13 Jun 2012 22:34:03 +0000 (22:34 +0000)]
dcbnl: Silence harmless gcc warning about uninitialized reply_nlh
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Wed, 13 Jun 2012 05:30:07 +0000 (05:30 +0000)]
bonding: drop_monitor aware
When packets are dropped in TX path, its better to use kfree_skb()
instead of dev_kfree_skb() to give proper drop_monitor events.
Also move the kfree_skb() call after read_unlock() in bond_alb_xmit()
and bond_xmit_activebackup()
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Tue, 12 Jun 2012 23:50:04 +0000 (23:50 +0000)]
bnx2x: fix checksum validation
bnx2x driver incorrectly sets ip_summed to CHECKSUM_UNNECESSARY on
encapsulated segments. TCP stack happily accepts frames with bad
checksums, if they are inside a GRE or IPIP encapsulation.
Our understanding is that if no IP or L4 csum validation was done by the
hardware, we should leave ip_summed as is (CHECKSUM_NONE), since
hardware doesn't provide CHECKSUM_COMPLETE support in its cqe.
Then, if IP/L4 checksumming was done by the hardware, set
CHECKSUM_UNNECESSARY if no error was flagged.
Patch based on findings and analysis from Robert Evans
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Eilon Greenstein <eilong@broadcom.com>
Cc: Yaniv Rosner <yanivr@broadcom.com>
Cc: Merav Sicron <meravs@broadcom.com>
Cc: Tom Herbert <therbert@google.com>
Cc: Robert Evans <evansr@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Acked-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Tue, 12 Jun 2012 19:30:21 +0000 (19:30 +0000)]
netpoll: fix netpoll_send_udp() bugs
Bogdan Hamciuc diagnosed and fixed following bug in netpoll_send_udp() :
"skb->len += len;" instead of "skb_put(skb, len);"
Meaning that _if_ a network driver needs to call skb_realloc_headroom(),
only packet headers would be copied, leaving garbage in the payload.
However the skb_realloc_headroom() must be avoided as much as possible
since it requires memory and netpoll tries hard to work even if memory
is exhausted (using a pool of preallocated skbs)
It appears netpoll_send_udp() reserved 16 bytes for the ethernet header,
which happens to work for typicall drivers but not all.
Right thing is to use LL_RESERVED_SPACE(dev)
(And also add dev->needed_tailroom of tailroom)
This patch combines both fixes.
Many thanks to Bogdan for raising this issue.
Reported-by: Bogdan Hamciuc <bogdan.hamciuc@freescale.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Tested-by: Bogdan Hamciuc <bogdan.hamciuc@freescale.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Neil Horman <nhorman@tuxdriver.com>
Reviewed-by: Neil Horman <nhorman@tuxdriver.com>
Reviewed-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Thomas Graf [Wed, 13 Jun 2012 02:55:01 +0000 (02:55 +0000)]
dcbnl: Use type safe nlmsg_data()
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Thomas Graf [Wed, 13 Jun 2012 02:55:00 +0000 (02:55 +0000)]
dcbnl: Move dcb app allocation into dcb_app_add()
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Thomas Graf [Wed, 13 Jun 2012 02:54:59 +0000 (02:54 +0000)]
dcbnl: Move dcb app lookup code into dcb_app_lookup()
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Thomas Graf [Wed, 13 Jun 2012 02:54:58 +0000 (02:54 +0000)]
dcbnl: Return consistent error codes
EMSGSIZE - ran out of space while constructing message
EOPNOTSUPP - driver/hardware does not support operation
ENODEV - network device not found
EINVAL - invalid message
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Thomas Graf [Wed, 13 Jun 2012 02:54:57 +0000 (02:54 +0000)]
dcbnl: Use dcbnl_newmsg() where possible
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Thomas Graf [Wed, 13 Jun 2012 02:54:56 +0000 (02:54 +0000)]
dcbnl: Remove now unused dcbnl_reply()
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Thomas Graf [Wed, 13 Jun 2012 02:54:55 +0000 (02:54 +0000)]
dcbnl: Shorten all command handling functions
Allocating and sending the skb in dcb_doit() allows for much
shorter and cleaner command handling functions.
The huge switch statement is replaced with an array based definition
of the handling function and reply message type.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Thomas Graf [Wed, 13 Jun 2012 02:54:54 +0000 (02:54 +0000)]
dcbnl: Prepare framework to shorten handling functions
There is no need to allocate and send the reply message in each
handling function separately. Instead, the reply skb can be allocated
and sent in dcb_doit() directly.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Wed, 13 Jun 2012 20:19:34 +0000 (23:19 +0300)]
Merge tag 'sh-for-linus' of git://github.com/pmundt/linux-sh
Pull SuperH fixes from Paul Mundt.
* tag 'sh-for-linus' of git://github.com/pmundt/linux-sh:
sh: Kill off additional asm-generic wrappers.
sh: Setup CROSS_COMPILE at the top
sh: Fix up link time defsym warnings.
sh: use the new generic strnlen_user() function
sh: switch to generic strncpy_from_user().
sh: Kill off last dead UBC header
serial: sh-sci: Make probe fail for ports that exceed the maximum count
serial: sh-sci: Fix probe error paths
clocksource: sh_tmu: Use clockevents_config_and_register().
clocksource: sh_tmu: Convert timer lock to raw spinlock.
clocksource: sh_mtu2: Convert timer lock to raw spinlock.
clocksource: sh_cmt: Convert timer lock to raw spinlock.
bug.h: need linux/kernel.h for TAINT_WARN.
sh: convert to kbuild asm-generic support.
sh64: Fix up fallout from generic init_task conversion.
sh: arch/sh/kernel/process.c needs asm/fpu.h for unlazy_fpu().
Linus Torvalds [Wed, 13 Jun 2012 20:18:41 +0000 (23:18 +0300)]
Merge branch 'fixes-for-3.5' of git://git./linux/kernel/git/cooloney/linux-leds
Pull led fixes from Bryan Wu.
* 'fixes-for-3.5' of git://git.kernel.org/pub/scm/linux/kernel/git/cooloney/linux-leds:
leds: Make LEDS_ASIC3 and LEDS_RENESAS_TPU depend on LEDS_CLASS=y
leds: fixed a coding style issue.
leds: don't disable blinking when writing the same value to delay_on or delay_off
Linus Torvalds [Wed, 13 Jun 2012 20:17:12 +0000 (23:17 +0300)]
Merge branch 'for-linus' of git://git./linux/kernel/git/geert/linux-m68k
Pull m68k update from Geert Uytterhoeven.
This makes m68k use the generic library functions for the user-space
strn[cpy|len] functions.
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
m68k: Use generic strncpy_from_user(), strlen_user(), and strnlen_user()
Linus Torvalds [Wed, 13 Jun 2012 14:57:30 +0000 (17:57 +0300)]
Merge tag 'omapdss-for-3.5-rc2' of git://gitorious.org/linux-omap-dss2/linux
Pull omapdss build problem fix from Tomi Valkeinen:
"Small fixes for omapdss driver. Most importantly, fixes a build
problem when debugfs or omapdss debug support is turned off, and fixes
a suspend related crash."
This has apparently been annoying rmk for a while..
* tag 'omapdss-for-3.5-rc2' of git://gitorious.org/linux-omap-dss2/linux:
OMAPDSS: fix registration of DPI and SDI devices
OMAPDSS: DSI: Fix bug when calculating LP command interleaving parameters
OMAPDSS: fix bogus WARN_ON in dss_runtime_put()
OMAPDSS: Taal: fix compilation warning
OMAPDSS: fix build when DEBUG_FS or DSS_DEBUG_SUPPORT disabled
Takashi Iwai [Wed, 13 Jun 2012 14:47:32 +0000 (16:47 +0200)]
ALSA: hda - Don't forget to call init verbs added by fixup list
During the split to the auto-parser helper functions, the actual call
of init verbs was lost.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=43366
Signed-off-by: Takashi Iwai <tiwai@suse.de>
David S. Miller [Wed, 13 Jun 2012 04:59:18 +0000 (21:59 -0700)]
Merge git://git./linux/kernel/git/davem/net
Conflicts:
MAINTAINERS
drivers/net/wireless/iwlwifi/pcie/trans.c
The iwlwifi conflict was resolved by keeping the code added
in 'net' that turns off the buggy chip feature.
The MAINTAINERS conflict was merely overlapping changes, one
change updated all the wireless web site URLs and the other
changed some GIT trees to be Johannes's instead of John's.
Signed-off-by: David S. Miller <davem@davemloft.net>
Paul Mundt [Wed, 13 Jun 2012 03:01:33 +0000 (12:01 +0900)]
Merge branches 'sh/urgent', 'sh/core', 'sh/clockevents', 'sh/asm-generic' and 'sh/trivial' into sh-fixes-for-linus
Paul Mundt [Wed, 13 Jun 2012 02:59:47 +0000 (11:59 +0900)]
sh: Kill off additional asm-generic wrappers.
A few wrappers were overlooked in the initial conversion, take care of
them now.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Geert Uytterhoeven [Sat, 12 May 2012 20:39:07 +0000 (22:39 +0200)]
sh: Setup CROSS_COMPILE at the top
CROSS_COMPILE must be setup before using e.g. cc-option (and a few other
as-*, cc-*, ld-* macros), else they will check against the wrong compiler
when cross-compiling, and may invoke the cross compiler with wrong or
suboptimal compiler options.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: linux-sh@vger.kernel.org
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Paul Mundt [Wed, 13 Jun 2012 02:36:36 +0000 (11:36 +0900)]
sh: Fix up link time defsym warnings.
sh-linux-gnu-ld:--defsym 'jiffies=jiffies_64': ignoring invalid character `'' in expression
For some reason ld has recently started complaining about the quotes, so just
get rid of them, we don't need them for anything anyways.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Ben Hutchings [Tue, 12 Jun 2012 13:05:41 +0000 (13:05 +0000)]
ethtool: Make more commands available to unprivileged processes
'Get' commands should generally not require CAP_NET_ADMIN, with
the exception of those that expose internal state.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michel Machado [Tue, 12 Jun 2012 10:16:35 +0000 (10:16 +0000)]
net-next: add dev_loopback_xmit() to avoid duplicate code
Add dev_loopback_xmit() in order to deduplicate functions
ip_dev_loopback_xmit() (in net/ipv4/ip_output.c) and
ip6_dev_loopback_xmit() (in net/ipv6/ip6_output.c).
I was about to reinvent the wheel when I noticed that
ip_dev_loopback_xmit() and ip6_dev_loopback_xmit() do exactly what I
need and are not IP-only functions, but they were not available to reuse
elsewhere.
ip6_dev_loopback_xmit() does not have line "skb_dst_force(skb);", but I
understand that this is harmless, and should be in dev_loopback_xmit().
Signed-off-by: Michel Machado <michel@digirati.com.br>
CC: "David S. Miller" <davem@davemloft.net>
CC: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
CC: James Morris <jmorris@namei.org>
CC: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
CC: Patrick McHardy <kaber@trash.net>
CC: Eric Dumazet <edumazet@google.com>
CC: Jiri Pirko <jpirko@redhat.com>
CC: "Michał Mirosław" <mirq-linux@rere.qmqm.pl>
CC: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Mon, 11 Jun 2012 19:23:07 +0000 (19:23 +0000)]
bonding: remove packet cloning in recv_probe()
Cloning all packets in input path have a significant cost.
Use skb_header_pointer()/skb_copy_bits() instead of pskb_may_pull() so
that recv_probe handlers (bond_3ad_lacpdu_recv / bond_arp_rcv /
rlb_arp_recv ) dont touch input skb.
bond_handle_frame() can avoid the skb_clone()/dev_kfree_skb()
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jay Vosburgh <fubar@us.ibm.com>
Cc: Andy Gospodarek <andy@greyhouse.net>
Cc: Jiri Bohac <jbohac@suse.cz>
Cc: Nicolas de Pesloüan <nicolas.2p.debian@free.fr>
Cc: Maciej Żenczykowski <maze@google.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>