Florian Westphal [Tue, 4 Jun 2013 22:22:16 +0000 (22:22 +0000)]
netfilter: nfnetlink_queue: only add CAP_LEN attr when needed
CAP_LEN contains the size of the network packet we're queueing to
userspace, i.e. normally it is the same as the NFQA_PAYLOAD attribute len.
Include it only in the unlikely case when NFQA_PAYLOAD is truncated due
to copy_range limitations.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Florian Westphal [Tue, 4 Jun 2013 22:22:15 +0000 (22:22 +0000)]
netfilter: nfnetlink_queue: cleanup copy_range usage
For every packet queued, we check if configured copy_range
is 0, and treat that as 'copy entire packet'.
We can move this check to the queue configuration, and can
set copy_range appropriately.
Also, convert repetitive '0xffff - NLA_HDRLEN' to a macro.
[ queue initialization still used 0xffff, although its harmless
since the initial setting is overwritten on queue config ]
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Jeff Mahoney [Wed, 22 May 2013 12:59:10 +0000 (14:59 +0200)]
netfilter: Implement RFC 1123 for FTP conntrack
The FTP conntrack code currently only accepts the following format for
the 227 response for PASV:
227 Entering Passive Mode (148,100,81,40,31,161).
It doesn't accept the following format from an obscure server:
227 Data transfer will passively listen to 67,218,99,134,50,144
From RFC 1123:
The format of the 227 reply to a PASV command is not
well standardized. In particular, an FTP client cannot
assume that the parentheses shown on page 40 of RFC-959
will be present (and in fact, Figure 3 on page 43 omits
them). Therefore, a User-FTP program that interprets
the PASV reply must scan the reply for the first digit
of the host and port numbers.
This patch adds support for the RFC 1123 clarification by:
- Allowing a search filter to specify NUL as the terminator so that
try_number will return successfully if the array of numbers has been
filled when an unexpected character is encountered.
- Using space as the separator for the 227 reply and then scanning for
the first digit of the number sequence. The number sequence is parsed
out using the existing try_rfc959 but with a NUL terminator.
References: https://bugzilla.novell.com/show_bug.cgi?id=466279
References: http://bugzilla.netfilter.org/show_bug.cgi?id=574
Reported-by: Mark Post <mpost@novell.com>
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Patrick McHardy <kaber@trash.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: netfilter-devel@vger.kernel.org
Cc: netfilter@vger.kernel.org
Cc: coreteam@netfilter.org
Cc: netdev@vger.kernel.org
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Florian Westphal [Sat, 25 May 2013 01:46:10 +0000 (01:46 +0000)]
netfilter: nfnetlink_queue: avoid peer_portid test
The portid is set to NETLINK_CB(skb).portid at create time.
The run-time check will always be false.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Zhang Yanfei [Mon, 29 Apr 2013 18:55:10 +0000 (11:55 -0700)]
ipvs: change type of netns_ipvs->sysctl_sync_qlen_max
This member of struct netns_ipvs is calculated from nr_free_buffer_pages
so change its type to unsigned long in case of overflow. Also, type of
its related proc var sync_qlen_max and the return type of function
sysctl_sync_qlen_max() should be changed to unsigned long, too.
Besides, the type of ipvs_master_sync_state->sync_queue_len should be
changed to unsigned long accordingly.
Signed-off-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Cc: Julian Anastasov <ja@ssi.bg>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Simon Horman <horms@verge.net.au>
Simon Horman [Wed, 22 May 2013 05:50:32 +0000 (14:50 +0900)]
ipvs: use cond_resched_rcu() helper when walking connections
This avoids the situation where walking of a large number of connections
may prevent scheduling for a long time while also avoiding excessive
calls to rcu_read_unlock() and rcu_read_lock().
Note that in the case of !CONFIG_PREEMPT_RCU this will
add a call to cond_resched().
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Simon Horman [Wed, 22 May 2013 05:50:31 +0000 (14:50 +0900)]
sched: add cond_resched_rcu() helper
This is intended for use in loops which read data protected by RCU and may
have a large number of iterations. Such an example is dumping the list of
connections known to IPVS: ip_vs_conn_array() and ip_vs_conn_seq_next().
The benefits are for CONFIG_PREEMPT_RCU=y where we save CPU cycles
by moving rcu_read_lock and rcu_read_unlock out of large loops
but still allowing the current task to be preempted after every
loop iteration for the CONFIG_PREEMPT_RCU=n case.
The call to cond_resched() is not needed when CONFIG_PREEMPT_RCU=y.
Thanks to Paul E. McKenney for explaining this and for the
final version that checks the context with CONFIG_DEBUG_ATOMIC_SLEEP=y
for all possible configurations.
The function can be empty in the CONFIG_PREEMPT_RCU case,
rcu_read_lock and rcu_read_unlock are not needed in this case
because the task can be preempted on indication from scheduler.
Thanks to Peter Zijlstra for catching this and for his help
in trying a solution that changes __might_sleep.
Initial cond_resched_rcu_lock() function suggested by Eric Dumazet.
Tested-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Pablo Neira Ayuso [Wed, 22 May 2013 22:42:37 +0000 (22:42 +0000)]
netfilter: {ipt,ebt}_ULOG: rise warning on deprecation
This target has been superseded by NFLOG. Spot a warning
so we prepare removal in a couple of years.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Gao feng <gaofeng@cn.fujitsu.com>
Pablo Neira Ayuso [Wed, 22 May 2013 22:42:36 +0000 (22:42 +0000)]
netfilter: don't panic on error while walking through the init path
Don't panic if we hit an error while adding the nf_log or pernet
netfilter support, just bail out.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Gao feng <gaofeng@cn.fujitsu.com>
Chen Gang [Thu, 16 May 2013 22:07:22 +0000 (22:07 +0000)]
bridge: netfilter: using strlcpy() instead of strncpy()
'name' has already set all zero when it is defined, so not need let
strncpy() to pad it again.
'name' is a string, better always let is NUL terminated, so use
strlcpy() instead of strncpy().
Signed-off-by: Chen Gang <gang.chen@asianux.com>
Acked-by: Bart De Schuymer <bdschuym@pandora.be>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Eric Dumazet [Wed, 22 May 2013 11:01:06 +0000 (11:01 +0000)]
netfilter: xt_socket: use IP early demux
With IP early demux added in linux-3.6, we perform TCP lookup in IP
layer before iptables hooks.
We can avoid doing a second lookup in xt_socket.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Eric Dumazet [Wed, 22 May 2013 11:10:57 +0000 (11:10 +0000)]
netfilter: xt_CT: optimize XT_CT_NOTRACK
The percpu untracked ct are not currently used for XT_CT_NOTRACK.
xt_ct_tg_check()/xt_ct_target() provides a single ct.
Thats not optimal as the ct->ct_general.use cache line will bounce among
cpus.
Use the intended [1] thing : xt_ct_target() should select the percpu
object.
[1] Refs :
commit
5bfddbd46a95c97 ("netfilter: nf_conntrack: IPS_UNTRACKED bit")
commit
b3c5163fe0193a7 ("netfilter: nf_conntrack: per_cpu untracking")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Cong Wang [Wed, 22 May 2013 05:52:22 +0000 (05:52 +0000)]
ipv6: use ipv6_addr_scope() helper
ipv6_addr_type(&addr)&IPV6_ADDR_SCOPE_MASK could be replaced
by ipv6_addr_scope(), which is slightly faster.
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cong Wang [Wed, 22 May 2013 05:41:06 +0000 (05:41 +0000)]
ipv6: use ipv6_addr_any() helper
ipv6_addr_any() is a faster way to determine if an addr
is ipv6 any addr, no need to compute the addr type.
Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Tue, 21 May 2013 08:16:46 +0000 (08:16 +0000)]
sch_tbf: segment too big GSO packets
If a GSO packet has a length above tbf burst limit, the packet
is currently silently dropped.
Current way to handle this is to set the device in non GSO/TSO mode, or
setting high bursts, and its sub optimal.
We can actually segment too big GSO packets, and send individual
segments as tbf parameters allow, allowing for better interoperability.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Ben Hutchings <ben@decadent.org.uk>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Reviewed-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fabio Estevam [Tue, 21 May 2013 05:44:26 +0000 (05:44 +0000)]
fec: Use DIV_ROUND_UP macro
Use the standard DIV_ROUND_UP macro in order to provide better readability.
Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Santosh Rastapur [Tue, 21 May 2013 04:21:29 +0000 (04:21 +0000)]
cxgb3: Check and handle the dma mapping errors
This patch adds checks at approprate places whether *dma_map*() call has
succeeded or not.
Signed-off-by: Santosh Rastapur <santosh@chelsio.com>
Reviewed-by: Jay Fenlason <fenlason@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jay Fenlason [Tue, 21 May 2013 04:21:28 +0000 (04:21 +0000)]
cxgb3: Fix warning about using rcu_dereference when not in a rcu-locked section
It is about using rcu_dereference() when not in a rcu-locked section. It only
happens on initialization hence fix the initialization to not rcu_dereference()
Signed-off-by: Jay Fenlason <fenlason@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mark Rutland [Tue, 21 May 2013 00:22:37 +0000 (00:22 +0000)]
net: smsc911x: don't artificially limit build
Currently the SMSC911X driver may only be built for a specific set of
architectures, being limited to do so by a Kconfig depends line. This
means that if a platform wishes to use the driver, its architecture must
be added to the list explicitly, introducing pointless churn.
This may have been due to the driver's use of the {read,write}s{b,w,l}
functions, which have since been replaced with the more standard
io{read,write}{8,16,32}_rep. We can instead depend on HAS_IOMEM, which
should prevent build issues while allowing the driver to be built for
currently unlisted architectures, including x86 and arm64.
This patch removes the explicit list of architectures from the driver's
depend line, and replaces it with a dependency on HAS_IOMEM.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: netdev@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
Simon Horman [Sun, 19 May 2013 15:46:49 +0000 (15:46 +0000)]
net: Loosen constraints for recalculating checksum in skb_segment()
This is a generic solution to resolve a specific problem that I have observed.
If the encapsulation of an skb changes then ability to offload checksums
may also change. In particular it may be necessary to perform checksumming
in software.
An example of such a case is where a non-GRE packet is received but
is to be encapsulated and transmitted as GRE.
Another example relates to my proposed support for for packets
that are non-MPLS when received but MPLS when transmitted.
The cost of this change is that the value of the csum variable may be
checked when it previously was not. In the case where the csum variable is
true this is pure overhead. In the case where the csum variable is false it
leads to software checksumming, which I believe also leads to correct
checksums in transmitted packets for the cases described above.
Further analysis:
This patch relies on the return value of can_checksum_protocol()
being correct and in turn the return value of skb_network_protocol(),
used to provide the protocol parameter of can_checksum_protocol(),
being correct. It also relies on the features passed to skb_segment()
and in turn to can_checksum_protocol() being correct.
I believe that this problem has not been observed for VLANs because it
appears that almost all drivers, the exception being xgbe, set
vlan_features such that that the checksum offload support for VLAN packets
is greater than or equal to that of non-VLAN packets.
I wonder if the code in xgbe may be an oversight and the hardware does
support checksumming of VLAN packets. If so it may be worth updating the
vlan_features of the driver as this patch will force such checksums to be
performed in software rather than hardware.
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cong Wang [Tue, 21 May 2013 21:52:56 +0000 (21:52 +0000)]
bridge: send query as soon as leave is received
Continue sending queries when leave is received if the user marks
it as a querier.
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Adam Baker <linux@baker-net.org.uk>
Signed-off-by: Cong Wang <amwang@redhat.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cong Wang [Tue, 21 May 2013 21:52:55 +0000 (21:52 +0000)]
bridge: only expire the mdb entry when query is received
Currently we arm the expire timer when the mdb entry is added,
however, this causes problem when there is no querier sent
out after that.
So we should only arm the timer when a corresponding query is
received, as suggested by Herbert.
And he also mentioned "if there is no querier then group
subscriptions shouldn't expire. There has to be at least one querier
in the network for this thing to work. Otherwise it just degenerates
into a non-snooping switch, which is OK."
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Adam Baker <linux@baker-net.org.uk>
Signed-off-by: Cong Wang <amwang@redhat.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cong Wang [Tue, 21 May 2013 21:52:54 +0000 (21:52 +0000)]
bridge: use the bridge IP addr as source addr for querier
Quote from Adam:
"If it is believed that the use of 0.0.0.0
as the IP address is what is causing strange behaviour on other devices
then is there a good reason that a bridge rather than a router shouldn't
be the active querier? If not then using the bridge IP address and
having the querier enabled by default may be a reasonable solution
(provided that our querier obeys the election rules and shuts up if it
sees a query from a lower IP address that isn't 0.0.0.0). Just because a
device is the elected querier for IGMP doesn't appear to mean it is
required to perform any other routing functions."
And introduce a new troggle for it, as suggested by Herbert.
Suggested-by: Adam Baker <linux@baker-net.org.uk>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Adam Baker <linux@baker-net.org.uk>
Signed-off-by: Cong Wang <amwang@redhat.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Peter Hüwe [Tue, 21 May 2013 13:42:56 +0000 (13:42 +0000)]
net/ethernet/nvidia/forcedeth: Use module_pci_driver to register driver
Removing some boilerplate by using module_pci_driver instead of calling
register and unregister in the otherwise empty init/exit functions.
The name of the pci_driver struct had to be changed in order to prevent
a build failure.
Signed-off-by: Peter Huewe <peterhuewe@gmx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Peter Hüwe [Tue, 21 May 2013 13:42:55 +0000 (13:42 +0000)]
net/ethernet/chelsio/cxgb/cxgb2: Use module_pci_driver to register driver
Removing some boilerplate by using module_pci_driver instead of calling
register and unregister in the otherwise empty init/exit functions.
The name of the pci_driver struct had to be changed in order to prevent
a build failure.
Signed-off-by: Peter Huewe <peterhuewe@gmx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Peter Hüwe [Tue, 21 May 2013 13:42:54 +0000 (13:42 +0000)]
net/fddi/skfp/skfddi: Use module_pci_driver to register driver
Removing some boilerplate by using module_pci_driver instead of calling
register and unregister in the otherwise empty init/exit functions.
Signed-off-by: Peter Huewe <peterhuewe@gmx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Peter Hüwe [Tue, 21 May 2013 13:42:53 +0000 (13:42 +0000)]
net/hippi/rrunner: Use module_pci_driver to register driver
Removing some boilerplate by using module_pci_driver instead of calling
register and unregister in the otherwise empty init/exit functions.
Signed-off-by: Peter Huewe <peterhuewe@gmx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Peter Hüwe [Tue, 21 May 2013 12:58:10 +0000 (12:58 +0000)]
net/ethernet/amd/amd8111e: Use module_pci_driver to register driver
Removing some boilerplate by using module_pci_driver instead of calling
register and unregister in the otherwise empty init/exit functions.
Signed-off-by: Peter Huewe <peterhuewe@gmx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Peter Hüwe [Tue, 21 May 2013 12:58:09 +0000 (12:58 +0000)]
net/ethernet/sun/sungem: Use module_pci_driver to register driver
Removing some boilerplate by using module_pci_driver instead of calling
register and unregister in the otherwise empty init/exit functions.
Signed-off-by: Peter Huewe <peterhuewe@gmx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Peter Hüwe [Tue, 21 May 2013 12:58:08 +0000 (12:58 +0000)]
net/ethernet/qlogic/qlge/qlge_main: Use module_pci_driver to register driver
Removing some boilerplate by using module_pci_driver instead of calling
register and unregister in the otherwise empty init/exit functions.
Signed-off-by: Peter Huewe <peterhuewe@gmx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Peter Hüwe [Tue, 21 May 2013 12:58:07 +0000 (12:58 +0000)]
net/ethernet/sgi/ioc3-eth: Use module_pci_driver to register driver
Removing some boilerplate by using module_pci_driver instead of calling
register and unregister in the otherwise empty init/exit functions.
Signed-off-by: Peter Huewe <peterhuewe@gmx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Peter Hüwe [Tue, 21 May 2013 12:58:06 +0000 (12:58 +0000)]
net/ethernet/broadcom/tg3: Use module_pci_driver to register driver
Removing some boilerplate by using module_pci_driver instead of calling
register and unregister in the otherwise empty init/exit functions.
Signed-off-by: Peter Huewe <peterhuewe@gmx.de>
Acked-by: Nithin Nayak Sujir <nsujir@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Peter Hüwe [Tue, 21 May 2013 12:58:05 +0000 (12:58 +0000)]
net/ethernet/broadcom/bnx2: Use module_pci_driver to register driver
Removing some boilerplate by using module_pci_driver instead of calling
register and unregister in the otherwise empty init/exit functions.
Signed-off-by: Peter Huewe <peterhuewe@gmx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Peter Hüwe [Tue, 21 May 2013 12:42:14 +0000 (12:42 +0000)]
net/ethernet/alteon/acenic: Use module_pci_driver to register driver
Removing some boilerplate by using module_pci_driver instead of calling
register and unregister in the otherwise empty init/exit functions.
Signed-off-by: Peter Huewe <peterhuewe@gmx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Peter Hüwe [Tue, 21 May 2013 12:42:13 +0000 (12:42 +0000)]
net/ethernet/icplus/ipg: Use module_pci_driver to register driver
Removing some boilerplate by using module_pci_driver instead of calling
register and unregister in the otherwise empty init/exit functions.
Signed-off-by: Peter Huewe <peterhuewe@gmx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Peter Hüwe [Tue, 21 May 2013 12:42:12 +0000 (12:42 +0000)]
net/ethernet/toshiba/tc35815: Use module_pci_driver to register driver
Removing some boilerplate by using module_pci_driver instead of calling
register and unregister in the otherwise empty init/exit functions.
Signed-off-by: Peter Huewe <peterhuewe@gmx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Peter Hüwe [Tue, 21 May 2013 12:42:11 +0000 (12:42 +0000)]
net/ethernet/dec/tulip/xircom_cb: Use module_pci_driver to register driver
Removing some boilerplate by using module_pci_driver instead of calling
register and unregister in the otherwise empty init/exit functions.
Signed-off-by: Peter Huewe <peterhuewe@gmx.de>
Reviewed-by: Grant Grundler <grundler@parisc-linux.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Peter Hüwe [Tue, 21 May 2013 12:42:10 +0000 (12:42 +0000)]
net/ethernet/sis/sis190: Use module_pci_driver to register driver
Removing some boilerplate by using module_pci_driver instead of calling
register and unregister in the otherwise empty init/exit functions.
Signed-off-by: Peter Huewe <peterhuewe@gmx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Peter Hüwe [Tue, 21 May 2013 12:42:09 +0000 (12:42 +0000)]
net/ethernet/atheros/atlx/atl1: Use module_pci_driver to register driver
Removing some boilerplate by using module_pci_driver instead of calling
register and unregister in the otherwise empty init/exit functions.
Signed-off-by: Peter Huewe <peterhuewe@gmx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Peter Hüwe [Tue, 21 May 2013 12:42:08 +0000 (12:42 +0000)]
net/ethernet/atheros/atl1e/atl1e_main: Use module_pci_driver to register driver
Removing some boilerplate by using module_pci_driver instead of calling
register and unregister in the otherwise empty init/exit functions.
Signed-off-by: Peter Huewe <peterhuewe@gmx.de>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Peter Hüwe [Tue, 21 May 2013 12:42:07 +0000 (12:42 +0000)]
net/ethernet/atheros/atl1c/atl1c_main: Use module_pci_driver to register driver
Removing some boilerplate by using module_pci_driver instead of calling
register and unregister in the otherwise empty init/exit functions.
Signed-off-by: Peter Huewe <peterhuewe@gmx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Peter Hüwe [Tue, 21 May 2013 12:42:06 +0000 (12:42 +0000)]
net/ethernet/silan/sc92031: Use module_pci_driver to register driver
Removing some boilerplate by using module_pci_driver instead of calling
register and unregister in the otherwise empty init/exit functions.
Signed-off-by: Peter Huewe <peterhuewe@gmx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 22 May 2013 20:56:56 +0000 (13:56 -0700)]
Merge branch 'master' of git://git./linux/kernel/git/jkirsher/net-next
Jeff Kirsher says:
====================
This series contains updates to e1000e, igb and ixgbe.
Bruce Allan provide 2 minor cleanups for e1000e to resolve whitespace
issues and build warnings about unused parameters.
Carolyn provides a couple of fixes for igb, one being a fix for a
possible panic when the interface is down and receive traffic
arrives. The second fix resolves an issue on newer parts which have
multiple checksum fields and set_ethtool was only checking to update
the first checksum of the NVM image.
Akeem provides majority of the changes in this patch set. Akeem
provides a fix for e1000e on an issue reported from the community to
resolve the issue of unlocking swflag_mutex for 82574 and 82583
devices even if the hardware semaphore was successfully acquired.
The other patches from Akeem are against igb, where he adds support
SFP module discovery, LED blink mechanism for devices using cathodes,
LED support for i210/i211 parts and cleanup of a i2c function which
was not being used.
Matthew provides an update for igb to support a more accurate check
for a PTP RX hang.
Amir provides a patch for ixgbe to set the software prio_tc values at
initialization to the hardware setting to remove the need to reset the
device at the first time we call ixgbe_dcbnl_ieee_setets.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Amir Hanania [Thu, 18 Apr 2013 04:23:52 +0000 (04:23 +0000)]
IXGBE: Set the SW prio_tc values at initialization to the HW setting.
Set the SW prio_tc values at initialization to the HW setting.
Setting the SW prio_tc default values to be the HW setting by reading the
rtrup2tc register. For any TC change we need to reset the device.
This will remove the need to reset the device at the first
time we call ixgbe_dcbnl_ieee_setets.
Signed-off-by: Amir Hanania <amir.hanania@intel.com>
Tested-by: Jack Morgan<jack.morgan@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Akeem G. Abodunrin [Wed, 15 May 2013 07:41:30 +0000 (07:41 +0000)]
igb: Removed unused i2c function
This patch removes unused i2c function definition.
Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Akeem G. Abodunrin [Wed, 1 May 2013 05:44:45 +0000 (05:44 +0000)]
igb: Implementation of i210/i211 LED support
This patch fixes LED issues with i210 and i211 devices, due to changes in the
device registers.
Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Carolyn Wyborny [Tue, 30 Apr 2013 00:21:32 +0000 (00:21 +0000)]
igb: Fix possible panic caused by Rx traffic arrival while interface is down
This patch reorders disabling napi and irqs during igb_down.
This is done to avoid possible panic's found in other Intel drivers
when Rx traffic arrives while interface is going down.
Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Carolyn Wyborny [Thu, 25 Apr 2013 17:22:34 +0000 (17:22 +0000)]
igb: Fix set_ethtool function to call update nvm for entire image
This patch fixes a problem where we were only checking to update checksum
on first part of nvm image. Newer parts have multiple checksum fields and
checksum function will accommodate that as long as we call it in the first
place for any changes made.
Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Akeem G. Abodunrin [Fri, 29 Mar 2013 15:22:17 +0000 (15:22 +0000)]
igb: SerDes flow control setting
This path allows users to get appropriate flow control setting on SerDes
devices, based on original implementation for Copper devices.
Also, since 100baseFX does not support setting flow control, so exclude
it from the setting mechanism.
Signed-off-by: Akeem G. Abodunrin <akeem.g.abodunrin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Akeem G. Abodunrin [Wed, 24 Apr 2013 16:54:50 +0000 (16:54 +0000)]
igb: Support for SFP modules discovery
This patch adds support for SFP modules media type discovery for
SGMII, which will enable driver to detect supported external PHYs,
including 100baseFXSFP module.
Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Matthew Vick [Wed, 24 Apr 2013 07:42:06 +0000 (07:42 +0000)]
igb: Add update to last_rx_timestamp in Rx rings
In order to support a more accurate check for a PTP Rx hang where the
device can no longer timestamp received packets, we need to update, per
ring, when the last Rx timestamp was. Because of how the PTP Rx hang logic
works, the current logic is valid, but properly updating the ring variable
increases the accuracy of the check.
Signed-off-by: Matthew Vick <matthew.vick@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Akeem G. Abodunrin [Fri, 29 Mar 2013 08:22:25 +0000 (08:22 +0000)]
igb: Changed LEDs blink mechanism to include designs using cathode
This patch addresses the changes needed to make LEDs work properly with
negative logic. This implementation uses LED Invert bit to reverse the
logic issue that occurred when LEDs are driven by cathode. Keep LEDs
blinking for SerDes devices. Also made changes to magic number and the
for loop to reduce number of shifts.
Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Akeem G. Abodunrin [Thu, 2 May 2013 02:57:44 +0000 (02:57 +0000)]
e1000e: Release mutex lock only if it has been initially acquired
This patch fixes the issue of unlocking swflag_mutex for 82574 and 82583
devices regardless of if the hw semaphore has been successfully acquired via
e1000_get_hw_semaphore_82574(). With this patch, unlocking mutex now depends
on if the hw semaphore was successfully acquired before. And 82574/82583
devices are reset regardless of whether e1000_get_hw_semaphore_82574()
returns success or failure.
Reported-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Wed, 1 May 2013 03:48:11 +0000 (03:48 +0000)]
e1000e: prevent warning from -Wunused-parameter
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Wed, 1 May 2013 01:19:46 +0000 (01:19 +0000)]
e1000e: cleanup whitespace
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Florian Fainelli [Sun, 19 May 2013 22:53:43 +0000 (22:53 +0000)]
phy: add phy_mac_interrupt() to use with PHY_IGNORE_INTERRUPT
There is currently no way for an Ethernet MAC driver servicing PHY link
interrupts to notify this to the PHY state machine without defining its
own state machine. Since most drivers are not so special, introduce a
helper: phy_mac_interrupt() which can be called from a link up/down
interrupt routine to update the PHY state machine. To avoid code
duplication some refactoring has been done to expose the workqueue and
its corresponding callback internally.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Sun, 19 May 2013 22:53:42 +0000 (22:53 +0000)]
phy: fix the use of PHY_IGNORE_INTERRUPT
When a PHY device is registered with the special IRQ value
PHY_IGNORE_INTERRUPT (-2) it will not properly be handled by the PHY
library:
- it continues to poll its register, while we do not want this
because such PHY link events or register changes are serviced by an
Ethernet MAC
- it will still try to configure PHY interrupts at the PHY level, such
interrupts do not exist at the PHY but at the MAC level
- the state machine only handles PHY_POLL, but should also handle
PHY_IGNORE_INTERRUPT similarly
This patch updates the PHY state machine and initialization paths to
account for the specific PHY_IGNORE_INTERRUPT. Based on an earlier patch
by Thomas Petazzoni, and reworked to add the missing bits. Add a helper
phy_interrupt_is_valid() which specifically tests for a PHY interrupt
not to be PHY_POLL or PHY_IGNORE_INTERRUPT and use it throughout the
code.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rasesh Mody [Mon, 20 May 2013 10:08:04 +0000 (10:08 +0000)]
bna: Driver and Firmware Updated
Driver and Firmware versions updated to 3.2.21.1.
Signed-off-by: Rasesh Mody <rmody@brocade.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rasesh Mody [Mon, 20 May 2013 10:08:03 +0000 (10:08 +0000)]
bna: Enahncement to Identify Default IOC Function
User should not be allowed to delete base function of eth port. Add a new field
to the bfa ioc attributes structure to indicate if the given ioc is default
function on the port or not.
Signed-off-by: Rasesh Mody <rmody@brocade.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rasesh Mody [Mon, 20 May 2013 10:08:02 +0000 (10:08 +0000)]
bna: Fix Ucast Failure Handling
Failure of the UCAST set for base mac address fails when user configures a
duplicate mac address that matches that of another vNIC on the same port.
The bna does not handle the ucast failure and keeps this address in cache.
On disable of the vNIC, bna tries to delete the failed base mac address and the
fw asserts.
On failure of ucast address, mark ucast address set to false.
Signed-off-by: Rasesh Mody <rmody@brocade.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rasesh Mody [Mon, 20 May 2013 10:08:01 +0000 (10:08 +0000)]
bna: Clear Driver Config Flags When HW Resets
Driver configuration flags are retained across open/stop operations preventing
configurations to be set in next open/stop. Setting MTU on a 1020 causes
network to fail until a reboot is performed on the host.
Clear the flags when configuration resets in hardware.
Signed-off-by: Rasesh Mody <rmody@brocade.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tomasz Figa [Mon, 20 May 2013 09:16:58 +0000 (09:16 +0000)]
net: dm9000: Allow instantiation using device tree
This patch adds Device Tree support to dm9000 driver.
Signed-off-by: Tomasz Figa <tomasz.figa@gmail.com>
Reviewed-by: Sylwester Nawrocki <sylvester.nawrocki@gmail.com>
Reviewed-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Mon, 20 May 2013 08:05:51 +0000 (08:05 +0000)]
arm: bpf_jit: can call module_free() from any context
Follow-up on module_free()/vfree() that takes care of the rest, so no
longer this workaround with work_struct needed.
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Mircea Gherzan <mgherzan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Mon, 20 May 2013 08:05:50 +0000 (08:05 +0000)]
ppc: bpf_jit: can call module_free() from any context
Followup patch on module_free()/vfree() that takes care of the rest, so
no longer this workaround with work_struct is needed.
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Matt Evans <matt@ozlabs.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Mon, 20 May 2013 06:52:26 +0000 (06:52 +0000)]
tcp: md5: remove spinlock usage in fast path
TCP md5 code uses per cpu variables but protects access to them with
a shared spinlock, which is a contention point.
[ tcp_md5sig_pool_lock is locked twice per incoming packet ]
Makes things much simpler, by allocating crypto structures once, first
time a socket needs md5 keys, and not deallocating them as they are
really small.
Next step would be to allow crypto allocations being done in a NUMA
aware way.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Mon, 20 May 2013 04:53:38 +0000 (04:53 +0000)]
net: ipv6: remove 'next' member from inet6_dev
The next pointer within the inet6_dev structure seems not to be used
anywhere. So just remove it. Tested with allmodconfig on x86_64.
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Willem de Bruijn [Mon, 20 May 2013 04:02:32 +0000 (04:02 +0000)]
rps: selective flow shedding during softnet overflow
A cpu executing the network receive path sheds packets when its input
queue grows to netdev_max_backlog. A single high rate flow (such as a
spoofed source DoS) can exceed a single cpu processing rate and will
degrade throughput of other flows hashed onto the same cpu.
This patch adds a more fine grained hashtable. If the netdev backlog
is above a threshold, IRQ cpus track the ratio of total traffic of
each flow (using 4096 buckets, configurable). The ratio is measured
by counting the number of packets per flow over the last 256 packets
from the source cpu. Any flow that occupies a large fraction of this
(set at 50%) will see packet drop while above the threshold.
Tested:
Setup is a muli-threaded UDP echo server with network rx IRQ on cpu0,
kernel receive (RPS) on cpu0 and application threads on cpus 2--7
each handling 20k req/s. Throughput halves when hit with a 400 kpps
antagonist storm. With this patch applied, antagonist overload is
dropped and the server processes its complete load.
The patch is effective when kernel receive processing is the
bottleneck. The above RPS scenario is a extreme, but the same is
reached with RFS and sufficient kernel processing (iptables, packet
socket tap, ..).
Signed-off-by: Willem de Bruijn <willemb@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fabio Estevam [Mon, 20 May 2013 03:06:17 +0000 (03:06 +0000)]
fec: Let device core handle pinctrl
Since commit
ab78029 (drivers/pinctrl: grab default handles from device core)
we can rely on device core for handling pinctrl, so remove
devm_pinctrl_get_select_default() from the driver.
Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Wei Liu [Mon, 20 May 2013 01:05:12 +0000 (01:05 +0000)]
xen-netfront: avoid leaking resources when setup_netfront fails
We should correctly free related resources (grant ref, memory page, evtchn)
when setup_netfront fails.
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tony Prisk [Sat, 18 May 2013 09:39:07 +0000 (09:39 +0000)]
net: velocity: Add platform device support to VIA velocity driver
Add support for the VIA Velocity network driver to be bound to a
OF created platform device.
Signed-off-by: Tony Prisk <linux@prisktech.co.nz>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tony Prisk [Sat, 18 May 2013 09:39:06 +0000 (09:39 +0000)]
net: velocity: Convert to generic dma functions
Remove the pci_* dma functions and replace with the more generic
versions.
In preparation of adding platform support, a new struct device *dev
is added to struct velocity_info which can be used by both the pci
and platform code.
Signed-off-by: Tony Prisk <linux@prisktech.co.nz>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tony Prisk [Sat, 18 May 2013 09:39:05 +0000 (09:39 +0000)]
net: velocity: Rename vptr->dev to vptr->netdev
Improve the clarity of the code in preparation for converting the
dma functions to generic versions, which require a struct device *.
This makes it possible to store a 'struct device *dev' in the
velocity_info structure.
Signed-off-by: Tony Prisk <linux@prisktech.co.nz>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sergei Shtylyov [Sun, 19 May 2013 10:17:13 +0000 (10:17 +0000)]
3c59x: remove useless VORTEX_PCI() invocations
It's suboptimal to invoke quite complex VORTEX_PCI() macro every time we want
to get a 'struct pci_dev *' when we already have it in a variable...
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rolf Eike Beer [Sat, 18 May 2013 11:50:17 +0000 (11:50 +0000)]
ThunderLAN: remove is_eisa flag
These 2 places are the only matches for is_eisa in the whole tree.
Signed-off-by: Rolf Eike Beer <eike-kernel@sf-tec.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Sat, 18 May 2013 07:14:53 +0000 (07:14 +0000)]
net-bnx2x: dont reload on GRO change
bnx2x_set_features() forces a driver reload if GRO setting is changed.
A reload makes the ethernet port unresponsive for about 5 seconds.
This is not needed in the common case LRO is enabled, as LRO
(TPA_ENABLE_FLAG) has precedence over GRO (GRO_ENABLE_FLAG)
Tested:
Verified that "ethtool -K eth0 gro {on|off}" doesn't blackout
the NIC anymore
Google-Bug-Id:
8440442
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Dmitry Kravkov <dmitry@broadcom.com>
Acked-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 20 May 2013 07:13:54 +0000 (00:13 -0700)]
Merge branch 'tg3_eee'
Nithin Nayak Sujir says:
====================
This series adds support for modifying EEE settings via ethtool. Since this can
impact Link Flap Avoidance, the driver pulls the current hardware settings if
LFA is enabled. This is similar to how we do the link settings to avoid a flap.
v2: Fixes pointed out by Ben Hutchings.
- Use MDIO_AN_EEE_LPABLE to set the lp_advertised field.
- Check that tx_lpi_timer is within valid range.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Nithin Sujir [Sat, 18 May 2013 06:26:55 +0000 (06:26 +0000)]
tg3: Implement set/get_eee handlers
Reviewed-by: Ben Li <benli@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Nithin Nayak Sujir <nsujir@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nithin Sujir [Sat, 18 May 2013 06:26:54 +0000 (06:26 +0000)]
tg3: Simplify tg3_phy_eee_config_ok() by reusing tg3_eee_pull_config()
eee_config_ok() was checking only for mismatch in advertised settings.
This patch expands the scope of eee_config_ok() to check for mismatch in
the other eee settings. On mismatch we will require a call to
tg3_setup_eee() to push the configured settings to the hardware.
Reviewed-by: Ben Li <benli@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Nithin Nayak Sujir <nsujir@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nithin Sujir [Sat, 18 May 2013 06:26:53 +0000 (06:26 +0000)]
tg3: Add tg3_eee_pull_config() function
Add tg3_eee_pull_config() to pull the settings from the hardware and
populate the eee structure.
If Link Flap Avoidance is enabled, we pull the eee settings from the hw
so as not to cause a phy reset on eee config mismatch later. This
requires moving down tg3_setup_eee() below the tg3_pull_config() to not
trample existing settings.
Reviewed-by: Ben Li <benli@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Nithin Nayak Sujir <nsujir@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nithin Sujir [Sat, 18 May 2013 06:26:52 +0000 (06:26 +0000)]
tg3: Add ethtool_eee struct and tg3_setup_eee()
Add an eee structure and update it with eee settings. This will be used
for set/get_eee operations. Add common function tg3_setup_eee() that
will be used in the subsequent patches.
Reviewed-by: Ben Li <benli@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Nithin Nayak Sujir <nsujir@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Fri, 17 May 2013 16:57:37 +0000 (16:57 +0000)]
filter: do not output bpf image address for security reason
Do not leak starting address of BPF JIT code for non root users,
as it might help intruders to perform an attack.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Ben Hutchings <bhutchings@solarflare.com>
Cc: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Fri, 17 May 2013 16:37:03 +0000 (16:37 +0000)]
x86: bpf_jit_comp: secure bpf jit against spraying attacks
hpa bringed into my attention some security related issues
with BPF JIT on x86.
This patch makes sure the bpf generated code is marked read only,
as other kernel text sections.
It also splits the unused space (we vmalloc() and only use a fraction of
the page) in two parts, so that the generated bpf code not starts at a
known offset in the page, but a pseudo random one.
Refs:
http://mainisusuallyafunction.blogspot.com/2012/11/attacking-hardened-linux-systems-with.html
Reported-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yuchung Cheng [Fri, 17 May 2013 13:45:05 +0000 (13:45 +0000)]
tcp: remove bad timeout logic in fast recovery
tcp_timeout_skb() was intended to trigger fast recovery on timeout,
unfortunately in reality it often causes spurious retransmission
storms during fast recovery. The particular sign is a fast retransmit
over the highest sacked sequence (SND.FACK).
Currently the RTO timer re-arming (as in RFC6298) offers a nice cushion
to avoid spurious timeout: when SND.UNA advances the sender re-arms
RTO and extends the timeout by icsk_rto. The sender does not offset
the time elapsed since the packet at SND.UNA was sent.
But if the next (DUP)ACK arrives later than ~RTTVAR and triggers
tcp_fastretrans_alert(), then tcp_timeout_skb() will mark any packet
sent before the icsk_rto interval lost, including one that's above the
highest sacked sequence. Most likely a large part of scorebard will be
marked.
If most packets are not lost then the subsequent DUPACKs with new SACK
blocks will cause the sender to continue to retransmit packets beyond
SND.FACK spuriously. Even if only one packet is lost the sender may
falsely retransmit almost the entire window.
The situation becomes common in the world of bufferbloat: the RTT
continues to grow as the queue builds up but RTTVAR remains small and
close to the minimum 200ms. If a data packet is lost and the DUPACK
triggered by the next data packet is slightly delayed, then a spurious
retransmission storm forms.
As the original comment on tcp_timeout_skb() suggests: the usefulness
of this feature is questionable. It also wastes cycles walking the
sack scoreboard and is actually harmful because of false recovery.
It's time to remove this.
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Nandita Dukkipati <nanditad@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rami Rosen [Fri, 17 May 2013 09:10:34 +0000 (09:10 +0000)]
Documentation/sysctl/net.txt: fix (attribute removal).
This patch removes mentioning the sysfsf net_device weight attribute
(class/net/<device>/weight)
in Documentation/sysctl/net.txt, since the net sysfs weight attribute
was removed by the following patch:
[NET]: Make NAPI polling independent of struct net_device objects
bea3348eef27e6044b6161fd04c3152215f96411
Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nicolas Dichtel [Thu, 16 May 2013 22:32:00 +0000 (22:32 +0000)]
ipv6: add support of peer address
This patch adds the support of peer address for IPv6. For example, it is
possible to specify the remote end of a 6inY tunnel.
This was already possible in IPv4:
ip addr add ip1 peer ip2 dev dev1
The peer address is specified with IFA_ADDRESS and the local address with
IFA_LOCAL (like explained in include/uapi/linux/if_addr.h).
Note that the API is not changed, because before this patch, it was not
possible to specify two different addresses in IFA_LOCAL and IFA_REMOTE.
There is a small change for the dump: if the peer is different from ::,
IFA_ADDRESS will contain the peer address instead of the local address.
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Fri, 17 May 2013 12:12:34 +0000 (12:12 +0000)]
sparc: bpf_jit_comp: can call module_free() from any context
module_free()/vfree() takes care of details, we no longer need a wrapper
and a work_struct.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nicolas Dichtel [Thu, 16 May 2013 23:36:32 +0000 (23:36 +0000)]
dev: remove duplicate 'skb->dev = dev' in dev_forward_skb()
This was added by commit
59b9997baba5 (Revert "net: maintain namespace
isolation between vlan and real device").
In fact, before the initial commit - the one that is reverted -, this
statement was not present.
'skb->dev = dev' is already done in eth_type_trans(), which is call just
after.
Spotted-by: Alain Ritoux <alain.ritoux@6wind.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Wei Liu [Thu, 16 May 2013 23:26:11 +0000 (23:26 +0000)]
xen-netback: enable user to unload netback module
This patch enables user to unload netback module, which is useful when user
wants to upgrade to a newer netback module without rebooting the host.
Netfront cannot handle netback removal event. As we cannot fix all possible
frontends we add module get / put along with vif get / put to avoid
mis-unloading of netback. To unload netback module, user needs to shutdown all
VMs or migrate them to another host or unplug all vifs before hand.
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>¬
Signed-off-by: David S. Miller <davem@davemloft.net>
Wei Liu [Thu, 16 May 2013 23:24:28 +0000 (23:24 +0000)]
xen-netback: remove dead code
The array mmap_pages is never touched in the initialization function. This is
remnant of mapping mechanism, which does not exist upstream. In current
upstream code this array only tracks usage of pages inside netback. Those
pages are allocated when contructing a SKB and passed directly to network
subsystem.
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Thu, 16 May 2013 19:45:30 +0000 (19:45 +0000)]
x86: bpf_jit_comp: can call module_free() from any context
It looks like we can call module_free()/vfree() from softirq context,
so no longer need a wrapper and a work_struct.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sachin Kamat [Thu, 16 May 2013 17:48:08 +0000 (17:48 +0000)]
net/usb: r8152: Use module_usb_driver()
module_usb_driver() eliminates boilerplate and simplifies the code.
Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Cc: Realtek linux nic maintainers <nic_swsd@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sachin Kamat [Thu, 16 May 2013 17:48:07 +0000 (17:48 +0000)]
net/usb: r8152: Remove redundant version.h header inclusion
version.h header inclusion is not necessary as detected by
checkversion.pl.
Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
stephen hemminger [Thu, 16 May 2013 11:35:20 +0000 (11:35 +0000)]
vxlan: listen on multiple ports
The commit
823aa873bc782f1c51b1ce8ec6da7cfcaf93836e
Author: stephen hemminger <stephen@networkplumber.org>
Date: Sat Apr 27 11:31:57 2013 +0000
vxlan: allow choosing destination port per vxlan
introduced per-vxlan UDP port configuration but only did half of the
necessary work. It added per vxlan destination for sending, but
overlooked the handling of multiple ports for incoming traffic.
This patch changes the listening port management to handle multiple
incoming UDP ports. The earlier per-namespace structure is now a hash
list per namespace.
It is also now possible to define the same virtual network id
but with different UDP port values which can be useful for migration.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Emilio López [Fri, 17 May 2013 10:42:56 +0000 (10:42 +0000)]
net: ethernet: korina: initialize variables directly
Clean up the code a bit to initialize the variables directly when
defining them.
Signed-off-by: Emilio López <emilio@elopez.com.ar>
Signed-off-by: David S. Miller <davem@davemloft.net>
Emilio López [Fri, 17 May 2013 10:42:55 +0000 (10:42 +0000)]
net: ethernet: davicom: dm9000: initialize variables directly
Clean up the code a bit to initialize the variables directly when
defining them.
Signed-off-by: Emilio López <emilio@elopez.com.ar>
Signed-off-by: David S. Miller <davem@davemloft.net>
Emilio López [Fri, 17 May 2013 10:42:54 +0000 (10:42 +0000)]
net: ethernet: apple: initialize variables directly
Clean up the code a bit to initialize the variables directly when
defining them.
Signed-off-by: Emilio López <emilio@elopez.com.ar>
Signed-off-by: David S. Miller <davem@davemloft.net>
Emilio López [Fri, 17 May 2013 10:42:53 +0000 (10:42 +0000)]
net: ethernet: sun: initialize variables directly
Clean up the code a bit to initialize the variables directly when
defining them.
Signed-off-by: Emilio López <emilio@elopez.com.ar>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 17 May 2013 00:07:46 +0000 (17:07 -0700)]
Merge branch 'for-davem' of git://gitorious.org/linux-can/linux-can-next
Marc Kleine-Budde says:
====================
this is a pull-request for net-next/master. It consists of 4 patches by
Jingoo Han, which remove the unnecessary platform_set_drvdata() and a
patch by Laurent Navet converting the grcan driver to use
devm_ioremap_resource().
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
govindarajulu.v [Thu, 16 May 2013 06:24:41 +0000 (06:24 +0000)]
net: 3com: 3c509: remove unnecessary code
This patch removes unnecessary #if 0 code from 3c509.c
Signed-off-by: govindarajulu.v <govindarajulu90@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Wolfram Sang [Thu, 16 May 2013 01:15:41 +0000 (01:15 +0000)]
drivers/net/ethernet/renesas: don't check resource with devm_ioremap_resource
devm_ioremap_resource does sanity checks on the given resource. No need to
duplicate this in the driver.
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Signed-off-by: David S. Miller <davem@davemloft.net>