Paul Durrant [Wed, 11 Dec 2013 10:57:16 +0000 (10:57 +0000)]
xen-netback: napi: don't prematurely request a tx event
This patch changes the RING_FINAL_CHECK_FOR_REQUESTS in
xenvif_build_tx_gops to a check for RING_HAS_UNCONSUMED_REQUESTS as the
former call has the side effect of advancing the ring event pointer and
therefore inviting another interrupt from the frontend before the napi
poll has actually finished, thereby defeating the point of napi.
The event pointer is updated by RING_FINAL_CHECK_FOR_REQUESTS in
xenvif_poll, the napi poll function, if the work done is less than the
budget i.e. when actually transitioning back to interrupt mode.
Reported-by: Malcolm Crossley <malcolm.crossley@citrix.com>
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Paul Durrant [Wed, 11 Dec 2013 10:57:15 +0000 (10:57 +0000)]
xen-netback: napi: fix abuse of budget
netback seems to be somewhat confused about the napi budget parameter. The
parameter is supposed to limit the number of skbs processed in each poll,
but netback has this confused with grant operations.
This patch fixes that, properly limiting the work done in each poll. Note
that this limit makes sure we do not process any more data from the shared
ring than we intend to pass back from the poll. This is important to
prevent tx_queue potentially growing without bound.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yang Yingliang [Thu, 12 Dec 2013 02:57:22 +0000 (10:57 +0800)]
sch_tbf: use do_div() for 64-bit divide
It's doing a 64-bit divide which is not supported
on 32-bit architectures in psched_ns_t2l(). The
correct way to do this is to use do_div().
It's introduced by commit
cc106e441a63
("net: sched: tbf: fix the calculation of max_size")
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Wed, 11 Dec 2013 22:46:51 +0000 (14:46 -0800)]
udp: ipv4: must add synchronization in udp_sk_rx_dst_set()
Unlike TCP, UDP input path does not hold the socket lock.
Before messing with sk->sk_rx_dst, we must use a spinlock, otherwise
multiple cpus could leak a refcount.
This patch also takes care of renewing a stale dst entry.
(When the sk->sk_rx_dst would not be used by IP early demux)
Fixes:
421b3885bf6d ("udp: ipv4: Add udp early demux")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Shawn Bohrer <sbohrer@rgmadvisors.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Philippe De Muyter [Wed, 11 Dec 2013 22:50:52 +0000 (23:50 +0100)]
net:fec: remove duplicate lines in comment about errata ERR006358
commit
031916568a1aa2ef1809f86d26f0bcfa215ff5c0 worked around
errata ERR006358, but comment contains duplicated lines, impairing
the readability. Remove them.
Signed-off-by: Philippe De Muyter <phdm@macqel.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 11 Dec 2013 22:20:31 +0000 (17:20 -0500)]
Revert "8390 : Replace ei_debug with msg_enable/NETIF_MSG_* feature"
This reverts commit
99023e90fe5c147ea0665bda86764ea44f08a622.
Accidently checked this into 'net' instead of 'net-next'.
Signed-off-by: David S. Miller <davem@davemloft.net>
Matthew Whitehead [Wed, 11 Dec 2013 22:00:59 +0000 (17:00 -0500)]
8390 : Replace ei_debug with msg_enable/NETIF_MSG_* feature
Removed the shared ei_debug variable. Replaced it by adding u32 msg_enable to
the private struct ei_device. Now each 8390 ethernet instance has a per-device
logging variable.
Changed older style printk() calls to more canonical forms.
Tested on: ne, ne2k-pci, smc-ultra, and wd hardware.
V4.0
- Substituted pr_info() and pr_debug() for printk() KERN_INFO and KERN_DEBUG
V3.0
- Checked for cases where pr_cont() was most appropriate choice.
- Changed module parameter from 'debug' to 'msg_enable' because debug was
no longer the best description.
V2.0
- Changed netif_msg_(drv|probe|ifdown|rx_err|tx_err|tx_queued|intr|rx_status|hw)
to netif_(dbg|info|warn|err) where possible.
Signed-off-by: Matthew Whitehead <tedheadster@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Paul Durrant [Wed, 11 Dec 2013 16:37:40 +0000 (16:37 +0000)]
xen-netback: make sure skb linear area covers checksum field
skb_partial_csum_set requires that the linear area of the skb covers the
checksum field. The checksum setup code in netback was only doing that
pullup in the case when the pseudo header checksum was being recalculated
though. This patch makes that pullup unconditional. (I pullup the whole
transport header just for simplicity; the requirement is only for the check
field but in the case of UDP this is the last field in the header and in the
case of TCP it's the last but one).
The lack of pullup manifested as failures running Microsoft HCK network
tests on a pair of Windows 8 VMs and it has been verified that this patch
fixes the problem.
Suggested-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tony Lindgren [Wed, 11 Dec 2013 21:04:27 +0000 (13:04 -0800)]
net: smc91x: Fix device tree based configuration so it's usable
Commit
89ce376c6bdc (drivers/net: Use of_match_ptr() macro in smc91x.c)
added minimal device tree support to smc91x, but it's not working on
many platforms because of the lack of some key configuration bits.
Fix the issue by parsing the necessary configuration like the
smc911x driver is doing. As most smc91x users seem to use 16-bit
access, let's default to that if no reg-io-width is specified.
Cc: Nicolas Pitre <nico@fluxnic.net>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: netdev@vger.kernel.org
Cc: devicetree@vger.kernel.org
Acked-by: Nishanth Menon <nm@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Wed, 11 Dec 2013 16:10:05 +0000 (08:10 -0800)]
udp: ipv4: fix potential use after free in udp_v4_early_demux()
pskb_may_pull() can reallocate skb->head, we need to move the
initialization of iph and uh pointers after its call.
Fixes:
421b3885bf6d ("udp: ipv4: Add udp early demux")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Shawn Bohrer <sbohrer@rgmadvisors.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jason Wang [Wed, 11 Dec 2013 05:08:34 +0000 (13:08 +0800)]
macvtap: signal truncated packets
macvtap_put_user() never return a value grater than iov length, this in fact
bypasses the truncated checking in macvtap_recvmsg(). Fix this by always
returning the size of packet plus the possible vlan header to let the trunca
checking work.
Cc: Vlad Yasevich <vyasevich@gmail.com>
Cc: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jason Wang [Wed, 11 Dec 2013 05:08:33 +0000 (13:08 +0800)]
tun: unbreak truncated packet signalling
Commit
6680ec68eff47d36f67b4351bc9836fd6cba9532
(tuntap: hardware vlan tx support) breaks the truncated packet signal by nev
return a length greater than iov length in tun_put_user(). This patch fixes
by always return the length of packet plus possible vlan header. Caller can
detect the truncated packet by comparing the return value and the size of io
length.
Cc: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yang Yingliang [Tue, 10 Dec 2013 06:59:28 +0000 (14:59 +0800)]
net: sched: htb: fix the calculation of quantum
Now, 32bit rates may be not the true rate.
So use rate_bytes_ps which is from
max(rate32, rate64) to calcualte quantum.
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yang Yingliang [Tue, 10 Dec 2013 06:59:27 +0000 (14:59 +0800)]
net: sched: tbf: fix the calculation of max_size
Current max_size is caluated from rate table. Now, the rate table
has been replaced and it's wrong to caculate max_size based on this
rate table. It can lead wrong calculation of max_size.
The burst in kernel may be lower than user asked, because burst may gets
some loss when transform it to buffer(E.g. "burst 40kb rate 30mbit/s")
and it seems we cannot avoid this loss. Burst's value(max_size) based on
rate table may be equal user asked. If a packet's length is max_size, this
packet will be stalled in tbf_dequeue() because its length is above the
burst in kernel so that it cannot get enough tokens. The max_size guards
against enqueuing packet sizes above q->buffer "time" in tbf_enqueue().
To make consistent with the calculation of tokens, this patch add a helper
psched_ns_t2l() to calculate burst(max_size) directly to fix this problem.
After this fix, we can support to using 64bit rates to calculate burst as well.
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sergei Shtylyov [Mon, 9 Dec 2013 23:20:41 +0000 (02:20 +0300)]
micrel: add support for KSZ8041RNLI
Renesas R-Car development boards use KSZ8041RNLI PHY which for some reason has
ID of 0x00221537 that is not documented for KSZ8041-family PHYs and does not
match the documented ID of 0x0022151x (where 'x' is the revision). We have
to add the new #define PHY_ID_* and new ksphy_driver[] entry, almost the same
as KSZ8041 one, differing only in the 'phy_id' and 'name' fields.
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Tested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 11 Dec 2013 17:57:57 +0000 (12:57 -0500)]
Merge branch 'for-davem' of git://git./linux/kernel/git/linville/wireless
John W. Linville says:
====================
Just one patch this time -- a fix from Felix Fietkau to fix the
duration calculation for non-aggregated packets in ath9k. This is
a small change and it is obviously specific to ath9k.
Please let me know if there are problems!
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
John W. Linville [Wed, 11 Dec 2013 15:41:56 +0000 (10:41 -0500)]
Merge branch 'master' of git://git./linux/kernel/git/linville/wireless into for-davem
Eric Dumazet [Wed, 11 Dec 2013 02:07:23 +0000 (18:07 -0800)]
udp: ipv4: fix an use after free in __udp4_lib_rcv()
Dave Jones reported a use after free in UDP stack :
[ 5059.434216] =========================
[ 5059.434314] [ BUG: held lock freed! ]
[ 5059.434420] 3.13.0-rc3+ #9 Not tainted
[ 5059.434520] -------------------------
[ 5059.434620] named/863 is freeing memory
ffff88005e960000-
ffff88005e96061f, with a lock still held there!
[ 5059.434815] (slock-AF_INET){+.-...}, at: [<
ffffffff8149bd21>] udp_queue_rcv_skb+0xd1/0x4b0
[ 5059.435012] 3 locks held by named/863:
[ 5059.435086] #0: (rcu_read_lock){.+.+..}, at: [<
ffffffff8143054d>] __netif_receive_skb_core+0x11d/0x940
[ 5059.435295] #1: (rcu_read_lock){.+.+..}, at: [<
ffffffff81467a5e>] ip_local_deliver_finish+0x3e/0x410
[ 5059.435500] #2: (slock-AF_INET){+.-...}, at: [<
ffffffff8149bd21>] udp_queue_rcv_skb+0xd1/0x4b0
[ 5059.435734]
stack backtrace:
[ 5059.435858] CPU: 0 PID: 863 Comm: named Not tainted 3.13.0-rc3+ #9 [loadavg: 0.21 0.06 0.06 1/115 1365]
[ 5059.436052] Hardware name: /D510MO, BIOS MOPNV10J.86A.0175.2010.0308.0620 03/08/2010
[ 5059.436223]
0000000000000002 ffff88007e203ad8 ffffffff8153a372 ffff8800677130e0
[ 5059.436390]
ffff88007e203b10 ffffffff8108cafa ffff88005e960000 ffff88007b00cfc0
[ 5059.436554]
ffffea00017a5800 ffffffff8141c490 0000000000000246 ffff88007e203b48
[ 5059.436718] Call Trace:
[ 5059.436769] <IRQ> [<
ffffffff8153a372>] dump_stack+0x4d/0x66
[ 5059.436904] [<
ffffffff8108cafa>] debug_check_no_locks_freed+0x15a/0x160
[ 5059.437037] [<
ffffffff8141c490>] ? __sk_free+0x110/0x230
[ 5059.437147] [<
ffffffff8112da2a>] kmem_cache_free+0x6a/0x150
[ 5059.437260] [<
ffffffff8141c490>] __sk_free+0x110/0x230
[ 5059.437364] [<
ffffffff8141c5c9>] sk_free+0x19/0x20
[ 5059.437463] [<
ffffffff8141cb25>] sock_edemux+0x25/0x40
[ 5059.437567] [<
ffffffff8141c181>] sock_queue_rcv_skb+0x81/0x280
[ 5059.437685] [<
ffffffff8149bd21>] ? udp_queue_rcv_skb+0xd1/0x4b0
[ 5059.437805] [<
ffffffff81499c82>] __udp_queue_rcv_skb+0x42/0x240
[ 5059.437925] [<
ffffffff81541d25>] ? _raw_spin_lock+0x65/0x70
[ 5059.438038] [<
ffffffff8149bebb>] udp_queue_rcv_skb+0x26b/0x4b0
[ 5059.438155] [<
ffffffff8149c712>] __udp4_lib_rcv+0x152/0xb00
[ 5059.438269] [<
ffffffff8149d7f5>] udp_rcv+0x15/0x20
[ 5059.438367] [<
ffffffff81467b2f>] ip_local_deliver_finish+0x10f/0x410
[ 5059.438492] [<
ffffffff81467a5e>] ? ip_local_deliver_finish+0x3e/0x410
[ 5059.438621] [<
ffffffff81468653>] ip_local_deliver+0x43/0x80
[ 5059.438733] [<
ffffffff81467f70>] ip_rcv_finish+0x140/0x5a0
[ 5059.438843] [<
ffffffff81468926>] ip_rcv+0x296/0x3f0
[ 5059.438945] [<
ffffffff81430b72>] __netif_receive_skb_core+0x742/0x940
[ 5059.439074] [<
ffffffff8143054d>] ? __netif_receive_skb_core+0x11d/0x940
[ 5059.442231] [<
ffffffff8108c81d>] ? trace_hardirqs_on+0xd/0x10
[ 5059.442231] [<
ffffffff81430d83>] __netif_receive_skb+0x13/0x60
[ 5059.442231] [<
ffffffff81431c1e>] netif_receive_skb+0x1e/0x1f0
[ 5059.442231] [<
ffffffff814334e0>] napi_gro_receive+0x70/0xa0
[ 5059.442231] [<
ffffffffa01de426>] rtl8169_poll+0x166/0x700 [r8169]
[ 5059.442231] [<
ffffffff81432bc9>] net_rx_action+0x129/0x1e0
[ 5059.442231] [<
ffffffff810478cd>] __do_softirq+0xed/0x240
[ 5059.442231] [<
ffffffff81047e25>] irq_exit+0x125/0x140
[ 5059.442231] [<
ffffffff81004241>] do_IRQ+0x51/0xc0
[ 5059.442231] [<
ffffffff81542bef>] common_interrupt+0x6f/0x6f
We need to keep a reference on the socket, by using skb_steal_sock()
at the right place.
Note that another patch is needed to fix a race in
udp_sk_rx_dst_set(), as we hold no lock protecting the dst.
Fixes:
421b3885bf6d ("udp: ipv4: Add udp early demux")
Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Shawn Bohrer <sbohrer@rgmadvisors.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 11 Dec 2013 03:54:39 +0000 (22:54 -0500)]
Merge branch 'sctp'
Wang Weidong says:
====================
sctp: check the rto_min and rto_max
v6 -> v7:
-patch2: fix the whitespace issues which pointed out by Daniel
v5 -> v6:
split the v5' first patch to patch1 and patch2, and remove the
macro in constants.h
-patch1: do rto_min/max socket option handling in its own patch, and
fix the check of rto_min/max.
-patch2: do rto_min/max sysctl handling in its own patch.
-patch3: add Suggested-by Daniel.
v4 -> v5:
- patch1: add marco in constants.h and fix up spacing as
suggested by Daniel
- patch2: add a patch for fix up do_hmac_alg for according
to do_rto_min[max]
v3 -> v4:
-patch1: fix use init_net directly which suggested by Vlad.
v2 -> v3:
-patch1: add proc_handler for check rto_min and rto_max which suggested
by Vlad
v1 -> v2:
-patch1: fix the From Name which pointed out by David, and
add the ACK by Neil
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
wangweidong [Wed, 11 Dec 2013 01:50:40 +0000 (09:50 +0800)]
sctp: fix up a spacing
fix up spacing of proc_sctp_do_hmac_alg for according to the
proc_sctp_do_rto_min[max] in sysctl.c
Suggested-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Wang Weidong <wangweidong1@huawei.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
wangweidong [Wed, 11 Dec 2013 01:50:39 +0000 (09:50 +0800)]
sctp: add check rto_min and rto_max in sysctl
rto_min should be smaller than rto_max while rto_max should be larger
than rto_min. Add two proc_handler for the checking.
Suggested-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: Wang Weidong <wangweidong1@huawei.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
wangweidong [Wed, 11 Dec 2013 01:50:38 +0000 (09:50 +0800)]
sctp: check the rto_min and rto_max in setsockopt
When we set 0 to rto_min or rto_max, just not change the value. Also
we should check the rto_min > rto_max.
Suggested-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: Wang Weidong <wangweidong1@huawei.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florent Fourcot [Tue, 10 Dec 2013 14:15:46 +0000 (15:15 +0100)]
ipv6: do not erase dst address with flow label destination
This patch is following
b579035ff766c9412e2b92abf5cab794bff102b6
"ipv6: remove old conditions on flow label sharing"
Since there is no reason to restrict a label to a
destination, we should not erase the destination value of a
socket with the value contained in the flow label storage.
This patch allows to really have the same flow label to more
than one destination.
Signed-off-by: Florent Fourcot <florent.fourcot@enst-bretagne.fr>
Reviewed-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Neil Horman [Tue, 10 Dec 2013 11:48:15 +0000 (06:48 -0500)]
sctp: properly latch and use autoclose value from sock to association
Currently, sctp associations latch a sockets autoclose value to an association
at association init time, subject to capping constraints from the max_autoclose
sysctl value. This leads to an odd situation where an application may set a
socket level autoclose timeout, but sliently sctp will limit the autoclose
timeout to something less than that.
Fix this by modifying the autoclose setsockopt function to check the limit, cap
it and warn the user via syslog that the timeout is capped. This will allow
getsockopt to return valid autoclose timeout values that reflect what subsequent
associations actually use.
While were at it, also elimintate the assoc->autoclose variable, it duplicates
whats in the timeout array, which leads to multiple sources for the same
information, that may differ (as the former isn't subject to any capping). This
gives us the timeout information in a canonical place and saves some space in
the association structure as well.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
CC: Wang Weidong <wangweidong1@huawei.com>
CC: David Miller <davem@davemloft.net>
CC: Vlad Yasevich <vyasevich@gmail.com>
CC: netdev@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 11 Dec 2013 03:36:00 +0000 (22:36 -0500)]
Merge branch 'tipc'
Jon Maloy says:
====================
tipc: corrections related to tasklet job mechanism
These commits correct two bugs related to tipc' service for launching
functions for asynchronous execution in a separate tasklet.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Ying Xue [Tue, 10 Dec 2013 06:54:47 +0000 (22:54 -0800)]
tipc: protect handler_enabled variable with qitem_lock spin lock
'handler_enabled' is a global flag indicating whether the TIPC
signal handling service is enabled or not. The lack of lock
protection for this flag incurs a risk for contention, so that
a tipc_k_signal() call might queue a signal handler to a destroyed
signal queue, with unpredictable results. To correct this, we let
the already existing 'qitem_lock' protect the flag, as it already
does with the queue itself. This way, we ensure that the flag
always is consistent across all cores.
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jon Paul Maloy [Tue, 10 Dec 2013 06:54:46 +0000 (22:54 -0800)]
tipc: correct the order of stopping services at rmmod
The 'signal handler' service in TIPC is a mechanism that makes it
possible to postpone execution of functions, by launcing them into
a job queue for execution in a separate tasklet, independent of
the launching execution thread.
When we do rmmod on the tipc module, this service is stopped after
the network service. At the same time, the stopping of the network
service may itself launch jobs for execution, with the risk that these
functions may be scheduled for execution after the data structures
meant to be accessed by the job have already been deleted. We have
seen this happen, most often resulting in an oops.
This commit ensures that the signal handler is the very first to be
stopped when TIPC is shut down, so there are no surprises during
the cleanup of the other services.
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nat Gurumoorthy [Mon, 9 Dec 2013 18:43:21 +0000 (10:43 -0800)]
tg3: Initialize REG_BASE_ADDR at PCI config offset 120 to 0
The new tg3 driver leaves REG_BASE_ADDR (PCI config offset 120)
uninitialized. From power on reset this register may have garbage in it. The
Register Base Address register defines the device local address of a
register. The data pointed to by this location is read or written using
the Register Data register (PCI config offset 128). When REG_BASE_ADDR has
garbage any read or write of Register Data Register (PCI 128) will cause the
PCI bus to lock up. The TCO watchdog will fire and bring down the system.
Signed-off-by: Nat Gurumoorthy <natg@google.com>
Acked-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 11 Dec 2013 03:10:21 +0000 (22:10 -0500)]
net: Revert macvtap/tun truncation signalling changes.
Jason Wang and Michael S. Tsirkin are still discussing how
to properly fix this.
Signed-off-by: David S. Miller <davem@davemloft.net>
Jason Wang [Mon, 9 Dec 2013 10:25:17 +0000 (18:25 +0800)]
macvtap: signal truncated packets
macvtap_put_user() never return a value grater than iov length, this in fact
bypasses the truncated checking in macvtap_recvmsg(). Fix this by always
returning the size of packet plus the possible vlan header to let the truncated
checking work.
Cc: Vlad Yasevich <vyasevich@gmail.com>
Cc: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jason Wang [Mon, 9 Dec 2013 10:25:16 +0000 (18:25 +0800)]
tun: unbreak truncated packet signalling
Commit
6680ec68eff47d36f67b4351bc9836fd6cba9532
(tuntap: hardware vlan tx support) breaks the truncated packet signal by never
return a length greater than iov length in tun_put_user(). This patch fixes this
by always return the length of packet plus possible vlan header. Caller can
detect the truncated packet by comparing the return value and the size of iov
length.
Reported-by: Vlad Yasevich <vyasevich@gmail.com>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Cc: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fan Du [Mon, 9 Dec 2013 02:33:53 +0000 (10:33 +0800)]
vxlan: release rt when found circular route
Otherwise causing dst memory leakage.
Have Checked all other type tunnel device transmit implementation,
no such things happens anymore.
Signed-off-by: Fan Du <fan.du@windriver.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sasha Levin [Sat, 7 Dec 2013 22:26:27 +0000 (17:26 -0500)]
net: unix: allow set_peek_off to fail
unix_dgram_recvmsg() will hold the readlock of the socket until recv
is complete.
In the same time, we may try to setsockopt(SO_PEEK_OFF) which will hang until
unix_dgram_recvmsg() will complete (which can take a while) without allowing
us to break out of it, triggering a hung task spew.
Instead, allow set_peek_off to fail, this way userspace will not hang.
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 11 Dec 2013 02:19:42 +0000 (21:19 -0500)]
Merge branch 'sfc-3.13' of git://git./linux/kernel/git/bwh/sfc
Ben Hutchings says:
====================
Several fixes for the PTP hardware support added in 3.7:
1. Fix filtering of PTP packets on the TX path to be robust against bad
header lengths.
2. Limit logging on the RX path in case of a PTP packet flood, partly
from Laurence Evans.
3. Disable PTP hardware when the interface is down so that we don't
receive RX timestamp events, from Alexandre Rames.
4. Maintain clock frequency adjustment when a time offset is applied.
Also fixes for the SFC9100 family support added in 3.12:
5. Take the RX prefix length into account when applying NET_IP_ALIGN,
from Andrew Rybchenko.
6. Work around a bug that breaks communication between the driver and
firmware, from Robert Stonehouse.
Please also queue these up for the appropriate stable branches.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Maxime Ripard [Tue, 10 Dec 2013 18:40:43 +0000 (19:40 +0100)]
net: allwinner: emac: Add missing free_irq
The sun4i-emac driver uses devm_request_irq at .ndo_open time, but relies on
the managed device mechanism to actually free it. This causes an issue whenever
someone wants to restart the interface, the interrupt still being held, and not
yet released.
Fall back to using the regular request_irq at .ndo_open time, and introduce a
free_irq during .ndo_stop.
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Cc: stable@vger.kernel.org # 3.11+
Signed-off-by: David S. Miller <davem@davemloft.net>
Stefan Tomanek [Tue, 10 Dec 2013 22:21:25 +0000 (23:21 +0100)]
inet: fix NULL pointer Oops in fib(6)_rule_suppress
This changes ensures that the routing entry investigated by the suppress
function actually does point to a device struct before following that pointer,
fixing a possible kernel oops situation when verifying the interface group
associated with a routing table entry.
According to Daniel Golle, this Oops can be triggered by a user process trying
to establish an outgoing IPv6 connection while having no real IPv6 connectivity
set up (only autoassigned link-local addresses).
Fixes:
6ef94cfafba15 ("fib_rules: add route suppression based on ifgroup")
Reported-by: Daniel Golle <daniel.golle@gmail.com>
Tested-by: Daniel Golle <daniel.golle@gmail.com>
Signed-off-by: Stefan Tomanek <stefan.tomanek@wertarbyte.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Changli Gao [Sun, 8 Dec 2013 14:36:56 +0000 (09:36 -0500)]
net: drop_monitor: fix the value of maxattr
maxattr in genl_family should be used to save the max attribute
type, but not the max command type. Drop monitor doesn't support
any attributes, so we should leave it as zero.
Signed-off-by: David S. Miller <davem@davemloft.net>
Srikanth Thokala [Sat, 7 Dec 2013 08:10:49 +0000 (13:40 +0530)]
net: emaclite: add barriers to support Xilinx Zynq platform
This patch adds barriers at appropriate places to ensure the driver
works on Xilinx Zynq ARM-based SoC platform.
Signed-off-by: Srikanth Thokala <sthokal@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Srikanth Thokala [Sat, 7 Dec 2013 08:10:48 +0000 (13:40 +0530)]
net: emaclite: Remove unnecessary code that enables/disables interrupts on PONG buffers
There are no specific interrupts for the PONG buffer on both
transmit and receive side, same interrupt is valid for both
buffers. So, this patch removes this code.
Signed-off-by: Srikanth Thokala <sthokal@xilinx.com>
Reviewed-by: Michal Simek <monstr@monstr.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hannes Frederic Sowa [Sat, 7 Dec 2013 02:33:45 +0000 (03:33 +0100)]
ipv6: don't count addrconf generated routes against gc limit
Brett Ciphery reported that new ipv6 addresses failed to get installed
because the addrconf generated dsts where counted against the dst gc
limit. We don't need to count those routes like we currently don't count
administratively added routes.
Because the max_addresses check enforces a limit on unbounded address
generation first in case someone plays with router advertisments, we
are still safe here.
Reported-by: Brett Ciphery <brett.ciphery@windriver.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 10 Dec 2013 01:43:21 +0000 (20:43 -0500)]
Merge branch 'master' of git://git./linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:
====================
The following patchset contains three Netfilter fixes for your net tree,
they are:
* fix incorrect comparison in the new netnet hash ipset type, from
Dave Jones.
* fix splat in hashlimit due to missing removal of the content of its
proc entry in netnamespaces, from Sergey Popovich.
* fix missing rule flushing operation by table in nf_tables. Table
flushing was already discussed back in October but this got lost and
no patch has hit the tree to address this issue so far, from me.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Fri, 6 Dec 2013 10:36:15 +0000 (11:36 +0100)]
packet: fix send path when running with proto == 0
Commit
e40526cb20b5 introduced a cached dev pointer, that gets
hooked into register_prot_hook(), __unregister_prot_hook() to
update the device used for the send path.
We need to fix this up, as otherwise this will not work with
sockets created with protocol = 0, plus with sll_protocol = 0
passed via sockaddr_ll when doing the bind.
So instead, assign the pointer directly. The compiler can inline
these helper functions automagically.
While at it, also assume the cached dev fast-path as likely(),
and document this variant of socket creation as it seems it is
not widely used (seems not even the author of TX_RING was aware
of that in his reference example [1]). Tested with reproducer
from
e40526cb20b5.
[1] http://wiki.ipxwarzone.com/index.php5?title=Linux_packet_mmap#Example
Fixes:
e40526cb20b5 ("packet: fix use after free race in send path when dev is released")
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Tested-by: Salam Noureddine <noureddine@aristanetworks.com>
Tested-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Felix Fietkau [Thu, 5 Dec 2013 14:20:53 +0000 (15:20 +0100)]
ath9k: fix duration calculation for non-aggregated packets
When not aggregating packets, fi->framelen should be passed in as length
to calculate the duration. Before the tx path rework, ath_tx_fill_desc
was called for either one aggregate, or one single frame, with the
length of the packet or the aggregate as a parameter.
After the rework, ath_tx_sched_aggr can pass a burst of single frames to
ath_tx_fill_desc and sets len=0.
Fix broken duration calculation by overriding the length in ath_tx_fill_desc
before passing it to ath_buf_set_rate.
Cc: stable@vger.kernel.org
Reported-by: Simon Wunderlich <sw@simonwunderlich.de>
Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Pablo Neira Ayuso [Sun, 24 Nov 2013 19:39:10 +0000 (20:39 +0100)]
netfilter: nf_tables: fix missing rules flushing per table
This patch allows you to atomically remove all rules stored in
a table via the NFT_MSG_DELRULE command. You only need to indicate
the specific table and no chain to flush all rules stored in that
table.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Sergey Popovich [Fri, 6 Dec 2013 08:57:19 +0000 (10:57 +0200)]
netfilter: xt_hashlimit: fix proc entry leak in netns destroy path
In (
32263dd1b netfilter: xt_hashlimit: fix namespace destroy path)
the hashlimit_net_exit() function is always called right before
hashlimit_mt_destroy() to release netns data. If you use xt_hashlimit
with IPv4 and IPv6 together, this produces the following splat via
netconsole in the netns destroy path:
Pid: 9499, comm: kworker/u:0 Tainted: G WC O 3.2.0-5-netctl-amd64-core2
Call Trace:
[<
ffffffff8104708d>] ? warn_slowpath_common+0x78/0x8c
[<
ffffffff81047139>] ? warn_slowpath_fmt+0x45/0x4a
[<
ffffffff81144a99>] ? remove_proc_entry+0xd8/0x22e
[<
ffffffff810ebbaa>] ? kfree+0x5b/0x6c
[<
ffffffffa043c501>] ? hashlimit_net_exit+0x45/0x8d [xt_hashlimit]
[<
ffffffff8128ab30>] ? ops_exit_list+0x1c/0x44
[<
ffffffff8128b28e>] ? cleanup_net+0xf1/0x180
[<
ffffffff810369fc>] ? should_resched+0x5/0x23
[<
ffffffff8105b8f9>] ? process_one_work+0x161/0x269
[<
ffffffff8105aea5>] ? cwq_activate_delayed_work+0x3c/0x48
[<
ffffffff8105c8c2>] ? worker_thread+0xc2/0x145
[<
ffffffff8105c800>] ? manage_workers.isra.25+0x15b/0x15b
[<
ffffffff8105fa01>] ? kthread+0x76/0x7e
[<
ffffffff813581f4>] ? kernel_thread_helper+0x4/0x10
[<
ffffffff8105f98b>] ? kthread_worker_fn+0x139/0x139
[<
ffffffff813581f0>] ? gs_change+0x13/0x13
---[ end trace
d8c3cc0ad163ef79 ]---
------------[ cut here ]------------
WARNING: at /usr/src/linux-3.2.52/debian/build/source_netctl/fs/proc/generic.c:849
remove_proc_entry+0x217/0x22e()
Hardware name:
remove_proc_entry: removing non-empty directory 'net/ip6t_hashlimit', leaking at least 'IN-REJECT'
This is due to lack of removal net/ip6t_hashlimit/* entries in
hashlimit_proc_net_exit(), since only IPv4 entries are deleted. Fix
it by always removing the IPv4 and IPv6 entries and their parent
directories in the netns destroy path.
Signed-off-by: Sergey Popovich <popovich_sergei@mail.ru>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Robert Stonehouse [Wed, 9 Oct 2013 10:52:48 +0000 (11:52 +0100)]
sfc: Poll for MCDI completion once before timeout occurs
There is an as-yet unexplained bug that sometimes prevents (or delays)
the driver seeing the completion event for a completed MCDI request on
the SFC9120. The requested configuration change will have happened
but the driver assumes it to have failed, and this can result in
further failures. We can mitigate this by polling for completion
after unsuccessfully waiting for an event.
Fixes:
8127d661e77f ('sfc: Add support for Solarflare SFC9100 family')
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Robert Stonehouse [Wed, 9 Oct 2013 10:52:43 +0000 (11:52 +0100)]
sfc: Refactor efx_mcdi_poll() by introducing efx_mcdi_poll_once()
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Andrew Rybchenko [Sat, 16 Nov 2013 07:02:27 +0000 (11:02 +0400)]
sfc: RX buffer allocation takes prefix size into account in IP header alignment
rx_prefix_size is 4-bytes aligned on Falcon/Siena (16 bytes), but it is equal
to 14 on EF10. So, it should be taken into account if arch requires IP header
to be 4-bytes aligned (via NET_IP_ALIGN).
Fixes:
8127d661e77f ('sfc: Add support for Solarflare SFC9100 family')
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Ben Hutchings [Thu, 5 Dec 2013 17:24:06 +0000 (17:24 +0000)]
sfc: Maintain current frequency adjustment when applying a time offset
There is a single MCDI PTP operation for setting the frequency
adjustment and applying a time offset to the hardware clock. When
applying a time offset we should not change the frequency adjustment.
These two operations can now be requested separately but this requires
a flash firmware update. Keep using the single operation, but
remember and repeat the previous frequency adjustment.
Fixes:
7c236c43b838 ('sfc: Add support for IEEE-1588 PTP')
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Alexandre Rames [Fri, 8 Nov 2013 10:20:31 +0000 (10:20 +0000)]
sfc: Stop/re-start PTP when stopping/starting the datapath.
This disables PTP when we bring the interface down to avoid getting
unmatched RX timestamp events, and tries to re-enable it when bringing
the interface up.
[bwh: Make efx_ptp_stop() safe on Falcon. Introduce
efx_ptp_{start,stop}_datapath() functions; we'll expand them later.]
Fixes:
7c236c43b838 ('sfc: Add support for IEEE-1588 PTP')
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Ben Hutchings [Fri, 6 Dec 2013 22:10:46 +0000 (22:10 +0000)]
sfc: Rate-limit log message for PTP packets without a matching timestamp event
In case of a flood of PTP packets, the timestamp peripheral and MC
firmware on the SFN[56]322F boards may not be able to provide
timestamp events for all packets. Don't complain too much about this.
Fixes:
7c236c43b838 ('sfc: Add support for IEEE-1588 PTP')
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Laurence Evans [Mon, 28 Jan 2013 14:51:17 +0000 (14:51 +0000)]
sfc: PTP: Moderate log message on event queue overflow
Limit syslog flood if a PTP packet storm occurs.
Fixes:
7c236c43b838 ('sfc: Add support for IEEE-1588 PTP')
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Michael Dalton [Thu, 5 Dec 2013 21:14:05 +0000 (13:14 -0800)]
virtio-net: free bufs correctly on invalid packet length
When a packet with invalid length arrives, ensure that the packet
is freed correctly if mergeable packet buffers and big packets
(GUEST_TSO4) are both enabled.
Signed-off-by: Michael Dalton <mwdalton@google.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Andrew Vagin <avagin@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ezequiel Garcia [Thu, 5 Dec 2013 16:35:37 +0000 (13:35 -0300)]
net: mvneta: Fix incorrect DMA unmapping size
The current code unmaps the DMA mapping created for rx skb_buff's by
using the data_size as the the mapping size. This is wrong since the
correct size to specify should match the size used to create the mapping.
This commit removes the following DMA_API_DEBUG warning:
------------[ cut here ]------------
WARNING: at lib/dma-debug.c:887 check_unmap+0x3a8/0x860()
mvneta
d0070000.ethernet: DMA-API: device driver frees DMA memory with different size [device address=0x000000002eb80000] [map size=1600 bytes] [unmap size=66 bytes]
CPU: 0 PID: 0 Comm: swapper/0 Not tainted
3.10.21-01444-ga88ae13-dirty #92
[<
c0013600>] (unwind_backtrace+0x0/0xf8) from [<
c0010fb8>] (show_stack+0x10/0x14)
[<
c0010fb8>] (show_stack+0x10/0x14) from [<
c001afa0>] (warn_slowpath_common+0x48/0x68)
[<
c001afa0>] (warn_slowpath_common+0x48/0x68) from [<
c001b01c>] (warn_slowpath_fmt+0x30/0x40)
[<
c001b01c>] (warn_slowpath_fmt+0x30/0x40) from [<
c018d0fc>] (check_unmap+0x3a8/0x860)
[<
c018d0fc>] (check_unmap+0x3a8/0x860) from [<
c018d734>] (debug_dma_unmap_page+0x64/0x70)
[<
c018d734>] (debug_dma_unmap_page+0x64/0x70) from [<
c0233f78>] (mvneta_rx+0xec/0x468)
[<
c0233f78>] (mvneta_rx+0xec/0x468) from [<
c023436c>] (mvneta_poll+0x78/0x16c)
[<
c023436c>] (mvneta_poll+0x78/0x16c) from [<
c02db468>] (net_rx_action+0x94/0x160)
[<
c02db468>] (net_rx_action+0x94/0x160) from [<
c0021e68>] (__do_softirq+0xe8/0x1d0)
[<
c0021e68>] (__do_softirq+0xe8/0x1d0) from [<
c0021ff8>] (do_softirq+0x4c/0x58)
[<
c0021ff8>] (do_softirq+0x4c/0x58) from [<
c0022228>] (irq_exit+0x58/0x90)
[<
c0022228>] (irq_exit+0x58/0x90) from [<
c000e7c8>] (handle_IRQ+0x3c/0x94)
[<
c000e7c8>] (handle_IRQ+0x3c/0x94) from [<
c0008548>] (armada_370_xp_handle_irq+0x4c/0xb4)
[<
c0008548>] (armada_370_xp_handle_irq+0x4c/0xb4) from [<
c000dc20>] (__irq_svc+0x40/0x50)
Exception stack(0xc04f1f70 to 0xc04f1fb8)
1f60:
c1fe46f8 00000000 00001d92 00001d92
1f80:
c04f0000 c04f0000 c04f84a4 c03e081c c05220e7 00000001 c05220e7 c04f0000
1fa0:
00000000 c04f1fb8 c000eaf8 c004c048 60000113 ffffffff
[<
c000dc20>] (__irq_svc+0x40/0x50) from [<
c004c048>] (cpu_startup_entry+0x54/0x128)
[<
c004c048>] (cpu_startup_entry+0x54/0x128) from [<
c04c1a14>] (start_kernel+0x29c/0x2f0)
[<
c04c1a14>] (start_kernel+0x29c/0x2f0) from [<
00008074>] (0x8074)
---[ end trace
d4955f6acd178110 ]---
Mapped at:
[<
c018d600>] debug_dma_map_page+0x4c/0x11c
[<
c0235d6c>] mvneta_setup_rxqs+0x398/0x598
[<
c0236084>] mvneta_open+0x40/0x17c
[<
c02dbbd4>] __dev_open+0x9c/0x100
[<
c02dbe58>] __dev_change_flags+0x7c/0x134
Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Thu, 5 Dec 2013 15:27:37 +0000 (16:27 +0100)]
br: fix use of ->rx_handler_data in code executed on non-rx_handler path
br_stp_rcv() is reached by non-rx_handler path. That means there is no
guarantee that dev is bridge port and therefore simple NULL check of
->rx_handler_data is not enough. There is need to check if dev is really
bridge port and since only rcu read lock is held here, do it by checking
->rx_handler pointer.
Note that synchronize_net() in netdev_rx_handler_unregister() ensures
this approach as valid.
Introduced originally by:
commit
f350a0a87374418635689471606454abc7beaa3a
"bridge: use rx_handler_data pointer to store net_bridge_port pointer"
Fixed but not in the best way by:
commit
b5ed54e94d324f17c97852296d61a143f01b227a
"bridge: fix RCU races with bridge port"
Reintroduced by:
commit
716ec052d2280d511e10e90ad54a86f5b5d4dcc2
"bridge: fix NULL pointer deref of br_port_get_rcu"
Please apply to stable trees as well. Thanks.
RH bugzilla reference: https://bugzilla.redhat.com/show_bug.cgi?id=
1025770
Reported-by: Laine Stump <laine@redhat.com>
Debugged-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ben Hutchings [Fri, 6 Dec 2013 19:26:40 +0000 (19:26 +0000)]
sfc: Add length checks to efx_xmit_with_hwtstamp() and efx_ptp_is_ptp_tx()
efx_ptp_is_ptp_tx() must be robust against skbs from raw sockets that
have invalid IPv4 and UDP headers.
Add checks that:
- the transport header has been found
- there is enough space between network and transport header offset
for an IPv4 header
- there is enough space after the transport header offset for a
UDP header
Fixes:
7c236c43b838 ('sfc: Add support for IEEE-1588 PTP')
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Andrey Vagin [Thu, 5 Dec 2013 14:36:21 +0000 (18:36 +0400)]
virtio: delete napi structures from netdev before releasing memory
free_netdev calls netif_napi_del too, but it's too late, because napi
structures are placed on vi->rq. netif_napi_add() is called from
virtnet_alloc_queues.
general protection fault: 0000 [#1] SMP
Dumping ftrace buffer:
(ftrace buffer empty)
Modules linked in: ip6table_filter ip6_tables iptable_filter ip_tables virtio_balloon pcspkr virtio_net(-) i2c_pii
CPU: 1 PID: 347 Comm: rmmod Not tainted 3.13.0-rc2+ #171
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
task:
ffff8800b779c420 ti:
ffff8800379e0000 task.ti:
ffff8800379e0000
RIP: 0010:[<
ffffffff81322e19>] [<
ffffffff81322e19>] __list_del_entry+0x29/0xd0
RSP: 0018:
ffff8800379e1dd0 EFLAGS:
00010a83
RAX:
6b6b6b6b6b6b6b6b RBX:
ffff8800379c2fd0 RCX:
dead000000200200
RDX:
6b6b6b6b6b6b6b6b RSI:
0000000000000001 RDI:
ffff8800379c2fd0
RBP:
ffff8800379e1dd0 R08:
0000000000000001 R09:
0000000000000000
R10:
0000000000000000 R11:
0000000000000001 R12:
ffff8800379c2f90
R13:
ffff880037839160 R14:
0000000000000000 R15:
00000000013352f0
FS:
00007f1400e34740(0000) GS:
ffff8800bfb00000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
000000008005003b
CR2:
00007f464124c763 CR3:
00000000b68cf000 CR4:
00000000000006e0
Stack:
ffff8800379e1df0 ffffffff8155beab 6b6b6b6b6b6b6b2b ffff8800378391c0
ffff8800379e1e18 ffffffff8156499b ffff880037839be0 ffff880037839d20
ffff88003779d3f0 ffff8800379e1e38 ffffffffa003477c ffff88003779d388
Call Trace:
[<
ffffffff8155beab>] netif_napi_del+0x1b/0x80
[<
ffffffff8156499b>] free_netdev+0x8b/0x110
[<
ffffffffa003477c>] virtnet_remove+0x7c/0x90 [virtio_net]
[<
ffffffff813ae323>] virtio_dev_remove+0x23/0x80
[<
ffffffff813f62ef>] __device_release_driver+0x7f/0xf0
[<
ffffffff813f6ca0>] driver_detach+0xc0/0xd0
[<
ffffffff813f5f28>] bus_remove_driver+0x58/0xd0
[<
ffffffff813f72ec>] driver_unregister+0x2c/0x50
[<
ffffffff813ae65e>] unregister_virtio_driver+0xe/0x10
[<
ffffffffa0036942>] virtio_net_driver_exit+0x10/0x6ce [virtio_net]
[<
ffffffff810d7cf2>] SyS_delete_module+0x172/0x220
[<
ffffffff810a732d>] ? trace_hardirqs_on+0xd/0x10
[<
ffffffff810f5d4c>] ? __audit_syscall_entry+0x9c/0xf0
[<
ffffffff81677f69>] system_call_fastpath+0x16/0x1b
Code: 00 00 55 48 8b 17 48 b9 00 01 10 00 00 00 ad de 48 8b 47 08 48 89 e5 48 39 ca 74 29 48 b9 00 02 20 00 00 00
RIP [<
ffffffff81322e19>] __list_del_entry+0x29/0xd0
RSP <
ffff8800379e1dd0>
---[ end trace
d5931cd3f87c9763 ]---
Fixes:
986a4f4d452d (virtio_net: multiqueue support)
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrey Vagin [Thu, 5 Dec 2013 14:36:20 +0000 (18:36 +0400)]
virtio-net: determine type of bufs correctly
free_unused_bufs must check vi->mergeable_rx_bufs before
vi->big_packets, because we use this sequence in other places.
Otherwise we allocate buffer of one type, then free it as another
type.
general protection fault: 0000 [#1] SMP
Dumping ftrace buffer:
(ftrace buffer empty)
Modules linked in: ip6table_filter ip6_tables iptable_filter ip_tables pcspkr virtio_balloon virtio_net(-) i2c_pii
CPU: 0 PID: 400 Comm: rmmod Not tainted 3.13.0-rc2+ #170
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
task:
ffff8800b6d2a210 ti:
ffff8800aed32000 task.ti:
ffff8800aed32000
RIP: 0010:[<
ffffffffa00345f3>] [<
ffffffffa00345f3>] free_unused_bufs+0xc3/0x190 [virtio_net]
RSP: 0018:
ffff8800aed33dd8 EFLAGS:
00010202
RAX:
ffff8800b1fe2c00 RBX:
ffff8800b66a7240 RCX:
6b6b6b6b6b6b6b6b
RDX:
6b6b6b6b6b6b6b6b RSI:
ffff8800b8419a68 RDI:
ffff8800b66a1148
RBP:
ffff8800aed33e00 R08:
0000000000000001 R09:
0000000000000000
R10:
0000000000000000 R11:
0000000000000001 R12:
0000000000000000
R13:
ffff8800b66a1148 R14:
0000000000000000 R15:
000077ff80000000
FS:
00007fc4f9c4e740(0000) GS:
ffff8800bfa00000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
000000008005003b
CR2:
00007f63f432f000 CR3:
00000000b6538000 CR4:
00000000000006f0
Stack:
ffff8800b66a7240 ffff8800b66a7380 ffff8800377bd3f0 0000000000000000
00000000023302f0 ffff8800aed33e18 ffffffffa00346e2 ffff8800b66a7240
ffff8800aed33e38 ffffffffa003474d ffff8800377bd388 ffff8800377bd390
Call Trace:
[<
ffffffffa00346e2>] remove_vq_common+0x22/0x40 [virtio_net]
[<
ffffffffa003474d>] virtnet_remove+0x4d/0x90 [virtio_net]
[<
ffffffff813ae303>] virtio_dev_remove+0x23/0x80
[<
ffffffff813f62cf>] __device_release_driver+0x7f/0xf0
[<
ffffffff813f6c80>] driver_detach+0xc0/0xd0
[<
ffffffff813f5f08>] bus_remove_driver+0x58/0xd0
[<
ffffffff813f72cc>] driver_unregister+0x2c/0x50
[<
ffffffff813ae63e>] unregister_virtio_driver+0xe/0x10
[<
ffffffffa0036852>] virtio_net_driver_exit+0x10/0x7be [virtio_net]
[<
ffffffff810d7cf2>] SyS_delete_module+0x172/0x220
[<
ffffffff810a732d>] ? trace_hardirqs_on+0xd/0x10
[<
ffffffff810f5d4c>] ? __audit_syscall_entry+0x9c/0xf0
[<
ffffffff81677f69>] system_call_fastpath+0x16/0x1b
Code: c0 74 55 0f 1f 44 00 00 80 7b 30 00 74 7a 48 8b 50 30 4c 89 e6 48 03 73 20 48 85 d2 0f 84 bb 00 00 00 66 0f
RIP [<
ffffffffa00345f3>] free_unused_bufs+0xc3/0x190 [virtio_net]
RSP <
ffff8800aed33dd8>
---[ end trace
edb570ea923cce9c ]---
Fixes:
2613af0ed18a (virtio_net: migrate mergeable rx buffers to page frag allocators)
Cc: Michael Dalton <mwdalton@google.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nikolay Aleksandrov [Thu, 5 Dec 2013 10:36:58 +0000 (11:36 +0100)]
bonding: fix packets_per_slave showing
There's an issue when showing the value of packets_per_slave due to
using signed integer. The value may be < 0 and thus not put through
reciprocal_value() before showing. This patch makes it use unsigned
integer when showing it.
CC: Andy Gospodarek <andy@greyhouse.net>
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Veaceslav Falico <vfalico@redhat.com>
CC: David S. Miller <davem@davemloft.net>
Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Acked-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Somnath Kotur [Thu, 5 Dec 2013 06:38:16 +0000 (12:08 +0530)]
be2net: Free/delete pmacs (in be_clear()) only if they exist
During suspend-resume and lancer error recovery we will cleanup and
re-initialize the resources through be_clear() and be_setup() respectively.
During re-initialisation in be_setup(), if be_get_config() fails, we'll again
call be_clear() which will cause a NULL pointer dereference as adapter->pmac_id is
already freed.
Signed-off-by: Kalesh AP <kalesh.purayil@emulex.com>
Signed-off-by: Somnath Kotur <somnath.kotur@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Somnath Kotur [Thu, 5 Dec 2013 06:37:55 +0000 (12:07 +0530)]
be2net: Fix Lancer error recovery to distinguish FW download
The Firmware update would be detected by looking at the sliport_error1/
sliport_error2 register values(0x02/0x00). If its not a FW reset the current
messaging would take place. If the error is due to FW reset, log a message to
user that "Firmware update in progress" and also do not log sliport_status and
sliport_error register values.
Signed-off-by: Kalesh AP <kalesh.purayil@emulex.com>
Signed-off-by: Somnath Kotur <somnath.kotur@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Zhi Yong Wu [Fri, 6 Dec 2013 06:16:51 +0000 (14:16 +0800)]
tun: update file current position
Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Zhi Yong Wu [Fri, 6 Dec 2013 06:16:50 +0000 (14:16 +0800)]
macvtap: update file current position
Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hannes Frederic Sowa [Thu, 5 Dec 2013 22:29:19 +0000 (23:29 +0100)]
net: clear local_df when passing skb between namespaces
We must clear local_df when passing the skb between namespaces as the
packet is not local to the new namespace any more and thus may not get
fragmented by local rules. Fred Templin noticed that other namespaces
do fragment IPv6 packets while forwarding. Instead they should have send
back a PTB.
The same problem should be present when forwarding DF-IPv4 packets
between namespaces.
Reported-by: Templin, Fred L <Fred.L.Templin@boeing.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric W. Biederman [Thu, 5 Dec 2013 04:12:04 +0000 (20:12 -0800)]
tcp_memcontrol: Cleanup/fix cg_proto->memory_pressure handling.
kill memcg_tcp_enter_memory_pressure. The only function of
memcg_tcp_enter_memory_pressure was to reduce deal with the
unnecessary abstraction that was tcp_memcontrol. Now that struct
tcp_memcontrol is gone remove this unnecessary function, the
unnecessary function pointer, and modify sk_enter_memory_pressure to
set this field directly, just as sk_leave_memory_pressure cleas this
field directly.
This fixes a small bug I intruduced when killing struct tcp_memcontrol
that caused memcg_tcp_enter_memory_pressure to never be called and
thus failed to ever set cg_proto->memory_pressure.
Remove the cg_proto enter_memory_pressure function as it now serves
no useful purpose.
Don't test cg_proto->memory_presser in sk_leave_memory_pressure before
clearing it. The test was originally there to ensure that the pointer
was non-NULL. Now that cg_proto is not a pointer the pointer does not
matter.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ivan Vecera [Wed, 4 Dec 2013 17:06:51 +0000 (18:06 +0100)]
forcedeth: run loopback test only on chipsets that support it
The driver incorrectly run loopback test on chips that don't support it.
Loopback test is only supported by chips that has DEV_HAS_TEST_EXTENDED
flag and returns 4 (NV_TEST_COUNT_EXTENDED) as test count.
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
dingtianhong [Wed, 4 Dec 2013 10:59:31 +0000 (18:59 +0800)]
bonding: add arp_ip_target checks when install the module
When I install the bonding with the wrong arp_ip_target,
just like arp_ip_target=500.500.500.500, the arp_ip_target
was transfored to 245.245.245.244 and stored in the ip
target success, it is uncorrect, so I add checks to avoid
adding wrong address.
The in4_pton() will set wrong ip address to 0.0.0.0 and
return 0, also use the micro IS_IP_TARGET_UNUSABLE_ADDRESS
to simplify the code.
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michal Kalderon [Wed, 4 Dec 2013 10:04:54 +0000 (12:04 +0200)]
bnx2x: avoid null pointer dereference when enabling SR-IOV
Fixed NULL pointer dereference when dynamically activating SR-IOV after vf
database failed to be allocated in probe stage (for example due to no ARI
support in pci hub).
Signed-off-by: Michal Kalderon <michals@broadcom.com>
Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
wangweidong [Wed, 4 Dec 2013 09:32:39 +0000 (17:32 +0800)]
sctp: disable max_burst when the max_burst is 0
As Michael pointed out that when max_burst is 0, it just disable
max_burst. It declared in rfc6458#section-8.1.24. so add the check
in sctp_transport_burst_limited, when it 0, just do nothing.
Reviewed-by: Daniel Borkmann <dborkman@redhat.com>
Suggested-by: Vlad Yasevich <vyasevich@gmail.com>
Suggested-by: Michael Tuexen <Michael.Tuexen@lurchi.franken.de>
Signed-off-by: Wang Weidong <wangweidong1@huawei.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tony Lindgren [Tue, 3 Dec 2013 23:13:02 +0000 (15:13 -0800)]
net: davinci_emac: Fix platform data handling and make usable for am3517
When booted with device tree, we may still have platform data passed
as auxdata. For am3517 this is needed for passing the interrupt_enable
and interrupt_disable callbacks that access the omap system control module
registers. These callback functions will eventually go away when we have
a separate system control module driver.
Some of the things that are currently passed as platform data we don't need
to set up as device tree properties as they are always the same on am3517.
So let's use a new compatible flag for those so we can get those from
the device tree match data.
Also note that we need to fix setting of phy_dev to NULL instead of an empty
string as the code later on uses that to find the first phy on the mdio bus.
This seems to have been caused by
5d69e0076a72 (net: davinci_emac: switch to
new mdio).
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jitendra Kalsaria [Thu, 5 Dec 2013 23:11:24 +0000 (18:11 -0500)]
qlge: Update version to 1.00.00.34
Signed-off-by: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jitendra Kalsaria [Thu, 5 Dec 2013 23:11:23 +0000 (18:11 -0500)]
qlge: Allow enable/disable rx/tx vlan acceleration independently
o Fix the driver to allow user to enable/disable rx/tx vlan acceleration independently.
For example:
ethtool -K ethX rxvlan on/off
ethtool -K ethX txvlan on/off
Signed-off-by: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jitendra Kalsaria [Thu, 5 Dec 2013 23:11:22 +0000 (18:11 -0500)]
qlge: Fix ethtool statistics
o Receive mac error stat was getting overwritten by other stats.
Signed-off-by: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Paul Durrant [Tue, 3 Dec 2013 17:39:29 +0000 (17:39 +0000)]
xen-netback: fix fragment detection in checksum setup
The code to detect fragments in checksum_setup() was missing for IPv4 and
too eager for IPv6. (It transpires that Windows seems to send IPv6 packets
with a fragment header even if they are not a fragment - i.e. offset is zero,
and M bit is not set).
This patch also incorporates a fix to callers of maybe_pull_tail() where
skb->network_header was being erroneously added to the length argument.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: David Vrabel <david.vrabel@citrix.com>
cc: David Miller <davem@davemloft.net>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jeff Kirsher [Fri, 6 Dec 2013 00:25:20 +0000 (16:25 -0800)]
MAINTAINERS: Update Intel Wired Ethernet LAN Maintainers
Remove Tushar and Peter (PJ) as maintainers, since they have moved out
of our group and no longer work on Wired Ethernet.
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 6 Dec 2013 00:29:26 +0000 (19:29 -0500)]
Merge branch 'net_sched_actions'
Jamal Hadi Salim says:
====================
net_sched: actions - Add default lookup and walker
This set of patches provide defaults for lookup and walkers for actions.
Users can override when needed.
It also enforces mandatory action methods to be passed.
Thanks to David Miller who paid attention and made this patch set
much better.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jamal Hadi Salim [Wed, 4 Dec 2013 14:26:56 +0000 (09:26 -0500)]
net_sched: Use default action walker methods
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jamal Hadi Salim [Wed, 4 Dec 2013 14:26:55 +0000 (09:26 -0500)]
net_sched: Provide default walker function for actions
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jamal Hadi Salim [Wed, 4 Dec 2013 14:26:54 +0000 (09:26 -0500)]
net_sched: Use default action lookup functions
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jamal Hadi Salim [Wed, 4 Dec 2013 14:26:53 +0000 (09:26 -0500)]
net_sched: Default action lookup method for actions
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jamal Hadi Salim [Wed, 4 Dec 2013 14:26:52 +0000 (09:26 -0500)]
net_sched: Fail if missing mandatory action operation methods
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 5 Dec 2013 21:02:56 +0000 (16:02 -0500)]
Merge branch 'for-davem' of git://git./linux/kernel/git/linville/wireless
John W. Linville says:
====================
Please pull this batch of fixes intende for the 3.13 stream!
For the mac80211 bits, Johannes says:
"For now I have various fixes all over, mostly for issues introduced in
relatively recent patches. There's no real pattern to it. Some of the
issues like go back longer, but still seemed 3.13 material."
And...
"These are just two patches disabling the broken CSA code. Once this
goes into your tree I'll merge it into mac80211-next and revert there
(since we fixed the bugs there)."
For the iwlwifi bits, Emmanuel says:
"I have here a few fixes for BT Coex. One of them is a NULL pointer
dereference. Another one avoids to enable a feature that can make the
firmware unhappy since the firmware isn't ready for it yet. WE also
avoid a WARNING that can be triggered upon association in not-so-bad
cases even if the association succeeded. We add support for new NICs
(not yet on the market) and bump the API so that 3.13 will be able to
work with the new firmware that will be out soon hopefully.
I also have a boundary check from Johannes."
In addition to those...
- Arend van Spriel fixes a brcmfmac problem that could use an
uninitialized variable in an error path.
- Borislav Petkov fixes a Kconfig-based build breakage problem for
brcmsmac.
- Michal Nazarewicz fixes a couple of NULL pointer dereference problems
in ath9k and wcn36xx.
- Sujith Manoharan fixes a couple of ath9k problems related to
incorrect interpretation of EEPROM configuration data.
- Ujjal Roy fixes a memory leak in mwifiex.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
John W. Linville [Thu, 5 Dec 2013 14:29:56 +0000 (09:29 -0500)]
Merge branch 'master' of git://git./linux/kernel/git/linville/wireless into for-davem
David S. Miller [Tue, 3 Dec 2013 21:55:49 +0000 (16:55 -0500)]
Merge branch 'cxgb4'
Hariprasad Shenai says:
====================
Fixes T5 adapter init, due to incorrect FW version check
This patch series fixes, Chelsio T5 adapter initialization failure due to
incorrect firmware version check. This patch series modifies the firmware
flashing mechanism for T4/T5 adapter.
The patch series moves chip type from struct adapter to struct adapter_params.
It changes the references of chip type in cxgb4 and cxgb4vf drivers such that
build failure is avoided.
Patch 3/3 is dependent on patch 1/3
Patch 2/3 is also dependent on patch 1/3
We would like to request this patch series to get merged via David Miller's
'net' tree.
We have included all the maintainers of respective drivers. Kindly review the
change and let us know in case of any review comments.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Hariprasad Shenai [Tue, 3 Dec 2013 11:35:58 +0000 (17:05 +0530)]
cxgb4: Add new scheme to update T4/T5 firmware
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hariprasad Shenai [Tue, 3 Dec 2013 11:35:57 +0000 (17:05 +0530)]
cxgb4vf: added much cleaner implementation of is_t4()
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hariprasad Shenai [Tue, 3 Dec 2013 11:35:56 +0000 (17:05 +0530)]
cxgb4: Much cleaner implementation of is_t4()/is_t5()
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Wei Yang [Tue, 3 Dec 2013 02:04:10 +0000 (10:04 +0800)]
net/mlx4_core: destroy workqueue when driver fails to register
When driver registration fails, we need to clean up the resources allocated
before. mlx4_core missed destroying the workqueue allocated.
This patch destroys the workqueue when registration fails.
Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Venkat Venkatsubra [Mon, 2 Dec 2013 23:41:39 +0000 (15:41 -0800)]
rds: prevent BUG_ON triggered on congestion update to loopback
After congestion update on a local connection, when rds_ib_xmit returns
less bytes than that are there in the message, rds_send_xmit calls
back rds_ib_xmit with an offset that causes BUG_ON(off & RDS_FRAG_SIZE)
to trigger.
For a 4Kb PAGE_SIZE rds_ib_xmit returns min(8240,4096)=4096 when actually
the message contains 8240 bytes. rds_send_xmit thinks there is more to send
and calls rds_ib_xmit again with a data offset "off" of 4096-48(rds header)
=4048 bytes thus hitting the BUG_ON(off & RDS_FRAG_SIZE) [RDS_FRAG_SIZE=4k].
The commit
6094628bfd94323fc1cea05ec2c6affd98c18f7f
"rds: prevent BUG_ON triggering on congestion map updates" introduced
this regression. That change was addressing the triggering of a different
BUG_ON in rds_send_xmit() on PowerPC architecture with 64Kbytes PAGE_SIZE:
BUG_ON(ret != 0 &&
conn->c_xmit_sg == rm->data.op_nents);
This was the sequence it was going through:
(rds_ib_xmit)
/* Do not send cong updates to IB loopback */
if (conn->c_loopback
&& rm->m_inc.i_hdr.h_flags & RDS_FLAG_CONG_BITMAP) {
rds_cong_map_updated(conn->c_fcong, ~(u64) 0);
return sizeof(struct rds_header) + RDS_CONG_MAP_BYTES;
}
rds_ib_xmit returns 8240
rds_send_xmit:
c_xmit_data_off = 0 + 8240 - 48 (rds header accounted only the first time)
= 8192
c_xmit_data_off < 65536 (sg->length), so calls rds_ib_xmit again
rds_ib_xmit returns 8240
rds_send_xmit:
c_xmit_data_off = 8192 + 8240 = 16432, calls rds_ib_xmit again
and so on (c_xmit_data_off 24672,32912,41152,49392,57632)
rds_ib_xmit returns 8240
On this iteration this sequence causes the BUG_ON in rds_send_xmit:
while (ret) {
tmp = min_t(int, ret, sg->length - conn->c_xmit_data_off);
[tmp = 65536 - 57632 = 7904]
conn->c_xmit_data_off += tmp;
[c_xmit_data_off = 57632 + 7904 = 65536]
ret -= tmp;
[ret = 8240 - 7904 = 336]
if (conn->c_xmit_data_off == sg->length) {
conn->c_xmit_data_off = 0;
sg++;
conn->c_xmit_sg++;
BUG_ON(ret != 0 &&
conn->c_xmit_sg == rm->data.op_nents);
[c_xmit_sg = 1, rm->data.op_nents = 1]
What the current fix does:
Since the congestion update over loopback is not actually transmitted
as a message, all that rds_ib_xmit needs to do is let the caller think
the full message has been transmitted and not return partial bytes.
It will return 8240 (RDS_CONG_MAP_BYTES+48) when PAGE_SIZE is 4Kb.
And 64Kb+48 when page size is 64Kb.
Reported-by: Josh Hunt <joshhunt00@gmail.com>
Tested-by: Honggang Li <honli@redhat.com>
Acked-by: Bang Nguyen <bang.nguyen@oracle.com>
Signed-off-by: Venkat Venkatsubra <venkat.x.venkatsubra@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Paul Durrant [Tue, 3 Dec 2013 14:06:25 +0000 (14:06 +0000)]
xen-netback: clear vif->task on disconnect
xenvif_start_xmit() relies on checking vif->task for NULL to determine
whether the vif is ready to accept packets. The task thread is stopped in
xenvif_disconnect() but task is not set to NULL. Thus, on a re-connect the
check will give a false positive.
Also since commit
ea732dff5cfa10789007bf4a5b935388a0bb2a8f (Handle backend
state transitions in a more robust way) it should not be possible for
xenvif_connect() to be called if the vif is already connected so change the
check of vif->tx_irq to a BUG_ON() and also add a BUG_ON(vif->task).
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 2 Dec 2013 22:26:05 +0000 (17:26 -0500)]
Revert "net: Handle CHECKSUM_COMPLETE more adequately in pskb_trim_rcsum()."
This reverts commit
018c5bba052b3a383d83cf0c756da0e7bc748397.
It causes regressions for people using chips driven by the sungem
driver. Suspicion is that the skb->csum value isn't being adjusted
properly.
The change also has a bug in that if __pskb_trim() fails, we'll leave
a corruped skb->csum value in there. We would really need to revert
it to it's original value in that case.
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Mon, 2 Dec 2013 16:51:13 +0000 (08:51 -0800)]
net: do not pretend FRAGLIST support
Few network drivers really supports frag_list : virtual drivers.
Some drivers wrongly advertise NETIF_F_FRAGLIST feature.
If skb with a frag_list is given to them, packet on the wire will be
corrupt.
Remove this flag, as core networking stack will make sure to
provide packets that can be sent without corruption.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com>
Cc: Anirudha Sarangi <anirudh@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kamala R [Mon, 2 Dec 2013 14:25:21 +0000 (19:55 +0530)]
IPv6: Fixed support for blackhole and prohibit routes
The behaviour of blackhole and prohibit routes has been corrected by setting
the input and output pointers of the dst variable appropriately. For
blackhole routes, they are set to dst_discard and to ip6_pkt_discard and
ip6_pkt_discard_out respectively for prohibit routes.
ipv6: ip6_pkt_prohibit(_out) should not depend on
CONFIG_IPV6_MULTIPLE_TABLES
We need ip6_pkt_prohibit(_out) available without
CONFIG_IPV6_MULTIPLE_TABLES
Signed-off-by: Kamala R <kamala@aristanetworks.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
François-Xavier Le Bail [Mon, 2 Dec 2013 10:28:49 +0000 (11:28 +0100)]
ipv6: fix third arg of anycast_dst_alloc(), must be bool.
Signed-off-by: Francois-Xavier Le Bail <fx.lebail@yahoo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sebastian Siewior [Mon, 2 Dec 2013 09:52:55 +0000 (10:52 +0100)]
net: fec_main: dma_map() only the length of the skb
On tx submit the driver always dma_map_single() FEC_ENET_TX_FRSIZE (=2048)
bytes. This works because we don't overwrite any memory after the data buffer,
we remove it from cache if it was there. So we hurt performace in case the
mapping of a smaller area makes a difference.
There is also a bug: If the data area starts shortly before the end of
RAM say 0xc7fffa10 and the RAM ends at 0xc8000000 then we have enough
space to fit the data area (according to skb->len) but we would map beyond
end of ram if we are using 2048. In v2.6.31 (against which kernel this patch
made) there is the following check in dma_cache_maint():
|BUG_ON(!virt_addr_valid(start) || !virt_addr_valid(start + size - 1));
Since the area starting at 0xc8000000 is no longer virt_addr_valid() we
BUG() during dma_map_single(). The BUG() statement was removed in v3.5-rc1 as
per
2dc6a016 ("ARM: dma-mapping: use asm-generic/dma-mapping-common.h").
This patch was tested on v2.6.31 and then forward-ported and compile
tested only against the net tree. I think it is still worth fixing
mainline even after the BUG() statement is gone.
Tested-by: Fugang Duan <B38611@freescale.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Fugang Duan <B38611@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mugunthan V N [Mon, 2 Dec 2013 07:23:39 +0000 (12:53 +0530)]
drivers: net: cpsw: fix dt probe for one port ethernet
When only one port of the two port is pinned out, then dt probe is failing
because second port phy is not found. fixing this by checking the number of
slaves and breaking the loop.
Signed-off-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rafael J. Wysocki [Sun, 1 Dec 2013 01:34:37 +0000 (02:34 +0100)]
PCI / tg3: Give up chip reset and carrier loss handling if PCI device is not present
Modify tg3_chip_reset() and tg3_close() to check if the PCI network
adapter device is accessible at all in order to skip poking it or
trying to handle a carrier loss in vain when that's not the case.
Introduce a special PCI helper function pci_device_is_present()
for this purpose.
Of course, this uncovers the lack of the appropriate RTNL locking
in tg3_suspend() and tg3_resume(), so add that locking in there
too.
These changes prevent tg3 from burning a CPU at 100% load level for
solid several seconds after the Thunderbolt link is disconnected from
a Matrox DS1 docking station.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Duan Jiong [Tue, 26 Nov 2013 07:46:56 +0000 (15:46 +0800)]
ipv6: judge the accept_ra_defrtr before calling rt6_route_rcv
when dealing with a RA message, if accept_ra_defrtr is false,
the kernel will not add the default route, and then deal with
the following route information options. Unfortunately, those
options maybe contain default route, so let's judge the
accept_ra_defrtr before calling rt6_route_rcv.
Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
John W. Linville [Mon, 2 Dec 2013 18:20:03 +0000 (13:20 -0500)]
Merge branch 'for-john' of git://git./linux/kernel/git/jberg/mac80211
Linus Torvalds [Mon, 2 Dec 2013 18:15:39 +0000 (10:15 -0800)]
Merge branch 'irq-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull irq fixes from Thomas Gleixner:
- Correction of fuzzy and fragile IRQ_RETVAL macro
- IRQ related resume fix affecting only XEN
- ARM/GIC fix for chained GIC controllers
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip: Gic: fix boot for chained gics
irq: Enable all irqs unconditionally in irq_resume
genirq: Correct fuzzy and fragile IRQ_RETVAL() definition