Toshiaki Makita [Fri, 7 Feb 2014 07:48:23 +0000 (16:48 +0900)]
bridge: Properly check if local fdb entry can be deleted in br_fdb_change_mac_address
br_fdb_change_mac_address() doesn't check if the local entry has the
same address as any of bridge ports.
Although I'm not sure when it is beneficial, current implementation allow
the bridge device to receive any mac address of its ports.
To preserve this behavior, we have to check if the mac address of the
entry being deleted is identical to that of any port.
As this check is almost the same as that in br_fdb_changeaddr(), create
a common function fdb_delete_local() and call it from
br_fdb_changeadddr() and br_fdb_change_mac_address().
Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Acked-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Toshiaki Makita [Fri, 7 Feb 2014 07:48:22 +0000 (16:48 +0900)]
bridge: Fix the way to check if a local fdb entry can be deleted
We should take into account the followings when deleting a local fdb
entry.
- nbp_vlan_find() can be used only when vid != 0 to check if an entry is
deletable, because a fdb entry with vid 0 can exist at any time while
nbp_vlan_find() always return false with vid 0.
Example of problematic case:
ip link set eth0 address 12:34:56:78:90:ab
ip link set eth1 address 12:34:56:78:90:ab
brctl addif br0 eth0
brctl addif br0 eth1
ip link set eth0 address aa:bb:cc:dd:ee:ff
Then, the fdb entry 12:34:56:78:90:ab will be deleted even though the
bridge port eth1 still has that address.
- The port to which the bridge device is attached might needs a local entry
if its mac address is set manually.
Example of problematic case:
ip link set eth0 address 12:34:56:78:90:ab
brctl addif br0 eth0
ip link set br0 address 12:34:56:78:90:ab
ip link set eth0 address aa:bb:cc:dd:ee:ff
Then, the fdb still must have the entry 12:34:56:78:90:ab, but it will be
deleted.
We can use br->dev->addr_assign_type to check if the address is manually
set or not, but I propose another approach.
Since we delete and insert local entries whenever changing mac address
of the bridge device, we can change dst of the entry to NULL regardless of
addr_assign_type when deleting an entry associated with a certain port,
and if it is found to be unnecessary later, then delete it.
That is, if changing mac address of a port, the entry might be changed
to its dst being NULL first, but is eventually deleted when recalculating
and changing bridge id.
This approach is especially useful when we want to share the code with
deleting vlan in which the bridge device might want such an entry regardless
of addr_assign_type, and makes things easy because we don't have to consider
if mac address of the bridge device will be changed or not at the time we
delete a local entry of a port, which means fdb code will not be bothered
even if the bridge id calculating logic is changed in the future.
Also, this change reduces inconsistent state, where frames whose dst is the
mac address of the bridge, can't reach the bridge because of premature fdb
entry deletion. This change reduces the possibility that the bridge device
replies unreachable mac address to arp requests, which could occur during
the short window between calling del_nbp() and br_stp_recalculate_bridge_id()
in br_del_if(). This will effective after br_fdb_delete_by_port() starts to
use the same code by following patch.
Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Acked-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Toshiaki Makita [Fri, 7 Feb 2014 07:48:21 +0000 (16:48 +0900)]
bridge: Change local fdb entries whenever mac address of bridge device changes
Vlan code may need fdb change when changing mac address of bridge device
even if it is caused by the mac address changing of a bridge port.
Example configuration:
ip link set eth0 address 12:34:56:78:90:ab
ip link set eth1 address aa:bb:cc:dd:ee:ff
brctl addif br0 eth0
brctl addif br0 eth1 # br0 will have mac address 12:34:56:78:90:ab
bridge vlan add dev br0 vid 10 self
bridge vlan add dev eth0 vid 10
We will have fdb entry such that f->dst == NULL, f->vlan_id == 10 and
f->addr == 12:34:56:78:90:ab at this time.
Next, change the mac address of eth0 to greater value.
ip link set eth0 address ee:ff:12:34:56:78
Then, mac address of br0 will be recalculated and set to aa:bb:cc:dd:ee:ff.
However, an entry aa:bb:cc:dd:ee:ff will not be created and we will be not
able to communicate using br0 on vlan 10.
Address this issue by deleting and adding local entries whenever
changing the mac address of the bridge device.
If there already exists an entry that has the same address, for example,
in case that br_fdb_changeaddr() has already inserted it,
br_fdb_change_mac_address() will simply fail to insert it and no
duplicated entry will be made, as it was.
This approach also needs br_add_if() to call br_fdb_insert() before
br_stp_recalculate_bridge_id() so that we don't create an entry whose
dst == NULL in this function to preserve previous behavior.
Note that this is a slight change in behavior where the bridge device can
receive the traffic to the new address before calling
br_stp_recalculate_bridge_id() in br_add_if().
However, it is not a problem because we have already the address on the
new port and such a way to insert new one before recalculating bridge id
is taken in br_device_event() as well.
Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Acked-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Toshiaki Makita [Fri, 7 Feb 2014 07:48:20 +0000 (16:48 +0900)]
bridge: Fix the way to find old local fdb entries in br_fdb_change_mac_address
We have been always failed to delete the old entry at
br_fdb_change_mac_address() because br_set_mac_address() updates
dev->dev_addr before calling br_fdb_change_mac_address() and
br_fdb_change_mac_address() uses dev->dev_addr to find the old entry.
That update of dev_addr is completely unnecessary because the same work
is done in br_stp_change_bridge_id() which is called right away after
calling br_fdb_change_mac_address().
Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Acked-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Toshiaki Makita [Fri, 7 Feb 2014 07:48:19 +0000 (16:48 +0900)]
bridge: Fix the way to insert new local fdb entries in br_fdb_changeaddr
Since commit
bc9a25d21ef8 ("bridge: Add vlan support for local fdb entries"),
br_fdb_changeaddr() has inserted a new local fdb entry only if it can
find old one. But if we have two ports where they have the same address
or user has deleted a local entry, there will be no entry for one of the
ports.
Example of problematic case:
ip link set eth0 address aa:bb:cc:dd:ee:ff
ip link set eth1 address aa:bb:cc:dd:ee:ff
brctl addif br0 eth0
brctl addif br0 eth1 # eth1 will not have a local entry due to dup.
ip link set eth1 address 12:34:56:78:90:ab
Then, the new entry for the address 12:34:56:78:90:ab will not be
created, and the bridge device will not be able to communicate.
Insert new entries regardless of whether we can find old entries or not.
Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Acked-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Toshiaki Makita [Fri, 7 Feb 2014 07:48:18 +0000 (16:48 +0900)]
bridge: Fix the way to find old local fdb entries in br_fdb_changeaddr
br_fdb_changeaddr() assumes that there is at most one local entry per port
per vlan. It used to be true, but since commit
36fd2b63e3b4 ("bridge: allow
creating/deleting fdb entries via netlink"), it has not been so.
Therefore, the function might fail to search a correct previous address
to be deleted and delete an arbitrary local entry if user has added local
entries manually.
Example of problematic case:
ip link set eth0 address ee:ff:12:34:56:78
brctl addif br0 eth0
bridge fdb add 12:34:56:78:90:ab dev eth0 master
ip link set eth0 address aa:bb:cc:dd:ee:ff
Then, the address 12:34:56:78:90:ab might be deleted instead of
ee:ff:12:34:56:78, the original mac address of eth0.
Address this issue by introducing a new flag, added_by_user, to struct
net_bridge_fdb_entry.
Note that br_fdb_delete_by_port() has to set added_by_user to 0 in cases
like:
ip link set eth0 address 12:34:56:78:90:ab
ip link set eth1 address aa:bb:cc:dd:ee:ff
brctl addif br0 eth0
bridge fdb add aa:bb:cc:dd:ee:ff dev eth0 master
brctl addif br0 eth1
brctl delif br0 eth0
In this case, kernel should delete the user-added entry aa:bb:cc:dd:ee:ff,
but it also should have been added by "brctl addif br0 eth1" originally,
so we don't delete it and treat it a new kernel-created entry.
Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jesper Juhl [Sun, 9 Feb 2014 22:30:32 +0000 (23:30 +0100)]
tcp: correct code comment stating 3 min timeout for FIN_WAIT2, we only do 1 min
As far as I can tell we have used a default of 60 seconds for
FIN_WAIT2 timeout for ages (since 2.x times??).
In any case, the timeout these days is 60 seconds, so the 3 min
comment is wrong (and cost me a few minutes of my life when I was
debugging a FIN_WAIT2 related problem in a userspace application and
checked the kernel source for details).
Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Christian Engelmayer [Sat, 8 Feb 2014 23:16:17 +0000 (00:16 +0100)]
net: vxge: Remove unused device pointer
Remove occurrences of unused struct __vxge_hw_device pointer in functions
vxge_learn_mac() and vxge_rem_isr().
Detected by Coverity: CID 139839, CID 139842.
Signed-off-by: Christian Engelmayer <cengelma@gmx.at>
Reviewed-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Raymond Wanyoike [Sat, 8 Feb 2014 21:01:02 +0000 (00:01 +0300)]
net: qmi_wwan: add ZTE MF667
The driver description files give these descriptions to the vendor specific
ports on this modem:
VID_19D2&PID_1270&MI_00: "ZTE MF667 Diagnostics Port"
VID_19D2&PID_1270&MI_01: "ZTE MF667 AT Port"
VID_19D2&PID_1270&MI_02: "ZTE MF667 ATExt2 Port"
VID_19D2&PID_1270&MI_03: "ZTE MF667 ATExt Port"
VID_19D2&PID_1270&MI_04: "ZTE MF667 USB Modem"
VID_19D2&PID_1270&MI_05: "ZTE MF667 Network Adapter"
Signed-off-by: Raymond Wanyoike <raymond.wanyoike@gmail.com>
Acked-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
Christian Engelmayer [Sat, 8 Feb 2014 17:11:17 +0000 (18:11 +0100)]
3c59x: Remove unused pointer in vortex_eisa_cleanup()
Remove unused network device private data pointer 'vp' in function
vortex_eisa_cleanup(). Detected by Coverity: CID 139826.
Signed-off-by: Christian Engelmayer <cengelma@gmx.at>
Signed-off-by: David S. Miller <davem@davemloft.net>
Maciej Żenczykowski [Sat, 8 Feb 2014 00:23:48 +0000 (16:23 -0800)]
net: fix 'ip rule' iif/oif device rename
ip rules with iif/oif references do not update:
(detach/attach) across interface renames.
Signed-off-by: Maciej Żenczykowski <maze@google.com>
CC: Willem de Bruijn <willemb@google.com>
CC: Eric Dumazet <edumazet@google.com>
CC: Chris Davis <chrismd@google.com>
CC: Carlo Contavalli <ccontavalli@google.com>
Google-Bug-Id:
12936021
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Christian Engelmayer [Fri, 7 Feb 2014 23:21:10 +0000 (00:21 +0100)]
wan: dlci: Remove unused netdev_priv pointer
Remove occurrences of unused pointer to network device private data in
functions dlci_header() and dlci_receive().
Detected by Coverity: CID 139844, CID 139845.
Signed-off-by: Christian Engelmayer <cengelma@gmx.at>
Signed-off-by: David S. Miller <davem@davemloft.net>
Christian Engelmayer [Fri, 7 Feb 2014 21:58:38 +0000 (22:58 +0100)]
6lowpan: Remove unused pointer in lowpan_header_create()
Commit
8df8c56a (6lowpan: Moving generic compression code into 6lowpan_iphc.c)
left pointer 'hdr' unused - remove it.
Detected by Coverity: CID
1164868.
Signed-off-by: Christian Engelmayer <cengelma@gmx.at>
Signed-off-by: David S. Miller <davem@davemloft.net>
FX Le Bail [Fri, 7 Feb 2014 10:22:37 +0000 (11:22 +0100)]
ipv6: icmp6_send: fix Oops when pinging a not set up IPv6 peer on a sit tunnel
The patch
446fab59333dea91e54688f033dd8d788d0486fb ("ipv6: enable anycast addresses
as source addresses in ICMPv6 error messages") causes an Oops when pinging a not
set up IPv6 peer on a sit tunnel.
The problem is that ipv6_anycast_destination() uses unconditionally skb_dst(skb),
which is NULL in this case.
The solution is to use instead the ipv6_chk_acast_addr_src() function.
Here are the steps to reproduce it:
modprobe sit
ip link add sit1 type sit remote 10.16.0.121 local 10.16.0.249
ip l s sit1 up
ip -6 a a dev sit1 2001:1234::123 remote 2001:1234::121
ping6 2001:1234::121
Reported-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Tested-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: Francois-Xavier Le Bail <fx.lebail@yahoo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rashika Kheria [Sun, 9 Feb 2014 17:03:03 +0000 (22:33 +0530)]
net: Mark functions as static in net/sunrpc/svc_xprt.c
Mark functions as static in net/sunrpc/svc_xprt.c because they are not
used outside this file.
This eliminates the following warning in net/sunrpc/svc_xprt.c:
net/sunrpc/svc_xprt.c:574:5: warning: no previous prototype for ‘svc_alloc_arg’ [-Wmissing-prototypes]
net/sunrpc/svc_xprt.c:615:18: warning: no previous prototype for ‘svc_get_next_xprt’ [-Wmissing-prototypes]
net/sunrpc/svc_xprt.c:694:6: warning: no previous prototype for ‘svc_add_new_temp_xprt’ [-Wmissing-prototypes]
Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rashika Kheria [Sun, 9 Feb 2014 17:01:42 +0000 (22:31 +0530)]
net: Include appropriate header file in netfilter/nft_lookup.c
Include appropriate header file net/netfilter/nf_tables_core.h in
net/netfilter/nft_lookup.c because it has prototype declaration of
functions defined in net/netfilter/nft_lookup.c.
This eliminates the following warning in net/netfilter/nft_lookup.c:
net/netfilter/nft_lookup.c:133:12: warning: no previous prototype for ‘nft_lookup_module_init’ [-Wmissing-prototypes]
net/netfilter/nft_lookup.c:138:6: warning: no previous prototype for ‘nft_lookup_module_exit’ [-Wmissing-prototypes]
Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rashika Kheria [Sun, 9 Feb 2014 16:59:14 +0000 (22:29 +0530)]
net: Move prototype declaration to header file include/net/net_namespace.h from net/ipx/af_ipx.c
Move prototype declaration of function to header file
include/net/net_namespace.h from net/ipx/af_ipx.c because they are used
by more than one file.
This eliminates the following warning in net/ipx/sysctl_net_ipx.c:
net/ipx/sysctl_net_ipx.c:33:6: warning: no previous prototype for ‘ipx_register_sysctl’ [-Wmissing-prototypes]
net/ipx/sysctl_net_ipx.c:38:6: warning: no previous prototype for ‘ipx_unregister_sysctl’ [-Wmissing-prototypes]
Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rashika Kheria [Sun, 9 Feb 2014 16:57:41 +0000 (22:27 +0530)]
net: Move prototype declaration to header file include/net/datalink.h from net/ipx/af_ipx.c
Move prototype declarations of function to header file
include/net/datalink.h from net/ipx/af_ipx.c because they are used by
more than one file.
This eliminates the following warning in net/ipx/pe2.c:
net/ipx/pe2.c:20:24: warning: no previous prototype for ‘make_EII_client’ [-Wmissing-prototypes]
net/ipx/pe2.c:32:6: warning: no previous prototype for ‘destroy_EII_client’ [-Wmissing-prototypes]
Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rashika Kheria [Sun, 9 Feb 2014 16:56:32 +0000 (22:26 +0530)]
net: Move prototype declaration to header file include/net/ipx.h from net/ipx/af_ipx.c
Move prototype declaration of functions to header file include/net/ipx.h
from net/ipx/af_ipx.c because they are used by more than one file.
This eliminates the following warning in
net/ipx/ipx_route.c:33:19: warning: no previous prototype for ‘ipxrtr_lookup’ [-Wmissing-prototypes]
net/ipx/ipx_route.c:52:5: warning: no previous prototype for ‘ipxrtr_add_route’ [-Wmissing-prototypes]
net/ipx/ipx_route.c:94:6: warning: no previous prototype for ‘ipxrtr_del_routes’ [-Wmissing-prototypes]
net/ipx/ipx_route.c:149:5: warning: no previous prototype for ‘ipxrtr_route_skb’ [-Wmissing-prototypes]
net/ipx/ipx_route.c:171:5: warning: no previous prototype for ‘ipxrtr_route_packet’ [-Wmissing-prototypes]
net/ipx/ipx_route.c:261:5: warning: no previous prototype for ‘ipxrtr_ioctl’ [-Wmissing-prototypes]
Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rashika Kheria [Sun, 9 Feb 2014 16:54:33 +0000 (22:24 +0530)]
net: Move prototype declaration to include/net/ipx.h from net/ipx/ipx_route.c
Move prototype definition of function to header file include/net/ipx.h
from net/ipx/ipx_route.c because they are used by more than one file.
This eliminates the following warning from net/ipx/af_ipx.c:
net/ipx/af_ipx.c:193:23: warning: no previous prototype for ‘ipxitf_find_using_net’ [-Wmissing-prototypes]
net/ipx/af_ipx.c:577:5: warning: no previous prototype for ‘ipxitf_send’ [-Wmissing-prototypes]
net/ipx/af_ipx.c:1219:8: warning: no previous prototype for ‘ipx_cksum’ [-Wmissing-prototypes]
Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rashika Kheria [Sun, 9 Feb 2014 16:52:53 +0000 (22:22 +0530)]
net: Move prototype declaration to header file include/net/dn.h from net/decnet/af_decnet.c
Move prototype declaration of functions to header file include/net/dn.h
from net/decnet/af_decnet.c because they are used by more than one file.
This eliminates the following warning in net/decnet/af_decnet.c:
net/decnet/sysctl_net_decnet.c:354:6: warning: no previous prototype for ‘dn_register_sysctl’ [-Wmissing-prototypes]
net/decnet/sysctl_net_decnet.c:359:6: warning: no previous prototype for ‘dn_unregister_sysctl’ [-Wmissing-prototypes]
Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rashika Kheria [Sun, 9 Feb 2014 16:50:09 +0000 (22:20 +0530)]
net: Move prototype declaration to appropriate header file from decnet/af_decnet.c
Move prototype declaration of functions to header file include/net/dn_route.h
from net/decnet/af_decnet.c because it is used by more than one file.
This eliminates the following warning in net/decnet/dn_route.c:
net/decnet/dn_route.c:629:5: warning: no previous prototype for ‘dn_route_rcv’ [-Wmissing-prototypes]
Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rashika Kheria [Sun, 9 Feb 2014 14:56:25 +0000 (20:26 +0530)]
net: Mark functions as static in core/dev.c
Mark functions as static in core/dev.c because they are not used outside
this file.
This eliminates the following warning in core/dev.c:
net/core/dev.c:2806:5: warning: no previous prototype for ‘__dev_queue_xmit’ [-Wmissing-prototypes]
net/core/dev.c:4640:5: warning: no previous prototype for ‘netdev_adjacent_sysfs_add’ [-Wmissing-prototypes]
net/core/dev.c:4650:6: warning: no previous prototype for ‘netdev_adjacent_sysfs_del’ [-Wmissing-prototypes]
Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Reviewed-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rashika Kheria [Sun, 9 Feb 2014 14:32:16 +0000 (20:02 +0530)]
net: Include appropriate header file in caif/cfsrvl.c
Include appropriate header file net/caif/caif_dev.h in caif/cfsrvl.c
because it has prototype declaration of functions defined in
caif/cfsrvl.c.
This eliminates the following warning in caif/cfsrvl.c:
net/caif/cfsrvl.c:198:6: warning: no previous prototype for ‘caif_free_client’ [-Wmissing-prototypes]
net/caif/cfsrvl.c:208:6: warning: no previous prototype for ‘caif_client_register_refcnt’ [-Wmissing-prototypes]
Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rashika Kheria [Sun, 9 Feb 2014 14:29:04 +0000 (19:59 +0530)]
net: Include appropriate header file in caif/caif_dev.c
Include appropriate header file net/caif/caif_dev.h in caif/caif_dev.c
because it has prototype declarations of function defined in
caif/caif_dev.c.
This eliminates the following file in caif/caif_dev.c:
net/caif/caif_dev.c:303:6: warning: no previous prototype for ‘caif_enroll_dev’ [-Wmissing-prototypes]
Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rashika Kheria [Sun, 9 Feb 2014 14:27:33 +0000 (19:57 +0530)]
net: Mark function as static in 9p/client.c
Mark function as static in net/9p/client.c because it is not used
outside this file.
This eliminates the following warning in net/9p/client.c:
net/9p/client.c:207:18: warning: no previous prototype for ‘p9_fcall_alloc’ [-Wmissing-prototypes]
Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sun, 9 Feb 2014 22:21:25 +0000 (14:21 -0800)]
Merge branch 'for-davem' of git://git./linux/kernel/git/linville/wireless
John W. Linville says:
====================
Please pull this batch of fixes intended for the 3.14 stream!
For the mac80211 bits, Johannes says:
"This is just a collection of small fixes, the commit logs explain the
details. The only thing that isn't strictly a fix is the 5/10 MHz
enabling, I had forgotten this and there's little point in waiting
longer. The patch simply removes the force-disable code that I put in
when there was a problem with the userspace API (that has long been
fixed.)"
For the iwlwifi bits, Emmanuel says:
"I have an important fix that disables A band in case the driver thought
it was enabled, and the firmware disagreed. We ended up making the
firmware unhappy. I also fix the station table in AP mode and fix the
scan while we have BT working.
Johannes removes a static variable that could potentially lead to to
issues on multi-device setups and disables scheduled scan to avoid
issues with old versions of wpa_supplicant.
A small fix from David on scan and a few new device IDs for 7265."
On top of that...
Oleksij Rempel adds a USB ID to the ar5523 driver and changes the
default powersave setting for ath9k_htc to "off", due to observed
stability issues (based on an equivalent ath9k patch).
Stanislaw Gruszka similarly disables powersave for a couple of rt2x00
drivers. He also fixes a couple of scheduling while atomic issues
in ath9k_htc.
Sujith Manoharan rounds-out the powersave disables with one for ath9k.
He also fixes a build prolem with ath9k on ARM and fixes an ath9k Tx
power calculation.
Finally, Andrea Merello fixes a couple of lingering DMA mapping
problems in the rtl8180 driver.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sun, 9 Feb 2014 22:20:00 +0000 (14:20 -0800)]
Merge branch 'master' of git://git./linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:
====================
Netfilter/nftables/IPVS fixes for net
The following patchset contains Netfilter/IPVS fixes, mostly nftables
fixes, most relevantly they are:
* Fix a crash in the h323 conntrack NAT helper due to expectation list
corruption, from Alexey Dobriyan.
* A couple of RCU race fixes for conntrack, one manifests by hitting BUG_ON
in nf_nat_setup_info() and the destroy path, patches from Andrey Vagin and
me.
* Dump direction attribute in nft_ct only if it is set, from Arturo
Borrero.
* Fix IPVS bug in its own connection tracking system that may lead to
copying only 4 bytes of the IPv6 address when initializing the
ip_vs_conn object, from Michal Kubecek.
* Fix -EBUSY errors in nftables when deleting the rules, chain and tables
in a row due mixture of asynchronous and synchronous object releasing,
from me.
* Three fixes for the nf_tables set infrastructure when using intervals and
mappings, from me.
* Four patches to fixing the nf_tables log, reject and ct expressions from
the new inet table, from Patrick McHardy.
* Fix memory overrun in the map that is used to dynamically allocate names
from anonymous sets, also from Patrick.
* Fix a potential oops if you dump a set with NFPROTO_UNSPEC and a table
name, from Patrick McHardy.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
John W. Linville [Fri, 7 Feb 2014 18:44:14 +0000 (13:44 -0500)]
Merge branch 'master' of git://git./linux/kernel/git/linville/wireless into for-davem
Patrick McHardy [Thu, 9 Jan 2014 18:42:42 +0000 (18:42 +0000)]
netfilter: nf_tables: unininline nft_trace_packet()
It makes no sense to inline a rarely used function meant for debugging
only that is called a total of five times in the main evaluation loop.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Pablo Neira Ayuso [Fri, 7 Feb 2014 13:45:01 +0000 (14:45 +0100)]
netfilter: nf_tables: fix loop checking with end interval elements
Fix access to uninitialized data for end interval elements. The
element data part is uninitialized in interval end elements.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Pablo Neira Ayuso [Thu, 6 Feb 2014 15:15:39 +0000 (16:15 +0100)]
netfilter: nft_rbtree: fix data handling of end interval elements
This patch fixes several things which related to the handling of
end interval elements:
* Chain use underflow with intervals and map: If you add a rule
using intervals+map that introduces a loop, the error path of the
rbtree set decrements the chain refcount for each side of the
interval, leading to a chain use counter underflow.
* Don't copy the data part of the end interval element since, this
area is uninitialized and this confuses the loop detection code.
* Don't allocate room for the data part of end interval elements
since this is unused.
So, after this patch the idea is that end interval elements don't
have a data part.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Patrick McHardy <kaber@trash.net>
Pablo Neira Ayuso [Fri, 7 Feb 2014 11:53:07 +0000 (12:53 +0100)]
netfilter: nf_tables: do not allow NFT_SET_ELEM_INTERVAL_END flag and data
This combination is not allowed since end interval elements cannot
contain data.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Patrick McHardy <kaber@trash.net>
Eric Dumazet [Thu, 6 Feb 2014 23:57:10 +0000 (15:57 -0800)]
tcp: remove 1ms offset in srtt computation
TCP pacing depends on an accurate srtt estimation.
Current srtt estimation is using jiffie resolution,
and has an artificial offset of at least 1 ms, which can produce
slowdowns when FQ/pacing is used, especially in DC world,
where typical rtt is below 1 ms.
We are planning a switch to usec resolution for linux-3.15,
but in the meantime, this patch removes the 1 ms offset.
All we need is to have tp->srtt minimal value of 1 to differentiate
the case of srtt being initialized or not, not 8.
The problematic behavior was observed on a 40Gbit testbed,
where 32 concurrent netperf were reaching 12Gbps of aggregate
speed, instead of line speed.
This patch also has the effect of reporting more accurate srtt and send
rates to iproute2 ss command as in :
$ ss -i dst cca2
Netid State Recv-Q Send-Q Local Address:Port
Peer Address:Port
tcp ESTAB 0 0 10.244.129.1:56984
10.244.129.2:12865
cubic wscale:6,6 rto:200 rtt:0.25/0.25 ato:40 mss:1448 cwnd:10 send
463.4Mbps rcv_rtt:1 rcv_space:29200
tcp ESTAB 0 390960 10.244.129.1:60247
10.244.129.2:50204
cubic wscale:6,6 rto:200 rtt:0.875/0.75 mss:1448 cwnd:73 ssthresh:51
send 966.4Mbps unacked:73 retrans:0/121 rcv_space:29200
Reported-by: Vytautas Valancius <valas@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cong Wang [Thu, 6 Feb 2014 23:00:52 +0000 (15:00 -0800)]
bridge: fix netconsole setup over bridge
Commit
93d8bf9fb8f3 ("bridge: cleanup netpoll code") introduced
a check in br_netpoll_enable(), but this check is incorrect for
br_netpoll_setup(). This patch moves the code after the check
into __br_netpoll_enable() and calls it in br_netpoll_setup().
For br_add_if(), the check is still needed.
Fixes:
93d8bf9fb8f3 ("bridge: cleanup netpoll code")
Cc: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <cwang@twopensource.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Tested-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Thu, 6 Feb 2014 18:42:42 +0000 (10:42 -0800)]
net: use __GFP_NORETRY for high order allocations
sock_alloc_send_pskb() & sk_page_frag_refill()
have a loop trying high order allocations to prepare
skb with low number of fragments as this increases performance.
Problem is that under memory pressure/fragmentation, this can
trigger OOM while the intent was only to try the high order
allocations, then fallback to order-0 allocations.
We had various reports from unexpected regressions.
According to David, setting __GFP_NORETRY should be fine,
as the asynchronous compaction is still enabled, and this
will prevent OOM from kicking as in :
CFSClientEventm invoked oom-killer: gfp_mask=0x42d0, order=3, oom_adj=0,
oom_score_adj=0, oom_score_badness=2 (enabled),memcg_scoring=disabled
CFSClientEventm
Call Trace:
[<
ffffffff8043766c>] dump_header+0xe1/0x23e
[<
ffffffff80437a02>] oom_kill_process+0x6a/0x323
[<
ffffffff80438443>] out_of_memory+0x4b3/0x50d
[<
ffffffff8043a4a6>] __alloc_pages_may_oom+0xa2/0xc7
[<
ffffffff80236f42>] __alloc_pages_nodemask+0x1002/0x17f0
[<
ffffffff8024bd23>] alloc_pages_current+0x103/0x2b0
[<
ffffffff8028567f>] sk_page_frag_refill+0x8f/0x160
[<
ffffffff80295fa0>] tcp_sendmsg+0x560/0xee0
[<
ffffffff802a5037>] inet_sendmsg+0x67/0x100
[<
ffffffff80283c9c>] __sock_sendmsg_nosec+0x6c/0x90
[<
ffffffff80283e85>] sock_sendmsg+0xc5/0xf0
[<
ffffffff802847b6>] __sys_sendmsg+0x136/0x430
[<
ffffffff80284ec8>] sys_sendmsg+0x88/0x110
[<
ffffffff80711472>] system_call_fastpath+0x16/0x1b
Out of Memory: Kill process 2856 (bash) score 9999 or sacrifice child
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: David Rientjes <rientjes@google.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sabrina Dubroca [Thu, 6 Feb 2014 17:34:12 +0000 (18:34 +0100)]
netpoll: fix netconsole IPv6 setup
Currently, to make netconsole start over IPv6, the source address
needs to be specified. Without a source address, netpoll_parse_options
assumes we're setting up over IPv4 and the destination IPv6 address is
rejected.
Check if the IP version has been forced by a source address before
checking for a version mismatch when parsing the destination address.
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Acked-by: Cong Wang <cwang@twopensource.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Paul Gortmaker [Thu, 6 Feb 2014 16:45:12 +0000 (11:45 -0500)]
drivers/net: fix build warning in ethernet/sfc/tx.c
Commit
ee45fd92c739db5b7950163d91dfe5f016af6d24 ("sfc: Use TX PIO
for sufficiently small packets") introduced the following warning:
drivers/net/ethernet/sfc/tx.c: In function 'efx_enqueue_skb':
drivers/net/ethernet/sfc/tx.c:432:1: warning: label 'finish_packet' defined but not used
Stick the label inside the same #ifdef that the code which calls
it uses. Note that this is only seen for arch that do not set
ARCH_HAS_IOREMAP_WC, such as arm, mips, sparc, ..., as the others
enable the write combining code and hence use the label.
Cc: Jon Cooper <jcooper@solarflare.com>
Cc: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Acked-by: Shradha Shah <sshah@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Thu, 6 Feb 2014 12:53:19 +0000 (15:53 +0300)]
hso: remove some dead code
It seems like this function was intended to have special handling for
urb statuses of -ENOENT and -ECONNRESET. But now it just prints some
debugging and returns at the start of the function.
I have removed the dead code, it's still in the git history if anyone
wants to revive it.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jan Moskyto Matejka [Thu, 6 Feb 2014 11:10:00 +0000 (12:10 +0100)]
inet: defines IPPROTO_* needed for module alias generation
Commit
cfd280c91253 ("net: sync some IP headers with glibc") changed a set of
define's to an enum (with no explanation why) which introduced a bug
in module mip6 where aliases are generated using the IPPROTO_* defines;
mip6 doesn't load if require_module called with the aliases from
xfrm_get_type().
Reverting this change back to define's to fix the aliases.
modinfo mip6 (before this change)
alias: xfrm-type-10-IPPROTO_DSTOPTS
alias: xfrm-type-10-IPPROTO_ROUTING
modinfo mip6 (after this change)
alias: xfrm-type-10-43
alias: xfrm-type-10-60
Signed-off-by: Jan Moskyto Matejka <mq@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Thu, 6 Feb 2014 08:03:08 +0000 (11:03 +0300)]
isdn/hisax: hex vs decimal typo in prfeatureind()
This is a static checker fix, but judging from the context then I think
hexidecimal 0x80 is intended here instead of decimal 80.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Matija Glavinic Pecotic [Thu, 6 Feb 2014 07:30:10 +0000 (08:30 +0100)]
net: sctp: fix initialization of local source address on accepted ipv6 sockets
commit
efe4208f47f907b86f528788da711e8ab9dea44d:
'ipv6: make lookups simpler and faster' broke initialization of local source
address on accepted ipv6 sockets. Before the mentioned commit receive address
was copied along with the contents of ipv6_pinfo in sctp_v6_create_accept_sk.
Now when it is moved, it has to be copied separately.
This also fixes lksctp's ipv6 regression in a sense that test_getname_v6, TC5 -
'getsockname on a connected server socket' now passes.
Signed-off-by: Matija Glavinic Pecotic <matija.glavinic-pecotic.ext@nsn.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
hayeswang [Thu, 6 Feb 2014 03:55:48 +0000 (11:55 +0800)]
r8152: fix the submission of the interrupt transfer
The submission of the interrupt transfer should be done after setting
the bit of WORK_ENABLE, otherwise the callback function would have
the opportunity to be returned directly.
Clear the bit of WORK_ENABLE before killing the interrupt transfer.
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yuval Mintz [Wed, 5 Feb 2014 14:07:12 +0000 (16:07 +0200)]
bnx2x: Allow VF rss on higher PFs
bnx2x driver uses incorrect PF identifier to configure (in HW) the VF
interrupt scheme; As a result, in multi-function mode the configuration
for PFs with a high index (4+) will overflow and the PF will erroneously
configure a single ISR scheme for its VFs.
As a result, if such a VF uses multiple queues, interrupt generation will
stop after VF receives an Rx packet or sends a Tx packet on a queue
other than queue[0].
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nithin Sujir [Thu, 6 Feb 2014 22:13:05 +0000 (14:13 -0800)]
tg3: Fix deadlock in tg3_change_mtu()
Quoting David Vrabel -
"5780 cards cannot have jumbo frames and TSO enabled together. When
jumbo frames are enabled by setting the MTU, the TSO feature must be
cleared. This is done indirectly by calling netdev_update_features()
which will call tg3_fix_features() to actually clear the flags.
netdev_update_features() will also trigger a new netlink message for the
feature change event which will result in a call to tg3_get_stats64()
which deadlocks on the tg3 lock."
tg3_set_mtu() does not need to be under the tg3 lock since converting
the flags to use set_bit(). Move it out to after tg3_netif_stop().
Reported-by: David Vrabel <david.vrabel@citrix.com>
Tested-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Nithin Nayak Sujir <nsujir@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Wed, 5 Feb 2014 13:29:21 +0000 (16:29 +0300)]
tg3: cleanup an error path in tg3_phy_reset_5703_4_5()
In the original code, if tg3_readphy() fails then it does an unnecessary
check to verify "err" is still zero and then returns -EBUSY.
My static checker complains about the unnecessary "if (!err)" check and
anyway it is better to propagate the -EBUSY error code from
tg3_readphy() instead of hard coding it here. And really the original
code is confusing to look at.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Nithin Nayak Sujir <nsujir@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Geert Uytterhoeven [Wed, 5 Feb 2014 07:38:25 +0000 (08:38 +0100)]
ipv4: Fix runtime WARNING in rtmsg_ifa()
On m68k/ARAnyM:
WARNING: CPU: 0 PID: 407 at net/ipv4/devinet.c:1599 0x316a99()
Modules linked in:
CPU: 0 PID: 407 Comm: ifconfig Not tainted
3.13.0-atari-09263-g0c71d68014d1 #1378
Stack from
10c4fdf0:
10c4fdf0 002ffabb 000243e8 00000000 008ced6c 00024416 00316a99 0000063f
00316a99 00000009 00000000 002501b4 00316a99 0000063f c0a86117 00000080
c0a86117 00ad0c90 00250a5a 00000014 00ad0c90 00000000 00000000 00000001
00b02dd0 00356594 00000000 00356594 c0a86117 eff6c9e4 008ced6c 00000002
008ced60 0024f9b4 00250b52 00ad0c90 00000000 00000000 00252390 00ad0c90
eff6c9e4 0000004f 00000000 00000000 eff6c9e4 8000e25c eff6c9e4 80001020
Call Trace: [<
000243e8>] warn_slowpath_common+0x52/0x6c
[<
00024416>] warn_slowpath_null+0x14/0x1a
[<
002501b4>] rtmsg_ifa+0xdc/0xf0
[<
00250a5a>] __inet_insert_ifa+0xd6/0x1c2
[<
0024f9b4>] inet_abc_len+0x0/0x42
[<
00250b52>] inet_insert_ifa+0xc/0x12
[<
00252390>] devinet_ioctl+0x2ae/0x5d6
Adding some debugging code reveals that net_fill_ifaddr() fails in
put_cacheinfo(skb, ifa->ifa_cstamp, ifa->ifa_tstamp,
preferred, valid))
nla_put complains:
lib/nlattr.c:454: skb_tailroom(skb) = 12, nla_total_size(attrlen) = 20
Apparently commit
5c766d642bcaffd0c2a5b354db2068515b3846cf ("ipv4:
introduce address lifetime") forgot to take into account the addition of
struct ifa_cacheinfo in inet_nlmsg_size(). Hence add it, like is already
done for ipv6.
Suggested-by: Cong Wang <cwang@twopensource.com>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Cong Wang <cwang@twopensource.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
James M Leddy [Tue, 4 Feb 2014 20:10:59 +0000 (15:10 -0500)]
bnx2[x]: Make module parameters readable
Occasionally users want to know what parameters their Broadcom drivers
are running with. For example, a user may want to know if MSI is
disabled.
This patch has been compile tested.
Signed-off-by: James M Leddy <james.leddy@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Shiyan [Tue, 4 Feb 2014 15:43:32 +0000 (19:43 +0400)]
net: irda: ep7211-sir: Remove driver
This patch removes old and unsupported CLPS711X IrDA driver.
Support for IrDA for CLPS711X serial port now provided by commit
4a33f1f59abd (serial: clps711x: Add support for N_IRDA line
discipline), so IrDA-mode can be turned ON with "irattach" tool
through "irtty" driver.
Signed-off-by: Alexander Shiyan <shc_work@mail.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
Maxime Ripard [Sun, 2 Feb 2014 13:49:13 +0000 (14:49 +0100)]
ARM: sunxi: dt: Convert to the new net compatibles
Switch the device tree to the new compatibles introduced in the ethernet and
mdio drivers to have a common pattern accross all Allwinner SoCs.
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Maxime Ripard [Sun, 2 Feb 2014 13:49:12 +0000 (14:49 +0100)]
net: phy: sunxi: Add new compatibles
The Allwinner A10 compatibles were following a slightly different compatible
patterns than the rest of the SoCs for historical reasons. Add compatibles
matching the other pattern to the mdio driver for consistency, and keep the
older one for backward compatibility.
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Maxime Ripard [Sun, 2 Feb 2014 13:49:11 +0000 (14:49 +0100)]
net: ethernet: sunxi: Add new compatibles
The Allwinner A10 compatibles were following a slightly different compatible
patterns than the rest of the SoCs for historical reasons. Add compatibles
matching the other pattern to the ethernet driver for consistency, and keep the
older one for backward compatibility.
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
andrea.merello [Wed, 5 Feb 2014 21:38:06 +0000 (22:38 +0100)]
rtl8180: Add error check for pci_map_single return value in TX path
Orignal code will not detect a DMA mapping failure, causing the HW
to attempt a DMA from an invalid address.
This patch add the error check and eventually simply drops the TX
packet if we can't map it for DMA.
Signed-off-by: andrea merello <andrea.merello@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
andrea.merello [Wed, 5 Feb 2014 21:38:05 +0000 (22:38 +0100)]
rtl8180: Add error check for pci_map_single return value in RX path
In original code the old RX DMA buffer is unmapped and processed and at the end
of the isr a new buffer is mapped with pci_map_single and attached to the RX
descriptor.
If pci_map_single fails then the RX descriptor remains with no valid DMA buffer
attached.
In this condition the DMA will target where it shouldn't with obvious evil
consequences.
Simply avoiding re-arming the descriptor will prevent buggy DMA but it will
result soon in RX stuck.
This patch move the DMA mapping of the new buffer at the beginning of the ISR
(and it adds error check for pci_map_single success/fail).
If the DMA mapping fails then we do not unmap the old buffer and we re-arm the
descriptor without processing it, with the old DMA buffer still attached.
In this way we lose the currently RX-ed packet, but whenever next calls to
pci_map_single will succeed again,then the RX process will go on without stuck.
Signed-off-by: andrea merello <andrea.merello@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
John W. Linville [Thu, 6 Feb 2014 19:34:31 +0000 (14:34 -0500)]
Merge branch 'for-john' of git://git./linux/kernel/git/jberg/mac80211
Pablo Neira Ayuso [Sat, 25 Jan 2014 13:03:51 +0000 (14:03 +0100)]
netfilter: nf_tables: fix racy rule deletion
We may lost race if we flush the rule-set (which happens asynchronously
via call_rcu) and we try to remove the table (that userspace assumes
to be empty).
Fix this by recovering synchronous rule and chain deletion. This was
introduced time ago before we had no batch support, and synchronous
rule deletion performance was not good. Now that we have the batch
support, we can just postpone the purge of old rule in a second step
in the commit phase. All object deletions are synchronous after this
patch.
As a side effect, we save memory as we don't need rcu_head per rule
anymore.
Cc: Patrick McHardy <kaber@trash.net>
Reported-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Patrick McHardy [Thu, 6 Feb 2014 09:17:41 +0000 (09:17 +0000)]
netfilter: nf_tables: fix log/queue expressions for NFPROTO_INET
The log and queue expressions both store the family during ->init() and
use it to deliver packets. This is wrong when used in NFPROTO_INET since
they should both deliver to the actual AF of the packet, not the dummy
NFPROTO_INET.
Use the family from the hook ops to fix this.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Johannes Berg [Wed, 29 Jan 2014 12:28:02 +0000 (13:28 +0100)]
mac80211: fix virtual monitor interface iteration
During channel context assignment, the interface should
be found by interface iteration, so we need to assign the
pointer before the channel context.
Reported-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Tested-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Johannes Berg [Fri, 31 Jan 2014 23:16:23 +0000 (00:16 +0100)]
mac80211: fix fragmentation code, particularly for encryption
The "new" fragmentation code (since my rewrite almost 5 years ago)
erroneously sets skb->len rather than using skb_trim() to adjust
the length of the first fragment after copying out all the others.
This leaves the skb tail pointer pointing to after where the data
originally ended, and thus causes the encryption MIC to be written
at that point, rather than where it belongs: immediately after the
data.
The impact of this is that if software encryption is done, then
a) encryption doesn't work for the first fragment, the connection
becomes unusable as the first fragment will never be properly
verified at the receiver, the MIC is practically guaranteed to
be wrong
b) we leak up to 8 bytes of plaintext (!) of the packet out into
the air
This is only mitigated by the fact that many devices are capable
of doing encryption in hardware, in which case this can't happen
as the tail pointer is irrelevant in that case. Additionally,
fragmentation is not used very frequently and would normally have
to be configured manually.
Fix this by using skb_trim() properly.
Cc: stable@vger.kernel.org
Fixes:
2de8e0d999b8 ("mac80211: rewrite fragmentation")
Reported-by: Jouni Malinen <j@w1.fi>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Sujith Manoharan [Thu, 30 Jan 2014 08:47:28 +0000 (14:17 +0530)]
mac80211: Fix IBSS disconnect
Currently, when a station leaves an IBSS network, the
corresponding BSS is not dropped from cfg80211 if there are
other active stations in the network. But, the small
window that is present when trying to determine a station's
status based on IEEE80211_IBSS_MERGE_INTERVAL introduces
a race.
Instead of trying to keep the BSS, always remove it when
leaving an IBSS network. There is not much benefit to retain
the BSS entry since it will be added with a subsequent join
operation.
This fixes an issue where a dangling BSS entry causes ath9k
to wait for a beacon indefinitely.
Cc: <stable@vger.kernel.org>
Reported-by: Simon Wunderlich <sw@simonwunderlich.de>
Signed-off-by: Sujith Manoharan <c_manoha@qca.qualcomm.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Emmanuel Grumbach [Mon, 27 Jan 2014 09:07:42 +0000 (11:07 +0200)]
mac80211: release the channel in error path in start_ap
When the driver cannot start the AP or when the assignement
of the beacon goes wrong, we need to unassign the vif.
Cc: stable@vger.kernel.org
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Johannes Berg [Wed, 22 Jan 2014 09:14:19 +0000 (11:14 +0200)]
cfg80211: send scan results from work queue
Due to the previous commit, when a scan finishes, it is in theory
possible to hit the following sequence:
1. interface starts being removed
2. scan is cancelled by driver and cfg80211 is notified
3. scan done work is scheduled
4. interface is removed completely, rdev->scan_req is freed,
event sent to userspace but scan done work remains pending
5. new scan is requested on another virtual interface
6. scan done work runs, freeing the still-running scan
To fix this situation, hang on to the scan done message and block
new scans while that is the case, and only send the message from
the work function, regardless of whether the scan_req is already
freed from interface removal. This makes step 5 above impossible
and changes step 6 to be
5. scan done work runs, sending the scan done message
As this can't work for wext, so we send the message immediately,
but this shouldn't be an issue since we still return -EBUSY.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Johannes Berg [Wed, 22 Jan 2014 09:14:18 +0000 (11:14 +0200)]
cfg80211: fix scan done race
When an interface/wdev is removed, any ongoing scan should be
cancelled by the driver. This will make it call cfg80211, which
only queues a work struct. If interface/wdev removal is quick
enough, this can leave the scan request pending and processed
only after the interface is gone, causing a use-after-free.
Fix this by making sure the scan request is not pending after
the interface is destroyed. We can't flush or cancel the work
item due to locking concerns, but when it'll run it shouldn't
find anything to do. This leaves a potential issue, if a new
scan gets requested before the work runs, it prematurely stops
the running scan, potentially causing another crash. I'll fix
that in the next patch.
This was particularly observed with P2P_DEVICE wdevs, likely
because freeing them is quicker than freeing netdevs.
Reported-by: Andrei Otcheretianski <andrei.otcheretianski@intel.com>
Fixes:
4a58e7c38443 ("cfg80211: don't "leak" uncompleted scans")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Emmanuel Grumbach [Thu, 23 Jan 2014 12:28:16 +0000 (14:28 +0200)]
mac80211: avoid deadlock revealed by lockdep
sdata->u.ap.request_smps_work can’t be flushed synchronously
under wdev_lock(wdev) since ieee80211_request_smps_ap_work
itself locks the same lock.
While at it, reset the driver_smps_mode when the ap is
stopped to its default: OFF.
This solves:
======================================================
[ INFO: possible circular locking dependency detected ]
3.12.0-ipeer+ #2 Tainted: G O
-------------------------------------------------------
rmmod/2867 is trying to acquire lock:
((&sdata->u.ap.request_smps_work)){+.+...}, at: [<
c105b8d0>] flush_work+0x0/0x90
but task is already holding lock:
(&wdev->mtx){+.+.+.}, at: [<
f9b32626>] cfg80211_stop_ap+0x26/0x230 [cfg80211]
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (&wdev->mtx){+.+.+.}:
[<
c10aefa9>] lock_acquire+0x79/0xe0
[<
c1607a1a>] mutex_lock_nested+0x4a/0x360
[<
fb06288b>] ieee80211_request_smps_ap_work+0x2b/0x50 [mac80211]
[<
c105cdd8>] process_one_work+0x198/0x450
[<
c105d469>] worker_thread+0xf9/0x320
[<
c10669ff>] kthread+0x9f/0xb0
[<
c1613397>] ret_from_kernel_thread+0x1b/0x28
-> #0 ((&sdata->u.ap.request_smps_work)){+.+...}:
[<
c10ae9df>] __lock_acquire+0x183f/0x1910
[<
c10aefa9>] lock_acquire+0x79/0xe0
[<
c105b917>] flush_work+0x47/0x90
[<
c105d867>] __cancel_work_timer+0x67/0xe0
[<
c105d90f>] cancel_work_sync+0xf/0x20
[<
fb0765cc>] ieee80211_stop_ap+0x8c/0x340 [mac80211]
[<
f9b3268c>] cfg80211_stop_ap+0x8c/0x230 [cfg80211]
[<
f9b0d8f9>] cfg80211_leave+0x79/0x100 [cfg80211]
[<
f9b0da72>] cfg80211_netdev_notifier_call+0xf2/0x4f0 [cfg80211]
[<
c160f2c9>] notifier_call_chain+0x59/0x130
[<
c106c6de>] __raw_notifier_call_chain+0x1e/0x30
[<
c106c70f>] raw_notifier_call_chain+0x1f/0x30
[<
c14f8213>] call_netdevice_notifiers_info+0x33/0x70
[<
c14f8263>] call_netdevice_notifiers+0x13/0x20
[<
c14f82a4>] __dev_close_many+0x34/0xb0
[<
c14f83fe>] dev_close_many+0x6e/0xc0
[<
c14f9c77>] rollback_registered_many+0xa7/0x1f0
[<
c14f9dd4>] unregister_netdevice_many+0x14/0x60
[<
fb06f4d9>] ieee80211_remove_interfaces+0xe9/0x170 [mac80211]
[<
fb055116>] ieee80211_unregister_hw+0x56/0x110 [mac80211]
[<
fa3e9396>] iwl_op_mode_mvm_stop+0x26/0xe0 [iwlmvm]
[<
f9b9d8ca>] _iwl_op_mode_stop+0x3a/0x70 [iwlwifi]
[<
f9b9d96f>] iwl_opmode_deregister+0x6f/0x90 [iwlwifi]
[<
fa405179>] __exit_compat+0xd/0x19 [iwlmvm]
[<
c10b8bf9>] SyS_delete_module+0x179/0x2b0
[<
c1613421>] sysenter_do_call+0x12/0x32
Fixes:
687da132234f ("mac80211: implement SMPS for AP")
Cc: <stable@vger.kernel.org> [3.13]
Reported-by: Ilan Peer <ilan.peer@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Johannes Berg [Thu, 23 Jan 2014 15:32:29 +0000 (16:32 +0100)]
cfg80211: re-enable 5/10 MHz support
Unfortunately I forgot this during the merge window, but the
patch seems small enough to go in as a fix. The userspace API
bug that was the reason for disabling it has long been fixed.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Pontus Fuchs [Thu, 16 Jan 2014 14:00:40 +0000 (15:00 +0100)]
nl80211: Reset split_start when netlink skb is exhausted
When the netlink skb is exhausted split_start is left set. In the
subsequent retry, with a larger buffer, the dump is continued from the
failing point instead of from the beginning.
This was causing my rt28xx based USB dongle to now show up when
running "iw list" with an old iw version without split dump support.
Cc: stable@vger.kernel.org
Fixes:
3713b4e364ef ("nl80211: allow splitting wiphy information in dumps")
Signed-off-by: Pontus Fuchs <pontus.fuchs@gmail.com>
[avoid the entire workaround when state->split is set]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Eliad Peller [Sun, 12 Jan 2014 09:06:37 +0000 (11:06 +0200)]
mac80211: move roc cookie assignment earlier
ieee80211_start_roc_work() might add a new roc
to existing roc, and tell cfg80211 it has already
started.
However, this might happen before the roc cookie
was set, resulting in REMAIN_ON_CHANNEL (started)
event with null cookie. Consequently, it can make
wpa_supplicant go out of sync.
Fix it by setting the roc cookie earlier.
Cc: stable@vger.kernel.org
Signed-off-by: Eliad Peller <eliad@wizery.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Patrick McHardy [Wed, 5 Feb 2014 15:03:39 +0000 (15:03 +0000)]
netfilter: nf_tables: add reject module for NFPROTO_INET
Add a reject module for NFPROTO_INET. It does nothing but dispatch
to the AF-specific modules based on the hook family.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Patrick McHardy [Wed, 5 Feb 2014 15:03:38 +0000 (15:03 +0000)]
netfilter: nft_reject: split up reject module into IPv4 and IPv6 specifc parts
Currently the nft_reject module depends on symbols from ipv6. This is
wrong since no generic module should force IPv6 support to be loaded.
Split up the module into AF-specific and a generic part.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
David S. Miller [Thu, 6 Feb 2014 00:25:53 +0000 (16:25 -0800)]
Merge branch 'fixes' of git://git./linux/kernel/git/jesse/openvswitch
Jesse Gross says:
====================
Open vSwitch
A handful of bug fixes for net/3.14. High level fixes are:
* Regressions introduced by the zerocopy changes, particularly with
old userspaces.
* A few bugs lingering from the introduction of megaflows.
* Overly zealous error checking that is now being triggered frequently
in common cases.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Zoltan Kiss [Tue, 4 Feb 2014 19:54:37 +0000 (19:54 +0000)]
xen-netback: Fix Rx stall due to race condition
The recent patch to fix receive side flow control
(
11b57f90257c1d6a91cee720151b69e0c2020cf6: xen-netback: stop vif thread
spinning if frontend is unresponsive) solved the spinning thread problem,
however caused an another one. The receive side can stall, if:
- [THREAD] xenvif_rx_action sets rx_queue_stopped to true
- [INTERRUPT] interrupt happens, and sets rx_event to true
- [THREAD] then xenvif_kthread sets rx_event to false
- [THREAD] rx_work_todo doesn't return true anymore
Also, if interrupt sent but there is still no room in the ring, it take quite a
long time until xenvif_rx_action realize it. This patch ditch that two variable,
and rework rx_work_todo. If the thread finds it can't fit more skb's into the
ring, it saves the last slot estimation into rx_last_skb_slots, otherwise it's
kept as 0. Then rx_work_todo will check if:
- there is something to send to the ring (like before)
- there is space for the topmost packet in the queue
I think that's more natural and optimal thing to test than two bool which are
set somewhere else.
Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Wed, 5 Feb 2014 15:03:37 +0000 (15:03 +0000)]
netfilter: nf_tables: add AF specific expression support
For the reject module, we need to add AF-specific implementations to
get rid of incorrect module dependencies. Try to load an AF-specific
module first and fall back to generic modules.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Patrick McHardy [Wed, 5 Feb 2014 15:03:36 +0000 (15:03 +0000)]
netfilter: nft_ct: fix missing NFT_CT_L3PROTOCOL key in validity checks
The key was missing in the list of valid keys, add it.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Patrick McHardy [Wed, 5 Feb 2014 15:03:35 +0000 (15:03 +0000)]
netfilter: nf_tables: fix potential oops when dumping sets
Commit
c9c8e48597 (netfilter: nf_tables: dump sets in all existing families)
changed nft_ctx_init_from_setattr() to only look up the address family if it
is not NFPROTO_UNSPEC. However if it is NFPROTO_UNSPEC and a table attribute
is given, nftables_afinfo_lookup() will dereference the NULL afi pointer.
Fix by checking for non-NULL afi and also move a check added by that commit
to the proper position.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Patrick McHardy [Wed, 5 Feb 2014 11:26:22 +0000 (12:26 +0100)]
netfilter: nf_tables: fix overrun in nf_tables_set_alloc_name()
The map that is used to allocate anonymous sets is indeed
BITS_PER_BYTE * PAGE_SIZE long.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Pablo Neira Ayuso [Mon, 3 Feb 2014 19:01:53 +0000 (20:01 +0100)]
netfilter: nf_conntrack: don't release a conntrack with non-zero refcnt
With this patch, the conntrack refcount is initially set to zero and
it is bumped once it is added to any of the list, so we fulfill
Eric's golden rule which is that all released objects always have a
refcount that equals zero.
Andrey Vagin reports that nf_conntrack_free can't be called for a
conntrack with non-zero ref-counter, because it can race with
nf_conntrack_find_get().
A conntrack slab is created with SLAB_DESTROY_BY_RCU. Non-zero
ref-counter says that this conntrack is used. So when we release
a conntrack with non-zero counter, we break this assumption.
CPU1 CPU2
____nf_conntrack_find()
nf_ct_put()
destroy_conntrack()
...
init_conntrack
__nf_conntrack_alloc (set use = 1)
atomic_inc_not_zero(&ct->use) (use = 2)
if (!l4proto->new(ct, skb, dataoff, timeouts))
nf_conntrack_free(ct); (use = 2 !!!)
...
__nf_conntrack_alloc (set use = 1)
if (!nf_ct_key_equal(h, tuple, zone))
nf_ct_put(ct); (use = 0)
destroy_conntrack()
/* continue to work with CT */
After applying the path "[PATCH] netfilter: nf_conntrack: fix RCU
race in nf_conntrack_find_get" another bug was triggered in
destroy_conntrack():
<4>[67096.759334] ------------[ cut here ]------------
<2>[67096.759353] kernel BUG at net/netfilter/nf_conntrack_core.c:211!
...
<4>[67096.759837] Pid: 498649, comm: atdd veid: 666 Tainted: G C --------------- 2.6.32-042stab084.18 #1 042stab084_18 /DQ45CB
<4>[67096.759932] RIP: 0010:[<
ffffffffa03d99ac>] [<
ffffffffa03d99ac>] destroy_conntrack+0x15c/0x190 [nf_conntrack]
<4>[67096.760255] Call Trace:
<4>[67096.760255] [<
ffffffff814844a7>] nf_conntrack_destroy+0x17/0x30
<4>[67096.760255] [<
ffffffffa03d9bb5>] nf_conntrack_find_get+0x85/0x130 [nf_conntrack]
<4>[67096.760255] [<
ffffffffa03d9fb2>] nf_conntrack_in+0x352/0xb60 [nf_conntrack]
<4>[67096.760255] [<
ffffffffa048c771>] ipv4_conntrack_local+0x51/0x60 [nf_conntrack_ipv4]
<4>[67096.760255] [<
ffffffff81484419>] nf_iterate+0x69/0xb0
<4>[67096.760255] [<
ffffffff814b5b00>] ? dst_output+0x0/0x20
<4>[67096.760255] [<
ffffffff814845d4>] nf_hook_slow+0x74/0x110
<4>[67096.760255] [<
ffffffff814b5b00>] ? dst_output+0x0/0x20
<4>[67096.760255] [<
ffffffff814b66d5>] raw_sendmsg+0x775/0x910
<4>[67096.760255] [<
ffffffff8104c5a8>] ? flush_tlb_others_ipi+0x128/0x130
<4>[67096.760255] [<
ffffffff8100bc4e>] ? apic_timer_interrupt+0xe/0x20
<4>[67096.760255] [<
ffffffff8100bc4e>] ? apic_timer_interrupt+0xe/0x20
<4>[67096.760255] [<
ffffffff814c136a>] inet_sendmsg+0x4a/0xb0
<4>[67096.760255] [<
ffffffff81444e93>] ? sock_sendmsg+0x13/0x140
<4>[67096.760255] [<
ffffffff81444f97>] sock_sendmsg+0x117/0x140
<4>[67096.760255] [<
ffffffff8102e299>] ? native_smp_send_reschedule+0x49/0x60
<4>[67096.760255] [<
ffffffff81519beb>] ? _spin_unlock_bh+0x1b/0x20
<4>[67096.760255] [<
ffffffff8109d930>] ? autoremove_wake_function+0x0/0x40
<4>[67096.760255] [<
ffffffff814960f0>] ? do_ip_setsockopt+0x90/0xd80
<4>[67096.760255] [<
ffffffff8100bc4e>] ? apic_timer_interrupt+0xe/0x20
<4>[67096.760255] [<
ffffffff8100bc4e>] ? apic_timer_interrupt+0xe/0x20
<4>[67096.760255] [<
ffffffff814457c9>] sys_sendto+0x139/0x190
<4>[67096.760255] [<
ffffffff810efa77>] ? audit_syscall_entry+0x1d7/0x200
<4>[67096.760255] [<
ffffffff810ef7c5>] ? __audit_syscall_exit+0x265/0x290
<4>[67096.760255] [<
ffffffff81474daf>] compat_sys_socketcall+0x13f/0x210
<4>[67096.760255] [<
ffffffff8104dea3>] ia32_sysret+0x0/0x5
I have reused the original title for the RFC patch that Andrey posted and
most of the original patch description.
Cc: Eric Dumazet <edumazet@google.com>
Cc: Andrew Vagin <avagin@parallels.com>
Cc: Florian Westphal <fw@strlen.de>
Reported-by: Andrew Vagin <avagin@parallels.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
Alexey Dobriyan [Mon, 3 Feb 2014 12:07:24 +0000 (13:07 +0100)]
netfilter: nf_nat_h323: fix crash in nf_ct_unlink_expect_report()
Similar bug fixed in SIP module in
3f509c6 ("netfilter: nf_nat_sip: fix
incorrect handling of EBUSY for RTCP expectation").
BUG: unable to handle kernel paging request at
00100104
IP: [<
f8214f07>] nf_ct_unlink_expect_report+0x57/0xf0 [nf_conntrack]
...
Call Trace:
[<
c0244bd8>] ? del_timer+0x48/0x70
[<
f8215687>] nf_ct_remove_expectations+0x47/0x60 [nf_conntrack]
[<
f8211c99>] nf_ct_delete_from_lists+0x59/0x90 [nf_conntrack]
[<
f8212e5e>] death_by_timeout+0x14e/0x1c0 [nf_conntrack]
[<
f8212d10>] ? nf_conntrack_set_hashsize+0x190/0x190 [nf_conntrack]
[<
c024442d>] call_timer_fn+0x1d/0x80
[<
c024461e>] run_timer_softirq+0x18e/0x1a0
[<
f8212d10>] ? nf_conntrack_set_hashsize+0x190/0x190 [nf_conntrack]
[<
c023e6f3>] __do_softirq+0xa3/0x170
[<
c023e650>] ? __local_bh_enable+0x70/0x70
<IRQ>
[<
c023e587>] ? irq_exit+0x67/0xa0
[<
c0202af6>] ? do_IRQ+0x46/0xb0
[<
c027ad05>] ? clockevents_notify+0x35/0x110
[<
c066ac6c>] ? common_interrupt+0x2c/0x40
[<
c056e3c1>] ? cpuidle_enter_state+0x41/0xf0
[<
c056e6fb>] ? cpuidle_idle_call+0x8b/0x100
[<
c02085f8>] ? arch_cpu_idle+0x8/0x30
[<
c027314b>] ? cpu_idle_loop+0x4b/0x140
[<
c0273258>] ? cpu_startup_entry+0x18/0x20
[<
c066056d>] ? rest_init+0x5d/0x70
[<
c0813ac8>] ? start_kernel+0x2ec/0x2f2
[<
c081364f>] ? repair_env_string+0x5b/0x5b
[<
c0813269>] ? i386_start_kernel+0x33/0x35
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Andrey Vagin [Wed, 29 Jan 2014 18:34:14 +0000 (19:34 +0100)]
netfilter: nf_conntrack: fix RCU race in nf_conntrack_find_get
Lets look at destroy_conntrack:
hlist_nulls_del_rcu(&ct->tuplehash[IP_CT_DIR_ORIGINAL].hnnode);
...
nf_conntrack_free(ct)
kmem_cache_free(net->ct.nf_conntrack_cachep, ct);
net->ct.nf_conntrack_cachep is created with SLAB_DESTROY_BY_RCU.
The hash is protected by rcu, so readers look up conntracks without
locks.
A conntrack is removed from the hash, but in this moment a few readers
still can use the conntrack. Then this conntrack is released and another
thread creates conntrack with the same address and the equal tuple.
After this a reader starts to validate the conntrack:
* It's not dying, because a new conntrack was created
* nf_ct_tuple_equal() returns true.
But this conntrack is not initialized yet, so it can not be used by two
threads concurrently. In this case BUG_ON may be triggered from
nf_nat_setup_info().
Florian Westphal suggested to check the confirm bit too. I think it's
right.
task 1 task 2 task 3
nf_conntrack_find_get
____nf_conntrack_find
destroy_conntrack
hlist_nulls_del_rcu
nf_conntrack_free
kmem_cache_free
__nf_conntrack_alloc
kmem_cache_alloc
memset(&ct->tuplehash[IP_CT_DIR_MAX],
if (nf_ct_is_dying(ct))
if (!nf_ct_tuple_equal()
I'm not sure, that I have ever seen this race condition in a real life.
Currently we are investigating a bug, which is reproduced on a few nodes.
In our case one conntrack is initialized from a few tasks concurrently,
we don't have any other explanation for this.
<2>[46267.083061] kernel BUG at net/ipv4/netfilter/nf_nat_core.c:322!
...
<4>[46267.083951] RIP: 0010:[<
ffffffffa01e00a4>] [<
ffffffffa01e00a4>] nf_nat_setup_info+0x564/0x590 [nf_nat]
...
<4>[46267.085549] Call Trace:
<4>[46267.085622] [<
ffffffffa023421b>] alloc_null_binding+0x5b/0xa0 [iptable_nat]
<4>[46267.085697] [<
ffffffffa02342bc>] nf_nat_rule_find+0x5c/0x80 [iptable_nat]
<4>[46267.085770] [<
ffffffffa0234521>] nf_nat_fn+0x111/0x260 [iptable_nat]
<4>[46267.085843] [<
ffffffffa0234798>] nf_nat_out+0x48/0xd0 [iptable_nat]
<4>[46267.085919] [<
ffffffff814841b9>] nf_iterate+0x69/0xb0
<4>[46267.085991] [<
ffffffff81494e70>] ? ip_finish_output+0x0/0x2f0
<4>[46267.086063] [<
ffffffff81484374>] nf_hook_slow+0x74/0x110
<4>[46267.086133] [<
ffffffff81494e70>] ? ip_finish_output+0x0/0x2f0
<4>[46267.086207] [<
ffffffff814b5890>] ? dst_output+0x0/0x20
<4>[46267.086277] [<
ffffffff81495204>] ip_output+0xa4/0xc0
<4>[46267.086346] [<
ffffffff814b65a4>] raw_sendmsg+0x8b4/0x910
<4>[46267.086419] [<
ffffffff814c10fa>] inet_sendmsg+0x4a/0xb0
<4>[46267.086491] [<
ffffffff814459aa>] ? sock_update_classid+0x3a/0x50
<4>[46267.086562] [<
ffffffff81444d67>] sock_sendmsg+0x117/0x140
<4>[46267.086638] [<
ffffffff8151997b>] ? _spin_unlock_bh+0x1b/0x20
<4>[46267.086712] [<
ffffffff8109d370>] ? autoremove_wake_function+0x0/0x40
<4>[46267.086785] [<
ffffffff81495e80>] ? do_ip_setsockopt+0x90/0xd80
<4>[46267.086858] [<
ffffffff8100be0e>] ? call_function_interrupt+0xe/0x20
<4>[46267.086936] [<
ffffffff8118cb10>] ? ub_slab_ptr+0x20/0x90
<4>[46267.087006] [<
ffffffff8118cb10>] ? ub_slab_ptr+0x20/0x90
<4>[46267.087081] [<
ffffffff8118f2e8>] ? kmem_cache_alloc+0xd8/0x1e0
<4>[46267.087151] [<
ffffffff81445599>] sys_sendto+0x139/0x190
<4>[46267.087229] [<
ffffffff81448c0d>] ? sock_setsockopt+0x16d/0x6f0
<4>[46267.087303] [<
ffffffff810efa47>] ? audit_syscall_entry+0x1d7/0x200
<4>[46267.087378] [<
ffffffff810ef795>] ? __audit_syscall_exit+0x265/0x290
<4>[46267.087454] [<
ffffffff81474885>] ? compat_sys_setsockopt+0x75/0x210
<4>[46267.087531] [<
ffffffff81474b5f>] compat_sys_socketcall+0x13f/0x210
<4>[46267.087607] [<
ffffffff8104dea3>] ia32_sysret+0x0/0x5
<4>[46267.087676] Code: 91 20 e2 01 75 29 48 89 de 4c 89 f7 e8 56 fa ff ff 85 c0 0f 84 68 fc ff ff 0f b6 4d c6 41 8b 45 00 e9 4d fb ff ff e8 7c 19 e9 e0 <0f> 0b eb fe f6 05 17 91 20 e2 80 74 ce 80 3d 5f 2e 00 00 00 74
<1>[46267.088023] RIP [<
ffffffffa01e00a4>] nf_nat_setup_info+0x564/0x590
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Florian Westphal <fw@strlen.de>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Patrick McHardy [Sat, 25 Jan 2014 08:04:07 +0000 (08:04 +0000)]
netfilter: nf_tables: fix oops when deleting a chain with references
The following commands trigger an oops:
# nft -i
nft> add table filter
nft> add chain filter input { type filter hook input priority 0; }
nft> add chain filter test
nft> add rule filter input jump test
nft> delete chain filter test
We need to check the chain use counter before allowing destruction since
we might have references from sets or jump rules.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=69341
Reported-by: Matthew Ife <deleriux1@gmail.com>
Tested-by: Matthew Ife <deleriux1@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Arturo Borrero [Fri, 17 Jan 2014 01:28:45 +0000 (02:28 +0100)]
netfilter: nft_ct: fix unconditional dump of 'dir' attr
We want to make sure that the information that we get from the kernel can
be reinjected without troubles. The kernel shouldn't return an attribute
that is not required, or even prohibited.
Dumping unconditionally NFTA_CT_DIRECTION could lead an application in
userspace to interpret that the attribute was originally set, while it
was not.
Signed-off-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Andy Zhou [Mon, 3 Feb 2014 01:08:06 +0000 (17:08 -0800)]
openvswitch: Suppress error messages on megaflow updates
With subfacets, we'd expect megaflow updates message to carry
the original micro flow. If not, EINVAL is returned and kernel
logs an error message. Now that the user space subfacet layer is
removed, it is expected that flow updates can arrive with a
micro flow other than the original. Change the return code to
EEXIST and remove the kernel error log message.
Reported-by: Ben Pfaff <blp@nicira.com>
Signed-off-by: Andy Zhou <azhou@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
Pravin B Shelar [Fri, 31 Jan 2014 17:43:23 +0000 (09:43 -0800)]
openvswitch: Fix ovs_flow_free() ovs-lock assert.
ovs_flow_free() is not called under ovs-lock during packet
execute path (ovs_packet_cmd_execute()). Since packet execute
does not touch flow->mask, there is no need to take that
lock either. So move assert in case where flow->mask is checked.
Found by code inspection.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
Daniele Di Proietto [Thu, 23 Jan 2014 18:47:35 +0000 (10:47 -0800)]
openvswitch: Fix ovs_dp_cmd_msg_size()
commit
43d4be9cb55f3bac5253e9289996fd9d735531db (openvswitch: Allow user space
to announce ability to accept unaligned Netlink messages) introduced
OVS_DP_ATTR_USER_FEATURES netlink attribute in datapath responses,
but the attribute size was not taken into account in ovs_dp_cmd_msg_size().
Signed-off-by: Daniele Di Proietto <daniele.di.proietto@gmail.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
Andy Zhou [Tue, 21 Jan 2014 17:31:04 +0000 (09:31 -0800)]
openvswitch: Fix kernel panic on ovs_flow_free
Both mega flow mask's reference counter and per flow table mask list
should only be accessed when holding ovs_mutex() lock. However
this is not true with ovs_flow_table_flush(). The patch fixes this bug.
Reported-by: Joe Stringer <joestringer@nicira.com>
Signed-off-by: Andy Zhou <azhou@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
Thomas Graf [Tue, 14 Jan 2014 16:27:49 +0000 (16:27 +0000)]
openvswitch: Pad OVS_PACKET_ATTR_PACKET if linear copy was performed
While the zerocopy method is correctly omitted if user space
does not support unaligned Netlink messages. The attribute is
still not padded correctly as skb_zerocopy() will not ensure
padding and the attribute size is no longer pre calculated
though nla_reserve() which ensured padding previously.
This patch applies appropriate padding if a linear data copy
was performed in skb_zerocopy().
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Zoltan Kiss <zoltan.kiss@citrix.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
David Vrabel [Tue, 4 Feb 2014 18:50:26 +0000 (18:50 +0000)]
xen-netfront: handle backend CLOSED without CLOSING
Backend drivers shouldn't transistion to CLOSED unless the frontend is
CLOSED. If a backend does transition to CLOSED too soon then the
frontend may not see the CLOSING state and will not properly shutdown.
So, treat an unexpected backend CLOSED state the same as CLOSING.
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dmitry Kravkov [Tue, 4 Feb 2014 15:43:03 +0000 (17:43 +0200)]
bnx2x: fix L2-GRE TCP issues
When configuring GRE tunnel using OVS, tcp stream is distributed over
all RSS queues which may cause TCP reordering. It happens since OVS
uses L2GRE protocol when kernel gre uses IPGRE.
Patch defaults gre tunnel to L2GRE which allows proper RSS for L2GRE
packets and (implicitly) disables RSS for IPGRE traffic.
Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Bjørn Mork [Tue, 4 Feb 2014 12:04:33 +0000 (13:04 +0100)]
net: qmi_wwan: add Netgear Aircard 340U
This device was mentioned in an OpenWRT forum. Seems to have a "standard"
Sierra Wireless ifnumber to function layout:
0: qcdm
2: nmea
3: modem
8: qmi
9: storage
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fernando Luis Vazquez Cao [Tue, 4 Feb 2014 10:35:02 +0000 (19:35 +0900)]
rtnetlink: fix oops in rtnl_link_get_slave_info_data_size
We should check whether rtnetlink link operations
are defined before calling get_slave_size().
Without this, the following oops can occur when
adding a tap device to OVS.
[ 87.839553] BUG: unable to handle kernel NULL pointer dereference at
00000000000000a8
[ 87.839595] IP: [<
ffffffff813d47c0>] if_nlmsg_size+0xf0/0x220
[...]
[ 87.840651] Call Trace:
[ 87.840664] [<
ffffffff813d694b>] ? rtmsg_ifinfo+0x2b/0x100
[ 87.840688] [<
ffffffff813c8340>] ? __netdev_adjacent_dev_insert+0x150/0x1a0
[ 87.840718] [<
ffffffff813d6a50>] ? rtnetlink_event+0x30/0x40
[ 87.840742] [<
ffffffff814b4144>] ? notifier_call_chain+0x44/0x70
[ 87.840768] [<
ffffffff813c8946>] ? __netdev_upper_dev_link+0x3c6/0x3f0
[ 87.840798] [<
ffffffffa0678d6c>] ? netdev_create+0xcc/0x160 [openvswitch]
[ 87.840828] [<
ffffffffa06781ea>] ? ovs_vport_add+0x4a/0xd0 [openvswitch]
[ 87.840857] [<
ffffffffa0670139>] ? new_vport+0x9/0x50 [openvswitch]
[ 87.840884] [<
ffffffffa067279e>] ? ovs_vport_cmd_new+0x11e/0x210 [openvswitch]
[ 87.840915] [<
ffffffff813f3efa>] ? genl_family_rcv_msg+0x19a/0x360
[ 87.840941] [<
ffffffff813f40c0>] ? genl_family_rcv_msg+0x360/0x360
[ 87.840967] [<
ffffffff813f4139>] ? genl_rcv_msg+0x79/0xc0
[ 87.840991] [<
ffffffff813b6cf9>] ? __kmalloc_reserve.isra.25+0x29/0x80
[ 87.841018] [<
ffffffff813f2389>] ? netlink_rcv_skb+0xa9/0xc0
[ 87.841042] [<
ffffffff813f27cf>] ? genl_rcv+0x1f/0x30
[ 87.841064] [<
ffffffff813f1988>] ? netlink_unicast+0xe8/0x1e0
[ 87.841088] [<
ffffffff813f1d9a>] ? netlink_sendmsg+0x31a/0x750
[ 87.841113] [<
ffffffff813aee96>] ? sock_sendmsg+0x86/0xc0
[ 87.841136] [<
ffffffff813c960d>] ? __netdev_update_features+0x4d/0x200
[ 87.841163] [<
ffffffff813ca94e>] ? ethtool_get_value+0x2e/0x50
[ 87.841188] [<
ffffffff813af269>] ? ___sys_sendmsg+0x359/0x370
[ 87.841212] [<
ffffffff813da686>] ? dev_ioctl+0x1a6/0x5c0
[ 87.841236] [<
ffffffff8109c210>] ? autoremove_wake_function+0x30/0x30
[ 87.841264] [<
ffffffff813ac59d>] ? sock_do_ioctl+0x3d/0x50
[ 87.841288] [<
ffffffff813aca68>] ? sock_ioctl+0x1e8/0x2c0
[ 87.841312] [<
ffffffff811934bf>] ? do_vfs_ioctl+0x2cf/0x4b0
[ 87.841335] [<
ffffffff813afeb9>] ? __sys_sendmsg+0x39/0x70
[ 87.841362] [<
ffffffff814b86f9>] ? system_call_fastpath+0x16/0x1b
[ 87.841386] Code: c0 74 10 48 89 ef ff d0 83 c0 07 83 e0 fc 48 98 49 01 c7 48 89 ef e8 d0 d6 fe ff 48 85 c0 0f 84 df 00 00 00 48 8b 90 08 07 00 00 <48> 8b 8a a8 00 00 00 31 d2 48 85 c9 74 0c 48 89 ee 48 89 c7 ff
[ 87.841529] RIP [<
ffffffff813d47c0>] if_nlmsg_size+0xf0/0x220
[ 87.841555] RSP <
ffff880221aa5950>
[ 87.841569] CR2:
00000000000000a8
[ 87.851442] ---[ end trace
e42ab217691b4fc2 ]---
Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Acked-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stefan Sørensen [Tue, 4 Feb 2014 07:46:36 +0000 (08:46 +0100)]
ptp: Allow selecting trigger/event index in testptp
Currently the trigger/event is hardcoded to 0, this patch adds
a new command line argument -i to select an arbitrary trigger/
event.
Signed-off-by: Stefan Sørensen <stefan.sorensen@spectralink.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Max Filippov [Mon, 3 Feb 2014 23:33:10 +0000 (03:33 +0400)]
net: ethoc: set up MII management bus clock
MII management bus clock is derived from the MAC clock by dividing it by
MIIMODER register CLKDIV field value. This value may need to be set up
in case it is undefined or its default value is too high (and
communication with PHY is too slow) or too low (and communication with
PHY is impossible). The value of CLKDIV is not specified directly, but
is derived from the MAC clock for the default MII management bus frequency
of 2.5MHz. The MAC clock may be specified in the platform data, or in
the 'clocks' device tree attribute.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Max Filippov [Mon, 3 Feb 2014 23:33:09 +0000 (03:33 +0400)]
net: ethoc: don't advertise gigabit speed on attached PHY
OpenCores 10/100 Mbps MAC does not support speeds above 100 Mbps, but does
not disable advertisement when PHY supports them. This results in
non-functioning network when the MAC is connected to a gigabit PHY connected
to a gigabit switch.
The fix is to disable gigabit speed advertisement on attached PHY
unconditionally.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Mon, 3 Feb 2014 20:35:46 +0000 (12:35 -0800)]
net: phy: ensure Gigabit features are masked off if requested
When a Gigabit PHY device is connected to a 10/100Mbits capable Ethernet
MAC, the driver will restrict the phydev->supported modes to mask off
Gigabit. If the Gigabit PHY comes out of reset with the Gigabit features
set by default in MII_CTRL1000, it will keep advertising these feature,
so by the time we call genphy_config_advert(), the condition on
phydev->supported having the Gigabit features on is false, and we do not
update MII_CTRL1000 with updated values, and we keep advertising Gigabit
features, eventually configuring the PHY for Gigabit whilst the Ethernet
MAC does not support that.
This patches fixes the problem by ensuring that the Gigabit feature bits
are always cleared in MII_CTRL1000, if the PHY happens to be a Gigabit
PHY, and then, if Gigabit features are supported, setting those and
updating MII_CTRL1000 accordingly.
Reported-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Tested-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stefan Sørensen [Mon, 3 Feb 2014 14:36:58 +0000 (15:36 +0100)]
net:phy:dp83640: Initialize PTP clocks at device init.
The trigger and events functionality can be useful even if packet
timestamping is not used, but the required PTP clock is only enabled
when packet timestamping is started. This patch moves the clock enable
to when the interface is configured.
Signed-off-by: Stefan Sørensen <stefan.sorensen@spectralink.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stefan Sørensen [Mon, 3 Feb 2014 14:36:50 +0000 (15:36 +0100)]
net:phy:dp83640: Do not hardcode timestamping event edge
Currently the external timestamping code is hardcoded to use the
rising edge even though the hardware has configurable event edge
detection. This patch changes the code to use falling edge detection
if PTP_FALLING_EDGE is set in the user supplied flags.
Signed-off-by: Stefan Sørensen <stefan.sorensen@spectralink.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stefan Sørensen [Mon, 3 Feb 2014 14:36:35 +0000 (15:36 +0100)]
net:phy:dp83640: Declare that TX timestamping possible
Set the SKBTX_IN_PROGRESS bit in tx_flags dp83640_txtstamp when doing
tx timestamps as per Documentation/networking/timestamping.txt.
Signed-off-by: Stefan Sørensen <stefan.sorensen@spectralink.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Shlomo Pongratz [Sun, 2 Feb 2014 13:42:10 +0000 (15:42 +0200)]
net/ipv4: Use proper RCU APIs for writer-side in udp_offload.c
RCU writer side should use rcu_dereference_protected() and not
rcu_dereference(), fix that. This also removes the "suspicious RCU usage"
warning seen when running with CONFIG_PROVE_RCU.
Also, don't use rcu_assign_pointer/rcu_dereference for pointers
which are invisible beyond the udp offload code.
Fixes:
b582ef0 ('net: Add GRO support for UDP encapsulating protocols')
Reported-by: Eric Dumazet <edumazet@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Shlomo Pongratz <shlomop@mellanox.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 5 Feb 2014 03:47:55 +0000 (19:47 -0800)]
Merge branch 'bonding_fail_over_mac'
Ding Tianhong says:
====================
bonding: Fix some issues for fail_over_mac
The parameter fail_over_mac only affect active-backup mode, if it was
set to active or follow and works with other modes, just like RR or XOR
mode, the bonding could not set all slaves to the master's address, it
will cause the slave could not work well with master.
v1->v2: According Jay's suggestion, that we should permit setting an option
at any time, but only have it take effect in active-backup mode, so
I add mode checking together with fail_over_mac during enslavement and
rebuild the patches.
v2->v3: The correct way to fix the problem is that we should not add restrictions when
setting options, just need to modify the bond enslave and removal processing
to check the mode in addition to fail_over_mac when setting a slave's MAC during
enslavement. The change active slave processing already only calls the fail_over_mac
function when in active-backup mode.
Remove the cleanup patch because the net-next is frozen now.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
dingtianhong [Sat, 25 Jan 2014 05:00:57 +0000 (13:00 +0800)]
bonding: fail_over_mac should only affect AB mode in bond_set_mac_address()
The fail_over_mac could be set to active or follow in any time for all modes,
so if the fail_over_mac is not none and the current mode is not active-backup,
the bond_set_mac_address() could not change the master and slave's MAC address.
In bond_set_mac_address(), the fail_over_mac should only affect AB mode, so modify
to check the mode in addition to fail_over_mac when setting bond's MAC address.
Cc: Jay Vosburgh <fubar@us.ibm.com>
Cc: Veaceslav Falico <vfalico@redhat.com>
Cc: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
dingtianhong [Sat, 25 Jan 2014 05:00:29 +0000 (13:00 +0800)]
bonding: fail_over_mac should only affect AB mode at enslave and removal processing
According to bonding.txt, the fail_over_ma should only affect active-backup mode,
but I found that the fail_over_mac could be set to active or follow in all
modes, this will cause new slave could not be set to bond's MAC address at
enslave processing and restore its own MAC address at removal processing.
The correct way to fix the problem is that we should not add restrictions when
setting options, just need to modify the bond enslave and removal processing
to check the mode in addition to fail_over_mac when setting a slave's MAC during
enslavement. The change active slave processing already only calls the fail_over_mac
function when in active-backup mode.
Thanks for Jay's suggestion.
The patch also modify the pr_warning() to pr_warn().
Cc: Jay Vosburgh <fubar@us.ibm.com>
Cc: Veaceslav Falico <vfalico@redhat.com>
Cc: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>