Tilman Schmidt [Wed, 10 Dec 2014 12:41:55 +0000 (13:41 +0100)]
isdn/gigaset: clarify gigaset_modem_fill control structure
Replace the flag-controlled retry loop by explicit goto statements
in the error branches to make the control structure easier to
understand.
Signed-off-by: Tilman Schmidt <tilman@imap.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tilman Schmidt [Wed, 10 Dec 2014 12:41:55 +0000 (13:41 +0100)]
isdn/gigaset: drop duplicate declaration
Function gigaset_skb_sent was declared twice, identically, in gigaset.h.
Signed-off-by: Tilman Schmidt <tilman@imap.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>
Richard Alpe [Wed, 10 Dec 2014 08:46:54 +0000 (09:46 +0100)]
tipc: fix broadcast wakeup contention after congestion
commit
908344cdda80 ("tipc: fix bug in multicast congestion handling")
introduced a race in the broadcast link wakeup functionality.
This patch eliminates this broadcast link wakeup race caused by
operation on the wakeup list without proper locking. If this race
hit and corrupted the list all subsequent wakeup messages would be
lost, resulting in a considerable memory leak.
Signed-off-by: Richard Alpe <richard.alpe@ericsson.com>
Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Govindarajulu Varadarajan [Wed, 10 Dec 2014 08:10:23 +0000 (13:40 +0530)]
enic: add support for set/get rss hash key
This patch adds support for setting/getting rss hash key using ethtool.
v2:
respin patch to support RSS hash function changes.
Signed-off-by: Govindarajulu Varadarajan <_govind@gmx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 10 Dec 2014 18:32:02 +0000 (13:32 -0500)]
Merge branch 'napi_page_frags'
Alexander Duyck says:
====================
net: Alloc NAPI page frags from their own pool
This patch series implements a means of allocating page fragments without
the need for the local_irq_save/restore in __netdev_alloc_frag. By doing
this I am able to decrease packet processing time by 11ns per packet in my
test environment.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Wed, 10 Dec 2014 03:41:17 +0000 (19:41 -0800)]
ethernet/broadcom: Use napi_alloc_skb instead of netdev_alloc_skb_ip_align
This patch replaces the calls to netdev_alloc_skb_ip_align in the
copybreak paths.
Cc: Gary Zambrano <zambrano@broadcom.com>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Ariel Elior <ariel.elior@qlogic.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Wed, 10 Dec 2014 03:41:09 +0000 (19:41 -0800)]
ethernet/realtek: use napi_alloc_skb instead of netdev_alloc_skb_ip_align
This replaces most of the calls to netdev_alloc_skb_ip_align in the Realtek
drivers. The one instance I didn't replace in 8139cp.c is because it was
called as a part of init and as such is not always accessed from the
softirq context.
Cc: Realtek linux nic maintainers <nic_swsd@realtek.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Wed, 10 Dec 2014 03:41:03 +0000 (19:41 -0800)]
cxgb: Use napi_alloc_skb instead of netdev_alloc_skb_ip_align
In order to use napi_alloc_skb I needed to pass a pointer to struct adapter
instead of struct pci_dev. This allowed me to access &adapter->napi.
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Wed, 10 Dec 2014 03:40:56 +0000 (19:40 -0800)]
ethernet/intel: Use napi_alloc_skb
This change replaces calls to netdev_alloc_skb_ip_align with
napi_alloc_skb. The advantage of napi_alloc_skb is currently the fact that
the page allocation doesn't make use of any irq disable calls.
There are few spots where I couldn't replace the calls as the buffer
allocation routine is called as a part of init which is outside of the
softirq context.
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Wed, 10 Dec 2014 03:40:49 +0000 (19:40 -0800)]
net: Pull out core bits of __netdev_alloc_skb and add __napi_alloc_skb
This change pulls the core functionality out of __netdev_alloc_skb and
places them in a new function named __alloc_rx_skb. The reason for doing
this is to make these bits accessible to a new function __napi_alloc_skb.
In addition __alloc_rx_skb now has a new flags value that is used to
determine which page frag pool to allocate from. If the SKB_ALLOC_NAPI
flag is set then the NAPI pool is used. The advantage of this is that we
do not have to use local_irq_save/restore when accessing the NAPI pool from
NAPI context.
In my test setup I saw at least 11ns of savings using the napi_alloc_skb
function versus the netdev_alloc_skb function, most of this being due to
the fact that we didn't have to call local_irq_save/restore.
The main use case for napi_alloc_skb would be for things such as copybreak
or page fragment based receive paths where an skb is allocated after the
data has been received instead of before.
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Wed, 10 Dec 2014 03:40:42 +0000 (19:40 -0800)]
net: Split netdev_alloc_frag into __alloc_page_frag and add __napi_alloc_frag
This patch splits the netdev_alloc_frag function up so that it can be used
on one of two page frag pools instead of being fixed on the
netdev_alloc_cache. By doing this we can add a NAPI specific function
__napi_alloc_frag that accesses a pool that is only used from softirq
context. The advantage to this is that we do not need to call
local_irq_save/restore which can be a significant savings.
I also took the opportunity to refactor the core bits that were placed in
__alloc_page_frag. First I updated the allocation to do either a 32K
allocation or an order 0 page. This is based on the changes in commmit
d9b2938aa where it was found that latencies could be reduced in case of
failures. Then I also rewrote the logic to work from the end of the page to
the start. By doing this the size value doesn't have to be used unless we
have run out of space for page fragments. Finally I cleaned up the atomic
bits so that we just do an atomic_sub_and_test and if that returns true then
we set the page->_count via an atomic_set. This way we can remove the extra
conditional for the atomic_read since it would have led to an atomic_inc in
the case of success anyway.
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 10 Dec 2014 18:17:23 +0000 (13:17 -0500)]
Merge branch 'for-davem-2' of git://git./linux/kernel/git/viro/vfs
More iov_iter work for the networking from Al Viro.
Signed-off-by: David S. Miller <davem@davemloft.net>
Flavio Leitner [Wed, 10 Dec 2014 00:41:48 +0000 (22:41 -0200)]
dummy: use MODULE_VERSION
Use MODULE_VERSION() now that dummy driver has a version.
Signed-off-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Tue, 9 Dec 2014 21:23:29 +0000 (22:23 +0100)]
net: sched: cls: use nla_nest_cancel instead of nlmsg_trim
To cancel nesting, this function is more convenient.
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
Lendacky, Thomas [Tue, 9 Dec 2014 20:54:08 +0000 (14:54 -0600)]
amd-xgbe: Use disable_irq_nosync when in IRQ context
The disable_irq_nosync function, not the disable_irq function, must be
used to disable the DMA channel interrupt from within the interrupt
service routine. Change the disable_irq call to disable_irq_nosync.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nimrod Andy [Tue, 9 Dec 2014 10:46:56 +0000 (18:46 +0800)]
net: fec: avoid kernal crash by NULL pointer when no phy connection
On i.MX6SX sabreauto board, when there have no phy daughter board connection,
there have kernel crash by NULL pointer:
fec
2188000.ethernet eth0: could not attach to PHY
Unable to handle kernel NULL pointer dereference at virtual address
00000220
pgd =
80004000
[
00000220] *pgd=
00000000
Internal error: Oops: 5 [#1] PREEMPT SMP ARM
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted
3.14.24-01042-g27eaeea-dirty #405
task:
d8078000 ti:
d8076000 task.ti:
d8076000
PC is at mutex_lock+0x10/0x54
LR is at phy_start+0x14/0x68
pc : [<
806ad4e4>] lr : [<
803b0f90>] psr:
60000113
sp :
d8077d80 ip :
00000000 fp :
d83cc000
r10:
0000100c r9 :
d83cc800 r8 :
00000000
r7 :
d83bcd0c r6 :
00000200 r5 :
00000220 r4 :
00000220
r3 :
00000000 r2 :
00000000 r1 :
d83bcd90 r0 :
00000220
Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment kernel
Control:
10c5387d Table:
8000404a DAC:
00000015
Process swapper/0 (pid: 1, stack limit = 0xd8076240)
Stack: (0xd8077d80 to 0xd8078000)
7d80:
00000000 803b0f90 00000001 00000000 d83bc800 803be034 00000007 805c3fb4
7da0:
00000003 80d4e0bc 805efcb8 fffffff1 fffffff0 00000000 00000000 d8077dfc
7dc0:
0000000d 80d6ce80 80d126b0 800499c8 d83bc800 d83bc800 806f0f40 d83bc82c
7de0:
00000000 00000000 80d6ce80 80d126b0 0000016b 80540250 d8076008 d83bc800
7e00:
0000016b d83bc800 00001003 00000001 00001002 805404d4 d83bc800 00000120
7e20:
00001002 00001002 00000000 805405d4 d83bc800 00000001 80d126c0 00001002
7e40:
80dbc5dc 80d02024 00000000 806ae360 00000002 d6128420 d6127198 12400000
7e60:
00000000 00000000 00000002 d61271e8 00000000 12400000 d801674c 800e49f0
7e80:
d6127198 d6124e58 00000000 80238848 d61271c4 00000000 00000001 d8016700
7ea0:
80dd2e00 80d752c0 80d752c0 80cfdaec 0000010c 80239430 806c2e90 d800f080
7ec0:
d800f380 804e46b4 ffffffbc 80d15cb0 00000007 80d752c0 80d752c0 80d01e94
7ee0:
0000010c d8076030 00000000 800088cc 80dbaba4 80bd411c d80a6f00 806b1e04
7f00:
00000000 00000000 00000000 80125b84 00000000 80d2c56c 60000113 00000001
7f20:
ef7ff9df 806c80cc 0000010c 80043f5c 80c95eb8 00000007 ef7ffa1d 00000007
7f40:
80d2c55c 80d15cb0 00000007 80d752c0 80d752c0 80ccc50c 0000010c 80d0a114
7f60:
80d0a10c 80cccc04 00000007 00000007 80ccc50c 806ae410 00000000 8004cb84
7f80:
80d17bc0 00000000 806a4bd4 00000000 00000000 00000000 00000000 00000000
7fa0:
00000000 806a4bdc 00000000 8000e5f8 00000000 00000000 00000000 00000000
7fc0:
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
7fe0:
00000000 00000000 00000000 00000000 00000013 00000000 1e79a7bb e5337f77
[<
806ad4e4>] (mutex_lock) from [<
803b0f90>] (phy_start+0x14/0x68)
[<
803b0f90>] (phy_start) from [<
803be034>] (fec_enet_open+0x448/0x5dc)
[<
803be034>] (fec_enet_open) from [<
80540250>] (__dev_open+0xa8/0x110)
[<
80540250>] (__dev_open) from [<
805404d4>] (__dev_change_flags+0x88/0x170)
[<
805404d4>] (__dev_change_flags) from [<
805405d4>] (dev_change_flags+0x18/0x48)
[<
805405d4>] (dev_change_flags) from [<
80d02024>] (ip_auto_config+0x190/0xf94)
[<
80d02024>] (ip_auto_config) from [<
800088cc>] (do_one_initcall+0xe8/0x144)
[<
800088cc>] (do_one_initcall) from [<
80cccc04>] (kernel_init_freeable+0x104/0x1c8)
[<
80cccc04>] (kernel_init_freeable) from [<
806a4bdc>] (kernel_init+0x8/0xec)
[<
806a4bdc>] (kernel_init) from [<
8000e5f8>] (ret_from_fork+0x14/0x3c)
Code:
e92d4010 e3a03000 e1a04000 ee073fba (
e1903f9f)
Add phydev check to fix the issue.
Signed-off-by: Fugang Duan <B38611@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ying Xue [Tue, 9 Dec 2014 07:17:56 +0000 (15:17 +0800)]
tipc: avoid double lock 'spin_lock:&seq->lock'
The commit
fb9962f3cefe ("tipc: ensure all name sequences are properly
protected with its lock") involves below errors:
net/tipc/name_table.c:980 tipc_purge_publications() error: double lock 'spin_lock:&seq->lock'
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Mon, 8 Dec 2014 23:59:18 +0000 (15:59 -0800)]
net: systemport: allow changing MAC address
Hook a ndo_set_mac_address callback, update the internal Ethernet MAC in
the netdevice structure, and finally write that address down to the
UniMAC registers. If the interface is down, and most likely clock gated,
we do not update the registers but just the local copy, such that next
ndo_open() call will effectively write down the address.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 9 Dec 2014 23:24:56 +0000 (18:24 -0500)]
Merge branch 'bridge_mode'
Roopa Prabhu says:
====================
remove bridge mode BRIDGE_MODE_SWDEV
BRIDGE_MODE_SWDEV was introduced to indicate switchdev offloads
for bridging from user space (In other words to call into the hw switch
port driver directly). But user can use existing BRIDGE_FLAGS_SELF
to call into the hw switch port driver today. swdev mode is not required
anymore. So, this patch removes it.
v4 - v5
incorporate comments
- Define BRIDGE_MODE_UNDEF to handle cases where mode is not defined
- reverse the order of patches
- include patch comments in all patches
====================
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Roopa Prabhu [Mon, 8 Dec 2014 22:04:21 +0000 (14:04 -0800)]
bridge: remove mode BRIDGE_MODE_SWDEV
This patch removes bridge mode swdev.
Users can use BRIDGE_FLAGS_SELF to indicate swdev offload
if needed.
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
Roopa Prabhu [Mon, 8 Dec 2014 22:04:20 +0000 (14:04 -0800)]
rocker: remove swdev mode
Remove use of 'swdev' mode in rocker. rocker dev offloads
can use the BRIDGE_FLAGS_SELF to indicate offload to hardware.
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
Roopa Prabhu [Mon, 8 Dec 2014 22:04:19 +0000 (14:04 -0800)]
bridge: new mode flag to indicate mode 'undefined'
This patch adds mode BRIDGE_MODE_UNDEF for cases where mode is not needed.
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 9 Dec 2014 23:12:03 +0000 (18:12 -0500)]
Merge tag 'master-2014-12-08' of git://git./linux/kernel/git/linville/wireless-next
John W. Linville says:
====================
pull request: wireless-next 2014-12-08
Please pull this last batch of pending wireless updates for the 3.19 tree...
For the wireless bits, Johannes says:
"This time I have Felix's no-status rate control work, which will allow
drivers to work better with rate control even if they don't have perfect
status reporting. In addition to this, a small hwsim fix from Patrik,
one of the regulatory patches from Arik, and a number of cleanups and
fixes I did myself.
Of note is a patch where I disable CFG80211_WEXT so that compatibility
is no longer selectable - this is intended as a wake-up call for anyone
who's still using it, and is still easily worked around (it's a one-line
patch) before we fully remove the code as well in the future."
For the Bluetooth bits, Johan says:
"Here's one more bluetooth-next pull request for 3.19:
- Minor cleanups for ieee802154 & mac802154
- Fix for the kernel warning with !TASK_RUNNING reported by Kirill A.
Shutemov
- Support for another ath3k device
- Fix for tracking link key based security level
- Device tree bindings for btmrvl + a state update fix
- Fix for wrong ACL flags on LE links"
And...
"In addition to the previous one this contains two more cleanups to
mac802154 as well as support for some new HCI features from the
Bluetooth 4.2 specification.
From the original request:
'Here's what should be the last bluetooth-next pull request for 3.19.
It's rather large but the majority of it is the Low Energy Secure
Connections feature that's part of the Bluetooth 4.2 specification. The
specification went public only this week so we couldn't publish the
corresponding code before that. The code itself can nevertheless be
considered fairly mature as it's been in development for over 6 months
and gone through several interoperability test events.
Besides LE SC the pull request contains an important fix for command
complete events for mgmt sockets which also fixes some leaks of hci_conn
objects when powering off or unplugging Bluetooth adapters.
A smaller feature that's part of the pull request is service discovery
support. This is like normal device discovery except that devices not
matching specific UUIDs or strong enough RSSI are filtered out.
Other changes that the pull request contains are firmware dump support
to the btmrvl driver, firmware download support for Broadcom BCM20702A0
variants, as well as some coding style cleanups in 6lowpan &
ieee802154/mac802154 code.'"
For the NFC bits, Samuel says:
"With this one we get:
- NFC digital improvements for DEP support: Chaining, NACK and ATN
support added.
- NCI improvements: Support for p2p target, SE IO operand addition,
SE operands extensions to support proprietary implementations, and
a few fixes.
- NFC HCI improvements: OPEN_PIPE and NOTIFY_ALL_CLEARED support,
and SE IO operand addition.
- A bunch of minor improvements and fixes for STMicro st21nfcb and
st21nfca"
For the iwlwifi bits, Emmanuel says:
"Major works are CSA and TDLS. On top of that I have a new
firmware API for scan and a few rate control improvements.
Johannes find a few tricks to improve our CPU utilization
and adds support for a new spin of 7265 called 7265D.
Along with this a few random things that don't stand out."
And...
"I deprecate here -8.ucode since -9 has been published long ago.
Along with that I have a new activity, we have now better
a infrastructure for firmware debugging. This will allow to
have configurable probes insides the firmware.
Luca continues his work on NetDetect, this feature is now
complete. All the rest is minor fixes here and there."
For the Atheros bits, Kalle says:
"Only ath10k changes this time and no major changes. Most visible are:
o new debugfs interface for runtime firmware debugging (Yanbo)
o fix shared WEP (Sujith)
o don't rebuild whenever kernel version changes (Johannes)
o lots of refactoring to make it easier to add new hw support (Michal)
There's also smaller fixes and improvements with no point of listing
here."
In addition, there are a few last minute updates to ath5k,
ath9k, brcmfmac, brcmsmac, mwifiex, rt2x00, rtlwifi, and wil6210.
Also included is a pull of the wireless tree to pick-up the fixes
originally included in "pull request: wireless 2014-12-03"...
Please let me know if there are problems!
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Mitsuhiro Kimura [Mon, 8 Dec 2014 10:46:21 +0000 (19:46 +0900)]
sh_eth: Remove redundant alignment adjustment
PTR_ALIGN macro after skb_reserve is redundant, because skb_reserve
function adjusts the alignment of skb->data.
Signed-off-by: Mitsuhiro Kimura <mitsuhiro.kimura.kc@renesas.com>
Signed-off-by: Yoshihiro Kaneko <ykaneko0929@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mitsuhiro Kimura [Tue, 9 Dec 2014 12:23:42 +0000 (21:23 +0900)]
sh_eth: Optimization for RX excess judgement
Both of 'boguscnt' and 'quota' have nearly meaning as the condition of
the reception loop.
In order to cut down redundant processing, this patch changes excess
judgement.
Signed-off-by: Mitsuhiro Kimura <mitsuhiro.kimura.kc@renesas.com>
Signed-off-by: Yoshihiro Kaneko <ykaneko0929@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Li RongQing [Mon, 8 Dec 2014 01:42:55 +0000 (09:42 +0800)]
net: avoid to call skb_queue_len again
the queue length of sd->input_pkt_queue has been put into qlen,
and impossible to change, since hold the lock
Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Cc: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 9 Dec 2014 22:01:21 +0000 (17:01 -0500)]
Merge branch 'master' of git://git./linux/kernel/git/jkirsher/net-next
Jeff Kirsher says:
====================
Intel Wired LAN Driver Updates 2014-12-09
This series contains updates to i40e and i40evf.
Jeff (me) provides a single patch to convert a macro to a static inline
function based on feedback from Joe Perches on a previous patch.
Shannon provides the remaining twelve patches against i40e. Almost all
of Shannon's patches cleanup/fix NVM issues varying in range from
adding more detail to debug messages, to removing dead code, to fixing
NVM state transitions after an error. Change the handy decoder interface
for admin queue return code to help catch and properly report the condition
as a useful errno rather than returning a misleading '0'. Added a range
check to avoid any possible array index-out-of-bound issues.
v2:
- fixed up patch 05 in the series to use the ARRAY_SIZE() macro as suggested
by Sergei Shtylyov
- fix up patch 13 to remove unnecessary parens in the return statement
as suggested by Sergei Shtylyov
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 9 Dec 2014 21:49:00 +0000 (16:49 -0500)]
Merge tag 'linux-can-next-for-3.19-
20141207' of git://gitorious.org/linux-can/linux-can-next
Marc Kleine-Budde says:
====================
pull-request: can-next 2014-12-07
this is a pull request of 8 patches for net-next/master.
Andri Yngvason contributes 4 patches in which the CAN state change
handling is consolidated and unified among the sja1000, mscan and
flexcan driver. The three patches by Jeremiah Mahler fix spelling
mistakes and eliminate the banner[] variable in various parts. And a
patch by me that switches on sparse endianess checking by default.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Sun, 7 Dec 2014 20:22:18 +0000 (12:22 -0800)]
tcp: refine TSO autosizing
Commit
95bd09eb2750 ("tcp: TSO packets automatic sizing") tried to
control TSO size, but did this at the wrong place (sendmsg() time)
At sendmsg() time, we might have a pessimistic view of flow rate,
and we end up building very small skbs (with 2 MSS per skb).
This is bad because :
- It sends small TSO packets even in Slow Start where rate quickly
increases.
- It tends to make socket write queue very big, increasing tcp_ack()
processing time, but also increasing memory needs, not necessarily
accounted for, as fast clones overhead is currently ignored.
- Lower GRO efficiency and more ACK packets.
Servers with a lot of small lived connections suffer from this.
Lets instead fill skbs as much as possible (64KB of payload), but split
them at xmit time, when we have a precise idea of the flow rate.
skb split is actually quite efficient.
Patch looks bigger than necessary, because TCP Small Queue decision now
has to take place after the eventual split.
As Neal suggested, introduce a new tcp_tso_autosize() helper, so that
tcp_tso_should_defer() can be synchronized on same goal.
Rename tp->xmit_size_goal_segs to tp->gso_segs, as this variable
contains number of mss that we can put in GSO packet, and is not
related to the autosizing goal anymore.
Tested:
40 ms rtt link
nstat >/dev/null
netperf -H remote -l -
2000000 -- -s
1000000
nstat | egrep "IpInReceives|IpOutRequests|TcpOutSegs|IpExtOutOctets"
Before patch :
Recv Send Send
Socket Socket Message Elapsed
Size Size Size Time Throughput
bytes bytes bytes secs. 10^6bits/s
87380
2000000 2000000 0.36 44.22
IpInReceives 600 0.0
IpOutRequests 599 0.0
TcpOutSegs 1397 0.0
IpExtOutOctets
2033249 0.0
After patch :
Recv Send Send
Socket Socket Message Elapsed
Size Size Size Time Throughput
bytes bytes bytes secs. 10^6bits/sec
87380
2000000 2000000 0.36 44.27
IpInReceives 221 0.0
IpOutRequests 232 0.0
TcpOutSegs 1397 0.0
IpExtOutOctets
2013953 0.0
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Al Viro [Tue, 25 Nov 2014 00:45:05 +0000 (19:45 -0500)]
bury memcpy_toiovec()
no users left
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Mon, 24 Nov 2014 23:29:54 +0000 (18:29 -0500)]
skb_copy_datagram_iovec() can die
no callers other than itself.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Mon, 24 Nov 2014 22:48:04 +0000 (17:48 -0500)]
ppp_read(): switch to skb_copy_datagram_iter()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Mon, 24 Nov 2014 23:17:55 +0000 (18:17 -0500)]
switch memcpy_to_msg() and skb_copy{,_and_csum}_datagram_msg() to primitives
... making both non-draining. That means that tcp_recvmsg() becomes
non-draining. And _that_ would break iscsit_do_rx_data() unless we
a) make sure tcp_recvmsg() is uniformly non-draining (it is)
b) make sure it copes with arbitrary (including shifted)
iov_iter (it does, all it uses is iov_iter primitives)
c) make iscsit_do_rx_data() initialize ->msg_iter only once.
Fortunately, (c) is doable with minimal work and we are rid of one
the two places where kernel send/recvmsg users would be unhappy with
non-draining behaviour.
Actually, that makes all but one of ->recvmsg() instances iov_iter-clean.
The exception is skcipher_recvmsg() and it also isn't hard to convert
to primitives (iov_iter_get_pages() is needed there). That'll wait
a bit - there's some interplay with ->sendmsg() path for that one.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Mon, 24 Nov 2014 22:07:38 +0000 (17:07 -0500)]
first fruits - kill l2cap ->memcpy_fromiovec()
Just use copy_from_iter(). That's what this method is trying to do
in all cases, in a very convoluted fashion.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Mon, 24 Nov 2014 15:42:55 +0000 (10:42 -0500)]
put iov_iter into msghdr
Note that the code _using_ ->msg_iter at that point will be very
unhappy with anything other than unshifted iovec-backed iov_iter.
We still need to convert users to proper primitives.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Tue, 25 Nov 2014 00:32:50 +0000 (19:32 -0500)]
vmci: propagate msghdr all way down to __qp_memcpy_from_queue()
... and switch it to memcpy_to_msg()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Mon, 24 Nov 2014 21:44:09 +0000 (16:44 -0500)]
switch l2cap ->memcpy_fromiovec() to msghdr
it'll die soon enough - now that kvec-backed iov_iter works regardless
of set_fs(), both instances will become copy_from_iter() as soon as
we introduce ->msg_iter...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Mon, 24 Nov 2014 18:26:06 +0000 (13:26 -0500)]
switch tcp_sock->ucopy from iovec (ucopy.iov) to msghdr (ucopy.msg)
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Mon, 24 Nov 2014 18:23:40 +0000 (13:23 -0500)]
ip_generic_getfrag, udplite_getfrag: switch to passing msghdr
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Mon, 24 Nov 2014 17:10:46 +0000 (12:10 -0500)]
ipv6 equivalent of "ipv4: Avoid reading user iov twice after raw_probe_proto_opt"
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Mon, 24 Nov 2014 15:52:29 +0000 (10:52 -0500)]
raw.c: stick msghdr into raw_frag_vec
we'll want access to ->msg_iter
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Tue, 9 Dec 2014 21:27:52 +0000 (16:27 -0500)]
Merge branch 'iov_iter' into for-davem-2
Julia Lawall [Sun, 7 Dec 2014 19:20:56 +0000 (20:20 +0100)]
chelsio: fix misspelling of current function in string
Replace a misspelled function name by %s and then __func__.
This was done using Coccinelle, including the use of Levenshtein distance,
as proposed by Rasmus Villemoes.
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Julia Lawall [Sun, 7 Dec 2014 19:20:54 +0000 (20:20 +0100)]
hp100: fix misspelling of current function in string
Replace a misspelled function name by %s and then __func__.
This was done using Coccinelle, including the use of Levenshtein distance,
as proposed by Rasmus Villemoes.
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Julia Lawall [Sun, 7 Dec 2014 19:20:52 +0000 (20:20 +0100)]
uli526x: fix misspelling of current function in string
Replace a misspelled function name by %s and then __func__.
This was done using Coccinelle, including the use of Levenshtein distance,
as proposed by Rasmus Villemoes.
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Julia Lawall [Sun, 7 Dec 2014 19:20:47 +0000 (20:20 +0100)]
isdn: fix misspelling of current function in string
Replace a misspelled function name by %s and then __func__.
In the first case, the print is just dropped, because kmalloc itself does
enough error reporting.
This was done using Coccinelle, including the use of Levenshtein distance,
as proposed by Rasmus Villemoes.
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Julia Lawall [Sun, 7 Dec 2014 19:20:46 +0000 (20:20 +0100)]
dmfe: fix misspelling of current function in string
The function name contains cleanup, not clean.
This was done using Coccinelle, including the use of Levenshtein distance,
as proposed by Rasmus Villemoes.
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Acked-by: Grant Grundler <grundler@parisc-linux.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mahesh Bandewar [Sat, 6 Dec 2014 23:53:46 +0000 (15:53 -0800)]
macvlan: play well with ipvlan device
If device is already used as an ipvlan port then refuse to
use it as a macvlan port at early stage of port creation.
thost1:~# ip link add link eth0 ipvl0 type ipvlan
thost1:~# echo $?
0
thost1:~# ip link add link eth0 mvl0 type macvlan
RTNETLINK answers: Device or resource busy
thost1:~# echo $?
2
thost1:~#
Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mahesh Bandewar [Sat, 6 Dec 2014 23:53:33 +0000 (15:53 -0800)]
ipvlan: move the device check function into netdevice.h
Move the port check [ipvlan_dev_master()] and device check
[ipvlan_dev_slave()] functions to netdevice.h and rename them
netif_is_ipvlan_port() and netif_is_ipvlan() resp. to be
consistent with macvlan api naming.
Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mahesh Bandewar [Sat, 6 Dec 2014 23:53:19 +0000 (15:53 -0800)]
ipvlan: play well with macvlan device
If a device is already a macvlan port then refuse to use it as
an ipvlan port in the early stage of port creation.
thost1:~# ip link add link eth0 mvl0 type macvlan
thost1:~# echo $?
0
thost1:~# ip link add link eth0 ipvl0 type ipvlan
RTNETLINK answers: Device or resource busy
thost1:~# echo $?
2
thost1:~#
Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mahesh Bandewar [Sat, 6 Dec 2014 23:53:04 +0000 (15:53 -0800)]
netdevice: Add a function to check macvlan port
Similar to a check for macvlan device, netif_is_macvlan(), add
another function to check if a device is used as macvlan port.
Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hannes Frederic Sowa [Sat, 6 Dec 2014 18:19:42 +0000 (19:19 +0100)]
dst: no need to take reference on DST_NOCACHE dsts
Since commit
f8864972126899 ("ipv4: fix dst race in sk_dst_get()")
DST_NOCACHE dst_entries get freed by RCU. So there is no need to get a
reference on them when we are in rcu protected sections.
Cc: Eric Dumazet <edumazet@google.com>
Cc: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Reviewed-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
Flavio Leitner [Sat, 6 Dec 2014 00:13:24 +0000 (22:13 -0200)]
dummy: add support for ethtool get_drvinfo
The command 'ethtool -i' is useful to find details
about the interface like the device driver being used.
This was missing for dummy driver.
Signed-off-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rami Rosen [Fri, 5 Dec 2014 17:35:43 +0000 (19:35 +0200)]
Documentation (ixgbe.txt): use a decimal address.
This patch fixes the erronous usage of an hexadecimal address in the
example, by replacing it with a decimal address.
Signed-off-by: Rami Rosen <ramirose@gmail.com>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Benc [Fri, 5 Dec 2014 16:24:28 +0000 (17:24 +0100)]
openvswitch: set correct protocol on route lookup
Respect what the caller passed to ovs_tunnel_get_egress_info.
Fixes:
8f0aad6f35f7e ("openvswitch: Extend packet attribute for egress tunnel info")
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michal Kubeček [Fri, 5 Dec 2014 16:05:49 +0000 (17:05 +0100)]
macvlan: allow setting LRO independently of lower device
Since commit
fbe168ba91f7 ("net: generic dev_disable_lro() stacked
device handling"), dev_disable_lro() zeroes NETIF_F_LRO feature flag
first for a macvlan device and then for its lower device. As an attempt
to set NETIF_F_LRO to zero is ignored, dev_disable_lro() issues a
warning and taints kernel.
Allowing NETIF_F_LRO to be set independently of the lower device
consists of three parts:
- add the flag to hw_features to allow toggling it
- allow setting it to 0 even if lower device has the flag set
- add the flag to MACVLAN_FEATURES to restore copying from lower
device on macvlan creation
Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jeff Kirsher [Tue, 9 Dec 2014 10:31:16 +0000 (02:31 -0800)]
i40e/i40evf: Convert macro to static inline
Inline functions are preferred over macros when they can be used
interchangeably.
CC: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Reported-by: Joe Perches <joe@perches.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Shannon Nelson [Thu, 13 Nov 2014 08:23:23 +0000 (08:23 +0000)]
i40e: add to NVM update debug message
Add a little more state context to an NVM update debug message.
Change-ID: I512160259052bcdbe5bdf1adf403ab2bf7984970
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Shannon Nelson [Thu, 13 Nov 2014 08:23:22 +0000 (08:23 +0000)]
i40e: check for AQ timeout in aq_rc decode
Decoding the AQ return code is great except when the AQ send timed out
and there's no return code set. This changes the handy decoder
interface to help catch and properly report the condition as a useful
errno rather than returning a misleading '0'.
Change-ID: I07a1f94f921606da49ffac7837bcdc37cd8222eb
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Shannon Nelson [Thu, 13 Nov 2014 08:23:21 +0000 (08:23 +0000)]
i40e: poll on NVM semaphore only if not other error
Only poll on the NVM semaphore if there's time left on a previous
reservation. Also, add a little more info to debug messages.
Change-ID: I2439bf870b95a28b810dcb5cca1c06440463cf8a
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Shannon Nelson [Thu, 13 Nov 2014 08:23:20 +0000 (08:23 +0000)]
i40e: fix up NVM update sm error handling
The state transitions after an error were not managed well, so
these changes get us back to the INIT state or don't transition
out of the INIT state after most errors.
Change-ID: I90aa0e4e348dc4f58cbcdce9c5d4b7fd35981c6c
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Acked-by: Michal Kosiarz <michal.kosiarz@intel.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Shannon Nelson [Thu, 13 Nov 2014 08:23:19 +0000 (08:23 +0000)]
i40e: set max limit for access polling
Don't bother trying to set a smaller timeout on the polling,
just simplify the code and always use the max limit. Also,
rename a variable for clarity and fix a comment.
Change-ID: I0300c3562ccc4fd5fa3088f8ae52db0c1eb33af5
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Acked-by: Michal Kosiarz <michal.kosiarz@intel.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Shannon Nelson [Thu, 13 Nov 2014 08:23:18 +0000 (08:23 +0000)]
i40e: remove unused nvm_semaphore_wait
The nvm_semaphore_wait field is set but never used, so let's
just get rid of it.
Change-ID: I2107bd29b69f99b1a61d7591d087429527c9d8fa
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Acked-by: Michal Kosiarz <michal.kosiarz@intel.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Shannon Nelson [Thu, 13 Nov 2014 08:23:17 +0000 (08:23 +0000)]
i40e: init NVM update state on adminq init
The adminq init is run after the EMPR that is triggered by the
NVM update. The final write command will cause the reset and
will want to wait for the ARQ event that signals the end of the
update, but the reset precludes the event being sent. The state
is probably already at INIT, but we set it so here anyway, and
clear the release_on_done flag as well.
Change-ID: Ie9d724a39e71f988741abc3d51b4cb198c7e0272
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Acked-by: Michal Kosiarz <michal.kosiarz@intel.com>
Acked-by: Kamil Krawczyk <kamil.krawczyk@intel.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Shannon Nelson [Thu, 13 Nov 2014 08:23:16 +0000 (08:23 +0000)]
i40e: add range check to i40e_aq_rc_to_posix
Just to be sure, add a range check to avoid any possible
array index-out-of-bound issues.
CC: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Change-ID: I9323bee6732c2a47599816e1d6c6b3a1f8dcbf54
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Acked-by: Michal Kosiarz <michal.kosiarz@intel.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Shannon Nelson [Thu, 13 Nov 2014 08:23:15 +0000 (08:23 +0000)]
i40e: rework debug messages for NVM update
Rework the debug messages in the NVM update state machine so that we can
turn them on and off dynamically rather than forcing a recompile/reload.
These can now be turned on with something like:
ethtool -s eth1 msglvl 0xf000008f
and off with:
ethtool -s eth1 msglvl 0xf000000f
The high 0xf0000000 gets the driver's attention that we want to change the
internal debug flags, and the 0x80 bit is the NVM debug.
Change-ID: I5efb9039400304b29a0fd6ddea3f47bb362e6661
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Acked-by: Greg Rose <gregory.v.rose@intel.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Shannon Nelson [Thu, 13 Nov 2014 08:23:14 +0000 (08:23 +0000)]
i40e: let firmware catch the NVM busy error
The NVM update operations take time finish asynchronously, and follow-on
update requests need to wait for the current one to finish. Early
firmware didn't handle this well, so the code had to track the busy state.
The released firmware handles the busy state correctly, returning
I40E_AQ_RC_EBUSY if an update is still in progress, so the code no longer
needs to track this.
Change-ID: I6e6b4adc26d6dcc5fd7adfee5763423858a7d921
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Acked-by: Greg Rose <gregory.v.rose@intel.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Shannon Nelson [Thu, 13 Nov 2014 08:23:13 +0000 (08:23 +0000)]
i40e: better error messages for NVM update issues
Add more detail to the NVM update error messages so folks
have a better chance at diagnosing issues without having to
resort to heroic measures to reproduce an issue.
Change-ID: I270d1a9c903baceaef0bebcc55d29108ac08b0bd
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Shannon Nelson [Thu, 13 Nov 2014 08:23:12 +0000 (08:23 +0000)]
i40e: clear NVM update state on ethtool test
Once in a great while the NVMUpdate tools and the driver get out
of phase with each other. This gives us a way to reset things
without having to unload the driver.
Change-ID: I353f688236249a666a90ba3e7233e0ed8c1a04e9
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Jiri Pirko [Fri, 5 Dec 2014 14:50:22 +0000 (15:50 +0100)]
net: sched: cls_basic: fix error path in basic_change()
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Reviewed-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
James Byrne [Fri, 5 Dec 2014 13:03:53 +0000 (13:03 +0000)]
net: macb: Remove obsolete comment from Kconfig
The Kconfig file says that Gigabit mode is not supported, but it has been
supported since commit
140b7552fdff04bbceeb842f0e04f0b4015fe97b ("net/macb:
Add support for Gigabit Ethernet mode").
Signed-off-by: James Byrne <james.byrne@origamienergy.com>
Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andreas Ruprecht [Fri, 5 Dec 2014 08:28:03 +0000 (09:28 +0100)]
net: ethernet: rocker: Add dependency to CONFIG_BRIDGE in Kconfig
In a configuration with CONFIG_BRIDGE set to 'm' and CONFIG_ROCKER
set to 'y', undefined references occur at link time:
> drivers/built-in.o: In function `rocker_port_fdb_learn_work':
> /home/jim/linux/drivers/net/ethernet/rocker/rocker.c:3014: undefined
> reference to `br_fdb_external_learn_del'
> /home/jim/linux/drivers/net/ethernet/rocker/rocker.c:3016: undefined
> reference to `br_fdb_external_learn_add'
This patch fixes these by declaring CONFIG_ROCKER as being dependent
on CONFIG_BRIDGE.
Reported-by: Jim Davis <jim.epost@gmail.com>
Signed-off-by: Andreas Ruprecht <rupran@einserver.de>
Acked-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
Gu Zheng [Fri, 5 Dec 2014 07:14:23 +0000 (15:14 +0800)]
net/socket.c : introduce helper function do_sock_sendmsg to replace reduplicate code
Introduce helper function do_sock_sendmsg() to simplify sock_sendmsg{_nosec},
and replace reduplicate code.
Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Fri, 5 Dec 2014 00:13:49 +0000 (16:13 -0800)]
tcp_cubic: refine Hystart delay threshold
In commit
2b4636a5f8ca ("tcp_cubic: make the delay threshold of HyStart
less sensitive"), HYSTART_DELAY_MIN was changed to 4 ms.
The remaining problem is that using delay_min + (delay_min/16) as the
threshold is too sensitive.
6.25 % of variation is too small for rtt above 60 ms, which are not
uncommon.
Lets use 12.5 % instead (delay_min + (delay_min/8))
Tested:
80 ms RTT between peers, FQ/pacing packet scheduler on sender.
10 bulk transfers of 10 seconds :
nstat >/dev/null
for i in `seq 1 10`
do
netperf -H remote -- -k THROUGHPUT | grep THROUGHPUT
done
nstat | grep Hystart
With the 6.25 % threshold :
THROUGHPUT=20.66
THROUGHPUT=249.38
THROUGHPUT=254.10
THROUGHPUT=14.94
THROUGHPUT=251.92
THROUGHPUT=237.73
THROUGHPUT=19.18
THROUGHPUT=252.89
THROUGHPUT=21.32
THROUGHPUT=15.58
TcpExtTCPHystartTrainDetect 2 0.0
TcpExtTCPHystartTrainCwnd 4756 0.0
TcpExtTCPHystartDelayDetect 5 0.0
TcpExtTCPHystartDelayCwnd 180 0.0
With the 12.5 % threshold
THROUGHPUT=251.09
THROUGHPUT=247.46
THROUGHPUT=250.92
THROUGHPUT=248.91
THROUGHPUT=250.88
THROUGHPUT=249.84
THROUGHPUT=250.51
THROUGHPUT=254.15
THROUGHPUT=250.62
THROUGHPUT=250.89
TcpExtTCPHystartTrainDetect 1 0.0
TcpExtTCPHystartTrainCwnd 3175 0.0
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Tested-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Fri, 5 Dec 2014 00:13:23 +0000 (16:13 -0800)]
tcp_cubic: add SNMP counters to track how effective is Hystart
When deploying FQ pacing, one thing we noticed is that CUBIC Hystart
triggers too soon.
Having SNMP counters to have an idea of how often the various Hystart
methods trigger is useful prior to any modifications.
This patch adds SNMP counters tracking, how many time "ack train" or
"Delay" based Hystart triggers, and cumulative sum of cwnd at the time
Hystart decided to end SS (Slow Start)
myhost:~# nstat -a | grep Hystart
TcpExtTCPHystartTrainDetect 9 0.0
TcpExtTCPHystartTrainCwnd 20650 0.0
TcpExtTCPHystartDelayDetect 10 0.0
TcpExtTCPHystartDelayCwnd 360 0.0
->
Train detection was triggered 9 times, and average cwnd was
20650/9=2294,
Delay detection was triggered 10 times and average cwnd was 36
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Joe Perches [Fri, 5 Dec 2014 01:01:24 +0000 (17:01 -0800)]
x86: bpf_jit_comp: Remove inline from static function definitions
Let the compiler decide instead.
No change in object size x86-64 -O2 no profiling
Signed-off-by: Joe Perches <joe@perches.com>
Suggested-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Joe Perches [Thu, 4 Dec 2014 23:00:48 +0000 (15:00 -0800)]
x86: bpf_jit_comp: Reduce is_ereg() code size
Use the (1 << reg) & mask trick to reduce code size.
x86-64 size difference -O2 without profiling for various
gcc versions:
$ size arch/x86/net/bpf_jit_comp.o*
text data bss dec hex filename
9266 4 0 9270 2436 arch/x86/net/bpf_jit_comp.o.4.4.new
10042 4 0 10046 273e arch/x86/net/bpf_jit_comp.o.4.4.old
9109 4 0 9113 2399 arch/x86/net/bpf_jit_comp.o.4.6.new
9717 4 0 9721 25f9 arch/x86/net/bpf_jit_comp.o.4.6.old
8789 4 0 8793 2259 arch/x86/net/bpf_jit_comp.o.4.7.new
10245 4 0 10249 2809 arch/x86/net/bpf_jit_comp.o.4.7.old
9671 4 0 9675 25cb arch/x86/net/bpf_jit_comp.o.4.9.new
10679 4 0 10683 29bb arch/x86/net/bpf_jit_comp.o.4.9.old
Signed-off-by: Joe Perches <joe@perches.com>
Tested-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Thu, 4 Dec 2014 20:41:18 +0000 (21:41 +0100)]
net: sched: cls: remove unused op put from tcf_proto_ops
It is never called and implementations are void. So just remove it.
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yuval Mintz [Thu, 4 Dec 2014 10:52:06 +0000 (12:52 +0200)]
bnx2x: Use correct fastpath version for VFs.
Our FW can support several fastpath HSI [for backward compatibility] but up
until now VFs were always configured to use latest fastpath HSI [although VF
driver might be older and use an older fastpath HSI].
For linux drivers, the differences are insignificant since driver never
utilized features that were overridden by the HSI change. But for VMs running
other operating systems this might be a problem.
In addition, eventually FW might change fastpath HSI in such a manner that
backward compatibility WILL break unless configured with proper version.
This patch fixes the issue for other operating system VMs, as well as lays
the ground work for forward compatibility in regard to the fastpath HSI.
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: Ariel Elior <Ariel.Elior@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rasmus Villemoes [Thu, 4 Dec 2014 10:30:40 +0000 (11:30 +0100)]
net: tulip: Remove private "strncmp"
The comment says that the built-in strncmp didn't work. That is not
surprising, as apparently "str" semantics are not really what is
wanted (hint: de4x5_strncmp only stops when two different bytes are
encountered or the end is reached; not if either byte happens to be
0). de4x5_strncmp is actually a memcmp (except for the signature and
that bytes are not necessarily treated as unsigned char); since only
the boolean value of the result is used we can just replace
de4x5_strncmp with memcmp.
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Acked-by: Grant Grundler <grundler@parisc-linux.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Lokesh Vutla [Thu, 4 Dec 2014 04:54:29 +0000 (10:24 +0530)]
drivers: net : cpsw: Update Kconfig for CPSW
CPSW is present in AM33xx, AM43xx, DRA7xx.
Updating the Kconfig to depend on ARCH_OMAP2PLUS instead of listing
all SoC's.
Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com>
Acked-by: Mugunthan V N <mugunthanvnm@ti.com>
Reviewed-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Erik Hugne [Wed, 3 Dec 2014 15:58:40 +0000 (16:58 +0100)]
tipc: fix missing spinlock init and nullptr oops
commit
908344cdda80 ("tipc: fix bug in multicast congestion
handling") introduced two bugs with the bclink wakeup
function. This commit fixes the missing spinlock init for the
waiting_sks list. We also eliminate the race condition
between the waiting_sks length check/dequeue operations in
tipc_bclink_wakeup_users by simply removing the redundant
length check.
Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Acked-by: Tero Aho <Tero.Aho@coriant.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
hayeswang [Thu, 4 Dec 2014 02:43:11 +0000 (10:43 +0800)]
r8152: redefine REALTEK_USB_DEVICE
Redefine REALTEK_USB_DEVICE for the desired USB interface for probe().
There are three USB interfaces for the device. USB_CLASS_COMM and
USB_CLASS_CDC_DATA are for ECM mode (config #2). USB_CLASS_VENDOR_SPEC
is for the vendor mode (config #1). However, we are not interesting
in USB_CLASS_CDC_DATA for probe(), so redefine REALTEK_USB_DEVICE
to ignore the USB interface class of USB_CLASS_CDC_DATA.
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Thu, 4 Dec 2014 01:04:39 +0000 (17:04 -0800)]
net: avoid two atomic operations in fast clones
Commit
ce1a4ea3f125 ("net: avoid one atomic operation in skb_clone()")
took the wrong way to save one atomic operation.
It is actually possible to avoid two atomic operations, if we
do not change skb->fclone values, and only rely on clone_ref
content to signal if the clone is available or not.
skb_clone() can simply use the fast clone if clone_ref is 1.
kfree_skbmem() can avoid the atomic_dec_and_test() if clone_ref is 1.
Note that because we usually free the clone before the original skb,
this particular attempt is only done for the original skb to have better
branch prediction.
SKB_FCLONE_FREE is removed.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Chris Mason <clm@fb.com>
Cc: Sabrina Dubroca <sd@queasysnail.net>
Cc: Vijay Subramanian <subramanian.vijay@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mahesh Bandewar [Wed, 3 Dec 2014 21:46:24 +0000 (13:46 -0800)]
rtnetlink: delay RTM_DELLINK notification until after ndo_uninit()
The commit
56bfa7ee7c ("unregister_netdevice : move RTM_DELLINK to
until after ndo_uninit") tried to do this ealier but while doing so
it created a problem. Unfortunately the delayed rtmsg_ifinfo() also
delayed call to fill_info(). So this translated into asking driver
to remove private state and then query it's private state. This
could have catastropic consequences.
This change breaks the rtmsg_ifinfo() into two parts - one takes the
precise snapshot of the device by called fill_info() before calling
the ndo_uninit() and the second part sends the notification using
collected snapshot.
It was brought to notice when last link is deleted from an ipvlan device
when it has free-ed the port and the subsequent .fill_info() call is
trying to get the info from the port.
kernel: [ 255.139429] ------------[ cut here ]------------
kernel: [ 255.139439] WARNING: CPU: 12 PID: 11173 at net/core/rtnetlink.c:2238 rtmsg_ifinfo+0x100/0x110()
kernel: [ 255.139493] Modules linked in: ipvlan bonding w1_therm ds2482 wire cdc_acm ehci_pci ehci_hcd i2c_dev i2c_i801 i2c_core msr cpuid bnx2x ptp pps_core mdio libcrc32c
kernel: [ 255.139513] CPU: 12 PID: 11173 Comm: ip Not tainted 3.18.0-smp-DEV #167
kernel: [ 255.139514] Hardware name: Intel RML,PCH/Ibis_QC_18, BIOS 1.0.10 05/15/2012
kernel: [ 255.139515]
0000000000000009 ffff880851b6b828 ffffffff815d87f4 00000000000000e0
kernel: [ 255.139516]
0000000000000000 ffff880851b6b868 ffffffff8109c29c 0000000000000000
kernel: [ 255.139518]
00000000ffffffa6 00000000000000d0 ffffffff81aaf580 0000000000000011
kernel: [ 255.139520] Call Trace:
kernel: [ 255.139527] [<
ffffffff815d87f4>] dump_stack+0x46/0x58
kernel: [ 255.139531] [<
ffffffff8109c29c>] warn_slowpath_common+0x8c/0xc0
kernel: [ 255.139540] [<
ffffffff8109c2ea>] warn_slowpath_null+0x1a/0x20
kernel: [ 255.139544] [<
ffffffff8150d570>] rtmsg_ifinfo+0x100/0x110
kernel: [ 255.139547] [<
ffffffff814f78b5>] rollback_registered_many+0x1d5/0x2d0
kernel: [ 255.139549] [<
ffffffff814f79cf>] unregister_netdevice_many+0x1f/0xb0
kernel: [ 255.139551] [<
ffffffff8150acab>] rtnl_dellink+0xbb/0x110
kernel: [ 255.139553] [<
ffffffff8150da90>] rtnetlink_rcv_msg+0xa0/0x240
kernel: [ 255.139557] [<
ffffffff81329283>] ? rhashtable_lookup_compare+0x43/0x80
kernel: [ 255.139558] [<
ffffffff8150d9f0>] ? __rtnl_unlock+0x20/0x20
kernel: [ 255.139562] [<
ffffffff8152cb11>] netlink_rcv_skb+0xb1/0xc0
kernel: [ 255.139563] [<
ffffffff8150a495>] rtnetlink_rcv+0x25/0x40
kernel: [ 255.139565] [<
ffffffff8152c398>] netlink_unicast+0x178/0x230
kernel: [ 255.139567] [<
ffffffff8152c75f>] netlink_sendmsg+0x30f/0x420
kernel: [ 255.139571] [<
ffffffff814e0b0c>] sock_sendmsg+0x9c/0xd0
kernel: [ 255.139575] [<
ffffffff811d1d7f>] ? rw_copy_check_uvector+0x6f/0x130
kernel: [ 255.139577] [<
ffffffff814e11c9>] ? copy_msghdr_from_user+0x139/0x1b0
kernel: [ 255.139578] [<
ffffffff814e1774>] ___sys_sendmsg+0x304/0x310
kernel: [ 255.139581] [<
ffffffff81198723>] ? handle_mm_fault+0xca3/0xde0
kernel: [ 255.139585] [<
ffffffff811ebc4c>] ? destroy_inode+0x3c/0x70
kernel: [ 255.139589] [<
ffffffff8108e6ec>] ? __do_page_fault+0x20c/0x500
kernel: [ 255.139597] [<
ffffffff811e8336>] ? dput+0xb6/0x190
kernel: [ 255.139606] [<
ffffffff811f05f6>] ? mntput+0x26/0x40
kernel: [ 255.139611] [<
ffffffff811d2b94>] ? __fput+0x174/0x1e0
kernel: [ 255.139613] [<
ffffffff814e2129>] __sys_sendmsg+0x49/0x90
kernel: [ 255.139615] [<
ffffffff814e2182>] SyS_sendmsg+0x12/0x20
kernel: [ 255.139617] [<
ffffffff815df092>] system_call_fastpath+0x12/0x17
kernel: [ 255.139619] ---[ end trace
5e6703e87d984f6b ]---
Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Reported-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Roopa Prabhu <roopa@cumulusnetworks.com>
Cc: David S. Miller <davem@davemloft.net>
Acked-by: Eric Dumazet <edumazet@google.com>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
stephen hemminger [Wed, 3 Dec 2014 17:33:17 +0000 (09:33 -0800)]
tc_act: export uapi header file
This file is used by iproute2 and should be exported.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 9 Dec 2014 18:32:04 +0000 (13:32 -0500)]
Merge branch 'cxgb4-next'
Hariprasad Shenai says:
====================
cxgb4/cxgb4vf: T5 BAR2 and ethtool related fixes
This series adds new interface to calculate BAR2 SGE queue register address for
cxgb4 and cxgb4vf driver and some more sge related fixes for T5. Also adds a
patch which updates the FW version displayed by ethtool after firmware flash.
The patches series is created against 'net-next' tree.
And includes patches on cxgb4 and cxgb4vf driver.
We have included all the maintainers of respective drivers. Kindly review the
change and let us know in case of any review comments.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Hariprasad Shenai [Wed, 3 Dec 2014 14:02:54 +0000 (19:32 +0530)]
cxgb4: Update firmware version after flashing it via ethtool
After successfully loading new firmware, reload the new firmware's version
number information so "ethtool -i", etc. will report the right value
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hariprasad Shenai [Wed, 3 Dec 2014 14:02:53 +0000 (19:32 +0530)]
cxgb4/cxgb4vf: Use new interfaces to calculate BAR2 SGE Queue Register addresses
Use BAR2 Going To Sleep (GTS) for T5 and later. Use new BAR2 User Doorbells for
T5 for both cxgb4 and cxgb4vf driver.
Based on original work by Casey Leedom <leedom@chelsio.com>
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hariprasad Shenai [Wed, 3 Dec 2014 14:02:52 +0000 (19:32 +0530)]
cxgb4/cxgb4vf: Add code to calculate T5 BAR2 Offsets for SGE Queue Registers
Add new Common Code facilities for calculating T5 BAR2 Offsets for SGE Queue
Registers. This new code can handle situations where
Queues Per Page * SGE BAR2 Queue Register Area Size > Page Size
Based on original work by Casey Leedom <leedom@chelsio.com>
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hariprasad Shenai [Wed, 3 Dec 2014 14:02:51 +0000 (19:32 +0530)]
cxgb4vf: Add and initialize some sge params for VF driver
Add sge_vf_eq_qpp and sge_vf_iq_qpp to (struct sge_params), initialize
sge_queues_per_page and sge_vf_qpp in t4vf_get_sge_params(), add new
t4vf_prep_adapter() which initializes basic adapter parameters.
Grab both SGE_EGRESS_QUEUES_PER_PAGE_VF and SGE_INGRESS_QUEUES_PER_PAGE_VF
for VF Drivers since we need both to calculate the User Doorbell area
offsets for Egress and Ingress Queues.
Based on original work by Casey Leedom <leedom@chelsio.com>
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Erik Hugne [Wed, 3 Dec 2014 13:44:44 +0000 (14:44 +0100)]
tipc: drop tx side permission checks
Part of the old remote management feature is a piece of code
that checked permissions on the local system to see if a certain
operation was permitted, and if so pass the command to a remote
node. This serves no purpose after the removal of remote management
with commit
5902385a2440 ("tipc: obsolete the remote management
feature") so we remove it.
Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Wed, 3 Dec 2014 13:14:54 +0000 (14:14 +0100)]
rocker: fix eth_type type in struct rocker_ctrl
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Wed, 3 Dec 2014 13:14:53 +0000 (14:14 +0100)]
rocker: introduce be put/get variants and use it when appropriate
This kills the sparse warnings.
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Duan Jiong [Wed, 3 Dec 2014 02:29:40 +0000 (10:29 +0800)]
ipv6: remove useless spin_lock/spin_unlock
xchg is atomic, so there is no necessary to use spin_lock/spin_unlock
to protect it. At last, remove the redundant
opt = xchg(&inet6_sk(sk)->opt, opt); statement.
Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Lendacky, Thomas [Wed, 3 Dec 2014 00:07:18 +0000 (18:07 -0600)]
amd-xgbe: IRQ names require allocated memory
When requesting an irq, the name passed in must be (part of) allocated
memory. The irq name was a local variable and resulted in random
characters when listing /proc/interrupts. Add a character field to the
xgbe_channel structure to hold the irq name and use that.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David L Stevens [Tue, 9 Dec 2014 02:46:09 +0000 (21:46 -0500)]
sunvnet: fix incorrect rcu_read_unlock() in vnet_start_xmit()
This patch removes an extra rcu_read_unlock() on an allocation failure
in vnet_skb_shape(). The needed rcu_read_unlock() is already done in
the out_dropped label.
Reported-by: Rashmi Narasimhan <rashmi.narasimhan@oracle.com>
Signed-off-by: David L Stevens <david.stevens@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 9 Dec 2014 02:33:36 +0000 (21:33 -0500)]
Merge branch 'genet-gphy'
Florian Fainelli says:
====================
net: bcmgenet: support for new GPHY revision scheme
These two patches update the GENET GPHY revision logic to account for some
of our newer designs starting with GPHY rev G0.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Wed, 3 Dec 2014 17:57:00 +0000 (09:57 -0800)]
net: phy: bcm7xxx: add an explicit version check for GPHY rev G0
GPHY revision G0 has its version rolled over to 0x10, introduce an
explicit check for that revision and invoke the proper workaround
function for it.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Wed, 3 Dec 2014 17:56:59 +0000 (09:56 -0800)]
net: bcmgenet: add support for new GENET PHY revision scheme
Starting with GPHY revision G0, the GENET register layout has changed to
use the same numbering scheme as the Starfighter 2 switch. This means
that GPHY major revision is in bits 15:12, minor in bits 11:8 and patch
level is in bits 7:4.
Introduce a small heuristic which checks for the old scheme first, tests
for the new scheme and finally attempts to catch reserved values and
aborts.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>