Marcel Holtmann [Thu, 8 Sep 2016 16:07:07 +0000 (18:07 +0200)]
Bluetooth: Increase the subsystem minor version number
While the subsystem version information are purely informational,
increase the minor number due to the addition of user channel and
management control monitoring suppport. It is helpful for debugging
purposes to see the version numbers change.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Frédéric Dalleau [Thu, 8 Sep 2016 10:00:11 +0000 (12:00 +0200)]
Bluetooth: Fix reason code used for rejecting SCO connections
A comment in the code states that SCO connection should be rejected
with the proper error value between 0xd-0xf. The code uses
HCI_ERROR_REMOTE_LOW_RESOURCES which is 0x14.
This led to following error:
< HCI Command: Reject Synchronous Co.. (0x01|0x002a) plen 7
Address: 34:51:C9:EF:02:CA (Apple, Inc.)
Reason: Remote Device Terminated due to Low Resources (0x14)
> HCI Event: Command Status (0x0f) plen 4
Reject Synchronous Connection Request (0x01|0x002a) ncmd 1
Status: Invalid HCI Command Parameters (0x12)
Instead make use of HCI_ERROR_REJ_LIMITED_RESOURCES which is 0xd.
Signed-off-by: Frédéric Dalleau <frederic.dalleau@collabora.co.uk>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Colin Ian King [Tue, 6 Sep 2016 12:15:49 +0000 (13:15 +0100)]
Bluetooth: btqca: remove null checks on edl->data as it is an array
edl->data is an array of __u8 so the null check is unneccessary,
so remove it.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Marcel Holtmann [Sun, 4 Sep 2016 03:13:46 +0000 (05:13 +0200)]
Bluetooth: Fix wrong New Settings event when closing HCI User Channel
When closing HCI User Channel, the New Settings event was send out to
inform about changed settings. However such event is wrong since the
exclusive HCI User Channel access is active until the Index Added event
has been sent.
@ USER Close: test
@ MGMT Event: New Settings (0x0006) plen 4
Current settings: 0x00000ad0
Bondable
Secure Simple Pairing
BR/EDR
Low Energy
Secure Connections
= Close Index: 00:14:EF:22:04:12
@ MGMT Event: Index Added (0x0004) plen 0
Calling __mgmt_power_off from hci_dev_do_close requires an extra check
for an active HCI User Channel.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Marcel Holtmann [Thu, 1 Sep 2016 17:48:28 +0000 (19:48 +0200)]
Bluetooth: Send control open and close messages for HCI user channels
When opening and closing HCI user channel, send monitoring messages to
be able to trace its behavior.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Alexander Aring [Thu, 1 Sep 2016 09:24:57 +0000 (11:24 +0200)]
fakelb: fix schedule while atomic
This patch changes the spinlock to mutex for the available fakelb phy
list. When holding the spinlock the ieee802154_unregister_hw is called
which holding the rtnl_mutex, in that case we get a "BUG: sleeping function
called from invalid context" error. We simple change the spinlock to
mutex which allows to hold the rtnl lock there.
Signed-off-by: Alexander Aring <aar@pengutronix.de>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Michał Narajowski [Thu, 1 Sep 2016 14:46:24 +0000 (16:46 +0200)]
Bluetooth: Append local name and CoD to Extended Controller Info
This adds device class, complete local name and short local name
to EIR data in Extended Controller Info as specified in docs.
Signed-off-by: Michał Narajowski <michal.narajowski@codecoup.pl>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Marcel Holtmann [Thu, 1 Sep 2016 14:46:23 +0000 (16:46 +0200)]
Bluetooth: Add framework for Extended Controller Information
This command is used to retrieve the current state and basic
information of a controller. It is typically used right after
getting the response to the Read Controller Index List command
or an Index Added event (or its extended counterparts).
When any of the values in the EIR_Data field changes, the event
Extended Controller Information Changed will be used to inform
clients about the updated information.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Michał Narajowski <michal.narajowski@codecoup.pl>
Szymon Janc [Thu, 1 Sep 2016 15:22:37 +0000 (17:22 +0200)]
Bluetooth: btusb: Mark CW6622 devices to have broken link key commands
Conwise CW6622 seems to have a problem with the stored link key
commands so just mark it as broken.
< HCI Command: Read Local Supported Features (0x04|0x0003) plen 0
> HCI Event: Command Complete (0x0e) plen 12
Read Local Supported Features (0x04|0x0003) ncmd 1
status 0x00
Features: 0xff 0x3e 0x85 0x38 0x18 0x18 0x00 0x00
< HCI Command: Read Local Version Information (0x04|0x0001) plen 0
> HCI Event: Command Complete (0x0e) plen 12
Read Local Version Information (0x04|0x0001) ncmd 1
status 0x00
HCI Version: 2.0 (0x3) HCI Revision: 0x1f4
LMP Version: 2.0 (0x3) LMP Subversion: 0x1f4
Manufacturer: CONWISE Technology Corporation Ltd (66)
...
< HCI Command: Read Local Supported Commands (0x04|0x0002) plen 0
> HCI Event: Command Complete (0x0e) plen 68
Read Local Supported Commands (0x04|0x0002) ncmd 1
status 0x00
Commands:
7fffef03cedfffffffffff1ff20ff8ff3f
...
< HCI Command: Read Stored Link Key (0x03|0x000d) plen 7
bdaddr 00:00:00:00:00:00 all 1
> HCI Event: Command Complete (0x0e) plen 8
Read Stored Link Key (0x03|0x000d) ncmd 1
status 0x11 max 0 num 0
Error: Unsupported Feature or Parameter Value
Signed-off-by: Szymon Janc <szymon.janc@codecoup.pl>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Bhaktipriya Shridhar [Tue, 30 Aug 2016 17:12:53 +0000 (22:42 +0530)]
Bluetooth: Remove deprecated create_singlethread_workqueue
The workqueue "workqueue" queues multiple work items viz &qca->ws_awake_rx
&qca->ws_rx_vote_off, &qca->ws_awake_device, &qca->ws_tx_vote_off which
require strict execution ordering. Hence, an ordered dedicated workqueue
has been used to replace the deprecated create_singlethread_workqueue
instance.
WQ_MEM_RECLAIM has not been set since the driver is not being used on a
memory reclaim path.
Signed-off-by: Bhaktipriya Shridhar <bhaktipriya96@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Marcel Holtmann [Tue, 30 Aug 2016 03:00:40 +0000 (05:00 +0200)]
Bluetooth: Handle HCI raw socket transition from unbound to bound
In case an unbound HCI raw socket is later on bound, ensure that the
monitor notification messages indicate a close and re-open. None of
the userspace tools use the socket this, but it is actually possible
to use an ioctl on an unbound socket and then later bind it.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Marcel Holtmann [Tue, 30 Aug 2016 03:00:39 +0000 (05:00 +0200)]
Bluetooth: Send control open and close messages for HCI raw sockets
When opening and closing HCI raw sockets their main usage is for legacy
userspace. To track interaction with the modern mgmt interface, send
open and close monitoring messages for these action.
The HCI raw sockets is special since it supports unbound ioctl operation
and for that special case delay the notification message until at least
one ioctl has been executed. The difference between a bound and unbound
socket will be detailed by the fact the HCI index is present or not.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Marcel Holtmann [Tue, 30 Aug 2016 03:00:38 +0000 (05:00 +0200)]
Bluetooth: Add extra channel checks for control open/close messages
The control open and close monitoring events require special channel
checks to ensure messages are only send when the right events happen.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Marcel Holtmann [Tue, 30 Aug 2016 03:00:37 +0000 (05:00 +0200)]
Bluetooth: Assign the channel early when binding HCI sockets
Assignment of the hci_pi(sk)->channel should be done early when binding
the HCI socket. This avoids confusion with the RAW channel that is used
for legacy access.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Marcel Holtmann [Tue, 30 Aug 2016 03:00:36 +0000 (05:00 +0200)]
Bluetooth: Send control open and close only when cookie is present
Only when the cookie has been assigned, then send the open and close
monitor messages. Also if the socket is bound to a device, then include
the index into the message.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Marcel Holtmann [Tue, 30 Aug 2016 03:00:35 +0000 (05:00 +0200)]
Bluetooth: Use numbers for subsystem version string
Instead of keeping a version string around, use version and revision
numbers and then stringify them for use as module parameter.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Marcel Holtmann [Tue, 30 Aug 2016 03:00:34 +0000 (05:00 +0200)]
Bluetooth: Introduce helper functions for socket cookie handling
Instead of manually allocating cookie information each time, use helper
functions for generating and releasing cookies.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Arnd Bergmann [Mon, 29 Aug 2016 12:36:18 +0000 (14:36 +0200)]
Bluetooth: add WCNSS dependency for HCI driver
The newly added bluetooth driver is based on the soc-specific support,
but lacks the obvious compile-time dependency on that:
drivers/bluetooth/btqcomsmd.o: In function `btqcomsmd_probe':
btqcomsmd.c:(.text.btqcomsmd_probe+0x40): undefined reference to `qcom_wcnss_open_channel'
btqcomsmd.c:(.text.btqcomsmd_probe+0x5c): undefined reference to `qcom_wcnss_open_channel'
Makefile:969: recipe for target 'vmlinux' failed
Fixes:
90c107dc8b2c ("Bluetooth: Introduce Qualcomm WCNSS SMD based HCI driver")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Marcel Holtmann [Mon, 29 Aug 2016 04:31:57 +0000 (06:31 +0200)]
Bluetooth: Use command status event for Set IO Capability errors
In case of failure, the Set IO Capability command is suppose to return
command status and not command complete.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Marcel Holtmann [Mon, 29 Aug 2016 04:19:47 +0000 (06:19 +0200)]
Bluetooth: Fix wrong Get Clock Information return parameters
The address information of the Get Clock Information return parameters
is copying from a different memory location. It uses &cmd->param while
it actually needs to be cmd->param.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Marcel Holtmann [Mon, 29 Aug 2016 04:19:46 +0000 (06:19 +0200)]
Bluetooth: Use individual flags for certain management events
Instead of hiding everything behind a general managment events flag,
introduce indivdual flags that allow fine control over which events are
send to a given management channel.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Johan Hedberg [Sun, 28 Aug 2016 17:53:34 +0000 (20:53 +0300)]
Bluetooth: mgmt: Fix sending redundant event for Advertising Instance
When an Advertising Instance is removed, the Advertising Removed event
shouldn't be sent to the same socket that issued the Remove
Advertising command (it gets a command complete event instead). The
mgmt_advertising_removed() function already has a parameter for
skipping a specific socket, but there was no code to propagate the
right value to this parameter. This patch fixes the issue by making
sure the intermediate hci_req_clear_adv_instance() function gets the
socket pointer.
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Marcel Holtmann [Sat, 27 Aug 2016 18:23:41 +0000 (20:23 +0200)]
Bluetooth: Add support for sending MGMT commands and events to monitor
This adds support for tracing all management commands and events via the
monitor interface.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Marcel Holtmann [Sat, 27 Aug 2016 18:23:40 +0000 (20:23 +0200)]
Bluetooth: Add support for sending MGMT open and close to monitor
This sends new notifications to the monitor support whenever a
management channel has been opened or closed. This allows tracing of
control channels really easily.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Marcel Holtmann [Sat, 27 Aug 2016 18:23:39 +0000 (20:23 +0200)]
Bluetooth: Introduce helper to pack mgmt version information
The mgmt version information will be also needed for the control
changell tracing feature. This provides a helper to pack them.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Marcel Holtmann [Sat, 27 Aug 2016 18:23:38 +0000 (20:23 +0200)]
Bluetooth: Store control socket cookie and comm information
To further allow unique identification and tracking of control socket,
store cookie and comm information when binding the socket.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Marcel Holtmann [Sat, 27 Aug 2016 18:23:37 +0000 (20:23 +0200)]
Bluetooth: Check SOL_HCI for raw socket options
The SOL_HCI level should be enforced when using socket options on the
HCI raw socket interface.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Wolfram Sang [Thu, 11 Aug 2016 21:00:31 +0000 (23:00 +0200)]
Bluetooth: bcm203x: don't print error when allocating urb fails
kmalloc will print enough information in case of failure.
Signed-off-by: Wolfram Sang <wsa-dev@sang-engineering.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Kai-Heng Feng [Tue, 16 Aug 2016 04:50:06 +0000 (12:50 +0800)]
Bluetooth: btusb: Add support for 0cf3:e009
Device 0cf3:e009 is one of the QCA ROME family.
T: Bus=01 Lev=01 Prnt=01 Port=07 Cnt=04 Dev#= 4 Spd=12 MxCh= 0
D: Ver= 2.01 Cls=e0(wlcon) Sub=01 Prot=01 MxPS=64 #Cfgs= 1
P: Vendor=0cf3 ProdID=e009 Rev=00.01
C: #Ifs= 2 Cfg#= 1 Atr=e0 MxPwr=100mA
I: If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
I: If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Nicolas Iooss [Fri, 29 Jul 2016 11:28:25 +0000 (13:28 +0200)]
Bluetooth: add printf format attribute to hci_set_[fh]w_info()
Commit
5177a83827cd ("Bluetooth: Add debugfs fields for hardware and
firmware info") introduced hci_set_hw_info() and hci_set_fw_info().
These functions use kvasprintf_const() but are not marked with a
__printf attribute. Adding such an attribute helps detecting issues
related to printf-formatting at build time.
Signed-off-by: Nicolas Iooss <nicolas.iooss_linux@m4x.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Bart Van Assche [Thu, 11 Aug 2016 23:02:44 +0000 (16:02 -0700)]
Bluetooth: btusb, hci_intel: Fix wait_on_bit_timeout() return value checks
wait_on_bit_timeout() returns one of the following three values:
* 0 to indicate success.
* -EINTR to indicate that a signal has been received;
* -EAGAIN to indicate timeout;
Make the wait_on_bit_timeout() callers check for these values.
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Bjorn Andersson [Sat, 13 Aug 2016 00:01:28 +0000 (17:01 -0700)]
Bluetooth: Introduce Qualcomm WCNSS SMD based HCI driver
The Qualcomm WCNSS chip provides two SMD channels to the BT core; one
for command and one for event packets. This driver exposes the two
channels as a hci device.
Signed-off-by: Bjorn Andersson <bjorn.andersson@sonymobile.com>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Bjorn Andersson [Sat, 13 Aug 2016 00:01:27 +0000 (17:01 -0700)]
Bluetooth: Add HCI device identifier for Qualcomm SMD
This patch assigns the next free HCI device identifier to Bluetooth
devices based on the Qualcomm Shared Memory channels.
Signed-off-by: Bjorn Andersson <bjorn.andersson@sonymobile.com>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Aristeu Rozanski [Mon, 25 Jul 2016 15:46:41 +0000 (11:46 -0400)]
mac802154: use rate limited warnings for malformed frames
Signed-off-by: Aristeu Rozanski <arozansk@redhat.com>
Acked-by: Alexander Aring <aar@pengutronix.de>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Aristeu Rozanski [Mon, 25 Jul 2016 15:46:40 +0000 (11:46 -0400)]
mac802154: don't warn on unsupported frames
Just because we don't support certain types of frames yet doesn't mean
we have to flood the message log with warnings about "invalid" frames.
Signed-off-by: Aristeu Rozanski <arozansk@redhat.com>
Acked-by: Alexander Aring <aar@pengutronix.de>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Alexander Aring [Sun, 24 Jul 2016 14:12:25 +0000 (16:12 +0200)]
6lowpan: ndisc: no overreact if no short address is available
This patch removes handling to remove short address for a neigbour entry
if RS/RA/NS/NA doesn't contain a short address. If these messages
doesn't has any short address option, the existing short address from
ndisc cache will be used. The current behaviour will set that the
neigbour doesn't has a short address anymore.
Signed-off-by: Alexander Aring <aar@pengutronix.de>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Alexander Aring [Sun, 24 Jul 2016 14:12:24 +0000 (16:12 +0200)]
mac802154: set phy net namespace for new ifaces
This patch sets the net namespace when creating SoftMAC interfaces. This
is important if the namespace at phy layer was switched before.
Currently we losing interfaces in some namespace and it's not possible
to recover that.
Signed-off-by: Alexander Aring <aar@pengutronix.de>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Marcel Holtmann [Thu, 21 Jul 2016 12:12:41 +0000 (14:12 +0200)]
Bluetooth: Add combined LED trigger for controller power
Instead of just having a LED trigger for power on a specific controller,
this adds the LED trigger "bluetooth-power" that combines the power
states of all controllers into a single trigger. This simplifies the
trigger selection and also supports multiple controllers per host
system via a single LED.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Marcel Holtmann [Thu, 21 Jul 2016 12:12:40 +0000 (14:12 +0200)]
Bluetooth: Put led_trigger field behind CONFIG_BT_LEDS
The led_trigger field in hci_dev should be conditional based on if
CONFIG_BT_LEDS is set or not.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
David S. Miller [Mon, 19 Sep 2016 05:52:21 +0000 (01:52 -0400)]
Merge tag 'rxrpc-rewrite-
20160917-2' of git://git./linux/kernel/git/dhowells/linux-fs
David Howells says:
====================
rxrpc: Tracepoint addition and improvement
Here is a set of patches that add some more tracepoints and improve a couple
of existing ones. New additions include:
(1) Connection refcount tracking.
(2) Client connection state machine tracking.
(3) Tx and Rx packet lifecycle.
(4) ACK reception and transmission.
(5) recvmsg processing.
Updates include:
(1) Print the symbolic packet name in the Rx packet tracepoint.
(2) Additional call refcount trace events.
(3) Improvements to sk_buff tracking with AF_RXRPC.
In addition:
(1) Config option to inject packet loss during both transmission and
reception.
(2) Removal of some printks.
This series needs to be applied on top of the previously posted fixes.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 19 Sep 2016 05:51:21 +0000 (01:51 -0400)]
Merge tag 'rxrpc-rewrite-
20160917-1' of git://git./linux/kernel/git/dhowells/linux-fs
David Howells says:
====================
rxrpc: Fixes & miscellany
Here are some more AF_RXRPC fix patches with a couple of miscellaneous
changes also. Fixes include:
(1) Make RxRPC IPv6 support conditional on IPv6 being available.
(2) Move the condition check in rxrpc_locate_data() into the caller and
check the error return.
(3) Fix the detection of the last received packet in recvmsg.
(4) Account calls that need acceptance and clean up any unaccepted ones if
the socket gets closed.
(5) Fix the cleanup of client connections.
(6) Fix the soft-ACK parsing and the retransmission of packets based on
those ACKs.
(7) Suppress transmission of an ACK when there's no pending ACK to
transmit because another thread stole it.
And some miscellany:
(8) Whitespace removal.
(9) Switch-value consistency in rxrpc_send_call_packet().
(10) Fix the basic transmission packet size to allow for spur-of-the-moment
jumbo DATA packet production.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 19 Sep 2016 05:47:23 +0000 (01:47 -0400)]
Merge branch 'net-sched-singly-linked-list'
Florian Westphal says:
====================
sched: convert queues to single-linked list
During Netfilter Workshop 2016 Eric Dumazet pointed out that qdisc
schedulers use doubly-linked lists, even though single-linked list
would be enough.
The double-linked skb lists incur one extra write on enqueue/dequeue
operations (to change ->prev pointer of next list elem).
This series converts qdiscs to single-linked version, listhead
maintains pointers to first (for dequeue) and last skb (for enqueue).
Most qdiscs don't queue at all and instead use a leaf qdisc (typically
pfifo_fast) so only a few schedulers needed changes.
I briefly tested netem and htb and they seemed fine.
UDP_STREAM netperf with 64 byte packets via veth+pfifo_fast shows
a small (~2%) improvement.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Westphal [Sat, 17 Sep 2016 22:57:34 +0000 (00:57 +0200)]
sched: add and use qdisc_skb_head helpers
This change replaces sk_buff_head struct in Qdiscs with new qdisc_skb_head.
Its similar to the skb_buff_head api, but does not use skb->prev pointers.
Qdiscs will commonly enqueue at the tail of a list and dequeue at head.
While skb_buff_head works fine for this, enqueue/dequeue needs to also
adjust the prev pointer of next element.
The ->prev pointer is not required for qdiscs so we can just leave
it undefined and avoid one cacheline write access for en/dequeue.
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Westphal [Sat, 17 Sep 2016 22:57:33 +0000 (00:57 +0200)]
sched: replace __skb_dequeue with __qdisc_dequeue_head
After previous patch these functions are identical.
Replace __skb_dequeue in qdiscs with __qdisc_dequeue_head.
Next patch will then make __qdisc_dequeue_head handle
single-linked list instead of strcut sk_buff_head argument.
Doesn't change generated code.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Westphal [Sat, 17 Sep 2016 22:57:32 +0000 (00:57 +0200)]
sched: remove qdisc arg from __qdisc_dequeue_head
Moves qdisc stat accouting to qdisc_dequeue_head.
The only direct caller of the __qdisc_dequeue_head version open-codes
this now.
This allows us to later use __qdisc_dequeue_head as a replacement
of __skb_dequeue() (which operates on sk_buff_head list).
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Westphal [Sat, 17 Sep 2016 22:57:31 +0000 (00:57 +0200)]
sched: don't use skb queue helpers
A followup change will replace the sk_buff_head in the qdisc
struct with a slightly different list.
Use of the sk_buff_head helpers will thus cause compiler
warnings.
Open-code these accesses in an extra change to ease review.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Westphal [Sat, 17 Sep 2016 22:57:30 +0000 (00:57 +0200)]
pie: use qdisc_dequeue_head wrapper
Doesn't change generated code.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Wei Yongjun [Sat, 17 Sep 2016 15:52:17 +0000 (15:52 +0000)]
cxgb4: Fix return value check in cfg_queues_uld()
Fix the retrn value check which testing the wrong variable
in cfg_queues_uld().
Fixes:
94cdb8bb993a ("cxgb4: Add support for dynamic allocation of
resources for ULD")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 19 Sep 2016 05:40:54 +0000 (01:40 -0400)]
Merge branch 'mediatek-hw-lro'
Nelson Chang says:
====================
net: ethernet: mediatek: add HW LRO functions
The series add the large receive offload (LRO) functions by hardware and
the ethtool functions to configure RX flows of HW LRO.
changes since v3:
- Respin the patch by the newer driver
- Move the dts description of hwlro to optional properties
changes since v2:
- Add ndo_fix_features to prevent NETIF_F_LRO off while RX flow is programmed
- Rephrase the dts property is a capability if the hardware supports LRO
changes since v1:
- Add HW LRO support
- Add ethtool hooks to set LRO RX flows
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Nelson Chang [Sat, 17 Sep 2016 15:50:57 +0000 (23:50 +0800)]
net: ethernet: mediatek: add the dts property to set if the HW supports LRO
Add the dts property for the capability if the hardware supports LRO.
Signed-off-by: Nelson Chang <nelson.chang@mediatek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nelson Chang [Sat, 17 Sep 2016 15:50:56 +0000 (23:50 +0800)]
net: ethernet: mediatek: add ethtool functions to configure RX flows of HW LRO
The codes add ethtool functions to set RX flows for HW LRO. Because the
HW LRO hardware can only recognize the destination IP of TCP/IP RX flows,
the ethtool command to add HW LRO flow is as below:
ethtool -N [devname] flow-type tcp4 dst-ip [ip_addr] loc [0~1]
Otherwise, cause the hardware can set total four destination IPs, each
GMAC (GMAC1/GMAC2) can set two IPs separately at most.
Signed-off-by: Nelson Chang <nelson.chang@mediatek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nelson Chang [Sat, 17 Sep 2016 15:50:55 +0000 (23:50 +0800)]
net: ethernet: mediatek: add HW LRO functions of PDMA RX rings
The codes add the large receive offload (LRO) functions by hardware as below:
1) PDMA has total four RX rings that one is the normal ring, and others can
be configured as LRO rings.
2) Only TCP/IP RX flows can be offloaded. The hardware can set four IP
addresses at most, if the destination IP of the RX flow matches one of
them, it has the chance to be offloaded.
3) There three RX flows can be offloaded at most, and one flow is mapped to
one RX ring.
4) If there are more than three candidate RX flows, the hardware can
choose three of them by throughput comparison results.
Signed-off-by: Nelson Chang <nelson.chang@mediatek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hariprasad Shenai [Sat, 17 Sep 2016 02:42:39 +0000 (08:12 +0530)]
chcr/cxgb4i/cxgbit/RDMA/cxgb4: Allocate resources dynamically for all cxgb4 ULD's
Allocate resources dynamically to cxgb4's Upper layer driver's(ULD) like
cxgbit, iw_cxgb4 and cxgb4i. Allocate resources when they register with
cxgb4 driver and free them while unregistering. All the queues and the
interrupts for them will be allocated during ULD probe only and freed
during remove.
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Christophe Jaillet [Fri, 16 Sep 2016 21:05:35 +0000 (23:05 +0200)]
sctp: Remove some redundant code
In commit
311b21774f13 ("sctp: simplify sk_receive_queue locking"), a call
to 'skb_queue_splice_tail_init()' has been made explicit. Previously it was
hidden in 'sctp_skb_list_tail()'
Now, the code around it looks redundant. The '_init()' part of
'skb_queue_splice_tail_init()' should already do the same.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jesper Dangaard Brouer [Fri, 16 Sep 2016 20:36:12 +0000 (22:36 +0200)]
mlx4: fix XDP_TX is acting like XDP_PASS on TX ring full
The XDP_TX action can fail transmitting the frame in case the TX ring
is full or port is down. In case of TX failure it should drop the
frame, and not as now call 'break' which is the same as XDP_PASS.
Fixes:
9ecc2d86171a ("net/mlx4_en: add xdp forwarding and data write support")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Reviewed-by: Brenden Blanco <bblanco@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 19 Sep 2016 05:25:30 +0000 (01:25 -0400)]
Merge branch 'ipvlan-l3'
Mahesh Bandewar says:
====================
IPvlan introduce l3s mode
Same old problem with new approach especially from suggestions from
earlier patch-series.
First thing is that this is introduced as a new mode rather than
modifying the old (L3) mode. So the behavior of the existing modes is
preserved as it is and the new L3s mode obeys iptables so that intended
conn-tracking can work.
To do this, the code uses newly added l3mdev_rcv() handler and an
Iptables hook. l3mdev_rcv() to perform an inbound route lookup with the
correct (IPvlan slave) interface and then IPtable-hook at LOCAL_INPUT
to change the input device from master to the slave to complete the
formality.
Supporting stack changes are trivial changes to export symbol to get
IPv4 equivalent code exported for IPv6 and to allow netfilter hook
registration code to allow caller to hold RTNL. Please look into
individual patches for details.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Mahesh Bandewar [Fri, 16 Sep 2016 19:59:19 +0000 (12:59 -0700)]
ipvlan: Introduce l3s mode
In a typical IPvlan L3 setup where master is in default-ns and
each slave is into different (slave) ns. In this setup egress
packet processing for traffic originating from slave-ns will
hit all NF_HOOKs in slave-ns as well as default-ns. However same
is not true for ingress processing. All these NF_HOOKs are
hit only in the slave-ns skipping them in the default-ns.
IPvlan in L3 mode is restrictive and if admins want to deploy
iptables rules in default-ns, this asymmetric data path makes it
impossible to do so.
This patch makes use of the l3_rcv() (added as part of l3mdev
enhancements) to perform input route lookup on RX packets without
changing the skb->dev and then uses nf_hook at NF_INET_LOCAL_IN
to change the skb->dev just before handing over skb to L4.
Signed-off-by: Mahesh Bandewar <maheshb@google.com>
CC: David Ahern <dsa@cumulusnetworks.com>
Reviewed-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mahesh Bandewar [Fri, 16 Sep 2016 19:59:13 +0000 (12:59 -0700)]
net: Add _nf_(un)register_hooks symbols
Add _nf_register_hooks() and _nf_unregister_hooks() calls which allow
caller to hold RTNL mutex.
Signed-off-by: Mahesh Bandewar <maheshb@google.com>
CC: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mahesh Bandewar [Fri, 16 Sep 2016 19:59:08 +0000 (12:59 -0700)]
ipv6: Export p6_route_input_lookup symbol
Make ip6_route_input_lookup available outside of ipv6 the module
similar to ip_route_input_noref in the IPv4 world.
Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 19 Sep 2016 02:33:47 +0000 (22:33 -0400)]
Merge branch 'net-offloaded-stats'
Jiri Pirko says:
====================
net: return offloaded stats as default and expose original sw stats
The problem we try to handle is about offloaded forwarded packets
which are not seen by kernel. Let me try to draw it:
port1 port2 (HW stats are counted here)
\ /
\ /
\ /
--(A)---- ASIC --(B)--
|
(C)
|
CPU (SW stats are counted here)
Now we have couple of flows for TX and RX (direction does not matter here):
1) port1->A->ASIC->C->CPU
For this flow, HW and SW stats are equal.
2) port1->A->ASIC->C->CPU->C->ASIC->B->port2
For this flow, HW and SW stats are equal.
3) port1->A->ASIC->B->port2
For this flow, SW stats are 0.
The purpose of this patchset is to provide facility for user to
find out the difference between flows 1+2 and 3. In other words, user
will be able to see the statistics for the slow-path (through kernel).
Also note that HW stats are what someone calls "accumulated" stats.
Every packet counted by SW is also counted by HW. Not the other way around.
As a default the accumulated stats (HW) will be exposed to user
so the userspace apps can react properly.
This patchset add the SW stats (flows 1+2) under offload related stats, so
in the future we can expose other offload related stat in a similar way.
---
v9->v10:
- patch 2/3
- removed unnecessary ()s as pointed out by Nik
v8->v9:
- patch 2/3
- add using of idxattr and prividx
v7->v8:
- patch 2/3
- move helping const from uapi to rtnetlink
- cancel driver xstat nesting if it is empty
v6->v7:
- patch 1/3:
- ndo interface changed to get the wanted stats type as an input.
- change commit message.
- patch 2/3:
- create a nesting for offloaded stat and put SW stats under it.
- change the ndo call to indicate which offload stats we wants.
- change commit message.
- patch 3/3:
- change ndo implementation to match the changes in the previous patches.
- change commit message.
v5->v6:
- patch 2/4 was dropped as requested by Roopa
- patch 1/3:
- comment changed to indicate that default stats are combined stats
- commit massage changed
- patch 2/3: (previously 3/4)
- SW stats return nothing if there is no SW stats ndo
v4->v5:
- updated cover letter
- patch3/4:
- using memcpy directly to copy stats as requested by DaveM
v3->v4:
- patch1/4:
- fixed "return ()" pointed out by EricD
- patch2/4:
- fixed if_nlmsg_size as pointed out by EricD
v2->v3:
- patch1/4:
- added dev_have_sw_stats helper
- patch2/4:
- avoided memcpy as requested by DaveM
- patch3/4:
- use new dev_have_sw_stats helper
v1->v2:
- patch3/4:
- fixed NULL initialization
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Nogah Frankel [Fri, 16 Sep 2016 13:05:38 +0000 (15:05 +0200)]
mlxsw: spectrum: Implement offload stats ndo and expose HW stats by default
Change the default statistics ndo to return HW statistics
(like the one returned by ethtool_ops).
The HW stats are collected to a cache by delayed work every 1 sec.
Implement the offload stat ndo.
Add a function to get SW statistics, to be called from this function.
Signed-off-by: Nogah Frankel <nogahf@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nogah Frankel [Fri, 16 Sep 2016 13:05:37 +0000 (15:05 +0200)]
net: core: Add offload stats to if_stats_msg
Add a nested attribute of offload stats to if_stats_msg
named IFLA_STATS_LINK_OFFLOAD_XSTATS.
Under it, add SW stats, meaning stats only per packets that went via
slowpath to the cpu, named IFLA_OFFLOAD_XSTATS_CPU_HIT.
Signed-off-by: Nogah Frankel <nogahf@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nogah Frankel [Fri, 16 Sep 2016 13:05:36 +0000 (15:05 +0200)]
netdevice: Add offload statistics ndo
Add a new ndo to return statistics for offloaded operation.
Since there can be many different offloaded operation with many
stats types, the ndo gets an attribute id by which it knows which
stats are wanted. The ndo also gets a void pointer to be cast according
to the attribute id.
Signed-off-by: Nogah Frankel <nogahf@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 19 Sep 2016 02:29:08 +0000 (22:29 -0400)]
Merge tag 'mac80211-next-for-davem-2016-09-16' of git://git./linux/kernel/git/jberg/mac80211-next
Johannes Berg says:
====================
This time we have various things - all across the board:
* MU-MIMO sniffer support in mac80211
* a create_singlethread_workqueue() cleanup
* interface dump filtering that was documented but not implemented
* support for the new radiotap timestamp field
* send delBA in two unexpected conditions (as required by the spec)
* connect keys cleanups - allow only WEP with index 0-3
* per-station aggregation limit to work around broken APs
* debugfs improvement for the integrated codel algorithm
and various other small improvements and cleanups.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Colin Ian King [Fri, 16 Sep 2016 09:43:38 +0000 (10:43 +0100)]
net: r6040: add in missing white space in error message text
A couple of dev_err messages span two lines and the literal
string is missing a white space between words. Add the white
space and join the two lines into one.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: FLorian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Thu, 15 Sep 2016 23:20:01 +0000 (16:20 -0700)]
pkt_sched: fq: use proper locking in fq_dump_stats()
When fq is used on 32bit kernels, we need to lock the qdisc before
copying 64bit fields.
Otherwise "tc -s qdisc ..." might report bogus values.
Fixes:
afe4fd062416 ("pkt_sched: fq: Fair Queue packet scheduler")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Thadeu Lima de Souza Cascardo [Thu, 15 Sep 2016 22:11:53 +0000 (19:11 -0300)]
openvswitch: use percpu flow stats
Instead of using flow stats per NUMA node, use it per CPU. When using
megaflows, the stats lock can be a bottleneck in scalability.
On a E5-2690 12-core system, usual throughput went from ~4Mpps to
~15Mpps when forwarding between two 40GbE ports with a single flow
configured on the datapath.
This has been tested on a system with possible CPUs 0-7,16-23. After
module removal, there were no corruption on the slab cache.
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com>
Cc: pravin shelar <pshelar@ovn.org>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Thadeu Lima de Souza Cascardo [Thu, 15 Sep 2016 22:11:52 +0000 (19:11 -0300)]
openvswitch: fix flow stats accounting when node 0 is not possible
On a system with only node 1 as possible, all statistics is going to be
accounted on node 0 as it will have a single writer.
However, when getting and clearing the statistics, node 0 is not going
to be considered, as it's not a possible node.
Tested that statistics are not zero on a system with only node 1
possible. Also compile-tested with CONFIG_NUMA off.
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 19 Sep 2016 02:02:40 +0000 (22:02 -0400)]
Merge branch 'sctp-transmit-errs'
Xin Long says:
====================
sctp: fix the transmit err process
This patchset is to improve the transmit err process and also fix some
issues.
After this patchset, once the chunks are enqueued successfully, even
if the chunks fail to send out, no matter because of nodst or nomem,
no err retruns back to users any more. Instead, they are taken care
of by retransmit.
v1->v2:
- add more details to the changelog in patch 1/6
- add Fixes: tag in patch 2/6, 3/6
- also revert
69b5777f2e57 in patch 3/6
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Tue, 13 Sep 2016 18:04:23 +0000 (02:04 +0800)]
sctp: not return ENOMEM err back in sctp_packet_transmit
As David and Marcelo's suggestion, ENOMEM err shouldn't return back to
user in transmit path. Instead, sctp's retransmit would take care of
the chunks that fail to send because of ENOMEM.
This patch is only to do some release job when alloc_skb fails, not to
return ENOMEM back any more.
Besides, it also cleans up sctp_packet_transmit's err path, and fixes
some issues in err path:
- It didn't free the head skb in nomem: path.
- No need to check nskb in no_route: path.
- It should goto err: path if alloc_skb fails for head.
- Not all the NOMEMs should free nskb.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Tue, 13 Sep 2016 18:04:22 +0000 (02:04 +0800)]
sctp: make sctp_outq_flush/tail/uncork return void
sctp_outq_flush return value is meaningless now, this patch is
to make sctp_outq_flush return void, as well as sctp_outq_fail
and sctp_outq_uncork.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Tue, 13 Sep 2016 18:04:21 +0000 (02:04 +0800)]
sctp: save transmit error to sk_err in sctp_outq_flush
Every time when sctp calls sctp_outq_flush, it sends out the chunks of
control queue, retransmit queue and data queue. Even if some trunks are
failed to transmit, it still has to flush all the transports, as it's
the only chance to clean that transmit_list.
So the latest transmit error here should be returned back. This transmit
error is an internal error of sctp stack.
I checked all the places where it uses the transmit error (the return
value of sctp_outq_flush), most of them are actually just save it to
sk_err.
Except for sctp_assoc/endpoint_bh_rcv, they will drop the chunk if
it's failed to send a REPLY, which is actually incorrect, as we can't
be sure the error that sctp_outq_flush returns is from sending that
REPLY.
So it's meaningless for sctp_outq_flush to return error back.
This patch is to save transmit error to sk_err in sctp_outq_flush, the
new error can update the old value. Eventually, sctp_wait_for_* would
check for it.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Tue, 13 Sep 2016 18:04:20 +0000 (02:04 +0800)]
sctp: free msg->chunks when sctp_primitive_SEND return err
Last patch "sctp: do not return the transmit err back to sctp_sendmsg"
made sctp_primitive_SEND return err only when asoc state is unavailable.
In this case, chunks are not enqueued, they have no chance to be freed if
we don't take care of them later.
This Patch is actually to revert commit
1cd4d5c4326a ("sctp: remove the
unused sctp_datamsg_free()"), commit
69b5777f2e57 ("sctp: hold the chunks
only after the chunk is enqueued in outq") and commit
8b570dc9f7b6 ("sctp:
only drop the reference on the datamsg after sending a msg"), to use
sctp_datamsg_free to free the chunks of current msg.
Fixes:
8b570dc9f7b6 ("sctp: only drop the reference on the datamsg after sending a msg")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Tue, 13 Sep 2016 18:04:19 +0000 (02:04 +0800)]
sctp: do not return the transmit err back to sctp_sendmsg
Once a chunk is enqueued successfully, sctp queues can take care of it.
Even if it is failed to transmit (like because of nomem), it should be
put into retransmit queue.
If sctp report this error to users, it confuses them, they may resend
that msg, but actually in kernel sctp stack is in charge of retransmit
it already.
Besides, this error probably is not from the failure of transmitting
current msg, but transmitting or retransmitting another msg's chunks,
as sctp_outq_flush just tries to send out all transports' chunks.
This patch is to make sctp_cmd_send_msg return avoid, and not return the
transmit err back to sctp_sendmsg
Fixes:
8b570dc9f7b6 ("sctp: only drop the reference on the datamsg after sending a msg")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Tue, 13 Sep 2016 18:04:18 +0000 (02:04 +0800)]
sctp: remove the unnecessary state check in sctp_outq_tail
Data Chunks are only sent by sctp_primitive_SEND, in which sctp checks
the asoc's state through statetable before calling sctp_outq_tail. So
there's no need to check the asoc's state again in sctp_outq_tail.
Besides, sctp_do_sm is protected by lock_sock, even if sending msg is
interrupted by timer events, the event's processes still need to acquire
lock_sock first. It means no others CMDs can be enqueue into side effect
list before CMD_SEND_MSG to change asoc->state, so it's safe to remove it.
This patch is to remove redundant asoc->state check from sctp_outq_tail.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sat, 17 Sep 2016 14:13:16 +0000 (10:13 -0400)]
Merge branch 'ip_tunnel-collect_md'
Alexei Starovoitov says:
====================
ip_tunnel: add collect_md mode to IPv4/IPv6 tunnels
Similar to geneve, vxlan, gre tunnels implement 'collect metadata' mode
in ipip, ipip6, ip6ip6 tunnels.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexei Starovoitov [Thu, 15 Sep 2016 20:00:32 +0000 (13:00 -0700)]
samples/bpf: add comprehensive ipip, ipip6, ip6ip6 test
the test creates 3 namespaces with veth connected via bridge.
First two namespaces simulate two different hosts with the same
IPv4 and IPv6 addresses configured on the tunnel interface and they
communicate with outside world via standard tunnels.
Third namespace creates collect_md tunnel that is driven by BPF
program which selects different remote host (either first or
second namespace) based on tcp dest port number while tcp dst
ip is the same.
This scenario is rough approximation of load balancer use case.
The tests check both traditional tunnel configuration and collect_md mode.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexei Starovoitov [Thu, 15 Sep 2016 20:00:31 +0000 (13:00 -0700)]
samples/bpf: extend test_tunnel_bpf.sh with IPIP test
extend existing tests for vxlan, geneve, gre to include IPIP tunnel.
It tests both traditional tunnel configuration and
dynamic via bpf helpers.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexei Starovoitov [Thu, 15 Sep 2016 20:00:30 +0000 (13:00 -0700)]
ip6_tunnel: add collect_md mode to IPv6 tunnels
Similar to gre, vxlan, geneve tunnels allow IPIP6 and IP6IP6 tunnels
to operate in 'collect metadata' mode.
Unlike ipv4 code here it's possible to reuse ip6_tnl_xmit() function
for both collect_md and traditional tunnels.
bpf_skb_[gs]et_tunnel_key() helpers and ovs (in the future) are the users.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexei Starovoitov [Thu, 15 Sep 2016 20:00:29 +0000 (13:00 -0700)]
ip_tunnel: add collect_md mode to IPIP tunnel
Similar to gre, vxlan, geneve tunnels allow IPIP tunnels to
operate in 'collect metadata' mode.
bpf_skb_[gs]et_tunnel_key() helpers can make use of it right away.
ovs can use it as well in the future (once appropriate ovs-vport
abstractions and user apis are added).
Note that just like in other tunnels we cannot cache the dst,
since tunnel_info metadata can be different for every packet.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Julia Lawall [Thu, 15 Sep 2016 20:23:26 +0000 (22:23 +0200)]
l2tp: constify net_device_ops structures
Check for net_device_ops structures that are only stored in the netdev_ops
field of a net_device structure. This field is declared const, so
net_device_ops structures that have this property can be declared as const
also.
The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)
// <smpl>
@r disable optional_qualifier@
identifier i;
position p;
@@
static struct net_device_ops i@p = { ... };
@ok@
identifier r.i;
struct net_device e;
position p;
@@
e.netdev_ops = &i@p;
@bad@
position p != {r.p,ok.p};
identifier r.i;
struct net_device_ops e;
@@
e@i@p
@depends on !bad disable optional_qualifier@
identifier r.i;
@@
static
+const
struct net_device_ops i = { ... };
// </smpl>
The result of size on this file before the change is:
text data bss dec hex filename
3401 931 44 4376 1118 net/l2tp/l2tp_eth.o
and after the change it is:
text data bss dec hex filename
3993 347 44 4384 1120 net/l2tp/l2tp_eth.o
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Julia Lawall [Thu, 15 Sep 2016 20:23:25 +0000 (22:23 +0200)]
dwc_eth_qos: constify net_device_ops structures
Check for net_device_ops structures that are only stored in the netdev_ops
field of a net_device structure. This field is declared const, so
net_device_ops structures that have this property can be declared as const
also.
The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)
// <smpl>
@r disable optional_qualifier@
identifier i;
position p;
@@
static struct net_device_ops i@p = { ... };
@ok@
identifier r.i;
struct net_device e;
position p;
@@
e.netdev_ops = &i@p;
@bad@
position p != {r.p,ok.p};
identifier r.i;
struct net_device_ops e;
@@
e@i@p
@depends on !bad disable optional_qualifier@
identifier r.i;
@@
static
+const
struct net_device_ops i = { ... };
// </smpl>
The result of size on this file before the change is:
text data bss dec hex filename
21623 1316 40 22979 59c3
drivers/net/ethernet/synopsys/dwc_eth_qos.o
and after the change it is:
text data bss dec hex filename
22199 724 40 22963 59b3
drivers/net/ethernet/synopsys/dwc_eth_qos.o
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Julia Lawall [Thu, 15 Sep 2016 20:23:24 +0000 (22:23 +0200)]
hisilicon: constify net_device_ops structures
Check for net_device_ops structures that are only stored in the netdev_ops
field of a net_device structure. This field is declared const, so
net_device_ops structures that have this property can be declared as const
also.
The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)
// <smpl>
@r disable optional_qualifier@
identifier i;
position p;
@@
static struct net_device_ops i@p = { ... };
@ok@
identifier r.i;
struct net_device e;
position p;
@@
e.netdev_ops = &i@p;
@bad@
position p != {r.p,ok.p};
identifier r.i;
struct net_device_ops e;
@@
e@i@p
@depends on !bad disable optional_qualifier@
identifier r.i;
@@
static
+const
struct net_device_ops i = { ... };
// </smpl>
The result of size on this file before the change is:
text data bss dec hex filename
7995 848 8 8851 2293
drivers/net/ethernet/hisilicon/hip04_eth.o
and after the change it is:
text data bss dec hex filename
8571 256 8 8835 2283
drivers/net/ethernet/hisilicon/hip04_eth.o
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alan Cox [Thu, 15 Sep 2016 17:51:25 +0000 (18:51 +0100)]
llc: switch type to bool as the timeout is only tested versus 0
(As asked by Dave in Februrary)
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Ahern [Thu, 15 Sep 2016 17:18:45 +0000 (10:18 -0700)]
net: l3mdev: Remove netif_index_is_l3_master
No longer used after
e0d56fdd73422 ("net: l3mdev: remove redundant calls")
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Ahern [Thu, 15 Sep 2016 17:13:47 +0000 (10:13 -0700)]
net: vrf: Remove RT_FL_TOS
No longer used after
d66f6c0a8f3c0 ("net: ipv4: Remove l3mdev_get_saddr")
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Thu, 15 Sep 2016 16:33:02 +0000 (09:33 -0700)]
tcp: prepare skbs for better sack shifting
With large BDP TCP flows and lossy networks, it is very important
to keep a low number of skbs in the write queue.
RACK and SACK processing can perform a linear scan of it.
We should avoid putting any payload in skb->head, so that SACK
shifting can be done if needed.
With this patch, we allow to pack ~0.5 MB per skb instead of
the 64KB initially cooked at tcp_sendmsg() time.
This gives a reduction of number of skbs in write queue by eight.
tcp_rack_detect_loss() likes this.
We still allow payload in skb->head for first skb put in the queue,
to not impact RPC workloads.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sat, 17 Sep 2016 13:53:29 +0000 (09:53 -0400)]
Merge tag 'wireless-drivers-next-for-davem-2016-09-15' of git://git./linux/kernel/git/kvalo/wireless-drivers-next
Kalle Valo says:
====================
wireless-drivers-next patches for 4.9
Major changes:
iwlwifi
* preparation for new a000 HW continues
* some DQA improvements
* add support for GMAC
* add support for 9460, 9270 and 9170 series
mwifiex
* support random MAC address for scanning
* add HT aggregation support for adhoc mode
* add custom regulatory domain support
* add manufacturing mode support via nl80211 testmode interface
bcma
* support BCM53573 series of wireless SoCs
bitfield.h
* add FIELD_PREP() and FIELD_GET() macros
mt7601u
* convert to use the new bitfield.h macros
brcmfmac
* add support for bcm4339 chip with modalias sdio:c00v02D0d4339
ath10k
* add nl80211 testmode support for 10.4 firmware
* hide kernel addresses from logs using %pK format specifier
* implement NAPI support
* enable peer stats by default
ath9k
* use ieee80211_tx_status_noskb where possible
wil6210
* extract firmware capabilities from the firmware file
ath6kl
* enable firmware crash dumps on the AR6004
ath-current is also merged to fix a conflict in ath10k.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sat, 17 Sep 2016 13:51:48 +0000 (09:51 -0400)]
Merge branch 'mlx5e-order-0'
Tariq Toukan says:
====================
mlx5e Order-0 pages for Striding RQ
In this series, we refactor our Striding RQ receive-flow to always use
fragmented WQEs (Work Queue Elements) using order-0 pages, omitting the
flow that allocates and splits high-order pages which would fragment
and deplete high-order pages in the system.
The first patch gives a slight degradation, but opens the opportunity
to using a simple page-cache mechanism of a fair size.
The page-cache, implemented in patch 3, not only closes the performance
gap but even gives a gain.
In patch 2 we re-organize the code to better manage the calls for
alloc/de-alloc pages in the RX flow.
Series generated against net-next commit:
bed806cb266e "Merge branch 'mlxsw-ethtool'"
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Tariq Toukan [Thu, 15 Sep 2016 13:08:38 +0000 (16:08 +0300)]
net/mlx5e: Implement RX mapped page cache for page recycle
Instead of reallocating and mapping pages for RX data-path,
recycle already used pages in a per ring cache.
Performance tests:
The following results were measured on a freshly booted system,
giving optimal baseline performance, as high-order pages are yet to
be fragmented and depleted.
We ran pktgen single-stream benchmarks, with iptables-raw-drop:
Single stride, 64 bytes:
* 4,739,057 - baseline
* 4,749,550 - order0 no cache
* 4,786,899 - order0 with cache
1% gain
Larger packets, no page cross, 1024 bytes:
* 3,982,361 - baseline
* 3,845,682 - order0 no cache
* 4,127,852 - order0 with cache
3.7% gain
Larger packets, every 3rd packet crosses a page, 1500 bytes:
* 3,731,189 - baseline
* 3,579,414 - order0 no cache
* 3,931,708 - order0 with cache
5.4% gain
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tariq Toukan [Thu, 15 Sep 2016 13:08:37 +0000 (16:08 +0300)]
net/mlx5e: Introduce API for RX mapped pages
Manage the allocation and deallocation of mapped RX pages only
through dedicated API functions.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tariq Toukan [Thu, 15 Sep 2016 13:08:36 +0000 (16:08 +0300)]
net/mlx5e: Single flow order-0 pages for Striding RQ
To improve the memory consumption scheme, we omit the flow that
demands and splits high-order pages in Striding RQ, and stay
with a single Striding RQ flow that uses order-0 pages.
Moving to fragmented memory allows the use of larger MPWQEs,
which reduces the number of UMR posts and filler CQEs.
Moving to a single flow allows several optimizations that improve
performance, especially in production servers where we would
anyway fallback to order-0 allocations:
- inline functions that were called via function pointers.
- improve the UMR post process.
This patch alone is expected to give a slight performance reduction.
However, the new memory scheme gives the possibility to use a page-cache
of a fair size, that doesn't inflate the memory footprint, which will
dramatically fix the reduction and even give a performance gain.
Performance tests:
The following results were measured on a freshly booted system,
giving optimal baseline performance, as high-order pages are yet to
be fragmented and depleted.
We ran pktgen single-stream benchmarks, with iptables-raw-drop:
Single stride, 64 bytes:
* 4,739,057 - baseline
* 4,749,550 - this patch
no reduction
Larger packets, no page cross, 1024 bytes:
* 3,982,361 - baseline
* 3,845,682 - this patch
3.5% reduction
Larger packets, every 3rd packet crosses a page, 1500 bytes:
* 3,731,189 - baseline
* 3,579,414 - this patch
4% reduction
Fixes:
461017cb006a ("net/mlx5e: Support RX multi-packet WQE (Striding RQ)")
Fixes:
bc77b240b3c5 ("net/mlx5e: Add fragmented memory support for RX multi packet WQE")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Howells [Sat, 17 Sep 2016 09:49:15 +0000 (10:49 +0100)]
rxrpc: Add config to inject packet loss
Add a configuration option to inject packet loss by discarding
approximately every 8th packet received and approximately every 8th DATA
packet transmitted.
Note that no locking is used, but it shouldn't really matter.
Signed-off-by: David Howells <dhowells@redhat.com>
David Howells [Sat, 17 Sep 2016 09:49:14 +0000 (10:49 +0100)]
rxrpc: Improve skb tracing
Improve sk_buff tracing within AF_RXRPC by the following means:
(1) Use an enum to note the event type rather than plain integers and use
an array of event names rather than a big multi ?: list.
(2) Distinguish Rx from Tx packets and account them separately. This
requires the call phase to be tracked so that we know what we might
find in rxtx_buffer[].
(3) Add a parameter to rxrpc_{new,see,get,free}_skb() to indicate the
event type.
(4) A pair of 'rotate' events are added to indicate packets that are about
to be rotated out of the Rx and Tx windows.
(5) A pair of 'lost' events are added, along with rxrpc_lose_skb() for
packet loss injection recording.
Signed-off-by: David Howells <dhowells@redhat.com>
David Howells [Sat, 17 Sep 2016 09:49:14 +0000 (10:49 +0100)]
rxrpc: Remove printks from rxrpc_recvmsg_data() to fix uninit var
Remove _enter/_debug/_leave calls from rxrpc_recvmsg_data() of which one
uses an uninitialised variable.
Signed-off-by: David Howells <dhowells@redhat.com>
David Howells [Sat, 17 Sep 2016 10:13:31 +0000 (11:13 +0100)]
rxrpc: Add a tracepoint to follow what recvmsg does
Add a tracepoint to follow what recvmsg does within AF_RXRPC.
Signed-off-by: David Howells <dhowells@redhat.com>
David Howells [Sat, 17 Sep 2016 09:49:13 +0000 (10:49 +0100)]
rxrpc: Add a tracepoint to follow packets in the Rx buffer
Add a tracepoint to follow the life of packets that get added to a call's
receive buffer.
Signed-off-by: David Howells <dhowells@redhat.com>
David Howells [Sat, 17 Sep 2016 09:49:13 +0000 (10:49 +0100)]
rxrpc: Add a tracepoint to log ACK transmission
Add a tracepoint to log information about ACK transmission.
Signed-off-by: David Howels <dhowells@redhat.com>
David Howells [Sat, 17 Sep 2016 09:49:13 +0000 (10:49 +0100)]
rxrpc: Add a tracepoint to log received ACK packets
Add a tracepoint to log information from received ACK packets.
Signed-off-by: David Howells <dhowells@redhat.com>
David Howells [Sat, 17 Sep 2016 09:49:13 +0000 (10:49 +0100)]
rxrpc: Add a tracepoint to follow the life of a packet in the Tx buffer
Add a tracepoint to follow the insertion of a packet into the transmit
buffer, its transmission and its rotation out of the buffer.
Signed-off-by: David Howells <dhowells@redhat.com>