Pontus Fuchs [Tue, 19 Apr 2016 05:00:44 +0000 (22:00 -0700)]
wcn36xx: Add helper macros to cast sta to priv
While poking at this I also change two related things. I rename one
variable to make the names consistent. I also move one assignment of
priv_sta to the declaration to save a few lines.
Signed-off-by: Pontus Fuchs <pontus.fuchs@gmail.com>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Pontus Fuchs [Tue, 19 Apr 2016 05:00:43 +0000 (22:00 -0700)]
wcn36xx: Use define for invalid index and fix typo
Signed-off-by: Pontus Fuchs <pontus.fuchs@gmail.com>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Pontus Fuchs [Tue, 19 Apr 2016 05:00:42 +0000 (22:00 -0700)]
wcn36xx: Use consistent name for private vif
Some code used priv_vif and some used vif_priv. Convert all to vif_priv
for consistency.
Signed-off-by: Pontus Fuchs <pontus.fuchs@gmail.com>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Pontus Fuchs [Tue, 19 Apr 2016 05:00:41 +0000 (22:00 -0700)]
wcn36xx: Add helper macros to cast vif to private vif and vice versa
Makes the code a little easier to read.
Signed-off-by: Pontus Fuchs <pontus.fuchs@gmail.com>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Pontus Fuchs [Tue, 19 Apr 2016 05:00:40 +0000 (22:00 -0700)]
wcn36xx: Pad TIM PVM if needed
The wcn36xx FW expects a fixed size TIM PVM in the beacon template. If
supplied with a shorter than expected PVM it will overwrite the IE
following the TIM.
Squashed with fix from Jason Mobarak <jam@cozybit.com>:
Patch "wcn36xx: Pad TIM PVM if needed" has caused a regression in mesh
beaconing. The field tim_off is always 0 for mesh mode, and thus
pvm_len (referring to the TIM length field) and pad are both incorrectly
calculated. Thus, msg_body.beacon_length is incorrectly calculated for
mesh mode. Fix this.
Signed-off-by: Pontus Fuchs <pontus.fuchs@gmail.com>
Signed-off-by: Jason Mobarak <jam@cozybit.com>
[bjorn: squashed in Jason's fixup]
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Pontus Fuchs [Tue, 19 Apr 2016 05:00:39 +0000 (22:00 -0700)]
wcn36xx: Clean up wcn36xx_smd_send_beacon
Needed for coming improvements. No functional changes.
Signed-off-by: Pontus Fuchs <pontus.fuchs@gmail.com>
[bjorn: restored BEACON_TEMPLATE_SIZE define to 0x180]
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Vittorio Gambaletta (VittGam) [Mon, 11 Apr 2016 02:48:55 +0000 (04:48 +0200)]
ath9k: Fix LED polarity for some Mini PCI AR9220 MB92 cards.
The Wistron DNMA-92 and Compex WLM200NX have inverted LED polarity
(active high instead of active low).
The same PCI Subsystem ID is used by both cards, which are based on
the same Atheros MB92 design.
Cc: <linux-wireless@vger.kernel.org>
Cc: <ath9k-devel@qca.qualcomm.com>
Cc: <ath9k-devel@lists.ath9k.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Vittorio Gambaletta <linuxbugs@vittgam.net>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Vittorio Gambaletta (VittGam) [Mon, 11 Apr 2016 02:48:54 +0000 (04:48 +0200)]
ath9k: Add a module parameter to invert LED polarity.
The LED can be active high instead of active low on some hardware.
Add the led_active_high module parameter. It defaults to -1 to obey
platform data as before.
Setting the parameter to 1 or 0 will force the LED respectively
active high or active low.
Cc: <linux-wireless@vger.kernel.org>
Cc: <ath9k-devel@qca.qualcomm.com>
Cc: <ath9k-devel@lists.ath9k.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Vittorio Gambaletta <linuxbugs@vittgam.net>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Kalle Valo [Wed, 20 Apr 2016 16:46:16 +0000 (19:46 +0300)]
ath10k: remove enum ath10k_swap_code_seg_bin_type
It's not needed for anything so just get rid of it.
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Kalle Valo [Wed, 20 Apr 2016 16:46:01 +0000 (19:46 +0300)]
ath10k: switch testmode to use ath10k_core_fetch_firmware_api_n()
Now that all firmware-N.bin related are within struct ath10k_fw_file we can
switch to use ath10k_core_fetch_firmware_api_n() and delete almost identical
ath10k_tm_fetch_utf_firmware_api_2().
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Kalle Valo [Wed, 20 Apr 2016 16:45:47 +0000 (19:45 +0300)]
ath10k: move htt_op_version to struct ath10k_fw_file
Preparation for testmode.c to use ath10k_core_fetch_board_data_api_n().
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Kalle Valo [Wed, 20 Apr 2016 16:45:33 +0000 (19:45 +0300)]
ath10k: move wmi_op_version to struct ath10k_fw_file
Preparation for testmode.c to use ath10k_core_fetch_board_data_api_n().
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Kalle Valo [Wed, 20 Apr 2016 16:45:18 +0000 (19:45 +0300)]
ath10k: move fw_features to struct ath10k_fw_file
Preparation for testmode.c to use ath10k_core_fetch_board_data_api_n().
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Kalle Valo [Wed, 20 Apr 2016 16:45:05 +0000 (19:45 +0300)]
ath10k: move fw_version inside struct ath10k_fw_file
Preparation for testmode.c to use ath10k_core_fetch_board_data_api_n().
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Kalle Valo [Wed, 20 Apr 2016 16:44:51 +0000 (19:44 +0300)]
ath10k: refactor firmware images to struct ath10k_fw_components
To make it easier to share ath10k_core_fetch_board_data_api_n() with testmode.c
refactor all firmware components to struct ath10k_fw_components. This structure
will hold firmware related files, for example firmware-N.bin and board-N.bin.
For firmware-N.bin create a new struct ath10k_fw_file which contains the actual
firmware image as well as the parsed data from the image.
Modify ath10k_core_start() to take struct ath10k_fw_components() as an argument
which makes it possible in following patches to drop some ugly hacks from
testmode.c.
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Kalle Valo [Wed, 20 Apr 2016 16:44:36 +0000 (19:44 +0300)]
ath10k: remove deprecated firmware API 1 support
This has ben deprecated years ago, I haven't heard anyone using it since and
most likely it won't even work anymore. So just remove all of it.
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Colin Ian King [Sun, 10 Apr 2016 11:25:31 +0000 (12:25 +0100)]
ath9k: remove duplicate assignment of variable ah
ah is written twice with the same value, remove one of the
redundant assignments to ah.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Zefir Kurtisi [Fri, 1 Apr 2016 09:37:08 +0000 (11:37 +0200)]
ath9k: interpret requested txpower in EIRP domain
Tx power limitations at upper layers are interpreted in
the EIRP domain. When the user requests a given maximum
txpower, e.g. with: 'iw phy0 set txpower fixed 1500',
he expects the EIRP to be at or below 15dBm.
In ath9k_hw_apply_txpower(), the interpretation is
different: the antenna-gain is capped against the
current txpower limit in the regulatory, but not
against the user set value. It ensures that the
resulting EIRP is below the limit defined by the
active countrycode, but not below the value the
user requested.
In a scenario like e.g.
a) antenna_gain=6
b) countrycode limits to eirp=18
c) user set txpower=15
this will cause a setting for AR_PHY_POWER_TX_RATE
regs resulting in an EIRP > 15.
This patch ensures that antenna-gain is considered
whenever the txpower limit is adjusted and with that
the user set limits are kept.
Signed-off-by: Zefir Kurtisi <zefir.kurtisi@neratec.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Markus Elfring [Fri, 1 Jan 2016 18:09:32 +0000 (19:09 +0100)]
ath9k_htc: Replace a variable initialisation by an assignment in ath9k_htc_set_channel()
Replace an explicit initialisation for one local variable at the beginning
by a conditional assignment.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Reviewed-by: Oleksij Rempel <linux@rempel-privat.de>
Reviewed-by: Julian Calaby <julian.calaby@gmail.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Raja Mani [Tue, 12 Apr 2016 14:45:53 +0000 (20:15 +0530)]
ath10k: add dynamic tx mode switch config support for qca4019
push-pull mode needs certain amount the host driver involvement for
managing queues in the host memory and packet delivery to firmware.
qca4019 wifi firmware has an option to stay in push mode for less
number of active traffic flow and then switch to push-pull mode when
the active traffic flow goes beyond the certain limit.
The advantage of staying in push mode for less active traffic is, the
host cpu consumption is reduced. qca4019 firmware supports this
flexibility of the mode switch. It takes the host driver interest
(LOW_PERF/HIGH_PERF) via WMI_EXT_RESOURCE_CFG_CMDID,
LOW_PERF - fw would stay in push mode and switch to push-pull
based on demand.
HIGH_PERF - fw would stay in push-pull mode from the boot.
To make this configuration generic, new WMI services
WMI_SERVICE_TX_MODE_PUSH_ONLY, WMI_SERVICE_TX_MODE_PUSH_PULL,
WMI_SERVICE_TX_MODE_DYNAMIC are introduced to take dynamic tx mode
switch support availability in firmware.
Based on WMI_SERVICE_TX_MODE_DYNAMIC, LOW_PERF or HIGHT_PERF is
configured to the firmware.
Signed-off-by: Raja Mani <rmani@qti.qualcomm.com>
Signed-off-by: Tamizh chelvam <c_traja@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Dan Carpenter [Mon, 11 Apr 2016 08:15:20 +0000 (11:15 +0300)]
ath10k: add some sanity checks to peer_map_event() functions
Smatch complains that since "ev->peer_id" comes from skb->data that
means we can't trust it and have to do a bounds check on it to prevent
an array overflow.
Fixes:
6942726f7f7b ('ath10k: add fast peer_map lookup')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Rajkumar Manoharan [Thu, 7 Apr 2016 06:41:54 +0000 (12:11 +0530)]
ath10k: fix rx_channel during hw reconfigure
Upon firmware assert, restart work will be triggered so that mac80211
will reconfigure the driver. An issue is reported that after restart
work, survey dump data do not contain in-use (SURVEY_INFO_IN_USE) info
for operating channel. During reconfigure, since mac80211 already has
valid channel context for given radio, channel context iteration return
num_chanctx > 0. Hence rx_channel is always NULL. Fix this by assigning
channel context to rx_channel when driver restart is in progress.
Cc: stable@vger.kernel.org
Signed-off-by: Rajkumar Manoharan <rmanohar@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Mohammed Shafi Shajakhan [Tue, 5 Apr 2016 15:28:26 +0000 (20:58 +0530)]
ath10k: fix return value for btcoex and peer stats debugfs
Return value is incorrect for btcoex and peer stats debugfs
'write' entries if the user provides a value that matches with
the already available debugfs entry, this results in the debugfs
entry getting stuck and the operation has to be terminated manually.
Fix this by returning the appropriate return 'count' as we do it for
other debugfs entries like pktlog etc.
Fixes:
cc61a1bbbc0e ("ath10k: enable debugfs provision to enable Peer Stats feature")
Fixes:
c28e6f06ff40 ("ath10k: fix sanity check on enabling btcoex via debugfs")
Signed-off-by: Mohammed Shafi Shajakhan <mohammed@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Kalle Valo [Wed, 13 Apr 2016 11:14:16 +0000 (14:14 +0300)]
ath10k: fix parenthesis alignment
Found by checkpatch:
drivers/net/wireless/ath/ath10k/mac.c:6800: Alignment should match open parent
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Kalle Valo [Wed, 13 Apr 2016 11:14:02 +0000 (14:14 +0300)]
ath10k: prefer ether_addr_copy() over memcpy()
Fixes checkpatch warning:
drivers/net/wireless/ath/ath10k/wmi.c:5800: Prefer ether_addr_copy() over memcpy() if the Ethernet addresses are __aligned(2)
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Kalle Valo [Wed, 13 Apr 2016 11:13:49 +0000 (14:13 +0300)]
ath10k: prefer ether_addr_equal() or ether_addr_equal_unaligned() over memcmp()
Fixes checkpatch warnings:
drivers/net/wireless/ath/ath10k/mac.c:452: Prefer ether_addr_equal() or ether_addr_equal_unaligned() over memcmp()
drivers/net/wireless/ath/ath10k/mac.c:455: Prefer ether_addr_equal() or ether_addr_equal_unaligned() over memcmp()
drivers/net/wireless/ath/ath10k/txrx.c:133: Prefer ether_addr_equal() or ether_addr_equal_unaligned() over memcmp()
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Kalle Valo [Wed, 13 Apr 2016 11:13:35 +0000 (14:13 +0300)]
ath10k: prefer kernel type 'u64' over 'u_int64_t'
Fixes checkpatch warnings:
drivers/net/wireless/ath/ath10k/htt.h:1477: Prefer kernel type 'u64' over 'u_int64_t'
drivers/net/wireless/ath/ath10k/htt.h:1480: Prefer kernel type 'u64' over 'u_int64_t'
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Kalle Valo [Wed, 13 Apr 2016 11:13:21 +0000 (14:13 +0300)]
ath10k: fix checkpatch warnings related to spaces
Fix checkpatch warnings about use of spaces with operators:
spaces preferred around that '*' (ctx:VxV)
This has been recently added to checkpatch.
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Rajkumar Manoharan [Thu, 7 Apr 2016 06:40:58 +0000 (12:10 +0530)]
ath10k: remove MSI range support
MSI-X is never well-tested, might contain bugs and generally isn't
really all that useful to maintain. Also ath10k is mainly used with
shared/singly-MSI interrupt systems. Hence removing MSI range support.
This change will be useful for further cleanup in copy engine lock
and to add NAPI support.
Signed-off-by: Rajkumar Manoharan <rmanohar@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Peter Oh [Mon, 4 Apr 2016 23:19:16 +0000 (16:19 -0700)]
ath10k: enable set_tsf vdev command to WMI 10.4
10.4 firmware has
addeded set_tsf vdev parameter,
hence enable it.
set_tsf function can be used to shift TBTT that will
help avoid its clockdrift which happens when beacons
are collided.
Signed-off-by: Peter Oh <poh@qca.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Peter Oh [Mon, 4 Apr 2016 23:19:15 +0000 (16:19 -0700)]
ath10k: update 10.4 WMI vdev parameters
Update 10.4 WMI vdev param to sync to current 10.4 firmware
as of 2/23/2016.
Signed-off-by: Peter Oh <poh@qca.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Peter Oh [Mon, 4 Apr 2016 23:19:14 +0000 (16:19 -0700)]
ath10k: add a support of set_tsf on vdev interface
10.2.4.70.24 firmware introduces new feature to set TSF
via vdev parameter, hence implement relevant function.
set_tsf function can be used to shift TBTT that will
help avoid its clockdrift which happens when beacons
are collided.
Signed-off-by: Peter Oh <poh@qca.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Ben Greear [Fri, 5 Feb 2016 23:10:02 +0000 (15:10 -0800)]
ath10k: Document alloc_frag_desc_for_data_pkt config option.
This will help anyone trying to use the ack-rssi reporting
feature with the host-specified TX-rate option in 10.4 firmware.
Signed-off-by: Ben Greear <greearb@candelatech.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
David S. Miller [Mon, 11 Apr 2016 15:58:12 +0000 (11:58 -0400)]
Merge tag 'wireless-drivers-next-for-davem-2016-04-11' of git://git./linux/kernel/git/kvalo/wireless-drivers-next
Kalle Valo says:
====================
wireless-drivers patches for 4.7
Major changes:
iwlwifi
* support for Link Quality measurement
* more work 9000 devices and MSIx
* continuation of the Dynamic Queue Allocation work
* make the paging less memory hungry
* 9000 new Rx path
* removal of IWLWIFI_UAPSD Kconfig option
ath10k
* implement push-pull tx model using mac80211 software queuing support
* enable scan in AP mode (NL80211_FEATURE_AP_SCAN)
wil6210
* add basic PBSS (Personal Basic Service Set) support
* add initial P2P support
* add oob_mode module parameter
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexei Starovoitov [Thu, 7 Apr 2016 02:39:21 +0000 (19:39 -0700)]
bpf: simplify verifier register state assignments
verifier is using the following structure to track the state of registers:
struct reg_state {
enum bpf_reg_type type;
union {
int imm;
struct bpf_map *map_ptr;
};
};
and later on in states_equal() does memcmp(&old->regs[i], &cur->regs[i],..)
to find equivalent states.
Throughout the code of verifier there are assignements to 'imm' and 'map_ptr'
fields and it's not obvious that most of the assignments into 'imm' don't
need to clear extra 4 bytes (like mark_reg_unknown_value() does) to make sure
that memcmp doesn't go over junk left from 'map_ptr' assignment.
Simplify the code by converting 'int' into 'long'
Suggested-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sat, 9 Apr 2016 21:41:41 +0000 (17:41 -0400)]
Merge git://git./linux/kernel/git/davem/net
Eric Dumazet [Sat, 9 Apr 2016 15:01:13 +0000 (08:01 -0700)]
ipv6: fix inet6_lookup_listener()
A stupid refactoring bug in inet6_lookup_listener() needs to be fixed
in order to get proper SO_REUSEPORT behavior.
Fixes:
3b24d854cb35 ("tcp/dccp: do not touch listener sk_refcnt under synflood")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Maciej Żenczykowski <maze@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Sat, 9 Apr 2016 19:32:44 +0000 (12:32 -0700)]
Merge tag 'tty-4.6-rc3' of git://git./linux/kernel/git/gregkh/tty
Pull tty fixes from Greg KH:
"Here are two tty fixes for issues found.
One was due to a merge error in 4.6-rc1, and the other a regression
fix for UML consoles that broke in 4.6-rc1.
Both have been in linux-next for a while"
* tag 'tty-4.6-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
tty: Fix merge of "tty: Refactor tty_open()"
tty: Fix UML console breakage
Linus Torvalds [Sat, 9 Apr 2016 19:23:02 +0000 (12:23 -0700)]
Merge tag 'usb-4.6-rc3' of git://git./linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
"Here are some USB fixes and new device ids for 4.6-rc3.
Nothing major, the normal USB gadget fixes and usb-serial driver ids,
along with some other fixes mixed in. All except the USB serial ids
have been tested in linux-next, the id additions should be fine as
they are 'trivial'"
* tag 'usb-4.6-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (25 commits)
USB: option: add "D-Link DWM-221 B1" device id
USB: serial: cp210x: Adding GE Healthcare Device ID
USB: serial: ftdi_sio: Add support for ICP DAS I-756xU devices
usb: dwc3: keystone: drop dma_mask configuration
usb: gadget: udc-core: remove manual dma configuration
usb: dwc3: pci: add ID for one more Intel Broxton platform
usb: renesas_usbhs: fix to avoid using a disabled ep in usbhsg_queue_done()
usb: dwc2: do not override forced dr_mode in gadget setup
usb: gadget: f_midi: unlock on error
USB: digi_acceleport: do sanity checking for the number of ports
USB: cypress_m8: add endpoint sanity check
USB: mct_u232: add sanity checking in probe
usb: fix regression in SuperSpeed endpoint descriptor parsing
USB: usbip: fix potential out-of-bounds write
usb: renesas_usbhs: disable TX IRQ before starting TX DMAC transfer
usb: renesas_usbhs: avoid NULL pointer derefernce in usbhsf_pkt_handler()
usb: gadget: f_midi: Fixed a bug when buflen was smaller than wMaxPacketSize
usb: phy: qcom-8x16: fix regulator API abuse
usb: ch9: Fix SSP Device Cap wFunctionalitySupport type
usb: gadget: composite: Access SSP Dev Cap fields properly
...
Linus Torvalds [Sat, 9 Apr 2016 19:09:37 +0000 (12:09 -0700)]
Merge tag 'staging-4.6-rc3' of git://git./linux/kernel/git/gregkh/staging
Pull staging and IIO driver fixes from Greg KH:
"Here are some IIO driver fixes, along with two staging driver fixes
for 4.6-rc3.
One staging driver patch reverts the deletion of a driver that
happened in 4.6-rc1. We thought that laptop.org was dead, but it's
still alive and kicking, and has users that were mad we broke their
hardware by deleting a driver for their machines. So that driver is
added back and everyone is happy again.
All of these have been in linux-next for a while with no reported
issues"
* tag 'staging-4.6-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
Revert "Staging: olpc_dcon: Remove obsolete driver"
staging/rdma/hfi1: select CRC32
iio: gyro: bmg160: fix buffer read values
iio: gyro: bmg160: fix endianness when reading axes
iio: accel: bmc150: fix endianness when reading axes
iio: st_magn: always define ST_MAGN_TRIGGER_SET_STATE
iio: fix config watermark initial value
iio: health: max30100: correct FIFO check condition
iio: imu: Fix inv_mpu6050 dependencies
iio: adc: Fix build error of missing devm_ioremap_resource on UM
iio: light: apds9960: correct FIFO check condition
iio: adc: max1363: correct reference voltage
iio: adc: max1363: add missing adc to max1363_id
Linus Torvalds [Sat, 9 Apr 2016 19:00:42 +0000 (12:00 -0700)]
Merge tag 'scsi-fixes' of git://git./linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"This is a set of eight fixes.
Two are trivial gcc-6 updates (brace additions and unused variable
removal). There's a couple of cxlflash regressions, a correction for
sd being overly chatty on revalidation (causing excess log increases).
A VPD issue which could crash USB devices because they seem very
intolerant to VPD inquiries, an ALUA deadlock fix and a mpt3sas buffer
overrun fix"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: Do not attach VPD to devices that don't support it
sd: Fix excessive capacity printing on devices with blocks bigger than 512 bytes
scsi_dh_alua: Fix a recently introduced deadlock
scsi: Declare local symbols static
cxlflash: Move to exponential back-off when cmd_room is not available
cxlflash: Fix regression issue with re-ordering patch
mpt3sas: Don't overreach ioc->reply_post[] during initialization
aacraid: add missing curly braces
Linus Torvalds [Sat, 9 Apr 2016 18:23:27 +0000 (11:23 -0700)]
Merge tag 'md/4.6-rc2-fix' of git://git./linux/kernel/git/shli/md
Pull MD fixes from Shaohua Li:
"This update mainly fixes bugs:
- fix error handling (Guoqing)
- fix a crash when a disk is hotremoved (me)
- fix a dead loop (Wei Fang)"
* tag 'md/4.6-rc2-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md:
md/bitmap: clear bitmap if bitmap_create failed
MD: add rdev reference for super write
md: fix a trivial typo in comments
md:raid1: fix a dead loop when read from a WriteMostly disk
Linus Torvalds [Sat, 9 Apr 2016 18:03:48 +0000 (11:03 -0700)]
Merge tag 'pm+acpi-4.6-rc3' of git://git./linux/kernel/git/rafael/linux-pm
Pull power management and ACPI fixes from Rafael Wysocki:
"Fixes for some issues discovered after recent changes and for some
that have just been found lately regardless of those changes
(intel_pstate, intel_idle, PM core, mailbox/pcc, turbostat) plus
support for some new CPU models (intel_idle, Intel RAPL driver,
turbostat) and documentation updates (intel_pstate, PM core).
Specifics:
- intel_pstate fixes for two issues exposed by the recent switch over
from using timers and for one issue introduced during the 4.4 cycle
plus new comments describing data structures used by the driver
(Rafael Wysocki, Srinivas Pandruvada).
- intel_idle fixes related to CPU offline/online (Richard Cochran).
- intel_idle support (new CPU IDs and state definitions mostly) for
Skylake-X and Kabylake processors (Len Brown).
- PCC mailbox driver fix for an out-of-bounds memory access that may
cause the kernel to panic() (Shanker Donthineni).
- New (missing) CPU ID for one apparently overlooked Haswell model in
the Intel RAPL power capping driver (Srinivas Pandruvada).
- Fix for the PM core's wakeup IRQs framework to make it work after
wakeup settings reconfiguration from sysfs (Grygorii Strashko).
- Runtime PM documentation update to make it describe what needs to
be done during device removal more precisely (Krzysztof Kozlowski).
- Stale comment removal cleanup in the cpufreq-dt driver (Viresh
Kumar).
- turbostat utility fixes and support for Broxton, Skylake-X and
Kabylake processors (Len Brown)"
* tag 'pm+acpi-4.6-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (28 commits)
PM / wakeirq: fix wakeirq setting after wakup re-configuration from sysfs
tools/power turbostat: work around RC6 counter wrap
tools/power turbostat: initial KBL support
tools/power turbostat: initial SKX support
tools/power turbostat: decode BXT TSC frequency via CPUID
tools/power turbostat: initial BXT support
tools/power turbostat: print IRTL MSRs
tools/power turbostat: SGX state should print only if --debug
intel_idle: Add KBL support
intel_idle: Add SKX support
intel_idle: Clean up all registered devices on exit.
intel_idle: Propagate hot plug errors.
intel_idle: Don't overreact to a cpuidle registration failure.
intel_idle: Setup the timer broadcast only on successful driver load.
intel_idle: Avoid a double free of the per-CPU data.
intel_idle: Fix dangling registration on error path.
intel_idle: Fix deallocation order on the driver exit path.
intel_idle: Remove redundant initialization calls.
intel_idle: Fix a helper function's return value.
intel_idle: remove useless return from void function.
...
Linus Torvalds [Sat, 9 Apr 2016 17:50:44 +0000 (10:50 -0700)]
Merge git://git./linux/kernel/git/davem/net
Pull networking fixes from David Miller:
1) Stale SKB data pointer access across pskb_may_pull() calls in L2TP,
from Haishuang Yan.
2) Fix multicast frame handling in mac80211 AP code, from Felix
Fietkau.
3) mac80211 station hashtable insert errors not handled properly, fix
from Johannes Berg.
4) Fix TX descriptor count limit handling in e1000, from Alexander
Duyck.
5) Revert a buggy netdev refcount fix in netpoll, from Bjorn Helgaas.
6) Must assign rtnl_link_ops of the device before registering it, fix
in ip6_tunnel from Thadeu Lima de Souza Cascardo.
7) Memory leak fix in tc action net exit, from WANG Cong.
8) Add missing AF_KCM entries to name tables, from Dexuan Cui.
9) Fix regression in GRE handling of csums wrt. FOU, from Alexander
Duyck.
10) Fix memory allocation alignment and congestion map corruption in
RDS, from Shamir Rabinovitch.
11) Fix default qdisc regression in tuntap driver, from Jason Wang.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (44 commits)
bridge, netem: mark mailing lists as moderated
tuntap: restore default qdisc
mpls: find_outdev: check for err ptr in addition to NULL check
ipv6: Count in extension headers in skb->network_header
RDS: fix congestion map corruption for PAGE_SIZE > 4k
RDS: memory allocated must be align to 8
GRE: Disable segmentation offloads w/ CSUM and we are encapsulated via FOU
net: add the AF_KCM entries to family name tables
MAINTAINERS: intel-wired-lan list is moderated
lib/test_bpf: Add additional BPF_ADD tests
lib/test_bpf: Add test to check for result of 32-bit add that overflows
lib/test_bpf: Add tests for unsigned BPF_JGT
lib/test_bpf: Fix JMP_JSET tests
VSOCK: Detach QP check should filter out non matching QPs.
stmmac: fix adjust link call in case of a switch is attached
af_packet: tone down the Tx-ring unsupported spew.
net_sched: fix a memory leak in tc action
samples/bpf: Enable powerpc support
samples/bpf: Use llc in PATH, rather than a hardcoded value
samples/bpf: Fix build breakage with map_perf_test_user.c
...
Linus Torvalds [Sat, 9 Apr 2016 17:41:34 +0000 (10:41 -0700)]
Merge branch 'for-linus-4.6' of git://git./linux/kernel/git/mason/linux-btrfs
Pull btrfs fixes from Chris Mason:
"These are bug fixes, including a really old fsync bug, and a few trace
points to help us track down problems in the quota code"
* 'for-linus-4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
Btrfs: fix file/data loss caused by fsync after rename and new inode
btrfs: Reset IO error counters before start of device replacing
btrfs: Add qgroup tracing
Btrfs: don't use src fd for printk
btrfs: fallback to vmalloc in btrfs_compare_tree
btrfs: handle non-fatal errors in btrfs_qgroup_inherit()
btrfs: Output more info for enospc_debug mount option
Btrfs: fix invalid reference in replace_path
Btrfs: Improve FL_KEEP_SIZE handling in fallocate
Linus Torvalds [Sat, 9 Apr 2016 17:33:58 +0000 (10:33 -0700)]
Merge tag 'for-linus-4.6-ofs1' of git://git./linux/kernel/git/hubcap/linux
Pull orangefs fixes from Mike Marshall:
"Orangefs cleanups and a strncpy vulnerability fix.
Cleanups:
- remove an unused variable from orangefs_readdir.
- clean up printk wrapper used for ofs "gossip" debugging.
- clean up truncate ctime and mtime setting in inode.c
- remove a useless null check found by coccinelle.
- optimize some memcpy/memset boilerplate code.
- remove some useless sanity checks from xattr.c
Fix:
- fix a potential strncpy vulnerability"
* tag 'for-linus-4.6-ofs1' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux:
orangefs: remove unused variable
orangefs: Add KERN_<LEVEL> to gossip_<level> macros
orangefs: strncpy -> strscpy
orangefs: clean up truncate ctime and mtime setting
Orangefs: fix ifnullfree.cocci warnings
Orangefs: optimize boilerplate code.
Orangefs: xattr.c cleanup
Linus Torvalds [Sat, 9 Apr 2016 17:23:45 +0000 (10:23 -0700)]
Merge tag 'iommu-fixes-v4.6-rc2' of git://git./linux/kernel/git/joro/iommu
Pull IOMMU fixes from Joerg Roedel:
- compile-time fixes (warnings and failures)
- a bug in iommu core code which could cause the group->domain pointer
to be falsly cleared
- fix in scatterlist handling of the ARM common DMA-API code
- stall detection fix for the Rockchip IOMMU driver
* tag 'iommu-fixes-v4.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
iommu/vt-d: Silence an uninitialized variable warning
iommu/rockchip: Fix "is stall active" check
iommu: Don't overwrite domain pointer when there is no default_domain
iommu/dma: Restore scatterlist offsets correctly
iommu: provide of_xlate pointer unconditionally
John Allen [Wed, 6 Apr 2016 16:49:55 +0000 (11:49 -0500)]
ibmvnic: Enable use of multiple tx/rx scrqs
Enables the use of multiple transmit and receive scrqs allowing the ibmvnic
driver to take advantage of multiqueue functionality. To achieve this, the
driver must implement the process of negotiating the maximum number of
queues allowed by the server. Initially, the driver will attempt to login
with the maximum number of tx and rx queues supported by the server. If
the server fails to allocate the requested number of scrqs, it will return
partial success in the login response. In this case, we must reinitiate
the login process from the request capabilities stage and attempt to login
requesting fewer scrqs.
Signed-off-by: John Allen <jallen@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Greg Kroah-Hartman [Fri, 8 Apr 2016 22:41:58 +0000 (15:41 -0700)]
Merge tag 'usb-serial-4.6-rc3' of git://git./linux/kernel/git/johan/usb-serial into usb-linus
Johan writes:
USB-serial fixes for v4.6-rc3
Here are some new device ids.
Signed-off-by: Johan Hovold <johan@kernel.org>
David S. Miller [Fri, 8 Apr 2016 20:50:41 +0000 (16:50 -0400)]
Merge branch 'dsa-voidify-ops'
Vivien Didelot says:
====================
net: dsa: voidify STP setter and FDB/VLAN add ops
Neither the DSA layer nor the bridge code (see br_set_state) really care
about eventual errors from STP state setters, so make it void.
The DSA layer separates the prepare and commit phases of switchdev in
two different functions. Logical errors must not happen in commit
routines, so make them void.
Changes v1 -> v2:
- rename port_stp_update to port_stp_state_set
- don't change code flow of bcm_sf2_sw_br_set_stp_state
- prefer netdev_err over netdev_warn
====================
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vivien Didelot [Wed, 6 Apr 2016 15:55:05 +0000 (11:55 -0400)]
net: dsa: make the VLAN add function return void
The switchdev design implies that a software error should not happen in
the commit phase since it must have been previously reported in the
prepare phase. If an hardware error occurs during the commit phase,
there is nothing switchdev can do about it.
The DSA layer separates port_vlan_prepare and port_vlan_add for
simplicity and convenience. If an hardware error occurs during the
commit phase, there is no need to report it outside the driver itself.
Make the DSA port_vlan_add routine return void for explicitness.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vivien Didelot [Wed, 6 Apr 2016 15:55:04 +0000 (11:55 -0400)]
net: dsa: make the FDB add function return void
The switchdev design implies that a software error should not happen in
the commit phase since it must have been previously reported in the
prepare phase. If an hardware error occurs during the commit phase,
there is nothing switchdev can do about it.
The DSA layer separates port_fdb_prepare and port_fdb_add for simplicity
and convenience. If an hardware error occurs during the commit phase,
there is no need to report it outside the DSA driver itself.
Make the DSA port_fdb_add routine return void for explicitness.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vivien Didelot [Wed, 6 Apr 2016 15:55:03 +0000 (11:55 -0400)]
net: dsa: make the STP state function return void
The DSA layer doesn't care about the return code of the port_stp_update
routine, so make it void in the layer and the DSA drivers.
Replace the useless dsa_slave_stp_update function with a
dsa_slave_stp_state function used to reply to the switchdev
SWITCHDEV_ATTR_ID_PORT_STP_STATE attribute.
In the meantime, rename port_stp_update to port_stp_state_set to
explicit the state change.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vivien Didelot [Wed, 6 Apr 2016 15:06:20 +0000 (11:06 -0400)]
net: dsa: document missing functions
Add description for the missing port_vlan_prepare, port_fdb_prepare,
port_fdb_dump functions in the DSA documentation.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 8 Apr 2016 20:42:31 +0000 (16:42 -0400)]
Merge tag 'mac80211-next-for-davem-2016-04-06' of git://git./linux/kernel/git/jberg/mac80211-next
Johannes Berg says:
====================
For the 4.7 cycle, we have a number of changes:
* Bob's mesh mode rhashtable conversion, this includes
the rhashtable API change for allocation flags
* BSSID scan, connect() command reassoc support (Jouni)
* fast (optimised data only) and support for RSS in mac80211 (myself)
* various smaller changes
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 8 Apr 2016 20:41:28 +0000 (16:41 -0400)]
Merge tag 'mac80211-for-davem-2016-04-06' of git://git./linux/kernel/git/jberg/mac80211
Johannes Berg says:
====================
For the current RC series, we have the following fixes:
* TDLS fixes from Arik and Ilan
* rhashtable fixes from Ben and myself
* documentation fixes from Luis
* U-APSD fixes from Emmanuel
* a TXQ fix from Felix
* and a compiler warning suppression from Jeff
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
stephen hemminger [Tue, 5 Apr 2016 20:43:53 +0000 (13:43 -0700)]
bridge, netem: mark mailing lists as moderated
I moderate these (lightly loaded) lists to block spam.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Tue, 5 Apr 2016 20:33:17 +0000 (22:33 +0200)]
bpf, verifier: further improve search pruning
The verifier needs to go through every path of the program in
order to check that it terminates safely, which can be quite a
lot of instructions that need to be processed f.e. in cases with
more branchy programs. With search pruning from
f1bca824dabb ("bpf:
add search pruning optimization to verifier") the search space can
already be reduced significantly when the verifier detects that
a previously walked path with same register and stack contents
terminated already (see verifier's states_equal()), so the search
can skip walking those states.
When working with larger programs of > ~2000 (out of max 4096)
insns, we found that the current limit of 32k instructions is easily
hit. For example, a case we ran into is that the search space cannot
be pruned due to branches at the beginning of the program that make
use of certain stack space slots (STACK_MISC), which are never used
in the remaining program (STACK_INVALID). Therefore, the verifier
needs to walk paths for the slots in STACK_INVALID state, but also
all remaining paths with a stack structure, where the slots are in
STACK_MISC, which can nearly double the search space needed. After
various experiments, we find that a limit of 64k processed insns is
a more reasonable choice when dealing with larger programs in practice.
This still allows to reject extreme crafted cases that can have a
much higher complexity (f.e. > ~300k) within the 4096 insns limit
due to search pruning not being able to take effect.
Furthermore, we found that a lot of states can be pruned after a
call instruction, f.e. we were able to reduce the search state by
~35% in some cases with this heuristic, trade-off is to keep a bit
more states in env->explored_states. Usually, call instructions
have a number of preceding register assignments and/or stack stores,
where search pruning has a better chance to suceed in states_equal()
test. The current code marks the branch targets with STATE_LIST_MARK
in case of conditional jumps, and the next (t + 1) instruction in
case of unconditional jump so that f.e. a backjump will walk it. We
also did experiments with using t + insns[t].off + 1 as a marker in
the unconditionally jump case instead of t + 1 with the rationale
that these two branches of execution that converge after the label
might have more potential of pruning. We found that it was a bit
better, but not necessarily significantly better than the current
state, perhaps also due to clang not generating back jumps often.
Hence, we left that as is for now.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jason Wang [Fri, 8 Apr 2016 05:26:48 +0000 (13:26 +0800)]
tuntap: restore default qdisc
After commit
f84bb1eac027 ("net: fix IFF_NO_QUEUE for drivers using
alloc_netdev"), default qdisc was changed to noqueue because
tuntap does not set tx_queue_len during .setup(). This patch restores
default qdisc by setting tx_queue_len in tun_setup().
Fixes:
f84bb1eac027 ("net: fix IFF_NO_QUEUE for drivers using alloc_netdev")
Cc: Phil Sutter <phil@nwl.cc>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>
Martin Brandenburg [Mon, 4 Apr 2016 20:26:38 +0000 (16:26 -0400)]
orangefs: remove unused variable
Signed-off-by: Martin Brandenburg <martin@omnibond.com>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
Rafael J. Wysocki [Fri, 8 Apr 2016 19:46:56 +0000 (21:46 +0200)]
Merge branches 'pm-core', 'powercap' and 'pm-tools'
* pm-core:
PM / wakeirq: fix wakeirq setting after wakup re-configuration from sysfs
PM / runtime: Document steps for device removal
* powercap:
powercap: intel_rapl: Add missing Haswell model
* pm-tools:
tools/power turbostat: work around RC6 counter wrap
tools/power turbostat: initial KBL support
tools/power turbostat: initial SKX support
tools/power turbostat: decode BXT TSC frequency via CPUID
tools/power turbostat: initial BXT support
tools/power turbostat: print IRTL MSRs
tools/power turbostat: SGX state should print only if --debug
Rafael J. Wysocki [Fri, 8 Apr 2016 19:46:05 +0000 (21:46 +0200)]
Merge branches 'pm-cpufreq', 'pm-cpuidle' and 'acpi-cppc'
* pm-cpufreq:
cpufreq: dt: Drop stale comment
cpufreq: intel_pstate: Documenation for structures
cpufreq: intel_pstate: fix inconsistency in setting policy limits
intel_pstate: Avoid extra invocation of intel_pstate_sample()
intel_pstate: Do not set utilization update hook too early
* pm-cpuidle:
intel_idle: Add KBL support
intel_idle: Add SKX support
intel_idle: Clean up all registered devices on exit.
intel_idle: Propagate hot plug errors.
intel_idle: Don't overreact to a cpuidle registration failure.
intel_idle: Setup the timer broadcast only on successful driver load.
intel_idle: Avoid a double free of the per-CPU data.
intel_idle: Fix dangling registration on error path.
intel_idle: Fix deallocation order on the driver exit path.
intel_idle: Remove redundant initialization calls.
intel_idle: Fix a helper function's return value.
intel_idle: remove useless return from void function.
* acpi-cppc:
mailbox: pcc: Don't access an unmapped memory address space
Jiri Pirko [Fri, 8 Apr 2016 17:12:48 +0000 (19:12 +0200)]
devlink: share user_ptr pointer for both devlink and devlink_port
Ptr to devlink structure can be easily obtained from
devlink_port->devlink. So share user_ptr[0] pointer for both and leave
user_ptr[1] free for other users.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 8 Apr 2016 19:38:43 +0000 (15:38 -0400)]
Merge branch 'mlxsw-next'
Jiri Pirko says:
====================
mlxsw: small driver update + one tiny devlink dependency
Cosmetics, in preparation to sharedbuffer patchset.
First patch is here to allow patch number two.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Fri, 8 Apr 2016 17:11:25 +0000 (19:11 +0200)]
mlxsw: reg: Fix SBPM register name
Fix copy&paste error and state the name of SBPM register correctly.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Fri, 8 Apr 2016 17:11:24 +0000 (19:11 +0200)]
mlxsw: reg: Share direction enum between SBPR, SBCM, SBPM
Same field, same values, so share the same enum.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Fri, 8 Apr 2016 17:11:23 +0000 (19:11 +0200)]
mlxsw: Do not pass around driver_priv directly
Instead of that, pass mlxsw_core and use a helper to get driver priv
from driver code. Looks much cleaner that way.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Fri, 8 Apr 2016 17:11:22 +0000 (19:11 +0200)]
mlxsw: Pass mlxsw_core as a param of mlxsw_core_skb_transmit*
Instead of passing around driver priv, pass struct mlxsw_core *
directly.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Fri, 8 Apr 2016 17:11:21 +0000 (19:11 +0200)]
mlxsw: Move devlink port registration into common core code
Remove devlink port reg/unreg from spectrum and switchx2 code and rather
do the common work in core. That also ensures code separation where
devlink is only used in core.c.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Fri, 8 Apr 2016 17:11:20 +0000 (19:11 +0200)]
devlink: remove implicit type set in port register
As we rely on caller zeroing or correctly set the struct before the call,
this implicit type set is either no-op (DEVLINK_PORT_TYPE_NOTSET is 0)
or it rewrites wanted value. So remove this.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 8 Apr 2016 19:26:07 +0000 (15:26 -0400)]
Merge branch 'nfp-mtu-buffer-reconfig'
Jakub Kicinski says:
====================
MTU/buffer reconfig changes
I re-discussed MPLS/MTU internally, dropped it from the patch 1,
re-tested everything, found out I forgot about debugfs pointers,
fixed that as well.
v5:
- don't reserve space in RX buffers for MPLS label stack
(patch 1);
- fix debugfs pointers to ring structures (patch 5).
v4:
- cut down on unrelated patches;
- don't "close" the device on error path.
--- v4 cover letter
Previous series included some not entirely related patches,
this one is cut down. Main issue I'm trying to solve here
is that .ndo_change_mtu() in nfpvf driver is doing full
close/open to reallocate buffers - which if open fails
can result in device being basically closed even though
the interface is started. As suggested by you I try to move
towards a paradigm where the resources are allocated first
and the MTU change is only done once I'm certain (almost)
nothing can fail. Almost because I need to communicate
with FW and that can always time out.
Patch 1 fixes small issue. Next 10 patches reorganize things
so that I can easily allocate new rings and sets of buffers
while the device is running. Patches 13 and 15 reshape the
.ndo_change_mtu() and ethtool's ring-resize operation into
desired form.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 7 Apr 2016 18:39:48 +0000 (19:39 +0100)]
nfp: allow ring size reconfiguration at runtime
Since much of the required changes have already been made for
changing MTU at runtime let's use it for ring size changes as
well.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 7 Apr 2016 18:39:47 +0000 (19:39 +0100)]
nfp: pass ring count as function parameter
Soon ring resize will call this functions with values
different than the current configuration we need to
explicitly pass the ring count as parameter.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 7 Apr 2016 18:39:46 +0000 (19:39 +0100)]
nfp: convert .ndo_change_mtu() to prepare/commit paradigm
When changing MTU on running device first allocate new rings
and buffers and once it succeeds proceed with changing MTU.
Allocation of new rings is not really necessary for this
operation - it's done to keep the code simple and because
size of the extra ring memory is quite small compared to
the size of buffers.
Operation can still fail midway through if FW communication
times out. In that case we retry with old MTU (rings).
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 7 Apr 2016 18:39:45 +0000 (19:39 +0100)]
nfp: propagate list buffer size in struct rx_ring
Free list buffer size needs to be propagated to few functions
as a parameter and added to struct nfp_net_rx_ring since soon
some of the functions will be reused to manage rings with
buffers of size different than nn->fl_bufsz.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 7 Apr 2016 18:39:44 +0000 (19:39 +0100)]
nfp: sync ring state during FW reconfiguration
FW reconfiguration in .ndo_open()/.ndo_stop() should reset/
restore queue state. Since we need IRQs to be disabled when
filling rings on RX path we have to move disable_irq() from
.ndo_open() all the way up to IRQ allocation.
nfp_net_start_vec() becomes trivial now so it's inlined.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 7 Apr 2016 18:39:43 +0000 (19:39 +0100)]
nfp: slice .ndo_open() and .ndo_stop() up
Divide .ndo_open() and .ndo_stop() into logical, callable
chunks. No functional changes.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 7 Apr 2016 18:39:42 +0000 (19:39 +0100)]
nfp: move filling ring information to FW config
nfp_net_[rt]x_ring_{alloc,free} should only allocate or free
ring resources without touching the device. Move setting
parameters in the BAR to separate functions. This will make
it possible to reuse alloc/free functions to allocate new
rings while the device is running.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 7 Apr 2016 18:39:41 +0000 (19:39 +0100)]
nfp: preallocate RX buffers early in .ndo_open
We want the .ndo_open() to have following structure:
- allocate resources;
- configure HW/FW;
- enable the device from stack perspective.
Therefore filling RX rings needs to be moved to the beginning
of .ndo_open().
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 7 Apr 2016 18:39:40 +0000 (19:39 +0100)]
nfp: reorganize initial filling of RX rings
Separate allocation of buffers from giving them to FW,
thanks to this it will be possible to move allocation
earlier on .ndo_open() path and reuse buffers during
runtime reconfiguration.
Similar to TX side clean up the spill of functionality
from flush to freeing the ring. Unlike on TX side,
RX ring reset does not free buffers from the ring.
Ring reset means only that FW pointers are zeroed and
buffers on the ring must be placed in [0, cnt - 1)
positions.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 7 Apr 2016 18:39:39 +0000 (19:39 +0100)]
nfp: cleanup tx ring flush and rename to reset
Since we never used flush without freeing the ring later
the functionality of the two operations is mixed.
Rename flush to ring reset and move there all the things
which have to be done after FW ring state is cleared.
While at it do some clean-ups.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 7 Apr 2016 18:39:38 +0000 (19:39 +0100)]
nfp: allocate ring SW structs dynamically
To be able to switch rings more easily on config changes
allocate them dynamically, separately from nfp_net structure.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 7 Apr 2016 18:39:37 +0000 (19:39 +0100)]
nfp: make *x_ring_init do all the init
nfp_net_[rt]x_ring_init functions used to be called from probe
path only and some of their functionality was spilled to the
call site. In order to reuse them for ring reconfiguration
we need them to do all the init.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 7 Apr 2016 18:39:36 +0000 (19:39 +0100)]
nfp: break up nfp_net_{alloc|free}_rings
nfp_net_{alloc|free}_rings contained strange mix of allocations
and vector initialization. Remove it, declare vector init as
a separate function and handle allocations explicitly.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 7 Apr 2016 18:39:35 +0000 (19:39 +0100)]
nfp: move link state interrupt request/free calls
We need to be able to disable the link state interrupt when
the device is brought down. We used to just free the IRQ
at the beginning of .ndo_stop(). As we now move towards
more ordered .ndo_open()/.ndo_stop() paths LSC allocation
should be placed in the "allocate resource" section.
Since the IRQ can't be freed early in .ndo_stop(), it is
disabled instead.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 7 Apr 2016 18:39:34 +0000 (19:39 +0100)]
nfp: correct RX buffer length calculation
When calculating the RX buffer length we need to account
for up to 2 VLAN tags. Rounding up to 1k is an relic of
a distant past and can be removed. While at it also remove
trivial print statement.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Joe Perches [Sun, 27 Mar 2016 21:34:52 +0000 (14:34 -0700)]
orangefs: Add KERN_<LEVEL> to gossip_<level> macros
Emit the logging messages at the appropriate levels.
Miscellanea:
o Change format to fmt
o Use the more common ##__VA_ARGS__
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
Martin Brandenburg [Fri, 8 Apr 2016 17:33:21 +0000 (13:33 -0400)]
orangefs: strncpy -> strscpy
It would have been possible for a rogue client-core to send in a symlink
target which is not NUL terminated. This returns EIO if the client-core
gives us corrupt data.
Leave debugfs and superblock code as is for now.
Other dcache.c and namei.c strncpy instances are safe because
ORANGEFS_NAME_MAX = NAME_MAX + 1; there is always enough space for a
name plus a NUL byte.
Signed-off-by: Martin Brandenburg <martin@omnibond.com>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
Martin Brandenburg [Mon, 4 Apr 2016 20:26:36 +0000 (16:26 -0400)]
orangefs: clean up truncate ctime and mtime setting
The ctime and mtime are always updated on a successful ftruncate and
only updated on a successful truncate where the size changed.
We handle the ``if the size changed'' bit.
This matches FUSE's behavior.
Signed-off-by: Martin Brandenburg <martin@omnibond.com>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
kbuild test robot [Sat, 26 Mar 2016 18:54:23 +0000 (02:54 +0800)]
Orangefs: fix ifnullfree.cocci warnings
fs/orangefs/orangefs-debugfs.c:130:2-26: WARNING: NULL check before freeing functions like kfree, debugfs_remove, debugfs_remove_recursive or usb_free_urb is not needed. Maybe consider reorganizing relevant code to avoid passing NULL values.
NULL check before some freeing functions is not needed.
Based on checkpatch warning
"kfree(NULL) is safe this check is probably not required"
and kfreeaddr.cocci by Julia Lawall.
Generated by: scripts/coccinelle/free/ifnullfree.cocci
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
Mike Marshall [Wed, 6 Apr 2016 15:19:37 +0000 (11:19 -0400)]
Orangefs: optimize boilerplate code.
Suggested by David Binderman <dcb314@hotmail.com>
The former can potentially be a performance win over the latter.
memcpy(d, s, len);
memset(d+len, c, size-len);
memset(d, c, size);
memcpy(d, s, len);
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
Mike Marshall [Wed, 6 Apr 2016 14:52:38 +0000 (10:52 -0400)]
Orangefs: xattr.c cleanup
1. It is nonsense to test for negative size_t, suggested by
David Binderman <dcb314@hotmail.com>
2. By the time Orangefs gets called, the vfs has ensured that
name != NULL, and that buffer and size are sane.
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
Roopa Prabhu [Fri, 8 Apr 2016 04:28:38 +0000 (21:28 -0700)]
mpls: find_outdev: check for err ptr in addition to NULL check
find_outdev calls inet{,6}_fib_lookup_dev() or dev_get_by_index() to
find the output device. In case of an error, inet{,6}_fib_lookup_dev()
returns error pointer and dev_get_by_index() returns NULL. But the function
only checks for NULL and thus can end up calling dev_put on an ERR_PTR.
This patch adds an additional check for err ptr after the NULL check.
Before: Trying to add an mpls route with no oif from user, no available
path to 10.1.1.8 and no default route:
$ip -f mpls route add 100 as 200 via inet 10.1.1.8
[ 822.337195] BUG: unable to handle kernel NULL pointer dereference at
00000000000003a3
[ 822.340033] IP: [<
ffffffff8148781e>] mpls_nh_assign_dev+0x10b/0x182
[ 822.340033] PGD
1db38067 PUD
1de9e067 PMD 0
[ 822.340033] Oops: 0000 [#1] SMP
[ 822.340033] Modules linked in:
[ 822.340033] CPU: 0 PID: 11148 Comm: ip Not tainted 4.5.0-rc7+ #54
[ 822.340033] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS
rel-1.7.5.1-0-g8936dbb-20141113_115728-nilsson.home.kraxel.org
04/01/2014
[ 822.340033] task:
ffff88001db82580 ti:
ffff88001dad4000 task.ti:
ffff88001dad4000
[ 822.340033] RIP: 0010:[<
ffffffff8148781e>] [<
ffffffff8148781e>]
mpls_nh_assign_dev+0x10b/0x182
[ 822.340033] RSP: 0018:
ffff88001dad7a88 EFLAGS:
00010282
[ 822.340033] RAX:
ffffffffffffff9b RBX:
ffffffffffffff9b RCX:
0000000000000002
[ 822.340033] RDX:
00000000ffffff9b RSI:
0000000000000008 RDI:
0000000000000000
[ 822.340033] RBP:
ffff88001ddc9ea0 R08:
ffff88001e9f1768 R09:
0000000000000000
[ 822.340033] R10:
ffff88001d9c1100 R11:
ffff88001e3c89f0 R12:
ffffffff8187e0c0
[ 822.340033] R13:
ffffffff8187e0c0 R14:
ffff88001ddc9e80 R15:
0000000000000004
[ 822.340033] FS:
00007ff9ed798700(0000) GS:
ffff88001fc00000(0000)
knlGS:
0000000000000000
[ 822.340033] CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
[ 822.340033] CR2:
00000000000003a3 CR3:
000000001de89000 CR4:
00000000000006f0
[ 822.340033] Stack:
[ 822.340033]
0000000000000000 0000000100000000 0000000000000000
0000000000000000
[ 822.340033]
0000000000000000 0801010a00000000 0000000000000000
0000000000000000
[ 822.340033]
0000000000000004 ffffffff8148749b ffffffff8187e0c0
000000000000001c
[ 822.340033] Call Trace:
[ 822.340033] [<
ffffffff8148749b>] ? mpls_rt_alloc+0x2b/0x3e
[ 822.340033] [<
ffffffff81488e66>] ? mpls_rtm_newroute+0x358/0x3e2
[ 822.340033] [<
ffffffff810e7bbc>] ? get_page+0x5/0xa
[ 822.340033] [<
ffffffff813b7d94>] ? rtnetlink_rcv_msg+0x17e/0x191
[ 822.340033] [<
ffffffff8111794e>] ? __kmalloc_track_caller+0x8c/0x9e
[ 822.340033] [<
ffffffff813c9393>] ?
rht_key_hashfn.isra.20.constprop.57+0x14/0x1f
[ 822.340033] [<
ffffffff813b7c16>] ? __rtnl_unlock+0xc/0xc
[ 822.340033] [<
ffffffff813cb794>] ? netlink_rcv_skb+0x36/0x82
[ 822.340033] [<
ffffffff813b4507>] ? rtnetlink_rcv+0x1f/0x28
[ 822.340033] [<
ffffffff813cb2b1>] ? netlink_unicast+0x106/0x189
[ 822.340033] [<
ffffffff813cb5b3>] ? netlink_sendmsg+0x27f/0x2c8
[ 822.340033] [<
ffffffff81392ede>] ? sock_sendmsg_nosec+0x10/0x1b
[ 822.340033] [<
ffffffff81393df1>] ? ___sys_sendmsg+0x182/0x1e3
[ 822.340033] [<
ffffffff810e4f35>] ?
__alloc_pages_nodemask+0x11c/0x1e4
[ 822.340033] [<
ffffffff8110619c>] ? PageAnon+0x5/0xd
[ 822.340033] [<
ffffffff811062fe>] ? __page_set_anon_rmap+0x45/0x52
[ 822.340033] [<
ffffffff810e7bbc>] ? get_page+0x5/0xa
[ 822.340033] [<
ffffffff810e85ab>] ? __lru_cache_add+0x1a/0x3a
[ 822.340033] [<
ffffffff81087ea9>] ? current_kernel_time64+0x9/0x30
[ 822.340033] [<
ffffffff813940c4>] ? __sys_sendmsg+0x3c/0x5a
[ 822.340033] [<
ffffffff8148f597>] ?
entry_SYSCALL_64_fastpath+0x12/0x6a
[ 822.340033] Code: 83 08 04 00 00 65 ff 00 48 8b 3c 24 e8 40 7c f2 ff
eb 13 48 c7 c3 9f ff ff ff eb 0f 89 ce e8 f1 ae f1 ff 48 89 c3 48 85 db
74 15 <48> 8b 83 08 04 00 00 65 ff 08 48 81 fb 00 f0 ff ff 76 0d eb 07
[ 822.340033] RIP [<
ffffffff8148781e>] mpls_nh_assign_dev+0x10b/0x182
[ 822.340033] RSP <
ffff88001dad7a88>
[ 822.340033] CR2:
00000000000003a3
[ 822.435363] ---[ end trace
98cc65e6f6b8bf11 ]---
After patch:
$ip -f mpls route add 100 as 200 via inet 10.1.1.8
RTNETLINK answers: Network is unreachable
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Reported-by: David Miller <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 8 Apr 2016 16:13:30 +0000 (12:13 -0400)]
Merge branch '10GbE' of git://git./linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:
====================
10GbE Intel Wired LAN Driver Updates 2016-04-07
This series contains updates to ixgbe and ixgbevf.
This entire series (except for one patch from Alex) comes from Mark and
is mainly to add support for our new MAC (x550em_a).
So let's get Alex's patch out of the way first before we cover Mark's
many changes. Alex does his enable bulk free in transmit cleanup for
ixgbe and ixgbevf, like his has done for all of our other drivers.
First Mark cleans up registers that were not being used, so do some
house cleaning. Then to avoid casting lan_id and func fields, just
make them u8 since they only hold small values anyways. Found and
fixed an issue where on read operations it could be possible to
modify locations beyond the length passed in, so change the check
to round up in the same way. Cleaned up the interface for issuing
firmware commands to use a void * instead of a u32 * which eliminates
a number of casts. Added support for the new MAC and provided method
pointers and use them to access IOSF-attached devices, since the
new MAC will also need a new access method. Added support for SFPs
with an external retimer and for an SGMII backplane interface.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Sitnicki [Tue, 5 Apr 2016 16:41:08 +0000 (18:41 +0200)]
ipv6: Count in extension headers in skb->network_header
When sending a UDPv6 message longer than MTU, account for the length
of fragmentable IPv6 extension headers in skb->network_header offset.
Same as we do in alloc_new_skb path in __ip6_append_data().
This ensures that later on __ip6_make_skb() will make space in
headroom for fragmentable extension headers:
/* move skb->data to ip header from ext header */
if (skb->data < skb_network_header(skb))
__skb_pull(skb, skb_network_offset(skb));
Prevents a splat due to skb_under_panic:
skbuff: skb_under_panic: text:
ffffffff8143397b len:2126 put:14 \
head:
ffff880005bacf50 data:
ffff880005bacf4a tail:0x48 end:0xc0 dev:lo
------------[ cut here ]------------
kernel BUG at net/core/skbuff.c:104!
invalid opcode: 0000 [#1] KASAN
CPU: 0 PID: 160 Comm: reproducer Not tainted 4.6.0-rc2 #65
[...]
Call Trace:
[<
ffffffff813eb7b9>] skb_push+0x79/0x80
[<
ffffffff8143397b>] eth_header+0x2b/0x100
[<
ffffffff8141e0d0>] neigh_resolve_output+0x210/0x310
[<
ffffffff814eab77>] ip6_finish_output2+0x4a7/0x7c0
[<
ffffffff814efe3a>] ip6_output+0x16a/0x280
[<
ffffffff815440c1>] ip6_local_out+0xb1/0xf0
[<
ffffffff814f1115>] ip6_send_skb+0x45/0xd0
[<
ffffffff81518836>] udp_v6_send_skb+0x246/0x5d0
[<
ffffffff8151985e>] udpv6_sendmsg+0xa6e/0x1090
[...]
Reported-by: Ji Jianwen <jiji@redhat.com>
Signed-off-by: Jakub Sitnicki <jkbs@redhat.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Bart Van Assche [Thu, 7 Apr 2016 22:55:04 +0000 (15:55 -0700)]
Revert "ib_srpt: Convert to percpu_ida tag allocation"
This reverts commit
0fd10721fe3664f7549e74af9d28a509c9a68719.
That patch causes the ib_srpt driver to crash as soon as the first SCSI
command is received:
kernel BUG at drivers/infiniband/ulp/srpt/ib_srpt.c:1439!
invalid opcode: 0000 [#1] SMP
Workqueue: target_completion target_complete_ok_work [target_core_mod]
RIP: srpt_queue_response+0x437/0x4a0 [ib_srpt]
Call Trace:
srpt_queue_data_in+0x9/0x10 [ib_srpt]
target_complete_ok_work+0x152/0x2b0 [target_core_mod]
process_one_work+0x197/0x480
worker_thread+0x49/0x490
kthread+0xea/0x100
ret_from_fork+0x22/0x40
Aside from the crash, the shortcomings of that patch are as follows:
- It makes the ib_srpt driver use I/O contexts allocated by
transport_alloc_session_tags() but it does not initialize these I/O
contexts properly. All the initializations performed by
srpt_alloc_ioctx() are skipped.
- It swaps the order of the send ioctx allocation and the transition to
RTR mode which is wrong.
- The amount of memory that is needed for I/O contexts is doubled.
- srpt_rdma_ch.free_list is no longer used but is not removed.
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David S. Miller [Fri, 8 Apr 2016 01:04:27 +0000 (21:04 -0400)]
Merge branch 'bpf-tracepoints'
Alexei Starovoitov says:
====================
allow bpf attach to tracepoints
Hi Steven, Peter,
v1->v2: addressed Peter's comments:
- fixed wording in patch 1, added ack
- refactored 2nd patch into 3:
2/10 remove unused __perf_addr macro which frees up
an argument in perf_trace_buf_submit
3/10 split perf_trace_buf_prepare into alloc and update parts, so that bpf
programs don't have to pay performance penalty for update of struct trace_entry
which is not going to be accessed by bpf
4/10 actual addition of bpf filter to perf tracepoint handler is now trivial
and bpf prog can be used as proper filter of tracepoints
v1 cover:
last time we discussed bpf+tracepoints it was a year ago [1] and the reason
we didn't proceed with that approach was that bpf would make arguments
arg1, arg2 to trace_xx(arg1, arg2) call to be exposed to bpf program
and that was considered unnecessary extension of abi. Back then I wanted
to avoid the cost of buffer alloc and field assign part in all
of the tracepoints, but looks like when optimized the cost is acceptable.
So this new apporach doesn't expose any new abi to bpf program.
The program is looking at tracepoint fields after they were copied
by perf_trace_xx() and described in /sys/kernel/debug/tracing/events/xxx/format
We made a tool [2] that takes arguments from /sys/.../format and works as:
$ tplist.py -v random:urandom_read
int got_bits;
int pool_left;
int input_left;
Then these fields can be copy-pasted into bpf program like:
struct urandom_read {
__u64 hidden_pad;
int got_bits;
int pool_left;
int input_left;
};
and the program can use it:
SEC("tracepoint/random/urandom_read")
int bpf_prog(struct urandom_read *ctx)
{
return ctx->pool_left > 0 ? 1 : 0;
}
This way the program can access tracepoint fields faster than
equivalent bpf+kprobe program, which is the main goal of these patches.
Patch 1-4 are simple changes in perf core side, please review.
I'd like to take the whole set via net-next tree, since the rest of
the patches might conflict with other bpf work going on in net-next
and we want to avoid cross-tree merge conflicts.
Alternatively we can put patches 1-4 into both tip and net-next.
Patch 9 is an example of access to tracepoint fields from bpf prog.
Patch 10 is a micro benchmark for bpf+kprobe vs bpf+tracepoint.
Note that for actual tracing tools the user doesn't need to
run tplist.py and copy-paste fields manually. The tools do it
automatically. Like argdist tool [3] can be used as:
$ argdist -H 't:block:block_rq_complete():u32:nr_sector'
where 'nr_sector' is name of tracepoint field taken from
/sys/kernel/debug/tracing/events/block/block_rq_complete/format
and appropriate bpf program is generated on the fly.
[1] http://thread.gmane.org/gmane.linux.kernel.api/8127/focus=8165
[2] https://github.com/iovisor/bcc/blob/master/tools/tplist.py
[3] https://github.com/iovisor/bcc/blob/master/tools/argdist.py
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexei Starovoitov [Thu, 7 Apr 2016 01:43:31 +0000 (18:43 -0700)]
samples/bpf: add tracepoint vs kprobe performance tests
the first microbenchmark does
fd=open("/proc/self/comm");
for() {
write(fd, "test");
}
and on 4 cpus in parallel:
writes per sec
base (no tracepoints, no kprobes) 930k
with kprobe at __set_task_comm() 420k
with tracepoint at task:task_rename 730k
For kprobe + full bpf program manully fetches oldcomm, newcomm via bpf_probe_read.
For tracepint bpf program does nothing, since arguments are copied by tracepoint.
2nd microbenchmark does:
fd=open("/dev/urandom");
for() {
read(fd, buf);
}
and on 4 cpus in parallel:
reads per sec
base (no tracepoints, no kprobes) 300k
with kprobe at urandom_read() 279k
with tracepoint at random:urandom_read 290k
bpf progs attached to kprobe and tracepoint are noop.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexei Starovoitov [Thu, 7 Apr 2016 01:43:30 +0000 (18:43 -0700)]
samples/bpf: tracepoint example
modify offwaketime to work with sched/sched_switch tracepoint
instead of kprobe into finish_task_switch
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexei Starovoitov [Thu, 7 Apr 2016 01:43:29 +0000 (18:43 -0700)]
samples/bpf: add tracepoint support to bpf loader
Recognize "tracepoint/" section name prefix and attach the program
to that tracepoint.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>