Tony Lindgren [Wed, 11 Jan 2017 22:13:34 +0000 (14:13 -0800)]
pinctrl: core: Fix regression caused by delayed work for hogs
Commit
df61b366af26 ("pinctrl: core: Use delayed work for hogs") caused a
regression at least with sh-pfc that is also a GPIO controller as
noted by Geert Uytterhoeven <geert@linux-m68k.org>.
As the original pinctrl_register() has issues calling pin controller
driver functions early before the controller has finished registering,
we can't just revert commit
df61b366af26. That would break the drivers
using GENERIC_PINCTRL_GROUPS or GENERIC_PINMUX_FUNCTIONS.
So let's fix the issue with the following steps as a single patch:
1. Revert the late_init parts of commit
df61b366af26.
The late_init clearly won't work and we have to just give up
on fixing pinctrl_register() for GENERIC_PINCTRL_GROUPS and
GENERIC_PINMUX_FUNCTIONS.
2. Split pinctrl_register() into two parts
By splitting pinctrl_register() into pinctrl_init_controller()
and pinctrl_create_and_start() we have better control over when
it's safe to call pinctrl_create().
3. Introduce a new pinctrl_register_and_init() function
As suggested by Linus Walleij <linus.walleij@linaro.org>, we
can just introduce a new function for the controllers that need
pinctrl_create() called later.
4. Convert the four known problem cases to use new function
Let's convert pinctrl-imx, pinctrl-single, sh-pfc and ti-iodelay
to use the new function to fix the issues. The rest of the drivers
can be converted later. Let's also update Documentation/pinctrl.txt
accordingly because of the known issues with pinctrl_register().
Fixes:
df61b366af26 ("pinctrl: core: Use delayed work for hogs")
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Gary Bisson <gary.bisson@boundarydevices.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Icenowy Zheng [Tue, 3 Jan 2017 15:16:27 +0000 (23:16 +0800)]
pinctrl: sunxi: add driver for V3s SoC
V3s SoC features only a pin controller (for the lack of CPUs part).
Add a driver for this controller.
Signed-off-by: Icenowy Zheng <icenowy@aosc.xyz>
Acked-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Dan Carpenter [Sat, 7 Jan 2017 06:32:15 +0000 (09:32 +0300)]
pinctrl/amd: white space cleanups in amd_gpio_dbg_show()
We accidentally deleted two tabs from the first line, but even with that
fixed the conditions were not really kernel style. Put the && at the
end of the line so we can align the condition clauses. Also add spaces
around the "+" operator.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Fabio Estevam [Fri, 6 Jan 2017 00:55:43 +0000 (22:55 -0200)]
pinctrl: imx7d-pinctrl: Fix a typo
Fix a typo in "Peripherals".
Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Bjorn Andersson [Thu, 5 Jan 2017 16:07:55 +0000 (08:07 -0800)]
pinctrl: Drop error prints on kzalloc() failure
Upon failing kzalloc() will print an error message in the log, so
there's no need for additional printouts. Also standardizes the "!ptr"
vs "ptr == NULL" while I'm touching those lines.
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
David Lechner [Thu, 5 Jan 2017 23:43:41 +0000 (17:43 -0600)]
pinctrl: da850-pupd: Add to module device table
This adds the pintrol-da850-pupd driver to the module device table so that
udev will automatically bind the driver to the device.
Signed-off-by: David Lechner <david@lechnology.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Andy Shevchenko [Tue, 10 Jan 2017 20:11:39 +0000 (22:11 +0200)]
pinctrl: baytrail: Convert to use devm_*()
This simplifies error handling and allows us to drop error path handlers
completely.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Mika Westerberg [Tue, 10 Jan 2017 14:31:57 +0000 (17:31 +0300)]
pinctrl: intel: Convert to use devm_gpiochip_add_data()
This simplifies error handling and allows us to drop intel_pinctrl_remove()
completely.
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Nishanth Menon [Thu, 5 Jan 2017 18:54:14 +0000 (10:54 -0800)]
pinctrl: Introduce TI IOdelay configuration driver
SoC family such as DRA7 family of processors have, in addition
to the regular muxing of pins (as done by pinctrl-single), a separate
hardware module called IODelay which is also expected to be configured.
The "IODelay" module has it's own register space that is independent
of the control module and the padconf register area.
With recent changes to the pinctrl framework, we can now support
this hardware with a reasonably minimal driver by using #pinctrl-cells,
GENERIC_PINCTRL_GROUPS and GENERIC_PINMUX_FUNCTIONS.
It is advocated strongly in TI's official documentation considering
the existing design of the DRA7 family of processors during mux or
IODelay reconfiguration, there is a potential for a significant glitch
which may cause functional impairment to certain hardware. It is
hence recommended to do as little of muxing as absolutely necessary
without I/O isolation (which can only be done in initial stages of
bootloader).
NOTE: with the system wide I/O isolation scheme present in DRA7 SoC
family, it is not reasonable to do stop all I/O operations for every
such pad configuration scheme. So, we will let it glitch when used in
this mode.
Even with the above limitation, certain functionality such as MMC has
mandatory need for IODelay reconfiguration requirements, depending on
speed of transfer. In these cases, with careful examination of usecase
involved, the expected glitch can be controlled such that it does not
impact functionality.
In short, IODelay module support as a padconf driver being introduced
here is not expected to do SoC wide I/O Isolation and is meant for
a limited subset of IODelay configuration requirements that need to
be dynamic and whose glitchy behavior will not cause functionality
failure for that interface.
IMPORTANT NOTE: we take the approach of keeping LOCK_BITs cleared
to 0x0 at all times, even when configuring Manual IO Timing Modes.
This is done by eliminating the LOCK_BIT=1 setting from Step
of the Manual IO timing Mode configuration procedure. This option
leaves the CFG_* registers unprotected from unintended writes to the
CTRL_CORE_PAD_* registers while Manual IO Timing Modes are configured.
This approach is taken to allow for a generic driver to exist in kernel
world that has to be used carefully in required usecases.
Signed-off-by: Nishanth Menon <nm@ti.com>
Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com>
[tony@atomide.com: updated to use generic pinctrl functions, added
binding documentation, updated comments]
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Tony Lindgren [Fri, 30 Dec 2016 18:37:31 +0000 (10:37 -0800)]
pinctrl: core: Make dt_free_map optional
If the pin controller driver is using devm_kzalloc, there may not be
anything to do for dt_free_map. Let's make it optional to avoid
unncessary boilerplate code.
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Maxime Ripard [Sun, 8 Jan 2017 21:31:17 +0000 (22:31 +0100)]
pinctrl: sunxi: Remove old sun5i pinctrl drivers
Now that we have a common pinctrl driver for all the sun5i SoCs, we can
remove the old, separate drivers.
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Maxime Ripard [Sun, 8 Jan 2017 21:31:16 +0000 (22:31 +0100)]
pinctrl: sunxi: Add common sun5i pinctrl driver
The sun5i SoCs (A10s, A13, GR8) are all based on the same die fit in
different packages. Hence, the pins and functions available are just the
based on the same set, each SoC having a different subset.
Introduce a common pinctrl driver that supports multiple variants to allow
to put as much as we can in common.
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Maxime Ripard [Sun, 8 Jan 2017 21:31:15 +0000 (22:31 +0100)]
pinctrl: sunxi: Add pinctrl variants
Some SoCs are either supposed to be pin compatible (A10 and A20 for
example), or are just repackaged versions of the same die (A10s, A13, GR8).
In those case, having a full blown pinctrl driver just introduces
duplication in both data size and maintainance effort.
Add a variant option to both pins and functions to be able to limit the
pins and functions described only to a subset of the SoC we support with a
given driver.
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Jon Hunter [Thu, 5 Jan 2017 15:52:55 +0000 (15:52 +0000)]
pinctrl: Fix panic when pinctrl devices with hogs are unregistered
Commit
df61b366af26 ('pinctrl: core: Use delayed work for hogs')
deferred part of the registration for pinctrl devices if the pinctrl
device has hogs. This introduced a window where if the pinctrl device
with hogs was sucessfully registered, but then unregistered again
(which could be caused by parent device being probe deferred) before
the delayed work has chanced to run, then this will cause a kernel
panic to occur because:
1. The 'pctldev->p' has not yet been initialised and when unregistering
the pinctrl device we only check to see if it is an error value, but
now it could also be NULL.
2. The pinctrl device may not have been added to the 'pinctrldev_list'
list and we don't check to see if it was added before removing.
Fix up the above by checking to see if the 'pctldev->p' pointer is an
error value or NULL before putting the pinctrl device and verifying
that the pinctrl device is present in 'pinctrldev_list' before removing.
Fixes:
df61b366af26 ('pinctrl: core: Use delayed work for hogs')
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Linus Walleij [Tue, 3 Jan 2017 08:18:58 +0000 (09:18 +0100)]
pinctrl: amd: fix compilation warning
3bfd44306c65 ("pinctrl: amd: Add support for additional GPIO")
created the following warning:
drivers/pinctrl/pinctrl-amd.c: In function 'amd_gpio_dbg_show':
drivers/pinctrl/pinctrl-amd.c:210:3: warning: 'pin_num' may be used uninitialized in this function [-Wmaybe-uninitialized]
for (; i < pin_num; i++) {
^
drivers/pinctrl/pinctrl-amd.c:172:21: warning: 'i' may be used uninitialized in this function [-Wmaybe-uninitialized]
unsigned int bank, i, pin_num;
^
Fix this by adding a guarding default case for illegal
bank numbers.
Cc: S-k Shyam-sundar <Shyam-sundar.S-k@amd.com>
Cc: Nehal Shah <Nehal-bakulchandra.Shah@amd.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Gary Bisson [Mon, 2 Jan 2017 18:20:22 +0000 (19:20 +0100)]
pinctrl: imx: use generic pinmux helpers for managing functions
Now using function_desc structure instead of imx_pmx_func.
Also leveraging generic functions to retrieve functions count/name/groups.
The imx_free_funcs function can be removed since it is now handled by
the core driver during unregister.
Signed-off-by: Gary Bisson <gary.bisson@boundarydevices.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Gary Bisson [Mon, 2 Jan 2017 18:20:21 +0000 (19:20 +0100)]
pinctrl: imx: use generic pinctrl helpers for managing groups
Now using group_desc structure instead of imx_pin_group.
Also leveraging generic functions to retrieve groups count/name/pins.
The imx_free_pingroups function can be removed since it is now handled by
the core driver during unregister.
Finally the device tree parsing is moved after the pinctrl driver registration
since this latter initializes the radix trees.
Signed-off-by: Gary Bisson <gary.bisson@boundarydevices.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Linus Walleij [Mon, 2 Jan 2017 08:42:28 +0000 (09:42 +0100)]
pinctrl: qcom: msm8660: rename some SDC1->SDC4
These four pins are for SDC4, not SDC1. They are grouped for
SDC4 later in the file so this must be a typo.
Reviewed-by: Björn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Tony Lindgren [Tue, 27 Dec 2016 17:20:03 +0000 (09:20 -0800)]
pinctrl: single: Use generic pinmux helpers for managing functions
We can now drop the driver specific code for managing functions.
Signed-off-by: Tony Lindgren <tony@atomide.com>
[Replaces GENERIC_PINMUX with GENERIC_PINMUX_FUNCTIONS]
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Tony Lindgren [Tue, 27 Dec 2016 17:20:02 +0000 (09:20 -0800)]
pinctrl: single: Use generic pinctrl helpers for managing groups
We can now drop the driver specific code for managing groups.
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Tony Lindgren [Tue, 27 Dec 2016 17:20:01 +0000 (09:20 -0800)]
pinctrl: core: Add generic pinctrl functions for managing groups
We can add generic helpers for function handling for cases where the pin
controller driver does not need to use static arrays.
Signed-off-by: Tony Lindgren <tony@atomide.com>
[Renamed the Kconfig item and moved things around]
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Linus Walleij [Fri, 30 Dec 2016 14:04:43 +0000 (15:04 +0100)]
pinctrl: stricten up generic group code
Rename the symbol PINCTRL_GENERIC to PINCTRL_GENERIC_GROUPS since
it all pertains to groups. Replace everywhere.
ifdef out the radix tree and the struct when not using the
generic groups.
Cc: Tony Lindgren <tony@atomide.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Tony Lindgren [Tue, 27 Dec 2016 17:20:00 +0000 (09:20 -0800)]
pinctrl: core: Add generic pinctrl functions for managing groups
We can add generic helpers for pin group handling for cases where the pin
controller driver does not need to use static arrays.
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Linus Walleij [Fri, 30 Dec 2016 13:44:18 +0000 (14:44 +0100)]
pinctrl: add some comments to the hog/late init code
It confused me a bit so it may confuse others. Make it crystal
clear what is going on here for any future readers.
Cc: Tony Lindgren <tony@atomide.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Tony Lindgren [Tue, 27 Dec 2016 17:19:59 +0000 (09:19 -0800)]
pinctrl: core: Use delayed work for hogs
Having the pin control framework call pin controller functions
before it's probe has finished is not nice as the pin controller
device driver does not yet have struct pinctrl_dev handle.
Let's fix this issue by adding deferred work for late init. This is
needed to be able to add pinctrl generic helper functions that expect
to know struct pinctrl_dev handle. Note that we now need to call
create_pinctrl() directly as we don't want to add the pin controller
to the list of controllers until the hogs are claimed. We also need
to pass the pinctrl_dev to the device tree parser functions as they
otherwise won't find the right controller at this point.
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Gary Bisson [Fri, 2 Dec 2016 16:35:19 +0000 (17:35 +0100)]
pinctrl: imx: use radix trees for groups and functions
This change is inspired from the pinctrl-single architecture.
The problem with current implementation is that it isn't possible
to add/remove functions and/or groups dynamically. The radix tree
offers an easy way to do so. The intent is to offer a follow-up
patch later that will enable the use of pinctrl nodes in dt-overlays.
Signed-off-by: Gary Bisson <gary.bisson@boundarydevices.com>
Reviewed-by: Fabio Estevam <fabio.estevam@nxp.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Gary Bisson [Fri, 2 Dec 2016 16:35:18 +0000 (17:35 +0100)]
pinctrl: imx: remove const qualifier of imx_pinctrl_soc_info
Otherwise can't dynamically update fields such as ngroups which can
change over time (with a dt-overlay for instance).
Signed-off-by: Gary Bisson <gary.bisson@boundarydevices.com>
Reviewed-by: Fabio Estevam <fabio.estevam@nxp.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Colin Ian King [Fri, 23 Dec 2016 00:47:14 +0000 (00:47 +0000)]
pinctrl: single: fix spelling mistakes on "Ivalid"
Trivial fixe to spelling mistake "Ivalid" to "Invalid" in
dev_err error message.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Vladimir Zapolskiy [Sun, 25 Dec 2016 00:59:28 +0000 (02:59 +0200)]
pinctrl: simplify check for pin request conflicts
This is a non-functional change, which deletes code duplication in two
of four if-if branches by reordering the checks. Functional identity
of the code change can be shown by running through the whole truth table
of boolean arguments.
Signed-off-by: Vladimir Zapolskiy <vz@mleia.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
John Crispin [Tue, 20 Dec 2016 18:55:41 +0000 (19:55 +0100)]
pinctrl: update my email address
This patch updates my email address as I no longer have access to the old
one.
Signed-off-by: John Crispin <john@phrozen.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Christophe JAILLET [Tue, 20 Dec 2016 05:41:22 +0000 (06:41 +0100)]
pinctrl: sirf: atlas7: Improve code layout
Add some tab in order to improve indentation.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Christophe JAILLET [Tue, 20 Dec 2016 05:40:43 +0000 (06:40 +0100)]
pinctrl: sirf: atlas7: Add missing 'of_node_put()'
Reference to 'sys2pci_np' should be dropped in all cases here, not only in
error handling path.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Gabriel Fernandez [Wed, 14 Dec 2016 14:24:16 +0000 (15:24 +0100)]
pinctrl: stm32: activate strict mux mode
This activates strict mode muxing for the STM32 pin controllers,
as these do not allow GPIO and functions to use the same pin
simultaneously.
Signed-off-by: Gabriel Fernandez <gabriel.fernandez@st.com>
Acked-by: Patrice Chotard <patrice.chotard@st.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Andreas Klinger [Tue, 13 Dec 2016 23:08:27 +0000 (00:08 +0100)]
pinctrl: fix DT bindings for marvell,kirkwood-pinctrl
On Marvell mv88f6180 mpp pins range from 0 to 19 as well as from 35 to 44.
This is already fixed in commit:
9573e7923007961799beff38bc5c5a7635634eef
This is the documentation change for above commit.
Signed-off-by: Andreas Klinger <ak@it-klinger.de>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Shah, Nehal-bakulchandra [Tue, 6 Dec 2016 06:47:48 +0000 (12:17 +0530)]
pinctrl: amd: Add support for additional GPIO
This patch adds support for new Bank and adds IRQCHIP_SKIP_SET_WAKE flag.
Reviewed-by: S-k, Shyam-sundar <Shyam-sundar.S-k@amd.com>
Signed-off-by: Nehal Shah <Nehal-bakulchandra.Shah@amd.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Andrew Jeffery [Tue, 20 Dec 2016 07:35:51 +0000 (18:05 +1030)]
pinctrl: aspeed: Fix kerneldoc return descriptions
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Acked-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Andrew Jeffery [Tue, 20 Dec 2016 07:35:50 +0000 (18:05 +1030)]
pinctrl: aspeed-g5: Add mux configuration for all pins
The patch introducing the g5 pinctrl driver implemented a smattering of
pins to flesh out the implementation of the core and provide bare-bones
support for some OpenPOWER platforms and the AST2500 evaluation board.
Now, update the bindings document to reflect the complete functionality
and implement the necessary pin configuration tables in the driver.
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Acked-by: Joel Stanley <joel@jms.id.au>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Andrew Jeffery [Tue, 20 Dec 2016 07:35:49 +0000 (18:05 +1030)]
pinctrl: aspeed-g4: Add mux configuration for all pins
The patch introducing the g4 pinctrl driver implemented a smattering of
pins to flesh out the implementation of the core and provide bare-bones
support for some OpenPOWER platforms. Now, update the bindings document
to reflect the complete functionality and implement the necessary pin
configuration tables in the driver.
Cc: Timothy Pearson <tpearson@raptorengineering.com>
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Acked-by: Joel Stanley <joel@jms.id.au>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Andrew Jeffery [Tue, 20 Dec 2016 07:35:48 +0000 (18:05 +1030)]
pinctrl: aspeed: Read and write bits in LPC and GFX controllers
The System Control Unit IP block in the Aspeed SoCs is typically where
the pinmux configuration is found, but not always. A number of pins
depend on state in one of LPC Host Control (LHC) or SoC Display
Controller (GFX) IP blocks, so the Aspeed pinmux drivers should have the
means to adjust these as necessary.
We use syscon to cast a regmap over the GFX and LPC blocks, which is
used as an arbitration layer between the relevant driver and the pinctrl
subsystem. The regmaps are then exposed to the SoC-specific pinctrl
drivers by phandles in the devicetree, and are selected during a mux
request by querying a new 'ip' member in struct aspeed_sig_desc.
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Andrew Jeffery [Tue, 20 Dec 2016 07:35:47 +0000 (18:05 +1030)]
pinctrl: aspeed: dt: Fix compatibles for the System Control Unit
Reference the SoC-specific compatible string in the examples as
required.
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Acked-by: Joel Stanley <joel@jms.id.au>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Linus Torvalds [Mon, 26 Dec 2016 00:13:08 +0000 (16:13 -0800)]
Linux 4.10-rc1
Larry Finger [Fri, 23 Dec 2016 03:06:53 +0000 (21:06 -0600)]
powerpc: Fix build warning on 32-bit PPC
I am getting the following warning when I build kernel 4.9-git on my
PowerBook G4 with a 32-bit PPC processor:
AS arch/powerpc/kernel/misc_32.o
arch/powerpc/kernel/misc_32.S:299:7: warning: "CONFIG_FSL_BOOKE" is not defined [-Wundef]
This problem is evident after commit
989cea5c14be ("kbuild: prevent
lib-ksyms.o rebuilds"); however, this change in kbuild only exposes an
error that has been in the code since 2005 when this source file was
created. That was with commit
9994a33865f4 ("powerpc: Introduce
entry_{32,64}.S, misc_{32,64}.S, systbl.S").
The offending line does not make a lot of sense. This error does not
seem to cause any errors in the executable, thus I am not recommending
that it be applied to any stable versions.
Thanks to Nicholas Piggin for suggesting this solution.
Fixes:
9994a33865f4 ("powerpc: Introduce entry_{32,64}.S, misc_{32,64}.S, systbl.S")
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sun, 25 Dec 2016 22:56:58 +0000 (14:56 -0800)]
avoid spurious "may be used uninitialized" warning
The timer type simplifications caused a new gcc warning:
drivers/base/power/domain.c: In function ‘genpd_runtime_suspend’:
drivers/base/power/domain.c:562:14: warning: ‘time_start’ may be used uninitialized in this function [-Wmaybe-uninitialized]
elapsed_ns = ktime_to_ns(ktime_sub(ktime_get(), time_start));
despite the actual use of "time_start" not having changed in any way.
It appears that simply changing the type of ktime_t from a union to a
plain scalar type made gcc check the use.
The variable wasn't actually used uninitialized, but gcc apparently
failed to notice that the conditional around the use was exactly the
same as the conditional around the initialization of that variable.
Add an unnecessary initialization just to shut up the compiler.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sun, 25 Dec 2016 22:30:04 +0000 (14:30 -0800)]
Merge branch 'timers-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull timer type cleanups from Thomas Gleixner:
"This series does a tree wide cleanup of types related to
timers/timekeeping.
- Get rid of cycles_t and use a plain u64. The type is not really
helpful and caused more confusion than clarity
- Get rid of the ktime union. The union has become useless as we use
the scalar nanoseconds storage unconditionally now. The 32bit
timespec alike storage got removed due to the Y2038 limitations
some time ago.
That leaves the odd union access around for no reason. Clean it up.
Both changes have been done with coccinelle and a small amount of
manual mopping up"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
ktime: Get rid of ktime_equal()
ktime: Cleanup ktime_set() usage
ktime: Get rid of the union
clocksource: Use a plain u64 instead of cycle_t
Linus Torvalds [Sun, 25 Dec 2016 22:05:56 +0000 (14:05 -0800)]
Merge branch 'smp-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull SMP hotplug notifier removal from Thomas Gleixner:
"This is the final cleanup of the hotplug notifier infrastructure. The
series has been reintgrated in the last two days because there came a
new driver using the old infrastructure via the SCSI tree.
Summary:
- convert the last leftover drivers utilizing notifiers
- fixup for a completely broken hotplug user
- prevent setup of already used states
- removal of the notifiers
- treewide cleanup of hotplug state names
- consolidation of state space
There is a sphinx based documentation pending, but that needs review
from the documentation folks"
* 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/armada-xp: Consolidate hotplug state space
irqchip/gic: Consolidate hotplug state space
coresight/etm3/4x: Consolidate hotplug state space
cpu/hotplug: Cleanup state names
cpu/hotplug: Remove obsolete cpu hotplug register/unregister functions
staging/lustre/libcfs: Convert to hotplug state machine
scsi/bnx2i: Convert to hotplug state machine
scsi/bnx2fc: Convert to hotplug state machine
cpu/hotplug: Prevent overwriting of callbacks
x86/msr: Remove bogus cleanup from the error path
bus: arm-ccn: Prevent hotplug callback leak
perf/x86/intel/cstate: Prevent hotplug callback leak
ARM/imx/mmcd: Fix broken cpu hotplug handling
scsi: qedi: Convert to hotplug state machine
Linus Torvalds [Sun, 25 Dec 2016 22:01:28 +0000 (14:01 -0800)]
Merge branch 'turbostat' of git://git./linux/kernel/git/lenb/linux
Pull turbostat updates from Len Brown.
* 'turbostat' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux:
tools/power turbostat: remove obsolete -M, -m, -C, -c options
tools/power turbostat: Make extensible via the --add parameter
tools/power turbostat: Denverton uses a 25 MHz crystal, not 19.2 MHz
tools/power turbostat: line up headers when -M is used
tools/power turbostat: fix SKX PKG_CSTATE_LIMIT decoding
tools/power turbostat: Support Knights Mill (KNM)
tools/power turbostat: Display HWP OOB status
tools/power turbostat: fix Denverton BCLK
tools/power turbostat: use intel-family.h model strings
tools/power/turbostat: Add Denverton RAPL support
tools/power/turbostat: Add Denverton support
tools/power/turbostat: split core MSR support into status + limit
tools/power turbostat: fix error case overflow read of slm_freq_table[]
tools/power turbostat: Allocate correct amount of fd and irq entries
tools/power turbostat: switch to tab delimited output
tools/power turbostat: Gracefully handle ACPI S3
tools/power turbostat: tidy up output on Joule counter overflow
Nicholas Piggin [Sun, 25 Dec 2016 03:00:30 +0000 (13:00 +1000)]
mm: add PageWaiters indicating tasks are waiting for a page bit
Add a new page flag, PageWaiters, to indicate the page waitqueue has
tasks waiting. This can be tested rather than testing waitqueue_active
which requires another cacheline load.
This bit is always set when the page has tasks on page_waitqueue(page),
and is set and cleared under the waitqueue lock. It may be set when
there are no tasks on the waitqueue, which will cause a harmless extra
wakeup check that will clears the bit.
The generic bit-waitqueue infrastructure is no longer used for pages.
Instead, waitqueues are used directly with a custom key type. The
generic code was not flexible enough to have PageWaiters manipulation
under the waitqueue lock (which simplifies concurrency).
This improves the performance of page lock intensive microbenchmarks by
2-3%.
Putting two bits in the same word opens the opportunity to remove the
memory barrier between clearing the lock bit and testing the waiters
bit, after some work on the arch primitives (e.g., ensuring memory
operand widths match and cover both bits).
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Bob Peterson <rpeterso@redhat.com>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Andrew Lutomirski <luto@kernel.org>
Cc: Andreas Gruenbacher <agruenba@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Nicholas Piggin [Sun, 25 Dec 2016 03:00:29 +0000 (13:00 +1000)]
mm: Use owner_priv bit for PageSwapCache, valid when PageSwapBacked
A page is not added to the swap cache without being swap backed,
so PageSwapBacked mappings can use PG_owner_priv_1 for PageSwapCache.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Bob Peterson <rpeterso@redhat.com>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Andrew Lutomirski <luto@kernel.org>
Cc: Andreas Gruenbacher <agruenba@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Thomas Gleixner [Sun, 25 Dec 2016 11:43:07 +0000 (12:43 +0100)]
ktime: Get rid of ktime_equal()
No point in going through loops and hoops instead of just comparing the
values.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Thomas Gleixner [Sun, 25 Dec 2016 11:30:41 +0000 (12:30 +0100)]
ktime: Cleanup ktime_set() usage
ktime_set(S,N) was required for the timespec storage type and is still
useful for situations where a Seconds and Nanoseconds part of a time value
needs to be converted. For anything where the Seconds argument is 0, this
is pointless and can be replaced with a simple assignment.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Thomas Gleixner [Sun, 25 Dec 2016 10:38:40 +0000 (11:38 +0100)]
ktime: Get rid of the union
ktime is a union because the initial implementation stored the time in
scalar nanoseconds on 64 bit machine and in a endianess optimized timespec
variant for 32bit machines. The Y2038 cleanup removed the timespec variant
and switched everything to scalar nanoseconds. The union remained, but
become completely pointless.
Get rid of the union and just keep ktime_t as simple typedef of type s64.
The conversion was done with coccinelle and some manual mopping up.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Thomas Gleixner [Wed, 21 Dec 2016 19:32:01 +0000 (20:32 +0100)]
clocksource: Use a plain u64 instead of cycle_t
There is no point in having an extra type for extra confusion. u64 is
unambiguous.
Conversion was done with the following coccinelle script:
@rem@
@@
-typedef u64 cycle_t;
@fix@
typedef cycle_t;
@@
-cycle_t
+u64
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: John Stultz <john.stultz@linaro.org>
Thomas Gleixner [Wed, 21 Dec 2016 19:19:57 +0000 (20:19 +0100)]
irqchip/armada-xp: Consolidate hotplug state space
The mpic is either the main interrupt controller or is cascaded behind a
GIC. The mpic is single instance and the modes are mutually exclusive, so
there is no reason to have seperate cpu hotplug states.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Link: http://lkml.kernel.org/r/20161221192112.333161745@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Thomas Gleixner [Wed, 21 Dec 2016 19:19:56 +0000 (20:19 +0100)]
irqchip/gic: Consolidate hotplug state space
Even if both drivers are compiled in only one instance can run on a given
system depending on the available GIC version.
So having seperate hotplug states for them is pointless.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Link: http://lkml.kernel.org/r/20161221192112.252416267@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Thomas Gleixner [Wed, 21 Dec 2016 19:19:55 +0000 (20:19 +0100)]
coresight/etm3/4x: Consolidate hotplug state space
Even if both drivers are compiled in only one instance can run on a given
system depending on the available tracer cell.
So having seperate hotplug states for them is pointless.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Link: http://lkml.kernel.org/r/20161221192112.162765484@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Thomas Gleixner [Wed, 21 Dec 2016 19:19:54 +0000 (20:19 +0100)]
cpu/hotplug: Cleanup state names
When the state names got added a script was used to add the extra argument
to the calls. The script basically converted the state constant to a
string, but the cleanup to convert these strings into meaningful ones did
not happen.
Replace all the useless strings with 'subsys/xxx/yyy:state' strings which
are used in all the other places already.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Link: http://lkml.kernel.org/r/20161221192112.085444152@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Thomas Gleixner [Wed, 21 Dec 2016 19:19:53 +0000 (20:19 +0100)]
cpu/hotplug: Remove obsolete cpu hotplug register/unregister functions
hotcpu_notifier(), cpu_notifier(), __hotcpu_notifier(), __cpu_notifier(),
register_hotcpu_notifier(), register_cpu_notifier(),
__register_hotcpu_notifier(), __register_cpu_notifier(),
unregister_hotcpu_notifier(), unregister_cpu_notifier(),
__unregister_hotcpu_notifier(), __unregister_cpu_notifier()
are unused now. Remove them and all related code.
Remove also the now pointless cpu notifier error injection mechanism. The
states can be executed step by step and error rollback is the same as cpu
down, so any state transition can be tested w/o requiring the notifier
error injection.
Some CPU hotplug states are kept as they are (ab)used for hotplug state
tracking.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20161221192112.005642358@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Anna-Maria Gleixner [Wed, 21 Dec 2016 19:19:52 +0000 (20:19 +0100)]
staging/lustre/libcfs: Convert to hotplug state machine
Install the callbacks via the state machine. No functional change.
Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: devel@driverdev.osuosl.org
Cc: Andreas Dilger <andreas.dilger@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Oleg Drokin <oleg.drokin@intel.com>
Cc: rt@linutronix.de
Cc: lustre-devel@lists.lustre.org
Link: http://lkml.kernel.org/r/20161202110027.htzzeervzkoc4muv@linutronix.de
Link: http://lkml.kernel.org/r/20161221192111.922872524@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Sebastian Andrzej Siewior [Wed, 21 Dec 2016 19:19:51 +0000 (20:19 +0100)]
scsi/bnx2i: Convert to hotplug state machine
Install the callbacks via the state machine. No functional change.
This is the minimal fixup so we can remove the hotplug notifier mess
completely.
The real rework of this driver to use work queues is still stuck in
review/testing on the SCSI mailing list.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: "James E.J. Bottomley" <jejb@linux.vnet.ibm.com>
Cc: linux-scsi@vger.kernel.org
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Chad Dupuis <chad.dupuis@qlogic.com>
Cc: QLogic-Storage-Upstream@qlogic.com
Cc: Johannes Thumshirn <jth@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20161221192111.836895753@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Sebastian Andrzej Siewior [Wed, 21 Dec 2016 19:19:50 +0000 (20:19 +0100)]
scsi/bnx2fc: Convert to hotplug state machine
Install the callbacks via the state machine. No functional change.
This is the minimal fixup so we can remove the hotplug notifier mess
completely.
The real rework of this driver to use work queues is still stuck in
review/testing on the SCSI mailing list.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: "James E.J. Bottomley" <jejb@linux.vnet.ibm.com>
Cc: linux-scsi@vger.kernel.org
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Chad Dupuis <chad.dupuis@qlogic.com>
Cc: QLogic-Storage-Upstream@qlogic.com
Cc: Johannes Thumshirn <jth@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20161221192111.757309869@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Thomas Gleixner [Wed, 21 Dec 2016 19:19:49 +0000 (20:19 +0100)]
cpu/hotplug: Prevent overwriting of callbacks
Developers manage to overwrite states blindly without thought. That's fatal
and hard to debug. Add sanity checks to make it fail.
This requries to restructure the code so that the dynamic state allocation
happens in the same lock protected section as the actual store. Otherwise
the previous assignment of 'Reserved' to the name field would trigger the
overwrite check.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Link: http://lkml.kernel.org/r/20161221192111.675234535@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Thomas Gleixner [Thu, 22 Dec 2016 09:32:38 +0000 (10:32 +0100)]
x86/msr: Remove bogus cleanup from the error path
The error cleanup which is invoked when the hotplug state setup failed
tries to remove the failed state, which is broken.
Fixes:
8fba38c937cd ("x86/msr: Convert to hotplug state machine")
Reported-by: kernel test robot <fengguang.wu@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Thomas Gleixner [Thu, 22 Dec 2016 10:14:06 +0000 (11:14 +0100)]
bus: arm-ccn: Prevent hotplug callback leak
In case the driver registration fails, the hotplug callback is leaked.
Not fatal, because it's never invoked as there are no instances registered,
but wrong nevertheless.
Fixes:
fdc15a36d84e ("bus/arm-ccn: Convert to hotplug statemachine")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Pawel Moll <pawel.moll@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will.deacon@arm.com>
Thomas Gleixner [Thu, 22 Dec 2016 10:02:08 +0000 (11:02 +0100)]
perf/x86/intel/cstate: Prevent hotplug callback leak
If the pmu registration fails the registered hotplug callbacks are not
removed. Wrong in any case, but fatal in case of a modular driver.
Replace the nonsensical state names with proper ones while at it.
Fixes:
77c34ef1c319 ("perf/x86/intel/cstate: Convert Intel CSTATE to hotplug state machine")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: stable@vger.kernel.org
Thomas Gleixner [Wed, 21 Dec 2016 19:19:48 +0000 (20:19 +0100)]
ARM/imx/mmcd: Fix broken cpu hotplug handling
The cpu hotplug support of this perf driver is broken in several ways:
1) It adds a instance before setting up the state.
2) The state for the instance is different from the state of the
callback. It's just a randomly chosen state.
3) The instance registration is not error checked so nobody noticed that
the call can never succeed.
4) The state for the multi install callbacks is chosen randomly and
overwrites existing state. This is now prevented by the core code so the
call is guaranteed to fail.
5) The error exit path in the init function leaves the instance registered
and then frees the memory which contains the enqueued hlist node.
6) The remove function is removing the state and not the instance.
Fix it by:
- Setting up the state before adding instances. Use a dynamically allocated
state for it.
- Installing instances after the state has been set up
- Removing the instance in the error path before freeing memory
- Removing the instance not the state in the driver remove callback
While at is use raw_cpu_processor_id(), because cpu_processor_id() cannot
be used in preemptible context, and set the driver data after successful
registration of the pmu.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Shawn Guo <shawnguo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Cc: Frank Li <frank.li@nxp.com>
Cc: Zhengyu Shen <zhengyu.shen@nxp.com>
Link: http://lkml.kernel.org/r/20161221192111.596204211@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Thomas Gleixner [Sat, 24 Dec 2016 11:34:02 +0000 (12:34 +0100)]
scsi: qedi: Convert to hotplug state machine
The CPU hotplug code is a trainwreck. It leaks a notifier in case of driver
registration error and the per cpu loop is racy against cpu hotplug. Aside
of that the driver should have been written and merged with the new state
machine interfaces in the first place.
Mop up the mess and Convert it to the hotplug state machine.
Signed-off-by: Thomas Grumpy Gleixner <tglx@linutronix.de>
Cc: Nilesh Javali <nilesh.javali@cavium.com>
Cc: Adheer Chandravanshi <adheer.chandravanshi@qlogic.com>
Cc: Chad Dupuis <chad.dupuis@cavium.com>
Cc: Saurav Kashyap <saurav.kashyap@cavium.com>
Cc: Arun Easi <arun.easi@cavium.com>
Cc: Manish Rangankar <manish.rangankar@cavium.com>
Cc: Johannes Thumshirn <jthumshirn@suse.de>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Len Brown [Sat, 24 Dec 2016 20:18:37 +0000 (15:18 -0500)]
tools/power turbostat: remove obsolete -M, -m, -C, -c options
The new --add option has replaced the -M, -m, -C, -c options
Eg.
-M 0x10 is now --add msr0x10,raw
-m 0x10 is now --add msr0x10,raw,u32
-C 0x10 is now --add msr0x10,delta
-c 0x10 is now --add msr0x10,delta,u32
The --add option can be repeated to add any number of counters,
while the previous options were limited to adding one of each type.
In addition, the --add option can accept a column label,
and can also display a counter as a percentage of elapsed cycles.
Eg. --add msr0x3fe,core,percent,MY_CC3
Signed-off-by: Len Brown <len.brown@intel.com>
Len Brown [Fri, 23 Dec 2016 04:57:55 +0000 (23:57 -0500)]
tools/power turbostat: Make extensible via the --add parameter
Create the "--add" parameter. This can be used to teach an existing
turbostat binary about any number of any type of counter.
turbostat(8) details the syntax for --add.
Signed-off-by: Len Brown <len.brown@intel.com>
Linus Torvalds [Sat, 24 Dec 2016 19:46:01 +0000 (11:46 -0800)]
Replace <asm/uaccess.h> with <linux/uaccess.h> globally
This was entirely automated, using the script by Al:
PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*<asm/uaccess.h>'
sed -i -e "s!$PATT!#include <linux/uaccess.h>!" \
$(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h)
to do the replacement at the end of the merge window.
Requested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sat, 24 Dec 2016 19:37:18 +0000 (11:37 -0800)]
Merge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs fixes from Steve French:
"This ncludes various cifs/smb3 bug fixes, mostly for stable as well.
In the next week I expect that Germano will have some reconnection
fixes, and also I expect to have the remaining pieces of the snapshot
enablement and SMB3 ACLs, but wanted to get this set of bug fixes in"
* 'for-next' of git://git.samba.org/sfrench/cifs-2.6:
cifs_get_root shouldn't use path with tree name
Fix default behaviour for empty domains and add domainauto option
cifs: use %16phN for formatting md5 sum
cifs: Fix smbencrypt() to stop pointing a scatterlist at the stack
CIFS: Fix a possible double locking of mutex during reconnect
CIFS: Fix a possible memory corruption during reconnect
CIFS: Fix a possible memory corruption in push locks
CIFS: Fix missing nls unload in smb2_reconnect()
CIFS: Decrease verbosity of ioctl call
SMB3: parsing for new snapshot timestamp mount parm
Linus Torvalds [Sat, 24 Dec 2016 19:27:45 +0000 (11:27 -0800)]
Merge tag 'watchdog-for-linus-v4.10' of git://git./linux/kernel/git/groeck/linux-staging
Pull watchdog updates from Wim Van Sebroeck and Guenter Roeck:
- new driver for Add Loongson1 SoC
- minor cleanup and fixes in various drivers
* tag 'watchdog-for-linus-v4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
watchdog: it87_wdt: add IT8620E ID
watchdog: mpc8xxx: Remove unneeded linux/miscdevice.h include
watchdog: octeon: Remove unneeded linux/miscdevice.h include
watchdog: bcm2835_wdt: set WDOG_HW_RUNNING bit when appropriate
watchdog: loongson1: Add Loongson1 SoC watchdog driver
watchdog: cpwd: remove memory allocate failure message
watchdog: da9062/61: watchdog driver
intel-mid_wdt: Error code is just an integer
intel-mid_wdt: make sure watchdog is not running at startup
watchdog: mei_wdt: request stop on reboot to prevent false positive event
watchdog: hpwdt: changed maintainer information
watchdog: jz4740: Fix modular build
watchdog: qcom: fix kernel panic due to external abort on non-linefetch
watchdog: davinci: add support for deferred probing
watchdog: meson: Remove unneeded platform MODULE_ALIAS
watchdog: Standardize leading tabs and spaces in Kconfig file
watchdog: max77620_wdt: fix module autoload
watchdog: bcm7038_wdt: fix module autoload
Linus Torvalds [Sat, 24 Dec 2016 19:23:24 +0000 (11:23 -0800)]
Merge tag 'ntb-4.10' of git://github.com/jonmason/ntb
Pull NTB update from Jon Mason:
- NTB bug fixes for removing an unnecessary call to ntb_peer_spad_read,
and correcting a free_irq inconsistency
- add Intel SKX support
- change the AMD NTB maintainer, and fix some bugs present there
* tag 'ntb-4.10' of git://github.com/jonmason/ntb:
ntb_transport: Remove unnecessary call to ntb_peer_spad_read
NTB: Fix 'request_irq()' and 'free_irq()' inconsistancy
ntb: fix SKX NTB config space size register offsets
NTB: correct ntb_peer_spad_read for case when callback is not supplied.
MAINTAINERS: Change in maintainer for AMD NTB
ntb_transport: Limit memory windows based on available, scratchpads
NTB: Register and offset values fix for memory window
NTB: add support for hotplug feature
ntb: Adding Skylake Xeon NTB support
Linus Torvalds [Sat, 24 Dec 2016 00:54:46 +0000 (16:54 -0800)]
Merge branch 'x86-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
"There's a number of fixes:
- a round of fixes for CPUID-less legacy CPUs
- a number of microcode loader fixes
- i8042 detection robustization fixes
- stack dump/unwinder fixes
- x86 SoC platform driver fixes
- a GCC 7 warning fix
- virtualization related fixes"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
Revert "x86/unwind: Detect bad stack return address"
x86/paravirt: Mark unused patch_default label
x86/microcode/AMD: Reload proper initrd start address
x86/platform/intel/quark: Add printf attribute to imr_self_test_result()
x86/platform/intel-mid: Switch MPU3050 driver to IIO
x86/alternatives: Do not use sync_core() to serialize I$
x86/topology: Document cpu_llc_id
x86/hyperv: Handle unknown NMIs on one CPU when unknown_nmi_panic
x86/asm: Rewrite sync_core() to use IRET-to-self
x86/microcode/intel: Replace sync_core() with native_cpuid()
Revert "x86/boot: Fail the boot if !M486 and CPUID is missing"
x86/asm/32: Make sync_core() handle missing CPUID on all 32-bit kernels
x86/cpu: Probe CPUID leaf 6 even when cpuid_level == 6
x86/tools: Fix gcc-7 warning in relocs.c
x86/unwind: Dump stack data on warnings
x86/unwind: Adjust last frame check for aligned function stacks
x86/init: Fix a couple of comment typos
x86/init: Remove i8042_detect() from platform ops
Input: i8042 - Trust firmware a bit more when probing on X86
x86/init: Add i8042 state to the platform data
...
Linus Torvalds [Sat, 24 Dec 2016 00:51:16 +0000 (16:51 -0800)]
Merge branch 'timers-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull timer fix from Ingo Molnar:
"ARM/MOXA SoC clocksource driver fixes"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
clocksource/drivers/moxart: Plug memory and mapping leaks
Linus Torvalds [Sat, 24 Dec 2016 00:49:12 +0000 (16:49 -0800)]
Merge branch 'perf-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
"On the kernel side there's two x86 PMU driver fixes and a uprobes fix,
plus on the tooling side there's a number of fixes and some late
updates"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits)
perf sched timehist: Fix invalid period calculation
perf sched timehist: Remove hardcoded 'comm_width' check at print_summary
perf sched timehist: Enlarge default 'comm_width'
perf sched timehist: Honour 'comm_width' when aligning the headers
perf/x86: Fix overlap counter scheduling bug
perf/x86/pebs: Fix handling of PEBS buffer overflows
samples/bpf: Move open_raw_sock to separate header
samples/bpf: Remove perf_event_open() declaration
samples/bpf: Be consistent with bpf_load_program bpf_insn parameter
tools lib bpf: Add bpf_prog_{attach,detach}
samples/bpf: Switch over to libbpf
perf diff: Do not overwrite valid build id
perf annotate: Don't throw error for zero length symbols
perf bench futex: Fix lock-pi help string
perf trace: Check if MAP_32BIT is defined (again)
samples/bpf: Make perf_event_read() static
uprobes: Fix uprobes on MIPS, allow for a cache flush after ixol breakpoint creation
samples/bpf: Make samples more libbpf-centric
tools lib bpf: Add flags to bpf_create_map()
tools lib bpf: use __u32 from linux/types.h
...
Linus Torvalds [Sat, 24 Dec 2016 00:47:25 +0000 (16:47 -0800)]
Merge branch 'irq-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull irq fix from Ingo Molnar:
"A build warning fix with certain .config's"
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/st: Mark st_irq_syscfg_resume() __maybe_unused
Steve Wahl [Wed, 21 Dec 2016 16:45:22 +0000 (11:45 -0500)]
ntb_transport: Remove unnecessary call to ntb_peer_spad_read
The results were previously ignored, anyway.
Signed-off-by: Steve Wahl <Steve.Wahl@dell.com>
Fixes:
e26a5843f7f5014ae4460030ca4de029a3ac35d3
Acked-by: Allen Hubbe <Allen.Hubbe@dell.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Christophe JAILLET [Mon, 19 Dec 2016 05:52:55 +0000 (06:52 +0100)]
NTB: Fix 'request_irq()' and 'free_irq()' inconsistancy
'request_irq()' and 'free_irq()' should have the same 'dev_id'.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Dave Jiang [Tue, 13 Dec 2016 16:03:13 +0000 (09:03 -0700)]
ntb: fix SKX NTB config space size register offsets
The offsets for the SZ registers are wrong. Updated.
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Reported-by: Sandeep Mann <sandeep@purestorage.com>
Tested-by: Zachary Ross <zacharyx.ross@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Steven Wahl [Thu, 8 Dec 2016 17:02:28 +0000 (17:02 +0000)]
NTB: correct ntb_peer_spad_read for case when callback is not supplied.
Correct ntb_peer_spad_read for case when callback is not supplied
Signed-off-by: Steve Wahl <Steve.Wahl@dell.com>
Acked-by: Allen Hubbe <Allen.Hubbe@dell.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Shyam Sundar S K [Wed, 7 Dec 2016 17:18:39 +0000 (22:48 +0530)]
MAINTAINERS: Change in maintainer for AMD NTB
I would like to take maintainership for AMD NTB
Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
Signed-off-by: Xiangliang Yu <Xiangliang.Yu@amd.com>
Acked-by: Xiangliang Yu <Xiangliang.Yu@amd.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Shyam Sundar S K [Wed, 7 Dec 2016 17:07:05 +0000 (22:37 +0530)]
ntb_transport: Limit memory windows based on available, scratchpads
When the underlying NTB H/W driver advertises more memory windows
than the number of scratchpads available to setup MW's, it is likely
that we may end up filling the remaining memory windows with garbage.
So to avoid that, lets limit the memory windows that transport driver
can setup based on the available scratchpads.
Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
Acked-by: Allen Hubbe <Allen.Hubbe@dell.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Shyam Sundar S K [Thu, 1 Dec 2016 19:14:28 +0000 (00:44 +0530)]
NTB: Register and offset values fix for memory window
Due to incorrect limit and translation register values, NTB link was
going down when the memory window was setup. Made appropriate changes
as per spec.
Fix limit register values for BAR1, which was overlapping
with the BAR23 address.
Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
Acked-by: Allen Hubbe <Allen.Hubbe@dell.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Xiangliang Yu [Fri, 18 Nov 2016 09:21:41 +0000 (14:51 +0530)]
NTB: add support for hotplug feature
AMD NTB support hotplug under B2B mode. NTB will trigger link
up/down interrupt event when doing plug add/remove, this patch
implements the two interrupt event to support B2B hotplug function.
Signed-off-by: Xiangliang Yu <Xiangliang.Yu@amd.com>
Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
Acked-by: Allen Hubbe <Allen.Hubbe@dell.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Dave Jiang [Wed, 16 Nov 2016 21:03:38 +0000 (14:03 -0700)]
ntb: Adding Skylake Xeon NTB support
The Skylake Xeon NTB hardware has made some changes to the register name,
offset, and the way doorbells work. Adding driver support for the new
hardware.
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Acked-by: Allen Hubbe <Allen.Hubbe@dell.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Josh Poimboeuf [Thu, 22 Dec 2016 15:02:49 +0000 (09:02 -0600)]
Revert "x86/unwind: Detect bad stack return address"
Revert the following commit:
b6959a362177 ("x86/unwind: Detect bad stack return address")
... because Andrey Konovalov reported an unwinder warning:
WARNING: unrecognized kernel stack return address
ffffffffa0000001 at
ffff88006377fa18 in a.out:4467
The unwind was initiated from an interrupt which occurred while running in the
generated code for a kprobe. The unwinder printed the warning because it
expected regs->ip to point to a valid text address, but instead it pointed to
the generated code.
Eventually we may want come up with a way to identify generated kprobe
code so the unwinder can know that it's a valid return address. Until
then, just remove the warning.
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/02f296848fbf49fb72dfeea706413ecbd9d4caf6.1482418739.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Ingo Molnar [Fri, 23 Dec 2016 19:23:29 +0000 (20:23 +0100)]
Merge tag 'perf-urgent-for-mingo-
20161222' of git://git./linux/kernel/git/acme/linux into perf/urgent
Pull perf/urgent fixes from Arnaldo Carvalho de Melo:
Fixes for 'perf sched timehist': (Namhyung Kim)
- Define a larger initial alignment value for the COMM column and
make it be more consistently honoured, for instance in the header.
- Fix invalid period calculation when using the --time option to
select a time slice, when events outside that slice were being
considered for the per cpu idle stats summary.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Linus Torvalds [Fri, 23 Dec 2016 19:23:25 +0000 (11:23 -0800)]
Merge git://git./linux/kernel/git/davem/net
Pull networking fixes from David Miller:
1) We have to be careful to not try and place a checksum after the end
of a rawv6 packet, fix from Dave Jones with help from Hannes
Frederic Sowa.
2) Missing memory barriers in tcp_tasklet_func() lead to crashes, from
Eric Dumazet.
3) Several bug fixes for the new XDP support in virtio_net, from Jason
Wang.
4) Increase headroom in RX skbs in be2net driver to accomodate
encapsulations such as geneve. From Kalesh A P.
5) Fix SKB frag unmapping on TX in mvpp2, from Thomas Petazzoni.
6) Pre-pulling UDP headers created a regression in RECVORIGDSTADDR
socket option support, from Willem de Bruijn.
7) UID based routing added a potential OOPS in ip_do_redirect() when we
see an SKB without a socket attached. We just need it for the
network namespace which we can get from skb->dev instead. Fix from
Lorenzo Colitti.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (30 commits)
sctp: fix recovering from 0 win with small data chunks
sctp: do not loose window information if in rwnd_over
virtio-net: XDP support for small buffers
virtio-net: remove big packet XDP codes
virtio-net: forbid XDP when VIRTIO_NET_F_GUEST_UFO is support
virtio-net: make rx buf size estimation works for XDP
virtio-net: unbreak csumed packets for XDP_PASS
virtio-net: correctly handle XDP_PASS for linearized packets
virtio-net: fix page miscount during XDP linearizing
virtio-net: correctly xmit linearized page on XDP_TX
virtio-net: remove the warning before XDP linearizing
mlxsw: spectrum_router: Correctly remove nexthop groups
mlxsw: spectrum_router: Don't reflect dead neighs
neigh: Send netevent after marking neigh as dead
ipv6: handle -EFAULT from skb_copy_bits
inet: fix IP(V6)_RECVORIGDSTADDR for udp sockets
net/sched: cls_flower: Mandate mask when matching on flags
net/sched: act_tunnel_key: Fix setting UDP dst port in metadata under IPv6
stmmac: CSR clock configuration fix
net: ipv4: Don't crash if passing a null sk to ip_do_redirect.
...
Marcelo Ricardo Leitner [Fri, 23 Dec 2016 16:29:37 +0000 (14:29 -0200)]
sctp: fix recovering from 0 win with small data chunks
Currently if SCTP closes the receive window with window pressure, mostly
caused by excessive skb overhead on payload/overheads ratio, SCTP will
close the window abruptly while saving the delta on rwnd_press. It will
start recovering rwnd as the chunks are consumed by the application and
the rwnd_press will be only recovered after rwnd reach the same value as
of rwnd_press, mostly to prevent silly window syndrome.
Thing is, this is very inefficient with small data chunks, as with those
it will never reach back that value, and thus it will never recover from
such pressure. This means that we will not issue window updates when
recovering from 0 window and will rely on a sender retransmit to notice
it.
The fix here is to remove such threshold, as no value is good enough: it
depends on the (avg) chunk sizes being used.
Test with netperf -t SCTP_STREAM -- -m 1, and trigger 0 window by
sending SIGSTOP to netserver, sleep 1.2, and SIGCONT.
Rate limited to 845kbps, for visibility. Capture done at netserver side.
Previously:
01.500751 IP B.48277 > A.36925: sctp (1) [SACK] [cum ack
632372996] [a_rwnd 99153] [
01.500752 IP A.36925 > B.48277: sctp (1) [DATA] (B)(E) [TSN:
632372997] [SID: 0] [SS
01.517471 IP A.36925 > B.48277: sctp (1) [DATA] (B)(E) [TSN:
632373010] [SID: 0] [SS
01.517483 IP B.48277 > A.36925: sctp (1) [SACK] [cum ack
632373009] [a_rwnd 0] [#gap
01.517485 IP A.36925 > B.48277: sctp (1) [DATA] (B)(E) [TSN:
632373083] [SID: 0] [SS
01.517488 IP B.48277 > A.36925: sctp (1) [SACK] [cum ack
632373009] [a_rwnd 0] [#gap
01.534168 IP A.36925 > B.48277: sctp (1) [DATA] (B)(E) [TSN:
632373096] [SID: 0] [SS
01.534180 IP B.48277 > A.36925: sctp (1) [SACK] [cum ack
632373009] [a_rwnd 0] [#gap
01.534181 IP A.36925 > B.48277: sctp (1) [DATA] (B)(E) [TSN:
632373169] [SID: 0] [SS
01.534185 IP B.48277 > A.36925: sctp (1) [SACK] [cum ack
632373009] [a_rwnd 0] [#gap
02.525978 IP A.36925 > B.48277: sctp (1) [DATA] (B)(E) [TSN:
632373010] [SID: 0] [SS
02.526021 IP B.48277 > A.36925: sctp (1) [SACK] [cum ack
632373009] [a_rwnd 0] [#gap
(window update missed)
04.573807 IP A.36925 > B.48277: sctp (1) [DATA] (B)(E) [TSN:
632373010] [SID: 0] [SS
04.779370 IP B.48277 > A.36925: sctp (1) [SACK] [cum ack
632373082] [a_rwnd 859] [#g
04.789162 IP A.36925 > B.48277: sctp (1) [DATA] (B)(E) [TSN:
632373083] [SID: 0] [SS
04.789323 IP A.36925 > B.48277: sctp (1) [DATA] (B)(E) [TSN:
632373156] [SID: 0] [SS
04.789372 IP B.48277 > A.36925: sctp (1) [SACK] [cum ack
632373228] [a_rwnd 786] [#g
After:
02.568957 IP B.50536 > A.55173: sctp (1) [SACK] [cum ack
2490098728] [a_rwnd 99153]
02.568961 IP A.55173 > B.50536: sctp (1) [DATA] (B)(E) [TSN:
2490098729] [SID: 0] [S
02.585631 IP A.55173 > B.50536: sctp (1) [DATA] (B)(E) [TSN:
2490098742] [SID: 0] [S
02.585666 IP B.50536 > A.55173: sctp (1) [SACK] [cum ack
2490098741] [a_rwnd 0] [#ga
02.585671 IP A.55173 > B.50536: sctp (1) [DATA] (B)(E) [TSN:
2490098815] [SID: 0] [S
02.585683 IP B.50536 > A.55173: sctp (1) [SACK] [cum ack
2490098741] [a_rwnd 0] [#ga
02.602330 IP A.55173 > B.50536: sctp (1) [DATA] (B)(E) [TSN:
2490098828] [SID: 0] [S
02.602359 IP B.50536 > A.55173: sctp (1) [SACK] [cum ack
2490098741] [a_rwnd 0] [#ga
02.602363 IP A.55173 > B.50536: sctp (1) [DATA] (B)(E) [TSN:
2490098901] [SID: 0] [S
02.602372 IP B.50536 > A.55173: sctp (1) [SACK] [cum ack
2490098741] [a_rwnd 0] [#ga
03.600788 IP A.55173 > B.50536: sctp (1) [DATA] (B)(E) [TSN:
2490098742] [SID: 0] [S
03.600830 IP B.50536 > A.55173: sctp (1) [SACK] [cum ack
2490098741] [a_rwnd 0] [#ga
03.619455 IP B.50536 > A.55173: sctp (1) [SACK] [cum ack
2490098741] [a_rwnd 13508]
03.619479 IP B.50536 > A.55173: sctp (1) [SACK] [cum ack
2490098741] [a_rwnd 27017]
03.619497 IP B.50536 > A.55173: sctp (1) [SACK] [cum ack
2490098741] [a_rwnd 40526]
03.619516 IP B.50536 > A.55173: sctp (1) [SACK] [cum ack
2490098741] [a_rwnd 54035]
03.619533 IP B.50536 > A.55173: sctp (1) [SACK] [cum ack
2490098741] [a_rwnd 67544]
03.619552 IP B.50536 > A.55173: sctp (1) [SACK] [cum ack
2490098741] [a_rwnd 81053]
03.619570 IP B.50536 > A.55173: sctp (1) [SACK] [cum ack
2490098741] [a_rwnd 94562]
(following data transmission triggered by window updates above)
03.633504 IP A.55173 > B.50536: sctp (1) [DATA] (B)(E) [TSN:
2490098742] [SID: 0] [S
03.836445 IP B.50536 > A.55173: sctp (1) [SACK] [cum ack
2490098814] [a_rwnd 100000]
03.843125 IP A.55173 > B.50536: sctp (1) [DATA] (B)(E) [TSN:
2490098815] [SID: 0] [S
03.843285 IP A.55173 > B.50536: sctp (1) [DATA] (B)(E) [TSN:
2490098888] [SID: 0] [S
03.843345 IP B.50536 > A.55173: sctp (1) [SACK] [cum ack
2490098960] [a_rwnd 99894]
03.856546 IP A.55173 > B.50536: sctp (1) [DATA] (B)(E) [TSN:
2490098961] [SID: 0] [S
03.866450 IP A.55173 > B.50536: sctp (1) [DATA] (B)(E) [TSN:
2490099011] [SID: 0] [S
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Marcelo Ricardo Leitner [Fri, 23 Dec 2016 16:29:02 +0000 (14:29 -0200)]
sctp: do not loose window information if in rwnd_over
It's possible that we receive a packet that is larger than current
window. If it's the first packet in this way, it will cause it to
increase rwnd_over. Then, if we receive another data chunk (specially as
SCTP allows you to have one data chunk in flight even during 0 window),
rwnd_over will be overwritten instead of added to.
In the long run, this could cause the window to grow bigger than its
initial size, as rwnd_over would be charged only for the last received
data chunk while the code will try open the window for all packets that
were received and had its value in rwnd_over overwritten. This, then,
can lead to the worsening of payload/buffer ratio and cause rwnd_press
to kick in more often.
The fix is to sum it too, same as is done for rwnd_press, so that if we
receive 3 chunks after closing the window, we still have to release that
same amount before re-opening it.
Log snippet from sctp_test exhibiting the issue:
[ 146.209232] sctp: sctp_assoc_rwnd_decrease: asoc:
ffff88013928e000
rwnd decreased by 1 to (0, 1, 114221)
[ 146.209232] sctp: sctp_assoc_rwnd_decrease:
association:
ffff88013928e000 has asoc->rwnd:0, asoc->rwnd_over:1!
[ 146.209232] sctp: sctp_assoc_rwnd_decrease: asoc:
ffff88013928e000
rwnd decreased by 1 to (0, 1, 114221)
[ 146.209232] sctp: sctp_assoc_rwnd_decrease:
association:
ffff88013928e000 has asoc->rwnd:0, asoc->rwnd_over:1!
[ 146.209232] sctp: sctp_assoc_rwnd_decrease: asoc:
ffff88013928e000
rwnd decreased by 1 to (0, 1, 114221)
[ 146.209232] sctp: sctp_assoc_rwnd_decrease:
association:
ffff88013928e000 has asoc->rwnd:0, asoc->rwnd_over:1!
[ 146.209232] sctp: sctp_assoc_rwnd_decrease: asoc:
ffff88013928e000
rwnd decreased by 1 to (0, 1, 114221)
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Fri, 23 Dec 2016 18:52:43 +0000 (10:52 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/viro/vfs
Pull final vfs updates from Al Viro:
"Assorted cleanups and fixes all over the place"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
sg_write()/bsg_write() is not fit to be called under KERNEL_DS
ufs: fix function declaration for ufs_truncate_blocks
fs: exec: apply CLOEXEC before changing dumpable task flags
seq_file: reset iterator to first record for zero offset
vfs: fix isize/pos/len checks for reflink & dedupe
[iov_iter] fix iterate_all_kinds() on empty iterators
move aio compat to fs/aio.c
reorganize do_make_slave()
clone_private_mount() doesn't need to touch namespace_sem
remove a bogus claim about namespace_sem being held by callers of mnt_alloc_id()
David S. Miller [Fri, 23 Dec 2016 18:48:56 +0000 (13:48 -0500)]
Merge branch 'virtio-net-xdp-fixes'
Jason Wang says:
====================
several fixups for virtio-net XDP
Merry Xmas and a Happy New year to all:
This series tries to fixes several issues for virtio-net XDP which
could be categorized into several parts:
- fix several issues during XDP linearizing
- allow csumed packet to work for XDP_PASS
- make EWMA rxbuf size estimation works for XDP
- forbid XDP when GUEST_UFO is support
- remove big packet XDP support
- add XDP support or small buffer
Please see individual patches for details.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jason Wang [Fri, 23 Dec 2016 14:37:32 +0000 (22:37 +0800)]
virtio-net: XDP support for small buffers
Commit
f600b6905015 ("virtio_net: Add XDP support") leaves the case of
small receive buffer untouched. This will confuse the user who want to
set XDP but use small buffers. Other than forbid XDP in small buffer
mode, let's make it work. XDP then can only work at skb->data since
virtio-net create skbs during refill, this is sub optimal which could
be optimized in the future.
Cc: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jason Wang [Fri, 23 Dec 2016 14:37:31 +0000 (22:37 +0800)]
virtio-net: remove big packet XDP codes
Now we in fact don't allow XDP for big packets, remove its codes.
Cc: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jason Wang [Fri, 23 Dec 2016 14:37:30 +0000 (22:37 +0800)]
virtio-net: forbid XDP when VIRTIO_NET_F_GUEST_UFO is support
When VIRTIO_NET_F_GUEST_UFO is negotiated, host could still send UFO
packet that exceeds a single page which could not be handled
correctly by XDP. So this patch forbids setting XDP when GUEST_UFO is
supported. While at it, forbid XDP for ECN (which comes only from GRO)
too to prevent user from misconfiguration.
Cc: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jason Wang [Fri, 23 Dec 2016 14:37:29 +0000 (22:37 +0800)]
virtio-net: make rx buf size estimation works for XDP
We don't update ewma rx buf size in the case of XDP. This will lead
underestimation of rx buf size which causes host to produce more than
one buffers. This will greatly increase the possibility of XDP page
linearization.
Cc: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jason Wang [Fri, 23 Dec 2016 14:37:28 +0000 (22:37 +0800)]
virtio-net: unbreak csumed packets for XDP_PASS
We drop csumed packet when do XDP for packets. This breaks
XDP_PASS when GUEST_CSUM is supported. Fix this by allowing csum flag
to be set. With this patch, simple TCP works for XDP_PASS.
Cc: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jason Wang [Fri, 23 Dec 2016 14:37:27 +0000 (22:37 +0800)]
virtio-net: correctly handle XDP_PASS for linearized packets
When XDP_PASS were determined for linearized packets, we try to get
new buffers in the virtqueue and build skbs from them. This is wrong,
we should create skbs based on existed buffers instead. Fixing them by
creating skb based on xdp_page.
With this patch "ping 192.168.100.4 -s 3900 -M do" works for XDP_PASS.
Cc: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jason Wang [Fri, 23 Dec 2016 14:37:26 +0000 (22:37 +0800)]
virtio-net: fix page miscount during XDP linearizing
We don't put page during linearizing, the would cause leaking when
xmit through XDP_TX or the packet exceeds PAGE_SIZE. Fix them by
put page accordingly. Also decrease the number of buffers during
linearizing to make sure caller can free buffers correctly when packet
exceeds PAGE_SIZE. With this patch, we won't get OOM after linearize
huge number of packets.
Cc: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jason Wang [Fri, 23 Dec 2016 14:37:25 +0000 (22:37 +0800)]
virtio-net: correctly xmit linearized page on XDP_TX
After we linearize page, we should xmit this page instead of the page
of first buffer which may lead unexpected result. With this patch, we
can see correct packet during XDP_TX.
Cc: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>