Laxman Dewangan [Tue, 19 Apr 2016 07:22:00 +0000 (12:52 +0530)]
thermal: generic-adc: Add DT binding for ADC based thermal sensor
Sometimes, thermal sensors like NCT thermistors are connected to
the ADC channel. The temperature is read by reading the voltage
across the sensor resistance via ADC and referring the lookup
table for ADC value to temperature.
Add DT binding doc for the ADC based thermal sensor driver to
detail the DT property and provide the example for how to use it.
Signed-off-by: Laxman Dewangan <ldewangan@nvidia.com>
Acked-by: Rob Herring <rob@kernel.org>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Wei Ni [Wed, 6 Apr 2016 09:48:04 +0000 (17:48 +0800)]
thermal: tegra: fix static checker warning
There has a static checker warning:
warn: variable dereferenced before check 'dev' (see line 222)
Since check 'dev' is unnecessary, so remove this check.
Fixes:
ee6d79f202a4 ("thermal: tegra: add thermtrip function")
Signed-off-by: Wei Ni <wni@nvidia.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Arnd Bergmann [Sat, 16 Apr 2016 20:19:33 +0000 (22:19 +0200)]
thermal: tegra: mark PM functions __maybe_unused
After the PM support has been added to this driver, we get
a harmless warning when that support is disabled at compile
time:
drivers/thermal/tegra/soctherm.c:641:12: error: 'soctherm_resume' defined but not used [-Werror=unused-function]
static int soctherm_resume(struct device *dev)
This marks the two PM functions as __maybe_unused to shut up
the warning. This is preferred over adding an #ifdef around
them, as it is harder to get wrong, and provides better
compile-time coverage.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes:
a134b4143b65 ("thermal: tegra: add PM support")
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Marc Gonzalez [Tue, 19 Apr 2016 14:21:16 +0000 (16:21 +0200)]
thermal: add temperature sensor support for tango SoC
The Tango thermal driver provides support for the primitive temperature
sensor embedded in Tango chips since the SMP8758.
This sensor only generates a 1-bit signal to indicate whether the die
temperature exceeds a programmable threshold.
Signed-off-by: Marc Gonzalez <marc_gonzalez@sigmadesigns.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Leo Yan [Tue, 29 Mar 2016 11:27:13 +0000 (19:27 +0800)]
thermal: hisilicon: fix IRQ imbalance enabling
When register sensors into thermal zone during initialization phase, it
reports error for IRQ imbalance enabling:
[ 2.040713] WARNING: at kernel/irq/manage.c:513
[ 2.040719] Modules linked in:
[ 2.040721]
[ 2.040729] CPU: 1 PID: 804 Comm: irq/33-hisi_the Not tainted 4.5.0-rc4+ #505
[ 2.040732] Hardware name: HiKey Development Board (DT)
[ 2.040736] task:
ffffffc03ae82580 ti:
ffffffc0379c8000 task.ti:
ffffffc0379c8000
[ 2.040745] PC is at __enable_irq+0x74/0x84
[ 2.040749] LR is at __enable_irq+0x74/0x84
This warning is for IRQ imbalance enabling, which is caused by
enable_irq() twice. During sensor's initialization it tries to enable
IRQ, the driver will call thermal_zone_of_sensor_register() to bind
sensors and read sensor's temperature. But at this moment the flag
"data->irq_enabled" has been not initialized as correct state, so it
finally introduces the function enabled_irq() to be called twice. In
essentially this is caused by the flag "data->irq_enabled" is
inconsistent with real hardware IRQ enabling state.
So this patch is to fix this issue, firstly init "irq_enabled" flag
before binding sensors to thermal zone. Also change to use the function
irq_get_irqchip_state() to read back real interrupt line state.
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Leo Yan [Tue, 29 Mar 2016 11:27:12 +0000 (19:27 +0800)]
thermal: hisilicon: support to use any sensor
In current code sensor driver registers all 4 sensors together and if
any of them has not bound to thermal zone successfully then driver will
return failure for driver's initialization. As a result, if DT binds
thermal zone with only one sensor, then the thermal driver will not work
well anymore.
So this patch is to fix this issue. It allows the thermal sensor driver
can register any number sensors at initialization phase, and fix up code
for other related code to skip related sensor's accessing if the sensor
has not been enabled in initialization phase.
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Simon Horman [Tue, 29 Mar 2016 05:07:45 +0000 (14:07 +0900)]
thermal: rcar: Remove binding docs for r8a7794
The latest information that I have is that there is no thermal IP
block present on the r8a7794 SoC so remove the corresponding binding.
Cc: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Wei Ni [Tue, 29 Mar 2016 10:29:22 +0000 (18:29 +0800)]
thermal: tegra: add PM support
Add suspend/resume function in soctherm driver.
Signed-off-by: Wei Ni <wni@nvidia.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Wei Ni [Tue, 29 Mar 2016 10:29:21 +0000 (18:29 +0800)]
thermal: tegra: handle HW initialization in one funcotion
Handle HW initialization in one function soctherm_init(),
so that the codes are more clear.
Signed-off-by: Wei Ni <wni@nvidia.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Wei Ni [Tue, 29 Mar 2016 10:29:20 +0000 (18:29 +0800)]
thermal: tegra: handle clocks in one function
Handle clock enable/disable codes in one function
soctherm_clk_enable(), so that the codes are more clear.
Signed-off-by: Wei Ni <wni@nvidia.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Wei Ni [Tue, 29 Mar 2016 10:29:19 +0000 (18:29 +0800)]
thermal: tegra: add thermtrip function
Add support for hardware critical thermal limits to the
SOC_THERM driver. It use the Linux thermal framework to
create critical trip temp, and set it to SOC_THERM hardware.
If these limits are breached, the chip will reset, and if
appropriately configured, will turn off the PMIC.
This support is critical for safe usage of the chip.
Signed-off-by: Wei Ni <wni@nvidia.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Wei Ni [Tue, 29 Mar 2016 10:29:18 +0000 (18:29 +0800)]
of: add notes of critical trips for soctherm
The "critical" type trip in thermal zone can be
set to SOC_THERM hardware, it can trigger shut down
or reset event from hardware.
Signed-off-by: Wei Ni <wni@nvidia.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Wei Ni [Tue, 29 Mar 2016 10:29:17 +0000 (18:29 +0800)]
thermal: of-thermal: allow setting trip_temp on hardware
In current of-thermal, the .set_trip_temp only support to
set trip_temp for SW. But some sensors support to set
trip_temp on hardware, so that can trigger interrupt,
shutdown or any other events.
This patch adds .set_trip_temp() callback in
thermal_zone_of_device_ops{}, so that the sensor device can
use it to set trip_temp on hardware.
Signed-off-by: Wei Ni <wni@nvidia.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Wei Ni [Tue, 29 Mar 2016 10:29:16 +0000 (18:29 +0800)]
thermal: tegra: add a debugfs to show registers
Add a debugfs interface to show register contents for debug.
Signed-off-by: Wei Ni <wni@nvidia.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Wei Ni [Tue, 29 Mar 2016 10:29:15 +0000 (18:29 +0800)]
thermal: tegra: add Tegra210 specific SOC_THERM driver
Add Tegra210 specific SOC_THERM driver.
Signed-off-by: Wei Ni <wni@nvidia.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Wei Ni [Tue, 29 Mar 2016 10:29:14 +0000 (18:29 +0800)]
thermal: tegra: split tegra_soctherm driver
Split most of the Tegra124 data and code into a Tegra124-specific
file.
Split most of the fuse-related code into a fuse-related source file.
This is in preparation for adding a Tegra210-specific driver in a
future patch.
Beyond the maintainability improvements, this is intended to separate
chip-specific ATE and characterization-related hacks into chip-specific
files, in the hopes that they won't pollute code for other chips.
Signed-off-by: Wei Ni <wni@nvidia.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Wei Ni [Tue, 29 Mar 2016 10:29:13 +0000 (18:29 +0800)]
thermal: tegra: get rid of PDIV/HOTSPOT hack
Get rid of T124-specific PDIV/HOTSPOT hack.
tegra-soctherm.c contained a hack to set the SENSOR_PDIV and
SENSOR_HOTSPOT_OFFSET registers - it just did two writes of
T124-specific opaque values. Convert these into a form that can be
substituted on a per-chip basis, and into structure fields that have
at least some independent meaning.
Signed-off-by: Wei Ni <wni@nvidia.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Wei Ni [Tue, 29 Mar 2016 10:29:12 +0000 (18:29 +0800)]
thermal: tegra: combine sensor group-related data
Combine sensor group-related data structures into struct
tegra_tsensor_group. This provides a single location for
sensor group data storage.
More sensor group data will be added in subsequent patches.
Signed-off-by: Wei Ni <wni@nvidia.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Wei Ni [Tue, 29 Mar 2016 10:29:11 +0000 (18:29 +0800)]
thermal: tegra: move tegra thermal files into tegra directory
Move Tegra soctherm driver to tegra directory, it's easy to maintain
and add more new function support for Tegra platforms.
This will also help to split soctherm driver into common parts and
chip specific data related parts.
Signed-off-by: Wei Ni <wni@nvidia.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Eduardo Valentin [Wed, 9 Mar 2016 21:12:52 +0000 (13:12 -0800)]
thermal: convert ti-thermal to use devm_thermal_zone_of_sensor_register
This changes the driver to use the devm_ version
of thermal_zone_of_sensor_register and cleans
up the local points and unregister calls.
Cc: Zhang Rui <rui.zhang@intel.com>
Cc: linux-pm@vger.kernel.org
Cc: linux-omap@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Tested-by: Keerthy <j-keerthy@ti.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Eduardo Valentin [Wed, 9 Mar 2016 21:12:13 +0000 (13:12 -0800)]
thermal: convert tegra_thermal to use devm_thermal_zone_of_sensor_register
This changes the driver to use the devm_ version
of thermal_zone_of_sensor_register and cleans
up the local points and unregister calls.
Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Stephen Warren <swarren@wwwdotorg.org>
Cc: Thierry Reding <thierry.reding@gmail.com>
Cc: Alexandre Courbot <gnurou@gmail.com>
Cc: linux-pm@vger.kernel.org
Cc: linux-tegra@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Eduardo Valentin [Wed, 9 Mar 2016 21:10:28 +0000 (13:10 -0800)]
thermal: convert rockchip_thermal to use devm_thermal_zone_of_sensor_register
This changes the driver to use the devm_ version
of thermal_zone_of_sensor_register and cleans
up the local points and unregister calls.
Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Heiko Stuebner <heiko@sntech.de>
Cc: linux-pm@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-rockchip@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Tested-by: Caesar Wang <wxt@rock-chips.com>
Reviewed-by: Caesar Wang <wxt@rock-chips.com>
Reviewed-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Eduardo Valentin [Wed, 9 Mar 2016 21:09:43 +0000 (13:09 -0800)]
thermal: convert rcar_thermal to use devm_thermal_zone_of_sensor_register
This changes the driver to use the devm_ version
of thermal_zone_of_sensor_register and cleans
up the local points and unregister calls.
Cc: Zhang Rui <rui.zhang@intel.com>
Cc: linux-pm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Eduardo Valentin [Wed, 9 Mar 2016 21:08:48 +0000 (13:08 -0800)]
thermal: convert qcom-spmi to use devm_thermal_zone_of_sensor_register
This changes the driver to use the devm_ version
of thermal_zone_of_sensor_register and cleans
up the local points and unregister calls.
Cc: Zhang Rui <rui.zhang@intel.com>
Cc: linux-pm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Eduardo Valentin [Wed, 9 Mar 2016 21:08:14 +0000 (13:08 -0800)]
thermal: convert mtk_thermal to use devm_thermal_zone_of_sensor_register
This changes the driver to use the devm_ version
of thermal_zone_of_sensor_register and cleans
up the local points and unregister calls.
Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Matthias Brugger <matthias.bgg@gmail.com>
Cc: linux-pm@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-mediatek@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Eduardo Valentin [Wed, 9 Mar 2016 21:07:13 +0000 (13:07 -0800)]
thermal: convert hisi_thermal to use devm_thermal_zone_of_sensor_register
This changes the driver to use the devm_ version
of thermal_zone_of_sensor_register and cleans
up the local points and unregister calls.
Cc: Zhang Rui <rui.zhang@intel.com>
Cc: linux-pm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Eduardo Valentin [Wed, 9 Mar 2016 21:05:42 +0000 (13:05 -0800)]
input: convert sun4i-ts to use devm_thermal_zone_of_sensor_register
This changes the driver to use the devm_ version
of thermal_zone_of_sensor_register and cleans
up the local points and unregister calls.
Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Cc: Maxime Ripard <maxime.ripard@free-electrons.com>
Cc: Chen-Yu Tsai <wens@csie.org>
Cc: Hans de Goede <hdegoede@redhat.com>
Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Lukasz Majewski <l.majewski@samsung.com>
Cc: Heiko Stuebner <heiko@sntech.de>
Cc: Sascha Hauer <s.hauer@pengutronix.de>
Cc: Jens Thiele <karme@karme.de>
Cc: linux-input@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Acked-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Eduardo Valentin [Wed, 9 Mar 2016 21:03:17 +0000 (13:03 -0800)]
hwmon: convert scpi-hwmon to use devm_thermal_zone_of_sensor_register
This changes the driver to use the devm_ version
of thermal_zone_of_sensor_register and cleans
up the local points and unregister calls.
Cc: Jean Delvare <jdelvare@suse.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: lm-sensors@lm-sensors.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Eduardo Valentin [Wed, 9 Mar 2016 21:02:54 +0000 (13:02 -0800)]
hwmon: convert tmp102 to use devm_thermal_zone_of_sensor_register
This changes the driver to use the devm_ version
of thermal_zone_of_sensor_register and cleans
up the local points and unregister calls.
Cc: Jean Delvare <jdelvare@suse.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: lm-sensors@lm-sensors.org
Cc: linux-kernel@vger.kernel.org
Acked-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Eduardo Valentin [Wed, 9 Mar 2016 21:02:22 +0000 (13:02 -0800)]
hwmon: convert ntc_thermistor to use devm_thermal_zone_of_sensor_register
This changes the driver to use the devm_ version
of thermal_zone_of_sensor_register and cleans
up the local points and unregister calls.
Cc: Jean Delvare <jdelvare@suse.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: lm-sensors@lm-sensors.org
Cc: linux-kernel@vger.kernel.org
Acked-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Eduardo Valentin [Wed, 9 Mar 2016 20:39:58 +0000 (12:39 -0800)]
hwmon: convert lm75 to use devm_thermal_zone_of_sensor_register
This changes the driver to use the devm_ version
of thermal_zone_of_sensor_register and cleans
up the local points and unregister calls.
Cc: Jean Delvare <jdelvare@suse.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: lm-sensors@lm-sensors.org
Cc: linux-kernel@vger.kernel.org
Acked-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Keerthy [Wed, 16 Mar 2016 06:12:58 +0000 (11:42 +0530)]
MAINTAINERS: ti-soc-thermal: add a co-maintainer and update the entry
Add myself as a co-maintainer for ti-soc-thermal
Signed-off-by: Keerthy <j-keerthy@ti.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Ulises Brindis [Fri, 25 Mar 2016 19:55:41 +0000 (12:55 -0700)]
thermal: of: fix cleanup when building a thermal zone
of_node_put is iterating through all terms in the tbps array even though
the bind has failed. We need to only iterate through the terms that have
already passed the binding step.
Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Eduardo Valentin <edubezval@gmail.com>
Cc: linux-pm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ulises Brindis <brindisu@lab126.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Andy Champ [Tue, 22 Mar 2016 12:37:25 +0000 (12:37 +0000)]
thermal: Syntactic and factual errors in the API document
There are several places where the English in the document is syntactically
invalid, or unclear. There are also one or two factual errors.
Reviewed-by: Durgadoss R <durgadoss.r@intel.com>
Signed-off-by: Andy Champ <andycham@amazon.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Linus Torvalds [Sun, 1 May 2016 22:52:31 +0000 (15:52 -0700)]
Linux 4.6-rc6
Linus Torvalds [Sun, 1 May 2016 01:57:42 +0000 (18:57 -0700)]
Merge branch 'fixes' of git://git./linux/kernel/git/evalenti/linux-soc-thermal
Pull thermal fixes from Eduardo Valentin:
"A couple of minor fixes for the thermal subsystem.
Specifics in this pull request:
- Fixes in hisilicon thermal driver
- More fixes of unsigned to int type change in thermal_core.c"
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal:
thermal: use %d to print S32 parameters
thermal: hisilicon: increase temperature resolution
Linus Torvalds [Sat, 30 Apr 2016 01:50:08 +0000 (18:50 -0700)]
Merge tag 'powerpc-4.6-4' of git://git./linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
"A few more powerpc fixes for 4.6:
- cxl: Keep IRQ mappings on context teardown from Michael Neuling
- cxl: Poll for outstanding IRQs when detaching a context from
Michael Neuling
- Wire up preadv2 and pwritev2 syscalls from Rui Salvaterra"
* tag 'powerpc-4.6-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc: wire up preadv2 and pwritev2 syscalls
cxl: Poll for outstanding IRQs when detaching a context
cxl: Keep IRQ mappings on context teardown
Linus Torvalds [Sat, 30 Apr 2016 00:59:26 +0000 (17:59 -0700)]
Merge tag 'edac_fix_for_4.6' of git://git./linux/kernel/git/bp/bp
Pull EDAC fix from Borislav Petkov:
"Make sure sb_edac and i7core_edac do not terminate MCE processing on
the decoding callchain prematurely"
* tag 'edac_fix_for_4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp:
EDAC: i7core, sb_edac: Don't return NOTIFY_BAD from mce_decoder callback
Linus Torvalds [Sat, 30 Apr 2016 00:39:51 +0000 (17:39 -0700)]
Merge tag 'pm+acpi-4.6-rc6' of git://git./linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
"One revert of a recent cpufreq commit that introduced a regression and
a fix for intel_pstate's Turbo Activation Ratio handling code.
Specifics:
- Revert cpufreq commit that attempted to fix a problem in the
ondemand/conservative governor code, but did that incorrectly and
introduced another problem instead (Rafael Wysocki).
- Fix incorrect decoding of MSR contents related to the Turbo
Activation Ratio (TAR) handling in the intel_pstate driver
(Srinivas Pandruvada)"
* tag 'pm+acpi-4.6-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpufreq: intel_pstate: Fix processing for turbo activation ratio
Revert "cpufreq: governor: Fix negative idle_time when configured with CONFIG_HZ_PERIODIC"
Linus Torvalds [Sat, 30 Apr 2016 00:32:19 +0000 (17:32 -0700)]
Merge tag 'mmc-v4.6-rc4' of git://git.linaro.org/people/ulf.hansson/mmc
Pull MMC fixes from Ulf Hansson:
"Here are a two MMC host fixes:
- sdhci-acpi: Reduce Baytrail eMMC/SD/SDIO hangs
- sunxi: Disable eMMC HS-DDR for Allwinner A80"
* tag 'mmc-v4.6-rc4' of git://git.linaro.org/people/ulf.hansson/mmc:
mmc: sunxi: Disable eMMC HS-DDR (MMC_CAP_1_8V_DDR) for Allwinner A80
mmc: sdhci-acpi: Reduce Baytrail eMMC/SD/SDIO hangs
Linus Torvalds [Sat, 30 Apr 2016 00:18:55 +0000 (17:18 -0700)]
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
"A few fixes all over the place:
radeon is probably the biggest standout, it's a fix for screen
corruption or hung black outputs so I thought it was worth pulling in.
Otherwise some amdgpu power control fixes, some misc vmwgfx fixes, one
etnaviv fix, one virtio-gpu fix, two DP MST fixes, and a single TTM
fix"
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
drm/vmwgfx: Fix order of operation
drm/vmwgfx: use vmw_cmd_dx_cid_check for query commands.
drm/vmwgfx: Enable SVGA_3D_CMD_DX_SET_PREDICATION
drm/amdgpu: disable vm interrupts with vm_fault_stop=2
drm/amdgpu: print a message if ATPX dGPU power control is missing
Revert "drm/amdgpu: disable runtime pm on PX laptops without dGPU power control"
drm/radeon: fix vertical bars appear on monitor (v2)
drm/ttm: fix kref count mess in ttm_bo_move_to_lru_tail
drm/virtio: send vblank event after crtc updates
drm/dp/mst: Restore primary hub guid on resume
drm/dp/mst: Get validated port ref in drm_dp_update_payload_part1()
drm/etnaviv: don't move linear memory window on 3D cores without MC2.0
Linus Torvalds [Sat, 30 Apr 2016 00:07:54 +0000 (17:07 -0700)]
Merge tag 'for-linus' of git://git./linux/kernel/git/dledford/rdma
Pull rdma fixes from Doug Ledford:
"Final set of -rc fixes for 4.6.
I've collected up a number of patches that are all pretty small with
the exception of only a couple. The hfi1 driver has a number of
important patches, and it is what really drives the line count of this
pull request up. These are all small and I've got this kernel built
and running in the test lab (I have most of the hardware, I think nes
is the only thing in this patch set that I can't say I've personally
tested and have up and running).
Summary:
- A number of collected fixes for oopses, memory corruptions,
deadlocks, etc. All of these fixes are small (many only 5-10
lines), obvious, and tested.
- Fix for the security issue related to the use of write for
bi-directional communications"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma:
RDMA/nes: don't leak skb if carrier down
IB/security: Restrict use of the write() interface
IB/hfi1: Use kernel default llseek for ui device
IB/hfi1: Don't attempt to free resources if initialization failed
IB/hfi1: Fix missing lock/unlock in verbs drain callback
IB/rdmavt: Fix send scheduling
IB/hfi1: Prevent unpinning of wrong pages
IB/hfi1: Fix deadlock caused by locking with wrong scope
IB/hfi1: Prevent NULL pointer deferences in caching code
MAINTAINERS: Update iser/isert maintainer contact info
IB/mlx5: Expose correct max_sge_rd limit
RDMA/iw_cxgb4: Fix bar2 virt addr calculation for T4 chips
iw_cxgb4: handle draining an idle qp
iw_cxgb3: initialize ibdev.iwcm->ifname for port mapping
iw_cxgb4: initialize ibdev.iwcm->ifname for port mapping
IB/core: Don't drain non-existent rq queue-pair
IB/core: Fix oops in ib_cache_gid_set_default_gid
Linus Torvalds [Fri, 29 Apr 2016 18:21:22 +0000 (11:21 -0700)]
Merge branch 'akpm' (patches from Andrew)
Merge fixes from Andrew Morton:
"20 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
Documentation/sysctl/vm.txt: update numa_zonelist_order description
lib/stackdepot.c: allow the stack trace hash to be zero
rapidio: fix potential NULL pointer dereference
mm/memory-failure: fix race with compound page split/merge
ocfs2/dlm: return zero if deref_done message is successfully handled
Ananth has moved
kcov: don't profile branches in kcov
kcov: don't trace the code coverage code
mm: wake kcompactd before kswapd's short sleep
.mailmap: add Frank Rowand
mm/hwpoison: fix wrong num_poisoned_pages accounting
mm: call swap_slot_free_notify() with page lock held
mm: vmscan: reclaim highmem zone if buffer_heads is over limit
numa: fix /proc/<pid>/numa_maps for THP
mm/huge_memory: replace VM_NO_THP VM_BUG_ON with actual VMA check
mailmap: fix Krzysztof Kozlowski's misspelled name
thp: keep huge zero page pinned until tlb flush
mm: exclude HugeTLB pages from THP page_mapped() logic
kexec: export OFFSET(page.compound_head) to find out compound tail page
kexec: update VMCOREINFO for compound_order/dtor
Tony Luck [Fri, 29 Apr 2016 13:42:25 +0000 (15:42 +0200)]
EDAC: i7core, sb_edac: Don't return NOTIFY_BAD from mce_decoder callback
Both of these drivers can return NOTIFY_BAD, but this terminates
processing other callbacks that were registered later on the chain.
Since the driver did nothing to log the error it seems wrong to prevent
other interested parties from seeing it. E.g. neither of them had even
bothered to check the type of the error to see if it was a memory error
before the return NOTIFY_BAD.
Signed-off-by: Tony Luck <tony.luck@intel.com>
Acked-by: Aristeu Rozanski <aris@redhat.com>
Acked-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/72937355dd92318d2630979666063f8a2853495b.1461864507.git.tony.luck@intel.com
Signed-off-by: Borislav Petkov <bp@suse.de>
Rafael J. Wysocki [Fri, 29 Apr 2016 12:22:25 +0000 (14:22 +0200)]
Merge branch 'pm-cpufreq-fixes'
* pm-cpufreq-fixes:
cpufreq: intel_pstate: Fix processing for turbo activation ratio
Revert "cpufreq: governor: Fix negative idle_time when configured with CONFIG_HZ_PERIODIC"
Dave Airlie [Fri, 29 Apr 2016 04:31:44 +0000 (14:31 +1000)]
Merge branch 'drm-fixes-4.6' of git://people.freedesktop.org/~agd5f/linux into drm-fixes
A few fixes for 4.6.
- revert amdgpu PX commit that was previously reverted on the radeon side
- cleaned up version of the NI+ MC update display fix for radeon
- TTM kref fix
* 'drm-fixes-4.6' of git://people.freedesktop.org/~agd5f/linux:
drm/amdgpu: disable vm interrupts with vm_fault_stop=2
drm/amdgpu: print a message if ATPX dGPU power control is missing
Revert "drm/amdgpu: disable runtime pm on PX laptops without dGPU power control"
drm/radeon: fix vertical bars appear on monitor (v2)
drm/ttm: fix kref count mess in ttm_bo_move_to_lru_tail
Dave Airlie [Fri, 29 Apr 2016 04:27:50 +0000 (14:27 +1000)]
Merge branch 'drm-vmwgfx-fixes' of git://people.freedesktop.org/~syeh/repos_linux into drm-fixes
three misc vmwgfx fixes
* 'drm-vmwgfx-fixes' of git://people.freedesktop.org/~syeh/repos_linux:
drm/vmwgfx: Fix order of operation
drm/vmwgfx: use vmw_cmd_dx_cid_check for query commands.
drm/vmwgfx: Enable SVGA_3D_CMD_DX_SET_PREDICATION
Linus Torvalds [Fri, 29 Apr 2016 03:24:27 +0000 (20:24 -0700)]
Merge branch 'x86-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
"Two boot crash fixes and an IRQ handling crash fix"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/apic: Handle zero vector gracefully in clear_vector_irq()
Revert "x86/mm/32: Set NX in __supported_pte_mask before enabling paging"
xen/qspinlock: Don't kick CPU if IRQ is not initialized
Linus Torvalds [Fri, 29 Apr 2016 03:19:04 +0000 (20:19 -0700)]
Merge branch 'perf-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
"x86 PMU driver fixes plus a core code race fix"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86/intel: Fix incorrect lbr_sel_mask value
perf/x86/intel/pt: Don't die on VMXON
perf/core: Fix perf_event_open() vs. execve() race
perf/x86/amd: Set the size of event map array to PERF_COUNT_HW_MAX
perf/core: Make sysctl_perf_cpu_time_max_percent conform to documentation
perf/x86/intel/rapl: Add missing Haswell model
perf/x86/intel: Add model number for Skylake Server to perf
Linus Torvalds [Fri, 29 Apr 2016 02:59:17 +0000 (19:59 -0700)]
Merge branch 'locking-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull locking fixes from Ingo Molnar:
"Two lockdep fixes"
* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
lockdep: Fix lock_chain::base size
locking/lockdep: Fix ->irq_context calculation
Linus Torvalds [Fri, 29 Apr 2016 02:54:50 +0000 (19:54 -0700)]
Merge branch 'efi-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull EFI fix from Ingo Molnar:
"This fixes a bug in the efivars code"
* 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
efi: Fix out-of-bounds read in variable_matches()
Linus Torvalds [Fri, 29 Apr 2016 02:44:47 +0000 (19:44 -0700)]
Merge tag 'media/v4.6-4' of git://git./linux/kernel/git/mchehab/linux-media
Pull media fixes from Mauro Carvalho Chehab:
"Some regression fixes:
- videobuf2 core: avoid the risk of going past buffer on multi-planes
and fix rw mode
- fix support for 4K formats at V4L2 core
- fix a trouble at davinci_fpe, caused by a bad patch
- usbvision: revert a patch with a partial fixup. The fixup patch
was merged already, and this one has some issues"
* tag 'media/v4.6-4' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
[media] vb2-memops: Fix over allocation of frame vectors
[media] media: vb2: Fix regression on poll() for RW mode
[media] v4l2-dv-timings.h: fix polarity for 4k formats
[media] davinci_vpfe: Revert "staging: media: davinci_vpfe: remove,unnecessary ret variable"
[media] usbvision: revert commit
588afcc1
[media] videobuf2-v4l2: Verify planes array in buffer dequeueing
[media] videobuf2-core: Check user space planes array in dqbuf
Linus Torvalds [Fri, 29 Apr 2016 02:38:45 +0000 (19:38 -0700)]
Merge tag 'sound-4.6-rc6' of git://git./linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"Usually we get a big collection of fixes for ASoC once during rc. And
this is it.
At this time, most of fixes are about Intel Skylake ASoC driver, which
is a new and still on-going development. Along with it, a slight
large LOC is seen in legacy HD-audio driver, but it's merely a code
move to the upper layer.
Other than that, the rest are small or trivial fixes to various
drivers, in addition to an ASoC dapm debugfs code fix"
* tag 'sound-4.6-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (24 commits)
ALSA: hda - Update BCLK also at hotplug for i915 HSW/BDW
ALSA: hda - Add dock support for ThinkPad X260
ASoC: wm5102: Free compressed IRQ in CODEC remove
ASoC: arizona: Free speaker thermal IRQs in CODEC remove
ASoC: Intel: Skylake: Fix ibs/obs calc for non-integral sampling rates
ASoC: Intel: sst: fix a loop timeout in sst_hsw_stream_reset()
ASoC: Intel: Skylake: Fix to turn OFF codec power when entering S3
ASoC: hdac_hdmi: Fix codec power state in S3 during playback
ASoC: hdac_hdmi: Fix to use dev_pm ops instead soc pm
ASoC: wm8962: Correct typo when setting DSPCLK rate
ASoC: nau8825: Fix jack detection across suspend
ASoC: Intel: Skylake: Fix DSP resource de-allocation
ASoC: Intel: Skylake: Fix for unloading module only when it is loaded
ASoC: Intel: Skylake: Fix kbuild dependency
ASoC: dapm: Make sure we have a card when displaying component widgets
ASoC: rt5640: Correct the digital interface data select
ASoC: Intel: Skylake: remove call to pci_dev_put
ASoC: Intel: Skylake: Call i915 exit last
ASoC: Intel: Skylake: Unmap the address last
ASoC: Intel: Skylake: Freeup properly on skl_dsp_free
...
Xishi Qiu [Thu, 28 Apr 2016 23:19:11 +0000 (16:19 -0700)]
Documentation/sysctl/vm.txt: update numa_zonelist_order description
Commit
3193913ce62c ("mm: page_alloc: default node-ordering on 64-bit
NUMA, zone-ordering on 32-bit") changes the default value of
numa_zonelist_order. Update the document.
Signed-off-by: Xishi Qiu <qiuxishi@huawei.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Alexander Potapenko [Thu, 28 Apr 2016 23:19:09 +0000 (16:19 -0700)]
lib/stackdepot.c: allow the stack trace hash to be zero
Do not bail out from depot_save_stack() if the stack trace has zero hash.
Initially depot_save_stack() silently dropped stack traces with zero
hashes, however there's actually no point in reserving this zero value.
Reported-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Alexander Potapenko <glider@google.com>
Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Vladimir Zapolskiy [Thu, 28 Apr 2016 23:19:06 +0000 (16:19 -0700)]
rapidio: fix potential NULL pointer dereference
The change fixes improper check for a returned error value by
class_create() function, which on error returns ERR_PTR() value, thus the
original check always results in a dead code on error path.
Signed-off-by: Vladimir Zapolskiy <vz@mleia.com>
Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Konstantin Khlebnikov [Thu, 28 Apr 2016 23:19:03 +0000 (16:19 -0700)]
mm/memory-failure: fix race with compound page split/merge
get_hwpoison_page() must recheck relation between head and tail pages.
n-horiguchi said: without this recheck, the race causes kernel to pin an
irrelevant page, and finally makes kernel crash for refcount mismatch.
Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
xuejiufei [Thu, 28 Apr 2016 23:19:01 +0000 (16:19 -0700)]
ocfs2/dlm: return zero if deref_done message is successfully handled
dlm_deref_lockres_done_handler() should return zero if the message is
successfully handled.
Fixes:
60d663cb5273 ("ocfs2/dlm: add DEREF_DONE message").
Signed-off-by: xuejiufei <xuejiufei@huawei.com>
Reviewed-by: Joseph Qi <joseph.qi@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ananth N Mavinakayanahalli [Thu, 28 Apr 2016 23:18:58 +0000 (16:18 -0700)]
Ananth has moved
The current ID is going away soon... update email address
Signed-off-by: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andrey Ryabinin [Thu, 28 Apr 2016 23:18:55 +0000 (16:18 -0700)]
kcov: don't profile branches in kcov
Profiling 'if' statements in __sanitizer_cov_trace_pc() leads to
unbound recursion and crash:
__sanitizer_cov_trace_pc() ->
ftrace_likely_update ->
__sanitizer_cov_trace_pc() ...
Define DISABLE_BRANCH_PROFILING to disable this tracer.
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
James Morse [Thu, 28 Apr 2016 23:18:52 +0000 (16:18 -0700)]
kcov: don't trace the code coverage code
Kcov causes the compiler to add a call to __sanitizer_cov_trace_pc() in
every basic block. Ftrace patches in a call to _mcount() to each
function it has annotated.
Letting these mechanisms annotate each other is a bad thing. Break the
loop by adding 'notrace' to __sanitizer_cov_trace_pc() so that ftrace
won't try to patch this code.
This patch lets arm64 with KCOV and STACK_TRACER boot.
Signed-off-by: James Morse <james.morse@arm.com>
Acked-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Alexander Potapenko <glider@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Vlastimil Babka [Thu, 28 Apr 2016 23:18:49 +0000 (16:18 -0700)]
mm: wake kcompactd before kswapd's short sleep
When kswapd goes to sleep it checks if the node is balanced and at first
it sleeps only for HZ/10 time, then rechecks if the node is still
balanced and nobody has woken it during the initial sleep. Only then it
goes fully sleep until an allocation slowpath wakes it up again.
For higher-order allocations, waking up kcompactd is done only before
the full sleep. This turns out to be an issue in case another
high-order allocation fails during the initial sleep. It will wake
kswapd up, however kswapd considers the zone balanced from the order-0
perspective, and will just quickly try to sleep again. So if there's a
longer stream of high-order allocations hitting the slowpath and waking
up kswapd, it might never actually wake up kcompactd, which may be
considered a regression from kswapd-based compaction. In the worst
case, it might be that a single allocation that cannot direct
reclaim/compact itself is waking kswapd in the retry loop and preventing
kcompactd from being woken up and unblocking it.
This patch makes sure kcompactd is woken up in such situations by simply
moving the wakeup before the short initial sleep. More efficient
solution would be to wake kcompactd immediately instead of kswapd if the
node is already order-0 balanced, but in that case we should also move
reset_isolation_suitable() call to kcompactd so it's not adding to the
allocator's latency. Since it's late in the 4.6 cycle, let's go with
the simpler change for now.
Fixes:
accf62422b3a ("mm, kswapd: replace kswapd compaction with waking up kcompactd")
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: David Rientjes <rientjes@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Frank Rowand [Thu, 28 Apr 2016 23:18:47 +0000 (16:18 -0700)]
.mailmap: add Frank Rowand
Set current email address to replace obsolete email addresses.
Signed-off-by: Frank Rowand <frank.rowand@sonymobile.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Minchan Kim [Thu, 28 Apr 2016 23:18:44 +0000 (16:18 -0700)]
mm/hwpoison: fix wrong num_poisoned_pages accounting
Currently, migration code increses num_poisoned_pages on *failed*
migration page as well as successfully migrated one at the trial of
memory-failure. It will make the stat wrong. As well, it marks the
page as PG_HWPoison even if the migration trial failed. It would mean
we cannot recover the corrupted page using memory-failure facility.
This patches fixes it.
Signed-off-by: Minchan Kim <minchan@kernel.org>
Reported-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Minchan Kim [Thu, 28 Apr 2016 23:18:41 +0000 (16:18 -0700)]
mm: call swap_slot_free_notify() with page lock held
Kyeongdon reported below error which is BUG_ON(!PageSwapCache(page)) in
page_swap_info. The reason is that page_endio in rw_page unlocks the
page if read I/O is completed so we need to hold a PG_lock again to
check PageSwapCache. Otherwise, the page can be removed from swapcache.
Kernel BUG at
c00f9040 [verbose debug info unavailable]
Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM
Modules linked in:
CPU: 4 PID: 13446 Comm: RenderThread Tainted: G W
3.10.84-g9f14aec-dirty #73
task:
c3b73200 ti:
dd192000 task.ti:
dd192000
PC is at page_swap_info+0x10/0x2c
LR is at swap_slot_free_notify+0x18/0x6c
pc : [<
c00f9040>] lr : [<
c00f5560>] psr:
400f0113
sp :
dd193d78 ip :
c2deb1e4 fp :
da015180
r10:
00000000 r9 :
000200da r8 :
c120fe08
r7 :
00000000 r6 :
00000000 r5 :
c249a6c0 r4 : =
c249a6c0
r3 :
00000000 r2 :
40080009 r1 :
200f0113 r0 : =
c249a6c0
..<snip> ..
Call Trace:
page_swap_info+0x10/0x2c
swap_slot_free_notify+0x18/0x6c
swap_readpage+0x90/0x11c
read_swap_cache_async+0x134/0x1ac
swapin_readahead+0x70/0xb0
handle_pte_fault+0x320/0x6fc
handle_mm_fault+0xc0/0xf0
do_page_fault+0x11c/0x36c
do_DataAbort+0x34/0x118
Fixes:
3f2b1a04f44933f2 ("zram: revive swap_slot_free_notify")
Signed-off-by: Minchan Kim <minchan@kernel.org>
Tested-by: Kyeongdon Kim <kyeongdon.kim@lge.com>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Minchan Kim [Thu, 28 Apr 2016 23:18:38 +0000 (16:18 -0700)]
mm: vmscan: reclaim highmem zone if buffer_heads is over limit
We have been reclaimed highmem zone if buffer_heads is over limit but
commit
6b4f7799c6a5 ("mm: vmscan: invoke slab shrinkers from
shrink_zone()") changed the behavior so it doesn't reclaim highmem zone
although buffer_heads is over the limit. This patch restores the logic.
Fixes:
6b4f7799c6a5 ("mm: vmscan: invoke slab shrinkers from shrink_zone()")
Signed-off-by: Minchan Kim <minchan@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Gerald Schaefer [Thu, 28 Apr 2016 23:18:35 +0000 (16:18 -0700)]
numa: fix /proc/<pid>/numa_maps for THP
In gather_pte_stats() a THP pmd is cast into a pte, which is wrong
because the layouts may differ depending on the architecture. On s390
this will lead to inaccurate numa_maps accounting in /proc because of
misguided pte_present() and pte_dirty() checks on the fake pte.
On other architectures pte_present() and pte_dirty() may work by chance,
but there may be an issue with direct-access (dax) mappings w/o
underlying struct pages when HAVE_PTE_SPECIAL is set and THP is
available. In vm_normal_page() the fake pte will be checked with
pte_special() and because there is no "special" bit in a pmd, this will
always return false and the VM_PFNMAP | VM_MIXEDMAP checking will be
skipped. On dax mappings w/o struct pages, an invalid struct page
pointer would then be returned that can crash the kernel.
This patch fixes the numa_maps THP handling by introducing new "_pmd"
variants of the can_gather_numa_stats() and vm_normal_page() functions.
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Jerome Marchand <jmarchan@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Cc: <stable@vger.kernel.org> [4.3+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Konstantin Khlebnikov [Thu, 28 Apr 2016 23:18:32 +0000 (16:18 -0700)]
mm/huge_memory: replace VM_NO_THP VM_BUG_ON with actual VMA check
Khugepaged detects own VMAs by checking vm_file and vm_ops but this way
it cannot distinguish private /dev/zero mappings from other special
mappings like /dev/hpet which has no vm_ops and popultes PTEs in mmap.
This fixes false-positive VM_BUG_ON and prevents installing THP where
they are not expected.
Link: http://lkml.kernel.org/r/CACT4Y+ZmuZMV5CjSFOeXviwQdABAgT7T+StKfTqan9YDtgEi5g@mail.gmail.com
Fixes:
78f11a255749 ("mm: thp: fix /dev/zero MAP_PRIVATE and vm_flags cleanups")
Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Krzysztof Kozlowski [Thu, 28 Apr 2016 23:18:30 +0000 (16:18 -0700)]
mailmap: fix Krzysztof Kozlowski's misspelled name
Patchwork introduced a garbled Polish character in commit
1e3012d0fdc5
("crypto: s5p-sss - Use memcpy_toio for iomem annotated memory") so fix
the mail mapping. Additionally prefer to use kernel.org account for
personal work, instead of my gmail address.
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kirill A. Shutemov [Thu, 28 Apr 2016 23:18:27 +0000 (16:18 -0700)]
thp: keep huge zero page pinned until tlb flush
Andrea has found[1] a race condition on MMU-gather based TLB flush vs
split_huge_page() or shrinker which frees huge zero under us (patch 1/2
and 2/2 respectively).
With new THP refcounting, we don't need patch 1/2: mmu_gather keeps the
page pinned until flush is complete and the pin prevents the page from
being split under us.
We still need patch 2/2. This is simplified version of Andrea's patch.
We don't need fancy encoding.
[1] http://lkml.kernel.org/r/
1447938052-22165-1-git-send-email-aarcange@redhat.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reported-by: Andrea Arcangeli <aarcange@redhat.com>
Reviewed-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Steve Capper [Thu, 28 Apr 2016 23:18:24 +0000 (16:18 -0700)]
mm: exclude HugeTLB pages from THP page_mapped() logic
HugeTLB pages cannot be split, so we use the compound_mapcount to track
rmaps.
Currently page_mapped() will check the compound_mapcount, but will also
go through the constituent pages of a THP compound page and query the
individual _mapcount's too.
Unfortunately, page_mapped() does not distinguish between HugeTLB and
THP compound pages and assumes that a compound page always needs to have
HPAGE_PMD_NR pages querying.
For most cases when dealing with HugeTLB this is just inefficient, but
for scenarios where the HugeTLB page size is less than the pmd block
size (e.g. when using contiguous bit on ARM) this can lead to crashes.
This patch adjusts the page_mapped function such that we skip the
unnecessary THP reference checks for HugeTLB pages.
Fixes:
e1534ae95004 ("mm: differentiate page_mapped() from page_mapcount() for compound pages")
Signed-off-by: Steve Capper <steve.capper@arm.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Atsushi Kumagai [Thu, 28 Apr 2016 23:18:21 +0000 (16:18 -0700)]
kexec: export OFFSET(page.compound_head) to find out compound tail page
PageAnon() always look at head page to check PAGE_MAPPING_ANON and tail
page's page->mapping has just a poisoned data since commit
1c290f642101
("mm: sanitize page->mapping for tail pages").
If makedumpfile checks page->mapping of a compound tail page to
distinguish anonymous page as usual, it must fail in newer kernel. So
it's necessary to export OFFSET(page.compound_head) to avoid checking
compound tail pages.
The problem is that unnecessary hugepages won't be removed from a dump
file in kernels 4.5.x and later. This means that extra disk space would
be consumed. It's a problem, but not critical.
Signed-off-by: Atsushi Kumagai <ats-kumagai@wm.jp.nec.com>
Acked-by: Dave Young <dyoung@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Atsushi Kumagai [Thu, 28 Apr 2016 23:18:18 +0000 (16:18 -0700)]
kexec: update VMCOREINFO for compound_order/dtor
makedumpfile refers page.lru.next to get the order of compound pages for
page filtering.
However, now the order is stored in page.compound_order, hence
VMCOREINFO should be updated to export the offset of
page.compound_order.
The fact is, page.compound_order was introduced already in kernel 4.0,
but the offset of it was the same as page.lru.next until kernel 4.3, so
this was not actual problem.
The above can be said also for page.lru.prev and page.compound_dtor,
it's necessary to detect hugetlbfs pages. Further, the content was
changed from direct address to the ID which means dtor.
The problem is that unnecessary hugepages won't be removed from a dump
file in kernels 4.4.x and later. This means that extra disk space would
be consumed. It's a problem, but not critical.
Signed-off-by: Atsushi Kumagai <ats-kumagai@wm.jp.nec.com>
Acked-by: Dave Young <dyoung@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Fri, 29 Apr 2016 01:59:24 +0000 (18:59 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/sage/ceph-client
Pull Ceph fixes from Sage Weil:
"There is a lifecycle fix in the auth code, a fix for a narrow race
condition on map, and a helpful message in the log when there is a
feature mismatch (which happens frequently now that the default
server-side options have changed)"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
rbd: report unsupported features to syslog
rbd: fix rbd map vs notify races
libceph: make authorizer destruction independent of ceph_auth_client
Linus Torvalds [Fri, 29 Apr 2016 01:52:11 +0000 (18:52 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/s390/linux
Pull s390 fixes from Martin Schwidefsky:
"Three more bug fixes for 4.6
- Due to a race in the dynamic page table code a multi-threaded
program can cause a translation specification exception. With
panic_on_oops a user space program can crash the system.
- An information leak with the /dev/sclp device.
- A use after free in the s390 PCI code"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/sclp_ctl: fix potential information leak with /dev/sclp
s390/mm: fix asce_bits handling with dynamic pagetable levels
s390/pci: fix use after free in dma_init
Florian Westphal [Sun, 24 Apr 2016 20:18:59 +0000 (22:18 +0200)]
RDMA/nes: don't leak skb if carrier down
Alternatively one could free the skb, OTOH I don't think this test is
useful so just remove it.
Cc: <linux-rdma@vger.kernel.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Sinclair Yeh [Thu, 21 Apr 2016 18:29:31 +0000 (11:29 -0700)]
drm/vmwgfx: Fix order of operation
mode->hdisplay * (var->bits_per_pixel + 7) gets evaluated before
the division, potentially making the pitch larger than it should
be.
Since the original intention is to do a div-round-up, just use
the macro instead.
Signed-off-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Charmaine Lee [Tue, 12 Apr 2016 15:19:08 +0000 (08:19 -0700)]
drm/vmwgfx: use vmw_cmd_dx_cid_check for query commands.
Instead of calling vmw_cmd_ok, call vmw_cmd_dx_cid_check to
validate the context id for query commands.
Signed-off-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Charmaine Lee [Tue, 12 Apr 2016 15:14:23 +0000 (08:14 -0700)]
drm/vmwgfx: Enable SVGA_3D_CMD_DX_SET_PREDICATION
Fixes piglit tests nv_conditional_render-* crashes.
Signed-off-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Jason Gunthorpe [Mon, 11 Apr 2016 01:13:13 +0000 (19:13 -0600)]
IB/security: Restrict use of the write() interface
The drivers/infiniband stack uses write() as a replacement for
bi-directional ioctl(). This is not safe. There are ways to
trigger write calls that result in the return structure that
is normally written to user space being shunted off to user
specified kernel memory instead.
For the immediate repair, detect and deny suspicious accesses to
the write API.
For long term, update the user space libraries and the kernel API
to something that doesn't present the same security vulnerabilities
(likely a structured ioctl() interface).
The impacted uAPI interfaces are generally only available if
hardware from drivers/infiniband is installed in the system.
Reported-by: Jann Horn <jann@thejh.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
[ Expanded check to all known write() entry points ]
Cc: stable@vger.kernel.org
Signed-off-by: Doug Ledford <dledford@redhat.com>
Dean Luick [Fri, 22 Apr 2016 18:17:03 +0000 (11:17 -0700)]
IB/hfi1: Use kernel default llseek for ui device
The ui device llseek had a mistake with SEEK_END and did
not fully follow seek semantics. Correct all this by
using a kernel supplied function for fixed size devices.
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Mitko Haralanov [Wed, 20 Apr 2016 13:05:36 +0000 (06:05 -0700)]
IB/hfi1: Don't attempt to free resources if initialization failed
Attempting to free resources which have not been allocated and
initialized properly led to the following kernel backtrace:
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<
ffffffffa09658fe>] unlock_exp_tids.isra.8+0x2e/0x120 [hfi1]
PGD
852a43067 PUD
85d4a6067 PMD 0
Oops: 0000 [#1] SMP
CPU: 0 PID: 2831 Comm: osu_bw Tainted: G IO 3.12.18-wfr+ #1
task:
ffff88085b15b540 ti:
ffff8808588fe000 task.ti:
ffff8808588fe000
RIP: 0010:[<
ffffffffa09658fe>] [<
ffffffffa09658fe>] unlock_exp_tids.isra.8+0x2e/0x120 [hfi1]
RSP: 0018:
ffff8808588ffde0 EFLAGS:
00010282
RAX:
0000000000000000 RBX:
ffff880858a31800 RCX:
0000000000000000
RDX:
ffff88085d971bc0 RSI:
ffff880858a318f8 RDI:
ffff880858a318c0
RBP:
ffff8808588ffe20 R08:
0000000000000000 R09:
0000000000000000
R10:
ffff88087ffd6f40 R11:
0000000001100348 R12:
ffff880852900000
R13:
ffff880858a318c0 R14:
0000000000000000 R15:
ffff88085d971be8
FS:
00007f4674e83740(0000) GS:
ffff88087f400000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
CR2:
0000000000000000 CR3:
000000085c377000 CR4:
00000000001407f0
Stack:
ffffffffa0941a71 ffff880858a318f8 ffff88085d971bc0 ffff880858a31800
ffff880852900000 ffff880858a31800 00000000003ffff7 ffff88085d971bc0
ffff8808588ffe60 ffffffffa09663fc ffff8808588ffe60 ffff880858a31800
Call Trace:
[<
ffffffffa0941a71>] ? find_mmu_handler+0x51/0x70 [hfi1]
[<
ffffffffa09663fc>] hfi1_user_exp_rcv_free+0x6c/0x120 [hfi1]
[<
ffffffffa0932809>] hfi1_file_close+0x1a9/0x340 [hfi1]
[<
ffffffff8116c189>] __fput+0xe9/0x270
[<
ffffffff8116c35e>] ____fput+0xe/0x10
[<
ffffffff81065707>] task_work_run+0xa7/0xe0
[<
ffffffff81002969>] do_notify_resume+0x59/0x80
[<
ffffffff814ffc1a>] int_signal+0x12/0x17
This commit re-arranges the context initialization code in a way that
would allow for context event flags to be used to determine whether
the context has been successfully initialized.
In turn, this can be used to skip the resource de-allocation if they
were never allocated in the first place.
Fixes:
3abb33ac6521 ("staging/hfi1: Add TID cache receive init and free funcs")
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com.
Signed-off-by: Doug Ledford <dledford@redhat.com>
Mike Marciniszyn [Wed, 20 Apr 2016 13:05:30 +0000 (06:05 -0700)]
IB/hfi1: Fix missing lock/unlock in verbs drain callback
The iowait_sdma_drained() callback lacked locking to
protect the qp s_flags field.
This causes the s_flags to be out of sync
on multiple CPUs, potentially corrupting the s_flags.
Fixes:
a545f5308b6c ("staging/rdma/hfi: fix CQ completion order issue")
Reviewed-by: Sebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Jubin John [Tue, 12 Apr 2016 17:47:00 +0000 (10:47 -0700)]
IB/rdmavt: Fix send scheduling
call_send is used to determine whether to send immediately or schedule
a send for later. The current logic in rdmavt is inverted and has a
negative impact on the latency of the hfi1 and qib drivers. Fix this
regression by correctly calling send immediately when call_send is set.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Mitko Haralanov [Tue, 12 Apr 2016 17:46:16 +0000 (10:46 -0700)]
IB/hfi1: Prevent unpinning of wrong pages
The routine used by the SDMA cache to handle already
cached nodes can extend an already existing node.
In its error handling code, the routine will unpin pages
when not all pages of the buffer extension were pinned.
There was a bug in that part of the routine, which would
mistakenly unpin pages from the original set rather than
the newly pinned pages.
This commit fixes that bug by offsetting the page array
to the proper place pointing at the beginning of the newly
pinned pages.
Reviewed-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Mitko Haralanov [Tue, 12 Apr 2016 17:46:03 +0000 (10:46 -0700)]
IB/hfi1: Fix deadlock caused by locking with wrong scope
The locking around the interval RB tree is designed to prevent
access to the tree while it's being modified. The locking in its
current form is too overzealous, which is causing a deadlock in
certain cases with the following backtrace:
Kernel panic - not syncing: Watchdog detected hard LOCKUP on cpu 0
CPU: 0 PID: 5836 Comm: IMB-MPI1 Tainted: G O 3.12.18-wfr+ #1
0000000000000000 ffff88087f206c50 ffffffff814f1caa ffffffff817b53f0
ffff88087f206cc8 ffffffff814ecd56 0000000000000010 ffff88087f206cd8
ffff88087f206c78 0000000000000000 0000000000000000 0000000000001662
Call Trace:
<NMI> [<
ffffffff814f1caa>] dump_stack+0x45/0x56
[<
ffffffff814ecd56>] panic+0xc2/0x1cb
[<
ffffffff810d4370>] ? restart_watchdog_hrtimer+0x50/0x50
[<
ffffffff810d4432>] watchdog_overflow_callback+0xc2/0xd0
[<
ffffffff81109b4e>] __perf_event_overflow+0x8e/0x2b0
[<
ffffffff8110a714>] perf_event_overflow+0x14/0x20
[<
ffffffff8101c906>] intel_pmu_handle_irq+0x1b6/0x390
[<
ffffffff814f927b>] perf_event_nmi_handler+0x2b/0x50
[<
ffffffff814f8ad8>] nmi_handle.isra.3+0x88/0x180
[<
ffffffff814f8d39>] do_nmi+0x169/0x310
[<
ffffffff814f8177>] end_repeat_nmi+0x1e/0x2e
[<
ffffffff81272600>] ? unmap_single+0x30/0x30
[<
ffffffff814f780d>] ? _raw_spin_lock_irqsave+0x2d/0x40
[<
ffffffff814f780d>] ? _raw_spin_lock_irqsave+0x2d/0x40
[<
ffffffff814f780d>] ? _raw_spin_lock_irqsave+0x2d/0x40
<<EOE>> <IRQ> [<
ffffffffa056c4a8>] hfi1_mmu_rb_search+0x38/0x70 [hfi1]
[<
ffffffffa05919cb>] user_sdma_free_request+0xcb/0x120 [hfi1]
[<
ffffffffa0593393>] user_sdma_txreq_cb+0x263/0x350 [hfi1]
[<
ffffffffa057fad7>] ? sdma_txclean+0x27/0x1c0 [hfi1]
[<
ffffffffa0593130>] ? user_sdma_send_pkts+0x1710/0x1710 [hfi1]
[<
ffffffffa057fdd6>] sdma_make_progress+0x166/0x480 [hfi1]
[<
ffffffff810762c9>] ? ttwu_do_wakeup+0x19/0xd0
[<
ffffffffa0581c7e>] sdma_engine_interrupt+0x8e/0x100 [hfi1]
[<
ffffffffa0546bdd>] sdma_interrupt+0x5d/0xa0 [hfi1]
[<
ffffffff81097e57>] handle_irq_event_percpu+0x47/0x1d0
[<
ffffffff81098017>] handle_irq_event+0x37/0x60
[<
ffffffff8109aa5f>] handle_edge_irq+0x6f/0x120
[<
ffffffff810044af>] handle_irq+0xbf/0x150
[<
ffffffff8104c9b7>] ? irq_enter+0x17/0x80
[<
ffffffff8150168d>] do_IRQ+0x4d/0xc0
[<
ffffffff814f7c6a>] common_interrupt+0x6a/0x6a
<EOI> [<
ffffffff81073524>] ? finish_task_switch+0x54/0xe0
[<
ffffffff814f56c6>] __schedule+0x3b6/0x7e0
[<
ffffffff810763a6>] __cond_resched+0x26/0x30
[<
ffffffff814f5eda>] _cond_resched+0x3a/0x50
[<
ffffffff814f4f82>] down_write+0x12/0x30
[<
ffffffffa0591619>] hfi1_release_user_pages+0x69/0x90 [hfi1]
[<
ffffffffa059173a>] sdma_rb_remove+0x9a/0xc0 [hfi1]
[<
ffffffffa056c00d>] __mmu_rb_remove.isra.5+0x5d/0x70 [hfi1]
[<
ffffffffa056c536>] hfi1_mmu_rb_remove+0x56/0x70 [hfi1]
[<
ffffffffa059427b>] hfi1_user_sdma_process_request+0x74b/0x1160 [hfi1]
[<
ffffffffa055c763>] hfi1_aio_write+0xc3/0x100 [hfi1]
[<
ffffffff8116a14c>] do_sync_readv_writev+0x4c/0x80
[<
ffffffff8116b58b>] do_readv_writev+0xbb/0x230
[<
ffffffff811a9da1>] ? fsnotify+0x241/0x320
[<
ffffffff81073524>] ? finish_task_switch+0x54/0xe0
[<
ffffffff8116b795>] vfs_writev+0x35/0x60
[<
ffffffff8116b8c9>] SyS_writev+0x49/0xc0
[<
ffffffff810cd876>] ? __audit_syscall_exit+0x1f6/0x2a0
[<
ffffffff814ff992>] system_call_fastpath+0x16/0x1b
As evident from the backtrace above, the process was being put to sleep
while holding the lock.
Limiting the scope of the lock only to the RB tree operation fixes the
above error allowing for proper locking and the process being put to
sleep when needed.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Mitko Haralanov [Tue, 12 Apr 2016 17:45:57 +0000 (10:45 -0700)]
IB/hfi1: Prevent NULL pointer deferences in caching code
There is a potential kernel crash when the MMU notifier calls the
invalidation routines in the hfi1 pinned page caching code for sdma.
The invalidation routine could call the remove callback
for the node, which in turn ends up dereferencing the
current task_struct to get a pointer to the mm_struct.
However, the mm_struct pointer could be NULL resulting in
the following backtrace:
BUG: unable to handle kernel NULL pointer dereference at
00000000000000a8
IP: [<
ffffffffa041f75a>] sdma_rb_remove+0xaa/0x100 [hfi1]
15
task:
ffff88085e66e080 ti:
ffff88085c244000 task.ti:
ffff88085c244000
RIP: 0010:[<
ffffffffa041f75a>] [<
ffffffffa041f75a>] sdma_rb_remove+0xaa/0x100 [hfi1]
RSP: 0000:
ffff88085c245878 EFLAGS:
00010002
RAX:
0000000000000000 RBX:
ffff88105b9bbd40 RCX:
ffffea003931a830
RDX:
0000000000000004 RSI:
ffff88105754a9c0 RDI:
ffff88105754a9c0
RBP:
ffff88085c245890 R08:
ffff88105b9bbd70 R09:
00000000fffffffb
R10:
ffff88105b9bbd58 R11:
0000000000000013 R12:
ffff88105754a9c0
R13:
0000000000000001 R14:
0000000000000001 R15:
ffff88105b9bbd40
FS:
0000000000000000(0000) GS:
ffff88107ef40000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
CR2:
00000000000000a8 CR3:
0000000001a0b000 CR4:
00000000001407e0
Stack:
ffff88105b9bbd40 ffff88080ec481a8 ffff88080ec481b8 ffff88085c2458c0
ffffffffa03fa00e ffff88080ec48190 ffff88080ed9cd00 0000000001024000
0000000000000000 ffff88085c245920 ffffffffa03fa0e7 0000000000000282
Call Trace:
[<
ffffffffa03fa00e>] __mmu_rb_remove.isra.5+0x5e/0x70 [hfi1]
[<
ffffffffa03fa0e7>] mmu_notifier_mem_invalidate+0xc7/0xf0 [hfi1]
[<
ffffffffa03fa143>] mmu_notifier_page+0x13/0x20 [hfi1]
[<
ffffffff81156dd0>] __mmu_notifier_invalidate_page+0x50/0x70
[<
ffffffff81140bbb>] try_to_unmap_one+0x20b/0x470
[<
ffffffff81141ee7>] try_to_unmap_anon+0xa7/0x120
[<
ffffffff81141fad>] try_to_unmap+0x4d/0x60
[<
ffffffff8111fd7b>] shrink_page_list+0x2eb/0x9d0
[<
ffffffff81120ab3>] shrink_inactive_list+0x243/0x490
[<
ffffffff81121491>] shrink_lruvec+0x4c1/0x640
[<
ffffffff81121641>] shrink_zone+0x31/0x100
[<
ffffffff81121b0f>] kswapd_shrink_zone.constprop.62+0xef/0x1c0
[<
ffffffff811229e3>] kswapd+0x403/0x7e0
[<
ffffffff811225e0>] ? shrink_all_memory+0xf0/0xf0
[<
ffffffff81068ac0>] kthread+0xc0/0xd0
[<
ffffffff81068a00>] ? insert_kthread_work+0x40/0x40
[<
ffffffff814ff8ec>] ret_from_fork+0x7c/0xb0
[<
ffffffff81068a00>] ? insert_kthread_work+0x40/0x40
To correct this, the mm_struct passed to us by the MMU notifier is
used (which is what should have been done to begin with). This avoids
the broken derefences and ensures that the correct mm_struct is used.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Sagi Grimberg [Sun, 3 Apr 2016 12:03:12 +0000 (15:03 +0300)]
MAINTAINERS: Update iser/isert maintainer contact info
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Acked-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Sagi Grimberg [Thu, 31 Mar 2016 16:03:25 +0000 (19:03 +0300)]
IB/mlx5: Expose correct max_sge_rd limit
mlx5 devices (Connect-IB, ConnectX-4, ConnectX-4-LX) has a limitation
where rdma read work queue entries cannot exceed 512 bytes.
A rdma_read wqe needs to fit in 512 bytes:
- wqe control segment (16 bytes)
- rdma segment (16 bytes)
- scatter elements (16 bytes each)
So max_sge_rd should be: (512 - 16 - 16) / 16 = 30.
Cc: linux-stable@vger.kernel.org
Reported-by: Christoph Hellwig <hch@lst.de>
Tested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sagi Grimberg <sagig@grimberg.me>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Chen-Yu Tsai [Tue, 26 Apr 2016 15:37:26 +0000 (23:37 +0800)]
mmc: sunxi: Disable eMMC HS-DDR (MMC_CAP_1_8V_DDR) for Allwinner A80
eMMC HS-DDR no longer works on the A80, despite it working when support
for this developed.
Disable it for now.
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Kan Liang [Thu, 21 Apr 2016 09:30:10 +0000 (02:30 -0700)]
perf/x86/intel: Fix incorrect lbr_sel_mask value
This patch fixes a bug which was introduced by:
b16a5b52eb90 ("perf/x86: Add option to disable reading branch flags/cycles")
In this patch, lbr_sel_mask is used to mask the lbr_select. But LBR_SEL_MASK
doesn't include the bit for LBR_CALL_STACK. So LBR call stack will never be
set in lbr_select.
This patch corrects the LBR_SEL_MASK by including all valid bits in
LBR_SELECT. Also, the LBR_CALL_STACK bit is different as other bit in
LBR_SELECT. It does not operate in suppress mode, so it needs to be
specially handled in intel_pmu_setup_hw_lbr_filter.
Signed-off-by: Kan Liang <kan.liang@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/1461231010-4399-1-git-send-email-kan.liang@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Alexander Shishkin [Tue, 29 Mar 2016 14:43:10 +0000 (17:43 +0300)]
perf/x86/intel/pt: Don't die on VMXON
Some versions of Intel PT do not support tracing across VMXON, more
specifically, VMXON will clear TraceEn control bit and any attempt to
set it before VMXOFF will throw a #GP, which in the current state of
things will crash the kernel. Namely:
$ perf record -e intel_pt// kvm -nographic
on such a machine will kill it.
To avoid this, notify the intel_pt driver before VMXON and after
VMXOFF so that it knows when not to enable itself.
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: hpa@zytor.com
Link: http://lkml.kernel.org/r/87oa9dwrfk.fsf@ashishki-desk.ger.corp.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Tue, 26 Apr 2016 09:36:53 +0000 (11:36 +0200)]
perf/core: Fix perf_event_open() vs. execve() race
Jann reported that the ptrace_may_access() check in
find_lively_task_by_vpid() is racy against exec().
Specifically:
perf_event_open() execve()
ptrace_may_access()
commit_creds()
... if (get_dumpable() != SUID_DUMP_USER)
perf_event_exit_task();
perf_install_in_context()
would result in installing a counter across the creds boundary.
Fix this by wrapping lots of perf_event_open() in cred_guard_mutex.
This should be fine as perf_event_exit_task() is already called with
cred_guard_mutex held, so all perf locks already nest inside it.
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Adam Borowski [Wed, 27 Apr 2016 09:35:31 +0000 (11:35 +0200)]
perf/x86/amd: Set the size of event map array to PERF_COUNT_HW_MAX
The entry for PERF_COUNT_HW_REF_CPU_CYCLES is not used on AMD, but is
referenced by filter_events() which expects undefined events to have a
value of 0.
Found via KASAN:
UBSAN: Undefined behaviour in arch/x86/events/amd/core.c:132:30
index 9 is out of range for type 'u64 [9]'
UBSAN: Undefined behaviour in arch/x86/events/amd/core.c:132:9
load of address
ffffffff81c021c8 with insufficient space for an object of type 'const u64'
Signed-off-by: Adam Borowski <kilobyte@angband.pl>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/1461749731-30979-1-git-send-email-kilobyte@angband.pl
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Ilya Dryomov [Wed, 13 Apr 2016 12:15:50 +0000 (14:15 +0200)]
rbd: report unsupported features to syslog
... instead of just returning an error.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Ilya Dryomov [Fri, 15 Apr 2016 14:22:16 +0000 (16:22 +0200)]
rbd: fix rbd map vs notify races
A while ago, commit
9875201e1049 ("rbd: fix use-after free of
rbd_dev->disk") fixed rbd unmap vs notify race by introducing
an exported wrapper for flushing notifies and sticking it into
do_rbd_remove().
A similar problem exists on the rbd map path, though: the watch is
registered in rbd_dev_image_probe(), while the disk is set up quite
a few steps later, in rbd_dev_device_setup(). Nothing prevents
a notify from coming in and crashing on a NULL rbd_dev->disk:
BUG: unable to handle kernel NULL pointer dereference at
0000000000000050
Call Trace:
[<
ffffffffa0508344>] rbd_watch_cb+0x34/0x180 [rbd]
[<
ffffffffa04bd290>] do_event_work+0x40/0xb0 [libceph]
[<
ffffffff8109d5db>] process_one_work+0x17b/0x470
[<
ffffffff8109e3ab>] worker_thread+0x11b/0x400
[<
ffffffff8109e290>] ? rescuer_thread+0x400/0x400
[<
ffffffff810a5acf>] kthread+0xcf/0xe0
[<
ffffffff810b41b3>] ? finish_task_switch+0x53/0x170
[<
ffffffff810a5a00>] ? kthread_create_on_node+0x140/0x140
[<
ffffffff81645dd8>] ret_from_fork+0x58/0x90
[<
ffffffff810a5a00>] ? kthread_create_on_node+0x140/0x140
RIP [<
ffffffffa050828a>] rbd_dev_refresh+0xfa/0x180 [rbd]
If an error occurs during rbd map, we have to error out, potentially
tearing down a watch. Just like on rbd unmap, notifies have to be
flushed, otherwise rbd_watch_cb() may end up trying to read in the
image header after rbd_dev_image_release() has run:
Assertion failure in rbd_dev_header_info() at line 4722:
rbd_assert(rbd_image_format_valid(rbd_dev->image_format));
Call Trace:
[<
ffffffff81cccee0>] ? rbd_parent_request_create+0x150/0x150
[<
ffffffff81cd4e59>] rbd_dev_refresh+0x59/0x390
[<
ffffffff81cd5229>] rbd_watch_cb+0x69/0x290
[<
ffffffff81fde9bf>] do_event_work+0x10f/0x1c0
[<
ffffffff81107799>] process_one_work+0x689/0x1a80
[<
ffffffff811076f7>] ? process_one_work+0x5e7/0x1a80
[<
ffffffff81132065>] ? finish_task_switch+0x225/0x640
[<
ffffffff81107110>] ? pwq_dec_nr_in_flight+0x2b0/0x2b0
[<
ffffffff81108c69>] worker_thread+0xd9/0x1320
[<
ffffffff81108b90>] ? process_one_work+0x1a80/0x1a80
[<
ffffffff8111b02d>] kthread+0x21d/0x2e0
[<
ffffffff8111ae10>] ? kthread_stop+0x550/0x550
[<
ffffffff82022802>] ret_from_fork+0x22/0x40
[<
ffffffff8111ae10>] ? kthread_stop+0x550/0x550
RIP [<
ffffffff81ccd8f9>] rbd_dev_header_info+0xa19/0x1e30
To fix this, a) check if RBD_DEV_FLAG_EXISTS is set before calling
revalidate_disk(), b) move ceph_osdc_flush_notifies() call into
rbd_dev_header_unwatch_sync() to cover rbd map error paths and c) turn
header read-in into a critical section. The latter also happens to
take care of rbd map foo@bar vs rbd snap rm foo@bar race.
Fixes: http://tracker.ceph.com/issues/15490
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Keith Busch [Wed, 27 Apr 2016 20:22:32 +0000 (14:22 -0600)]
x86/apic: Handle zero vector gracefully in clear_vector_irq()
If x86_vector_alloc_irq() fails x86_vector_free_irqs() is invoked to cleanup
the already allocated vectors. This subsequently calls clear_vector_irq().
The failed irq has no vector assigned, which triggers the BUG_ON(!vector) in
clear_vector_irq().
We cannot suppress the call to x86_vector_free_irqs() for the failed
interrupt, because the other data related to this irq must be cleaned up as
well. So calling clear_vector_irq() with vector == 0 is legitimate.
Remove the BUG_ON and return if vector is zero,
[ tglx: Massaged changelog ]
Fixes:
b5dc8e6c21e7 "x86/irq: Use hierarchical irqdomain to manage CPU interrupt vectors"
Signed-off-by: Keith Busch <keith.busch@intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Leo Yan [Tue, 29 Mar 2016 11:24:15 +0000 (19:24 +0800)]
thermal: use %d to print S32 parameters
Power allocator's parameters are S32 type, so use %d to print them.
Acked-by: Javi Merino <javi.merino@arm.com>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Leo Yan [Tue, 29 Mar 2016 11:23:32 +0000 (19:23 +0800)]
thermal: hisilicon: increase temperature resolution
When calculate temperature, old code firstly do division and then
convert to "millicelsius" unit. This will lose resolution and only can
read back temperature with "Celsius" unit.
So firstly scale step value to "millicelsius" and then do division, so
finally we can increase resolution for temperature value. Also refine
the calculation from temperature value to step value.
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Linus Torvalds [Wed, 27 Apr 2016 19:03:59 +0000 (12:03 -0700)]
Merge branch 'for-4.6-fixes' of git://git./linux/kernel/git/tj/wq
Pull workqueue fix from Tejun Heo:
"So, it turns out we had a silly bug in the most fundamental part of
workqueue for a very long time. AFAICS, this dates back to pre-git
era and has quite likely been there from the time workqueue was first
introduced.
A work item uses its PENDING bit to synchronize multiple queuers.
Anyone who wins the PENDING bit owns the pending state of the work
item. Whether a queuer wins or loses the race, one thing should be
guaranteed - there will soon be at least one execution of the work
item - where "after" means that the execution instance would be able
to see all the changes that the queuer has made prior to the queueing
attempt.
Unfortunately, we were missing a smp_mb() after clearing PENDING for
execution, so nothing guaranteed visibility of the changes that a
queueing loser has made, which manifested as a reproducible blk-mq
stall.
Lots of kudos to Roman for debugging the problem. The patch for
-stable is the minimal one. For v3.7, Peter is working on a patch to
make the code path slightly more efficient and less fragile"
* 'for-4.6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
workqueue: fix ghost PENDING flag while doing MQ IO