Dan Carpenter [Tue, 19 Nov 2019 06:17:05 +0000 (09:17 +0300)]
Bluetooth: delete a stray unlock
commit
df66499a1fab340c167250a5743931dc50d5f0fa upstream.
We used to take a lock in amp_physical_cfm() but then we moved it to
the caller function. Unfortunately the unlock on this error path was
overlooked so it leads to a double unlock.
Fixes:
a514b17fab51 ("Bluetooth: Refactor locking in amp_physical_cfm")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Oliver Neukum [Thu, 14 Nov 2019 15:01:18 +0000 (16:01 +0100)]
Bluetooth: btusb: fix PM leak in error case of setup
commit
3d44a6fd0775e6215e836423e27f8eedf8c871ea upstream.
If setup() fails a reference for runtime PM has already
been taken. Proper use of the error handling in btusb_open()is needed.
You cannot just return.
Fixes:
ace31982585a3 ("Bluetooth: btusb: Add setup callback for chip init on USB")
Signed-off-by: Oliver Neukum <oneukum@suse.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Michael Haener [Fri, 29 Nov 2019 09:16:49 +0000 (10:16 +0100)]
platform/x86: pmc_atom: Add Siemens CONNECT X300 to critclk_systems DMI table
commit
e8796c6c69d129420ee94a1906b18d86b84644d4 upstream.
The CONNECT X300 uses the PMC clock for on-board components and gets
stuck during boot if the clock is disabled. Therefore, add this
device to the critical systems list.
Tested on CONNECT X300.
Fixes:
648e921888ad ("clk: x86: Stop marking clocks as CLK_IS_CRITICAL")
Signed-off-by: Michael Haener <michael.haener@siemens.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Omar Sandoval [Wed, 27 Nov 2019 00:58:08 +0000 (16:58 -0800)]
xfs: don't check for AG deadlock for realtime files in bunmapi
commit
69ffe5960df16938bccfe1b65382af0b3de51265 upstream.
Commit
5b094d6dac04 ("xfs: fix multi-AG deadlock in xfs_bunmapi") added
a check in __xfs_bunmapi() to stop early if we would touch multiple AGs
in the wrong order. However, this check isn't applicable for realtime
files. In most cases, it just makes us do unnecessary commits. However,
without the fix from the previous commit ("xfs: fix realtime file data
space leak"), if the last and second-to-last extents also happen to have
different "AG numbers", then the break actually causes __xfs_bunmapi()
to return without making any progress, which sends
xfs_itruncate_extents_flags() into an infinite loop.
Fixes:
5b094d6dac04 ("xfs: fix multi-AG deadlock in xfs_bunmapi")
Signed-off-by: Omar Sandoval <osandov@fb.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Roman Bolshakov [Mon, 25 Nov 2019 16:56:53 +0000 (19:56 +0300)]
scsi: qla2xxx: Drop superfluous INIT_WORK of del_work
commit
600954e6f2df695434887dfc6a99a098859990cf upstream.
del_work is already initialized inside qla2x00_alloc_fcport, there's no
need to overwrite it. Indeed, it might prevent complete traversal of
workqueue list.
Fixes:
a01c77d2cbc45 ("scsi: qla2xxx: Move session delete to driver work queue")
Cc: Quinn Tran <qutran@marvell.com>
Link: https://lore.kernel.org/r/20191125165702.1013-5-r.bolshakov@yadro.com
Acked-by: Himanshu Madhani <hmadhani@marvell.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Tested-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Scott Mayhew [Wed, 9 Oct 2019 19:11:37 +0000 (15:11 -0400)]
nfsd4: fix up replay_matches_cache()
commit
6e73e92b155c868ff7fce9d108839668caf1d9be upstream.
When running an nfs stress test, I see quite a few cached replies that
don't match up with the actual request. The first comment in
replay_matches_cache() makes sense, but the code doesn't seem to
match... fix it.
This isn't exactly a bugfix, as the server isn't required to catch every
case of a false retry. So, we may as well do this, but if this is
fixing a problem then that suggests there's a client bug.
Fixes:
53da6a53e1d4 ("nfsd4: catch some false session retries")
Signed-off-by: Scott Mayhew <smayhew@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Leonard Crestez [Tue, 24 Sep 2019 07:26:53 +0000 (10:26 +0300)]
PM / devfreq: Check NULL governor in available_governors_show
commit
d68adc8f85cd757bd33c8d7b2660ad6f16f7f3dc upstream.
The governor is initialized after sysfs attributes become visible so in
theory the governor field can be NULL here.
Fixes:
bcf23c79c4e46 ("PM / devfreq: Fix available_governor sysfs")
Signed-off-by: Leonard Crestez <leonard.crestez@nxp.com>
Reviewed-by: Matthias Kaehlcke <mka@chromium.org>
Reviewed-by: Chanwoo Choi <cw00.choi@samsung.com>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Catalin Marinas [Mon, 6 Jan 2020 14:35:39 +0000 (14:35 +0000)]
arm64: Revert support for execute-only user mappings
commit
24cecc37746393432d994c0dbc251fb9ac7c5d72 upstream.
The ARMv8 64-bit architecture supports execute-only user permissions by
clearing the PTE_USER and PTE_UXN bits, practically making it a mostly
privileged mapping but from which user running at EL0 can still execute.
The downside, however, is that the kernel at EL1 inadvertently reading
such mapping would not trip over the PAN (privileged access never)
protection.
Revert the relevant bits from commit
cab15ce604e5 ("arm64: Introduce
execute-only page access permissions") so that PROT_EXEC implies
PROT_READ (and therefore PTE_USER) until the architecture gains proper
support for execute-only user mappings.
Fixes:
cab15ce604e5 ("arm64: Introduce execute-only page access permissions")
Cc: <stable@vger.kernel.org> # 4.9.x-
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Wen Yang [Fri, 3 Jan 2020 03:02:48 +0000 (11:02 +0800)]
ftrace: Avoid potential division by zero in function profiler
commit
e31f7939c1c27faa5d0e3f14519eaf7c89e8a69d upstream.
The ftrace_profile->counter is unsigned long and
do_div truncates it to 32 bits, which means it can test
non-zero and be truncated to zero for division.
Fix this issue by using div64_ul() instead.
Link: http://lkml.kernel.org/r/20200103030248.14516-1-wenyang@linux.alibaba.com
Cc: stable@vger.kernel.org
Fixes:
e330b3bcd8319 ("tracing: Show sample std dev in function profiling")
Fixes:
34886c8bc590f ("tracing: add average time in function to function profiler")
Signed-off-by: Wen Yang <wenyang@linux.alibaba.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
chenqiwu [Thu, 19 Dec 2019 06:29:53 +0000 (14:29 +0800)]
exit: panic before exit_mm() on global init exit
commit
43cf75d96409a20ef06b756877a2e72b10a026fc upstream.
Currently, when global init and all threads in its thread-group have exited
we panic via:
do_exit()
-> exit_notify()
-> forget_original_parent()
-> find_child_reaper()
This makes it hard to extract a useable coredump for global init from a
kernel crashdump because by the time we panic exit_mm() will have already
released global init's mm.
This patch moves the panic futher up before exit_mm() is called. As was the
case previously, we only panic when global init and all its threads in the
thread-group have exited.
Signed-off-by: chenqiwu <chenqiwu@xiaomi.com>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
[christian.brauner@ubuntu.com: fix typo, rewrite commit message]
Link: https://lore.kernel.org/r/1576736993-10121-1-git-send-email-qiwuchen55@gmail.com
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Takashi Iwai [Wed, 30 Oct 2019 10:09:21 +0000 (11:09 +0100)]
ALSA: firewire-motu: Correct a typo in the clock proc string
commit
0929249e3be3bb82ee6cfec0025f4dde952210b3 upstream.
Just fix a typo of "S/PDIF" in the clock name string.
Fixes:
4638ec6ede08 ("ALSA: firewire-motu: add proc node to show current statuc of clock and packet formats")
Acked-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
Link: https://lore.kernel.org/r/20191030100921.3826-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Colin Ian King [Fri, 22 Nov 2019 13:13:54 +0000 (13:13 +0000)]
ALSA: cs4236: fix error return comparison of an unsigned integer
commit
d60229d84846a8399257006af9c5444599f64361 upstream.
The return from pnp_irq is an unsigned integer type resource_size_t
and hence the error check for a positive non-error code is always
going to be true. A check for a non-failure return from pnp_irq
should in fact be for (resource_size_t)-1 rather than >= 0.
Addresses-Coverity: ("Unsigned compared against 0")
Fixes:
a9824c868a2c ("[ALSA] Add CS4232 PnP BIOS support")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Link: https://lore.kernel.org/r/20191122131354.58042-1-colin.king@canonical.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Steven Rostedt (VMware) [Wed, 11 Dec 2019 20:44:22 +0000 (15:44 -0500)]
tracing: Have the histogram compare functions convert to u64 first
commit
106f41f5a302cb1f36c7543fae6a05de12e96fa4 upstream.
The compare functions of the histogram code would be specific for the size
of the value being compared (byte, short, int, long long). It would
reference the value from the array via the type of the compare, but the
value was stored in a 64 bit number. This is fine for little endian
machines, but for big endian machines, it would end up comparing zeros or
all ones (depending on the sign) for anything but 64 bit numbers.
To fix this, first derference the value as a u64 then convert it to the type
being compared.
Link: http://lkml.kernel.org/r/20191211103557.7bed6928@gandalf.local.home
Cc: stable@vger.kernel.org
Fixes:
08d43a5fa063e ("tracing: Add lock-free tracing_map")
Acked-by: Tom Zanussi <zanussi@kernel.org>
Reported-by: Sven Schnelle <svens@stackframe.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Prateek Sood [Tue, 10 Dec 2019 09:15:16 +0000 (09:15 +0000)]
tracing: Fix lock inversion in trace_event_enable_tgid_record()
commit
3a53acf1d9bea11b57c1f6205e3fe73f9d8a3688 upstream.
Task T2 Task T3
trace_options_core_write() subsystem_open()
mutex_lock(trace_types_lock) mutex_lock(event_mutex)
set_tracer_flag()
trace_event_enable_tgid_record() mutex_lock(trace_types_lock)
mutex_lock(event_mutex)
This gives a circular dependency deadlock between trace_types_lock and
event_mutex. To fix this invert the usage of trace_types_lock and
event_mutex in trace_options_core_write(). This keeps the sequence of
lock usage consistent.
Link: http://lkml.kernel.org/r/0101016eef175e38-8ca71caf-a4eb-480d-a1e6-6f0bbc015495-000000@us-west-2.amazonses.com
Cc: stable@vger.kernel.org
Fixes:
d914ba37d7145 ("tracing: Add support for recording tgid of tasks")
Signed-off-by: Prateek Sood <prsood@codeaurora.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Russell King [Sat, 7 Dec 2019 16:20:18 +0000 (16:20 +0000)]
gpiolib: fix up emulated open drain outputs
commit
256efaea1fdc4e38970489197409a26125ee0aaa upstream.
gpiolib has a corner case with open drain outputs that are emulated.
When such outputs are outputting a logic 1, emulation will set the
hardware to input mode, which will cause gpiod_get_direction() to
report that it is in input mode. This is different from the behaviour
with a true open-drain output.
Unify the semantics here.
Cc: <stable@vger.kernel.org>
Suggested-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Florian Fainelli [Tue, 10 Dec 2019 18:53:45 +0000 (10:53 -0800)]
ata: ahci_brcm: Fix AHCI resources management
commit
c0cdf2ac4b5bf3e5ef2451ea29fb4104278cdabc upstream.
The AHCI resources management within ahci_brcm.c is a little
convoluted, largely because it historically had a dedicated clock that
was managed within this file in the downstream tree. Once brough
upstream though, the clock was left to be managed by libahci_platform.c
which is entirely appropriate.
This patch series ensures that the AHCI resources are fetched and
enabled before any register access is done, thus avoiding bus errors on
platforms which clock gate the controller by default.
As a result we need to re-arrange the suspend() and resume() functions
in order to avoid accessing registers after the clocks have been turned
off respectively before the clocks have been turned on. Finally, we can
refactor brcm_ahci_get_portmask() in order to fetch the number of ports
from hpriv->mmio which is now accessible without jumping through hoops
like we used to do.
The commit pointed in the Fixes tag is both old and new enough not to
require major headaches for backporting of this patch.
Fixes:
eba68f829794 ("ata: ahci_brcmstb: rename to support across Broadcom SoC's")
Cc: stable@vger.kernel.org
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Florian Fainelli [Mon, 1 Oct 2018 17:33:00 +0000 (10:33 -0700)]
ata: ahci_brcm: Allow optional reset controller to be used
commit
2b2c47d9e1fe90311b725125d6252a859ee87a79 upstream.
On BCM63138, we need to reset the AHCI core prior to start utilizing it,
grab the reset controller device cookie and do that.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Florian Fainelli [Tue, 10 Dec 2019 18:53:44 +0000 (10:53 -0800)]
ata: libahci_platform: Export again ahci_platform_<en/dis>able_phys()
commit
84b032dbfdf1c139cd2b864e43959510646975f8 upstream.
This reverts commit
6bb86fefa086faba7b60bb452300b76a47cde1a5
("libahci_platform: Staticize ahci_platform_<en/dis>able_phys()") we are
going to need ahci_platform_{enable,disable}_phys() in a subsequent
commit for ahci_brcm.c in order to properly control the PHY
initialization order.
Also make sure the function prototypes are declared in
include/linux/ahci_platform.h as a result.
Cc: stable@vger.kernel.org
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Arnd Bergmann [Fri, 29 Nov 2019 10:28:22 +0000 (11:28 +0100)]
compat_ioctl: block: handle BLKREPORTZONE/BLKRESETZONE
commit
673bdf8ce0a387ef585c13b69a2676096c6edfe9 upstream.
These were added to blkdev_ioctl() but not blkdev_compat_ioctl,
so add them now.
Cc: <stable@vger.kernel.org> # v4.10+
Fixes:
3ed05a987e0f ("blk-zoned: implement ioctls")
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Arnd Bergmann [Fri, 29 Nov 2019 10:28:22 +0000 (11:28 +0100)]
compat_ioctl: block: handle Persistent Reservations
commit
b2c0fcd28772f99236d261509bcd242135677965 upstream.
These were added to blkdev_ioctl() in linux-5.5 but not
blkdev_compat_ioctl, so add them now.
Cc: <stable@vger.kernel.org> # v4.4+
Fixes:
bbd3e064362e ("block: add an API for Persistent Reservations")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Fold in followup patch from Arnd with missing pr.h header include.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Lukas Wunner [Thu, 5 Dec 2019 11:54:49 +0000 (12:54 +0100)]
dmaengine: Fix access to uninitialized dma_slave_caps
commit
53a256a9b925b47c7e67fc1f16ca41561a7b877c upstream.
dmaengine_desc_set_reuse() allocates a struct dma_slave_caps on the
stack, populates it using dma_get_slave_caps() and then accesses one
of its members.
However dma_get_slave_caps() may fail and this isn't accounted for,
leading to a legitimate warning of gcc-4.9 (but not newer versions):
In file included from drivers/spi/spi-bcm2835.c:19:0:
drivers/spi/spi-bcm2835.c: In function 'dmaengine_desc_set_reuse':
>> include/linux/dmaengine.h:1370:10: warning: 'caps.descriptor_reuse' is used uninitialized in this function [-Wuninitialized]
if (caps.descriptor_reuse) {
Fix it, thereby also silencing the gcc-4.9 warning.
The issue has been present for 4 years but surfaces only now that
the first caller of dmaengine_desc_set_reuse() has been added in
spi-bcm2835.c. Another user of reusable DMA descriptors has existed
for a while in pxa_camera.c, but it sets the DMA_CTRL_REUSE flag
directly instead of calling dmaengine_desc_set_reuse(). Nevertheless,
tag this commit for stable in case there are out-of-tree users.
Fixes:
272420214d26 ("dmaengine: Add DMA_CTRL_REUSE")
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Cc: stable@vger.kernel.org # v4.3+
Link: https://lore.kernel.org/r/ca92998ccc054b4f2bfd60ef3adbab2913171eac.1575546234.git.lukas@wunner.de
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Amir Goldstein [Sun, 22 Dec 2019 18:45:28 +0000 (20:45 +0200)]
locks: print unsigned ino in /proc/locks
commit
98ca480a8f22fdbd768e3dad07024c8d4856576c upstream.
An ino is unsigned, so display it as such in /proc/locks.
Cc: stable@vger.kernel.org
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Aleksandr Yashkin [Mon, 23 Dec 2019 13:38:16 +0000 (18:38 +0500)]
pstore/ram: Write new dumps to start of recycled zones
commit
9e5f1c19800b808a37fb9815a26d382132c26c3d upstream.
The ram_core.c routines treat przs as circular buffers. When writing a
new crash dump, the old buffer needs to be cleared so that the new dump
doesn't end up in the wrong place (i.e. at the end).
The solution to this problem is to reset the circular buffer state before
writing a new Oops dump.
Signed-off-by: Aleksandr Yashkin <a.yashkin@inango-systems.com>
Signed-off-by: Nikolay Merinov <n.merinov@inango-systems.com>
Signed-off-by: Ariel Gilman <a.gilman@inango-systems.com>
Link: https://lore.kernel.org/r/20191223133816.28155-1-n.merinov@inango-systems.com
Fixes:
896fc1f0c4c6 ("pstore/ram: Switch to persistent_ram routines")
Cc: stable@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Shakeel Butt [Sat, 4 Jan 2020 20:59:43 +0000 (12:59 -0800)]
memcg: account security cred as well to kmemcg
commit
84029fd04c201a4c7e0b07ba262664900f47c6f5 upstream.
The cred_jar kmem_cache is already memcg accounted in the current kernel
but cred->security is not. Account cred->security to kmemcg.
Recently we saw high root slab usage on our production and on further
inspection, we found a buggy application leaking processes. Though that
buggy application was contained within its memcg but we observe much
more system memory overhead, couple of GiBs, during that period. This
overhead can adversely impact the isolation on the system.
One source of high overhead we found was cred->security objects, which
have a lifetime of at least the life of the process which allocated
them.
Link: http://lkml.kernel.org/r/20191205223721.40034-1-shakeelb@google.com
Signed-off-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Chris Down <chris@chrisdown.name>
Reviewed-by: Roman Gushchin <guro@fb.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Chanho Min [Sat, 4 Jan 2020 20:59:36 +0000 (12:59 -0800)]
mm/zsmalloc.c: fix the migrated zspage statistics.
commit
ac8f05da5174c560de122c499ce5dfb5d0dfbee5 upstream.
When zspage is migrated to the other zone, the zone page state should be
updated as well, otherwise the NR_ZSPAGE for each zone shows wrong
counts including proc/zoneinfo in practice.
Link: http://lkml.kernel.org/r/1575434841-48009-1-git-send-email-chanho.min@lge.com
Fixes:
91537fee0013 ("mm: add NR_ZSMALLOC to vmstat")
Signed-off-by: Chanho Min <chanho.min@lge.com>
Signed-off-by: Jinsuk Choi <jjinsuk.choi@lge.com>
Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Cc: <stable@vger.kernel.org> [4.9+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Hans Verkuil [Sat, 7 Dec 2019 22:48:09 +0000 (23:48 +0100)]
media: cec: avoid decrementing transmit_queue_sz if it is 0
commit
95c29d46ab2a517e4c26d0a07300edca6768db17 upstream.
WARN if transmit_queue_sz is 0 but do not decrement it.
The CEC adapter will become unresponsive if it goes below
0 since then it thinks there are 4 billion messages in the
queue.
Obviously this should not happen, but a driver bug could
cause this.
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Cc: <stable@vger.kernel.org> # for v4.12 and up
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Hans Verkuil [Wed, 4 Dec 2019 07:52:08 +0000 (08:52 +0100)]
media: cec: CEC 2.0-only bcast messages were ignored
commit
cec935ce69fc386f13959578deb40963ebbb85c3 upstream.
Some messages are allowed to be a broadcast message in CEC 2.0
only, and should be ignored by CEC 1.4 devices.
Unfortunately, the check was wrong, causing such messages to be
marked as invalid under CEC 2.0.
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Cc: <stable@vger.kernel.org> # for v4.10 and up
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Hans Verkuil [Sat, 7 Dec 2019 22:43:23 +0000 (23:43 +0100)]
media: pulse8-cec: fix lost cec_transmit_attempt_done() call
commit
e5a52a1d15c79bb48a430fb263852263ec1d3f11 upstream.
The periodic PING command could interfere with the result of
a CEC transmit, causing a lost cec_transmit_attempt_done()
call.
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Cc: <stable@vger.kernel.org> # for v4.10 and up
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Paul Burton [Thu, 2 Jan 2020 04:50:38 +0000 (20:50 -0800)]
MIPS: Avoid VDSO ABI breakage due to global register variable
commit
bbcc5672b0063b0e9d65dc8787a4f09c3b5bb5cc upstream.
Declaring __current_thread_info as a global register variable has the
effect of preventing GCC from saving & restoring its value in cases
where the ABI would typically do so.
To quote GCC documentation:
> If the register is a call-saved register, call ABI is affected: the
> register will not be restored in function epilogue sequences after the
> variable has been assigned. Therefore, functions cannot safely return
> to callers that assume standard ABI.
When our position independent VDSO is built for the n32 or n64 ABIs all
functions it exposes should be preserving the value of $gp/$28 for their
caller, but in the presence of the __current_thread_info global register
variable GCC stops doing so & simply clobbers $gp/$28 when calculating
the address of the GOT.
In cases where the VDSO returns success this problem will typically be
masked by the caller in libc returning & restoring $gp/$28 itself, but
that is by no means guaranteed. In cases where the VDSO returns an error
libc will typically contain a fallback path which will now fail
(typically with a bad memory access) if it attempts anything which
relies upon the value of $gp/$28 - eg. accessing anything via the GOT.
One fix for this would be to move the declaration of
__current_thread_info inside the current_thread_info() function,
demoting it from global register variable to local register variable &
avoiding inadvertently creating a non-standard calling ABI for the VDSO.
Unfortunately this causes issues for clang, which doesn't support local
register variables as pointed out by commit
fe92da0f355e ("MIPS: Changed
current_thread_info() to an equivalent supported by both clang and GCC")
which introduced the global register variable before we had a VDSO to
worry about.
Instead, fix this by continuing to use the global register variable for
the kernel proper but declare __current_thread_info as a simple extern
variable when building the VDSO. It should never be referenced, and will
cause a link error if it is. This resolves the calling convention issue
for the VDSO without having any impact upon the build of the kernel
itself for either clang or gcc.
Signed-off-by: Paul Burton <paulburton@kernel.org>
Fixes:
ebb5e78cc634 ("MIPS: Initial implementation of a VDSO")
Reported-by: Jason A. Donenfeld <Jason@zx2c4.com>
Reviewed-by: Jason A. Donenfeld <Jason@zx2c4.com>
Tested-by: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Brauner <christian.brauner@canonical.com>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: <stable@vger.kernel.org> # v4.4+
Cc: linux-mips@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Stefan Mavrodiev [Tue, 17 Dec 2019 12:46:32 +0000 (14:46 +0200)]
drm/sun4i: hdmi: Remove duplicate cleanup calls
commit
57177d214ee0816c4436c23d6c933ccb32c571f1 upstream.
When the HDMI unbinds drm_connector_cleanup() and drm_encoder_cleanup()
are called. This also happens when the connector and the encoder are
destroyed. This double call triggers a NULL pointer exception.
The patch fixes this by removing the cleanup calls in the unbind
function.
Cc: <stable@vger.kernel.org>
Fixes:
9c5681011a0c ("drm/sun4i: Add HDMI support")
Signed-off-by: Stefan Mavrodiev <stefan@olimex.com>
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20191217124632.20820-1-stefan@olimex.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Takashi Iwai [Wed, 18 Dec 2019 19:26:06 +0000 (20:26 +0100)]
ALSA: ice1724: Fix sleep-in-atomic in Infrasonic Quartet support code
commit
0aec96f5897ac16ad9945f531b4bef9a2edd2ebd upstream.
Jia-Ju Bai reported a possible sleep-in-atomic scenario in the ice1724
driver with Infrasonic Quartet support code: namely, ice->set_rate
callback gets called inside ice->reg_lock spinlock, while the callback
in quartet.c holds ice->gpio_mutex.
This patch fixes the invalid call: it simply moves the calls of
ice->set_rate and ice->set_mclk callbacks outside the spinlock.
Reported-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/5d43135e-73b9-a46a-2155-9e91d0dcdf83@gmail.com
Link: https://lore.kernel.org/r/20191218192606.12866-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Daniel Vetter [Thu, 5 Dec 2019 00:52:37 +0000 (16:52 -0800)]
drm: limit to INT_MAX in create_blob ioctl
[ Upstream commit
5bf8bec3f4ce044a223c40cbce92590d938f0e9c ]
The hardened usercpy code is too paranoid ever since commit
6a30afa8c1fb
("uaccess: disallow > INT_MAX copy sizes")
Code itself should have been fine as-is.
Link: http://lkml.kernel.org/r/20191106164755.31478-1-daniel.vetter@ffwll.ch
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Reported-by: syzbot+fb77e97ebf0612ee6914@syzkaller.appspotmail.com
Fixes:
6a30afa8c1fb ("uaccess: disallow > INT_MAX copy sizes")
Cc: Kees Cook <keescook@chromium.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Christian Brauner [Wed, 9 Oct 2019 11:48:09 +0000 (13:48 +0200)]
taskstats: fix data-race
[ Upstream commit
0b8d616fb5a8ffa307b1d3af37f55c15dae14f28 ]
When assiging and testing taskstats in taskstats_exit() there's a race
when setting up and reading sig->stats when a thread-group with more
than one thread exits:
write to 0xffff8881157bbe10 of 8 bytes by task 7951 on cpu 0:
taskstats_tgid_alloc kernel/taskstats.c:567 [inline]
taskstats_exit+0x6b7/0x717 kernel/taskstats.c:596
do_exit+0x2c2/0x18e0 kernel/exit.c:864
do_group_exit+0xb4/0x1c0 kernel/exit.c:983
get_signal+0x2a2/0x1320 kernel/signal.c:2734
do_signal+0x3b/0xc00 arch/x86/kernel/signal.c:815
exit_to_usermode_loop+0x250/0x2c0 arch/x86/entry/common.c:159
prepare_exit_to_usermode arch/x86/entry/common.c:194 [inline]
syscall_return_slowpath arch/x86/entry/common.c:274 [inline]
do_syscall_64+0x2d7/0x2f0 arch/x86/entry/common.c:299
entry_SYSCALL_64_after_hwframe+0x44/0xa9
read to 0xffff8881157bbe10 of 8 bytes by task 7949 on cpu 1:
taskstats_tgid_alloc kernel/taskstats.c:559 [inline]
taskstats_exit+0xb2/0x717 kernel/taskstats.c:596
do_exit+0x2c2/0x18e0 kernel/exit.c:864
do_group_exit+0xb4/0x1c0 kernel/exit.c:983
__do_sys_exit_group kernel/exit.c:994 [inline]
__se_sys_exit_group kernel/exit.c:992 [inline]
__x64_sys_exit_group+0x2e/0x30 kernel/exit.c:992
do_syscall_64+0xcf/0x2f0 arch/x86/entry/common.c:296
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Fix this by using smp_load_acquire() and smp_store_release().
Reported-by: syzbot+c5d03165a1bd1dead0c1@syzkaller.appspotmail.com
Fixes:
34ec12349c8a ("taskstats: cleanup ->signal->stats allocation")
Cc: stable@vger.kernel.org
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Acked-by: Marco Elver <elver@google.com>
Reviewed-by: Will Deacon <will@kernel.org>
Reviewed-by: Andrea Parri <parri.andrea@gmail.com>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Link: https://lore.kernel.org/r/20191009114809.8643-1-christian.brauner@ubuntu.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Brian Foster [Tue, 3 Dec 2019 15:53:15 +0000 (07:53 -0800)]
xfs: fix mount failure crash on invalid iclog memory access
[ Upstream commit
798a9cada4694ca8d970259f216cec47e675bfd5 ]
syzbot (via KASAN) reports a use-after-free in the error path of
xlog_alloc_log(). Specifically, the iclog freeing loop doesn't
handle the case of a fully initialized ->l_iclog linked list.
Instead, it assumes that the list is partially constructed and NULL
terminated.
This bug manifested because there was no possible error scenario
after iclog list setup when the original code was added. Subsequent
code and associated error conditions were added some time later,
while the original error handling code was never updated. Fix up the
error loop to terminate either on a NULL iclog or reaching the end
of the list.
Reported-by: syzbot+c732f8644185de340492@syzkaller.appspotmail.com
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Andy Whitcroft [Wed, 25 Sep 2019 14:39:12 +0000 (15:39 +0100)]
PM / hibernate: memory_bm_find_bit(): Tighten node optimisation
[ Upstream commit
da6043fe85eb5ec621e34a92540735dcebbea134 ]
When looking for a bit by number we make use of the cached result from the
preceding lookup to speed up operation. Firstly we check if the requested
pfn is within the cached zone and if not lookup the new zone. We then
check if the offset for that pfn falls within the existing cached node.
This happens regardless of whether the node is within the zone we are
now scanning. With certain memory layouts it is possible for this to
false trigger creating a temporary alias for the pfn to a different bit.
This leads the hibernation code to free memory which it was never allocated
with the expected fallout.
Ensure the zone we are scanning matches the cached zone before considering
the cached node.
Deep thanks go to Andrea for many, many, many hours of hacking and testing
that went into cornering this bug.
Reported-by: Andrea Righi <andrea.righi@canonical.com>
Tested-by: Andrea Righi <andrea.righi@canonical.com>
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Juergen Gross [Thu, 12 Dec 2019 14:17:50 +0000 (15:17 +0100)]
xen/balloon: fix ballooned page accounting without hotplug enabled
[ Upstream commit
c673ec61ade89bf2f417960f986bc25671762efb ]
When CONFIG_XEN_BALLOON_MEMORY_HOTPLUG is not defined
reserve_additional_memory() will set balloon_stats.target_pages to a
wrong value in case there are still some ballooned pages allocated via
alloc_xenballooned_pages().
This will result in balloon_process() no longer be triggered when
ballooned pages are freed in batches.
Reported-by: Nicholas Tsirakis <niko.tsirakis@gmail.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Paul Durrant [Tue, 10 Dec 2019 14:53:05 +0000 (14:53 +0000)]
xen-blkback: prevent premature module unload
[ Upstream commit
fa2ac657f9783f0891b2935490afe9a7fd29d3fa ]
Objects allocated by xen_blkif_alloc come from the 'blkif_cache' kmem
cache. This cache is destoyed when xen-blkif is unloaded so it is
necessary to wait for the deferred free routine used for such objects to
complete. This necessity was missed in commit
14855954f636 "xen-blkback:
allow module to be cleanly unloaded". This patch fixes the problem by
taking/releasing extra module references in xen_blkif_alloc/free()
respectively.
Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Parav Pandit [Thu, 12 Dec 2019 09:12:13 +0000 (11:12 +0200)]
IB/mlx4: Follow mirror sequence of device add during device removal
[ Upstream commit
89f988d93c62384758b19323c886db917a80c371 ]
Current code device add sequence is:
ib_register_device()
ib_mad_init()
init_sriov_init()
register_netdev_notifier()
Therefore, the remove sequence should be,
unregister_netdev_notifier()
close_sriov()
mad_cleanup()
ib_unregister_device()
However it is not above.
Hence, make do above remove sequence.
Fixes:
fa417f7b520ee ("IB/mlx4: Add support for IBoE")
Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Link: https://lore.kernel.org/r/20191212091214.315005-3-leon@kernel.org
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Thomas Richter [Fri, 29 Nov 2019 14:24:25 +0000 (15:24 +0100)]
s390/cpum_sf: Avoid SBD overflow condition in irq handler
[ Upstream commit
0539ad0b22877225095d8adef0c376f52cc23834 ]
The s390 CPU Measurement sampling facility has an overflow condition
which fires when all entries in a SBD are used.
The measurement alert interrupt is triggered and reads out all samples
in this SDB. It then tests the successor SDB, if this SBD is not full,
the interrupt handler does not read any samples at all from this SDB
The design waits for the hardware to fill this SBD and then trigger
another meassurement alert interrupt.
This scheme works nicely until
an perf_event_overflow() function call discards the sample due to
a too high sampling rate.
The interrupt handler has logic to read out a partially filled SDB
when the perf event overflow condition in linux common code is met.
This causes the CPUM sampling measurement hardware and the PMU
device driver to operate on the same SBD's trailer entry.
This should not happen.
This can be seen here using this trace:
cpumsf_pmu_add: tear:0xb5286000
hw_perf_event_update: sdbt 0xb5286000 full 1 over 0 flush_all:0
hw_perf_event_update: sdbt 0xb5286008 full 0 over 0 flush_all:0
above shows 1. interrupt
hw_perf_event_update: sdbt 0xb5286008 full 1 over 0 flush_all:0
hw_perf_event_update: sdbt 0xb5286008 full 0 over 0 flush_all:0
above shows 2. interrupt
... this goes on fine until...
hw_perf_event_update: sdbt 0xb5286068 full 1 over 0 flush_all:0
perf_push_sample1: overflow
one or more samples read from the IRQ handler are rejected by
perf_event_overflow() and the IRQ handler advances to the next SDB
and modifies the trailer entry of a partially filled SDB.
hw_perf_event_update: sdbt 0xb5286070 full 0 over 0 flush_all:1
timestamp: 14:32:52.519953
Next time the IRQ handler is called for this SDB the trailer entry shows
an overflow count of 19 missed entries.
hw_perf_event_update: sdbt 0xb5286070 full 1 over 19 flush_all:1
timestamp: 14:32:52.970058
Remove access to a follow on SDB when event overflow happened.
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Thomas Richter [Thu, 28 Nov 2019 09:26:41 +0000 (10:26 +0100)]
s390/cpum_sf: Adjust sampling interval to avoid hitting sample limits
[ Upstream commit
39d4a501a9ef55c57b51e3ef07fc2aeed7f30b3b ]
Function perf_event_ever_overflow() and perf_event_account_interrupt()
are called every time samples are processed by the interrupt handler.
However function perf_event_account_interrupt() has checks to avoid being
flooded with interrupts (more then 1000 samples are received per
task_tick). Samples are then dropped and a PERF_RECORD_THROTTLED is
added to the perf data. The perf subsystem limit calculation is:
maximum sample frequency := 100000 --> 1 samples per 10 us
task_tick = 10ms = 10000us --> 1000 samples per task_tick
The work flow is
measurement_alert() uses SDBT head and each SBDT points to 511
SDB pages, each with 126 sample entries. After processing 8 SBDs
and for each valid sample calling:
perf_event_overflow()
perf_event_account_interrupts()
there is a considerable amount of samples being dropped, especially when
the sample frequency is very high and near the 100000 limit.
To avoid the high amount of samples being dropped near the end of a
task_tick time frame, increment the sampling interval in case of
dropped events. The CPU Measurement sampling facility on the s390
supports only intervals, specifiing how many CPU cycles have to be
executed before a sample is generated. Increase the interval when the
samples being generated hit the task_tick limit.
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Zhiqiang Liu [Tue, 10 Dec 2019 02:42:25 +0000 (10:42 +0800)]
md: raid1: check rdev before reference in raid1_sync_request func
[ Upstream commit
028288df635f5a9addd48ac4677b720192747944 ]
In raid1_sync_request func, rdev should be checked before reference.
Signed-off-by: Zhiqiang Liu <liuzhiqiang26@huawei.com>
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Jens Axboe [Tue, 10 Dec 2019 03:58:56 +0000 (20:58 -0700)]
net: make socket read/write_iter() honor IOCB_NOWAIT
[ Upstream commit
ebfcd8955c0b52eb793bcbc9e71140e3d0cdb228 ]
The socket read/write helpers only look at the file O_NONBLOCK. not
the iocb IOCB_NOWAIT flag. This breaks users like preadv2/pwritev2
and io_uring that rely on not having the file itself marked nonblocking,
but rather the iocb itself.
Cc: netdev@vger.kernel.org
Acked-by: David Miller <davem@davemloft.net>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
EJ Hsu [Wed, 4 Dec 2019 07:34:56 +0000 (23:34 -0800)]
usb: gadget: fix wrong endpoint desc
[ Upstream commit
e5b5da96da50ef30abb39cb9f694e99366404d24 ]
Gadget driver should always use config_ep_by_speed() to initialize
usb_ep struct according to usb device's operating speed. Otherwise,
usb_ep struct may be wrong if usb devcie's operating speed is changed.
The key point in this patch is that we want to make sure the desc pointer
in usb_ep struct will be set to NULL when gadget is disconnected.
This will force it to call config_ep_by_speed() to correctly initialize
usb_ep struct based on the new operating speed when gadget is
re-connected later.
Reviewed-by: Peter Chen <peter.chen@nxp.com>
Signed-off-by: EJ Hsu <ejh@nvidia.com>
Signed-off-by: Felipe Balbi <balbi@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Hans de Goede [Thu, 24 Oct 2019 08:52:52 +0000 (10:52 +0200)]
drm/nouveau: Move the declaration of struct nouveau_conn_atom up a bit
[ Upstream commit
37a68eab4cd92b507c9e8afd760fdc18e4fecac6 ]
Place the declaration of struct nouveau_conn_atom above that of
struct nouveau_connector. This commit makes no changes to the moved
block what so ever, it just moves it up a bit.
This is a preparation patch to fix some issues with connector handling
on pre nv50 displays (which do not use atomic modesetting).
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Jason Yan [Fri, 6 Dec 2019 01:11:18 +0000 (09:11 +0800)]
scsi: libsas: stop discovering if oob mode is disconnected
[ Upstream commit
f70267f379b5e5e11bdc5d72a56bf17e5feed01f ]
The discovering of sas port is driven by workqueue in libsas. When libsas
is processing port events or phy events in workqueue, new events may rise
up and change the state of some structures such as asd_sas_phy. This may
cause some problems such as follows:
==>thread 1 ==>thread 2
==>phy up
==>phy_up_v3_hw()
==>oob_mode = SATA_OOB_MODE;
==>phy down quickly
==>hisi_sas_phy_down()
==>sas_ha->notify_phy_event()
==>sas_phy_disconnected()
==>oob_mode = OOB_NOT_CONNECTED
==>workqueue wakeup
==>sas_form_port()
==>sas_discover_domain()
==>sas_get_port_device()
==>oob_mode is OOB_NOT_CONNECTED and device
is wrongly taken as expander
This at last lead to the panic when libsas trying to issue a command to
discover the device.
[183047.614035] Unable to handle kernel NULL pointer dereference at
virtual address
0000000000000058
[183047.622896] Mem abort info:
[183047.625762] ESR = 0x96000004
[183047.628893] Exception class = DABT (current EL), IL = 32 bits
[183047.634888] SET = 0, FnV = 0
[183047.638015] EA = 0, S1PTW = 0
[183047.641232] Data abort info:
[183047.644189] ISV = 0, ISS = 0x00000004
[183047.648100] CM = 0, WnR = 0
[183047.651145] user pgtable: 4k pages, 48-bit VAs, pgdp =
00000000b7df67be
[183047.657834] [
0000000000000058] pgd=
0000000000000000
[183047.662789] Internal error: Oops:
96000004 [#1] SMP
[183047.667740] Process kworker/u16:2 (pid: 31291, stack limit =
0x00000000417c4974)
[183047.675208] CPU: 0 PID: 3291 Comm: kworker/u16:2 Tainted: G
W OE 4.19.36-vhulk1907.1.0.h410.eulerosv2r8.aarch64 #1
[183047.687015] Hardware name: N/A N/A/Kunpeng Desktop Board D920S10,
BIOS 0.15 10/22/2019
[183047.695007] Workqueue: 0000:74:02.0_disco_q sas_discover_domain
[183047.700999] pstate:
20c00009 (nzCv daif +PAN +UAO)
[183047.705864] pc : prep_ata_v3_hw+0xf8/0x230 [hisi_sas_v3_hw]
[183047.711510] lr : prep_ata_v3_hw+0xb0/0x230 [hisi_sas_v3_hw]
[183047.717153] sp :
ffff00000f28ba60
[183047.720541] x29:
ffff00000f28ba60 x28:
ffff8026852d7228
[183047.725925] x27:
ffff8027dba3e0a8 x26:
ffff8027c05fc200
[183047.731310] x25:
0000000000000000 x24:
ffff8026bafa8dc0
[183047.736695] x23:
ffff8027c05fc218 x22:
ffff8026852d7228
[183047.742079] x21:
ffff80007c2f2940 x20:
ffff8027c05fc200
[183047.747464] x19:
0000000000f80800 x18:
0000000000000010
[183047.752848] x17:
0000000000000000 x16:
0000000000000000
[183047.758232] x15:
ffff000089a5a4ff x14:
0000000000000005
[183047.763617] x13:
ffff000009a5a50e x12:
ffff8026bafa1e20
[183047.769001] x11:
ffff0000087453b8 x10:
ffff00000f28b870
[183047.774385] x9 :
0000000000000000 x8 :
ffff80007e58f9b0
[183047.779770] x7 :
0000000000000000 x6 :
000000000000003f
[183047.785154] x5 :
0000000000000040 x4 :
ffffffffffffffe0
[183047.790538] x3 :
00000000000000f8 x2 :
0000000002000007
[183047.795922] x1 :
0000000000000008 x0 :
0000000000000000
[183047.801307] Call trace:
[183047.803827] prep_ata_v3_hw+0xf8/0x230 [hisi_sas_v3_hw]
[183047.809127] hisi_sas_task_prep+0x750/0x888 [hisi_sas_main]
[183047.814773] hisi_sas_task_exec.isra.7+0x88/0x1f0 [hisi_sas_main]
[183047.820939] hisi_sas_queue_command+0x28/0x38 [hisi_sas_main]
[183047.826757] smp_execute_task_sg+0xec/0x218
[183047.831013] smp_execute_task+0x74/0xa0
[183047.834921] sas_discover_expander.part.7+0x9c/0x5f8
[183047.839959] sas_discover_root_expander+0x90/0x160
[183047.844822] sas_discover_domain+0x1b8/0x1e8
[183047.849164] process_one_work+0x1b4/0x3f8
[183047.853246] worker_thread+0x54/0x470
[183047.856981] kthread+0x134/0x138
[183047.860283] ret_from_fork+0x10/0x18
[183047.863931] Code:
f9407a80 528000e2 39409281 72a04002 (
b9405800)
[183047.870097] kernel fault(0x1) notification starting on CPU 0
[183047.875828] kernel fault(0x1) notification finished on CPU 0
[183047.881559] Modules linked in: unibsp(OE) hns3(OE) hclge(OE)
hnae3(OE) mem_drv(OE) hisi_sas_v3_hw(OE) hisi_sas_main(OE)
[183047.892418] ---[ end trace
4cc26083fc11b783 ]---
[183047.897107] Kernel panic - not syncing: Fatal exception
[183047.902403] kernel fault(0x5) notification starting on CPU 0
[183047.908134] kernel fault(0x5) notification finished on CPU 0
[183047.913865] SMP: stopping secondary CPUs
[183047.917861] Kernel Offset: disabled
[183047.921422] CPU features: 0x2,
a2a00a38
[183047.925243] Memory Limit: none
[183047.928372] kernel reboot(0x2) notification starting on CPU 0
[183047.934190] kernel reboot(0x2) notification finished on CPU 0
[183047.940008] ---[ end Kernel panic - not syncing: Fatal exception
]---
Fixes:
2908d778ab3e ("[SCSI] aic94xx: new driver")
Link: https://lore.kernel.org/r/20191206011118.46909-1-yanaijie@huawei.com
Reported-by: Gao Chuan <gaochuan4@huawei.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Dan Carpenter [Tue, 3 Dec 2019 09:45:09 +0000 (12:45 +0300)]
scsi: iscsi: qla4xxx: fix double free in probe
[ Upstream commit
fee92f25777789d73e1936b91472e9c4644457c8 ]
On this error path we call qla4xxx_mem_free() and then the caller also
calls qla4xxx_free_adapter() which calls qla4xxx_mem_free(). It leads to a
couple double frees:
drivers/scsi/qla4xxx/ql4_os.c:8856 qla4xxx_probe_adapter() warn: 'ha->chap_dma_pool' double freed
drivers/scsi/qla4xxx/ql4_os.c:8856 qla4xxx_probe_adapter() warn: 'ha->fw_ddb_dma_pool' double freed
Fixes:
afaf5a2d341d ("[SCSI] Initial Commit of qla4xxx")
Link: https://lore.kernel.org/r/20191203094421.hw7ex7qr3j2rbsmx@kili.mountain
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Roman Bolshakov [Mon, 25 Nov 2019 16:56:56 +0000 (19:56 +0300)]
scsi: qla2xxx: Don't call qlt_async_event twice
[ Upstream commit
2c2f4bed9b6299e6430a65a29b5d27b8763fdf25 ]
MBA_PORT_UPDATE generates duplicate log lines in target mode because
qlt_async_event is called twice. Drop the calls within the case as the
function will be called right after the switch statement.
Cc: Quinn Tran <qutran@marvell.com>
Link: https://lore.kernel.org/r/20191125165702.1013-8-r.bolshakov@yadro.com
Acked-by: Himanshu Madhani <hmadhani@marvel.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Tested-by: Hannes Reinecke <hare@suse.de>
Acked-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Bo Wu [Sat, 7 Dec 2019 03:22:46 +0000 (03:22 +0000)]
scsi: lpfc: Fix memory leak on lpfc_bsg_write_ebuf_set func
[ Upstream commit
9a1b0b9a6dab452fb0e39fe96880c4faf3878369 ]
When phba->mbox_ext_buf_ctx.seqNum != phba->mbox_ext_buf_ctx.numBuf,
dd_data should be freed before return SLI_CONFIG_HANDLED.
When lpfc_sli_issue_mbox func return fails, pmboxq should be also freed in
job_error tag.
Link: https://lore.kernel.org/r/EDBAAA0BBBA2AC4E9C8B6B81DEEE1D6915E7A966@DGGEML525-MBS.china.huawei.com
Signed-off-by: Bo Wu <wubo40@huawei.com>
Reviewed-by: Zhiqiang Liu <liuzhiqiang26@huawei.com>
Reviewed-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Steve Wise [Tue, 3 Dec 2019 02:03:20 +0000 (20:03 -0600)]
rxe: correctly calculate iCRC for unaligned payloads
[ Upstream commit
2030abddec6884aaf5892f5724c48fc340e6826f ]
If RoCE PDUs being sent or received contain pad bytes, then the iCRC
is miscalculated, resulting in PDUs being emitted by RXE with an incorrect
iCRC, as well as ingress PDUs being dropped due to erroneously detecting
a bad iCRC in the PDU. The fix is to include the pad bytes, if any,
in iCRC computations.
Note: This bug has caused broken on-the-wire compatibility with actual
hardware RoCE devices since the soft-RoCE driver was first put into the
mainstream kernel. Fixing it will create an incompatibility with the
original soft-RoCE devices, but is necessary to be compatible with real
hardware devices.
Fixes:
8700e3e7c485 ("Soft RoCE driver")
Signed-off-by: Steve Wise <larrystevenwise@gmail.com>
Link: https://lore.kernel.org/r/20191203020319.15036-2-larrystevenwise@gmail.com
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Chuhong Yuan [Fri, 6 Dec 2019 01:24:26 +0000 (09:24 +0800)]
RDMA/cma: add missed unregister_pernet_subsys in init failure
[ Upstream commit
44a7b6759000ac51b92715579a7bba9e3f9245c2 ]
The driver forgets to call unregister_pernet_subsys() in the error path
of cma_init().
Add the missed call to fix it.
Fixes:
4be74b42a6d0 ("IB/cma: Separate port allocation to network namespaces")
Signed-off-by: Chuhong Yuan <hslester96@gmail.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Link: https://lore.kernel.org/r/20191206012426.12744-1-hslester96@gmail.com
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Leonard Crestez [Wed, 13 Nov 2019 23:21:31 +0000 (01:21 +0200)]
PM / devfreq: Don't fail devfreq_dev_release if not in list
[ Upstream commit
42a6b25e67df6ee6675e8d1eaf18065bd73328ba ]
Right now devfreq_dev_release will print a warning and abort the rest of
the cleanup if the devfreq instance is not part of the global
devfreq_list. But this is a valid scenario, for example it can happen if
the governor can't be found or on any other init error that happens
after device_register.
Initialize devfreq->node to an empty list head in devfreq_add_device so
that list_del becomes a safe noop inside devfreq_dev_release and we can
continue the rest of the cleanup.
Signed-off-by: Leonard Crestez <leonard.crestez@nxp.com>
Reviewed-by: Matthias Kaehlcke <mka@chromium.org>
Reviewed-by: Chanwoo Choi <cw00.choi@samsung.com>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Geert Uytterhoeven [Mon, 2 Dec 2019 08:55:46 +0000 (09:55 +0100)]
iio: adc: max9611: Fix too short conversion time delay
[ Upstream commit
9fd229c478fbf77c41c8528aa757ef14210365f6 ]
As of commit
b9ddd5091160793e ("iio: adc: max9611: Fix temperature
reading in probe"), max9611 initialization sometimes fails on the
Salvator-X(S) development board with:
max9611 4-007f: Invalid value received from ADC 0x8000: aborting
max9611: probe of 4-007f failed with error -5
The max9611 driver tests communications with the chip by reading the die
temperature during the probe function, which returns an invalid value.
According to the datasheet, the typical ADC conversion time is 2 ms, but
no minimum or maximum values are provided. Maxim Technical Support
confirmed this was tested with temperature Ta=25 degreeC, and promised
to inform me if a maximum/minimum value is available (they didn't get
back to me, so I assume it is not).
However, the driver assumes a 1 ms conversion time. Usually the
usleep_range() call returns after more than 1.8 ms, hence it succeeds.
When it returns earlier, the data register may be read too early, and
the previous measurement value will be returned. After boot, this is
the temperature POR (power-on reset) value, causing the failure above.
Fix this by increasing the delay from 1000-2000 µs to 3000-3300 µs.
Note that this issue has always been present, but it was exposed by the
aformentioned commit.
Fixes:
69780a3bbc0b1e7e ("iio: adc: Add Maxim max9611 ADC driver")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Jacopo Mondi <jacopo+renesas@jmondi.org>
Reviewed-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
James Smart [Thu, 14 Nov 2019 23:15:26 +0000 (15:15 -0800)]
nvme_fc: add module to ops template to allow module references
[ Upstream commit
863fbae929c7a5b64e96b8a3ffb34a29eefb9f8f ]
In nvme-fc: it's possible to have connected active controllers
and as no references are taken on the LLDD, the LLDD can be
unloaded. The controller would enter a reconnect state and as
long as the LLDD resumed within the reconnect timeout, the
controller would resume. But if a namespace on the controller
is the root device, allowing the driver to unload can be problematic.
To reload the driver, it may require new io to the boot device,
and as it's no longer connected we get into a catch-22 that
eventually fails, and the system locks up.
Fix this issue by taking a module reference for every connected
controller (which is what the core layer did to the transport
module). Reference is cleared when the controller is removed.
Acked-by: Himanshu Madhani <hmadhani@marvell.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Greg Kroah-Hartman [Sat, 4 Jan 2020 13:00:23 +0000 (14:00 +0100)]
Linux 4.14.162
Christophe Leroy [Thu, 12 Dec 2019 17:47:24 +0000 (17:47 +0000)]
spi: fsl: use platform_get_irq() instead of of_irq_to_resource()
commit
63aa6a692595d47a0785297b481072086b9272d2 upstream.
Unlike irq_of_parse_and_map() which has a dummy definition on SPARC,
of_irq_to_resource() hasn't.
But as platform_get_irq() can be used instead and is generic, use it.
Reported-by: kbuild test robot <lkp@intel.com>
Suggested-by: Mark Brown <broonie@kernel.org>
Fixes:
3194d2533eff ("spi: fsl: don't map irq during probe")
Cc: stable@vger.kernel.org
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Link: https://lore.kernel.org/r/091a277fd0b3356dca1e29858c1c96983fc9cb25.1576172743.git.christophe.leroy@c-s.fr
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Taehee Yoo [Wed, 11 Dec 2019 08:23:48 +0000 (08:23 +0000)]
gtp: avoid zero size hashtable
[ Upstream commit
6a902c0f31993ab02e1b6ea7085002b9c9083b6a ]
GTP default hashtable size is 1024 and userspace could set specific
hashtable size with IFLA_GTP_PDP_HASHSIZE. If hashtable size is set to 0
from userspace, hashtable will not work and panic will occur.
Fixes:
459aa660eb1d ("gtp: add initial driver for datapath of GPRS Tunneling Protocol (GTP-U)")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Taehee Yoo [Wed, 11 Dec 2019 08:23:34 +0000 (08:23 +0000)]
gtp: fix an use-after-free in ipv4_pdp_find()
[ Upstream commit
94dc550a5062030569d4aa76e10e50c8fc001930 ]
ipv4_pdp_find() is called in TX packet path of GTP.
ipv4_pdp_find() internally uses gtp->tid_hash to lookup pdp context.
In the current code, gtp->tid_hash and gtp->addr_hash are freed by
->dellink(), which is gtp_dellink().
But gtp_dellink() would be called while packets are processing.
So, gtp_dellink() should not free gtp->tid_hash and gtp->addr_hash.
Instead, dev->priv_destructor() would be used because this callback
is called after all packet processing safely.
Test commands:
ip link add veth1 type veth peer name veth2
ip a a 172.0.0.1/24 dev veth1
ip link set veth1 up
ip a a 172.99.0.1/32 dev lo
gtp-link add gtp1 &
gtp-tunnel add gtp1 v1 200 100 172.99.0.2 172.0.0.2
ip r a 172.99.0.2/32 dev gtp1
ip link set gtp1 mtu 1500
ip netns add ns2
ip link set veth2 netns ns2
ip netns exec ns2 ip a a 172.0.0.2/24 dev veth2
ip netns exec ns2 ip link set veth2 up
ip netns exec ns2 ip a a 172.99.0.2/32 dev lo
ip netns exec ns2 ip link set lo up
ip netns exec ns2 gtp-link add gtp2 &
ip netns exec ns2 gtp-tunnel add gtp2 v1 100 200 172.99.0.1 172.0.0.1
ip netns exec ns2 ip r a 172.99.0.1/32 dev gtp2
ip netns exec ns2 ip link set gtp2 mtu 1500
hping3 172.99.0.2 -2 --flood &
ip link del gtp1
Splat looks like:
[ 72.568081][ T1195] BUG: KASAN: use-after-free in ipv4_pdp_find.isra.12+0x130/0x170 [gtp]
[ 72.568916][ T1195] Read of size 8 at addr
ffff8880b9a35d28 by task hping3/1195
[ 72.569631][ T1195]
[ 72.569861][ T1195] CPU: 2 PID: 1195 Comm: hping3 Not tainted 5.5.0-rc1 #199
[ 72.570547][ T1195] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[ 72.571438][ T1195] Call Trace:
[ 72.571764][ T1195] dump_stack+0x96/0xdb
[ 72.572171][ T1195] ? ipv4_pdp_find.isra.12+0x130/0x170 [gtp]
[ 72.572761][ T1195] print_address_description.constprop.5+0x1be/0x360
[ 72.573400][ T1195] ? ipv4_pdp_find.isra.12+0x130/0x170 [gtp]
[ 72.573971][ T1195] ? ipv4_pdp_find.isra.12+0x130/0x170 [gtp]
[ 72.574544][ T1195] __kasan_report+0x12a/0x16f
[ 72.575014][ T1195] ? ipv4_pdp_find.isra.12+0x130/0x170 [gtp]
[ 72.575593][ T1195] kasan_report+0xe/0x20
[ 72.576004][ T1195] ipv4_pdp_find.isra.12+0x130/0x170 [gtp]
[ 72.576577][ T1195] gtp_build_skb_ip4+0x199/0x1420 [gtp]
[ ... ]
[ 72.647671][ T1195] BUG: unable to handle page fault for address:
ffff8880b9a35d28
[ 72.648512][ T1195] #PF: supervisor read access in kernel mode
[ 72.649158][ T1195] #PF: error_code(0x0000) - not-present page
[ 72.649849][ T1195] PGD
a6c01067 P4D
a6c01067 PUD
11fb07067 PMD
11f939067 PTE
800fffff465ca060
[ 72.652958][ T1195] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
[ 72.653834][ T1195] CPU: 2 PID: 1195 Comm: hping3 Tainted: G B 5.5.0-rc1 #199
[ 72.668062][ T1195] RIP: 0010:ipv4_pdp_find.isra.12+0x86/0x170 [gtp]
[ ... ]
[ 72.679168][ T1195] Call Trace:
[ 72.679603][ T1195] gtp_build_skb_ip4+0x199/0x1420 [gtp]
[ 72.681915][ T1195] ? ipv4_pdp_find.isra.12+0x170/0x170 [gtp]
[ 72.682513][ T1195] ? lock_acquire+0x164/0x3b0
[ 72.682966][ T1195] ? gtp_dev_xmit+0x35e/0x890 [gtp]
[ 72.683481][ T1195] gtp_dev_xmit+0x3c2/0x890 [gtp]
[ ... ]
Fixes:
459aa660eb1d ("gtp: add initial driver for datapath of GPRS Tunneling Protocol (GTP-U)")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Taehee Yoo [Wed, 11 Dec 2019 08:23:17 +0000 (08:23 +0000)]
gtp: fix wrong condition in gtp_genl_dump_pdp()
[ Upstream commit
94a6d9fb88df43f92d943c32b84ce398d50bf49f ]
gtp_genl_dump_pdp() is ->dumpit() callback of GTP module and it is used
to dump pdp contexts. it would be re-executed because of dump packet size.
If dump packet size is too big, it saves current dump pointer
(gtp interface pointer, bucket, TID value) then it restarts dump from
last pointer.
Current GTP code allows adding zero TID pdp context but dump code
ignores zero TID value. So, last dump pointer will not be found.
In addition, this patch adds missing rcu_read_lock() in
gtp_genl_dump_pdp().
Fixes:
459aa660eb1d ("gtp: add initial driver for datapath of GPRS Tunneling Protocol (GTP-U)")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Eric Dumazet [Thu, 12 Dec 2019 20:55:29 +0000 (12:55 -0800)]
tcp: do not send empty skb from tcp_write_xmit()
[ Upstream commit
1f85e6267caca44b30c54711652b0726fadbb131 ]
Backport of commit
fdfc5c8594c2 ("tcp: remove empty skb from
write queue in error cases") in linux-4.14 stable triggered
various bugs. One of them has been fixed in commit
ba2ddb43f270
("tcp: Don't dequeue SYN/FIN-segments from write-queue"), but
we still have crashes in some occasions.
Root-cause is that when tcp_sendmsg() has allocated a fresh
skb and could not append a fragment before being blocked
in sk_stream_wait_memory(), tcp_write_xmit() might be called
and decide to send this fresh and empty skb.
Sending an empty packet is not only silly, it might have caused
many issues we had in the past with tp->packets_out being
out of sync.
Fixes:
c65f7f00c587 ("[TCP]: Simplify SKB data portion allocation with NETIF_F_SG.")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Christoph Paasch <cpaasch@apple.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Cc: Jason Baron <jbaron@akamai.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Eric Dumazet [Sat, 14 Dec 2019 02:20:41 +0000 (18:20 -0800)]
tcp/dccp: fix possible race __inet_lookup_established()
commit
8dbd76e79a16b45b2ccb01d2f2e08dbf64e71e40 upstream.
Michal Kubecek and Firo Yang did a very nice analysis of crashes
happening in __inet_lookup_established().
Since a TCP socket can go from TCP_ESTABLISH to TCP_LISTEN
(via a close()/socket()/listen() cycle) without a RCU grace period,
I should not have changed listeners linkage in their hash table.
They must use the nulls protocol (Documentation/RCU/rculist_nulls.txt),
so that a lookup can detect a socket in a hash list was moved in
another one.
Since we added code in commit
d296ba60d8e2 ("soreuseport: Resolve
merge conflict for v4/v6 ordering fix"), we have to add
hlist_nulls_add_tail_rcu() helper.
Fixes:
3b24d854cb35 ("tcp/dccp: do not touch listener sk_refcnt under synflood")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Michal Kubecek <mkubecek@suse.cz>
Reported-by: Firo Yang <firo.yang@suse.com>
Reviewed-by: Michal Kubecek <mkubecek@suse.cz>
Link: https://lore.kernel.org/netdev/20191120083919.GH27852@unicorn.suse.cz/
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
[stable-4.14: we also need to update code in __inet_lookup_listener() and
inet6_lookup_listener() which has been removed in 5.0-rc1.]
Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Taehee Yoo [Wed, 11 Dec 2019 08:23:00 +0000 (08:23 +0000)]
gtp: do not allow adding duplicate tid and ms_addr pdp context
[ Upstream commit
6b01b1d9b2d38dc84ac398bfe9f00baff06a31e5 ]
GTP RX packet path lookups pdp context with TID. If duplicate TID pdp
contexts are existing in the list, it couldn't select correct pdp context.
So, TID value should be unique.
GTP TX packet path lookups pdp context with ms_addr. If duplicate ms_addr pdp
contexts are existing in the list, it couldn't select correct pdp context.
So, ms_addr value should be unique.
Fixes:
459aa660eb1d ("gtp: add initial driver for datapath of GPRS Tunneling Protocol (GTP-U)")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Hangbin Liu [Sun, 22 Dec 2019 02:51:15 +0000 (10:51 +0800)]
sit: do not confirm neighbor when do pmtu update
[ Upstream commit
4d42df46d6372ece4cb4279870b46c2ea7304a47 ]
When do IPv6 tunnel PMTU update and calls __ip6_rt_update_pmtu() in the end,
we should not call dst_confirm_neigh() as there is no two-way communication.
v5: No change.
v4: No change.
v3: Do not remove dst_confirm_neigh, but add a new bool parameter in
dst_ops.update_pmtu to control whether we should do neighbor confirm.
Also split the big patch to small ones for each area.
v2: Remove dst_confirm_neigh in __ip6_rt_update_pmtu.
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Hangbin Liu [Sun, 22 Dec 2019 02:51:14 +0000 (10:51 +0800)]
vti: do not confirm neighbor when do pmtu update
[ Upstream commit
8247a79efa2f28b44329f363272550c1738377de ]
When do IPv6 tunnel PMTU update and calls __ip6_rt_update_pmtu() in the end,
we should not call dst_confirm_neigh() as there is no two-way communication.
Although vti and vti6 are immune to this problem because they are IFF_NOARP
interfaces, as Guillaume pointed. There is still no sense to confirm neighbour
here.
v5: Update commit description.
v4: No change.
v3: Do not remove dst_confirm_neigh, but add a new bool parameter in
dst_ops.update_pmtu to control whether we should do neighbor confirm.
Also split the big patch to small ones for each area.
v2: Remove dst_confirm_neigh in __ip6_rt_update_pmtu.
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Hangbin Liu [Sun, 22 Dec 2019 02:51:13 +0000 (10:51 +0800)]
tunnel: do not confirm neighbor when do pmtu update
[ Upstream commit
7a1592bcb15d71400a98632727791d1e68ea0ee8 ]
When do tunnel PMTU update and calls __ip6_rt_update_pmtu() in the end,
we should not call dst_confirm_neigh() as there is no two-way communication.
v5: No Change.
v4: Update commit description
v3: Do not remove dst_confirm_neigh, but add a new bool parameter in
dst_ops.update_pmtu to control whether we should do neighbor confirm.
Also split the big patch to small ones for each area.
v2: Remove dst_confirm_neigh in __ip6_rt_update_pmtu.
Fixes:
0dec879f636f ("net: use dst_confirm_neigh for UDP, RAW, ICMP, L2TP")
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Tested-by: Guillaume Nault <gnault@redhat.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Hangbin Liu [Sun, 22 Dec 2019 02:51:12 +0000 (10:51 +0800)]
net/dst: add new function skb_dst_update_pmtu_no_confirm
[ Upstream commit
07dc35c6e3cc3c001915d05f5bf21f80a39a0970 ]
Add a new function skb_dst_update_pmtu_no_confirm() for callers who need
update pmtu but should not do neighbor confirm.
v5: No change.
v4: No change.
v3: Do not remove dst_confirm_neigh, but add a new bool parameter in
dst_ops.update_pmtu to control whether we should do neighbor confirm.
Also split the big patch to small ones for each area.
v2: Remove dst_confirm_neigh in __ip6_rt_update_pmtu.
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Hangbin Liu [Sun, 22 Dec 2019 02:51:11 +0000 (10:51 +0800)]
gtp: do not confirm neighbor when do pmtu update
[ Upstream commit
6e9105c73f8d2163d12d5dfd762fd75483ed30f5 ]
When do IPv6 tunnel PMTU update and calls __ip6_rt_update_pmtu() in the end,
we should not call dst_confirm_neigh() as there is no two-way communication.
Although GTP only support ipv4 right now, and __ip_rt_update_pmtu() does not
call dst_confirm_neigh(), we still set it to false to keep consistency with
IPv6 code.
v5: No change.
v4: No change.
v3: Do not remove dst_confirm_neigh, but add a new bool parameter in
dst_ops.update_pmtu to control whether we should do neighbor confirm.
Also split the big patch to small ones for each area.
v2: Remove dst_confirm_neigh in __ip6_rt_update_pmtu.
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Hangbin Liu [Sun, 22 Dec 2019 02:51:10 +0000 (10:51 +0800)]
ip6_gre: do not confirm neighbor when do pmtu update
[ Upstream commit
675d76ad0ad5bf41c9a129772ef0aba8f57ea9a7 ]
When we do ipv6 gre pmtu update, we will also do neigh confirm currently.
This will cause the neigh cache be refreshed and set to REACHABLE before
xmit.
But if the remote mac address changed, e.g. device is deleted and recreated,
we will not able to notice this and still use the old mac address as the neigh
cache is REACHABLE.
Fix this by disable neigh confirm when do pmtu update
v5: No change.
v4: No change.
v3: Do not remove dst_confirm_neigh, but add a new bool parameter in
dst_ops.update_pmtu to control whether we should do neighbor confirm.
Also split the big patch to small ones for each area.
v2: Remove dst_confirm_neigh in __ip6_rt_update_pmtu.
Reported-by: Jianlin Shi <jishi@redhat.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Hangbin Liu [Sun, 22 Dec 2019 02:51:09 +0000 (10:51 +0800)]
net: add bool confirm_neigh parameter for dst_ops.update_pmtu
[ Upstream commit
bd085ef678b2cc8c38c105673dfe8ff8f5ec0c57 ]
The MTU update code is supposed to be invoked in response to real
networking events that update the PMTU. In IPv6 PMTU update function
__ip6_rt_update_pmtu() we called dst_confirm_neigh() to update neighbor
confirmed time.
But for tunnel code, it will call pmtu before xmit, like:
- tnl_update_pmtu()
- skb_dst_update_pmtu()
- ip6_rt_update_pmtu()
- __ip6_rt_update_pmtu()
- dst_confirm_neigh()
If the tunnel remote dst mac address changed and we still do the neigh
confirm, we will not be able to update neigh cache and ping6 remote
will failed.
So for this ip_tunnel_xmit() case, _EVEN_ if the MTU is changed, we
should not be invoking dst_confirm_neigh() as we have no evidence
of successful two-way communication at this point.
On the other hand it is also important to keep the neigh reachability fresh
for TCP flows, so we cannot remove this dst_confirm_neigh() call.
To fix the issue, we have to add a new bool parameter for dst_ops.update_pmtu
to choose whether we should do neigh update or not. I will add the parameter
in this patch and set all the callers to true to comply with the previous
way, and fix the tunnel code one by one on later patches.
v5: No change.
v4: No change.
v3: Do not remove dst_confirm_neigh, but add a new bool parameter in
dst_ops.update_pmtu to control whether we should do neighbor confirm.
Also split the big patch to small ones for each area.
v2: Remove dst_confirm_neigh in __ip6_rt_update_pmtu.
Suggested-by: David Miller <davem@davemloft.net>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Stefano Garzarella [Fri, 6 Dec 2019 14:39:12 +0000 (15:39 +0100)]
vhost/vsock: accept only packets with the right dst_cid
[ Upstream commit
8a3cc29c316c17de590e3ff8b59f3d6cbfd37b0a ]
When we receive a new packet from the guest, we check if the
src_cid is correct, but we forgot to check the dst_cid.
The host should accept only packets where dst_cid is
equal to the host CID.
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Antonio Messina [Thu, 19 Dec 2019 14:08:03 +0000 (15:08 +0100)]
udp: fix integer overflow while computing available space in sk_rcvbuf
[ Upstream commit
feed8a4fc9d46c3126fb9fcae0e9248270c6321a ]
When the size of the receive buffer for a socket is close to 2^31 when
computing if we have enough space in the buffer to copy a packet from
the queue to the buffer we might hit an integer overflow.
When an user set net.core.rmem_default to a value close to 2^31 UDP
packets are dropped because of this overflow. This can be visible, for
instance, with failure to resolve hostnames.
This can be fixed by casting sk_rcvbuf (which is an int) to unsigned
int, similarly to how it is done in TCP.
Signed-off-by: Antonio Messina <amessina@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Vladis Dronov [Fri, 27 Dec 2019 02:26:27 +0000 (03:26 +0100)]
ptp: fix the race between the release of ptp_clock and cdev
[ Upstream commit
a33121e5487b424339636b25c35d3a180eaa5f5e ]
In a case when a ptp chardev (like /dev/ptp0) is open but an underlying
device is removed, closing this file leads to a race. This reproduces
easily in a kvm virtual machine:
ts# cat openptp0.c
int main() { ... fp = fopen("/dev/ptp0", "r"); ... sleep(10); }
ts# uname -r
5.5.0-rc3-
46cf053e
ts# cat /proc/cmdline
... slub_debug=FZP
ts# modprobe ptp_kvm
ts# ./openptp0 &
[1] 670
opened /dev/ptp0, sleeping 10s...
ts# rmmod ptp_kvm
ts# ls /dev/ptp*
ls: cannot access '/dev/ptp*': No such file or directory
ts# ...woken up
[ 48.010809] general protection fault: 0000 [#1] SMP
[ 48.012502] CPU: 6 PID: 658 Comm: openptp0 Not tainted 5.5.0-rc3-
46cf053e #25
[ 48.014624] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), ...
[ 48.016270] RIP: 0010:module_put.part.0+0x7/0x80
[ 48.017939] RSP: 0018:
ffffb3850073be00 EFLAGS:
00010202
[ 48.018339] RAX:
000000006b6b6b6b RBX:
6b6b6b6b6b6b6b6b RCX:
ffff89a476c00ad0
[ 48.018936] RDX:
fffff65a08d3ea08 RSI:
0000000000000247 RDI:
6b6b6b6b6b6b6b6b
[ 48.019470] ... ^^^ a slub poison
[ 48.023854] Call Trace:
[ 48.024050] __fput+0x21f/0x240
[ 48.024288] task_work_run+0x79/0x90
[ 48.024555] do_exit+0x2af/0xab0
[ 48.024799] ? vfs_write+0x16a/0x190
[ 48.025082] do_group_exit+0x35/0x90
[ 48.025387] __x64_sys_exit_group+0xf/0x10
[ 48.025737] do_syscall_64+0x3d/0x130
[ 48.026056] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 48.026479] RIP: 0033:0x7f53b12082f6
[ 48.026792] ...
[ 48.030945] Modules linked in: ptp i6300esb watchdog [last unloaded: ptp_kvm]
[ 48.045001] Fixing recursive fault but reboot is needed!
This happens in:
static void __fput(struct file *file)
{ ...
if (file->f_op->release)
file->f_op->release(inode, file); <<< cdev is kfree'd here
if (unlikely(S_ISCHR(inode->i_mode) && inode->i_cdev != NULL &&
!(mode & FMODE_PATH))) {
cdev_put(inode->i_cdev); <<< cdev fields are accessed here
Namely:
__fput()
posix_clock_release()
kref_put(&clk->kref, delete_clock) <<< the last reference
delete_clock()
delete_ptp_clock()
kfree(ptp) <<< cdev is embedded in ptp
cdev_put
module_put(p->owner) <<< *p is kfree'd, bang!
Here cdev is embedded in posix_clock which is embedded in ptp_clock.
The race happens because ptp_clock's lifetime is controlled by two
refcounts: kref and cdev.kobj in posix_clock. This is wrong.
Make ptp_clock's sysfs device a parent of cdev with cdev_device_add()
created especially for such cases. This way the parent device with its
ptp_clock is not released until all references to the cdev are released.
This adds a requirement that an initialized but not exposed struct
device should be provided to posix_clock_register() by a caller instead
of a simple dev_t.
This approach was adopted from the commit
72139dfa2464 ("watchdog: Fix
the race between the release of watchdog_core_data and cdev"). See
details of the implementation in the commit
233ed09d7fda ("chardev: add
helper function to register char devs with a struct device").
Link: https://lore.kernel.org/linux-fsdevel/20191125125342.6189-1-vdronov@redhat.com/T/#u
Analyzed-by: Stephen Johnston <sjohnsto@redhat.com>
Analyzed-by: Vern Lovejoy <vlovejoy@redhat.com>
Signed-off-by: Vladis Dronov <vdronov@redhat.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Vladyslav Tarasiuk [Thu, 26 Dec 2019 08:41:56 +0000 (10:41 +0200)]
net/mlxfw: Fix out-of-memory error in mfa2 flash burning
[ Upstream commit
a5bcd72e054aabb93ddc51ed8cde36a5bfc50271 ]
The burning process requires to perform internal allocations of large
chunks of memory. This memory doesn't need to be contiguous and can be
safely allocated by vzalloc() instead of kzalloc(). This patch changes
such allocation to avoid possible out-of-memory failure.
Fixes:
410ed13cae39 ("Add the mlxfw module for Mellanox firmware flash process")
Signed-off-by: Vladyslav Tarasiuk <vladyslavt@mellanox.com>
Reviewed-by: Aya Levin <ayal@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Tested-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Netanel Belgazal [Tue, 10 Dec 2019 11:27:44 +0000 (11:27 +0000)]
net: ena: fix napi handler misbehavior when the napi budget is zero
[ Upstream commit
24dee0c7478d1a1e00abdf5625b7f921467325dc ]
In netpoll the napi handler could be called with budget equal to zero.
Current ENA napi handler doesn't take that into consideration.
The napi handler handles Rx packets in a do-while loop.
Currently, the budget check happens only after decrementing the
budget, therefore the napi handler, in rare cases, could run over
MAX_INT packets.
In addition to that, this moves all budget related variables to int
calculation and stop mixing u32 to avoid ambiguity
Fixes:
1738cd3ed342 ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Hans de Goede [Tue, 19 Nov 2019 15:46:41 +0000 (16:46 +0100)]
pinctrl: baytrail: Really serialize all register accesses
[ Upstream commit
40ecab551232972a39cdd8b6f17ede54a3fdb296 ]
Commit
39ce8150a079 ("pinctrl: baytrail: Serialize all register access")
added a spinlock around all register accesses because:
"There is a hardware issue in Intel Baytrail where concurrent GPIO register
access might result reads of 0xffffffff and writes might get dropped
completely."
Testing has shown that this does not catch all cases, there are still
2 problems remaining
1) The original fix uses a spinlock per byt_gpio device / struct,
additional testing has shown that this is not sufficient concurent
accesses to 2 different GPIO banks also suffer from the same problem.
This commit fixes this by moving to a single global lock.
2) The original fix did not add a lock around the register accesses in
the suspend/resume handling.
Since pinctrl-baytrail.c is using normal suspend/resume handlers,
interrupts are still enabled during suspend/resume handling. Nothing
should be using the GPIOs when they are being taken down, _but_ the
GPIOs themselves may still cause interrupts, which are likely to
use (read) the triggering GPIO. So we need to protect against
concurrent GPIO register accesses in the suspend/resume handlers too.
This commit fixes this by adding the missing spin_lock / unlock calls.
The 2 fixes together fix the Acer Switch 10 SW5-012 getting completely
confused after a suspend resume. The DSDT for this device has a bug
in its _LID method which reprograms the home and power button trigger-
flags requesting both high and low _level_ interrupts so the IRQs for
these 2 GPIOs continuously fire. This combined with the saving of
registers during suspend, triggers concurrent GPIO register accesses
resulting in saving 0xffffffff as pconf0 value during suspend and then
when restoring this on resume the pinmux settings get all messed up,
resulting in various I2C busses being stuck, the wifi no longer working
and often the tablet simply not coming out of suspend at all.
Cc: stable@vger.kernel.org
Fixes:
39ce8150a079 ("pinctrl: baytrail: Serialize all register access")
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
David Engraf [Mon, 16 Dec 2019 08:54:03 +0000 (09:54 +0100)]
tty/serial: atmel: fix out of range clock divider handling
[ Upstream commit
cb47b9f8630ae3fa3f5fbd0c7003faba7abdf711 ]
Use MCK_DIV8 when the clock divider is > 65535. Unfortunately the mode
register was already written thus the clock selection is ignored.
Fix by doing the baud rate calulation before setting the mode.
Fixes:
5bf5635ac170 ("tty/serial: atmel: add fractional baud rate support")
Signed-off-by: David Engraf <david.engraf@sysgo.com>
Acked-by: Ludovic Desroches <ludovic.desroches@microchip.com>
Acked-by: Richard Genoud <richard.genoud@gmail.com>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20191216085403.17050-1-david.engraf@sysgo.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Christophe Leroy [Mon, 9 Dec 2019 15:27:27 +0000 (15:27 +0000)]
spi: fsl: don't map irq during probe
[ Upstream commit
3194d2533efffae8b815d84729ecc58b6a9000ab ]
With lastest kernel, the following warning is observed at startup:
[ 1.500609] ------------[ cut here ]------------
[ 1.505225] remove_proc_entry: removing non-empty directory 'irq/22', leaking at least 'fsl_spi'
[ 1.514234] WARNING: CPU: 0 PID: 1 at fs/proc/generic.c:682 remove_proc_entry+0x198/0x1c0
[ 1.522403] CPU: 0 PID: 1 Comm: swapper Not tainted
5.4.0-s3k-dev-02248-g93532430a4ff #2564
[ 1.530724] NIP:
c0197694 LR:
c0197694 CTR:
c0050d80
[ 1.535762] REGS:
df4a5af0 TRAP: 0700 Not tainted (
5.4.0-02248-g93532430a4ff)
[ 1.543818] MSR:
00029032 <EE,ME,IR,DR,RI> CR:
22028222 XER:
00000000
[ 1.550524]
[ 1.550524] GPR00:
c0197694 df4a5ba8 df4a0000 00000054 00000000 00000000 00004a38 00000010
[ 1.550524] GPR08:
c07c5a30 00000800 00000000 00001032 22000208 00000000 c0004b14 00000000
[ 1.550524] GPR16:
00000000 00000000 00000000 00000000 00000000 00000000 c0830000 c07fc078
[ 1.550524] GPR24:
c08e8ca0 df665d10 df60ea98 c07c9db8 00000001 df5d5ae3 df5d5a80 df43f8e3
[ 1.585327] NIP [
c0197694] remove_proc_entry+0x198/0x1c0
[ 1.590628] LR [
c0197694] remove_proc_entry+0x198/0x1c0
[ 1.595829] Call Trace:
[ 1.598280] [
df4a5ba8] [
c0197694] remove_proc_entry+0x198/0x1c0 (unreliable)
[ 1.605321] [
df4a5bd8] [
c0067acc] unregister_irq_proc+0x5c/0x70
[ 1.611238] [
df4a5bf8] [
c005fbc4] free_desc+0x3c/0x80
[ 1.616286] [
df4a5c18] [
c005fe2c] irq_free_descs+0x70/0xa8
[ 1.621778] [
df4a5c38] [
c033d3fc] of_fsl_spi_probe+0xdc/0x3cc
[ 1.627525] [
df4a5c88] [
c02f0f64] platform_drv_probe+0x44/0xa4
[ 1.633350] [
df4a5c98] [
c02eee44] really_probe+0x1ac/0x418
[ 1.638829] [
df4a5cc8] [
c02ed3e8] bus_for_each_drv+0x64/0xb0
[ 1.644481] [
df4a5cf8] [
c02ef950] __device_attach+0xd4/0x128
[ 1.650132] [
df4a5d28] [
c02ed61c] bus_probe_device+0xa0/0xbc
[ 1.655783] [
df4a5d48] [
c02ebbe8] device_add+0x544/0x74c
[ 1.661096] [
df4a5d88] [
c0382b78] of_platform_device_create_pdata+0xa4/0x100
[ 1.668131] [
df4a5da8] [
c0382cf4] of_platform_bus_create+0x120/0x20c
[ 1.674474] [
df4a5df8] [
c0382d50] of_platform_bus_create+0x17c/0x20c
[ 1.680818] [
df4a5e48] [
c0382e88] of_platform_bus_probe+0x9c/0xf0
[ 1.686907] [
df4a5e68] [
c0751404] __machine_initcall_cmpcpro_cmpcpro_declare_of_platform_devices+0x74/0x1a4
[ 1.696629] [
df4a5e98] [
c072a4cc] do_one_initcall+0x8c/0x1d4
[ 1.702282] [
df4a5ef8] [
c072a768] kernel_init_freeable+0x154/0x204
[ 1.708455] [
df4a5f28] [
c0004b2c] kernel_init+0x18/0x110
[ 1.713769] [
df4a5f38] [
c00122ac] ret_from_kernel_thread+0x14/0x1c
[ 1.719926] Instruction dump:
[ 1.722889]
2c030000 4182004c 3863ffb0 3c80c05f 80e3005c 388436a0 3c60c06d 7fa6eb78
[ 1.730630]
7fe5fb78 38840280 38634178 4be8c611 <
0fe00000>
4bffff6c 3c60c071 7fe4fb78
[ 1.738556] ---[ end trace
05d0720bf2e352e2 ]---
The problem comes from the error path which calls
irq_dispose_mapping() while the IRQ has been requested with
devm_request_irq().
IRQ doesn't need to be mapped with irq_of_parse_and_map(). The only
need is to get the IRQ virtual number. For that, use
of_irq_to_resource() instead of the
irq_of_parse_and_map()/irq_dispose_mapping() pair.
Fixes:
500a32abaf81 ("spi: fsl: Call irq_dispose_mapping in err path")
Cc: stable@vger.kernel.org
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Link: https://lore.kernel.org/r/518cfb83347d5372748e7fe72f94e2e9443d0d4a.1575905123.git.christophe.leroy@c-s.fr
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Eric Dumazet [Wed, 6 Nov 2019 17:48:04 +0000 (09:48 -0800)]
hrtimer: Annotate lockless access to timer->state
commit
56144737e67329c9aaed15f942d46a6302e2e3d8 upstream.
syzbot reported various data-race caused by hrtimer_is_queued() reading
timer->state. A READ_ONCE() is required there to silence the warning.
Also add the corresponding WRITE_ONCE() when timer->state is set.
In remove_hrtimer() the hrtimer_is_queued() helper is open coded to avoid
loading timer->state twice.
KCSAN reported these cases:
BUG: KCSAN: data-race in __remove_hrtimer / tcp_pacing_check
write to 0xffff8880b2a7d388 of 1 bytes by interrupt on cpu 0:
__remove_hrtimer+0x52/0x130 kernel/time/hrtimer.c:991
__run_hrtimer kernel/time/hrtimer.c:1496 [inline]
__hrtimer_run_queues+0x250/0x600 kernel/time/hrtimer.c:1576
hrtimer_run_softirq+0x10e/0x150 kernel/time/hrtimer.c:1593
__do_softirq+0x115/0x33f kernel/softirq.c:292
run_ksoftirqd+0x46/0x60 kernel/softirq.c:603
smpboot_thread_fn+0x37d/0x4a0 kernel/smpboot.c:165
kthread+0x1d4/0x200 drivers/block/aoe/aoecmd.c:1253
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:352
read to 0xffff8880b2a7d388 of 1 bytes by task 24652 on cpu 1:
tcp_pacing_check net/ipv4/tcp_output.c:2235 [inline]
tcp_pacing_check+0xba/0x130 net/ipv4/tcp_output.c:2225
tcp_xmit_retransmit_queue+0x32c/0x5a0 net/ipv4/tcp_output.c:3044
tcp_xmit_recovery+0x7c/0x120 net/ipv4/tcp_input.c:3558
tcp_ack+0x17b6/0x3170 net/ipv4/tcp_input.c:3717
tcp_rcv_established+0x37e/0xf50 net/ipv4/tcp_input.c:5696
tcp_v4_do_rcv+0x381/0x4e0 net/ipv4/tcp_ipv4.c:1561
sk_backlog_rcv include/net/sock.h:945 [inline]
__release_sock+0x135/0x1e0 net/core/sock.c:2435
release_sock+0x61/0x160 net/core/sock.c:2951
sk_stream_wait_memory+0x3d7/0x7c0 net/core/stream.c:145
tcp_sendmsg_locked+0xb47/0x1f30 net/ipv4/tcp.c:1393
tcp_sendmsg+0x39/0x60 net/ipv4/tcp.c:1434
inet_sendmsg+0x6d/0x90 net/ipv4/af_inet.c:807
sock_sendmsg_nosec net/socket.c:637 [inline]
sock_sendmsg+0x9f/0xc0 net/socket.c:657
BUG: KCSAN: data-race in __remove_hrtimer / __tcp_ack_snd_check
write to 0xffff8880a3a65588 of 1 bytes by interrupt on cpu 0:
__remove_hrtimer+0x52/0x130 kernel/time/hrtimer.c:991
__run_hrtimer kernel/time/hrtimer.c:1496 [inline]
__hrtimer_run_queues+0x250/0x600 kernel/time/hrtimer.c:1576
hrtimer_run_softirq+0x10e/0x150 kernel/time/hrtimer.c:1593
__do_softirq+0x115/0x33f kernel/softirq.c:292
invoke_softirq kernel/softirq.c:373 [inline]
irq_exit+0xbb/0xe0 kernel/softirq.c:413
exiting_irq arch/x86/include/asm/apic.h:536 [inline]
smp_apic_timer_interrupt+0xe6/0x280 arch/x86/kernel/apic/apic.c:1137
apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:830
read to 0xffff8880a3a65588 of 1 bytes by task 22891 on cpu 1:
__tcp_ack_snd_check+0x415/0x4f0 net/ipv4/tcp_input.c:5265
tcp_ack_snd_check net/ipv4/tcp_input.c:5287 [inline]
tcp_rcv_established+0x750/0xf50 net/ipv4/tcp_input.c:5708
tcp_v4_do_rcv+0x381/0x4e0 net/ipv4/tcp_ipv4.c:1561
sk_backlog_rcv include/net/sock.h:945 [inline]
__release_sock+0x135/0x1e0 net/core/sock.c:2435
release_sock+0x61/0x160 net/core/sock.c:2951
sk_stream_wait_memory+0x3d7/0x7c0 net/core/stream.c:145
tcp_sendmsg_locked+0xb47/0x1f30 net/ipv4/tcp.c:1393
tcp_sendmsg+0x39/0x60 net/ipv4/tcp.c:1434
inet_sendmsg+0x6d/0x90 net/ipv4/af_inet.c:807
sock_sendmsg_nosec net/socket.c:637 [inline]
sock_sendmsg+0x9f/0xc0 net/socket.c:657
__sys_sendto+0x21f/0x320 net/socket.c:1952
__do_sys_sendto net/socket.c:1964 [inline]
__se_sys_sendto net/socket.c:1960 [inline]
__x64_sys_sendto+0x89/0xb0 net/socket.c:1960
do_syscall_64+0xcc/0x370 arch/x86/entry/common.c:290
Reported by Kernel Concurrency Sanitizer on:
CPU: 1 PID: 24652 Comm: syz-executor.3 Not tainted 5.4.0-rc3+ #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
[ tglx: Added comments ]
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20191106174804.74723-1-edumazet@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Eric Dumazet [Fri, 8 Nov 2019 18:34:47 +0000 (10:34 -0800)]
net: icmp: fix data-race in cmp_global_allow()
commit
bbab7ef235031f6733b5429ae7877bfa22339712 upstream.
This code reads two global variables without protection
of a lock. We need READ_ONCE()/WRITE_ONCE() pairs to
avoid load/store-tearing and better document the intent.
KCSAN reported :
BUG: KCSAN: data-race in icmp_global_allow / icmp_global_allow
read to 0xffffffff861a8014 of 4 bytes by task 11201 on cpu 0:
icmp_global_allow+0x36/0x1b0 net/ipv4/icmp.c:254
icmpv6_global_allow net/ipv6/icmp.c:184 [inline]
icmpv6_global_allow net/ipv6/icmp.c:179 [inline]
icmp6_send+0x493/0x1140 net/ipv6/icmp.c:514
icmpv6_send+0x71/0xb0 net/ipv6/ip6_icmp.c:43
ip6_link_failure+0x43/0x180 net/ipv6/route.c:2640
dst_link_failure include/net/dst.h:419 [inline]
vti_xmit net/ipv4/ip_vti.c:243 [inline]
vti_tunnel_xmit+0x27f/0xa50 net/ipv4/ip_vti.c:279
__netdev_start_xmit include/linux/netdevice.h:4420 [inline]
netdev_start_xmit include/linux/netdevice.h:4434 [inline]
xmit_one net/core/dev.c:3280 [inline]
dev_hard_start_xmit+0xef/0x430 net/core/dev.c:3296
__dev_queue_xmit+0x14c9/0x1b60 net/core/dev.c:3873
dev_queue_xmit+0x21/0x30 net/core/dev.c:3906
neigh_direct_output+0x1f/0x30 net/core/neighbour.c:1530
neigh_output include/net/neighbour.h:511 [inline]
ip6_finish_output2+0x7a6/0xec0 net/ipv6/ip6_output.c:116
__ip6_finish_output net/ipv6/ip6_output.c:142 [inline]
__ip6_finish_output+0x2d7/0x330 net/ipv6/ip6_output.c:127
ip6_finish_output+0x41/0x160 net/ipv6/ip6_output.c:152
NF_HOOK_COND include/linux/netfilter.h:294 [inline]
ip6_output+0xf2/0x280 net/ipv6/ip6_output.c:175
dst_output include/net/dst.h:436 [inline]
ip6_local_out+0x74/0x90 net/ipv6/output_core.c:179
write to 0xffffffff861a8014 of 4 bytes by task 11183 on cpu 1:
icmp_global_allow+0x174/0x1b0 net/ipv4/icmp.c:272
icmpv6_global_allow net/ipv6/icmp.c:184 [inline]
icmpv6_global_allow net/ipv6/icmp.c:179 [inline]
icmp6_send+0x493/0x1140 net/ipv6/icmp.c:514
icmpv6_send+0x71/0xb0 net/ipv6/ip6_icmp.c:43
ip6_link_failure+0x43/0x180 net/ipv6/route.c:2640
dst_link_failure include/net/dst.h:419 [inline]
vti_xmit net/ipv4/ip_vti.c:243 [inline]
vti_tunnel_xmit+0x27f/0xa50 net/ipv4/ip_vti.c:279
__netdev_start_xmit include/linux/netdevice.h:4420 [inline]
netdev_start_xmit include/linux/netdevice.h:4434 [inline]
xmit_one net/core/dev.c:3280 [inline]
dev_hard_start_xmit+0xef/0x430 net/core/dev.c:3296
__dev_queue_xmit+0x14c9/0x1b60 net/core/dev.c:3873
dev_queue_xmit+0x21/0x30 net/core/dev.c:3906
neigh_direct_output+0x1f/0x30 net/core/neighbour.c:1530
neigh_output include/net/neighbour.h:511 [inline]
ip6_finish_output2+0x7a6/0xec0 net/ipv6/ip6_output.c:116
__ip6_finish_output net/ipv6/ip6_output.c:142 [inline]
__ip6_finish_output+0x2d7/0x330 net/ipv6/ip6_output.c:127
ip6_finish_output+0x41/0x160 net/ipv6/ip6_output.c:152
NF_HOOK_COND include/linux/netfilter.h:294 [inline]
ip6_output+0xf2/0x280 net/ipv6/ip6_output.c:175
Reported by Kernel Concurrency Sanitizer on:
CPU: 1 PID: 11183 Comm: syz-executor.2 Not tainted 5.4.0-rc3+ #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Fixes:
4cdf507d5452 ("icmp: add a global rate limitation")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Eric Dumazet [Fri, 8 Nov 2019 02:49:43 +0000 (18:49 -0800)]
net: add a READ_ONCE() in skb_peek_tail()
commit
f8cc62ca3e660ae3fdaee533b1d554297cd2ae82 upstream.
skb_peek_tail() can be used without protection of a lock,
as spotted by KCSAN [1]
In order to avoid load-stearing, add a READ_ONCE()
Note that the corresponding WRITE_ONCE() are already there.
[1]
BUG: KCSAN: data-race in sk_wait_data / skb_queue_tail
read to 0xffff8880b36a4118 of 8 bytes by task 20426 on cpu 1:
skb_peek_tail include/linux/skbuff.h:1784 [inline]
sk_wait_data+0x15b/0x250 net/core/sock.c:2477
kcm_wait_data+0x112/0x1f0 net/kcm/kcmsock.c:1103
kcm_recvmsg+0xac/0x320 net/kcm/kcmsock.c:1130
sock_recvmsg_nosec net/socket.c:871 [inline]
sock_recvmsg net/socket.c:889 [inline]
sock_recvmsg+0x92/0xb0 net/socket.c:885
___sys_recvmsg+0x1a0/0x3e0 net/socket.c:2480
do_recvmmsg+0x19a/0x5c0 net/socket.c:2601
__sys_recvmmsg+0x1ef/0x200 net/socket.c:2680
__do_sys_recvmmsg net/socket.c:2703 [inline]
__se_sys_recvmmsg net/socket.c:2696 [inline]
__x64_sys_recvmmsg+0x89/0xb0 net/socket.c:2696
do_syscall_64+0xcc/0x370 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x44/0xa9
write to 0xffff8880b36a4118 of 8 bytes by task 451 on cpu 0:
__skb_insert include/linux/skbuff.h:1852 [inline]
__skb_queue_before include/linux/skbuff.h:1958 [inline]
__skb_queue_tail include/linux/skbuff.h:1991 [inline]
skb_queue_tail+0x7e/0xc0 net/core/skbuff.c:3145
kcm_queue_rcv_skb+0x202/0x310 net/kcm/kcmsock.c:206
kcm_rcv_strparser+0x74/0x4b0 net/kcm/kcmsock.c:370
__strp_recv+0x348/0xf50 net/strparser/strparser.c:309
strp_recv+0x84/0xa0 net/strparser/strparser.c:343
tcp_read_sock+0x174/0x5c0 net/ipv4/tcp.c:1639
strp_read_sock+0xd4/0x140 net/strparser/strparser.c:366
do_strp_work net/strparser/strparser.c:414 [inline]
strp_work+0x9a/0xe0 net/strparser/strparser.c:423
process_one_work+0x3d4/0x890 kernel/workqueue.c:2269
worker_thread+0xa0/0x800 kernel/workqueue.c:2415
kthread+0x1d4/0x200 drivers/block/aoe/aoecmd.c:1253
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:352
Reported by Kernel Concurrency Sanitizer on:
CPU: 0 PID: 451 Comm: kworker/u4:3 Not tainted 5.4.0-rc3+ #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: kstrp strp_work
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Eric Dumazet [Thu, 7 Nov 2019 18:30:42 +0000 (10:30 -0800)]
inetpeer: fix data-race in inet_putpeer / inet_putpeer
commit
71685eb4ce80ae9c49eff82ca4dd15acab215de9 upstream.
We need to explicitely forbid read/store tearing in inet_peer_gc()
and inet_putpeer().
The following syzbot report reminds us about inet_putpeer()
running without a lock held.
BUG: KCSAN: data-race in inet_putpeer / inet_putpeer
write to 0xffff888121fb2ed0 of 4 bytes by interrupt on cpu 0:
inet_putpeer+0x37/0xa0 net/ipv4/inetpeer.c:240
ip4_frag_free+0x3d/0x50 net/ipv4/ip_fragment.c:102
inet_frag_destroy_rcu+0x58/0x80 net/ipv4/inet_fragment.c:228
__rcu_reclaim kernel/rcu/rcu.h:222 [inline]
rcu_do_batch+0x256/0x5b0 kernel/rcu/tree.c:2157
rcu_core+0x369/0x4d0 kernel/rcu/tree.c:2377
rcu_core_si+0x12/0x20 kernel/rcu/tree.c:2386
__do_softirq+0x115/0x33f kernel/softirq.c:292
invoke_softirq kernel/softirq.c:373 [inline]
irq_exit+0xbb/0xe0 kernel/softirq.c:413
exiting_irq arch/x86/include/asm/apic.h:536 [inline]
smp_apic_timer_interrupt+0xe6/0x280 arch/x86/kernel/apic/apic.c:1137
apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:830
native_safe_halt+0xe/0x10 arch/x86/kernel/paravirt.c:71
arch_cpu_idle+0x1f/0x30 arch/x86/kernel/process.c:571
default_idle_call+0x1e/0x40 kernel/sched/idle.c:94
cpuidle_idle_call kernel/sched/idle.c:154 [inline]
do_idle+0x1af/0x280 kernel/sched/idle.c:263
write to 0xffff888121fb2ed0 of 4 bytes by interrupt on cpu 1:
inet_putpeer+0x37/0xa0 net/ipv4/inetpeer.c:240
ip4_frag_free+0x3d/0x50 net/ipv4/ip_fragment.c:102
inet_frag_destroy_rcu+0x58/0x80 net/ipv4/inet_fragment.c:228
__rcu_reclaim kernel/rcu/rcu.h:222 [inline]
rcu_do_batch+0x256/0x5b0 kernel/rcu/tree.c:2157
rcu_core+0x369/0x4d0 kernel/rcu/tree.c:2377
rcu_core_si+0x12/0x20 kernel/rcu/tree.c:2386
__do_softirq+0x115/0x33f kernel/softirq.c:292
run_ksoftirqd+0x46/0x60 kernel/softirq.c:603
smpboot_thread_fn+0x37d/0x4a0 kernel/smpboot.c:165
kthread+0x1d4/0x200 drivers/block/aoe/aoecmd.c:1253
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:352
Reported by Kernel Concurrency Sanitizer on:
CPU: 1 PID: 16 Comm: ksoftirqd/1 Not tainted 5.4.0-rc3+ #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Fixes:
4b9d9be839fd ("inetpeer: remove unused list")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Eric Dumazet [Sat, 7 Dec 2019 22:43:39 +0000 (14:43 -0800)]
netfilter: bridge: make sure to pull arp header in br_nf_forward_arp()
commit
5604285839aaedfb23ebe297799c6e558939334d upstream.
syzbot is kind enough to remind us we need to call skb_may_pull()
BUG: KMSAN: uninit-value in br_nf_forward_arp+0xe61/0x1230 net/bridge/br_netfilter_hooks.c:665
CPU: 1 PID: 11631 Comm: syz-executor.1 Not tainted 5.4.0-rc8-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
<IRQ>
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x1c9/0x220 lib/dump_stack.c:118
kmsan_report+0x128/0x220 mm/kmsan/kmsan_report.c:108
__msan_warning+0x64/0xc0 mm/kmsan/kmsan_instr.c:245
br_nf_forward_arp+0xe61/0x1230 net/bridge/br_netfilter_hooks.c:665
nf_hook_entry_hookfn include/linux/netfilter.h:135 [inline]
nf_hook_slow+0x18b/0x3f0 net/netfilter/core.c:512
nf_hook include/linux/netfilter.h:260 [inline]
NF_HOOK include/linux/netfilter.h:303 [inline]
__br_forward+0x78f/0xe30 net/bridge/br_forward.c:109
br_flood+0xef0/0xfe0 net/bridge/br_forward.c:234
br_handle_frame_finish+0x1a77/0x1c20 net/bridge/br_input.c:162
nf_hook_bridge_pre net/bridge/br_input.c:245 [inline]
br_handle_frame+0xfb6/0x1eb0 net/bridge/br_input.c:348
__netif_receive_skb_core+0x20b9/0x51a0 net/core/dev.c:4830
__netif_receive_skb_one_core net/core/dev.c:4927 [inline]
__netif_receive_skb net/core/dev.c:5043 [inline]
process_backlog+0x610/0x13c0 net/core/dev.c:5874
napi_poll net/core/dev.c:6311 [inline]
net_rx_action+0x7a6/0x1aa0 net/core/dev.c:6379
__do_softirq+0x4a1/0x83a kernel/softirq.c:293
do_softirq_own_stack+0x49/0x80 arch/x86/entry/entry_64.S:1091
</IRQ>
do_softirq kernel/softirq.c:338 [inline]
__local_bh_enable_ip+0x184/0x1d0 kernel/softirq.c:190
local_bh_enable+0x36/0x40 include/linux/bottom_half.h:32
rcu_read_unlock_bh include/linux/rcupdate.h:688 [inline]
__dev_queue_xmit+0x38e8/0x4200 net/core/dev.c:3819
dev_queue_xmit+0x4b/0x60 net/core/dev.c:3825
packet_snd net/packet/af_packet.c:2959 [inline]
packet_sendmsg+0x8234/0x9100 net/packet/af_packet.c:2984
sock_sendmsg_nosec net/socket.c:637 [inline]
sock_sendmsg net/socket.c:657 [inline]
__sys_sendto+0xc44/0xc70 net/socket.c:1952
__do_sys_sendto net/socket.c:1964 [inline]
__se_sys_sendto+0x107/0x130 net/socket.c:1960
__x64_sys_sendto+0x6e/0x90 net/socket.c:1960
do_syscall_64+0xb6/0x160 arch/x86/entry/common.c:291
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x45a679
Code: ad b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 7b b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:
00007f0a3c9e5c78 EFLAGS:
00000246 ORIG_RAX:
000000000000002c
RAX:
ffffffffffffffda RBX:
0000000000000006 RCX:
000000000045a679
RDX:
000000000000000e RSI:
0000000020000200 RDI:
0000000000000003
RBP:
000000000075bf20 R08:
00000000200000c0 R09:
0000000000000014
R10:
0000000000000000 R11:
0000000000000246 R12:
00007f0a3c9e66d4
R13:
00000000004c8ec1 R14:
00000000004dfe28 R15:
00000000ffffffff
Uninit was created at:
kmsan_save_stack_with_flags mm/kmsan/kmsan.c:149 [inline]
kmsan_internal_poison_shadow+0x5c/0x110 mm/kmsan/kmsan.c:132
kmsan_slab_alloc+0x97/0x100 mm/kmsan/kmsan_hooks.c:86
slab_alloc_node mm/slub.c:2773 [inline]
__kmalloc_node_track_caller+0xe27/0x11a0 mm/slub.c:4381
__kmalloc_reserve net/core/skbuff.c:141 [inline]
__alloc_skb+0x306/0xa10 net/core/skbuff.c:209
alloc_skb include/linux/skbuff.h:1049 [inline]
alloc_skb_with_frags+0x18c/0xa80 net/core/skbuff.c:5662
sock_alloc_send_pskb+0xafd/0x10a0 net/core/sock.c:2244
packet_alloc_skb net/packet/af_packet.c:2807 [inline]
packet_snd net/packet/af_packet.c:2902 [inline]
packet_sendmsg+0x63a6/0x9100 net/packet/af_packet.c:2984
sock_sendmsg_nosec net/socket.c:637 [inline]
sock_sendmsg net/socket.c:657 [inline]
__sys_sendto+0xc44/0xc70 net/socket.c:1952
__do_sys_sendto net/socket.c:1964 [inline]
__se_sys_sendto+0x107/0x130 net/socket.c:1960
__x64_sys_sendto+0x6e/0x90 net/socket.c:1960
do_syscall_64+0xb6/0x160 arch/x86/entry/common.c:291
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Fixes:
c4e70a87d975 ("netfilter: bridge: rename br_netfilter.c to br_netfilter_hooks.c")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Reviewed-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Eric Dumazet [Thu, 12 Dec 2019 18:32:13 +0000 (10:32 -0800)]
6pack,mkiss: fix possible deadlock
commit
5c9934b6767b16ba60be22ec3cbd4379ad64170d upstream.
We got another syzbot report [1] that tells us we must use
write_lock_irq()/write_unlock_irq() to avoid possible deadlock.
[1]
WARNING: inconsistent lock state
5.5.0-rc1-syzkaller #0 Not tainted
--------------------------------
inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-R} usage.
syz-executor826/9605 [HC1[1]:SC0[0]:HE0:SE1] takes:
ffffffff8a128718 (disc_data_lock){+-..}, at: sp_get.isra.0+0x1d/0xf0 drivers/net/ppp/ppp_synctty.c:138
{HARDIRQ-ON-W} state was registered at:
lock_acquire+0x190/0x410 kernel/locking/lockdep.c:4485
__raw_write_lock_bh include/linux/rwlock_api_smp.h:203 [inline]
_raw_write_lock_bh+0x33/0x50 kernel/locking/spinlock.c:319
sixpack_close+0x1d/0x250 drivers/net/hamradio/6pack.c:657
tty_ldisc_close.isra.0+0x119/0x1a0 drivers/tty/tty_ldisc.c:489
tty_set_ldisc+0x230/0x6b0 drivers/tty/tty_ldisc.c:585
tiocsetd drivers/tty/tty_io.c:2337 [inline]
tty_ioctl+0xe8d/0x14f0 drivers/tty/tty_io.c:2597
vfs_ioctl fs/ioctl.c:47 [inline]
file_ioctl fs/ioctl.c:545 [inline]
do_vfs_ioctl+0x977/0x14e0 fs/ioctl.c:732
ksys_ioctl+0xab/0xd0 fs/ioctl.c:749
__do_sys_ioctl fs/ioctl.c:756 [inline]
__se_sys_ioctl fs/ioctl.c:754 [inline]
__x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:754
do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294
entry_SYSCALL_64_after_hwframe+0x49/0xbe
irq event stamp: 3946
hardirqs last enabled at (3945): [<
ffffffff87c86e43>] __raw_spin_unlock_irq include/linux/spinlock_api_smp.h:168 [inline]
hardirqs last enabled at (3945): [<
ffffffff87c86e43>] _raw_spin_unlock_irq+0x23/0x80 kernel/locking/spinlock.c:199
hardirqs last disabled at (3946): [<
ffffffff8100675f>] trace_hardirqs_off_thunk+0x1a/0x1c arch/x86/entry/thunk_64.S:42
softirqs last enabled at (2658): [<
ffffffff86a8b4df>] spin_unlock_bh include/linux/spinlock.h:383 [inline]
softirqs last enabled at (2658): [<
ffffffff86a8b4df>] clusterip_netdev_event+0x46f/0x670 net/ipv4/netfilter/ipt_CLUSTERIP.c:222
softirqs last disabled at (2656): [<
ffffffff86a8b22b>] spin_lock_bh include/linux/spinlock.h:343 [inline]
softirqs last disabled at (2656): [<
ffffffff86a8b22b>] clusterip_netdev_event+0x1bb/0x670 net/ipv4/netfilter/ipt_CLUSTERIP.c:196
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(disc_data_lock);
<Interrupt>
lock(disc_data_lock);
*** DEADLOCK ***
5 locks held by syz-executor826/9605:
#0:
ffff8880a905e198 (&tty->legacy_mutex){+.+.}, at: tty_lock+0xc7/0x130 drivers/tty/tty_mutex.c:19
#1:
ffffffff899a56c0 (rcu_read_lock){....}, at: mutex_spin_on_owner+0x0/0x330 kernel/locking/mutex.c:413
#2:
ffff8880a496a2b0 (&(&i->lock)->rlock){-.-.}, at: spin_lock include/linux/spinlock.h:338 [inline]
#2:
ffff8880a496a2b0 (&(&i->lock)->rlock){-.-.}, at: serial8250_interrupt+0x2d/0x1a0 drivers/tty/serial/8250/8250_core.c:116
#3:
ffffffff8c104048 (&port_lock_key){-.-.}, at: serial8250_handle_irq.part.0+0x24/0x330 drivers/tty/serial/8250/8250_port.c:1823
#4:
ffff8880a905e090 (&tty->ldisc_sem){++++}, at: tty_ldisc_ref+0x22/0x90 drivers/tty/tty_ldisc.c:288
stack backtrace:
CPU: 1 PID: 9605 Comm: syz-executor826 Not tainted 5.5.0-rc1-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
<IRQ>
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x197/0x210 lib/dump_stack.c:118
print_usage_bug.cold+0x327/0x378 kernel/locking/lockdep.c:3101
valid_state kernel/locking/lockdep.c:3112 [inline]
mark_lock_irq kernel/locking/lockdep.c:3309 [inline]
mark_lock+0xbb4/0x1220 kernel/locking/lockdep.c:3666
mark_usage kernel/locking/lockdep.c:3554 [inline]
__lock_acquire+0x1e55/0x4a00 kernel/locking/lockdep.c:3909
lock_acquire+0x190/0x410 kernel/locking/lockdep.c:4485
__raw_read_lock include/linux/rwlock_api_smp.h:149 [inline]
_raw_read_lock+0x32/0x50 kernel/locking/spinlock.c:223
sp_get.isra.0+0x1d/0xf0 drivers/net/ppp/ppp_synctty.c:138
sixpack_write_wakeup+0x25/0x340 drivers/net/hamradio/6pack.c:402
tty_wakeup+0xe9/0x120 drivers/tty/tty_io.c:536
tty_port_default_wakeup+0x2b/0x40 drivers/tty/tty_port.c:50
tty_port_tty_wakeup+0x57/0x70 drivers/tty/tty_port.c:387
uart_write_wakeup+0x46/0x70 drivers/tty/serial/serial_core.c:104
serial8250_tx_chars+0x495/0xaf0 drivers/tty/serial/8250/8250_port.c:1761
serial8250_handle_irq.part.0+0x2a2/0x330 drivers/tty/serial/8250/8250_port.c:1834
serial8250_handle_irq drivers/tty/serial/8250/8250_port.c:1820 [inline]
serial8250_default_handle_irq+0xc0/0x150 drivers/tty/serial/8250/8250_port.c:1850
serial8250_interrupt+0xf1/0x1a0 drivers/tty/serial/8250/8250_core.c:126
__handle_irq_event_percpu+0x15d/0x970 kernel/irq/handle.c:149
handle_irq_event_percpu+0x74/0x160 kernel/irq/handle.c:189
handle_irq_event+0xa7/0x134 kernel/irq/handle.c:206
handle_edge_irq+0x25e/0x8d0 kernel/irq/chip.c:830
generic_handle_irq_desc include/linux/irqdesc.h:156 [inline]
do_IRQ+0xde/0x280 arch/x86/kernel/irq.c:250
common_interrupt+0xf/0xf arch/x86/entry/entry_64.S:607
</IRQ>
RIP: 0010:cpu_relax arch/x86/include/asm/processor.h:685 [inline]
RIP: 0010:mutex_spin_on_owner+0x247/0x330 kernel/locking/mutex.c:579
Code: c3 be 08 00 00 00 4c 89 e7 e8 e5 06 59 00 4c 89 e0 48 c1 e8 03 42 80 3c 38 00 0f 85 e1 00 00 00 49 8b 04 24 a8 01 75 96 f3 90 <e9> 2f fe ff ff 0f 0b e8 0d 19 09 00 84 c0 0f 85 ff fd ff ff 48 c7
RSP: 0018:
ffffc90001eafa20 EFLAGS:
00000246 ORIG_RAX:
ffffffffffffffd7
RAX:
0000000000000000 RBX:
ffff88809fd9e0c0 RCX:
1ffffffff13266dd
RDX:
0000000000000000 RSI:
0000000000000008 RDI:
0000000000000000
RBP:
ffffc90001eafa60 R08:
1ffff11013d22898 R09:
ffffed1013d22899
R10:
ffffed1013d22898 R11:
ffff88809e9144c7 R12:
ffff8880a905e138
R13:
ffff88809e9144c0 R14:
0000000000000000 R15:
dffffc0000000000
mutex_optimistic_spin kernel/locking/mutex.c:673 [inline]
__mutex_lock_common kernel/locking/mutex.c:962 [inline]
__mutex_lock+0x32b/0x13c0 kernel/locking/mutex.c:1106
mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:1121
tty_lock+0xc7/0x130 drivers/tty/tty_mutex.c:19
tty_release+0xb5/0xe90 drivers/tty/tty_io.c:1665
__fput+0x2ff/0x890 fs/file_table.c:280
____fput+0x16/0x20 fs/file_table.c:313
task_work_run+0x145/0x1c0 kernel/task_work.c:113
exit_task_work include/linux/task_work.h:22 [inline]
do_exit+0x8e7/0x2ef0 kernel/exit.c:797
do_group_exit+0x135/0x360 kernel/exit.c:895
__do_sys_exit_group kernel/exit.c:906 [inline]
__se_sys_exit_group kernel/exit.c:904 [inline]
__x64_sys_exit_group+0x44/0x50 kernel/exit.c:904
do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x43fef8
Code: Bad RIP value.
RSP: 002b:
00007ffdb07d2338 EFLAGS:
00000246 ORIG_RAX:
00000000000000e7
RAX:
ffffffffffffffda RBX:
0000000000000000 RCX:
000000000043fef8
RDX:
0000000000000000 RSI:
000000000000003c RDI:
0000000000000000
RBP:
00000000004bf730 R08:
00000000000000e7 R09:
ffffffffffffffd0
R10:
00000000004002c8 R11:
0000000000000246 R12:
0000000000000001
R13:
00000000006d1180 R14:
0000000000000000 R15:
0000000000000000
Fixes:
6e4e2f811bad ("6pack,mkiss: fix lock inconsistency")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Florian Westphal [Sun, 15 Dec 2019 02:49:25 +0000 (03:49 +0100)]
netfilter: ebtables: compat: reject all padding in matches/watchers
commit
e608f631f0ba5f1fc5ee2e260a3a35d13107cbfe upstream.
syzbot reported following splat:
BUG: KASAN: vmalloc-out-of-bounds in size_entry_mwt net/bridge/netfilter/ebtables.c:2063 [inline]
BUG: KASAN: vmalloc-out-of-bounds in compat_copy_entries+0x128b/0x1380 net/bridge/netfilter/ebtables.c:2155
Read of size 4 at addr
ffffc900004461f4 by task syz-executor267/7937
CPU: 1 PID: 7937 Comm: syz-executor267 Not tainted 5.5.0-rc1-syzkaller #0
size_entry_mwt net/bridge/netfilter/ebtables.c:2063 [inline]
compat_copy_entries+0x128b/0x1380 net/bridge/netfilter/ebtables.c:2155
compat_do_replace+0x344/0x720 net/bridge/netfilter/ebtables.c:2249
compat_do_ebt_set_ctl+0x22f/0x27e net/bridge/netfilter/ebtables.c:2333
[..]
Because padding isn't considered during computation of ->buf_user_offset,
"total" is decremented by fewer bytes than it should.
Therefore, the first part of
if (*total < sizeof(*entry) || entry->next_offset < sizeof(*entry))
will pass, -- it should not have. This causes oob access:
entry->next_offset is past the vmalloced size.
Reject padding and check that computed user offset (sum of ebt_entry
structure plus all individual matches/watchers/targets) is same
value that userspace gave us as the offset of the next entry.
Reported-by: syzbot+f68108fed972453a0ad4@syzkaller.appspotmail.com
Fixes:
81e675c227ec ("netfilter: ebtables: add CONFIG_COMPAT support")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Linus Torvalds [Fri, 18 Oct 2019 22:41:16 +0000 (18:41 -0400)]
filldir[64]: remove WARN_ON_ONCE() for bad directory entries
commit
b9959c7a347d6adbb558fba7e36e9fef3cba3b07 upstream.
This was always meant to be a temporary thing, just for testing and to
see if it actually ever triggered.
The only thing that reported it was syzbot doing disk image fuzzing, and
then that warning is expected. So let's just remove it before -rc4,
because the extra sanity testing should probably go to -stable, but we
don't want the warning to do so.
Reported-by: syzbot+3031f712c7ad5dd4d926@syzkaller.appspotmail.com
Fixes:
8a23eb804ca4 ("Make filldir[64]() verify the directory entry filename is valid")
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Siddharth Chandrasekaran <csiddharth@vmware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Linus Torvalds [Sat, 5 Oct 2019 18:32:52 +0000 (11:32 -0700)]
Make filldir[64]() verify the directory entry filename is valid
commit
8a23eb804ca4f2be909e372cf5a9e7b30ae476cd upstream.
This has been discussed several times, and now filesystem people are
talking about doing it individually at the filesystem layer, so head
that off at the pass and just do it in getdents{64}().
This is partially based on a patch by Jann Horn, but checks for NUL
bytes as well, and somewhat simplified.
There's also commentary about how it might be better if invalid names
due to filesystem corruption don't cause an immediate failure, but only
an error at the end of the readdir(), so that people can still see the
filenames that are ok.
There's also been discussion about just how much POSIX strictly speaking
requires this since it's about filesystem corruption. It's really more
"protect user space from bad behavior" as pointed out by Jann. But
since Eric Biederman looked up the POSIX wording, here it is for context:
"From readdir:
The readdir() function shall return a pointer to a structure
representing the directory entry at the current position in the
directory stream specified by the argument dirp, and position the
directory stream at the next entry. It shall return a null pointer
upon reaching the end of the directory stream. The structure dirent
defined in the <dirent.h> header describes a directory entry.
From definitions:
3.129 Directory Entry (or Link)
An object that associates a filename with a file. Several directory
entries can associate names with the same file.
...
3.169 Filename
A name consisting of 1 to {NAME_MAX} bytes used to name a file. The
characters composing the name may be selected from the set of all
character values excluding the slash character and the null byte. The
filenames dot and dot-dot have special meaning. A filename is
sometimes referred to as a 'pathname component'."
Note that I didn't bother adding the checks to any legacy interfaces
that nobody uses.
Also note that if this ends up being noticeable as a performance
regression, we can fix that to do a much more optimized model that
checks for both NUL and '/' at the same time one word at a time.
We haven't really tended to optimize 'memchr()', and it only checks for
one pattern at a time anyway, and we really _should_ check for NUL too
(but see the comment about "soft errors" in the code about why it
currently only checks for '/')
See the CONFIG_DCACHE_WORD_ACCESS case of hash_name() for how the name
lookup code looks for pathname terminating characters in parallel.
Link: https://lore.kernel.org/lkml/20190118161440.220134-2-jannh@google.com/
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Jann Horn <jannh@google.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Siddharth Chandrasekaran <csiddharth@vmware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Mattias Jacobsson [Sat, 29 Dec 2018 14:17:50 +0000 (15:17 +0100)]
perf strbuf: Remove redundant va_end() in strbuf_addv()
commit
099be748865eece21362aee416c350c0b1ae34df upstream.
Each call to va_copy() should have one, and only one, corresponding call
to va_end(). In strbuf_addv() some code paths result in va_end() getting
called multiple times. Remove the superfluous va_end().
Signed-off-by: Mattias Jacobsson <2pi@mok.nu>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sanskriti Sharma <sansharm@redhat.com>
Link: http://lkml.kernel.org/r/20181229141750.16945-1-2pi@mok.nu
Fixes:
ce49d8436cff ("perf strbuf: Match va_{add,copy} with va_end")
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Nobuhiro Iwamatsu <nobuhiro1.iwamatsu@toshiba.co.jp>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Mahesh Bandewar [Fri, 6 Dec 2019 23:44:55 +0000 (15:44 -0800)]
bonding: fix active-backup transition after link failure
[ Upstream commit
5d485ed88d48f8101a2067348e267c0aaf4ed486 ]
After the recent fix in commit
1899bb325149 ("bonding: fix state
transition issue in link monitoring"), the active-backup mode with
miimon initially come-up fine but after a link-failure, both members
transition into backup state.
Following steps to reproduce the scenario (eth1 and eth2 are the
slaves of the bond):
ip link set eth1 up
ip link set eth2 down
sleep 1
ip link set eth2 up
ip link set eth1 down
cat /sys/class/net/eth1/bonding_slave/state
cat /sys/class/net/eth2/bonding_slave/state
Fixes:
1899bb325149 ("bonding: fix state transition issue in link monitoring")
CC: Jay Vosburgh <jay.vosburgh@canonical.com>
Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Acked-by: Jay Vosburgh <jay.vosburgh@canonical.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Takashi Iwai [Mon, 16 Dec 2019 15:12:24 +0000 (16:12 +0100)]
ALSA: hda - Downgrade error message for single-cmd fallback
[ Upstream commit
475feec0c41ad71cb7d02f0310e56256606b57c5 ]
We made the error message for the CORB/RIRB communication clearer by
upgrading to dev_WARN() so that user can notice better. But this
struck us like a boomerang: now it caught syzbot and reported back as
a fatal issue although it's not really any too serious bug that worth
for stopping the whole system.
OK, OK, let's be softy, downgrade it to the standard dev_err() again.
Fixes:
dd65f7e19c69 ("ALSA: hda - Show the fatal CORB/RIRB error more clearly")
Reported-by: syzbot+b3028ac3933f5c466389@syzkaller.appspotmail.com
Link: https://lore.kernel.org/r/20191216151224.30013-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Marco Oliverio [Mon, 2 Dec 2019 18:54:30 +0000 (19:54 +0100)]
netfilter: nf_queue: enqueue skbs with NULL dst
[ Upstream commit
0b9173f4688dfa7c5d723426be1d979c24ce3d51 ]
Bridge packets that are forwarded have skb->dst == NULL and get
dropped by the check introduced by
b60a77386b1d4868f72f6353d35dabe5fbe981f2 (net: make skb_dst_force
return true when dst is refcounted).
To fix this we check skb_dst() before skb_dst_force(), so we don't
drop skb packet with dst == NULL. This holds also for skb at the
PRE_ROUTING hook so we remove the second check.
Fixes:
b60a77386b1d ("net: make skb_dst_force return true when dst is refcounted")
Signed-off-by: Marco Oliverio <marco.oliverio@tanaza.com>
Signed-off-by: Rocco Folino <rocco.folino@tanaza.com>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Alexander Lobakin [Wed, 18 Dec 2019 09:18:21 +0000 (12:18 +0300)]
net, sysctl: Fix compiler warning when only cBPF is present
[ Upstream commit
1148f9adbe71415836a18a36c1b4ece999ab0973 ]
proc_dointvec_minmax_bpf_restricted() has been firstly introduced
in commit
2e4a30983b0f ("bpf: restrict access to core bpf sysctls")
under CONFIG_HAVE_EBPF_JIT. Then, this ifdef has been removed in
ede95a63b5e8 ("bpf: add bpf_jit_limit knob to restrict unpriv
allocations"), because a new sysctl, bpf_jit_limit, made use of it.
Finally, this parameter has become long instead of integer with
fdadd04931c2 ("bpf: fix bpf_jit_limit knob for PAGE_SIZE >= 64K")
and thus, a new proc_dolongvec_minmax_bpf_restricted() has been
added.
With this last change, we got back to that
proc_dointvec_minmax_bpf_restricted() is used only under
CONFIG_HAVE_EBPF_JIT, but the corresponding ifdef has not been
brought back.
So, in configurations like CONFIG_BPF_JIT=y && CONFIG_HAVE_EBPF_JIT=n
since v4.20 we have:
CC net/core/sysctl_net_core.o
net/core/sysctl_net_core.c:292:1: warning: ‘proc_dointvec_minmax_bpf_restricted’ defined but not used [-Wunused-function]
292 | proc_dointvec_minmax_bpf_restricted(struct ctl_table *table, int write,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Suppress this by guarding it with CONFIG_HAVE_EBPF_JIT again.
Fixes:
fdadd04931c2 ("bpf: fix bpf_jit_limit knob for PAGE_SIZE >= 64K")
Signed-off-by: Alexander Lobakin <alobakin@dlink.ru>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191218091821.7080-1-alobakin@dlink.ru
Signed-off-by: Sasha Levin <sashal@kernel.org>
Jan H. Schönherr [Tue, 10 Dec 2019 00:07:30 +0000 (01:07 +0100)]
x86/mce: Fix possibly incorrect severity calculation on AMD
[ Upstream commit
a3a57ddad061acc90bef39635caf2b2330ce8f21 ]
The function mce_severity_amd_smca() requires m->bank to be initialized
for correct operation. Fix the one case, where mce_severity() is called
without doing so.
Fixes:
6bda529ec42e ("x86/mce: Grade uncorrected errors for SMCA-enabled systems")
Fixes:
d28af26faa0b ("x86/MCE: Initialize mce.bank in the case of a fatal error in mce_no_way_out()")
Signed-off-by: Jan H. Schönherr <jschoenh@amazon.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: <stable@vger.kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86-ml <x86@kernel.org>
Cc: Yazen Ghannam <Yazen.Ghannam@amd.com>
Link: https://lkml.kernel.org/r/20191210000733.17979-4-jschoenh@amazon.de
Signed-off-by: Sasha Levin <sashal@kernel.org>
Mike Rapoport [Sun, 1 Dec 2019 01:58:01 +0000 (17:58 -0800)]
userfaultfd: require CAP_SYS_PTRACE for UFFD_FEATURE_EVENT_FORK
[ Upstream commit
3c1c24d91ffd536de0a64688a9df7f49e58fadbc ]
A while ago Andy noticed
(http://lkml.kernel.org/r/CALCETrWY+5ynDct7eU_nDUqx=okQvjm=Y5wJvA4ahBja=CQXGw@mail.gmail.com)
that UFFD_FEATURE_EVENT_FORK used by an unprivileged user may have
security implications.
As the first step of the solution the following patch limits the availably
of UFFD_FEATURE_EVENT_FORK only for those having CAP_SYS_PTRACE.
The usage of CAP_SYS_PTRACE ensures compatibility with CRIU.
Yet, if there are other users of non-cooperative userfaultfd that run
without CAP_SYS_PTRACE, they would be broken :(
Current implementation of UFFD_FEATURE_EVENT_FORK modifies the file
descriptor table from the read() implementation of uffd, which may have
security implications for unprivileged use of the userfaultfd.
Limit availability of UFFD_FEATURE_EVENT_FORK only for callers that have
CAP_SYS_PTRACE.
Link: http://lkml.kernel.org/r/1572967777-8812-2-git-send-email-rppt@linux.ibm.com
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Reviewed-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: Daniel Colascione <dancol@google.com>
Cc: Jann Horn <jannh@google.com>
Cc: Lokesh Gidra <lokeshgidra@google.com>
Cc: Nick Kralevich <nnk@google.com>
Cc: Nosh Minwalla <nosh@google.com>
Cc: Pavel Emelyanov <ovzxemul@gmail.com>
Cc: Tim Murray <timmurray@google.com>
Cc: Aleksa Sarai <cyphar@cyphar.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Johannes Weiner [Sun, 1 Dec 2019 01:56:08 +0000 (17:56 -0800)]
kernel: sysctl: make drop_caches write-only
[ Upstream commit
204cb79ad42f015312a5bbd7012d09c93d9b46fb ]
Currently, the drop_caches proc file and sysctl read back the last value
written, suggesting this is somehow a stateful setting instead of a
one-time command. Make it write-only, like e.g. compact_memory.
While mitigating a VM problem at scale in our fleet, there was confusion
about whether writing to this file will permanently switch the kernel into
a non-caching mode. This influences the decision making in a tense
situation, where tens of people are trying to fix tens of thousands of
affected machines: Do we need a rollback strategy? What are the
performance implications of operating in a non-caching state for several
days? It also caused confusion when the kernel team said we may need to
write the file several times to make sure it's effective ("But it already
reads back 3?").
Link: http://lkml.kernel.org/r/20191031221602.9375-1-hannes@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Chris Down <chris@chrisdown.name>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Ding Xiang [Sun, 1 Dec 2019 01:49:12 +0000 (17:49 -0800)]
ocfs2: fix passing zero to 'PTR_ERR' warning
[ Upstream commit
188c523e1c271d537f3c9f55b6b65bf4476de32f ]
Fix a static code checker warning:
fs/ocfs2/acl.c:331
ocfs2_acl_chmod() warn: passing zero to 'PTR_ERR'
Link: http://lkml.kernel.org/r/1dee278b-6c96-eec2-ce76-fe6e07c6e20f@linux.alibaba.com
Fixes:
5ee0fbd50fd ("ocfs2: revert using ocfs2_acl_chmod to avoid inode cluster lock hang")
Signed-off-by: Ding Xiang <dingxiang@cmss.chinamobile.com>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Thomas Richter [Fri, 22 Nov 2019 15:43:15 +0000 (16:43 +0100)]
s390/cpum_sf: Check for SDBT and SDB consistency
[ Upstream commit
247f265fa502e7b17a0cb0cc330e055a36aafce4 ]
Each SBDT is located at a 4KB page and contains 512 entries.
Each entry of a SDBT points to a SDB, a 4KB page containing
sampled data. The last entry is a link to another SDBT page.
When an event is created the function sequence executed is:
__hw_perf_event_init()
+--> allocate_buffers()
+--> realloc_sampling_buffers()
+---> alloc_sample_data_block()
Both functions realloc_sampling_buffers() and
alloc_sample_data_block() allocate pages and the allocation
can fail. This is handled correctly and all allocated
pages are freed and error -ENOMEM is returned to the
top calling function. Finally the event is not created.
Once the event has been created, the amount of initially
allocated SDBT and SDB can be too low. This is detected
during measurement interrupt handling, where the amount
of lost samples is calculated. If the number of lost samples
is too high considering sampling frequency and already allocated
SBDs, the number of SDBs is enlarged during the next execution
of cpumsf_pmu_enable().
If more SBDs need to be allocated, functions
realloc_sampling_buffers()
+---> alloc-sample_data_block()
are called to allocate more pages. Page allocation may fail
and the returned error is ignored. A SDBT and SDB setup
already exists.
However the modified SDBTs and SDBs might end up in a situation
where the first entry of an SDBT does not point to an SDB,
but another SDBT, basicly an SBDT without payload.
This can not be handled by the interrupt handler, where an SDBT
must have at least one entry pointing to an SBD.
Add a check to avoid SDBTs with out payload (SDBs) when enlarging
the buffer setup.
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Masahiro Yamada [Wed, 13 Nov 2019 07:12:02 +0000 (16:12 +0900)]
libfdt: define INT32_MAX and UINT32_MAX in libfdt_env.h
[ Upstream commit
a8de1304b7df30e3a14f2a8b9709bb4ff31a0385 ]
The DTC v1.5.1 added references to (U)INT32_MAX.
This is no problem for user-space programs since <stdint.h> defines
(U)INT32_MAX along with (u)int32_t.
For the kernel space, libfdt_env.h needs to be adjusted before we
pull in the changes.
In the kernel, we usually use s/u32 instead of (u)int32_t for the
fixed-width types.
Accordingly, we already have S/U32_MAX for their max values.
So, we should not add (U)INT32_MAX to <linux/limits.h> any more.
Instead, add them to the in-kernel libfdt_env.h to compile the
latest libfdt.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Harald Freudenberger [Wed, 20 Nov 2019 10:44:31 +0000 (11:44 +0100)]
s390/zcrypt: handle new reply code FILTERED_BY_HYPERVISOR
[ Upstream commit
6733775a92eacd612ac88afa0fd922e4ffeb2bc7 ]
This patch introduces support for a new architectured reply
code 0x8B indicating that a hypervisor layer (if any) has
rejected an ap message.
Linux may run as a guest on top of a hypervisor like zVM
or KVM. So the crypto hardware seen by the ap bus may be
restricted by the hypervisor for example only a subset like
only clear key crypto requests may be supported. Other
requests will be filtered out - rejected by the hypervisor.
The new reply code 0x8B will appear in such cases and needs
to get recognized by the ap bus and zcrypt device driver zoo.
Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Arnaldo Carvalho de Melo [Wed, 27 Nov 2019 13:13:34 +0000 (10:13 -0300)]
perf regs: Make perf_reg_name() return "unknown" instead of NULL
[ Upstream commit
5b596e0ff0e1852197d4c82d3314db5e43126bf7 ]
To avoid breaking the build on arches where this is not wired up, at
least all the other features should be made available and when using
this specific routine, the "unknown" should point the user/developer to
the need to wire this up on this particular hardware architecture.
Detected in a container mipsel debian cross build environment, where it
shows up as:
In file included from /usr/mipsel-linux-gnu/include/stdio.h:867,
from /git/linux/tools/perf/lib/include/perf/cpumap.h:6,
from util/session.c:13:
In function 'printf',
inlined from 'regs_dump__printf' at util/session.c:1103:3,
inlined from 'regs__printf' at util/session.c:1131:2:
/usr/mipsel-linux-gnu/include/bits/stdio2.h:107:10: error: '%-5s' directive argument is null [-Werror=format-overflow=]
107 | return __printf_chk (__USE_FORTIFY_LEVEL - 1, __fmt, __va_arg_pack ());
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
cross compiler details:
mipsel-linux-gnu-gcc (Debian 9.2.1-8) 9.2.1
20190909
Also on mips64:
In file included from /usr/mips64-linux-gnuabi64/include/stdio.h:867,
from /git/linux/tools/perf/lib/include/perf/cpumap.h:6,
from util/session.c:13:
In function 'printf',
inlined from 'regs_dump__printf' at util/session.c:1103:3,
inlined from 'regs__printf' at util/session.c:1131:2,
inlined from 'regs_user__printf' at util/session.c:1139:3,
inlined from 'dump_sample' at util/session.c:1246:3,
inlined from 'machines__deliver_event' at util/session.c:1421:3:
/usr/mips64-linux-gnuabi64/include/bits/stdio2.h:107:10: error: '%-5s' directive argument is null [-Werror=format-overflow=]
107 | return __printf_chk (__USE_FORTIFY_LEVEL - 1, __fmt, __va_arg_pack ());
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In function 'printf',
inlined from 'regs_dump__printf' at util/session.c:1103:3,
inlined from 'regs__printf' at util/session.c:1131:2,
inlined from 'regs_intr__printf' at util/session.c:1147:3,
inlined from 'dump_sample' at util/session.c:1249:3,
inlined from 'machines__deliver_event' at util/session.c:1421:3:
/usr/mips64-linux-gnuabi64/include/bits/stdio2.h:107:10: error: '%-5s' directive argument is null [-Werror=format-overflow=]
107 | return __printf_chk (__USE_FORTIFY_LEVEL - 1, __fmt, __va_arg_pack ());
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
cross compiler details:
mips64-linux-gnuabi64-gcc (Debian 9.2.1-8) 9.2.1
20190909
Fixes:
2bcd355b71da ("perf tools: Add interface to arch registers sets")
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-95wjyv4o65nuaeweq31t7l1s@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Adrian Hunter [Wed, 27 Nov 2019 09:53:21 +0000 (11:53 +0200)]
perf script: Fix brstackinsn for AUXTRACE
[ Upstream commit
0cd032d3b5fcebf5454315400ab310746a81ca53 ]
brstackinsn must be allowed to be set by the user when AUX area data has
been captured because, in that case, the branch stack might be
synthesized on the fly. This fixes the following error:
Before:
$ perf record -e '{intel_pt//,cpu/mem_inst_retired.all_loads,aux-sample-size=8192/pp}:u' grep -rqs jhgjhg /boot
[ perf record: Woken up 19 times to write data ]
[ perf record: Captured and wrote 2.274 MB perf.data ]
$ perf script -F +brstackinsn --xed --itrace=i1usl100 | head
Display of branch stack assembler requested, but non all-branch filter set
Hint: run 'perf record -b ...'
After:
$ perf record -e '{intel_pt//,cpu/mem_inst_retired.all_loads,aux-sample-size=8192/pp}:u' grep -rqs jhgjhg /boot
[ perf record: Woken up 19 times to write data ]
[ perf record: Captured and wrote 2.274 MB perf.data ]
$ perf script -F +brstackinsn --xed --itrace=i1usl100 | head
grep 13759 [002] 8091.310257: 1862 instructions:uH:
5641d58069eb bmexec+0x86b (/bin/grep)
bmexec+2485:
00005641d5806b35 jnz 0x5641d5806bd0 # MISPRED
00005641d5806bd0 movzxb (%r13,%rdx,1), %eax
00005641d5806bd6 add %rdi, %rax
00005641d5806bd9 movzxb -0x1(%rax), %edx
00005641d5806bdd cmp %rax, %r14
00005641d5806be0 jnb 0x5641d58069c0 # MISPRED
mismatch of LBR data and executable
00005641d58069c0 movzxb (%r13,%rdx,1), %edi
Fixes:
48d02a1d5c13 ("perf script: Add 'brstackinsn' for branch stacks")
Reported-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20191127095322.15417-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Diego Elio Pettenò [Tue, 19 Nov 2019 21:37:08 +0000 (21:37 +0000)]
cdrom: respect device capabilities during opening action
[ Upstream commit
366ba7c71ef77c08d06b18ad61b26e2df7352338 ]
Reading the TOC only works if the device can play audio, otherwise
these commands fail (and possibly bring the device to an unhealthy
state.)
Similarly, cdrom_mmc3_profile() should only be called if the device
supports generic packet commands.
To: Jens Axboe <axboe@kernel.dk>
Cc: linux-kernel@vger.kernel.org
Cc: linux-scsi@vger.kernel.org
Signed-off-by: Diego Elio Pettenò <flameeyes@flameeyes.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>