Martin Vuille [Sun, 11 Feb 2018 21:24:20 +0000 (16:24 -0500)]
perf unwind: Unwind with libdw doesn't take symfs into account
[ Upstream commit
3d20c6246690219881786de10d2dda93f616d0ac ]
Path passed to libdw for unwinding doesn't include symfs path
if specified, so unwinding fails because ELF file is not found.
Similar to unwinding with libunwind, pass symsrc_filename instead
of long_name. If there is no symsrc_filename, fallback to long_name.
Signed-off-by: Martin Vuille <jpmv27@aim.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/20180211212420.18388-1-jpmv27@aim.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Nicolas Pitre [Wed, 9 Jan 2019 03:55:01 +0000 (22:55 -0500)]
vt: invoke notifier on screen size change
commit
0c9b1965faddad7534b6974b5b36c4ad37998f8e upstream.
User space using poll() on /dev/vcs devices are not awaken when a
screen size change occurs. Let's fix that.
Signed-off-by: Nicolas Pitre <nico@linaro.org>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Oliver Hartkopp [Sun, 13 Jan 2019 18:31:43 +0000 (19:31 +0100)]
can: bcm: check timer values before ktime conversion
commit
93171ba6f1deffd82f381d36cb13177872d023f6 upstream.
Kyungtae Kim detected a potential integer overflow in bcm_[rx|tx]_setup()
when the conversion into ktime multiplies the given value with NSEC_PER_USEC
(1000).
Reference: https://marc.info/?l=linux-can&m=
154732118819828&w=2
Add a check for the given tv_usec, so that the value stays below one second.
Additionally limit the tv_sec value to a reasonable value for CAN related
use-cases of 400 days and ensure all values to be positive.
Reported-by: Kyungtae Kim <kt0755@gmail.com>
Tested-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Cc: linux-stable <stable@vger.kernel.org> # >= 2.6.26
Tested-by: Kyungtae Kim <kt0755@gmail.com>
Acked-by: Andre Naujoks <nautsch2@gmail.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Manfred Schlaegl [Wed, 19 Dec 2018 18:39:58 +0000 (19:39 +0100)]
can: dev: __can_get_echo_skb(): fix bogous check for non-existing skb by removing it
commit
7b12c8189a3dc50638e7d53714c88007268d47ef upstream.
This patch revert commit
7da11ba5c506
("can: dev: __can_get_echo_skb(): print error message, if trying to echo non existing skb")
After introduction of this change we encountered following new error
message on various i.MX plattforms (flexcan):
| flexcan
53fc8000.can can0: __can_get_echo_skb: BUG! Trying to echo non
| existing skb: can_priv::echo_skb[0]
The introduction of the message was a mistake because
priv->echo_skb[idx] = NULL is a perfectly valid in following case: If
CAN_RAW_LOOPBACK is disabled (setsockopt) in applications, the pkt_type
of the tx skb's given to can_put_echo_skb is set to PACKET_LOOPBACK. In
this case can_put_echo_skb will not set priv->echo_skb[idx]. It is
therefore kept NULL.
As additional argument for revert: The order of check and usage of idx
was changed. idx is used to access an array element before checking it's
boundaries.
Signed-off-by: Manfred Schlaegl <manfred.schlaegl@ginzinger.com>
Fixes:
7da11ba5c506 ("can: dev: __can_get_echo_skb(): print error message, if trying to echo non existing skb")
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Daniel Drake [Mon, 7 Jan 2019 03:40:24 +0000 (11:40 +0800)]
x86/kaslr: Fix incorrect i8254 outb() parameters
commit
7e6fc2f50a3197d0e82d1c0e86282976c9e6c8a4 upstream.
The outb() function takes parameters value and port, in that order. Fix
the parameters used in the kalsr i8254 fallback code.
Fixes:
5bfce5ef55cb ("x86, kaslr: Provide randomness functions")
Signed-off-by: Daniel Drake <drake@endlessm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: bp@alien8.de
Cc: hpa@zytor.com
Cc: linux@endlessm.com
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20190107034024.15005-1-drake@endlessm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Dave Hansen [Wed, 2 Jan 2019 21:56:57 +0000 (13:56 -0800)]
x86/selftests/pkeys: Fork() to check for state being preserved
commit
e1812933b17be7814f51b6c310c5d1ced7a9a5f5 upstream.
There was a bug where the per-mm pkey state was not being preserved across
fork() in the child. fork() is performed in the pkey selftests, but all of
the pkey activity is performed in the parent. The child does not perform
any actions sensitive to pkey state.
To make the test more sensitive to these kinds of bugs, add a fork() where
the parent exits, and execution continues in the child.
To achieve this let the key exhaustion test not terminate at the first
allocation failure and fork after 2*NR_PKEYS loops and continue in the
child.
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: bp@alien8.de
Cc: hpa@zytor.com
Cc: peterz@infradead.org
Cc: mpe@ellerman.id.au
Cc: will.deacon@arm.com
Cc: luto@kernel.org
Cc: jroedel@suse.de
Cc: stable@vger.kernel.org
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Joerg Roedel <jroedel@suse.de>
Link: https://lkml.kernel.org/r/20190102215657.585704B7@viggo.jf.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Alexander Popov [Mon, 21 Jan 2019 12:48:40 +0000 (15:48 +0300)]
KVM: x86: Fix single-step debugging
commit
5cc244a20b86090c087073c124284381cdf47234 upstream.
The single-step debugging of KVM guests on x86 is broken: if we run
gdb 'stepi' command at the breakpoint when the guest interrupts are
enabled, RIP always jumps to native_apic_mem_write(). Then other
nasty effects follow.
Long investigation showed that on Jun 7, 2017 the
commit
c8401dda2f0a00cd25c0 ("KVM: x86: fix singlestepping over syscall")
introduced the kvm_run.debug corruption: kvm_vcpu_do_singlestep() can
be called without X86_EFLAGS_TF set.
Let's fix it. Please consider that for -stable.
Signed-off-by: Alexander Popov <alex.popov@linux.com>
Cc: stable@vger.kernel.org
Fixes:
c8401dda2f0a00cd25c0 ("KVM: x86: fix singlestepping over syscall")
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Joe Thornber [Tue, 15 Jan 2019 18:27:01 +0000 (13:27 -0500)]
dm thin: fix passdown_double_checking_shared_status()
commit
d445bd9cec1a850c2100fcf53684c13b3fd934f2 upstream.
Commit
00a0ea33b495 ("dm thin: do not queue freed thin mapping for next
stage processing") changed process_prepared_discard_passdown_pt1() to
increment all the blocks being discarded until after the passdown had
completed to avoid them being prematurely reused.
IO issued to a thin device that breaks sharing with a snapshot, followed
by a discard issued to snapshot(s) that previously shared the block(s),
results in passdown_double_checking_shared_status() being called to
iterate through the blocks double checking their reference count is zero
and issuing the passdown if so. So a side effect of commit
00a0ea33b495
is passdown_double_checking_shared_status() was broken.
Fix this by checking if the block reference count is greater than 1.
Also, rename dm_pool_block_is_used() to dm_pool_block_is_shared().
Fixes:
00a0ea33b495 ("dm thin: do not queue freed thin mapping for next stage processing")
Cc: stable@vger.kernel.org # 4.9+
Reported-by: ryan.p.norwood@gmail.com
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Dan Williams [Sat, 19 Jan 2019 18:55:04 +0000 (10:55 -0800)]
acpi/nfit: Fix command-supported detection
commit
11189c1089da413aa4b5fd6be4c4d47c78968819 upstream.
The _DSM function number validation only happens to succeed when the
generic Linux command number translation corresponds with a
DSM-family-specific function number. This breaks NVDIMM-N
implementations that correctly implement _LSR, _LSW, and _LSI, but do
not happen to publish support for DSM function numbers 4, 5, and 6.
Recall that the support for _LS{I,R,W} family of methods results in the
DIMM being marked as supporting those command numbers at
acpi_nfit_register_dimms() time. The DSM function mask is only used for
ND_CMD_CALL support of non-NVDIMM_FAMILY_INTEL devices.
Fixes:
31eca76ba2fc ("nfit, libnvdimm: limited/whitelisted dimm command...")
Cc: <stable@vger.kernel.org>
Link: https://github.com/pmem/ndctl/issues/78
Reported-by: Sujith Pandel <sujith_pandel@dell.com>
Tested-by: Sujith Pandel <sujith_pandel@dell.com>
Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Dan Williams [Mon, 14 Jan 2019 22:07:19 +0000 (14:07 -0800)]
acpi/nfit: Block function zero DSMs
commit
5e9e38d0db1d29efed1dd4cf9a70115d33521be7 upstream.
In preparation for using function number 0 as an error value, prevent it
from being considered a valid function value by acpi_nfit_ctl().
Cc: <stable@vger.kernel.org>
Cc: stuart hayes <stuart.w.hayes@gmail.com>
Fixes:
e02fb7264d8a ("nfit: add Microsoft NVDIMM DSM command set...")
Reported-by: Jeff Moyer <jmoyer@redhat.com>
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Dmitry Torokhov [Mon, 14 Jan 2019 21:54:55 +0000 (13:54 -0800)]
Input: uinput - fix undefined behavior in uinput_validate_absinfo()
commit
d77651a227f8920dd7ec179b84e400cce844eeb3 upstream.
An integer overflow may arise in uinput_validate_absinfo() if "max - min"
can't be represented by an "int". We should check for overflow before
trying to use the result.
Reported-by: Kyungtae Kim <kt0755@gmail.com>
Reviewed-by: Peter Hutterer <peter.hutterer@who-t.net>
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Rasmus Villemoes [Mon, 7 May 2018 22:36:27 +0000 (00:36 +0200)]
compiler.h: enable builtin overflow checkers and add fallback code
commit
f0907827a8a9152aedac2833ed1b674a7b2a44f2 upstream.
This adds wrappers for the __builtin overflow checkers present in gcc
5.1+ as well as fallback implementations for earlier compilers. It's not
that easy to implement the fully generic __builtin_X_overflow(T1 a, T2
b, T3 *d) in macros, so the fallback code assumes that T1, T2 and T3 are
the same. We obviously don't want the wrappers to have different
semantics depending on $GCC_VERSION, so we also insist on that even when
using the builtins.
There are a few problems with the 'a+b < a' idiom for checking for
overflow: For signed types, it relies on undefined behaviour and is
not actually complete (it doesn't check underflow;
e.g. INT_MIN+INT_MIN == 0 isn't caught). Due to type promotion it
is wrong for all types (signed and unsigned) narrower than
int. Similarly, when a and b does not have the same type, there are
subtle cases like
u32 a;
if (a + sizeof(foo) < a)
return -EOVERFLOW;
a += sizeof(foo);
where the test is always false on 64 bit platforms. Add to that that it
is not always possible to determine the types involved at a glance.
The new overflow.h is somewhat bulky, but that's mostly a result of
trying to be type-generic, complete (e.g. catching not only overflow
but also signed underflow) and not relying on undefined behaviour.
Linus is of course right [1] that for unsigned subtraction a-b, the
right way to check for overflow (underflow) is "b > a" and not
"__builtin_sub_overflow(a, b, &d)", but that's just one out of six cases
covered here, and included mostly for completeness.
So is it worth it? I think it is, if nothing else for the documentation
value of seeing
if (check_add_overflow(a, b, &d))
return -EGOAWAY;
do_stuff_with(d);
instead of the open-coded (and possibly wrong and/or incomplete and/or
UBsan-tickling)
if (a+b < a)
return -EGOAWAY;
do_stuff_with(a+b);
While gcc does recognize the 'a+b < a' idiom for testing unsigned add
overflow, it doesn't do nearly as good for unsigned multiplication
(there's also no single well-established idiom). So using
check_mul_overflow in kcalloc and friends may also make gcc generate
slightly better code.
[1] https://lkml.org/lkml/2015/11/2/658
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Tom Panfil [Sat, 12 Jan 2019 01:49:40 +0000 (17:49 -0800)]
Input: xpad - add support for SteelSeries Stratus Duo
commit
fe2bfd0d40c935763812973ce15f5764f1c12833 upstream.
Add support for the SteelSeries Stratus Duo, a wireless Xbox 360
controller. The Stratus Duo ships with a USB dongle to enable wireless
connectivity, but it can also function as a wired controller by connecting
it directly to a PC via USB, hence the need for two USD PIDs. 0x1430 is the
dongle, and 0x1431 is the controller.
Signed-off-by: Tom Panfil <tom@steelseries.com>
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Pavel Shilovsky [Thu, 17 Jan 2019 16:21:24 +0000 (08:21 -0800)]
CIFS: Fix possible hang during async MTU reads and writes
commit
acc58d0bab55a50e02c25f00bd6a210ee121595f upstream.
When doing MTU i/o we need to leave some credits for
possible reopen requests and other operations happening
in parallel. Currently we leave 1 credit which is not
enough even for reopen only: we need at least 2 credits
if durable handle reconnect fails. Also there may be
other operations at the same time including compounding
ones which require 3 credits at a time each. Fix this
by leaving 8 credits which is big enough to cover most
scenarios.
Was able to reproduce this when server was configured
to give out fewer credits than usual.
The proper fix would be to reconnect a file handle first
and then obtain credits for an MTU request but this leads
to bigger code changes and should happen in other patches.
Cc: <stable@vger.kernel.org>
Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Paul Fulghum [Tue, 1 Jan 2019 20:28:53 +0000 (12:28 -0800)]
tty/n_hdlc: fix __might_sleep warning
commit
fc01d8c61ce02c034e67378cd3e645734bc18c8c upstream.
Fix __might_sleep warning[1] in tty/n_hdlc.c read due to copy_to_user
call while current is TASK_INTERRUPTIBLE. This is a false positive
since the code path does not depend on current state remaining
TASK_INTERRUPTIBLE. The loop breaks out and sets TASK_RUNNING after
calling copy_to_user.
This patch supresses the warning by setting TASK_RUNNING before calling
copy_to_user.
[1] https://syzkaller.appspot.com/bug?id=
17d5de7f1fcab794cb8c40032f893f52de899324
Signed-off-by: Paul Fulghum <paulkf@microgate.com>
Reported-by: syzbot <syzbot+c244af085a0159d22879@syzkaller.appspotmail.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: stable <stable@vger.kernel.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Samir Virmani [Wed, 16 Jan 2019 18:28:07 +0000 (10:28 -0800)]
uart: Fix crash in uart_write and uart_put_char
commit
aff9cf5955185d1f183227e46c5f8673fa483813 upstream.
We were experiencing a crash similar to the one reported as part of
commit:
a5ba1d95e46e ("uart: fix race between uart_put_char() and
uart_shutdown()") in our testbed as well. We continue to observe the same
crash after integrating the commit
a5ba1d95e46e ("uart: fix race between
uart_put_char() and uart_shutdown()")
On reviewing the change, the port lock should be taken prior to checking for
if (!circ->buf) in fn. __uart_put_char and other fns. that update the buffer
uart_state->xmit.
Traceback:
[11/27/2018 06:24:32.4870] Unable to handle kernel NULL pointer dereference
at virtual address
0000003b
[11/27/2018 06:24:32.4950] PC is at memcpy+0x48/0x180
[11/27/2018 06:24:32.4950] LR is at uart_write+0x74/0x120
[11/27/2018 06:24:32.4950] pc : [<
ffffffc0002e6808>]
lr : [<
ffffffc0003747cc>] pstate:
000001c5
[11/27/2018 06:24:32.4950] sp :
ffffffc076433d30
[11/27/2018 06:24:32.4950] x29:
ffffffc076433d30 x28:
0000000000000140
[11/27/2018 06:24:32.4950] x27:
ffffffc0009b9d5e x26:
ffffffc07ce36580
[11/27/2018 06:24:32.4950] x25:
0000000000000000 x24:
0000000000000140
[11/27/2018 06:24:32.4950] x23:
ffffffc000891200 x22:
ffffffc01fc34000
[11/27/2018 06:24:32.4950] x21:
0000000000000fff x20:
0000000000000076
[11/27/2018 06:24:32.4950] x19:
0000000000000076 x18:
0000000000000000
[11/27/2018 06:24:32.4950] x17:
000000000047cf08 x16:
ffffffc000099e68
[11/27/2018 06:24:32.4950] x15:
0000000000000018 x14:
776d726966205948
[11/27/2018 06:24:32.4950] x13:
50203a6c6974755f x12:
74647075205d3333
[11/27/2018 06:24:32.4950] x11:
3a35323a36203831 x10:
30322f37322f3131
[11/27/2018 06:24:32.4950] x9 :
5b205d303638342e x8 :
746164206f742070
[11/27/2018 06:24:32.4950] x7 :
7520736920657261 x6 :
000000000000003b
[11/27/2018 06:24:32.4950] x5 :
000000000000817a x4 :
0000000000000008
[11/27/2018 06:24:32.4950] x3 :
2f37322f31312a5b x2 :
000000000000006e
[11/27/2018 06:24:32.4950] x1 :
ffffffc0009b9cf0 x0 :
000000000000003b
[11/27/2018 06:24:32.4950] CPU2: stopping
[11/27/2018 06:24:32.4950] CPU: 2 PID: 0 Comm: swapper/2 Tainted: P D O 4.1.51 #3
[11/27/2018 06:24:32.4950] Hardware name: Broadcom-v8A (DT)
[11/27/2018 06:24:32.4950] Call trace:
[11/27/2018 06:24:32.4950] [<
ffffffc0000883b8>] dump_backtrace+0x0/0x150
[11/27/2018 06:24:32.4950] [<
ffffffc00008851c>] show_stack+0x14/0x20
[11/27/2018 06:24:32.4950] [<
ffffffc0005ee810>] dump_stack+0x90/0xb0
[11/27/2018 06:24:32.4950] [<
ffffffc00008e844>] handle_IPI+0x18c/0x1a0
[11/27/2018 06:24:32.4950] [<
ffffffc000080c68>] gic_handle_irq+0x88/0x90
Fixes:
a5ba1d95e46e ("uart: fix race between uart_put_char() and uart_shutdown()")
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Samir Virmani <samir@embedur.com>
Acked-by: Tycho Andersen <tycho@tycho.ws>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Greg Kroah-Hartman [Sun, 20 Jan 2019 09:46:58 +0000 (10:46 +0100)]
tty: Handle problem if line discipline does not have receive_buf
commit
27cfb3a53be46a54ec5e0bd04e51995b74c90343 upstream.
Some tty line disciplines do not have a receive buf callback, so
properly check for that before calling it. If they do not have this
callback, just eat the character quietly, as we can't fail this call.
Reported-by: Jann Horn <jannh@google.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Michael Straube [Mon, 7 Jan 2019 17:28:58 +0000 (18:28 +0100)]
staging: rtl8188eu: Add device code for D-Link DWA-121 rev B1
commit
5f74a8cbb38d10615ed46bc3e37d9a4c9af8045a upstream.
This device was added to the stand-alone driver on github.
Add it to the staging driver as well.
Link: https://github.com/lwfinger/rtl8188eu/commit/a0619a07cd1e
Signed-off-by: Michael Straube <straube.linux@gmail.com>
Acked-by: Larry Finger <Larry.Finger@lwfinger.net>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Gustavo A. R. Silva [Wed, 9 Jan 2019 19:02:36 +0000 (13:02 -0600)]
char/mwave: fix potential Spectre v1 vulnerability
commit
701956d4018e5d5438570e39e8bda47edd32c489 upstream.
ipcnum is indirectly controlled by user-space, hence leading to
a potential exploitation of the Spectre variant 1 vulnerability.
This issue was detected with the help of Smatch:
drivers/char/mwave/mwavedd.c:299 mwave_ioctl() warn: potential spectre issue 'pDrvData->IPCs' [w] (local cap)
Fix this by sanitizing ipcnum before using it to index pDrvData->IPCs.
Notice that given that speculation windows are large, the policy is
to kill the speculation on the first load and not worry if it can be
completed with a dependent load/store [1].
[1] https://marc.info/?l=linux-kernel&m=
152449131114778&w=2
Cc: stable@vger.kernel.org
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Gerald Schaefer [Wed, 9 Jan 2019 12:00:03 +0000 (13:00 +0100)]
s390/smp: fix CPU hotplug deadlock with CPU rescan
commit
b7cb707c373094ce4008d4a6ac9b6b366ec52da5 upstream.
smp_rescan_cpus() is called without the device_hotplug_lock, which can lead
to a dedlock when a new CPU is found and immediately set online by a udev
rule.
This was observed on an older kernel version, where the cpu_hotplug_begin()
loop was still present, and it resulted in hanging chcpu and systemd-udev
processes. This specific deadlock will not show on current kernels. However,
there may be other possible deadlocks, and since smp_rescan_cpus() can still
trigger a CPU hotplug operation, the device_hotplug_lock should be held.
For reference, this was the deadlock with the old cpu_hotplug_begin() loop:
chcpu (rescan) systemd-udevd
echo 1 > /sys/../rescan
-> smp_rescan_cpus()
-> (*) get_online_cpus()
(increases refcount)
-> smp_add_present_cpu()
(new CPU found)
-> register_cpu()
-> device_add()
-> udev "add" event triggered -----------> udev rule sets CPU online
-> echo 1 > /sys/.../online
-> lock_device_hotplug_sysfs()
(this is missing in rescan path)
-> device_online()
-> (**) device_lock(new CPU dev)
-> cpu_up()
-> cpu_hotplug_begin()
(loops until refcount == 0)
-> deadlock with (*)
-> bus_probe_device()
-> device_attach()
-> device_lock(new CPU dev)
-> deadlock with (**)
Fix this by taking the device_hotplug_lock in the CPU rescan path.
Cc: <stable@vger.kernel.org>
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Christian Borntraeger [Fri, 9 Nov 2018 08:21:47 +0000 (09:21 +0100)]
s390/early: improve machine detection
commit
03aa047ef2db4985e444af6ee1c1dd084ad9fb4c upstream.
Right now the early machine detection code check stsi 3.2.2 for "KVM"
and set MACHINE_IS_VM if this is different. As the console detection
uses diagnose 8 if MACHINE_IS_VM returns true this will crash Linux
early for any non z/VM system that sets a different value than KVM.
So instead of assuming z/VM, do not set any of MACHINE_IS_LPAR,
MACHINE_IS_VM, or MACHINE_IS_KVM.
CC: stable@vger.kernel.org
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Eugeniy Paltsev [Mon, 17 Dec 2018 09:54:23 +0000 (12:54 +0300)]
ARC: perf: map generic branches to correct hardware condition
commit
3affbf0e154ee351add6fcc254c59c3f3947fa8f upstream.
So far we've mapped branches to "ijmp" which also counts conditional
branches NOT taken. This makes us different from other architectures
such as ARM which seem to be counting only taken branches.
So use "ijmptak" hardware condition which only counts (all jump
instructions that are taken)
'ijmptak' event is available on both ARCompact and ARCv2 ISA based
cores.
Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
Cc: stable@vger.kernel.org
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
[vgupta: reworked changelog]
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Eugeniy Paltsev [Mon, 14 Jan 2019 15:16:48 +0000 (18:16 +0300)]
ARCv2: lib: memeset: fix doing prefetchw outside of buffer
commit
e6a72b7daeeb521753803550f0ed711152bb2555 upstream.
ARCv2 optimized memset uses PREFETCHW instruction for prefetching the
next cache line but doesn't ensure that the line is not past the end of
the buffer. PRETECHW changes the line ownership and marks it dirty,
which can cause issues in SMP config when next line was already owned by
other core. Fix the issue by avoiding the PREFETCHW
Some more details:
The current code has 3 logical loops (ignroing the unaligned part)
(a) Big loop for doing aligned 64 bytes per iteration with PREALLOC
(b) Loop for 32 x 2 bytes with PREFETCHW
(c) any left over bytes
loop (a) was already eliding the last 64 bytes, so PREALLOC was
safe. The fix was removing PREFETCW from (b).
Another potential issue (applicable to configs with 32 or 128 byte L1
cache line) is that PREALLOC assumes 64 byte cache line and may not do
the right thing specially for 32b. While it would be easy to adapt,
there are no known configs with those lie sizes, so for now, just
compile out PREALLOC in such cases.
Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
Cc: stable@vger.kernel.org #4.4+
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
[vgupta: rewrote changelog, used asm .macro vs. "C" macro]
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Gustavo A. R. Silva [Tue, 15 Jan 2019 17:57:23 +0000 (11:57 -0600)]
ASoC: rt5514-spi: Fix potential NULL pointer dereference
commit
060d0bf491874daece47053c4e1fb0489eb867d2 upstream.
There is a potential NULL pointer dereference in case devm_kzalloc()
fails and returns NULL.
Fix this by adding a NULL check on rt5514_dsp.
This issue was detected with the help of Coccinelle.
Fixes:
6eebf35b0e4a ("ASoC: rt5514: add rt5514 SPI driver")
Cc: stable@vger.kernel.org
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Kangjie Lu [Wed, 26 Dec 2018 02:29:48 +0000 (20:29 -0600)]
ASoC: atom: fix a missing check of snd_pcm_lib_malloc_pages
commit
44fabd8cdaaa3acb80ad2bb3b5c61ae2136af661 upstream.
snd_pcm_lib_malloc_pages() may fail, so let's check its status and
return its error code upstream.
Signed-off-by: Kangjie Lu <kjlu@umn.edu>
Acked-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Charles Yeh [Tue, 15 Jan 2019 15:13:56 +0000 (23:13 +0800)]
USB: serial: pl2303: add new PID to support PL2303TB
commit
4dcf9ddc9ad5ab649abafa98c5a4d54b1a33dabb upstream.
Add new PID to support PL2303TB (TYPE_HX)
Signed-off-by: Charles Yeh <charlesyeh522@gmail.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Max Schulze [Mon, 7 Jan 2019 07:31:49 +0000 (08:31 +0100)]
USB: serial: simple: add Motorola Tetra TPG2200 device id
commit
b81c2c33eab79dfd3650293b2227ee5c6036585c upstream.
Add new Motorola Tetra device id for Motorola Solutions TETRA PEI device
T: Bus=02 Lev=01 Prnt=01 Port=01 Cnt=01 Dev#= 4 Spd=480 MxCh= 0
D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1
P: Vendor=0cad ProdID=9016 Rev=24.16
S: Manufacturer=Motorola Solutions, Inc.
S: Product=TETRA PEI interface
C: #Ifs= 2 Cfg#= 1 Atr=80 MxPwr=500mA
I: If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=usb_serial_simple
I: If#= 1 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=usb_serial_simple
Signed-off-by: Max Schulze <max.schulze@posteo.de>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Paolo Abeni [Fri, 6 Jul 2018 10:30:20 +0000 (12:30 +0200)]
ipfrag: really prevent allocation on netns exit
[ Upstream commit
f6f2a4a2eb92bc73671204198bb2f8ab53ff59fb ]
Setting the low threshold to 0 has no effect on frags allocation,
we need to clear high_thresh instead.
The code was pre-existent to commit
648700f76b03 ("inet: frags:
use rhashtables for reassembly units"), but before the above,
such assignment had a different role: prevent concurrent eviction
from the worker and the netns cleanup helper.
Fixes:
648700f76b03 ("inet: frags: use rhashtables for reassembly units")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cong Wang [Sat, 12 Jan 2019 02:55:42 +0000 (18:55 -0800)]
net_sched: refetch skb protocol for each filter
[ Upstream commit
cd0c4e70fc0ccfa705cdf55efb27519ce9337a26 ]
Martin reported a set of filters don't work after changing
from reclassify to continue. Looking into the code, it
looks like skb protocol is not always fetched for each
iteration of the filters. But, as demonstrated by Martin,
TC actions could modify skb->protocol, for example act_vlan,
this means we have to refetch skb protocol in each iteration,
rather than using the one we fetch in the beginning of the loop.
This bug is _not_ introduced by commit
3b3ae880266d
("net: sched: consolidate tc_classify{,_compat}"), technically,
if act_vlan is the only action that modifies skb protocol, then
it is commit
c7e2b9689ef8 ("sched: introduce vlan action") which
introduced this bug.
Reported-by: Martin Olsson <martin.olsson+netdev@sentorsecurity.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Ido Schimmel [Wed, 9 Jan 2019 09:57:39 +0000 (09:57 +0000)]
net: ipv4: Fix memory leak in network namespace dismantle
[ Upstream commit
f97f4dd8b3bb9d0993d2491e0f22024c68109184 ]
IPv4 routing tables are flushed in two cases:
1. In response to events in the netdev and inetaddr notification chains
2. When a network namespace is being dismantled
In both cases only routes associated with a dead nexthop group are
flushed. However, a nexthop group will only be marked as dead in case it
is populated with actual nexthops using a nexthop device. This is not
the case when the route in question is an error route (e.g.,
'blackhole', 'unreachable').
Therefore, when a network namespace is being dismantled such routes are
not flushed and leaked [1].
To reproduce:
# ip netns add blue
# ip -n blue route add unreachable 192.0.2.0/24
# ip netns del blue
Fix this by not skipping error routes that are not marked with
RTNH_F_DEAD when flushing the routing tables.
To prevent the flushing of such routes in case #1, add a parameter to
fib_table_flush() that indicates if the table is flushed as part of
namespace dismantle or not.
Note that this problem does not exist in IPv6 since error routes are
associated with the loopback device.
[1]
unreferenced object 0xffff888066650338 (size 56):
comm "ip", pid 1206, jiffies
4294786063 (age 26.235s)
hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 b0 1c 62 61 80 88 ff ff ..........ba....
e8 8b a1 64 80 88 ff ff 00 07 00 08 fe 00 00 00 ...d............
backtrace:
[<
00000000856ed27d>] inet_rtm_newroute+0x129/0x220
[<
00000000fcdfc00a>] rtnetlink_rcv_msg+0x397/0xa20
[<
00000000cb85801a>] netlink_rcv_skb+0x132/0x380
[<
00000000ebc991d2>] netlink_unicast+0x4c0/0x690
[<
0000000014f62875>] netlink_sendmsg+0x929/0xe10
[<
00000000bac9d967>] sock_sendmsg+0xc8/0x110
[<
00000000223e6485>] ___sys_sendmsg+0x77a/0x8f0
[<
000000002e94f880>] __sys_sendmsg+0xf7/0x250
[<
00000000ccb1fa72>] do_syscall_64+0x14d/0x610
[<
00000000ffbe3dae>] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[<
000000003a8b605b>] 0xffffffffffffffff
unreferenced object 0xffff888061621c88 (size 48):
comm "ip", pid 1206, jiffies
4294786063 (age 26.235s)
hex dump (first 32 bytes):
6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
6b 6b 6b 6b 6b 6b 6b 6b d8 8e 26 5f 80 88 ff ff kkkkkkkk..&_....
backtrace:
[<
00000000733609e3>] fib_table_insert+0x978/0x1500
[<
00000000856ed27d>] inet_rtm_newroute+0x129/0x220
[<
00000000fcdfc00a>] rtnetlink_rcv_msg+0x397/0xa20
[<
00000000cb85801a>] netlink_rcv_skb+0x132/0x380
[<
00000000ebc991d2>] netlink_unicast+0x4c0/0x690
[<
0000000014f62875>] netlink_sendmsg+0x929/0xe10
[<
00000000bac9d967>] sock_sendmsg+0xc8/0x110
[<
00000000223e6485>] ___sys_sendmsg+0x77a/0x8f0
[<
000000002e94f880>] __sys_sendmsg+0xf7/0x250
[<
00000000ccb1fa72>] do_syscall_64+0x14d/0x610
[<
00000000ffbe3dae>] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[<
000000003a8b605b>] 0xffffffffffffffff
Fixes:
8cced9eff1d4 ("[NETNS]: Enable routing configuration in non-initial namespace.")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Jason Wang [Wed, 16 Jan 2019 08:54:42 +0000 (16:54 +0800)]
vhost: log dirty page correctly
[ Upstream commit
cc5e710759470bc7f3c61d11fd54586f15fdbdf4 ]
Vhost dirty page logging API is designed to sync through GPA. But we
try to log GIOVA when device IOTLB is enabled. This is wrong and may
lead to missing data after migration.
To solve this issue, when logging with device IOTLB enabled, we will:
1) reuse the device IOTLB translation result of GIOVA->HVA mapping to
get HVA, for writable descriptor, get HVA through iovec. For used
ring update, translate its GIOVA to HVA
2) traverse the GPA->HVA mapping to get the possible GPA and log
through GPA. Pay attention this reverse mapping is not guaranteed
to be unique, so we should log each possible GPA in this case.
This fix the failure of scp to guest during migration. In -next, we
will probably support passing GIOVA->GPA instead of GIOVA->HVA.
Fixes:
6b1e6cc7855b ("vhost: new device IOTLB API")
Reported-by: Jintack Lim <jintack@cs.columbia.edu>
Cc: Jintack Lim <jintack@cs.columbia.edu>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Ross Lagerwall [Mon, 14 Jan 2019 09:16:56 +0000 (09:16 +0000)]
openvswitch: Avoid OOB read when parsing flow nlattrs
[ Upstream commit
04a4af334b971814eedf4e4a413343ad3287d9a9 ]
For nested and variable attributes, the expected length of an attribute
is not known and marked by a negative number. This results in an OOB
read when the expected length is later used to check if the attribute is
all zeros. Fix this by using the actual length of the attribute rather
than the expected length.
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Ross Lagerwall [Thu, 17 Jan 2019 15:34:38 +0000 (15:34 +0000)]
net: Fix usage of pskb_trim_rcsum
[ Upstream commit
6c57f0458022298e4da1729c67bd33ce41c14e7a ]
In certain cases, pskb_trim_rcsum() may change skb pointers.
Reinitialize header pointers afterwards to avoid potential
use-after-frees. Add a note in the documentation of
pskb_trim_rcsum(). Found by KASAN.
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Yunjian Wang [Thu, 17 Jan 2019 01:46:41 +0000 (09:46 +0800)]
net: bridge: Fix ethernet header pointer before check skb forwardable
[ Upstream commit
28c1382fa28f2e2d9d0d6f25ae879b5af2ecbd03 ]
The skb header should be set to ethernet header before using
is_skb_forwardable. Because the ethernet header length has been
considered in is_skb_forwardable(including dev->hard_header_len
length).
To reproduce the issue:
1, add 2 ports on linux bridge br using following commands:
$ brctl addbr br
$ brctl addif br eth0
$ brctl addif br eth1
2, the MTU of eth0 and eth1 is 1500
3, send a packet(Data 1480, UDP 8, IP 20, Ethernet 14, VLAN 4)
from eth0 to eth1
So the expect result is packet larger than 1500 cannot pass through
eth0 and eth1. But currently, the packet passes through success, it
means eth1's MTU limit doesn't take effect.
Fixes:
f6367b4660dd ("bridge: use is_skb_forwardable in forward path")
Cc: bridge@lists.linux-foundation.org
Cc: Nkolay Aleksandrov <nikolay@cumulusnetworks.com>
Cc: Roopa Prabhu <roopa@cumulusnetworks.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Greg Kroah-Hartman [Sat, 26 Jan 2019 08:38:36 +0000 (09:38 +0100)]
Linux 4.9.153
Dave Airlie [Thu, 24 Jan 2019 18:54:15 +0000 (18:54 +0000)]
locking/qspinlock: Pull in asm/byteorder.h to ensure correct endianness
This commit is not required upstream, but is required for the 4.9.y
stable series.
Upstream commit
101110f6271c ("Kbuild: always define endianess in
kconfig.h") ensures that either __LITTLE_ENDIAN or __BIG_ENDIAN is
defined to reflect the endianness of the target CPU architecture
regardless of whether or not <asm/byteorder.h> has been #included. The
upstream definition of 'struct qspinlock' relies on this property.
Unfortunately, the 4.9.y stable series does not provide this guarantee,
so the 'spin_unlock()' routine can erroneously treat the underlying
lockword as big-endian on little-endian architectures using native
qspinlock (i.e. x86_64 without PV) if the caller has not included
<asm/byteorder.h>. This can lead to hangs such as the one in
'i915_gem_request()' reported via bugzilla:
https://bugzilla.kernel.org/show_bug.cgi?id=202063
Fix the issue by ensuring that <asm/byteorder.h> is #included in
<asm/qspinlock_types.h>, where 'struct qspinlock' is defined.
Cc: <stable@vger.kernel.org> # 4.9
Signed-off-by: Dave Airlie <airlied@redhat.com>
[will: wrote commit message]
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Corey Minyard [Fri, 16 Nov 2018 15:59:21 +0000 (09:59 -0600)]
ipmi:ssif: Fix handling of multi-part return messages
commit
7d6380cd40f7993f75c4bde5b36f6019237e8719 upstream.
The block number was not being compared right, it was off by one
when checking the response.
Some statistics wouldn't be incremented properly in some cases.
Check to see if that middle-part messages always have 31 bytes of
data.
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Cc: stable@vger.kernel.org # 4.4
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Michal Hocko [Fri, 28 Dec 2018 08:38:17 +0000 (00:38 -0800)]
mm, proc: be more verbose about unstable VMA flags in /proc/<pid>/smaps
[ Upstream commit
7550c6079846a24f30d15ac75a941c8515dbedfb ]
Patch series "THP eligibility reporting via proc".
This series of three patches aims at making THP eligibility reporting much
more robust and long term sustainable. The trigger for the change is a
regression report [2] and the long follow up discussion. In short the
specific application didn't have good API to query whether a particular
mapping can be backed by THP so it has used VMA flags to workaround that.
These flags represent a deep internal state of VMAs and as such they
should be used by userspace with a great deal of caution.
A similar has happened for [3] when users complained that VM_MIXEDMAP is
no longer set on DAX mappings. Again a lack of a proper API led to an
abuse.
The first patch in the series tries to emphasise that that the semantic of
flags might change and any application consuming those should be really
careful.
The remaining two patches provide a more suitable interface to address [2]
and provide a consistent API to query the THP status both for each VMA and
process wide as well. [1]
http://lkml.kernel.org/r/
20181120103515.25280-1-mhocko@kernel.org [2]
http://lkml.kernel.org/r/http://lkml.kernel.org/r/alpine.DEB.2.21.
1809241054050.224429@chino.kir.corp.google.com
[3] http://lkml.kernel.org/r/
20181002100531.GC4135@quack2.suse.cz
This patch (of 3):
Even though vma flags exported via /proc/<pid>/smaps are explicitly
documented to be not guaranteed for future compatibility the warning
doesn't go far enough because it doesn't mention semantic changes to those
flags. And they are important as well because these flags are a deep
implementation internal to the MM code and the semantic might change at
any time.
Let's consider two recent examples:
http://lkml.kernel.org/r/
20181002100531.GC4135@quack2.suse.cz
: commit
e1fb4a086495 "dax: remove VM_MIXEDMAP for fsdax and device dax" has
: removed VM_MIXEDMAP flag from DAX VMAs. Now our testing shows that in the
: mean time certain customer of ours started poking into /proc/<pid>/smaps
: and looks at VMA flags there and if VM_MIXEDMAP is missing among the VMA
: flags, the application just fails to start complaining that DAX support is
: missing in the kernel.
http://lkml.kernel.org/r/alpine.DEB.2.21.
1809241054050.224429@chino.kir.corp.google.com
: Commit
1860033237d4 ("mm: make PR_SET_THP_DISABLE immediately active")
: introduced a regression in that userspace cannot always determine the set
: of vmas where thp is ineligible.
: Userspace relies on the "nh" flag being emitted as part of /proc/pid/smaps
: to determine if a vma is eligible to be backed by hugepages.
: Previous to this commit, prctl(PR_SET_THP_DISABLE, 1) would cause thp to
: be disabled and emit "nh" as a flag for the corresponding vmas as part of
: /proc/pid/smaps. After the commit, thp is disabled by means of an mm
: flag and "nh" is not emitted.
: This causes smaps parsing libraries to assume a vma is eligible for thp
: and ends up puzzling the user on why its memory is not backed by thp.
In both cases userspace was relying on a semantic of a specific VMA flag.
The primary reason why that happened is a lack of a proper interface.
While this has been worked on and it will be fixed properly, it seems that
our wording could see some refinement and be more vocal about semantic
aspect of these flags as well.
Link: http://lkml.kernel.org/r/20181211143641.3503-2-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Jan Kara <jack@suse.cz>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: David Rientjes <rientjes@google.com>
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Paul Oppenheimer <bepvte@gmail.com>
Cc: William Kucharski <william.kucharski@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Brian Foster [Fri, 28 Dec 2018 08:37:20 +0000 (00:37 -0800)]
mm/page-writeback.c: don't break integrity writeback on ->writepage() error
[ Upstream commit
3fa750dcf29e8606e3969d13d8e188cc1c0f511d ]
write_cache_pages() is used in both background and integrity writeback
scenarios by various filesystems. Background writeback is mostly
concerned with cleaning a certain number of dirty pages based on various
mm heuristics. It may not write the full set of dirty pages or wait for
I/O to complete. Integrity writeback is responsible for persisting a set
of dirty pages before the writeback job completes. For example, an
fsync() call must perform integrity writeback to ensure data is on disk
before the call returns.
write_cache_pages() unconditionally breaks out of its processing loop in
the event of a ->writepage() error. This is fine for background
writeback, which had no strict requirements and will eventually come
around again. This can cause problems for integrity writeback on
filesystems that might need to clean up state associated with failed page
writeouts. For example, XFS performs internal delayed allocation
accounting before returning a ->writepage() error, where applicable. If
the current writeback happens to be associated with an unmount and
write_cache_pages() completes the writeback prematurely due to error, the
filesystem is unmounted in an inconsistent state if dirty+delalloc pages
still exist.
To handle this problem, update write_cache_pages() to always process the
full set of pages for integrity writeback regardless of ->writepage()
errors. Save the first encountered error and return it to the caller once
complete. This facilitates XFS (or any other fs that expects integrity
writeback to process the entire set of dirty pages) to clean up its
internal state completely in the event of persistent mapping errors.
Background writeback continues to exit on the first error encountered.
[akpm@linux-foundation.org: fix typo in comment]
Link: http://lkml.kernel.org/r/20181116134304.32440-1-bfoster@redhat.com
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Junxiao Bi [Fri, 28 Dec 2018 08:32:50 +0000 (00:32 -0800)]
ocfs2: fix panic due to unrecovered local alloc
[ Upstream commit
532e1e54c8140188e192348c790317921cb2dc1c ]
mount.ocfs2 ignore the inconsistent error that journal is clean but
local alloc is unrecovered. After mount, local alloc not empty, then
reserver cluster didn't alloc a new local alloc window, reserveration
map is empty(ocfs2_reservation_map.m_bitmap_len = 0), that triggered the
following panic.
This issue was reported at
https://oss.oracle.com/pipermail/ocfs2-devel/2015-May/010854.html
and was advised to fixed during mount. But this is a very unusual
inconsistent state, usually journal dirty flag should be cleared at the
last stage of umount until every other things go right. We may need do
further debug to check that. Any way to avoid possible futher
corruption, mount should be abort and fsck should be run.
(mount.ocfs2,1765,1):ocfs2_load_local_alloc:353 ERROR: Local alloc hasn't been recovered!
found = 6518, set = 6518, taken = 8192, off =
15912372
ocfs2: Mounting device (202,64) on (node 0, slot 3) with ordered data mode.
o2dlm: Joining domain
89CEAC63CC4F4D03AC185B44E0EE0F3F ( 0 1 2 3 4 5 6 8 ) 8 nodes
ocfs2: Mounting device (202,80) on (node 0, slot 3) with ordered data mode.
o2hb: Region
89CEAC63CC4F4D03AC185B44E0EE0F3F (xvdf) is now a quorum device
o2net: Accepted connection from node yvwsoa17p (num 7) at 172.22.77.88:7777
o2dlm: Node 7 joins domain
64FE421C8C984E6D96ED12C55FEE2435 ( 0 1 2 3 4 5 6 7 8 ) 9 nodes
o2dlm: Node 7 joins domain
89CEAC63CC4F4D03AC185B44E0EE0F3F ( 0 1 2 3 4 5 6 7 8 ) 9 nodes
------------[ cut here ]------------
kernel BUG at fs/ocfs2/reservations.c:507!
invalid opcode: 0000 [#1] SMP
Modules linked in: ocfs2 rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs fscache lockd grace ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm ocfs2_nodemanager ocfs2_stackglue configfs sunrpc ipt_REJECT nf_reject_ipv4 nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr ipv6 ovmapi ppdev parport_pc parport xen_netfront fb_sys_fops sysimgblt sysfillrect syscopyarea acpi_cpufreq pcspkr i2c_piix4 i2c_core sg ext4 jbd2 mbcache2 sr_mod cdrom xen_blkfront pata_acpi ata_generic ata_piix floppy dm_mirror dm_region_hash dm_log dm_mod
CPU: 0 PID: 4349 Comm: startWebLogic.s Not tainted 4.1.12-124.19.2.el6uek.x86_64 #2
Hardware name: Xen HVM domU, BIOS 4.4.4OVM 09/06/2018
task:
ffff8803fb04e200 ti:
ffff8800ea4d8000 task.ti:
ffff8800ea4d8000
RIP: 0010:[<
ffffffffa05e96a8>] [<
ffffffffa05e96a8>] __ocfs2_resv_find_window+0x498/0x760 [ocfs2]
Call Trace:
ocfs2_resmap_resv_bits+0x10d/0x400 [ocfs2]
ocfs2_claim_local_alloc_bits+0xd0/0x640 [ocfs2]
__ocfs2_claim_clusters+0x178/0x360 [ocfs2]
ocfs2_claim_clusters+0x1f/0x30 [ocfs2]
ocfs2_convert_inline_data_to_extents+0x634/0xa60 [ocfs2]
ocfs2_write_begin_nolock+0x1c6/0x1da0 [ocfs2]
ocfs2_write_begin+0x13e/0x230 [ocfs2]
generic_perform_write+0xbf/0x1c0
__generic_file_write_iter+0x19c/0x1d0
ocfs2_file_write_iter+0x589/0x1360 [ocfs2]
__vfs_write+0xb8/0x110
vfs_write+0xa9/0x1b0
SyS_write+0x46/0xb0
system_call_fastpath+0x18/0xd7
Code: ff ff 8b 75 b8 39 75 b0 8b 45 c8 89 45 98 0f 84 e5 fe ff ff 45 8b 74 24 18 41 8b 54 24 1c e9 56 fc ff ff 85 c0 0f 85 48 ff ff ff <0f> 0b 48 8b 05 cf c3 de ff 48 ba 00 00 00 00 00 00 00 10 48 85
RIP __ocfs2_resv_find_window+0x498/0x760 [ocfs2]
RSP <
ffff8800ea4db668>
---[ end trace
566f07529f2edf3c ]---
Kernel panic - not syncing: Fatal exception
Kernel Offset: disabled
Link: http://lkml.kernel.org/r/20181121020023.3034-2-junxiao.bi@oracle.com
Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
Reviewed-by: Yiwen Jiang <jiangyiwen@huawei.com>
Acked-by: Joseph Qi <jiangqi903@gmail.com>
Cc: Jun Piao <piaojun@huawei.com>
Cc: Mark Fasheh <mfasheh@versity.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Changwei Ge <ge.changwei@h3c.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Qian Cai [Thu, 13 Dec 2018 13:27:27 +0000 (08:27 -0500)]
scsi: megaraid: fix out-of-bound array accesses
[ Upstream commit
c7a082e4242fd8cd21a441071e622f87c16bdacc ]
UBSAN reported those with MegaRAID SAS-3 3108,
[ 77.467308] UBSAN: Undefined behaviour in drivers/scsi/megaraid/megaraid_sas_fp.c:117:32
[ 77.475402] index 255 is out of range for type 'MR_LD_SPAN_MAP [1]'
[ 77.481677] CPU: 16 PID: 333 Comm: kworker/16:1 Not tainted 4.20.0-rc5+ #1
[ 77.488556] Hardware name: Huawei TaiShan 2280 /BC11SPCD, BIOS 1.50 06/01/2018
[ 77.495791] Workqueue: events work_for_cpu_fn
[ 77.500154] Call trace:
[ 77.502610] dump_backtrace+0x0/0x2c8
[ 77.506279] show_stack+0x24/0x30
[ 77.509604] dump_stack+0x118/0x19c
[ 77.513098] ubsan_epilogue+0x14/0x60
[ 77.516765] __ubsan_handle_out_of_bounds+0xfc/0x13c
[ 77.521767] mr_update_load_balance_params+0x150/0x158 [megaraid_sas]
[ 77.528230] MR_ValidateMapInfo+0x2cc/0x10d0 [megaraid_sas]
[ 77.533825] megasas_get_map_info+0x244/0x2f0 [megaraid_sas]
[ 77.539505] megasas_init_adapter_fusion+0x9b0/0xf48 [megaraid_sas]
[ 77.545794] megasas_init_fw+0x1ab4/0x3518 [megaraid_sas]
[ 77.551212] megasas_probe_one+0x2c4/0xbe0 [megaraid_sas]
[ 77.556614] local_pci_probe+0x7c/0xf0
[ 77.560365] work_for_cpu_fn+0x34/0x50
[ 77.564118] process_one_work+0x61c/0xf08
[ 77.568129] worker_thread+0x534/0xa70
[ 77.571882] kthread+0x1c8/0x1d0
[ 77.575114] ret_from_fork+0x10/0x1c
[ 89.240332] UBSAN: Undefined behaviour in drivers/scsi/megaraid/megaraid_sas_fp.c:117:32
[ 89.248426] index 255 is out of range for type 'MR_LD_SPAN_MAP [1]'
[ 89.254700] CPU: 16 PID: 95 Comm: kworker/u130:0 Not tainted 4.20.0-rc5+ #1
[ 89.261665] Hardware name: Huawei TaiShan 2280 /BC11SPCD, BIOS 1.50 06/01/2018
[ 89.268903] Workqueue: events_unbound async_run_entry_fn
[ 89.274222] Call trace:
[ 89.276680] dump_backtrace+0x0/0x2c8
[ 89.280348] show_stack+0x24/0x30
[ 89.283671] dump_stack+0x118/0x19c
[ 89.287167] ubsan_epilogue+0x14/0x60
[ 89.290835] __ubsan_handle_out_of_bounds+0xfc/0x13c
[ 89.295828] MR_LdRaidGet+0x50/0x58 [megaraid_sas]
[ 89.300638] megasas_build_io_fusion+0xbb8/0xd90 [megaraid_sas]
[ 89.306576] megasas_build_and_issue_cmd_fusion+0x138/0x460 [megaraid_sas]
[ 89.313468] megasas_queue_command+0x398/0x3d0 [megaraid_sas]
[ 89.319222] scsi_dispatch_cmd+0x1dc/0x8a8
[ 89.323321] scsi_request_fn+0x8e8/0xdd0
[ 89.327249] __blk_run_queue+0xc4/0x158
[ 89.331090] blk_execute_rq_nowait+0xf4/0x158
[ 89.335449] blk_execute_rq+0xdc/0x158
[ 89.339202] __scsi_execute+0x130/0x258
[ 89.343041] scsi_probe_and_add_lun+0x2fc/0x1488
[ 89.347661] __scsi_scan_target+0x1cc/0x8c8
[ 89.351848] scsi_scan_channel.part.3+0x8c/0xc0
[ 89.356382] scsi_scan_host_selected+0x130/0x1f0
[ 89.361002] do_scsi_scan_host+0xd8/0xf0
[ 89.364927] do_scan_async+0x9c/0x320
[ 89.368594] async_run_entry_fn+0x138/0x420
[ 89.372780] process_one_work+0x61c/0xf08
[ 89.376793] worker_thread+0x13c/0xa70
[ 89.380546] kthread+0x1c8/0x1d0
[ 89.383778] ret_from_fork+0x10/0x1c
This is because when populating Driver Map using firmware raid map, all
non-existing VDs set their ldTgtIdToLd to 0xff, so it can be skipped later.
From drivers/scsi/megaraid/megaraid_sas_base.c ,
memset(instance->ld_ids, 0xff, MEGASAS_MAX_LD_IDS);
From drivers/scsi/megaraid/megaraid_sas_fp.c ,
/* For non existing VDs, iterate to next VD*/
if (ld >= (MAX_LOGICAL_DRIVES_EXT - 1))
continue;
However, there are a few places that failed to skip those non-existing VDs
due to off-by-one errors. Then, those 0xff leaked into MR_LdRaidGet(0xff,
map) and triggered the out-of-bound accesses.
Fixes:
51087a8617fe ("megaraid_sas : Extended VD support")
Signed-off-by: Qian Cai <cai@lca.pw>
Acked-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Kevin Barnett [Fri, 7 Dec 2018 22:29:51 +0000 (16:29 -0600)]
scsi: smartpqi: correct lun reset issues
[ Upstream commit
2ba55c9851d74eb015a554ef69ddf2ef061d5780 ]
Problem:
The Linux kernel takes a logical volume offline after a LUN reset. This is
generally accompanied by this message in the dmesg output:
Device offlined - not ready after error recovery
Root Cause:
The root cause is a "quirk" in the timeout handling in the Linux SCSI
layer. The Linux kernel places a 30-second timeout on most media access
commands (reads and writes) that it send to device drivers. When a media
access command times out, the Linux kernel goes into error recovery mode
for the LUN that was the target of the command that timed out. Every
command that timed out is kept on a list inside of the Linux kernel to be
retried later. The kernel attempts to recover the command(s) that timed out
by issuing a LUN reset followed by a TEST UNIT READY. If the LUN reset and
TEST UNIT READY commands are successful, the kernel retries the command(s)
that timed out.
Each SCSI command issued by the kernel has a result field associated with
it. This field indicates the final result of the command (success or
error). When a command times out, the kernel places a value in this result
field indicating that the command timed out.
The "quirk" is that after the LUN reset and TEST UNIT READY commands are
completed, the kernel checks each command on the timed-out command list
before retrying it. If the result field is still "timed out", the kernel
treats that command as not having been successfully recovered for a
retry. If the number of commands that are in this state are greater than
two, the kernel takes the LUN offline.
Fix:
When our RAIDStack receives a LUN reset, it simply waits until all
outstanding commands complete. Generally, all of these outstanding commands
complete successfully. Therefore, the fix in the smartpqi driver is to
always set the command result field to indicate success when a request
completes successfully. This normally isn’t necessary because the result
field is always initialized to success when the command is submitted to the
driver. So when the command completes successfully, the result field is
left untouched. But in this case, the kernel changes the result field
behind the driver’s back and then expects the field to be changed by the
driver as the commands that timed-out complete.
Reviewed-by: Dave Carroll <david.carroll@microsemi.com>
Reviewed-by: Scott Teel <scott.teel@microsemi.com>
Signed-off-by: Kevin Barnett <kevin.barnett@microsemi.com>
Signed-off-by: Don Brace <don.brace@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Daniel Vetter [Wed, 19 Dec 2018 12:39:09 +0000 (13:39 +0100)]
sysfs: Disable lockdep for driver bind/unbind files
[ Upstream commit
4f4b374332ec0ae9c738ff8ec9bed5cd97ff9adc ]
This is the much more correct fix for my earlier attempt at:
https://lkml.org/lkml/2018/12/10/118
Short recap:
- There's not actually a locking issue, it's just lockdep being a bit
too eager to complain about a possible deadlock.
- Contrary to what I claimed the real problem is recursion on
kn->count. Greg pointed me at sysfs_break_active_protection(), used
by the scsi subsystem to allow a sysfs file to unbind itself. That
would be a real deadlock, which isn't what's happening here. Also,
breaking the active protection means we'd need to manually handle
all the lifetime fun.
- With Rafael we discussed the task_work approach, which kinda works,
but has two downsides: It's a functional change for a lockdep
annotation issue, and it won't work for the bind file (which needs
to get the errno from the driver load function back to userspace).
- Greg also asked why this never showed up: To hit this you need to
unregister a 2nd driver from the unload code of your first driver. I
guess only gpus do that. The bug has always been there, but only
with a recent patch series did we add more locks so that lockdep
built a chain from unbinding the snd-hda driver to the
acpi_video_unregister call.
Full lockdep splat:
[12301.898799] ============================================
[12301.898805] WARNING: possible recursive locking detected
[12301.898811] 4.20.0-rc7+ #84 Not tainted
[12301.898815] --------------------------------------------
[12301.898821] bash/5297 is trying to acquire lock:
[12301.898826]
00000000f61c6093 (kn->count#39){++++}, at: kernfs_remove_by_name_ns+0x3b/0x80
[12301.898841] but task is already holding lock:
[12301.898847]
000000005f634021 (kn->count#39){++++}, at: kernfs_fop_write+0xdc/0x190
[12301.898856] other info that might help us debug this:
[12301.898862] Possible unsafe locking scenario:
[12301.898867] CPU0
[12301.898870] ----
[12301.898874] lock(kn->count#39);
[12301.898879] lock(kn->count#39);
[12301.898883] *** DEADLOCK ***
[12301.898891] May be due to missing lock nesting notation
[12301.898899] 5 locks held by bash/5297:
[12301.898903] #0:
00000000cd800e54 (sb_writers#4){.+.+}, at: vfs_write+0x17f/0x1b0
[12301.898915] #1:
000000000465e7c2 (&of->mutex){+.+.}, at: kernfs_fop_write+0xd3/0x190
[12301.898925] #2:
000000005f634021 (kn->count#39){++++}, at: kernfs_fop_write+0xdc/0x190
[12301.898936] #3:
00000000414ef7ac (&dev->mutex){....}, at: device_release_driver_internal+0x34/0x240
[12301.898950] #4:
000000003218fbdf (register_count_mutex){+.+.}, at: acpi_video_unregister+0xe/0x40
[12301.898960] stack backtrace:
[12301.898968] CPU: 1 PID: 5297 Comm: bash Not tainted 4.20.0-rc7+ #84
[12301.898974] Hardware name: Hewlett-Packard HP EliteBook 8460p/161C, BIOS 68SCF Ver. F.01 03/11/2011
[12301.898982] Call Trace:
[12301.898989] dump_stack+0x67/0x9b
[12301.898997] __lock_acquire+0x6ad/0x1410
[12301.899003] ? kernfs_remove_by_name_ns+0x3b/0x80
[12301.899010] ? find_held_lock+0x2d/0x90
[12301.899017] ? mutex_spin_on_owner+0xe4/0x150
[12301.899023] ? find_held_lock+0x2d/0x90
[12301.899030] ? lock_acquire+0x90/0x180
[12301.899036] lock_acquire+0x90/0x180
[12301.899042] ? kernfs_remove_by_name_ns+0x3b/0x80
[12301.899049] __kernfs_remove+0x296/0x310
[12301.899055] ? kernfs_remove_by_name_ns+0x3b/0x80
[12301.899060] ? kernfs_name_hash+0xd/0x80
[12301.899066] ? kernfs_find_ns+0x6c/0x100
[12301.899073] kernfs_remove_by_name_ns+0x3b/0x80
[12301.899080] bus_remove_driver+0x92/0xa0
[12301.899085] acpi_video_unregister+0x24/0x40
[12301.899127] i915_driver_unload+0x42/0x130 [i915]
[12301.899160] i915_pci_remove+0x19/0x30 [i915]
[12301.899169] pci_device_remove+0x36/0xb0
[12301.899176] device_release_driver_internal+0x185/0x240
[12301.899183] unbind_store+0xaf/0x180
[12301.899189] kernfs_fop_write+0x104/0x190
[12301.899195] __vfs_write+0x31/0x180
[12301.899203] ? rcu_read_lock_sched_held+0x6f/0x80
[12301.899209] ? rcu_sync_lockdep_assert+0x29/0x50
[12301.899216] ? __sb_start_write+0x13c/0x1a0
[12301.899221] ? vfs_write+0x17f/0x1b0
[12301.899227] vfs_write+0xb9/0x1b0
[12301.899233] ksys_write+0x50/0xc0
[12301.899239] do_syscall_64+0x4b/0x180
[12301.899247] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[12301.899253] RIP: 0033:0x7f452ac7f7a4
[12301.899259] Code: 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 80 00 00 00 00 8b 05 aa f0 2c 00 48 63 ff 85 c0 75 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 f3 c3 66 90 55 53 48 89 d5 48 89 f3 48 83
[12301.899273] RSP: 002b:
00007ffceafa6918 EFLAGS:
00000246 ORIG_RAX:
0000000000000001
[12301.899282] RAX:
ffffffffffffffda RBX:
000000000000000d RCX:
00007f452ac7f7a4
[12301.899288] RDX:
000000000000000d RSI:
00005612a1abf7c0 RDI:
0000000000000001
[12301.899295] RBP:
00005612a1abf7c0 R08:
000000000000000a R09:
00005612a1c46730
[12301.899301] R10:
000000000000000a R11:
0000000000000246 R12:
000000000000000d
[12301.899308] R13:
0000000000000001 R14:
00007f452af4a740 R15:
000000000000000d
Looking around I've noticed that usb and i2c already handle similar
recursion problems, where a sysfs file can unbind the same type of
sysfs somewhere else in the hierarchy. Relevant commits are:
commit
356c05d58af05d582e634b54b40050c73609617b
Author: Alan Stern <stern@rowland.harvard.edu>
Date: Mon May 14 13:30:03 2012 -0400
sysfs: get rid of some lockdep false positives
commit
e9b526fe704812364bca07edd15eadeba163ebfb
Author: Alexander Sverdlin <alexander.sverdlin@nsn.com>
Date: Fri May 17 14:56:35 2013 +0200
i2c: suppress lockdep warning on delete_device
Implement the same trick for driver bind/unbind.
v2: Put the macro into bus.c (Greg).
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Ramalingam C <ramalingam.c@intel.com>
Cc: Arend van Spriel <aspriel@gmail.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Geert Uytterhoeven <geert+renesas@glider.be>
Cc: Bartosz Golaszewski <brgl@bgdev.pl>
Cc: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Cc: Vivek Gautam <vivek.gautam@codeaurora.org>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Takashi Sakamoto [Wed, 19 Dec 2018 11:00:42 +0000 (20:00 +0900)]
ALSA: bebob: fix model-id of unit for Apogee Ensemble
[ Upstream commit
644b2e97405b0b74845e1d3c2b4fe4c34858062b ]
This commit fixes hard-coded model-id for an unit of Apogee Ensemble with
a correct value. This unit uses DM1500 ASIC produced ArchWave AG (formerly
known as BridgeCo AG).
I note that this model supports three modes in the number of data channels
in tx/rx streams; 8 ch pairs, 10 ch pairs, 18 ch pairs. The mode is
switched by Vendor-dependent AV/C command, like:
$ cd linux-firewire-utils
$ ./firewire-request /dev/fw1 fcp 0x00ff000003dbeb0600000000 (8ch pairs)
$ ./firewire-request /dev/fw1 fcp 0x00ff000003dbeb0601000000 (10ch pairs)
$ ./firewire-request /dev/fw1 fcp 0x00ff000003dbeb0602000000 (18ch pairs)
When switching between different mode, the unit disappears from IEEE 1394
bus, then appears on the bus with different combination of stream formats.
In a mode of 18 ch pairs, available sampling rate is up to 96.0 kHz, else
up to 192.0 kHz.
$ ./hinawa-config-rom-printer /dev/fw1
{ 'bus-info': { 'adj': False,
'bmc': True,
'chip_ID':
21474898341,
'cmc': True,
'cyc_clk_acc': 100,
'generation': 2,
'imc': True,
'isc': True,
'link_spd': 2,
'max_ROM': 1,
'max_rec': 512,
'name': '1394',
'node_vendor_ID': 987,
'pmc': False},
'root-directory': [ ['HARDWARE_VERSION', 19],
[ 'NODE_CAPABILITIES',
{ 'addressing': {'64': True, 'fix': True, 'prv': False},
'misc': {'int': False, 'ms': False, 'spt': True},
'state': { 'atn': False,
'ded': False,
'drq': True,
'elo': False,
'init': False,
'lst': True,
'off': False},
'testing': {'bas': False, 'ext': False}}],
['VENDOR', 987],
['DESCRIPTOR', 'Apogee Electronics'],
['MODEL', 126702],
['DESCRIPTOR', 'Ensemble'],
['VERSION', 5297],
[ 'UNIT',
[ ['SPECIFIER_ID', 41005],
['VERSION', 65537],
['MODEL', 126702],
['DESCRIPTOR', 'Ensemble']]],
[ 'DEPENDENT_INFO',
[ ['SPECIFIER_ID', 2037],
['VERSION', 1],
[(58, 'IMMEDIATE'),
16777159],
[(59, 'IMMEDIATE'),
1048576],
[(60, 'IMMEDIATE'),
16777159],
[(61, 'IMMEDIATE'),
6291456]]]]}
Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Nikos Tsironis [Wed, 31 Oct 2018 21:53:08 +0000 (17:53 -0400)]
dm snapshot: Fix excessive memory usage and workqueue stalls
[ Upstream commit
721b1d98fb517ae99ab3b757021cf81db41e67be ]
kcopyd has no upper limit to the number of jobs one can allocate and
issue. Under certain workloads this can lead to excessive memory usage
and workqueue stalls. For example, when creating multiple dm-snapshot
targets with a 4K chunk size and then writing to the origin through the
page cache. Syncing the page cache causes a large number of BIOs to be
issued to the dm-snapshot origin target, which itself issues an even
larger (because of the BIO splitting taking place) number of kcopyd
jobs.
Running the following test, from the device mapper test suite [1],
dmtest run --suite snapshot -n many_snapshots_of_same_volume_N
, with 8 active snapshots, results in the kcopyd job slab cache growing
to 10G. Depending on the available system RAM this can lead to the OOM
killer killing user processes:
[463.492878] kthreadd invoked oom-killer: gfp_mask=0x6040c0(GFP_KERNEL|__GFP_COMP),
nodemask=(null), order=1, oom_score_adj=0
[463.492894] kthreadd cpuset=/ mems_allowed=0
[463.492948] CPU: 7 PID: 2 Comm: kthreadd Not tainted 4.19.0-rc7 #3
[463.492950] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
[463.492952] Call Trace:
[463.492964] dump_stack+0x7d/0xbb
[463.492973] dump_header+0x6b/0x2fc
[463.492987] ? lockdep_hardirqs_on+0xee/0x190
[463.493012] oom_kill_process+0x302/0x370
[463.493021] out_of_memory+0x113/0x560
[463.493030] __alloc_pages_slowpath+0xf40/0x1020
[463.493055] __alloc_pages_nodemask+0x348/0x3c0
[463.493067] cache_grow_begin+0x81/0x8b0
[463.493072] ? cache_grow_begin+0x874/0x8b0
[463.493078] fallback_alloc+0x1e4/0x280
[463.493092] kmem_cache_alloc_node+0xd6/0x370
[463.493098] ? copy_process.part.31+0x1c5/0x20d0
[463.493105] copy_process.part.31+0x1c5/0x20d0
[463.493115] ? __lock_acquire+0x3cc/0x1550
[463.493121] ? __switch_to_asm+0x34/0x70
[463.493129] ? kthread_create_worker_on_cpu+0x70/0x70
[463.493135] ? finish_task_switch+0x90/0x280
[463.493165] _do_fork+0xe0/0x6d0
[463.493191] ? kthreadd+0x19f/0x220
[463.493233] kernel_thread+0x25/0x30
[463.493235] kthreadd+0x1bf/0x220
[463.493242] ? kthread_create_on_cpu+0x90/0x90
[463.493248] ret_from_fork+0x3a/0x50
[463.493279] Mem-Info:
[463.493285] active_anon:20631 inactive_anon:4831 isolated_anon:0
[463.493285] active_file:80216 inactive_file:80107 isolated_file:435
[463.493285] unevictable:0 dirty:51266 writeback:109372 unstable:0
[463.493285] slab_reclaimable:31191 slab_unreclaimable:
3483521
[463.493285] mapped:526 shmem:4903 pagetables:1759 bounce:0
[463.493285] free:33623 free_pcp:2392 free_cma:0
...
[463.493489] Unreclaimable slab info:
[463.493513] Name Used Total
[463.493522] bio-6 1028KB 1028KB
[463.493525] bio-5 1028KB 1028KB
[463.493528] dm_snap_pending_exception 236783KB 243789KB
[463.493531] dm_exception 41KB 42KB
[463.493534] bio-4 1216KB 1216KB
[463.493537] bio-3 439396KB 439396KB
[463.493539] kcopyd_job 6973427KB 6973427KB
...
[463.494340] Out of memory: Kill process 1298 (ruby2.3) score 1 or sacrifice child
[463.494673] Killed process 1298 (ruby2.3) total-vm:435740kB, anon-rss:20180kB, file-rss:4kB, shmem-rss:0kB
[463.506437] oom_reaper: reaped process 1298 (ruby2.3), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
Moreover, issuing a large number of kcopyd jobs results in kcopyd
hogging the CPU, while processing them. As a result, processing of work
items, queued for execution on the same CPU as the currently running
kcopyd thread, is stalled for long periods of time, hurting performance.
Running the aforementioned test we get, in dmesg, messages like the
following:
[67501.194592] BUG: workqueue lockup - pool cpus=4 node=0 flags=0x0 nice=0 stuck for 27s!
[67501.195586] Showing busy workqueues and worker pools:
[67501.195591] workqueue events: flags=0x0
[67501.195597] pwq 8: cpus=4 node=0 flags=0x0 nice=0 active=1/256
[67501.195611] pending: cache_reap
[67501.195641] workqueue mm_percpu_wq: flags=0x8
[67501.195645] pwq 8: cpus=4 node=0 flags=0x0 nice=0 active=1/256
[67501.195656] pending: vmstat_update
[67501.195682] workqueue kblockd: flags=0x18
[67501.195687] pwq 5: cpus=2 node=0 flags=0x0 nice=-20 active=1/256
[67501.195698] pending: blk_timeout_work
[67501.195753] workqueue kcopyd: flags=0x8
[67501.195757] pwq 8: cpus=4 node=0 flags=0x0 nice=0 active=1/256
[67501.195768] pending: do_work [dm_mod]
[67501.195802] workqueue kcopyd: flags=0x8
[67501.195806] pwq 8: cpus=4 node=0 flags=0x0 nice=0 active=1/256
[67501.195817] pending: do_work [dm_mod]
[67501.195834] workqueue kcopyd: flags=0x8
[67501.195838] pwq 8: cpus=4 node=0 flags=0x0 nice=0 active=1/256
[67501.195848] pending: do_work [dm_mod]
[67501.195881] workqueue kcopyd: flags=0x8
[67501.195885] pwq 8: cpus=4 node=0 flags=0x0 nice=0 active=1/256
[67501.195896] pending: do_work [dm_mod]
[67501.195920] workqueue kcopyd: flags=0x8
[67501.195924] pwq 8: cpus=4 node=0 flags=0x0 nice=0 active=2/256
[67501.195935] in-flight: 67:do_work [dm_mod]
[67501.195945] pending: do_work [dm_mod]
[67501.195961] pool 8: cpus=4 node=0 flags=0x0 nice=0 hung=27s workers=3 idle: 129 23765
The root cause for these issues is the way dm-snapshot uses kcopyd. In
particular, the lack of an explicit or implicit limit to the maximum
number of in-flight COW jobs. The merging path is not affected because
it implicitly limits the in-flight kcopyd jobs to one.
Fix these issues by using a semaphore to limit the maximum number of
in-flight kcopyd jobs. We grab the semaphore before allocating a new
kcopyd job in start_copy() and start_full_bio() and release it after the
job finishes in copy_callback().
The initial semaphore value is configurable through a module parameter,
to allow fine tuning the maximum number of in-flight COW jobs. Setting
this parameter to zero initializes the semaphore to INT_MAX.
A default value of 2048 maximum in-flight kcopyd jobs was chosen. This
value was decided experimentally as a trade-off between memory
consumption, stalling the kernel's workqueues and maintaining a high
enough throughput.
Re-running the aforementioned test:
* Workqueue stalls are eliminated
* kcopyd's job slab cache uses a maximum of 130MB
* The time taken by the test to write to the snapshot-origin target is
reduced from 05m20.48s to 03m26.38s
[1] https://github.com/jthornber/device-mapper-test-suite
Signed-off-by: Nikos Tsironis <ntsironis@arrikto.com>
Signed-off-by: Ilias Tsitsimpis <iliastsi@arrikto.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Arnaldo Carvalho de Melo [Tue, 11 Dec 2018 18:00:52 +0000 (15:00 -0300)]
tools lib subcmd: Don't add the kernel sources to the include path
[ Upstream commit
ece9804985b57e1ccd83b1fb6288520955a29d51 ]
At some point we decided not to directly include kernel sources files
when building tools/perf/, but when tools/lib/subcmd/ was forked from
tools/perf it somehow ended up adding it via these two lines in its
Makefile:
CFLAGS += -I$(srctree)/include/uapi
CFLAGS += -I$(srctree)/include
As $(srctree) points to the kernel sources.
Removing those lines and keeping just:
CFLAGS += -I$(srctree)/tools/include/
Is enough to build tools/perf and tools/objtool.
This fixes the build when building from the sources in environments such
as the Android NDK crossbuilding from a fedora:26 system:
subcmd-util.h:11:15: error: expected ',' or ';' before 'void'
static inline void report(const char *prefix, const char *err, va_list params)
^
In file included from /git/perf/include/uapi/linux/stddef.h:2:0,
from /git/perf/include/uapi/linux/posix_types.h:5,
from /opt/android-ndk-r12b/platforms/android-24/arch-arm/usr/include/sys/types.h:36,
from /opt/android-ndk-r12b/platforms/android-24/arch-arm/usr/include/unistd.h:33,
from run-command.c:2:
subcmd-util.h:18:17: error: '__no_instrument_function__' attribute applies only to functions
The /opt/android-ndk-r12b/platforms/android-24/arch-arm/usr/include/sys/types.h
file that includes linux/posix_types.h ends up getting the one in the kernel
sources causing the breakage. Fix it.
Test built tools/objtool/ too.
Reported-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Fixes:
4b6ab94eabe4 ("perf subcmd: Create subcmd library")
Link: https://lkml.kernel.org/n/tip-5lhaoecrj12t0bqwvpiu14sm@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Nikos Tsironis [Wed, 31 Oct 2018 21:53:09 +0000 (17:53 -0400)]
dm kcopyd: Fix bug causing workqueue stalls
[ Upstream commit
d7e6b8dfc7bcb3f4f3a18313581f67486a725b52 ]
When using kcopyd to run callbacks through dm_kcopyd_do_callback() or
submitting copy jobs with a source size of 0, the jobs are pushed
directly to the complete_jobs list, which could be under processing by
the kcopyd thread. As a result, the kcopyd thread can continue running
completed jobs indefinitely, without releasing the CPU, as long as
someone keeps submitting new completed jobs through the aforementioned
paths. Processing of work items, queued for execution on the same CPU as
the currently running kcopyd thread, is thus stalled for excessive
amounts of time, hurting performance.
Running the following test, from the device mapper test suite [1],
dmtest run --suite snapshot -n parallel_io_to_many_snaps_N
, with 8 active snapshots, we get, in dmesg, messages like the
following:
[68899.948523] BUG: workqueue lockup - pool cpus=0 node=0 flags=0x0 nice=0 stuck for 95s!
[68899.949282] Showing busy workqueues and worker pools:
[68899.949288] workqueue events: flags=0x0
[68899.949295] pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=2/256
[68899.949306] pending: vmstat_shepherd, cache_reap
[68899.949331] workqueue mm_percpu_wq: flags=0x8
[68899.949337] pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256
[68899.949345] pending: vmstat_update
[68899.949387] workqueue dm_bufio_cache: flags=0x8
[68899.949392] pwq 4: cpus=2 node=0 flags=0x0 nice=0 active=1/256
[68899.949400] pending: work_fn [dm_bufio]
[68899.949423] workqueue kcopyd: flags=0x8
[68899.949429] pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256
[68899.949437] pending: do_work [dm_mod]
[68899.949452] workqueue kcopyd: flags=0x8
[68899.949458] pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=2/256
[68899.949466] in-flight: 13:do_work [dm_mod]
[68899.949474] pending: do_work [dm_mod]
[68899.949487] workqueue kcopyd: flags=0x8
[68899.949493] pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256
[68899.949501] pending: do_work [dm_mod]
[68899.949515] workqueue kcopyd: flags=0x8
[68899.949521] pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256
[68899.949529] pending: do_work [dm_mod]
[68899.949541] workqueue kcopyd: flags=0x8
[68899.949547] pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256
[68899.949555] pending: do_work [dm_mod]
[68899.949568] pool 0: cpus=0 node=0 flags=0x0 nice=0 hung=95s workers=4 idle: 27130 27223 1084
Fix this by splitting the complete_jobs list into two parts: A user
facing part, named callback_jobs, and one used internally by kcopyd,
retaining the name complete_jobs. dm_kcopyd_do_callback() and
dispatch_job() now push their jobs to the callback_jobs list, which is
spliced to the complete_jobs list once, every time the kcopyd thread
wakes up. This prevents kcopyd from hogging the CPU indefinitely and
causing workqueue stalls.
Re-running the aforementioned test:
* Workqueue stalls are eliminated
* The maximum writing time among all targets is reduced from 09m37.10s
to 06m04.85s and the total run time of the test is reduced from
10m43.591s to 7m19.199s
[1] https://github.com/jthornber/device-mapper-test-suite
Signed-off-by: Nikos Tsironis <ntsironis@arrikto.com>
Signed-off-by: Ilias Tsitsimpis <iliastsi@arrikto.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Arnaldo Carvalho de Melo [Thu, 6 Dec 2018 16:52:13 +0000 (13:52 -0300)]
perf parse-events: Fix unchecked usage of strncpy()
[ Upstream commit
bd8d57fb7e25e9fcf67a9eef5fa13aabe2016e07 ]
The strncpy() function may leave the destination string buffer
unterminated, better use strlcpy() that we have a __weak fallback
implementation for systems without it.
This fixes this warning on an Alpine Linux Edge system with gcc 8.2:
util/parse-events.c: In function 'print_symbol_events':
util/parse-events.c:2465:4: error: 'strncpy' specified bound 100 equals destination size [-Werror=stringop-truncation]
strncpy(name, syms->symbol, MAX_NAME_LEN);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In function 'print_symbol_events.constprop',
inlined from 'print_events' at util/parse-events.c:2508:2:
util/parse-events.c:2465:4: error: 'strncpy' specified bound 100 equals destination size [-Werror=stringop-truncation]
strncpy(name, syms->symbol, MAX_NAME_LEN);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In function 'print_symbol_events.constprop',
inlined from 'print_events' at util/parse-events.c:2511:2:
util/parse-events.c:2465:4: error: 'strncpy' specified bound 100 equals destination size [-Werror=stringop-truncation]
strncpy(name, syms->symbol, MAX_NAME_LEN);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
cc1: all warnings being treated as errors
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Fixes:
947b4ad1d198 ("perf list: Fix max event string size")
Link: https://lkml.kernel.org/n/tip-b663e33bm6x8hrkie4uxh7u2@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Arnaldo Carvalho de Melo [Thu, 6 Dec 2018 14:29:48 +0000 (11:29 -0300)]
perf svghelper: Fix unchecked usage of strncpy()
[ Upstream commit
2f5302533f306d5ee87bd375aef9ca35b91762cb ]
The strncpy() function may leave the destination string buffer
unterminated, better use strlcpy() that we have a __weak fallback
implementation for systems without it.
In this specific case this would only happen if fgets() was buggy, as
its man page states that it should read one less byte than the size of
the destination buffer, so that it can put the nul byte at the end of
it, so it would never copy 255 non-nul chars, as fgets reads into the
orig buffer at most 254 non-nul chars and terminates it. But lets just
switch to strlcpy to keep the original intent and silence the gcc 8.2
warning.
This fixes this warning on an Alpine Linux Edge system with gcc 8.2:
In function 'cpu_model',
inlined from 'svg_cpu_box' at util/svghelper.c:378:2:
util/svghelper.c:337:5: error: 'strncpy' output may be truncated copying 255 bytes from a string of length 255 [-Werror=stringop-truncation]
strncpy(cpu_m, &buf[13], 255);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Fixes:
f48d55ce7871 ("perf: Add a SVG helper library file")
Link: https://lkml.kernel.org/n/tip-xzkoo0gyr56gej39ltivuh9g@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Adrian Hunter [Mon, 26 Nov 2018 12:12:52 +0000 (14:12 +0200)]
perf intel-pt: Fix error with config term "pt=0"
[ Upstream commit
1c6f709b9f96366cc47af23c05ecec9b8c0c392d ]
Users should never use 'pt=0', but if they do it may give a meaningless
error:
$ perf record -e intel_pt/pt=0/u uname
Error:
The sys_perf_event_open() syscall returned with 22 (Invalid argument) for
event (intel_pt/pt=0/u).
Fix that by forcing 'pt=1'.
Committer testing:
# perf record -e intel_pt/pt=0/u uname
Error:
The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (intel_pt/pt=0/u).
/bin/dmesg | grep -i perf may provide additional information.
# perf record -e intel_pt/pt=0/u uname
pt=0 doesn't make sense, forcing pt=1
Linux
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.020 MB perf.data ]
#
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/b7c5b4e5-9497-10e5-fd43-5f3e4a0fe51d@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Sergey Senozhatsky [Thu, 13 Dec 2018 04:58:39 +0000 (13:58 +0900)]
tty/serial: do not free trasnmit buffer page under port lock
[ Upstream commit
d72402145ace0697a6a9e8e75a3de5bf3375f78d ]
LKP has hit yet another circular locking dependency between uart
console drivers and debugobjects [1]:
CPU0 CPU1
rhltable_init()
__init_work()
debug_object_init
uart_shutdown() /* db->lock */
/* uart_port->lock */ debug_print_object()
free_page() printk()
call_console_drivers()
debug_check_no_obj_freed() /* uart_port->lock */
/* db->lock */
debug_print_object()
So there are two dependency chains:
uart_port->lock -> db->lock
And
db->lock -> uart_port->lock
This particular circular locking dependency can be addressed in several
ways:
a) One way would be to move debug_print_object() out of db->lock scope
and, thus, break the db->lock -> uart_port->lock chain.
b) Another one would be to free() transmit buffer page out of db->lock
in UART code; which is what this patch does.
It makes sense to apply a) and b) independently: there are too many things
going on behind free(), none of which depend on uart_port->lock.
The patch fixes transmit buffer page free() in uart_shutdown() and,
additionally, in uart_port_startup() (as was suggested by Dmitry Safonov).
[1] https://lore.kernel.org/lkml/
20181211091154.GL23332@shao2-debian/T/#u
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jiri Slaby <jslaby@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Waiman Long <longman@redhat.com>
Cc: Dmitry Safonov <dima@arista.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Jonas Danielsson [Fri, 19 Oct 2018 14:40:05 +0000 (16:40 +0200)]
mmc: atmel-mci: do not assume idle after atmci_request_end
[ Upstream commit
ae460c115b7aa50c9a36cf78fced07b27962c9d0 ]
On our AT91SAM9260 board we use the same sdio bus for wifi and for the
sd card slot. This caused the atmel-mci to give the following splat on
the serial console:
------------[ cut here ]------------
WARNING: CPU: 0 PID: 538 at drivers/mmc/host/atmel-mci.c:859 atmci_send_command+0x24/0x44
Modules linked in:
CPU: 0 PID: 538 Comm: mmcqd/0 Not tainted 4.14.76 #14
Hardware name: Atmel AT91SAM9
[<
c000fccc>] (unwind_backtrace) from [<
c000d3dc>] (show_stack+0x10/0x14)
[<
c000d3dc>] (show_stack) from [<
c0017644>] (__warn+0xd8/0xf4)
[<
c0017644>] (__warn) from [<
c0017704>] (warn_slowpath_null+0x1c/0x24)
[<
c0017704>] (warn_slowpath_null) from [<
c033bb9c>] (atmci_send_command+0x24/0x44)
[<
c033bb9c>] (atmci_send_command) from [<
c033e984>] (atmci_start_request+0x1f4/0x2dc)
[<
c033e984>] (atmci_start_request) from [<
c033f3b4>] (atmci_request+0xf0/0x164)
[<
c033f3b4>] (atmci_request) from [<
c0327108>] (mmc_start_request+0x280/0x2d0)
[<
c0327108>] (mmc_start_request) from [<
c032800c>] (mmc_start_areq+0x230/0x330)
[<
c032800c>] (mmc_start_areq) from [<
c03366f8>] (mmc_blk_issue_rw_rq+0xc4/0x310)
[<
c03366f8>] (mmc_blk_issue_rw_rq) from [<
c03372c4>] (mmc_blk_issue_rq+0x118/0x5ac)
[<
c03372c4>] (mmc_blk_issue_rq) from [<
c033781c>] (mmc_queue_thread+0xc4/0x118)
[<
c033781c>] (mmc_queue_thread) from [<
c002daf8>] (kthread+0x100/0x118)
[<
c002daf8>] (kthread) from [<
c000a580>] (ret_from_fork+0x14/0x34)
---[ end trace
594371ddfa284bd6 ]---
This is:
WARN_ON(host->cmd);
This was fixed on our board by letting atmci_request_end determine what
state we are in. Instead of unconditionally setting it to STATE_IDLE on
STATE_END_REQUEST.
Signed-off-by: Jonas Danielsson <jonas@orbital-systems.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Masahiro Yamada [Tue, 11 Dec 2018 11:00:45 +0000 (20:00 +0900)]
kconfig: fix memory leak when EOF is encountered in quotation
[ Upstream commit
fbac5977d81cb2b2b7e37b11c459055d9585273c ]
An unterminated string literal followed by new line is passed to the
parser (with "multi-line strings not supported" warning shown), then
handled properly there.
On the other hand, an unterminated string literal at end of file is
never passed to the parser, then results in memory leak.
[Test Code]
----------(Kconfig begin)----------
source "Kconfig.inc"
config A
bool "a"
-----------(Kconfig end)-----------
--------(Kconfig.inc begin)--------
config B
bool "b\No new line at end of file
---------(Kconfig.inc end)---------
[Summary from Valgrind]
Before the fix:
LEAK SUMMARY:
definitely lost: 16 bytes in 1 blocks
...
After the fix:
LEAK SUMMARY:
definitely lost: 0 bytes in 0 blocks
...
Eliminate the memory leak path by handling this case. Of course, such
a Kconfig file is wrong already, so I will add an error message later.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Masahiro Yamada [Tue, 11 Dec 2018 11:00:44 +0000 (20:00 +0900)]
kconfig: fix file name and line number of warn_ignored_character()
[ Upstream commit
77c1c0fa8b1477c5799bdad65026ea5ff676da44 ]
Currently, warn_ignore_character() displays invalid file name and
line number.
The lexer should use current_file->name and yylineno, while the parser
should use zconf_curname() and zconf_lineno().
This difference comes from that the lexer is always going ahead
of the parser. The parser needs to look ahead one token to make a
shift/reduce decision, so the lexer is requested to scan more text
from the input file.
This commit fixes the warning message from warn_ignored_character().
[Test Code]
----(Kconfig begin)----
/
-----(Kconfig end)-----
[Output]
Before the fix:
<none>:0:warning: ignoring unsupported character '/'
After the fix:
Kconfig:1:warning: ignoring unsupported character '/'
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Lucas Stach [Thu, 15 Nov 2018 14:30:26 +0000 (15:30 +0100)]
clk: imx6q: reset exclusive gates on init
[ Upstream commit
f7542d817733f461258fd3a47d77da35b2d9fc81 ]
The exclusive gates may be set up in the wrong way by software running
before the clock driver comes up. In that case the exclusive setup is
locked in its initial state, as the complementary function can't be
activated without disabling the initial setup first.
To avoid this lock situation, reset the exclusive gates to the off
state and allow the kernel to provide the proper setup.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Dong Aisheng <Aisheng.dong@nxp.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
David Disseldorp [Wed, 5 Dec 2018 12:18:34 +0000 (13:18 +0100)]
scsi: target: use consistent left-aligned ASCII INQUIRY data
[ Upstream commit
0de263577de5d5e052be5f4f93334e63cc8a7f0b ]
spc5r17.pdf specifies:
4.3.1 ASCII data field requirements
ASCII data fields shall contain only ASCII printable characters (i.e.,
code values 20h to 7Eh) and may be terminated with one or more ASCII null
(00h) characters. ASCII data fields described as being left-aligned
shall have any unused bytes at the end of the field (i.e., highest
offset) and the unused bytes shall be filled with ASCII space characters
(20h).
LIO currently space-pads the T10 VENDOR IDENTIFICATION and PRODUCT
IDENTIFICATION fields in the standard INQUIRY data. However, the PRODUCT
REVISION LEVEL field in the standard INQUIRY data as well as the T10 VENDOR
IDENTIFICATION field in the INQUIRY Device Identification VPD Page are
zero-terminated/zero-padded.
Fix this inconsistency by using space-padding for all of the above fields.
Signed-off-by: David Disseldorp <ddiss@suse.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bryant G. Ly <bly@catalogicsoftware.com>
Reviewed-by: Lee Duncan <lduncan@suse.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Roman Bolshakov <r.bolshakov@yadro.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
yupeng [Thu, 6 Dec 2018 02:56:28 +0000 (18:56 -0800)]
net: call sk_dst_reset when set SO_DONTROUTE
[ Upstream commit
0fbe82e628c817e292ff588cd5847fc935e025f2 ]
after set SO_DONTROUTE to 1, the IP layer should not route packets if
the dest IP address is not in link scope. But if the socket has cached
the dst_entry, such packets would be routed until the sk_dst_cache
expires. So we should clean the sk_dst_cache when a user set
SO_DONTROUTE option. Below are server/client python scripts which
could reprodue this issue:
server side code:
==========================================================================
import socket
import struct
import time
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.bind(('0.0.0.0', 9000))
s.listen(1)
sock, addr = s.accept()
sock.setsockopt(socket.SOL_SOCKET, socket.SO_DONTROUTE, struct.pack('i', 1))
while True:
sock.send(b'foo')
time.sleep(1)
==========================================================================
client side code:
==========================================================================
import socket
import time
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect(('server_address', 9000))
while True:
data = s.recv(1024)
print(data)
==========================================================================
Signed-off-by: yupeng <yupeng0921@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Nathan Chancellor [Thu, 18 Oct 2018 20:03:06 +0000 (16:03 -0400)]
media: firewire: Fix app_info parameter type in avc_ca{,_app}_info
[ Upstream commit
b2e9a4eda11fd2cb1e6714e9ad3f455c402568ff ]
Clang warns:
drivers/media/firewire/firedtv-avc.c:999:45: warning: implicit
conversion from 'int' to 'char' changes value from 159 to -97
[-Wconstant-conversion]
app_info[0] = (EN50221_TAG_APP_INFO >> 16) & 0xff;
~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~
drivers/media/firewire/firedtv-avc.c:1000:45: warning: implicit
conversion from 'int' to 'char' changes value from 128 to -128
[-Wconstant-conversion]
app_info[1] = (EN50221_TAG_APP_INFO >> 8) & 0xff;
~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~
drivers/media/firewire/firedtv-avc.c:1040:44: warning: implicit
conversion from 'int' to 'char' changes value from 159 to -97
[-Wconstant-conversion]
app_info[0] = (EN50221_TAG_CA_INFO >> 16) & 0xff;
~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~
drivers/media/firewire/firedtv-avc.c:1041:44: warning: implicit
conversion from 'int' to 'char' changes value from 128 to -128
[-Wconstant-conversion]
app_info[1] = (EN50221_TAG_CA_INFO >> 8) & 0xff;
~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~
4 warnings generated.
Change app_info's type to unsigned char to match the type of the
member msg in struct ca_msg, which is the only thing passed into the
app_info parameter in this function.
Link: https://github.com/ClangBuiltLinux/linux/issues/105
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Breno Leitao [Fri, 23 Nov 2018 16:30:11 +0000 (14:30 -0200)]
powerpc/pseries/cpuidle: Fix preempt warning
[ Upstream commit
2b038cbc5fcf12a7ee1cc9bfd5da1e46dacdee87 ]
When booting a pseries kernel with PREEMPT enabled, it dumps the
following warning:
BUG: using smp_processor_id() in preemptible [
00000000] code: swapper/0/1
caller is pseries_processor_idle_init+0x5c/0x22c
CPU: 13 PID: 1 Comm: swapper/0 Not tainted
4.20.0-rc3-00090-g12201a0128bc-dirty #828
Call Trace:
[
c000000429437ab0] [
c0000000009c8878] dump_stack+0xec/0x164 (unreliable)
[
c000000429437b00] [
c0000000005f2f24] check_preemption_disabled+0x154/0x160
[
c000000429437b90] [
c000000000cab8e8] pseries_processor_idle_init+0x5c/0x22c
[
c000000429437c10] [
c000000000010ed4] do_one_initcall+0x64/0x300
[
c000000429437ce0] [
c000000000c54500] kernel_init_freeable+0x3f0/0x500
[
c000000429437db0] [
c0000000000112dc] kernel_init+0x2c/0x160
[
c000000429437e20] [
c00000000000c1d0] ret_from_kernel_thread+0x5c/0x6c
This happens because the code calls get_lppaca() which calls
get_paca() and it checks if preemption is disabled through
check_preemption_disabled().
Preemption should be disabled because the per CPU variable may make no
sense if there is a preemption (and a CPU switch) after it reads the
per CPU data and when it is used.
In this device driver specifically, it is not a problem, because this
code just needs to have access to one lppaca struct, and it does not
matter if it is the current per CPU lppaca struct or not (i.e. when
there is a preemption and a CPU migration).
That said, the most appropriate fix seems to be related to avoiding
the debug_smp_processor_id() call at get_paca(), instead of calling
preempt_disable() before get_paca().
Signed-off-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Breno Leitao [Thu, 8 Nov 2018 17:12:42 +0000 (15:12 -0200)]
powerpc/xmon: Fix invocation inside lock region
[ Upstream commit
8d4a862276a9c30a269d368d324fb56529e6d5fd ]
Currently xmon needs to get devtree_lock (through rtas_token()) during its
invocation (at crash time). If there is a crash while devtree_lock is being
held, then xmon tries to get the lock but spins forever and never get into
the interactive debugger, as in the following case:
int *ptr = NULL;
raw_spin_lock_irqsave(&devtree_lock, flags);
*ptr = 0xdeadbeef;
This patch avoids calling rtas_token(), thus trying to get the same lock,
at crash time. This new mechanism proposes getting the token at
initialization time (xmon_init()) and just consuming it at crash time.
This would allow xmon to be possible invoked independent of devtree_lock
being held or not.
Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Joel Fernandes (Google) [Sat, 3 Nov 2018 23:38:18 +0000 (16:38 -0700)]
pstore/ram: Do not treat empty buffers as valid
[ Upstream commit
30696378f68a9e3dad6bfe55938b112e72af00c2 ]
The ramoops backend currently calls persistent_ram_save_old() even
if a buffer is empty. While this appears to work, it is does not seem
like the right thing to do and could lead to future bugs so lets avoid
that. It also prevents misleading prints in the logs which claim the
buffer is valid.
I got something like:
found existing buffer, size 0, start 0
When I was expecting:
no valid data in buffer (sig = ...)
This bails out early (and reports with pr_debug()), since it's an
acceptable state.
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Co-developed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Daniel Santos [Fri, 19 Oct 2018 08:30:20 +0000 (03:30 -0500)]
jffs2: Fix use of uninitialized delayed_work, lockdep breakage
[ Upstream commit
a788c5272769ddbcdbab297cf386413eeac04463 ]
jffs2_sync_fs makes the assumption that if CONFIG_JFFS2_FS_WRITEBUFFER
is defined then a write buffer is available and has been initialized.
However, this does is not the case when the mtd device has no
out-of-band buffer:
int jffs2_nand_flash_setup(struct jffs2_sb_info *c)
{
if (!c->mtd->oobsize)
return 0;
...
The resulting call to cancel_delayed_work_sync passing a uninitialized
(but zeroed) delayed_work struct forces lockdep to become disabled.
[ 90.050639] overlayfs: upper fs does not support tmpfile.
[ 90.652264] INFO: trying to register non-static key.
[ 90.662171] the code is fine but needs lockdep annotation.
[ 90.673090] turning off the locking correctness validator.
[ 90.684021] CPU: 0 PID: 1762 Comm: mount_root Not tainted 4.14.63 #0
[ 90.696672] Stack :
00000000 00000000 80d8f6a2 00000038 805f0000 80444600 8fe364f4 805dfbe7
[ 90.713349]
80563a30 000006e2 8068370c 00000001 00000000 00000001 8e2fdc48 ffffffff
[ 90.730020]
00000000 00000000 80d90000 00000000 00000106 00000000 6465746e 312e3420
[ 90.746690]
6b636f6c 03bf0000 f8000000 20676e69 00000000 80000000 00000000 8e2c2a90
[ 90.763362]
80d90000 00000001 00000000 8e2c2a90 00000003 80260dc0 08052098 80680000
[ 90.780033] ...
[ 90.784902] Call Trace:
[ 90.789793] [<
8000f0d8>] show_stack+0xb8/0x148
[ 90.798659] [<
8005a000>] register_lock_class+0x270/0x55c
[ 90.809247] [<
8005cb64>] __lock_acquire+0x13c/0xf7c
[ 90.818964] [<
8005e314>] lock_acquire+0x194/0x1dc
[ 90.828345] [<
8003f27c>] flush_work+0x200/0x24c
[ 90.837374] [<
80041dfc>] __cancel_work_timer+0x158/0x210
[ 90.847958] [<
801a8770>] jffs2_sync_fs+0x20/0x54
[ 90.857173] [<
80125cf4>] iterate_supers+0xf4/0x120
[ 90.866729] [<
80158fc4>] sys_sync+0x44/0x9c
[ 90.875067] [<
80014424>] syscall_common+0x34/0x58
Signed-off-by: Daniel Santos <daniel.santos@pobox.com>
Reviewed-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Chuck Lever [Sun, 25 Nov 2018 22:13:08 +0000 (17:13 -0500)]
rxe: IB_WR_REG_MR does not capture MR's iova field
[ Upstream commit
b024dd0eba6e6d568f69d63c5e3153aba94c23e3 ]
FRWR memory registration is done with a series of calls and WRs.
1. ULP invokes ib_dma_map_sg()
2. ULP invokes ib_map_mr_sg()
3. ULP posts an IB_WR_REG_MR on the Send queue
Step 2 generates an iova. It is permissible for ULPs to change this
iova (with certain restrictions) between steps 2 and 3.
rxe_map_mr_sg captures the MR's iova but later when rxe processes the
REG_MR WR, it ignores the MR's iova field. If a ULP alters the MR's iova
after step 2 but before step 3, rxe never captures that change.
When the remote sends an RDMA Read targeting that MR, rxe looks up the
R_key, but the altered iova does not match the iova stored in the MR,
causing the RDMA Read request to fail.
Reported-by: Anna Schumaker <schumaker.anna@gmail.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Ondrej Mosnacek [Fri, 16 Nov 2018 13:12:02 +0000 (14:12 +0100)]
selinux: always allow mounting submounts
[ Upstream commit
2cbdcb882f97a45f7475c67ac6257bbc16277dfe ]
If a superblock has the MS_SUBMOUNT flag set, we should always allow
mounting it. These mounts are done automatically by the kernel either as
part of mounting some parent mount (e.g. debugfs always mounts tracefs
under "tracing" for compatibility) or they are mounted automatically as
needed on subdirectory accesses (e.g. NFS crossmnt mounts). Since such
automounts are either an implicit consequence of the parent mount (which
is already checked) or they can happen during regular accesses (where it
doesn't make sense to check against the current task's context), the
mount permission check should be skipped for them.
Without this patch, attempts to access contents of an automounted
directory can cause unexpected SELinux denials.
In the current kernel tree, the MS_SUBMOUNT flag is set only via
vfs_submount(), which is called only from the following places:
- AFS, when automounting special "symlinks" referencing other cells
- CIFS, when automounting "referrals"
- NFS, when automounting subtrees
- debugfs, when automounting tracefs
In all cases the submounts are meant to be transparent to the user and
it makes sense that if mounting the master is allowed, then so should be
the automounts. Note that CAP_SYS_ADMIN capability checking is already
skipped for (SB_KERNMOUNT|SB_SUBMOUNT) in:
- sget_userns() in fs/super.c:
if (!(flags & (SB_KERNMOUNT|SB_SUBMOUNT)) &&
!(type->fs_flags & FS_USERNS_MOUNT) &&
!capable(CAP_SYS_ADMIN))
return ERR_PTR(-EPERM);
- sget() in fs/super.c:
/* Ensure the requestor has permissions over the target filesystem */
if (!(flags & (SB_KERNMOUNT|SB_SUBMOUNT)) && !ns_capable(user_ns, CAP_SYS_ADMIN))
return ERR_PTR(-EPERM);
Verified internally on patched RHEL 7.6 with a reproducer using
NFS+httpd and selinux-tesuite.
Fixes:
93faccbbfa95 ("fs: Better permission checking for submounts")
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Anders Roxell [Wed, 17 Oct 2018 15:26:22 +0000 (17:26 +0200)]
arm64: perf: set suppress_bind_attrs flag to true
[ Upstream commit
81e9fa8bab381f8b6eb04df7cdf0f71994099bd4 ]
The armv8_pmuv3 driver doesn't have a remove function, and when the test
'CONFIG_DEBUG_TEST_DRIVER_REMOVE=y' is enabled, the following Call trace
can be seen.
[ 1.424287] Failed to register pmu: armv8_pmuv3, reason -17
[ 1.424870] WARNING: CPU: 0 PID: 1 at ../kernel/events/core.c:11771 perf_event_sysfs_init+0x98/0xdc
[ 1.425220] Modules linked in:
[ 1.425531] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W
4.19.0-rc7-next-20181012-00003-ge7a97b1ad77b-dirty #35
[ 1.425951] Hardware name: linux,dummy-virt (DT)
[ 1.426212] pstate:
80000005 (Nzcv daif -PAN -UAO)
[ 1.426458] pc : perf_event_sysfs_init+0x98/0xdc
[ 1.426720] lr : perf_event_sysfs_init+0x98/0xdc
[ 1.426908] sp :
ffff00000804bd50
[ 1.427077] x29:
ffff00000804bd50 x28:
ffff00000934e078
[ 1.427429] x27:
ffff000009546000 x26:
0000000000000007
[ 1.427757] x25:
ffff000009280710 x24:
00000000ffffffef
[ 1.428086] x23:
ffff000009408000 x22:
0000000000000000
[ 1.428415] x21:
ffff000009136008 x20:
ffff000009408730
[ 1.428744] x19:
ffff80007b20b400 x18:
000000000000000a
[ 1.429075] x17:
0000000000000000 x16:
0000000000000000
[ 1.429418] x15:
0000000000000400 x14:
2e79726f74636572
[ 1.429748] x13:
696420656d617320 x12:
656874206e692065
[ 1.430060] x11:
6d616e20656d6173 x10:
2065687420687469
[ 1.430335] x9 :
ffff00000804bd50 x8 :
206e6f7361657220
[ 1.430610] x7 :
2c3376756d705f38 x6 :
ffff00000954d7ce
[ 1.430880] x5 :
0000000000000000 x4 :
0000000000000000
[ 1.431226] x3 :
0000000000000000 x2 :
ffffffffffffffff
[ 1.431554] x1 :
4d151327adc50b00 x0 :
0000000000000000
[ 1.431868] Call trace:
[ 1.432102] perf_event_sysfs_init+0x98/0xdc
[ 1.432382] do_one_initcall+0x6c/0x1a8
[ 1.432637] kernel_init_freeable+0x1bc/0x280
[ 1.432905] kernel_init+0x18/0x160
[ 1.433115] ret_from_fork+0x10/0x18
[ 1.433297] ---[ end trace
27fd415390eb9883 ]---
Rework to set suppress_bind_attrs flag to avoid removing the device when
CONFIG_DEBUG_TEST_DRIVER_REMOVE=y, since there's no real reason to
remove the armv8_pmuv3 driver.
Cc: Arnd Bergmann <arnd@arndb.de>
Co-developed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Maciej W. Rozycki [Tue, 13 Nov 2018 22:42:44 +0000 (22:42 +0000)]
MIPS: SiByte: Enable swiotlb for SWARM, LittleSur and BigSur
[ Upstream commit
e4849aff1e169b86c561738daf8ff020e9de1011 ]
The Broadcom SiByte BCM1250, BCM1125, and BCM1125H SOCs have an onchip
DRAM controller that supports memory amounts of up to 16GiB, and due to
how the address decoder has been wired in the SOC any memory beyond 1GiB
is actually mapped starting from 4GiB physical up, that is beyond the
32-bit addressable limit[1]. Consequently if the maximum amount of
memory has been installed, then it will span up to 19GiB.
Many of the evaluation boards we support that are based on one of these
SOCs have their memory soldered and the amount present fits in the
32-bit address range. The BCM91250A SWARM board however has actual DIMM
slots and accepts, depending on the peripherals revision of the SOC, up
to 4GiB or 8GiB of memory in commercially available JEDEC modules[2].
I believe this is also the case with the BCM91250C2 LittleSur board.
This means that up to either 3GiB or 7GiB of memory requires 64-bit
addressing to access.
I believe the BCM91480B BigSur board, which has the BCM1480 SOC instead,
accepts at least as much memory, although I have no documentation or
actual hardware available to verify that.
Both systems have PCI slots installed for use by any PCI option boards,
including ones that only support 32-bit addressing (additionally the
32-bit PCI host bridge of the BCM1250, BCM1125, and BCM1125H SOCs limits
addressing to 32-bits), and there is no IOMMU available. Therefore for
PCI DMA to work in the presence of memory beyond enable swiotlb for the
affected systems.
All the other SOC onchip DMA devices use 40-bit addressing and therefore
can address the whole memory, so only enable swiotlb if PCI support and
support for DMA beyond 4GiB have been both enabled in the configuration
of the kernel.
This shows up as follows:
Broadcom SiByte BCM1250 B2 @ 800 MHz (SB1 rev 2)
Board type: SiByte BCM91250A (SWARM)
Determined physical RAM map:
memory:
000000000fe7fe00 @
0000000000000000 (usable)
memory:
000000001ffffe00 @
0000000080000000 (usable)
memory:
000000000ffffe00 @
00000000c0000000 (usable)
memory:
0000000087fffe00 @
0000000100000000 (usable)
software IO TLB: mapped [mem 0xcbffc000-0xcfffc000] (64MB)
in the bootstrap log and removes failures like these:
defxx 0000:02:00.0: dma_direct_map_page: overflow 0x0000000185bc6080+4608 of device mask
ffffffff bus mask 0
fddi0: Receive buffer allocation failed
fddi0: Adapter open failed!
IP-Config: Failed to open fddi0
defxx 0000:09:08.0: dma_direct_map_page: overflow 0x0000000185bc6080+4608 of device mask
ffffffff bus mask 0
fddi1: Receive buffer allocation failed
fddi1: Adapter open failed!
IP-Config: Failed to open fddi1
when memory beyond 4GiB is handed out to devices that can only do 32-bit
addressing.
This updates commit
cce335ae47e2 ("[MIPS] 64-bit Sibyte kernels need
DMA32.").
References:
[1] "BCM1250/BCM1125/BCM1125H User Manual", Revision 1250_1125-UM100-R,
Broadcom Corporation, 21 Oct 2002, Section 3: "System Overview",
"Memory Map", pp. 34-38
[2] "BCM91250A User Manual", Revision 91250A-UM100-R, Broadcom
Corporation, 18 May 2004, Section 3: "Physical Description",
"Supported DRAM", p. 23
Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
[paul.burton@mips.com: Remove GPL text from dma.c; SPDX tag covers it]
Signed-off-by: Paul Burton <paul.burton@mips.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Patchwork: https://patchwork.linux-mips.org/patch/21108/
References:
cce335ae47e2 ("[MIPS] 64-bit Sibyte kernels need DMA32.")
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
Takashi Sakamoto [Tue, 13 Nov 2018 03:01:30 +0000 (12:01 +0900)]
ALSA: oxfw: add support for APOGEE duet FireWire
[ Upstream commit
fba43f454cdf9caa3185219d116bd2a6e6354552 ]
This commit adds support for APOGEE duet FireWire, launched 2007, already
discontinued. This model uses Oxford Semiconductor FW971 as its
communication engine. Below is information on Configuration ROM of this
unit. The unit supports some AV/C commands defined by Audio subunit
specification and vendor dependent commands.
$ ./hinawa-config-rom-printer /dev/fw1
{ 'bus-info': { 'adj': False,
'bmc': False,
'chip_ID':
42949742248,
'cmc': False,
'cyc_clk_acc': 255,
'generation': 0,
'imc': False,
'isc': True,
'link_spd': 3,
'max_ROM': 0,
'max_rec': 64,
'name': '1394',
'node_vendor_ID': 987,
'pmc': False},
'root-directory': [ ['VENDOR', 987],
['DESCRIPTOR', 'Apogee Electronics'],
['MODEL', 122333],
['DESCRIPTOR', 'Duet'],
[ 'NODE_CAPABILITIES',
{ 'addressing': {'64': True, 'fix': True, 'prv': False},
'misc': {'int': False, 'ms': False, 'spt': True},
'state': { 'atn': False,
'ded': False,
'drq': True,
'elo': False,
'init': False,
'lst': True,
'off': False},
'testing': {'bas': False, 'ext': False}}],
[ 'UNIT',
[ ['SPECIFIER_ID', 41005],
['VERSION', 65537],
['MODEL', 122333],
['DESCRIPTOR', 'Duet']]]]}
Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Anders Roxell [Tue, 30 Oct 2018 11:35:44 +0000 (12:35 +0100)]
serial: set suppress_bind_attrs flag only if builtin
[ Upstream commit
646097940ad35aa2c1f2012af932d55976a9f255 ]
When the test 'CONFIG_DEBUG_TEST_DRIVER_REMOVE=y' is enabled,
arch_initcall(pl011_init) came before subsys_initcall(default_bdi_init).
devtmpfs gets killed because we try to remove a file and decrement the
wb reference count before the noop_backing_device_info gets initialized.
[ 0.332075] Serial: AMBA PL011 UART driver
[ 0.485276]
9000000.pl011: ttyAMA0 at MMIO 0x9000000 (irq = 39, base_baud = 0) is a PL011 rev1
[ 0.502382] console [ttyAMA0] enabled
[ 0.515710] Unable to handle kernel paging request at virtual address
0000800074c12000
[ 0.516053] Mem abort info:
[ 0.516222] ESR = 0x96000004
[ 0.516417] Exception class = DABT (current EL), IL = 32 bits
[ 0.516641] SET = 0, FnV = 0
[ 0.516826] EA = 0, S1PTW = 0
[ 0.516984] Data abort info:
[ 0.517149] ISV = 0, ISS = 0x00000004
[ 0.517339] CM = 0, WnR = 0
[ 0.517553] [
0000800074c12000] user address but active_mm is swapper
[ 0.517928] Internal error: Oops:
96000004 [#1] PREEMPT SMP
[ 0.518305] Modules linked in:
[ 0.518839] CPU: 0 PID: 13 Comm: kdevtmpfs Not tainted
4.19.0-rc5-next-20180928-00002-g2ba39ab0cd01-dirty #82
[ 0.519307] Hardware name: linux,dummy-virt (DT)
[ 0.519681] pstate:
80000005 (Nzcv daif -PAN -UAO)
[ 0.519959] pc : __destroy_inode+0x94/0x2a8
[ 0.520212] lr : __destroy_inode+0x78/0x2a8
[ 0.520401] sp :
ffff0000098c3b20
[ 0.520590] x29:
ffff0000098c3b20 x28:
00000000087a3714
[ 0.520904] x27:
0000000000002000 x26:
0000000000002000
[ 0.521179] x25:
ffff000009583000 x24:
0000000000000000
[ 0.521467] x23:
ffff80007bb52000 x22:
ffff80007bbaa7c0
[ 0.521737] x21:
ffff0000093f9338 x20:
0000000000000000
[ 0.522033] x19:
ffff80007bbb05d8 x18:
0000000000000400
[ 0.522376] x17:
0000000000000000 x16:
0000000000000000
[ 0.522727] x15:
0000000000000400 x14:
0000000000000400
[ 0.523068] x13:
0000000000000001 x12:
0000000000000001
[ 0.523421] x11:
0000000000000000 x10:
0000000000000970
[ 0.523749] x9 :
ffff0000098c3a60 x8 :
ffff80007bbab190
[ 0.524017] x7 :
ffff80007bbaa880 x6 :
0000000000000c88
[ 0.524305] x5 :
ffff0000093d96c8 x4 :
61c8864680b583eb
[ 0.524567] x3 :
ffff0000093d6180 x2 :
ffffffffffffffff
[ 0.524872] x1 :
0000800074c12000 x0 :
0000800074c12000
[ 0.525207] Process kdevtmpfs (pid: 13, stack limit = 0x(____ptrval____))
[ 0.525529] Call trace:
[ 0.525806] __destroy_inode+0x94/0x2a8
[ 0.526108] destroy_inode+0x34/0x88
[ 0.526370] evict+0x144/0x1c8
[ 0.526636] iput+0x184/0x230
[ 0.526871] dentry_unlink_inode+0x118/0x130
[ 0.527152] d_delete+0xd8/0xe0
[ 0.527420] vfs_unlink+0x240/0x270
[ 0.527665] handle_remove+0x1d8/0x330
[ 0.527875] devtmpfsd+0x138/0x1c8
[ 0.528085] kthread+0x14c/0x158
[ 0.528291] ret_from_fork+0x10/0x18
[ 0.528720] Code:
92800002 aa1403e0 d538d081 8b010000 (
c85f7c04)
[ 0.529367] ---[ end trace
5a3dee47727f877c ]---
Rework to set suppress_bind_attrs flag to avoid removing the device when
CONFIG_DEBUG_TEST_DRIVER_REMOVE=y. This applies for pic32_uart and
xilinx_uartps as well.
Co-developed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Anders Roxell [Tue, 30 Oct 2018 11:35:45 +0000 (12:35 +0100)]
writeback: don't decrement wb->refcnt if !wb->bdi
[ Upstream commit
347a28b586802d09604a149c1a1f6de5dccbe6fa ]
This happened while running in qemu-system-aarch64, the AMBA PL011 UART
driver when enabling CONFIG_DEBUG_TEST_DRIVER_REMOVE.
arch_initcall(pl011_init) came before subsys_initcall(default_bdi_init),
devtmpfs' handle_remove() crashes because the reference count is a NULL
pointer only because wb->bdi hasn't been initialized yet.
Rework so that wb_put have an extra check if wb->bdi before decrement
wb->refcnt and also add a WARN_ON_ONCE to get a warning if it happens again
in other drivers.
Fixes:
52ebea749aae ("writeback: make backing_dev_info host cgroup-specific bdi_writebacks")
Co-developed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Miroslav Lichvar [Tue, 23 Oct 2018 12:37:39 +0000 (14:37 +0200)]
e1000e: allow non-monotonic SYSTIM readings
[ Upstream commit
e1f65b0d70e9e5c80e15105cd96fa00174d7c436 ]
It seems with some NICs supported by the e1000e driver a SYSTIM reading
may occasionally be few microseconds before the previous reading and if
enabled also pass e1000e_sanitize_systim() without reaching the maximum
number of rereads, even if the function is modified to check three
consecutive readings (i.e. it doesn't look like a double read error).
This causes an underflow in the timecounter and the PHC time jumps hours
ahead.
This was observed on 82574, I217 and I219. The fastest way to reproduce
it is to run a program that continuously calls the PTP_SYS_OFFSET ioctl
on the PHC.
Modify e1000e_phc_gettime() to use timecounter_cyc2time() instead of
timecounter_read() in order to allow non-monotonic SYSTIM readings and
prevent the PHC from jumping.
Cc: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com>
Acked-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
João Paulo Rechi Vita [Thu, 1 Nov 2018 00:21:26 +0000 (17:21 -0700)]
platform/x86: asus-wmi: Tell the EC the OS will handle the display off hotkey
[ Upstream commit
78f3ac76d9e5219589718b9e4733bee21627b3f5 ]
In the past, Asus firmwares would change the panel backlight directly
through the EC when the display off hotkey (Fn+F7) was pressed, and
only notify the OS of such change, with 0x33 when the LCD was ON and
0x34 when the LCD was OFF. These are currently mapped to
KEY_DISPLAYTOGGLE and KEY_DISPLAY_OFF, respectively.
Most recently the EC on Asus most machines lost ability to toggle the
LCD backlight directly, but unless the OS informs the firmware it is
going to handle the display toggle hotkey events, the firmware still
tries change the brightness through the EC, to no effect. The end result
is a long list (at Endless we counted 11) of Asus laptop models where
the display toggle hotkey does not perform any action. Our firmware
engineers contacts at Asus were surprised that there were still machines
out there with the old behavior.
Calling WMNB(ASUS_WMI_DEVID_BACKLIGHT==0x00050011, 2) on the _WDG device
tells the firmware that it should let the OS handle the display toggle
event, in which case it will simply notify the OS of a key press with
0x35, as shown by the DSDT excerpts bellow.
Scope (_SB)
{
(...)
Device (ATKD)
{
(...)
Name (_WDG, Buffer (0x28)
{
/* 0000 */ 0xD0, 0x5E, 0x84, 0x97, 0x6D, 0x4E, 0xDE, 0x11,
/* 0008 */ 0x8A, 0x39, 0x08, 0x00, 0x20, 0x0C, 0x9A, 0x66,
/* 0010 */ 0x4E, 0x42, 0x01, 0x02, 0x35, 0xBB, 0x3C, 0x0B,
/* 0018 */ 0xC2, 0xE3, 0xED, 0x45, 0x91, 0xC2, 0x4C, 0x5A,
/* 0020 */ 0x6D, 0x19, 0x5D, 0x1C, 0xFF, 0x00, 0x01, 0x08
})
Method (WMNB, 3, Serialized)
{
CreateDWordField (Arg2, Zero, IIA0)
CreateDWordField (Arg2, 0x04, IIA1)
Local0 = (Arg1 & 0xFFFFFFFF)
(...)
If ((Local0 == 0x53564544))
{
(...)
If ((IIA0 == 0x00050011))
{
If ((IIA1 == 0x02))
{
^^PCI0.SBRG.EC0.SPIN (0x72, One)
^^PCI0.SBRG.EC0.BLCT = One
}
Return (One)
}
}
(...)
}
(...)
}
(...)
}
(...)
Scope (_SB.PCI0.SBRG.EC0)
{
(...)
Name (BLCT, Zero)
(...)
Method (_Q10, 0, NotSerialized) // _Qxx: EC Query
{
If ((BLCT == Zero))
{
Local0 = One
Local0 = RPIN (0x72)
Local0 ^= One
SPIN (0x72, Local0)
If (ATKP)
{
Local0 = (0x34 - Local0)
^^^^ATKD.IANE (Local0)
}
}
ElseIf ((BLCT == One))
{
If (ATKP)
{
^^^^ATKD.IANE (0x35)
}
}
}
(...)
}
Signed-off-by: João Paulo Rechi Vita <jprvita@endlessm.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
David Ahern [Sat, 5 Jan 2019 15:35:04 +0000 (07:35 -0800)]
ipv6: Take rcu_read_lock in __inet6_bind for mapped addresses
[ Upstream commit
d4a7e9bb74b5aaf07b89f6531c080b1130bdf019 ]
I realized the last patch calls dev_get_by_index_rcu in a branch not
holding the rcu lock. Add the calls to rcu_read_lock and rcu_read_unlock.
Fixes:
ec90ad334986 ("ipv6: Consider sk_bound_dev_if when binding a socket to a v4 mapped address")
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
David Ahern [Sat, 5 Jan 2019 00:58:15 +0000 (16:58 -0800)]
ipv6: Consider sk_bound_dev_if when binding a socket to a v4 mapped address
[ Upstream commit
ec90ad334986fa5856d11dd272f7f22fa86c55c4 ]
Similar to
c5ee066333eb ("ipv6: Consider sk_bound_dev_if when binding a
socket to an address"), binding a socket to v4 mapped addresses needs to
consider if the socket is bound to a device.
This problem also exists from the beginning of git history.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Kai-Heng Feng [Wed, 2 Jan 2019 06:45:07 +0000 (14:45 +0800)]
r8169: Add support for new Realtek Ethernet
[ Upstream commit
36352991835ce99e46b4441dd0eb6980f9a83e8f ]
There are two new Realtek Ethernet devices which are re-branded r8168h.
Add the IDs to to support them.
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Reviewed-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Greg Kroah-Hartman [Wed, 23 Jan 2019 07:10:57 +0000 (08:10 +0100)]
Linux 4.9.152
Jan Kara [Mon, 14 Jan 2019 08:48:09 +0000 (09:48 +0100)]
nbd: Use set_blocksize() to set device blocksize
commit
c8a83a6b54d0ca078de036aafb3f6af58c1dc5eb upstream.
NBD can update block device block size implicitely through
bd_set_size(). Make it explicitely set blocksize with set_blocksize() as
this behavior of bd_set_size() is going away.
CC: Josef Bacik <jbacik@fb.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Josef Bacik [Mon, 13 Feb 2017 15:39:47 +0000 (10:39 -0500)]
nbd: set the logical and physical blocksize properly
commit
e544541b0765c341174613b416d4b074fa7571c2 upstream.
We noticed when trying to do O_DIRECT to an export on the server side
that we were getting requests smaller than the 4k sectorsize of the
device. This is because the client isn't setting the logical and
physical blocksizes properly for the underlying device. Fix this up by
setting the queue blocksizes and then calling bd_set_size.
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Mauro Carvalho Chehab [Fri, 23 Nov 2018 12:05:58 +0000 (07:05 -0500)]
media: vb2: be sure to unlock mutex on errors
commit
c06ef2e9acef4cda1feee2ce055b8086e33d251a upstream.
As reported by smatch:
drivers/media/common/videobuf2/videobuf2-core.c: drivers/media/common/videobuf2/videobuf2-core.c:2159 vb2_mmap() warn: inconsistent returns 'mutex:&q->mmap_lock'.
Locked on: line 2148
Unlocked on: line 2100
line 2108
line 2113
line 2118
line 2156
line 2159
There is one error condition that doesn't unlock a mutex.
Fixes:
cd26d1c4d1bc ("media: vb2: vb2_mmap: move lock up")
Reviewed-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Michal Hocko [Tue, 8 Jan 2019 23:23:07 +0000 (15:23 -0800)]
mm, memcg: fix reclaim deadlock with writeback
commit
63f3655f950186752236bb88a22f8252c11ce394 upstream.
Liu Bo has experienced a deadlock between memcg (legacy) reclaim and the
ext4 writeback
task1:
wait_on_page_bit+0x82/0xa0
shrink_page_list+0x907/0x960
shrink_inactive_list+0x2c7/0x680
shrink_node_memcg+0x404/0x830
shrink_node+0xd8/0x300
do_try_to_free_pages+0x10d/0x330
try_to_free_mem_cgroup_pages+0xd5/0x1b0
try_charge+0x14d/0x720
memcg_kmem_charge_memcg+0x3c/0xa0
memcg_kmem_charge+0x7e/0xd0
__alloc_pages_nodemask+0x178/0x260
alloc_pages_current+0x95/0x140
pte_alloc_one+0x17/0x40
__pte_alloc+0x1e/0x110
alloc_set_pte+0x5fe/0xc20
do_fault+0x103/0x970
handle_mm_fault+0x61e/0xd10
__do_page_fault+0x252/0x4d0
do_page_fault+0x30/0x80
page_fault+0x28/0x30
task2:
__lock_page+0x86/0xa0
mpage_prepare_extent_to_map+0x2e7/0x310 [ext4]
ext4_writepages+0x479/0xd60
do_writepages+0x1e/0x30
__writeback_single_inode+0x45/0x320
writeback_sb_inodes+0x272/0x600
__writeback_inodes_wb+0x92/0xc0
wb_writeback+0x268/0x300
wb_workfn+0xb4/0x390
process_one_work+0x189/0x420
worker_thread+0x4e/0x4b0
kthread+0xe6/0x100
ret_from_fork+0x41/0x50
He adds
"task1 is waiting for the PageWriteback bit of the page that task2 has
collected in mpd->io_submit->io_bio, and tasks2 is waiting for the
LOCKED bit the page which tasks1 has locked"
More precisely task1 is handling a page fault and it has a page locked
while it charges a new page table to a memcg. That in turn hits a
memory limit reclaim and the memcg reclaim for legacy controller is
waiting on the writeback but that is never going to finish because the
writeback itself is waiting for the page locked in the #PF path. So
this is essentially ABBA deadlock:
lock_page(A)
SetPageWriteback(A)
unlock_page(A)
lock_page(B)
lock_page(B)
pte_alloc_pne
shrink_page_list
wait_on_page_writeback(A)
SetPageWriteback(B)
unlock_page(B)
# flush A, B to clear the writeback
This accumulating of more pages to flush is used by several filesystems
to generate a more optimal IO patterns.
Waiting for the writeback in legacy memcg controller is a workaround for
pre-mature OOM killer invocations because there is no dirty IO
throttling available for the controller. There is no easy way around
that unfortunately. Therefore fix this specific issue by pre-allocating
the page table outside of the page lock. We have that handy
infrastructure for that already so simply reuse the fault-around pattern
which already does this.
There are probably other hidden __GFP_ACCOUNT | GFP_KERNEL allocations
from under a fs page locked but they should be really rare. I am not
aware of a better solution unfortunately.
[akpm@linux-foundation.org: fix mm/memory.c:__do_fault()]
[akpm@linux-foundation.org: coding-style fixes]
[mhocko@kernel.org: enhance comment, per Johannes]
Link: http://lkml.kernel.org/r/20181214084948.GA5624@dhcp22.suse.cz
Link: http://lkml.kernel.org/r/20181213092221.27270-1-mhocko@kernel.org
Fixes:
c3b94f44fcb0 ("memcg: further prevent OOM with too many dirty pages")
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reported-by: Liu Bo <bo.liu@linux.alibaba.com>
Debugged-by: Liu Bo <bo.liu@linux.alibaba.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Liu Bo <bo.liu@linux.alibaba.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Ivan Mironov [Tue, 8 Jan 2019 07:23:53 +0000 (12:23 +0500)]
drm/fb-helper: Ignore the value of fb_var_screeninfo.pixclock
commit
66a8d5bfb518f9f12d47e1d2dce1732279f9451e upstream.
Strict requirement of pixclock to be zero breaks support of SDL 1.2
which contains hardcoded table of supported video modes with non-zero
pixclock values[1].
To better understand which pixclock values are considered valid and how
driver should handle these values, I briefly examined few existing fbdev
drivers and documentation in Documentation/fb/. And it looks like there
are no strict rules on that and actual behaviour varies:
* some drivers treat (pixclock == 0) as "use defaults" (uvesafb.c);
* some treat (pixclock == 0) as invalid value which leads to
-EINVAL (clps711x-fb.c);
* some pass converted pixclock value to hardware (uvesafb.c);
* some are trying to find nearest value from predefined table
(vga16fb.c, video_gx.c).
Given this, I believe that it should be safe to just ignore this value if
changing is not supported. It seems that any portable fbdev application
which was not written only for one specific device working under one
specific kernel version should not rely on any particular behaviour of
pixclock anyway.
However, while enabling SDL1 applications to work out of the box when
there is no /etc/fb.modes with valid settings, this change affects the
video mode choosing logic in SDL. Depending on current screen
resolution, contents of /etc/fb.modes and resolution requested by
application, this may lead to user-visible difference (not always):
image will be displayed in a right way, but it will be aligned to the
left instead of center. There is no "right behaviour" here as well, as
emulated fbdev, opposing to old fbdev drivers, simply ignores any
requsts of video mode changes with resolutions smaller than current.
The easiest way to reproduce this problem is to install sdl-sopwith[2],
remove /etc/fb.modes file if it exists, and then try to run sopwith
from console without X. At least in Fedora 29, sopwith may be simply
installed from standard repositories.
[1] SDL 1.2.15 source code, src/video/fbcon/SDL_fbvideo.c, vesa_timings
[2] http://sdl-sopwith.sourceforge.net/
Signed-off-by: Ivan Mironov <mironov.ivan@gmail.com>
Cc: stable@vger.kernel.org
Fixes:
79e539453b34e ("DRM: i915: add mode setting support")
Fixes:
771fe6b912fca ("drm/radeon: introduce kernel modesetting for radeon hardware")
Fixes:
785b93ef8c309 ("drm/kms: move driver specific fb common code to helper functions (v2)")
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20190108072353.28078-3-mironov.ivan@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Tetsuo Handa [Mon, 12 Nov 2018 15:42:14 +0000 (08:42 -0700)]
loop: Fix double mutex_unlock(&loop_ctl_mutex) in loop_control_ioctl()
commit
628bd85947091830a8c4872adfd5ed1d515a9cf2 upstream.
Commit
0a42e99b58a20883 ("loop: Get rid of loop_index_mutex") forgot to
remove mutex_unlock(&loop_ctl_mutex) from loop_control_ioctl() when
replacing loop_index_mutex with loop_ctl_mutex.
Fixes:
0a42e99b58a20883 ("loop: Get rid of loop_index_mutex")
Reported-by: syzbot <syzbot+c0138741c2290fc5e63f@syzkaller.appspotmail.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Jan Kara [Thu, 8 Nov 2018 13:01:04 +0000 (14:01 +0100)]
loop: Get rid of loop_index_mutex
commit
0a42e99b58a208839626465af194cfe640ef9493 upstream.
Now that loop_ctl_mutex is global, just get rid of loop_index_mutex as
there is no good reason to keep these two separate and it just
complicates the locking.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Jan Kara [Thu, 8 Nov 2018 13:01:03 +0000 (14:01 +0100)]
loop: Fold __loop_release into loop_release
commit
967d1dc144b50ad005e5eecdfadfbcfb399ffff6 upstream.
__loop_release() has a single call site. Fold it there. This is
currently not a huge win but it will make following replacement of
loop_index_mutex more obvious.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Tetsuo Handa [Thu, 8 Nov 2018 13:01:02 +0000 (14:01 +0100)]
block/loop: Use global lock for ioctl() operation.
commit
310ca162d779efee8a2dc3731439680f3e9c1e86 upstream.
syzbot is reporting NULL pointer dereference [1] which is caused by
race condition between ioctl(loop_fd, LOOP_CLR_FD, 0) versus
ioctl(other_loop_fd, LOOP_SET_FD, loop_fd) due to traversing other
loop devices at loop_validate_file() without holding corresponding
lo->lo_ctl_mutex locks.
Since ioctl() request on loop devices is not frequent operation, we don't
need fine grained locking. Let's use global lock in order to allow safe
traversal at loop_validate_file().
Note that syzbot is also reporting circular locking dependency between
bdev->bd_mutex and lo->lo_ctl_mutex [2] which is caused by calling
blkdev_reread_part() with lock held. This patch does not address it.
[1] https://syzkaller.appspot.com/bug?id=
f3cfe26e785d85f9ee259f385515291d21bd80a3
[2] https://syzkaller.appspot.com/bug?id=
bf154052f0eea4bc7712499e4569505907d15889
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reported-by: syzbot <syzbot+bf89c128e05dd6c62523@syzkaller.appspotmail.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Ying Xue [Mon, 14 Jan 2019 09:22:29 +0000 (17:22 +0800)]
tipc: fix uninit-value in tipc_nl_compat_doit
commit
2753ca5d9009c180dbfd4c802c80983b4b6108d1 upstream.
BUG: KMSAN: uninit-value in tipc_nl_compat_doit+0x404/0xa10 net/tipc/netlink_compat.c:335
CPU: 0 PID: 4514 Comm: syz-executor485 Not tainted 4.16.0+ #87
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:17 [inline]
dump_stack+0x185/0x1d0 lib/dump_stack.c:53
kmsan_report+0x142/0x240 mm/kmsan/kmsan.c:1067
__msan_warning_32+0x6c/0xb0 mm/kmsan/kmsan_instr.c:683
tipc_nl_compat_doit+0x404/0xa10 net/tipc/netlink_compat.c:335
tipc_nl_compat_recv+0x164b/0x2700 net/tipc/netlink_compat.c:1153
genl_family_rcv_msg net/netlink/genetlink.c:599 [inline]
genl_rcv_msg+0x1686/0x1810 net/netlink/genetlink.c:624
netlink_rcv_skb+0x378/0x600 net/netlink/af_netlink.c:2447
genl_rcv+0x63/0x80 net/netlink/genetlink.c:635
netlink_unicast_kernel net/netlink/af_netlink.c:1311 [inline]
netlink_unicast+0x166b/0x1740 net/netlink/af_netlink.c:1337
netlink_sendmsg+0x1048/0x1310 net/netlink/af_netlink.c:1900
sock_sendmsg_nosec net/socket.c:630 [inline]
sock_sendmsg net/socket.c:640 [inline]
___sys_sendmsg+0xec0/0x1310 net/socket.c:2046
__sys_sendmsg net/socket.c:2080 [inline]
SYSC_sendmsg+0x2a3/0x3d0 net/socket.c:2091
SyS_sendmsg+0x54/0x80 net/socket.c:2087
do_syscall_64+0x309/0x430 arch/x86/entry/common.c:287
entry_SYSCALL_64_after_hwframe+0x3d/0xa2
RIP: 0033:0x43fda9
RSP: 002b:
00007ffd0c184ba8 EFLAGS:
00000213 ORIG_RAX:
000000000000002e
RAX:
ffffffffffffffda RBX:
00000000004002c8 RCX:
000000000043fda9
RDX:
0000000000000000 RSI:
0000000020023000 RDI:
0000000000000003
RBP:
00000000006ca018 R08:
00000000004002c8 R09:
00000000004002c8
R10:
00000000004002c8 R11:
0000000000000213 R12:
00000000004016d0
R13:
0000000000401760 R14:
0000000000000000 R15:
0000000000000000
Uninit was created at:
kmsan_save_stack_with_flags mm/kmsan/kmsan.c:278 [inline]
kmsan_internal_poison_shadow+0xb8/0x1b0 mm/kmsan/kmsan.c:188
kmsan_kmalloc+0x94/0x100 mm/kmsan/kmsan.c:314
kmsan_slab_alloc+0x11/0x20 mm/kmsan/kmsan.c:321
slab_post_alloc_hook mm/slab.h:445 [inline]
slab_alloc_node mm/slub.c:2737 [inline]
__kmalloc_node_track_caller+0xaed/0x11c0 mm/slub.c:4369
__kmalloc_reserve net/core/skbuff.c:138 [inline]
__alloc_skb+0x2cf/0x9f0 net/core/skbuff.c:206
alloc_skb include/linux/skbuff.h:984 [inline]
netlink_alloc_large_skb net/netlink/af_netlink.c:1183 [inline]
netlink_sendmsg+0x9a6/0x1310 net/netlink/af_netlink.c:1875
sock_sendmsg_nosec net/socket.c:630 [inline]
sock_sendmsg net/socket.c:640 [inline]
___sys_sendmsg+0xec0/0x1310 net/socket.c:2046
__sys_sendmsg net/socket.c:2080 [inline]
SYSC_sendmsg+0x2a3/0x3d0 net/socket.c:2091
SyS_sendmsg+0x54/0x80 net/socket.c:2087
do_syscall_64+0x309/0x430 arch/x86/entry/common.c:287
entry_SYSCALL_64_after_hwframe+0x3d/0xa2
In tipc_nl_compat_recv(), when the len variable returned by
nlmsg_attrlen() is 0, the message is still treated as a valid one,
which is obviously unresonable. When len is zero, it means the
message not only doesn't contain any valid TLV payload, but also
TLV header is not included. Under this stituation, tlv_type field
in TLV header is still accessed in tipc_nl_compat_dumpit() or
tipc_nl_compat_doit(), but the field space is obviously illegal.
Of course, it is not initialized.
Reported-by: syzbot+bca0dc46634781f08b38@syzkaller.appspotmail.com
Reported-by: syzbot+6bdb590321a7ae40c1a6@syzkaller.appspotmail.com
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Ying Xue [Mon, 14 Jan 2019 09:22:28 +0000 (17:22 +0800)]
tipc: fix uninit-value in tipc_nl_compat_name_table_dump
commit
974cb0e3e7c963ced06c4e32c5b2884173fa5e01 upstream.
syzbot reported:
BUG: KMSAN: uninit-value in __arch_swab32 arch/x86/include/uapi/asm/swab.h:10 [inline]
BUG: KMSAN: uninit-value in __fswab32 include/uapi/linux/swab.h:59 [inline]
BUG: KMSAN: uninit-value in tipc_nl_compat_name_table_dump+0x4a8/0xba0 net/tipc/netlink_compat.c:826
CPU: 0 PID: 6290 Comm: syz-executor848 Not tainted 4.19.0-rc8+ #70
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x306/0x460 lib/dump_stack.c:113
kmsan_report+0x1a2/0x2e0 mm/kmsan/kmsan.c:917
__msan_warning+0x7c/0xe0 mm/kmsan/kmsan_instr.c:500
__arch_swab32 arch/x86/include/uapi/asm/swab.h:10 [inline]
__fswab32 include/uapi/linux/swab.h:59 [inline]
tipc_nl_compat_name_table_dump+0x4a8/0xba0 net/tipc/netlink_compat.c:826
__tipc_nl_compat_dumpit+0x59e/0xdb0 net/tipc/netlink_compat.c:205
tipc_nl_compat_dumpit+0x63a/0x820 net/tipc/netlink_compat.c:270
tipc_nl_compat_handle net/tipc/netlink_compat.c:1151 [inline]
tipc_nl_compat_recv+0x1402/0x2760 net/tipc/netlink_compat.c:1210
genl_family_rcv_msg net/netlink/genetlink.c:601 [inline]
genl_rcv_msg+0x185c/0x1a20 net/netlink/genetlink.c:626
netlink_rcv_skb+0x394/0x640 net/netlink/af_netlink.c:2454
genl_rcv+0x63/0x80 net/netlink/genetlink.c:637
netlink_unicast_kernel net/netlink/af_netlink.c:1317 [inline]
netlink_unicast+0x166d/0x1720 net/netlink/af_netlink.c:1343
netlink_sendmsg+0x1391/0x1420 net/netlink/af_netlink.c:1908
sock_sendmsg_nosec net/socket.c:621 [inline]
sock_sendmsg net/socket.c:631 [inline]
___sys_sendmsg+0xe47/0x1200 net/socket.c:2116
__sys_sendmsg net/socket.c:2154 [inline]
__do_sys_sendmsg net/socket.c:2163 [inline]
__se_sys_sendmsg+0x307/0x460 net/socket.c:2161
__x64_sys_sendmsg+0x4a/0x70 net/socket.c:2161
do_syscall_64+0xbe/0x100 arch/x86/entry/common.c:291
entry_SYSCALL_64_after_hwframe+0x63/0xe7
RIP: 0033:0x440179
Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 fb 13 fc ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:
00007ffecec49318 EFLAGS:
00000213 ORIG_RAX:
000000000000002e
RAX:
ffffffffffffffda RBX:
00000000004002c8 RCX:
0000000000440179
RDX:
0000000000000000 RSI:
0000000020000100 RDI:
0000000000000003
RBP:
00000000006ca018 R08:
0000000000000000 R09:
00000000004002c8
R10:
0000000000000000 R11:
0000000000000213 R12:
0000000000401a00
R13:
0000000000401a90 R14:
0000000000000000 R15:
0000000000000000
Uninit was created at:
kmsan_save_stack_with_flags mm/kmsan/kmsan.c:255 [inline]
kmsan_internal_poison_shadow+0xc8/0x1d0 mm/kmsan/kmsan.c:180
kmsan_kmalloc+0xa4/0x120 mm/kmsan/kmsan_hooks.c:104
kmsan_slab_alloc+0x10/0x20 mm/kmsan/kmsan_hooks.c:113
slab_post_alloc_hook mm/slab.h:446 [inline]
slab_alloc_node mm/slub.c:2727 [inline]
__kmalloc_node_track_caller+0xb43/0x1400 mm/slub.c:4360
__kmalloc_reserve net/core/skbuff.c:138 [inline]
__alloc_skb+0x422/0xe90 net/core/skbuff.c:206
alloc_skb include/linux/skbuff.h:996 [inline]
netlink_alloc_large_skb net/netlink/af_netlink.c:1189 [inline]
netlink_sendmsg+0xcaf/0x1420 net/netlink/af_netlink.c:1883
sock_sendmsg_nosec net/socket.c:621 [inline]
sock_sendmsg net/socket.c:631 [inline]
___sys_sendmsg+0xe47/0x1200 net/socket.c:2116
__sys_sendmsg net/socket.c:2154 [inline]
__do_sys_sendmsg net/socket.c:2163 [inline]
__se_sys_sendmsg+0x307/0x460 net/socket.c:2161
__x64_sys_sendmsg+0x4a/0x70 net/socket.c:2161
do_syscall_64+0xbe/0x100 arch/x86/entry/common.c:291
entry_SYSCALL_64_after_hwframe+0x63/0xe7
We cannot take for granted the thing that the length of data contained
in TLV is longer than the size of struct tipc_name_table_query in
tipc_nl_compat_name_table_dump().
Reported-by: syzbot+06e771a754829716a327@syzkaller.appspotmail.com
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Ying Xue [Mon, 14 Jan 2019 09:22:27 +0000 (17:22 +0800)]
tipc: fix uninit-value in tipc_nl_compat_link_set
commit
edf5ff04a45750ac8ce2435974f001dc9cfbf055 upstream.
syzbot reports following splat:
BUG: KMSAN: uninit-value in strlen+0x3b/0xa0 lib/string.c:486
CPU: 1 PID: 9306 Comm: syz-executor172 Not tainted 4.20.0-rc7+ #2
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x173/0x1d0 lib/dump_stack.c:113
kmsan_report+0x12e/0x2a0 mm/kmsan/kmsan.c:613
__msan_warning+0x82/0xf0 mm/kmsan/kmsan_instr.c:313
strlen+0x3b/0xa0 lib/string.c:486
nla_put_string include/net/netlink.h:1154 [inline]
__tipc_nl_compat_link_set net/tipc/netlink_compat.c:708 [inline]
tipc_nl_compat_link_set+0x929/0x1220 net/tipc/netlink_compat.c:744
__tipc_nl_compat_doit net/tipc/netlink_compat.c:311 [inline]
tipc_nl_compat_doit+0x3aa/0xaf0 net/tipc/netlink_compat.c:344
tipc_nl_compat_handle net/tipc/netlink_compat.c:1107 [inline]
tipc_nl_compat_recv+0x14d7/0x2760 net/tipc/netlink_compat.c:1210
genl_family_rcv_msg net/netlink/genetlink.c:601 [inline]
genl_rcv_msg+0x185f/0x1a60 net/netlink/genetlink.c:626
netlink_rcv_skb+0x444/0x640 net/netlink/af_netlink.c:2477
genl_rcv+0x63/0x80 net/netlink/genetlink.c:637
netlink_unicast_kernel net/netlink/af_netlink.c:1310 [inline]
netlink_unicast+0xf40/0x1020 net/netlink/af_netlink.c:1336
netlink_sendmsg+0x127f/0x1300 net/netlink/af_netlink.c:1917
sock_sendmsg_nosec net/socket.c:621 [inline]
sock_sendmsg net/socket.c:631 [inline]
___sys_sendmsg+0xdb9/0x11b0 net/socket.c:2116
__sys_sendmsg net/socket.c:2154 [inline]
__do_sys_sendmsg net/socket.c:2163 [inline]
__se_sys_sendmsg+0x305/0x460 net/socket.c:2161
__x64_sys_sendmsg+0x4a/0x70 net/socket.c:2161
do_syscall_64+0xbc/0xf0 arch/x86/entry/common.c:291
entry_SYSCALL_64_after_hwframe+0x63/0xe7
The uninitialised access happened in
nla_put_string(skb, TIPC_NLA_LINK_NAME, lc->name)
This is because lc->name string is not validated before it's used.
Reported-by: syzbot+d78b8a29241a195aefb8@syzkaller.appspotmail.com
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Ying Xue [Mon, 14 Jan 2019 09:22:26 +0000 (17:22 +0800)]
tipc: fix uninit-value in tipc_nl_compat_bearer_enable
commit
0762216c0ad2a2fccd63890648eca491f2c83d9a upstream.
syzbot reported:
BUG: KMSAN: uninit-value in strlen+0x3b/0xa0 lib/string.c:484
CPU: 1 PID: 6371 Comm: syz-executor652 Not tainted 4.19.0-rc8+ #70
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x306/0x460 lib/dump_stack.c:113
kmsan_report+0x1a2/0x2e0 mm/kmsan/kmsan.c:917
__msan_warning+0x7c/0xe0 mm/kmsan/kmsan_instr.c:500
strlen+0x3b/0xa0 lib/string.c:484
nla_put_string include/net/netlink.h:1011 [inline]
tipc_nl_compat_bearer_enable+0x238/0x7b0 net/tipc/netlink_compat.c:389
__tipc_nl_compat_doit net/tipc/netlink_compat.c:311 [inline]
tipc_nl_compat_doit+0x39f/0xae0 net/tipc/netlink_compat.c:344
tipc_nl_compat_recv+0x147c/0x2760 net/tipc/netlink_compat.c:1107
genl_family_rcv_msg net/netlink/genetlink.c:601 [inline]
genl_rcv_msg+0x185c/0x1a20 net/netlink/genetlink.c:626
netlink_rcv_skb+0x394/0x640 net/netlink/af_netlink.c:2454
genl_rcv+0x63/0x80 net/netlink/genetlink.c:637
netlink_unicast_kernel net/netlink/af_netlink.c:1317 [inline]
netlink_unicast+0x166d/0x1720 net/netlink/af_netlink.c:1343
netlink_sendmsg+0x1391/0x1420 net/netlink/af_netlink.c:1908
sock_sendmsg_nosec net/socket.c:621 [inline]
sock_sendmsg net/socket.c:631 [inline]
___sys_sendmsg+0xe47/0x1200 net/socket.c:2116
__sys_sendmsg net/socket.c:2154 [inline]
__do_sys_sendmsg net/socket.c:2163 [inline]
__se_sys_sendmsg+0x307/0x460 net/socket.c:2161
__x64_sys_sendmsg+0x4a/0x70 net/socket.c:2161
do_syscall_64+0xbe/0x100 arch/x86/entry/common.c:291
entry_SYSCALL_64_after_hwframe+0x63/0xe7
RIP: 0033:0x440179
Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 fb 13 fc ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:
00007fffef7beee8 EFLAGS:
00000213 ORIG_RAX:
000000000000002e
RAX:
ffffffffffffffda RBX:
00000000004002c8 RCX:
0000000000440179
RDX:
0000000000000000 RSI:
0000000020000100 RDI:
0000000000000003
RBP:
00000000006ca018 R08:
0000000000000000 R09:
00000000004002c8
R10:
0000000000000000 R11:
0000000000000213 R12:
0000000000401a00
R13:
0000000000401a90 R14:
0000000000000000 R15:
0000000000000000
Uninit was created at:
kmsan_save_stack_with_flags mm/kmsan/kmsan.c:255 [inline]
kmsan_internal_poison_shadow+0xc8/0x1d0 mm/kmsan/kmsan.c:180
kmsan_kmalloc+0xa4/0x120 mm/kmsan/kmsan_hooks.c:104
kmsan_slab_alloc+0x10/0x20 mm/kmsan/kmsan_hooks.c:113
slab_post_alloc_hook mm/slab.h:446 [inline]
slab_alloc_node mm/slub.c:2727 [inline]
__kmalloc_node_track_caller+0xb43/0x1400 mm/slub.c:4360
__kmalloc_reserve net/core/skbuff.c:138 [inline]
__alloc_skb+0x422/0xe90 net/core/skbuff.c:206
alloc_skb include/linux/skbuff.h:996 [inline]
netlink_alloc_large_skb net/netlink/af_netlink.c:1189 [inline]
netlink_sendmsg+0xcaf/0x1420 net/netlink/af_netlink.c:1883
sock_sendmsg_nosec net/socket.c:621 [inline]
sock_sendmsg net/socket.c:631 [inline]
___sys_sendmsg+0xe47/0x1200 net/socket.c:2116
__sys_sendmsg net/socket.c:2154 [inline]
__do_sys_sendmsg net/socket.c:2163 [inline]
__se_sys_sendmsg+0x307/0x460 net/socket.c:2161
__x64_sys_sendmsg+0x4a/0x70 net/socket.c:2161
do_syscall_64+0xbe/0x100 arch/x86/entry/common.c:291
entry_SYSCALL_64_after_hwframe+0x63/0xe7
The root cause is that we don't validate whether bear name is a valid
string in tipc_nl_compat_bearer_enable().
Meanwhile, we also fix the same issue in the following functions:
tipc_nl_compat_bearer_disable()
tipc_nl_compat_link_stat_dump()
tipc_nl_compat_media_set()
tipc_nl_compat_bearer_set()
Reported-by: syzbot+b33d5cae0efd35dbfe77@syzkaller.appspotmail.com
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Ying Xue [Mon, 14 Jan 2019 09:22:25 +0000 (17:22 +0800)]
tipc: fix uninit-value in tipc_nl_compat_link_reset_stats
commit
8b66fee7f8ee18f9c51260e7a43ab37db5177a05 upstream.
syzbot reports following splat:
BUG: KMSAN: uninit-value in strlen+0x3b/0xa0 lib/string.c:486
CPU: 1 PID: 11057 Comm: syz-executor0 Not tainted 4.20.0-rc7+ #2
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x173/0x1d0 lib/dump_stack.c:113
kmsan_report+0x12e/0x2a0 mm/kmsan/kmsan.c:613
__msan_warning+0x82/0xf0 mm/kmsan/kmsan_instr.c:295
strlen+0x3b/0xa0 lib/string.c:486
nla_put_string include/net/netlink.h:1154 [inline]
tipc_nl_compat_link_reset_stats+0x1f0/0x360 net/tipc/netlink_compat.c:760
__tipc_nl_compat_doit net/tipc/netlink_compat.c:311 [inline]
tipc_nl_compat_doit+0x3aa/0xaf0 net/tipc/netlink_compat.c:344
tipc_nl_compat_handle net/tipc/netlink_compat.c:1107 [inline]
tipc_nl_compat_recv+0x14d7/0x2760 net/tipc/netlink_compat.c:1210
genl_family_rcv_msg net/netlink/genetlink.c:601 [inline]
genl_rcv_msg+0x185f/0x1a60 net/netlink/genetlink.c:626
netlink_rcv_skb+0x444/0x640 net/netlink/af_netlink.c:2477
genl_rcv+0x63/0x80 net/netlink/genetlink.c:637
netlink_unicast_kernel net/netlink/af_netlink.c:1310 [inline]
netlink_unicast+0xf40/0x1020 net/netlink/af_netlink.c:1336
netlink_sendmsg+0x127f/0x1300 net/netlink/af_netlink.c:1917
sock_sendmsg_nosec net/socket.c:621 [inline]
sock_sendmsg net/socket.c:631 [inline]
___sys_sendmsg+0xdb9/0x11b0 net/socket.c:2116
__sys_sendmsg net/socket.c:2154 [inline]
__do_sys_sendmsg net/socket.c:2163 [inline]
__se_sys_sendmsg+0x305/0x460 net/socket.c:2161
__x64_sys_sendmsg+0x4a/0x70 net/socket.c:2161
do_syscall_64+0xbc/0xf0 arch/x86/entry/common.c:291
entry_SYSCALL_64_after_hwframe+0x63/0xe7
RIP: 0033:0x457ec9
Code: 6d b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 3b b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:
00007f2557338c78 EFLAGS:
00000246 ORIG_RAX:
000000000000002e
RAX:
ffffffffffffffda RBX:
0000000000000003 RCX:
0000000000457ec9
RDX:
0000000000000000 RSI:
00000000200001c0 RDI:
0000000000000003
RBP:
000000000073bf00 R08:
0000000000000000 R09:
0000000000000000
R10:
0000000000000000 R11:
0000000000000246 R12:
00007f25573396d4
R13:
00000000004cb478 R14:
00000000004d86c8 R15:
00000000ffffffff
Uninit was created at:
kmsan_save_stack_with_flags mm/kmsan/kmsan.c:204 [inline]
kmsan_internal_poison_shadow+0x92/0x150 mm/kmsan/kmsan.c:158
kmsan_kmalloc+0xa6/0x130 mm/kmsan/kmsan_hooks.c:176
kmsan_slab_alloc+0xe/0x10 mm/kmsan/kmsan_hooks.c:185
slab_post_alloc_hook mm/slab.h:446 [inline]
slab_alloc_node mm/slub.c:2759 [inline]
__kmalloc_node_track_caller+0xe18/0x1030 mm/slub.c:4383
__kmalloc_reserve net/core/skbuff.c:137 [inline]
__alloc_skb+0x309/0xa20 net/core/skbuff.c:205
alloc_skb include/linux/skbuff.h:998 [inline]
netlink_alloc_large_skb net/netlink/af_netlink.c:1182 [inline]
netlink_sendmsg+0xb82/0x1300 net/netlink/af_netlink.c:1892
sock_sendmsg_nosec net/socket.c:621 [inline]
sock_sendmsg net/socket.c:631 [inline]
___sys_sendmsg+0xdb9/0x11b0 net/socket.c:2116
__sys_sendmsg net/socket.c:2154 [inline]
__do_sys_sendmsg net/socket.c:2163 [inline]
__se_sys_sendmsg+0x305/0x460 net/socket.c:2161
__x64_sys_sendmsg+0x4a/0x70 net/socket.c:2161
do_syscall_64+0xbc/0xf0 arch/x86/entry/common.c:291
entry_SYSCALL_64_after_hwframe+0x63/0xe7
The uninitialised access happened in tipc_nl_compat_link_reset_stats:
nla_put_string(skb, TIPC_NLA_LINK_NAME, name)
This is because name string is not validated before it's used.
Reported-by: syzbot+e01d94b5a4c266be6e4c@syzkaller.appspotmail.com
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Xin Long [Mon, 14 Jan 2019 10:34:02 +0000 (18:34 +0800)]
sctp: allocate sctp_sockaddr_entry with kzalloc
commit
400b8b9a2a17918f8ce00786f596f530e7f30d50 upstream.
The similar issue as fixed in Commit
4a2eb0c37b47 ("sctp: initialize
sin6_flowinfo for ipv6 addrs in sctp_inet6addr_event") also exists
in sctp_inetaddr_event, as Alexander noticed.
To fix it, allocate sctp_sockaddr_entry with kzalloc for both sctp
ipv4 and ipv6 addresses, as does in sctp_v4/6_copy_addrlist().
Reported-by: Alexander Potapenko <glider@google.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Reported-by: syzbot+ae0c70c0c2d40c51bb92@syzkaller.appspotmail.com
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Jan Kara [Mon, 14 Jan 2019 08:48:10 +0000 (09:48 +0100)]
blockdev: Fix livelocks on loop device
commit
04906b2f542c23626b0ef6219b808406f8dddbe9 upstream.
bd_set_size() updates also block device's block size. This is somewhat
unexpected from its name and at this point, only blkdev_open() uses this
functionality. Furthermore, this can result in changing block size under
a filesystem mounted on a loop device which leads to livelocks inside
__getblk_gfp() like:
Sending NMI from CPU 0 to CPUs 1:
NMI backtrace for cpu 1
CPU: 1 PID: 10863 Comm: syz-executor0 Not tainted 4.18.0-rc5+ #151
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google
01/01/2011
RIP: 0010:__sanitizer_cov_trace_pc+0x3f/0x50 kernel/kcov.c:106
...
Call Trace:
init_page_buffers+0x3e2/0x530 fs/buffer.c:904
grow_dev_page fs/buffer.c:947 [inline]
grow_buffers fs/buffer.c:1009 [inline]
__getblk_slow fs/buffer.c:1036 [inline]
__getblk_gfp+0x906/0xb10 fs/buffer.c:1313
__bread_gfp+0x2d/0x310 fs/buffer.c:1347
sb_bread include/linux/buffer_head.h:307 [inline]
fat12_ent_bread+0x14e/0x3d0 fs/fat/fatent.c:75
fat_ent_read_block fs/fat/fatent.c:441 [inline]
fat_alloc_clusters+0x8ce/0x16e0 fs/fat/fatent.c:489
fat_add_cluster+0x7a/0x150 fs/fat/inode.c:101
__fat_get_block fs/fat/inode.c:148 [inline]
...
Trivial reproducer for the problem looks like:
truncate -s 1G /tmp/image
losetup /dev/loop0 /tmp/image
mkfs.ext4 -b 1024 /dev/loop0
mount -t ext4 /dev/loop0 /mnt
losetup -c /dev/loop0
l /mnt
Fix the problem by moving initialization of a block device block size
into a separate function and call it when needed.
Thanks to Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> for help with
debugging the problem.
Reported-by: syzbot+9933e4476f365f5d5a1b@syzkaller.appspotmail.com
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Stephen Smalley [Wed, 9 Jan 2019 15:55:10 +0000 (10:55 -0500)]
selinux: fix GPF on invalid policy
commit
5b0e7310a2a33c06edc7eb81ffc521af9b2c5610 upstream.
levdatum->level can be NULL if we encounter an error while loading
the policy during sens_read prior to initializing it. Make sure
sens_destroy handles that case correctly.
Reported-by: syzbot+6664500f0f18f07a5c0e@syzkaller.appspotmail.com
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Shakeel Butt [Thu, 3 Jan 2019 03:14:31 +0000 (19:14 -0800)]
netfilter: ebtables: account ebt_table_info to kmemcg
commit
e2c8d550a973bb34fc28bc8d0ec996f84562fb8a upstream.
The [ip,ip6,arp]_tables use x_tables_info internally and the underlying
memory is already accounted to kmemcg. Do the same for ebtables. The
syzbot, by using setsockopt(EBT_SO_SET_ENTRIES), was able to OOM the
whole system from a restricted memcg, a potential DoS.
By accounting the ebt_table_info, the memory used for ebt_table_info can
be contained within the memcg of the allocating process. However the
lifetime of ebt_table_info is independent of the allocating process and
is tied to the network namespace. So, the oom-killer will not be able to
relieve the memory pressure due to ebt_table_info memory. The memory for
ebt_table_info is allocated through vmalloc. Currently vmalloc does not
handle the oom-killed allocating process correctly and one large
allocation can bypass memcg limit enforcement. So, with this patch,
at least the small allocations will be contained. For large allocations,
we need to fix vmalloc.
Reported-by: syzbot+7713f3aa67be76b1552c@syzkaller.appspotmail.com
Signed-off-by: Shakeel Butt <shakeelb@google.com>
Reviewed-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
J. Bruce Fields [Thu, 20 Dec 2018 15:35:11 +0000 (10:35 -0500)]
sunrpc: handle ENOMEM in rpcb_getport_async
commit
81c88b18de1f11f70c97f28ced8d642c00bb3955 upstream.
If we ignore the error we'll hit a null dereference a little later.
Reported-by: syzbot+4b98281f2401ab849f4b@syzkaller.appspotmail.com
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Hans Verkuil [Tue, 13 Nov 2018 14:06:46 +0000 (09:06 -0500)]
media: vb2: vb2_mmap: move lock up
commit
cd26d1c4d1bc947b56ae404998ae2276df7b39b7 upstream.
If a filehandle is dup()ped, then it is possible to close it from one fd
and call mmap from the other. This creates a race condition in vb2_mmap
where it is using queue data that __vb2_queue_free (called from close())
is in the process of releasing.
By moving up the mutex_lock(mmap_lock) in vb2_mmap this race is avoided
since __vb2_queue_free is called with the same mutex locked. So vb2_mmap
now reads consistent buffer data.
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Reported-by: syzbot+be93025dd45dccd8923c@syzkaller.appspotmail.com
Signed-off-by: Hans Verkuil <hansverk@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
James Morris [Wed, 16 Jan 2019 23:41:11 +0000 (15:41 -0800)]
LSM: Check for NULL cred-security on free
commit
a5795fd38ee8194451ba3f281f075301a3696ce2 upstream.
From: Casey Schaufler <casey@schaufler-ca.com>
Check that the cred security blob has been set before trying
to clean it up. There is a case during credential initialization
that could result in this.
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Acked-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: James Morris <james.morris@microsoft.com>
Reported-by: syzbot+69ca07954461f189e808@syzkaller.appspotmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Hans Verkuil [Mon, 29 Oct 2018 17:32:38 +0000 (13:32 -0400)]
media: vivid: set min width/height to a value > 0
commit
9729d6d282a6d7ce88e64c9119cecdf79edf4e88 upstream.
The capture DV timings capabilities allowed for a minimum width and
height of 0. So passing a timings struct with 0 values is allowed
and will later cause a division by zero.
Ensure that the width and height must be >= 16 to avoid this.
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Reported-by: syzbot+57c3d83d71187054d56f@syzkaller.appspotmail.com
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Hans Verkuil [Mon, 29 Oct 2018 10:15:31 +0000 (06:15 -0400)]
media: vivid: fix error handling of kthread_run
commit
701f49bc028edb19ffccd101997dd84f0d71e279 upstream.
kthread_run returns an error pointer, but elsewhere in the code
dev->kthread_vid_cap/out is checked against NULL.
If kthread_run returns an error, then set the pointer to NULL.
I chose this method over changing all kthread_vid_cap/out tests
elsewhere since this is more robust.
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Reported-by: syzbot+53d5b2df0d9744411e2e@syzkaller.appspotmail.com
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Vlad Tsyrklevich [Fri, 11 Jan 2019 13:34:38 +0000 (14:34 +0100)]
omap2fb: Fix stack memory disclosure
commit
a01421e4484327fe44f8e126793ed5a48a221e24 upstream.
Using [1] for static analysis I found that the OMAPFB_QUERY_PLANE,
OMAPFB_GET_COLOR_KEY, OMAPFB_GET_DISPLAY_INFO, and OMAPFB_GET_VRAM_INFO
cases could all leak uninitialized stack memory--either due to
uninitialized padding or 'reserved' fields.
Fix them by clearing the shared union used to store copied out data.
[1] https://github.com/vlad902/kernel-uninitialized-memory-checker
Signed-off-by: Vlad Tsyrklevich <vlad@tsyrklevich.net>
Reviewed-by: Kees Cook <keescook@chromium.org>
Fixes:
b39a982ddecf ("OMAP: DSS2: omapfb driver")
Cc: security@kernel.org
[b.zolnierkie: prefix patch subject with "omap2fb: "]
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
YunQiang Su [Tue, 8 Jan 2019 05:45:10 +0000 (13:45 +0800)]
Disable MSI also when pcie-octeon.pcie_disable on
commit
a214720cbf50cd8c3f76bbb9c3f5c283910e9d33 upstream.
Octeon has an boot-time option to disable pcie.
Since MSI depends on PCI-E, we should also disable MSI also with
this option is on in order to avoid inadvertently accessing PCIe
registers.
Signed-off-by: YunQiang Su <ysu@wavecomp.com>
Signed-off-by: Paul Burton <paul.burton@mips.com>
Cc: pburton@wavecomp.com
Cc: linux-mips@vger.kernel.org
Cc: aaro.koskinen@iki.fi
Cc: stable@vger.kernel.org # v3.3+
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>