git.stricted.de - GitHub/exynos8895/android_kernel_samsung

MIPS: KGDB: Use kernel context for sleeping threads

commit 162b270c664dca2e0944308e92f9fcc887151a72 upstream.

KGDB is a kernel debug stub and it can't be used to debug userland as it
can only safely access kernel memory.

On MIPS however KGDB has always got the register state of sleeping
processes from the userland register context at the beginning of the
kernel stack. This is meaningless for kernel threads (which never enter
userland), and for user threads it prevents the user seeing what it is
doing while in the kernel:

(gdb) info threads
  Id   Target Id         Frame
  ...
  3    Thread 2 (kthreadd) 0x0000000000000000 in ?? ()
  2    Thread 1 (init)   0x000000007705c4b4 in ?? ()
  1    Thread -2 (shadowCPU0) 0xffffffff8012524c in arch_kgdb_breakpoint () at arch/mips/kernel/kgdb.c:201

Get the register state instead from the (partial) kernel register
context stored in the task's thread_struct for resume() to restore. All
threads now correctly appear to be in context_switch():

(gdb) info threads
  Id   Target Id         Frame
  ...
  3    Thread 2 (kthreadd) context_switch (rq=<optimized out>, cookie=..., next=<optimized out>, prev=0x0) at kernel/sched/core.c:2903
  2    Thread 1 (init)   context_switch (rq=<optimized out>, cookie=..., next=<optimized out>, prev=0x0) at kernel/sched/core.c:2903
  1    Thread -2 (shadowCPU0) 0xffffffff8012524c in arch_kgdb_breakpoint () at arch/mips/kernel/kgdb.c:201

Call clobbered registers which aren't saved and exception registers
(BadVAddr & Cause) which can't be easily determined without stack
unwinding are reported as 0. The PC is taken from the return address,
such that the state presented matches that found immediately after
returning from resume().

Fixes: 8854700115ec ("[MIPS] kgdb: add arch support for the kernel's kgdb core")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Jason Wessel <jason.wessel@windriver.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/15829/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ALSA: seq: Don't break snd_use_lock_sync() loop by timeout

commit 4e7655fd4f47c23e5249ea260dc802f909a64611 upstream.

The snd_use_lock_sync() (thus its implementation
snd_use_lock_sync_helper()) has the 5 seconds timeout to break out of
the sync loop. It was introduced from the beginning, just to be
"safer", in terms of avoiding the stupid bugs.

However, as Ben Hutchings suggested, this timeout rather introduces a
potential leak or use-after-free that was apparently fixed by the
commit 2d7d54002e39 ("ALSA: seq: Fix race during FIFO resize"):
for example, snd_seq_fifo_event_in() -> snd_seq_event_dup() ->
copy_from_user() could block for a long time, and snd_use_lock_sync()
goes timeout and still leaves the cell at releasing the pool.

For fixing such a problem, we remove the break by the timeout while
still keeping the warning.

Suggested-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ALSA: firewire-lib: fix inappropriate assignment between signed/unsigned type

commit dfb00a56935186171abb5280b3407c3f910011f1 upstream.

An abstraction of asynchronous transaction for transmission of MIDI
messages was introduced in Linux v4.4. Each driver can utilize this
abstraction to transfer MIDI messages via fixed-length payload of
transaction to a certain unit address. Filling payload of the transaction
is done by callback. In this callback, each driver can return negative
error code, however current implementation assigns the return value to
unsigned variable.

This commit changes type of the variable to fix the bug.

Reported-by: Julia Lawall <Julia.Lawall@lip6.fr>
Fixes: 585d7cba5e1f ("ALSA: firewire-lib: add helper functions for asynchronous transactions to transfer MIDI messages")
Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ipv6: check raw payload size correctly in ioctl

[ Upstream commit 105f5528b9bbaa08b526d3405a5bcd2ff0c953c8 ]

In situations where an skb is paged, the transport header pointer and
tail pointer can be the same because the skb contents are in frags.

This results in ioctl(SIOCINQ/FIONREAD) incorrectly returning a
length of 0 when the length to receive is actually greater than zero.

skb->len is already correctly set in ip6_input_finish() with
pskb_pull(), so use skb->len as it always returns the correct result
for both linear and paged data.

Signed-off-by: Jamie Bainbridge <jbainbri@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ipv6: check skb->protocol before lookup for nexthop

[ Upstream commit 199ab00f3cdb6f154ea93fa76fd80192861a821d ]

Andrey reported a out-of-bound access in ip6_tnl_xmit(), this
is because we use an ipv4 dst in ip6_tnl_xmit() and cast an IPv4
neigh key as an IPv6 address:

        neigh = dst_neigh_lookup(skb_dst(skb),
                                 &ipv6_hdr(skb)->daddr);
        if (!neigh)
                goto tx_err_link_failure;

        addr6 = (struct in6_addr *)&neigh->primary_key; // <=== HERE
        addr_type = ipv6_addr_type(addr6);

        if (addr_type == IPV6_ADDR_ANY)
                addr6 = &ipv6_hdr(skb)->daddr;

        memcpy(&fl6->daddr, addr6, sizeof(fl6->daddr));

Also the network header of the skb at this point should be still IPv4
for 4in6 tunnels, we shold not just use it as IPv6 header.

This patch fixes it by checking if skb->protocol is ETH_P_IPV6: if it
is, we are safe to do the nexthop lookup using skb_dst() and
ipv6_hdr(skb)->daddr; if not (aka IPv4), we have no clue about which
dest address we can pick here, we have to rely on callers to fill it
from tunnel config, so just fall to ip6_route_output() to make the
decision.

Fixes: ea3dc9601bda ("ip6_tunnel: Add support for wildcard tunnel endpoints.")
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Tested-by: Andrey Konovalov <andreyknvl@google.com>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

macvlan: Fix device ref leak when purging bc_queue

[ Upstream commit f6478218e6edc2a587b8f132f66373baa7b2497c ]

When a parent macvlan device is destroyed we end up purging its
broadcast queue without dropping the device reference count on
the packet source device. This causes the source device to linger.

This patch drops that reference count.

Fixes: 260916dfb48c ("macvlan: Fix potential use-after free for...")
Reported-by: Joe Ghalam <Joe.Ghalam@dell.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ip6mr: fix notification device destruction

[ Upstream commit 723b929ca0f79c0796f160c2eeda4597ee98d2b8 ]

Andrey Konovalov reported a BUG caused by the ip6mr code which is caused
because we call unregister_netdevice_many for a device that is already
being destroyed. In IPv4's ipmr that has been resolved by two commits
long time ago by introducing the "notify" parameter to the delete
function and avoiding the unregister when called from a notifier, so
let's do the same for ip6mr.

The trace from Andrey:
------------[ cut here ]------------
kernel BUG at net/core/dev.c:6813!
invalid opcode: 0000 [#1] SMP KASAN
Modules linked in:
CPU: 1 PID: 1165 Comm: kworker/u4:3 Not tainted 4.11.0-rc7+ #251
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs
01/01/2011
Workqueue: netns cleanup_net
task: ffff880069208000 task.stack: ffff8800692d8000
RIP: 0010:rollback_registered_many+0x348/0xeb0 net/core/dev.c:6813
RSP: 0018:ffff8800692de7f0 EFLAGS: 00010297
RAX: ffff880069208000 RBX: 0000000000000002 RCX: 0000000000000001
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88006af90569
RBP: ffff8800692de9f0 R08: ffff8800692dec60 R09: 0000000000000000
R10: 0000000000000006 R11: 0000000000000000 R12: ffff88006af90070
R13: ffff8800692debf0 R14: dffffc0000000000 R15: ffff88006af90000
FS: 0000000000000000(0000) GS:ffff88006cb00000(0000)
knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fe7e897d870 CR3: 00000000657e7000 CR4: 00000000000006e0
Call Trace:
unregister_netdevice_many.part.105+0x87/0x440 net/core/dev.c:7881
unregister_netdevice_many+0xc8/0x120 net/core/dev.c:7880
ip6mr_device_event+0x362/0x3f0 net/ipv6/ip6mr.c:1346
notifier_call_chain+0x145/0x2f0 kernel/notifier.c:93
__raw_notifier_call_chain kernel/notifier.c:394
raw_notifier_call_chain+0x2d/0x40 kernel/notifier.c:401
call_netdevice_notifiers_info+0x51/0x90 net/core/dev.c:1647
call_netdevice_notifiers net/core/dev.c:1663
rollback_registered_many+0x919/0xeb0 net/core/dev.c:6841
unregister_netdevice_many.part.105+0x87/0x440 net/core/dev.c:7881
unregister_netdevice_many net/core/dev.c:7880
default_device_exit_batch+0x4fa/0x640 net/core/dev.c:8333
ops_exit_list.isra.4+0x100/0x150 net/core/net_namespace.c:144
cleanup_net+0x5a8/0xb40 net/core/net_namespace.c:463
process_one_work+0xc04/0x1c10 kernel/workqueue.c:2097
worker_thread+0x223/0x19c0 kernel/workqueue.c:2231
kthread+0x35e/0x430 kernel/kthread.c:231
ret_from_fork+0x31/0x40 arch/x86/entry/entry_64.S:430
Code: 3c 32 00 0f 85 70 0b 00 00 48 b8 00 02 00 00 00 00 ad de 49 89
47 78 e9 93 fe ff ff 49 8d 57 70 49 8d 5f 78 eb 9e e8 88 7a 14 fe <0f>
0b 48 8b 9d 28 fe ff ff e8 7a 7a 14 fe 48 b8 00 00 00 00 00
RIP: rollback_registered_many+0x348/0xeb0 RSP: ffff8800692de7f0
---[ end trace e0b29c57e9b3292c ]---

Reported-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Tested-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

netpoll: Check for skb->queue_mapping

[ Upstream commit c70b17b775edb21280e9de7531acf6db3b365274 ]

Reducing real_num_tx_queues needs to be in sync with skb queue_mapping
otherwise skbs with queue_mapping greater than real_num_tx_queues
can be sent to the underlying driver and can result in kernel panic.

One such event is running netconsole and enabling VF on the same
device. Or running netconsole and changing number of tx queues via
ethtool on same device.

e.g.
Unable to handle kernel NULL pointer dereference
tsk->{mm,active_mm}->context = 0000000000001525
tsk->{mm,active_mm}->pgd = fff800130ff9a000
              \|/ ____ \|/
              "@'/ .. \`@"
              /_| \__/ |_\
                 \__U_/
kworker/48:1(475): Oops [#1]
CPU: 48 PID: 475 Comm: kworker/48:1 Tainted: G           OE
4.11.0-rc3-davem-net+ #7
Workqueue: events queue_process
task: fff80013113299c0 task.stack: fff800131132c000
TSTATE: 0000004480e01600 TPC: 00000000103f9e3c TNPC: 00000000103f9e40 Y:
00000000    Tainted: G           OE
TPC: <ixgbe_xmit_frame_ring+0x7c/0x6c0 [ixgbe]>
g0: 0000000000000000 g1: 0000000000003fff g2: 0000000000000000 g3:
0000000000000001
g4: fff80013113299c0 g5: fff8001fa6808000 g6: fff800131132c000 g7:
00000000000000c0
o0: fff8001fa760c460 o1: fff8001311329a50 o2: fff8001fa7607504 o3:
0000000000000003
o4: fff8001f96e63a40 o5: fff8001311d77ec0 sp: fff800131132f0e1 ret_pc:
000000000049ed94
RPC: <set_next_entity+0x34/0xb80>
l0: 0000000000000000 l1: 0000000000000800 l2: 0000000000000000 l3:
0000000000000000
l4: 000b2aa30e34b10d l5: 0000000000000000 l6: 0000000000000000 l7:
fff8001fa7605028
i0: fff80013111a8a00 i1: fff80013155a0780 i2: 0000000000000000 i3:
0000000000000000
i4: 0000000000000000 i5: 0000000000100000 i6: fff800131132f1a1 i7:
00000000103fa4b0
I7: <ixgbe_xmit_frame+0x30/0xa0 [ixgbe]>
Call Trace:
[00000000103fa4b0] ixgbe_xmit_frame+0x30/0xa0 [ixgbe]
[0000000000998c74] netpoll_start_xmit+0xf4/0x200
[0000000000998e10] queue_process+0x90/0x160
[0000000000485fa8] process_one_work+0x188/0x480
[0000000000486410] worker_thread+0x170/0x4c0
[000000000048c6b8] kthread+0xd8/0x120
[0000000000406064] ret_from_fork+0x1c/0x2c
[0000000000000000]           (null)
Disabling lock debugging due to kernel taint
Caller[00000000103fa4b0]: ixgbe_xmit_frame+0x30/0xa0 [ixgbe]
Caller[0000000000998c74]: netpoll_start_xmit+0xf4/0x200
Caller[0000000000998e10]: queue_process+0x90/0x160
Caller[0000000000485fa8]: process_one_work+0x188/0x480
Caller[0000000000486410]: worker_thread+0x170/0x4c0
Caller[000000000048c6b8]: kthread+0xd8/0x120
Caller[0000000000406064]: ret_from_fork+0x1c/0x2c
Caller[0000000000000000]:           (null)

Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

net: ipv6: RTF_PCPU should not be settable from userspace

[ Upstream commit 557c44be917c322860665be3d28376afa84aa936 ]

Andrey reported a fault in the IPv6 route code:

kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault: 0000 [#1] SMP KASAN
Modules linked in:
CPU: 1 PID: 4035 Comm: a.out Not tainted 4.11.0-rc7+ #250
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
task: ffff880069809600 task.stack: ffff880062dc8000
RIP: 0010:ip6_rt_cache_alloc+0xa6/0x560 net/ipv6/route.c:975
RSP: 0018:ffff880062dced30 EFLAGS: 00010206
RAX: dffffc0000000000 RBX: ffff8800670561c0 RCX: 0000000000000006
RDX: 0000000000000003 RSI: ffff880062dcfb28 RDI: 0000000000000018
RBP: ffff880062dced68 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: ffff880062dcfb28 R14: dffffc0000000000 R15: 0000000000000000
FS: 00007feebe37e7c0(0000) GS:ffff88006cb00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000205a0fe4 CR3: 000000006b5c9000 CR4: 00000000000006e0
Call Trace:
ip6_pol_route+0x1512/0x1f20 net/ipv6/route.c:1128
ip6_pol_route_output+0x4c/0x60 net/ipv6/route.c:1212
...

Andrey's syzkaller program passes rtmsg.rtmsg_flags with the RTF_PCPU bit
set. Flags passed to the kernel are blindly copied to the allocated
rt6_info by ip6_route_info_create making a newly inserted route appear
as though it is a per-cpu route. ip6_rt_cache_alloc sees the flag set
and expects rt->dst.from to be set - which it is not since it is not
really a per-cpu copy. The subsequent call to __ip6_dst_alloc then
generates the fault.

Fix by checking for the flag and failing with EINVAL.

Fixes: d52d3997f843f ("ipv6: Create percpu rt6_info")
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Tested-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

dp83640: don't recieve time stamps twice

[ Upstream commit 9d386cd9a755c8293e8916264d4d053878a7c9c7 ]

This patch is prompted by a static checker warning about a potential
use after free.  The concern is that netif_rx_ni() can free "skb" and we
call it twice.

When I look at the commit that added this, it looks like some stray
lines were added accidentally.  It doesn't make sense to me that we
would recieve the same data two times.  I asked the author but never
recieved a response.

I can't test this code, but I'm pretty sure my patch is correct.

Fixes: 4b063258ab93 ("dp83640: Delay scheduled work.")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Stefan Sørensen <stefan.sorensen@spectralink.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

tcp: clear saved_syn in tcp_disconnect()

[ Upstream commit 17c3060b1701fc69daedb4c90be6325d3d9fca8e ]

In the (very unlikely) case a passive socket becomes a listener,
we do not want to duplicate its saved SYN headers.

This would lead to double frees, use after free, and please hackers and
various fuzzers

Tested:
    0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
   +0 setsockopt(3, IPPROTO_TCP, TCP_SAVE_SYN, [1], 4) = 0
   +0 fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0

   +0 bind(3, ..., ...) = 0
   +0 listen(3, 5) = 0

   +0 < S 0:0(0) win 32972 <mss 1460,nop,wscale 7>
   +0 > S. 0:0(0) ack 1 <...>
  +.1 < . 1:1(0) ack 1 win 257
   +0 accept(3, ..., ...) = 4

   +0 connect(4, AF_UNSPEC, ...) = 0
   +0 close(3) = 0
   +0 bind(4, ..., ...) = 0
   +0 listen(4, 5) = 0

   +0 < S 0:0(0) win 32972 <mss 1460,nop,wscale 7>
   +0 > S. 0:0(0) ack 1 <...>
  +.1 < . 1:1(0) ack 1 win 257

Fixes: cd8ae85299d5 ("tcp: provide SYN headers for passive connections")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

sctp: listen on the sock only when it's state is listening or closed

[ Upstream commit 34b2789f1d9bf8dcca9b5cb553d076ca2cd898ee ]

Now sctp doesn't check sock's state before listening on it. It could
even cause changing a sock with any state to become a listening sock
when doing sctp_listen.

This patch is to fix it by checking sock's state in sctp_listen, so
that it will listen on the sock with right state.

Reported-by: Andrey Konovalov <andreyknvl@google.com>
Tested-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

net: ipv4: fix multipath RTM_GETROUTE behavior when iif is given

[ Upstream commit a8801799c6975601fd58ae62f48964caec2eb83f ]

inet_rtm_getroute synthesizes a skeletal ICMP skb, which is passed to
ip_route_input when iif is given. If a multipath route is present for
the designated destination, ip_multipath_icmp_hash ends up being called,
which uses the source/destination addresses within the skb to calculate
a hash. However, those are not set in the synthetic skb, causing it to
return an arbitrary and incorrect result.

Instead, use UDP, which gets no such special treatment.

Signed-off-by: Florian Larysch <fl@n621.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

l2tp: fix PPP pseudo-wire auto-loading

[ Upstream commit 249ee819e24c180909f43c1173c8ef6724d21faf ]

PPP pseudo-wire type is 7 (11 is L2TP_PWTYPE_IP).

Fixes: f1f39f911027 ("l2tp: auto load type modules")
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

l2tp: take reference on sessions being dumped

[ Upstream commit e08293a4ccbcc993ded0fdc46f1e57926b833d63 ]

Take a reference on the sessions returned by l2tp_session_find_nth()
(and rename it l2tp_session_get_nth() to reflect this change), so that
caller is assured that the session isn't going to disappear while
processing it.

For procfs and debugfs handlers, the session is held in the .start()
callback and dropped in .show(). Given that pppol2tp_seq_session_show()
dereferences the associated PPPoL2TP socket and that
l2tp_dfs_seq_session_show() might call pppol2tp_show(), we also need to
call the session's .ref() callback to prevent the socket from going
away from under us.

Fixes: fd558d186df2 ("l2tp: Split pppol2tp patch into separate l2tp and ppp parts")
Fixes: 0ad6614048cf ("l2tp: Add debugfs files for dumping l2tp debug info")
Fixes: 309795f4bec2 ("l2tp: Add netlink control API for L2TP")
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

net/packet: fix overflow in check for tp_reserve

[ Upstream commit bcc5364bdcfe131e6379363f089e7b4108d35b70 ]

When calculating po->tp_hdrlen + po->tp_reserve the result can overflow.

Fix by checking that tp_reserve <= INT_MAX on assign.

Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

net/packet: fix overflow in check for tp_frame_nr

[ Upstream commit 8f8d28e4d6d815a391285e121c3a53a0b6cb9e7b ]

When calculating rb->frames_per_block * req->tp_block_nr the result
can overflow.

Add a check that tp_block_size * tp_block_nr <= UINT_MAX.

Since frames_per_block <= tp_block_size, the expression would
never overflow.

Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

l2tp: purge socket queues in the .destruct() callback

[ Upstream commit e91793bb615cf6cdd59c0b6749fe173687bb0947 ]

The Rx path may grab the socket right before pppol2tp_release(), but
nothing guarantees that it will enqueue packets before
skb_queue_purge(). Therefore, the socket can be destroyed without its
queues fully purged.

Fix this by purging queues in pppol2tp_session_destruct() where we're
guaranteed nothing is still referencing the socket.

Fixes: 9e9cb6221aa7 ("l2tp: fix userspace reception on plain L2TP sockets")
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

net: phy: handle state correctly in phy_stop_machine

[ Upstream commit 49d52e8108a21749dc2114b924c907db43358984 ]

If the PHY is halted on stop, then do not set the state to PHY_UP. This
ensures the phy will be restarted later in phy_start when the machine is
started again.

Fixes: 00db8189d984 ("This patch adds a PHY Abstraction Layer to the Linux Kernel, enabling ethernet drivers to remain as ignorant as is reasonable of the connected PHY's design and operation details.")
Signed-off-by: Nathan Sullivan <nathan.sullivan@ni.com>
Signed-off-by: Brad Mouring <brad.mouring@ni.com>
Acked-by: Xander Huff <xander.huff@ni.com>
Acked-by: Kyle Roeschley <kyle.roeschley@ni.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

net: neigh: guard against NULL solicit() method

[ Upstream commit 48481c8fa16410ffa45939b13b6c53c2ca609e5f ]

Dmitry posted a nice reproducer of a bug triggering in neigh_probe()
when dereferencing a NULL neigh->ops->solicit method.

This can happen for arp_direct_ops/ndisc_direct_ops and similar,
which can be used for NUD_NOARP neighbours (created when dev->header_ops
is NULL). Admin can then force changing nud_state to some other state
that would fire neigh timer.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

sparc64: Fix kernel panic due to erroneous #ifdef surrounding pmd_write()

[ Upstream commit 9ae34dbd8afd790cb5f52467e4f816434379eafa ]

This commit moves sparc64's prototype of pmd_write() outside
of the CONFIG_TRANSPARENT_HUGEPAGE ifdef.

In 2013, commit a7b9403f0e6d ("sparc64: Encode huge PMDs using PTE
encoding.") exposed a path where pmd_write() could be called without
CONFIG_TRANSPARENT_HUGEPAGE defined.  This can result in the panic below.

The diff is awkward to read, but the changes are straightforward.
pmd_write() was moved outside of #ifdef CONFIG_TRANSPARENT_HUGEPAGE.
Also, __HAVE_ARCH_PMD_WRITE was defined.

kernel BUG at include/asm-generic/pgtable.h:576!
              \|/ ____ \|/
              "@'/ .. \`@"
              /_| \__/ |_\
                 \__U_/
oracle_8114_cdb(8114): Kernel bad sw trap 5 [#1]
CPU: 120 PID: 8114 Comm: oracle_8114_cdb Not tainted
4.1.12-61.7.1.el6uek.rc1.sparc64 #1
task: fff8400700a24d60 ti: fff8400700bc4000 task.ti: fff8400700bc4000
TSTATE: 0000004411e01607 TPC: 00000000004609f8 TNPC: 00000000004609fc Y:
00000005    Not tainted
TPC: <gup_huge_pmd+0x198/0x1e0>
g0: 000000000001c000 g1: 0000000000ef3954 g2: 0000000000000000 g3: 0000000000000001
g4: fff8400700a24d60 g5: fff8001fa5c10000 g6: fff8400700bc4000 g7: 0000000000000720
o0: 0000000000bc5058 o1: 0000000000000240 o2: 0000000000006000 o3: 0000000000001c00
o4: 0000000000000000 o5: 0000048000080000 sp: fff8400700bc6ab1 ret_pc: 00000000004609f0
RPC: <gup_huge_pmd+0x190/0x1e0>
l0: fff8400700bc74fc l1: 0000000000020000 l2: 0000000000002000 l3: 0000000000000000
l4: fff8001f93250950 l5: 000000000113f800 l6: 0000000000000004 l7: 0000000000000000
i0: fff8400700ca46a0 i1: bd0000085e800453 i2: 000000026a0c4000 i3: 000000026a0c6000
i4: 0000000000000001 i5: fff800070c958de8 i6: fff8400700bc6b61 i7: 0000000000460dd0
I7: <gup_pud_range+0x170/0x1a0>
Call Trace:
[0000000000460dd0] gup_pud_range+0x170/0x1a0
[0000000000460e84] get_user_pages_fast+0x84/0x120
[00000000006f5a18] iov_iter_get_pages+0x98/0x240
[00000000005fa744] do_direct_IO+0xf64/0x1e00
[00000000005fbbc0] __blockdev_direct_IO+0x360/0x15a0
[00000000101f74fc] ext4_ind_direct_IO+0xdc/0x400 [ext4]
[00000000101af690] ext4_ext_direct_IO+0x1d0/0x2c0 [ext4]
[00000000101af86c] ext4_direct_IO+0xec/0x220 [ext4]
[0000000000553bd4] generic_file_read_iter+0x114/0x140
[00000000005bdc2c] __vfs_read+0xac/0x100
[00000000005bf254] vfs_read+0x54/0x100
[00000000005bf368] SyS_pread64+0x68/0x80

Signed-off-by: Tom Hromatka <tom.hromatka@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

sparc64: kern_addr_valid regression

[ Upstream commit adfae8a5d833fa2b46577a8081f350e408851f5b ]

I encountered this bug when using /proc/kcore to examine the kernel. Plus a
coworker inquired about debugging tools. We computed pa but did
not use it during the maximum physical address bits test. Instead we used
the identity mapped virtual address which will always fail this test.

I believe the defect came in here:
[bpicco@zareason linus.git]$ git describe --contains bb4e6e85daa52
v3.18-rc1~87^2~4
.

Signed-off-by: Bob Picco <bob.picco@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

xen/x86: don't lose event interrupts

commit c06b6d70feb32d28f04ba37aa3df17973fd37b6b upstream.

On slow platforms with unreliable TSC, such as QEMU emulated machines,
it is possible for the kernel to request the next event in the past. In
that case, in the current implementation of xen_vcpuop_clockevent, we
simply return -ETIME. To be precise the Xen returns -ETIME and we pass
it on. However the result of this is a missed event, which simply causes
the kernel to hang.

Instead it is better to always ask the hypervisor for a timer event,
even if the timeout is in the past. That way there are no lost
interrupts and the kernel survives. To do that, remove the
VCPU_SSHOTTMR_future flag.

Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Acked-by: Juergen Gross <jgross@suse.com>
Cc: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

usb: gadget: f_midi: Fixed a bug when buflen was smaller than wMaxPacketSize

commit 03d27ade4941076b34c823d63d91dc895731a595 upstream.

buflen by default (256) is smaller than wMaxPacketSize (512) in high-speed
devices.

That caused the OUT endpoint to freeze if the host send any data packet of
length greater than 256 bytes.

This is an example dump of what happended on that enpoint:
HOST:   [DATA][Length=260][...]
DEVICE: [NAK]
HOST:   [PING]
DEVICE: [NAK]
HOST:   [PING]
DEVICE: [NAK]
...
HOST:   [PING]
DEVICE: [NAK]

This patch fixes this problem by setting the minimum usb_request's buffer size
for the OUT endpoint as its wMaxPacketSize.

Acked-by: Michal Nazarewicz <mina86@mina86.com>
Signed-off-by: Felipe F. Tonello <eu@felipetonello.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Cc: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

regulator: core: Clear the supply pointer if enabling fails

commit 8e5356a73604f53da6a1e0756727cb8f9f7bba17 upstream.

During the resolution of a regulator's supply, we may attempt to enable
the supply if the regulator itself is already enabled. If enabling the
supply fails, then we will call _regulator_put() for the supply.
However, the pointer to the supply has not been cleared for the
regulator and this will cause a crash if we then unregister the
regulator and attempt to call regulator_put() a second time for the
supply. Fix this by clearing the supply pointer if enabling the supply
after fails when resolving the supply for a regulator.

Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

RDS: Fix the atomicity for congestion map update

commit e47db94e10447fc467777a40302f2b393e9af2fa upstream.

Two different threads with different rds sockets may be in
rds_recv_rcvbuf_delta() via receive path. If their ports
both map to the same word in the congestion map, then
using non-atomic ops to update it could cause the map to
be incorrect. Lets use atomics to avoid such an issue.

Full credit to Wengang <wen.gang.wang@oracle.com> for
finding the issue, analysing it and also pointing out
to offending code with spin lock based fix.

Reviewed-by: Leon Romanovsky <leon@leon.nu>
Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

net_sched: close another race condition in tcf_mirred_release()

commit dc327f8931cb9d66191f489eb9a852fc04530546 upstream.

We saw the following extra refcount release on veth device:

  kernel: [7957821.463992] unregister_netdevice: waiting for mesos50284 to become free. Usage count = -1

Since we heavily use mirred action to redirect packets to veth, I think
this is caused by the following race condition:

CPU0:
tcf_mirred_release(): (in RCU callback)
struct net_device *dev = rcu_dereference_protected(m->tcfm_dev, 1);

CPU1:
mirred_device_event():
        spin_lock_bh(&mirred_list_lock);
        list_for_each_entry(m, &mirred_list, tcfm_list) {
                if (rcu_access_pointer(m->tcfm_dev) == dev) {
                        dev_put(dev);
                        /* Note : no rcu grace period necessary, as
                         * net_device are already rcu protected.
                         */
                        RCU_INIT_POINTER(m->tcfm_dev, NULL);
                }
        }
        spin_unlock_bh(&mirred_list_lock);

CPU0:
tcf_mirred_release():
        spin_lock_bh(&mirred_list_lock);
        list_del(&m->tcfm_list);
        spin_unlock_bh(&mirred_list_lock);
        if (dev)               // <======== Stil refers to the old m->tcfm_dev
                dev_put(dev);  // <======== dev_put() is called on it again

The action init code path is good because it is impossible to modify
an action that is being removed.

So, fix this by moving everything under the spinlock.

Fixes: 2ee22a90c7af ("net_sched: act_mirred: remove spinlock in fast path")
Fixes: 6bd00b850635 ("act_mirred: fix a race condition on mirred_list")
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

net: cavium: liquidio: Avoid dma_unmap_single on uninitialized ndata

commit 8e6ce7ebeb34f0992f56de078c3744fb383657fa upstream.

The label lio_xmit_failed is used 3 times through liquidio_xmit() but it
always makes a call to dma_unmap_single() using potentially
uninitialized variables from "ndata" variable. Out of the 3 gotos, 2 run
after ndata has been initialized, and had a prior dma_map_single() call.

Fix this by adding a new error label: lio_xmit_dma_failed which does
this dma_unmap_single() and then processed with the lio_xmit_failed
fallthrough.

Fixes: f21fb3ed364bb ("Add support of Cavium Liquidio ethernet adapters")
Reported-by: coverity (CID 1309740)
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

MIPS: Fix crash registers on non-crashing CPUs

commit c80e1b62ffca52e2d1d865ee58bc79c4c0c55005 upstream.

As part of handling a crash on an SMP system, an IPI is send to
all other CPUs to save their current registers and stop. It was
using task_pt_regs(current) to get the registers, but that will
only be accurate if the CPU was interrupted running in userland.
Instead allow the architecture to pass in the registers (all
pass NULL now, but allow for the future) and then use get_irq_regs()
which should be accurate as we are in an interrupt. Fall back to
task_pt_regs(current) if nothing else is available.

Signed-off-by: Corey Minyard <cminyard@mvista.com>
Cc: David Daney <ddaney@caviumnetworks.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/13050/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Cc: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

md:raid1: fix a dead loop when read from a WriteMostly disk

commit 816b0acf3deb6d6be5d0519b286fdd4bafade905 upstream.

If first_bad == this_sector when we get the WriteMostly disk
in read_balance(), valid disk will be returned with zero
max_sectors. It'll lead to a dead loop in make_request(), and
OOM will happen because of endless allocation of struct bio.

Since we can't get data from this disk in this case, so
continue for another disk.

Signed-off-by: Wei Fang <fangwei1@huawei.com>
Signed-off-by: Shaohua Li <shli@fb.com>
Cc: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ext4: check if in-inode xattr is corrupted in ext4_expand_extra_isize_ea()

commit 9e92f48c34eb2b9af9d12f892e2fe1fce5e8ce35 upstream.

We aren't checking to see if the in-inode extended attribute is
corrupted before we try to expand the inode's extra isize fields.

This can lead to potential crashes caused by the BUG_ON() check in
ext4_xattr_shift_entries().

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/amdgpu: fix array out of bounds

commit 484f689fc9d4eb91c68f53e97dc355b1b06c3edb upstream.

When the initial value of i is greater than zero,
it may cause endless loop, resulting in array out
of bounds, fix it.

This is a port of the radeon fix to amdgpu.

Signed-off-by: tom will <os@iscas.ac.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

crypto: testmgr - fix out of bound read in __test_aead()

commit abfa7f4357e3640fdee87dfc276fd0f379fb5ae6 upstream.

__test_aead() reads MAX_IVLEN bytes from template[i].iv, but the
actual length of the initialisation vector can be shorter.
The length of the IV is already calculated earlier in the
function. Let's just reuses that. Also the IV length is currently
calculated several time for no reason. Let's fix that too.
This fix an out-of-bound error detected by KASan.

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

clk: sunxi: Add apb0 gates for H3

commit 6e17b4181603d183d20c73f4535529ddf2a2a020 upstream.

This patch adds support for APB0 in H3. It seems to be compatible with
earlier SOCs. apb0 gates controls R_ block peripherals (R_PIO, R_IR,
etc).

Since this gates behave just like any Allwinner clock gate, add a generic
compatible that can be reused if we don't have any clock to protect.

Signed-off-by: Krzysztof Adamski <k@japko.eu>
[Maxime: Removed the H3 compatible from the simple-gates driver, reworked
the commit log a bit]
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Cc: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ARM: OMAP2+: timer: add probe for clocksources

commit 970f9091d25df14e9540ec7ff48a2f709e284cd1 upstream.

A few platforms are currently missing clocksource_probe() completely
in their time_init functionality. On OMAP3430 for example, this is
causing cpuidle to be pretty much dead, as the counter32k is not
going to be registered and instead a gptimer is used as a clocksource.
This will tick in periodic mode, preventing any deeper idle states.

While here, also drop one unnecessary check for populated DT before
existing clocksource_probe() call.

Signed-off-by: Tero Kristo <t-kristo@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Cc: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

xc2028: unlock on error in xc2028_set_config()

commit 210bd104c6acd31c3c6b8b075b3f12d4a9f6b60d upstream.

We have to unlock before returning -ENOMEM.

Fixes: 8dfbcc4351a0 ('[media] xc2028: avoid use after free')

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

f2fs: do more integrity verification for superblock

commit 9a59b62fd88196844cee5fff851bee2cfd7afb6e upstream.

Do more sanity check for superblock during ->mount.

Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Linux 4.4.65

perf/core: Fix concurrent sys_perf_event_open() vs. 'move_group' race

commit 321027c1fe77f892f4ea07846aeae08cefbbb290 upstream.

Di Shen reported a race between two concurrent sys_perf_event_open()
calls where both try and move the same pre-existing software group
into a hardware context.

The problem is exactly that described in commit:

f63a8daa5812 ("perf: Fix event->ctx locking")

... where, while we wait for a ctx->mutex acquisition, the event->ctx
relation can have changed under us.

That very same commit failed to recognise sys_perf_event_context() as an
external access vector to the events and thereby didn't apply the
established locking rules correctly.

So while one sys_perf_event_open() call is stuck waiting on
mutex_lock_double(), the other (which owns said locks) moves the group
about. So by the time the former sys_perf_event_open() acquires the
locks, the context we've acquired is stale (and possibly dead).

Apply the established locking rules as per perf_event_ctx_lock_nested()
to the mutex_lock_double() for the 'move_group' case. This obviously means
we need to validate state after we acquire the locks.

Reported-by: Di Shen (Keen Lab)
Tested-by: John Dias <joaodias@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Min Chong <mchong@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Fixes: f63a8daa5812 ("perf: Fix event->ctx locking")
Link: http://lkml.kernel.org/r/20170106131444.GZ3174@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
[bwh: Backported to 4.4:
- Test perf_event::group_flags instead of group_caps
- Adjust context]
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ping: implement proper locking

commit 43a6684519ab0a6c52024b5e25322476cabad893 upstream.

We got a report of yet another bug in ping

http://www.openwall.com/lists/oss-security/2017/03/24/6

->disconnect() is not called with socket lock held.

Fix this by acquiring ping rwlock earlier.

Thanks to Daniel, Alexander and Andrey for letting us know this problem.

Fixes: c319b4d76b9e ("net: ipv4: add IPPROTO_ICMP socket kind")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Daniel Jiang <danieljiang0415@gmail.com>
Reported-by: Solar Designer <solar@openwall.com>
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

staging/android/ion : fix a race condition in the ion driver

commit 9590232bb4f4cc824f3425a6e1349afbe6d6d2b7 upstream.

There is a use-after-free problem in the ion driver.
This is caused by a race condition in the ion_ioctl()
function.

A handle has ref count of 1 and two tasks on different
cpus calls ION_IOC_FREE simultaneously.

cpu 0                                   cpu 1
-------------------------------------------------------
ion_handle_get_by_id()
(ref == 2)
                            ion_handle_get_by_id()
                            (ref == 3)

ion_free()
(ref == 2)

ion_handle_put()
(ref == 1)

                            ion_free()
                            (ref == 0 so ion_handle_destroy() is
                            called
                            and the handle is freed.)

                            ion_handle_put() is called and it
                            decreases the slub's next free pointer

The problem is detected as an unaligned access in the
spin lock functions since it uses load exclusive
instruction. In some cases it corrupts the slub's
free pointer which causes a mis-aligned access to the
next free pointer.(kmalloc returns a pointer like
ffffc0745b4580aa). And it causes lots of other
hard-to-debug problems.

This symptom is caused since the first member in the
ion_handle structure is the reference count and the
ion driver decrements the reference after it has been
freed.

To fix this problem client->lock mutex is extended
to protect all the codes that uses the handle.

Signed-off-by: Eun Taik Lee <eun.taik.lee@samsung.com>
Reviewed-by: Laura Abbott <labbott@redhat.com>
Cc: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
index 7ff2a7ec871f..33b390e7ea31

vfio/pci: Fix integer overflows, bitmask check

commit 05692d7005a364add85c6e25a6c4447ce08f913a upstream.

The VFIO_DEVICE_SET_IRQS ioctl did not sufficiently sanitize
user-supplied integers, potentially allowing memory corruption. This
patch adds appropriate integer overflow checks, checks the range bounds
for VFIO_IRQ_SET_DATA_NONE, and also verifies that only single element
in the VFIO_IRQ_SET_DATA_TYPE_MASK bitmask is set.
VFIO_IRQ_SET_ACTION_TYPE_MASK is already correctly checked later in
vfio_pci_set_irqs_ioctl().

Furthermore, a kzalloc is changed to a kcalloc because the use of a
kzalloc with an integer multiplication allowed an integer overflow
condition to be reached without this patch. kcalloc checks for overflow
and should prevent a similar occurrence.

Signed-off-by: Vlad Tsyrklevich <vlad@tsyrklevich.net>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Cc: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

tipc: check minimum bearer MTU

commit 3de81b758853f0b29c61e246679d20b513c4cfec upstream.

Qian Zhang (张谦) reported a potential socket buffer overflow in
tipc_msg_build() which is also known as CVE-2016-8632: due to
insufficient checks, a buffer overflow can occur if MTU is too short for
even tipc headers. As anyone can set device MTU in a user/net namespace,
this issue can be abused by a regular user.

As agreed in the discussion on Ben Hutchings' original patch, we should
check the MTU at the moment a bearer is attached rather than for each
processed packet. We also need to repeat the check when bearer MTU is
adjusted to new device MTU. UDP case also needs a check to avoid
overflow when calculating bearer MTU.

Fixes: b97bf3fd8f6a ("[TIPC] Initial merge")
Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Reported-by: Qian Zhang (张谦) <zhangqian-c@360.cn>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 4.4:
- Adjust context
- NETDEV_GOING_DOWN and NETDEV_CHANGEMTU cases in net notifier were combined]
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

netfilter: nfnetlink: correctly validate length of batch messages

commit c58d6c93680f28ac58984af61d0a7ebf4319c241 upstream.

If nlh->nlmsg_len is zero then an infinite loop is triggered because
'skb_pull(skb, msglen);' pulls zero bytes.

The calculation in nlmsg_len() underflows if 'nlh->nlmsg_len <
NLMSG_HDRLEN' which bypasses the length validation and will later
trigger an out-of-bound read.

If the length validation does fail then the malformed batch message is
copied back to userspace. However, we cannot do this because the
nlh->nlmsg_len can be invalid. This leads to an out-of-bounds read in
netlink_ack:

    [   41.455421] ==================================================================
    [   41.456431] BUG: KASAN: slab-out-of-bounds in memcpy+0x1d/0x40 at addr ffff880119e79340
    [   41.456431] Read of size 4294967280 by task a.out/987
    [   41.456431] =============================================================================
    [   41.456431] BUG kmalloc-512 (Not tainted): kasan: bad access detected
    [   41.456431] -----------------------------------------------------------------------------
    ...
    [   41.456431] Bytes b4 ffff880119e79310: 00 00 00 00 d5 03 00 00 b0 fb fe ff 00 00 00 00  ................
    [   41.456431] Object ffff880119e79320: 20 00 00 00 10 00 05 00 00 00 00 00 00 00 00 00   ...............
    [   41.456431] Object ffff880119e79330: 14 00 0a 00 01 03 fc 40 45 56 11 22 33 10 00 05  .......@EV."3...
    [   41.456431] Object ffff880119e79340: f0 ff ff ff 88 99 aa bb 00 14 00 0a 00 06 fe fb  ................
                                            ^^ start of batch nlmsg with
                                               nlmsg_len=4294967280
    ...
    [   41.456431] Memory state around the buggy address:
    [   41.456431]  ffff880119e79400: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   41.456431]  ffff880119e79480: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   41.456431] >ffff880119e79500: 00 00 00 00 fc fc fc fc fc fc fc fc fc fc fc fc
    [   41.456431]                                ^
    [   41.456431]  ffff880119e79580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
    [   41.456431]  ffff880119e79600: fc fc fc fc fc fc fc fc fc fc fb fb fb fb fb fb
    [   41.456431] ==================================================================

Fix this with better validation of nlh->nlmsg_len and by setting
NFNL_BATCH_FAILURE if any batch message fails length validation.

CAP_NET_ADMIN is required to trigger the bugs.

Fixes: 9ea2aa8b7dba ("netfilter: nfnetlink: validate nfnetlink header from batch")
Signed-off-by: Phil Turnbull <phil.turnbull@oracle.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

xc2028: avoid use after free

commit 8dfbcc4351a0b6d2f2d77f367552f48ffefafe18 upstream.

If struct xc2028_config is passed without a firmware name,
the following trouble may happen:

[11009.907205] xc2028 5-0061: type set to XCeive xc2028/xc3028 tuner
[11009.907491] ==================================================================
[11009.907750] BUG: KASAN: use-after-free in strcmp+0x96/0xb0 at addr ffff8803bd78ab40
[11009.907992] Read of size 1 by task modprobe/28992
[11009.907994] =============================================================================
[11009.907997] BUG kmalloc-16 (Tainted: G        W      ): kasan: bad access detected
[11009.907999] -----------------------------------------------------------------------------

[11009.908008] INFO: Allocated in xhci_urb_enqueue+0x214/0x14c0 [xhci_hcd] age=0 cpu=3 pid=28992
[11009.908012] ___slab_alloc+0x581/0x5b0
[11009.908014] __slab_alloc+0x51/0x90
[11009.908017] __kmalloc+0x27b/0x350
[11009.908022] xhci_urb_enqueue+0x214/0x14c0 [xhci_hcd]
[11009.908026] usb_hcd_submit_urb+0x1e8/0x1c60
[11009.908029] usb_submit_urb+0xb0e/0x1200
[11009.908032] usb_serial_generic_write_start+0xb6/0x4c0
[11009.908035] usb_serial_generic_write+0x92/0xc0
[11009.908039] usb_console_write+0x38a/0x560
[11009.908045] call_console_drivers.constprop.14+0x1ee/0x2c0
[11009.908051] console_unlock+0x40d/0x900
[11009.908056] vprintk_emit+0x4b4/0x830
[11009.908061] vprintk_default+0x1f/0x30
[11009.908064] printk+0x99/0xb5
[11009.908067] kasan_report_error+0x10a/0x550
[11009.908070] __asan_report_load1_noabort+0x43/0x50
[11009.908074] INFO: Freed in xc2028_set_config+0x90/0x630 [tuner_xc2028] age=1 cpu=3 pid=28992
[11009.908077] __slab_free+0x2ec/0x460
[11009.908080] kfree+0x266/0x280
[11009.908083] xc2028_set_config+0x90/0x630 [tuner_xc2028]
[11009.908086] xc2028_attach+0x310/0x8a0 [tuner_xc2028]
[11009.908090] em28xx_attach_xc3028.constprop.7+0x1f9/0x30d [em28xx_dvb]
[11009.908094] em28xx_dvb_init.part.3+0x8e4/0x5cf4 [em28xx_dvb]
[11009.908098] em28xx_dvb_init+0x81/0x8a [em28xx_dvb]
[11009.908101] em28xx_register_extension+0xd9/0x190 [em28xx]
[11009.908105] em28xx_dvb_register+0x10/0x1000 [em28xx_dvb]
[11009.908108] do_one_initcall+0x141/0x300
[11009.908111] do_init_module+0x1d0/0x5ad
[11009.908114] load_module+0x6666/0x9ba0
[11009.908117] SyS_finit_module+0x108/0x130
[11009.908120] entry_SYSCALL_64_fastpath+0x16/0x76
[11009.908123] INFO: Slab 0xffffea000ef5e280 objects=25 used=25 fp=0x          (null) flags=0x2ffff8000004080
[11009.908126] INFO: Object 0xffff8803bd78ab40 @offset=2880 fp=0x0000000000000001

[11009.908130] Bytes b4 ffff8803bd78ab30: 01 00 00 00 2a 07 00 00 9d 28 00 00 01 00 00 00  ....*....(......
[11009.908133] Object ffff8803bd78ab40: 01 00 00 00 00 00 00 00 b0 1d c3 6a 00 88 ff ff  ...........j....
[11009.908137] CPU: 3 PID: 28992 Comm: modprobe Tainted: G    B   W       4.5.0-rc1+ #43
[11009.908140] Hardware name:                  /NUC5i7RYB, BIOS RYBDWi35.86A.0350.2015.0812.1722 08/12/2015
[11009.908142]  ffff8803bd78a000 ffff8802c273f1b8 ffffffff81932007 ffff8803c6407a80
[11009.908148]  ffff8802c273f1e8 ffffffff81556759 ffff8803c6407a80 ffffea000ef5e280
[11009.908153]  ffff8803bd78ab40 dffffc0000000000 ffff8802c273f210 ffffffff8155ccb4
[11009.908158] Call Trace:
[11009.908162]  [<ffffffff81932007>] dump_stack+0x4b/0x64
[11009.908165]  [<ffffffff81556759>] print_trailer+0xf9/0x150
[11009.908168]  [<ffffffff8155ccb4>] object_err+0x34/0x40
[11009.908171]  [<ffffffff8155f260>] kasan_report_error+0x230/0x550
[11009.908175]  [<ffffffff81237d71>] ? trace_hardirqs_off_caller+0x21/0x290
[11009.908179]  [<ffffffff8155e926>] ? kasan_unpoison_shadow+0x36/0x50
[11009.908182]  [<ffffffff8155f5c3>] __asan_report_load1_noabort+0x43/0x50
[11009.908185]  [<ffffffff8155ea00>] ? __asan_register_globals+0x50/0xa0
[11009.908189]  [<ffffffff8194cea6>] ? strcmp+0x96/0xb0
[11009.908192]  [<ffffffff8194cea6>] strcmp+0x96/0xb0
[11009.908196]  [<ffffffffa13ba4ac>] xc2028_set_config+0x15c/0x630 [tuner_xc2028]
[11009.908200]  [<ffffffffa13bac90>] xc2028_attach+0x310/0x8a0 [tuner_xc2028]
[11009.908203]  [<ffffffff8155ea78>] ? memset+0x28/0x30
[11009.908206]  [<ffffffffa13ba980>] ? xc2028_set_config+0x630/0x630 [tuner_xc2028]
[11009.908211]  [<ffffffffa157a59a>] em28xx_attach_xc3028.constprop.7+0x1f9/0x30d [em28xx_dvb]
[11009.908215]  [<ffffffffa157aa2a>] ? em28xx_dvb_init.part.3+0x37c/0x5cf4 [em28xx_dvb]
[11009.908219]  [<ffffffffa157a3a1>] ? hauppauge_hvr930c_init+0x487/0x487 [em28xx_dvb]
[11009.908222]  [<ffffffffa01795ac>] ? lgdt330x_attach+0x1cc/0x370 [lgdt330x]
[11009.908226]  [<ffffffffa01793e0>] ? i2c_read_demod_bytes.isra.2+0x210/0x210 [lgdt330x]
[11009.908230]  [<ffffffff812e87d0>] ? ref_module.part.15+0x10/0x10
[11009.908233]  [<ffffffff812e56e0>] ? module_assert_mutex_or_preempt+0x80/0x80
[11009.908238]  [<ffffffffa157af92>] em28xx_dvb_init.part.3+0x8e4/0x5cf4 [em28xx_dvb]
[11009.908242]  [<ffffffffa157a6ae>] ? em28xx_attach_xc3028.constprop.7+0x30d/0x30d [em28xx_dvb]
[11009.908245]  [<ffffffff8195222d>] ? string+0x14d/0x1f0
[11009.908249]  [<ffffffff8195381f>] ? symbol_string+0xff/0x1a0
[11009.908253]  [<ffffffff81953720>] ? uuid_string+0x6f0/0x6f0
[11009.908257]  [<ffffffff811a775e>] ? __kernel_text_address+0x7e/0xa0
[11009.908260]  [<ffffffff8104b02f>] ? print_context_stack+0x7f/0xf0
[11009.908264]  [<ffffffff812e9846>] ? __module_address+0xb6/0x360
[11009.908268]  [<ffffffff8137fdc9>] ? is_ftrace_trampoline+0x99/0xe0
[11009.908271]  [<ffffffff811a775e>] ? __kernel_text_address+0x7e/0xa0
[11009.908275]  [<ffffffff81240a70>] ? debug_check_no_locks_freed+0x290/0x290
[11009.908278]  [<ffffffff8104a24b>] ? dump_trace+0x11b/0x300
[11009.908282]  [<ffffffffa13e8143>] ? em28xx_register_extension+0x23/0x190 [em28xx]
[11009.908285]  [<ffffffff81237d71>] ? trace_hardirqs_off_caller+0x21/0x290
[11009.908289]  [<ffffffff8123ff56>] ? trace_hardirqs_on_caller+0x16/0x590
[11009.908292]  [<ffffffff812404dd>] ? trace_hardirqs_on+0xd/0x10
[11009.908296]  [<ffffffffa13e8143>] ? em28xx_register_extension+0x23/0x190 [em28xx]
[11009.908299]  [<ffffffff822dcbb0>] ? mutex_trylock+0x400/0x400
[11009.908302]  [<ffffffff810021a1>] ? do_one_initcall+0x131/0x300
[11009.908306]  [<ffffffff81296dc7>] ? call_rcu_sched+0x17/0x20
[11009.908309]  [<ffffffff8159e708>] ? put_object+0x48/0x70
[11009.908314]  [<ffffffffa1579f11>] em28xx_dvb_init+0x81/0x8a [em28xx_dvb]
[11009.908317]  [<ffffffffa13e81f9>] em28xx_register_extension+0xd9/0x190 [em28xx]
[11009.908320]  [<ffffffffa0150000>] ? 0xffffffffa0150000
[11009.908324]  [<ffffffffa0150010>] em28xx_dvb_register+0x10/0x1000 [em28xx_dvb]
[11009.908327]  [<ffffffff810021b1>] do_one_initcall+0x141/0x300
[11009.908330]  [<ffffffff81002070>] ? try_to_run_init_process+0x40/0x40
[11009.908333]  [<ffffffff8123ff56>] ? trace_hardirqs_on_caller+0x16/0x590
[11009.908337]  [<ffffffff8155e926>] ? kasan_unpoison_shadow+0x36/0x50
[11009.908340]  [<ffffffff8155e926>] ? kasan_unpoison_shadow+0x36/0x50
[11009.908343]  [<ffffffff8155e926>] ? kasan_unpoison_shadow+0x36/0x50
[11009.908346]  [<ffffffff8155ea37>] ? __asan_register_globals+0x87/0xa0
[11009.908350]  [<ffffffff8144da7b>] do_init_module+0x1d0/0x5ad
[11009.908353]  [<ffffffff812f2626>] load_module+0x6666/0x9ba0
[11009.908356]  [<ffffffff812e9c90>] ? symbol_put_addr+0x50/0x50
[11009.908361]  [<ffffffffa1580037>] ? em28xx_dvb_init.part.3+0x5989/0x5cf4 [em28xx_dvb]
[11009.908366]  [<ffffffff812ebfc0>] ? module_frob_arch_sections+0x20/0x20
[11009.908369]  [<ffffffff815bc940>] ? open_exec+0x50/0x50
[11009.908374]  [<ffffffff811671bb>] ? ns_capable+0x5b/0xd0
[11009.908377]  [<ffffffff812f5e58>] SyS_finit_module+0x108/0x130
[11009.908379]  [<ffffffff812f5d50>] ? SyS_init_module+0x1f0/0x1f0
[11009.908383]  [<ffffffff81004044>] ? lockdep_sys_exit_thunk+0x12/0x14
[11009.908394]  [<ffffffff822e6936>] entry_SYSCALL_64_fastpath+0x16/0x76
[11009.908396] Memory state around the buggy address:
[11009.908398]  ffff8803bd78aa00: 00 00 fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[11009.908401]  ffff8803bd78aa80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[11009.908403] >ffff8803bd78ab00: fc fc fc fc fc fc fc fc 00 00 fc fc fc fc fc fc
[11009.908405]                                            ^
[11009.908407]  ffff8803bd78ab80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[11009.908409]  ffff8803bd78ac00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[11009.908411] ==================================================================

In order to avoid it, let's set the cached value of the firmware
name to NULL after freeing it. While here, return an error if
the memory allocation fails.

Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Cc: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mnt: Add a per mount namespace limit on the number of mounts

commit d29216842a85c7970c536108e093963f02714498 upstream.

CAI Qian <caiqian@redhat.com> pointed out that the semantics
of shared subtrees make it possible to create an exponentially
increasing number of mounts in a mount namespace.

    mkdir /tmp/1 /tmp/2
    mount --make-rshared /
    for i in $(seq 1 20) ; do mount --bind /tmp/1 /tmp/2 ; done

Will create create 2^20 or 1048576 mounts, which is a practical problem
as some people have managed to hit this by accident.

As such CVE-2016-6213 was assigned.

Ian Kent <raven@themaw.net> described the situation for autofs users
as follows:

> The number of mounts for direct mount maps is usually not very large because of
> the way they are implemented, large direct mount maps can have performance
> problems. There can be anywhere from a few (likely case a few hundred) to less
> than 10000, plus mounts that have been triggered and not yet expired.
>
> Indirect mounts have one autofs mount at the root plus the number of mounts that
> have been triggered and not yet expired.
>
> The number of autofs indirect map entries can range from a few to the common
> case of several thousand and in rare cases up to between 30000 and 50000. I've
> not heard of people with maps larger than 50000 entries.
>
> The larger the number of map entries the greater the possibility for a large
> number of active mounts so it's not hard to expect cases of a 1000 or somewhat
> more active mounts.

So I am setting the default number of mounts allowed per mount
namespace at 100,000.  This is more than enough for any use case I
know of, but small enough to quickly stop an exponential increase
in mounts.  Which should be perfect to catch misconfigurations and
malfunctioning programs.

For anyone who needs a higher limit this can be changed by writing
to the new /proc/sys/fs/mount-max sysctl.

Tested-by: CAI Qian <caiqian@redhat.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
[bwh: Backported to 4.4: adjust context]
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

tipc: fix socket timer deadlock

commit f1d048f24e66ba85d3dabf3d076cefa5f2b546b0 upstream.

We sometimes observe a 'deadly embrace' type deadlock occurring
between mutually connected sockets on the same node. This happens
when the one-hour peer supervision timers happen to expire
simultaneously in both sockets.

The scenario is as follows:

CPU 1:                          CPU 2:
--------                        --------
tipc_sk_timeout(sk1)            tipc_sk_timeout(sk2)
  lock(sk1.slock)                 lock(sk2.slock)
  msg_create(probe)               msg_create(probe)
  unlock(sk1.slock)               unlock(sk2.slock)
  tipc_node_xmit_skb()            tipc_node_xmit_skb()
    tipc_node_xmit()                tipc_node_xmit()
      tipc_sk_rcv(sk2)                tipc_sk_rcv(sk1)
        lock(sk2.slock)                 lock((sk1.slock)
        filter_rcv()                    filter_rcv()
          tipc_sk_proto_rcv()             tipc_sk_proto_rcv()
            msg_create(probe_rsp)           msg_create(probe_rsp)
            tipc_sk_respond()               tipc_sk_respond()
              tipc_node_xmit_skb()            tipc_node_xmit_skb()
                tipc_node_xmit()                tipc_node_xmit()
                  tipc_sk_rcv(sk1)                tipc_sk_rcv(sk2)
                    lock((sk1.slock)                lock((sk2.slock)
                    ===> DEADLOCK                   ===> DEADLOCK

Further analysis reveals that there are three different locations in the
socket code where tipc_sk_respond() is called within the context of the
socket lock, with ensuing risk of similar deadlocks.

We now solve this by passing a buffer queue along with all upcalls where
sk_lock.slock may potentially be held. Response or rejected message
buffers are accumulated into this queue instead of being sent out
directly, and only sent once we know we are safely outside the slock
context.

Reported-by: GUNA <gbalasun@gmail.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

tipc: fix random link resets while adding a second bearer

commit d2f394dc4816b7bd1b44981d83509f18f19c53f0 upstream.

In a dual bearer configuration, if the second tipc link becomes
active while the first link still has pending nametable "bulk"
updates, it randomly leads to reset of the second link.

When a link is established, the function named_distribute(),
fills the skb based on node mtu (allows room for TUNNEL_PROTOCOL)
with NAME_DISTRIBUTOR message for each PUBLICATION.
However, the function named_distribute() allocates the buffer by
increasing the node mtu by INT_H_SIZE (to insert NAME_DISTRIBUTOR).
This consumes the space allocated for TUNNEL_PROTOCOL.

When establishing the second link, the link shall tunnel all the
messages in the first link queue including the "bulk" update.
As size of the NAME_DISTRIBUTOR messages while tunnelling, exceeds
the link mtu the transmission fails (-EMSGSIZE).

Thus, the synch point based on the message count of the tunnel
packets is never reached leading to link timeout.

In this commit, we adjust the size of name distributor message so that
they can be tunnelled.

Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

gfs2: avoid uninitialized variable warning

commit 67893f12e5374bbcaaffbc6e570acbc2714ea884 upstream.

We get a bogus warning about a potential uninitialized variable
use in gfs2, because the compiler does not figure out that we
never use the leaf number if get_leaf_nr() returns an error:

fs/gfs2/dir.c: In function 'get_first_leaf':
fs/gfs2/dir.c:802:9: warning: 'leaf_no' may be used uninitialized in this function [-Wmaybe-uninitialized]
fs/gfs2/dir.c: In function 'dir_split_leaf':
fs/gfs2/dir.c:1021:8: warning: 'leaf_no' may be used uninitialized in this function [-Wmaybe-uninitialized]

Changing the 'if (!error)' to 'if (!IS_ERR_VALUE(error))' is
sufficient to let gcc understand that this is exactly the same
condition as in IS_ERR() so it can optimize the code path enough
to understand it.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

hostap: avoid uninitialized variable use in hfa384x_get_rid

commit 48dc5fb3ba53b20418de8514700f63d88c5de3a3 upstream.

The driver reads a value from hfa384x_from_bap(), which may fail,
and then assigns the value to a local variable. gcc detects that
in in the failure case, the 'rlen' variable now contains
uninitialized data:

In file included from ../drivers/net/wireless/intersil/hostap/hostap_pci.c:220:0:
drivers/net/wireless/intersil/hostap/hostap_hw.c: In function 'hfa384x_get_rid':
drivers/net/wireless/intersil/hostap/hostap_hw.c:842:5: warning: 'rec' may be used uninitialized in this function [-Wmaybe-uninitialized]
if (le16_to_cpu(rec.len) == 0) {

This restructures the function as suggested by Russell King, to
make it more readable and get more reliable error handling, by
handling each failure mode using a goto.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

tty: nozomi: avoid a harmless gcc warning

commit a4f642a8a3c2838ad09fe8313d45db46600e1478 upstream.

The nozomi wireless data driver has its own helper function to
transfer data from a FIFO, doing an extra byte swap on big-endian
architectures, presumably to bring the data back into byte-serial
order after readw() or readl() perform their implicit byteswap.

This helper function is used in the receive_data() function to
first read the length into a 32-bit variable, which causes
a compile-time warning:

drivers/tty/nozomi.c: In function 'receive_data':
drivers/tty/nozomi.c:857:9: warning: 'size' may be used uninitialized in this function [-Wmaybe-uninitialized]

The problem is that gcc is unsure whether the data was actually
read or not. We know that it is at this point, so we can replace
it with a single readl() to shut up that warning.

I am leaving the byteswap in there, to preserve the existing
behavior, even though this seems fishy: Reading the length of
the data into a cpu-endian variable should normally not use
a second byteswap on big-endian systems, unless the hardware
is aware of the CPU endianess.

There appears to be a lot more confusion about endianess in this
driver, so it probably has not worked on big-endian systems in
a long time, if ever, and I have no way to test it. It's well
possible that this driver has not been used by anyone in a while,
the last patch that looks like it was tested on the hardware is
from 2008.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

tipc: correct error in node fsm

commit c4282ca76c5b81ed73ef4c5eb5c07ee397e51642 upstream.

commit 88e8ac7000dc ("tipc: reduce transmission rate of reset messages
when link is down") revealed a flaw in the node FSM, as defined in
the log of commit 66996b6c47ed ("tipc: extend node FSM").

We see the following scenario:
1: Node B receives a RESET message from node A before its link endpoint
   is fully up, i.e., the node FSM is in state SELF_UP_PEER_COMING. This
   event will not change the node FSM state, but the (distinct) link FSM
   will move to state RESETTING.
2: As an effect of the previous event, the local endpoint on B will
   declare node A lost, and post the event SELF_DOWN to the its node
   FSM. This moves the FSM state to SELF_DOWN_PEER_LEAVING, meaning
   that no messages will be accepted from A until it receives another
   RESET message that confirms that A's endpoint has been reset. This
   is  wasteful, since we know this as a fact already from the first
   received RESET, but worse is that the link instance's FSM has not
   wasted this information, but instead moved on to state ESTABLISHING,
   meaning that it repeatedly sends out ACTIVATE messages to the reset
   peer A.
3: Node A will receive one of the ACTIVATE messages, move its link FSM
   to state ESTABLISHED, and start repeatedly sending out STATE messages
   to node B.
4: Node B will consistently drop these messages, since it can only accept
   accept a RESET according to its node FSM.
5: After four lost STATE messages node A will reset its link and start
   repeatedly sending out RESET messages to B.
6: Because of the reduced send rate for RESET messages, it is very
   likely that A will receive an ACTIVATE (which is sent out at a much
   higher frequency) before it gets the chance to send a RESET, and A
   may hence quickly move back to state ESTABLISHED and continue sending
   out STATE messages, which will again be dropped by B.
7: GOTO 5.
8: After having repeated the cycle 5-7 a number of times, node A will
   by chance get in between with sending a RESET, and the situation is
   resolved.

Unfortunately, we have seen that it may take a substantial amount of
time before this vicious loop is broken, sometimes in the order of
minutes.

We correct this by making a small correction to the node FSM: When a
node in state SELF_UP_PEER_COMING receives a SELF_DOWN event, it now
moves directly back to state SELF_DOWN_PEER_DOWN, instead of as now
SELF_DOWN_PEER_LEAVING. This is logically consistent, since we don't
need to wait for RESET confirmation from of an endpoint that we alread
know has been reset. It also means that node B in the scenario above
will not be dropping incoming STATE messages, and the link can come up
immediately.

Finally, a symmetry comparison reveals that the  FSM has a similar
error when receiving the event PEER_DOWN in state PEER_UP_SELF_COMING.
Instead of moving to PERR_DOWN_SELF_LEAVING, it should move directly
to SELF_DOWN_PEER_DOWN. Although we have never seen any negative effect
of this logical error, we choose fix this one, too.

The node FSM looks as follows after those changes:

                           +----------------------------------------+
                           |                           PEER_DOWN_EVT|
                           |                                        |
  +------------------------+----------------+                       |
  |SELF_DOWN_EVT           |                |                       |
  |                        |                |                       |
  |              +-----------+          +-----------+               |
  |              |NODE_      |          |NODE_      |               |
  |   +----------|FAILINGOVER|<---------|SYNCHING   |-----------+   |
  |   |SELF_     +-----------+ FAILOVER_+-----------+   PEER_   |   |
  |   |DOWN_EVT   |          A BEGIN_EVT  A         |   DOWN_EVT|   |
  |   |           |          |            |         |           |   |
  |   |           |          |            |         |           |   |
  |   |           |FAILOVER_ |FAILOVER_   |SYNCH_   |SYNCH_     |   |
  |   |           |END_EVT   |BEGIN_EVT   |BEGIN_EVT|END_EVT    |   |
  |   |           |          |            |         |           |   |
  |   |           |          |            |         |           |   |
  |   |           |         +--------------+        |           |   |
  |   |           +-------->|   SELF_UP_   |<-------+           |   |
  |   |   +-----------------|   PEER_UP    |----------------+   |   |
  |   |   |SELF_DOWN_EVT    +--------------+   PEER_DOWN_EVT|   |   |
  |   |   |                    A        A                   |   |   |
  |   |   |                    |        |                   |   |   |
  |   |   |         PEER_UP_EVT|        |SELF_UP_EVT        |   |   |
  |   |   |                    |        |                   |   |   |
  V   V   V                    |        |                   V   V   V
+------------+       +-----------+    +-----------+       +------------+
|SELF_DOWN_  |       |SELF_UP_   |    |PEER_UP_   |       |PEER_DOWN   |
|PEER_LEAVING|       |PEER_COMING|    |SELF_COMING|       |SELF_LEAVING|
+------------+       +-----------+    +-----------+       +------------+
       |               |       A        A       |                |
       |               |       |        |       |                |
       |       SELF_   |       |SELF_   |PEER_  |PEER_           |
       |       DOWN_EVT|       |UP_EVT  |UP_EVT |DOWN_EVT        |
       |               |       |        |       |                |
       |               |       |        |       |                |
       |               |    +--------------+    |                |
       |PEER_DOWN_EVT  +--->|  SELF_DOWN_  |<---+   SELF_DOWN_EVT|
       +------------------->|  PEER_DOWN   |<--------------------+
                            +--------------+

Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

tipc: re-enable compensation for socket receive buffer double counting

commit 7c8bcfb1255fe9d929c227d67bdcd84430fd200b upstream.

In the refactoring commit d570d86497ee ("tipc: enqueue arrived buffers
in socket in separate function") we did by accident replace the test

if (sk->sk_backlog.len == 0)
atomic_set(&tsk->dupl_rcvcnt, 0);

with

if (sk->sk_backlog.len)
atomic_set(&tsk->dupl_rcvcnt, 0);

This effectively disables the compensation we have for the double
receive buffer accounting that occurs temporarily when buffers are
moved from the backlog to the socket receive queue. Until now, this
has gone unnoticed because of the large receive buffer limits we are
applying, but becomes indispensable when we reduce this buffer limit
later in this series.

We now fix this by inverting the mentioned condition.

Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

tipc: make dist queue pernet

commit 541726abe7daca64390c2ec34e6a203145f1686d upstream.

Nametable updates received from the network that cannot be applied
immediately are placed on a defer queue. This queue is global to the
TIPC module, which might cause problems when using TIPC in containers.
To prevent nametable updates from escaping into the wrong namespace,
we make the queue pernet instead.

Signed-off-by: Erik Hugne <erik.hugne@gmail.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

tipc: make sure IPv6 header fits in skb headroom

commit 9bd160bfa27fa41927dbbce7ee0ea779700e09ef upstream.

Expand headroom further in order to be able to fit the larger IPv6
header. Prior to this patch this caused a skb under panic for certain
tipc packets when using IPv6 UDP bearer(s).

Signed-off-by: Richard Alpe <richard.alpe@ericsson.com>
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Linux 4.4.64

tipc: fix crash during node removal

commit d25a01257e422a4bdeb426f69529d57c73b235fe upstream.

When the TIPC module is unloaded, we have identified a race condition
that allows a node reference counter to go to zero and the node instance
being freed before the node timer is finished with accessing it. This
leads to occasional crashes, especially in multi-namespace environments.

The scenario goes as follows:

CPU0:(node_stop)                       CPU1:(node_timeout)  // ref == 2

1:                                          if(!mod_timer())
2: if (del_timer())
3:   tipc_node_put()                                        // ref -> 1
4: tipc_node_put()                                          // ref -> 0
5:   kfree_rcu(node);
6:                                               tipc_node_get(node)
7:                                               // BOOM!

We now clean up this functionality as follows:

1) We remove the node pointer from the node lookup table before we
   attempt deactivating the timer. This way, we reduce the risk that
   tipc_node_find() may obtain a valid pointer to an instance marked
   for deletion; a harmless but undesirable situation.

2) We use del_timer_sync() instead of del_timer() to safely deactivate
   the node timer without any risk that it might be reactivated by the
   timeout handler. There is no risk of deadlock here, since the two
   functions never touch the same spinlocks.

3: We remove a pointless tipc_node_get() + tipc_node_put() from the
   timeout handler.

Reported-by: Zhijiang Hu <huzhijiang@gmail.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

block: fix del_gendisk() vs blkdev_ioctl crash

commit ac34f15e0c6d2fd58480052b6985f6991fb53bcc upstream.

When tearing down a block device early in its lifetime, userspace may
still be performing discovery actions like blkdev_ioctl() to re-read
partitions.

The nvdimm_revalidate_disk() implementation depends on
disk->driverfs_dev to be valid at entry.  However, it is set to NULL in
del_gendisk() and fatally this is happening *before* the disk device is
deleted from userspace view.

There's no reason for del_gendisk() to clear ->driverfs_dev.  That
device is the parent of the disk.  It is guaranteed to not be freed
until the disk, as a child, drops its ->parent reference.

We could also fix this issue locally in nvdimm_revalidate_disk() by
using disk_to_dev(disk)->parent, but lets fix it globally since
->driverfs_dev follows the lifetime of the parent.  Longer term we
should probably just add a @parent parameter to add_disk(), and stop
carrying this pointer in the gendisk.

BUG: unable to handle kernel NULL pointer dereference at           (null)
IP: [<ffffffffa00340a8>] nvdimm_revalidate_disk+0x18/0x90 [libnvdimm]
CPU: 2 PID: 538 Comm: systemd-udevd Tainted: G           O    4.4.0-rc5 #2257
[..]
Call Trace:
  [<ffffffff8143e5c7>] rescan_partitions+0x87/0x2c0
  [<ffffffff810f37f9>] ? __lock_is_held+0x49/0x70
  [<ffffffff81438c62>] __blkdev_reread_part+0x72/0xb0
  [<ffffffff81438cc5>] blkdev_reread_part+0x25/0x40
  [<ffffffff8143982d>] blkdev_ioctl+0x4fd/0x9c0
  [<ffffffff811246c9>] ? current_kernel_time64+0x69/0xd0
  [<ffffffff812916dd>] block_ioctl+0x3d/0x50
  [<ffffffff81264c38>] do_vfs_ioctl+0x308/0x560
  [<ffffffff8115dbd1>] ? __audit_syscall_entry+0xb1/0x100
  [<ffffffff810031d6>] ? do_audit_syscall_entry+0x66/0x70
  [<ffffffff81264f09>] SyS_ioctl+0x79/0x90
  [<ffffffff81902672>] entry_SYSCALL_64_fastpath+0x12/0x76

Cc: Jan Kara <jack@suse.cz>
Cc: Jens Axboe <axboe@fb.com>
Reported-by: Robert Hu <robert.hu@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

x86, pmem: fix broken __copy_user_nocache cache-bypass assumptions

commit 11e63f6d920d6f2dfd3cd421e939a4aec9a58dcd upstream.

Before we rework the "pmem api" to stop abusing __copy_user_nocache()
for memcpy_to_pmem() we need to fix cases where we may strand dirty data
in the cpu cache. The problem occurs when copy_from_iter_pmem() is used
for arbitrary data transfers from userspace. There is no guarantee that
these transfers, performed by dax_iomap_actor(), will have aligned
destinations or aligned transfer lengths. Backstop the usage
__copy_user_nocache() with explicit cache management in these unaligned
cases.

Yes, copy_from_iter_pmem() is now too big for an inline, but addressing
that is saved for a later patch that moves the entirety of the "pmem
api" into the pmem driver directly.

Fixes: 5de490daec8b ("pmem: add copy_from_iter_pmem() and clear_pmem()")
Cc: <x86@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

hv: don't reset hv_context.tsc_page on crash

commit 56ef6718a1d8d77745033c5291e025ce18504159 upstream.

It may happen that secondary CPUs are still alive and resetting
hv_context.tsc_page will cause a consequent crash in read_hv_clock_tsc()
as we don't check for it being not NULL there. It is safe as we're not
freeing this page anyways.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Drivers: hv: balloon: account for gaps in hot add regions

commit cb7a5724c7e1bfb5766ad1c3beba14cc715991cf upstream.

I'm observing the following hot add requests from the WS2012 host:

hot_add_req: start_pfn = 0x108200 count = 330752
hot_add_req: start_pfn = 0x158e00 count = 193536
hot_add_req: start_pfn = 0x188400 count = 239616

As the host doesn't specify hot add regions we're trying to create
128Mb-aligned region covering the first request, we create the 0x108000 -
0x160000 region and we add 0x108000 - 0x158e00 memory. The second request
passes the pfn_covered() check, we enlarge the region to 0x108000 -
0x190000 and add 0x158e00 - 0x188200 memory. The problem emerges with the
third request as it starts at 0x188400 so there is a 0x200 gap which is
not covered. As the end of our region is 0x190000 now it again passes the
pfn_covered() check were we just adjust the covered_end_pfn and make it
0x188400 instead of 0x188200 which means that we'll try to online
0x188200-0x188400 pages but these pages were never assigned to us and we
crash.

We can't react to such requests by creating new hot add regions as it may
happen that the whole suggested range falls into the previously identified
128Mb-aligned area so we'll end up adding nothing or create intersecting
regions and our current logic doesn't allow that. Instead, create a list of
such 'gaps' and check for them in the page online callback.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Drivers: hv: balloon: keep track of where ha_region starts

commit 7cf3b79ec85ee1a5bbaaf936bb1d050dc652983b upstream.

Windows 2012 (non-R2) does not specify hot add region in hot add requests
and the logic in hot_add_req() is trying to find a 128Mb-aligned region
covering the request. It may also happen that host's requests are not 128Mb
aligned and the created ha_region will start before the first specified
PFN. We can't online these non-present pages but we don't remember the real
start of the region.

This is a regression introduced by the commit 5abbbb75d733 ("Drivers: hv:
hv_balloon: don't lose memory when onlining order is not natural"). While
the idea of keeping the 'moving window' was wrong (as there is no guarantee
that hot add requests come ordered) we should still keep track of
covered_start_pfn. This is not a revert, the logic is different.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Tools: hv: kvp: ensure kvp device fd is closed on exec

commit 26840437cbd6d3625ea6ab34e17cd34bb810c861 upstream.

KVP daemon does fork()/exec() (with popen()) so we need to close our fds
to avoid sharing them with child processes. The immediate implication of
not doing so I see is SELinux complaining about 'ip' trying to access
'/dev/vmbus/hv_kvp'.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

kvm: arm/arm64: Fix locking for kvm_free_stage2_pgd

commit 8b3405e345b5a098101b0c31b264c812bba045d9 upstream.

In kvm_free_stage2_pgd() we don't hold the kvm->mmu_lock while calling
unmap_stage2_range() on the entire memory range for the guest. This could
cause problems with other callers (e.g, munmap on a memslot) trying to
unmap a range. And since we have to unmap the entire Guest memory range
holding a spinlock, make sure we yield the lock if necessary, after we
unmap each PUD range.

Fixes: commit d5d8184d35c9 ("KVM: ARM: Memory virtualization setup")
Cc: Paolo Bonzini <pbonzin@redhat.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Christoffer Dall <christoffer.dall@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
[ Avoid vCPU starvation and lockup detector warnings ]
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Christoffer Dall <cdall@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

x86/mce/AMD: Give a name to MCA bank 3 when accessed with legacy MSRs

commit 29f72ce3e4d18066ec75c79c857bee0618a3504b upstream.

MCA bank 3 is reserved on systems pre-Fam17h, so it didn't have a name.
However, MCA bank 3 is defined on Fam17h systems and can be accessed
using legacy MSRs. Without a name we get a stack trace on Fam17h systems
when trying to register sysfs files for bank 3 on kernels that don't
recognize Scalable MCA.

Call MCA bank 3 "decode_unit" since this is what it represents on
Fam17h. This will allow kernels without SMCA support to see this bank on
Fam17h+ and prevent the stack trace. This will not affect older systems
since this bank is reserved on them, i.e. it'll be ignored.

Tested on AMD Fam15h and Fam17h systems.

  WARNING: CPU: 26 PID: 1 at lib/kobject.c:210 kobject_add_internal
  kobject: (ffff88085bb256c0): attempted to be registered with empty name!
  ...
  Call Trace:
   kobject_add_internal
   kobject_add
   kobject_create_and_add
   threshold_create_device
   threshold_init_device

Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1490102285-3659-1-git-send-email-Yazen.Ghannam@amd.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

powerpc/kprobe: Fix oops when kprobed on 'stdu' instruction

commit 9e1ba4f27f018742a1aa95d11e35106feba08ec1 upstream.

If we set a kprobe on a 'stdu' instruction on powerpc64, we see a kernel
OOPS:

  Bad kernel stack pointer cd93c840 at c000000000009868
  Oops: Bad kernel stack pointer, sig: 6 [#1]
  ...
  GPR00: c000001fcd93cb30 00000000cd93c840 c0000000015c5e00 00000000cd93c840
  ...
  NIP [c000000000009868] resume_kernel+0x2c/0x58
  LR [c000000000006208] program_check_common+0x108/0x180

On a 64-bit system when the user probes on a 'stdu' instruction, the kernel does
not emulate actual store in emulate_step() because it may corrupt the exception
frame. So the kernel does the actual store operation in exception return code
i.e. resume_kernel().

resume_kernel() loads the saved stack pointer from memory using lwz, which only
loads the low 32-bits of the address, causing the kernel crash.

Fix this by loading the 64-bit value instead.

Fixes: be96f63375a1 ("powerpc: Split out instruction analysis part of emulate_step()")
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Reviewed-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Reviewed-by: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
[mpe: Change log massage, add stable tag]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ubi/upd: Always flush after prepared for an update

commit 9cd9a21ce070be8a918ffd3381468315a7a76ba6 upstream.

In commit 6afaf8a484cb ("UBI: flush wl before clearing update marker") I
managed to trigger and fix a similar bug. Now here is another version of
which I assumed it wouldn't matter back then but it turns out UBI has a
check for it and will error out like this:

|ubi0 warning: validate_vid_hdr: inconsistent used_ebs
|ubi0 error: validate_vid_hdr: inconsistent VID header at PEB 592

All you need to trigger this is? "ubiupdatevol /dev/ubi0_0 file" + a
powercut in the middle of the operation.
ubi_start_update() sets the update-marker and puts all EBs on the erase
list. After that userland can proceed to write new data while the old EB
aren't erased completely. A powercut at this point is usually not that
much of a tragedy. UBI won't give read access to the static volume
because it has the update marker. It will most likely set the corrupted
flag because it misses some EBs.
So we are all good. Unless the size of the image that has been written
differs from the old image in the magnitude of at least one EB. In that
case UBI will find two different values for `used_ebs' and refuse to
attach the image with the error message mentioned above.

So in order not to get in the situation, the patch will ensure that we
wait until everything is removed before it tries to write any data.
The alternative would be to detect such a case and remove all EBs at the
attached time after we processed the volume-table and see the
update-marker set. The patch looks bigger and I doubt it is worth it
since usually the write() will wait from time to time for a new EB since
usually there not that many spare EB that can be used.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mac80211: reject ToDS broadcast data frames

commit 3018e947d7fd536d57e2b550c33e456d921fff8c upstream.

AP/AP_VLAN modes don't accept any real 802.11 multicast data
frames, but since they do need to accept broadcast management
frames the same is currently permitted for data frames. This
opens a security problem because such frames would be decrypted
with the GTK, and could even contain unicast L3 frames.

Since the spec says that ToDS frames must always have the BSSID
as the RA (addr1), reject any other data frames.

The problem was originally reported in "Predicting, Decrypting,
and Abusing WPA2/802.11 Group Keys" at usenix
https://www.usenix.org/conference/usenixsecurity16/technical-sessions/presentation/vanhoef
and brought to my attention by Jouni.

Reported-by: Jouni Malinen <j@w1.fi>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
--

mmc: sdhci-esdhc-imx: increase the pad I/O drive strength for DDR50 card

commit 9f327845358d3dd0d8a5a7a5436b0aa5c432e757 upstream.

Currently for DDR50 card, it need tuning in default. We meet tuning fail
issue for DDR50 card and some data CRC error when DDR50 sd card works.

This is because the default pad I/O drive strength can't make sure DDR50
card work stable. So increase the pad I/O drive strength for DDR50 card,
and use pins_100mhz.

This fixes DDR50 card support for IMX since DDR50 tuning was enabled from
commit 9faac7b95ea4 ("mmc: sdhci: enable tuning for DDR50")

Tested-and-reported-by: Tim Harvey <tharvey@gateworks.com>
Signed-off-by: Haibo Chen <haibo.chen@nxp.com>
Acked-by: Dong Aisheng <aisheng.dong@nxp.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ACPI / power: Avoid maybe-uninitialized warning

commit fe8c470ab87d90e4b5115902dd94eced7e3305c3 upstream.

gcc -O2 cannot always prove that the loop in acpi_power_get_inferred_state()
is enterered at least once, so it assumes that cur_state might not get
initialized:

drivers/acpi/power.c: In function 'acpi_power_get_inferred_state':
drivers/acpi/power.c:222:9: error: 'cur_state' may be used uninitialized in this function [-Werror=maybe-uninitialized]

This sets the variable to zero at the start of the loop, to ensure that
there is well-defined behavior even for an empty list. This gets rid of
the warning.

The warning first showed up when the -Os flag got removed in a bug fix
patch in linux-4.11-rc5.

I would suggest merging this addon patch on top of that bug fix to avoid
introducing a new warning in the stable kernels.

Fixes: 61b79e16c68d (ACPI: Fix incompatibility with mcount-based function graph tracing)
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Input: elantech - add Fujitsu Lifebook E547 to force crc_enabled

commit 704de489e0e3640a2ee2d0daf173e9f7375582ba upstream.

Temporary got a Lifebook E547 into my hands and noticed the touchpad
only works after running:

echo "1" > /sys/devices/platform/i8042/serio2/crc_enabled

Add it to the list of machines that need this workaround.

Signed-off-by: Thorsten Leemhuis <linux@leemhuis.info>
Reviewed-by: Ulrik De Bie <ulrik.debie-os@e2big.org>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

VSOCK: Detach QP check should filter out non matching QPs.

commit 8ab18d71de8b07d2c4d6f984b718418c09ea45c5 upstream.

The check in vmci_transport_peer_detach_cb should only allow a
detach when the qp handle of the transport matches the one in
the detach message.

Testing: Before this change, a detach from a peer on a different
socket would cause an active stream socket to register a detach.

Reviewed-by: George Zhang <georgezhang@vmware.com>
Signed-off-by: Jorgen Hansen <jhansen@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Drivers: hv: vmbus: Reduce the delay between retries in vmbus_post_msg()

commit 8de0d7e951826d7592e0ba1da655b175c4aa0923 upstream.

The current delay between retries is unnecessarily high and is negatively
affecting the time it takes to boot the system.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Drivers: hv: get rid of timeout in vmbus_open()

commit 396e287fa2ff46e83ae016cdcb300c3faa3b02f6 upstream.

vmbus_teardown_gpadl() can result in infinite wait when it is called on 5
second timeout in vmbus_open(). The issue is caused by the fact that gpadl
teardown operation won't ever succeed for an opened channel and the timeout
isn't always enough. As a guest, we can always trust the host to respond to
our request (and there is nothing we can do if it doesn't).

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Drivers: hv: don't leak memory in vmbus_establish_gpadl()

commit 7cc80c98070ccc7940fc28811c92cca0a681015d upstream.

In some cases create_gpadl_header() allocates submessages but we never
free them.

[sumits] Note for stable:
Upstream commit 4d63763296ab7865a98bc29cc7d77145815ef89f:
(Drivers: hv: get rid of redundant messagecount in create_gpadl_header())
changes the list usage to initialize list header in all cases; that patch
isn't added to stable, so the current patch is modified a little bit from
the upstream commit to check if the list is valid or not.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

s390/mm: fix CMMA vs KSM vs others

commit a8f60d1fadf7b8b54449fcc9d6b15248917478ba upstream.

On heavy paging with KSM I see guest data corruption. Turns out that
KSM will add pages to its tree, where the mapping return true for
pte_unused (or might become as such later). KSM will unmap such pages
and reinstantiate with different attributes (e.g. write protected or
special, e.g. in replace_page or write_protect_page)). This uncovered
a bug in our pagetable handling: We must remove the unused flag as
soon as an entry becomes present again.

Signed-of-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

CIFS: remove bad_network_name flag

commit a0918f1ce6a43ac980b42b300ec443c154970979 upstream.

STATUS_BAD_NETWORK_NAME can be received during node failover,
causing the flag to be set and making the reconnect thread
always unsuccessful, thereafter.

Once the only place where it is set is removed, the remaining
bits are rendered moot.

Removing it does not prevent "mount" from failing when a non
existent share is passed.

What happens when the share really ceases to exist while the
share is mounted is undefined now as much as it was before.

Signed-off-by: Germano Percossi <germano.percossi@citrix.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

cifs: Do not send echoes before Negotiate is complete

commit 62a6cfddcc0a5313e7da3e8311ba16226fe0ac10 upstream.

commit 4fcd1813e640 ("Fix reconnect to not defer smb3 session reconnect
long after socket reconnect") added support for Negotiate requests to
be initiated by echo calls.

To avoid delays in calling echo after a reconnect, I added the patch
introduced by the commit b8c600120fc8 ("Call echo service immediately
after socket reconnect").

This has however caused a regression with cifs shares which do not have
support for echo calls to trigger Negotiate requests. On connections
which need to call Negotiation, the echo calls trigger an error which
triggers a reconnect which in turn triggers another echo call. This
results in a loop which is only broken when an operation is performed on
the cifs share. For an idle share, it can DOS a server.

The patch uses the smb_operation can_echo() for cifs so that it is
called only if connection has been already been setup.

kernel bz: 194531

Signed-off-by: Sachin Prabhu <sprabhu@redhat.com>
Tested-by: Jonathan Liu <net147@gmail.com>
Acked-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ring-buffer: Have ring_buffer_iter_empty() return true when empty

commit 78f7a45dac2a2d2002f98a3a95f7979867868d73 upstream.

I noticed that reading the snapshot file when it is empty no longer gives a
status. It suppose to show the status of the snapshot buffer as well as how
to allocate and use it. For example:

># cat snapshot
# tracer: nop
#
#
# * Snapshot is allocated *
#
# Snapshot commands:
# echo 0 > snapshot : Clears and frees snapshot buffer
# echo 1 > snapshot : Allocates snapshot buffer, if not already allocated.
#                      Takes a snapshot of the main buffer.
# echo 2 > snapshot : Clears snapshot buffer (but does not allocate or free)
#                      (Doesn't have to be '2' works with any number that
#                       is not a '0' or '1')

But instead it just showed an empty buffer:

># cat snapshot
# tracer: nop
#
# entries-in-buffer/entries-written: 0/0   #P:4
#
#                              _-----=> irqs-off
#                             / _----=> need-resched
#                            | / _---=> hardirq/softirq
#                            || / _--=> preempt-depth
#                            ||| /     delay
#           TASK-PID   CPU#  ||||    TIMESTAMP  FUNCTION
#              | |       |   ||||       |         |

What happened was that it was using the ring_buffer_iter_empty() function to
see if it was empty, and if it was, it showed the status. But that function
was returning false when it was empty. The reason was that the iter header
page was on the reader page, and the reader page was empty, but so was the
buffer itself. The check only tested to see if the iter was on the commit
page, but the commit page was no longer pointing to the reader page, but as
all pages were empty, the buffer is also.

Fixes: 651e22f2701b ("ring-buffer: Always reset iterator to reader page")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

tracing: Allocate the snapshot buffer before enabling probe

commit df62db5be2e5f070ecd1a5ece5945b590ee112e0 upstream.

Currently the snapshot trigger enables the probe and then allocates the
snapshot. If the probe triggers before the allocation, it could cause the
snapshot to fail and turn tracing off. It's best to allocate the snapshot
buffer first, and then enable the trigger. If something goes wrong in the
enabling of the trigger, the snapshot buffer is still allocated, but it can
also be freed by the user by writting zero into the snapshot buffer file.

Also add a check of the return status of alloc_snapshot().

Fixes: 77fd5c15e3 ("tracing: Add snapshot trigger to function probes")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

KEYS: fix keyctl_set_reqkey_keyring() to not leak thread keyrings

commit c9f838d104fed6f2f61d68164712e3204bf5271b upstream.

This fixes CVE-2017-7472.

Running the following program as an unprivileged user exhausts kernel
memory by leaking thread keyrings:

#include <keyutils.h>

int main()
{
for (;;)
keyctl_set_reqkey_keyring(KEY_REQKEY_DEFL_THREAD_KEYRING);
}

Fix it by only creating a new thread keyring if there wasn't one before.
To make things more consistent, make install_thread_keyring_to_cred()
and install_process_keyring_to_cred() both return 0 if the corresponding
keyring is already present.

Fixes: d84f4f992cbd ("CRED: Inaugurate COW credentials")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

KEYS: Change the name of the dead type to ".dead" to prevent user access

commit c1644fe041ebaf6519f6809146a77c3ead9193af upstream.

This fixes CVE-2017-6951.

Userspace should not be able to do things with the "dead" key type as it
doesn't have some of the helper functions set upon it that the kernel
needs. Attempting to use it may cause the kernel to crash.

Fix this by changing the name of the type to ".dead" so that it's rejected
up front on userspace syscalls by key_get_type_from_user().

Though this doesn't seem to affect recent kernels, it does affect older
ones, certainly those prior to:

commit c06cfb08b88dfbe13be44a69ae2fdc3a7c902d81
Author: David Howells <dhowells@redhat.com>
Date: Tue Sep 16 17:36:06 2014 +0100
KEYS: Remove key_type::match in favour of overriding default by match_preparse

which went in before 3.18-rc1.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

KEYS: Disallow keyrings beginning with '.' to be joined as session keyrings

commit ee8f844e3c5a73b999edf733df1c529d6503ec2f upstream.

This fixes CVE-2016-9604.

Keyrings whose name begin with a '.' are special internal keyrings and so
userspace isn't allowed to create keyrings by this name to prevent
shadowing.  However, the patch that added the guard didn't fix
KEYCTL_JOIN_SESSION_KEYRING.  Not only can that create dot-named keyrings,
it can also subscribe to them as a session keyring if they grant SEARCH
permission to the user.

This, for example, allows a root process to set .builtin_trusted_keys as
its session keyring, at which point it has full access because now the
possessor permissions are added.  This permits root to add extra public
keys, thereby bypassing module verification.

This also affects kexec and IMA.

This can be tested by (as root):

keyctl session .builtin_trusted_keys
keyctl add user a a @s
keyctl list @s

which on my test box gives me:

2 keys in keyring:
180010936: ---lswrv     0     0 asymmetric: Build time autogenerated kernel key: ae3d4a31b82daa8e1a75b49dc2bba949fd992a05
801382539: --alswrv     0     0 user: a

Fix this by rejecting names beginning with a '.' in the keyctl.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
cc: linux-ima-devel@lists.sourceforge.net
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Linux 4.4.63

MIPS: fix Select HAVE_IRQ_EXIT_ON_IRQ_STACK patch.

Commit f017e58da4aba293e4a6ab62ca5d4801f79cc929 which was commit
3cc3434fd6307d06b53b98ce83e76bf9807689b9 upstream, was misapplied to the
4.4 stable kernel.

This patch fixes this and moves the chunk to the proper Kconfig area.

Reported-by: "Maciej W. Rozycki" <macro@linux-mips.org>
Cc: Matt Redfearn <matt.redfearn@imgtec.com>
Cc: Jason A. Donenfeld <jason@zx2c4.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Amit Pundir <amit.pundir@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

sctp: deny peeloff operation on asocs with threads sleeping on it

commit dfcb9f4f99f1e9a49e43398a7bfbf56927544af1 upstream.

commit 2dcab5984841 ("sctp: avoid BUG_ON on sctp_wait_for_sndbuf")
attempted to avoid a BUG_ON call when the association being used for a
sendmsg() is blocked waiting for more sndbuf and another thread did a
peeloff operation on such asoc, moving it to another socket.

As Ben Hutchings noticed, then in such case it would return without
locking back the socket and would cause two unlocks in a row.

Further analysis also revealed that it could allow a double free if the
application managed to peeloff the asoc that is created during the
sendmsg call, because then sctp_sendmsg() would try to free the asoc
that was created only for that call.

This patch takes another approach. It will deny the peeloff operation
if there is a thread sleeping on the asoc, so this situation doesn't
exist anymore. This avoids the issues described above and also honors
the syscalls that are already being handled (it can be multiple sendmsg
calls).

Joint work with Xin Long.

Fixes: 2dcab5984841 ("sctp: avoid BUG_ON on sctp_wait_for_sndbuf")
Cc: Alexander Popov <alex.popov@linux.com>
Cc: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

net: ipv6: check route protocol when deleting routes

commit c2ed1880fd61a998e3ce40254a99a2ad000f1a7d upstream.

The protocol field is checked when deleting IPv4 routes, but ignored for
IPv6, which causes problems with routing daemons accidentally deleting
externally set routes (observed by multiple bird6 users).

This can be verified using `ip -6 route del <prefix> proto something`.

Signed-off-by: Mantas Mikulėnas <grawity@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

tty/serial: atmel: RS485 half duplex w/DMA: enable RX after TX is done

commit b389f173aaa1204d6dc1f299082a162eb0491545 upstream.

When using RS485 in half duplex, RX should be enabled when TX is
finished, and stopped when TX starts.

Before commit 0058f0871efe7b01c6 ("tty/serial: atmel: fix RS485 half
duplex with DMA"), RX was not disabled in atmel_start_tx() if the DMA
was used. So, collisions could happened.

But disabling RX in atmel_start_tx() uncovered another bug:
RX was enabled again in the wrong place (in atmel_tx_dma) instead of
being enabled when TX is finished (in atmel_complete_tx_dma), so the
transmission simply stopped.

This bug was not triggered before commit 0058f0871efe7b01c6
("tty/serial: atmel: fix RS485 half duplex with DMA") because RX was
never disabled before.

Moving atmel_start_rx() in atmel_complete_tx_dma() corrects the problem.

Reported-by: Gil Weber <webergil@gmail.com>
Fixes: 0058f0871efe7b01c6
Tested-by: Gil Weber <webergil@gmail.com>
Signed-off-by: Richard Genoud <richard.genoud@gmail.com>
Acked-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Tested-by: Bryan Evenson <bevenson@melinkcorp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

SUNRPC: fix refcounting problems with auth_gss messages.

commit 1cded9d2974fe4fe339fc0ccd6638b80d465ab2c upstream.

There are two problems with refcounting of auth_gss messages.

First, the reference on the pipe->pipe list (taken by a call
to rpc_queue_upcall()) is not counted.  It seems to be
assumed that a message in pipe->pipe will always also be in
pipe->in_downcall, where it is correctly reference counted.

However there is no guaranty of this.  I have a report of a
NULL dereferences in rpc_pipe_read() which suggests a msg
that has been freed is still on the pipe->pipe list.

One way I imagine this might happen is:
- message is queued for uid=U and auth->service=S1
- rpc.gssd reads this message and starts processing.
  This removes the message from pipe->pipe
- message is queued for uid=U and auth->service=S2
- rpc.gssd replies to the first message. gss_pipe_downcall()
  calls __gss_find_upcall(pipe, U, NULL) and it finds the
  *second* message, as new messages are placed at the head
  of ->in_downcall, and the service type is not checked.
- This second message is removed from ->in_downcall and freed
  by gss_release_msg() (even though it is still on pipe->pipe)
- rpc.gssd tries to read another message, and dereferences a pointer
  to this message that has just been freed.

I fix this by incrementing the reference count before calling
rpc_queue_upcall(), and decrementing it if that fails, or normally in
gss_pipe_destroy_msg().

It seems strange that the reply doesn't target the message more
precisely, but I don't know all the details.  In any case, I think the
reference counting irregularity became a measureable bug when the
extra arg was added to __gss_find_upcall(), hence the Fixes: line
below.

The second problem is that if rpc_queue_upcall() fails, the new
message is not freed. gss_alloc_msg() set the ->count to 1,
gss_add_msg() increments this to 2, gss_unhash_msg() decrements to 1,
then the pointer is discarded so the memory never gets freed.

Fixes: 9130b8dbc6ac ("SUNRPC: allow for upcalls for same uid but different gss service")
Link: https://bugzilla.opensuse.org/show_bug.cgi?id=1011250
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ibmveth: calculate gso_segs for large packets

commit 94acf164dc8f1184e8d0737be7125134c2701dbe upstream.

Include calculations to compute the number of segments
that comprise an aggregated large packet.

Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Reviewed-by: Jonathan Maxwell <jmaxwell37@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

catc: Use heap buffer for memory size test

commit 2d6a0e9de03ee658a9adc3bfb2f0ca55dff1e478 upstream.

Allocating USB buffers on the stack is not portable, and no longer
works on x86_64 (with VMAP_STACK enabled as per default).

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Brad Spengler <spender@grsecurity.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

catc: Combine failure cleanup code in catc_probe()

commit d41149145f98fe26dcd0bfd1d6cc095e6e041418 upstream.

Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

rtl8150: Use heap buffers for all register access

commit 7926aff5c57b577ab0f43364ff0c59d968f6a414 upstream.

Allocating USB buffers on the stack is not portable, and no longer
works on x86_64 (with VMAP_STACK enabled as per default).

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Brad Spengler <spender@grsecurity.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

pegasus: Use heap buffers for all register access

commit 5593523f968bc86d42a035c6df47d5e0979b5ace upstream.

Allocating USB buffers on the stack is not portable, and no longer
works on x86_64 (with VMAP_STACK enabled as per default).

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
References: https://bugs.debian.org/852556
Reported-by: Lisandro Damián Nicanor Pérez Meyer <lisandro@debian.org>
Tested-by: Lisandro Damián Nicanor Pérez Meyer <lisandro@debian.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Brad Spengler <spender@grsecurity.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

virtio-console: avoid DMA from stack

commit c4baad50297d84bde1a7ad45e50c73adae4a2192 upstream.

put_chars() stuffs the buffer it gets into an sg, but that buffer may be
on the stack. This breaks with CONFIG_VMAP_STACK=y (for me, it
manifested as printks getting turned into NUL bytes).

Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Cc: Ben Hutchings <ben@decadent.org.uk>
Cc: Brad Spengler <spender@grsecurity.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

dvb-usb-firmware: don't do DMA on stack

commit 67b0503db9c29b04eadfeede6bebbfe5ddad94ef upstream.

The buffer allocation for the firmware data was changed in
commit 43fab9793c1f ("[media] dvb-usb: don't use stack for firmware load")
but the same applies for the reset value.

Fixes: 43fab9793c1f ("[media] dvb-usb: don't use stack for firmware load")
Signed-off-by: Stefan Brüns <stefan.bruens@rwth-aachen.de>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Cc: Ben Hutchings <ben@decadent.org.uk>
Cc: Brad Spengler <spender@grsecurity.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

dvb-usb: don't use stack for firmware load

commit 43fab9793c1f44e665b4f98035a14942edf03ddc upstream.

As reported by Marc Duponcheel <marc@offline.be>, firmware load on
dvb-usb is using the stack, with is not allowed anymore on default
Kernel configurations:

[ 1025.958836] dvb-usb: found a 'WideView WT-220U PenType Receiver (based on ZL353)' in cold state, will try to load a firmware
[ 1025.958853] dvb-usb: downloading firmware from file 'dvb-usb-wt220u-zl0353-01.fw'
[ 1025.958855] dvb-usb: could not stop the USB controller CPU.
[ 1025.958856] dvb-usb: error while transferring firmware (transferred size: -11, block size: 3)
[ 1025.958856] dvb-usb: firmware download failed at 8 with -22
[ 1025.958867] usbcore: registered new interface driver dvb_usb_dtt200u

[    2.789902] dvb-usb: downloading firmware from file 'dvb-usb-wt220u-zl0353-01.fw'
[    2.789905] ------------[ cut here ]------------
[    2.789911] WARNING: CPU: 3 PID: 2196 at drivers/usb/core/hcd.c:1584 usb_hcd_map_urb_for_dma+0x430/0x560 [usbcore]
[    2.789912] transfer buffer not dma capable
[    2.789912] Modules linked in: btusb dvb_usb_dtt200u(+) dvb_usb_af9035(+) btrtl btbcm dvb_usb dvb_usb_v2 btintel dvb_core bluetooth rc_core rfkill x86_pkg_temp_thermal intel_powerclamp coretemp crc32_pclmul aesni_intel aes_x86_64 glue_helper lrw gf128mul ablk_helper cryptd drm_kms_helper syscopyarea sysfillrect pcspkr i2c_i801 sysimgblt fb_sys_fops drm i2c_smbus i2c_core r8169 lpc_ich mfd_core mii thermal fan rtc_cmos video button acpi_cpufreq processor snd_hda_codec_realtek snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core snd_pcm snd_timer snd crc32c_intel ahci libahci libata xhci_pci ehci_pci xhci_hcd ehci_hcd usbcore usb_common dm_mirror dm_region_hash dm_log dm_mod
[    2.789936] CPU: 3 PID: 2196 Comm: systemd-udevd Not tainted 4.9.0-gentoo #1
[    2.789937] Hardware name: ASUS All Series/H81I-PLUS, BIOS 0401 07/23/2013
[    2.789938]  ffffc9000339b690 ffffffff812bd397 ffffc9000339b6e0 0000000000000000
[    2.789939]  ffffc9000339b6d0 ffffffff81055c86 000006300339b6a0 ffff880116c0c000
[    2.789941]  0000000000000000 0000000000000000 0000000000000001 ffff880116c08000
[    2.789942] Call Trace:
[    2.789945]  [<ffffffff812bd397>] dump_stack+0x4d/0x66
[    2.789947]  [<ffffffff81055c86>] __warn+0xc6/0xe0
[    2.789948]  [<ffffffff81055cea>] warn_slowpath_fmt+0x4a/0x50
[    2.789952]  [<ffffffffa006d460>] usb_hcd_map_urb_for_dma+0x430/0x560 [usbcore]
[    2.789954]  [<ffffffff814ed5a8>] ? io_schedule_timeout+0xd8/0x110
[    2.789956]  [<ffffffffa006e09c>] usb_hcd_submit_urb+0x9c/0x980 [usbcore]
[    2.789958]  [<ffffffff812d0ebf>] ? copy_page_to_iter+0x14f/0x2b0
[    2.789960]  [<ffffffff81126818>] ? pagecache_get_page+0x28/0x240
[    2.789962]  [<ffffffff8118c2a0>] ? touch_atime+0x20/0xa0
[    2.789964]  [<ffffffffa006f7c4>] usb_submit_urb+0x2c4/0x520 [usbcore]
[    2.789967]  [<ffffffffa006feca>] usb_start_wait_urb+0x5a/0xe0 [usbcore]
[    2.789969]  [<ffffffffa007000c>] usb_control_msg+0xbc/0xf0 [usbcore]
[    2.789970]  [<ffffffffa067903d>] usb_cypress_writemem+0x3d/0x40 [dvb_usb]
[    2.789972]  [<ffffffffa06791cf>] usb_cypress_load_firmware+0x4f/0x130 [dvb_usb]
[    2.789973]  [<ffffffff8109dbbe>] ? console_unlock+0x2fe/0x5d0
[    2.789974]  [<ffffffff8109e10c>] ? vprintk_emit+0x27c/0x410
[    2.789975]  [<ffffffff8109e40a>] ? vprintk_default+0x1a/0x20
[    2.789976]  [<ffffffff81124d76>] ? printk+0x43/0x4b
[    2.789977]  [<ffffffffa0679310>] dvb_usb_download_firmware+0x60/0xd0 [dvb_usb]
[    2.789979]  [<ffffffffa0679898>] dvb_usb_device_init+0x3d8/0x610 [dvb_usb]
[    2.789981]  [<ffffffffa069e302>] dtt200u_usb_probe+0x92/0xd0 [dvb_usb_dtt200u]
[    2.789984]  [<ffffffffa007420c>] usb_probe_interface+0xfc/0x270 [usbcore]
[    2.789985]  [<ffffffff8138bf95>] driver_probe_device+0x215/0x2d0
[    2.789986]  [<ffffffff8138c0e6>] __driver_attach+0x96/0xa0
[    2.789987]  [<ffffffff8138c050>] ? driver_probe_device+0x2d0/0x2d0
[    2.789988]  [<ffffffff81389ffb>] bus_for_each_dev+0x5b/0x90
[    2.789989]  [<ffffffff8138b7b9>] driver_attach+0x19/0x20
[    2.789990]  [<ffffffff8138b33c>] bus_add_driver+0x11c/0x220
[    2.789991]  [<ffffffff8138c91b>] driver_register+0x5b/0xd0
[    2.789994]  [<ffffffffa0072f6c>] usb_register_driver+0x7c/0x130 [usbcore]
[    2.789994]  [<ffffffffa06a5000>] ? 0xffffffffa06a5000
[    2.789996]  [<ffffffffa06a501e>] dtt200u_usb_driver_init+0x1e/0x20 [dvb_usb_dtt200u]
[    2.789997]  [<ffffffff81000408>] do_one_initcall+0x38/0x140
[    2.789998]  [<ffffffff8116001c>] ? __vunmap+0x7c/0xc0
[    2.789999]  [<ffffffff81124fb0>] ? do_init_module+0x22/0x1d2
[    2.790000]  [<ffffffff81124fe8>] do_init_module+0x5a/0x1d2
[    2.790002]  [<ffffffff810c96b1>] load_module+0x1e11/0x2580
[    2.790003]  [<ffffffff810c68b0>] ? show_taint+0x30/0x30
[    2.790004]  [<ffffffff81177250>] ? kernel_read_file+0x100/0x190
[    2.790005]  [<ffffffff810c9ffa>] SyS_finit_module+0xba/0xc0
[    2.790007]  [<ffffffff814f13e0>] entry_SYSCALL_64_fastpath+0x13/0x94
[    2.790008] ---[ end trace c78a74e78baec6fc ]---

So, allocate the structure dynamically.

Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
[bwh: Backported to 4.9: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mm: Tighten x86 /dev/mem with zeroing reads

commit a4866aa812518ed1a37d8ea0c881dc946409de94 upstream.

Under CONFIG_STRICT_DEVMEM, reading System RAM through /dev/mem is
disallowed. However, on x86, the first 1MB was always allowed for BIOS
and similar things, regardless of it actually being System RAM. It was
possible for heap to end up getting allocated in low 1MB RAM, and then
read by things like x86info or dd, which would trip hardened usercopy:

usercopy: kernel memory exposure attempt detected from ffff880000090000 (dma-kmalloc-256) (4096 bytes)

This changes the x86 exception for the low 1MB by reading back zeros for
System RAM areas instead of blindly allowing them. More work is needed to
extend this to mmap, but currently mmap doesn't go through usercopy, so
hardened usercopy won't Oops the kernel.

Reported-by: Tommi Rantala <tommi.t.rantala@nokia.com>
Tested-by: Tommi Rantala <tommi.t.rantala@nokia.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Brad Spengler <spender@grsecurity.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

rtc: tegra: Implement clock handling

commit 5fa4086987506b2ab8c92f8f99f2295db9918856 upstream.

Accessing the registers of the RTC block on Tegra requires the module
clock to be enabled. This only works because the RTC module clock will
be enabled by default during early boot. However, because the clock is
unused, the CCF will disable it at late_init time. This causes the RTC
to become unusable afterwards. This can easily be reproduced by trying
to use the RTC:

$ hwclock --rtc /dev/rtc1

This will hang the system. I ran into this by following up on a report
by Martin Michlmayr that reboot wasn't working on Tegra210 systems. It
turns out that the rtc-tegra driver's ->shutdown() implementation will
hang the CPU, because of the disabled clock, before the system can be
rebooted.

What confused me for a while is that the same driver is used on prior
Tegra generations where the hang can not be observed. However, as Peter
De Schrijver pointed out, this is because on 32-bit Tegra chips the RTC
clock is enabled by the tegra20_timer.c clocksource driver, which uses
the RTC to provide a persistent clock. This code is never enabled on
64-bit Tegra because the persistent clock infrastructure does not exist
on 64-bit ARM.

The proper fix for this is to add proper clock handling to the RTC
driver in order to ensure that the clock is enabled when the driver
requires it. All device trees contain the clock already, therefore
no additional changes are required.

Reported-by: Martin Michlmayr <tbm@cyrius.com>
Acked-By Peter De Schrijver <pdeschrijver@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
[bwh: Backported to 4.9: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

platform/x86: acer-wmi: setup accelerometer when machine has appropriate notify event

commit 98d610c3739ac354319a6590b915f4624d9151e6 upstream.

The accelerometer event relies on the ACERWMID_EVENT_GUID notify.
So, this patch changes the codes to setup accelerometer input device
when detected ACERWMID_EVENT_GUID. It avoids that the accel input
device created on every Acer machines.

In addition, patch adds a clearly parsing logic of accelerometer hid
to acer_wmi_get_handle_cb callback function. It is positive matching
the "SENR" name with "BST0001" device to avoid non-supported hardware.

Reported-by: Bjørn Mork <bjorn@mork.no>
Cc: Darren Hart <dvhart@infradead.org>
Signed-off-by: Lee, Chun-Yi <jlee@suse.com>
[andy: slightly massage commit message]
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>