David Herrmann [Fri, 8 Aug 2014 21:25:36 +0000 (14:25 -0700)]
shm: wait for pins to be released when sealing
If we set SEAL_WRITE on a file, we must make sure there cannot be any
ongoing write-operations on the file. For write() calls, we simply lock
the inode mutex, for mmap() we simply verify there're no writable
mappings. However, there might be pages pinned by AIO, Direct-IO and
similar operations via GUP. We must make sure those do not write to the
memfd file after we set SEAL_WRITE.
As there is no way to notify GUP users to drop pages or to wait for them
to be done, we implement the wait ourself: When setting SEAL_WRITE, we
check all pages for their ref-count. If it's bigger than 1, we know
there's some user of the page. We then mark the page and wait for up to
150ms for those ref-counts to be dropped. If the ref-counts are not
dropped in time, we refuse the seal operation.
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Ryan Lortie <desrt@desrt.ca>
Cc: Lennart Poettering <lennart@poettering.net>
Cc: Daniel Mack <zonque@gmail.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Change-Id: I44afbd3f0af72777702c317737f8d16c566bd240
David Herrmann [Fri, 8 Aug 2014 21:25:25 +0000 (14:25 -0700)]
mm: allow drivers to prevent new writable mappings
This patch (of 6):
The i_mmap_writable field counts existing writable mappings of an
address_space. To allow drivers to prevent new writable mappings, make
this counter signed and prevent new writable mappings if it is negative.
This is modelled after i_writecount and DENYWRITE.
This will be required by the shmem-sealing infrastructure to prevent any
new writable mappings after the WRITE seal has been set. In case there
exists a writable mapping, this operation will fail with EBUSY.
Note that we rely on the fact that iff you already own a writable mapping,
you can increase the counter without using the helpers. This is the same
that we do for i_writecount.
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Ryan Lortie <desrt@desrt.ca>
Cc: Lennart Poettering <lennart@poettering.net>
Cc: Daniel Mack <zonque@gmail.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Change-Id: Ic852afffd43f8e75333d8182ad2ab045c78996f4
Oleg Nesterov [Wed, 11 Sep 2013 21:20:20 +0000 (14:20 -0700)]
mm: mmap_region: kill correct_wcount/inode, use allow_write_access()
correct_wcount and inode in mmap_region() just complicate the code. This
boolean was needed previously, when deny_write_access() was called before
vma_merge(), now we can simply check VM_DENYWRITE and do
allow_write_access() if it is set.
allow_write_access() checks file != NULL, so this is safe even if it was
possible to use VM_DENYWRITE && !file. Just we need to ensure we use the
same file which was deny_write_access()'ed, so the patch also moves "file
= vma->vm_file" down after allow_write_access().
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Colin Cross <ccross@android.com>
Cc: David Rientjes <rientjes@google.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Change-Id: I4274f2f7e24b51ce8e2eafd5a28bdafb106fbe5e
Oleg Nesterov [Wed, 11 Sep 2013 21:20:19 +0000 (14:20 -0700)]
mm: do_mmap_pgoff: cleanup the usage of file_inode()
Simple cleanup. Move "struct inode *inode" variable into "if (file)"
block to simplify the code and avoid the unnecessary check.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Colin Cross <ccross@android.com>
Cc: David Rientjes <rientjes@google.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Change-Id: I1bf739a9e3175cf5d3c1d3b421bf5daa48d5f1b5
Oleg Nesterov [Wed, 11 Sep 2013 21:20:18 +0000 (14:20 -0700)]
mm: shift VM_GROWS* check from mmap_region() to do_mmap_pgoff()
mmap() doesn't allow the non-anonymous mappings with VM_GROWS* bit set.
In particular this means that mmap_region()->vma_merge(file, vm_flags)
must always fail if "vm_flags & VM_GROWS" is set incorrectly.
So it does not make sense to check VM_GROWS* after we already allocated
the new vma, the only caller, do_mmap_pgoff(), which can pass this flag
can do the check itself.
And this looks a bit more correct, mmap_region() already unmapped the
old mapping at this stage. But if mmap() is going to fail, it should
avoid do_munmap() if possible.
Note: we check VM_GROWS at the end to ensure that do_mmap_pgoff() won't
return EINVAL in the case when it currently returns another error code.
Many thanks to Hugh who nacked the buggy v1.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Change-Id: Ic81c1919adf051b9308125fcb87ae6a46e71b580
Oleg Nesterov [Wed, 11 Sep 2013 21:20:14 +0000 (14:20 -0700)]
mm: mempolicy: turn vma_set_policy() into vma_dup_policy()
Simple cleanup. Every user of vma_set_policy() does the same work, this
looks a bit annoying imho. And the new trivial helper which does
mpol_dup() + vma_set_policy() to simplify the callers.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Change-Id: Ice9406b849ca330fc0b6bc2436a685fa6fd50217
David Herrmann [Fri, 8 Aug 2014 21:25:27 +0000 (14:25 -0700)]
shm: add sealing API
If two processes share a common memory region, they usually want some
guarantees to allow safe access. This often includes:
- one side cannot overwrite data while the other reads it
- one side cannot shrink the buffer while the other accesses it
- one side cannot grow the buffer beyond previously set boundaries
If there is a trust-relationship between both parties, there is no need
for policy enforcement. However, if there's no trust relationship (eg.,
for general-purpose IPC) sharing memory-regions is highly fragile and
often not possible without local copies. Look at the following two
use-cases:
1) A graphics client wants to share its rendering-buffer with a
graphics-server. The memory-region is allocated by the client for
read/write access and a second FD is passed to the server. While
scanning out from the memory region, the server has no guarantee that
the client doesn't shrink the buffer at any time, requiring rather
cumbersome SIGBUS handling.
2) A process wants to perform an RPC on another process. To avoid huge
bandwidth consumption, zero-copy is preferred. After a message is
assembled in-memory and a FD is passed to the remote side, both sides
want to be sure that neither modifies this shared copy, anymore. The
source may have put sensible data into the message without a separate
copy and the target may want to parse the message inline, to avoid a
local copy.
While SIGBUS handling, POSIX mandatory locking and MAP_DENYWRITE provide
ways to achieve most of this, the first one is unproportionally ugly to
use in libraries and the latter two are broken/racy or even disabled due
to denial of service attacks.
This patch introduces the concept of SEALING. If you seal a file, a
specific set of operations is blocked on that file forever. Unlike locks,
seals can only be set, never removed. Hence, once you verified a specific
set of seals is set, you're guaranteed that no-one can perform the blocked
operations on this file, anymore.
An initial set of SEALS is introduced by this patch:
- SHRINK: If SEAL_SHRINK is set, the file in question cannot be reduced
in size. This affects ftruncate() and open(O_TRUNC).
- GROW: If SEAL_GROW is set, the file in question cannot be increased
in size. This affects ftruncate(), fallocate() and write().
- WRITE: If SEAL_WRITE is set, no write operations (besides resizing)
are possible. This affects fallocate(PUNCH_HOLE), mmap() and
write().
- SEAL: If SEAL_SEAL is set, no further seals can be added to a file.
This basically prevents the F_ADD_SEAL operation on a file and
can be set to prevent others from adding further seals that you
don't want.
The described use-cases can easily use these seals to provide safe use
without any trust-relationship:
1) The graphics server can verify that a passed file-descriptor has
SEAL_SHRINK set. This allows safe scanout, while the client is
allowed to increase buffer size for window-resizing on-the-fly.
Concurrent writes are explicitly allowed.
2) For general-purpose IPC, both processes can verify that SEAL_SHRINK,
SEAL_GROW and SEAL_WRITE are set. This guarantees that neither
process can modify the data while the other side parses it.
Furthermore, it guarantees that even with writable FDs passed to the
peer, it cannot increase the size to hit memory-limits of the source
process (in case the file-storage is accounted to the source).
The new API is an extension to fcntl(), adding two new commands:
F_GET_SEALS: Return a bitset describing the seals on the file. This
can be called on any FD if the underlying file supports
sealing.
F_ADD_SEALS: Change the seals of a given file. This requires WRITE
access to the file and F_SEAL_SEAL may not already be set.
Furthermore, the underlying file must support sealing and
there may not be any existing shared mapping of that file.
Otherwise, EBADF/EPERM is returned.
The given seals are _added_ to the existing set of seals
on the file. You cannot remove seals again.
The fcntl() handler is currently specific to shmem and disabled on all
files. A file needs to explicitly support sealing for this interface to
work. A separate syscall is added in a follow-up, which creates files that
support sealing. There is no intention to support this on other
file-systems. Semantics are unclear for non-volatile files and we lack any
use-case right now. Therefore, the implementation is specific to shmem.
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Ryan Lortie <desrt@desrt.ca>
Cc: Lennart Poettering <lennart@poettering.net>
Cc: Daniel Mack <zonque@gmail.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Angelo G. Del Regno <kholk11@gmail.com>
Change-Id: Ib71a640ebcc010c1ac2ec384bc292dd9dc7a5a26
Roman Birg [Mon, 18 Aug 2014 21:04:44 +0000 (14:04 -0700)]
input: gpio_keys: report SW_LID instead of SW_FLIP
* Android expects SW_LID for lid events. It also expects a different
sequence of lid state since windowed covers are not yet supported.
Change-Id: Iebffbabdbb3748eec4f887ebd227c67adf01d8ef
Signed-off-by: Roman Birg <roman@cyngn.com>
Eric Dumazet [Mon, 16 Mar 2015 04:12:12 +0000 (21:12 -0700)]
net: add sk_fullsock() helper
We have many places where we want to check if a socket is
not a timewait or request socket. Use a helper to avoid
hard coding this.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
[backported from net-next
1d0ab253872cdd3d8e7913f59c266c7fd01771d0]
[lorenzo@google.com: removed TCPF_NEW_SYN_RECV, and added a comment to add it back.]
Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
Bug:
24163529
Change-Id: Ibf09017e1ab00af5e6925273117c335d7f515d73
Harout Hedeshian [Mon, 2 Feb 2015 20:30:42 +0000 (13:30 -0700)]
net: tcp: Scale the TCP backlog queue to absorb packet bursts
A large momentary influx of packets flooding the TCP layer may cause
packets to get dropped at the socket's backlog queue. Bump this up to
prevent these drops. Note that this change may cause the socket memory
accounting to allow the total backlog queue length to exceed the user
space configured values, sometimes by a substantial amount, which can
lead to out of order packets to be dropped instead of being queued. To
avoid these ofo drops, the condition to drop an out of order packet is
modified to allow out of order queuing to continue as long as it falls
within the now increased backlog queue limit.
Change-Id: I447ffc8560cb149fe84193c72bf693862f7ec740
Signed-off-by: Harout Hedeshian <harouth@codeaurora.org>
Eric Dumazet [Mon, 5 Oct 2015 04:08:09 +0000 (21:08 -0700)]
ipv6: inet6_sk() should use sk_fullsock()
SYN_RECV & TIMEWAIT sockets are not full blown, they do not have a pinet6
pointer.
Bug:
24163529
Change-Id: I6ce67a190d67d200c6ebeb81d2daeb9c86cd7581
Fixes:
ca6fb0651883 ("tcp: attach SYNACK messages to request sockets instead of listener")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
Eric Dumazet [Tue, 19 Sep 2017 17:05:57 +0000 (10:05 -0700)]
tcp: fastopen: fix on syn-data transmit failure
[ Upstream commit
b5b7db8d680464b1d631fd016f5e093419f0bfd9 ]
Our recent change exposed a bug in TCP Fastopen Client that syzkaller
found right away [1]
When we prepare skb with SYN+DATA, we attempt to transmit it,
and we update socket state as if the transmit was a success.
In socket RTX queue we have two skbs, one with the SYN alone,
and a second one containing the DATA.
When (malicious) ACK comes in, we now complain that second one had no
skb_mstamp.
The proper fix is to make sure that if the transmit failed, we do not
pretend we sent the DATA skb, and make it our send_head.
When 3WHS completes, we can now send the DATA right away, without having
to wait for a timeout.
[1]
WARNING: CPU: 0 PID: 100189 at net/ipv4/tcp_input.c:3117 tcp_clean_rtx_queue+0x2057/0x2ab0 net/ipv4/tcp_input.c:3117()
WARN_ON_ONCE(last_ackt == 0);
Modules linked in:
CPU: 0 PID: 100189 Comm: syz-executor1 Not tainted
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
0000000000000000 ffff8800b35cb1d8 ffffffff81cad00d 0000000000000000
ffffffff828a4347 ffff88009f86c080 ffffffff8316eb20 0000000000000d7f
ffff8800b35cb220 ffffffff812c33c2 ffff8800baad2440 00000009d46575c0
Call Trace:
[<
ffffffff81cad00d>] __dump_stack
[<
ffffffff81cad00d>] dump_stack+0xc1/0x124
[<
ffffffff812c33c2>] warn_slowpath_common+0xe2/0x150
[<
ffffffff812c361e>] warn_slowpath_null+0x2e/0x40
[<
ffffffff828a4347>] tcp_clean_rtx_queue+0x2057/0x2ab0 n
[<
ffffffff828ae6fd>] tcp_ack+0x151d/0x3930
[<
ffffffff828baa09>] tcp_rcv_state_process+0x1c69/0x4fd0
[<
ffffffff828efb7f>] tcp_v4_do_rcv+0x54f/0x7c0
[<
ffffffff8258aacb>] sk_backlog_rcv
[<
ffffffff8258aacb>] __release_sock+0x12b/0x3a0
[<
ffffffff8258ad9e>] release_sock+0x5e/0x1c0
[<
ffffffff8294a785>] inet_wait_for_connect
[<
ffffffff8294a785>] __inet_stream_connect+0x545/0xc50
[<
ffffffff82886f08>] tcp_sendmsg_fastopen
[<
ffffffff82886f08>] tcp_sendmsg+0x2298/0x35a0
[<
ffffffff82952515>] inet_sendmsg+0xe5/0x520
[<
ffffffff8257152f>] sock_sendmsg_nosec
[<
ffffffff8257152f>] sock_sendmsg+0xcf/0x110
Fixes:
8c72c65b426b ("tcp: update skb->skb_mstamp more carefully")
Fixes:
783237e8daf1 ("net-tcp: Fast Open client - sending SYN-data")
Change-Id: I1ee49ef4b2ab363fd9f10a518c1ce8bfa71ad7d1
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Lorenzo Colitti [Tue, 29 Nov 2016 17:56:47 +0000 (02:56 +0900)]
net: ipv4: Don't crash if passing a null sk to ip_rt_update_pmtu.
Commit
e2d118a1cb5e ("net: inet: Support UID-based routing in IP
protocols.") made __build_flow_key call sock_net(sk) to determine
the network namespace of the passed-in socket. This crashes if sk
is NULL.
Fix this by getting the network namespace from the skb instead.
[Backport of net-next
d109e61bfe7a468fd8df4a7ceb65635e7aa909a0]
Bug:
16355602
Change-Id: I23b43db5adb8546833e013c268f31111d0e53c69
Fixes:
e2d118a1cb5e ("net: inet: Support UID-based routing in IP protocols.")
Reported-by: Erez Shitrit <erezsh@dev.mellanox.co.il>
Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Lorenzo Colitti [Thu, 3 Nov 2016 17:23:43 +0000 (02:23 +0900)]
net: inet: Support UID-based routing in IP protocols.
- Use the UID in routing lookups made by protocol connect() and
sendmsg() functions.
- Make sure that routing lookups triggered by incoming packets
(e.g., Path MTU discovery) take the UID of the socket into
account.
- For packets not associated with a userspace socket, (e.g., ping
replies) use UID 0 inside the user namespace corresponding to
the network namespace the socket belongs to. This allows
all namespaces to apply routing and iptables rules to
kernel-originated traffic in that namespaces by matching UID 0.
This is better than using the UID of the kernel socket that is
sending the traffic, because the UID of kernel sockets created
at namespace creation time (e.g., the per-processor ICMP and
TCP sockets) is the UID of the user that created the socket,
which might not be mapped in the namespace.
[Backport of net-next
e2d118a1cb5e60d077131a09db1d81b90a5295fe]
Bug:
16355602
Change-Id: I126f8359887b5b5bbac68daf0ded89e899cb7cb0
Tested: compiles allnoconfig, allyesconfig, allmodconfig
Tested: https://android-review.googlesource.com/253302
Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Lorenzo Colitti [Thu, 15 May 2014 23:38:41 +0000 (16:38 -0700)]
net: ipv6: make "ip -6 route get mark xyz" work.
Currently, "ip -6 route get mark xyz" ignores the mark passed in
by userspace. Make it honour the mark, just like IPv4 does.
[net-next commit
2e47b291953c35afa4e20a65475954c1a1b9afe1]
Change-Id: Idaae7338506d1785a80159bfe4f0cc3c2a9b6827
Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Lorenzo Colitti [Thu, 3 Nov 2016 17:23:41 +0000 (02:23 +0900)]
net: core: Add a UID field to struct sock.
Protocol sockets (struct sock) don't have UIDs, but most of the
time, they map 1:1 to userspace sockets (struct socket) which do.
Various operations such as the iptables xt_owner match need
access to the "UID of a socket", and do so by following the
backpointer to the struct socket. This involves taking
sk_callback_lock and doesn't work when there is no socket
because userspace has already called close().
Simplify this by adding a sk_uid field to struct sock whose value
matches the UID of the corresponding struct socket. The semantics
are as follows:
1. Whenever sk_socket is non-null: sk_uid is the same as the UID
in sk_socket, i.e., matches the return value of sock_i_uid.
Specifically, the UID is set when userspace calls socket(),
fchown(), or accept().
2. When sk_socket is NULL, sk_uid is defined as follows:
- For a socket that no longer has a sk_socket because
userspace has called close(): the previous UID.
- For a cloned socket (e.g., an incoming connection that is
established but on which userspace has not yet called
accept): the UID of the socket it was cloned from.
- For a socket that has never had an sk_socket: UID 0 inside
the user namespace corresponding to the network namespace
the socket belongs to.
Kernel sockets created by sock_create_kern are a special case
of #1 and sk_uid is the user that created them. For kernel
sockets created at network namespace creation time, such as the
per-processor ICMP and TCP sockets, this is the user that created
the network namespace.
[Backport of net-next
86741ec25462e4c8cdce6df2f41ead05568c7d5e]
Bug:
16355602
Change-Id: I73e1a57dfeedf672f4c2dfc9ce6867838b55974b
Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Tue, 9 Dec 2014 17:56:08 +0000 (09:56 -0800)]
tcp: fix more NULL deref after prequeue changes
When I cooked commit
c3658e8d0f1 ("tcp: fix possible NULL dereference in
tcp_vX_send_reset()") I missed other spots we could deref a NULL
skb_dst(skb)
Again, if a socket is provided, we do not need skb_dst() to get a
pointer to network namespace : sock_net(sk) is good enough.
[Backport of net-next
0f85feae6b710ced3abad5b2b47d31dfcb956b62]
Bug:
16355602
Change-Id: I72c9f7dae8da4451112a20ea36183365303bd389
Reported-by: Dann Frazier <dann.frazier@canonical.com>
Bisected-by: Dann Frazier <dann.frazier@canonical.com>
Tested-by: Dann Frazier <dann.frazier@canonical.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Fixes:
ca777eff51f7 ("tcp: remove dst refcount false sharing for prequeue mode")
Signed-off-by: David S. Miller <davem@davemloft.net>
Lorenzo Colitti [Fri, 12 Aug 2016 16:13:38 +0000 (01:13 +0900)]
net: ipv6: Fix ping to link-local addresses.
ping_v6_sendmsg does not set flowi6_oif in response to
sin6_scope_id or sk_bound_dev_if, so it is not possible to use
these APIs to ping an IPv6 address on a different interface.
Instead, it sets flowi6_iif, which is incorrect but harmless.
Stop setting flowi6_iif, and support various ways of setting oif
in the same priority order used by udpv6_sendmsg.
[Backport of net
5e457896986e16c440c97bb94b9ccd95dd157292]
Bug:
29370996
Change-Id: I2c8bc213c417a4427f64439e0954138cb30416c2
Tested: https://android-review.googlesource.com/#/c/254470/
Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Corinna Vinschen <xda@vinschen.de>
Chris Clark [Tue, 27 Aug 2013 18:02:15 +0000 (12:02 -0600)]
ipv4: sendto/hdrincl: don't use destination address found in header
ipv4: raw_sendmsg: don't use header's destination address
A sendto() regression was bisected and found to start with commit
f8126f1d5136be1 (ipv4: Adjust semantics of rt->rt_gateway.)
The problem is that it tries to ARP-lookup the constructed packet's
destination address rather than the explicitly provided address.
Fix this using FLOWI_FLAG_KNOWN_NH so that given nexthop is used.
cf. commit
2ad5b9e4bd314fc685086b99e90e5de3bc59e26b
Reported-by: Chris Clark <chris.clark@alcatel-lucent.com>
Bisected-by: Chris Clark <chris.clark@alcatel-lucent.com>
Tested-by: Chris Clark <chris.clark@alcatel-lucent.com>
Suggested-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Chris Clark <chris.clark@alcatel-lucent.com>
Change-Id: I06c9f3a0bca97a4b190e31543345e5accbf73a6d
Danny Wood [Mon, 6 Jul 2020 14:06:03 +0000 (15:06 +0100)]
universal7580: a7xelte-dts: remove second touchkey entry
* fixes touchkeys using our abov_touchkey_ft1804 combined driver
Change-Id: Ie82812adf7a13b2c4d05ea70d319cae83e2e566f
Sourajit Karmakar [Tue, 21 Apr 2020 13:48:24 +0000 (09:48 -0400)]
defconfig: Import a7xelte defconfig.
Thanks @danwood76.
Change-Id: I3e68f291b872d1e493d662d9cab699fb0f472a2c
Dario Trombello [Thu, 20 Feb 2020 18:41:57 +0000 (18:41 +0000)]
sensors: k2hh: Fix accelerometer
Using the STMicroelectronics K2HH driver from SM-J700F Android 6.0 kernel source (J700FXXU4BQE3) makes the sensor work.
Change-Id: I9f50ea5096b56617b171d5cc64c2ed1b01a3e205
Dario Trombello [Thu, 20 Feb 2020 18:34:39 +0000 (18:34 +0000)]
arm64: Add lineageos_j7elte_defconfig
Change-Id: I85146aabf467c93cc6713b63445dfac667186212
Will Deacon [Mon, 11 Aug 2014 13:24:47 +0000 (14:24 +0100)]
asm-generic: add memfd_create system call to unistd.h
Commit
9183df25fe7b ("shm: add memfd_create() syscall") added a new
system call (memfd_create) but didn't update the asm-generic unistd
header.
This patch adds the new system call to the asm-generic version of
unistd.h so that it can be used by architectures such as arm64.
Change-Id: I173b1e5b6087fcea7d226a9f55f792432515897d
Cc: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: David Herrmann <dh.herrmann@gmail.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
David Herrmann [Fri, 8 Aug 2014 21:25:29 +0000 (14:25 -0700)]
shm: add memfd_create() syscall
memfd_create() is similar to mmap(MAP_ANON), but returns a file-descriptor
that you can pass to mmap(). It can support sealing and avoids any
connection to user-visible mount-points. Thus, it's not subject to quotas
on mounted file-systems, but can be used like malloc()'ed memory, but with
a file-descriptor to it.
memfd_create() returns the raw shmem file, so calls like ftruncate() can
be used to modify the underlying inode. Also calls like fstat() will
return proper information and mark the file as regular file. If you want
sealing, you can specify MFD_ALLOW_SEALING. Otherwise, sealing is not
supported (like on all other regular files).
Compared to O_TMPFILE, it does not require a tmpfs mount-point and is not
subject to a filesystem size limit. It is still properly accounted to
memcg limits, though, and to the same overcommit or no-overcommit
accounting as all user memory.
Change-Id: Iaf959293e2c490523aeb46d56cc45b0e7bbe7bf5
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Ryan Lortie <desrt@desrt.ca>
Cc: Lennart Poettering <lennart@poettering.net>
Cc: Daniel Mack <zonque@gmail.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Angelo G. Del Regno <kholk11@gmail.com>
Corinna Vinschen [Sun, 18 Nov 2018 18:21:35 +0000 (19:21 +0100)]
universal7580: fix commit "ANDROID: sdcardfs: Hold i_mutex for
i_size_write"
I accidentally merged the 3.18 patch, using a different way to access
the lower file's inode. Use the 3.10 technique instead.
Change-Id: Iea18abcb24cce9afa23e870af8beb31767d67250
Signed-off-by: Corinna Vinschen <xda@vinschen.de>
Daniel Rosenberg [Thu, 25 Oct 2018 23:25:15 +0000 (16:25 -0700)]
ANDROID: sdcardfs: Add option to not link obb
Add mount option unshared_obb to not link the obb
folders of multiple users together.
Bug:
27915347
Test: mount with option. Check if altering one obb
alters the other
Signed-off-by: Daniel Rosenberg <drosen@google.com>
Change-Id: I3956e06bd0a222b0bbb2768c9a8a8372ada85e1e
Signed-off-by: Paul Keith <javelinanddart@gmail.com>
Daniel Rosenberg [Thu, 25 Oct 2018 23:22:50 +0000 (16:22 -0700)]
ANDROID: sdcardfs: Add sandbox
Android/sandbox is treated the same as Android/data
Bug:
27915347
Test: ls -l /sdcard/Android/sandbox/*somepackage* after
creating the folder.
Signed-off-by: Daniel Rosenberg <drosen@google.com>
Change-Id: I7ef440a88df72198303c419e1f2f7c4657f9c170
Signed-off-by: Paul Keith <javelinanddart@gmail.com>
Daniel Rosenberg [Fri, 6 Jul 2018 23:24:27 +0000 (16:24 -0700)]
ANDROID: sdcardfs: Add option to drop unused dentries
This adds the nocache mount option, which will cause sdcardfs to always
drop dentries that are not in use, preventing cached entries from
holding on to lower dentries, which could cause strange behavior when
bypassing the sdcardfs layer and directly changing the lower fs.
Change-Id: I70268584a20b989ae8cfdd278a2e4fa1605217fb
Signed-off-by: Daniel Rosenberg <drosen@google.com>
Signed-off-by: Paul Keith <javelinanddart@gmail.com>
Daniel Rosenberg [Fri, 20 Jul 2018 23:11:40 +0000 (16:11 -0700)]
ANDROID: sdcardfs: Change current->fs under lock
bug:
111641492
Change-Id: I79e9894f94880048edaf0f7cfa2d180f65cbcf3b
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Daniel Rosenberg <drosen@google.com>
Signed-off-by: Paul Keith <javelinanddart@gmail.com>
Daniel Rosenberg [Fri, 20 Jul 2018 01:08:35 +0000 (18:08 -0700)]
ANDROID: sdcardfs: Don't use OVERRIDE_CRED macro
The macro hides some control flow, making it easier
to run into bugs.
bug:
111642636
Change-Id: I37ec207c277d97c4e7f1e8381bc9ae743ad78435
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Daniel Rosenberg <drosen@google.com>
Signed-off-by: Paul Keith <javelinanddart@gmail.com>
Danny Wood [Thu, 31 Oct 2019 14:35:27 +0000 (14:35 +0000)]
Revert "FROMLIST: android: binder: Move buffer out of area shared with user space"
This commit causes the Samsung a5xelte fingerprint blobs to stop working
This reverts commit
35852b611c5af888e0ac979391099fe2035a06be.
Change-Id: I4c9e3c551deb98b793cb6a7de9ef2a14f3a46067
Todd Kjos [Wed, 12 Jun 2019 20:29:27 +0000 (13:29 -0700)]
binder: fix possible UAF when freeing buffer
commit
a370003cc301d4361bae20c9ef615f89bf8d1e8a upstream
There is a race between the binder driver cleaning
up a completed transaction via binder_free_transaction()
and a user calling binder_ioctl(BC_FREE_BUFFER) to
release a buffer. It doesn't matter which is first but
they need to be protected against running concurrently
which can result in a UAF.
Signed-off-by: Todd Kjos <tkjos@google.com>
Cc: stable <stable@vger.kernel.org> # 4.14 4.19
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Change-Id: I1b9c9bdc52df8ddbc5fe7c6d8308f1068265f8ae
Martijn Coenen [Tue, 9 Jul 2019 11:09:23 +0000 (13:09 +0200)]
BACKPORT: binder: Set end of SG buffer area properly.
In case the target node requests a security context, the
extra_buffers_size is increased with the size of the security context.
But, that size is not available for use by regular scatter-gather
buffers; make sure the ending of that buffer is marked correctly.
Bug:
136210786
Acked-by: Todd Kjos <tkjos@google.com>
Fixes:
ec74136ded79 ("binder: create node flag to request sender's security context")
Signed-off-by: Martijn Coenen <maco@android.com>
Cc: stable@vger.kernel.org # 5.1+
Link: https://lore.kernel.org/r/20190709110923.220736-1-maco@android.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit
a56587065094fd96eb4c2b5ad65571daad32156d)
Change-Id: Ib4d3a99e7a881992c1313169f902cfad02a508a6
Todd Kjos [Wed, 24 Apr 2019 19:31:18 +0000 (12:31 -0700)]
UPSTREAM: binder: check for overflow when alloc for security context
commit
0b0509508beff65c1d50541861bc0d4973487dc5 upstream.
When allocating space in the target buffer for the security context,
make sure the extra_buffers_size doesn't overflow. This can only
happen if the given size is invalid, but an overflow can turn it
into a valid size. Fail the transaction if an overflow is detected.
Bug:
130571081
Change-Id: Ibaec652d2073491cc426a4a24004a848348316bf
Signed-off-by: Todd Kjos <tkjos@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Todd Kjos [Mon, 14 Jan 2019 17:10:21 +0000 (09:10 -0800)]
FROMGIT: binder: create node flag to request sender's security context
To allow servers to verify client identity, allow a node
flag to be set that causes the sender's security context
to be delivered with the transaction. The BR_TRANSACTION
command is extended in BR_TRANSACTION_SEC_CTX to
contain a pointer to the security context string.
Signed-off-by: Todd Kjos <tkjos@google.com>
Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit
ec74136ded792deed80780a2f8baf3521eeb72f9
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
master)
Change-Id: I44496546e2d0dc0022f818a45cd52feb1c1a92cb
Signed-off-by: Todd Kjos <tkjos@google.com>
Todd Kjos [Tue, 6 Nov 2018 23:55:32 +0000 (15:55 -0800)]
UPSTREAM: binder: fix race that allows malicious free of live buffer
commit
7bada55ab50697861eee6bb7d60b41e68a961a9c upstream
Malicious code can attempt to free buffers using the BC_FREE_BUFFER
ioctl to binder. There are protections against a user freeing a buffer
while in use by the kernel, however there was a window where
BC_FREE_BUFFER could be used to free a recently allocated buffer that
was not completely initialized. This resulted in a use-after-free
detected by KASAN with a malicious test program.
This window is closed by setting the buffer's allow_user_free attribute
to 0 when the buffer is allocated or when the user has previously freed
it instead of waiting for the caller to set it. The problem was that
when the struct buffer was recycled, allow_user_free was stale and set
to 1 allowing a free to go through.
Bug:
116855682
Change-Id: I0b38089f6fdb1adbf7e1102747e4119c9a05b191
Signed-off-by: Todd Kjos <tkjos@google.com>
Acked-by: Arve Hjønnevåg <arve@android.com>
Cc: stable <stable@vger.kernel.org> # 4.14
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Todd Kjos [Mon, 27 Nov 2017 17:32:33 +0000 (09:32 -0800)]
UPSTREAM: binder: fix proc->files use-after-free
proc->files cleanup is initiated by binder_vma_close. Therefore
a reference on the binder_proc is not enough to prevent the
files_struct from being released while the binder_proc still has
a reference. This can lead to an attempt to dereference the
stale pointer obtained from proc->files prior to proc->files
cleanup. This has been seen once in task_get_unused_fd_flags()
when __alloc_fd() is called with a stale "files".
The fix is to protect proc->files with a mutex to prevent cleanup
while in use.
Signed-off-by: Todd Kjos <tkjos@google.com>
Cc: stable <stable@vger.kernel.org> # 4.14
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit
7f3dc0088b98533f17128058fac73cd8b2752ef1)
Change-Id: I40982bb0b4615bda5459538c20eb2a913964042c
Martijn Coenen [Sat, 25 Aug 2018 20:50:56 +0000 (13:50 -0700)]
FROMLIST: ANDROID: binder: Add BINDER_GET_NODE_INFO_FOR_REF ioctl.
This allows the context manager to retrieve information about nodes
that it holds a reference to, such as the current number of
references to those nodes.
Such information can for example be used to determine whether the
servicemanager is the only process holding a reference to a node.
This information can then be passed on to the process holding the
node, which can in turn decide whether it wants to shut down to
reduce resource usage.
Signed-off-by: Martijn Coenen <maco@android.com>
Change-Id: I2fa9b6e2b1d1d6c84fca954125c3ec776dc2c04f
Martijn Coenen [Wed, 28 Mar 2018 09:14:50 +0000 (11:14 +0200)]
UPSTREAM: ANDROID: binder: prevent transactions into own process.
This can't happen with normal nodes (because you can't get a ref
to a node you own), but it could happen with the context manager;
to make the behavior consistent with regular nodes, reject
transactions into the context manager by the process owning it.
Reported-by: syzbot+09e05aba06723a94d43d@syzkaller.appspotmail.com
Signed-off-by: Martijn Coenen <maco@android.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit
7aa135fcf26377f92dc0680a57566b4c7f3e281b)
Change-Id: I3f6c0528fb2d3f8b835255b2a0ec603cab94626a
Martijn Coenen [Fri, 16 Feb 2018 08:47:15 +0000 (09:47 +0100)]
UPSTREAM: ANDROID: binder: synchronize_rcu() when using POLLFREE.
To prevent races with ep_remove_waitqueue() removing the
waitqueue at the same time.
Reported-by: syzbot+a2a3c4909716e271487e@syzkaller.appspotmail.com
Signed-off-by: Martijn Coenen <maco@android.com>
Cc: stable <stable@vger.kernel.org> # 4.14+
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit
5eeb2ca02a2f6084fc57ae5c244a38baab07033a)
Change-Id: Ia0089448079c78d0ab0b57303faf838e9e5ee797
Martijn Coenen [Fri, 5 Jan 2018 10:27:07 +0000 (11:27 +0100)]
UPSTREAM: ANDROID: binder: remove waitqueue when thread exits.
binder_poll() passes the thread->wait waitqueue that
can be slept on for work. When a thread that uses
epoll explicitly exits using BINDER_THREAD_EXIT,
the waitqueue is freed, but it is never removed
from the corresponding epoll data structure. When
the process subsequently exits, the epoll cleanup
code tries to access the waitlist, which results in
a use-after-free.
Prevent this by using POLLFREE when the thread exits.
(cherry picked from commit
f5cb779ba16334b45ba8946d6bfa6d9834d1527f)
Change-Id: Ib34b1cbb8ab2192d78c3d9956b2f963a66ecad2e
Signed-off-by: Martijn Coenen <maco@android.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Cc: stable <stable@vger.kernel.org> # 4.14
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Martijn Coenen [Wed, 20 Dec 2017 15:21:00 +0000 (16:21 +0100)]
ANDROID: binder: Remove obsolete proc waitqueue.
It was no longer being used.
Change-Id: I7fc42b76f688a459ad990f59fbd7006b96bb91a6
Signed-off-by: Martijn Coenen <maco@android.com>
Martijn Coenen [Mon, 27 Nov 2017 17:24:33 +0000 (09:24 -0800)]
UPSTREAM: android: binder: fix type mismatch warning
Allowing binder to expose the 64-bit API on 32-bit kernels caused a
build warning:
drivers/android/binder.c: In function
'binder_transaction_buffer_release':
drivers/android/binder.c:2220:15: error: cast to pointer from integer of
different size [-Werror=int-to-pointer-cast]
fd_array = (u32 *)(parent_buffer + fda->parent_offset);
^
drivers/android/binder.c: In function 'binder_translate_fd_array':
drivers/android/binder.c:2445:13: error: cast to pointer from integer of
different size [-Werror=int-to-pointer-cast]
fd_array = (u32 *)(parent_buffer + fda->parent_offset);
^
drivers/android/binder.c: In function 'binder_fixup_parent':
drivers/android/binder.c:2511:18: error: cast to pointer from integer of
different size [-Werror=int-to-pointer-cast]
This adds extra type casts to avoid the warning.
However, there is another problem with the Kconfig option: turning
it on or off creates two incompatible ABI versions, a kernel that
has this enabled cannot run user space that was built without it
or vice versa. A better solution might be to leave the option hidden
until the binder code is fixed to deal with both ABI versions.
Fixes:
e8d2ed7db7c3 ("Revert "staging: Fix build issues with new binder
API"")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit
1c363eaece2752c5f8b1b874cb4ae435de06aa66)
Change-Id: Id09185a6f86905926699e92a2b30201b8a5e83e5
Martijn Coenen [Tue, 14 Nov 2017 16:04:12 +0000 (17:04 +0100)]
ANDROID: binder: clarify deferred thread work.
Rename the function to more accurately reflect what
it does, and add a comment explaining why we use it.
Change-Id: I8d011c017dfc6e24b5b54fc462578f8e153e5926
Signed-off-by: Martijn Coenen <maco@android.com>
Martijn Coenen [Thu, 19 Oct 2017 13:04:46 +0000 (15:04 +0200)]
ANDROID: binder: Add thread->process_todo flag.
This flag determines whether the thread should currently
process the work in the thread->todo worklist.
The prime usecase for this is improving the performance
of synchronous transactions: all synchronous transactions
post a BR_TRANSACTION_COMPLETE to the calling thread,
but there's no reason to return that command to userspace
right away - userspace anyway needs to wait for the reply.
Likewise, a synchronous transaction that contains a binder
object can cause a BC_ACQUIRE/BC_INCREFS to be returned to
userspace; since the caller must anyway hold a strong/weak
ref for the duration of the call, postponing these commands
until the reply comes in is not a problem.
Note that this flag is not used to determine whether a
thread can handle process work; a thread should never pick
up process work when thread work is still pending.
Before patch:
------------------------------------------------------------------
Benchmark Time CPU Iterations
------------------------------------------------------------------
BM_sendVec_binderize/4 45959 ns 20288 ns 34351
BM_sendVec_binderize/8 45603 ns 20080 ns 34909
BM_sendVec_binderize/16 45528 ns 20113 ns 34863
BM_sendVec_binderize/32 45551 ns 20122 ns 34881
BM_sendVec_binderize/64 45701 ns 20183 ns 34864
BM_sendVec_binderize/128 45824 ns 20250 ns 34576
BM_sendVec_binderize/256 45695 ns 20171 ns 34759
BM_sendVec_binderize/512 45743 ns 20211 ns 34489
BM_sendVec_binderize/1024 46169 ns 20430 ns 34081
After patch:
------------------------------------------------------------------
Benchmark Time CPU Iterations
------------------------------------------------------------------
BM_sendVec_binderize/4 42939 ns 17262 ns 40653
BM_sendVec_binderize/8 42823 ns 17243 ns 40671
BM_sendVec_binderize/16 42898 ns 17243 ns 40594
BM_sendVec_binderize/32 42838 ns 17267 ns 40527
BM_sendVec_binderize/64 42854 ns 17249 ns 40379
BM_sendVec_binderize/128 42881 ns 17288 ns 40427
BM_sendVec_binderize/256 42917 ns 17297 ns 40429
BM_sendVec_binderize/512 43184 ns 17395 ns 40411
BM_sendVec_binderize/1024 43119 ns 17357 ns 40432
Signed-off-by: Martijn Coenen <maco@android.com>
Change-Id: Ia70287066d62aba64e98ac44ff1214e37ca75693
Sherry Yang [Thu, 5 Oct 2017 21:13:47 +0000 (17:13 -0400)]
FROMLIST: android: binder: Fix null ptr dereference in debug msg
(from https://patchwork.kernel.org/patch/
9990323/)
Don't access next->data in kernel debug message when the
next buffer is null.
Bug:
36007193
Change-Id: Ib8240d7e9a7087a2256e88c0ae84b9df0f2d0224
Acked-by: Arve Hjønnevåg <arve@android.com>
Signed-off-by: Sherry Yang <sherryy@android.com>
Ganesh Mahendran [Tue, 26 Sep 2017 09:56:25 +0000 (17:56 +0800)]
ANDROID: binder: fix node sched policy calculation
We should use FLAT_BINDER_FLAG_SCHED_POLICY_MASK as
the mask to calculate sched policy.
Change-Id: Ic252fd7c68495830690130d792802c02f99fc8fc
Signed-off-by: Ganesh Mahendran <opensource.ganesh@gmail.com>
Ganesh Mahendran [Wed, 27 Sep 2017 07:12:25 +0000 (15:12 +0800)]
ANDROID: binder: init desired_prio.sched_policy before use it
In function binder_transaction_priority(), we access
desired_prio before initialzing it.
This patch fix this.
Change-Id: I9d14d50f9a128010476a65b52631630899a44633
Signed-off-by: Ganesh Mahendran <opensource.ganesh@gmail.com>
Martijn Coenen [Thu, 24 Aug 2017 13:23:36 +0000 (15:23 +0200)]
ANDROID: binder: fix transaction leak.
If a call to put_user() fails, we failed to
properly free a transaction and send a failed
reply (if necessary).
Bug:
63117588
Test: binderLibTest
Change-Id: Ia98db8cd82ce354a4cdc8811c969988d585c7e31
Signed-off-by: Martijn Coenen <maco@android.com>
Martijn Coenen [Mon, 8 May 2017 16:33:22 +0000 (09:33 -0700)]
ANDROID: binder: Add tracing for binder priority inheritance.
Bug:
34461621
Change-Id: I5ebb1c0c49fd42a89ee250a1d70221f767c82c7c
Signed-off-by: Martijn Coenen <maco@google.com>
Todd Kjos [Mon, 25 Sep 2017 15:55:09 +0000 (08:55 -0700)]
FROMLIST: binder: fix use-after-free in binder_transaction()
(from https://patchwork.kernel.org/patch/
9978801/)
User-space normally keeps the node alive when creating a transaction
since it has a reference to the target. The local strong ref keeps it
alive if the sending process dies before the target process processes
the transaction. If the source process is malicious or has a reference
counting bug, this can fail.
In this case, when we attempt to decrement the node in the failure
path, the node has already been freed.
This is fixed by taking a tmpref on the node while constructing
the transaction. To avoid re-acquiring the node lock and inner
proc lock to increment the proc's tmpref, a helper is used that
does the ref increments on both the node and proc.
Bug:
66899329
Change-Id: Iad40e1e0bccee88234900494fb52a510a37fe8d7
Signed-off-by: Todd Kjos <tkjos@google.com>
Xu YiPing [Tue, 5 Sep 2017 17:00:59 +0000 (10:00 -0700)]
FROMLIST: binder: fix an ret value override
(from https://patchwork.kernel.org/patch/
9939409/)
commit
372e3147df70 ("binder: guarantee txn complete / errors delivered
in-order") incorrectly defined a local ret value. This ret value will
be invalid when out of the if block
Change-Id: If7bd963ac7e67d135aa949133263aac27bf15d1a
Signed-off-by: Xu YiPing <xuyiping@hislicon.com>
Signed-off-by: Todd Kjos <tkjos@google.com>
Xu YiPing [Mon, 22 May 2017 18:26:23 +0000 (11:26 -0700)]
FROMLIST: binder: fix memory corruption in binder_transaction binder
(from https://patchwork.kernel.org/patch/
9939405/)
commit
7a4408c6bd3e ("binder: make sure accesses to proc/thread are
safe") made a change to enqueue tcomplete to thread->todo before
enqueuing the transaction. However, in err_dead_proc_or_thread case,
the tcomplete is directly freed, without dequeued. It may cause the
thread->todo list to be corrupted.
So, dequeue it before freeing.
Bug:
65333488
Change-Id: Id063a4db18deaa634f4d44aa6ebca47bea32537a
Signed-off-by: Xu YiPing <xuyiping@hisilicon.com>
Signed-off-by: Todd Kjos <tkjos@google.com>
Sherry Yang [Thu, 3 Aug 2017 18:33:53 +0000 (11:33 -0700)]
FROMLIST: android: binder: Move buffer out of area shared with user space
(from https://patchwork.kernel.org/patch/
9928607/)
Binder driver allocates buffer meta data in a region that is mapped
in user space. These meta data contain pointers in the kernel.
This patch allocates buffer meta data on the kernel heap that is
not mapped in user space, and uses a pointer to refer to the data mapped.
Also move alloc->buffers initialization from mmap to init since it's
now used even when mmap failed or was not called.
Bug:
36007193
Change-Id: Id5136048bdb7b796f59de066de7ea7df410498f5
Signed-off-by: Sherry Yang <sherryy@android.com>
Sherry Yang [Thu, 22 Jun 2017 21:37:45 +0000 (14:37 -0700)]
FROMLIST: android: binder: Add allocator selftest
(from https://patchwork.kernel.org/patch/
9928609/)
binder_alloc_selftest tests that alloc_new_buf handles page allocation and
deallocation properly when allocate and free buffers. The test allocates 5
buffers of various sizes to cover all possible page alignment cases, and
frees the buffers using a list of exhaustive freeing order.
Test: boot the device with ANDROID_BINDER_IPC_SELFTEST config option
enabled. Allocator selftest passes.
Bug:
36007193
Change-Id: I2fe396232b7dfe4bbc50bdba99ca0de9be63cc37
Signed-off-by: Sherry Yang <sherryy@android.com>
Sherry Yang [Fri, 30 Jun 2017 17:22:23 +0000 (10:22 -0700)]
FROMLIST: android: binder: Refactor prev and next buffer into a helper function
(from https://patchwork.kernel.org/patch/
9928605/)
Use helper functions buffer_next and buffer_prev instead
of list_entry to get the next and previous buffers.
Bug:
36007193
Change-Id: I422dce84afde3d2138a6d976593b109a9cc49003
Signed-off-by: Sherry Yang <sherryy@android.com>
Martijn Coenen [Wed, 15 Mar 2017 17:22:52 +0000 (18:22 +0100)]
FROMLIST: binder: add more debug info when allocation fails.
(from https://patchwork.kernel.org/patch/
9817797/)
Bug:
36088202
Test: tested manually
Change-Id: Ib526a9c375e6136669b72f341e0b54d896fd1cec
Signed-off-by: Martijn Coenen <maco@android.com>
Signed-off-by: Siqi Lin <siqilin@google.com>
Michal Nazarewicz [Tue, 12 Nov 2013 23:08:41 +0000 (15:08 -0800)]
gen_init_cpio: avoid NULL pointer dereference and rework env expanding
getenv() may return NULL if given environment variable does not exist
which leads to NULL dereference when calling strncat.
Besides that, the environment variable name was copied to a temporary
env_var buffer, but this copying can be avoided by simply using the input
string.
Lastly, the whole loop can be greatly simplified by using the snprintf
function instead of the playing with strncat.
By the way, the current implementation allows a recursive variable
expansion, as in:
$ echo 'out ${A} out ' | A='a ${B} a' B=b /tmp/a
out a b a out
I'm assuming this is just a side effect and not a conscious decision
(especially as this may lead to infinite loop), but I didn't want to
change this behaviour without consulting.
If the current behaviour is deamed incorrect, I'll be happy to send
a patch without recursive processing.
Signed-off-by: Michal Nazarewicz <mina86@mina86.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Jesper Juhl <jj@codesealer.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Change-Id: I575a65c6261288ffe5c1166dc40271fbbc4d11cf
Ramkumar Ramachandra [Wed, 10 Jul 2013 18:03:38 +0000 (23:33 +0530)]
scripts: remove unused function in sortextable.c
Change-Id: Ifc81f354582e43d71cfe8a7f0ce4a5e50778bd81
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Acked-by: David Daney <david.daney@cavium.com>
Signed-off-by: Michal Marek <mmarek@suse.cz>
Nathan Chancellor [Sat, 2 Jun 2018 16:02:09 +0000 (09:02 -0700)]
kconfig: Avoid format overflow warning from GCC 8.1
In file included from scripts/kconfig/zconf.tab.c:2485:
scripts/kconfig/confdata.c: In function ‘conf_write’:
scripts/kconfig/confdata.c:773:22: warning: ‘%s’ directive writing likely 7 or more bytes into a region of size between 1 and 4097 [-Wformat-overflow=]
sprintf(newname, "%s%s", dirname, basename);
^~
scripts/kconfig/confdata.c:773:19: note: assuming directive output of 7 bytes
sprintf(newname, "%s%s", dirname, basename);
^~~~~~
scripts/kconfig/confdata.c:773:2: note: ‘sprintf’ output 1 or more bytes (assuming 4104) into a destination of size 4097
sprintf(newname, "%s%s", dirname, basename);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
scripts/kconfig/confdata.c:776:23: warning: ‘.tmpconfig.’ directive writing 11 bytes into a region of size between 1 and 4097 [-Wformat-overflow=]
sprintf(tmpname, "%s.tmpconfig.%d", dirname, (int)getpid());
^~~~~~~~~~~
scripts/kconfig/confdata.c:776:3: note: ‘sprintf’ output between 13 and 4119 bytes into a destination of size 4097
sprintf(tmpname, "%s.tmpconfig.%d", dirname, (int)getpid());
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Increase the size of tmpname and newname to make GCC happy.
Change-Id: Ie3a8689e3982734be63d15e1ad98416ab13d4b48
Cc: stable@vger.kernel.org
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Tim Gardner [Tue, 24 Nov 2015 08:04:15 +0000 (09:04 +0100)]
scripts/sortextable: suppress warning: `relocs_size' may be used uninitialized
In file included from scripts/sortextable.c:194:0:
scripts/sortextable.c: In function `main':
scripts/sortextable.h:176:3: warning: `relocs_size' may be used uninitialized in this function [-Wmaybe-uninitialized]
memset(relocs, 0, relocs_size);
^
scripts/sortextable.h:106:6: note: `relocs_size' was declared here
int relocs_size;
^
In file included from scripts/sortextable.c:192:0:
scripts/sortextable.h:176:3: warning: `relocs_size' may be used uninitialized in this function [-Wmaybe-uninitialized]
memset(relocs, 0, relocs_size);
^
scripts/sortextable.h:106:6: note: `relocs_size' was declared here
int relocs_size;
^
gcc 4.9.1
Change-Id: I50749f95f1e212ed8c913547bffed14f847b8929
Stricted [Mon, 29 Jul 2019 18:15:51 +0000 (18:15 +0000)]
Revert "Decon update max_win to 7"
* we only use 3 windows
This reverts commit
7f52706418f42c16f384496aecbf961c11299a4c.
Daniel Micay [Wed, 15 Jun 2016 10:11:48 +0000 (06:11 -0400)]
minimal port of grsecurity's DENYUSB feature
Danny Wood [Fri, 1 Feb 2019 09:53:56 +0000 (09:53 +0000)]
universal7580: Enable CT target for iptables
Florian Westphal [Thu, 8 Mar 2018 11:54:19 +0000 (12:54 +0100)]
netfilter: ebtables: fix erroneous reject of last rule
The last rule in the blob has next_entry offset that is same as total size.
This made "ebtables32 -A OUTPUT -d de:ad:be:ef:01:02" fail on 64 bit kernel.
Change-Id: Idaf15f1b3ad0aabdff3e995131a0fdd8bffb1e4a
Fixes:
b71812168571fa ("netfilter: ebtables: CONFIG_COMPAT: don't trust userland offsets")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Jan Altensen <info@stricted.net>
Utkarsh Saxena [Wed, 11 Jan 2017 11:00:00 +0000 (16:30 +0530)]
nf_conntrack: Null pointer check added prior deleting sip node
Prior to null pointer check, SIP node was deleted from the list
a null pointer check is added to confront the exception.
Bug:
111529827
Change-Id: Ia12fa468eed7d1a91fad96840fa27cb0e4e208a8
Acked-by: Rishav LNU <rna@qti.qualcomm.com>
Signed-off-by: Utkarsh Saxena <usaxena@codeaurora.org>
Signed-off-by: Jan Altensen <info@stricted.net>
Ravinder Konka [Tue, 9 Dec 2014 10:52:47 +0000 (16:22 +0530)]
netfilter: Changes to handle segmentation in SIP ALG
Linux Kernel SIP ALG did not handle Segmented TCP Packets
because of which SIP communication could not be established
for some clients. This change fixes that issue.
Change-Id: I8c77322f69cf4d9ad4c7b4971da924ffd585dea0
Signed-off-by: Ravinder Konka <rkonka@codeaurora.org>
Signed-off-by: Tyler Wear <twear@codeaurora.org>
Signed-off-by: Jan Altensen <info@stricted.net>
Peter Zijlstra [Thu, 26 Mar 2015 16:45:37 +0000 (17:45 +0100)]
locking: Remove atomicy checks from {READ,WRITE}_ONCE
The fact that volatile allows for atomic load/stores is a special case
not a requirement for {READ,WRITE}_ONCE(). Their primary purpose is to
force the compiler to emit load/stores _once_.
Change-Id: I86e34ae44535576859d66411208e7c8a13c0ec3a
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Git-commit:
7bd3e239d6c6d1cad276e8f130b386df4234dcd7
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Trilok Soni <tsoni@codeaurora.org>
Andrew Morton [Wed, 3 Jul 2013 22:02:11 +0000 (15:02 -0700)]
UPSTREAM: include/linux/mm.h: add PAGE_ALIGNED() helper
To test whether an address is aligned to PAGE_SIZE.
Cc: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>,
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit
0fa73b86ef0797ca4fde5334117ca0b330f08030)
Bug:
36007193
Change-Id: I7e912bb0dbd8c9737fb13c5b48acb54ee39dd5fc
Al Viro [Sat, 9 May 2015 02:53:15 +0000 (22:53 -0400)]
path_openat(): fix double fput()
[ Upstream commit
f15133df088ecadd141ea1907f2c96df67c729f0 ]
path_openat() jumps to the wrong place after do_tmpfile() - it has
already done path_cleanup() (as part of path_lookupat() called by
do_tmpfile()), so doing that again can lead to double fput().
Change-Id: Ia74c130ae5e379b512532c0feebea871b5f73668
Cc: stable@vger.kernel.org # v3.11+
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Al Viro [Tue, 11 Jun 2013 04:23:01 +0000 (08:23 +0400)]
allow build_open_flags() to return an error
Change-Id: I6e900fc3facf5a3febefe138fea7db493bc383d8
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Thu, 6 Jun 2013 13:12:33 +0000 (09:12 -0400)]
do_last(): fix missing checks for LAST_BIND case
/proc/self/cwd with O_CREAT should fail with EISDIR. /proc/self/exe, OTOH,
should fail with ENOTDIR when opened with O_DIRECTORY.
Change-Id: I01c85a6a3894c6854c604f192f221175edc19867
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Ethan Chen [Tue, 4 Sep 2018 02:11:39 +0000 (19:11 -0700)]
uapi: Define __BITS_PER_LONG based on compiler target
* We may compile 32-bit ARM code against these kernel headers in many
situations, so provide a compiler-defined method of obtaining the width
of long.
Change-Id: Iac5e48200d70f1258ab3caca1a8f1eb6e8f7f2d3
Eric Rannaud [Thu, 30 Oct 2014 08:51:01 +0000 (01:51 -0700)]
fs: allow open(dir, O_TMPFILE|..., 0) with mode 0
The man page for open(2) indicates that when O_CREAT is specified, the
'mode' argument applies only to future accesses to the file:
Note that this mode applies only to future accesses of the newly
created file; the open() call that creates a read-only file
may well return a read/write file descriptor.
The man page for open(2) implies that 'mode' is treated identically by
O_CREAT and O_TMPFILE.
O_TMPFILE, however, behaves differently:
int fd = open("/tmp", O_TMPFILE | O_RDWR, 0);
assert(fd == -1);
assert(errno == EACCES);
int fd = open("/tmp", O_TMPFILE | O_RDWR, 0600);
assert(fd > 0);
For O_CREAT, do_last() sets acc_mode to MAY_OPEN only:
if (*opened & FILE_CREATED) {
/* Don't check for write permission, don't truncate */
open_flag &= ~O_TRUNC;
will_truncate = false;
acc_mode = MAY_OPEN;
path_to_nameidata(path, nd);
goto finish_open_created;
}
But for O_TMPFILE, do_tmpfile() passes the full op->acc_mode to
may_open().
This patch lines up the behavior of O_TMPFILE with O_CREAT. After the
inode is created, may_open() is called with acc_mode = MAY_OPEN, in
do_tmpfile().
A different, but related glibc bug revealed the discrepancy:
https://sourceware.org/bugzilla/show_bug.cgi?id=17523
The glibc lazily loads the 'mode' argument of open() and openat() using
va_arg() only if O_CREAT is present in 'flags' (to support both the 2
argument and the 3 argument forms of open; same idea for openat()).
However, the glibc ignores the 'mode' argument if O_TMPFILE is in
'flags'.
On x86_64, for open(), it magically works anyway, as 'mode' is in
RDX when entering open(), and is still in RDX on SYSCALL, which is where
the kernel looks for the 3rd argument of a syscall.
But openat() is not quite so lucky: 'mode' is in RCX when entering the
glibc wrapper for openat(), while the kernel looks for the 4th argument
of a syscall in R10. Indeed, the syscall calling convention differs from
the regular calling convention in this respect on x86_64. So the kernel
sees mode = 0 when trying to use glibc openat() with O_TMPFILE, and
fails with EACCES.
Change-Id: If41388897576291d3cfb0515fd71efb70bf2b3df
Signed-off-by: Eric Rannaud <e@nanocritical.com>
Acked-by: Andy Lutomirski <luto@amacapital.net>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andy Lutomirski [Fri, 2 Aug 2013 04:07:52 +0000 (21:07 -0700)]
fs: Fix file mode for O_TMPFILE
O_TMPFILE, like O_CREAT, should respect the requested mode and should
create regular files.
This fixes two bugs: O_TMPFILE required privilege (because the mode
ended up as 000) and it produced bogus inodes with no type.
Change-Id: I4d045c5b3a07e3d3114897c5f3d2448ab6c3a0a5
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Zheng Liu [Thu, 25 Jul 2013 00:13:19 +0000 (08:13 +0800)]
vfs: add missing check for __O_TMPFILE in fcntl_init()
As comment in include/uapi/asm-generic/fcntl.h described, when
introducing new O_* bits, we need to check its uniqueness in
fcntl_init(). But __O_TMPFILE bit is missing. So fix it.
Change-Id: I372d41d2bc15b007595eaf763e200c4c06de177f
Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Fri, 19 Jul 2013 23:11:32 +0000 (03:11 +0400)]
allow O_TMPFILE to work with O_WRONLY
Change-Id: If75a4f1b8f1ba485f6073be4058b59126cef034b
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Sat, 13 Jul 2013 09:26:37 +0000 (13:26 +0400)]
Safer ABI for O_TMPFILE
[suggested by Rasmus Villemoes] make O_DIRECTORY | O_RDWR part of O_TMPFILE;
that will fail on old kernels in a lot more cases than what I came up with.
And make sure O_CREAT doesn't get there...
Change-Id: I4818563d79ca1abf9ea99f5ccea9317eb2f3b678
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Fri, 7 Jun 2013 05:20:27 +0000 (01:20 -0400)]
it's still short a few helpers, but infrastructure should be OK now...
Change-Id: I9e003fabb858fd901fd922cd891ca29966ccdf3a
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Lorenzo Colitti [Thu, 3 Nov 2016 17:23:42 +0000 (02:23 +0900)]
net: core: add UID to flows, rules, and routes
- Define a new FIB rule attributes, FRA_UID_RANGE, to describe a
range of UIDs.
- Define a RTA_UID attribute for per-UID route lookups and dumps.
- Support passing these attributes to and from userspace via
rtnetlink. The value INVALID_UID indicates no UID was
specified.
- Add a UID field to the flow structures.
[Backport of net-next
622ec2c9d52405973c9f1ca5116eb1c393adfc7d]
Bug:
16355602
Change-Id: I7e3ab388ed862c4b7e39dc8b0209d977cb1129ac
Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stricted [Fri, 7 Sep 2018 09:46:28 +0000 (11:46 +0200)]
Revert "net: core: Support UID-based routing."
This reverts commit
99a6ea48b591877d1cd6a51732c40a1d5321d961.
Stricted [Fri, 7 Sep 2018 09:46:22 +0000 (11:46 +0200)]
Revert "Handle 'sk' being NULL in UID-based routing."
This reverts commit
455b09d66a9ccfc572497ae88375ae343ff9ae66.
Change-Id: I1d95864d8b48ae3ca418cfd790cfd62c07f54f66
Cong Wang [Tue, 15 Apr 2014 23:25:34 +0000 (16:25 -0700)]
ipv4, fib: pass LOOPBACK_IFINDEX instead of 0 to flowi4_iif
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
As suggested by Julian:
Simply, flowi4_iif must not contain 0, it does not
look logical to ignore all ip rules with specified iif.
because in fib_rule_match() we do:
if (rule->iifindex && (rule->iifindex != fl->flowi_iif))
goto out;
flowi4_iif should be LOOPBACK_IFINDEX by default.
We need to move LOOPBACK_IFINDEX to include/net/flow.h:
1) It is mostly used by flowi_iif
2) Fix the following compile error if we use it in flow.h
by the patches latter:
In file included from include/linux/netfilter.h:277:0,
from include/net/netns/netfilter.h:5,
from include/net/net_namespace.h:21,
from include/linux/netdevice.h:43,
from include/linux/icmpv6.h:12,
from include/linux/ipv6.h:61,
from include/net/ipv6.h:16,
from include/linux/sunrpc/clnt.h:27,
from include/linux/nfs_fs.h:30,
from init/do_mounts.c:32:
include/net/flow.h: In function ‘flowi4_init_output’:
include/net/flow.h:84:32: error: ‘LOOPBACK_IFINDEX’ undeclared (first use in this function)
[Backport of net-next
6a662719c9868b3d6c7d26b3a085f0cd3cc15e64]
Change-Id: Ib7a0a08d78c03800488afa1b2c170cb70e34cfd9
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Julian Anastasov <ja@ssi.bg>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Cong Wang <cwang@twopensource.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
Stricted [Tue, 6 Aug 2019 11:33:01 +0000 (11:33 +0000)]
defconfig: s5neolte: disable cpusets
Stricted [Mon, 29 Jul 2019 04:10:08 +0000 (04:10 +0000)]
defconfig: s5neolte: regenerate defconfig
Ruchi Kandoi [Thu, 23 Apr 2015 19:09:09 +0000 (12:09 -0700)]
nf: IDLETIMER: Adds the uid field in the msg
Message notifications contains an additional uid field. This field
represents the uid that was responsible for waking the radio. And hence
it is present only in notifications stating that the radio is now
active.
Change-Id: I18fc73eada512e370d7ab24fc9f890845037b729
Signed-off-by: Ruchi Kandoi <kandoiruchi@google.com>
Bug:
20264396
Danny Wood [Fri, 3 May 2019 10:07:52 +0000 (11:07 +0100)]
bcmdhd: Fix android version check
Danny Wood [Fri, 3 May 2019 08:57:38 +0000 (09:57 +0100)]
dma: dmaengine: Do not skip opening a 'busy' DMA channel as it crashes the kernel and doesn't seem to be necessary
Danny Wood [Sat, 4 May 2019 09:36:48 +0000 (10:36 +0100)]
sec_battery: update prev_safety_time to fix the non-charging issue when the device is not plugged in for 20+ hours
Danny Wood [Thu, 25 Apr 2019 10:16:33 +0000 (11:16 +0100)]
a5xelte: add initial defconfig
Danny Wood [Mon, 8 Apr 2019 08:59:40 +0000 (09:59 +0100)]
cpufreq: interactive: check speedchange_task pointer before waking it to avoid a kernel panic
Danny Wood [Sun, 7 Apr 2019 08:21:44 +0000 (09:21 +0100)]
trace: exynos-ss: fix mode permissions in calls to 'ptrace_may_access'
Rohit Gupta [Sat, 7 Mar 2015 02:46:04 +0000 (18:46 -0800)]
cpufreq: interactive: Rearm governor timer at max freq
Interactive governor doesn't rearm per-cpu timer if target_freq is
equal to policy->max. However, this does not have clear performance
benefits. Profiling doesn't show any difference in benchmarks, games
or other workloads, if timers are always rearmed.
At same time, there are a few issues caused by not rearming timer
at policy->max.
1) min_sample_time enforcement is inconsistent
For target frequency that is lower than policy->max, it will not
drop until min_sample_time has passed since last frequency evaluation
selected current frequency. However, for policy->max, it will
always drop immediately as long as CPU has been run for longer than
min_sample_time. This is because timer is not running and thus
floor_freq and floor_validate_time is not updated.
Example: assume min_sample_time is 59ms and timer_rate is 20ms.
Frequency X < Y. Let's say CPU would pick the following frequencies
before accounting for min_sample_time in each 20ms sampling window.
Y, Y, Y, Y, X, X, X, X, X
If Y is not policy->max, the final target_freq after considering
min_sample_time will be Y, Y, Y, Y, *Y, *Y, X, X, X
* marks the windows where frequency is prevented from dropping.
If Y is policy->max, the final target_freq will be
Y, Y, Y, Y, X, X, X, X, X
2) Rearm timer in IDLE_START does not work as intended
IDLE_START/END is sent in arch_cpu_idle_enter/exit(). However, next
wake up is decided in tick_nohz_idle_enter(), which traverses the
timer list before idle notification is sent out. Therefore, rearming
timer in idle notification won't take effect until CPU wakes up at
least once. In rare scenarios when a CPU goes to idle and sleeps for a
long time immediately after a heavy load stops, it may not wake up
to drop its frequency vote for a long time, defeating the purpose of
having a slack_timer.
3) Need to rearm timer for policy->max change
commit
535a553fc1c4b4c3627c73214ade6326615a7463
(cpufreq: interactive: restructure CPUFREQ_GOV_LIMITS) mentions the
problem of timer getting indefinitely pushed back due to frequency
changes in policy->min/max. However, it still cancels and rearms timer
if policy->max is increased, and same problem could still happen if
policy->max is frequently changing after the fix. The best solution is
to always rearm timer for each CPU even if it's running at
policy->max.
Rearming timers even if target_freq is policy->max solves these
problems cleanly. It also simplifies the design and code of interactive
governor.
Change-Id: I973853d2375ea6f697fa4cee04a89efe6b8bf735
Reviewed-by: Saravana Kannan <skannan@codeaurora.org>
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
Signed-off-by: Rohit Gupta <rohgup@codeaurora.org>
Sultan Alsawaf [Sun, 3 Jun 2018 17:47:51 +0000 (10:47 -0700)]
kernel: Fix massive cpufreq stats memory leaks
Every time _cpu_up() is called for a CPU, idle_thread_get() is called which
then re-initializes a CPU's idle thread that was already previously created
and cached in a global variable in smpboot.c. idle_thread_get() calls
init_idle() which then calls __sched_fork(). __sched_fork() is where
cpufreq_task_stats_init() is, and cpufreq_task_stats_init() allocates
512 bytes of memory to a pointer in the task struct.
Since idle_thread_get() reuses a task struct instance that was already
previously created, this means that every time it calls init_idle(),
cpufreq_task_stats_init() allocates 512 bytes again and overwrites the
existing 512-byte allocation that the idle thread already had.
This causes 512 bytes to be leaked every time a CPU is onlined. This is
significant when non-boot CPUs are enabled during resume from suspend; this
means that (NR_CPUS - 1) * 512 bytes are leaked every time the device exits
suspend (this turned out to be ~500 kiB leaked in 20 minutes with the
device left on a desk with the screen off).
In order to fix this, don't initialize cpufreq stats at all for the idle
threads. The cpufreq stats interface is intended to be used for tracking
userspace tasks, so we can safely remove it from the kernel's idle threads
without killing any functionality.
Change-Id: I12fe7611fc88eb7f6c39f8f7629ad27b6ec4722c
Signed-off-by: Sultan Alsawaf <sultanxda@gmail.com>
NeilBrown [Fri, 13 Feb 2015 04:49:17 +0000 (15:49 +1100)]
sched: Prevent recursion in io_schedule()
commit
9cff8adeaa34b5d2802f03f89803da57856b3b72 upstream.
io_schedule() calls blk_flush_plug() which, depending on the
contents of current->plug, can initiate arbitrary blk-io requests.
Note that this contrasts with blk_schedule_flush_plug() which requires
all non-trivial work to be handed off to a separate thread.
This makes it possible for io_schedule() to recurse, and initiating
block requests could possibly call mempool_alloc() which, in times of
memory pressure, uses io_schedule().
Apart from any stack usage issues, io_schedule() will not behave
correctly when called recursively as delayacct_blkio_start() does
not allow for repeated calls.
So:
- use ->in_iowait to detect recursion. Set it earlier, and restore
it to the old value.
- move the call to "raw_rq" after the call to blk_flush_plug().
As this is some sort of per-cpu thing, we want some chance that
we are on the right CPU
- When io_schedule() is called recurively, use blk_schedule_flush_plug()
which cannot further recurse.
- as this makes io_schedule() a lot more complex and as io_schedule()
must match io_schedule_timeout(), but all the changes in io_schedule_timeout()
and make io_schedule a simple wrapper for that.
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
[ Moved the now rudimentary io_schedule() into sched.h. ]
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Tony Battersby <tonyb@cybernetics.com>
Link: http://lkml.kernel.org/r/20150213162600.059fffb2@notabene.brown
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Change-Id: Ic32d78863e35db89c3e8f0b29dcf20cb563dc70f
Junjie Wu [Tue, 5 Jan 2016 19:09:41 +0000 (11:09 -0800)]
cpufreq: interactive: Use wake_up_process_no_notif to wake up tasks
Scheduler could send a notification to governor each time a task wakes
up. If governor wakes up another task as a response to such a
notification, it could result in endless recursive notifications.
Use wake_up_process_no_notif to ensure scheduler won't send another
notification for speedchange task woken up by the governor.
Change-Id: I697affcbdf79e2ad0cfe843eb880d304960682f4
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
Junjie Wu [Tue, 5 Jan 2016 18:53:30 +0000 (10:53 -0800)]
sched: Provide a wake up API without sending freq notifications
Each time a task wakes up, scheduler evaluates its load and notifies
governor if the resulting frequency of destination CPU is larger than
a threshold. However, some governor wakes up a separate task that
handles frequency change, which again calls wake_up_process().
This is dangerous because if the task being woken up meets the
threshold and ends up being moved around, there is a potential for
endless recursive notifications.
Introduce a new API for waking up a task without triggering
frequency notification.
Change-Id: I24261af81b7dc410c7fb01eaa90920b8d66fbd2a
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
Danny Wood [Tue, 2 Apr 2019 08:27:16 +0000 (09:27 +0100)]
arm64: rebuild configs to reflect CONFIG_UID_SYS_STATS change
Danny Wood [Sun, 31 Mar 2019 08:41:21 +0000 (09:41 +0100)]
cpufreq: interactive: use CPUFREQ_RELATION_C to choose closest frequency insted of lowest