GitHub/LineageOS/android_kernel_motorola_exynos9610.git
6 months agobpf/verifier: disallow pointer subtraction
Alexei Starovoitov [Mon, 31 May 2021 18:25:50 +0000 (18:25 +0000)]
bpf/verifier: disallow pointer subtraction

commit dd066823db2ac4e22f721ec85190817b58059a54 upstream.

Subtraction of pointers was accidentally allowed for unpriv programs
by commit 82abbf8d2fc4. Revert that part of commit.

Fixes: 82abbf8d2fc4 ("bpf: do not allow root to mangle valid pointers")
Reported-by: Jann Horn <jannh@google.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
[fllinden@amazon.com: backport to 4.14]
Signed-off-by: Frank van der Linden <fllinden@amazon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
6 months agobpf: do not allow root to mangle valid pointers
Alexei Starovoitov [Mon, 31 May 2021 18:25:49 +0000 (18:25 +0000)]
bpf: do not allow root to mangle valid pointers

commit 82abbf8d2fc46d79611ab58daa7c608df14bb3ee upstream.

Do not allow root to convert valid pointers into unknown scalars.
In particular disallow:
 ptr &= reg
 ptr <<= reg
 ptr += ptr
and explicitly allow:
 ptr -= ptr
since pkt_end - pkt == length

1.
This minimizes amount of address leaks root can do.
In the future may need to further tighten the leaks with kptr_restrict.

2.
If program has such pointer math it's likely a user mistake and
when verifier complains about it right away instead of many instructions
later on invalid memory access it's easier for users to fix their progs.

3.
when register holding a pointer cannot change to scalar it allows JITs to
optimize better. Like 32-bit archs could use single register for pointers
instead of a pair required to hold 64-bit scalars.

4.
reduces architecture dependent behavior. Since code:
r1 = r10;
r1 &= 0xff;
if (r1 ...)
will behave differently arm64 vs x64 and offloaded vs native.

A significant chunk of ptr mangling was allowed by
commit f1174f77b50c ("bpf/verifier: rework value tracking")
yet some of it was allowed even earlier.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
[fllinden@amazon.com: backport to 4.14]
Signed-off-by: Frank van der Linden <fllinden@amazon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
6 months agobpf: Update selftests to reflect new error states
Daniel Borkmann [Mon, 31 May 2021 18:25:48 +0000 (18:25 +0000)]
bpf: Update selftests to reflect new error states

commit d7a5091351756d0ae8e63134313c455624e36a13 upstream.

Update various selftest error messages:

 * The 'Rx tried to sub from different maps, paths, or prohibited types'
   is reworked into more specific/differentiated error messages for better
   guidance.

 * The change into 'value -4294967168 makes map_value pointer be out of
   bounds' is due to moving the mixed bounds check into the speculation
   handling and thus occuring slightly later than above mentioned sanity
   check.

 * The change into 'math between map_value pointer and register with
   unbounded min value' is similarly due to register sanity check coming
   before the mixed bounds check.

 * The case of 'map access: known scalar += value_ptr from different maps'
   now loads fine given masks are the same from the different paths (despite
   max map value size being different).

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
[fllinden@amazon.com - 4.14 backport, account for split test_verifier and
different / missing tests]
Signed-off-by: Frank van der Linden <fllinden@amazon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
6 months agobpf: Tighten speculative pointer arithmetic mask
Daniel Borkmann [Mon, 31 May 2021 18:25:47 +0000 (18:25 +0000)]
bpf: Tighten speculative pointer arithmetic mask

commit 7fedb63a8307dda0ec3b8969a3b233a1dd7ea8e0 upstream.

This work tightens the offset mask we use for unprivileged pointer arithmetic
in order to mitigate a corner case reported by Piotr and Benedict where in
the speculative domain it is possible to advance, for example, the map value
pointer by up to value_size-1 out-of-bounds in order to leak kernel memory
via side-channel to user space.

Before this change, the computed ptr_limit for retrieve_ptr_limit() helper
represents largest valid distance when moving pointer to the right or left
which is then fed as aux->alu_limit to generate masking instructions against
the offset register. After the change, the derived aux->alu_limit represents
the largest potential value of the offset register which we mask against which
is just a narrower subset of the former limit.

For minimal complexity, we call sanitize_ptr_alu() from 2 observation points
in adjust_ptr_min_max_vals(), that is, before and after the simulated alu
operation. In the first step, we retieve the alu_state and alu_limit before
the operation as well as we branch-off a verifier path and push it to the
verification stack as we did before which checks the dst_reg under truncation,
in other words, when the speculative domain would attempt to move the pointer
out-of-bounds.

In the second step, we retrieve the new alu_limit and calculate the absolute
distance between both. Moreover, we commit the alu_state and final alu_limit
via update_alu_sanitation_state() to the env's instruction aux data, and bail
out from there if there is a mismatch due to coming from different verification
paths with different states.

Reported-by: Piotr Krysiuk <piotras@gmail.com>
Reported-by: Benedict Schlueter <benedict.schlueter@rub.de>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Tested-by: Benedict Schlueter <benedict.schlueter@rub.de>
[fllinden@amazon.com: backported to 4.14]
Signed-off-by: Frank van der Linden <fllinden@amazon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
6 months agobpf: Move sanitize_val_alu out of op switch
Daniel Borkmann [Mon, 31 May 2021 18:25:46 +0000 (18:25 +0000)]
bpf: Move sanitize_val_alu out of op switch

commit f528819334881fd622fdadeddb3f7edaed8b7c9b upstream.

Add a small sanitize_needed() helper function and move sanitize_val_alu()
out of the main opcode switch. In upcoming work, we'll move sanitize_ptr_alu()
as well out of its opcode switch so this helps to streamline both.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
[fllinden@amazon.com: backported to 4.14]
Signed-off-by: Frank van der Linden <fllinden@amazon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
6 months agobpf: Refactor and streamline bounds check into helper
Daniel Borkmann [Mon, 31 May 2021 18:25:45 +0000 (18:25 +0000)]
bpf: Refactor and streamline bounds check into helper

commit 073815b756c51ba9d8384d924c5d1c03ca3d1ae4 upstream.

Move the bounds check in adjust_ptr_min_max_vals() into a small helper named
sanitize_check_bounds() in order to simplify the former a bit.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
[fllinden@amazon.com: backport to 4.14]
Signed-off-by: Frank van der Linden <fllinden@amazon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
6 months agobpf: Improve verifier error messages for users
Daniel Borkmann [Mon, 31 May 2021 18:25:44 +0000 (18:25 +0000)]
bpf: Improve verifier error messages for users

commit a6aaece00a57fa6f22575364b3903dfbccf5345d upstream.

Consolidate all error handling and provide more user-friendly error messages
from sanitize_ptr_alu() and sanitize_val_alu().

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
[fllinden@amazon.com: backport to 4.14]
Signed-off-by: Frank van der Linden <fllinden@amazon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
6 months agobpf: Rework ptr_limit into alu_limit and add common error path
Daniel Borkmann [Mon, 31 May 2021 18:25:43 +0000 (18:25 +0000)]
bpf: Rework ptr_limit into alu_limit and add common error path

commit b658bbb844e28f1862867f37e8ca11a8e2aa94a3 upstream.

Small refactor with no semantic changes in order to consolidate the max
ptr_limit boundary check.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
6 months agobpf: Ensure off_reg has no mixed signed bounds for all types
Daniel Borkmann [Mon, 31 May 2021 18:25:42 +0000 (18:25 +0000)]
bpf: Ensure off_reg has no mixed signed bounds for all types

commit 24c109bb1537c12c02aeed2d51a347b4d6a9b76e upstream.

The mixed signed bounds check really belongs into retrieve_ptr_limit()
instead of outside of it in adjust_ptr_min_max_vals(). The reason is
that this check is not tied to PTR_TO_MAP_VALUE only, but to all pointer
types that we handle in retrieve_ptr_limit() and given errors from the latter
propagate back to adjust_ptr_min_max_vals() and lead to rejection of the
program, it's a better place to reside to avoid anything slipping through
for future types. The reason why we must reject such off_reg is that we
otherwise would not be able to derive a mask, see details in 9d7eceede769
("bpf: restrict unknown scalars of mixed signed bounds for unprivileged").

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
[fllinden@amazon.com: backport to 4.14]
Signed-off-by: Frank van der Linden <fllinden@amazon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
6 months agobpf: Move off_reg into sanitize_ptr_alu
Daniel Borkmann [Mon, 31 May 2021 18:25:41 +0000 (18:25 +0000)]
bpf: Move off_reg into sanitize_ptr_alu

commit 6f55b2f2a1178856c19bbce2f71449926e731914 upstream.

Small refactor to drag off_reg into sanitize_ptr_alu(), so we later on can
use off_reg for generalizing some of the checks for all pointer types.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
[fllinden@amazon.com: fix minor contextual conflict for 4.14]
Signed-off-by: Frank van der Linden <fllinden@amazon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
6 months agobpf, selftests: Fix up some test_verifier cases for unprivileged
Piotr Krysiuk [Mon, 31 May 2021 18:25:40 +0000 (18:25 +0000)]
bpf, selftests: Fix up some test_verifier cases for unprivileged

commit 0a13e3537ea67452d549a6a80da3776d6b7dedb3 upstream.

Fix up test_verifier error messages for the case where the original error
message changed, or for the case where pointer alu errors differ between
privileged and unprivileged tests. Also, add alternative tests for keeping
coverage of the original verifier rejection error message (fp alu), and
newly reject map_ptr += rX where rX == 0 given we now forbid alu on these
types for unprivileged. All test_verifier cases pass after the change. The
test case fixups were kept separate to ease backporting of core changes.

Signed-off-by: Piotr Krysiuk <piotras@gmail.com>
Co-developed-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
[fllinden@amazon.com: backport to 4.14, skipping non-existent tests]
Signed-off-by: Frank van der Linden <fllinden@amazon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
6 months agobpf: Set mac_len in bpf_skb_change_head
Jussi Maki [Wed, 19 May 2021 15:47:42 +0000 (15:47 +0000)]
bpf: Set mac_len in bpf_skb_change_head

[ Upstream commit 84316ca4e100d8cbfccd9f774e23817cb2059868 ]

The skb_change_head() helper did not set "skb->mac_len", which is
problematic when it's used in combination with skb_redirect_peer().
Without it, redirecting a packet from a L3 device such as wireguard to
the veth peer device will cause skb->data to point to the middle of the
IP header on entry to tcp_v4_rcv() since the L2 header is not pulled
correctly due to mac_len=0.

Fixes: 3a0af8fd61f9 ("bpf: BPF for lightweight tunnel infrastructure")
Signed-off-by: Jussi Maki <joamaki@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210519154743.2554771-2-joamaki@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
6 months agosamples/bpf: Fix broken tracex1 due to kprobe argument change
Yaqi Chen [Fri, 16 Apr 2021 15:48:03 +0000 (23:48 +0800)]
samples/bpf: Fix broken tracex1 due to kprobe argument change

[ Upstream commit 137733d08f4ab14a354dacaa9a8fc35217747605 ]

>From commit c0bbbdc32feb ("__netif_receive_skb_core: pass skb by
reference"), the first argument passed into __netif_receive_skb_core
has changed to reference of a skb pointer.

This commit fixes by using bpf_probe_read_kernel.

Signed-off-by: Yaqi Chen <chendotjs@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210416154803.37157-1-chendotjs@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
6 months agobpf: Fix masking negation logic upon negative dst register
Daniel Borkmann [Fri, 30 Apr 2021 14:21:46 +0000 (16:21 +0200)]
bpf: Fix masking negation logic upon negative dst register

commit b9b34ddbe2076ade359cd5ce7537d5ed019e9807 upstream.

The negation logic for the case where the off_reg is sitting in the
dst register is not correct given then we cannot just invert the add
to a sub or vice versa. As a fix, perform the final bitwise and-op
unconditionally into AX from the off_reg, then move the pointer from
the src to dst and finally use AX as the source for the original
pointer arithmetic operation such that the inversion yields a correct
result. The single non-AX mov in between is possible given constant
blinding is retaining it as it's not an immediate based operation.

Fixes: 979d63d50c0c ("bpf: prevent out of bounds speculation on pointer arithmetic")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Piotr Krysiuk <piotras@gmail.com>
Reviewed-by: Piotr Krysiuk <piotras@gmail.com>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
6 months agobpf: fix up selftests after backports were fixed
Frank van der Linden [Sat, 1 May 2021 18:05:06 +0000 (18:05 +0000)]
bpf: fix up selftests after backports were fixed

After the backport of the changes to fix CVE 2019-7308, the
selftests also need to be fixed up, as was done originally
in mainline 80c9b2fae87b ("bpf: add various test cases to selftests").

4.14 commit 03f11a51a19 ("bpf: Fix selftests are changes for CVE 2019-7308")
did that, but since there was an error in the backport, some
selftests did not change output. So, add them now that this error
has been fixed, and their output has actually changed as expected.

This adds the rest of the changed test outputs from 80c9b2fae87b.

Fixes: 03f11a51a19 ("bpf: Fix selftests are changes for CVE 2019-7308")
Signed-off-by: Frank van der Linden <fllinden@amazon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
6 months agobpf: Fix backport of "bpf: restrict unknown scalars of mixed signed bounds for unpriv...
Samuel Mendoza-Jonas [Sat, 1 May 2021 18:05:05 +0000 (18:05 +0000)]
bpf: Fix backport of "bpf: restrict unknown scalars of mixed signed bounds for unprivileged"

The 4.14 backport of 9d7eceede ("bpf: restrict unknown scalars of mixed
signed bounds for unprivileged") adds the PTR_TO_MAP_VALUE check to the
wrong location in adjust_ptr_min_max_vals(), most likely because 4.14
doesn't include the commit that updates the if-statement to a
switch-statement (aad2eeaf4 "bpf: Simplify ptr_min_max_vals adjustment").

Move the check to the proper location in adjust_ptr_min_max_vals().

Fixes: 17efa65350c5a ("bpf: restrict unknown scalars of mixed signed bounds for unprivileged")
Signed-off-by: Samuel Mendoza-Jonas <samjonas@amazon.com>
Reviewed-by: Frank van der Linden <fllinden@amazon.com>
Reviewed-by: Ethan Chen <yishache@amazon.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
6 months agobpf: Remove MTU check in __bpf_skb_max_len
Jesper Dangaard Brouer [Tue, 9 Feb 2021 13:38:09 +0000 (14:38 +0100)]
bpf: Remove MTU check in __bpf_skb_max_len

commit 6306c1189e77a513bf02720450bb43bd4ba5d8ae upstream.

Multiple BPF-helpers that can manipulate/increase the size of the SKB uses
__bpf_skb_max_len() as the max-length. This function limit size against
the current net_device MTU (skb->dev->mtu).

When a BPF-prog grow the packet size, then it should not be limited to the
MTU. The MTU is a transmit limitation, and software receiving this packet
should be allowed to increase the size. Further more, current MTU check in
__bpf_skb_max_len uses the MTU from ingress/current net_device, which in
case of redirects uses the wrong net_device.

This patch keeps a sanity max limit of SKB_MAX_ALLOC (16KiB). The real limit
is elsewhere in the system. Jesper's testing[1] showed it was not possible
to exceed 8KiB when expanding the SKB size via BPF-helper. The limiting
factor is the define KMALLOC_MAX_CACHE_SIZE which is 8192 for
SLUB-allocator (CONFIG_SLUB) in-case PAGE_SIZE is 4096. This define is
in-effect due to this being called from softirq context see code
__gfp_pfmemalloc_flags() and __do_kmalloc_node(). Jakub's testing showed
that frames above 16KiB can cause NICs to reset (but not crash). Keep this
sanity limit at this level as memory layer can differ based on kernel
config.

[1] https://github.com/xdp-project/bpf-examples/tree/master/MTU-tests

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/161287788936.790810.2937823995775097177.stgit@firesoul
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
6 months agoRevert "ANDROID: net: bpf: permit redirect from ingress L3 to egress L2 devices at...
Cosmin Tanislav [Thu, 16 May 2024 07:52:16 +0000 (10:52 +0300)]
Revert "ANDROID: net: bpf: permit redirect from ingress L3 to egress L2 devices at near max mtu"

This reverts commit 69eb478ff932476b208d2e7e97137ced5665fa04.

6 months agolibbpf: Fix INSTALL flag order
Georgi Valkov [Mon, 8 Mar 2021 18:30:38 +0000 (10:30 -0800)]
libbpf: Fix INSTALL flag order

[ Upstream commit e7fb6465d4c8e767e39cbee72464e0060ab3d20c ]

It was reported ([0]) that having optional -m flag between source and
destination arguments in install command breaks bpftools cross-build
on MacOS. Move -m to the front to fix this issue.

  [0] https://github.com/openwrt/openwrt/pull/3959

Fixes: 7110d80d53f4 ("libbpf: Makefile set specified permission mode")
Signed-off-by: Georgi Valkov <gvalkov@abv.bg>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210308183038.613432-1-andrii@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
6 months agobpf: Prohibit alu ops for pointer types not defining ptr_limit
Piotr Krysiuk [Tue, 16 Mar 2021 08:47:02 +0000 (09:47 +0100)]
bpf: Prohibit alu ops for pointer types not defining ptr_limit

commit f232326f6966cf2a1d1db7bc917a4ce5f9f55f76 upstream.

The purpose of this patch is to streamline error propagation and in particular
to propagate retrieve_ptr_limit() errors for pointer types that are not defining
a ptr_limit such that register-based alu ops against these types can be rejected.

The main rationale is that a gap has been identified by Piotr in the existing
protection against speculatively out-of-bounds loads, for example, in case of
ctx pointers, unprivileged programs can still perform pointer arithmetic. This
can be abused to execute speculatively out-of-bounds loads without restrictions
and thus extract contents of kernel memory.

Fix this by rejecting unprivileged programs that attempt any pointer arithmetic
on unprotected pointer types. The two affected ones are pointer to ctx as well
as pointer to map. Field access to a modified ctx' pointer is rejected at a
later point in time in the verifier, and 7c6967326267 ("bpf: Permit map_ptr
arithmetic with opcode add and offset 0") only relevant for root-only use cases.
Risk of unprivileged program breakage is considered very low.

Fixes: 7c6967326267 ("bpf: Permit map_ptr arithmetic with opcode add and offset 0")
Fixes: b2157399cc98 ("bpf: prevent out-of-bounds speculation")
Signed-off-by: Piotr Krysiuk <piotras@gmail.com>
Co-developed-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
6 months agobpf: Add sanity check for upper ptr_limit
Piotr Krysiuk [Tue, 16 Mar 2021 08:47:02 +0000 (09:47 +0100)]
bpf: Add sanity check for upper ptr_limit

commit 1b1597e64e1a610c7a96710fc4717158e98a08b3 upstream.

Given we know the max possible value of ptr_limit at the time of retrieving
the latter, add basic assertions, so that the verifier can bail out if
anything looks odd and reject the program. Nothing triggered this so far,
but it also does not hurt to have these.

Signed-off-by: Piotr Krysiuk <piotras@gmail.com>
Co-developed-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
6 months agobpf: Simplify alu_limit masking for pointer arithmetic
Piotr Krysiuk [Tue, 16 Mar 2021 07:26:25 +0000 (08:26 +0100)]
bpf: Simplify alu_limit masking for pointer arithmetic

commit b5871dca250cd391885218b99cc015aca1a51aea upstream.

Instead of having the mov32 with aux->alu_limit - 1 immediate, move this
operation to retrieve_ptr_limit() instead to simplify the logic and to
allow for subsequent sanity boundary checks inside retrieve_ptr_limit().
This avoids in future that at the time of the verifier masking rewrite
we'd run into an underflow which would not sign extend due to the nature
of mov32 instruction.

Signed-off-by: Piotr Krysiuk <piotras@gmail.com>
Co-developed-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
6 months agobpf: Fix off-by-one for area size in creating mask to left
Piotr Krysiuk [Tue, 16 Mar 2021 07:20:16 +0000 (08:20 +0100)]
bpf: Fix off-by-one for area size in creating mask to left

commit 10d2bb2e6b1d8c4576c56a748f697dbeb8388899 upstream.

retrieve_ptr_limit() computes the ptr_limit for registers with stack and
map_value type. ptr_limit is the size of the memory area that is still
valid / in-bounds from the point of the current position and direction
of the operation (add / sub). This size will later be used for masking
the operation such that attempting out-of-bounds access in the speculative
domain is redirected to remain within the bounds of the current map value.

When masking to the right the size is correct, however, when masking to
the left, the size is off-by-one which would lead to an incorrect mask
and thus incorrect arithmetic operation in the non-speculative domain.
Piotr found that if the resulting alu_limit value is zero, then the
BPF_MOV32_IMM() from the fixup_bpf_calls() rewrite will end up loading
0xffffffff into AX instead of sign-extending to the full 64 bit range,
and as a result, this allows abuse for executing speculatively out-of-
bounds loads against 4GB window of address space and thus extracting the
contents of kernel memory via side-channel.

Fixes: 979d63d50c0c ("bpf: prevent out of bounds speculation on pointer arithmetic")
Signed-off-by: Piotr Krysiuk <piotras@gmail.com>
Co-developed-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
6 months agobpf_lru_list: Read double-checked variable once without lock
Marco Elver [Tue, 9 Feb 2021 11:27:01 +0000 (12:27 +0100)]
bpf_lru_list: Read double-checked variable once without lock

[ Upstream commit 6df8fb83301d68ea0a0c0e1cbcc790fcc333ed12 ]

For double-checked locking in bpf_common_lru_push_free(), node->type is
read outside the critical section and then re-checked under the lock.
However, concurrent writes to node->type result in data races.

For example, the following concurrent access was observed by KCSAN:

  write to 0xffff88801521bc22 of 1 bytes by task 10038 on cpu 1:
   __bpf_lru_node_move_in        kernel/bpf/bpf_lru_list.c:91
   __local_list_flush            kernel/bpf/bpf_lru_list.c:298
   ...
  read to 0xffff88801521bc22 of 1 bytes by task 10043 on cpu 0:
   bpf_common_lru_push_free      kernel/bpf/bpf_lru_list.c:507
   bpf_lru_push_free             kernel/bpf/bpf_lru_list.c:555
   ...

Fix the data races where node->type is read outside the critical section
(for double-checked locking) by marking the access with READ_ONCE() as
well as ensuring the variable is only accessed once.

Fixes: 3a08c2fd7634 ("bpf: LRU List")
Reported-by: syzbot+3536db46dfa58c573458@syzkaller.appspotmail.com
Reported-by: syzbot+516acdb03d3e27d91bcd@syzkaller.appspotmail.com
Signed-off-by: Marco Elver <elver@google.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20210209112701.3341724-1-elver@google.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
6 months agobpf: Check for integer overflow when using roundup_pow_of_two()
Bui Quang Minh [Wed, 27 Jan 2021 06:36:53 +0000 (06:36 +0000)]
bpf: Check for integer overflow when using roundup_pow_of_two()

[ Upstream commit 6183f4d3a0a2ad230511987c6c362ca43ec0055f ]

On 32-bit architecture, roundup_pow_of_two() can return 0 when the argument
has upper most bit set due to resulting 1UL << 32. Add a check for this case.

Fixes: d5a3b1f69186 ("bpf: introduce BPF_MAP_TYPE_STACK_TRACE")
Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210127063653.3576-1-minhquangbui99@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
6 months agoBACKPORT: bpf: add bpf_ktime_get_boot_ns()
Maciej Żenczykowski [Sun, 26 Apr 2020 16:15:25 +0000 (09:15 -0700)]
BACKPORT: bpf: add bpf_ktime_get_boot_ns()

On a device like a cellphone which is constantly suspending
and resuming CLOCK_MONOTONIC is not particularly useful for
keeping track of or reacting to external network events.
Instead you want to use CLOCK_BOOTTIME.

Hence add bpf_ktime_get_boot_ns() as a mirror of bpf_ktime_get_ns()
based around CLOCK_BOOTTIME instead of CLOCK_MONOTONIC.

Signed-off-by: Maciej Żenczykowski <maze@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
(cherry picked from commit 71d19214776e61b33da48f7c1b46e522c7f78221)
Change-Id: Ifd62c410dcc5112fd1a473a7e1f70231ca514bc0

6 months agoUPSTREAM: net: bpf: Make bpf_ktime_get_ns() available to non GPL programs
Maciej Żenczykowski [Mon, 20 Apr 2020 18:47:50 +0000 (11:47 -0700)]
UPSTREAM: net: bpf: Make bpf_ktime_get_ns() available to non GPL programs

The entire implementation is in kernel/bpf/helpers.c:

BPF_CALL_0(bpf_ktime_get_ns) {
       /* NMI safe access to clock monotonic */
       return ktime_get_mono_fast_ns();
}

const struct bpf_func_proto bpf_ktime_get_ns_proto = {
       .func           = bpf_ktime_get_ns,
       .gpl_only       = false,
       .ret_type       = RET_INTEGER,
};

and this was presumably marked GPL due to kernel/time/timekeeping.c:
  EXPORT_SYMBOL_GPL(ktime_get_mono_fast_ns);

and while that may make sense for kernel modules (although even that
is doubtful), there is currently AFAICT no other source of time
available to ebpf.

Furthermore this is really just equivalent to clock_gettime(CLOCK_MONOTONIC)
which is exposed to userspace (via vdso even to make it performant)...

As such, I see no reason to keep the GPL restriction.
(In the future I'd like to have access to time from Apache licensed ebpf code)

Signed-off-by: Maciej Żenczykowski <maze@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
(cherry picked from commit 082b57e3eb09810d357083cca5ee2df02c16aec9)
Change-Id: I76f763c64fcd56e7149f94625146486ba00db6c1

6 months agosamples: bpf: Fix lwt_len_hist reusing previous BPF map
Daniel T. Lee [Tue, 24 Nov 2020 09:03:09 +0000 (09:03 +0000)]
samples: bpf: Fix lwt_len_hist reusing previous BPF map

[ Upstream commit 0afe0a998c40085a6342e1aeb4c510cccba46caf ]

Currently, lwt_len_hist's map lwt_len_hist_map is uses pinning, and the
map isn't cleared on test end. This leds to reuse of that map for
each test, which prevents the results of the test from being accurate.

This commit fixes the problem by removing of pinned map from bpffs.
Also, this commit add the executable permission to shell script
files.

Fixes: f74599f7c5309 ("bpf: Add tests and samples for LWT-BPF")
Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20201124090310.24374-7-danieltimlee@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
6 months agobpf: Remove recursion prevention from rcu free callback
Thomas Gleixner [Mon, 24 Feb 2020 14:01:39 +0000 (15:01 +0100)]
bpf: Remove recursion prevention from rcu free callback

[ Upstream commit 8a37963c7ac9ecb7f86f8ebda020e3f8d6d7b8a0 ]

If an element is freed via RCU then recursion into BPF instrumentation
functions is not a concern. The element is already detached from the map
and the RCU callback does not hold any locks on which a kprobe, perf event
or tracepoint attached BPF program could deadlock.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200224145643.259118710@linutronix.de
Signed-off-by: Sasha Levin <sashal@kernel.org>
6 months agobpf: Fix map leak in HASH_OF_MAPS map
Andrii Nakryiko [Wed, 29 Jul 2020 04:09:12 +0000 (21:09 -0700)]
bpf: Fix map leak in HASH_OF_MAPS map

[ Upstream commit 1d4e1eab456e1ee92a94987499b211db05f900ea ]

Fix HASH_OF_MAPS bug of not putting inner map pointer on bpf_map_elem_update()
operation. This is due to per-cpu extra_elems optimization, which bypassed
free_htab_elem() logic doing proper clean ups. Make sure that inner map is put
properly in optimized case as well.

Fixes: 8c290e60fa2a ("bpf: fix hashmap extra_elems logic")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20200729040913.2815687-1-andriin@fb.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
6 months agosamples: bpf: Fix build error
Matteo Croce [Mon, 11 May 2020 11:32:34 +0000 (13:32 +0200)]
samples: bpf: Fix build error

[ Upstream commit 23ad04669f81f958e9a4121b0266228d2eb3c357 ]

GCC 10 is very strict about symbol clash, and lwt_len_hist_user contains
a symbol which clashes with libbpf:

/usr/bin/ld: samples/bpf/lwt_len_hist_user.o:(.bss+0x0): multiple definition of `bpf_log_buf'; samples/bpf/bpf_load.o:(.bss+0x8c0): first defined here
collect2: error: ld returned 1 exit status

bpf_log_buf here seems to be a leftover, so removing it.

Signed-off-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200511113234.80722-1-mcroce@redhat.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
6 months agoarm, bpf: Fix bugs with ALU64 {RSH, ARSH} BPF_K shift by 0
Luke Nelson [Wed, 8 Apr 2020 18:12:29 +0000 (18:12 +0000)]
arm, bpf: Fix bugs with ALU64 {RSH, ARSH} BPF_K shift by 0

commit bb9562cf5c67813034c96afb50bd21130a504441 upstream.

The current arm BPF JIT does not correctly compile RSH or ARSH when the
immediate shift amount is 0. This causes the "rsh64 by 0 imm" and "arsh64
by 0 imm" BPF selftests to hang the kernel by reaching an instruction
the verifier determines to be unreachable.

The root cause is in how immediate right shifts are encoded on arm.
For LSR and ASR (logical and arithmetic right shift), a bit-pattern
of 00000 in the immediate encodes a shift amount of 32. When the BPF
immediate is 0, the generated code shifts by 32 instead of the expected
behavior (a no-op).

This patch fixes the bugs by adding an additional check if the BPF
immediate is 0. After the change, the above mentioned BPF selftests pass.

Fixes: 39c13c204bb11 ("arm: eBPF JIT compiler")
Co-developed-by: Xi Wang <xi.wang@gmail.com>
Signed-off-by: Xi Wang <xi.wang@gmail.com>
Signed-off-by: Luke Nelson <luke.r.nels@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200408181229.10909-1-luke.r.nels@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
6 months agovti[6]: fix packet tx through bpf_redirect() in XinY cases
Nicolas Dichtel [Tue, 4 Feb 2020 16:00:27 +0000 (17:00 +0100)]
vti[6]: fix packet tx through bpf_redirect() in XinY cases

commit f1ed10264ed6b66b9cd5e8461cffce69be482356 upstream.

I forgot the 4in6/6in4 cases in my previous patch. Let's fix them.

Fixes: 95224166a903 ("vti[6]: fix packet tx through bpf_redirect()")
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
6 months agoUPSTREAM: bpf: Explicitly memset some bpf info structures declared on the stack
Greg Kroah-Hartman [Fri, 20 Mar 2020 16:22:58 +0000 (17:22 +0100)]
UPSTREAM: bpf: Explicitly memset some bpf info structures declared on the stack

Trying to initialize a structure with "= {};" will not always clean out
all padding locations in a structure. So be explicit and call memset to
initialize everything for a number of bpf information structures that
are then copied from userspace, sometimes from smaller memory locations
than the size of the structure.

Reported-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200320162258.GA794295@kroah.com
(cherry picked from commit 269efb7fc478563a7e7b22590d8076823f4ac82a)
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I52a2cab20aa310085ec104bd811ac4f2b83657b6

6 months agoUPSTREAM: bpf: Explicitly memset the bpf_attr structure
Greg Kroah-Hartman [Fri, 20 Mar 2020 09:48:13 +0000 (10:48 +0100)]
UPSTREAM: bpf: Explicitly memset the bpf_attr structure

For the bpf syscall, we are relying on the compiler to properly zero out
the bpf_attr union that we copy userspace data into. Unfortunately that
doesn't always work properly, padding and other oddities might not be
correctly zeroed, and in some tests odd things have been found when the
stack is pre-initialized to other values.

Fix this by explicitly memsetting the structure to 0 before using it.

Reported-by: Maciej Żenczykowski <maze@google.com>
Reported-by: John Stultz <john.stultz@linaro.org>
Reported-by: Alexander Potapenko <glider@google.com>
Reported-by: Alistair Delva <adelva@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://android-review.googlesource.com/c/kernel/common/+/1235490
Link: https://lore.kernel.org/bpf/20200320094813.GA421650@kroah.com
(cherry picked from commit 8096f229421f7b22433775e928d506f0342e5907)
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I2dc28cd45024da5cc6861ff4a9b25fae389cc6d8

6 months agosamples/bpf: Don't try to remove user's homedir on clean
Toke Høiland-Jørgensen [Mon, 20 Jan 2020 13:06:41 +0000 (14:06 +0100)]
samples/bpf: Don't try to remove user's homedir on clean

commit b2e5e93ae8af6a34bca536cdc4b453ab1e707b8b upstream.

The 'clean' rule in the samples/bpf Makefile tries to remove backup
files (ending in ~). However, if no such files exist, it will instead try
to remove the user's home directory. While the attempt is mostly harmless,
it does lead to a somewhat scary warning like this:

rm: cannot remove '~': Is a directory

Fix this by using find instead of shell expansion to locate any actual
backup files that need to be removed.

Fixes: b62a796c109c ("samples/bpf: allow make to be run from samples/bpf/ directory")
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Link: https://lore.kernel.org/bpf/157952560126.1683545.7273054725976032511.stgit@toke.dk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
6 months agovti[6]: fix packet tx through bpf_redirect()
Nicolas Dichtel [Mon, 13 Jan 2020 08:32:46 +0000 (09:32 +0100)]
vti[6]: fix packet tx through bpf_redirect()

[ Upstream commit 95224166a9032ff5d08fca633d37113078ce7d01 ]

With an ebpf program that redirects packets through a vti[6] interface,
the packets are dropped because no dst is attached.

This could also be reproduced with an AF_PACKET socket, with the following
python script (vti1 is an ip_vti interface):

 import socket
 send_s = socket.socket(socket.AF_PACKET, socket.SOCK_RAW, 0)
 # scapy
 # p = IP(src='10.100.0.2', dst='10.200.0.1')/ICMP(type='echo-request')
 # raw(p)
 req = b'E\x00\x00\x1c\x00\x01\x00\x00@\x01e\xb2\nd\x00\x02\n\xc8\x00\x01\x08\x00\xf7\xff\x00\x00\x00\x00'
 send_s.sendto(req, ('vti1', 0x800, 0, 0))

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
6 months agoANDROID: fix bpf jit + cfi interactions
Maciej Żenczykowski [Wed, 29 Jan 2020 14:45:56 +0000 (06:45 -0800)]
ANDROID: fix bpf jit + cfi interactions

change from:
  https://android-review.googlesource.com/c/kernel/common/+/1126406
  ANDROID: bpf: validate bpf_func when BPF_JIT is enabled with CFI

was incorrectly reverted in:
  https://android-review.googlesource.com/c/kernel/common/+/1184358
  UPSTREAM: bpf: multi program support for cgroup+bpf

Test: builds
Bug: 121213201
Bug: 138317270
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Change-Id: I2b238de61340e58eb71aaa6cf6b59945a8740a08

6 months agobpf: Fix passing modified ctx to ld/abs/ind instruction
Daniel Borkmann [Mon, 6 Jan 2020 21:51:57 +0000 (22:51 +0100)]
bpf: Fix passing modified ctx to ld/abs/ind instruction

commit 6d4f151acf9a4f6fab09b615f246c717ddedcf0c upstream.

Anatoly has been fuzzing with kBdysch harness and reported a KASAN
slab oob in one of the outcomes:

  [...]
  [   77.359642] BUG: KASAN: slab-out-of-bounds in bpf_skb_load_helper_8_no_cache+0x71/0x130
  [   77.360463] Read of size 4 at addr ffff8880679bac68 by task bpf/406
  [   77.361119]
  [   77.361289] CPU: 2 PID: 406 Comm: bpf Not tainted 5.5.0-rc2-xfstests-00157-g2187f215eba #1
  [   77.362134] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
  [   77.362984] Call Trace:
  [   77.363249]  dump_stack+0x97/0xe0
  [   77.363603]  print_address_description.constprop.0+0x1d/0x220
  [   77.364251]  ? bpf_skb_load_helper_8_no_cache+0x71/0x130
  [   77.365030]  ? bpf_skb_load_helper_8_no_cache+0x71/0x130
  [   77.365860]  __kasan_report.cold+0x37/0x7b
  [   77.366365]  ? bpf_skb_load_helper_8_no_cache+0x71/0x130
  [   77.366940]  kasan_report+0xe/0x20
  [   77.367295]  bpf_skb_load_helper_8_no_cache+0x71/0x130
  [   77.367821]  ? bpf_skb_load_helper_8+0xf0/0xf0
  [   77.368278]  ? mark_lock+0xa3/0x9b0
  [   77.368641]  ? kvm_sched_clock_read+0x14/0x30
  [   77.369096]  ? sched_clock+0x5/0x10
  [   77.369460]  ? sched_clock_cpu+0x18/0x110
  [   77.369876]  ? bpf_skb_load_helper_8+0xf0/0xf0
  [   77.370330]  ___bpf_prog_run+0x16c0/0x28f0
  [   77.370755]  __bpf_prog_run32+0x83/0xc0
  [   77.371153]  ? __bpf_prog_run64+0xc0/0xc0
  [   77.371568]  ? match_held_lock+0x1b/0x230
  [   77.371984]  ? rcu_read_lock_held+0xa1/0xb0
  [   77.372416]  ? rcu_is_watching+0x34/0x50
  [   77.372826]  sk_filter_trim_cap+0x17c/0x4d0
  [   77.373259]  ? sock_kzfree_s+0x40/0x40
  [   77.373648]  ? __get_filter+0x150/0x150
  [   77.374059]  ? skb_copy_datagram_from_iter+0x80/0x280
  [   77.374581]  ? do_raw_spin_unlock+0xa5/0x140
  [   77.375025]  unix_dgram_sendmsg+0x33a/0xa70
  [   77.375459]  ? do_raw_spin_lock+0x1d0/0x1d0
  [   77.375893]  ? unix_peer_get+0xa0/0xa0
  [   77.376287]  ? __fget_light+0xa4/0xf0
  [   77.376670]  __sys_sendto+0x265/0x280
  [   77.377056]  ? __ia32_sys_getpeername+0x50/0x50
  [   77.377523]  ? lock_downgrade+0x350/0x350
  [   77.377940]  ? __sys_setsockopt+0x2a6/0x2c0
  [   77.378374]  ? sock_read_iter+0x240/0x240
  [   77.378789]  ? __sys_socketpair+0x22a/0x300
  [   77.379221]  ? __ia32_sys_socket+0x50/0x50
  [   77.379649]  ? mark_held_locks+0x1d/0x90
  [   77.380059]  ? trace_hardirqs_on_thunk+0x1a/0x1c
  [   77.380536]  __x64_sys_sendto+0x74/0x90
  [   77.380938]  do_syscall_64+0x68/0x2a0
  [   77.381324]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
  [   77.381878] RIP: 0033:0x44c070
  [...]

After further debugging, turns out while in case of other helper functions
we disallow passing modified ctx, the special case of ld/abs/ind instruction
which has similar semantics (except r6 being the ctx argument) is missing
such check. Modified ctx is impossible here as bpf_skb_load_helper_8_no_cache()
and others are expecting skb fields in original position, hence, add
check_ctx_reg() to reject any modified ctx. Issue was first introduced back
in f1174f77b50c ("bpf/verifier: rework value tracking").

Fixes: f1174f77b50c ("bpf/verifier: rework value tracking")
Reported-by: Anatoly Trosinenko <anatoly.trosinenko@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200106215157.3553-1-daniel@iogearbox.net
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
6 months agobpf: reject passing modified ctx to helper functions
Daniel Borkmann [Thu, 7 Jun 2018 15:40:03 +0000 (17:40 +0200)]
bpf: reject passing modified ctx to helper functions

commit 58990d1ff3f7896ee341030e9a7c2e4002570683 upstream.

As commit 28e33f9d78ee ("bpf: disallow arithmetic operations on
context pointer") already describes, f1174f77b50c ("bpf/verifier:
rework value tracking") removed the specific white-listed cases
we had previously where we would allow for pointer arithmetic in
order to further generalize it, and allow e.g. context access via
modified registers. While the dereferencing of modified context
pointers had been forbidden through 28e33f9d78ee, syzkaller did
recently manage to trigger several KASAN splats for slab out of
bounds access and use after frees by simply passing a modified
context pointer to a helper function which would then do the bad
access since verifier allowed it in adjust_ptr_min_max_vals().

Rejecting arithmetic on ctx pointer in adjust_ptr_min_max_vals()
generally could break existing programs as there's a valid use
case in tracing in combination with passing the ctx to helpers as
bpf_probe_read(), where the register then becomes unknown at
verification time due to adding a non-constant offset to it. An
access sequence may look like the following:

  offset = args->filename;  /* field __data_loc filename */
  bpf_probe_read(&dst, len, (char *)args + offset); // args is ctx

There are two options: i) we could special case the ctx and as
soon as we add a constant or bounded offset to it (hence ctx type
wouldn't change) we could turn the ctx into an unknown scalar, or
ii) we generalize the sanity test for ctx member access into a
small helper and assert it on the ctx register that was passed
as a function argument. Fwiw, latter is more obvious and less
complex at the same time, and one case that may potentially be
legitimate in future for ctx member access at least would be for
ctx to carry a const offset. Therefore, fix follows approach
from ii) and adds test cases to BPF kselftests.

Fixes: f1174f77b50c ("bpf/verifier: rework value tracking")
Reported-by: syzbot+3d0b2441dbb71751615e@syzkaller.appspotmail.com
Reported-by: syzbot+c8504affd4fdd0c1b626@syzkaller.appspotmail.com
Reported-by: syzbot+e5190cb881d8660fb1a3@syzkaller.appspotmail.com
Reported-by: syzbot+efae31b384d5badbd620@syzkaller.appspotmail.com
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Acked-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
6 months agosamples: bpf: fix syscall_tp due to unused syscall
Daniel T. Lee [Thu, 5 Dec 2019 08:01:14 +0000 (17:01 +0900)]
samples: bpf: fix syscall_tp due to unused syscall

[ Upstream commit fe3300897cbfd76c6cb825776e5ac0ca50a91ca4 ]

Currently, open() is called from the user program and it calls the syscall
'sys_openat', not the 'sys_open'. This leads to an error of the program
of user side, due to the fact that the counter maps are zero since no
function such 'sys_open' is called.

This commit adds the kernel bpf program which are attached to the
tracepoint 'sys_enter_openat' and 'sys_enter_openat'.

Fixes: 1da236b6be963 ("bpf: add a test case for syscalls/sys_{enter|exit}_* tracepoints")
Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
6 months agosamples: bpf: Replace symbol compare of trace_event
Daniel T. Lee [Thu, 5 Dec 2019 08:01:13 +0000 (17:01 +0900)]
samples: bpf: Replace symbol compare of trace_event

[ Upstream commit bba1b2a890253528c45aa66cf856f289a215bfbc ]

Previously, when this sample is added, commit 1c47910ef8013
("samples/bpf: add perf_event+bpf example"), a symbol 'sys_read' and
'sys_write' has been used without no prefixes. But currently there are
no exact symbols with these under kallsyms and this leads to failure.

This commit changes exact compare to substring compare to keep compatible
with exact symbol or prefixed symbol.

Fixes: 1c47910ef8013 ("samples/bpf: add perf_event+bpf example")
Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191205080114.19766-2-danieltimlee@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
6 months agobpf, mips: Limit to 33 tail calls
Paul Chaignon [Mon, 9 Dec 2019 18:52:52 +0000 (19:52 +0100)]
bpf, mips: Limit to 33 tail calls

[ Upstream commit e49e6f6db04e915dccb494ae10fa14888fea6f89 ]

All BPF JIT compilers except RISC-V's and MIPS' enforce a 33-tail calls
limit at runtime.  In addition, a test was recently added, in tailcalls2,
to check this limit.

This patch updates the tail call limit in MIPS' JIT compiler to allow
33 tail calls.

Fixes: b6bd53f9c4e8 ("MIPS: Add missing file for eBPF JIT.")
Reported-by: Mahshid Khezri <khezri.mahshid@gmail.com>
Signed-off-by: Paul Chaignon <paul.chaignon@orange.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/b8eb2caac1c25453c539248e56ca22f74b5316af.1575916815.git.paul.chaignon@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
6 months agonet, sysctl: Fix compiler warning when only cBPF is present
Alexander Lobakin [Wed, 18 Dec 2019 09:18:21 +0000 (12:18 +0300)]
net, sysctl: Fix compiler warning when only cBPF is present

[ Upstream commit 1148f9adbe71415836a18a36c1b4ece999ab0973 ]

proc_dointvec_minmax_bpf_restricted() has been firstly introduced
in commit 2e4a30983b0f ("bpf: restrict access to core bpf sysctls")
under CONFIG_HAVE_EBPF_JIT. Then, this ifdef has been removed in
ede95a63b5e8 ("bpf: add bpf_jit_limit knob to restrict unpriv
allocations"), because a new sysctl, bpf_jit_limit, made use of it.
Finally, this parameter has become long instead of integer with
fdadd04931c2 ("bpf: fix bpf_jit_limit knob for PAGE_SIZE >= 64K")
and thus, a new proc_dolongvec_minmax_bpf_restricted() has been
added.

With this last change, we got back to that
proc_dointvec_minmax_bpf_restricted() is used only under
CONFIG_HAVE_EBPF_JIT, but the corresponding ifdef has not been
brought back.

So, in configurations like CONFIG_BPF_JIT=y && CONFIG_HAVE_EBPF_JIT=n
since v4.20 we have:

  CC      net/core/sysctl_net_core.o
net/core/sysctl_net_core.c:292:1: warning: â€˜proc_dointvec_minmax_bpf_restricted’ defined but not used [-Wunused-function]
  292 | proc_dointvec_minmax_bpf_restricted(struct ctl_table *table, int write,
      | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Suppress this by guarding it with CONFIG_HAVE_EBPF_JIT again.

Fixes: fdadd04931c2 ("bpf: fix bpf_jit_limit knob for PAGE_SIZE >= 64K")
Signed-off-by: Alexander Lobakin <alobakin@dlink.ru>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191218091821.7080-1-alobakin@dlink.ru
Signed-off-by: Sasha Levin <sashal@kernel.org>
6 months agoselftests/bpf: Correct path to include msg + path
Ivan Khoronzhuk [Wed, 2 Oct 2019 12:04:04 +0000 (15:04 +0300)]
selftests/bpf: Correct path to include msg + path

[ Upstream commit c588146378962786ddeec817f7736a53298a7b01 ]

The "path" buf is supposed to contain path + printf msg up to 24 bytes.
It will be cut anyway, but compiler generates truncation warns like:

"
samples/bpf/../../tools/testing/selftests/bpf/cgroup_helpers.c: In
function â€˜setup_cgroup_environment’:
samples/bpf/../../tools/testing/selftests/bpf/cgroup_helpers.c:52:34:
warning: â€˜/cgroup.controllers’ directive output may be truncated
writing 19 bytes into a region of size between 1 and 4097
[-Wformat-truncation=]
snprintf(path, sizeof(path), "%s/cgroup.controllers", cgroup_path);
  ^~~~~~~~~~~~~~~~~~~
samples/bpf/../../tools/testing/selftests/bpf/cgroup_helpers.c:52:2:
note: â€˜snprintf’ output between 20 and 4116 bytes into a destination
of size 4097
snprintf(path, sizeof(path), "%s/cgroup.controllers", cgroup_path);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
samples/bpf/../../tools/testing/selftests/bpf/cgroup_helpers.c:72:34:
warning: â€˜/cgroup.subtree_control’ directive output may be truncated
writing 23 bytes into a region of size between 1 and 4097
[-Wformat-truncation=]
snprintf(path, sizeof(path), "%s/cgroup.subtree_control",
  ^~~~~~~~~~~~~~~~~~~~~~~
cgroup_path);
samples/bpf/../../tools/testing/selftests/bpf/cgroup_helpers.c:72:2:
note: â€˜snprintf’ output between 24 and 4120 bytes into a destination
of size 4097
snprintf(path, sizeof(path), "%s/cgroup.subtree_control",
cgroup_path);
"

In order to avoid warns, lets decrease buf size for cgroup workdir on
24 bytes with assumption to include also "/cgroup.subtree_control" to
the address. The cut will never happen anyway.

Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20191002120404.26962-3-ivan.khoronzhuk@linaro.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
6 months agoUPSTREAM: bpf: permit multiple bpf attachments for a single perf event
Yonghong Song [Tue, 24 Oct 2017 06:53:08 +0000 (23:53 -0700)]
UPSTREAM: bpf: permit multiple bpf attachments for a single perf event

This patch enables multiple bpf attachments for a
kprobe/uprobe/tracepoint single trace event.
Each trace_event keeps a list of attached perf events.
When an event happens, all attached bpf programs will
be executed based on the order of attachment.

A global bpf_event_mutex lock is introduced to protect
prog_array attaching and detaching. An alternative will
be introduce a mutex lock in every trace_event_call
structure, but it takes a lot of extra memory.
So a global bpf_event_mutex lock is a good compromise.

The bpf prog detachment involves allocation of memory.
If the allocation fails, a dummy do-nothing program
will replace to-be-detached program in-place.

Signed-off-by: Yonghong Song <yhs@fb.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit e87c6bc3852b981e71c757be20771546ce9f76f3)
Signed-off-by: Connor O'Brien <connoro@google.com>
Bug: 121213201
Bug: 138317270
Test: build & boot cuttlefish; attach 2 progs to 1 tracepoint
Change-Id: I25ce1ed6c9512d0a6f2db7547e109958fe1619b6

6 months agoUPSTREAM: bpf: use the same condition in perf event set/free bpf handler
Yonghong Song [Tue, 24 Oct 2017 06:53:07 +0000 (23:53 -0700)]
UPSTREAM: bpf: use the same condition in perf event set/free bpf handler

This is a cleanup such that doing the same check in
perf_event_free_bpf_prog as we already do in
perf_event_set_bpf_prog step.

Signed-off-by: Yonghong Song <yhs@fb.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 0b4c6841fee03e096b735074a0c4aab3a8e92986)
Signed-off-by: Connor O'Brien <connoro@google.com>
Bug: 121213201
Bug: 138317270
Test: build & boot cuttlefish
Change-Id: Id64d5a025d383fa3d3b16c5c74e8f9e86148efaa

6 months agoUPSTREAM: bpf: multi program support for cgroup+bpf
Alexei Starovoitov [Tue, 3 Oct 2017 05:50:21 +0000 (22:50 -0700)]
UPSTREAM: bpf: multi program support for cgroup+bpf

introduce BPF_F_ALLOW_MULTI flag that can be used to attach multiple
bpf programs to a cgroup.

The difference between three possible flags for BPF_PROG_ATTACH command:
- NONE(default): No further bpf programs allowed in the subtree.
- BPF_F_ALLOW_OVERRIDE: If a sub-cgroup installs some bpf program,
  the program in this cgroup yields to sub-cgroup program.
- BPF_F_ALLOW_MULTI: If a sub-cgroup installs some bpf program,
  that cgroup program gets run in addition to the program in this cgroup.

NONE and BPF_F_ALLOW_OVERRIDE existed before. This patch doesn't
change their behavior. It only clarifies the semantics in relation
to new flag.

Only one program is allowed to be attached to a cgroup with
NONE or BPF_F_ALLOW_OVERRIDE flag.
Multiple programs are allowed to be attached to a cgroup with
BPF_F_ALLOW_MULTI flag. They are executed in FIFO order
(those that were attached first, run first)
The programs of sub-cgroup are executed first, then programs of
this cgroup and then programs of parent cgroup.
All eligible programs are executed regardless of return code from
earlier programs.

To allow efficient execution of multiple programs attached to a cgroup
and to avoid penalizing cgroups without any programs attached
introduce 'struct bpf_prog_array' which is RCU protected array
of pointers to bpf programs.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
for cgroup bits
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 324bda9e6c5add86ba2e1066476481c48132aca0)
Signed-off-by: Connor O'Brien <connoro@google.com>
Bug: 121213201
Bug: 138317270
Test: build & boot cuttlefish
Change-Id: If17b11a773f73d45ea565a947fc1bf7e158db98d

6 months agobpf: drop refcount if bpf_map_new_fd() fails in map_create()
Peng Sun [Wed, 27 Feb 2019 14:36:25 +0000 (22:36 +0800)]
bpf: drop refcount if bpf_map_new_fd() fails in map_create()

[ Upstream commit 352d20d611414715353ee65fc206ee57ab1a6984 ]

In bpf/syscall.c, map_create() first set map->usercnt to 1, a file
descriptor is supposed to return to userspace. When bpf_map_new_fd()
fails, drop the refcount.

Fixes: bd5f5f4ecb78 ("bpf: Add BPF_MAP_GET_FD_BY_ID")
Signed-off-by: Peng Sun <sironhide0null@gmail.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
6 months agobpf: decrease usercnt if bpf_map_new_fd() fails in bpf_map_get_fd_by_id()
Peng Sun [Tue, 26 Feb 2019 14:15:37 +0000 (22:15 +0800)]
bpf: decrease usercnt if bpf_map_new_fd() fails in bpf_map_get_fd_by_id()

[ Upstream commit 781e62823cb81b972dc8652c1827205cda2ac9ac ]

In bpf/syscall.c, bpf_map_get_fd_by_id() use bpf_map_inc_not_zero()
to increase the refcount, both map->refcnt and map->usercnt. Then, if
bpf_map_new_fd() fails, should handle map->usercnt too.

Fixes: bd5f5f4ecb78 ("bpf: Add BPF_MAP_GET_FD_BY_ID")
Signed-off-by: Peng Sun <sironhide0null@gmail.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
6 months agobpf: devmap: fix wrong interface selection in notifier_call
Taehee Yoo [Wed, 24 Oct 2018 11:15:17 +0000 (20:15 +0900)]
bpf: devmap: fix wrong interface selection in notifier_call

[ Upstream commit f592f804831f1cf9d1f9966f58c80f150e6829b5 ]

The dev_map_notification() removes interface in devmap if
unregistering interface's ifindex is same.
But only checking ifindex is not enough because other netns can have
same ifindex. so that wrong interface selection could occurred.
Hence netdev pointer comparison code is added.

v2: compare netdev pointer instead of using net_eq() (Daniel Borkmann)
v1: Initial patch

Fixes: 2ddf71e23cc2 ("net: add notifier hooks for devmap bpf map")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
6 months agosamples/bpf: fix compilation failure
Prashant Bhole [Thu, 20 Sep 2018 07:52:03 +0000 (16:52 +0900)]
samples/bpf: fix compilation failure

[ Upstream commit 32c009798385ce21080beaa87a9b95faad3acd1e ]

following commit:
commit d58e468b1112 ("flow_dissector: implements flow dissector BPF hook")
added struct bpf_flow_keys which conflicts with the struct with
same name in sockex2_kern.c and sockex3_kern.c

similar to commit:
commit 534e0e52bc23 ("samples/bpf: fix a compilation failure")
we tried the rename it "flow_keys" but it also conflicted with struct
having same name in include/net/flow_dissector.h. Hence renaming the
struct to "flow_key_record". Also, this commit doesn't fix the
compilation error completely because the similar struct is present in
sockex3_kern.c. Hence renaming it in both files sockex3_user.c and
sockex3_kern.c

Signed-off-by: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
6 months agosamples/bpf: fix a compilation failure
Yonghong Song [Tue, 18 Sep 2018 05:08:13 +0000 (22:08 -0700)]
samples/bpf: fix a compilation failure

[ Upstream commit 534e0e52bc23de588e81b5a6f75e10c8c4b189fc ]

samples/bpf build failed with the following errors:

  $ make samples/bpf/
  ...
  HOSTCC  samples/bpf/sockex3_user.o
  /data/users/yhs/work/net-next/samples/bpf/sockex3_user.c:16:8: error: redefinition of â€˜struct bpf_flow_keys’
   struct bpf_flow_keys {
          ^
  In file included from /data/users/yhs/work/net-next/samples/bpf/sockex3_user.c:4:0:
  ./usr/include/linux/bpf.h:2338:9: note: originally defined here
    struct bpf_flow_keys *flow_keys;
           ^
  make[3]: *** [samples/bpf/sockex3_user.o] Error 1

Commit d58e468b1112d ("flow_dissector: implements flow dissector BPF hook")
introduced struct bpf_flow_keys in include/uapi/linux/bpf.h and hence
caused the naming conflict with samples/bpf/sockex3_user.c.

The fix is to rename struct bpf_flow_keys in samples/bpf/sockex3_user.c
to flow_keys to avoid the conflict.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
6 months agobpf: fix use after free in prog symbol exposure
Daniel Borkmann [Fri, 4 Oct 2019 17:41:12 +0000 (10:41 -0700)]
bpf: fix use after free in prog symbol exposure

commit c751798aa224fadc5124b49eeb38fb468c0fa039 upstream.

syzkaller managed to trigger the warning in bpf_jit_free() which checks via
bpf_prog_kallsyms_verify_off() for potentially unlinked JITed BPF progs
in kallsyms, and subsequently trips over GPF when walking kallsyms entries:

  [...]
  8021q: adding VLAN 0 to HW filter on device batadv0
  8021q: adding VLAN 0 to HW filter on device batadv0
  WARNING: CPU: 0 PID: 9869 at kernel/bpf/core.c:810 bpf_jit_free+0x1e8/0x2a0
  Kernel panic - not syncing: panic_on_warn set ...
  CPU: 0 PID: 9869 Comm: kworker/0:7 Not tainted 5.0.0-rc8+ #1
  Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
  Workqueue: events bpf_prog_free_deferred
  Call Trace:
   __dump_stack lib/dump_stack.c:77 [inline]
   dump_stack+0x113/0x167 lib/dump_stack.c:113
   panic+0x212/0x40b kernel/panic.c:214
   __warn.cold.8+0x1b/0x38 kernel/panic.c:571
   report_bug+0x1a4/0x200 lib/bug.c:186
   fixup_bug arch/x86/kernel/traps.c:178 [inline]
   do_error_trap+0x11b/0x200 arch/x86/kernel/traps.c:271
   do_invalid_op+0x36/0x40 arch/x86/kernel/traps.c:290
   invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:973
  RIP: 0010:bpf_jit_free+0x1e8/0x2a0
  Code: 02 4c 89 e2 83 e2 07 38 d0 7f 08 84 c0 0f 85 86 00 00 00 48 ba 00 02 00 00 00 00 ad de 0f b6 43 02 49 39 d6 0f 84 5f fe ff ff <0f> 0b e9 58 fe ff ff 48 b8 00 00 00 00 00 fc ff df 4c 89 e2 48 c1
  RSP: 0018:ffff888092f67cd8 EFLAGS: 00010202
  RAX: 0000000000000007 RBX: ffffc90001947000 RCX: ffffffff816e9d88
  RDX: dead000000000200 RSI: 0000000000000008 RDI: ffff88808769f7f0
  RBP: ffff888092f67d00 R08: fffffbfff1394059 R09: fffffbfff1394058
  R10: fffffbfff1394058 R11: ffffffff89ca02c7 R12: ffffc90001947002
  R13: ffffc90001947020 R14: ffffffff881eca80 R15: ffff88808769f7e8
  BUG: unable to handle kernel paging request at fffffbfff400d000
  #PF error: [normal kernel read fault]
  PGD 21ffee067 P4D 21ffee067 PUD 21ffed067 PMD 9f942067 PTE 0
  Oops: 0000 [#1] PREEMPT SMP KASAN
  CPU: 0 PID: 9869 Comm: kworker/0:7 Not tainted 5.0.0-rc8+ #1
  Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
  Workqueue: events bpf_prog_free_deferred
  RIP: 0010:bpf_get_prog_addr_region kernel/bpf/core.c:495 [inline]
  RIP: 0010:bpf_tree_comp kernel/bpf/core.c:558 [inline]
  RIP: 0010:__lt_find include/linux/rbtree_latch.h:115 [inline]
  RIP: 0010:latch_tree_find include/linux/rbtree_latch.h:208 [inline]
  RIP: 0010:bpf_prog_kallsyms_find+0x107/0x2e0 kernel/bpf/core.c:632
  Code: 00 f0 ff ff 44 38 c8 7f 08 84 c0 0f 85 fa 00 00 00 41 f6 45 02 01 75 02 0f 0b 48 39 da 0f 82 92 00 00 00 48 89 d8 48 c1 e8 03 <42> 0f b6 04 30 84 c0 74 08 3c 03 0f 8e 45 01 00 00 8b 03 48 c1 e0
  [...]

Upon further debugging, it turns out that whenever we trigger this
issue, the kallsyms removal in bpf_prog_ksym_node_del() was /skipped/
but yet bpf_jit_free() reported that the entry is /in use/.

Problem is that symbol exposure via bpf_prog_kallsyms_add() but also
perf_event_bpf_event() were done /after/ bpf_prog_new_fd(). Once the
fd is exposed to the public, a parallel close request came in right
before we attempted to do the bpf_prog_kallsyms_add().

Given at this time the prog reference count is one, we start to rip
everything underneath us via bpf_prog_release() -> bpf_prog_put().
The memory is eventually released via deferred free, so we're seeing
that bpf_jit_free() has a kallsym entry because we added it from
bpf_prog_load() but /after/ bpf_prog_put() from the remote CPU.

Therefore, move both notifications /before/ we install the fd. The
issue was never seen between bpf_prog_alloc_id() and bpf_prog_new_fd()
because upon bpf_prog_get_fd_by_id() we'll take another reference to
the BPF prog, so we're still holding the original reference from the
bpf_prog_load().

Fixes: 6ee52e2a3fe4 ("perf, bpf: Introduce PERF_RECORD_BPF_EVENT")
Fixes: 74451e66d516 ("bpf: make jited programs visible in traces")
Reported-by: syzbot+bd3bba6ff3fcea7a6ec6@syzkaller.appspotmail.com
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Song Liu <songliubraving@fb.com>
Signed-off-by: Zubin Mithra <zsm@chromium.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
6 months agoANDROID: arm64: bpf: implement arch_bpf_jit_check_func
Sami Tolvanen [Wed, 4 Sep 2019 21:56:40 +0000 (14:56 -0700)]
ANDROID: arm64: bpf: implement arch_bpf_jit_check_func

Implement arch_bpf_jit_check_func to check that pointers to jited BPF
functions are correctly aligned and point to the BPF JIT region. This
narrows down the attack surface on the stored pointer.

Bug: 140377409
Change-Id: I10c448eda6a8b0bf4c16ee591fc65974696216b9
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
6 months agoANDROID: bpf: validate bpf_func when BPF_JIT is enabled with CFI
Sami Tolvanen [Wed, 4 Sep 2019 21:08:16 +0000 (14:08 -0700)]
ANDROID: bpf: validate bpf_func when BPF_JIT is enabled with CFI

With CONFIG_BPF_JIT, the kernel makes indirect calls to dynamically
generated code, which the compile-time Control-Flow Integrity (CFI)
checking cannot validate. This change adds basic sanity checking to
ensure we are jumping to a valid location, which narrows down the
attack surface on the stored pointer.

In addition, this change adds a weak arch_bpf_jit_check_func function,
which architectures that implement BPF JIT can override to perform
additional validation, such as verifying that the pointer points to
the correct memory region.

Bug: 140377409
Change-Id: I8ebac6637ab6bd9db44716b1c742add267298669
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
6 months agoUPSTREAM: kcm: use BPF_PROG_RUN
Sami Tolvanen [Wed, 4 Sep 2019 19:38:57 +0000 (12:38 -0700)]
UPSTREAM: kcm: use BPF_PROG_RUN

Instead of invoking struct bpf_prog::bpf_func directly, use the
BPF_PROG_RUN macro.

Bug: 140377409
Change-Id: I26abeccc8d25af0f412935ed97aebb5c64f52a2a
(cherry picked from commit a2c11b034142 ("kcm: use BPF_PROG_RUN"))
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
6 months agobpf: fix bpf_jit_limit knob for PAGE_SIZE >= 64K
Daniel Borkmann [Tue, 11 Dec 2018 11:14:12 +0000 (12:14 +0100)]
bpf: fix bpf_jit_limit knob for PAGE_SIZE >= 64K

[ Upstream commit fdadd04931c2d7cd294dc5b2b342863f94be53a3 ]

Michael and Sandipan report:

  Commit ede95a63b5 introduced a bpf_jit_limit tuneable to limit BPF
  JIT allocations. At compile time it defaults to PAGE_SIZE * 40000,
  and is adjusted again at init time if MODULES_VADDR is defined.

  For ppc64 kernels, MODULES_VADDR isn't defined, so we're stuck with
  the compile-time default at boot-time, which is 0x9c400000 when
  using 64K page size. This overflows the signed 32-bit bpf_jit_limit
  value:

  root@ubuntu:/tmp# cat /proc/sys/net/core/bpf_jit_limit
  -1673527296

  and can cause various unexpected failures throughout the network
  stack. In one case `strace dhclient eth0` reported:

  setsockopt(5, SOL_SOCKET, SO_ATTACH_FILTER, {len=11, filter=0x105dd27f8},
             16) = -1 ENOTSUPP (Unknown error 524)

  and similar failures can be seen with tools like tcpdump. This doesn't
  always reproduce however, and I'm not sure why. The more consistent
  failure I've seen is an Ubuntu 18.04 KVM guest booted on a POWER9
  host would time out on systemd/netplan configuring a virtio-net NIC
  with no noticeable errors in the logs.

Given this and also given that in near future some architectures like
arm64 will have a custom area for BPF JIT image allocations we should
get rid of the BPF_JIT_LIMIT_DEFAULT fallback / default entirely. For
4.21, we have an overridable bpf_jit_alloc_exec(), bpf_jit_free_exec()
so therefore add another overridable bpf_jit_alloc_exec_limit() helper
function which returns the possible size of the memory area for deriving
the default heuristic in bpf_jit_charge_init().

Like bpf_jit_alloc_exec() and bpf_jit_free_exec(), the new
bpf_jit_alloc_exec_limit() assumes that module_alloc() is the default
JIT memory provider, and therefore in case archs implement their custom
module_alloc() we use MODULES_{END,_VADDR} for limits and otherwise for
vmalloc_exec() cases like on ppc64 we use VMALLOC_{END,_START}.

Additionally, for archs supporting large page sizes, we should change
the sysctl to be handled as long to not run into sysctl restrictions
in future.

Fixes: ede95a63b5e8 ("bpf: add bpf_jit_limit knob to restrict unpriv allocations")
Reported-by: Sandipan Das <sandipan@linux.ibm.com>
Reported-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
6 months agobpf: add bpf_jit_limit knob to restrict unpriv allocations
Daniel Borkmann [Fri, 16 Aug 2019 22:05:36 +0000 (23:05 +0100)]
bpf: add bpf_jit_limit knob to restrict unpriv allocations

commit ede95a63b5e84ddeea6b0c473b36ab8bfd8c6ce3 upstream.

Rick reported that the BPF JIT could potentially fill the entire module
space with BPF programs from unprivileged users which would prevent later
attempts to load normal kernel modules or privileged BPF programs, for
example. If JIT was enabled but unsuccessful to generate the image, then
before commit 290af86629b2 ("bpf: introduce BPF_JIT_ALWAYS_ON config")
we would always fall back to the BPF interpreter. Nowadays in the case
where the CONFIG_BPF_JIT_ALWAYS_ON could be set, then the load will abort
with a failure since the BPF interpreter was compiled out.

Add a global limit and enforce it for unprivileged users such that in case
of BPF interpreter compiled out we fail once the limit has been reached
or we fall back to BPF interpreter earlier w/o using module mem if latter
was compiled in. In a next step, fair share among unprivileged users can
be resolved in particular for the case where we would fail hard once limit
is reached.

Fixes: 290af86629b2 ("bpf: introduce BPF_JIT_ALWAYS_ON config")
Fixes: 0a14842f5a3c ("net: filter: Just In Time compiler for x86-64")
Co-Developed-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Jann Horn <jannh@google.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: LKML <linux-kernel@vger.kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
6 months agobpf: restrict access to core bpf sysctls
Daniel Borkmann [Fri, 16 Aug 2019 22:05:20 +0000 (23:05 +0100)]
bpf: restrict access to core bpf sysctls

commit 2e4a30983b0f9b19b59e38bbf7427d7fdd480d98 upstream.

Given BPF reaches far beyond just networking these days, it was
never intended to allow setting and in some cases reading those
knobs out of a user namespace root running without CAP_SYS_ADMIN,
thus tighten such access.

Also the bpf_jit_enable = 2 debugging mode should only be allowed
if kptr_restrict is not set since it otherwise can leak addresses
to the kernel log. Dump a note to the kernel log that this is for
debugging JITs only when enabled.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
[bwh: Backported to 4.14: We don't have bpf_dump_raw_ok(), so drop the
 condition based on it. This condition only made it a bit harder for a
 privileged user to do something silly.]
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
6 months agobpf: get rid of pure_initcall dependency to enable jits
Daniel Borkmann [Fri, 16 Aug 2019 22:04:32 +0000 (23:04 +0100)]
bpf: get rid of pure_initcall dependency to enable jits

commit fa9dd599b4dae841924b022768354cfde9affecb upstream.

Having a pure_initcall() callback just to permanently enable BPF
JITs under CONFIG_BPF_JIT_ALWAYS_ON is unnecessary and could leave
a small race window in future where JIT is still disabled on boot.
Since we know about the setting at compilation time anyway, just
initialize it properly there. Also consolidate all the individual
bpf_jit_enable variables into a single one and move them under one
location. Moreover, don't allow for setting unspecified garbage
values on them.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
[bwh: Backported to 4.14 as dependency of commit 2e4a30983b0f
 "bpf: restrict access to core bpf sysctls":
 - Adjust context]
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
6 months agotcp: fix tcp_set_congestion_control() use from bpf hook
Eric Dumazet [Fri, 19 Jul 2019 02:28:14 +0000 (19:28 -0700)]
tcp: fix tcp_set_congestion_control() use from bpf hook

[ Upstream commit 8d650cdedaabb33e85e9b7c517c0c71fcecc1de9 ]

Neal reported incorrect use of ns_capable() from bpf hook.

bpf_setsockopt(...TCP_CONGESTION...)
  -> tcp_set_congestion_control()
   -> ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN)
    -> ns_capable_common()
     -> current_cred()
      -> rcu_dereference_protected(current->cred, 1)

Accessing 'current' in bpf context makes no sense, since packets
are processed from softirq context.

As Neal stated : The capability check in tcp_set_congestion_control()
was written assuming a system call context, and then was reused from
a BPF call site.

The fix is to add a new parameter to tcp_set_congestion_control(),
so that the ns_capable() call is only performed under the right
context.

Fixes: 91b5b21c7c16 ("bpf: Add support for changing congestion control")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Lawrence Brakmo <brakmo@fb.com>
Reported-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Lawrence Brakmo <brakmo@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
6 months agobpf: silence warning messages in core
Valdis Klētnieks [Fri, 7 Jun 2019 02:39:27 +0000 (22:39 -0400)]
bpf: silence warning messages in core

[ Upstream commit aee450cbe482a8c2f6fa5b05b178ef8b8ff107ca ]

Compiling kernel/bpf/core.c with W=1 causes a flood of warnings:

kernel/bpf/core.c:1198:65: warning: initialized field overwritten [-Woverride-init]
 1198 | #define BPF_INSN_3_TBL(x, y, z) [BPF_##x | BPF_##y | BPF_##z] = true
      |                                                                 ^~~~
kernel/bpf/core.c:1087:2: note: in expansion of macro 'BPF_INSN_3_TBL'
 1087 |  INSN_3(ALU, ADD,  X),   \
      |  ^~~~~~
kernel/bpf/core.c:1202:3: note: in expansion of macro 'BPF_INSN_MAP'
 1202 |   BPF_INSN_MAP(BPF_INSN_2_TBL, BPF_INSN_3_TBL),
      |   ^~~~~~~~~~~~
kernel/bpf/core.c:1198:65: note: (near initialization for 'public_insntable[12]')
 1198 | #define BPF_INSN_3_TBL(x, y, z) [BPF_##x | BPF_##y | BPF_##z] = true
      |                                                                 ^~~~
kernel/bpf/core.c:1087:2: note: in expansion of macro 'BPF_INSN_3_TBL'
 1087 |  INSN_3(ALU, ADD,  X),   \
      |  ^~~~~~
kernel/bpf/core.c:1202:3: note: in expansion of macro 'BPF_INSN_MAP'
 1202 |   BPF_INSN_MAP(BPF_INSN_2_TBL, BPF_INSN_3_TBL),
      |   ^~~~~~~~~~~~

98 copies of the above.

The attached patch silences the warnings, because we *know* we're overwriting
the default initializer. That leaves bpf/core.c with only 6 other warnings,
which become more visible in comparison.

Signed-off-by: Valdis Kletnieks <valdis.kletnieks@vt.edu>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
6 months agobpf: sockmap, fix use after free from sleep in psock backlog workqueue
John Fastabend [Fri, 24 May 2019 15:01:00 +0000 (08:01 -0700)]
bpf: sockmap, fix use after free from sleep in psock backlog workqueue

[ Upstream commit bd95e678e0f6e18351ecdc147ca819145db9ed7b ]

Backlog work for psock (sk_psock_backlog) might sleep while waiting
for memory to free up when sending packets. However, while sleeping
the socket may be closed and removed from the map by the user space
side.

This breaks an assumption in sk_stream_wait_memory, which expects the
wait queue to be still there when it wakes up resulting in a
use-after-free shown below. To fix his mark sendmsg as MSG_DONTWAIT
to avoid the sleep altogether. We already set the flag for the
sendpage case but we missed the case were sendmsg is used.
Sockmap is currently the only user of skb_send_sock_locked() so only
the sockmap paths should be impacted.

==================================================================
BUG: KASAN: use-after-free in remove_wait_queue+0x31/0x70
Write of size 8 at addr ffff888069a0c4e8 by task kworker/0:2/110

CPU: 0 PID: 110 Comm: kworker/0:2 Not tainted 5.0.0-rc2-00335-g28f9d1a3d4fe-dirty #14
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-2.fc27 04/01/2014
Workqueue: events sk_psock_backlog
Call Trace:
 print_address_description+0x6e/0x2b0
 ? remove_wait_queue+0x31/0x70
 kasan_report+0xfd/0x177
 ? remove_wait_queue+0x31/0x70
 ? remove_wait_queue+0x31/0x70
 remove_wait_queue+0x31/0x70
 sk_stream_wait_memory+0x4dd/0x5f0
 ? sk_stream_wait_close+0x1b0/0x1b0
 ? wait_woken+0xc0/0xc0
 ? tcp_current_mss+0xc5/0x110
 tcp_sendmsg_locked+0x634/0x15d0
 ? tcp_set_state+0x2e0/0x2e0
 ? __kasan_slab_free+0x1d1/0x230
 ? kmem_cache_free+0x70/0x140
 ? sk_psock_backlog+0x40c/0x4b0
 ? process_one_work+0x40b/0x660
 ? worker_thread+0x82/0x680
 ? kthread+0x1b9/0x1e0
 ? ret_from_fork+0x1f/0x30
 ? check_preempt_curr+0xaf/0x130
 ? iov_iter_kvec+0x5f/0x70
 ? kernel_sendmsg_locked+0xa0/0xe0
 skb_send_sock_locked+0x273/0x3c0
 ? skb_splice_bits+0x180/0x180
 ? start_thread+0xe0/0xe0
 ? update_min_vruntime.constprop.27+0x88/0xc0
 sk_psock_backlog+0xb3/0x4b0
 ? strscpy+0xbf/0x1e0
 process_one_work+0x40b/0x660
 worker_thread+0x82/0x680
 ? process_one_work+0x660/0x660
 kthread+0x1b9/0x1e0
 ? __kthread_create_on_node+0x250/0x250
 ret_from_fork+0x1f/0x30

Fixes: 20bf50de3028c ("skbuff: Function to send an skbuf on a socket")
Reported-by: Jakub Sitnicki <jakub@cloudflare.com>
Tested-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
6 months agosamples, bpf: fix to change the buffer size for read()
Chang-Hsien Tsai [Sun, 19 May 2019 09:05:44 +0000 (09:05 +0000)]
samples, bpf: fix to change the buffer size for read()

[ Upstream commit f7c2d64bac1be2ff32f8e4f500c6e5429c1003e0 ]

If the trace for read is larger than 4096, the return
value sz will be 4096. This results in off-by-one error
on buf:

    static char buf[4096];
    ssize_t sz;

    sz = read(trace_fd, buf, sizeof(buf));
    if (sz > 0) {
        buf[sz] = 0;
        puts(buf);
    }

Signed-off-by: Chang-Hsien Tsai <luke.tw@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
6 months agobpf, arm64: use more scalable stadd over ldxr / stxr loop in xadd
Daniel Borkmann [Fri, 26 Apr 2019 19:48:22 +0000 (21:48 +0200)]
bpf, arm64: use more scalable stadd over ldxr / stxr loop in xadd

commit 34b8ab091f9ef57a2bb3c8c8359a0a03a8abf2f9 upstream.

Since ARMv8.1 supplement introduced LSE atomic instructions back in 2016,
lets add support for STADD and use that in favor of LDXR / STXR loop for
the XADD mapping if available. STADD is encoded as an alias for LDADD with
XZR as the destination register, therefore add LDADD to the instruction
encoder along with STADD as special case and use it in the JIT for CPUs
that advertise LSE atomics in CPUID register. If immediate offset in the
BPF XADD insn is 0, then use dst register directly instead of temporary
one.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
6 months agobpf: udp: ipv6: Avoid running reuseport's bpf_prog from __udp6_lib_err
Martin KaFai Lau [Fri, 31 May 2019 22:29:11 +0000 (15:29 -0700)]
bpf: udp: ipv6: Avoid running reuseport's bpf_prog from __udp6_lib_err

commit 4ac30c4b3659efac031818c418beb51e630d512d upstream.

__udp6_lib_err() may be called when handling icmpv6 message. For example,
the icmpv6 toobig(type=2).  __udp6_lib_lookup() is then called
which may call reuseport_select_sock().  reuseport_select_sock() will
call into a bpf_prog (if there is one).

reuseport_select_sock() is expecting the skb->data pointing to the
transport header (udphdr in this case).  For example, run_bpf_filter()
is pulling the transport header.

However, in the __udp6_lib_err() path, the skb->data is pointing to the
ipv6hdr instead of the udphdr.

One option is to pull and push the ipv6hdr in __udp6_lib_err().
Instead of doing this, this patch follows how the original
commit 538950a1b752 ("soreuseport: setsockopt SO_ATTACH_REUSEPORT_[CE]BPF")
was done in IPv4, which has passed a NULL skb pointer to
reuseport_select_sock().

Fixes: 538950a1b752 ("soreuseport: setsockopt SO_ATTACH_REUSEPORT_[CE]BPF")
Cc: Craig Gallek <kraig@google.com>
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Song Liu <songliubraving@fb.com>
Acked-by: Craig Gallek <kraig@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
6 months agobpf: udp: Avoid calling reuseport's bpf_prog from udp_gro
Martin KaFai Lau [Fri, 31 May 2019 22:29:13 +0000 (15:29 -0700)]
bpf: udp: Avoid calling reuseport's bpf_prog from udp_gro

commit 257a525fe2e49584842c504a92c27097407f778f upstream.

When the commit a6024562ffd7 ("udp: Add GRO functions to UDP socket")
added udp[46]_lib_lookup_skb to the udp_gro code path, it broke
the reuseport_select_sock() assumption that skb->data is pointing
to the transport header.

This patch follows an earlier __udp6_lib_err() fix by
passing a NULL skb to avoid calling the reuseport's bpf_prog.

Fixes: a6024562ffd7 ("udp: Add GRO functions to UDP socket")
Cc: Tom Herbert <tom@herbertland.com>
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
6 months agopowerpc/bpf: use unsigned division instruction for 64-bit operations
Naveen N. Rao [Wed, 12 Jun 2019 18:51:40 +0000 (00:21 +0530)]
powerpc/bpf: use unsigned division instruction for 64-bit operations

commit 758f2046ea040773ae8ea7f72dd3bbd8fa984501 upstream.

BPF_ALU64 div/mod operations are currently using signed division, unlike
BPF_ALU32 operations. Fix the same. DIV64 and MOD64 overflow tests pass
with this fix.

Fixes: 156d0e290e969c ("powerpc/ebpf/jit: Implement JIT compiler for extended BPF")
Cc: stable@vger.kernel.org # v4.8+
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
6 months agolibbpf: fix samples/bpf build failure due to undefined UINT32_MAX
Daniel T. Lee [Tue, 23 Apr 2019 20:24:56 +0000 (05:24 +0900)]
libbpf: fix samples/bpf build failure due to undefined UINT32_MAX

[ Upstream commit 32e621e55496a0009f44fe4914cd4a23cade4984 ]

Currently, building bpf samples will cause the following error.

    ./tools/lib/bpf/bpf.h:132:27: error: 'UINT32_MAX' undeclared here (not in a function) ..
     #define BPF_LOG_BUF_SIZE (UINT32_MAX >> 8) /* verifier maximum in kernels <= 5.1 */
                               ^
    ./samples/bpf/bpf_load.h:31:25: note: in expansion of macro 'BPF_LOG_BUF_SIZE'
     extern char bpf_log_buf[BPF_LOG_BUF_SIZE];
                             ^~~~~~~~~~~~~~~~

Due to commit 4519efa6f8ea ("libbpf: fix BPF_LOG_BUF_SIZE off-by-one error")
hard-coded size of BPF_LOG_BUF_SIZE has been replaced with UINT32_MAX which is
defined in <stdint.h> header.

Even with this change, bpf selftests are running fine since these are built
with clang and it includes header(-idirafter) from clang/6.0.0/include.
(it has <stdint.h>)

    clang -I. -I./include/uapi -I../../../include/uapi -idirafter /usr/local/include -idirafter /usr/include \
    -idirafter /usr/lib/llvm-6.0/lib/clang/6.0.0/include -idirafter /usr/include/x86_64-linux-gnu \
    -Wno-compare-distinct-pointer-types -O2 -target bpf -emit-llvm -c progs/test_sysctl_prog.c -o - | \
    llc -march=bpf -mcpu=generic  -filetype=obj -o /linux/tools/testing/selftests/bpf/test_sysctl_prog.o

But bpf samples are compiled with GCC, and it only searches and includes
headers declared at the target file. As '#include <stdint.h>' hasn't been
declared in tools/lib/bpf/bpf.h, it causes build failure of bpf samples.

    gcc -Wp,-MD,./samples/bpf/.sockex3_user.o.d -Wall -Wmissing-prototypes -Wstrict-prototypes \
    -O2 -fomit-frame-pointer -std=gnu89 -I./usr/include -I./tools/lib/ -I./tools/testing/selftests/bpf/ \
    -I./tools/  lib/ -I./tools/include -I./tools/perf -c -o ./samples/bpf/sockex3_user.o ./samples/bpf/sockex3_user.c;

This commit add declaration of '#include <stdint.h>' to tools/lib/bpf/bpf.h
to fix this problem.

Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
6 months agotools/bpf: fix perf build error with uClibc (seen on ARC)
Vineet Gupta [Thu, 2 May 2019 15:56:50 +0000 (08:56 -0700)]
tools/bpf: fix perf build error with uClibc (seen on ARC)

[ Upstream commit ca31ca8247e2d3807ff5fa1d1760616a2292001c ]

When build perf for ARC recently, there was a build failure due to lack
of __NR_bpf.

| Auto-detecting system features:
|
| ...                     get_cpuid: [ OFF ]
| ...                           bpf: [ on  ]
|
| #  error __NR_bpf not defined. libbpf does not support your arch.
    ^~~~~
| bpf.c: In function 'sys_bpf':
| bpf.c:66:17: error: '__NR_bpf' undeclared (first use in this function)
|  return syscall(__NR_bpf, cmd, attr, size);
|                 ^~~~~~~~
|                 sys_bpf

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
6 months agobpf: devmap: fix use-after-free Read in __dev_map_entry_free
Eric Dumazet [Mon, 13 May 2019 16:59:16 +0000 (09:59 -0700)]
bpf: devmap: fix use-after-free Read in __dev_map_entry_free

commit 2baae3545327632167c0180e9ca1d467416f1919 upstream.

synchronize_rcu() is fine when the rcu callbacks only need
to free memory (kfree_rcu() or direct kfree() call rcu call backs)

__dev_map_entry_free() is a bit more complex, so we need to make
sure that call queued __dev_map_entry_free() callbacks have completed.

sysbot report:

BUG: KASAN: use-after-free in dev_map_flush_old kernel/bpf/devmap.c:365
[inline]
BUG: KASAN: use-after-free in __dev_map_entry_free+0x2a8/0x300
kernel/bpf/devmap.c:379
Read of size 8 at addr ffff8801b8da38c8 by task ksoftirqd/1/18

CPU: 1 PID: 18 Comm: ksoftirqd/1 Not tainted 4.17.0+ #39
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
  __dump_stack lib/dump_stack.c:77 [inline]
  dump_stack+0x1b9/0x294 lib/dump_stack.c:113
  print_address_description+0x6c/0x20b mm/kasan/report.c:256
  kasan_report_error mm/kasan/report.c:354 [inline]
  kasan_report.cold.7+0x242/0x2fe mm/kasan/report.c:412
  __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433
  dev_map_flush_old kernel/bpf/devmap.c:365 [inline]
  __dev_map_entry_free+0x2a8/0x300 kernel/bpf/devmap.c:379
  __rcu_reclaim kernel/rcu/rcu.h:178 [inline]
  rcu_do_batch kernel/rcu/tree.c:2558 [inline]
  invoke_rcu_callbacks kernel/rcu/tree.c:2818 [inline]
  __rcu_process_callbacks kernel/rcu/tree.c:2785 [inline]
  rcu_process_callbacks+0xe9d/0x1760 kernel/rcu/tree.c:2802
  __do_softirq+0x2e0/0xaf5 kernel/softirq.c:284
  run_ksoftirqd+0x86/0x100 kernel/softirq.c:645
  smpboot_thread_fn+0x417/0x870 kernel/smpboot.c:164
  kthread+0x345/0x410 kernel/kthread.c:240
  ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:412

Allocated by task 6675:
  save_stack+0x43/0xd0 mm/kasan/kasan.c:448
  set_track mm/kasan/kasan.c:460 [inline]
  kasan_kmalloc+0xc4/0xe0 mm/kasan/kasan.c:553
  kmem_cache_alloc_trace+0x152/0x780 mm/slab.c:3620
  kmalloc include/linux/slab.h:513 [inline]
  kzalloc include/linux/slab.h:706 [inline]
  dev_map_alloc+0x208/0x7f0 kernel/bpf/devmap.c:102
  find_and_alloc_map kernel/bpf/syscall.c:129 [inline]
  map_create+0x393/0x1010 kernel/bpf/syscall.c:453
  __do_sys_bpf kernel/bpf/syscall.c:2351 [inline]
  __se_sys_bpf kernel/bpf/syscall.c:2328 [inline]
  __x64_sys_bpf+0x303/0x510 kernel/bpf/syscall.c:2328
  do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:290
  entry_SYSCALL_64_after_hwframe+0x49/0xbe

Freed by task 26:
  save_stack+0x43/0xd0 mm/kasan/kasan.c:448
  set_track mm/kasan/kasan.c:460 [inline]
  __kasan_slab_free+0x11a/0x170 mm/kasan/kasan.c:521
  kasan_slab_free+0xe/0x10 mm/kasan/kasan.c:528
  __cache_free mm/slab.c:3498 [inline]
  kfree+0xd9/0x260 mm/slab.c:3813
  dev_map_free+0x4fa/0x670 kernel/bpf/devmap.c:191
  bpf_map_free_deferred+0xba/0xf0 kernel/bpf/syscall.c:262
  process_one_work+0xc64/0x1b70 kernel/workqueue.c:2153
  worker_thread+0x181/0x13a0 kernel/workqueue.c:2296
  kthread+0x345/0x410 kernel/kthread.c:240
  ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:412

The buggy address belongs to the object at ffff8801b8da37c0
  which belongs to the cache kmalloc-512 of size 512
The buggy address is located 264 bytes inside of
  512-byte region [ffff8801b8da37c0ffff8801b8da39c0)
The buggy address belongs to the page:
page:ffffea0006e368c0 count:1 mapcount:0 mapping:ffff8801da800940
index:0xffff8801b8da3540
flags: 0x2fffc0000000100(slab)
raw: 02fffc0000000100 ffffea0007217b88 ffffea0006e30cc8 ffff8801da800940
raw: ffff8801b8da3540 ffff8801b8da3040 0000000100000004 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
  ffff8801b8da3780: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb
  ffff8801b8da3800: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff8801b8da3880: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                               ^
  ffff8801b8da3900: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
  ffff8801b8da3980: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc

Fixes: 546ac1ffb70d ("bpf: add devmap, a map for storing net device references")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot+457d3e2ffbcf31aee5c0@syzkaller.appspotmail.com
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
6 months agobpf, lru: avoid messing with eviction heuristics upon syscall lookup
Daniel Borkmann [Mon, 13 May 2019 23:18:56 +0000 (01:18 +0200)]
bpf, lru: avoid messing with eviction heuristics upon syscall lookup

commit 50b045a8c0ccf44f76640ac3eea8d80ca53979a3 upstream.

One of the biggest issues we face right now with picking LRU map over
regular hash table is that a map walk out of user space, for example,
to just dump the existing entries or to remove certain ones, will
completely mess up LRU eviction heuristics and wrong entries such
as just created ones will get evicted instead. The reason for this
is that we mark an entry as "in use" via bpf_lru_node_set_ref() from
system call lookup side as well. Thus upon walk, all entries are
being marked, so information of actual least recently used ones
are "lost".

In case of Cilium where it can be used (besides others) as a BPF
based connection tracker, this current behavior causes disruption
upon control plane changes that need to walk the map from user space
to evict certain entries. Discussion result from bpfconf [0] was that
we should simply just remove marking from system call side as no
good use case could be found where it's actually needed there.
Therefore this patch removes marking for regular LRU and per-CPU
flavor. If there ever should be a need in future, the behavior could
be selected via map creation flag, but due to mentioned reason we
avoid this here.

  [0] http://vger.kernel.org/bpfconf.html

Fixes: 29ba732acbee ("bpf: Add BPF_MAP_TYPE_LRU_HASH")
Fixes: 8f8449384ec3 ("bpf: Add BPF_MAP_TYPE_LRU_PERCPU_HASH")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
6 months agobpf: add map_lookup_elem_sys_only for lookups from syscall side
Daniel Borkmann [Mon, 13 May 2019 23:18:55 +0000 (01:18 +0200)]
bpf: add map_lookup_elem_sys_only for lookups from syscall side

commit c6110222c6f49ea68169f353565eb865488a8619 upstream.

Add a callback map_lookup_elem_sys_only() that map implementations
could use over map_lookup_elem() from system call side in case the
map implementation needs to handle the latter differently than from
the BPF data path. If map_lookup_elem_sys_only() is set, this will
be preferred pick for map lookups out of user space. This hook is
used in a follow-up fix for LRU map, but once development window
opens, we can convert other map types from map_lookup_elem() (here,
the one called upon BPF_MAP_LOOKUP_ELEM cmd is meant) over to use
the callback to simplify and clean up the latter.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
6 months agobpf, arm64: remove prefetch insn in xadd mapping
Daniel Borkmann [Fri, 26 Apr 2019 19:48:21 +0000 (21:48 +0200)]
bpf, arm64: remove prefetch insn in xadd mapping

commit 8968c67a82ab7501bc3b9439c3624a49b42fe54c upstream.

Prefetch-with-intent-to-write is currently part of the XADD mapping in
the AArch64 JIT and follows the kernel's implementation of atomic_add.
This may interfere with other threads executing the LDXR/STXR loop,
leading to potential starvation and fairness issues. Drop the optional
prefetch instruction.

Fixes: 85f68fe89832 ("bpf, arm64: implement jiting of BPF_XADD")
Reported-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 months agoANDROID: arm64: vdso: wrap -n in ld-option
Nick Desaulniers [Thu, 4 Jun 2020 18:49:55 +0000 (11:49 -0700)]
ANDROID: arm64: vdso: wrap -n in ld-option

ld.lld distributed in clang-r353983c AOSP LLVM release (the final AOSP
LLVM release for Android Q) did not support `-n` linker flag. It was
eventually added to clang-r360593.

Android OEM's may wish to still use ld.lld to link their kernels for Q.
This flag was disabled for Pixel 4 in go/pag/1258086. This patch is
equivalent, but rebased on upstream changes that removed cc-ldoption in
favor of ld-option.

For Android R, the final AOSP LLVM release, clang-r383902 has long
supported `-n` for ld.lld.

Change-Id: Iab41c9e1039e163113b428fc487a4a0708822faa
Bug: 63740206
Bug: 157279372
Link: https://github.com/ClangBuiltLinux/linux/issues/340
Link: https://bugs.llvm.org/show_bug.cgi?id=40542
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
7 months agoBACKPORT: arm64: vdso: Explicitly add build-id option
Laura Abbott [Fri, 29 May 2020 17:54:25 +0000 (10:54 -0700)]
BACKPORT: arm64: vdso: Explicitly add build-id option

Commit 691efbedc60d ("arm64: vdso: use $(LD) instead of $(CC) to
link VDSO") switched to using LD explicitly. The --build-id option
needs to be passed explicitly, similar to x86. Add this option.

Fixes: 691efbedc60d ("arm64: vdso: use $(LD) instead of $(CC) to link VDSO")
Reviewed-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Laura Abbott <labbott@redhat.com>
[will: drop redundant use of 'call ld-option' as requested by Masahiro]
Signed-off-by: Will Deacon <will.deacon@arm.com>
Change-Id: I4a0f5c1bb60bda682221a7ff96a783bf8731cc00
[nd: conflict due to ANDROID LTO and CFI]
(cherry picked from commit 7a0a93c51799edc45ee57c6cc1679aa94f1e03d5)
Bug: 153418016
Bug: 157279372
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
(cherry picked from commit a9ee8bba814d956404c12b1c2e2c24cf4b710f08)

7 months agoBACKPORT: arm64: vdso: use $(LD) instead of $(CC) to link VDSO
Masahiro Yamada [Fri, 29 May 2020 17:52:09 +0000 (10:52 -0700)]
BACKPORT: arm64: vdso: use $(LD) instead of $(CC) to link VDSO

We use $(LD) to link vmlinux, modules, decompressors, etc.

VDSO is the only exceptional case where $(CC) is used as the linker
driver, but I do not know why we need to do so. VDSO uses a special
linker script, and does not link standard libraries at all.

I changed the Makefile to use $(LD) rather than $(CC). I tested this,
and VDSO worked for me.

Users will be able to use their favorite linker (e.g. lld instead of
of bfd) by passing LD= from the command line.

My plan is to rewrite all VDSO Makefiles to use $(LD), then delete
cc-ldoption.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Change-Id: I8a14d6dd51d46b6942e68720e24217d1564b7869
[nd: conflicts due to ANDROID patches for LTO and SCS]
(cherry picked from commit 691efbedc60d2a7364a90e38882fc762f06f52c4)
Bug: 153418016
Bug: 157279372
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
(cherry picked from commit 64ea9b4b072b37bd624dd98b963161fd22c1be34)

7 months agoRevert "[RAMEN9610-10883][HACK] Add cc-ldoption without checking"
Cosmin Tanislav [Tue, 16 Apr 2024 20:57:21 +0000 (23:57 +0300)]
Revert "[RAMEN9610-10883][HACK] Add cc-ldoption without checking"

This reverts commit 5b37cf28c672d6decfcd7fbeb5446baca8501e0a.

7 months agoBACKPORT: bpf: Use char in prog and map name backups/20240423-1546/lineage-21
Martin KaFai Lau [Fri, 6 Oct 2017 04:52:12 +0000 (21:52 -0700)]
BACKPORT: bpf: Use char in prog and map name

Instead of u8, use char for prog and map name.  It can avoid the
userspace tool getting compiler's signess warning.  The
bpf_prog_aux, bpf_map, bpf_attr, bpf_prog_info and
bpf_map_info are changed.

Change-Id: I599a8f1eccb0d63aa8d680b771fff1580c69cf75
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Cc: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 months agobpf: Change bpf_obj_name_cpy() to better ensure map's name is init by 0
Martin KaFai Lau [Fri, 6 Oct 2017 04:52:11 +0000 (21:52 -0700)]
bpf: Change bpf_obj_name_cpy() to better ensure map's name is init by 0

During get_info_by_fd, the prog/map name is memcpy-ed.  It depends
on the prog->aux->name and map->name to be zero initialized.

bpf_prog_aux is easy to guarantee that aux->name is zero init.

The name in bpf_map may be harder to be guaranteed in the future when
new map type is added.

Hence, this patch makes bpf_obj_name_cpy() to always zero init
the prog/map name.

Suggested-by: Daniel Borkmann <daniel@iogearbox.net>
Change-Id: Ib3bb6efbda0bd682e0cdad8617f587320d7dd397
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 months agoBACKPORT: bpf: Add map_name to bpf_map_info
Martin KaFai Lau [Wed, 27 Sep 2017 21:37:53 +0000 (14:37 -0700)]
BACKPORT: bpf: Add map_name to bpf_map_info

This patch allows userspace to specify a name for a map
during BPF_MAP_CREATE.

The map's name can later be exported to user space
via BPF_OBJ_GET_INFO_BY_FD.

Change-Id: I96b8d74b09c14f2413d421bba61cfa63d1730bc3
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Alexei Starovoitov <ast@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 months agoBACKPORT: bpf: Add name, load_time, uid and map_ids to bpf_prog_info
Martin KaFai Lau [Wed, 27 Sep 2017 21:37:52 +0000 (14:37 -0700)]
BACKPORT: bpf: Add name, load_time, uid and map_ids to bpf_prog_info

The patch adds name and load_time to struct bpf_prog_aux.  They
are also exported to bpf_prog_info.

The bpf_prog's name is passed by userspace during BPF_PROG_LOAD.
The kernel only stores the first (BPF_PROG_NAME_LEN - 1) bytes
and the name stored in the kernel is always \0 terminated.

The kernel will reject name that contains characters other than
isalnum() and '_'.  It will also reject name that is not null
terminated.

The existing 'user->uid' of the bpf_prog_aux is also exported to
the bpf_prog_info as created_by_uid.

The existing 'used_maps' of the bpf_prog_aux is exported to
the newly added members 'nr_map_ids' and 'map_ids' of
the bpf_prog_info.  On the input, nr_map_ids tells how
big the userspace's map_ids buffer is.  On the output,
nr_map_ids tells the exact user_map_cnt and it will only
copy up to the userspace's map_ids buffer is allowed.

Change-Id: I85270047bd427a4f00259541a08868df62168959
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Alexei Starovoitov <ast@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 months agoARM64: defconfig: FCM required changes
Nolen Johnson [Fri, 16 Feb 2024 20:58:47 +0000 (15:58 -0500)]
ARM64: defconfig: FCM required changes

* Solves the "There is an internal problem with your device.".

Change-Id: I06c77d1945a8995d2a6a0e173434129594db4c3d

10 months agoUPSTREAM: security: selinux: allow per-file labeling for bpffs
Connor O'Brien [Fri, 7 Feb 2020 18:01:49 +0000 (10:01 -0800)]
UPSTREAM: security: selinux: allow per-file labeling for bpffs

Add support for genfscon per-file labeling of bpffs files. This allows
for separate permissions for different pinned bpf objects, which may
be completely unrelated to each other.

Signed-off-by: Connor O'Brien <connoro@google.com>
Signed-off-by: Steven Moreland <smoreland@google.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
(cherry picked from commit 4ca54d3d3022ce27170b50e4bdecc3a42f05dbdc)
[which is v5.6-rc1-10-g4ca54d3d3022 and thus already included in 5.10]
Bug: 200440527
Change-Id: I8234b9047f29981b8140bd81bb2ff070b3b0b843
(cherry picked from commit d52ac987ad2ae16ff313d7fb6185bc412cb221a4)

10 months agoFROMLIST: security: selinux: allow per-file labelling for binderfs
Hridya Valsaraju [Sun, 8 Dec 2019 20:43:44 +0000 (12:43 -0800)]
FROMLIST: security: selinux: allow per-file labelling for binderfs

This patch allows genfscon per-file labeling for binderfs.
This is required to have separate permissions to allow
access to binder, hwbinder and vndbinder devices which are
relocating to binderfs.

Acked-by: Jeff Vander Stoep <jeffv@google.com>
Acked-by: Mark Salyzyn <salyzyn@android.com>
Signed-off-by: Hridya Valsaraju <hridya@google.com>
Bug: 136497735
(cherry picked from commit 7a4b51947475a7f67e2bd06c4a4c768e2e64a975
git: //git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux.git
master)
Link: https://lore.kernel.org/patchwork/patch/1175776/
Change-Id: I105cc54b30ddd4120dc23a363bddc2f9d00e4dc4

11 months agoARM: dts: msm: Move to second stage init
Sebastiano Barezzi [Thu, 18 Aug 2022 13:01:45 +0000 (15:01 +0200)]
ARM: dts: msm: Move to second stage init

Change-Id: Ia7642d730b0a68e045f19c6a3fd00ce3c8844379

11 months agoARM64: configs: Enable CONFIG_PROC_CMDLINE_APPEND_ANDROID_FORCE_NORMAL_BOOT
jabashque [Mon, 23 Oct 2023 06:21:09 +0000 (23:21 -0700)]
ARM64: configs: Enable CONFIG_PROC_CMDLINE_APPEND_ANDROID_FORCE_NORMAL_BOOT

Change-Id: I89f115c5757cdb7edb74bd45f881c5b8a7acd644

11 months agoARM64: configs: Enable CONFIG_INITRAMFS_IGNORE_SKIP_FLAG
Sebastiano Barezzi [Sun, 22 Oct 2023 21:05:43 +0000 (14:05 -0700)]
ARM64: configs: Enable CONFIG_INITRAMFS_IGNORE_SKIP_FLAG

Change-Id: I22b5eb981c886d9280957f6bd1047fc76ac9916b

11 months agofs: proc: Add PROC_CMDLINE_APPEND_ANDROID_FORCE_NORMAL_BOOT
Yumi Yukimura [Mon, 23 Oct 2023 06:06:53 +0000 (23:06 -0700)]
fs: proc: Add PROC_CMDLINE_APPEND_ANDROID_FORCE_NORMAL_BOOT

For Android 9 launch A/B devices migrating to Android 10 style
system-as-root, `androidboot.force_normal_boot=1` must be passed in
cmdline when booting into normal or charger mode. However, it is not
always possible for one to modify the bootloader to adhere to these
changes. As a workaround, one can use the presence of the
`skip_initramfs` flag in cmdline to to decide whether to append the new
flag to cmdline on the kernel side.

Co-authored-by: jabashque <jabashque@gmail.com>
Change-Id: Ia00ea2c54e2a7d2275e552837039033adb98d0ff

11 months agoinit: Add CONFIG_INITRAMFS_IGNORE_SKIP_FLAG
Sebastiano Barezzi [Tue, 28 Jun 2022 21:10:29 +0000 (23:10 +0200)]
init: Add CONFIG_INITRAMFS_IGNORE_SKIP_FLAG

* Ignoring an ignore flag, yikes
* Also replace skip_initramf with want_initramf (omitting last letter for Magisk since it binary patches that out of kernel, I'm not even sure why we're supporting that mess)

Co-authored-by: Erfan Abdi <erfangplus@gmail.com>
Change-Id: Ifdf726f128bc66bf860bbb71024f94f56879710f

2 years agokbuild: use HOSTLDFLAGS for single .c executables lineage-20
Robin Jarry [Mon, 26 Feb 2018 18:41:47 +0000 (19:41 +0100)]
kbuild: use HOSTLDFLAGS for single .c executables

When compiling executables from a single .c file, the linker is also
invoked. Pass the HOSTLDFLAGS like for other linker commands.

Signed-off-by: Robin Jarry <robin.jarry@6wind.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Change-Id: I8332132775e7cc220eb7d1e20911223bca65e1e2

2 years agocert host tools: Stop complaining about deprecated OpenSSL functions lineage-19.1
Linus Torvalds [Wed, 8 Jun 2022 20:18:39 +0000 (13:18 -0700)]
cert host tools: Stop complaining about deprecated OpenSSL functions

OpenSSL 3.0 deprecated the OpenSSL's ENGINE API.  That is as may be, but
the kernel build host tools still use it.  Disable the warning about
deprecated declarations until somebody who cares fixes it.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Change-Id: I655a75665920983abc9c3b0d73abfcb34f41b610

2 years agomisc: samsung: scsc_bt: Always set transport unit size to 16
Tim Zimmermann [Mon, 18 Apr 2022 14:33:58 +0000 (16:33 +0200)]
misc: samsung: scsc_bt: Always set transport unit size to 16

* Our firmware wants this to be 16 (PCM) instead of 0x00 (HCI)
* Removes the need for patching userspace

Change-Id: Ie88a682b5acc6ef73db5d35dbd336544ab0a1a2a

2 years agodrivers: soc: cal-if: Prevent optimization of structs on fvmap_copy_from_sram
John Vincent [Thu, 5 Aug 2021 04:37:08 +0000 (12:37 +0800)]
drivers: soc: cal-if: Prevent optimization of structs on fvmap_copy_from_sram

Newer versions of Clang tend to apply heavier optimizations than GCC, especially if -mcpu is set. Because we are applying optimizations to the LITTLE core (see b92e1e70898f515646142736df8d72c43e97e251) kernel panics during boot, citing inability to access memory regions on fvmap_init.

<6>[    0.664609]  [0:      swapper/0:    1] fvmap_init:fvmap initialize 0000000000000000
<0>[    0.664625]  [0:      swapper/0:    1] Unable to handle kernel paging request at virtual address ffffff800b2f2402
<2>[    0.664640]  [0:      swapper/0:    1] sec_debug_set_extra_info_fault = KERN / 0xffffff800b2f2402
<6>[    0.664657]  [0:      swapper/0:    1] search_item_by_key: (FTYPE) extra_info is not ready
<2>[    0.664666]  [0:      swapper/0:    1] set_item_val: fail to find FTYPE
<6>[    0.664680]  [0:      swapper/0:    1] search_item_by_key: (FAULT) extra_info is not ready
<2>[    0.664688]  [0:      swapper/0:    1] set_item_val: fail to find FAULT
<6>[    0.664702]  [0:      swapper/0:    1] search_item_by_key: (PC) extra_info is not ready
<2>[    0.664710]  [0:      swapper/0:    1] set_item_val: fail to find PC
<6>[    0.664724]  [0:      swapper/0:    1] search_item_by_key: (LR) extra_info is not ready
<2>[    0.664732]  [0:      swapper/0:    1] set_item_val: fail to find LR
<1>[    0.664746]  [0:      swapper/0:    1] Mem abort info:
<1>[    0.664760]  [0:      swapper/0:    1]   Exception class = DABT (current EL), IL = 32 bits
<1>[    0.664774]  [0:      swapper/0:    1]   SET = 0, FnV = 0
<1>[    0.664787]  [0:      swapper/0:    1]   EA = 0, S1PTW = 0
<1>[    0.664799]  [0:      swapper/0:    1] Data abort info:
<1>[    0.664814]  [0:      swapper/0:    1]   ISV = 0, ISS = 0x00000021
<1>[    0.664828]  [0:      swapper/0:    1]   CM = 0, WnR = 0
<1>[    0.664842]  [0:      swapper/0:    1] swapper pgtable: 4k pages, 39-bit VAs, pgd = ffffff800a739000
<1>[    0.664856]  [0:      swapper/0:    1] [ffffff800b2f2402] *pgd=000000097cdfe003, *pud=000000097cdfe003, *pmd=00000009742a9003, *pte=00e800000204b707
<0>[    0.664884]  [0:      swapper/0:    1] Internal error: Oops: 96000021 [#1] PREEMPT SMP
<4>[    0.664899]  [0:      swapper/0:    1] Modules linked in:
<0>[    0.664916]  [0:      swapper/0:    1] Process swapper/0 (pid: 1, stack limit = 0xffffff80081a8000)
<0>[    0.664936]  [0:      swapper/0:    1] debug-snapshot: core register saved(CPU:0)
<0>[    0.664950]  [0:      swapper/0:    1] L2ECTLR_EL1: 0000000000000007
<0>[    0.664959]  [0:      swapper/0:    1] L2ECTLR_EL1 valid_bit(30) is NOT set (0x0)
<0>[    0.664978]  [0:      swapper/0:    1] CPUMERRSR: 0000000008000001, L2MERRSR: 0000000010200c00
<0>[    0.664992]  [0:      swapper/0:    1] CPUMERRSR valid_bit(31) is NOT set (0x0)
<0>[    0.665006]  [0:      swapper/0:    1] L2MERRSR valid_bit(31) is NOT set (0x0)
<0>[    0.665020]  [0:      swapper/0:    1] debug-snapshot: context saved(CPU:0)
<6>[    0.665088]  [0:      swapper/0:    1] debug-snapshot: item - log_kevents is disabled
<6>[    0.665112]  [0:      swapper/0:    1] TIF_FOREIGN_FPSTATE: 0, FP/SIMD depth 0, cpu: 0
<4>[    0.665130]  [0:      swapper/0:    1] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.14.113 - Fresh Core-user #1
<4>[    0.665144]  [0:      swapper/0:    1] Hardware name: Samsung A50 LTN OPEN rev04 board based on Exynos9610 (DT)
<4>[    0.665160]  [0:      swapper/0:    1] task: ffffffc8f4ce8000 task.stack: ffffff80081a8000
<4>[    0.665180]  [0:      swapper/0:    1] PC is at fvmap_init+0xac/0x2c8
<4>[    0.665195]  [0:      swapper/0:    1] LR is at fvmap_init+0x70/0x2c8
<4>[    0.665211]  [0:      swapper/0:    1] pc : [<ffffff80085e12b8>] lr : [<ffffff80085e127c>] pstate: 20400145
<4>[    0.665225]  [0:      swapper/0:    1] sp : ffffff80081abb80
<4>[    0.665238]  [0:      swapper/0:    1] x29: ffffff80081abbb0 x28: 0000000000000000
<4>[    0.665256]  [0:      swapper/0:    1] x27: ffffff8009efc000 x26: 0000000000000000
<4>[    0.665273]  [0:      swapper/0:    1] x25: ffffff800b2e5970 x24: ffffff8008fa7222
<4>[    0.665291]  [0:      swapper/0:    1] x23: ffffff800b2e5a60 x22: ffffff8009efb000
<4>[    0.665307]  [0:      swapper/0:    1] x21: ffffffc8f3480000 x20: ffffff800b2e0000
<4>[    0.665324]  [0:      swapper/0:    1] x19: ffffff800b2f2400 x18: 0000000000000000
<4>[    0.665341]  [0:      swapper/0:    1] x17: ffffff8009bff23c x16: 0000000000000000
<4>[    0.665358]  [0:      swapper/0:    1] x15: 00000000000000c6 x14: 0000000000000054
<4>[    0.665375]  [0:      swapper/0:    1] x13: 000000000000d7b8 x12: 0000000000000000
<4>[    0.665392]  [0:      swapper/0:    1] x11: 0000000000000000 x10: ffffffc8f3480000
<4>[    0.665409]  [0:      swapper/0:    1] x9 : ffffff800b2f2400 x8 : 0000000000000000
<4>[    0.665426]  [0:      swapper/0:    1] x7 : 5b20205d39303634 x6 : ffffffc0117d09b7
<4>[    0.665443]  [0:      swapper/0:    1] x5 : 0000000000000001 x4 : 000000000000000c
<4>[    0.665459]  [0:      swapper/0:    1] x3 : 0000000000000a30 x2 : ffffffffffffffce
<4>[    0.665476]  [0:      swapper/0:    1] x1 : 000000000b040000 x0 : 000000000000000a

Since we need these optimizations for performance reasons, the only way to resolve this is to solve the issue. Samsung already did something similar to cal-if before.

We only needed to make fvmap_header/header volatile and has been tested to work.

Signed-off-by: John Vincent <git@tensevntysevn.cf>
Change-Id: Ic419135d4a80cbe15f0fa71dc59cc6efa73d6141

2 years agodrivers: net: wireless: scsc: update slsi_lls_peer_info struct for Android S
Tim Zimmermann [Fri, 14 Jan 2022 19:05:14 +0000 (20:05 +0100)]
drivers: net: wireless: scsc: update slsi_lls_peer_info struct for Android S

* In https://android.googlesource.com/platform/hardware/libhardware_legacy/+/359fa1d1ba2907436817728f4e9565ed6e8a9b73
  there was a new field added to wifi_peer_info struct, our HAL expects these two
  to match - with the new change num_rate wasn't properly aligned anymore
  causing userspace to allocate a "random" amount of wifi_rate_stat structs

Change-Id: I791a5a26fbb5123d08a8280e3312946b7f89c45c

2 years agonet: wireless: scsc: do not write info files without wlbtd
Tim Zimmermann [Sun, 5 Dec 2021 09:25:57 +0000 (10:25 +0100)]
net: wireless: scsc: do not write info files without wlbtd

* Not needed on AOSP
* Causes kernel panics (Accessing user space memory with fs=KERNEL_DS)

Change-Id: If57cf16246039bd32fc07d91d2f081cca00c42d0

2 years agodrivers: misc: scsc: Fix build when CONFIG_SCSC_WLBTD is not set
Nolen Johnson [Wed, 13 Apr 2022 22:34:42 +0000 (18:34 -0400)]
drivers: misc: scsc: Fix build when CONFIG_SCSC_WLBTD is not set

* `MEMDUMP_FILE_FOR_RECOVERY` won't be declared otherwise.

Change-Id: I49040415e24a50e116a919ed876e7ddbfa500c1b

2 years agoARM64: configs: exynos9610: Disable wlbtd
Tim Zimmermann [Sun, 5 Dec 2021 09:08:50 +0000 (10:08 +0100)]
ARM64: configs: exynos9610: Disable wlbtd

* This allows us to drop wlbtd userspace service
* Not needed on AOSP anyway

Change-Id: If1b0846371f550f75df28d67bb3f40fda6dee58b

2 years agoARM64: configs: exynos9610: Enable CC_OPTIMIZE_FOR_PERFORMANCE
Tim Zimmermann [Mon, 22 Nov 2021 20:57:58 +0000 (21:57 +0100)]
ARM64: configs: exynos9610: Enable CC_OPTIMIZE_FOR_PERFORMANCE

Change-Id: I15bacce94af25d32a58a7dc1e6d9db2f2bf9c4e1