Martijn Coenen [Mon, 17 Oct 2016 13:17:31 +0000 (15:17 +0200)]
android: binder: deal with contexts in debugfs.
Properly print the context in debugfs entries.
Change-Id: If10c2129536d9f39bae542afd7318ca79af60e3a
Signed-off-by: Martijn Coenen <maco@google.com>
Martijn Coenen [Fri, 30 Sep 2016 13:51:48 +0000 (15:51 +0200)]
android: binder: support multiple context managers.
Move the context manager state into a separate
struct context, and allow for each process to have
its own context associated with it.
Change-Id: Ifa934370241a2d447dd519eac3fd0682c6d00ab4
Signed-off-by: Martijn Coenen <maco@google.com>
Martijn Coenen [Wed, 13 Jul 2016 10:06:49 +0000 (12:06 +0200)]
android: binder: split flat_binder_object.
flat_binder_object is used for both handling
binder objects and file descriptors, even though
the two are mostly independent. Since we'll
have more fixup objects in binder in the future,
instead of extending flat_binder_object again,
split out file descriptors to their own object
while retaining backwards compatibility to
existing user-space clients. All binder objects
just share a header.
Change-Id: If3c55f27a2aa8f21815383e0e807be47895e4786
Signed-off-by: Martijn Coenen <maco@google.com>
Riley Andrews [Wed, 21 Oct 2015 23:30:51 +0000 (16:30 -0700)]
android: binder: Disable preemption while holding the global binder lock.
(cherry pick from commit
f681f0264fd2c51aa12190ff9be04622a7c8ca3f)
Signed-off-by: Riley Andrews <riandrews@google.com>
Bug:
30141999
Change-Id: I66ed44990d5c347d197e61dc49a37b5228c748d0
Andrew Bresticker [Fri, 23 Oct 2015 22:13:42 +0000 (15:13 -0700)]
CHROMIUM: android: binder: Fix potential scheduling-while-atomic
(cherry picked from commit
166b45af97359159f9585a836c9849e725e31fd6)
Commit
f1e7f0a724f6 ("android: binder: Disable preemption while holding
the global binder lock.") re-enabled preemption around most of the sites
where calls to potentially sleeping functions were made, but missed
__alloc_fd(), which can sleep if the fdtable needs to be resized.
Re-enable preemption around __alloc_fd() as well as __fd_install() which
can now sleep in upstream kernels as of commit
8a81252b774b ("fs/file.c:
don't acquire files->file_lock in fd_install()").
BUG=chrome-os-partner:44012
TEST=Build and boot on Smaug.
Change-Id: I9819c4b95876f697e75b1b84810b6c520d9c33ec
Signed-off-by: Andrew Bresticker <abrestic@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/308582
Reviewed-by: Stephen Barber <smbarber@chromium.org>
Reviewed-by: Riley Andrews <riandrews@google.com>
Bug:
30141999
Riley Andrews [Wed, 21 Oct 2015 23:30:25 +0000 (16:30 -0700)]
android: binder: Use wake up hint for synchronous transactions.
(cherry pick from
572b57fc6f7fb6ffaa979d505ec2b0a9e9840cca)
Use wake_up_interruptible_sync() to hint to the scheduler binder
transactions are synchronous wakeups. Disable preemption while waking
to avoid ping-ponging on the binder lock.
Signed-off-by: Riley Andrews <riandrews@google.com>
Bug:
30141999
Change-Id: If570d94ef3fed09c328052922d5a9e83d7ba479a
Stricted [Thu, 11 Oct 2018 16:12:47 +0000 (18:12 +0200)]
Add security hooks to binder and implement the hooks for SELinux.
Add security hooks to the binder and implement the hooks for SELinux.
The security hooks enable security modules such as SELinux to implement
controls over binder IPC. The security hooks include support for
controlling what process can become the binder context manager
(binder_set_context_mgr), controlling the ability of a process
to invoke a binder transaction/IPC to another process (binder_transaction),
controlling the ability a process to transfer a binder reference to
another process (binder_transfer_binder), and controlling the ability
of a process to transfer an open file to another process (binder_transfer_file).
This support is used by SE Android, http://selinuxproject.org/page/SEAndroid.
Change-Id: I34266b66320b6a3df9ac01833d7f94daf742920e
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Nick Desaulniers [Fri, 7 Oct 2016 18:13:55 +0000 (11:13 -0700)]
binder: blacklist %p kptr_restrict
Bug:
31495231
Change-Id: Iebc150f6bc939b56e021424ee44fb30ce8d732fd
[d-cagle@codeaurora.org: Applied to correct file location]
Git-repo: https://android.googlesource.com/kernel/msm.git
Git-commit:
0804d7840364fc1a93652632bd43a93c055c658e
Signed-off-by: Dennis Cagle <d-cagle@codeaurora.org>
Signed-off-by: Pradosh Das <prados@codeaurora.org>
Arve Hjønnevåg [Tue, 2 Aug 2016 22:40:39 +0000 (15:40 -0700)]
ANDROID: binder: Add strong ref checks
Prevent using a binder_ref with only weak references where a strong
reference is required.
BUG:
30445380
Signed-off-by: Arve Hjønnevåg <arve@android.com>
Git-repo: https://android.googlesource.com/kernel/msm.git
Git-commit:
5e2a2bc89956ae1c739854403408059144b23c28
Signed-off-by: Ravi Kumar Siddojigari <rsiddoji@codeaurora.org>
Change-Id: I66c15b066808f28bd27bfe50fd0e03ff45a09fca
Signed-off-by: Ravi Kumar Siddojigari <rsiddoji@codeaurora.org>
Pradosh Das [Mon, 17 Oct 2016 18:21:11 +0000 (23:51 +0530)]
ANDROID: binder: Clear binder and cookie when setting handle in flat binder struct
Prevents leaking pointers between processes
BUG:
30768347
Change-Id: Id898076926f658a1b8b27a3ccb848756b36de4ca
Signed-off-by: Arve Hjønnevåg <arve@android.com>
Git-repo: https://android.googlesource.com/kernel/msm.git
Git-commit:
11032d745836280574827bb1db5e64a94945180e
Signed-off-by: Dennis Cagle <d-cagle@codeaurora.org>
Signed-off-by: Pradosh Das <prados@codeaurora.org>
Pradosh Das [Sun, 16 Oct 2016 06:34:51 +0000 (12:04 +0530)]
binder: prevent kptr leak by using %pK format specifier
Works in conjunction with kptr_restrict.
Bug:
30143283
Change-Id: I2b3ce22f4e206e74614d51453a1d59b7080ab05a
Git-repo: https://android.googlesource.com/kernel/msm.git
Git-commit:
b884cbf06200b18e660514a30293931a61126ef5
Signed-off-by: Dennis Cagle <d-cagle@codeaurora.org>
Signed-off-by: Pradosh Das <prados@codeaurora.org>
Riley Andrews [Thu, 28 May 2015 19:10:05 +0000 (12:10 -0700)]
android: drivers: workaround debugfs race in binder
If a /d/binder/proc/[pid] entry is kept open after linux has
torn down the associated process, binder_proc_show can deference
an invalid binder_proc that has been stashed in the debugfs
inode. Validate that the binder_proc ptr passed into binder_proc_show
has not been freed by looking for it within the global process list
whilst the global lock is held. If the ptr is not valid, print nothing.
Bug
19587483
Change-Id: Ice878c171db51ef9a4879c2f9299a2deb873d255
Signed-off-by: Riley Andrews <riandrews@android.com>
Git-commit:
67680c141957991b9350269a8eaf30baf1c85427
Git-repo: https://android.googlesource.com/kernel/common.git
Signed-off-by: Ian Maund <imaund@codeaurora.org>
Arve Hjønnevåg [Fri, 14 Feb 2014 04:25:06 +0000 (20:25 -0800)]
android: binder: More offset validation.
Make sure offsets don't point to overlapping flat_binder_object
structs.
Change-Id: I85c759b9c6395492474b177625dc6b0b289fd6b0
Signed-off-by: Arve Hjønnevåg <arve@android.com>
Git-commit:
3ce49f5ed6c76a7772aeba7419d4976f225e1b78
Git-repo: https://android.googlesource.com/kernel/common.git
Signed-off-by: Ian Maund <imaund@codeaurora.org>
Greg Kroah-Hartman [Thu, 16 Oct 2014 13:26:51 +0000 (15:26 +0200)]
android: binder: remove binder.h
binder.h isn't needed to just include a uapi file and set a single
define, so move it into binder.c to save a few lines of code.
Change-Id: Idcd0aba576295bbe0ddf5d18c4b1d1e8efdc8c84
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Git-commit:
0983897d00a541f725375d00604dccb3e716e128
Git-repo: https://android.googlesource.com/kernel/common.git
Signed-off-by: Ian Maund <imaund@codeaurora.org>
Greg Kroah-Hartman [Thu, 16 Oct 2014 12:40:38 +0000 (14:40 +0200)]
staging: android: binder: move to the "real" part of the kernel
The Android binder code has been "stable" for many years now. No matter
what comes in the future, we are going to have to support this API, so
might as well move it to the "real" part of the kernel as there's no
real work that needs to be done to the existing code.
Change-Id: I36d5c6fc05aff26dd01a227201be18e86c9f9994
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Git-commit:
a8363b44d11a212b1d44d7823fa1795de6887185
Git-repo: https://android.googlesource.com/kernel/common.git
[imaund@codeaurora.org: Resolved context conflicts]
Signed-off-by: Ian Maund <imaund@codeaurora.org>
Danny Wood [Thu, 11 Oct 2018 08:27:43 +0000 (09:27 +0100)]
Remove remaining Samsung changes to binder
Change-Id: I885a7081dbd2d7e85511c46ca3572ecbe1ad3a35
Danny Wood [Thu, 11 Oct 2018 08:20:05 +0000 (09:20 +0100)]
Revert "Add security hooks to binder and implement the hooks for SELinux."
This reverts commit
d90fcbdb7c87acd5e71bdbf8f77c44366ea18f28.
Dmitry Voytik [Mon, 8 Sep 2014 14:16:34 +0000 (18:16 +0400)]
staging: binder: fix coding style issues
Fix coding style issues:
* put braces in all if-else branches;
* limit the length of changed lines to 80 columns.
checkpatch.pl warning count reduces by 3.
Change-Id: I1796588ef0f358780445203c5afa87361ab2bf73
Signed-off-by: Dmitry Voytik <voytikd@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Git-commit:
0a2be85ca49d19a251aa4ebc232df7b99dcafaa2
Git-repo: https://android.googlesource.com/kernel/common.git
Signed-off-by: Ian Maund <imaund@codeaurora.org>
William Panlener [Thu, 4 Sep 2014 03:44:03 +0000 (22:44 -0500)]
staging: android: Break up a long line in binder_send_failed_reply
Kernel coding style. Breaking long lines and strings.
Change-Id: Ie1af1f3bcfef547ab69f873f2e86ee37c8c23caf
Signed-off-by: William Panlener <wpanlener@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Git-commit:
dd0f11419f8fe1a64082575fba2132fe8558c6fe
Git-repo: https://android.googlesource.com/kernel/common.git
Signed-off-by: Ian Maund <imaund@codeaurora.org>
Lucas Tanure [Mon, 14 Jul 2014 00:31:05 +0000 (21:31 -0300)]
staging: android: Clean up else statement from binder_send_failed_reply
Kernel coding style. Remove useless else statement after return.
Changes from v1 and v2: Fix warning for mixed declarations and code.
Declaration of "struct binder_transaction *next" made outside of while.
Changes from v3: Removed initialization to NULL for next variable.
Change-Id: I1ec8b512130595b90098126acf562e58eeca8458
Signed-off-by: Lucas Tanure <tanure@linux.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Git-commit:
0137e89dee95eff290c919c3fafb5b1e7d9fad6a
Git-repo: https://android.googlesource.com/kernel/common.git
Signed-off-by: Ian Maund <imaund@codeaurora.org>
Riley Andrews [Mon, 12 Jan 2015 22:01:03 +0000 (14:01 -0800)]
staging: android: binder.c: binder_ioctl() cleanup
binder_ioctl() is quite huge and checkpatch dirty - mostly because of
the amount of code for the BINDER_WRITE_READ and BINDER_SET_CONTEXT_MGR.
Moved that code into the new binder_ioctl_write_read() and
binder_ioctl_set_ctx_mgr()
Change-Id: I9c5cea46a570ca91768e5aa7b11c7178bdbb667d
Signed-off-by: Tair Rzayev <tair.rzayev@gmail.com>
Cc: Arve Hjønnevåg <arve@android.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Git-commit:
8802f02dfcf2cf5fb3569fd2b888543ca6a327a6
Git-repo: https://android.googlesource.com/kernel/common.git
Signed-off-by: Ian Maund <imaund@codeaurora.org>
Karthik Nayak [Sat, 21 Jun 2014 14:53:16 +0000 (20:23 +0530)]
Staging: Android: removed an unnecessary else statement
As per checkpatch warning, removed an unnecessary else statement
proceeding an if statement with a return.
Change-Id: I718d9ece6c8bac128c10ecaf904721410d701b60
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Git-commit:
a82ef7179a96b30c86258f4744c83cce9bfae441
Git-repo: https://android.googlesource.com/kernel/common.git
Signed-off-by: Ian Maund <imaund@codeaurora.org>
Tair Rzayev [Sat, 31 May 2014 19:43:34 +0000 (22:43 +0300)]
staging: android: binder.c: Use more appropriate functions for euid retrieval
Instead of getting the reference to whole credential structure, use
task_euid() and current_euid() to get it.
Change-Id: Id5b9d0305b5f90023415863e569988023aaae290
Signed-off-by: Tair Rzayev <tair.rzayev@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Git-commit:
2d47ab7fed4155c334f881548612290833b1d930
Git-repo: https://android.googlesource.com/kernel/common.git
Signed-off-by: Ian Maund <imaund@codeaurora.org>
Christian Engelmayer [Wed, 7 May 2014 19:44:53 +0000 (21:44 +0200)]
staging: binder: fix usage of uninit scalar in binder_transaction()
Fix the error path when a cookie mismatch is detected. In that case the
function jumps to the exit label without setting the uninitialized, local
variable 'return_error'. Detected by Coverity - CID 201453.
Change-Id: I6c960b7d3ad0adb28fad106a9a0b8cb934013987
Signed-off-by: Christian Engelmayer <cengelma@gmx.at>
Acked-by: Arve <arve@android.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Git-commit:
045788ea680a4e204aa832ba3985ee1f6a87abc4
Git-repo: https://android.googlesource.com/kernel/common.git
Signed-off-by: Ian Maund <imaund@codeaurora.org>
Jerry Snitselaar [Thu, 1 May 2014 06:58:55 +0000 (23:58 -0700)]
staging: binder: cleanup dereference of noderef expressions
Clean up sparse warnings for cred struct dereference.
Change-Id: I78059976c0488abfe9eb4e9f0b0f8ac10c7ef4f9
Signed-off-by: Jerry Snitselaar <dev@snitselaar.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Git-commit:
12b727a8f9ba257c8d5882a65661e5a81521e571
Git-repo: https://android.googlesource.com/kernel/common.git
Signed-off-by: Ian Maund <imaund@codeaurora.org>
Seunghun Lee [Wed, 30 Apr 2014 16:30:23 +0000 (01:30 +0900)]
staging: android: fix missing a blank line after declarations
This patch fixes "Missing a blank line after declarations" warnings.
Change-Id: Iede8a80f003eba36fd1f8d3ec8135d9d35c16ee9
Signed-off-by: Seunghun Lee <waydi1@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Git-commit:
d0a7d1865c7ff9054a8eeb722e4f7ae0d80fcdb3
Git-repo: https://android.googlesource.com/kernel/common.git
[imaund@codeaurora.org: Resolved context conflicts]
Signed-off-by: Ian Maund <imaund@codeaurora.org>
Mathieu Maret [Tue, 15 Apr 2014 10:03:05 +0000 (12:03 +0200)]
staging: binder: add __user annotation in binder.c
Add __user to binder_version to correct sparse warning.
Reduce line size to fit to coding style.
Change-Id: I8694fb5a082721c69d1596b2853c5d4899f6536b
Signed-off-by: Mathieu Maret <mathieu.maret@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Git-commit:
9c5197b7a50ae6a323b0281826fb36ab6389d74e
Git-repo: https://android.googlesource.com/kernel/common.git
Signed-off-by: Ian Maund <imaund@codeaurora.org>
Arve Hjønnevåg [Fri, 21 Feb 2014 22:40:26 +0000 (14:40 -0800)]
staging: binder: Support concurrent 32 bit and 64 bit processes.
For 64bit systems we want to use the same binder interface for 32bit and
64bit processes. Thus the size and the layout of the structures passed
between the kernel and the userspace has to be the same for both 32 and
64bit processes.
This change replaces all the uses of void* and size_t with
binder_uintptr_t and binder_size_t. These are then typedefed to specific
sizes depending on the use of the interface, as follows:
* __u32 - on legacy 32bit only userspace
* __u64 - on mixed 32/64bit userspace where all processes use the same
interface.
This change also increments the BINDER_CURRENT_PROTOCOL_VERSION to 8 and
hooks the compat_ioctl entry for the mixed 32/64bit Android userspace.
This patch also provides a CONFIG_ANDROID_BINDER_IPC_32BIT option for
compatability, which if set which enables the old protocol, setting
BINDER_CURRENT_PROTOCOL_VERSION to 7, on 32 bit systems.
Please note that all 64bit kernels will use the 64bit Binder ABI.
Change-Id: If54f075787a6bb261012eb73295eb4f83a8c91c9
Cc: Colin Cross <ccross@android.com>
Cc: Arve Hjønnevåg <arve@android.com>
Cc: Serban Constantinescu <serban.constantinescu@arm.com>
Cc: Android Kernel Team <kernel-team@android.com>
Signed-off-by: Arve Hjønnevåg <arve@android.com>
[jstultz: Merged with upstream type changes. Various whitespace fixes
and longer Kconfig description for checkpatch. Included improved commit
message from Serban (with a few tweaks).]
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Git-commit:
dda6600546202c636b2be74a4640e929dae7cba9
Git-repo: https://android.googlesource.com/kernel/common.git
[imaund@codeaurora.org: Resolved context conflicts]
Signed-off-by: Ian Maund <imaund@codeaurora.org>
Riley Andrews [Sat, 10 Jan 2015 03:10:36 +0000 (19:10 -0800)]
Revert "Staging: android: binder: Support concurrent 32 bit and 64 bit processes."
This reverts commit
2d595dc92ae19b49e4c1ebb1f8e3b461a2d06592.
Change-Id: I1e4e306fa38f851b3044abb8a7c929c298c14812
Git-commit:
b4804ef90f54068bfefd6a6f8c687e7844912c1d
Git-repo: https://android.googlesource.com/kernel/common.git
[imaund@codeaurora.org: Resolved context conflicts]
Signed-off-by: Ian Maund <imaund@codeaurora.org>
Uma Maheshwari Bhiram [Fri, 8 Nov 2013 23:19:38 +0000 (15:19 -0800)]
binder: Quiet Binder
Temporary change to avoid watchdog bark because of
excessive failed transaction logging
CRs-Fixed: 572081
Change-Id: Id664d65ab9e78627991f8b7d4f4e5e126908c214
Signed-off-by: Uma Maheshwari Bhiram <ubhira@codeaurora.org>
Vinayak Menon [Mon, 2 Jun 2014 12:47:59 +0000 (18:17 +0530)]
staging: binder: add vm_fault handler
An issue was observed when a userspace task exits.
The page which hits error here is the zero page.
In binder mmap, the whole of vma is not mapped.
On a task crash, when debuggerd reads the binder regions,
the unmapped areas fall to do_anonymous_page in handle_pte_fault,
due to the absence of a vm_fault handler. This results in
zero page being mapped. Later in zap_pte_range, vm_normal_page
returns zero page in the case of VM_MIXEDMAP and it results in the
error.
BUG: Bad page map in process mediaserver pte:
9dff379f pmd:
9bfbd831
page:
c0ed8e60 count:1 mapcount:-1 mapping: (null) index:0x0
page flags: 0x404(referenced|reserved)
addr:
40c3f000 vm_flags:
10220051 anon_vma: (null) mapping:
d9fe0764 index:fd
vma->vm_ops->fault: (null)
vma->vm_file->f_op->mmap: binder_mmap+0x0/0x274
CPU: 0 PID: 1463 Comm: mediaserver Tainted: G W 3.10.17+ #1
[<
c001549c>] (unwind_backtrace+0x0/0x11c) from [<
c001200c>] (show_stack+0x10/0x14)
[<
c001200c>] (show_stack+0x10/0x14) from [<
c0103d78>] (print_bad_pte+0x158/0x190)
[<
c0103d78>] (print_bad_pte+0x158/0x190) from [<
c01055f0>] (unmap_single_vma+0x2e4/0x598)
[<
c01055f0>] (unmap_single_vma+0x2e4/0x598) from [<
c010618c>] (unmap_vmas+0x34/0x50)
[<
c010618c>] (unmap_vmas+0x34/0x50) from [<
c010a9e4>] (exit_mmap+0xc8/0x1e8)
[<
c010a9e4>] (exit_mmap+0xc8/0x1e8) from [<
c00520f0>] (mmput+0x54/0xd0)
[<
c00520f0>] (mmput+0x54/0xd0) from [<
c005972c>] (do_exit+0x360/0x990)
[<
c005972c>] (do_exit+0x360/0x990) from [<
c0059ef0>] (do_group_exit+0x84/0xc0)
[<
c0059ef0>] (do_group_exit+0x84/0xc0) from [<
c0066de0>] (get_signal_to_deliver+0x4d4/0x548)
[<
c0066de0>] (get_signal_to_deliver+0x4d4/0x548) from [<
c0011500>] (do_signal+0xa8/0x3b8)
Add a vm_fault handler which returns VM_FAULT_SIGBUS, and prevents the
wrong fallback to do_anonymous_page.
CRs-Fixed: 673147
Change-Id: I43730a51b6c819538b46c5e4dc5c96c8a384098d
Signed-off-by: Vinayak Menon <vinayakm.list@gmail.com>
Patch-mainline: linux-arm-kernel @ 06/02/14, 18:17
Signed-off-by: Vignesh Radhakrishnan <vigneshr@codeaurora.org>
Paresh Nakhe [Wed, 9 Apr 2014 09:19:32 +0000 (14:49 +0530)]
binder: NULL pointer reference
This fix handles a possible NULL pointer reference in
debug message.
CRs-fixed: 642883
Change-Id: Ide78f281ec0cff5cbd8231b85c305d13a892854e
Signed-off-by: Paresh Nakhe <pnakhe@codeaurora.org>
Bojan Prtvar [Mon, 2 Sep 2013 06:18:40 +0000 (08:18 +0200)]
Staging: android: Mark local functions in binder.c as static
This fixes the following sparse warnings
drivers/staging/android/binder.c:1703:5: warning: symbol 'binder_thread_write' was not declared. Should it be static?
drivers/staging/android/binder.c:2058:6: warning: symbol 'binder_stat_br' was not declared. Should it be static?
Change-Id: I930f10e54c19b0c6aca275f3ef51320bcfa3bb34
Signed-off-by: Bojan Prtvar <prtvar.b@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Git-commit:
fb07ebc3e82a98a3605112b71ea819c359549c4b
Git-repo: https://android.googlesource.com/kernel/common/
Signed-off-by: Neeti Desai <neetid@codeaurora.org>
Signed-off-by: Ajay Dudani <adudani@codeaurora.org>
Danny Wood [Thu, 11 Oct 2018 07:33:51 +0000 (08:33 +0100)]
Revert "android: binder: More offset validation."
This reverts commit
2d0c27cf884f6ecf543b672a9b14a29440db65a3.
Danny Wood [Thu, 11 Oct 2018 07:33:28 +0000 (08:33 +0100)]
Revert "android: drivers: workaround debugfs race in binder"
This reverts commit
48357f46f151cf683b40320b35e92430833435c8.
Danny Wood [Thu, 11 Oct 2018 07:33:09 +0000 (08:33 +0100)]
Revert "binder: prevent kptr leak by using %pK format specifier"
This reverts commit
924e75bf89da402be916a3fda62a39fe25782e1d.
Danny Wood [Thu, 11 Oct 2018 07:32:44 +0000 (08:32 +0100)]
Revert "ANDROID: binder: Clear binder and cookie when setting handle in flat binder struct"
This reverts commit
54a0185930cdc9681b6371c68e8df9fbe98766e0.
Danny Wood [Thu, 11 Oct 2018 07:32:15 +0000 (08:32 +0100)]
Revert "ANDROID: binder: Add strong ref checks"
This reverts commit
a00322a2dbda693130a90a039fac48578e474eda.
Change-Id: Ib69f278c3a6cef08afb4a9d694deb59888aaab41
Danny Wood [Thu, 11 Oct 2018 07:27:30 +0000 (08:27 +0100)]
Revert "binder: blacklist %p kptr_restrict"
This reverts commit
0f0cfd941fbe505504e756c13d6c790132d839aa.
Danny Wood [Thu, 11 Oct 2018 11:30:45 +0000 (12:30 +0100)]
Fix gud-exynos7580 driver when compiling in a separate output directory
Change-Id: Ib996842a7d9a4cdf1c9282ef0d78401b09d9533c
Stricted [Thu, 11 Oct 2018 16:07:17 +0000 (18:07 +0200)]
move d_rcu from overlapping d_child to overlapping d_alias
commit
946e51f2bf37f1656916eb75bd0742ba33983c28 upstream.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: Ben Hutchings <ben@decadent.org.uk>
[hujianyang: Backported to 3.10 refer to the work of Ben Hutchings in 3.2:
- Apply name changes in all the different places we use d_alias and d_child
- Move the WARN_ON() in __d_free() to d_free() as we don't have dentry_free()]
Signed-off-by: hujianyang <hujianyang@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Used commit: https://github.com/LineageOS/android_kernel_oppo_msm8939/commit/
6637ecd306a94a03dd5b8e4e8d3f260d9877c5b0
Change-Id: Ie0e7d9f23094d8e451741cf3e82327e2512a792e
UpInTheAir [Sat, 3 Sep 2016 11:54:13 +0000 (11:54 +0000)]
drivers/char: add frandom [Eli Billauer]
Signed-off-by: Jesse Chan <jc@linux.com>
Original commit: https://github.com/jesec/android_kernel_samsung_universal8890/commit/
de479cea4721b2b66b6917d247dbb7ba32afbc93
Oleg Nesterov [Sat, 7 Nov 2015 00:30:06 +0000 (16:30 -0800)]
proc: actually make proc_fd_permission() thread-friendly
The commit
96d0df79f264 ("proc: make proc_fd_permission() thread-friendly")
fixed the access to /proc/self/fd from sub-threads, but introduced another
problem: a sub-thread can't access /proc/<tid>/fd/ or /proc/thread-self/fd
if generic_permission() fails.
Change proc_fd_permission() to check same_thread_group(pid_task(), current).
Change-Id: I14b0e0d52ba87ace783e02d31707221b06f44efa
Fixes:
96d0df79f264 ("proc: make proc_fd_permission() thread-friendly")
Reported-by: "Jin, Yihua" <yihua.jin@intel.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Git-commit:
54708d2858e79a2bdda10bf8a20c80eb96c20613
Git-repo: https://android.googlesource.com/kernel/common/
Signed-off-by: Srinivasarao P <spathi@codeaurora.org>
Aleksa Sarai [Sun, 20 Aug 2017 16:25:50 +0000 (18:25 +0200)]
fs: exec: apply CLOEXEC before changing dumpable task flags
If you have a process that has set itself to be non-dumpable, and it
then undergoes exec(2), any CLOEXEC file descriptors it has open are
"exposed" during a race window between the dumpable flags of the process
being reset for exec(2) and CLOEXEC being applied to the file
descriptors. This can be exploited by a process by attempting to access
/proc/<pid>/fd/... during this window, without requiring CAP_SYS_PTRACE.
The race in question is after set_dumpable has been (for get_link,
though the trace is basically the same for readlink):
[vfs]
-> proc_pid_link_inode_operations.get_link
-> proc_pid_get_link
-> proc_fd_access_allowed
-> ptrace_may_access(task, PTRACE_MODE_READ_FSCREDS);
Which will return 0, during the race window and CLOEXEC file descriptors
will still be open during this window because do_close_on_exec has not
been called yet. As a result, the ordering of these calls should be
reversed to avoid this race window.
This is of particular concern to container runtimes, where joining a
PID namespace with file descriptors referring to the host filesystem
can result in security issues (since PRCTL_SET_DUMPABLE doesn't protect
against access of CLOEXEC file descriptors -- file descriptors which may
reference filesystem objects the container shouldn't have access to).
Cc: dev@opencontainers.org
Cc: <stable@vger.kernel.org> # v3.2+
Reported-by: Michael Crosby <crosbymichael@gmail.com>
Signed-off-by: Aleksa Sarai <asarai@suse.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
https://github.com/miraclestars/android_kernel_samsung_msm8996/commit/
2cb49253bd089b9d823f4ed60c525cb5a348f2e8
Ray Essick [Sun, 20 Aug 2017 16:55:54 +0000 (16:55 +0000)]
exec: reserve full virtual stack space at execve()
grows the main process stack at execve() time so that it can reach it's
maximum allowed size. No RAM is allocated until the stack actually uses this
space.
This prevents problems observed around address space randomization where
large heaps grew into and collided with the main process' stack. This
prevented the main stack from growing later. Early workarounds had the
particular application allocating its stack to be large before going off
to execute the work that grew the heap; that saved the stack's virtual
space but also allocated RAM ahead of the stack needing it.
Change-Id: I131a1afb6e50ddfa580894562c05de653d2e30e7
Signed-off-by: Ray Essick <are002@motorola.com>
Reviewed-on: http://gerrit.mot.com/718024
SLTApproved: Slta Waiver <sltawvr@motorola.com>
Tested-by: Jira Key <jirakey@motorola.com>
Reviewed-by: Christopher Fries <cfries@motorola.com>
Submit-Approved: Jira Key <jirakey@motorola.com>
alexax66 [Sun, 6 Aug 2017 15:30:11 +0000 (15:30 +0000)]
fs: fixup after last commits
Johannes Weiner [Fri, 31 Mar 2017 22:09:53 +0000 (22:09 +0000)]
mm + fs: store shadow entries in page cache
Reclaim will be leaving shadow entries in the page cache radix tree upon
evicting the real page. As those pages are found from the LRU, an
iput() can lead to the inode being freed concurrently. At this point,
reclaim must no longer install shadow pages because the inode freeing
code needs to ensure the page tree is really empty.
Add an address_space flag, AS_EXITING, that the inode freeing code sets
under the tree lock before doing the final truncate. Reclaim will check
for this flag before installing shadow pages.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Rik van Riel <riel@redhat.com>
Reviewed-by: Minchan Kim <minchan@kernel.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Bob Liu <bob.liu@oracle.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jan Kara <jack@suse.cz>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Luigi Semenzato <semenzato@google.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Metin Doslu <metin@citusdata.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Ozgun Erdogan <ozgun@citusdata.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Roman Gushchin <klamm@yandex-team.ru>
Cc: Ryan Mallon <rmallon@gmail.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Based on commit: https://github.com/hizukiayaka/linux-kernel/commit/
91b0abe36a7b2b3b02d7500925a5f8455334f0e5
Conflicts:
fs/ecryptfs/super.c
Davidlohr Bueso [Sat, 16 Apr 2016 14:50:56 +0000 (14:50 +0000)]
mm: per-thread vma caching
This patch is a continuation of efforts trying to optimize find_vma(),
avoiding potentially expensive rbtree walks to locate a vma upon faults.
The original approach (https://lkml.org/lkml/2013/11/1/410), where the
largest vma was also cached, ended up being too specific and random,
thus further comparison with other approaches were needed. There are
two things to consider when dealing with this, the cache hit rate and
the latency of find_vma(). Improving the hit-rate does not necessarily
translate in finding the vma any faster, as the overhead of any fancy
caching schemes can be too high to consider.
We currently cache the last used vma for the whole address space, which
provides a nice optimization, reducing the total cycles in find_vma() by
up to 250%, for workloads with good locality. On the other hand, this
simple scheme is pretty much useless for workloads with poor locality.
Analyzing ebizzy runs shows that, no matter how many threads are
running, the mmap_cache hit rate is less than 2%, and in many situations
below 1%.
The proposed approach is to replace this scheme with a small per-thread
cache, maximizing hit rates at a very low maintenance cost.
Invalidations are performed by simply bumping up a 32-bit sequence
number. The only expensive operation is in the rare case of a seq
number overflow, where all caches that share the same address space are
flushed. Upon a miss, the proposed replacement policy is based on the
page number that contains the virtual address in question. Concretely,
the following results are seen on an 80 core, 8 socket x86-64 box:
1) System bootup: Most programs are single threaded, so the per-thread
scheme does improve ~50% hit rate by just adding a few more slots to
the cache.
+----------------+----------+------------------+
| caching scheme | hit-rate | cycles (billion) |
+----------------+----------+------------------+
| baseline | 50.61% | 19.90 |
| patched | 73.45% | 13.58 |
+----------------+----------+------------------+
2) Kernel build: This one is already pretty good with the current
approach as we're dealing with good locality.
+----------------+----------+------------------+
| caching scheme | hit-rate | cycles (billion) |
+----------------+----------+------------------+
| baseline | 75.28% | 11.03 |
| patched | 88.09% | 9.31 |
+----------------+----------+------------------+
3) Oracle 11g Data Mining (4k pages): Similar to the kernel build workload.
+----------------+----------+------------------+
| caching scheme | hit-rate | cycles (billion) |
+----------------+----------+------------------+
| baseline | 70.66% | 17.14 |
| patched | 91.15% | 12.57 |
+----------------+----------+------------------+
4) Ebizzy: There's a fair amount of variation from run to run, but this
approach always shows nearly perfect hit rates, while baseline is just
about non-existent. The amounts of cycles can fluctuate between
anywhere from ~60 to ~116 for the baseline scheme, but this approach
reduces it considerably. For instance, with 80 threads:
+----------------+----------+------------------+
| caching scheme | hit-rate | cycles (billion) |
+----------------+----------+------------------+
| baseline | 1.06% | 91.54 |
| patched | 99.97% | 14.18 |
+----------------+----------+------------------+
[akpm@linux-foundation.org: fix nommu build, per Davidlohr]
[akpm@linux-foundation.org: document vmacache_valid() logic]
[akpm@linux-foundation.org: attempt to untangle header files]
[akpm@linux-foundation.org: add vmacache_find() BUG_ON]
[hughd@google.com: add vmacache_valid_mm() (from Oleg)]
[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: adjust and enhance comments]
Signed-off-by: Davidlohr Bueso <davidlohr@hp.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Reviewed-by: Michel Lespinasse <walken@google.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Tested-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Based on commit: https://github.com/halaszk/Perseus-halaszk-universal5433/commit/
610115c9a6cfe8fd3333e32fc266da1d10b2a170#diff-f38e70c39e1312b46fdae37b8cae6eb3R214
Conflicts:
mm/mmap.c
Hugh Dickins [Mon, 19 Jun 2017 11:03:24 +0000 (04:03 -0700)]
mm: larger stack guard gap, between vmas
commit
1be7107fbe18eed3e319a6c3e83c78254b693acb upstream.
Stack guard page is a useful feature to reduce a risk of stack smashing
into a different mapping. We have been using a single page gap which
is sufficient to prevent having stack adjacent to a different mapping.
But this seems to be insufficient in the light of the stack usage in
userspace. E.g. glibc uses as large as 64kB alloca() in many commonly
used functions. Others use constructs liks gid_t buffer[NGROUPS_MAX]
which is 256kB or stack strings with MAX_ARG_STRLEN.
This will become especially dangerous for suid binaries and the default
no limit for the stack size limit because those applications can be
tricked to consume a large portion of the stack and a single glibc call
could jump over the guard page. These attacks are not theoretical,
unfortunatelly.
Make those attacks less probable by increasing the stack guard gap
to 1MB (on systems with 4k pages; but make it depend on the page size
because systems with larger base pages might cap stack allocations in
the PAGE_SIZE units) which should cover larger alloca() and VLA stack
allocations. It is obviously not a full fix because the problem is
somehow inherent, but it should reduce attack space a lot.
One could argue that the gap size should be configurable from userspace,
but that can be done later when somebody finds that the new 1MB is wrong
for some special case applications. For now, add a kernel command line
option (stack_guard_gap) to specify the stack gap size (in page units).
Implementation wise, first delete all the old code for stack guard page:
because although we could get away with accounting one extra page in a
stack vma, accounting a larger gap can break userspace - case in point,
a program run with "ulimit -S -v 20000" failed when the 1MB gap was
counted for RLIMIT_AS; similar problems could come with RLIMIT_MLOCK
and strict non-overcommit mode.
Instead of keeping gap inside the stack vma, maintain the stack guard
gap as a gap between vmas: using vm_start_gap() in place of vm_start
(or vm_end_gap() in place of vm_end if VM_GROWSUP) in just those few
places which need to respect the gap - mainly arch_get_unmapped_area(),
and and the vma tree's subtree_gap support for that.
Original-patch-by: Oleg Nesterov <oleg@redhat.com>
Original-patch-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Hugh Dickins <hughd@google.com>
[wt: backport to 4.11: adjust context]
[wt: backport to 4.9: adjust context ; kernel doc was not in admin-guide]
[wt: backport to 4.4: adjust context ; drop ppc hugetlb_radix changes]
[wt: backport to 3.18: adjust context ; no FOLL_POPULATE ;
s390 uses generic arch_get_unmapped_area()]
[wt: backport to 3.16: adjust context]
[wt: backport to 3.10: adjust context ; code logic in PARISC's
arch_get_unmapped_area() wasn't found ; code inserted into
expand_upwards() and expand_downwards() runs under anon_vma lock;
changes for gup.c:faultin_page go to memory.c:__get_user_pages();
included Hugh Dickins' fixes]
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Corey Edwards <ensabahnur16@gmail.com>
Conflicts:
mm/memory.c
OGAWA Hirofumi [Fri, 10 Mar 2017 00:17:37 +0000 (16:17 -0800)]
fat: fix using uninitialized fields of fat_inode/fsinfo_inode
commit
c0d0e351285161a515396b7b1ee53ec9ffd97e3c upstream.
Recently fallocate patch was merged and it uses
MSDOS_I(inode)->mmu_private at fat_evict_inode(). However,
fat_inode/fsinfo_inode that was introduced in past didn't initialize
MSDOS_I(inode) properly.
With those combinations, it became the cause of accessing random entry
in FAT area.
Link: http://lkml.kernel.org/r/87pohrj4i8.fsf@mail.parknet.co.jp
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Reported-by: Moreno Bartalucci <moreno.bartalucci@tecnorama.it>
Tested-by: Moreno Bartalucci <moreno.bartalucci@tecnorama.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Corey Edwards <ensabahnur16@gmail.com>
Weston Andros Adamson [Thu, 23 Feb 2017 19:54:21 +0000 (14:54 -0500)]
NFSv4: fix getacl ERANGE for some ACL buffer sizes
commit
ed92d8c137b7794c2c2aa14479298b9885967607 upstream.
We're not taking into account that the space needed for the (variable
length) attr bitmap, with the result that we'd sometimes get a spurious
ERANGE when the ACL data got close to the end of a page.
Just add in an extra page to make sure.
Signed-off-by: Weston Andros Adamson <dros@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Corey Edwards <ensabahnur16@gmail.com>
Miklos Szeredi [Thu, 16 Feb 2017 16:49:02 +0000 (17:49 +0100)]
vfs: fix uninitialized flags in splice_to_pipe()
commit
5a81e6a171cdbd1fa8bc1fdd80c23d3d71816fac upstream.
Flags (PIPE_BUF_FLAG_PACKET, PIPE_BUF_FLAG_GIFT) could remain on the
unused part of the pipe ring buffer. Previously splice_to_pipe() left
the flags value alone, which could result in incorrect behavior.
Uninitialized flags appears to have been there from the introduction of
the splice syscall.
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Corey Edwards <ensabahnur16@gmail.com>
Al Viro [Thu, 11 Sep 2014 22:55:50 +0000 (18:55 -0400)]
move the call of __d_drop(anon) into __d_materialise_unique(dentry, anon)
commit
6f18493e541c690169c3b1479d47d95f624161cf upstream.
and lock the right list there
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Corey Edwards <ensabahnur16@gmail.com>
Daeho Jeong [Thu, 1 Dec 2016 16:49:12 +0000 (11:49 -0500)]
ext4: fix inode checksum calculation problem if i_extra_size is small
commit
05ac5aa18abd7db341e54df4ae2b4c98ea0e43b7 upstream.
We've fixed the race condition problem in calculating ext4 checksum
value in commit
b47820edd163 ("ext4: avoid modifying checksum fields
directly during checksum veficationon"). However, by this change,
when calculating the checksum value of inode whose i_extra_size is
less than 4, we couldn't calculate the checksum value in a proper way.
This problem was found and reported by Nix, Thank you.
Reported-by: Nix <nix@esperi.org.uk>
Signed-off-by: Daeho Jeong <daeho.jeong@samsung.com>
Signed-off-by: Youngjin Gil <youngjin.gil@samsung.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Corey Edwards <ensabahnur16@gmail.com>
Theodore Ts'o [Sun, 5 Feb 2017 06:26:48 +0000 (01:26 -0500)]
ext4: return EROFS if device is r/o and journal replay is needed
commit
4753d8a24d4588657bc0a4cd66d4e282dff15c8c upstream.
If the file system requires journal recovery, and the device is
read-ony, return EROFS to the mount system call. This allows xfstests
generic/050 to pass.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Corey Edwards <ensabahnur16@gmail.com>
Conflicts:
fs/ext4/super.c
Theodore Ts'o [Sun, 5 Feb 2017 04:38:06 +0000 (23:38 -0500)]
ext4: preserve the needs_recovery flag when the journal is aborted
commit
97abd7d4b5d9c48ec15c425485f054e1c15e591b upstream.
If the journal is aborted, the needs_recovery feature flag should not
be removed. Otherwise, it's the journal might not get replayed and
this could lead to more data getting lost.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Corey Edwards <ensabahnur16@gmail.com>
Jan Kara [Fri, 27 Jan 2017 19:34:30 +0000 (14:34 -0500)]
ext4: trim allocation requests to group size
commit
cd648b8a8fd5071d232242d5ee7ee3c0815776af upstream.
If filesystem groups are artifically small (using parameter -g to
mkfs.ext4), ext4_mb_normalize_request() can result in a request that is
larger than a block group. Trim the request size to not confuse
allocation code.
Reported-by: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Corey Edwards <ensabahnur16@gmail.com>
Theodore Ts'o [Wed, 15 Feb 2017 06:26:39 +0000 (01:26 -0500)]
ext4: fix fencepost in s_first_meta_bg validation
commit
2ba3e6e8afc9b6188b471f27cf2b5e3cf34e7af2 upstream.
It is OK for s_first_meta_bg to be equal to the number of block group
descriptor blocks. (It rarely happens, but it shouldn't cause any
problems.)
https://bugzilla.kernel.org/show_bug.cgi?id=194567
Fixes:
3a4b77cd47bb837b8557595ec7425f281f2ca1fe
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Corey Edwards <ensabahnur16@gmail.com>
Theodore Ts'o [Sun, 5 Feb 2017 04:14:19 +0000 (23:14 -0500)]
jbd2: don't leak modified metadata buffers on an aborted journal
commit
e112666b4959b25a8552d63bc564e1059be703e8 upstream.
If the journal has been aborted, we shouldn't mark the underlying
buffer head as dirty, since that will cause the metadata block to get
modified. And if the journal has been aborted, we shouldn't allow
this since it will almost certainly lead to a corrupted file system.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Corey Edwards <ensabahnur16@gmail.com>
Conflicts:
fs/jbd2/transaction.c
Eryu Guan [Thu, 1 Dec 2016 20:08:37 +0000 (15:08 -0500)]
ext4: validate s_first_meta_bg at mount time
commit
3a4b77cd47bb837b8557595ec7425f281f2ca1fe upstream.
Ralf Spenneberg reported that he hit a kernel crash when mounting a
modified ext4 image. And it turns out that kernel crashed when
calculating fs overhead (ext4_calculate_overhead()), this is because
the image has very large s_first_meta_bg (debug code shows it's
842150400), and ext4 overruns the memory in count_overhead() when
setting bitmap buffer, which is PAGE_SIZE.
ext4_calculate_overhead():
buf = get_zeroed_page(GFP_NOFS); <=== PAGE_SIZE buffer
blks = count_overhead(sb, i, buf);
count_overhead():
for (j = ext4_bg_num_gdb(sb, grp); j > 0; j--) { <=== j =
842150400
ext4_set_bit(EXT4_B2C(sbi, s++), buf); <=== buffer overrun
count++;
}
This can be reproduced easily for me by this script:
#!/bin/bash
rm -f fs.img
mkdir -p /mnt/ext4
fallocate -l 16M fs.img
mke2fs -t ext4 -O bigalloc,meta_bg,^resize_inode -F fs.img
debugfs -w -R "ssv first_meta_bg
842150400" fs.img
mount -o loop fs.img /mnt/ext4
Fix it by validating s_first_meta_bg first at mount time, and
refusing to mount if its value exceeds the largest possible meta_bg
number.
[js] use EXT4_HAS_INCOMPAT_FEATURE instead of new
ext4_has_feature_meta_bg
Reported-by: Ralf Spenneberg <ralf@os-t.de>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Corey Edwards <ensabahnur16@gmail.com>
Theodore Ts'o [Fri, 18 Nov 2016 18:37:47 +0000 (13:37 -0500)]
ext4: add sanity checking to count_overhead()
commit
c48ae41bafe31e9a66d8be2ced4e42a6b57fa814 upstream.
The commit "ext4: sanity check the block and cluster size at mount
time" should prevent any problems, but in case the superblock is
modified while the file system is mounted, add an extra safety check
to make sure we won't overrun the allocated buffer.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Corey Edwards <ensabahnur16@gmail.com>
Theodore Ts'o [Fri, 18 Nov 2016 18:24:26 +0000 (13:24 -0500)]
ext4: fix in-superblock mount options processing
commit
5aee0f8a3f42c94c5012f1673420aee96315925a upstream.
Fix a large number of problems with how we handle mount options in the
superblock. For one, if the string in the superblock is long enough
that it is not null terminated, we could run off the end of the string
and try to interpret superblocks fields as characters. It's unlikely
this will cause a security problem, but it could result in an invalid
parse. Also, parse_options is destructive to the string, so in some
cases if there is a comma-separated string, it would be modified in
the superblock. (Fortunately it only happens on file systems with a
1k block size.)
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Corey Edwards <ensabahnur16@gmail.com>
Theodore Ts'o [Fri, 18 Nov 2016 18:28:30 +0000 (13:28 -0500)]
ext4: use more strict checks for inodes_per_block on mount
commit
cd6bb35bf7f6d7d922509bf50265383a0ceabe96 upstream.
Centralize the checks for inodes_per_block and be more strict to make
sure the inodes_per_block_group can't end up being zero.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Corey Edwards <ensabahnur16@gmail.com>
Liu Bo [Wed, 3 Aug 2016 19:33:01 +0000 (12:33 -0700)]
Btrfs: fix memory leak in reading btree blocks
commit
2571e739677f1e4c0c63f5ed49adcc0857923625 upstream.
So we can read a btree block via readahead or intentional read,
and we can end up with a memory leak when something happens as
follows,
1) readahead starts to read block A but does not wait for read
completion,
2) btree_readpage_end_io_hook finds that block A is corrupted,
and it needs to clear all block A's pages' uptodate bit.
3) meanwhile an intentional read kicks in and checks block A's
pages' uptodate to decide which page needs to be read.
4) when some pages have the uptodate bit during 3)'s check so
3) doesn't count them for eb->io_pages, but they are later
cleared by 2) so we has to readpage on the page, we get
the wrong eb->io_pages which results in a memory leak of
this block.
This fixes the problem by firstly getting all pages's locking and
then checking pages' uptodate bit.
t1(readahead) t2(readahead endio) t3(the following read)
read_extent_buffer_pages end_bio_extent_readpage
for pg in eb: for page 0,1,2 in eb:
if pg is uptodate: btree_readpage_end_io_hook(pg)
num_reads++ if uptodate:
eb->io_pages = num_reads SetPageUptodate(pg) _______________
for pg in eb: for page 3 in eb: read_extent_buffer_pages
if pg is NOT uptodate: btree_readpage_end_io_hook(pg) for pg in eb:
__extent_read_full_page(pg) sanity check reports something wrong if pg is uptodate:
clear_extent_buffer_uptodate(eb) num_reads++
for pg in eb: eb->io_pages = num_reads
ClearPageUptodate(page) _______________
for pg in eb:
if pg is NOT uptodate:
__extent_read_full_page(pg)
So t3's eb->io_pages is not consistent with the number of pages it's reading,
and during endio(), atomic_dec_and_test(&eb->io_pages) will get a negative
number so that we're not able to free the eb.
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Corey Edwards <ensabahnur16@gmail.com>
Eric Biggers [Wed, 15 Mar 2017 18:52:02 +0000 (14:52 -0400)]
ext4: mark inode dirty after converting inline directory
commit
b9cf625d6ecde0d372e23ae022feead72b4228a6 upstream.
If ext4_convert_inline_data() was called on a directory with inline
data, the filesystem was left in an inconsistent state (as considered by
e2fsck) because the file size was not increased to cover the new block.
This happened because the inode was not marked dirty after i_disksize
was updated. Fix this by marking the inode dirty at the end of
ext4_finish_convert_inline_dir().
This bug was probably not noticed before because most users mark the
inode dirty afterwards for other reasons. But if userspace executed
FS_IOC_SET_ENCRYPTION_POLICY with invalid parameters, as exercised by
'kvm-xfstests -c adv generic/396', then the inode was never marked dirty
after updating i_disksize.
Fixes:
3c47d54170b6a678875566b1b8d6dcf57904e49b
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Corey Edwards <ensabahnur16@gmail.com>
J. Bruce Fields [Thu, 23 Feb 2017 19:53:39 +0000 (14:53 -0500)]
NFSv4: fix getacl head length estimation
commit
6682c14bbe505a8b912c57faf544f866777ee48d upstream.
Bitmap and attrlen follow immediately after the op reply header. This
was an oversight from commit
bf118a342f.
Consequences of this are just minor efficiency (extra calls to
xdr_shrink_bufhead).
Fixes:
bf118a342f10 "NFSv4: include bitmap in nfsv4 get acl data"
Reviewed-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Corey Edwards <ensabahnur16@gmail.com>
Miklos Szeredi [Wed, 22 Feb 2017 19:08:25 +0000 (20:08 +0100)]
fuse: add missing FR_FORCE
commit
2e38bea99a80eab408adee27f873a188d57b76cb upstream.
fuse_file_put() was missing the "force" flag for the RELEASE request when
sending synchronously (fuseblk).
If this flag is not set, then a sync request may be interrupted before it
is dequeued by the userspace filesystem. In this case the OPEN won't be
balanced with a RELEASE.
[js] force is a variable, not a bit
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Fixes:
5a18ec176c93 ("fuse: fix hang of single threaded fuseblk filesystem")
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Corey Edwards <ensabahnur16@gmail.com>
Ben Hutchings [Fri, 29 Aug 2014 22:18:58 +0000 (15:18 -0700)]
ocfs2: do not write error flag to user structure we cannot copy from/to
commit
2b462638e41ea62230297c21c4da9955937b7a3c upstream.
If we failed to copy from the structure, writing back the flags leaks 31
bits of kernel memory (the rest of the ir_flags field).
In any case, if we cannot copy from/to the structure, why should we
expect putting just the flags to work?
Also make sure ocfs2_info_handle_freeinode() returns the right error
code if the copy_to_user() fails.
Fixes:
ddee5cdb70e6 ('Ocfs2: Add new OCFS2_IOC_INFO ioctl for ocfs2 v8.')
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Cc: Joel Becker <jlbec@evilplan.org>
Acked-by: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Corey Edwards <ensabahnur16@gmail.com>
Richard Weinberger [Tue, 10 Jan 2017 10:49:40 +0000 (11:49 +0100)]
ubifs: Fix journal replay wrt. xattr nodes
commit
1cb51a15b576ee325d527726afff40947218fd5e upstream.
When replaying the journal it can happen that a journal entry points to
a garbage collected node.
This is the case when a power-cut occurred between a garbage collect run
and a commit. In such a case nodes have to be read using the failable
read functions to detect whether the found node matches what we expect.
One corner case was forgotten, when the journal contains an entry to
remove an inode all xattrs have to be removed too. UBIFS models xattr
like directory entries, so the TNC code iterates over
all xattrs of the inode and removes them too. This code re-uses the
functions for walking directories and calls ubifs_tnc_next_ent().
ubifs_tnc_next_ent() expects to be used only after the journal and
aborts when a node does not match the expected result. This behavior can
render an UBIFS volume unmountable after a power-cut when xattrs are
used.
Fix this issue by using failable read functions in ubifs_tnc_next_ent()
too when replaying the journal.
Fixes:
1e51764a3c2ac05a ("UBIFS: add new flash file system")
Reported-by: Rock Lee <rockdotlee@gmail.com>
Reviewed-by: David Gstir <david@sigma-star.at>
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Corey Edwards <ensabahnur16@gmail.com>
NeilBrown [Mon, 19 Dec 2016 00:19:31 +0000 (11:19 +1100)]
NFSv4.1: nfs4_fl_prepare_ds must be careful about reporting success.
commit
cfd278c280f997cf2fe4662e0acab0fe465f637b upstream.
Various places assume that if nfs4_fl_prepare_ds() turns a non-NULL 'ds',
then ds->ds_clp will also be non-NULL.
This is not necessasrily true in the case when the process received a fatal signal
while nfs4_pnfs_ds_connect is waiting in nfs4_wait_ds_connect().
In that case ->ds_clp may not be set, and the devid may not recently have been marked
unavailable.
So add a test for ds_clp == NULL and return NULL in that case.
Fixes:
c23266d532b4 ("NFS4.1 Fix data server connection race")
Signed-off-by: NeilBrown <neilb@suse.com>
Acked-by: Olga Kornievskaia <aglo@umich.edu>
Acked-by: Adamson, Andy <William.Adamson@netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Corey Edwards <ensabahnur16@gmail.com>
Rabin Vincent [Thu, 1 Dec 2016 08:18:28 +0000 (09:18 +0100)]
block: protect iterate_bdevs() against concurrent close
commit
af309226db916e2c6e08d3eba3fa5c34225200c4 upstream.
If a block device is closed while iterate_bdevs() is handling it, the
following NULL pointer dereference occurs because bdev->b_disk is NULL
in bdev_get_queue(), which is called from blk_get_backing_dev_info() (in
turn called by the mapping_cap_writeback_dirty() call in
__filemap_fdatawrite_range()):
BUG: unable to handle kernel NULL pointer dereference at
0000000000000508
IP: [<
ffffffff81314790>] blk_get_backing_dev_info+0x10/0x20
PGD
9e62067 PUD
9ee8067 PMD 0
Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
Modules linked in:
CPU: 1 PID: 2422 Comm: sync Not tainted 4.5.0-rc7+ #400
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
task:
ffff880009f4d700 ti:
ffff880009f5c000 task.ti:
ffff880009f5c000
RIP: 0010:[<
ffffffff81314790>] [<
ffffffff81314790>] blk_get_backing_dev_info+0x10/0x20
RSP: 0018:
ffff880009f5fe68 EFLAGS:
00010246
RAX:
0000000000000000 RBX:
ffff88000ec17a38 RCX:
ffffffff81a4e940
RDX:
7fffffffffffffff RSI:
0000000000000000 RDI:
ffff88000ec176c0
RBP:
ffff880009f5fe68 R08:
0000000000000000 R09:
0000000000000000
R10:
0000000000000001 R11:
0000000000000000 R12:
ffff88000ec17860
R13:
ffffffff811b25c0 R14:
ffff88000ec178e0 R15:
ffff88000ec17a38
FS:
00007faee505d700(0000) GS:
ffff88000fb00000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
000000008005003b
CR2:
0000000000000508 CR3:
0000000009e8a000 CR4:
00000000000006e0
Stack:
ffff880009f5feb8 ffffffff8112e7f5 0000000000000000 7fffffffffffffff
0000000000000000 0000000000000000 7fffffffffffffff 0000000000000001
ffff88000ec178e0 ffff88000ec17860 ffff880009f5fec8 ffffffff8112e81f
Call Trace:
[<
ffffffff8112e7f5>] __filemap_fdatawrite_range+0x85/0x90
[<
ffffffff8112e81f>] filemap_fdatawrite+0x1f/0x30
[<
ffffffff811b25d6>] fdatawrite_one_bdev+0x16/0x20
[<
ffffffff811bc402>] iterate_bdevs+0xf2/0x130
[<
ffffffff811b2763>] sys_sync+0x63/0x90
[<
ffffffff815d4272>] entry_SYSCALL_64_fastpath+0x12/0x76
Code: 0f 1f 44 00 00 48 8b 87 f0 00 00 00 55 48 89 e5 <48> 8b 80 08 05 00 00 5d
RIP [<
ffffffff81314790>] blk_get_backing_dev_info+0x10/0x20
RSP <
ffff880009f5fe68>
CR2:
0000000000000508
---[ end trace
2487336ceb3de62d ]---
The crash is easily reproducible by running the following command, if an
msleep(100) is inserted before the call to func() in iterate_devs():
while :; do head -c1 /dev/nullb0; done > /dev/null & while :; do sync; done
Fix it by holding the bd_mutex across the func() call and only calling
func() if the bdev is opened.
Fixes:
5c0d6b60a0ba ("vfs: Create function for iterating over block devices")
Reported-and-tested-by: Wei Fang <fangwei1@huawei.com>
Signed-off-by: Rabin Vincent <rabinv@axis.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Corey Edwards <ensabahnur16@gmail.com>
Dan Carpenter [Sat, 10 Dec 2016 14:56:01 +0000 (09:56 -0500)]
ext4: return -ENOMEM instead of success
commit
578620f451f836389424833f1454eeeb2ffc9e9f upstream.
We should set the error code if kzalloc() fails.
Fixes:
67cf5b09a46f ("ext4: add the basic function for inline data support")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Corey Edwards <ensabahnur16@gmail.com>
Darrick J. Wong [Sat, 10 Dec 2016 14:55:01 +0000 (09:55 -0500)]
ext4: reject inodes with negative size
commit
7e6e1ef48fc02f3ac5d0edecbb0c6087cd758d58 upstream.
Don't load an inode with a negative size; this causes integer overflow
problems in the VFS.
[ Added EXT4_ERROR_INODE() to mark file system as corrupted. -TYT]
js: use EIO for 3.12 instead of EFSCORRUPTED.
Fixes:
a48380f769df (ext4: rename i_dir_acl to i_size_high)
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Corey Edwards <ensabahnur16@gmail.com>
Chandan Rajendra [Tue, 15 Nov 2016 02:26:26 +0000 (21:26 -0500)]
ext4: fix stack memory corruption with 64k block size
commit
30a9d7afe70ed6bd9191d3000e2ef1a34fb58493 upstream.
The number of 'counters' elements needed in 'struct sg' is
super_block->s_blocksize_bits + 2. Presently we have 16 'counters'
elements in the array. This is insufficient for block sizes >= 32k. In
such cases the memcpy operation performed in ext4_mb_seq_groups_show()
would cause stack memory corruption.
Fixes:
c9de560ded61f
Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Corey Edwards <ensabahnur16@gmail.com>
Chandan Rajendra [Tue, 15 Nov 2016 02:04:37 +0000 (21:04 -0500)]
ext4: fix mballoc breakage with 64k block size
commit
69e43e8cc971a79dd1ee5d4343d8e63f82725123 upstream.
'border' variable is set to a value of 2 times the block size of the
underlying filesystem. With 64k block size, the resulting value won't
fit into a 16-bit variable. Hence this commit changes the data type of
'border' to 'unsigned int'.
Fixes:
c9de560ded61f
Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Corey Edwards <ensabahnur16@gmail.com>
NeilBrown [Mon, 12 Dec 2016 15:21:51 +0000 (08:21 -0700)]
block_dev: don't test bdev->bd_contains when it is not stable
commit
bcc7f5b4bee8e327689a4d994022765855c807ff upstream.
bdev->bd_contains is not stable before calling __blkdev_get().
When __blkdev_get() is called on a parition with ->bd_openers == 0
it sets
bdev->bd_contains = bdev;
which is not correct for a partition.
After a call to __blkdev_get() succeeds, ->bd_openers will be > 0
and then ->bd_contains is stable.
When FMODE_EXCL is used, blkdev_get() calls
bd_start_claiming() -> bd_prepare_to_claim() -> bd_may_claim()
This call happens before __blkdev_get() is called, so ->bd_contains
is not stable. So bd_may_claim() cannot safely use ->bd_contains.
It currently tries to use it, and this can lead to a BUG_ON().
This happens when a whole device is already open with a bd_holder (in
use by dm in my particular example) and two threads race to open a
partition of that device for the first time, one opening with O_EXCL and
one without.
The thread that doesn't use O_EXCL gets through blkdev_get() to
__blkdev_get(), gains the ->bd_mutex, and sets bdev->bd_contains = bdev;
Immediately thereafter the other thread, using FMODE_EXCL, calls
bd_start_claiming() from blkdev_get(). This should fail because the
whole device has a holder, but because bdev->bd_contains == bdev
bd_may_claim() incorrectly reports success.
This thread continues and blocks on bd_mutex.
The first thread then sets bdev->bd_contains correctly and drops the mutex.
The thread using FMODE_EXCL then continues and when it calls bd_may_claim()
again in:
BUG_ON(!bd_may_claim(bdev, whole, holder));
The BUG_ON fires.
Fix this by removing the dependency on ->bd_contains in
bd_may_claim(). As bd_may_claim() has direct access to the whole
device, it can simply test if the target bdev is the whole device.
Fixes:
6b4517a7913a ("block: implement bd_claiming and claiming block")
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Corey Edwards <ensabahnur16@gmail.com>
Robbie Ko [Fri, 7 Oct 2016 09:30:47 +0000 (17:30 +0800)]
Btrfs: fix tree search logic when replaying directory entry deletes
commit
2a7bf53f577e49c43de4ffa7776056de26db65d9 upstream.
If a log tree has a layout like the following:
leaf N:
...
item 240 key (282 DIR_LOG_ITEM 0) itemoff 8189 itemsize 8
dir log end
1275809046
leaf N + 1:
item 0 key (282 DIR_LOG_ITEM
3936149215) itemoff 16275 itemsize 8
dir log end
18446744073709551615
...
When we pass the value
1275809046 + 1 as the parameter start_ret to the
function tree-log.c:find_dir_range() (done by replay_dir_deletes()), we
end up with path->slots[0] having the value 239 (points to the last item
of leaf N, item 240). Because the dir log item in that position has an
offset value smaller than *start_ret (
1275809046 + 1) we need to move on
to the next leaf, however the logic for that is wrong since it compares
the current slot to the number of items in the leaf, which is smaller
and therefore we don't lookup for the next leaf but instead we set the
slot to point to an item that does not exist, at slot 240, and we later
operate on that slot which has unexpected content or in the worst case
can result in an invalid memory access (accessing beyond the last page
of leaf N's extent buffer).
So fix the logic that checks when we need to lookup at the next leaf
by first incrementing the slot and only after to check if that slot
is beyond the last item of the current leaf.
Signed-off-by: Robbie Ko <robbieko@synology.com>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Fixes:
e02119d5a7b4 (Btrfs: Add a write ahead tree log to optimize synchronous operations)
Signed-off-by: Filipe Manana <fdmanana@suse.com>
[Modified changelog for clarity and correctness]
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Corey Edwards <ensabahnur16@gmail.com>
Jan Kara [Sun, 24 Apr 2016 04:56:03 +0000 (00:56 -0400)]
ext4: fix data exposure after a crash
commit
06bd3c36a733ac27962fea7d6f47168841376824 upstream.
Huang has reported that in his powerfail testing he is seeing stale
block contents in some of recently allocated blocks although he mounts
ext4 in data=ordered mode. After some investigation I have found out
that indeed when delayed allocation is used, we don't add inode to
transaction's list of inodes needing flushing before commit. Originally
we were doing that but commit
f3b59291a69d removed the logic with a
flawed argument that it is not needed.
The problem is that although for delayed allocated blocks we write their
contents immediately after allocating them, there is no guarantee that
the IO scheduler or device doesn't reorder things and thus transaction
allocating blocks and attaching them to inode can reach stable storage
before actual block contents. Actually whenever we attach freshly
allocated blocks to inode using a written extent, we should add inode to
transaction's ordered inode list to make sure we properly wait for block
contents to be written before committing the transaction. So that is
what we do in this patch. This also handles other cases where stale data
exposure was possible - like filling hole via mmap in
data=ordered,nodelalloc mode.
The only exception to the above rule are extending direct IO writes where
blkdev_direct_IO() waits for IO to complete before increasing i_size and
thus stale data exposure is not possible. For now we don't complicate
the code with optimizing this special case since the overhead is pretty
low. In case this is observed to be a performance problem we can always
handle it using a special flag to ext4_map_blocks().
Fixes:
f3b59291a69d0b734be1fc8be489fef2dd846d3d
Reported-by: "HUANG Weller (CM/ESW12-CN)" <Weller.Huang@cn.bosch.com>
Tested-by: "HUANG Weller (CM/ESW12-CN)" <Weller.Huang@cn.bosch.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Corey Edwards <ensabahnur16@gmail.com>
Alex Thorlton [Sat, 1 Apr 2017 15:40:11 +0000 (15:40 +0000)]
exec: kill the unnecessary mm->def_flags setting in load_elf_binary()
load_elf_binary() sets current->mm->def_flags = def_flags and def_flags
is always zero. Not only this looks strange, this is unnecessary
because mm_init() has already set ->def_flags = 0.
Signed-off-by: Alex Thorlton <athorlton@sgi.com>
Suggested-by: Oleg Nesterov <oleg@redhat.com>
Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Mel Gorman <mgorman@suse.de>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Used commit: https://github.com/hizukiayaka/linux-kernel/commit/
ab0e113f6bee71a3933755d2c9ae41fcee631800
Andy Shevchenko [Sat, 3 Sep 2016 22:43:16 +0000 (22:43 +0000)]
lib/uuid.c: move generate_random_uuid() to uuid.c
Let's gather the UUID related functions under one hood.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Dmitry Kasatkin <dmitry.kasatkin@gmail.com>
Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jesse Chan <jc@linux.com>
Original commit: https://github.com/jesec/android_kernel_samsung_universal8890/commit/
518e3b0e09fc2ce274df5fa937ab40ac3c4c1d3d#diff-8a7983688893b4a00f280f89e6e6c559R250
Jin Qian [Thu, 2 Mar 2017 21:32:59 +0000 (13:32 -0800)]
ANDROID: sched: add a counter to track fsync
Change-Id: I6c138de5b2332eea70f57e098134d1d141247b3f
Signed-off-by: Jin Qian <jinqian@google.com>
Jan Kara [Fri, 19 Dec 2014 11:21:47 +0000 (12:21 +0100)]
BACKPORT: udf: Verify symlink size before loading it
UDF specification allows arbitrarily large symlinks. However we support
only symlinks at most one block large. Check the length of the symlink
so that we don't access memory beyond end of the symlink block.
Upstream commit:
a1d47b262952a45aae62bd49cfaf33dd76c11a2c
CC: stable@vger.kernel.org
Reported-by: Carl Henrik Lunde <chlunde@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Jan Kara [Thu, 18 Dec 2014 21:37:50 +0000 (22:37 +0100)]
BACKPORT: udf: Check path length when reading symlink
Symlink reading code does not check whether the resulting path fits into
the page provided by the generic code. This isn't as easy as just
checking the symlink size because of various encoding conversions we
perform on path. So we have to check whether there is still enough space
in the buffer on the fly.
Upstream commit:
0e5cc9a40ada6046e6bc3bdfcd0c0d7e4b706b14
CC: stable@vger.kernel.org
Reported-by: Carl Henrik Lunde <chlunde@ping.uio.no>
Signed-off-by: Jan Kara <jack@suse.cz>
Miklos Szeredi [Sun, 21 May 2017 15:34:52 +0000 (15:34 +0000)]
BACKPORT: fs: limit filesystem stacking depth
Add a simple read-only counter to super_block that indicates how deep this
is in the stack of filesystems. Previously ecryptfs was the only stackable
filesystem and it explicitly disallowed multiple layers of itself.
Overlayfs, however, can be stacked recursively and also may be stacked
on top of ecryptfs or vice versa.
To limit the kernel stack usage we must limit the depth of the
filesystem stack. Initially the limit is set to 2.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
(cherry picked from commit
69c433ed2ecd2d3264efd7afec4439524b319121)
Bug:
32761463
Change-Id: I69b2fba2112db2ece09a1bf61a44f8fc4db00820
Used commit: https://github.com/LineageOS/android_kernel_oppo_msm8939/commit/
e5be4ea6bbd1c010ff25b893ad68fae6f07cff84
alexax66 [Sun, 21 May 2017 15:27:29 +0000 (15:27 +0000)]
ecryptfs: Merge tag 'LA.BR.1.2.9_rb1.10' of https://source.codeaurora.org/quic/la/kernel/msm-3.10 into cm-14.1
alexax66 [Sun, 21 May 2017 15:22:40 +0000 (15:22 +0000)]
fs: remove Samusung's ecryptfs
Al Viro [Tue, 16 May 2017 17:20:09 +0000 (17:20 +0000)]
move d_rcu from overlapping d_child to overlapping d_alias
commit
946e51f2bf37f1656916eb75bd0742ba33983c28 upstream.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: Ben Hutchings <ben@decadent.org.uk>
[hujianyang: Backported to 3.10 refer to the work of Ben Hutchings in 3.2:
- Apply name changes in all the different places we use d_alias and d_child
- Move the WARN_ON() in __d_free() to d_free() as we don't have dentry_free()]
Signed-off-by: hujianyang <hujianyang@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Used commit: https://github.com/LineageOS/android_kernel_oppo_msm8939/commit/
6637ecd306a94a03dd5b8e4e8d3f260d9877c5b0
Amit Pundir [Mon, 15 May 2017 21:27:35 +0000 (21:27 +0000)]
fs: ecryptfs: readdir: constify actor
actor is a constant in dir_context struct and
because of that we run into following build failure:
----------
fs/ecryptfs/file.c: In function ‘ecryptfs_readdir’:
fs/ecryptfs/file.c:130:2: error: assignment of read-only member ‘actor’
make[2]: *** [fs/ecryptfs/file.o] Error 1
make[1]: *** [fs/ecryptfs] Error 2
make: *** [fs] Error 2
----------
This fix is based on commit:
b2497fc([readdir] constify ->actor)
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
vlw [Sat, 14 Jan 2017 11:32:36 +0000 (14:32 +0300)]
fs: ecryptfs: constify actor fix
Change-Id: I2eb27ff4835fa8a34ce40ca25176abf1fbb732f6
Daniel Rosenberg [Thu, 5 Jan 2017 22:37:11 +0000 (14:37 -0800)]
ANDROID: mnt: remount should propagate to slaves of slaves
propagate_remount was not accounting for the slave mounts
of other slave mounts, leading to some namespaces not
recieving the remount information.
Signed-off-by: Daniel Rosenberg <drosen@google.com>
Bug:
33731928
Change-Id: Idc9e8c2ed126a4143229fc23f10a959c2d0a3854
Johannes Thumshirn [Thu, 8 Dec 2016 13:04:57 +0000 (14:04 +0100)]
splice: introduce FMODE_SPLICE_READ and FMODE_SPLICE_WRITE
Introduce FMODE_SPLICE_READ and FMODE_SPLICE_WRITE. These modes check
whether it is legal to read or write a file using splice. Both get
automatically set on regular files and are not checked when a 'struct
fileoperations' includes the splice_{read,write} methods.
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Daniel Rosenberg [Fri, 22 Apr 2016 07:00:48 +0000 (00:00 -0700)]
fuse: Add support for d_canonical_path
Allows FUSE to report to inotify that it is acting
as a layered filesystem. The userspace component
returns a string representing the location of the
underlying file. If the string cannot be resolved
into a path, the top level path is returned instead.
bug:
23904372
Change-Id: Iabdca0bbedfbff59e9c820c58636a68ef9683d9f
Signed-off-by: Daniel Rosenberg <drosen@google.com>
Daniel Rosenberg [Fri, 22 Apr 2016 07:00:14 +0000 (00:00 -0700)]
vfs: change d_canonical_path to take two paths
bug:
23904372
Change-Id: I4a686d64b6de37decf60019be1718e1d820193e6
Signed-off-by: Daniel Rosenberg <drosen@google.com>
Daniel Rosenberg [Wed, 23 Mar 2016 19:09:25 +0000 (12:09 -0700)]
inotify: Fix erroneous update of bit count
Patch "vfs: add d_canonical_path for stacked filesystem support"
erroneously updated the ALL_INOTIFY_BITS count. This changes it back
Change-Id: Idb04edc736da276159d30f04c40cff9d6b1e070f
Daniel Rosenberg [Fri, 12 Feb 2016 00:44:15 +0000 (16:44 -0800)]
vfs: add d_canonical_path for stacked filesystem support
Inotify does not currently know when a filesystem
is acting as a wrapper around another fs. This means
that inotify watchers will miss any modifications to
the base file, as well as any made in a separate
stacked fs that points to the same file.
d_canonical_path solves this problem by allowing the fs
to map a dentry to a path in the lower fs. Inotify
can use it to find the appropriate place to watch to
be informed of all changes to a file.
Change-Id: I09563baffad1711a045e45c1bd0bd8713c2cc0b6
Signed-off-by: Daniel Rosenberg <drosen@google.com>
Al Viro [Sun, 14 May 2017 11:48:51 +0000 (11:48 +0000)]
constify ->actor
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Sun, 14 May 2017 09:17:29 +0000 (09:17 +0000)]
introduce ->iterate(), ctx->pos, dir_emit()
New method - ->iterate(file, ctx). That's the replacement for ->readdir();
it takes callback from ctx->actor, uses ctx->pos instead of file->f_pos and
calls dir_emit(ctx, ...) instead of filldir(data, ...). It does *not*
update file->f_pos (or look at it, for that matter); iterate_dir() does the
update.
Note that dir_emit() takes the offset from ctx->pos (and eventually
filldir_t will lose that argument).
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Sun, 14 May 2017 09:05:23 +0000 (09:05 +0000)]
introduce iterate_dir() and dir_context
iterate_dir(): new helper, replacing vfs_readdir().
struct dir_context: contains the readdir callback (and will get more stuff
in it), embedded into whatever data that callback wants to deal with;
eventually, we'll be passing it to ->readdir() replacement instead of
(data,filldir) pair.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Mark Salyzyn [Mon, 23 Jan 2017 20:56:41 +0000 (12:56 -0800)]
ANDROID: fix acl leaks
Fixes regressions associated with commit
073931017b49d9458aa351605b43a7e34598caef
Signed-off-by: Mark Salyzyn <salyzyn@google.com>
Bug:
32458736
Change-Id: I6ee127dfdf3594d24ccd8560541ac554c5b05eb6
Cong Wang [Tue, 13 Dec 2016 18:33:34 +0000 (10:33 -0800)]
FROMLIST: 9p: fix a potential acl leak
(https://lkml.org/lkml/2016/12/13/579)
posix_acl_update_mode() could possibly clear 'acl', if so
we leak the memory pointed by 'acl'. Save this pointer
before calling posix_acl_update_mode() and release the memory
if 'acl' really gets cleared.
Reported-by: Mark Salyzyn <salyzyn@android.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Greg Kurz <groug@kaod.org>
Cc: Eric Van Hensbergen <ericvh@gmail.com>
Cc: Ron Minnich <rminnich@sandia.gov>
Cc: Latchesar Ionkov <lucho@ionkov.net>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Bug:
32458736
Change-Id: Ia78da401e6fd1bfd569653bd2cd0ebd3f9c737a0