git.stricted.de - GitHub/LineageOS/android_kernel_samsung

ANDROID: sdcardfs: Replace get/put with d_lock

dput cannot be called with a spin_lock. Instead,
we protect our accesses by holding the d_lock.

Signed-off-by: Daniel Rosenberg <drosen@google.com>
Bug: 35643557
Change-Id: I22cf30856d75b5616cbb0c223724f5ab866b5114

ANDROID: sdcardfs: rate limit warning print

Signed-off-by: Daniel Rosenberg <drosen@google.com>
Bug: 35848445
Change-Id: Ida72ea0ece191b2ae4a8babae096b2451eb563f6

ANDROID: sdcardfs: Fix case insensitive lookup

The previous case insensitive lookup relied on the
entry being present in the dcache. This instead uses
iterate_dir to find the correct case.

Signed-off-by: Daniel Rosenberg <drosen@google.com
bug: 35633782
Change-Id: I556f7090773468c1943c89a5e2aa07f746ba49c5

Revert "sdcardfs: limit stacking depth"

This reverts commit 1d9ff6b31bd797409085ddeb606d8654979be7dc.

Revert "sdcardfs: Fix build for linux-3.10.y"

This reverts commit 86cb714606711a06109eb7c973dc585f993202c9.

sdcardfs: Fix build for linux-3.10.y

* Fix compilation after 6637ecd

sdcardfs: limit stacking depth

Limit filesystem stacking to prevent stack overflow.

Bug: 32761463
Change-Id: I8b1462b9c0d6c7f00cf110724ffb17e7f307c51e
Signed-off-by: Andrew Chant <achant@google.com>

ANDROID: sdcardfs: Don't bother deleting freelist

There is no point deleting entries from dlist, as
that is a temporary list on the stack from which
contains only entries that are being deleted.

Not all code paths set up dlist, so those that
don't were performing invalid accesses in
hash_del_rcu. As an additional means to prevent
any other issue, we null out the list entries when
we allocate from the cache.

Signed-off-by: Daniel Rosenberg <drosen@google.com>
Bug: 35666680
Change-Id: Ibb1e28c08c3a600c29418d39ba1c0f3db3bf31e5

ANDROID: sdcardfs: Add missing path_put

"ANDROID: sdcardfs: Add GID Derivation to sdcardfs" introduced
an unbalanced pat_get, leading to storage space not being freed
after deleting a file until rebooting. This adds the missing path_put.

Signed-off-by: Daniel Rosenberg <drosen@google.com>
Bug: 34691169
Change-Id: Ia7ef97ec2eca2c555cc06b235715635afc87940e

ANDROID: sdcardfs: Fix incorrect hash

This adds back the hash calculation removed as part of
the previous patch, as it is in fact necessary.

Signed-off-by: Daniel Rosenberg <drosen@google.com>
Bug: 35307857
Change-Id: Ie607332bcf2c5d2efdf924e4060ef3f576bf25dc

ANDROID: sdcardfs: Switch strcasecmp for internal call

This moves our uses of strcasecmp over to an internal call so we can
easily change implementations later if we so desire. Additionally,
we leverage qstr's where appropriate to save time on comparisons.

Change-Id: I32fdc4fd0cd3b7b735dcfd82f60a2516fd8272a5
Signed-off-by: Daniel Rosenberg <drosen@google.com>

ANDROID: sdcardfs: switch to full_name_hash and qstr

Use the kernel's string hash function instead of rolling
our own. Additionally, save a bit of calculation by using
the qstr struct in place of strings.

Change-Id: I0bbeb5ec2a9233f40135ad632e6f22c30ffa95c1
Signed-off-by: Daniel Rosenberg <drosen@google.com>

ANDROID: sdcardfs: Add GID Derivation to sdcardfs

This changes sdcardfs to modify the user and group in the
underlying filesystem depending on its usage. Ownership is
set by Android user, and package, as well as if the file is
under obb or cache. Other files can be labeled by extension.
Those values are set via the configfs interace.

To add an entry,
mkdir -p [configfs root]/sdcardfs/extensions/[gid]/[ext]

Bug: 34262585
Change-Id: I4e030ce84f094a678376349b1a96923e5076a0f4
Signed-off-by: Daniel Rosenberg <drosen@google.com>

ANDROID: sdcardfs: Remove redundant operation

We call get_derived_permission_new unconditionally, so we don't need
to call update_derived_permission_lock, which does the same thing.

Change-Id: I0748100828c6af806da807241a33bf42be614935
Signed-off-by: Daniel Rosenberg <drosen@google.com>

ANDROID: sdcardfs: add support for user permission isolation

This allows you to hide the existence of a package from
a user by adding them to an exclude list. If a user
creates that package's folder and is on the exclude list,
they will not see that package's id.

Bug: 34542611
Change-Id: I9eb82e0bf2457d7eb81ee56153b9c7d2f6646323
Signed-off-by: Daniel Rosenberg <drosen@google.com>

ANDROID: sdcardfs: Refactor configfs interface

This refactors the configfs code to be more easily extended.
It will allow additional files to be added easily.

Bug: 34542611
Bug: 34262585
Change-Id: I73c9b0ae5ca7eb27f4ebef3e6807f088b512d539
Signed-off-by: Daniel Rosenberg <drosen@google.com>

ANDROID: sdcardfs: Allow non-owners to touch

This modifies the permission checks in setattr to
allow for non-owners to modify the timestamp of
files to things other than the current time.
This still requires write access, as enforced by
the permission call, but relaxes the requirement
that the caller must be the owner, allowing those
with group permissions to change it as well.

Bug: 11118565
Change-Id: Ied31f0cce2797675c7ef179eeb4e088185adcbad
Signed-off-by: Daniel Rosenberg <drosen@google.com>

ANDROID: sdcardfs: implement vm_ops->page_mkwrite

This comes from the wrapfs patch
3dfec0ffe5e2 Wrapfs: implement vm_ops->page_mkwrite

Some file systems (e.g., ext4) require it. Reported by Ted Ts'o.

Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Signed-off-by: Daniel Rosenberg <drosen@google.com>
Bug: 34133558
Change-Id: I1a389b2422c654a6d3046bb8ec3e20511aebfa8e

ANDROID: sdcardfs: support direct-IO (DIO) operations

This comes from the wrapfs patch
2e346c83b26e Wrapfs: support direct-IO (DIO) operations

Signed-off-by: Li Mengyang <li.mengyang@stonybrook.edu>
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Signed-off-by: Daniel Rosenberg <drosen@google.com>
Bug: 34133558
Change-Id: I3fd779c510ab70d56b1d918f99c20421b524cdc4

sdcardfs: Flag files as non-mappable

WARNING: it's not full commit, it just update sdcardfs

Implement Samsung's FMODE_NONMAPPABLE flag from
sdcardfs version 2.1.4 as we hit a BUG on ext4:

[   49.655037]@0 Kernel BUG at ffffffc0001deeec [verbose debug info unavailable]
[   49.655045]@0 Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
[   49.655052]@0 Modules linked in:
[   49.655061]@0 CPU: 0 PID: 283 Comm: kworker/u8:7 Tainted: G        W      3.18.20-perf-g3be2054-00086-ga8307fb #1
[   49.655070]@0 Hardware name: Qualcomm Technologies, Inc. MSM 8996 v3 + PMI8996 MTP (DT)
[   49.655077]@0 Workqueue: writeback bdi_writeback_workfn (flush-8:0)
[   49.655096]@0 task: ffffffc174ba8b00 ti: ffffffc174bb4000 task.ti: ffffffc174bb4000
[   49.655108]@0 PC is at mpage_prepare_extent_to_map+0x198/0x218
[   49.655116]@0 LR is at mpage_prepare_extent_to_map+0x110/0x218
[   49.655121]@0 pc : [<ffffffc0001deeec>] lr : [<ffffffc0001dee64>] pstate: 60000145
[   49.655126]@0 sp : ffffffc174bb7800
[   49.655130]@0 x29: ffffffc174bb7800 x28: ffffffc174bb7880
[   49.655140]@0 x27: 000000000000000d x26: ffffffc1245505e8
[   49.655149]@0 x25: 0000000000000000 x24: 0000000000003400
[   49.655160]@0 x23: ffffffffffffffff x22: 0000000000000000
[   49.655172]@0 x21: ffffffc174bb7888 x20: ffffffc174bb79e0
[   49.655182]@0 x19: ffffffbdc4ee7b80 x18: 0000007f92872000
[   49.655191]@0 x17: 0000007f959b6424 x16: ffffffc00016d1ac
[   49.655201]@0 x15: 0000007f9285d158 x14: ffffffc1734796e8
[   49.655210]@0 x13: ffffffbdc1ffa4c0 x12: ffffffbdc4ee7b80
[   49.655220]@0 x11: 0000000000000100 x10: 0000000000000000
[   49.655229]@0 x9 : 0000000000000000 x8 : ffffffc0b444e210
[   49.655237]@0 x7 : 0000000000000000 x6 : ffffffc0b444e1e0
[   49.655246]@0 x5 : 0000000000000000 x4 : 0000000000000001
[   49.655254]@0 x3 : 0000000000000000 x2 : 400000000002003d
[   49.655263]@0 x1 : ffffffbdc4ee7b80 x0 : 400000000002003d
[   49.655271]@0
[   49.656502]@0 Process kworker/u8:7 (pid: 283, stack limit = 0xffffffc174bb4058)
[   49.656509]@0 Call trace:
[   49.656514]@0 [<ffffffc0001deeec>] mpage_prepare_extent_to_map+0x198/0x218
[   49.656526]@0 [<ffffffc0001e28d0>] ext4_writepages+0x270/0xa58
[   49.656533]@0 [<ffffffc00012982c>] do_writepages+0x24/0x40
[   49.656541]@0 [<ffffffc000180160>] __writeback_single_inode+0x40/0x114
[   49.656549]@0 [<ffffffc000180e50>] writeback_sb_inodes+0x1dc/0x34c
[   49.656555]@0 [<ffffffc00018103c>] __writeback_inodes_wb+0x7c/0xc4
[   49.656560]@0 [<ffffffc000181224>] wb_writeback+0x110/0x1a8
[   49.656565]@0 [<ffffffc000181344>] wb_check_old_data_flush+0x88/0x98
[   49.656571]@0 [<ffffffc00018156c>] bdi_writeback_workfn+0xf4/0x1fc
[   49.656576]@0 [<ffffffc0000b14f8>] process_one_work+0x1e0/0x300
[   49.656585]@0 [<ffffffc0000b1e14>] worker_thread+0x318/0x438
[   49.656590]@0 [<ffffffc0000b5da0>] kthread+0xe0/0xec
[   49.656598]@0 Code: f9400260 f9400a63 1ad92063 37580040 (e7f001f2)
[   49.656604]@0 ---[ end trace cbed09f772fd630d ]---

Change-Id: I931da7cb3841db1f130dba298a7d256b6f02d1bc

ANDROID: sdcardfs: Fix locking issue with permision fix up

Don't use lookup_one_len so we can grab the spinlock that
protects d_subdirs.

Bug: 30954918
Change-Id: I0c6a393252db7beb467e0d563739a3a14e1b5115
Signed-off-by: Daniel Rosenberg <drosen@google.com>

sdcardfs: Use per mount permissions

This switches sdcardfs over to using permission2.
Instead of mounting several sdcardfs instances onto
the same underlaying directory, you bind mount a
single mount several times, and remount with the
options you want. These are stored in the private
mount data, allowing you to maintain the same tree,
but have different permissions for different mount
points.

Warning functions have been added for permission,
as it should never be called, and the correct
behavior is unclear.

Change-Id: I841b1d70ec60cf2b866fa48edeb74a0b0f8334f5
Signed-off-by: Daniel Rosenberg <drosen@google.com>

sdcardfs: Add gid and mask to private mount data

Adds support for mount2, remount2, and the functions
to allocate/clone/copy the private data

The next patch will switch over to actually using it.

Change-Id: I8a43da26021d33401f655f0b2784ead161c575e3
Signed-off-by: Daniel Rosenberg <drosen@google.com>

sdcardfs: User new permission2 functions

Change-Id: Ic7e0fb8fdcebb31e657b079fe02ac834c4a50db9
Signed-off-by: Daniel Rosenberg <drosen@google.com>

sdcardfs: Move directory unlock before touch

This removes a deadlock under low memory conditions.
filp_open can call lookup_slow, which will attempt to
lock the parent.

Change-Id: I940643d0793f5051d1e79a56f4da2fa8ca3d8ff7
Signed-off-by: Daniel Rosenberg <drosen@google.com>

sdcardfs: fix external storage exporting incorrect uid

Symptom: App cannot write into per-app folder
Root Cause: sdcardfs exports incorrect uid
Solution: fix uid
Project: All
Note:
Test done by RD: passed

sdcardfs: Added top to sdcardfs_inode_info

Adding packages to the package list and moving files
takes a large amount of locks, and is currently a
heavy operation. This adds a 'top' field to the
inode_info, which points to the inode for the top
most directory whose owner you would like to match.

On permission checks and get_attr, we look up the
owner based on the information at top. When we change
a package mapping, we need only modify the information
in the corresponding top inode_info's. When renaming,
we must ensure top is set correctly in all children.
This happens when an app specific folder gets moved
outside of the folder for that app.

Change-Id: Ib749c60b568e9a45a46f8ceed985c1338246ec6c
Signed-off-by: Daniel Rosenberg <drosen@google.com>

sdcardfs: Switch package list to RCU

Switched the package id hashmap to use RCU.

Change-Id: I9fdcab279009005bf28536247d11e13babab0b93
Signed-off-by: Daniel Rosenberg <drosen@google.com>

sdcardfs: Fix locking for permission fix up

Iterating over d_subdirs requires taking d_lock.
Removed several unneeded locks.

Change-Id: I5b1588e54c7e6ee19b756d6705171c7f829e2650
Signed-off-by: Daniel Rosenberg <drosen@google.com>

sdcardfs: Check for other cases on path lookup

This fixes a bug where the first lookup of a
file or folder created under a different view
would not be case insensitive. It will now
search through for a case insensitive match
if the initial lookup fails.

Bug:28024488
Change-Id: I4ff9ce297b9f2f9864b47540e740fd491c545229
Signed-off-by: Daniel Rosenberg <drosen@google.com>

sdcardfs: override umask on mkdir and create

The mode on files created on the lower fs should
not be affected by the umask of the calling
task's fs_struct. Instead, we create a copy
and modify it as needed. This also lets us avoid
the string shenanigans around .nomedia files.

Bug: 27992761
Change-Id: Ia3a6e56c24c6e19b3b01c1827e46403bb71c2f4c
Signed-off-by: Daniel Rosenberg <drosen@google.com>

ANDROID: sdcardfs: fix itnull.cocci warnings

List_for_each_entry has the property that the first argument is always
bound to a real list element, never NULL, so testing dentry is not needed.

Generated by: scripts/coccinelle/iterators/itnull.cocci

Change-Id: I51033a2649eb39451862b35b6358fe5cfe25c5f5
Cc: Daniel Rosenberg <drosen@google.com>
Signed-off-by: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Guenter Roeck <groeck@chromium.org>

sdcardfs: Truncate packages_gid.list on overflow

packages_gid.list was improperly returning the wrong
count. Use scnprintf instead, and inform the user that
the list was truncated if it is.

Bug: 30013843
Change-Id: Ida2b2ef7cd86dd87300bfb4c2cdb6bfe2ee1650d
Signed-off-by: Daniel Rosenberg <drosen@google.com>

vfs: change d_canonical_path to take two paths

bug: 23904372
Change-Id: I4a686d64b6de37decf60019be1718e1d820193e6
Signed-off-by: Daniel Rosenberg <drosen@google.com>

sdcardfs: remove unneeded __init and __exit

Change-Id: I2a2d45d52f891332174c3000e8681c5167c1564f

sdcardfs: Remove unused code

Change-Id: Ie97cba27ce44818ac56cfe40954f164ad44eccf6

sdcardfs: remove effectless config option

CONFIG_SDCARD_FS_CI_SEARCH only guards a define for
LOOKUP_CASE_INSENSITIVE, which is never used in the
kernel. Remove both, along with the option matching
that supports it.

Change-Id: I363a8f31de8ee7a7a934d75300cc9ba8176e2edf
Signed-off-by: Daniel Rosenberg <drosen@google.com>

sdcardfs: Add support for d_canonicalize

Change-Id: I5d6f0e71b8ca99aec4b0894412f1dfd1cfe12add
Signed-off-by: Daniel Rosenberg <drosen@google.com>

sdcardfs: Bring up to date with Android M permissions:

In M, the workings of sdcardfs were changed significantly.
This brings sdcardfs into line with the changes.

Change-Id: I10e91a84a884c838feef7aa26c0a2b21f02e052e

sdcardfs: Changed type-cast in packagelist management

Change-Id: Ic8842de2d7274b7a5438938d2febf5d8da867148

sdcardfs: port to 3.10

Change-Id: I832a14cee3fcbf47ee6e5da2943a90f9dea5b60a

Initial port of sdcardfs

Change-Id: I5b5772a2bbff9f3a7dda641644630a7b8afacec0

Conflicts:
include/linux/namei.h

fixed-by: vlw <vlwwwwww@gmail.com>

fs: remove Samsung's sdcardfs

fs: sdfat: Fix compilation for 32-bit targets

Change-Id: I9a9f3e253001bfbb3a209bd16d2741c95c99f46b

fs: sdfat: Update to version 2.0.6-lineage

* Samsung version G960USQU1ARBG with -lineage patches applied

Change-Id: I0432751926085aa249b377a418728854618929e5
Signed-off-by: Paul Keith <javelinanddart@gmail.com>

fs: sdfat: Add -lineage extraversion to SDFAT_VERSION

* It has diverged enough to add this to differentiate it

Change-Id: I5e43ee01c785acbc5292c6c115a4e083eeeb36a6
Signed-off-by: Paul Keith <javelinanddart@gmail.com>

fs: sdfat: Add config option to register sdFAT for VFAT

Change-Id: I72ba7a14b56175535884390e8601960b5d8ed1cf
Signed-off-by: Paul Keith <javelinanddart@gmail.com>

fs: sdfat: Add config option to register sdFAT for exFAT

Change-Id: Id57abf0a4bd0b433fecc622eecb383cd4ea29d17
Signed-off-by: Paul Keith <javelinanddart@gmail.com>

fs: sdfat: Allow disabling sdfat

Change-Id: If508804ba4d3536a98c70eb871771d26b628ad50
Signed-off-by: Paul Keith <javelinanddart@gmail.com>

fs: sdfat: Fix compilation without debugging

* And make WARNON debugging optional

Change-Id: Id59e908c8a60ded1238d3fd010f3d96cdde95f40
Signed-off-by: Paul Keith <javelinanddart@gmail.com>

fs: exfat: Allow disabling exfat

Change-Id: If8e8f3a0b7962617ac806c6fb64dbf463b906f59
Signed-off-by: Paul Keith <javelinanddart@gmail.com>

fs: sdfat: Fix compilation on Linux 3.4

Change-Id: I3a500f03f399abc9af9586e80419d75aca5b4320
Signed-off-by: Paul Keith <javelinanddart@gmail.com>

fs: Add sdfat

* Samsung package version: G950FXXU1CRAP (yes, really)

Change-Id: Id866574b34d4434fd4955fac154c9684210abebd
Signed-off-by: Paul Keith <javelinanddart@gmail.com>

remove samsung sdfat

V4L/DVB (13571): v4l: Adding Digital Video Timings APIs

This adds the above APIs to the v4l2 core. This is based on version v1.2
of the RFC titled "V4L - Support for video timings at the input/output interface"
Following new ioctls are added:-

        - VIDIOC_ENUM_DV_PRESETS
        - VIDIOC_S_DV_PRESET
        - VIDIOC_G_DV_PRESET
        - VIDIOC_QUERY_DV_PRESET
        - VIDIOC_S_DV_TIMINGS
        - VIDIOC_G_DV_TIMINGS

Please refer to the RFC for the details. This code was tested using vpfe
capture driver on TI's DM365. Following is the test configuration used :-

Blu-Ray HD DVD source -> TVP7002 -> DM365 (VPFE) ->DDR

A draft version of the TVP7002 driver (currently being reviewed in the mailing
list) was used that supports V4L2_DV_1080I60 & V4L2_DV_720P60 presets.

A loopback video capture application was used for testing these APIs. This calls
following IOCTLS :-

-  verify the new v4l2_input capabilities flag added
-  Enumerate available presets using VIDIOC_ENUM_DV_PRESETS
-  Set one of the supported preset using VIDIOC_S_DV_PRESET
-  Get current preset using VIDIOC_G_DV_PRESET
-  Detect current preset using VIDIOC_QUERY_DV_PRESET
-  Using stub functions in tvp7002, verify VIDIOC_S_DV_TIMINGS
    and VIDIOC_G_DV_TIMINGS ioctls are received at the sub device.
-  Tested on 64bit platform by Hans Verkuil

Change-Id: Idf1492474fb5cbd2c0608f01437fcde6f5ed86fa
Signed-off-by: Muralidharan Karicheri <m-karicheri2@ti.com>
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Reviewed-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

drivers: gpu: arm: t72x: r7p0: fix "platform_name" assignment

silence firmware warnings

fs: Fix the compile w/o ROOT Restriction enabled

Change-Id: I2cb687617eb1228ad7f62da25449afeb11cdf701

sysrq: Emergency Remount R/O in reverse order

This change fixes a problem where reboot on Android panics the kernel
almost every time when file systems are mounted over loop devices.

Android reboot command does:
- sync
- echo u > /proc/sysrq-trigger
- syscall_reboot

The problem is with sysrq emergency remount R/O trying to remount-ro
in wrong order.
since /data is re-mounted ro before loop devices, loop device
remount-ro fails to flush the journal and panics the kernel:

  EXT4-fs (loop0): Remounting filesystem read-only
  EXT4-fs (loop0): previous I/O error to superblock detected
  loop: Write error at byte offset 0, length 4096.
  Buffer I/O error on device loop0, logical block 0
  lost page write due to I/O error on loop0
  Kernel panic - not syncing: EXT4-fs panic from previous error

The fix is quite simple. In do_emergency_remount(), use
list_for_each_entry_reverse() on sb list instead of list_for_each_entry().
It makes a lot of sense to umount the file systems in reverse order in
which they were added to sb list.

Change-Id: I4370e39b5873bd16ade5d5f9ddb2704beb02a2bb
Signed-off-by: Amir Goldstein <amir@cellrox.com>
Acked-by: Oren Laadan <orenl@cellrox.com>
Signed-off-by: Apavayan Sinha <info@apavayan.com>

arm64: vdso: Define sigtramp offset if needed

Something breaks vdso offset generation currently, this needs to be
fixed properly.

Change-Id: I3f56a78eb83e61e97c4ffa11a7e865c326084c0a
Signed-off-by: Apavayan Sinha <info@apavayan.com>

Kbuild: Add missing videodev2 exynos headers.

Signed-off-by: Apavayan Sinha <info@apavayan.com>

drivers: media: fix compiling

Signed-off-by: Apavayan Sinha <info@apavayan.com>

drivers: muic: fix the compile

Signed-off-by: Apavayan Sinha <info@apavayan.com>

security: Fix tima compile

Signed-off-by: Apavayan Sinha <info@apavayan.com>

import SM-A510F_EUR_NN_Opensource.zip

A510FXXU4CQH2

Linux 3.10.61

mm: memcg: handle non-error OOM situations more gracefully

commit 4942642080ea82d99ab5b653abb9a12b7ba31f4a upstream.

Commit 3812c8c8f395 ("mm: memcg: do not trap chargers with full
callstack on OOM") assumed that only a few places that can trigger a
memcg OOM situation do not return VM_FAULT_OOM, like optional page cache
readahead.  But there are many more and it's impractical to annotate
them all.

First of all, we don't want to invoke the OOM killer when the failed
allocation is gracefully handled, so defer the actual kill to the end of
the fault handling as well.  This simplifies the code quite a bit for
added bonus.

Second, since a failed allocation might not be the abrupt end of the
fault, the memcg OOM handler needs to be re-entrant until the fault
finishes for subsequent allocation attempts.  If an allocation is
attempted after the task already OOMed, allow it to bypass the limit so
that it can quickly finish the fault and invoke the OOM killer.

Reported-by: azurIt <azurit@pobox.sk>
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mm: memcg: do not trap chargers with full callstack on OOM

commit 3812c8c8f3953921ef18544110dafc3505c1ac62 upstream.

The memcg OOM handling is incredibly fragile and can deadlock.  When a
task fails to charge memory, it invokes the OOM killer and loops right
there in the charge code until it succeeds.  Comparably, any other task
that enters the charge path at this point will go to a waitqueue right
then and there and sleep until the OOM situation is resolved.  The problem
is that these tasks may hold filesystem locks and the mmap_sem; locks that
the selected OOM victim may need to exit.

For example, in one reported case, the task invoking the OOM killer was
about to charge a page cache page during a write(), which holds the
i_mutex.  The OOM killer selected a task that was just entering truncate()
and trying to acquire the i_mutex:

OOM invoking task:
  mem_cgroup_handle_oom+0x241/0x3b0
  mem_cgroup_cache_charge+0xbe/0xe0
  add_to_page_cache_locked+0x4c/0x140
  add_to_page_cache_lru+0x22/0x50
  grab_cache_page_write_begin+0x8b/0xe0
  ext3_write_begin+0x88/0x270
  generic_file_buffered_write+0x116/0x290
  __generic_file_aio_write+0x27c/0x480
  generic_file_aio_write+0x76/0xf0           # takes ->i_mutex
  do_sync_write+0xea/0x130
  vfs_write+0xf3/0x1f0
  sys_write+0x51/0x90
  system_call_fastpath+0x18/0x1d

OOM kill victim:
  do_truncate+0x58/0xa0              # takes i_mutex
  do_last+0x250/0xa30
  path_openat+0xd7/0x440
  do_filp_open+0x49/0xa0
  do_sys_open+0x106/0x240
  sys_open+0x20/0x30
  system_call_fastpath+0x18/0x1d

The OOM handling task will retry the charge indefinitely while the OOM
killed task is not releasing any resources.

A similar scenario can happen when the kernel OOM killer for a memcg is
disabled and a userspace task is in charge of resolving OOM situations.
In this case, ALL tasks that enter the OOM path will be made to sleep on
the OOM waitqueue and wait for userspace to free resources or increase
the group's limit.  But a userspace OOM handler is prone to deadlock
itself on the locks held by the waiting tasks.  For example one of the
sleeping tasks may be stuck in a brk() call with the mmap_sem held for
writing but the userspace handler, in order to pick an optimal victim,
may need to read files from /proc/<pid>, which tries to acquire the same
mmap_sem for reading and deadlocks.

This patch changes the way tasks behave after detecting a memcg OOM and
makes sure nobody loops or sleeps with locks held:

1. When OOMing in a user fault, invoke the OOM killer and restart the
   fault instead of looping on the charge attempt.  This way, the OOM
   victim can not get stuck on locks the looping task may hold.

2. When OOMing in a user fault but somebody else is handling it
   (either the kernel OOM killer or a userspace handler), don't go to
   sleep in the charge context.  Instead, remember the OOMing memcg in
   the task struct and then fully unwind the page fault stack with
   -ENOMEM.  pagefault_out_of_memory() will then call back into the
   memcg code to check if the -ENOMEM came from the memcg, and then
   either put the task to sleep on the memcg's OOM waitqueue or just
   restart the fault.  The OOM victim can no longer get stuck on any
   lock a sleeping task may hold.

Debugged by Michal Hocko.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reported-by: azurIt <azurit@pobox.sk>
Acked-by: Michal Hocko <mhocko@suse.cz>
Cc: David Rientjes <rientjes@google.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mm: memcg: rework and document OOM waiting and wakeup

commit fb2a6fc56be66c169f8b80e07ed999ba453a2db2 upstream.

The memcg OOM handler open-codes a sleeping lock for OOM serialization
(trylock, wait, repeat) because the required locking is so specific to
memcg hierarchies.  However, it would be nice if this construct would be
clearly recognizable and not be as obfuscated as it is right now.  Clean
up as follows:

1. Remove the return value of mem_cgroup_oom_unlock()

2. Rename mem_cgroup_oom_lock() to mem_cgroup_oom_trylock().

3. Pull the prepare_to_wait() out of the memcg_oom_lock scope.  This
   makes it more obvious that the task has to be on the waitqueue
   before attempting to OOM-trylock the hierarchy, to not miss any
   wakeups before going to sleep.  It just didn't matter until now
   because it was all lumped together into the global memcg_oom_lock
   spinlock section.

4. Pull the mem_cgroup_oom_notify() out of the memcg_oom_lock scope.
   It is proctected by the hierarchical OOM-lock.

5. The memcg_oom_lock spinlock is only required to propagate the OOM
   lock in any given hierarchy atomically.  Restrict its scope to
   mem_cgroup_oom_(trylock|unlock).

6. Do not wake up the waitqueue unconditionally at the end of the
   function.  Only the lockholder has to wake up the next in line
   after releasing the lock.

   Note that the lockholder kicks off the OOM-killer, which in turn
   leads to wakeups from the uncharges of the exiting task.  But a
   contender is not guaranteed to see them if it enters the OOM path
   after the OOM kills but before the lockholder releases the lock.
   Thus there has to be an explicit wakeup after releasing the lock.

7. Put the OOM task on the waitqueue before marking the hierarchy as
   under OOM as that is the point where we start to receive wakeups.
   No point in listening before being on the waitqueue.

8. Likewise, unmark the hierarchy before finishing the sleep, for
   symmetry.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.cz>
Cc: David Rientjes <rientjes@google.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: azurIt <azurit@pobox.sk>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mm: memcg: enable memcg OOM killer only for user faults

commit 519e52473ebe9db5cdef44670d5a97f1fd53d721 upstream.

System calls and kernel faults (uaccess, gup) can handle an out of memory
situation gracefully and just return -ENOMEM.

Enable the memcg OOM killer only for user faults, where it's really the
only option available.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.cz>
Cc: David Rientjes <rientjes@google.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: azurIt <azurit@pobox.sk>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

x86: finish user fault error path with fatal signal

commit 3a13c4d761b4b979ba8767f42345fed3274991b0 upstream.

The x86 fault handler bails in the middle of error handling when the
task has a fatal signal pending. For a subsequent patch this is a
problem in OOM situations because it relies on pagefault_out_of_memory()
being called even when the task has been killed, to perform proper
per-task OOM state unwinding.

Shortcutting the fault like this is a rather minor optimization that
saves a few instructions in rare cases. Just remove it for
user-triggered faults.

Use the opportunity to split the fault retry handling from actual fault
errors and add locking documentation that reads suprisingly similar to
ARM's.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: David Rientjes <rientjes@google.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: azurIt <azurit@pobox.sk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

arch: mm: pass userspace fault flag to generic fault handler

commit 759496ba6407c6994d6a5ce3a5e74937d7816208 upstream.

Unlike global OOM handling, memory cgroup code will invoke the OOM killer
in any OOM situation because it has no way of telling faults occuring in
kernel context - which could be handled more gracefully - from
user-triggered faults.

Pass a flag that identifies faults originating in user space from the
architecture-specific fault handlers to generic code so that memcg OOM
handling can be improved.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Cc: David Rientjes <rientjes@google.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: azurIt <azurit@pobox.sk>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

arch: mm: do not invoke OOM killer on kernel fault OOM

commit 871341023c771ad233620b7a1fb3d9c7031c4e5c upstream.

Kernel faults are expected to handle OOM conditions gracefully (gup,
uaccess etc.), so they should never invoke the OOM killer. Reserve this
for faults triggered in user context when it is the only option.

Most architectures already do this, fix up the remaining few.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: David Rientjes <rientjes@google.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: azurIt <azurit@pobox.sk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

arch: mm: remove obsolete init OOM protection

commit 94bce453c78996cc4373d5da6cfabe07fcc6d9f9 upstream.

The memcg code can trap tasks in the context of the failing allocation
until an OOM situation is resolved. They can hold all kinds of locks
(fs, mm) at this point, which makes it prone to deadlocking.

This series converts memcg OOM handling into a two step process that is
started in the charge context, but any waiting is done after the fault
stack is fully unwound.

Patches 1-4 prepare architecture handlers to support the new memcg
requirements, but in doing so they also remove old cruft and unify
out-of-memory behavior across architectures.

Patch 5 disables the memcg OOM handling for syscalls, readahead, kernel
faults, because they can gracefully unwind the stack with -ENOMEM. OOM
handling is restricted to user triggered faults that have no other
option.

Patch 6 reworks memcg's hierarchical OOM locking to make it a little
more obvious wth is going on in there: reduce locked regions, rename
locking functions, reorder and document.

Patch 7 implements the two-part OOM handling such that tasks are never
trapped with the full charge stack in an OOM situation.

This patch:

Back before smart OOM killing, when faulting tasks were killed directly on
allocation failures, the arch-specific fault handlers needed special
protection for the init process.

Now that all fault handlers call into the generic OOM killer (see commit
609838cfed97: "mm: invoke oom-killer from remaining unconverted page
fault handlers"), which already provides init protection, the
arch-specific leftovers can be removed.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: David Rientjes <rientjes@google.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: azurIt <azurit@pobox.sk>
Acked-by: Vineet Gupta <vgupta@synopsys.com> [arch/arc bits]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mm: invoke oom-killer from remaining unconverted page fault handlers

commit 609838cfed972d49a65aac7923a9ff5cbe482e30 upstream.

A few remaining architectures directly kill the page faulting task in an
out of memory situation. This is usually not a good idea since that
task might not even use a significant amount of memory and so may not be
the optimal victim to resolve the situation.

Since 2.6.29's 1c0fe6e ("mm: invoke oom-killer from page fault") there
is a hook that architecture page fault handlers are supposed to call to
invoke the OOM killer and let it pick the right task to kill. Convert
the remaining architectures over to this hook.

To have the previous behavior of simply taking out the faulting task the
vm.oom_kill_allocating_task sysctl can be set to 1.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: David Rientjes <rientjes@google.com>
Acked-by: Vineet Gupta <vgupta@synopsys.com> [arch/arc bits]
Cc: James Hogan <james.hogan@imgtec.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Chen Liqin <liqin.chen@sunplusct.com>
Cc: Lennox Wu <lennox.wu@gmail.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

net: sctp: fix skb_over_panic when receiving malformed ASCONF chunks

commit 9de7922bc709eee2f609cd01d98aaedc4cf5ea74 upstream.

Commit 6f4c618ddb0 ("SCTP : Add paramters validity check for
ASCONF chunk") added basic verification of ASCONF chunks, however,
it is still possible to remotely crash a server by sending a
special crafted ASCONF chunk, even up to pre 2.6.12 kernels:

skb_over_panic: text:ffffffffa01ea1c3 len:31056 put:30768
head:ffff88011bd81800 data:ffff88011bd81800 tail:0x7950
end:0x440 dev:<NULL>
------------[ cut here ]------------
kernel BUG at net/core/skbuff.c:129!
[...]
Call Trace:
<IRQ>
[<ffffffff8144fb1c>] skb_put+0x5c/0x70
[<ffffffffa01ea1c3>] sctp_addto_chunk+0x63/0xd0 [sctp]
[<ffffffffa01eadaf>] sctp_process_asconf+0x1af/0x540 [sctp]
[<ffffffff8152d025>] ? _read_unlock_bh+0x15/0x20
[<ffffffffa01e0038>] sctp_sf_do_asconf+0x168/0x240 [sctp]
[<ffffffffa01e3751>] sctp_do_sm+0x71/0x1210 [sctp]
[<ffffffff8147645d>] ? fib_rules_lookup+0xad/0xf0
[<ffffffffa01e6b22>] ? sctp_cmp_addr_exact+0x32/0x40 [sctp]
[<ffffffffa01e8393>] sctp_assoc_bh_rcv+0xd3/0x180 [sctp]
[<ffffffffa01ee986>] sctp_inq_push+0x56/0x80 [sctp]
[<ffffffffa01fcc42>] sctp_rcv+0x982/0xa10 [sctp]
[<ffffffffa01d5123>] ? ipt_local_in_hook+0x23/0x28 [iptable_filter]
[<ffffffff8148bdc9>] ? nf_iterate+0x69/0xb0
[<ffffffff81496d10>] ? ip_local_deliver_finish+0x0/0x2d0
[<ffffffff8148bf86>] ? nf_hook_slow+0x76/0x120
[<ffffffff81496d10>] ? ip_local_deliver_finish+0x0/0x2d0
[<ffffffff81496ded>] ip_local_deliver_finish+0xdd/0x2d0
[<ffffffff81497078>] ip_local_deliver+0x98/0xa0
[<ffffffff8149653d>] ip_rcv_finish+0x12d/0x440
[<ffffffff81496ac5>] ip_rcv+0x275/0x350
[<ffffffff8145c88b>] __netif_receive_skb+0x4ab/0x750
[<ffffffff81460588>] netif_receive_skb+0x58/0x60

This can be triggered e.g., through a simple scripted nmap
connection scan injecting the chunk after the handshake, for
example, ...

  -------------- INIT[ASCONF; ASCONF_ACK] ------------->
  <----------- INIT-ACK[ASCONF; ASCONF_ACK] ------------
  -------------------- COOKIE-ECHO -------------------->
  <-------------------- COOKIE-ACK ---------------------
  ------------------ ASCONF; UNKNOWN ------------------>

... where ASCONF chunk of length 280 contains 2 parameters ...

  1) Add IP address parameter (param length: 16)
  2) Add/del IP address parameter (param length: 255)

... followed by an UNKNOWN chunk of e.g. 4 bytes. Here, the
Address Parameter in the ASCONF chunk is even missing, too.
This is just an example and similarly-crafted ASCONF chunks
could be used just as well.

The ASCONF chunk passes through sctp_verify_asconf() as all
parameters passed sanity checks, and after walking, we ended
up successfully at the chunk end boundary, and thus may invoke
sctp_process_asconf(). Parameter walking is done with
WORD_ROUND() to take padding into account.

In sctp_process_asconf()'s TLV processing, we may fail in
sctp_process_asconf_param() e.g., due to removal of the IP
address that is also the source address of the packet containing
the ASCONF chunk, and thus we need to add all TLVs after the
failure to our ASCONF response to remote via helper function
sctp_add_asconf_response(), which basically invokes a
sctp_addto_chunk() adding the error parameters to the given
skb.

When walking to the next parameter this time, we proceed
with ...

  length = ntohs(asconf_param->param_hdr.length);
  asconf_param = (void *)asconf_param + length;

... instead of the WORD_ROUND()'ed length, thus resulting here
in an off-by-one that leads to reading the follow-up garbage
parameter length of 12336, and thus throwing an skb_over_panic
for the reply when trying to sctp_addto_chunk() next time,
which implicitly calls the skb_put() with that length.

Fix it by using sctp_walk_params() [ which is also used in
INIT parameter processing ] macro in the verification *and*
in ASCONF processing: it will make sure we don't spill over,
that we walk parameters WORD_ROUND()'ed. Moreover, we're being
more defensive and guard against unknown parameter types and
missized addresses.

Joint work with Vlad Yasevich.

Fixes: b896b82be4ae ("[SCTP] ADDIP: Support for processing incoming ASCONF_ACK chunks.")
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Vlad Yasevich <vyasevich@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Josh Boyer <jwboyer@fedoraproject.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

net: sctp: fix panic on duplicate ASCONF chunks

commit b69040d8e39f20d5215a03502a8e8b4c6ab78395 upstream.

When receiving a e.g. semi-good formed connection scan in the
form of ...

  -------------- INIT[ASCONF; ASCONF_ACK] ------------->
  <----------- INIT-ACK[ASCONF; ASCONF_ACK] ------------
  -------------------- COOKIE-ECHO -------------------->
  <-------------------- COOKIE-ACK ---------------------
  ---------------- ASCONF_a; ASCONF_b ----------------->

... where ASCONF_a equals ASCONF_b chunk (at least both serials
need to be equal), we panic an SCTP server!

The problem is that good-formed ASCONF chunks that we reply with
ASCONF_ACK chunks are cached per serial. Thus, when we receive a
same ASCONF chunk twice (e.g. through a lost ASCONF_ACK), we do
not need to process them again on the server side (that was the
idea, also proposed in the RFC). Instead, we know it was cached
and we just resend the cached chunk instead. So far, so good.

Where things get nasty is in SCTP's side effect interpreter, that
is, sctp_cmd_interpreter():

While incoming ASCONF_a (chunk = event_arg) is being marked
!end_of_packet and !singleton, and we have an association context,
we do not flush the outqueue the first time after processing the
ASCONF_ACK singleton chunk via SCTP_CMD_REPLY. Instead, we keep it
queued up, although we set local_cork to 1. Commit 2e3216cd54b1
changed the precedence, so that as long as we get bundled, incoming
chunks we try possible bundling on outgoing queue as well. Before
this commit, we would just flush the output queue.

Now, while ASCONF_a's ASCONF_ACK sits in the corked outq, we
continue to process the same ASCONF_b chunk from the packet. As
we have cached the previous ASCONF_ACK, we find it, grab it and
do another SCTP_CMD_REPLY command on it. So, effectively, we rip
the chunk->list pointers and requeue the same ASCONF_ACK chunk
another time. Since we process ASCONF_b, it's correctly marked
with end_of_packet and we enforce an uncork, and thus flush, thus
crashing the kernel.

Fix it by testing if the ASCONF_ACK is currently pending and if
that is the case, do not requeue it. When flushing the output
queue we may relink the chunk for preparing an outgoing packet,
but eventually unlink it when it's copied into the skb right
before transmission.

Joint work with Vlad Yasevich.

Fixes: 2e3216cd54b1 ("sctp: Follow security requirement of responding with 1 packet")
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Josh Boyer <jwboyer@fedoraproject.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

net: sctp: fix remote memory pressure from excessive queueing

commit 26b87c7881006311828bb0ab271a551a62dcceb4 upstream.

This scenario is not limited to ASCONF, just taken as one
example triggering the issue. When receiving ASCONF probes
in the form of ...

  -------------- INIT[ASCONF; ASCONF_ACK] ------------->
  <----------- INIT-ACK[ASCONF; ASCONF_ACK] ------------
  -------------------- COOKIE-ECHO -------------------->
  <-------------------- COOKIE-ACK ---------------------
  ---- ASCONF_a; [ASCONF_b; ...; ASCONF_n;] JUNK ------>
  [...]
  ---- ASCONF_m; [ASCONF_o; ...; ASCONF_z;] JUNK ------>

... where ASCONF_a, ASCONF_b, ..., ASCONF_z are good-formed
ASCONFs and have increasing serial numbers, we process such
ASCONF chunk(s) marked with !end_of_packet and !singleton,
since we have not yet reached the SCTP packet end. SCTP does
only do verification on a chunk by chunk basis, as an SCTP
packet is nothing more than just a container of a stream of
chunks which it eats up one by one.

We could run into the case that we receive a packet with a
malformed tail, above marked as trailing JUNK. All previous
chunks are here goodformed, so the stack will eat up all
previous chunks up to this point. In case JUNK does not fit
into a chunk header and there are no more other chunks in
the input queue, or in case JUNK contains a garbage chunk
header, but the encoded chunk length would exceed the skb
tail, or we came here from an entirely different scenario
and the chunk has pdiscard=1 mark (without having had a flush
point), it will happen, that we will excessively queue up
the association's output queue (a correct final chunk may
then turn it into a response flood when flushing the
queue ;)): I ran a simple script with incremental ASCONF
serial numbers and could see the server side consuming
excessive amount of RAM [before/after: up to 2GB and more].

The issue at heart is that the chunk train basically ends
with !end_of_packet and !singleton markers and since commit
2e3216cd54b1 ("sctp: Follow security requirement of responding
with 1 packet") therefore preventing an output queue flush
point in sctp_do_sm() -> sctp_cmd_interpreter() on the input
chunk (chunk = event_arg) even though local_cork is set,
but its precedence has changed since then. In the normal
case, the last chunk with end_of_packet=1 would trigger the
queue flush to accommodate possible outgoing bundling.

In the input queue, sctp_inq_pop() seems to do the right thing
in terms of discarding invalid chunks. So, above JUNK will
not enter the state machine and instead be released and exit
the sctp_assoc_bh_rcv() chunk processing loop. It's simply
the flush point being missing at loop exit. Adding a try-flush
approach on the output queue might not work as the underlying
infrastructure might be long gone at this point due to the
side-effect interpreter run.

One possibility, albeit a bit of a kludge, would be to defer
invalid chunk freeing into the state machine in order to
possibly trigger packet discards and thus indirectly a queue
flush on error. It would surely be better to discard chunks
as in the current, perhaps better controlled environment, but
going back and forth, it's simply architecturally not possible.
I tried various trailing JUNK attack cases and it seems to
look good now.

Joint work with Vlad Yasevich.

Fixes: 2e3216cd54b1 ("sctp: Follow security requirement of responding with 1 packet")
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Josh Boyer <jwboyer@fedoraproject.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

KVM: x86: Don't report guest userspace emulation error to userspace

commit a2b9e6c1a35afcc0973acb72e591c714e78885ff upstream.

Commit fc3a9157d314 ("KVM: X86: Don't report L2 emulation failures to
user-space") disabled the reporting of L2 (nested guest) emulation failures to
userspace due to race-condition between a vmexit and the instruction emulator.
The same rational applies also to userspace applications that are permitted by
the guest OS to access MMIO area or perform PIO.

This patch extends the current behavior - of injecting a #UD instead of
reporting it to userspace - also for guest userspace code.

Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

SCSI: hpsa: fix a race in cmd_free/scsi_done

commit 2cc5bfaf854463d9d1aa52091f60110fbf102a96 upstream.

When the driver calls scsi_done and after that frees it's internal
preallocated memory it can happen that a new job is enqueud before
the memory is freed. The allocation fails and the message
"cmd_alloc returned NULL" is shown.
Patch below fixes it by moving cmd->scsi_done after cmd_free.

Signed-off-by: Tomas Henzl <thenzl@redhat.com>
Acked-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Cc: Masoud Sharbiani <msharbiani@twitter.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

net/mlx4_en: Fix BlueFlame race

commit 2d4b646613d6b12175b017aca18113945af1faf3 upstream.

Fix a race between BlueFlame flow and stamping in post send flow.
Example:
SW: Build WQE 0 on the TX buffer, except the ownership bit
SW: Set ownership for WQE 0 on the TX buffer
SW: Ring doorbell for WQE 0
SW: Build WQE 1 on the TX buffer, except the ownership bit
SW: Set ownership for WQE 1 on the TX buffer
HW: Read WQE 0 and then WQE 1, before doorbell was rung/BF was done for WQE 1
HW: Produce CQEs for WQE 0 and WQE 1
SW: Process the CQEs, and stamp WQE 0 and WQE 1 accordingly (on the TX buffer)
SW: Copy WQE 1 from the TX buffer to the BF register - ALREADY STAMPED!
HW: CQE error with index 0xFFFF - the BF WQE's control segment is STAMPED,
so the BF index is 0xFFFF. Error: Invalid Opcode.
As a result QP enters the error state and no traffic can be sent.

Solution:
When stamping - do not stamp last completed wqe.

Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Vinson Lee <vlee@twopensource.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ARM: Correct BUG() assembly to ensure it is endian-agnostic

commit 63328070eff2f4fd730c86966a0dbc976147c39f upstream.

Currently BUG() uses .word or .hword to create the necessary illegal
instructions. However if we are building BE8 then these get swapped
by the linker into different illegal instructions in the text. This
means that the BUG() macro does not get trapped properly.

Change to using <asm/opcodes.h> to provide the necessary ARM instruction
building as we cannot rely on gcc/gas having the `.inst` instructions
which where added to try and resolve this issue (reported by Dave Martin
<Dave.Martin@arm.com>).

Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
Reviewed-by: Dave Martin <Dave.Martin@arm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

perf/x86/intel: Use proper dTLB-load-misses event on IvyBridge

commit 1996388e9f4e3444db8273bc08d25164d2967c21 upstream.

This was discussed back in February:

https://lkml.org/lkml/2014/2/18/956

But I never saw a patch come out of it.

On IvyBridge we share the SandyBridge cache event tables, but the
dTLB-load-miss event is not compatible. Patch it up after
the fact to the proper DTLB_LOAD_MISSES.DEMAND_LD_MISS_CAUSES_A_WALK

Signed-off-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1407141528200.17214@vincent-weaver-1.umelst.maine.edu
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Hou Pengyang <houpengyang@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mei: bus: fix possible boundaries violation

commit cfda2794b5afe7ce64ee9605c64bef0e56a48125 upstream.

function 'strncpy' will fill whole buffer 'id.name' of fixed size (32)
with string value and will not leave place for NULL-terminator.
Possible buffer boundaries violation in following string operations.
Replace strncpy with strlcpy.

Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

perf: Handle compat ioctl

commit b3f207855f57b9c8f43a547a801340bb5cbc59e5 upstream.

When running a 32-bit userspace on a 64-bit kernel (eg. i386
application on x86_64 kernel or 32-bit arm userspace on arm64
kernel) some of the perf ioctls must be treated with special
care, as they have a pointer size encoded in the command.

For example, PERF_EVENT_IOC_ID in 32-bit world will be encoded
as 0x80042407, but 64-bit kernel will expect 0x80082407. In
result the ioctl will fail returning -ENOTTY.

This patch solves the problem by adding code fixing up the
size as compat_ioctl file operation.

Reported-by: Drew Richardson <drew.richardson@arm.com>
Signed-off-by: Pawel Moll <pawel.moll@arm.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1402671812-9078-1-git-send-email-pawel.moll@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: David Ahern <daahern@cisco.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

MIPS: Fix forgotten preempt_enable() when CPU has inclusive pcaches

commit 5596b0b245fb9d2cefb5023b11061050351c1398 upstream.

[    1.904000] BUG: scheduling while atomic: swapper/1/0x00000002
[    1.908000] Modules linked in:
[    1.916000] CPU: 0 PID: 1 Comm: swapper Not tainted 3.12.0-rc2-lemote-los.git-5318619-dirty #1
[    1.920000] Stack : 0000000031aac000 ffffffff810d0000 0000000000000052 ffffffff802730a4
          0000000000000000 0000000000000001 ffffffff810cdf90 ffffffff810d0000
          ffffffff8068b968 ffffffff806f5537 ffffffff810cdf90 980000009f0782e8
          0000000000000001 ffffffff80720000 ffffffff806b0000 980000009f078000
          980000009f290000 ffffffff805f312c 980000009f05b5d8 ffffffff80233518
          980000009f05b5e8 ffffffff80274b7c 980000009f078000 ffffffff8068b968
          0000000000000000 0000000000000000 0000000000000000 0000000000000000
          0000000000000000 980000009f05b520 0000000000000000 ffffffff805f2f6c
          0000000000000000 ffffffff80700000 ffffffff80700000 ffffffff806fc758
          ffffffff80700000 ffffffff8020be98 ffffffff806fceb0 ffffffff805f2f6c
          ...
[    2.028000] Call Trace:
[    2.032000] [<ffffffff8020be98>] show_stack+0x80/0x98
[    2.036000] [<ffffffff805f2f6c>] __schedule_bug+0x44/0x6c
[    2.040000] [<ffffffff805fac58>] __schedule+0x518/0x5b0
[    2.044000] [<ffffffff805f8a58>] schedule_timeout+0x128/0x1f0
[    2.048000] [<ffffffff80240314>] msleep+0x3c/0x60
[    2.052000] [<ffffffff80495400>] do_probe+0x238/0x3a8
[    2.056000] [<ffffffff804958b0>] ide_probe_port+0x340/0x7e8
[    2.060000] [<ffffffff80496028>] ide_host_register+0x2d0/0x7a8
[    2.064000] [<ffffffff8049c65c>] ide_pci_init_two+0x4e4/0x790
[    2.068000] [<ffffffff8049f9b8>] amd74xx_probe+0x148/0x2c8
[    2.072000] [<ffffffff803f571c>] pci_device_probe+0xc4/0x130
[    2.076000] [<ffffffff80478f60>] driver_probe_device+0x98/0x270
[    2.080000] [<ffffffff80479298>] __driver_attach+0xe0/0xe8
[    2.084000] [<ffffffff80476ab0>] bus_for_each_dev+0x78/0xe0
[    2.088000] [<ffffffff80478468>] bus_add_driver+0x230/0x310
[    2.092000] [<ffffffff80479b44>] driver_register+0x84/0x158
[    2.096000] [<ffffffff80200504>] do_one_initcall+0x104/0x160

Signed-off-by: Yoichi Yuasa <yuasa@linux-mips.org>
Reported-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Tested-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Cc: linux-mips@linux-mips.org
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Patchwork: https://patchwork.linux-mips.org/patch/5941/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Cc: Alexandre Oliva <lxoliva@fsfla.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

dell-wmi: Fix access out of memory

commit a666b6ffbc9b6705a3ced704f52c3fe9ea8bf959 upstream.

Without this patch, dell-wmi is trying to access elements of dynamically
allocated array without checking the array size. This can lead to memory
corruption or a kernel panic. This patch adds the missing checks for
array size.

Signed-off-by: Pali Rohár <pali.rohar@gmail.com>
Signed-off-by: Darren Hart <dvhart@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ARM: probes: fix instruction fetch order with <asm/opcodes.h>

commit 888be25402021a425da3e85e2d5a954d7509286e upstream.

If we are running BE8, the data and instruction endianness do not
match, so use <asm/opcodes.h> to correctly translate memory accesses
into ARM instructions.

Acked-by: Jon Medhurst <tixy@linaro.org>
Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
[taras.kondratiuk@linaro.org: fixed Thumb instruction fetch order]
Signed-off-by: Taras Kondratiuk <taras.kondratiuk@linaro.org>
[wangnan: backport to 3.10 and 3.14:
- adjust context
- backport all changes on arch/arm/kernel/probes.c to
   arch/arm/kernel/kprobes-common.c since we don't have
   commit c18377c303787ded44b7decd7dee694db0f205e9.
- After the above adjustments, becomes same to Taras Kondratiuk's
   original patch:
     http://lists.linaro.org/pipermail/linaro-kernel/2014-January/010346.html
]
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

br: fix use of ->rx_handler_data in code executed on non-rx_handler path

commit 859828c0ea476b42f3a93d69d117aaba90994b6f upstream.

br_stp_rcv() is reached by non-rx_handler path. That means there is no
guarantee that dev is bridge port and therefore simple NULL check of
->rx_handler_data is not enough. There is need to check if dev is really
bridge port and since only rcu read lock is held here, do it by checking
->rx_handler pointer.

Note that synchronize_net() in netdev_rx_handler_unregister() ensures
this approach as valid.

Introduced originally by:
commit f350a0a87374418635689471606454abc7beaa3a
  "bridge: use rx_handler_data pointer to store net_bridge_port pointer"

Fixed but not in the best way by:
commit b5ed54e94d324f17c97852296d61a143f01b227a
  "bridge: fix RCU races with bridge port"

Reintroduced by:
commit 716ec052d2280d511e10e90ad54a86f5b5d4dcc2
  "bridge: fix NULL pointer deref of br_port_get_rcu"

Please apply to stable trees as well. Thanks.

RH bugzilla reference: https://bugzilla.redhat.com/show_bug.cgi?id=1025770

Reported-by: Laine Stump <laine@redhat.com>
Debugged-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Andrew Collins <bsderandrew@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

netfilter: nf_nat: fix oops on netns removal

commit 945b2b2d259d1a4364a2799e80e8ff32f8c6ee6f upstream.

Quoting Samu Kallio:

Basically what's happening is, during netns cleanup,
nf_nat_net_exit gets called before ipv4_net_exit. As I understand
it, nf_nat_net_exit is supposed to kill any conntrack entries which
have NAT context (through nf_ct_iterate_cleanup), but for some
reason this doesn't happen (perhaps something else is still holding
refs to those entries?).

When ipv4_net_exit is called, conntrack entries (including those
with NAT context) are cleaned up, but the
nat_bysource hashtable is long gone - freed in nf_nat_net_exit. The
bug happens when attempting to free a conntrack entry whose NAT hash
'prev' field points to a slot in the freed hash table (head for that
bin).

We ignore conntracks with null nat bindings. But this is wrong,
as these are in bysource hash table as well.

Restore nat-cleaning for the netns-is-being-removed case.

bug:
https://bugzilla.kernel.org/show_bug.cgi?id=65191

Fixes: c2d421e1718 ('netfilter: nf_nat: fix race when unloading protocol modules')
Reported-by: Samu Kallio <samu.kallio@aberdeencloud.com>
Debugged-by: Samu Kallio <samu.kallio@aberdeencloud.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Tested-by: Samu Kallio <samu.kallio@aberdeencloud.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
[samu.kallio@aberdeencloud.com: backport to 3.10-stable]
Signed-off-by: Samu Kallio <samu.kallio@aberdeencloud.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

netfilter: xt_bpf: add mising opaque struct sk_filter definition

commit e10038a8ec06ac819b7552bb67aaa6d2d6f850c1 upstream.

This structure is not exposed to userspace, so fix this by defining
struct sk_filter; so we skip the casting in kernelspace. This is safe
since userspace has no way to lurk with that internal pointer.

Fixes: e6f30c7 ("netfilter: x_tables: add xt_bpf match")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

netfilter: nf_log: release skbuff on nlmsg put failure

commit b51d3fa364885a2c1e1668f88776c67c95291820 upstream.

The kernel should reserve enough room in the skb so that the DONE
message can always be appended. However, in case of e.g. new attribute
erronously not being size-accounted for, __nfulnl_send() will still
try to put next nlmsg into this full skbuf, causing the skb to be stuck
forever and blocking delivery of further messages.

Fix issue by releasing skb immediately after nlmsg_put error and
WARN() so we can track down the cause of such size mismatch.

[ fw@strlen.de: add tailroom/len info to WARN ]

Signed-off-by: Houcheng Lin <houcheng@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

netfilter: nfnetlink_log: fix maximum packet length logged to userspace

commit c1e7dc91eed0ed1a51c9b814d648db18bf8fc6e9 upstream.

don't try to queue payloads > 0xffff - NLA_HDRLEN, it does not work.
The nla length includes the size of the nla struct, so anything larger
results in u16 integer overflow.

This patch is similar to
9cefbbc9c8f9abe (netfilter: nfnetlink_queue: cleanup copy_range usage).

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

netfilter: nf_log: account for size of NLMSG_DONE attribute

commit 9dfa1dfe4d5e5e66a991321ab08afe69759d797a upstream.

We currently neither account for the nlattr size, nor do we consider
the size of the trailing NLMSG_DONE when allocating nlmsg skb.

This can result in nflog to stop working, as __nfulnl_send() re-tries
sending forever if it failed to append NLMSG_DONE (which will never
work if buffer is not large enough).

Reported-by: Houcheng Lin <houcheng@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ipc: always handle a new value of auto_msgmni

commit 1195d94e006b23c6292e78857e154872e33b6d7e upstream.

proc_dointvec_minmax() returns zero if a new value has been set.  So we
don't need to check all charecters have been handled.

Below you can find two examples.  In the new value has not been handled
properly.

$ strace ./a.out
open("/proc/sys/kernel/auto_msgmni", O_WRONLY) = 3
write(3, "0\n\0", 3)                    = 2
close(3)                                = 0
exit_group(0)
$ cat /sys/kernel/debug/tracing/trace

$strace ./a.out
open("/proc/sys/kernel/auto_msgmni", O_WRONLY) = 3
write(3, "0\n", 2)                      = 2
close(3)                                = 0

$ cat /sys/kernel/debug/tracing/trace
a.out-697   [000] ....  3280.998235: unregister_ipcns_notifier <-proc_ipcauto_dointvec_minmax

Fixes: 9eefe520c814 ("ipc: do not use a negative value to re-enable msgmni automatic recomputin")
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Cc: Mathias Krause <minipli@googlemail.com>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Joe Perches <joe@perches.com>
Cc: Davidlohr Bueso <davidlohr@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

clocksource: Remove "weak" from clocksource_default_clock() declaration

commit 96a2adbc6f501996418da9f7afe39bf0e4d006a9 upstream.

kernel/time/jiffies.c provides a default clocksource_default_clock()
definition explicitly marked "weak". arch/s390 provides its own definition
intended to override the default, but the "weak" attribute on the
declaration applied to the s390 definition as well, so the linker chose one
based on link order (see 10629d711ed7 ("PCI: Remove __weak annotation from
pcibios_get_phb_of_node decl")).

Remove the "weak" attribute from the clocksource_default_clock()
declaration so we always prefer a non-weak definition over the weak one,
independent of link order.

Fixes: f1b82746c1e9 ("clocksource: Cleanup clocksource selection")
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: John Stultz <john.stultz@linaro.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
CC: Daniel Lezcano <daniel.lezcano@linaro.org>
CC: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

kgdb: Remove "weak" from kgdb_arch_pc() declaration

commit 107bcc6d566cb40184068d888637f9aefe6252dd upstream.

kernel/debug/debug_core.c provides a default kgdb_arch_pc() definition
explicitly marked "weak". Several architectures provide their own
definitions intended to override the default, but the "weak" attribute on
the declaration applied to the arch definitions as well, so the linker
chose one based on link order (see 10629d711ed7 ("PCI: Remove __weak
annotation from pcibios_get_phb_of_node decl")).

Remove the "weak" attribute from the declaration so we always prefer a
non-weak definition over the weak one, independent of link order.

Fixes: 688b744d8bc8 ("kgdb: fix signedness mixmatches, add statics, add declaration to header")
Tested-by: Vineet Gupta <vgupta@synopsys.com> # for ARC build
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

media: ttusb-dec: buffer overflow in ioctl

commit f2e323ec96077642d397bb1c355def536d489d16 upstream.

We need to add a limit check here so we don't overflow the buffer.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

NFSv4: Fix races between nfs_remove_bad_delegation() and delegation return

commit 869f9dfa4d6d57b79e0afc3af14772c2a023eeb1 upstream.

Any attempt to call nfs_remove_bad_delegation() while a delegation is being
returned is currently a no-op. This means that we can end up looping
forever in nfs_end_delegation_return() if something causes the delegation
to be revoked.
This patch adds a mechanism whereby the state recovery code can communicate
to the delegation return code that the delegation is no longer valid and
that it should not be used when reclaiming state.
It also changes the return value for nfs4_handle_delegation_recall_error()
to ensure that nfs_end_delegation_return() does not reattempt the lock
reclaim before state recovery is done.

http://lkml.kernel.org/r/CAN-5tyHwG=Cn2Q9KsHWadewjpTTy_K26ee+UnSvHvG4192p-Xw@mail.gmail.com
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

nfs: Fix use of uninitialized variable in nfs_getattr()

commit 16caf5b6101d03335b386e77e9e14136f989be87 upstream.

Variable 'err' needn't be initialized when nfs_getattr() uses it to
check whether it should call generic_fillattr() or not. That can result
in spurious error returns. Initialize 'err' properly.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>