GitHub/moto-9609/android_kernel_motorola_exynos9610.git
5 years agomm, memory_hotplug: test_pages_in_a_zone do not pass the end of zone
Mikhail Zaslonko [Fri, 1 Feb 2019 22:20:38 +0000 (14:20 -0800)]
mm, memory_hotplug: test_pages_in_a_zone do not pass the end of zone

[ Upstream commit 24feb47c5fa5b825efb0151f28906dfdad027e61 ]

If memory end is not aligned with the sparse memory section boundary,
the mapping of such a section is only partly initialized.  This may lead
to VM_BUG_ON due to uninitialized struct pages access from
test_pages_in_a_zone() function triggered by memory_hotplug sysfs
handlers.

Here are the the panic examples:
 CONFIG_DEBUG_VM_PGFLAGS=y
 kernel parameter mem=2050M
 --------------------------
 page:000003d082008000 is uninitialized and poisoned
 page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p))
 Call Trace:
   test_pages_in_a_zone+0xde/0x160
   show_valid_zones+0x5c/0x190
   dev_attr_show+0x34/0x70
   sysfs_kf_seq_show+0xc8/0x148
   seq_read+0x204/0x480
   __vfs_read+0x32/0x178
   vfs_read+0x82/0x138
   ksys_read+0x5a/0xb0
   system_call+0xdc/0x2d8
 Last Breaking-Event-Address:
   test_pages_in_a_zone+0xde/0x160
 Kernel panic - not syncing: Fatal exception: panic_on_oops

Fix this by checking whether the pfn to check is within the zone.

[mhocko@suse.com: separated this change from http://lkml.kernel.org/r/20181105150401.97287-2-zaslonko@linux.ibm.com]
Link: http://lkml.kernel.org/r/20190128144506.15603-3-mhocko@kernel.org
[mhocko@suse.com: separated this change from
http://lkml.kernel.org/r/20181105150401.97287-2-zaslonko@linux.ibm.com]
Signed-off-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Mikhail Zaslonko <zaslonko@linux.ibm.com>
Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Tested-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agomm, memory_hotplug: is_mem_section_removable do not pass the end of a zone
Michal Hocko [Fri, 1 Feb 2019 22:20:34 +0000 (14:20 -0800)]
mm, memory_hotplug: is_mem_section_removable do not pass the end of a zone

[ Upstream commit efad4e475c312456edb3c789d0996d12ed744c13 ]

Patch series "mm, memory_hotplug: fix uninitialized pages fallouts", v2.

Mikhail Zaslonko has posted fixes for the two bugs quite some time ago
[1].  I have pushed back on those fixes because I believed that it is
much better to plug the problem at the initialization time rather than
play whack-a-mole all over the hotplug code and find all the places
which expect the full memory section to be initialized.

We have ended up with commit 2830bf6f05fb ("mm, memory_hotplug:
initialize struct pages for the full memory section") merged and cause a
regression [2][3].  The reason is that there might be memory layouts
when two NUMA nodes share the same memory section so the merged fix is
simply incorrect.

In order to plug this hole we really have to be zone range aware in
those handlers.  I have split up the original patch into two.  One is
unchanged (patch 2) and I took a different approach for `removable'
crash.

[1] http://lkml.kernel.org/r/20181105150401.97287-2-zaslonko@linux.ibm.com
[2] https://bugzilla.redhat.com/show_bug.cgi?id=1666948
[3] http://lkml.kernel.org/r/20190125163938.GA20411@dhcp22.suse.cz

This patch (of 2):

Mikhail has reported the following VM_BUG_ON triggered when reading sysfs
removable state of a memory block:

 page:000003d08300c000 is uninitialized and poisoned
 page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p))
 Call Trace:
   is_mem_section_removable+0xb4/0x190
   show_mem_removable+0x9a/0xd8
   dev_attr_show+0x34/0x70
   sysfs_kf_seq_show+0xc8/0x148
   seq_read+0x204/0x480
   __vfs_read+0x32/0x178
   vfs_read+0x82/0x138
   ksys_read+0x5a/0xb0
   system_call+0xdc/0x2d8
 Last Breaking-Event-Address:
   is_mem_section_removable+0xb4/0x190
 Kernel panic - not syncing: Fatal exception: panic_on_oops

The reason is that the memory block spans the zone boundary and we are
stumbling over an unitialized struct page.  Fix this by enforcing zone
range in is_mem_section_removable so that we never run away from a zone.

Link: http://lkml.kernel.org/r/20190128144506.15603-2-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reported-by: Mikhail Zaslonko <zaslonko@linux.ibm.com>
Debugged-by: Mikhail Zaslonko <zaslonko@linux.ibm.com>
Tested-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agox86_64: increase stack size for KASAN_EXTRA
Qian Cai [Fri, 1 Feb 2019 22:20:20 +0000 (14:20 -0800)]
x86_64: increase stack size for KASAN_EXTRA

[ Upstream commit a8e911d13540487942d53137c156bd7707f66e5d ]

If the kernel is configured with KASAN_EXTRA, the stack size is
increasted significantly because this option sets "-fstack-reuse" to
"none" in GCC [1].  As a result, it triggers stack overrun quite often
with 32k stack size compiled using GCC 8.  For example, this reproducer

  https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/syscalls/madvise/madvise06.c

triggers a "corrupted stack end detected inside scheduler" very reliably
with CONFIG_SCHED_STACK_END_CHECK enabled.

There are just too many functions that could have a large stack with
KASAN_EXTRA due to large local variables that have been called over and
over again without being able to reuse the stacks.  Some noticiable ones
are

  size
  7648 shrink_page_list
  3584 xfs_rmap_convert
  3312 migrate_page_move_mapping
  3312 dev_ethtool
  3200 migrate_misplaced_transhuge_page
  3168 copy_process

There are other 49 functions are over 2k in size while compiling kernel
with "-Wframe-larger-than=" even with a related minimal config on this
machine.  Hence, it is too much work to change Makefiles for each object
to compile without "-fsanitize-address-use-after-scope" individually.

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81715#c23

Although there is a patch in GCC 9 to help the situation, GCC 9 probably
won't be released in a few months and then it probably take another
6-month to 1-year for all major distros to include it as a default.
Hence, the stack usage with KASAN_EXTRA can be revisited again in 2020
when GCC 9 is everywhere.  Until then, this patch will help users avoid
stack overrun.

This has already been fixed for arm64 for the same reason via
6e8830674ea ("arm64: kasan: Increase stack size for KASAN_EXTRA").

Link: http://lkml.kernel.org/r/20190109215209.2903-1-cai@lca.pw
Signed-off-by: Qian Cai <cai@lca.pw>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agox86/kexec: Don't setup EFI info if EFI runtime is not enabled
Kairui Song [Fri, 18 Jan 2019 11:13:08 +0000 (19:13 +0800)]
x86/kexec: Don't setup EFI info if EFI runtime is not enabled

[ Upstream commit 2aa958c99c7fd3162b089a1a56a34a0cdb778de1 ]

Kexec-ing a kernel with "efi=noruntime" on the first kernel's command
line causes the following null pointer dereference:

  BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
  #PF error: [normal kernel read fault]
  Call Trace:
   efi_runtime_map_copy+0x28/0x30
   bzImage64_load+0x688/0x872
   arch_kexec_kernel_image_load+0x6d/0x70
   kimage_file_alloc_init+0x13e/0x220
   __x64_sys_kexec_file_load+0x144/0x290
   do_syscall_64+0x55/0x1a0
   entry_SYSCALL_64_after_hwframe+0x44/0xa9

Just skip the EFI info setup if EFI runtime services are not enabled.

 [ bp: Massage commit message. ]

Suggested-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Kairui Song <kasong@redhat.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Dave Young <dyoung@redhat.com>
Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: bhe@redhat.com
Cc: David Howells <dhowells@redhat.com>
Cc: erik.schmauss@intel.com
Cc: fanc.fnst@cn.fujitsu.com
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: kexec@lists.infradead.org
Cc: lenb@kernel.org
Cc: linux-acpi@vger.kernel.org
Cc: Philipp Rudo <prudo@linux.vnet.ibm.com>
Cc: rafael.j.wysocki@intel.com
Cc: robert.moore@intel.com
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86-ml <x86@kernel.org>
Cc: Yannik Sembritzki <yannik@sembritzki.me>
Link: https://lkml.kernel.org/r/20190118111310.29589-2-kasong@redhat.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agoapparmor: Fix aa_label_build() error handling for failed merges
John Johansen [Thu, 24 Jan 2019 21:53:05 +0000 (13:53 -0800)]
apparmor: Fix aa_label_build() error handling for failed merges

[ Upstream commit d6d478aee003e19ef90321176552a8ad2929a47f ]

aa_label_merge() can return NULL for memory allocations failures
make sure to handle and set the correct error in this case.

Reported-by: Peng Hao <peng.hao2@zte.com.cn>
Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agoarm64: kprobe: Always blacklist the KVM world-switch code
James Morse [Thu, 24 Jan 2019 16:32:55 +0000 (16:32 +0000)]
arm64: kprobe: Always blacklist the KVM world-switch code

[ Upstream commit f2b3d8566d81deaca31f4e3163def0bea7746e11 ]

On systems with VHE the kernel and KVM's world-switch code run at the
same exception level. Code that is only used on a VHE system does not
need to be annotated as __hyp_text as it can reside anywhere in the
 kernel text.

__hyp_text was also used to prevent kprobes from patching breakpoint
instructions into this region, as this code runs at a different
exception level. While this is no longer true with VHE, KVM still
switches VBAR_EL1, meaning a kprobe's breakpoint executed in the
world-switch code will cause a hyp-panic.

Move the __hyp_text check in the kprobes blacklist so it applies on
VHE systems too, to cover the common code and guest enter/exit
assembly.

Fixes: 888b3c8720e0 ("arm64: Treat all entry code as non-kprobe-able")
Reviewed-by: Christoffer Dall <christoffer.dall@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agox86/microcode/amd: Don't falsely trick the late loading mechanism
Thomas Lendacky [Thu, 31 Jan 2019 14:33:06 +0000 (14:33 +0000)]
x86/microcode/amd: Don't falsely trick the late loading mechanism

[ Upstream commit 912139cfbfa6a2bc1da052314d2c29338dae1f6a ]

The load_microcode_amd() function searches for microcode patches and
attempts to apply a microcode patch if it is of different level than the
currently installed level.

While the processor won't actually load a level that is less than
what is already installed, the logic wrongly returns UCODE_NEW thus
signaling to its caller reload_store() that a late loading should be
attempted.

If the file-system contains an older microcode revision than what is
currently running, such a late microcode reload can result in these
misleading messages:

  x86/CPU: CPU features have changed after loading microcode, but might not take effect.
  x86/CPU: Please consider either early loading through initrd/built-in or a potential BIOS update.

These messages were issued on a system where SME/SEV are not
enabled by the BIOS (MSR C001_0010[23] = 0b) because during boot,
early_detect_mem_encrypt() is called and cleared the SME and SEV
features in this case.

However, after the wrong late load attempt, get_cpu_cap() is called and
reloads the SME and SEV feature bits, resulting in the messages.

Update the microcode level check to not attempt microcode loading if the
current level is greater than(!) and not only equal to the current patch
level.

 [ bp: massage commit message. ]

Fixes: 2613f36ed965 ("x86/microcode: Attempt late loading only when new microcode is present")
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/154894518427.9406.8246222496874202773.stgit@tlendack-t1.amdoffice.net
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agocifs: fix computation for MAX_SMB2_HDR_SIZE
Ronnie Sahlberg [Tue, 29 Jan 2019 02:46:16 +0000 (12:46 +1000)]
cifs: fix computation for MAX_SMB2_HDR_SIZE

[ Upstream commit 58d15ed1203f4d858c339ea4d7dafa94bd2a56d3 ]

The size of the fixed part of the create response is 88 bytes not 56.

Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agoplatform/x86: Fix unmet dependency warning for SAMSUNG_Q10
Sinan Kaya [Thu, 24 Jan 2019 19:31:01 +0000 (19:31 +0000)]
platform/x86: Fix unmet dependency warning for SAMSUNG_Q10

[ Upstream commit 0ee4b5f801b73b83a9fb3921d725f2162fd4a2e5 ]

Add BACKLIGHT_LCD_SUPPORT for SAMSUNG_Q10 to fix the
warning: unmet direct dependencies detected for BACKLIGHT_CLASS_DEVICE.

SAMSUNG_Q10 selects BACKLIGHT_CLASS_DEVICE but BACKLIGHT_CLASS_DEVICE
depends on BACKLIGHT_LCD_SUPPORT.

Copy BACKLIGHT_LCD_SUPPORT dependency into SAMSUNG_Q10 to fix:

WARNING: unmet direct dependencies detected for BACKLIGHT_CLASS_DEVICE
  Depends on [n]: HAS_IOMEM [=y] && BACKLIGHT_LCD_SUPPORT [=n]
  Selected by [y]:
  - SAMSUNG_Q10 [=y] && X86 [=y] && X86_PLATFORM_DEVICES [=y] && ACPI [=y]

Signed-off-by: Sinan Kaya <okaya@kernel.org>
Acked-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agoscsi: 53c700: pass correct "dev" to dma_alloc_attrs()
Dan Carpenter [Thu, 24 Jan 2019 10:33:27 +0000 (13:33 +0300)]
scsi: 53c700: pass correct "dev" to dma_alloc_attrs()

[ Upstream commit 8437fcf14deed67e5ad90b5e8abf62fb20f30881 ]

The "hostdata->dev" pointer is NULL here.  We set "hostdata->dev = dev;"
later in the function and we also use "hostdata->dev" when we call
dma_free_attrs() in NCR_700_release().

This bug predates git version control.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agoscsi: libfc: free skb when receiving invalid flogi resp
Ming Lu [Thu, 24 Jan 2019 05:25:42 +0000 (13:25 +0800)]
scsi: libfc: free skb when receiving invalid flogi resp

[ Upstream commit 5d8fc4a9f0eec20b6c07895022a6bea3fb6dfb38 ]

The issue to be fixed in this commit is when libfc found it received a
invalid FLOGI response from FC switch, it would return without freeing the
fc frame, which is just the skb data. This would cause memory leak if FC
switch keeps sending invalid FLOGI responses.

This fix is just to make it execute `fc_frame_free(fp)` before returning
from function `fc_lport_flogi_resp`.

Signed-off-by: Ming Lu <ming.lu@citrix.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agoqed: Fix stack out of bounds bug
Manish Chopra [Mon, 28 Jan 2019 18:05:08 +0000 (10:05 -0800)]
qed: Fix stack out of bounds bug

[ Upstream commit ffb057f98928aa099b08e419bbe5afc26ec9f448 ]

KASAN reported following bug in qed_init_qm_get_idx_from_flags
due to inappropriate casting of "pq_flags". Fix the type of "pq_flags".

[  196.624707] BUG: KASAN: stack-out-of-bounds in qed_init_qm_get_idx_from_flags+0x1a4/0x1b8 [qed]
[  196.624712] Read of size 8 at addr ffff809b00bc7360 by task kworker/0:9/1712
[  196.624714]
[  196.624720] CPU: 0 PID: 1712 Comm: kworker/0:9 Not tainted 4.18.0-60.el8.aarch64+debug #1
[  196.624723] Hardware name: To be filled by O.E.M. Saber/Saber, BIOS 0ACKL024 09/26/2018
[  196.624733] Workqueue: events work_for_cpu_fn
[  196.624738] Call trace:
[  196.624742]  dump_backtrace+0x0/0x2f8
[  196.624745]  show_stack+0x24/0x30
[  196.624749]  dump_stack+0xe0/0x11c
[  196.624755]  print_address_description+0x68/0x260
[  196.624759]  kasan_report+0x178/0x340
[  196.624762]  __asan_report_load_n_noabort+0x38/0x48
[  196.624786]  qed_init_qm_get_idx_from_flags+0x1a4/0x1b8 [qed]
[  196.624808]  qed_init_qm_info+0xec0/0x2200 [qed]
[  196.624830]  qed_resc_alloc+0x284/0x7e8 [qed]
[  196.624853]  qed_slowpath_start+0x6cc/0x1ae8 [qed]
[  196.624864]  __qede_probe.isra.10+0x1cc/0x12c0 [qede]
[  196.624874]  qede_probe+0x78/0xf0 [qede]
[  196.624879]  local_pci_probe+0xc4/0x180
[  196.624882]  work_for_cpu_fn+0x54/0x98
[  196.624885]  process_one_work+0x758/0x1900
[  196.624888]  worker_thread+0x4e0/0xd18
[  196.624892]  kthread+0x2c8/0x350
[  196.624897]  ret_from_fork+0x10/0x18
[  196.624899]
[  196.624902] Allocated by task 2:
[  196.624906]  kasan_kmalloc.part.1+0x40/0x108
[  196.624909]  kasan_kmalloc+0xb4/0xc8
[  196.624913]  kasan_slab_alloc+0x14/0x20
[  196.624916]  kmem_cache_alloc_node+0x1dc/0x480
[  196.624921]  copy_process.isra.1.part.2+0x1d8/0x4a98
[  196.624924]  _do_fork+0x150/0xfa0
[  196.624926]  kernel_thread+0x48/0x58
[  196.624930]  kthreadd+0x3a4/0x5a0
[  196.624932]  ret_from_fork+0x10/0x18
[  196.624934]
[  196.624937] Freed by task 0:
[  196.624938] (stack is not available)
[  196.624940]
[  196.624943] The buggy address belongs to the object at ffff809b00bc0000
[  196.624943]  which belongs to the cache thread_stack of size 32768
[  196.624946] The buggy address is located 29536 bytes inside of
[  196.624946]  32768-byte region [ffff809b00bc0000ffff809b00bc8000)
[  196.624948] The buggy address belongs to the page:
[  196.624952] page:ffff7fe026c02e00 count:1 mapcount:0 mapping:ffff809b4001c000 index:0x0 compound_mapcount: 0
[  196.624960] flags: 0xfffff8000008100(slab|head)
[  196.624967] raw: 0fffff8000008100 dead000000000100 dead000000000200 ffff809b4001c000
[  196.624970] raw: 0000000000000000 0000000000080008 00000001ffffffff 0000000000000000
[  196.624973] page dumped because: kasan: bad access detected
[  196.624974]
[  196.624976] Memory state around the buggy address:
[  196.624980]  ffff809b00bc7200: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  196.624983]  ffff809b00bc7280: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  196.624985] >ffff809b00bc7300: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 04 f2 f2 f2
[  196.624988]                                                        ^
[  196.624990]  ffff809b00bc7380: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  196.624993]  ffff809b00bc7400: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  196.624995] ==================================================================

Signed-off-by: Manish Chopra <manishc@marvell.com>
Signed-off-by: Ariel Elior <aelior@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agoqed: Fix system crash in ll2 xmit
Manish Chopra [Mon, 28 Jan 2019 18:05:07 +0000 (10:05 -0800)]
qed: Fix system crash in ll2 xmit

[ Upstream commit 7c81626a3c37e4ac320b8ad785694ba498f24794 ]

Cache number of fragments in the skb locally as in case
of linear skb (with zero fragments), tx completion
(or freeing of skb) may happen before driver tries
to get number of frgaments from the skb which could
lead to stale access to an already freed skb.

Signed-off-by: Manish Chopra <manishc@marvell.com>
Signed-off-by: Ariel Elior <aelior@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agoqed: Fix VF probe failure while FLR
Manish Chopra [Mon, 28 Jan 2019 18:05:06 +0000 (10:05 -0800)]
qed: Fix VF probe failure while FLR

[ Upstream commit 327852ec64205bb651be391a069784872098a3b2 ]

VFs may hit VF-PF channel timeout while probing, as in some
cases it was observed that VF FLR and VF "acquire" message
transaction (i.e first message from VF to PF in VF's probe flow)
could occur simultaneously which could lead VF to fail sending
"acquire" message to PF as VF is marked disabled from HW perspective
due to FLR, which will result into channel timeout and VF probe failure.

In such cases, try retrying VF "acquire" message so that in later
attempts it could be successful to pass message to PF after the VF
FLR is completed and can be probed successfully.

Signed-off-by: Manish Chopra <manishc@marvell.com>
Signed-off-by: Ariel Elior <aelior@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agoqed: Fix LACP pdu drops for VFs
Manish Chopra [Mon, 28 Jan 2019 18:05:05 +0000 (10:05 -0800)]
qed: Fix LACP pdu drops for VFs

[ Upstream commit ff9296966e5e00b0d0d00477b2365a178f0f06a3 ]

VF is always configured to drop control frames
(with reserved mac addresses) but to work LACP
on the VFs, it would require LACP control frames
to be forwarded or transmitted successfully.

This patch fixes this in such a way that trusted VFs
(marked through ndo_set_vf_trust) would be allowed to
pass the control frames such as LACP pdus.

Signed-off-by: Manish Chopra <manishc@marvell.com>
Signed-off-by: Ariel Elior <aelior@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agoqed: Fix bug in tx promiscuous mode settings
Manish Chopra [Mon, 28 Jan 2019 18:05:04 +0000 (10:05 -0800)]
qed: Fix bug in tx promiscuous mode settings

[ Upstream commit 9e71a15d8b5bbce25c637f7f8833cd3f45b65646 ]

When running tx switched traffic between VNICs
created via a bridge(to which VFs are added),
adapter drops the unicast packets in tx flow due to
VNIC's ucast mac being unknown to it. But VF interfaces
being in promiscuous mode should have caused adapter
to accept all the unknown ucast packets. Later, it
was found that driver doesn't really configure tx
promiscuous mode settings to accept all unknown unicast macs.

This patch fixes tx promiscuous mode settings to accept all
unknown/unmatched unicast macs and works out the scenario.

Signed-off-by: Manish Chopra <manishc@marvell.com>
Signed-off-by: Ariel Elior <aelior@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agonfs: Fix NULL pointer dereference of dev_name
Yao Liu [Mon, 28 Jan 2019 11:44:14 +0000 (19:44 +0800)]
nfs: Fix NULL pointer dereference of dev_name

[ Upstream commit 80ff00172407e0aad4b10b94ef0816fc3e7813cb ]

There is a NULL pointer dereference of dev_name in nfs_parse_devname()

The oops looks something like:

  BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
  ...
  RIP: 0010:nfs_fs_mount+0x3b6/0xc20 [nfs]
  ...
  Call Trace:
   ? ida_alloc_range+0x34b/0x3d0
   ? nfs_clone_super+0x80/0x80 [nfs]
   ? nfs_free_parsed_mount_data+0x60/0x60 [nfs]
   mount_fs+0x52/0x170
   ? __init_waitqueue_head+0x3b/0x50
   vfs_kern_mount+0x6b/0x170
   do_mount+0x216/0xdc0
   ksys_mount+0x83/0xd0
   __x64_sys_mount+0x25/0x30
   do_syscall_64+0x65/0x220
   entry_SYSCALL_64_after_hwframe+0x49/0xbe

Fix this by adding a NULL check on dev_name

Signed-off-by: Yao Liu <yotta.liu@ucloud.cn>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agoselftests: timers: use LDLIBS instead of LDFLAGS
Fathi Boudra [Wed, 16 Jan 2019 17:43:20 +0000 (11:43 -0600)]
selftests: timers: use LDLIBS instead of LDFLAGS

[ Upstream commit 7d4e591bc051d3382c45caaa2530969fb42ed23d ]

posix_timers fails to build due to undefined reference errors:

 aarch64-linaro-linux-gcc --sysroot=/build/tmp-rpb-glibc/sysroots/hikey
 -O2 -pipe -g -feliminate-unused-debug-types -O3 -Wl,-no-as-needed -Wall
 -DKTEST  -Wl,-O1 -Wl,--hash-style=gnu -Wl,--as-needed -lrt -lpthread
 posix_timers.c
 -o /build/tmp-rpb-glibc/work/hikey-linaro-linux/kselftests/4.12-r0/linux-4.12-rc7/tools/testing/selftests/timers/posix_timers
 /tmp/cc1FTZzT.o: In function `check_timer_create':
 /usr/src/debug/kselftests/4.12-r0/linux-4.12-rc7/tools/testing/selftests/timers/posix_timers.c:157:
 undefined reference to `timer_create'
 /usr/src/debug/kselftests/4.12-r0/linux-4.12-rc7/tools/testing/selftests/timers/posix_timers.c:170:
 undefined reference to `timer_settime'
 collect2: error: ld returned 1 exit status

It's GNU Make and linker specific.

The default Makefile rule looks like:

$(CC) $(CFLAGS) $(LDFLAGS) $@ $^ $(LDLIBS)

When linking is done by gcc itself, no issue, but when it needs to be passed
to proper ld, only LDLIBS follows and then ld cannot know what libs to link
with.

More detail:
https://www.gnu.org/software/make/manual/html_node/Implicit-Variables.html

LDFLAGS
Extra flags to give to compilers when they are supposed to invoke the linker,
‘ld’, such as -L. Libraries (-lfoo) should be added to the LDLIBS variable
instead.

LDLIBS
Library flags or names given to compilers when they are supposed to invoke the
linker, ‘ld’. LOADLIBES is a deprecated (but still supported) alternative to
LDLIBS. Non-library linker flags, such as -L, should go in the LDFLAGS
variable.

https://lkml.org/lkml/2010/2/10/362

tools/perf: libraries must come after objects

Link order matters, use LDLIBS instead of LDFLAGS to properly link against
libpthread.

Signed-off-by: Denys Dmytriyenko <denys@ti.com>
Signed-off-by: Fathi Boudra <fathi.boudra@linaro.org>
Signed-off-by: Shuah Khan <shuah@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agogpio: vf610: Mask all GPIO interrupts
Andrew Lunn [Sun, 27 Jan 2019 21:58:00 +0000 (22:58 +0100)]
gpio: vf610: Mask all GPIO interrupts

[ Upstream commit 7ae710f9f8b2cf95297e7bbfe1c09789a7dc43d4 ]

On SoC reset all GPIO interrupts are disable. However, if kexec is
used to boot into a new kernel, the SoC does not experience a
reset. Hence GPIO interrupts can be left enabled from the previous
kernel. It is then possible for the interrupt to fire before an
interrupt handler is registered, resulting in the kernel complaining
of an "unexpected IRQ trap", the interrupt is never cleared, and so
fires again, resulting in an interrupt storm.

Disable all GPIO interrupts before registering the GPIO IRQ chip.

Fixes: 7f2691a19627 ("gpio: vf610: add gpiolib/IRQ chip driver for Vybrid")
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Stefan Agner <stefan@agner.ch>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agonetfilter: ebtables: compat: un-break 32bit setsockopt when no rules are present
Florian Westphal [Mon, 21 Jan 2019 20:54:36 +0000 (21:54 +0100)]
netfilter: ebtables: compat: un-break 32bit setsockopt when no rules are present

[ Upstream commit 2035f3ff8eaa29cfb5c8e2160b0f6e85eeb21a95 ]

Unlike ip(6)tables ebtables only counts user-defined chains.

The effect is that a 32bit ebtables binary on a 64bit kernel can do
'ebtables -N FOO' only after adding at least one rule, else the request
fails with -EINVAL.

This is a similar fix as done in
3f1e53abff84 ("netfilter: ebtables: don't attempt to allocate 0-sized compat array").

Fixes: 7d7d7e02111e9 ("netfilter: compat: reject huge allocation requests")
Reported-by: Francesco Ruggeri <fruggeri@arista.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agonet: stmmac: dwmac-rk: fix error handling in rk_gmac_powerup()
Alexey Khoroshilov [Sat, 26 Jan 2019 19:48:57 +0000 (22:48 +0300)]
net: stmmac: dwmac-rk: fix error handling in rk_gmac_powerup()

[ Upstream commit c69c29a1a0a8f68cd87e98ba4a5a79fb8ef2a58c ]

If phy_power_on() fails in rk_gmac_powerup(), clocks are left enabled.

Found by Linux Driver Verification project (linuxtesting.org).

Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agonet: hns: Fix wrong read accesses via Clause 45 MDIO protocol
Yonglong Liu [Sat, 26 Jan 2019 09:18:27 +0000 (17:18 +0800)]
net: hns: Fix wrong read accesses via Clause 45 MDIO protocol

[ Upstream commit cec8abba13e6a26729dfed41019720068eeeff2b ]

When reading phy registers via Clause 45 MDIO protocol, after write
address operation, the driver use another write address operation, so
can not read the right value of any phy registers. This patch fixes it.

Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agonet: hns: Restart autoneg need return failed when autoneg off
Yonglong Liu [Sat, 26 Jan 2019 09:18:26 +0000 (17:18 +0800)]
net: hns: Restart autoneg need return failed when autoneg off

[ Upstream commit ed29ca8b9592562559c64d027fb5eb126e463e2c ]

The hns driver of earlier devices, when autoneg off, restart autoneg
will return -EINVAL, so make the hns driver for the latest devices
do the same.

Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agonet: hns: Fix for missing of_node_put() after of_parse_phandle()
Yonglong Liu [Sat, 26 Jan 2019 09:18:25 +0000 (17:18 +0800)]
net: hns: Fix for missing of_node_put() after of_parse_phandle()

[ Upstream commit 263c6d75f9a544a3c2f8f6a26de4f4808d8f59cf ]

In hns enet driver, we use of_parse_handle() to get hold of the
device node related to "ae-handle" but we have missed to put
the node reference using of_node_put() after we are done using
the node. This patch fixes it.

Note:
Link: https://lkml.org/lkml/2018/12/22/217
Fixes: 48189d6aaf1e ("net: hns: enet specifies a reference to dsaf")
Reported-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agonet: altera_tse: fix msgdma_tx_completion on non-zero fill_level case
Tomonori Sakita [Fri, 25 Jan 2019 02:02:22 +0000 (11:02 +0900)]
net: altera_tse: fix msgdma_tx_completion on non-zero fill_level case

[ Upstream commit 6571ebce112a21ec9be68ef2f53b96fcd41fd81b ]

If fill_level was not zero and status was not BUSY,
result of "tx_prod - tx_cons - inuse" might be zero.
Subtracting 1 unconditionally results invalid negative return value
on this case.
Make sure not to return an negative value.

Signed-off-by: Tomonori Sakita <tomonori.sakita@sord.co.jp>
Signed-off-by: Atsushi Nemoto <atsushi.nemoto@sord.co.jp>
Reviewed-by: Dalon L Westergreen <dalon.westergreen@linux.intel.com>
Acked-by: Thor Thayer <thor.thayer@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agoxtensa: SMP: limit number of possible CPUs by NR_CPUS
Max Filippov [Sun, 27 Jan 2019 04:35:18 +0000 (20:35 -0800)]
xtensa: SMP: limit number of possible CPUs by NR_CPUS

[ Upstream commit 25384ce5f9530def39421597b1457d9462df6455 ]

This fixes the following warning at boot when the kernel is booted on a
board with more CPU cores than was configured in NR_CPUS:

  smp_init_cpus: Core Count = 8
  smp_init_cpus: Core Id = 0
  ------------[ cut here ]------------
  WARNING: CPU: 0 PID: 0 at include/linux/cpumask.h:121 smp_init_cpus+0x54/0x74
  Modules linked in:
  CPU: 0 PID: 0 Comm: swapper Not tainted 5.0.0-rc3-00015-g1459333f88a0 #124
  Call Trace:
    __warn$part$3+0x6a/0x7c
    warn_slowpath_null+0x35/0x3c
    smp_init_cpus+0x54/0x74
    setup_arch+0x1c0/0x1d0
    start_kernel+0x44/0x310
    _startup+0x107/0x107

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agoxtensa: SMP: mark each possible CPU as present
Max Filippov [Sat, 19 Jan 2019 08:26:48 +0000 (00:26 -0800)]
xtensa: SMP: mark each possible CPU as present

[ Upstream commit 8b1c42cdd7181200dc1fff39dcb6ac1a3fac2c25 ]

Otherwise it is impossible to enable CPUs after booting with 'maxcpus'
parameter.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agoxtensa: smp_lx200_defconfig: fix vectors clash
Max Filippov [Fri, 25 Jan 2019 01:16:11 +0000 (17:16 -0800)]
xtensa: smp_lx200_defconfig: fix vectors clash

[ Upstream commit 306b38305c0f86de7f17c5b091a95451dcc93d7d ]

Secondary CPU reset vector overlaps part of the double exception handler
code, resulting in weird crashes and hangups when running user code.
Move exception vectors one page up so that they don't clash with the
secondary CPU reset vector.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agoxtensa: SMP: fix secondary CPU initialization
Max Filippov [Fri, 21 Dec 2018 16:26:20 +0000 (08:26 -0800)]
xtensa: SMP: fix secondary CPU initialization

[ Upstream commit 32a7726c4f4aadfabdb82440d84f88a5a2c8fe13 ]

- add missing memory barriers to the secondary CPU synchronization spin
  loops; add comment to the matching memory barrier in the boot_secondary
  and __cpu_die functions;
- use READ_ONCE/WRITE_ONCE to access cpu_start_id/cpu_start_ccount
  instead of reading/writing them directly;
- re-initialize cpu_running every time before starting secondary CPU to
  flush possible previous CPU startup results.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agoselftests: cpu-hotplug: fix case where CPUs offline > CPUs present
Colin Ian King [Thu, 10 Jan 2019 12:38:02 +0000 (12:38 +0000)]
selftests: cpu-hotplug: fix case where CPUs offline > CPUs present

[ Upstream commit 2b531b6137834a55857a337ac17510d6436b6fbb ]

The cpu-hotplug test assumes that we can offline the maximum CPU as
described by /sys/devices/system/cpu/offline.  However, in the case
where the number of CPUs exceeds like kernel configuration then
the offline count can be greater than the present count and we end
up trying to test the offlining of a CPU that is not available to
offline.  Fix this by testing the maximum present CPU instead.

Also, the test currently offlines the CPU and does not online it,
so fix this by onlining the CPU after the test.

Fixes: d89dffa976bc ("fault-injection: add selftests for cpu and memory hotplug")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Shuah Khan <shuah@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agoxtensa: SMP: fix ccount_timer_shutdown
Max Filippov [Mon, 29 Jan 2018 17:09:41 +0000 (09:09 -0800)]
xtensa: SMP: fix ccount_timer_shutdown

[ Upstream commit 4fe8713b873fc881284722ce4ac47995de7cf62c ]

ccount_timer_shutdown is called from the atomic context in the
secondary_start_kernel, resulting in the following BUG:

BUG: sleeping function called from invalid context
in_atomic(): 1, irqs_disabled(): 1, pid: 0, name: swapper/1
Preemption disabled at:
  secondary_start_kernel+0xa1/0x130
Call Trace:
  ___might_sleep+0xe7/0xfc
  __might_sleep+0x41/0x44
  synchronize_irq+0x24/0x64
  disable_irq+0x11/0x14
  ccount_timer_shutdown+0x12/0x20
  clockevents_switch_state+0x82/0xb4
  clockevents_exchange_device+0x54/0x60
  tick_check_new_device+0x46/0x70
  clockevents_register_device+0x8c/0xc8
  clockevents_config_and_register+0x1d/0x2c
  local_timer_setup+0x75/0x7c
  secondary_start_kernel+0xb4/0x130
  should_never_return+0x32/0x35

Use disable_irq_nosync instead of disable_irq to avoid it.
This is safe because the ccount timer IRQ is per-CPU, and once IRQ is
masked the ISR will not be called.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agoiommu/amd: Fix IOMMU page flush when detach device from a domain
Suravee Suthikulpanit [Thu, 24 Jan 2019 04:16:45 +0000 (04:16 +0000)]
iommu/amd: Fix IOMMU page flush when detach device from a domain

[ Upstream commit 9825bd94e3a2baae1f4874767ae3a7d4c049720e ]

When a VM is terminated, the VFIO driver detaches all pass-through
devices from VFIO domain by clearing domain id and page table root
pointer from each device table entry (DTE), and then invalidates
the DTE. Then, the VFIO driver unmap pages and invalidate IOMMU pages.

Currently, the IOMMU driver keeps track of which IOMMU and how many
devices are attached to the domain. When invalidate IOMMU pages,
the driver checks if the IOMMU is still attached to the domain before
issuing the invalidate page command.

However, since VFIO has already detached all devices from the domain,
the subsequent INVALIDATE_IOMMU_PAGES commands are being skipped as
there is no IOMMU attached to the domain. This results in data
corruption and could cause the PCI device to end up in indeterministic
state.

Fix this by invalidate IOMMU pages when detach a device, and
before decrementing the per-domain device reference counts.

Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Suggested-by: Joerg Roedel <joro@8bytes.org>
Co-developed-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Fixes: 6de8ad9b9ee0 ('x86/amd-iommu: Make iommu_flush_pages aware of multiple IOMMUs')
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agoipvs: Fix signed integer overflow when setsockopt timeout
ZhangXiaoxu [Thu, 10 Jan 2019 08:39:06 +0000 (16:39 +0800)]
ipvs: Fix signed integer overflow when setsockopt timeout

[ Upstream commit 53ab60baa1ac4f20b080a22c13b77b6373922fd7 ]

There is a UBSAN bug report as below:
UBSAN: Undefined behaviour in net/netfilter/ipvs/ip_vs_ctl.c:2227:21
signed integer overflow:
-2147483647 * 1000 cannot be represented in type 'int'

Reproduce program:
#include <stdio.h>
#include <sys/types.h>
#include <sys/socket.h>

#define IPPROTO_IP 0
#define IPPROTO_RAW 255

#define IP_VS_BASE_CTL (64+1024+64)
#define IP_VS_SO_SET_TIMEOUT (IP_VS_BASE_CTL+10)

/* The argument to IP_VS_SO_GET_TIMEOUT */
struct ipvs_timeout_t {
int tcp_timeout;
int tcp_fin_timeout;
int udp_timeout;
};

int main() {
int ret = -1;
int sockfd = -1;
struct ipvs_timeout_t to;

sockfd = socket(AF_INET, SOCK_RAW, IPPROTO_RAW);
if (sockfd == -1) {
printf("socket init error\n");
return -1;
}

to.tcp_timeout = -2147483647;
to.tcp_fin_timeout = -2147483647;
to.udp_timeout = -2147483647;

ret = setsockopt(sockfd,
 IPPROTO_IP,
 IP_VS_SO_SET_TIMEOUT,
 (char *)(&to),
 sizeof(to));

printf("setsockopt return %d\n", ret);
return ret;
}

Return -EINVAL if the timeout value is negative or max than 'INT_MAX / HZ'.

Signed-off-by: ZhangXiaoxu <zhangxiaoxu5@huawei.com>
Acked-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agoiommu/amd: Unmap all mapped pages in error path of map_sg
Jerry Snitselaar [Sat, 19 Jan 2019 17:38:05 +0000 (10:38 -0700)]
iommu/amd: Unmap all mapped pages in error path of map_sg

[ Upstream commit f1724c0883bb0ce93b8dcb94b53dcca3b75ac9a7 ]

In the error path of map_sg there is an incorrect if condition
for breaking out of the loop that searches the scatterlist
for mapped pages to unmap. Instead of breaking out of the
loop once all the pages that were mapped have been unmapped,
it will break out of the loop after it has unmapped 1 page.
Fix the condition, so it breaks out of the loop only after
all the mapped pages have been unmapped.

Fixes: 80187fd39dcb ("iommu/amd: Optimize map_sg and unmap_sg")
Cc: Joerg Roedel <joro@8bytes.org>
Signed-off-by: Jerry Snitselaar <jsnitsel@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agoiommu/amd: Call free_iova_fast with pfn in map_sg
Jerry Snitselaar [Thu, 17 Jan 2019 19:29:02 +0000 (12:29 -0700)]
iommu/amd: Call free_iova_fast with pfn in map_sg

[ Upstream commit 51d8838d66d3249508940d8f59b07701f2129723 ]

In the error path of map_sg, free_iova_fast is being called with
address instead of the pfn. This results in a bad value getting into
the rcache, and can result in hitting a BUG_ON when
iova_magazine_free_pfns is called.

Cc: Joerg Roedel <joro@8bytes.org>
Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Jerry Snitselaar <jsnitsel@redhat.com>
Fixes: 80187fd39dcb ("iommu/amd: Optimize map_sg and unmap_sg")
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agoIB/{hfi1, qib}: Fix WC.byte_len calculation for UD_SEND_WITH_IMM
Brian Welty [Thu, 17 Jan 2019 20:41:32 +0000 (12:41 -0800)]
IB/{hfi1, qib}: Fix WC.byte_len calculation for UD_SEND_WITH_IMM

[ Upstream commit 904bba211acc2112fdf866e5a2bc6cd9ecd0de1b ]

The work completion length for a receiving a UD send with immediate is
short by 4 bytes causing application using this opcode to fail.

The UD receive logic incorrectly subtracts 4 bytes for immediate
value. These bytes are already included in header length and are used to
calculate header/payload split, so the result is these 4 bytes are
subtracted twice, once when the header length subtracted from the overall
length and once again in the UD opcode specific path.

Remove the extra subtraction when handling the opcode.

Fixes: 7724105686e7 ("IB/hfi1: add driver files")
Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Brian Welty <brian.welty@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agoperf tools: Handle TOPOLOGY headers with no CPU
Stephane Eranian [Sat, 19 Jan 2019 08:12:39 +0000 (00:12 -0800)]
perf tools: Handle TOPOLOGY headers with no CPU

[ Upstream commit 1497e804d1a6e2bd9107ddf64b0310449f4673eb ]

This patch fixes an issue in cpumap.c when used with the TOPOLOGY
header. In some configurations, some NUMA nodes may have no CPU (empty
cpulist). Yet a cpumap map must be created otherwise perf abort with an
error. This patch handles this case by creating a dummy map.

  Before:

  $ perf record -o - -e cycles noploop 2 | perf script -i -
  0x6e8 [0x6c]: failed to process type: 80

  After:

  $ perf record -o - -e cycles noploop 2 | perf script -i -
  noploop for 2 seconds

Signed-off-by: Stephane Eranian <eranian@google.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1547885559-1657-1-git-send-email-eranian@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agoperf core: Fix perf_proc_update_handler() bug
Stephane Eranian [Fri, 11 Jan 2019 01:17:16 +0000 (17:17 -0800)]
perf core: Fix perf_proc_update_handler() bug

[ Upstream commit 1a51c5da5acc6c188c917ba572eebac5f8793432 ]

The perf_proc_update_handler() handles /proc/sys/kernel/perf_event_max_sample_rate
syctl variable.  When the PMU IRQ handler timing monitoring is disabled, i.e,
when /proc/sys/kernel/perf_cpu_time_max_percent is equal to 0 or 100,
then no modification to sysctl_perf_event_sample_rate is allowed to prevent
possible hang from wrong values.

The problem is that the test to prevent modification is made after the
sysctl variable is modified in perf_proc_update_handler().

You get an error:

  $ echo 10001 >/proc/sys/kernel/perf_event_max_sample_rate
  echo: write error: invalid argument

But the value is still modified causing all sorts of inconsistencies:

  $ cat /proc/sys/kernel/perf_event_max_sample_rate
  10001

This patch fixes the problem by moving the parsing of the value after
the test.

Committer testing:

  # echo 100 > /proc/sys/kernel/perf_cpu_time_max_percent
  # echo 10001 > /proc/sys/kernel/perf_event_max_sample_rate
  -bash: echo: write error: Invalid argument
  # cat /proc/sys/kernel/perf_event_max_sample_rate
  10001
  #

Signed-off-by: Stephane Eranian <eranian@google.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1547169436-6266-1-git-send-email-eranian@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agovti4: Fix a ipip packet processing bug in 'IPCOMP' virtual tunnel
Su Yanjun [Mon, 7 Jan 2019 02:31:20 +0000 (21:31 -0500)]
vti4: Fix a ipip packet processing bug in 'IPCOMP' virtual tunnel

[ Upstream commit dd9ee3444014e8f28c0eefc9fffc9ac9c5248c12 ]

Recently we run a network test over ipcomp virtual tunnel.We find that
if a ipv4 packet needs fragment, then the peer can't receive
it.

We deep into the code and find that when packet need fragment the smaller
fragment will be encapsulated by ipip not ipcomp. So when the ipip packet
goes into xfrm, it's skb->dev is not properly set. The ipv4 reassembly code
always set skb'dev to the last fragment's dev. After ipv4 defrag processing,
when the kernel rp_filter parameter is set, the skb will be drop by -EXDEV
error.

This patch adds compatible support for the ipip process in ipcomp virtual tunnel.

Signed-off-by: Su Yanjun <suyj.fnst@cn.fujitsu.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agomedia: uvcvideo: Fix 'type' check leading to overflow
Alistair Strachan [Wed, 19 Dec 2018 01:32:48 +0000 (20:32 -0500)]
media: uvcvideo: Fix 'type' check leading to overflow

commit 47bb117911b051bbc90764a8bff96543cbd2005f upstream.

When initially testing the Camera Terminal Descriptor wTerminalType
field (buffer[4]), no mask is used. Later in the function, the MSB is
overloaded to store the descriptor subtype, and so a mask of 0x7fff
is used to check the type.

If a descriptor is specially crafted to set this overloaded bit in the
original wTerminalType field, the initial type check will fail (falling
through, without adjusting the buffer size), but the later type checks
will pass, assuming the buffer has been made suitably large, causing an
overflow.

Avoid this problem by checking for the MSB in the wTerminalType field.
If the bit is set, assume the descriptor is bad, and abort parsing it.

Originally reported here:
https://groups.google.com/forum/#!topic/syzkaller/Ot1fOE6v1d8
A similar (non-compiling) patch was provided at that time.

Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Alistair Strachan <astrachan@google.com>
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoscsi: core: reset host byte in DID_NEXUS_FAILURE case
Martin Wilck [Thu, 14 Feb 2019 21:57:41 +0000 (22:57 +0100)]
scsi: core: reset host byte in DID_NEXUS_FAILURE case

commit 4a067cf823d9d8e50d41cfb618011c0d4a969c72 upstream.

Up to 4.12, __scsi_error_from_host_byte() would reset the host byte to
DID_OK for various cases including DID_NEXUS_FAILURE.  Commit
2a842acab109 ("block: introduce new block status code type") replaced this
function with scsi_result_to_blk_status() and removed the host-byte
resetting code for the DID_NEXUS_FAILURE case.  As the line
set_host_byte(cmd, DID_OK) was preserved for the other cases, I suppose
this was an editing mistake.

The fact that the host byte remains set after 4.13 is causing problems with
the sg_persist tool, which now returns success rather then exit status 24
when a RESERVATION CONFLICT error is encountered.

Fixes: 2a842acab109 "block: introduce new block status code type"
Signed-off-by: Martin Wilck <mwilck@suse.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoexec: Fix mem leak in kernel_read_file
YueHaibing [Tue, 19 Feb 2019 02:10:38 +0000 (10:10 +0800)]
exec: Fix mem leak in kernel_read_file

commit f612acfae86af7ecad754ae6a46019be9da05b8e upstream.

syzkaller report this:
BUG: memory leak
unreferenced object 0xffffc9000488d000 (size 9195520):
  comm "syz-executor.0", pid 2752, jiffies 4294787496 (age 18.757s)
  hex dump (first 32 bytes):
    ff ff ff ff ff ff ff ff a8 00 00 00 01 00 00 00  ................
    02 00 00 00 00 00 00 00 80 a1 7a c1 ff ff ff ff  ..........z.....
  backtrace:
    [<000000000863775c>] __vmalloc_node mm/vmalloc.c:1795 [inline]
    [<000000000863775c>] __vmalloc_node_flags mm/vmalloc.c:1809 [inline]
    [<000000000863775c>] vmalloc+0x8c/0xb0 mm/vmalloc.c:1831
    [<000000003f668111>] kernel_read_file+0x58f/0x7d0 fs/exec.c:924
    [<000000002385813f>] kernel_read_file_from_fd+0x49/0x80 fs/exec.c:993
    [<0000000011953ff1>] __do_sys_finit_module+0x13b/0x2a0 kernel/module.c:3895
    [<000000006f58491f>] do_syscall_64+0x147/0x600 arch/x86/entry/common.c:290
    [<00000000ee78baf4>] entry_SYSCALL_64_after_hwframe+0x49/0xbe
    [<00000000241f889b>] 0xffffffffffffffff

It should goto 'out_free' lable to free allocated buf while kernel_read
fails.

Fixes: 39d637af5aa7 ("vfs: forbid write access when reading a file into memory")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: Thibaut Sautereau <thibaut@sautereau.fr>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoBluetooth: Fix locking in bt_accept_enqueue() for BH context
Matthias Kaehlcke [Thu, 3 Jan 2019 00:11:20 +0000 (16:11 -0800)]
Bluetooth: Fix locking in bt_accept_enqueue() for BH context

commit c4f5627f7eeecde1bb6b646d8c0907b96dc2b2a6 upstream.

With commit e16337622016 ("Bluetooth: Handle bt_accept_enqueue() socket
atomically") lock_sock[_nested]() is used to acquire the socket lock
before manipulating the socket. lock_sock[_nested]() may block, which
is problematic since bt_accept_enqueue() can be called in bottom half
context (e.g. from rfcomm_connect_ind()):

[<ffffff80080d81ec>] __might_sleep+0x4c/0x80
[<ffffff800876c7b0>] lock_sock_nested+0x24/0x58
[<ffffff8000d7c27c>] bt_accept_enqueue+0x48/0xd4 [bluetooth]
[<ffffff8000e67d8c>] rfcomm_connect_ind+0x190/0x218 [rfcomm]

Add a parameter to bt_accept_enqueue() to indicate whether the
function is called from BH context, and acquire the socket lock
with bh_lock_sock_nested() if that's the case.

Also adapt all callers of bt_accept_enqueue() to pass the new
parameter:

- l2cap_sock_new_connection_cb()
  - uses lock_sock() to lock the parent socket => process context

- rfcomm_connect_ind()
  - acquires the parent socket lock with bh_lock_sock() => BH
    context

- __sco_chan_add()
  - called from sco_chan_add(), which is called from sco_connect().
    parent is NULL, hence bt_accept_enqueue() isn't called in this
    code path and we can ignore it
  - also called from sco_conn_ready(). uses bh_lock_sock() to acquire
    the parent lock => BH context

Fixes: e16337622016 ("Bluetooth: Handle bt_accept_enqueue() socket atomically")
Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoxtensa: fix get_wchan
Max Filippov [Wed, 2 Jan 2019 09:08:32 +0000 (01:08 -0800)]
xtensa: fix get_wchan

commit d90b88fd3653f1fb66ecc6571b860d5a5749fa56 upstream.

Stack unwinding is implemented incorrectly in xtensa get_wchan: instead
of extracting a0 and a1 registers from the spill location under the
stack pointer it extracts a word pointed to by the stack pointer and
subtracts 4 or 3 from it.

Cc: stable@vger.kernel.org
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agohugetlbfs: fix races and page leaks during migration
Mike Kravetz [Fri, 1 Mar 2019 00:22:02 +0000 (16:22 -0800)]
hugetlbfs: fix races and page leaks during migration

commit cb6acd01e2e43fd8bad11155752b7699c3d0fb76 upstream.

hugetlb pages should only be migrated if they are 'active'.  The
routines set/clear_page_huge_active() modify the active state of hugetlb
pages.

When a new hugetlb page is allocated at fault time, set_page_huge_active
is called before the page is locked.  Therefore, another thread could
race and migrate the page while it is being added to page table by the
fault code.  This race is somewhat hard to trigger, but can be seen by
strategically adding udelay to simulate worst case scheduling behavior.
Depending on 'how' the code races, various BUG()s could be triggered.

To address this issue, simply delay the set_page_huge_active call until
after the page is successfully added to the page table.

Hugetlb pages can also be leaked at migration time if the pages are
associated with a file in an explicitly mounted hugetlbfs filesystem.
For example, consider a two node system with 4GB worth of huge pages
available.  A program mmaps a 2G file in a hugetlbfs filesystem.  It
then migrates the pages associated with the file from one node to
another.  When the program exits, huge page counts are as follows:

  node0
  1024    free_hugepages
  1024    nr_hugepages

  node1
  0       free_hugepages
  1024    nr_hugepages

  Filesystem                         Size  Used Avail Use% Mounted on
  nodev                              4.0G  2.0G  2.0G  50% /var/opt/hugepool

That is as expected.  2G of huge pages are taken from the free_hugepages
counts, and 2G is the size of the file in the explicitly mounted
filesystem.  If the file is then removed, the counts become:

  node0
  1024    free_hugepages
  1024    nr_hugepages

  node1
  1024    free_hugepages
  1024    nr_hugepages

  Filesystem                         Size  Used Avail Use% Mounted on
  nodev                              4.0G  2.0G  2.0G  50% /var/opt/hugepool

Note that the filesystem still shows 2G of pages used, while there
actually are no huge pages in use.  The only way to 'fix' the filesystem
accounting is to unmount the filesystem

If a hugetlb page is associated with an explicitly mounted filesystem,
this information in contained in the page_private field.  At migration
time, this information is not preserved.  To fix, simply transfer
page_private from old to new page at migration time if necessary.

There is a related race with removing a huge page from a file and
migration.  When a huge page is removed from the pagecache, the
page_mapping() field is cleared, yet page_private remains set until the
page is actually freed by free_huge_page().  A page could be migrated
while in this state.  However, since page_mapping() is not set the
hugetlbfs specific routine to transfer page_private is not called and we
leak the page count in the filesystem.

To fix that, check for this condition before migrating a huge page.  If
the condition is detected, return EBUSY for the page.

Link: http://lkml.kernel.org/r/74510272-7319-7372-9ea6-ec914734c179@oracle.com
Link: http://lkml.kernel.org/r/20190212221400.3512-1-mike.kravetz@oracle.com
Fixes: bcc54222309c ("mm: hugetlb: introduce page_huge_active")
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: <stable@vger.kernel.org>
[mike.kravetz@oracle.com: v2]
Link: http://lkml.kernel.org/r/7534d322-d782-8ac6-1c8d-a8dc380eb3ab@oracle.com
[mike.kravetz@oracle.com: update comment and changelog]
Link: http://lkml.kernel.org/r/420bcfd6-158b-38e4-98da-26d0cd85bd01@oracle.com
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoMIPS: irq: Allocate accurate order pages for irq stack
Liu Xiang [Sat, 16 Feb 2019 09:12:24 +0000 (17:12 +0800)]
MIPS: irq: Allocate accurate order pages for irq stack

commit 72faa7a773ca59336f3c889e878de81445c5a85c upstream.

The irq_pages is the number of pages for irq stack, but not the
order which is needed by __get_free_pages().
We can use get_order() to calculate the accurate order.

Signed-off-by: Liu Xiang <liu.xiang6@zte.com.cn>
Signed-off-by: Paul Burton <paul.burton@mips.com>
Fixes: fe8bd18ffea5 ("MIPS: Introduce irq_stack")
Cc: linux-mips@vger.kernel.org
Cc: stable@vger.kernel.org # v4.11+
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoapplicom: Fix potential Spectre v1 vulnerabilities
Gustavo A. R. Silva [Wed, 9 Jan 2019 22:05:10 +0000 (16:05 -0600)]
applicom: Fix potential Spectre v1 vulnerabilities

commit d7ac3c6ef5d8ce14b6381d52eb7adafdd6c8bb3c upstream.

IndexCard is indirectly controlled by user-space, hence leading to
a potential exploitation of the Spectre variant 1 vulnerability.

This issue was detected with the help of Smatch:

drivers/char/applicom.c:418 ac_write() warn: potential spectre issue 'apbs' [r]
drivers/char/applicom.c:728 ac_ioctl() warn: potential spectre issue 'apbs' [r] (local cap)

Fix this by sanitizing IndexCard before using it to index apbs.

Notice that given that speculation windows are large, the policy is
to kill the speculation on the first load and not worry if it can be
completed with a dependent load/store [1].

[1] https://lore.kernel.org/lkml/20180423164740.GY17484@dhcp22.suse.cz/

Cc: stable@vger.kernel.org
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agox86/CPU/AMD: Set the CPB bit unconditionally on F17h
Jiaxun Yang [Tue, 20 Nov 2018 03:00:18 +0000 (11:00 +0800)]
x86/CPU/AMD: Set the CPB bit unconditionally on F17h

commit 0237199186e7a4aa5310741f0a6498a20c820fd7 upstream.

Some F17h models do not have CPB set in CPUID even though the CPU
supports it. Set the feature bit unconditionally on all F17h.

 [ bp: Rewrite commit message and patch. ]

Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Tom Lendacky <thomas.lendacky@amd.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Sherry Hurwitz <sherry.hurwitz@amd.com>
Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20181120030018.5185-1-jiaxun.yang@flygoat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agonet: dsa: mv88e6xxx: Fix statistics on mv88e6161
Andrew Lunn [Fri, 1 Mar 2019 22:43:39 +0000 (23:43 +0100)]
net: dsa: mv88e6xxx: Fix statistics on mv88e6161

[ Upstream commit a6da21bb0eae459a375d5bd48baed821d14301d0 ]

Despite what the datesheet says, the silicon implements the older way
of snapshoting the statistics. Change the op.

Reported-by: Chris.Healy@zii.aero
Tested-by: Chris.Healy@zii.aero
Fixes: 0ac64c394900 ("net: dsa: mv88e6xxx: mv88e6161 uses mv88e6320 stats snapshot")
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agonet: phy: Micrel KSZ8061: link failure after cable connect
Rajasingh Thavamani [Wed, 27 Feb 2019 12:13:19 +0000 (17:43 +0530)]
net: phy: Micrel KSZ8061: link failure after cable connect

[ Upstream commit 232ba3a51cc224b339c7114888ed7f0d4d95695e ]

With Micrel KSZ8061 PHY, the link may occasionally not come up after
Ethernet cable connect. The vendor's (Microchip, former Micrel) errata
sheet 80000688A.pdf descripes the problem and possible workarounds in
detail, see below.
The batch implements workaround 1, which permanently fixes the issue.

DESCRIPTION
Link-up may not occur properly when the Ethernet cable is initially
connected. This issue occurs more commonly when the cable is connected
slowly, but it may occur any time a cable is connected. This issue occurs
in the auto-negotiation circuit, and will not occur if auto-negotiation
is disabled (which requires that the two link partners be set to the
same speed and duplex).

END USER IMPLICATIONS
When this issue occurs, link is not established. Subsequent cable
plug/unplaug cycle will not correct the issue.

WORk AROUND
There are four approaches to work around this issue:
1. This issue can be prevented by setting bit 15 in MMD device address 1,
   register 2, prior to connecting the cable or prior to setting the
   Restart Auto-negotiation bit in register 0h. The MMD registers are
   accessed via the indirect access registers Dh and Eh, or via the Micrel
   EthUtil utility as shown here:
   . if using the EthUtil utility (usually with a Micrel KSZ8061
     Evaluation Board), type the following commands:
     > address 1
     > mmd 1
     > iw 2 b61a
   . Alternatively, write the following registers to write to the
     indirect MMD register:
     Write register Dh, data 0001h
     Write register Eh, data 0002h
     Write register Dh, data 4001h
     Write register Eh, data B61Ah
2. The issue can be avoided by disabling auto-negotiation in the KSZ8061,
   either by the strapping option, or by clearing bit 12 in register 0h.
   Care must be taken to ensure that the KSZ8061 and the link partner
   will link with the same speed and duplex. Note that the KSZ8061
   defaults to full-duplex when auto-negotiation is off, but other
   devices may default to half-duplex in the event of failed
   auto-negotiation.
3. The issue can be avoided by connecting the cable prior to powering-up
   or resetting the KSZ8061, and leaving it plugged in thereafter.
4. If the above measures are not taken and the problem occurs, link can
   be recovered by setting the Restart Auto-Negotiation bit in
   register 0h, or by resetting or power cycling the device. Reset may
   be either hardware reset or software reset (register 0h, bit 15).

PLAN
This errata will not be corrected in the future revision.

Fixes: 7ab59dc15e2f ("drivers/net/phy/micrel_phy: Add support for new PHYs")
Signed-off-by: Alexander Onnasch <alexander.onnasch@landisgyr.com>
Signed-off-by: Rajasingh Thavamani <T.Rajasingh@landisgyr.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agotun: remove unnecessary memory barrier
Timur Celik [Mon, 25 Feb 2019 20:13:13 +0000 (21:13 +0100)]
tun: remove unnecessary memory barrier

[ Upstream commit ecef67cb10db7b83b3b71c61dbb29aa070ab0112 ]

Replace set_current_state with __set_current_state since no memory
barrier is needed at this point.

Signed-off-by: Timur Celik <mail@timurcelik.de>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agotun: fix blocking read
Timur Celik [Sat, 23 Feb 2019 11:53:13 +0000 (12:53 +0100)]
tun: fix blocking read

[ Upstream commit 71828b2240692cec0e68b8d867bc00e1745e7fae ]

This patch moves setting of the current state into the loop. Otherwise
the task may end up in a busy wait loop if none of the break conditions
are met.

Signed-off-by: Timur Celik <mail@timurcelik.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agompls: Return error for RTA_GATEWAY attribute
David Ahern [Tue, 26 Feb 2019 17:00:04 +0000 (09:00 -0800)]
mpls: Return error for RTA_GATEWAY attribute

[ Upstream commit be48220edd48ca0d569782992840488a52373a24 ]

MPLS does not support nexthops with an MPLS address family.
Specifically, it does not handle RTA_GATEWAY attribute. Make it
clear by returning an error.

Fixes: 03c0566542f4c ("mpls: Netlink commands to add, remove, and dump routes")
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoipv6: Return error for RTA_VIA attribute
David Ahern [Tue, 26 Feb 2019 17:00:03 +0000 (09:00 -0800)]
ipv6: Return error for RTA_VIA attribute

[ Upstream commit e3818541b49fb88650ba339d33cc53e4095da5b3 ]

IPv6 currently does not support nexthops outside of the AF_INET6 family.
Specifically, it does not handle RTA_VIA attribute. If it is passed
in a route add request, the actual route added only uses the device
which is clearly not what the user intended:

  $ ip -6 ro add 2001:db8:2::/64 via inet 172.16.1.1 dev eth0
  $ ip ro ls
  ...
  2001:db8:2::/64 dev eth0 metric 1024 pref medium

Catch this and fail the route add:
  $ ip -6 ro add 2001:db8:2::/64 via inet 172.16.1.1 dev eth0
  Error: IPv6 does not support RTA_VIA attribute.

Fixes: 03c0566542f4c ("mpls: Netlink commands to add, remove, and dump routes")
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoipv4: Return error for RTA_VIA attribute
David Ahern [Tue, 26 Feb 2019 17:00:02 +0000 (09:00 -0800)]
ipv4: Return error for RTA_VIA attribute

[ Upstream commit b6e9e5df4ecf100f6a10ab2ade8e46d47a4b9779 ]

IPv4 currently does not support nexthops outside of the AF_INET family.
Specifically, it does not handle RTA_VIA attribute. If it is passed
in a route add request, the actual route added only uses the device
which is clearly not what the user intended:

  $ ip ro add 172.16.1.0/24 via inet6 2001:db8:1::1 dev eth0
  $ ip ro ls
  ...
  172.16.1.0/24 dev eth0

Catch this and fail the route add:
  $ ip ro add 172.16.1.0/24 via inet6 2001:db8:1::1 dev eth0
  Error: IPv4 does not support RTA_VIA attribute.

Fixes: 03c0566542f4c ("mpls: Netlink commands to add, remove, and dump routes")
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agonet: avoid use IPCB in cipso_v4_error
Nazarov Sergey [Mon, 25 Feb 2019 16:27:15 +0000 (19:27 +0300)]
net: avoid use IPCB in cipso_v4_error

[ Upstream commit 3da1ed7ac398f34fff1694017a07054d69c5f5c5 ]

Extract IP options in cipso_v4_error and use __icmp_send.

Signed-off-by: Sergey Nazarov <s-nazarov@yandex.ru>
Acked-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agonet: Add __icmp_send helper.
Nazarov Sergey [Mon, 25 Feb 2019 16:24:15 +0000 (19:24 +0300)]
net: Add __icmp_send helper.

[ Upstream commit 9ef6b42ad6fd7929dd1b6092cb02014e382c6a91 ]

Add __icmp_send function having ip_options struct parameter

Signed-off-by: Sergey Nazarov <s-nazarov@yandex.ru>
Reviewed-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoxen-netback: fix occasional leak of grant ref mappings under memory pressure
Igor Druzhinin [Thu, 28 Feb 2019 12:48:03 +0000 (12:48 +0000)]
xen-netback: fix occasional leak of grant ref mappings under memory pressure

[ Upstream commit 99e87f56b48f490fb16b6e0f74691c1e664dea95 ]

Zero-copy callback flag is not yet set on frag list skb at the moment
xenvif_handle_frag_list() returns -ENOMEM. This eventually results in
leaking grant ref mappings since xenvif_zerocopy_callback() is never
called for these fragments. Those eventually build up and cause Xen
to kill Dom0 as the slots get reused for new mappings:

"d0v0 Attempt to implicitly unmap a granted PTE c010000329fce005"

That behavior is observed under certain workloads where sudden spikes
of page cache writes coexist with active atomic skb allocations from
network traffic. Additionally, rework the logic to deal with frag_list
deallocation in a single place.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoxen-netback: don't populate the hash cache on XenBus disconnect
Igor Druzhinin [Thu, 28 Feb 2019 14:11:26 +0000 (14:11 +0000)]
xen-netback: don't populate the hash cache on XenBus disconnect

[ Upstream commit a2288d4e355992d369c50c45d017a85f6061ff71 ]

Occasionally, during the disconnection procedure on XenBus which
includes hash cache deinitialization there might be some packets
still in-flight on other processors. Handling of these packets includes
hashing and hash cache population that finally results in hash cache
data structure corruption.

In order to avoid this we prevent hashing of those packets if there
are no queues initialized. In that case RCU protection of queues guards
the hash cache as well.

Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agonet: socket: set sock->sk to NULL after calling proto_ops::release()
Eric Biggers [Thu, 21 Feb 2019 22:13:56 +0000 (14:13 -0800)]
net: socket: set sock->sk to NULL after calling proto_ops::release()

[ Upstream commit ff7b11aa481f682e0e9711abfeb7d03f5cd612bf ]

Commit 9060cb719e61 ("net: crypto set sk to NULL when af_alg_release.")
fixed a use-after-free in sockfs_setattr() when an AF_ALG socket is
closed concurrently with fchownat().  However, it ignored that many
other proto_ops::release() methods don't set sock->sk to NULL and
therefore allow the same use-after-free:

    - base_sock_release
    - bnep_sock_release
    - cmtp_sock_release
    - data_sock_release
    - dn_release
    - hci_sock_release
    - hidp_sock_release
    - iucv_sock_release
    - l2cap_sock_release
    - llcp_sock_release
    - llc_ui_release
    - rawsock_release
    - rfcomm_sock_release
    - sco_sock_release
    - svc_release
    - vcc_release
    - x25_release

Rather than fixing all these and relying on every socket type to get
this right forever, just make __sock_release() set sock->sk to NULL
itself after calling proto_ops::release().

Reproducer that produces the KASAN splat when any of these socket types
are configured into the kernel:

    #include <pthread.h>
    #include <stdlib.h>
    #include <sys/socket.h>
    #include <unistd.h>

    pthread_t t;
    volatile int fd;

    void *close_thread(void *arg)
    {
        for (;;) {
            usleep(rand() % 100);
            close(fd);
        }
    }

    int main()
    {
        pthread_create(&t, NULL, close_thread, NULL);
        for (;;) {
            fd = socket(rand() % 50, rand() % 11, 0);
            fchownat(fd, "", 1000, 1000, 0x1000);
            close(fd);
        }
    }

Fixes: 86741ec25462 ("net: core: Add a UID field to struct sock.")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agonet: sit: fix memory leak in sit_init_net()
Mao Wenan [Fri, 1 Mar 2019 15:06:40 +0000 (23:06 +0800)]
net: sit: fix memory leak in sit_init_net()

[ Upstream commit 07f12b26e21ab359261bf75cfcb424fdc7daeb6d ]

If register_netdev() is failed to register sitn->fb_tunnel_dev,
it will go to err_reg_dev and forget to free netdev(sitn->fb_tunnel_dev).

BUG: memory leak
unreferenced object 0xffff888378daad00 (size 512):
  comm "syz-executor.1", pid 4006, jiffies 4295121142 (age 16.115s)
  hex dump (first 32 bytes):
    00 e6 ed c0 83 88 ff ff 00 00 00 00 00 00 00 00  ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
backtrace:
    [<00000000d6dcb63e>] kvmalloc include/linux/mm.h:577 [inline]
    [<00000000d6dcb63e>] kvzalloc include/linux/mm.h:585 [inline]
    [<00000000d6dcb63e>] netif_alloc_netdev_queues net/core/dev.c:8380 [inline]
    [<00000000d6dcb63e>] alloc_netdev_mqs+0x600/0xcc0 net/core/dev.c:8970
    [<00000000867e172f>] sit_init_net+0x295/0xa40 net/ipv6/sit.c:1848
    [<00000000871019fa>] ops_init+0xad/0x3e0 net/core/net_namespace.c:129
    [<00000000319507f6>] setup_net+0x2ba/0x690 net/core/net_namespace.c:314
    [<0000000087db4f96>] copy_net_ns+0x1dc/0x330 net/core/net_namespace.c:437
    [<0000000057efc651>] create_new_namespaces+0x382/0x730 kernel/nsproxy.c:107
    [<00000000676f83de>] copy_namespaces+0x2ed/0x3d0 kernel/nsproxy.c:165
    [<0000000030b74bac>] copy_process.part.27+0x231e/0x6db0 kernel/fork.c:1919
    [<00000000fff78746>] copy_process kernel/fork.c:1713 [inline]
    [<00000000fff78746>] _do_fork+0x1bc/0xe90 kernel/fork.c:2224
    [<000000001c2e0d1c>] do_syscall_64+0xc8/0x580 arch/x86/entry/common.c:290
    [<00000000ec48bd44>] entry_SYSCALL_64_after_hwframe+0x49/0xbe
    [<0000000039acff8a>] 0xffffffffffffffff

Signed-off-by: Mao Wenan <maowenan@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agonet: phy: phylink: fix uninitialized variable in phylink_get_mac_state
Heiner Kallweit [Tue, 26 Feb 2019 18:29:22 +0000 (19:29 +0100)]
net: phy: phylink: fix uninitialized variable in phylink_get_mac_state

[ Upstream commit d25ed413d5e51644e18f66e34eec049f17a7abcb ]

When debugging an issue I found implausible values in state->pause.
Reason in that state->pause isn't initialized and later only single
bits are changed. Also the struct itself isn't initialized in
phylink_resolve(). So better initialize state->pause and other
not yet initialized fields.

v2:
- use right function name in subject
v3:
- initialize additional fields

Fixes: 9525ae83959b ("phylink: add phylink infrastructure")
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agonet: nfc: Fix NULL dereference on nfc_llcp_build_tlv fails
YueHaibing [Fri, 22 Feb 2019 07:37:58 +0000 (15:37 +0800)]
net: nfc: Fix NULL dereference on nfc_llcp_build_tlv fails

[ Upstream commit 58bdd544e2933a21a51eecf17c3f5f94038261b5 ]

KASAN report this:

BUG: KASAN: null-ptr-deref in nfc_llcp_build_gb+0x37f/0x540 [nfc]
Read of size 3 at addr 0000000000000000 by task syz-executor.0/5401

CPU: 0 PID: 5401 Comm: syz-executor.0 Not tainted 5.0.0-rc7+ #45
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0xfa/0x1ce lib/dump_stack.c:113
 kasan_report+0x171/0x18d mm/kasan/report.c:321
 memcpy+0x1f/0x50 mm/kasan/common.c:130
 nfc_llcp_build_gb+0x37f/0x540 [nfc]
 nfc_llcp_register_device+0x6eb/0xb50 [nfc]
 nfc_register_device+0x50/0x1d0 [nfc]
 nfcsim_device_new+0x394/0x67d [nfcsim]
 ? 0xffffffffc1080000
 nfcsim_init+0x6b/0x1000 [nfcsim]
 do_one_initcall+0xfa/0x5ca init/main.c:887
 do_init_module+0x204/0x5f6 kernel/module.c:3460
 load_module+0x66b2/0x8570 kernel/module.c:3808
 __do_sys_finit_module+0x238/0x2a0 kernel/module.c:3902
 do_syscall_64+0x147/0x600 arch/x86/entry/common.c:290
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x462e99
Code: f7 d8 64 89 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f9cb79dcc58 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
RAX: ffffffffffffffda RBX: 000000000073bf00 RCX: 0000000000462e99
RDX: 0000000000000000 RSI: 0000000020000280 RDI: 0000000000000003
RBP: 00007f9cb79dcc70 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007f9cb79dd6bc
R13: 00000000004bcefb R14: 00000000006f7030 R15: 0000000000000004

nfc_llcp_build_tlv will return NULL on fails, caller should check it,
otherwise will trigger a NULL dereference.

Reported-by: Hulk Robot <hulkci@huawei.com>
Fixes: eda21f16a5ed ("NFC: Set MIU and RW values from CONNECT and CC LLCP frames")
Fixes: d646960f7986 ("NFC: Initial LLCP support")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agonet: netem: fix skb length BUG_ON in __skb_to_sgvec
Sheng Lan [Thu, 28 Feb 2019 10:47:58 +0000 (18:47 +0800)]
net: netem: fix skb length BUG_ON in __skb_to_sgvec

[ Upstream commit 5845f706388a4cde0f6b80f9e5d33527e942b7d9 ]

It can be reproduced by following steps:
1. virtio_net NIC is configured with gso/tso on
2. configure nginx as http server with an index file bigger than 1M bytes
3. use tc netem to produce duplicate packets and delay:
   tc qdisc add dev eth0 root netem delay 100ms 10ms 30% duplicate 90%
4. continually curl the nginx http server to get index file on client
5. BUG_ON is seen quickly

[10258690.371129] kernel BUG at net/core/skbuff.c:4028!
[10258690.371748] invalid opcode: 0000 [#1] SMP PTI
[10258690.372094] CPU: 5 PID: 0 Comm: swapper/5 Tainted: G        W         5.0.0-rc6 #2
[10258690.372094] RSP: 0018:ffffa05797b43da0 EFLAGS: 00010202
[10258690.372094] RBP: 00000000000005ea R08: 0000000000000000 R09: 00000000000005ea
[10258690.372094] R10: ffffa0579334d800 R11: 00000000000002c0 R12: 0000000000000002
[10258690.372094] R13: 0000000000000000 R14: ffffa05793122900 R15: ffffa0578f7cb028
[10258690.372094] FS:  0000000000000000(0000) GS:ffffa05797b40000(0000) knlGS:0000000000000000
[10258690.372094] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[10258690.372094] CR2: 00007f1a6dc00868 CR3: 000000001000e000 CR4: 00000000000006e0
[10258690.372094] Call Trace:
[10258690.372094]  <IRQ>
[10258690.372094]  skb_to_sgvec+0x11/0x40
[10258690.372094]  start_xmit+0x38c/0x520 [virtio_net]
[10258690.372094]  dev_hard_start_xmit+0x9b/0x200
[10258690.372094]  sch_direct_xmit+0xff/0x260
[10258690.372094]  __qdisc_run+0x15e/0x4e0
[10258690.372094]  net_tx_action+0x137/0x210
[10258690.372094]  __do_softirq+0xd6/0x2a9
[10258690.372094]  irq_exit+0xde/0xf0
[10258690.372094]  smp_apic_timer_interrupt+0x74/0x140
[10258690.372094]  apic_timer_interrupt+0xf/0x20
[10258690.372094]  </IRQ>

In __skb_to_sgvec(), the skb->len is not equal to the sum of the skb's
linear data size and nonlinear data size, thus BUG_ON triggered.
Because the skb is cloned and a part of nonlinear data is split off.

Duplicate packet is cloned in netem_enqueue() and may be delayed
some time in qdisc. When qdisc len reached the limit and returns
NET_XMIT_DROP, the skb will be retransmit later in write queue.
the skb will be fragmented by tso_fragment(), the limit size
that depends on cwnd and mss decrease, the skb's nonlinear
data will be split off. The length of the skb cloned by netem
will not be updated. When we use virtio_net NIC and invoke skb_to_sgvec(),
the BUG_ON trigger.

To fix it, netem returns NET_XMIT_SUCCESS to upper stack
when it clones a duplicate packet.

Fixes: 35d889d1 ("sch_netem: fix skb leak in netem_enqueue()")
Signed-off-by: Sheng Lan <lansheng@huawei.com>
Reported-by: Qin Ji <jiqin.ji@huawei.com>
Suggested-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agonetlabel: fix out-of-bounds memory accesses
Paul Moore [Tue, 26 Feb 2019 00:06:06 +0000 (19:06 -0500)]
netlabel: fix out-of-bounds memory accesses

[ Upstream commit 5578de4834fe0f2a34fedc7374be691443396d1f ]

There are two array out-of-bounds memory accesses, one in
cipso_v4_map_lvl_valid(), the other in netlbl_bitmap_walk().  Both
errors are embarassingly simple, and the fixes are straightforward.

As a FYI for anyone backporting this patch to kernels prior to v4.8,
you'll want to apply the netlbl_bitmap_walk() patch to
cipso_v4_bitmap_walk() as netlbl_bitmap_walk() doesn't exist before
Linux v4.8.

Reported-by: Jann Horn <jannh@google.com>
Fixes: 446fda4f2682 ("[NetLabel]: CIPSOv4 engine")
Fixes: 3faa8f982f95 ("netlabel: Move bitmap manipulation functions to the NetLabel core.")
Signed-off-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agonet: dsa: mv88e6xxx: Fix u64 statistics
Andrew Lunn [Thu, 28 Feb 2019 17:14:03 +0000 (18:14 +0100)]
net: dsa: mv88e6xxx: Fix u64 statistics

[ Upstream commit 6e46e2d821bb22b285ae8187959096b65d063b0d ]

The switch maintains u64 counters for the number of octets sent and
received. These are kept as two u32's which need to be combined.  Fix
the combing, which wrongly worked on u16's.

Fixes: 80c4627b2719 ("dsa: mv88x6xxx: Refactor getting a single statistic")
Reported-by: Chris Healy <Chris.Healy@zii.aero>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agohv_netvsc: Fix IP header checksum for coalesced packets
Haiyang Zhang [Fri, 22 Feb 2019 18:25:03 +0000 (18:25 +0000)]
hv_netvsc: Fix IP header checksum for coalesced packets

[ Upstream commit bf48648d650db1146b75b9bd358502431e86cf4f ]

Incoming packets may have IP header checksum verified by the host.
They may not have IP header checksum computed after coalescing.
This patch re-compute the checksum when necessary, otherwise the
packets may be dropped, because Linux network stack always checks it.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agogeneve: correctly handle ipv6.disable module parameter
Jiri Benc [Thu, 28 Feb 2019 13:56:04 +0000 (14:56 +0100)]
geneve: correctly handle ipv6.disable module parameter

[ Upstream commit cf1c9ccba7308e48a68fa77f476287d9d614e4c7 ]

When IPv6 is compiled but disabled at runtime, geneve_sock_add returns
-EAFNOSUPPORT. For metadata based tunnels, this causes failure of the whole
operation of bringing up the tunnel.

Ignore failure of IPv6 socket creation for metadata based tunnels caused by
IPv6 not being available.

This is the same fix as what commit d074bf960044 ("vxlan: correctly handle
ipv6.disable module parameter") is doing for vxlan.

Note there's also commit c0a47e44c098 ("geneve: should not call rt6_lookup()
when ipv6 was disabled") which fixes a similar issue but for regular
tunnels, while this patch is needed for metadata based tunnels.

Signed-off-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agobnxt_en: Drop oversize TX packets to prevent errors.
Michael Chan [Wed, 27 Feb 2019 08:58:53 +0000 (03:58 -0500)]
bnxt_en: Drop oversize TX packets to prevent errors.

[ Upstream commit 2b3c6885386020b1b9d92d45e8349637e27d1f66 ]

There have been reports of oversize UDP packets being sent to the
driver to be transmitted, causing error conditions.  The issue is
likely caused by the dst of the SKB switching between 'lo' with
64K MTU and the hardware device with a smaller MTU.  Patches are
being proposed by Mahesh Bandewar <maheshb@google.com> to fix the
issue.

In the meantime, add a quick length check in the driver to prevent
the error.  The driver uses the TX packet size as index to look up an
array to setup the TX BD.  The array is large enough to support all MTU
sizes supported by the driver.  The oversize TX packet causes the
driver to index beyond the array and put garbage values into the
TX BD.  Add a simple check to prevent this.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agotipc: fix RDM/DGRAM connect() regression
Erik Hugne [Mon, 4 Mar 2019 22:26:10 +0000 (23:26 +0100)]
tipc: fix RDM/DGRAM connect() regression

[ Upstream commit 0e63208915a8d7590d0a6218dadb2a6a00ac705a ]

Fix regression bug introduced in
commit 365ad353c256 ("tipc: reduce risk of user starvation during link
congestion")

Only signal -EDESTADDRREQ for RDM/DGRAM if we don't have a cached
sockaddr.

Fixes: 365ad353c256 ("tipc: reduce risk of user starvation during link congestion")
Signed-off-by: Erik Hugne <erik.hugne@gmail.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoteam: Free BPF filter when unregistering netdev
Ido Schimmel [Sun, 3 Mar 2019 07:35:51 +0000 (07:35 +0000)]
team: Free BPF filter when unregistering netdev

[ Upstream commit 692c31bd4054212312396b1d303bffab2c5b93a7 ]

When team is used in loadbalance mode a BPF filter can be used to
provide a hash which will determine the Tx port.

When the netdev is later unregistered the filter is not freed which
results in memory leaks [1].

Fix by freeing the program and the corresponding filter when
unregistering the netdev.

[1]
unreferenced object 0xffff8881dbc47cc8 (size 16):
  comm "teamd", pid 3068, jiffies 4294997779 (age 438.247s)
  hex dump (first 16 bytes):
    a3 00 6b 6b 6b 6b 6b 6b 88 a5 82 e1 81 88 ff ff  ..kkkkkk........
  backtrace:
    [<000000008a3b47e3>] team_nl_cmd_options_set+0x88f/0x11b0
    [<00000000c4f4f27e>] genl_family_rcv_msg+0x78f/0x1080
    [<00000000610ef838>] genl_rcv_msg+0xca/0x170
    [<00000000a281df93>] netlink_rcv_skb+0x132/0x380
    [<000000004d9448a2>] genl_rcv+0x29/0x40
    [<000000000321b2f4>] netlink_unicast+0x4c0/0x690
    [<000000008c25dffb>] netlink_sendmsg+0x929/0xe10
    [<00000000068298c5>] sock_sendmsg+0xc8/0x110
    [<0000000082a61ff0>] ___sys_sendmsg+0x77a/0x8f0
    [<00000000663ae29d>] __sys_sendmsg+0xf7/0x250
    [<0000000027c5f11a>] do_syscall_64+0x14d/0x610
    [<000000006cfbc8d3>] entry_SYSCALL_64_after_hwframe+0x49/0xbe
    [<00000000e23197e2>] 0xffffffffffffffff
unreferenced object 0xffff8881e182a588 (size 2048):
  comm "teamd", pid 3068, jiffies 4294997780 (age 438.247s)
  hex dump (first 32 bytes):
    20 00 00 00 02 00 00 00 30 00 00 00 28 f0 ff ff   .......0...(...
    07 00 00 00 00 00 00 00 28 00 00 00 00 00 00 00  ........(.......
  backtrace:
    [<000000002daf01fb>] lb_bpf_func_set+0x45c/0x6d0
    [<000000008a3b47e3>] team_nl_cmd_options_set+0x88f/0x11b0
    [<00000000c4f4f27e>] genl_family_rcv_msg+0x78f/0x1080
    [<00000000610ef838>] genl_rcv_msg+0xca/0x170
    [<00000000a281df93>] netlink_rcv_skb+0x132/0x380
    [<000000004d9448a2>] genl_rcv+0x29/0x40
    [<000000000321b2f4>] netlink_unicast+0x4c0/0x690
    [<000000008c25dffb>] netlink_sendmsg+0x929/0xe10
    [<00000000068298c5>] sock_sendmsg+0xc8/0x110
    [<0000000082a61ff0>] ___sys_sendmsg+0x77a/0x8f0
    [<00000000663ae29d>] __sys_sendmsg+0xf7/0x250
    [<0000000027c5f11a>] do_syscall_64+0x14d/0x610
    [<000000006cfbc8d3>] entry_SYSCALL_64_after_hwframe+0x49/0xbe
    [<00000000e23197e2>] 0xffffffffffffffff

Fixes: 01d7f30a9f96 ("team: add loadbalance mode")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reported-by: Amit Cohen <amitc@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agosky2: Disable MSI on Dell Inspiron 1545 and Gateway P-79
Kai-Heng Feng [Mon, 4 Mar 2019 07:00:03 +0000 (15:00 +0800)]
sky2: Disable MSI on Dell Inspiron 1545 and Gateway P-79

[ Upstream commit b33b7cd6fd86478dd2890a9abeb6f036aa01fdf7 ]

Some sky2 chips fire IRQ after S3, before the driver is fully resumed:
[ 686.804877] do_IRQ: 1.37 No irq handler for vector

This is likely a platform bug that device isn't fully quiesced during
S3. Use MSI-X, maskable MSI or INTx can prevent this issue from
happening.

Since MSI-X and maskable MSI are not supported by this device, fallback
to use INTx on affected platforms.

BugLink: https://bugs.launchpad.net/bugs/1807259
BugLink: https://bugs.launchpad.net/bugs/1809843
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agonet-sysfs: Fix mem leak in netdev_register_kobject
YueHaibing [Sat, 2 Mar 2019 02:34:55 +0000 (10:34 +0800)]
net-sysfs: Fix mem leak in netdev_register_kobject

[ Upstream commit 895a5e96dbd6386c8e78e5b78e067dcc67b7f0ab ]

syzkaller report this:
BUG: memory leak
unreferenced object 0xffff88837a71a500 (size 256):
  comm "syz-executor.2", pid 9770, jiffies 4297825125 (age 17.843s)
  hex dump (first 32 bytes):
    00 00 00 00 ad 4e ad de ff ff ff ff 00 00 00 00  .....N..........
    ff ff ff ff ff ff ff ff 20 c0 ef 86 ff ff ff ff  ........ .......
  backtrace:
    [<00000000db12624b>] netdev_register_kobject+0x124/0x2e0 net/core/net-sysfs.c:1751
    [<00000000dc49a994>] register_netdevice+0xcc1/0x1270 net/core/dev.c:8516
    [<00000000e5f3fea0>] tun_set_iff drivers/net/tun.c:2649 [inline]
    [<00000000e5f3fea0>] __tun_chr_ioctl+0x2218/0x3d20 drivers/net/tun.c:2883
    [<000000001b8ac127>] vfs_ioctl fs/ioctl.c:46 [inline]
    [<000000001b8ac127>] do_vfs_ioctl+0x1a5/0x10e0 fs/ioctl.c:690
    [<0000000079b269f8>] ksys_ioctl+0x89/0xa0 fs/ioctl.c:705
    [<00000000de649beb>] __do_sys_ioctl fs/ioctl.c:712 [inline]
    [<00000000de649beb>] __se_sys_ioctl fs/ioctl.c:710 [inline]
    [<00000000de649beb>] __x64_sys_ioctl+0x74/0xb0 fs/ioctl.c:710
    [<000000007ebded1e>] do_syscall_64+0xc8/0x580 arch/x86/entry/common.c:290
    [<00000000db315d36>] entry_SYSCALL_64_after_hwframe+0x49/0xbe
    [<00000000115be9bb>] 0xffffffffffffffff

It should call kset_unregister to free 'dev->queues_kset'
in error path of register_queue_kobjects, otherwise will cause a mem leak.

Reported-by: Hulk Robot <hulkci@huawei.com>
Fixes: 1d24eb4815d1 ("xps: Transmit Packet Steering")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agonet: dsa: mv88e6xxx: handle unknown duplex modes gracefully in mv88e6xxx_port_set_duplex
Heiner Kallweit [Fri, 1 Mar 2019 18:53:57 +0000 (19:53 +0100)]
net: dsa: mv88e6xxx: handle unknown duplex modes gracefully in mv88e6xxx_port_set_duplex

[ Upstream commit c6195a8bdfc62a7cecf7df685e64847a4b700275 ]

When testing another issue I faced the problem that
mv88e6xxx_port_setup_mac() failed due to DUPLEX_UNKNOWN being passed
as argument to mv88e6xxx_port_set_duplex(). We should handle this case
gracefully and return -EOPNOTSUPP, like e.g. mv88e6xxx_port_set_speed()
is doing it.

Fixes: 7f1ae07b51e8 ("net: dsa: mv88e6xxx: add port duplex setter")
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoip6mr: Do not call __IP6_INC_STATS() from preemptible context
Ido Schimmel [Sun, 3 Mar 2019 07:34:57 +0000 (07:34 +0000)]
ip6mr: Do not call __IP6_INC_STATS() from preemptible context

[ Upstream commit 87c11f1ddbbad38ad8bad47af133a8208985fbdf ]

Similar to commit 44f49dd8b5a6 ("ipmr: fix possible race resulting from
improper usage of IP_INC_STATS_BH() in preemptible context."), we cannot
assume preemption is disabled when incrementing the counter and
accessing a per-CPU variable.

Preemption can be enabled when we add a route in process context that
corresponds to packets stored in the unresolved queue, which are then
forwarded using this route [1].

Fix this by using IP6_INC_STATS() which takes care of disabling
preemption on architectures where it is needed.

[1]
[  157.451447] BUG: using __this_cpu_add() in preemptible [00000000] code: smcrouted/2314
[  157.460409] caller is ip6mr_forward2+0x73e/0x10e0
[  157.460434] CPU: 3 PID: 2314 Comm: smcrouted Not tainted 5.0.0-rc7-custom-03635-g22f2712113f1 #1336
[  157.460449] Hardware name: Mellanox Technologies Ltd. MSN2100-CB2FO/SA001017, BIOS 5.6.5 06/07/2016
[  157.460461] Call Trace:
[  157.460486]  dump_stack+0xf9/0x1be
[  157.460553]  check_preemption_disabled+0x1d6/0x200
[  157.460576]  ip6mr_forward2+0x73e/0x10e0
[  157.460705]  ip6_mr_forward+0x9a0/0x1510
[  157.460771]  ip6mr_mfc_add+0x16b3/0x1e00
[  157.461155]  ip6_mroute_setsockopt+0x3cb/0x13c0
[  157.461384]  do_ipv6_setsockopt.isra.8+0x348/0x4060
[  157.462013]  ipv6_setsockopt+0x90/0x110
[  157.462036]  rawv6_setsockopt+0x4a/0x120
[  157.462058]  __sys_setsockopt+0x16b/0x340
[  157.462198]  __x64_sys_setsockopt+0xbf/0x160
[  157.462220]  do_syscall_64+0x14d/0x610
[  157.462349]  entry_SYSCALL_64_after_hwframe+0x49/0xbe

Fixes: 0912ea38de61 ("[IPV6] MROUTE: Add stats in multicast routing module method ip6_mr_forward().")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reported-by: Amit Cohen <amitc@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agostaging: android: ion: fix sys heap pool's gfp_flags
Qing Xia [Fri, 1 Feb 2019 06:59:46 +0000 (14:59 +0800)]
staging: android: ion: fix sys heap pool's gfp_flags

commit 9bcf065e28122588a6cbee08cf847826dacbb438 upstream.

In the first loop, gfp_flags will be modified to high_order_gfp_flags,
and there will be no chance to change back to low_order_gfp_flags.

Fixes: e7f63771b60e ("ION: Sys_heap: Add cached pool to spead up cached buffer alloc")
Signed-off-by: Qing Xia <saberlily.xia@hisilicon.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Jing Xia <jing.xia@unisoc.com>
Reviewed-by: Yuming Han <yuming.han@unisoc.com>
Reviewed-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
Reviewed-by: Orson Zhai <orson.zhai@unisoc.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agostaging: wilc1000: fix to set correct value for 'vif_num'
Ajay Singh [Thu, 7 Feb 2019 11:28:58 +0000 (11:28 +0000)]
staging: wilc1000: fix to set correct value for 'vif_num'

commit dda037057a572f5c82ac2499eb4e6fb17600ba3e upstream.

Set correct value in '->vif_num' for the total number of interfaces and
set '->idx' value using 'i'.

Fixes: 735bb39ca3be ("staging: wilc1000: simplify vif[i]->ndev accesses")
Fixes: 0e490657c721 ("staging: wilc1000: Fix problem with wrong vif index")
Cc: <stable@vger.kernel.org>
Suggested-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agostaging: comedi: ni_660x: fix missing break in switch statement
Gustavo A. R. Silva [Tue, 12 Feb 2019 18:44:50 +0000 (12:44 -0600)]
staging: comedi: ni_660x: fix missing break in switch statement

commit 479826cc86118e0d87e5cefb3df5b748e0480924 upstream.

Add missing break statement in order to prevent the code from falling
through to the default case and return -EINVAL every time.

This bug was found thanks to the ongoing efforts to enable
-Wimplicit-fallthrough.

Fixes: aa94f2888825 ("staging: comedi: ni_660x: tidy up ni_660x_set_pfi_routing()")
Cc: stable@vger.kernel.org
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Reviewed-by: Ian Abbott <abbotti@mev.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoUSB: serial: ftdi_sio: add ID for Hjelmslund Electronics USB485
Mans Rullgard [Thu, 14 Feb 2019 19:45:33 +0000 (19:45 +0000)]
USB: serial: ftdi_sio: add ID for Hjelmslund Electronics USB485

commit 8d7fa3d4ea3f0ca69554215e87411494e6346fdc upstream.

This adds the USB ID of the Hjelmslund Electronics USB485 Iso stick.

Signed-off-by: Mans Rullgard <mans@mansr.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoUSB: serial: cp210x: add ID for Ingenico 3070
Ivan Mironov [Wed, 6 Feb 2019 16:14:13 +0000 (21:14 +0500)]
USB: serial: cp210x: add ID for Ingenico 3070

commit dd9d3d86b08d6a106830364879c42c78db85389c upstream.

Here is how this device appears in kernel log:

usb 3-1: new full-speed USB device number 18 using xhci_hcd
usb 3-1: New USB device found, idVendor=0b00, idProduct=3070
usb 3-1: New USB device strings: Mfr=1, Product=2, SerialNumber=3
usb 3-1: Product: Ingenico 3070
usb 3-1: Manufacturer: Silicon Labs
usb 3-1: SerialNumber: 0001

Apparently this is a POS terminal with embedded USB-to-Serial converter.

Cc: stable@vger.kernel.org
Signed-off-by: Ivan Mironov <mironov.ivan@gmail.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoUSB: serial: option: add Telit ME910 ECM composition
Daniele Palmas [Wed, 20 Feb 2019 10:43:17 +0000 (11:43 +0100)]
USB: serial: option: add Telit ME910 ECM composition

commit 6431866b6707d27151be381252d6eef13025cfce upstream.

This patch adds Telit ME910 family ECM composition 0x1102.

Signed-off-by: Daniele Palmas <dnlplm@gmail.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agocpufreq: Use struct kobj_attribute instead of struct global_attr
Viresh Kumar [Fri, 25 Jan 2019 07:23:07 +0000 (12:53 +0530)]
cpufreq: Use struct kobj_attribute instead of struct global_attr

commit 625c85a62cb7d3c79f6e16de3cfa972033658250 upstream.

The cpufreq_global_kobject is created using kobject_create_and_add()
helper, which assigns the kobj_type as dynamic_kobj_ktype and show/store
routines are set to kobj_attr_show() and kobj_attr_store().

These routines pass struct kobj_attribute as an argument to the
show/store callbacks. But all the cpufreq files created using the
cpufreq_global_kobject expect the argument to be of type struct
attribute. Things work fine currently as no one accesses the "attr"
argument. We may not see issues even if the argument is used, as struct
kobj_attribute has struct attribute as its first element and so they
will both get same address.

But this is logically incorrect and we should rather use struct
kobj_attribute instead of struct global_attr in the cpufreq core and
drivers and the show/store callbacks should take struct kobj_attribute
as argument instead.

This bug is caught using CFI CLANG builds in android kernel which
catches mismatch in function prototypes for such callbacks.

Reported-by: Donghee Han <dh.han@samsung.com>
Reported-by: Sangkyu Kim <skwith.kim@samsung.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoLinux 4.14.105
Greg Kroah-Hartman [Tue, 5 Mar 2019 16:58:03 +0000 (17:58 +0100)]
Linux 4.14.105

5 years agox86/uaccess: Don't leak the AC flag into __put_user() value evaluation
Andy Lutomirski [Sat, 23 Feb 2019 01:17:04 +0000 (17:17 -0800)]
x86/uaccess: Don't leak the AC flag into __put_user() value evaluation

commit 2a418cf3f5f1caf911af288e978d61c9844b0695 upstream.

When calling __put_user(foo(), ptr), the __put_user() macro would call
foo() in between __uaccess_begin() and __uaccess_end().  If that code
were buggy, then those bugs would be run without SMAP protection.

Fortunately, there seem to be few instances of the problem in the
kernel. Nevertheless, __put_user() should be fixed to avoid doing this.
Therefore, evaluate __put_user()'s argument before setting AC.

This issue was noticed when an objtool hack by Peter Zijlstra complained
about genregs_get() and I compared the assembly output to the C source.

 [ bp: Massage commit message and fixed up whitespace. ]

Fixes: 11f1a4b9755f ("x86: reorganize SMAP handling in user space accesses")
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20190225125231.845656645@infradead.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoMIPS: eBPF: Fix icache flush end address
Paul Burton [Fri, 1 Mar 2019 22:58:09 +0000 (22:58 +0000)]
MIPS: eBPF: Fix icache flush end address

commit d1a2930d8a992fb6ac2529449f81a0056e1b98d1 upstream.

The MIPS eBPF JIT calls flush_icache_range() in order to ensure the
icache observes the code that we just wrote. Unfortunately it gets the
end address calculation wrong due to some bad pointer arithmetic.

The struct jit_ctx target field is of type pointer to u32, and as such
adding one to it will increment the address being pointed to by 4 bytes.
Therefore in order to find the address of the end of the code we simply
need to add the number of 4 byte instructions emitted, but we mistakenly
add the number of instructions multiplied by 4. This results in the call
to flush_icache_range() operating on a memory region 4x larger than
intended, which is always wasteful and can cause crashes if we overrun
into an unmapped page.

Fix this by correcting the pointer arithmetic to remove the bogus
multiplication, and use braces to remove the need for a set of brackets
whilst also making it obvious that the target field is a pointer.

Signed-off-by: Paul Burton <paul.burton@mips.com>
Fixes: b6bd53f9c4e8 ("MIPS: Add missing file for eBPF JIT.")
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: netdev@vger.kernel.org
Cc: bpf@vger.kernel.org
Cc: linux-mips@vger.kernel.org
Cc: stable@vger.kernel.org # v4.13+
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoMIPS: fix truncation in __cmpxchg_small for short values
Michael Clark [Mon, 11 Feb 2019 04:38:29 +0000 (17:38 +1300)]
MIPS: fix truncation in __cmpxchg_small for short values

commit 94ee12b507db8b5876e31c9d6c9d84f556a4b49f upstream.

__cmpxchg_small erroneously uses u8 for load comparison which can
be either char or short. This patch changes the local variable to
u32 which is sufficiently sized, as the loaded value is already
masked and shifted appropriately. Using an integer size avoids
any unnecessary canonicalization from use of non native widths.

This patch is part of a series that adapts the MIPS small word
atomics code for xchg and cmpxchg on short and char to RISC-V.

Cc: RISC-V Patches <patches@groups.riscv.org>
Cc: Linux RISC-V <linux-riscv@lists.infradead.org>
Cc: Linux MIPS <linux-mips@linux-mips.org>
Signed-off-by: Michael Clark <michaeljclark@mac.com>
[paul.burton@mips.com:
  - Fix varialble typo per Jonas Gorski.
  - Consolidate load variable with other declarations.]
Signed-off-by: Paul Burton <paul.burton@mips.com>
Fixes: 3ba7f44d2b19 ("MIPS: cmpxchg: Implement 1 byte & 2 byte cmpxchg()")
Cc: stable@vger.kernel.org # v4.13+
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agomm: enforce min addr even if capable() in expand_downwards()
Jann Horn [Wed, 27 Feb 2019 20:29:52 +0000 (21:29 +0100)]
mm: enforce min addr even if capable() in expand_downwards()

commit 0a1d52994d440e21def1c2174932410b4f2a98a1 upstream.

security_mmap_addr() does a capability check with current_cred(), but
we can reach this code from contexts like a VFS write handler where
current_cred() must not be used.

This can be abused on systems without SMAP to make NULL pointer
dereferences exploitable again.

Fixes: 8869477a49c3 ("security: protect from stack expansion into low vm addresses")
Cc: stable@kernel.org
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agommc: sdhci-esdhc-imx: correct the fix of ERR004536
BOUGH CHEN [Thu, 28 Feb 2019 10:15:42 +0000 (10:15 +0000)]
mmc: sdhci-esdhc-imx: correct the fix of ERR004536

commit e30be063d6dbcc0f18b1eb25fa709fdef89201fb upstream.

Commit 18094430d6b5 ("mmc: sdhci-esdhc-imx: add ADMA Length
Mismatch errata fix") involve the fix of ERR004536, but the
fix is incorrect. Double confirm with IC, need to clear the
bit 7 of register 0x6c rather than set this bit 7.
Here is the definition of bit 7 of 0x6c:
    0: enable the new IC fix for ERR004536
    1: do not use the IC fix, keep the same as before

Find this issue on i.MX845s-evk board when enable CMDQ, and
let system in heavy loading.

root@imx8mmevk:~# dd if=/dev/mmcblk2 of=/dev/null bs=1M &
root@imx8mmevk:~# memtester 1000M > /dev/zero &
root@imx8mmevk:~# [  139.897220] mmc2: cqhci: timeout for tag 16
[  139.901417] mmc2: cqhci: ============ CQHCI REGISTER DUMP ===========
[  139.907862] mmc2: cqhci: Caps:      0x0000310a | Version:  0x00000510
[  139.914311] mmc2: cqhci: Config:    0x00001001 | Control:  0x00000000
[  139.920753] mmc2: cqhci: Int stat:  0x00000000 | Int enab: 0x00000006
[  139.927193] mmc2: cqhci: Int sig:   0x00000006 | Int Coal: 0x00000000
[  139.933634] mmc2: cqhci: TDL base:  0x7809c000 | TDL up32: 0x00000000
[  139.940073] mmc2: cqhci: Doorbell:  0x00030000 | TCN:      0x00000000
[  139.946518] mmc2: cqhci: Dev queue: 0x00010000 | Dev Pend: 0x00010000
[  139.952967] mmc2: cqhci: Task clr:  0x00000000 | SSC1:     0x00011000
[  139.959411] mmc2: cqhci: SSC2:      0x00000001 | DCMD rsp: 0x00000000
[  139.965857] mmc2: cqhci: RED mask:  0xfdf9a080 | TERRI:    0x00000000
[  139.972308] mmc2: cqhci: Resp idx:  0x0000002e | Resp arg: 0x00000900
[  139.978761] mmc2: sdhci: ============ SDHCI REGISTER DUMP ===========
[  139.985214] mmc2: sdhci: Sys addr:  0xb2c19000 | Version:  0x00000002
[  139.991669] mmc2: sdhci: Blk size:  0x00000200 | Blk cnt:  0x00000400
[  139.998127] mmc2: sdhci: Argument:  0x40110400 | Trn mode: 0x00000033
[  140.004618] mmc2: sdhci: Present:   0x01088a8f | Host ctl: 0x00000030
[  140.011113] mmc2: sdhci: Power:     0x00000002 | Blk gap:  0x00000080
[  140.017583] mmc2: sdhci: Wake-up:   0x00000008 | Clock:    0x0000000f
[  140.024039] mmc2: sdhci: Timeout:   0x0000008f | Int stat: 0x00000000
[  140.030497] mmc2: sdhci: Int enab:  0x107f4000 | Sig enab: 0x107f4000
[  140.036972] mmc2: sdhci: AC12 err:  0x00000000 | Slot int: 0x00000502
[  140.043426] mmc2: sdhci: Caps:      0x07eb0000 | Caps_1:   0x8000b407
[  140.049867] mmc2: sdhci: Cmd:       0x00002c1a | Max curr: 0x00ffffff
[  140.056314] mmc2: sdhci: Resp[0]:   0x00000900 | Resp[1]:  0xffffffff
[  140.062755] mmc2: sdhci: Resp[2]:   0x328f5903 | Resp[3]:  0x00d00f00
[  140.069195] mmc2: sdhci: Host ctl2: 0x00000008
[  140.073640] mmc2: sdhci: ADMA Err:  0x00000007 | ADMA Ptr: 0x7809c108
[  140.080079] mmc2: sdhci: ============================================
[  140.086662] mmc2: running CQE recovery

Fixes: 18094430d6b5 ("mmc: sdhci-esdhc-imx: add ADMA Length Mismatch errata fix")
Signed-off-by: Haibo Chen <haibo.chen@nxp.com>
Cc: stable@vger.kernel.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agommc: tmio: fix access width of Block Count Register
Takeshi Saito [Thu, 21 Feb 2019 19:38:05 +0000 (20:38 +0100)]
mmc: tmio: fix access width of Block Count Register

commit 5603731a15ef9ca317c122cc8c959f1dee1798b4 upstream.

In R-Car Gen2 or later, the maximum number of transfer blocks are
changed from 0xFFFF to 0xFFFFFFFF. Therefore, Block Count Register
should use iowrite32().

If another system (U-boot, Hypervisor OS, etc) uses bit[31:16], this
value will not be cleared. So, SD/MMC card initialization fails.

So, check for the bigger register and use apropriate write. Also, mark
the register as extended on Gen2.

Signed-off-by: Takeshi Saito <takeshi.saito.xv@renesas.com>
[wsa: use max_blk_count in if(), add Gen2, update commit message]
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Cc: stable@kernel.org
Reviewed-by: Simon Horman <horms+renesas@verge.net.au>
[Ulf: Fixed build error]
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agommc: tmio_mmc_core: don't claim spurious interrupts
Sergei Shtylyov [Mon, 18 Feb 2019 17:45:40 +0000 (20:45 +0300)]
mmc: tmio_mmc_core: don't claim spurious interrupts

commit 5c27ff5db1491a947264d6d4e4cbe43ae6535bae upstream.

I have encountered an interrupt storm during the eMMC chip probing (and
the chip finally didn't get detected).  It turned out that U-Boot left
the DMAC interrupts enabled while the Linux driver  didn't use those.
The SDHI driver's interrupt handler somehow assumes that, even if an
SDIO interrupt didn't happen, it should return IRQ_HANDLED.  I think
that if none of the enabled interrupts happened and got handled, we
should return IRQ_NONE -- that way the kernel IRQ code recoginizes
a spurious interrupt and masks it off pretty quickly...

Fixes: 7729c7a232a9 ("mmc: tmio: Provide separate interrupt handlers")
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Reviewed-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Tested-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Simon Horman <horms+renesas@verge.net.au>
Cc: stable@vger.kernel.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agommc: spi: Fix card detection during probe
Jonathan Neuschäfer [Sun, 10 Feb 2019 17:31:07 +0000 (18:31 +0100)]
mmc: spi: Fix card detection during probe

commit c9bd505dbd9d3dc80c496f88eafe70affdcf1ba6 upstream.

When using the mmc_spi driver with a card-detect pin, I noticed that the
card was not detected immediately after probe, but only after it was
unplugged and plugged back in (and the CD IRQ fired).

The call tree looks something like this:

mmc_spi_probe
  mmc_add_host
    mmc_start_host
      _mmc_detect_change
        mmc_schedule_delayed_work(&host->detect, 0)
          mmc_rescan
            host->bus_ops->detect(host)
              mmc_detect
                _mmc_detect_card_removed
                  host->ops->get_cd(host)
                    mmc_gpio_get_cd -> -ENOSYS (ctx->cd_gpio not set)
  mmc_gpiod_request_cd
    ctx->cd_gpio = desc

To fix this issue, call mmc_detect_change after the card-detect GPIO/IRQ
is registered.

Signed-off-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Cc: stable@vger.kernel.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agopowerpc: Always initialize input array when calling epapr_hypercall()
Seth Forshee [Thu, 28 Sep 2017 13:33:39 +0000 (09:33 -0400)]
powerpc: Always initialize input array when calling epapr_hypercall()

commit 186b8f1587c79c2fa04bfa392fdf084443e398c1 upstream.

Several callers to epapr_hypercall() pass an uninitialized stack
allocated array for the input arguments, presumably because they
have no input arguments. However this can produce errors like
this one

 arch/powerpc/include/asm/epapr_hcalls.h:470:42: error: 'in' may be used uninitialized in this function [-Werror=maybe-uninitialized]
  unsigned long register r3 asm("r3") = in[0];
                                        ~~^~~

Fix callers to this function to always zero-initialize the input
arguments array to prevent this.

Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Cc: "A. Wilcox" <awilfox@adelielinux.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoKVM: nSVM: clear events pending from svm_complete_interrupts() when exiting to L1
Vitaly Kuznetsov [Mon, 7 Jan 2019 18:44:51 +0000 (19:44 +0100)]
KVM: nSVM: clear events pending from svm_complete_interrupts() when exiting to L1

[ Upstream commit 619ad846fc3452adaf71ca246c5aa711e2055398 ]

kvm-unit-tests' eventinj "NMI failing on IDT" test results in NMI being
delivered to the host (L1) when it's running nested. The problem seems to
be: svm_complete_interrupts() raises 'nmi_injected' flag but later we
decide to reflect EXIT_NPF to L1. The flag remains pending and we do NMI
injection upon entry so it got delivered to L1 instead of L2.

It seems that VMX code solves the same issue in prepare_vmcs12(), this was
introduced with code refactoring in commit 5f3d5799974b ("KVM: nVMX: Rework
event injection and recovery").

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agosvm: Fix AVIC incomplete IPI emulation
Suravee Suthikulpanit [Tue, 22 Jan 2019 10:25:13 +0000 (10:25 +0000)]
svm: Fix AVIC incomplete IPI emulation

[ Upstream commit bb218fbcfaaa3b115d4cd7a43c0ca164f3a96e57 ]

In case of incomplete IPI with invalid interrupt type, the current
SVM driver does not properly emulate the IPI, and fails to boot
FreeBSD guests with multiple vcpus when enabling AVIC.

Fix this by update APIC ICR high/low registers, which also
emulate sending the IPI.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agocfg80211: extend range deviation for DMG
Chaitanya Tata [Fri, 18 Jan 2019 21:47:47 +0000 (03:17 +0530)]
cfg80211: extend range deviation for DMG

[ Upstream commit 93183bdbe73bbdd03e9566c8dc37c9d06b0d0db6 ]

Recently, DMG frequency bands have been extended till 71GHz, so extend
the range check till 20GHz (45-71GHZ), else some channels will be marked
as disabled.

Signed-off-by: Chaitanya Tata <Chaitanya.Tata@bluwireless.co.uk>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agomac80211: Add attribute aligned(2) to struct 'action'
Mathieu Malaterre [Thu, 24 Jan 2019 18:19:57 +0000 (19:19 +0100)]
mac80211: Add attribute aligned(2) to struct 'action'

[ Upstream commit 7c53eb5d87bc21464da4268c3c0c47457b6d9c9b ]

During refactor in commit 9e478066eae4 ("mac80211: fix MU-MIMO
follow-MAC mode") a new struct 'action' was declared with packed
attribute as:

  struct {
          struct ieee80211_hdr_3addr hdr;
          u8 category;
          u8 action_code;
  } __packed action;

But since struct 'ieee80211_hdr_3addr' is declared with an aligned
keyword as:

  struct ieee80211_hdr {
   __le16 frame_control;
   __le16 duration_id;
   u8 addr1[ETH_ALEN];
   u8 addr2[ETH_ALEN];
   u8 addr3[ETH_ALEN];
   __le16 seq_ctrl;
   u8 addr4[ETH_ALEN];
  } __packed __aligned(2);

Solve the ambiguity of placing aligned structure in a packed one by
adding the aligned(2) attribute to struct 'action'.

This removes the following warning (W=1):

  net/mac80211/rx.c:234:2: warning: alignment 1 of 'struct <anonymous>' is less than 2 [-Wpacked-not-aligned]

Cc: Johannes Berg <johannes.berg@intel.com>
Suggested-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Mathieu Malaterre <malat@debian.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agomac80211: don't initiate TDLS connection if station is not associated to AP
Balaji Pothunoori [Mon, 21 Jan 2019 07:00:43 +0000 (12:30 +0530)]
mac80211: don't initiate TDLS connection if station is not associated to AP

[ Upstream commit 7ed5285396c257fd4070b1e29e7b2341aae2a1ce ]

Following call trace is observed while adding TDLS peer entry in driver
during TDLS setup.

Call Trace:
[<c1301476>] dump_stack+0x47/0x61
[<c10537d2>] __warn+0xe2/0x100
[<fa22415f>] ? sta_apply_parameters+0x49f/0x550 [mac80211]
[<c1053895>] warn_slowpath_null+0x25/0x30
[<fa22415f>] sta_apply_parameters+0x49f/0x550 [mac80211]
[<fa20ad42>] ? sta_info_alloc+0x1c2/0x450 [mac80211]
[<fa224623>] ieee80211_add_station+0xe3/0x160 [mac80211]
[<c1876fe3>] nl80211_new_station+0x273/0x420
[<c170f6d9>] genl_rcv_msg+0x219/0x3c0
[<c170f4c0>] ? genl_rcv+0x30/0x30
[<c170ee7e>] netlink_rcv_skb+0x8e/0xb0
[<c170f4ac>] genl_rcv+0x1c/0x30
[<c170e8aa>] netlink_unicast+0x13a/0x1d0
[<c170ec18>] netlink_sendmsg+0x2d8/0x390
[<c16c5acd>] sock_sendmsg+0x2d/0x40
[<c16c6369>] ___sys_sendmsg+0x1d9/0x1e0

Fixing this by allowing TDLS setup request only when we have completed
association.

Signed-off-by: Balaji Pothunoori <bpothuno@codeaurora.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agoibmveth: Do not process frames after calling napi_reschedule
Thomas Falcon [Thu, 24 Jan 2019 17:17:01 +0000 (11:17 -0600)]
ibmveth: Do not process frames after calling napi_reschedule

[ Upstream commit e95d22c69b2c130ccce257b84daf283fd82d611e ]

The IBM virtual ethernet driver's polling function continues
to process frames after rescheduling NAPI, resulting in a warning
if it exhausted its budget. Do not restart polling after calling
napi_reschedule. Instead let frames be processed in the following
instance.

Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agonet: dev_is_mac_header_xmit() true for ARPHRD_RAWIP
Maciej Żenczykowski [Thu, 24 Jan 2019 11:07:02 +0000 (03:07 -0800)]
net: dev_is_mac_header_xmit() true for ARPHRD_RAWIP

[ Upstream commit 3b707c3008cad04604c1f50e39f456621821c414 ]

__bpf_redirect() and act_mirred checks this boolean
to determine whether to prefix an ethernet header.

Signed-off-by: Maciej Żenczykowski <maze@google.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agonet: usb: asix: ax88772_bind return error when hw_reset fail
Zhang Run [Thu, 24 Jan 2019 05:48:49 +0000 (13:48 +0800)]
net: usb: asix: ax88772_bind return error when hw_reset fail

[ Upstream commit 6eea3527e68acc22483f4763c8682f223eb90029 ]

The ax88772_bind() should return error code immediately when the PHY
was not reset properly through ax88772a_hw_reset().
Otherwise, The asix_get_phyid() will block when get the PHY
Identifier from the PHYSID1 MII registers through asix_mdio_read()
due to the PHY isn't ready. Furthermore, it will produce a lot of
error message cause system crash.As follows:
asix 1-1:1.0 (unnamed net_device) (uninitialized): Failed to write
 reg index 0x0000: -71
asix 1-1:1.0 (unnamed net_device) (uninitialized): Failed to send
 software reset: ffffffb9
asix 1-1:1.0 (unnamed net_device) (uninitialized): Failed to write
 reg index 0x0000: -71
asix 1-1:1.0 (unnamed net_device) (uninitialized): Failed to enable
 software MII access
asix 1-1:1.0 (unnamed net_device) (uninitialized): Failed to read
 reg index 0x0000: -71
asix 1-1:1.0 (unnamed net_device) (uninitialized): Failed to write
 reg index 0x0000: -71
asix 1-1:1.0 (unnamed net_device) (uninitialized): Failed to enable
 software MII access
asix 1-1:1.0 (unnamed net_device) (uninitialized): Failed to read
 reg index 0x0000: -71
...

Signed-off-by: Zhang Run <zhang.run@zte.com.cn>
Reviewed-by: Yang Wei <yang.wei9@zte.com.cn>
Tested-by: Marcel Ziswiler <marcel.ziswiler@toradex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>