Stefano Stabellini [Thu, 19 Jan 2017 18:39:09 +0000 (10:39 -0800)]
swiotlb-xen: update dev_addr after swapping pages
[ Upstream commit
f1225ee4c8fcf09afaa199b8b1f0450f38b8cd11 ]
In xen_swiotlb_map_page and xen_swiotlb_map_sg_attrs, if the original
page is not suitable, we swap it for another page from the swiotlb
pool.
In these cases, we don't update the previously calculated dma address
for the page before calling xen_dma_map_page. Thus, we end up calling
xen_dma_map_page passing the wrong dev_addr, resulting in
xen_dma_map_page mistakenly assuming that the page is foreign when it is
local.
Fix the bug by updating dev_addr appropriately.
This change has no effect on x86, because xen_dma_map_page is a stub
there.
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Pooya Keshavarzi <Pooya.Keshavarzi@de.bosch.com>
Tested-by: Pooya Keshavarzi <Pooya.Keshavarzi@de.bosch.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
G. Campana [Thu, 19 Jan 2017 21:37:46 +0000 (23:37 +0200)]
virtio_console: fix a crash in config_work_handler
[ Upstream commit
8379cadf71c3ee8173a1c6fc1ea7762a9638c047 ]
Using control_work instead of config_work as the 3rd argument to
container_of results in an invalid portdev pointer. Indeed, the work
structure is initialized as below:
INIT_WORK(&portdev->config_work, &config_work_handler);
It leads to a crash when portdev->vdev is dereferenced later. This
bug
is triggered when the guest uses a virtio-console without multiport
feature and receives a config_changed virtio interrupt.
Signed-off-by: G. Campana <gcampana@quarkslab.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Liu Bo [Thu, 1 Dec 2016 21:43:31 +0000 (13:43 -0800)]
Btrfs: fix truncate down when no_holes feature is enabled
[ Upstream commit
91298eec05cd8d4e828cf7ee5d4a6334f70cf69a ]
For such a file mapping,
[0-4k][hole][8k-12k]
In NO_HOLES mode, we don't have the [hole] extent any more.
Commit
c1aa45759e90 ("Btrfs: fix shrinking truncate when the no_holes feature is enabled")
fixed disk isize not being updated in NO_HOLES mode when data is not flushed.
However, even if data has been flushed, we can still have trouble
in updating disk isize since we updated disk isize to 'start' of
the last evicted extent.
Reviewed-by: Chris Mason <clm@fb.com>
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Eric Dumazet [Thu, 19 Jan 2017 03:44:42 +0000 (19:44 -0800)]
gianfar: Do not reuse pages from emergency reserve
[ Upstream commit
69fed99baac186013840ced3524562841296034f ]
A driver using dev_alloc_page() must not reuse a page that had to
use emergency memory reserve.
Otherwise all packets using this page will be immediately dropped,
unless for very specific sockets having SOCK_MEMALLOC bit set.
This issue might be hard to debug, because only a fraction of the RX
ring buffer would suffer from drops.
Fixes:
75354148ce69 ("gianfar: Add paged allocation and Rx S/G")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Claudiu Manoil <claudiu.manoil@freescale.com>
Acked-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Gavin Shan [Thu, 5 Jan 2017 23:39:49 +0000 (10:39 +1100)]
powerpc/eeh: Enable IO path on permanent error
[ Upstream commit
387bbc974f6adf91aa635090f73434ed10edd915 ]
We give up recovery on permanent error, simply shutdown the affected
devices and remove them. If the devices can't be put into quiet state,
they spew more traffic that is likely to cause another unexpected EEH
error. This was observed on "p8dtu2u" machine:
0002:00:00.0 PCI bridge: IBM Device 03dc
0002:01:00.0 Ethernet controller: Intel Corporation \
Ethernet Controller X710/X557-AT 10GBASE-T (rev 02)
0002:01:00.1 Ethernet controller: Intel Corporation \
Ethernet Controller X710/X557-AT 10GBASE-T (rev 02)
0002:01:00.2 Ethernet controller: Intel Corporation \
Ethernet Controller X710/X557-AT 10GBASE-T (rev 02)
0002:01:00.3 Ethernet controller: Intel Corporation \
Ethernet Controller X710/X557-AT 10GBASE-T (rev 02)
On P8 PowerNV platform, the IO path is frozen when shutdowning the
devices, meaning the memory registers are inaccessible. It is why
the devices can't be put into quiet state before removing them.
This fixes the issue by enabling IO path prior to putting the devices
into quiet state.
Reported-by: Pridhiviraj Paidipeddi <ppaidipe@linux.vnet.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Acked-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Florian Fainelli [Thu, 23 Jun 2016 21:25:33 +0000 (14:25 -0700)]
net: bgmac: Remove superflous netif_carrier_on()
commit
3894396e64994f31c3ef5c7e6f63dded0593e567 upstream.
bgmac_open() calls phy_start() to initialize the PHY state machine,
which will set the interface's carrier state accordingly, no need to
force that as this could be conflicting with the PHY state determined by
PHYLIB.
Fixes:
dd4544f05469 ("bgmac: driver for GBit MAC core on BCMA bus")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Florian Fainelli [Thu, 23 Jun 2016 21:25:32 +0000 (14:25 -0700)]
net: bgmac: Start transmit queue in bgmac_open
commit
c3897f2a69e54dd113fc9abd2daf872e5b495798 upstream.
The driver does not start the transmit queue in bgmac_open(). If the
queue was stopped prior to closing then re-opening the interface, we
would never be able to wake-up again.
Fixes:
dd4544f05469 ("bgmac: driver for GBit MAC core on BCMA bus")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Florian Fainelli [Thu, 23 Jun 2016 21:23:12 +0000 (14:23 -0700)]
net: bgmac: Fix SOF bit checking
commit
d2b13233879ca1268a1c027d4573109e5a777811 upstream.
We are checking for the Start of Frame bit in the ctl1 word, while this
bit is set in the ctl0 word instead. Read the ctl0 word and update the
check to verify that.
Fixes:
9cde94506eac ("bgmac: implement scatter/gather support")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
David S. Miller [Fri, 15 Jan 2016 21:07:13 +0000 (16:07 -0500)]
bgmac: Fix reversed test of build_skb() return value.
commit
750afbf8ee9c6a1c74a1fe5fc9852146b1d72687 upstream.
Fixes:
f1640c3ddeec ("bgmac: fix a missing check for build_skb")
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Rafał Miłecki [Sun, 6 Dec 2015 10:31:38 +0000 (11:31 +0100)]
mtd: bcm47xxpart: don't fail because of bit-flips
commit
36bcc0c9c2bc8f56569cd735ba531a51358d7c2b upstream.
Bit-flip errors may occur on NAND flashes and are harmless. Handle them
gracefully as read content is still reliable and can be parsed.
Signed-off-by: Rafał Miłecki <zajec5@gmail.com>
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
wangweidong [Wed, 13 Jan 2016 03:06:41 +0000 (11:06 +0800)]
bgmac: fix a missing check for build_skb
commit
f1640c3ddeec12804bc9a21feee85fc15aca95f6 upstream.
when build_skb failed, it may occure a NULL pointer.
So add a 'NULL check' for it.
Signed-off-by: Weidong Wang <wangweidong1@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Rafał Miłecki [Sat, 5 Dec 2015 01:09:43 +0000 (02:09 +0100)]
mtd: bcm47xxpart: limit scanned flash area on BCM47XX (MIPS) only
commit
2a36a5c30eab9cd1c9d2d08bd27cd763325d70c5 upstream.
We allowed using bcm47xxpart on BCM5301X arch with commit:
9e3afa5f5c7 ("mtd: bcm47xxpart: allow enabling on ARCH_BCM_5301X")
BCM5301X devices may contain some partitions in higher memory, e.g.
Netgear R8000 has board_data at 0x2600000. To detect them we should
use size limit on MIPS only.
Signed-off-by: Rafał Miłecki <zajec5@gmail.com>
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Álvaro Fernández Rojas [Thu, 19 May 2016 20:07:35 +0000 (22:07 +0200)]
MIPS: ralink: fix MT7628 wled_an pinmux gpio
commit
07b50db6e685172a41b9978aebffb2438166d9b6 upstream.
Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com>
Cc: john@phrozen.org
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/13307/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Álvaro Fernández Rojas [Thu, 19 May 2016 20:07:34 +0000 (22:07 +0200)]
MIPS: ralink: fix MT7628 pinmux typos
commit
d7146829c9da24e285cb1b1f2156b5b3e2d40c07 upstream.
Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com>
Cc: john@phrozen.org
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/13306/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
John Crispin [Mon, 4 Jan 2016 19:23:58 +0000 (20:23 +0100)]
MIPS: ralink: Fix invalid assignment of SoC type
commit
0af3a40f09a2a85089037a0b5b51471fa48b229e upstream.
Commit
418d29c87061 ("MIPS: ralink: Unify SoC id handling") introduced
broken code. We obviously need to assign the value.
Signed-off-by: John Crispin <blogic@openwrt.org>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/11993/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
John Crispin [Mon, 4 Jan 2016 19:23:57 +0000 (20:23 +0100)]
MIPS: ralink: fix USB frequency scaling
commit
fad2522272ed5ed451d2d7b1dc547ddf3781cc7e upstream.
Commit
418d29c87061 ("MIPS: ralink: Unify SoC id handling") was not fully
correct. The logic for the SoC check got inverted. We need to check if it
is not a MT76x8.
Signed-off-by: John Crispin <blogic@openwrt.org>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/11992/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
John Crispin [Mon, 4 Jan 2016 19:23:56 +0000 (20:23 +0100)]
MIPS: ralink: MT7688 pinmux fixes
commit
e906a5f67e5a3337d696ec848e9c28fc68b39aa3 upstream.
A few fixes to the pinmux data, 2 new muxes and a minor whitespace
cleanup.
Signed-off-by: John Crispin <blogic@openwrt.org>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/11991/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Florian Fainelli [Sat, 24 Dec 2016 03:56:56 +0000 (19:56 -0800)]
net: korina: Fix NAPI versus resources freeing
commit
e6afb1ad88feddf2347ea779cfaf4d03d3cd40b6 upstream.
Commit
beb0babfb77e ("korina: disable napi on close and restart")
introduced calls to napi_disable() that were missing before,
unfortunately this leaves a small window during which NAPI has a chance
to run, yet we just freed resources since korina_free_ring() has been
called:
Fix this by disabling NAPI first then freeing resource, and make sure
that we also cancel the restart task before doing the resource freeing.
Fixes:
beb0babfb77e ("korina: disable napi on close and restart")
Reported-by: Alexandros C. Couloumbis <alex@ozo.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Felix Fietkau [Mon, 16 May 2016 17:51:55 +0000 (19:51 +0200)]
MIPS: ath79: fix regression in PCI window initialization
commit
9184dc8ffa56844352b3b9860e562ec4ee41176f upstream.
ath79_ddr_pci_win_base has the type void __iomem *, so register offsets
need to be a multiple of 4.
Cc: Alban Bedel <albeu@free.fr>
Fixes:
24b0e3e84fbf ("MIPS: ath79: Improve the DDR controller interface")
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Cc: sergei.shtylyov@cogentembedded.com
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/13258/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Gregory CLEMENT [Thu, 4 Feb 2016 21:09:23 +0000 (22:09 +0100)]
net: mvneta: Fix for_each_present_cpu usage
commit
129219e4950a3fcf9323b3bbd8b224c7aa873985 upstream.
This patch convert the for_each_present in on_each_cpu, instead of
applying on the present cpus it will be applied only on the online cpus.
This fix a bug reported on
http://thread.gmane.org/gmane.linux.ports.arm.kernel/468173.
Using the macro on_each_cpu (instead of a for_each_* loop) also ensures
that all the calls will be done all at once.
Fixes:
f86428854480 ("net: mvneta: Statically assign queues to CPUs")
Reported-by: Stefan Roese <stefan.roese@gmail.com>
Suggested-by: Jisheng Zhang <jszhang@marvell.com>
Suggested-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Jon Mason [Fri, 3 Mar 2017 00:21:32 +0000 (19:21 -0500)]
ARM: dts: BCM5301X: Correct GIC_PPI interrupt flags
commit
0c2bf9f95983fe30aa2f6463cb761cd42c2d521a upstream.
GIC_PPI flags were misconfigured for the timers, resulting in errors
like:
[ 0.000000] GIC: PPI11 is secure or misconfigured
Changing them to being edge triggered corrects the issue
Suggested-by: Rafał Miłecki <rafal@milecki.pl>
Signed-off-by: Jon Mason <jon.mason@broadcom.com>
Fixes:
d27509f1 ("ARM: BCM5301X: add dts files for BCM4708 SoC")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
[AmitP: Resolved minor cherry-pick conflict]
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Quinn Tran [Sat, 24 Dec 2016 02:06:13 +0000 (18:06 -0800)]
qla2xxx: Fix erroneous invalid handle message
[ Upstream commit
4f060736f29a960aba8e781a88837464756200a8 ]
Termination of Immediate Notify IOCB was using wrong
IOCB handle. IOCB completion code was unable to find
appropriate code path due to wrong handle.
Following message is seen in the logs.
"Error entry - invalid handle/queue (ffff)."
Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
[ bvanassche: Fixed word order in patch title ]
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Johannes Thumshirn [Tue, 10 Jan 2017 11:05:54 +0000 (12:05 +0100)]
scsi: lpfc: Set elsiocb contexts to NULL after freeing it
[ Upstream commit
8667f515952feefebb3c0f8d9a9266c91b101a46 ]
Set the elsiocb contexts to NULL after freeing as others depend on it.
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Acked-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Damien Le Moal [Thu, 12 Jan 2017 06:25:10 +0000 (15:25 +0900)]
scsi: sd: Fix wrong DPOFUA disable in sd_read_cache_type
[ Upstream commit
26f2819772af891dee2843e1f8662c58e5129d5f ]
Zoned block devices force the use of READ/WRITE(16) commands by setting
sdkp->use_16_for_rw and clearing sdkp->use_10_for_rw. This result in
DPOFUA always being disabled for these drives as the assumed use of
the deprecated READ/WRITE(6) commands only looks at sdkp->use_10_for_rw.
Strenghten the test by also checking that sdkp->use_16_for_rw is false.
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Dmitry Vyukov [Tue, 17 Jan 2017 13:51:04 +0000 (14:51 +0100)]
KVM: x86: fix fixing of hypercalls
[ Upstream commit
ce2e852ecc9a42e4b8dabb46025cfef63209234a ]
emulator_fix_hypercall() replaces hypercall with vmcall instruction,
but it does not handle GP exception properly when writes the new instruction.
It can return X86EMUL_PROPAGATE_FAULT without setting exception information.
This leads to incorrect emulation and triggers
WARN_ON(ctxt->exception.vector > 0x1f) in x86_emulate_insn()
as discovered by syzkaller fuzzer:
WARNING: CPU: 2 PID: 18646 at arch/x86/kvm/emulate.c:5558
Call Trace:
warn_slowpath_null+0x2c/0x40 kernel/panic.c:582
x86_emulate_insn+0x16a5/0x4090 arch/x86/kvm/emulate.c:5572
x86_emulate_instruction+0x403/0x1cc0 arch/x86/kvm/x86.c:5618
emulate_instruction arch/x86/include/asm/kvm_host.h:1127 [inline]
handle_exception+0x594/0xfd0 arch/x86/kvm/vmx.c:5762
vmx_handle_exit+0x2b7/0x38b0 arch/x86/kvm/vmx.c:8625
vcpu_enter_guest arch/x86/kvm/x86.c:6888 [inline]
vcpu_run arch/x86/kvm/x86.c:6947 [inline]
Set exception information when write in emulator_fix_hypercall() fails.
Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Wanpeng Li <wanpeng.li@hotmail.com>
Cc: kvm@vger.kernel.org
Cc: syzkaller@googlegroups.com
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Mark Rutland [Fri, 16 Jun 2017 21:02:34 +0000 (14:02 -0700)]
mm: numa: avoid waiting on freed migrated pages
commit
3c226c637b69104f6b9f1c6ec5b08d7b741b3229 upstream.
In do_huge_pmd_numa_page(), we attempt to handle a migrating thp pmd by
waiting until the pmd is unlocked before we return and retry. However,
we can race with migrate_misplaced_transhuge_page():
// do_huge_pmd_numa_page // migrate_misplaced_transhuge_page()
// Holds 0 refs on page // Holds 2 refs on page
vmf->ptl = pmd_lock(vma->vm_mm, vmf->pmd);
/* ... */
if (pmd_trans_migrating(*vmf->pmd)) {
page = pmd_page(*vmf->pmd);
spin_unlock(vmf->ptl);
ptl = pmd_lock(mm, pmd);
if (page_count(page) != 2)) {
/* roll back */
}
/* ... */
mlock_migrate_page(new_page, page);
/* ... */
spin_unlock(ptl);
put_page(page);
put_page(page); // page freed here
wait_on_page_locked(page);
goto out;
}
This can result in the freed page having its waiters flag set
unexpectedly, which trips the PAGE_FLAGS_CHECK_AT_PREP checks in the
page alloc/free functions. This has been observed on arm64 KVM guests.
We can avoid this by having do_huge_pmd_numa_page() take a reference on
the page before dropping the pmd lock, mirroring what we do in
__migration_entry_wait().
When we hit the race, migrate_misplaced_transhuge_page() will see the
reference and abort the migration, as it may do today in other cases.
Fixes:
b8916634b77bffb2 ("mm: Prevent parallel splits during THP migration")
Link: http://lkml.kernel.org/r/1497349722-6731-2-git-send-email-will.deacon@arm.com
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Acked-by: Steve Capper <steve.capper@arm.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Mel Gorman <mgorman@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Roman Pen [Tue, 9 Feb 2016 19:33:35 +0000 (12:33 -0700)]
block: fix module reference leak on put_disk() call for cgroups throttle
commit
39a169b62b415390398291080dafe63aec751e0a upstream.
get_disk(),get_gendisk() calls have non explicit side effect: they
increase the reference on the disk owner module.
The following is the correct sequence how to get a disk reference and
to put it:
disk = get_gendisk(...);
/* use disk */
owner = disk->fops->owner;
put_disk(disk);
module_put(owner);
fs/block_dev.c is aware of this required module_put() call, but f.e.
blkg_conf_finish(), which is located in block/blk-cgroup.c, does not put
a module reference. To see a leakage in action cgroups throttle config
can be used. In the following script I'm removing throttle for /dev/ram0
(actually this is NOP, because throttle was never set for this device):
# lsmod | grep brd
brd 5175 0
# i=100; while [ $i -gt 0 ]; do echo "1:0 0" > \
/sys/fs/cgroup/blkio/blkio.throttle.read_bps_device; i=$(($i - 1)); \
done
# lsmod | grep brd
brd 5175 100
Now brd module has 100 references.
The issue is fixed by calling module_put() just right away put_disk().
Signed-off-by: Roman Pen <roman.penyaev@profitbricks.com>
Cc: Gi-Oh Kim <gi-oh.kim@profitbricks.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: linux-block@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@fb.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Kees Cook [Wed, 20 Jan 2016 23:00:45 +0000 (15:00 -0800)]
sysctl: enable strict writes
commit
41662f5cc55335807d39404371cfcbb1909304c4 upstream.
SYSCTL_WRITES_WARN was added in commit
f4aacea2f5d1 ("sysctl: allow for
strict write position handling"), and released in v3.16 in August of
2014. Since then I can find only 1 instance of non-zero offset
writing[1], and it was fixed immediately in CRIU[2]. As such, it
appears safe to flip this to the strict state now.
[1] https://www.google.com/search?q="when%20file%20position%20was%20not%200"
[2] http://lists.openvz.org/pipermail/criu/2015-April/019819.html
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Baolin Wang [Thu, 8 Dec 2016 11:55:22 +0000 (19:55 +0800)]
usb: gadget: f_fs: Fix possibe deadlock
commit
b3ce3ce02d146841af012d08506b4071db8ffde3 upstream.
When system try to close /dev/usb-ffs/adb/ep0 on one core, at the same
time another core try to attach new UDC, which will cause deadlock as
below scenario. Thus we should release ffs lock before issuing
unregister_gadget_item().
[ 52.642225] c1 ======================================================
[ 52.642228] c1 [ INFO: possible circular locking dependency detected ]
[ 52.642236] c1 4.4.6+ #1 Tainted: G W O
[ 52.642241] c1 -------------------------------------------------------
[ 52.642245] c1 usb ffs open/2808 is trying to acquire lock:
[ 52.642270] c0 (udc_lock){+.+.+.}, at: [<
ffffffc00065aeec>]
usb_gadget_unregister_driver+0x3c/0xc8
[ 52.642272] c1 but task is already holding lock:
[ 52.642283] c0 (ffs_lock){+.+.+.}, at: [<
ffffffc00066b244>]
ffs_data_clear+0x30/0x140
[ 52.642285] c1 which lock already depends on the new lock.
[ 52.642287] c1
the existing dependency chain (in reverse order) is:
[ 52.642295] c0
-> #1 (ffs_lock){+.+.+.}:
[ 52.642307] c0 [<
ffffffc00012340c>] __lock_acquire+0x20f0/0x2238
[ 52.642314] c0 [<
ffffffc000123b54>] lock_acquire+0xe4/0x298
[ 52.642322] c0 [<
ffffffc000aaf6e8>] mutex_lock_nested+0x7c/0x3cc
[ 52.642328] c0 [<
ffffffc00066f7bc>] ffs_func_bind+0x504/0x6e8
[ 52.642334] c0 [<
ffffffc000654004>] usb_add_function+0x84/0x184
[ 52.642340] c0 [<
ffffffc000658ca4>] configfs_composite_bind+0x264/0x39c
[ 52.642346] c0 [<
ffffffc00065b348>] udc_bind_to_driver+0x58/0x11c
[ 52.642352] c0 [<
ffffffc00065b49c>] usb_udc_attach_driver+0x90/0xc8
[ 52.642358] c0 [<
ffffffc0006598e0>] gadget_dev_desc_UDC_store+0xd4/0x128
[ 52.642369] c0 [<
ffffffc0002c14e8>] configfs_write_file+0xd0/0x13c
[ 52.642376] c0 [<
ffffffc00023c054>] vfs_write+0xb8/0x214
[ 52.642381] c0 [<
ffffffc00023cad4>] SyS_write+0x54/0xb0
[ 52.642388] c0 [<
ffffffc000085ff0>] el0_svc_naked+0x24/0x28
[ 52.642395] c0
-> #0 (udc_lock){+.+.+.}:
[ 52.642401] c0 [<
ffffffc00011e3d0>] print_circular_bug+0x84/0x2e4
[ 52.642407] c0 [<
ffffffc000123454>] __lock_acquire+0x2138/0x2238
[ 52.642412] c0 [<
ffffffc000123b54>] lock_acquire+0xe4/0x298
[ 52.642420] c0 [<
ffffffc000aaf6e8>] mutex_lock_nested+0x7c/0x3cc
[ 52.642427] c0 [<
ffffffc00065aeec>] usb_gadget_unregister_driver+0x3c/0xc8
[ 52.642432] c0 [<
ffffffc00065995c>] unregister_gadget_item+0x28/0x44
[ 52.642439] c0 [<
ffffffc00066b34c>] ffs_data_clear+0x138/0x140
[ 52.642444] c0 [<
ffffffc00066b374>] ffs_data_reset+0x20/0x6c
[ 52.642450] c0 [<
ffffffc00066efd0>] ffs_data_closed+0xac/0x12c
[ 52.642454] c0 [<
ffffffc00066f070>] ffs_ep0_release+0x20/0x2c
[ 52.642460] c0 [<
ffffffc00023dbe4>] __fput+0xb0/0x1f4
[ 52.642466] c0 [<
ffffffc00023dd9c>] ____fput+0x20/0x2c
[ 52.642473] c0 [<
ffffffc0000ee944>] task_work_run+0xb4/0xe8
[ 52.642482] c0 [<
ffffffc0000cd45c>] do_exit+0x360/0xb9c
[ 52.642487] c0 [<
ffffffc0000cf228>] do_group_exit+0x4c/0xb0
[ 52.642494] c0 [<
ffffffc0000dd3c8>] get_signal+0x380/0x89c
[ 52.642501] c0 [<
ffffffc00008a8f0>] do_signal+0x154/0x518
[ 52.642507] c0 [<
ffffffc00008af00>] do_notify_resume+0x70/0x78
[ 52.642512] c0 [<
ffffffc000085ee8>] work_pending+0x1c/0x20
[ 52.642514] c1
other info that might help us debug this:
[ 52.642517] c1 Possible unsafe locking scenario:
[ 52.642518] c1 CPU0 CPU1
[ 52.642520] c1 ---- ----
[ 52.642525] c0 lock(ffs_lock);
[ 52.642529] c0 lock(udc_lock);
[ 52.642533] c0 lock(ffs_lock);
[ 52.642537] c0 lock(udc_lock);
[ 52.642539] c1
*** DEADLOCK ***
[ 52.642543] c1 1 lock held by usb ffs open/2808:
[ 52.642555] c0 #0: (ffs_lock){+.+.+.}, at: [<
ffffffc00066b244>]
ffs_data_clear+0x30/0x140
[ 52.642557] c1 stack backtrace:
[ 52.642563] c1 CPU: 1 PID: 2808 Comm: usb ffs open Tainted: G
[ 52.642565] c1 Hardware name: Spreadtrum SP9860g Board (DT)
[ 52.642568] c1 Call trace:
[ 52.642573] c1 [<
ffffffc00008b430>] dump_backtrace+0x0/0x170
[ 52.642577] c1 [<
ffffffc00008b5c0>] show_stack+0x20/0x28
[ 52.642583] c1 [<
ffffffc000422694>] dump_stack+0xa8/0xe0
[ 52.642587] c1 [<
ffffffc00011e548>] print_circular_bug+0x1fc/0x2e4
[ 52.642591] c1 [<
ffffffc000123454>] __lock_acquire+0x2138/0x2238
[ 52.642595] c1 [<
ffffffc000123b54>] lock_acquire+0xe4/0x298
[ 52.642599] c1 [<
ffffffc000aaf6e8>] mutex_lock_nested+0x7c/0x3cc
[ 52.642604] c1 [<
ffffffc00065aeec>] usb_gadget_unregister_driver+0x3c/0xc8
[ 52.642608] c1 [<
ffffffc00065995c>] unregister_gadget_item+0x28/0x44
[ 52.642613] c1 [<
ffffffc00066b34c>] ffs_data_clear+0x138/0x140
[ 52.642618] c1 [<
ffffffc00066b374>] ffs_data_reset+0x20/0x6c
[ 52.642621] c1 [<
ffffffc00066efd0>] ffs_data_closed+0xac/0x12c
[ 52.642625] c1 [<
ffffffc00066f070>] ffs_ep0_release+0x20/0x2c
[ 52.642629] c1 [<
ffffffc00023dbe4>] __fput+0xb0/0x1f4
[ 52.642633] c1 [<
ffffffc00023dd9c>] ____fput+0x20/0x2c
[ 52.642636] c1 [<
ffffffc0000ee944>] task_work_run+0xb4/0xe8
[ 52.642640] c1 [<
ffffffc0000cd45c>] do_exit+0x360/0xb9c
[ 52.642644] c1 [<
ffffffc0000cf228>] do_group_exit+0x4c/0xb0
[ 52.642647] c1 [<
ffffffc0000dd3c8>] get_signal+0x380/0x89c
[ 52.642651] c1 [<
ffffffc00008a8f0>] do_signal+0x154/0x518
[ 52.642656] c1 [<
ffffffc00008af00>] do_notify_resume+0x70/0x78
[ 52.642659] c1 [<
ffffffc000085ee8>] work_pending+0x1c/0x20
Acked-by: Michal Nazarewicz <mina86@mina86.com>
Signed-off-by: Baolin Wang <baolin.wang@linaro.org>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Cc: Jerry Zhang <zhangjerry@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Deepak Rawat [Mon, 26 Jun 2017 12:39:08 +0000 (14:39 +0200)]
drm/vmwgfx: Free hash table allocated by cmdbuf managed res mgr
commit
82fcee526ba8ca2c5d378bdf51b21b7eb058fe3a upstream.
The hash table created during vmw_cmdbuf_res_man_create was
never freed. This causes memory leak in context creation.
Added the corresponding drm_ht_remove in vmw_cmdbuf_res_man_destroy.
Tested for memory leak by running piglit overnight and kernel
memory is not inflated which earlier was.
Signed-off-by: Deepak Rawat <drawat@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Hui Wang [Wed, 28 Jun 2017 00:59:16 +0000 (08:59 +0800)]
ALSA: hda - set input_path bitmap to zero after moving it to new place
commit
a8f20fd25bdce81a8e41767c39f456d346b63427 upstream.
Recently we met a problem, the codec has valid adcs and input pins,
and they can form valid input paths, but the driver does not build
valid controls for them like "Mic boost", "Capture Volume" and
"Capture Switch".
Through debugging, I found the driver needs to shrink the invalid
adcs and input paths for this machine, so it will move the whole
column bitmap value to the previous column, after moving it, the
driver forgets to set the original column bitmap value to zero, as a
result, the driver will invalidate the path whose index value is the
original colume bitmap value. After executing this function, all
valid input paths are invalidated by a mistake, there are no any
valid input paths, so the driver won't build controls for them.
Fixes:
3a65bcdc577a ("ALSA: hda - Fix inconsistent input_paths after ADC reduction")
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Takashi Iwai [Wed, 28 Jun 2017 10:02:02 +0000 (12:02 +0200)]
ALSA: hda - Fix endless loop of codec configure
commit
d94815f917da770d42c377786dc428f542e38f71 upstream.
azx_codec_configure() loops over the codecs found on the given
controller via a linked list. The code used to work in the past, but
in the current version, this may lead to an endless loop when a codec
binding returns an error.
The culprit is that the snd_hda_codec_configure() unregisters the
device upon error, and this eventually deletes the given codec object
from the bus. Since the list is initialized via list_del_init(), the
next object points to the same device itself. This behavior change
was introduced at splitting the HD-audio code code, and forgotten to
adapt it here.
For fixing this bug, just use a *_safe() version of list iteration.
Fixes:
d068ebc25e6e ("ALSA: hda - Move some codes up to hdac_bus struct")
Reported-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Paul Burton [Fri, 3 Mar 2017 23:26:05 +0000 (15:26 -0800)]
MIPS: Fix IRQ tracing & lockdep when rescheduling
commit
d8550860d910c6b7b70f830f59003b33daaa52c9 upstream.
When the scheduler sets TIF_NEED_RESCHED & we call into the scheduler
from arch/mips/kernel/entry.S we disable interrupts. This is true
regardless of whether we reach work_resched from syscall_exit_work,
resume_userspace or by looping after calling schedule(). Although we
disable interrupts in these paths we don't call trace_hardirqs_off()
before calling into C code which may acquire locks, and we therefore
leave lockdep with an inconsistent view of whether interrupts are
disabled or not when CONFIG_PROVE_LOCKING & CONFIG_DEBUG_LOCKDEP are
both enabled.
Without tracing this interrupt state lockdep will print warnings such
as the following once a task returns from a syscall via
syscall_exit_partial with TIF_NEED_RESCHED set:
[ 49.927678] ------------[ cut here ]------------
[ 49.934445] WARNING: CPU: 0 PID: 1 at kernel/locking/lockdep.c:3687 check_flags.part.41+0x1dc/0x1e8
[ 49.946031] DEBUG_LOCKS_WARN_ON(current->hardirqs_enabled)
[ 49.946355] CPU: 0 PID: 1 Comm: init Not tainted
4.10.0-00439-gc9fd5d362289-dirty #197
[ 49.963505] Stack :
0000000000000000 ffffffff81bb5d6a 0000000000000006 ffffffff801ce9c4
[ 49.974431]
0000000000000000 0000000000000000 0000000000000000 000000000000004a
[ 49.985300]
ffffffff80b7e487 ffffffff80a24498 a8000000ff160000 ffffffff80ede8b8
[ 49.996194]
0000000000000001 0000000000000000 0000000000000000 0000000077c8030c
[ 50.007063]
000000007fd8a510 ffffffff801cd45c 0000000000000000 a8000000ff127c88
[ 50.017945]
0000000000000000 ffffffff801cf928 0000000000000001 ffffffff80a24498
[ 50.028827]
0000000000000000 0000000000000001 0000000000000000 0000000000000000
[ 50.039688]
0000000000000000 a8000000ff127bd0 0000000000000000 ffffffff805509bc
[ 50.050575]
00000000140084e0 0000000000000000 0000000000000000 0000000000040a00
[ 50.061448]
0000000000000000 ffffffff8010e1b0 0000000000000000 ffffffff805509bc
[ 50.072327] ...
[ 50.076087] Call Trace:
[ 50.079869] [<
ffffffff8010e1b0>] show_stack+0x80/0xa8
[ 50.086577] [<
ffffffff805509bc>] dump_stack+0x10c/0x190
[ 50.093498] [<
ffffffff8015dde0>] __warn+0xf0/0x108
[ 50.099889] [<
ffffffff8015de34>] warn_slowpath_fmt+0x3c/0x48
[ 50.107241] [<
ffffffff801c15b4>] check_flags.part.41+0x1dc/0x1e8
[ 50.114961] [<
ffffffff801c239c>] lock_is_held_type+0x8c/0xb0
[ 50.122291] [<
ffffffff809461b8>] __schedule+0x8c0/0x10f8
[ 50.129221] [<
ffffffff80946a60>] schedule+0x30/0x98
[ 50.135659] [<
ffffffff80106278>] work_resched+0x8/0x34
[ 50.142397] ---[ end trace
0cb4f6ef5b99fe21 ]---
[ 50.148405] possible reason: unannotated irqs-off.
[ 50.154600] irq event stamp: 400463
[ 50.159566] hardirqs last enabled at (400463): [<
ffffffff8094edc8>] _raw_spin_unlock_irqrestore+0x40/0xa8
[ 50.171981] hardirqs last disabled at (400462): [<
ffffffff8094eb98>] _raw_spin_lock_irqsave+0x30/0xb0
[ 50.183897] softirqs last enabled at (400450): [<
ffffffff8016580c>] __do_softirq+0x4ac/0x6a8
[ 50.195015] softirqs last disabled at (400425): [<
ffffffff80165e78>] irq_exit+0x110/0x128
Fix this by using the TRACE_IRQS_OFF macro to call trace_hardirqs_off()
when CONFIG_TRACE_IRQFLAGS is enabled. This is done before invoking
schedule() following the work_resched label because:
1) Interrupts are disabled regardless of the path we take to reach
work_resched() & schedule().
2) Performing the tracing here avoids the need to do it in paths which
disable interrupts but don't call out to C code before hitting a
path which uses the RESTORE_SOME macro that will call
trace_hardirqs_on() or trace_hardirqs_off() as appropriate.
We call trace_hardirqs_on() using the TRACE_IRQS_ON macro before calling
syscall_trace_leave() for similar reasons, ensuring that lockdep has a
consistent view of state after we re-enable interrupts.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Fixes:
1da177e4c3f4 ("Linux-2.6.12-rc2")
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/15385/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Paul Burton [Thu, 2 Mar 2017 22:02:40 +0000 (14:02 -0800)]
MIPS: pm-cps: Drop manual cache-line alignment of ready_count
commit
161c51ccb7a6faf45ffe09aa5cf1ad85ccdad503 upstream.
We allocate memory for a ready_count variable per-CPU, which is accessed
via a cached non-coherent TLB mapping to perform synchronisation between
threads within the core using LL/SC instructions. In order to ensure
that the variable is contained within its own data cache line we
allocate 2 lines worth of memory & align the resulting pointer to a line
boundary. This is however unnecessary, since kmalloc is guaranteed to
return memory which is at least cache-line aligned (see
ARCH_DMA_MINALIGN). Stop the redundant manual alignment.
Besides cleaning up the code & avoiding needless work, this has the side
effect of avoiding an arithmetic error found by Bryan on 64 bit systems
due to the 32 bit size of the former dlinesz. This led the ready_count
variable to have its upper 32b cleared erroneously for MIPS64 kernels,
causing problems when ready_count was later used on MIPS64 via cpuidle.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Fixes:
3179d37ee1ed ("MIPS: pm-cps: add PM state entry code for CPS systems")
Reported-by: Bryan O'Donoghue <bryan.odonoghue@imgtec.com>
Reviewed-by: Bryan O'Donoghue <bryan.odonoghue@imgtec.com>
Tested-by: Bryan O'Donoghue <bryan.odonoghue@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/15383/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
James Hogan [Thu, 29 Jun 2017 14:05:04 +0000 (15:05 +0100)]
MIPS: Avoid accidental raw backtrace
commit
854236363370995a609a10b03e35fd3dc5e9e4a1 upstream.
Since commit
81a76d7119f6 ("MIPS: Avoid using unwind_stack() with
usermode") show_backtrace() invokes the raw backtracer when
cp0_status & ST0_KSU indicates user mode to fix issues on EVA kernels
where user and kernel address spaces overlap.
However this is used by show_stack() which creates its own pt_regs on
the stack and leaves cp0_status uninitialised in most of the code paths.
This results in the non deterministic use of the raw back tracer
depending on the previous stack content.
show_stack() deals exclusively with kernel mode stacks anyway, so
explicitly initialise regs.cp0_status to KSU_KERNEL (i.e. 0) to ensure
we get a useful backtrace.
Fixes:
81a76d7119f6 ("MIPS: Avoid using unwind_stack() with usermode")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/16656/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
David Rientjes [Fri, 7 Apr 2017 23:05:00 +0000 (16:05 -0700)]
mm, swap_cgroup: reschedule when neeed in swap_cgroup_swapoff()
commit
460bcec84e11c75122ace5976214abbc596eb91b upstream.
We got need_resched() warnings in swap_cgroup_swapoff() because
swap_cgroup_ctrl[type].length is particularly large.
Reschedule when needed.
Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1704061315270.80559@chino.kir.corp.google.com
Signed-off-by: David Rientjes <rientjes@google.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Russell Currey [Fri, 17 Feb 2017 03:33:01 +0000 (14:33 +1100)]
drm/ast: Handle configuration without P2A bridge
commit
71f677a91046599ece96ebab21df956ce909c456 upstream.
The ast driver configures a window to enable access into BMC
memory space in order to read some configuration registers.
If this window is disabled, which it can be from the BMC side,
the ast driver can't function.
Closing this window is a necessity for security if a machine's
host side and BMC side are controlled by different parties;
i.e. a cloud provider offering machines "bare metal".
A recent patch went in to try to check if that window is open
but it does so by trying to access the registers in question
and testing if the result is 0xffffffff.
This method will trigger a PCIe error when the window is closed
which on some systems will be fatal (it will trigger an EEH
for example on POWER which will take out the device).
This patch improves this in two ways:
- First, if the firmware has put properties in the device-tree
containing the relevant configuration information, we use these.
- Otherwise, a bit in one of the SCU scratch registers (which
are readable via the VGA register space and writeable by the BMC)
will indicate if the BMC has closed the window. This bit has been
defined by Y.C Chen from Aspeed.
If the window is closed and the configuration isn't available from
the device-tree, some sane defaults are used. Those defaults are
hopefully sufficient for standard video modes used on a server.
Signed-off-by: Russell Currey <ruscur@russell.cc>
Acked-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Cc: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Kinglong Mee [Mon, 6 Mar 2017 14:29:14 +0000 (22:29 +0800)]
NFSv4: fix a reference leak caused WARNING messages
commit
366a1569bff3fe14abfdf9285e31e05e091745f5 upstream.
Because nfs4_opendata_access() has close the state when access is denied,
so the state isn't leak.
Rather than revert the commit
a974deee47, I'd like clean the strange state close.
[ 1615.094218] ------------[ cut here ]------------
[ 1615.094607] WARNING: CPU: 0 PID: 23702 at lib/list_debug.c:31 __list_add_valid+0x8e/0xa0
[ 1615.094913] list_add double add: new=
ffff9d7901d9f608, prev=
ffff9d7901d9f608, next=
ffff9d7901ee8dd0.
[ 1615.095458] Modules linked in: nfsv4(E) nfs(E) nfsd(E) tun bridge stp llc fuse ip_set nfnetlink vmw_vsock_vmci_transport vsock f2fs snd_seq_midi snd_seq_midi_event fscrypto coretemp ppdev crct10dif_pclmul crc32_pclmul ghash_clmulni_intel intel_rapl_perf vmw_balloon snd_ens1371 joydev gameport snd_ac97_codec ac97_bus snd_seq snd_pcm snd_rawmidi snd_timer snd_seq_device snd soundcore nfit parport_pc parport acpi_cpufreq tpm_tis tpm_tis_core tpm i2c_piix4 vmw_vmci shpchp auth_rpcgss nfs_acl lockd(E) grace sunrpc(E) xfs libcrc32c vmwgfx drm_kms_helper ttm drm crc32c_intel mptspi e1000 serio_raw scsi_transport_spi mptscsih mptbase ata_generic pata_acpi fjes [last unloaded: nfs]
[ 1615.097663] CPU: 0 PID: 23702 Comm: fstest Tainted: G W E 4.11.0-rc1+ #517
[ 1615.098015] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/02/2015
[ 1615.098807] Call Trace:
[ 1615.099183] dump_stack+0x63/0x86
[ 1615.099578] __warn+0xcb/0xf0
[ 1615.099967] warn_slowpath_fmt+0x5f/0x80
[ 1615.100370] __list_add_valid+0x8e/0xa0
[ 1615.100760] nfs4_put_state_owner+0x75/0xc0 [nfsv4]
[ 1615.101136] __nfs4_close+0x109/0x140 [nfsv4]
[ 1615.101524] nfs4_close_state+0x15/0x20 [nfsv4]
[ 1615.101949] nfs4_close_context+0x21/0x30 [nfsv4]
[ 1615.102691] __put_nfs_open_context+0xb8/0x110 [nfs]
[ 1615.103155] put_nfs_open_context+0x10/0x20 [nfs]
[ 1615.103586] nfs4_file_open+0x13b/0x260 [nfsv4]
[ 1615.103978] do_dentry_open+0x20a/0x2f0
[ 1615.104369] ? nfs4_copy_file_range+0x30/0x30 [nfsv4]
[ 1615.104739] vfs_open+0x4c/0x70
[ 1615.105106] ? may_open+0x5a/0x100
[ 1615.105469] path_openat+0x623/0x1420
[ 1615.105823] do_filp_open+0x91/0x100
[ 1615.106174] ? __alloc_fd+0x3f/0x170
[ 1615.106568] do_sys_open+0x130/0x220
[ 1615.106920] ? __put_cred+0x3d/0x50
[ 1615.107256] SyS_open+0x1e/0x20
[ 1615.107588] entry_SYSCALL_64_fastpath+0x1a/0xa9
[ 1615.107922] RIP: 0033:0x7fab599069b0
[ 1615.108247] RSP: 002b:
00007ffcf0600d78 EFLAGS:
00000246 ORIG_RAX:
0000000000000002
[ 1615.108575] RAX:
ffffffffffffffda RBX:
00007fab59bcfae0 RCX:
00007fab599069b0
[ 1615.108896] RDX:
0000000000000200 RSI:
0000000000000200 RDI:
00007ffcf060255e
[ 1615.109211] RBP:
0000000000040010 R08:
0000000000000000 R09:
0000000000000016
[ 1615.109515] R10:
00000000000006a1 R11:
0000000000000246 R12:
0000000000041000
[ 1615.109806] R13:
0000000000040010 R14:
0000000000001000 R15:
0000000000002710
[ 1615.110152] ---[ end trace
96ed63b1306bf2f3 ]---
Fixes:
a974deee47 ("NFSv4: Fix memory and state leak in...")
Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Cc: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Eric Leblond [Thu, 11 May 2017 16:56:38 +0000 (18:56 +0200)]
netfilter: synproxy: fix conntrackd interaction
commit
87e94dbc210a720a34be5c1174faee5c84be963e upstream.
This patch fixes the creation of connection tracking entry from
netlink when synproxy is used. It was missing the addition of
the synproxy extension.
This was causing kernel crashes when a conntrack entry created by
conntrackd was used after the switch of traffic from active node
to the passive node.
Signed-off-by: Eric Leblond <eric@regit.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Eric Dumazet [Mon, 3 Apr 2017 17:55:11 +0000 (10:55 -0700)]
netfilter: xt_TCPMSS: add more sanity tests on tcph->doff
commit
2638fd0f92d4397884fd991d8f4925cb3f081901 upstream.
Denys provided an awesome KASAN report pointing to an use
after free in xt_TCPMSS
I have provided three patches to fix this issue, either in xt_TCPMSS or
in xt_tcpudp.c. It seems xt_TCPMSS patch has the smallest possible
impact.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Denys Fedoryshchenko <nuclearcat@nuclearcat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Serhey Popovych [Tue, 20 Jun 2017 11:35:23 +0000 (14:35 +0300)]
rtnetlink: add IFLA_GROUP to ifla_policy
[ Upstream commit
db833d40ad3263b2ee3b59a1ba168bb3cfed8137 ]
Network interface groups support added while ago, however
there is no IFLA_GROUP attribute description in policy
and netlink message size calculations until now.
Add IFLA_GROUP attribute to the policy.
Fixes:
cbda10fa97d7 ("net_device: add support for network device groups")
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Serhey Popovych [Tue, 20 Jun 2017 10:29:25 +0000 (13:29 +0300)]
ipv6: Do not leak throw route references
[ Upstream commit
07f615574f8ac499875b21c1142f26308234a92c ]
While commit
73ba57bfae4a ("ipv6: fix backtracking for throw routes")
does good job on error propagation to the fib_rules_lookup()
in fib rules core framework that also corrects throw routes
handling, it does not solve route reference leakage problem
happened when we return -EAGAIN to the fib_rules_lookup()
and leave routing table entry referenced in arg->result.
If rule with matched throw route isn't last matched in the
list we overwrite arg->result losing reference on throw
route stored previously forever.
We also partially revert commit
ab997ad40839 ("ipv6: fix the
incorrect return value of throw route") since we never return
routing table entry with dst.error == -EAGAIN when
CONFIG_IPV6_MULTIPLE_TABLES is on. Also there is no point
to check for RTF_REJECT flag since it is always set throw
route.
Fixes:
73ba57bfae4a ("ipv6: fix backtracking for throw routes")
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Bert Kenward [Fri, 16 Jun 2017 08:45:08 +0000 (09:45 +0100)]
sfc: provide dummy definitions of vswitch functions
efx_probe_all() calls efx->type->vswitching_probe during probe. For
SFC4000 (Falcon) NICs this function is not defined, leading to a BUG
with the top of the call stack similar to:
? efx_pci_probe_main+0x29a/0x830
efx_pci_probe+0x7d3/0xe70
vswitching_restore and vswitching_remove also need to be defined.
Fixed in mainline by:
commit
5a6681e22c14 ("sfc: separate out SFC4000 ("Falcon") support into new sfc-falcon driver")
Fixes:
6d8aaaf6f798 ("sfc: create VEB vswitch and vport above default firmware setup")
Signed-off-by: Bert Kenward <bkenward@solarflare.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Gao Feng [Fri, 16 Jun 2017 07:00:02 +0000 (15:00 +0800)]
net: 8021q: Fix one possible panic caused by BUG_ON in free_netdev
[ Upstream commit
9745e362add89432d2c951272a99b0a5fe4348a9 ]
The register_vlan_device would invoke free_netdev directly, when
register_vlan_dev failed. It would trigger the BUG_ON in free_netdev
if the dev was already registered. In this case, the netdev would be
freed in netdev_run_todo later.
So add one condition check now. Only when dev is not registered, then
free it directly.
The following is the part coredump when netdev_upper_dev_link failed
in register_vlan_dev. I removed the lines which are too long.
[ 411.237457] ------------[ cut here ]------------
[ 411.237458] kernel BUG at net/core/dev.c:7998!
[ 411.237484] invalid opcode: 0000 [#1] SMP
[ 411.237705] [last unloaded: 8021q]
[ 411.237718] CPU: 1 PID: 12845 Comm: vconfig Tainted: G E 4.12.0-rc5+ #6
[ 411.237737] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/02/2015
[ 411.237764] task:
ffff9cbeb6685580 task.stack:
ffffa7d2807d8000
[ 411.237782] RIP: 0010:free_netdev+0x116/0x120
[ 411.237794] RSP: 0018:
ffffa7d2807dbdb0 EFLAGS:
00010297
[ 411.237808] RAX:
0000000000000002 RBX:
ffff9cbeb6ba8fd8 RCX:
0000000000001878
[ 411.237826] RDX:
0000000000000001 RSI:
0000000000000282 RDI:
0000000000000000
[ 411.237844] RBP:
ffffa7d2807dbdc8 R08:
0002986100029841 R09:
0002982100029801
[ 411.237861] R10:
0004000100029980 R11:
0004000100029980 R12:
ffff9cbeb6ba9000
[ 411.238761] R13:
ffff9cbeb6ba9060 R14:
ffff9cbe60f1a000 R15:
ffff9cbeb6ba9000
[ 411.239518] FS:
00007fb690d81700(0000) GS:
ffff9cbebb640000(0000) knlGS:
0000000000000000
[ 411.239949] CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
[ 411.240454] CR2:
00007f7115624000 CR3:
0000000077cdf000 CR4:
00000000003406e0
[ 411.240936] Call Trace:
[ 411.241462] vlan_ioctl_handler+0x3f1/0x400 [8021q]
[ 411.241910] sock_ioctl+0x18b/0x2c0
[ 411.242394] do_vfs_ioctl+0xa1/0x5d0
[ 411.242853] ? sock_alloc_file+0xa6/0x130
[ 411.243465] SyS_ioctl+0x79/0x90
[ 411.243900] entry_SYSCALL_64_fastpath+0x1e/0xa9
[ 411.244425] RIP: 0033:0x7fb69089a357
[ 411.244863] RSP: 002b:
00007ffcd04e0fc8 EFLAGS:
00000202 ORIG_RAX:
0000000000000010
[ 411.245445] RAX:
ffffffffffffffda RBX:
00007ffcd04e2884 RCX:
00007fb69089a357
[ 411.245903] RDX:
00007ffcd04e0fd0 RSI:
0000000000008983 RDI:
0000000000000003
[ 411.246527] RBP:
00007ffcd04e0fd0 R08:
0000000000000000 R09:
1999999999999999
[ 411.246976] R10:
000000000000053f R11:
0000000000000202 R12:
0000000000000004
[ 411.247414] R13:
00007ffcd04e1128 R14:
00007ffcd04e2888 R15:
0000000000000001
[ 411.249129] RIP: free_netdev+0x116/0x120 RSP:
ffffa7d2807dbdb0
Signed-off-by: Gao Feng <gfree.wind@vip.163.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Wei Wang [Fri, 16 Jun 2017 17:46:37 +0000 (10:46 -0700)]
decnet: always not take dst->__refcnt when inserting dst into hash table
[ Upstream commit
76371d2e3ad1f84426a30ebcd8c3b9b98f4c724f ]
In the existing dn_route.c code, dn_route_output_slow() takes
dst->__refcnt before calling dn_insert_route() while dn_route_input_slow()
does not take dst->__refcnt before calling dn_insert_route().
This makes the whole routing code very buggy.
In dn_dst_check_expire(), dnrt_free() is called when rt expires. This
makes the routes inserted by dn_route_output_slow() not able to be
freed as the refcnt is not released.
In dn_dst_gc(), dnrt_drop() is called to release rt which could
potentially cause the dst->__refcnt to be dropped to -1.
In dn_run_flush(), dst_free() is called to release all the dst. Again,
it makes the dst inserted by dn_route_output_slow() not able to be
released and also, it does not wait on the rcu and could potentially
cause crash in the path where other users still refer to this dst.
This patch makes sure both input and output path do not take
dst->__refcnt before calling dn_insert_route() and also makes sure
dnrt_free()/dst_free() is called when removing dst from the hash table.
The only difference between those 2 calls is that dnrt_free() waits on
the rcu while dst_free() does not.
Signed-off-by: Wei Wang <weiwan@google.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Eli Cohen [Thu, 8 Jun 2017 16:33:16 +0000 (11:33 -0500)]
net/mlx5: Wait for FW readiness before initializing command interface
[ Upstream commit
6c780a0267b8a1075f40b39851132eeaefefcff5 ]
Before attempting to initialize the command interface we must wait till
the fw_initializing bit is clear.
If we fail to meet this condition the hardware will drop our
configuration, specifically the descriptors page address. This scenario
can happen when the firmware is still executing an FLR flow and did not
finish yet so the driver needs to wait for that to finish.
Fixes:
e3297246c2c8 ('net/mlx5_core: Wait for FW readiness on startup')
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Xin Long [Thu, 15 Jun 2017 08:33:58 +0000 (16:33 +0800)]
ipv6: fix calling in6_ifa_hold incorrectly for dad work
[ Upstream commit
f8a894b218138888542a5058d0e902378fd0d4ec ]
Now when starting the dad work in addrconf_mod_dad_work, if the dad work
is idle and queued, it needs to hold ifa.
The problem is there's one gap in [1], during which if the pending dad work
is removed elsewhere. It will miss to hold ifa, but the dad word is still
idea and queue.
if (!delayed_work_pending(&ifp->dad_work))
in6_ifa_hold(ifp);
<--------------[1]
mod_delayed_work(addrconf_wq, &ifp->dad_work, delay);
An use-after-free issue can be caused by this.
Chen Wei found this issue when WARN_ON(!hlist_unhashed(&ifp->addr_lst)) in
net6_ifa_finish_destroy was hit because of it.
As Hannes' suggestion, this patch is to fix it by holding ifa first in
addrconf_mod_dad_work, then calling mod_delayed_work and putting ifa if
the dad_work is already in queue.
Note that this patch did not choose to fix it with:
if (!mod_delayed_work(delay))
in6_ifa_hold(ifp);
As with it, when delay == 0, dad_work would be scheduled immediately, all
addrconf_mod_dad_work(0) callings had to be moved under ifp->lock.
Reported-by: Wei Chen <weichen@redhat.com>
Suggested-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
WANG Cong [Tue, 20 Jun 2017 17:46:27 +0000 (10:46 -0700)]
igmp: add a missing spin_lock_init()
[ Upstream commit
b4846fc3c8559649277e3e4e6b5cec5348a8d208 ]
Andrey reported a lockdep warning on non-initialized
spinlock:
INFO: trying to register non-static key.
the code is fine but needs lockdep annotation.
turning off the locking correctness validator.
CPU: 1 PID: 4099 Comm: a.out Not tainted 4.12.0-rc6+ #9
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:16
dump_stack+0x292/0x395 lib/dump_stack.c:52
register_lock_class+0x717/0x1aa0 kernel/locking/lockdep.c:755
? 0xffffffffa0000000
__lock_acquire+0x269/0x3690 kernel/locking/lockdep.c:3255
lock_acquire+0x22d/0x560 kernel/locking/lockdep.c:3855
__raw_spin_lock_bh ./include/linux/spinlock_api_smp.h:135
_raw_spin_lock_bh+0x36/0x50 kernel/locking/spinlock.c:175
spin_lock_bh ./include/linux/spinlock.h:304
ip_mc_clear_src+0x27/0x1e0 net/ipv4/igmp.c:2076
igmpv3_clear_delrec+0xee/0x4f0 net/ipv4/igmp.c:1194
ip_mc_destroy_dev+0x4e/0x190 net/ipv4/igmp.c:1736
We miss a spin_lock_init() in igmpv3_add_delrec(), probably
because previously we never use it on this code path. Since
we already unlink it from the global mc_tomb list, it is
probably safe not to acquire this spinlock here. It does not
harm to have it although, to avoid conditional locking.
Fixes:
c38b7d327aaf ("igmp: acquire pmc lock for ip_mc_clear_src()")
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
WANG Cong [Mon, 12 Jun 2017 16:52:26 +0000 (09:52 -0700)]
igmp: acquire pmc lock for ip_mc_clear_src()
[ Upstream commit
c38b7d327aafd1e3ad7ff53eefac990673b65667 ]
Andrey reported a use-after-free in add_grec():
for (psf = *psf_list; psf; psf = psf_next) {
...
psf_next = psf->sf_next;
where the struct ip_sf_list's were already freed by:
kfree+0xe8/0x2b0 mm/slub.c:3882
ip_mc_clear_src+0x69/0x1c0 net/ipv4/igmp.c:2078
ip_mc_dec_group+0x19a/0x470 net/ipv4/igmp.c:1618
ip_mc_drop_socket+0x145/0x230 net/ipv4/igmp.c:2609
inet_release+0x4e/0x1c0 net/ipv4/af_inet.c:411
sock_release+0x8d/0x1e0 net/socket.c:597
sock_close+0x16/0x20 net/socket.c:1072
This happens because we don't hold pmc->lock in ip_mc_clear_src()
and a parallel mr_ifc_timer timer could jump in and access them.
The RCU lock is there but it is merely for pmc itself, this
spinlock could actually ensure we don't access them in parallel.
Thanks to Eric and Long for discussion on this bug.
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Jia-Ju Bai [Sat, 10 Jun 2017 08:49:39 +0000 (16:49 +0800)]
net: caif: Fix a sleep-in-atomic bug in cfpkt_create_pfx
[ Upstream commit
f146e872eb12ebbe92d8e583b2637e0741440db3 ]
The kernel may sleep under a rcu read lock in cfpkt_create_pfx, and the
function call path is:
cfcnfg_linkup_rsp (acquire the lock by rcu_read_lock)
cfctrl_linkdown_req
cfpkt_create
cfpkt_create_pfx
alloc_skb(GFP_KERNEL) --> may sleep
cfserl_receive (acquire the lock by rcu_read_lock)
cfpkt_split
cfpkt_create_pfx
alloc_skb(GFP_KERNEL) --> may sleep
There is "in_interrupt" in cfpkt_create_pfx to decide use "GFP_KERNEL" or
"GFP_ATOMIC". In this situation, "GFP_KERNEL" is used because the function
is called under a rcu read lock, instead in interrupt.
To fix it, only "GFP_ATOMIC" is used in cfpkt_create_pfx.
Signed-off-by: Jia-Ju Bai <baijiaju1990@163.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Krister Johansen [Thu, 8 Jun 2017 20:12:38 +0000 (13:12 -0700)]
Fix an intermittent pr_emerg warning about lo becoming free.
[ Upstream commit
f186ce61bb8235d80068c390dc2aad7ca427a4c2 ]
It looks like this:
Message from syslogd@flamingo at Apr 26 00:45:00 ...
kernel:unregister_netdevice: waiting for lo to become free. Usage count = 4
They seem to coincide with net namespace teardown.
The message is emitted by netdev_wait_allrefs().
Forced a kdump in netdev_run_todo, but found that the refcount on the lo
device was already 0 at the time we got to the panic.
Used bcc to check the blocking in netdev_run_todo. The only places
where we're off cpu there are in the rcu_barrier() and msleep() calls.
That behavior is expected. The msleep time coincides with the amount of
time we spend waiting for the refcount to reach zero; the rcu_barrier()
wait times are not excessive.
After looking through the list of callbacks that the netdevice notifiers
invoke in this path, it appears that the dst_dev_event is the most
interesting. The dst_ifdown path places a hold on the loopback_dev as
part of releasing the dev associated with the original dst cache entry.
Most of our notifier callbacks are straight-forward, but this one a)
looks complex, and b) places a hold on the network interface in
question.
I constructed a new bcc script that watches various events in the
liftime of a dst cache entry. Note that dst_ifdown will take a hold on
the loopback device until the invalidated dst entry gets freed.
[ __dst_free] on DST:
ffff883ccabb7900 IF tap1008300eth0 invoked at
1282115677036183
__dst_free
rcu_nocb_kthread
kthread
ret_from_fork
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Mateusz Jurczyk [Thu, 8 Jun 2017 09:13:36 +0000 (11:13 +0200)]
af_unix: Add sockaddr length checks before accessing sa_family in bind and connect handlers
[ Upstream commit
defbcf2decc903a28d8398aa477b6881e711e3ea ]
Verify that the caller-provided sockaddr structure is large enough to
contain the sa_family field, before accessing it in bind() and connect()
handlers of the AF_UNIX socket. Since neither syscall enforces a minimum
size of the corresponding memory region, very short sockaddrs (zero or
one byte long) result in operating on uninitialized memory while
referencing .sa_family.
Signed-off-by: Mateusz Jurczyk <mjurczyk@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Mintz, Yuval [Wed, 7 Jun 2017 18:00:33 +0000 (21:00 +0300)]
net: Zero ifla_vf_info in rtnl_fill_vfinfo()
[ Upstream commit
0eed9cf58446b28b233388b7f224cbca268b6986 ]
Some of the structure's fields are not initialized by the
rtnetlink. If driver doesn't set those in ndo_get_vf_config(),
they'd leak memory to user.
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
CC: Michal Schmidt <mschmidt@redhat.com>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Mateusz Jurczyk [Wed, 7 Jun 2017 14:14:29 +0000 (16:14 +0200)]
decnet: dn_rtmsg: Improve input length sanitization in dnrmg_receive_user_skb
[ Upstream commit
dd0da17b209ed91f39872766634ca967c170ada1 ]
Verify that the length of the socket buffer is sufficient to cover the
nlmsghdr structure before accessing the nlh->nlmsg_len field for further
input sanitization. If the client only supplies 1-3 bytes of data in
sk_buff, then nlh->nlmsg_len remains partially uninitialized and
contains leftover memory from the corresponding kernel allocation.
Operating on such data may result in indeterminate evaluation of the
nlmsg_len < sizeof(*nlh) expression.
The bug was discovered by a runtime instrumentation designed to detect
use of uninitialized memory in the kernel. The patch prevents this and
other similar tools (e.g. KMSAN) from flagging this behavior in the future.
Signed-off-by: Mateusz Jurczyk <mjurczyk@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Alexander Potapenko [Tue, 6 Jun 2017 13:56:54 +0000 (15:56 +0200)]
net: don't call strlen on non-terminated string in dev_set_alias()
[ Upstream commit
c28294b941232931fbd714099798eb7aa7e865d7 ]
KMSAN reported a use of uninitialized memory in dev_set_alias(),
which was caused by calling strlcpy() (which in turn called strlen())
on the user-supplied non-terminated string.
Signed-off-by: Alexander Potapenko <glider@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Willem de Bruijn [Sun, 19 Feb 2017 00:00:45 +0000 (19:00 -0500)]
ipv6: release dst on error in ip6_dst_lookup_tail
commit
00ea1ceebe0d9f2dc1cc2b7bd575a00100c27869 upstream.
If ip6_dst_lookup_tail has acquired a dst and fails the IPv4-mapped
check, release the dst before returning an error.
Fixes:
ec5e3b0a1d41 ("ipv6: Inhibit IPv4-mapped src address on the wire.")
Signed-off-by: Willem de Bruijn <willemb@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Greg Kroah-Hartman [Thu, 29 Jun 2017 10:49:08 +0000 (12:49 +0200)]
Linux 4.4.75
Guilherme G. Piccoli [Thu, 29 Dec 2016 00:13:15 +0000 (22:13 -0200)]
nvme: apply DELAY_BEFORE_CHK_RDY quirk at probe time too
commit
b5a10c5f7532b7473776da87e67f8301bbc32693 upstream.
Commit
54adc01055b7 ("nvme/quirk: Add a delay before checking for adapter
readiness") introduced a quirk to adapters that cannot read the bit
NVME_CSTS_RDY right after register NVME_REG_CC is set; these adapters
need a delay or else the action of reading the bit NVME_CSTS_RDY could
somehow corrupt adapter's registers state and it never recovers.
When this quirk was added, we checked ctrl->tagset in order to avoid
quirking in probe time, supposing we would never require such delay
during probe. Well, it was too optimistic; we in fact need this quirk
at probe time in some cases, like after a kexec.
In some experiments, after abnormal shutdown of machine (aka power cord
unplug), we booted into our bootloader in Power, which is a Linux kernel,
and kexec'ed into another distro. If this kexec is too quick, we end up
reaching the probe of NVMe adapter in that distro when adapter is in
bad state (not fully initialized on our bootloader). What happens next
is that nvme_wait_ready() is unable to complete, except if the quirk is
enabled.
So, this patch removes the original ctrl->tagset verification in order
to enable the quirk even on probe time.
Fixes:
54adc01055b7 ("nvme/quirk: Add a delay before checking for adapter readiness")
Reported-by: Andrew Byrne <byrneadw@ie.ibm.com>
Reported-by: Jaime A. H. Gomez <jahgomez@mx1.ibm.com>
Reported-by: Zachary D. Myers <zdmyers@us.ibm.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
Acked-by: Jeffrey Lien <Jeff.Lien@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
[mauricfo: backport to v4.4.70 without nvme quirk handling & nvme_ctrl]
Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
Tested-by: Narasimhan Vaidyanathan <vnarasimhan@in.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Guilherme G. Piccoli [Tue, 14 Jun 2016 21:22:41 +0000 (18:22 -0300)]
nvme/quirk: Add a delay before checking for adapter readiness
commit
54adc01055b75ec8769c5a36574c7a0895c0c0b2 upstream.
When disabling the controller, the specification says the register
NVME_REG_CC should be written and then driver needs to wait the
adapter to be ready, which is checked by reading another register
bit (NVME_CSTS_RDY). There's a timeout validation in this checking,
so in case this timeout is reached the driver gives up and removes
the adapter from the system.
After a firmware activation procedure, the PCI_DEVICE(0x1c58, 0x0003)
(HGST adapter) end up being removed if we issue a reset_controller,
because driver keeps verifying the NVME_REG_CSTS until the timeout is
reached. This patch adds a necessary quirk for this adapter, by
introducing a delay before nvme_wait_ready(), so the reset procedure
is able to be completed. This quirk is needed because just increasing
the timeout is not enough in case of this adapter - the driver must
wait before start reading NVME_REG_CSTS register on this specific
device.
Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
[mauricfo: backport to v4.4.70 without nvme quirk handling & nvme_ctrl]
Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
Tested-by: Narasimhan Vaidyanathan <vnarasimhan@in.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Russell King [Tue, 30 May 2017 15:21:51 +0000 (16:21 +0100)]
net: phy: fix marvell phy status reading
commit
898805e0cdf7fd860ec21bf661d3a0285a3defbd upstream.
The Marvell driver incorrectly provides phydev->lp_advertising as the
logical and of the link partner's advert and our advert. This is
incorrect - this field is supposed to store the link parter's unmodified
advertisment.
This allows ethtool to report the correct link partner auto-negotiation
status.
Fixes:
be937f1f89ca ("Marvell PHY m88e1111 driver fix")
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Yendapally Reddy Dhananjaya Reddy [Wed, 8 Feb 2017 22:14:26 +0000 (17:14 -0500)]
net: phy: Initialize mdio clock at probe function
commit
bb1a619735b4660f21bce3e728b937640024b4ad upstream.
USB PHYs need the MDIO clock divisor enabled earlier to work.
Initialize mdio clock divisor in probe function. The ext bus
bit available in the same register will be used by mdio mux
to enable external mdio.
Signed-off-by: Yendapally Reddy Dhananjaya Reddy <yendapally.reddy@broadcom.com>
Fixes:
ddc24ae1 ("net: phy: Broadcom iProc MDIO bus driver")
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jon Mason <jon.mason@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
William Wu [Tue, 25 Apr 2017 09:45:48 +0000 (17:45 +0800)]
usb: gadget: f_fs: avoid out of bounds access on comp_desc
commit
b7f73850bb4fac1e2209a4dd5e636d39be92f42c upstream.
Companion descriptor is only used for SuperSpeed endpoints,
if the endpoints are HighSpeed or FullSpeed, the Companion
descriptor will not allocated, so we can only access it if
gadget is SuperSpeed.
I can reproduce this issue on Rockchip platform rk3368 SoC
which supports USB 2.0, and use functionfs for ADB. Kernel
build with CONFIG_KASAN=y and CONFIG_SLUB_DEBUG=y report
the following BUG:
==================================================================
BUG: KASAN: slab-out-of-bounds in ffs_func_set_alt+0x224/0x3a0 at addr
ffffffc0601f6509
Read of size 1 by task swapper/0/0
============================================================================
BUG kmalloc-256 (Not tainted): kasan: bad access detected
----------------------------------------------------------------------------
Disabling lock debugging due to kernel taint
INFO: Allocated in ffs_func_bind+0x52c/0x99c age=1275 cpu=0 pid=1
alloc_debug_processing+0x128/0x17c
___slab_alloc.constprop.58+0x50c/0x610
__slab_alloc.isra.55.constprop.57+0x24/0x34
__kmalloc+0xe0/0x250
ffs_func_bind+0x52c/0x99c
usb_add_function+0xd8/0x1d4
configfs_composite_bind+0x48c/0x570
udc_bind_to_driver+0x6c/0x170
usb_udc_attach_driver+0xa4/0xd0
gadget_dev_desc_UDC_store+0xcc/0x118
configfs_write_file+0x1a0/0x1f8
__vfs_write+0x64/0x174
vfs_write+0xe4/0x200
SyS_write+0x68/0xc8
el0_svc_naked+0x24/0x28
INFO: Freed in inode_doinit_with_dentry+0x3f0/0x7c4 age=1275 cpu=7 pid=247
...
Call trace:
[<
ffffff900808aab4>] dump_backtrace+0x0/0x230
[<
ffffff900808acf8>] show_stack+0x14/0x1c
[<
ffffff90084ad420>] dump_stack+0xa0/0xc8
[<
ffffff90082157cc>] print_trailer+0x188/0x198
[<
ffffff9008215948>] object_err+0x3c/0x4c
[<
ffffff900821b5ac>] kasan_report+0x324/0x4dc
[<
ffffff900821aa38>] __asan_load1+0x24/0x50
[<
ffffff90089eb750>] ffs_func_set_alt+0x224/0x3a0
[<
ffffff90089d3760>] composite_setup+0xdcc/0x1ac8
[<
ffffff90089d7394>] android_setup+0x124/0x1a0
[<
ffffff90089acd18>] _setup+0x54/0x74
[<
ffffff90089b6b98>] handle_ep0+0x3288/0x4390
[<
ffffff90089b9b44>] dwc_otg_pcd_handle_out_ep_intr+0x14dc/0x2ae4
[<
ffffff90089be85c>] dwc_otg_pcd_handle_intr+0x1ec/0x298
[<
ffffff90089ad680>] dwc_otg_pcd_irq+0x10/0x20
[<
ffffff9008116328>] handle_irq_event_percpu+0x124/0x3ac
[<
ffffff9008116610>] handle_irq_event+0x60/0xa0
[<
ffffff900811af30>] handle_fasteoi_irq+0x10c/0x1d4
[<
ffffff9008115568>] generic_handle_irq+0x30/0x40
[<
ffffff90081159b4>] __handle_domain_irq+0xac/0xdc
[<
ffffff9008080e9c>] gic_handle_irq+0x64/0xa4
...
Memory state around the buggy address:
ffffffc0601f6400: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
ffffffc0601f6480: 00 00 00 00 00 00 00 00 00 00 06 fc fc fc fc fc
>
ffffffc0601f6500: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
^
ffffffc0601f6580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffffffc0601f6600: fc fc fc fc fc fc fc fc 00 00 00 00 00 00 00 00
==================================================================
Signed-off-by: William Wu <william.wu@rock-chips.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Cc: Jerry Zhang <zhangjerry@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Michael Ellerman [Thu, 22 Jun 2017 06:52:51 +0000 (16:52 +1000)]
powerpc/slb: Force a full SLB flush when we insert for a bad EA
[Note this patch is not upstream. The bug fix was fixed differently in
upstream prior to the bug being identified.]
The SLB miss handler calls slb_allocate_realmode() in order to create an
SLB entry for the faulting address. At the very start of that function
we check that the faulting Effective Address (EA) is less than
PGTABLE_RANGE (ignoring the region), ie. is it an address which could
possibly fit in the virtual address space.
For an EA which fails that test, we branch out of line (to label 8), but
we still go on to create an SLB entry for the address. The SLB entry we
create has a VSID of 0, which means it will never match anything in the
hash table and so can't actually translate to a physical address.
However that SLB entry will be inserted in the SLB, and so needs to be
managed properly like any other SLB entry. In particular we need to
insert the SLB entry in the SLB cache, so that it will be flushed when
the process is descheduled.
And that is where the bugs begin. The first bug is that slb_finish_load()
uses cr7 to decide if it should insert the SLB entry into the SLB cache.
When we come from the invalid EA case we don't set cr7, it just has some
junk value from userspace. So we may or may not insert the SLB entry in
the SLB cache. If we fail to insert it, we may then incorrectly leave it
in the SLB when the process is descheduled.
The second bug is that even if we do happen to add the entry to the SLB
cache, we do not have enough bits in the SLB cache to remember the full
ESID value for very large EAs.
For example if a process branches to 0x788c545a18000000, that results in
a 256MB SLB entry with an ESID of 0x788c545a1. But each entry in the SLB
cache is only 32-bits, meaning we truncate the ESID to 0x88c545a1. This
has the same effect as the first bug, we incorrectly leave the SLB entry
in the SLB when the process is descheduled.
When a process accesses an invalid EA it results in a SEGV signal being
sent to the process, which typically results in the process being
killed. Process death isn't instantaneous however, the process may catch
the SEGV signal and continue somehow, or the kernel may start writing a
core dump for the process, either of which means it's possible for the
process to be preempted while its processing the SEGV but before it's
been killed.
If that happens, when the process is scheduled back onto the CPU we will
allocate a new SLB entry for the NIP, which will insert a second entry
into the SLB for the bad EA. Because we never flushed the original
entry, due to either bug one or two, we now have two SLB entries that
match the same EA.
If another access is made to that EA, either by the process continuing
after catching the SEGV, or by a second process accessing the same bad
EA on the same CPU, we will trigger an SLB multi-hit machine check
exception. This has been observed happening in the wild.
The fix is when we hit the invalid EA case, we mark the SLB cache as
being full. This causes us to not insert the truncated ESID into the SLB
cache, and means when the process is switched out we will flush the
entire SLB. Note that this works both for the original fault and for a
subsequent call to slb_allocate_realmode() from switch_slb().
Because we mark the SLB cache as full, it doesn't really matter what
value is in cr7, but rather than leaving it as something random we set
it to indicate the address was a kernel address. That also skips the
attempt to insert it in the SLB cache which is a nice side effect.
Another way to fix the bug would be to make the entries in the SLB cache
wider, so that we don't truncate the ESID. However this would be a more
intrusive change as it alters the size and layout of the paca.
This bug was fixed in upstream by commit
f0f558b131db ("powerpc/mm:
Preserve CFAR value on SLB miss caused by access to bogus address"),
which changed the way we handle a bad EA entirely removing this bug in
the process.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Joël Esponde [Wed, 23 Nov 2016 11:47:40 +0000 (12:47 +0100)]
mtd: spi-nor: fix spansion quad enable
commit
807c16253319ee6ccf8873ae64f070f7eb532cd5 upstream.
With the S25FL127S nor flash part, each writing to the configuration
register takes hundreds of ms. During that time, no more accesses to
the flash should be done (even reads).
This commit adds a wait loop after the register writing until the flash
finishes its work.
This issue could make rootfs mounting fail when the latter was done too
much closely to this quad enable bit setting step. And in this case, a
driver as UBIFS may try to recover the filesystem and may broke it
completely.
Signed-off-by: Joël Esponde <joel.esponde@honeywell.com>
Signed-off-by: Cyrille Pitchen <cyrille.pitchen@atmel.com>
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Tobias Wolf [Wed, 23 Nov 2016 09:40:07 +0000 (10:40 +0100)]
of: Add check to of_scan_flat_dt() before accessing initial_boot_params
commit
3ec754410cb3e931a6c4920b1a150f21a94a2bf4 upstream.
An empty __dtb_start to __dtb_end section might result in
initial_boot_params being null for arch/mips/ralink. This showed that the
boot process hangs indefinitely in of_scan_flat_dt().
Signed-off-by: Tobias Wolf <dev-NTEO@vplace.de>
Cc: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/14605/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
David Howells [Wed, 14 Jun 2017 23:12:24 +0000 (00:12 +0100)]
rxrpc: Fix several cases where a padded len isn't checked in ticket decode
commit
5f2f97656ada8d811d3c1bef503ced266fcd53a0 upstream.
This fixes CVE-2017-7482.
When a kerberos 5 ticket is being decoded so that it can be loaded into an
rxrpc-type key, there are several places in which the length of a
variable-length field is checked to make sure that it's not going to
overrun the available data - but the data is padded to the nearest
four-byte boundary and the code doesn't check for this extra. This could
lead to the size-remaining variable wrapping and the data pointer going
over the end of the buffer.
Fix this by making the various variable-length data checks use the padded
length.
Reported-by: 石磊 <shilei-c@360.cn>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Marc Dionne <marc.c.dionne@auristor.com>
Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Johan Hovold [Wed, 10 May 2017 16:18:26 +0000 (18:18 +0200)]
USB: usbip: fix nonconforming hub descriptor
commit
ec963b412a54aac8e527708ecad06a6988a86fb4 upstream.
Fix up the root-hub descriptor to accommodate the variable-length
DeviceRemovable and PortPwrCtrlMask fields, while marking all ports as
removable (and leaving the reserved bit zero unset).
Also add a build-time constraint on VHCI_HC_PORTS which must never be
greater than USB_MAXCHILDREN (but this was only enforced through a
KConfig constant).
This specifically fixes the descriptor layout whenever VHCI_HC_PORTS is
greater than seven (default is 8).
Fixes:
04679b3489e0 ("Staging: USB/IP: add client driver")
Cc: Takahiro Hirofuchi <hirofuchi@users.sourceforge.net>
Cc: Valentina Manea <valentina.manea.m@gmail.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Acked-by: Shuah Khan <shuahkh@osg.samsung.com>
[ johan: backport to v4.4, which uses VHCI_NPORTS ]
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Alex Deucher [Thu, 15 Jun 2017 15:12:28 +0000 (11:12 -0400)]
drm/amdgpu: adjust default display clock
commit
52b482b0f4fd6d5267faf29fe91398e203f3c230 upstream.
Increase the default display clock on newer asics to
accomodate some high res modes with really high refresh
rates.
bug: https://bugs.freedesktop.org/show_bug.cgi?id=93826
Acked-by: Chunming Zhou <david1.zhou@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Alex Deucher [Thu, 15 Jun 2017 14:55:11 +0000 (10:55 -0400)]
drm/amdgpu/atom: fix ps allocation size for EnableDispPowerGating
commit
05b4017b37f1fce4b7185f138126dd8decdb381f upstream.
We were using the wrong structure which lead to an overflow
on some boards.
bug: https://bugs.freedesktop.org/show_bug.cgi?id=101387
Acked-by: Chunming Zhou <david1.zhou@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Alex Deucher [Mon, 19 Jun 2017 19:59:58 +0000 (15:59 -0400)]
drm/radeon: add a quirk for Toshiba Satellite L20-183
commit
acfd6ee4fa7ebeee75511825fe02be3f7ac1d668 upstream.
Fixes resume from suspend.
bug: https://bugzilla.kernel.org/show_bug.cgi?id=196121
Reported-by: Przemek <soprwa@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Alex Deucher [Mon, 19 Jun 2017 16:52:47 +0000 (12:52 -0400)]
drm/radeon: add a PX quirk for another K53TK variant
commit
4eb59793cca00b0e629b6d55b5abb5acb82c5868 upstream.
Disable PX on these systems.
bug: https://bugs.freedesktop.org/show_bug.cgi?id=101491
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Nicholas Bellinger [Thu, 8 Jun 2017 03:29:50 +0000 (20:29 -0700)]
iscsi-target: Reject immediate data underflow larger than SCSI transfer length
commit
abb85a9b512e8ca7ad04a5a8a6db9664fe644974 upstream.
When iscsi WRITE underflow occurs there are two different scenarios
that can happen.
Normally in practice, when an EDTL vs. SCSI CDB TRANSFER LENGTH
underflow is detected, the iscsi immediate data payload is the
smaller SCSI CDB TRANSFER LENGTH.
That is, when a host fabric LLD is using a fixed size EDTL for
a specific control CDB, the SCSI CDB TRANSFER LENGTH and actual
SCSI payload ends up being smaller than EDTL. In iscsi, this
means the received iscsi immediate data payload matches the
smaller SCSI CDB TRANSFER LENGTH, because there is no more
SCSI payload to accept beyond SCSI CDB TRANSFER LENGTH.
However, it's possible for a malicous host to send a WRITE
underflow where EDTL is larger than SCSI CDB TRANSFER LENGTH,
but incoming iscsi immediate data actually matches EDTL.
In the wild, we've never had a iscsi host environment actually
try to do this.
For this special case, it's wrong to truncate part of the
control CDB payload and continue to process the command during
underflow when immediate data payload received was larger than
SCSI CDB TRANSFER LENGTH, so go ahead and reject and drop the
bogus payload as a defensive action.
Note this potential bug was originally relaxed by the following
for allowing WRITE underflow in MSFT FCP host environments:
commit
c72c5250224d475614a00c1d7e54a67f77cd3410
Author: Roland Dreier <roland@purestorage.com>
Date: Wed Jul 22 15:08:18 2015 -0700
target: allow underflow/overflow for PR OUT etc. commands
Cc: Roland Dreier <roland@purestorage.com>
Cc: Mike Christie <mchristi@redhat.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Nicholas Bellinger [Sat, 3 Jun 2017 03:00:17 +0000 (20:00 -0700)]
target: Fix kref->refcount underflow in transport_cmd_finish_abort
commit
73d4e580ccc5c3e05cea002f18111f66c9c07034 upstream.
This patch fixes a se_cmd->cmd_kref underflow during CMD_T_ABORTED
when a fabric driver drops it's second reference from below the
target_core_tmr.c based callers of transport_cmd_finish_abort().
Recently with the conversion of kref to refcount_t, this bug was
manifesting itself as:
[705519.601034] refcount_t: underflow; use-after-free.
[705519.604034] INFO: NMI handler (kgdb_nmi_handler) took too long to run: 20116.512 msecs
[705539.719111] ------------[ cut here ]------------
[705539.719117] WARNING: CPU: 3 PID: 26510 at lib/refcount.c:184 refcount_sub_and_test+0x33/0x51
Since the original kref atomic_t based kref_put() didn't check for
underflow and only invoked the final callback when zero was reached,
this bug did not manifest in practice since all se_cmd memory is
using preallocated tags.
To address this, go ahead and propigate the existing return from
transport_put_cmd() up via transport_cmd_finish_abort(), and
change transport_cmd_finish_abort() + core_tmr_handle_tas_abort()
callers to only do their local target_put_sess_cmd() if necessary.
Reported-by: Bart Van Assche <bart.vanassche@sandisk.com>
Tested-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Mike Christie <mchristi@redhat.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Himanshu Madhani <himanshu.madhani@qlogic.com>
Cc: Sagi Grimberg <sagig@mellanox.com>
Tested-by: Gary Guo <ghg@datera.io>
Tested-by: Chu Yuan Lin <cyl@datera.io>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
John Stultz [Thu, 8 Jun 2017 23:44:20 +0000 (16:44 -0700)]
time: Fix clock->read(clock) race around clocksource changes
commit
ceea5e3771ed2378668455fa21861bead7504df5 upstream.
In tests, which excercise switching of clocksources, a NULL
pointer dereference can be observed on AMR64 platforms in the
clocksource read() function:
u64 clocksource_mmio_readl_down(struct clocksource *c)
{
return ~(u64)readl_relaxed(to_mmio_clksrc(c)->reg) & c->mask;
}
This is called from the core timekeeping code via:
cycle_now = tkr->read(tkr->clock);
tkr->read is the cached tkr->clock->read() function pointer.
When the clocksource is changed then tkr->clock and tkr->read
are updated sequentially. The code above results in a sequential
load operation of tkr->read and tkr->clock as well.
If the store to tkr->clock hits between the loads of tkr->read
and tkr->clock, then the old read() function is called with the
new clock pointer. As a consequence the read() function
dereferences a different data structure and the resulting 'reg'
pointer can point anywhere including NULL.
This problem was introduced when the timekeeping code was
switched over to use struct tk_read_base. Before that, it was
theoretically possible as well when the compiler decided to
reload clock in the code sequence:
now = tk->clock->read(tk->clock);
Add a helper function which avoids the issue by reading
tk_read_base->clock once into a local variable clk and then issue
the read function via clk->read(clk). This guarantees that the
read() function always gets the proper clocksource pointer handed
in.
Since there is now no use for the tkr.read pointer, this patch
also removes it, and to address stopping the fast timekeeper
during suspend/resume, it introduces a dummy clocksource to use
rather then just a dummy read function.
Signed-off-by: John Stultz <john.stultz@linaro.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Stephen Boyd <stephen.boyd@linaro.org>
Cc: Miroslav Lichvar <mlichvar@redhat.com>
Cc: Daniel Mentz <danielmentz@google.com>
Link: http://lkml.kernel.org/r/1496965462-20003-2-git-send-email-john.stultz@linaro.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Daniel Drake [Tue, 20 Jun 2017 02:48:52 +0000 (19:48 -0700)]
Input: i8042 - add Fujitsu Lifebook AH544 to notimeout list
commit
817ae460c784f32cd45e60b2b1b21378c3c6a847 upstream.
Without this quirk, the touchpad is not responsive on this product, with
the following message repeated in the logs:
psmouse serio1: bad data from KBC - timeout
Add it to the notimeout list alongside other similar Fujitsu laptops.
Signed-off-by: Daniel Drake <drake@endlessm.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Naveen N. Rao [Thu, 1 Jun 2017 10:48:15 +0000 (16:18 +0530)]
powerpc/kprobes: Pause function_graph tracing during jprobes handling
commit
a9f8553e935f26cb5447f67e280946b0923cd2dc upstream.
This fixes a crash when function_graph and jprobes are used together.
This is essentially commit
237d28db036e ("ftrace/jprobes/x86: Fix
conflict between jprobes and function graph tracing"), but for powerpc.
Jprobes breaks function_graph tracing since the jprobe hook needs to use
jprobe_return(), which never returns back to the hook, but instead to
the original jprobe'd function. The solution is to momentarily pause
function_graph tracing before invoking the jprobe hook and re-enable it
when returning back to the original jprobe'd function.
Fixes:
6794c78243bf ("powerpc64: port of the function graph tracer")
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Eric W. Biederman [Tue, 13 Jun 2017 09:31:16 +0000 (04:31 -0500)]
signal: Only reschedule timers on signals timers have sent
commit
57db7e4a2d92c2d3dfbca4ef8057849b2682436b upstream.
Thomas Gleixner wrote:
> The CRIU support added a 'feature' which allows a user space task to send
> arbitrary (kernel) signals to itself. The changelog says:
>
> The kernel prevents sending of siginfo with positive si_code, because
> these codes are reserved for kernel. I think we can allow a task to
> send such a siginfo to itself. This operation should not be dangerous.
>
> Quite contrary to that claim, it turns out that it is outright dangerous
> for signals with info->si_code == SI_TIMER. The following code sequence in
> a user space task allows to crash the kernel:
>
> id = timer_create(CLOCK_XXX, ..... signo = SIGX);
> timer_set(id, ....);
> info->si_signo = SIGX;
> info->si_code = SI_TIMER:
> info->_sifields._timer._tid = id;
> info->_sifields._timer._sys_private = 2;
> rt_[tg]sigqueueinfo(..., SIGX, info);
> sigemptyset(&sigset);
> sigaddset(&sigset, SIGX);
> rt_sigtimedwait(sigset, info);
>
> For timers based on CLOCK_PROCESS_CPUTIME_ID, CLOCK_THREAD_CPUTIME_ID this
> results in a kernel crash because sigwait() dequeues the signal and the
> dequeue code observes:
>
> info->si_code == SI_TIMER && info->_sifields._timer._sys_private != 0
>
> which triggers the following callchain:
>
> do_schedule_next_timer() -> posix_cpu_timer_schedule() -> arm_timer()
>
> arm_timer() executes a list_add() on the timer, which is already armed via
> the timer_set() syscall. That's a double list add which corrupts the posix
> cpu timer list. As a consequence the kernel crashes on the next operation
> touching the posix cpu timer list.
>
> Posix clocks which are internally implemented based on hrtimers are not
> affected by this because hrtimer_start() can handle already armed timers
> nicely, but it's a reliable way to trigger the WARN_ON() in
> hrtimer_forward(), which complains about calling that function on an
> already armed timer.
This problem has existed since the posix timer code was merged into
2.5.63. A few releases earlier in 2.5.60 ptrace gained the ability to
inject not just a signal (which linux has supported since 1.0) but the
full siginfo of a signal.
The core problem is that the code will reschedule in response to
signals getting dequeued not just for signals the timers sent but
for other signals that happen to a si_code of SI_TIMER.
Avoid this confusion by testing to see if the queued signal was
preallocated as all timer signals are preallocated, and so far
only the timer code preallocates signals.
Move the check for if a timer needs to be rescheduled up into
collect_signal where the preallocation check must be performed,
and pass the result back to dequeue_signal where the code reschedules
timers. This makes it clear why the code cares about preallocated
timers.
Reported-by: Thomas Gleixner <tglx@linutronix.de>
History Tree: https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git
Reference:
66dd34ad31e5 ("signal: allow to send any siginfo to itself")
Reference:
1669ce53e2ff ("Add PTRACE_GETSIGINFO and PTRACE_SETSIGINFO")
Fixes:
db8b50ba75f2 ("[PATCH] POSIX clocks & timers")
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Sebastian Parschauer [Tue, 6 Jun 2017 11:53:13 +0000 (13:53 +0200)]
HID: Add quirk for Dell PIXART OEM mouse
commit
3db28271f0feae129262d30e41384a7c4c767987 upstream.
This mouse is also known under other IDs. It needs the quirk
ALWAYS_POLL or will disconnect in runlevel 1 or 3.
Signed-off-by: Sebastian Parschauer <sparschauer@suse.de>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Pavel Shilovsky [Tue, 6 Jun 2017 23:58:58 +0000 (16:58 -0700)]
CIFS: Improve readdir verbosity
commit
dcd87838c06f05ab7650b249ebf0d5b57ae63e1e upstream.
Downgrade the loglevel for SMB2 to prevent filling the log
with messages if e.g. readdir was interrupted. Also make SMB2
and SMB1 codepaths do the same logging during readdir.
Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Paul Mackerras [Thu, 15 Jun 2017 06:10:27 +0000 (16:10 +1000)]
KVM: PPC: Book3S HV: Preserve userspace HTM state properly
commit
46a704f8409f79fd66567ad3f8a7304830a84293 upstream.
If userspace attempts to call the KVM_RUN ioctl when it has hardware
transactional memory (HTM) enabled, the values that it has put in the
HTM-related SPRs TFHAR, TFIAR and TEXASR will get overwritten by
guest values. To fix this, we detect this condition and save those
SPR values in the thread struct, and disable HTM for the task. If
userspace goes to access those SPRs or the HTM facility in future,
a TM-unavailable interrupt will occur and the handler will reload
those SPRs and re-enable HTM.
If userspace has started a transaction and suspended it, we would
currently lose the transactional state in the guest entry path and
would almost certainly get a "TM Bad Thing" interrupt, which would
cause the host to crash. To avoid this, we detect this case and
return from the KVM_RUN ioctl with an EINVAL error, with the KVM
exit reason set to KVM_EXIT_FAIL_ENTRY.
Fixes:
b005255e12a3 ("KVM: PPC: Book3S HV: Context-switch new POWER8 SPRs", 2014-01-08)
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Ilya Matveychikov [Fri, 23 Jun 2017 22:08:49 +0000 (15:08 -0700)]
lib/cmdline.c: fix get_options() overflow while parsing ranges
commit
a91e0f680bcd9e10c253ae8b62462a38bd48f09f upstream.
When using get_options() it's possible to specify a range of numbers,
like 1-100500. The problem is that it doesn't track array size while
calling internally to get_range() which iterates over the range and
fills the memory with numbers.
Link: http://lkml.kernel.org/r/2613C75C-B04D-4BFF-82A6-12F97BA0F620@gmail.com
Signed-off-by: Ilya V. Matveychikov <matvejchikov@gmail.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
NeilBrown [Fri, 23 Jun 2017 22:08:43 +0000 (15:08 -0700)]
autofs: sanity check status reported with AUTOFS_DEV_IOCTL_FAIL
commit
9fa4eb8e490a28de40964b1b0e583d8db4c7e57c upstream.
If a positive status is passed with the AUTOFS_DEV_IOCTL_FAIL ioctl,
autofs4_d_automount() will return
ERR_PTR(status)
with that status to follow_automount(), which will then dereference an
invalid pointer.
So treat a positive status the same as zero, and map to ENOENT.
See comment in systemd src/core/automount.c::automount_send_ready().
Link: http://lkml.kernel.org/r/871sqwczx5.fsf@notabene.neil.brown.name
Signed-off-by: NeilBrown <neilb@suse.com>
Cc: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Kees Cook [Fri, 23 Jun 2017 22:08:57 +0000 (15:08 -0700)]
fs/exec.c: account for argv/envp pointers
commit
98da7d08850fb8bdeb395d6368ed15753304aa0c upstream.
When limiting the argv/envp strings during exec to 1/4 of the stack limit,
the storage of the pointers to the strings was not included. This means
that an exec with huge numbers of tiny strings could eat 1/4 of the stack
limit in strings and then additional space would be later used by the
pointers to the strings.
For example, on 32-bit with a 8MB stack rlimit, an exec with
1677721
single-byte strings would consume less than 2MB of stack, the max (8MB /
4) amount allowed, but the pointers to the strings would consume the
remaining additional stack space (
1677721 * 4 ==
6710884).
The result (
1677721 +
6710884 ==
8388605) would exhaust stack space
entirely. Controlling this stack exhaustion could result in
pathological behavior in setuid binaries (CVE-2017-
1000365).
[akpm@linux-foundation.org: additional commenting from Kees]
Fixes:
b6a2fea39318 ("mm: variable length argument support")
Link: http://lkml.kernel.org/r/20170622001720.GA32173@beast
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Rik van Riel <riel@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Qualys Security Advisory <qsa@qualys.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Greg Kroah-Hartman [Mon, 26 Jun 2017 05:13:24 +0000 (07:13 +0200)]
Linux 4.4.74
Hugh Dickins [Tue, 20 Jun 2017 09:10:44 +0000 (02:10 -0700)]
mm: fix new crash in unmapped_area_topdown()
commit
f4cb767d76cf7ee72f97dd76f6cfa6c76a5edc89 upstream.
Trinity gets kernel BUG at mm/mmap.c:1963! in about 3 minutes of
mmap testing. That's the VM_BUG_ON(gap_end < gap_start) at the
end of unmapped_area_topdown(). Linus points out how MAP_FIXED
(which does not have to respect our stack guard gap intentions)
could result in gap_end below gap_start there. Fix that, and
the similar case in its alternative, unmapped_area().
Fixes:
1be7107fbe18 ("mm: larger stack guard gap, between vmas")
Reported-by: Dave Jones <davej@codemonkey.org.uk>
Debugged-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Helge Deller [Mon, 19 Jun 2017 15:34:05 +0000 (17:34 +0200)]
Allow stack to grow up to address space limit
commit
bd726c90b6b8ce87602208701b208a208e6d5600 upstream.
Fix expand_upwards() on architectures with an upward-growing stack (parisc,
metag and partly IA-64) to allow the stack to reliably grow exactly up to
the address space limit given by TASK_SIZE.
Signed-off-by: Helge Deller <deller@gmx.de>
Acked-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Hugh Dickins [Mon, 19 Jun 2017 11:03:24 +0000 (04:03 -0700)]
mm: larger stack guard gap, between vmas
commit
1be7107fbe18eed3e319a6c3e83c78254b693acb upstream.
Stack guard page is a useful feature to reduce a risk of stack smashing
into a different mapping. We have been using a single page gap which
is sufficient to prevent having stack adjacent to a different mapping.
But this seems to be insufficient in the light of the stack usage in
userspace. E.g. glibc uses as large as 64kB alloca() in many commonly
used functions. Others use constructs liks gid_t buffer[NGROUPS_MAX]
which is 256kB or stack strings with MAX_ARG_STRLEN.
This will become especially dangerous for suid binaries and the default
no limit for the stack size limit because those applications can be
tricked to consume a large portion of the stack and a single glibc call
could jump over the guard page. These attacks are not theoretical,
unfortunatelly.
Make those attacks less probable by increasing the stack guard gap
to 1MB (on systems with 4k pages; but make it depend on the page size
because systems with larger base pages might cap stack allocations in
the PAGE_SIZE units) which should cover larger alloca() and VLA stack
allocations. It is obviously not a full fix because the problem is
somehow inherent, but it should reduce attack space a lot.
One could argue that the gap size should be configurable from userspace,
but that can be done later when somebody finds that the new 1MB is wrong
for some special case applications. For now, add a kernel command line
option (stack_guard_gap) to specify the stack gap size (in page units).
Implementation wise, first delete all the old code for stack guard page:
because although we could get away with accounting one extra page in a
stack vma, accounting a larger gap can break userspace - case in point,
a program run with "ulimit -S -v 20000" failed when the 1MB gap was
counted for RLIMIT_AS; similar problems could come with RLIMIT_MLOCK
and strict non-overcommit mode.
Instead of keeping gap inside the stack vma, maintain the stack guard
gap as a gap between vmas: using vm_start_gap() in place of vm_start
(or vm_end_gap() in place of vm_end if VM_GROWSUP) in just those few
places which need to respect the gap - mainly arch_get_unmapped_area(),
and and the vma tree's subtree_gap support for that.
Original-patch-by: Oleg Nesterov <oleg@redhat.com>
Original-patch-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Tested-by: Helge Deller <deller@gmx.de> # parisc
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[wt: backport to 4.11: adjust context]
[wt: backport to 4.9: adjust context ; kernel doc was not in admin-guide]
[wt: backport to 4.4: adjust context ; drop ppc hugetlb_radix changes]
Signed-off-by: Willy Tarreau <w@1wt.eu>
[gkh: minor build fixes for 4.4]
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Thomas Gleixner [Tue, 30 May 2017 21:15:35 +0000 (23:15 +0200)]
alarmtimer: Rate limit periodic intervals
commit
ff86bf0c65f14346bf2440534f9ba5ac232c39a0 upstream.
The alarmtimer code has another source of potentially rearming itself too
fast. Interval timers with a very samll interval have a similar CPU hog
effect as the previously fixed overflow issue.
The reason is that alarmtimers do not implement the normal protection
against this kind of problem which the other posix timer use:
timer expires -> queue signal -> deliver signal -> rearm timer
This scheme brings the rearming under scheduler control and prevents
permanently firing timers which hog the CPU.
Bringing this scheme to the alarm timer code is a major overhaul because it
lacks all the necessary mechanisms completely.
So for a quick fix limit the interval to one jiffie. This is not
problematic in practice as alarmtimers are usually backed by an RTC for
suspend which have 1 second resolution. It could be therefor argued that
the resolution of this clock should be set to 1 second in general, but
that's outside the scope of this fix.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Kostya Serebryany <kcc@google.com>
Cc: syzkaller <syzkaller@googlegroups.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Dmitry Vyukov <dvyukov@google.com>
Link: http://lkml.kernel.org/r/20170530211655.896767100@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Paul Burton [Fri, 2 Jun 2017 18:35:01 +0000 (11:35 -0700)]
MIPS: Fix bnezc/jialc return address calculation
commit
1a73d9310e093fc3adffba4d0a67b9fab2ee3f63 upstream.
The code handling the pop76 opcode (ie. bnezc & jialc instructions) in
__compute_return_epc_for_insn() needs to set the value of $31 in the
jialc case, which is encoded with rs = 0. However its check to
differentiate bnezc (rs != 0) from jialc (rs = 0) was unfortunately
backwards, meaning that if we emulate a bnezc instruction we clobber $31
& if we emulate a jialc instruction it actually behaves like a jic
instruction.
Fix this by inverting the check of rs to match the way the instructions
are actually encoded.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Fixes:
28d6f93d201d ("MIPS: Emulate the new MIPS R6 BNEZC and JIALC instructions")
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/16178/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Shuah Khan [Tue, 10 Jan 2017 23:05:28 +0000 (16:05 -0700)]
usb: dwc3: exynos fix axius clock error path to do cleanup
commit
8ae584d1951f241efd45499f8774fd7066f22823 upstream.
Axius clock error path returns without disabling clock and suspend clock.
Fix it to disable them before returning error.
Reviewed-by: Javier Martinez Canillas <javier@osg.samsung.com>
Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Thomas Gleixner [Tue, 30 May 2017 21:15:34 +0000 (23:15 +0200)]
alarmtimer: Prevent overflow of relative timers
commit
f4781e76f90df7aec400635d73ea4c35ee1d4765 upstream.
Andrey reported a alartimer related RCU stall while fuzzing the kernel with
syzkaller.
The reason for this is an overflow in ktime_add() which brings the
resulting time into negative space and causes immediate expiry of the
timer. The following rearm with a small interval does not bring the timer
back into positive space due to the same issue.
This results in a permanent firing alarmtimer which hogs the CPU.
Use ktime_add_safe() instead which detects the overflow and clamps the
result to KTIME_SEC_MAX.
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Kostya Serebryany <kcc@google.com>
Cc: syzkaller <syzkaller@googlegroups.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Dmitry Vyukov <dvyukov@google.com>
Link: http://lkml.kernel.org/r/20170530211655.802921648@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Heiner Kallweit [Sat, 10 Jun 2017 22:38:36 +0000 (00:38 +0200)]
genirq: Release resources in __setup_irq() error path
commit
fa07ab72cbb0d843429e61bf179308aed6cbe0dd upstream.
In case __irq_set_trigger() fails the resources requested via
irq_request_resources() are not released.
Add the missing release call into the error handling path.
Fixes:
c1bacbae8192 ("genirq: Provide irq_request/release_resources chip callbacks")
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/655538f5-cb20-a892-ff15-fbd2dd1fa4ec@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Yu Zhao [Fri, 16 Jun 2017 21:02:31 +0000 (14:02 -0700)]
swap: cond_resched in swap_cgroup_prepare()
commit
ef70762948dde012146926720b70e79736336764 upstream.
I saw need_resched() warnings when swapping on large swapfile (TBs)
because continuously allocating many pages in swap_cgroup_prepare() took
too long.
We already cond_resched when freeing page in swap_cgroup_swapoff(). Do
the same for the page allocation.
Link: http://lkml.kernel.org/r/20170604200109.17606-1-yuzhao@google.com
Signed-off-by: Yu Zhao <yuzhao@google.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vladimir Davydov <vdavydov.dev@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
James Morse [Fri, 16 Jun 2017 21:02:29 +0000 (14:02 -0700)]
mm/memory-failure.c: use compound_head() flags for huge pages
commit
7258ae5c5a2ce2f5969e8b18b881be40ab55433d upstream.
memory_failure() chooses a recovery action function based on the page
flags. For huge pages it uses the tail page flags which don't have
anything interesting set, resulting in:
> Memory failure: 0x9be3b4: Unknown page state
> Memory failure: 0x9be3b4: recovery action for unknown page: Failed
Instead, save a copy of the head page's flags if this is a huge page,
this means if there are no relevant flags for this tail page, we use the
head pages flags instead. This results in the me_huge_page() recovery
action being called:
> Memory failure: 0x9b7969: recovery action for huge page: Delayed
For hugepages that have not yet been allocated, this allows the hugepage
to be dequeued.
Fixes:
524fca1e7356 ("HWPOISON: fix misjudgement of page_action() for errors on mlocked pages")
Link: http://lkml.kernel.org/r/20170524130204.21845-1-james.morse@arm.com
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Punit Agrawal <punit.agrawal@arm.com>
Acked-by: Punit Agrawal <punit.agrawal@arm.com>
Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Alan Stern [Tue, 13 Jun 2017 19:23:42 +0000 (15:23 -0400)]
USB: gadgetfs, dummy-hcd, net2280: fix locking for callbacks
commit
f16443a034c7aa359ddf6f0f9bc40d01ca31faea upstream.
Using the syzkaller kernel fuzzer, Andrey Konovalov generated the
following error in gadgetfs:
> BUG: KASAN: use-after-free in __lock_acquire+0x3069/0x3690
> kernel/locking/lockdep.c:3246
> Read of size 8 at addr
ffff88003a2bdaf8 by task kworker/3:1/903
>
> CPU: 3 PID: 903 Comm: kworker/3:1 Not tainted 4.12.0-rc4+ #35
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
> Workqueue: usb_hub_wq hub_event
> Call Trace:
> __dump_stack lib/dump_stack.c:16 [inline]
> dump_stack+0x292/0x395 lib/dump_stack.c:52
> print_address_description+0x78/0x280 mm/kasan/report.c:252
> kasan_report_error mm/kasan/report.c:351 [inline]
> kasan_report+0x230/0x340 mm/kasan/report.c:408
> __asan_report_load8_noabort+0x19/0x20 mm/kasan/report.c:429
> __lock_acquire+0x3069/0x3690 kernel/locking/lockdep.c:3246
> lock_acquire+0x22d/0x560 kernel/locking/lockdep.c:3855
> __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
> _raw_spin_lock+0x2f/0x40 kernel/locking/spinlock.c:151
> spin_lock include/linux/spinlock.h:299 [inline]
> gadgetfs_suspend+0x89/0x130 drivers/usb/gadget/legacy/inode.c:1682
> set_link_state+0x88e/0xae0 drivers/usb/gadget/udc/dummy_hcd.c:455
> dummy_hub_control+0xd7e/0x1fb0 drivers/usb/gadget/udc/dummy_hcd.c:2074
> rh_call_control drivers/usb/core/hcd.c:689 [inline]
> rh_urb_enqueue drivers/usb/core/hcd.c:846 [inline]
> usb_hcd_submit_urb+0x92f/0x20b0 drivers/usb/core/hcd.c:1650
> usb_submit_urb+0x8b2/0x12c0 drivers/usb/core/urb.c:542
> usb_start_wait_urb+0x148/0x5b0 drivers/usb/core/message.c:56
> usb_internal_control_msg drivers/usb/core/message.c:100 [inline]
> usb_control_msg+0x341/0x4d0 drivers/usb/core/message.c:151
> usb_clear_port_feature+0x74/0xa0 drivers/usb/core/hub.c:412
> hub_port_disable+0x123/0x510 drivers/usb/core/hub.c:4177
> hub_port_init+0x1ed/0x2940 drivers/usb/core/hub.c:4648
> hub_port_connect drivers/usb/core/hub.c:4826 [inline]
> hub_port_connect_change drivers/usb/core/hub.c:4999 [inline]
> port_event drivers/usb/core/hub.c:5105 [inline]
> hub_event+0x1ae1/0x3d40 drivers/usb/core/hub.c:5185
> process_one_work+0xc08/0x1bd0 kernel/workqueue.c:2097
> process_scheduled_works kernel/workqueue.c:2157 [inline]
> worker_thread+0xb2b/0x1860 kernel/workqueue.c:2233
> kthread+0x363/0x440 kernel/kthread.c:231
> ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:424
>
> Allocated by task 9958:
> save_stack_trace+0x1b/0x20 arch/x86/kernel/stacktrace.c:59
> save_stack+0x43/0xd0 mm/kasan/kasan.c:513
> set_track mm/kasan/kasan.c:525 [inline]
> kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:617
> kmem_cache_alloc_trace+0x87/0x280 mm/slub.c:2745
> kmalloc include/linux/slab.h:492 [inline]
> kzalloc include/linux/slab.h:665 [inline]
> dev_new drivers/usb/gadget/legacy/inode.c:170 [inline]
> gadgetfs_fill_super+0x24f/0x540 drivers/usb/gadget/legacy/inode.c:1993
> mount_single+0xf6/0x160 fs/super.c:1192
> gadgetfs_mount+0x31/0x40 drivers/usb/gadget/legacy/inode.c:2019
> mount_fs+0x9c/0x2d0 fs/super.c:1223
> vfs_kern_mount.part.25+0xcb/0x490 fs/namespace.c:976
> vfs_kern_mount fs/namespace.c:2509 [inline]
> do_new_mount fs/namespace.c:2512 [inline]
> do_mount+0x41b/0x2d90 fs/namespace.c:2834
> SYSC_mount fs/namespace.c:3050 [inline]
> SyS_mount+0xb0/0x120 fs/namespace.c:3027
> entry_SYSCALL_64_fastpath+0x1f/0xbe
>
> Freed by task 9960:
> save_stack_trace+0x1b/0x20 arch/x86/kernel/stacktrace.c:59
> save_stack+0x43/0xd0 mm/kasan/kasan.c:513
> set_track mm/kasan/kasan.c:525 [inline]
> kasan_slab_free+0x72/0xc0 mm/kasan/kasan.c:590
> slab_free_hook mm/slub.c:1357 [inline]
> slab_free_freelist_hook mm/slub.c:1379 [inline]
> slab_free mm/slub.c:2961 [inline]
> kfree+0xed/0x2b0 mm/slub.c:3882
> put_dev+0x124/0x160 drivers/usb/gadget/legacy/inode.c:163
> gadgetfs_kill_sb+0x33/0x60 drivers/usb/gadget/legacy/inode.c:2027
> deactivate_locked_super+0x8d/0xd0 fs/super.c:309
> deactivate_super+0x21e/0x310 fs/super.c:340
> cleanup_mnt+0xb7/0x150 fs/namespace.c:1112
> __cleanup_mnt+0x1b/0x20 fs/namespace.c:1119
> task_work_run+0x1a0/0x280 kernel/task_work.c:116
> exit_task_work include/linux/task_work.h:21 [inline]
> do_exit+0x18a8/0x2820 kernel/exit.c:878
> do_group_exit+0x14e/0x420 kernel/exit.c:982
> get_signal+0x784/0x1780 kernel/signal.c:2318
> do_signal+0xd7/0x2130 arch/x86/kernel/signal.c:808
> exit_to_usermode_loop+0x1ac/0x240 arch/x86/entry/common.c:157
> prepare_exit_to_usermode arch/x86/entry/common.c:194 [inline]
> syscall_return_slowpath+0x3ba/0x410 arch/x86/entry/common.c:263
> entry_SYSCALL_64_fastpath+0xbc/0xbe
>
> The buggy address belongs to the object at
ffff88003a2bdae0
> which belongs to the cache kmalloc-1024 of size 1024
> The buggy address is located 24 bytes inside of
> 1024-byte region [
ffff88003a2bdae0,
ffff88003a2bdee0)
> The buggy address belongs to the page:
> page:
ffffea0000e8ae00 count:1 mapcount:0 mapping: (null)
> index:0x0 compound_mapcount: 0
> flags: 0x100000000008100(slab|head)
> raw:
0100000000008100 0000000000000000 0000000000000000 0000000100170017
> raw:
ffffea0000ed3020 ffffea0000f5f820 ffff88003e80efc0 0000000000000000
> page dumped because: kasan: bad access detected
>
> Memory state around the buggy address:
>
ffff88003a2bd980: fb fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>
ffff88003a2bda00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> >
ffff88003a2bda80: fc fc fc fc fc fc fc fc fc fc fc fc fb fb fb fb
> ^
>
ffff88003a2bdb00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>
ffff88003a2bdb80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> ==================================================================
What this means is that the gadgetfs_suspend() routine was trying to
access dev->lock after it had been deallocated. The root cause is a
race in the dummy_hcd driver; the dummy_udc_stop() routine can race
with the rest of the driver because it contains no locking. And even
when proper locking is added, it can still race with the
set_link_state() function because that function incorrectly drops the
private spinlock before invoking any gadget driver callbacks.
The result of this race, as seen above, is that set_link_state() can
invoke a callback in gadgetfs even after gadgetfs has been unbound
from dummy_hcd's UDC and its private data structures have been
deallocated.
include/linux/usb/gadget.h documents that the ->reset, ->disconnect,
->suspend, and ->resume callbacks may be invoked in interrupt context.
In general this is necessary, to prevent races with gadget driver
removal. This patch fixes dummy_hcd to retain the spinlock across
these calls, and it adds a spinlock acquisition to dummy_udc_stop() to
prevent the race.
The net2280 driver makes the same mistake of dropping the private
spinlock for its ->disconnect and ->reset callback invocations. The
patch fixes it too.
Lastly, since gadgetfs_suspend() may be invoked in interrupt context,
it cannot assume that interrupts are enabled when it runs. It must
use spin_lock_irqsave() instead of spin_lock_irq(). The patch fixes
that bug as well.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Reported-and-tested-by: Andrey Konovalov <andreyknvl@google.com>
Acked-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Corentin Labbe [Fri, 9 Jun 2017 11:48:41 +0000 (14:48 +0300)]
usb: xhci: ASMedia ASM1042A chipset need shorts TX quirk
commit
d2f48f05cd2a2a0a708fbfa45f1a00a87660d937 upstream.
When plugging an USB webcam I see the following message:
[106385.615559] xhci_hcd 0000:04:00.0: WARN Successful completion on short TX: needs XHCI_TRUST_TX_LENGTH quirk?
[106390.583860] handle_tx_event: 913 callbacks suppressed
With this patch applied, I get no more printing of this message.
Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Dan Carpenter [Mon, 8 May 2017 22:55:17 +0000 (15:55 -0700)]
drivers/misc/c2port/c2port-duramar2150.c: checking for NULL instead of IS_ERR()
commit
8128a31eaadbcdfa37774bbd28f3f00bac69996a upstream.
c2port_device_register() never returns NULL, it uses error pointers.
Link: http://lkml.kernel.org/r/20170412083321.GC3250@mwanda
Fixes:
65131cd52b9e ("c2port: add c2port support for Eurotech Duramar 2150")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Rodolfo Giometti <giometti@linux.it>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Chris Brandt [Thu, 27 Apr 2017 19:12:49 +0000 (12:12 -0700)]
usb: r8a66597-hcd: decrease timeout
commit
dd14a3e9b92ac6f0918054f9e3477438760a4fa6 upstream.
The timeout for BULK packets was 300ms which is a long time if other
endpoints or devices are waiting for their turn. Changing it to 50ms
greatly increased the overall performance for multi-endpoint devices.
Fixes:
5d3043586db4 ("usb: r8a66597-hcd: host controller driver for R8A6659")
Signed-off-by: Chris Brandt <chris.brandt@renesas.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Chris Brandt [Thu, 27 Apr 2017 19:12:02 +0000 (12:12 -0700)]
usb: r8a66597-hcd: select a different endpoint on timeout
commit
1f873d857b6c2fefb4dada952674aa01bcfb92bd upstream.
If multiple endpoints on a single device have pending IN URBs and one
endpoint times out due to NAKs (perfectly legal), select a different
endpoint URB to try.
The existing code only checked to see another device address has pending
URBs and ignores other IN endpoints on the current device address. This
leads to endpoints never getting serviced if one endpoint is using NAK as
a flow control method.
Fixes:
5d3043586db4 ("usb: r8a66597-hcd: host controller driver for R8A6659")
Signed-off-by: Chris Brandt <chris.brandt@renesas.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Johan Hovold [Wed, 10 May 2017 16:18:25 +0000 (18:18 +0200)]
USB: gadget: dummy_hcd: fix hub-descriptor removable fields
commit
d81182ce30dbd497a1e7047d7fda2af040347790 upstream.
Flag the first and only port as removable while also leaving the
remaining bits (including the reserved bit zero) unset in accordance
with the specifications:
"Within a byte, if no port exists for a given location, the bit
field representing the port characteristics shall be 0."
Also add a comment marking the legacy PortPwrCtrlMask field.
Fixes:
1cd8fd2887e1 ("usb: gadget: dummy_hcd: add SuperSpeed support")
Fixes:
1da177e4c3f4 ("Linux-2.6.12-rc2")
Cc: Tatyana Brokhman <tlinder@codeaurora.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>