Linus Torvalds [Sat, 27 May 2017 16:06:43 +0000 (09:06 -0700)]
Merge branch 'ras-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull RAS fixes from Thomas Gleixner:
"Two fixlets for RAS:
- Export memory_error() so the NFIT module can utilize it
- Handle memory errors in NFIT correctly"
* 'ras-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
acpi, nfit: Fix the memory error check in nfit_handle_mce()
x86/MCE: Export memory_error()
Linus Torvalds [Sat, 27 May 2017 16:02:41 +0000 (09:02 -0700)]
Merge branch 'perf-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull perf tooling fixes from Thomas Gleixner:
- Synchronization of tools and kernel headers
- A series of fixes for perf report addressing various failures:
* Handle invalid maps proper
* Plug a memory leak
* Handle frames and callchain order correctly
- Fixes for handling inlines and children mode
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
tools/include: Sync kernel ABI headers with tooling headers
perf tools: Put caller above callee in --children mode
perf report: Do not drop last inlined frame
perf report: Always honor callchain order for inlined nodes
perf script: Add --inline option for debugging
perf report: Fix off-by-one for non-activation frames
perf report: Fix memory leak in addr2line when called by addr2inlines
perf report: Don't crash on invalid maps in `-g srcline` mode
Linus Torvalds [Sat, 27 May 2017 15:59:37 +0000 (08:59 -0700)]
Merge branch 'locking-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull locking fix from Thomas Gleixner:
"A fix for a state leak which was introduced in the recent rework of
futex/rtmutex interaction"
* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
futex,rt_mutex: Fix rt_mutex_cleanup_proxy_lock()
Linus Torvalds [Sat, 27 May 2017 15:52:27 +0000 (08:52 -0700)]
Merge branch 'core-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull kthread fix from Thomas Gleixner:
"A single fix which prevents a use after free when kthread fork fails"
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
kthread: Fix use-after-free if kthread fork fails
Linus Torvalds [Sat, 27 May 2017 15:30:30 +0000 (08:30 -0700)]
Merge tag 'trace-v4.12-rc2' of git://git./linux/kernel/git/rostedt/linux-trace
Pull ftrace fixes from Steven Rostedt:
"There's been a few memory issues found with ftrace.
One was simply a memory leak where not all was being freed that should
have been in releasing a file pointer on set_graph_function.
Then Thomas found that the ftrace trampolines were marked for
read/write as well as execute. To shrink the possible attack surface,
he added calls to set them to ro. Which also uncovered some other
issues with freeing module allocated memory that had its permissions
changed.
Kprobes had a similar issue which is fixed and a selftest was added to
trigger that issue again"
* tag 'trace-v4.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
x86/ftrace: Make sure that ftrace trampolines are not RWX
x86/mm/ftrace: Do not bug in early boot on irqs_disabled in cpu_flush_range()
selftests/ftrace: Add a testcase for many kprobe events
kprobes/x86: Fix to set RWX bits correctly before releasing trampoline
ftrace: Fix memory leak in ftrace_graph_release()
Thomas Gleixner [Thu, 25 May 2017 08:57:51 +0000 (10:57 +0200)]
x86/ftrace: Make sure that ftrace trampolines are not RWX
ftrace use module_alloc() to allocate trampoline pages. The mapping of
module_alloc() is RWX, which makes sense as the memory is written to right
after allocation. But nothing makes these pages RO after writing to them.
Add proper set_memory_rw/ro() calls to protect the trampolines after
modification.
Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1705251056410.1862@nanos
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Steven Rostedt (VMware) [Fri, 26 May 2017 14:14:11 +0000 (10:14 -0400)]
x86/mm/ftrace: Do not bug in early boot on irqs_disabled in cpu_flush_range()
With function tracing starting in early bootup and having its trampoline
pages being read only, a bug triggered with the following:
kernel BUG at arch/x86/mm/pageattr.c:189!
invalid opcode: 0000 [#1] SMP
Modules linked in:
CPU: 0 PID: 0 Comm: swapper Not tainted 4.12.0-rc2-test+ #3
Hardware name: MSI MS-7823/CSM-H87M-G43 (MS-7823), BIOS V1.6 02/22/2014
task:
ffffffffb4222500 task.stack:
ffffffffb4200000
RIP: 0010:change_page_attr_set_clr+0x269/0x302
RSP: 0000:
ffffffffb4203c88 EFLAGS:
00010046
RAX:
0000000000000046 RBX:
0000000000000000 RCX:
00000001b6000000
RDX:
ffffffffb4203d40 RSI:
0000000000000000 RDI:
ffffffffb4240d60
RBP:
ffffffffb4203d18 R08:
00000001b6000000 R09:
0000000000000001
R10:
ffffffffb4203aa8 R11:
0000000000000003 R12:
ffffffffc029b000
R13:
ffffffffb4203d40 R14:
0000000000000001 R15:
0000000000000000
FS:
0000000000000000(0000) GS:
ffff9a639ea00000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
CR2:
ffff9a636b384000 CR3:
00000001ea21d000 CR4:
00000000000406b0
Call Trace:
change_page_attr_clear+0x1f/0x21
set_memory_ro+0x1e/0x20
arch_ftrace_update_trampoline+0x207/0x21c
? ftrace_caller+0x64/0x64
? 0xffffffffc029b000
ftrace_startup+0xf4/0x198
register_ftrace_function+0x26/0x3c
function_trace_init+0x5e/0x73
tracer_init+0x1e/0x23
tracing_set_tracer+0x127/0x15a
register_tracer+0x19b/0x1bc
init_function_trace+0x90/0x92
early_trace_init+0x236/0x2b3
start_kernel+0x200/0x3f5
x86_64_start_reservations+0x29/0x2b
x86_64_start_kernel+0x17c/0x18f
secondary_startup_64+0x9f/0x9f
? secondary_startup_64+0x9f/0x9f
Interrupts should not be enabled at this early in the boot process. It is
also fine to leave interrupts enabled during this time as there's only one
CPU running, and on_each_cpu() means to only run on the current CPU.
If early_boot_irqs_disabled is set, it is safe to run cpu_flush_range() with
interrupts disabled. Don't trigger a BUG_ON() in that case.
Link: http://lkml.kernel.org/r/20170526093717.0be3b849@gandalf.local.home
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Masami Hiramatsu [Fri, 26 May 2017 04:44:54 +0000 (13:44 +0900)]
selftests/ftrace: Add a testcase for many kprobe events
Add a testcase to test kprobes via ftrace interface
with many concurrent kprobe events.
This tries to add many kprobe events (up to 256) on
kernel functions. To avoid making ftrace-based
kprobes (kprobes on fentry), it skips first N bytes
(on x86 N=5, on ppc or arm N=4) of function entry.
After that, it enables all those events, disable it,
and remove it.
Since the unoptimization buffer reclaiming will
be delayed, after removing events, it will wait
enough time.
Link: http://lkml.kernel.org/r/149577388470.11702.11832460851769204511.stgit@devbox
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Masami Hiramatsu [Thu, 25 May 2017 10:38:17 +0000 (19:38 +0900)]
kprobes/x86: Fix to set RWX bits correctly before releasing trampoline
Fix kprobes to set(recover) RWX bits correctly on trampoline
buffer before releasing it. Releasing readonly page to
module_memfree() crash the kernel.
Without this fix, if kprobes user register a bunch of kprobes
in function body (since kprobes on function entry usually
use ftrace) and unregister it, kernel hits a BUG and crash.
Link: http://lkml.kernel.org/r/149570868652.3518.14120169373590420503.stgit@devbox
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Fixes:
d0381c81c2f7 ("kprobes/x86: Set kprobes pages read-only")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Luis Henriques [Thu, 25 May 2017 15:20:38 +0000 (16:20 +0100)]
ftrace: Fix memory leak in ftrace_graph_release()
ftrace_hash is being kfree'ed in ftrace_graph_release(), however the
->buckets field is not. This results in a memory leak that is easily
captured by kmemleak:
unreferenced object 0xffff880038afe000 (size 8192):
comm "trace-cmd", pid 238, jiffies
4294916898 (age 9.736s)
hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace:
[<
ffffffff815f561e>] kmemleak_alloc+0x4e/0xb0
[<
ffffffff8113964d>] __kmalloc+0x12d/0x1a0
[<
ffffffff810bf6d1>] alloc_ftrace_hash+0x51/0x80
[<
ffffffff810c0523>] __ftrace_graph_open.isra.39.constprop.46+0xa3/0x100
[<
ffffffff810c05e8>] ftrace_graph_open+0x68/0xa0
[<
ffffffff8114003d>] do_dentry_open.isra.1+0x1bd/0x2d0
[<
ffffffff81140df7>] vfs_open+0x47/0x60
[<
ffffffff81150f95>] path_openat+0x2a5/0x1020
[<
ffffffff81152d6a>] do_filp_open+0x8a/0xf0
[<
ffffffff811411df>] do_sys_open+0x12f/0x200
[<
ffffffff811412ce>] SyS_open+0x1e/0x20
[<
ffffffff815fa6e0>] entry_SYSCALL_64_fastpath+0x13/0x94
[<
ffffffffffffffff>] 0xffffffffffffffff
Link: http://lkml.kernel.org/r/20170525152038.7661-1-lhenriques@suse.com
Cc: stable@vger.kernel.org
Fixes:
b9b0c831bed2 ("ftrace: Convert graph filter to use hash tables")
Signed-off-by: Luis Henriques <lhenriques@suse.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Linus Torvalds [Fri, 26 May 2017 23:45:13 +0000 (16:45 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/dtor/input
Pull input layer fixes from Dmitry Torokhov:
"Just a few fixups to a couple of drivers"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: elan_i2c - ignore signals when finishing updating firmware
Input: elan_i2c - clear INT before resetting controller
Input: atmel_mxt_ts - add T100 as a readable object
Input: edt-ft5x06 - increase allowed data range for threshold parameter
Linus Torvalds [Fri, 26 May 2017 21:02:30 +0000 (14:02 -0700)]
Merge tag 'led_fixes_for_4-12-rc3' of git://git./linux/kernel/git/j.anaszewski/linux-leds
Pull LED fix from Jacek Anaszewski:
"A single LED fix for 4.12-rc3.
leds-pca955x driver uses only i2c_smbus API and thus it should pass
I2C_FUNC_SMBUS_BYTE_DATA flag to i2c_check_functionality"
* tag 'led_fixes_for_4-12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds:
leds: pca955x: Correct I2C Functionality
Linus Torvalds [Fri, 26 May 2017 20:51:01 +0000 (13:51 -0700)]
Merge git://git./linux/kernel/git/davem/net
Pull networking fixes from David Miller:
1) Fix state pruning in bpf verifier wrt. alignment, from Daniel
Borkmann.
2) Handle non-linear SKBs properly in SCTP ICMP parsing, from Davide
Caratti.
3) Fix bit field definitions for rss_hash_type of descriptors in mlx5
driver, from Jesper Brouer.
4) Defer slave->link updates until bonding is ready to do a full commit
to the new settings, from Nithin Sujir.
5) Properly reference count ipv4 FIB metrics to avoid use after free
situations, from Eric Dumazet and several others including Cong Wang
and Julian Anastasov.
6) Fix races in llc_ui_bind(), from Lin Zhang.
7) Fix regression of ESP UDP encapsulation for TCP packets, from
Steffen Klassert.
8) Fix mdio-octeon driver Kconfig deps, from Randy Dunlap.
9) Fix regression in setting DSCP on ipv6/GRE encapsulation, from Peter
Dawson.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (43 commits)
ipv4: add reference counting to metrics
net: ethernet: ax88796: don't call free_irq without request_irq first
ip6_tunnel, ip6_gre: fix setting of DSCP on encapsulated packets
sctp: fix ICMP processing if skb is non-linear
net: llc: add lock_sock in llc_ui_bind to avoid a race condition
bonding: Don't update slave->link until ready to commit
test_bpf: Add a couple of tests for BPF_JSGE.
bpf: add various verifier test cases
bpf: fix wrong exposure of map_flags into fdinfo for lpm
bpf: add bpf_clone_redirect to bpf_helper_changes_pkt_data
bpf: properly reset caller saved regs after helper call and ld_abs/ind
bpf: fix incorrect pruning decision when alignment must be tracked
arp: fixed -Wuninitialized compiler warning
tcp: avoid fastopen API to be used on AF_UNSPEC
net: move somaxconn init from sysctl code
net: fix potential null pointer dereference
geneve: fix fill_info when using collect_metadata
virtio-net: enable TSO/checksum offloads for Q-in-Q vlans
be2net: Fix offload features for Q-in-Q packets
vlan: Fix tcp checksum offloads in Q-in-Q vlans
...
Linus Torvalds [Fri, 26 May 2017 19:13:08 +0000 (12:13 -0700)]
Merge tag 'xfs-4.12-fixes-2' of git://git./fs/xfs/xfs-linux
Pull XFS fixes from Darrick Wong:
"A few miscellaneous bug fixes & cleanups:
- Fix indlen block reservation accounting bug when splitting delalloc
extent
- Fix warnings about unused variables that appeared in -rc1.
- Don't spew errors when bmapping a local format directory
- Fix an off-by-one error in a delalloc eof assertion
- Make fsmap only return inode information for CAP_SYS_ADMIN
- Fix a potential mount time deadlock recovering cow extents
- Fix unaligned memory access in _btree_visit_blocks
- Fix various SEEK_HOLE/SEEK_DATA bugs"
* tag 'xfs-4.12-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: Move handling of missing page into one place in xfs_find_get_desired_pgoff()
xfs: Fix off-by-in in loop termination in xfs_find_get_desired_pgoff()
xfs: Fix missed holes in SEEK_HOLE implementation
xfs: fix off-by-one on max nr_pages in xfs_find_get_desired_pgoff()
xfs: fix unaligned access in xfs_btree_visit_blocks
xfs: avoid mount-time deadlock in CoW extent recovery
xfs: only return detailed fsmap info if the caller has CAP_SYS_ADMIN
xfs: bad assertion for delalloc an extent that start at i_size
xfs: fix warnings about unused stack variables
xfs: BMAPX shouldn't barf on inline-format directories
xfs: fix indlen accounting error on partial delalloc conversion
Eric Dumazet [Thu, 25 May 2017 21:27:35 +0000 (14:27 -0700)]
ipv4: add reference counting to metrics
Andrey Konovalov reported crashes in ipv4_mtu()
I could reproduce the issue with KASAN kernels, between
10.246.7.151 and 10.246.7.152 :
1) 20 concurrent netperf -t TCP_RR -H 10.246.7.152 -l 1000 &
2) At the same time run following loop :
while :
do
ip ro add 10.246.7.152 dev eth0 src 10.246.7.151 mtu 1500
ip ro del 10.246.7.152 dev eth0 src 10.246.7.151 mtu 1500
done
Cong Wang attempted to add back rt->fi in commit
82486aa6f1b9 ("ipv4: restore rt->fi for reference counting")
but this proved to add some issues that were complex to solve.
Instead, I suggested to add a refcount to the metrics themselves,
being a standalone object (in particular, no reference to other objects)
I tried to make this patch as small as possible to ease its backport,
instead of being super clean. Note that we believe that only ipv4 dst
need to take care of the metric refcount. But if this is wrong,
this patch adds the basic infrastructure to extend this to other
families.
Many thanks to Julian Anastasov for reviewing this patch, and Cong Wang
for his efforts on this problem.
Fixes:
2860583fe840 ("ipv4: Kill rt->fi")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Julian Anastasov <ja@ssi.bg>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Uwe Kleine-König [Thu, 25 May 2017 20:54:53 +0000 (22:54 +0200)]
net: ethernet: ax88796: don't call free_irq without request_irq first
The function ax_init_dev (which is called only from the driver's .probe
function) calls free_irq in the error path without having requested the
irq in the first place. So drop the free_irq call in the error path.
Fixes:
825a2ff1896e ("AX88796 network driver")
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Peter Dawson [Thu, 25 May 2017 20:35:18 +0000 (06:35 +1000)]
ip6_tunnel, ip6_gre: fix setting of DSCP on encapsulated packets
This fix addresses two problems in the way the DSCP field is formulated
on the encapsulating header of IPv6 tunnels.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=195661
1) The IPv6 tunneling code was manipulating the DSCP field of the
encapsulating packet using the 32b flowlabel. Since the flowlabel is
only the lower 20b it was incorrect to assume that the upper 12b
containing the DSCP and ECN fields would remain intact when formulating
the encapsulating header. This fix handles the 'inherit' and
'fixed-value' DSCP cases explicitly using the extant dsfield u8 variable.
2) The use of INET_ECN_encapsulate(0, dsfield) in ip6_tnl_xmit was
incorrect and resulted in the DSCP value always being set to 0.
Commit
90427ef5d2a4 ("ipv6: fix flow labels when the traffic class
is non-0") caused the regression by masking out the flowlabel
which exposed the incorrect handling of the DSCP portion of the
flowlabel in ip6_tunnel and ip6_gre.
Fixes:
90427ef5d2a4 ("ipv6: fix flow labels when the traffic class is non-0")
Signed-off-by: Peter Dawson <peter.a.dawson@boeing.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Davide Caratti [Thu, 25 May 2017 17:14:56 +0000 (19:14 +0200)]
sctp: fix ICMP processing if skb is non-linear
sometimes ICMP replies to INIT chunks are ignored by the client, even if
the encapsulated SCTP headers match an open socket. This happens when the
ICMP packet is carried by a paged skb: use skb_header_pointer() to read
packet contents beyond the SCTP header, so that chunk header and initiate
tag are validated correctly.
v2:
- don't use skb_header_pointer() to read the transport header, since
icmp_socket_deliver() already puts these 8 bytes in the linear area.
- change commit message to make specific reference to INIT chunks.
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
linzhang [Thu, 25 May 2017 06:07:18 +0000 (14:07 +0800)]
net: llc: add lock_sock in llc_ui_bind to avoid a race condition
There is a race condition in llc_ui_bind if two or more processes/threads
try to bind a same socket.
If more processes/threads bind a same socket success that will lead to
two problems, one is this action is not what we expected, another is
will lead to kernel in unstable status or oops(in my simple test case,
cause llc2.ko can't unload).
The current code is test SOCK_ZAPPED bit to avoid a process to
bind a same socket twice but that is can't avoid more processes/threads
try to bind a same socket at the same time.
So, add lock_sock in llc_ui_bind like others, such as llc_ui_connect.
Signed-off-by: Lin Zhang <xiaolou4617@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Fri, 26 May 2017 18:05:22 +0000 (11:05 -0700)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
"A collection of fixes that should go into this series. This contains:
- A set of NVMe fixes, pulled from Christoph. This includes a set of
fixes for the fiber channel bits from James Smart, rdma queue depth
fix from Marta, controller removal fixes from Ming, and some more
APST quirk updates from Andy.
- A blk-mq debugfs fix from Bart, fixing a problem with the
untangling of the sysfs and debugfs blk-mq bits that was added in
this series.
- Error code fix in add_partition() from Dan.
- A small series of fixes for the new blk-throttle code from Shaohua"
* 'for-linus' of git://git.kernel.dk/linux-block: (21 commits)
blk-mq: Only register debugfs attributes for blk-mq queues
nvme: Quirk APST on Intel 600P/P3100 devices
nvme: only setup block integrity if supported by the driver
nvme: replace is_flags field in nvme_ctrl_ops with a flags field
nvme-pci: consistencly use ctrl->device for logging
partitions/msdos: FreeBSD UFS2 file systems are not recognized
block: fix an error code in add_partition()
blk-throttle: force user to configure all settings for io.low
blk-throttle: respect 0 bps/iops settings for io.low
blk-throttle: output some debug info in trace
blk-throttle: add hierarchy support for latency target and idle time
nvme_fc: remove extra controller reference taken on reconnect
nvme_fc: correct nvme status set on abort
nvme_fc: set logging level on resets/deletes
nvme_fc: revise comment on teardown
nvme_fc: Support ctrl_loss_tmo
nvme_fc: get rid of local reconnect_delay
blk-mq: remove blk_mq_abort_requeue_list()
nvme: avoid to use blk_mq_abort_requeue_list()
nvme: use blk_mq_start_hw_queues() in nvme_kill_queues()
...
Linus Torvalds [Fri, 26 May 2017 17:51:18 +0000 (10:51 -0700)]
Merge tag 'pci-v4.12-fixes-1' of git://git./linux/kernel/git/helgaas/pci
Pull PCI fixes from Bjorn Helgaas:
- fix PCI_ENDPOINT build error (merged for v4.12)
- fix Switchtec driver (merged for v4.12)
- fix imx6 config read timeouts, fallout from changing to non-postable
reads
- add PM "needs_resume" flag for i915 suspend issue
* tag 'pci-v4.12-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
PCI/PM: Add needs_resume flag to avoid suspend complete optimization
PCI: imx6: Fix config read timeout handling
switchtec: Fix minor bug with partition ID register
switchtec: Use new cdev_device_add() helper function
PCI: endpoint: Make PCI_ENDPOINT depend on HAS_DMA
Linus Torvalds [Fri, 26 May 2017 16:35:22 +0000 (09:35 -0700)]
Merge tag 'ceph-for-4.12-rc3' of git://github.com/ceph/ceph-client
Pul ceph fixes from Ilya Dryomov:
"A bunch of make W=1 and static checker fixups, a RECONNECT_SEQ
messenger patch from Zheng and Luis' fallocate fix"
* tag 'ceph-for-4.12-rc3' of git://github.com/ceph/ceph-client:
ceph: check that the new inode size is within limits in ceph_fallocate()
libceph: cleanup old messages according to reconnect seq
libceph: NULL deref on crush_decode() error path
libceph: fix error handling in process_one_ticket()
libceph: validate blob_struct_v in process_one_ticket()
libceph: drop version variable from ceph_monmap_decode()
libceph: make ceph_msg_data_advance() return void
libceph: use kbasename() and kill ceph_file_part()
Linus Torvalds [Fri, 26 May 2017 16:05:35 +0000 (09:05 -0700)]
Merge tag 'mmc-v4.12-rc2' of git://git./linux/kernel/git/ulfh/mmc
Pull MMC fixes from Ulf Hansson:
"This contains fixes to make the WiFi work again for the ARM64 Hikey
board.
Together with a couple of DTS updates for the Hikey board we have also
extended the mmc pwrseq_simple, to support a new power-off-delay-us DT
property, as that was required to enable a graceful power off sequence
for the WiFi chip"
* tag 'mmc-v4.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
arm64: dts: hikey: Fix WiFi support
arm64: dts: hi6220: Move board data from the dwmmc nodes to hikey dts
arm64: dts: hikey: Add the SYS_5V and the VDD_3V3 regulators
arm64: dts: hi6220: Move the fixed_5v_hub regulator to the hikey dts
arm64: dts: hikey: Add clock for the pmic mfd
mfd: dts: hi655x: Add clock binding for the pmic
mmc: pwrseq_simple: Parse DTS for the power-off-delay-us property
mmc: dt: pwrseq-simple: Invent power-off-delay-us
Linus Torvalds [Fri, 26 May 2017 16:03:09 +0000 (09:03 -0700)]
Merge tag 'sound-4.12-rc3' of git://git./linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"This contains a few HD-audio device-specific quirks and an endianess
fix for USB-audio, as well as the update of quirk model list document.
All fixes are small and trivial.
The document update could have been postponed, but it's a good thing
for user and has absolutely zero risk of breakage, so included here"
* tag 'sound-4.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda - apply STAC_9200_DELL_M22 quirk for Dell Latitude D430
ALSA: hda - Update the list of quirk models
ALSA: hda - Provide dual-codecs model option for a few Realtek codecs
ALSA: hda - Apply dual-codec quirk for MSI Z270-Gaming mobo
ALSA: hda - No loopback on ALC299 codec
ALSA: usb-audio: fix Amanero Combo384 quirk on big-endian hosts
Linus Torvalds [Fri, 26 May 2017 15:54:06 +0000 (08:54 -0700)]
Merge tag 'drm-fixes-for-v4.12-rc3' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
"Not a whole lot happening here, a set of amdgpu fixes and one core
deadlock fix, and some misc drivers fixes"
* tag 'drm-fixes-for-v4.12-rc3' of git://people.freedesktop.org/~airlied/linux:
drm/amdgpu: fix null point error when rmmod amdgpu.
drm/amd/powerplay: fix a signedness bugs
drm/amdgpu: fix NULL pointer panic of emit_gds_switch
drm/radeon: Unbreak HPD handling for r600+
drm/amd/powerplay/smu7: disable mclk switching for high refresh rates
drm/amd/powerplay/smu7: add vblank check for mclk switching (v2)
drm/radeon/ci: disable mclk switching for high refresh rates (v2)
drm/amdgpu/ci: disable mclk switching for high refresh rates (v2)
drm/amdgpu: fix fundamental suspend/resume issue
drm/gma500/psb: Actually use VBT mode when it is found
drm: Fix deadlock retry loop in page_flip_ioctl
drm: qxl: Delay entering atomic context during cursor update
drm/radeon: Fix oops upon driver load on PowerXpress laptops
Christoph Hellwig [Sat, 20 May 2017 16:59:54 +0000 (18:59 +0200)]
PCI/msi: fix the pci_alloc_irq_vectors_affinity stub
We need to return an error for any call that asks for MSI / MSI-X
vectors only, so that non-trivial fallback logic can work properly.
Also valid dev->irq and use the "correct" errno value based on feedback
from Linus.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Steven Rostedt <rostedt@goodmis.org>
Fixes:
aff17164 ("PCI: Provide sensible IRQ vector alloc/free routines")
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jens Axboe [Fri, 26 May 2017 15:11:19 +0000 (09:11 -0600)]
Merge branch 'nvme-4.12' of git://git.infradead.org/nvme into for-linus
Christoph writes:
"A couple of fixes for the next rc on the nvme front. Various FC fixes
from James, controller removal fixes from Ming (including a block layer
patch), a APST related device quirk from Andy, a RDMA fix for small
queue depth device from Marta, as well as fixes for the lack of
metadata support in non-PCIe drivers and the printk logging format from
me."
Bart Van Assche [Thu, 25 May 2017 23:38:06 +0000 (16:38 -0700)]
blk-mq: Only register debugfs attributes for blk-mq queues
The code in blk-mq-debugfs.c assumes that it is working on a blk-mq
queue and is not intended to work on a blk-sq queue. Hence only
register blk-mq debugfs attributes for blk-mq queues.
Fixes: commit
9c1051aacde8 ("blk-mq: untangle debugfs and sysfs")
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Omar Sandoval <osandov@fb.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Andy Lutomirski [Wed, 24 May 2017 22:06:31 +0000 (15:06 -0700)]
nvme: Quirk APST on Intel 600P/P3100 devices
They have known firmware bugs. A fix is apparently in the works --
once fixed firmware is available, someone from Intel (Hi, Keith!)
can adjust the quirk accordingly.
Cc: stable@vger.kernel.org # v4.11
Cc: Kai-Heng Feng <kai.heng.feng@canonical.com>
Cc: Mario Limonciello <mario_limonciello@dell.com>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Christoph Hellwig [Sat, 20 May 2017 13:14:45 +0000 (15:14 +0200)]
nvme: only setup block integrity if supported by the driver
Currently only the PCIe driver supports metadata, so we should not claim
integrity support for the other drivers. This prevents nasty crashes
with targets that advertise metadata support on fabrics.
Also use the opportunity to factor out some code into a separate helper
that isn't even compiled if CONFIG_BLK_DEV_INTEGRITY is disabled.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Christoph Hellwig [Sat, 20 May 2017 13:14:44 +0000 (15:14 +0200)]
nvme: replace is_flags field in nvme_ctrl_ops with a flags field
So that we can have more flags for transport-specific behavior.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Christoph Hellwig [Sat, 20 May 2017 13:14:43 +0000 (15:14 +0200)]
nvme-pci: consistencly use ctrl->device for logging
This is what most of the code already does and gives much more useful
prefixes than the device embedded in the pci_dev.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Dave Airlie [Fri, 26 May 2017 01:51:55 +0000 (11:51 +1000)]
Merge branch 'drm-fixes-4.12' of git://people.freedesktop.org/~agd5f/linux into drm-fixes
A bunch of bug fixes:
- Fix display flickering on some chips at high refresh rates
- suspend/resume fix
- hotplug fix
- a couple of segfault fixes for certain cases
* 'drm-fixes-4.12' of git://people.freedesktop.org/~agd5f/linux:
drm/amdgpu: fix null point error when rmmod amdgpu.
drm/amd/powerplay: fix a signedness bugs
drm/amdgpu: fix NULL pointer panic of emit_gds_switch
drm/radeon: Unbreak HPD handling for r600+
drm/amd/powerplay/smu7: disable mclk switching for high refresh rates
drm/amd/powerplay/smu7: add vblank check for mclk switching (v2)
drm/radeon/ci: disable mclk switching for high refresh rates (v2)
drm/amdgpu/ci: disable mclk switching for high refresh rates (v2)
drm/amdgpu: fix fundamental suspend/resume issue
Dave Airlie [Fri, 26 May 2017 01:51:28 +0000 (11:51 +1000)]
Merge tag 'drm-misc-fixes-2017-05-25' of git://anongit.freedesktop.org/git/drm-misc into drm-fixes
Core Changes:
- Don't drop vblank reference more than once in cases of ww retry (Daniel)
Driver Changes:
- radeon: Fix oops during radeon probe trying to reference wrong device (Lukas)
- qxl: Avoid sleeping while in atomic context on cursor update (Gabriel)
- gma500: Use VBT mode instead of pre-programmed mode for LVDS (Patrik)
Cc: Lukas Wunner <lukas@wunner.de>
Cc: Gabriel Krisman Bertazi <krisman@collabora.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Patrik Jakobsson <patrik.r.jakobsson@gmail.com>
* tag 'drm-misc-fixes-2017-05-25' of git://anongit.freedesktop.org/git/drm-misc:
drm/gma500/psb: Actually use VBT mode when it is found
drm: Fix deadlock retry loop in page_flip_ioctl
drm: qxl: Delay entering atomic context during cursor update
drm/radeon: Fix oops upon driver load on PowerXpress laptops
Nithin Sujir [Thu, 25 May 2017 02:45:17 +0000 (19:45 -0700)]
bonding: Don't update slave->link until ready to commit
In the loadbalance arp monitoring scheme, when a slave link change is
detected, the slave->link is immediately updated and slave_state_changed
is set. Later down the function, the rtnl_lock is acquired and the
changes are committed, updating the bond link state.
However, the acquisition of the rtnl_lock can fail. The next time the
monitor runs, since slave->link is already updated, it determines that
link is unchanged. This results in the bond link state permanently out
of sync with the slave link.
This patch modifies bond_loadbalance_arp_mon() to handle link changes
identical to bond_ab_arp_{inspect/commit}(). The new link state is
maintained in slave->new_link until we're ready to commit at which point
it's copied into slave->link.
NOTE: miimon_{inspect/commit}() has a more complex state machine
requiring the use of the bond_{propose,commit}_link_state() functions
which maintains the intermediate state in slave->link_new_state. The arp
monitors don't require that.
Testing: This bug is very easy to reproduce with the following steps.
1. In a loop, toggle a slave link of a bond slave interface.
2. In a separate loop, do ifconfig up/down of an unrelated interface to
create contention for rtnl_lock.
Within a few iterations, the bond link goes out of sync with the slave
link.
Signed-off-by: Nithin Nayak Sujir <nsujir@tintri.com>
Cc: Mahesh Bandewar <maheshb@google.com>
Cc: Jay Vosburgh <jay.vosburgh@canonical.com>
Acked-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Daney [Wed, 24 May 2017 23:35:49 +0000 (16:35 -0700)]
test_bpf: Add a couple of tests for BPF_JSGE.
Some JITs can optimize comparisons with zero. Add a couple of
BPF_JSGE tests against immediate zero.
Signed-off-by: David Daney <david.daney@cavium.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 25 May 2017 17:44:29 +0000 (13:44 -0400)]
Merge branch 'bpf-fixes'
Daniel Borkmann says:
====================
Various BPF fixes
Follow-up to fix incorrect pruning when alignment tracking is
in use and to properly clear regs after call to not leave stale
data behind, also a fix that adds bpf_clone_redirect to the
bpf_helper_changes_pkt_data helper and exposes correct map_flags
for lpm map into fdinfo. For details, please see individual
patches.
v1 -> v2:
- Reworked first patch so that env->strict_alignment is the
final indicator on whether we have to deal with strict
alignment rather than having CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
checks on various locations, so only checking env->strict_alignment
is sufficient after that. Thanks for spotting, Dave!
- Added patch 3 and 4.
- Rest as is.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Wed, 24 May 2017 23:05:09 +0000 (01:05 +0200)]
bpf: add various verifier test cases
This patch adds various verifier test cases:
1) A test case for the pruning issue when tracking alignment
is used.
2) Various PTR_TO_MAP_VALUE_OR_NULL tests to make sure pointer
arithmetic turns such register into UNKNOWN_VALUE type.
3) Test cases for the special treatment of LD_ABS/LD_IND to
make sure verifier doesn't break calling convention here.
Latter is needed, since f.e. arm64 JIT uses r1 - r5 for
storing temporary data, so they really must be marked as
NOT_INIT.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Wed, 24 May 2017 23:05:08 +0000 (01:05 +0200)]
bpf: fix wrong exposure of map_flags into fdinfo for lpm
trie_alloc() always needs to have BPF_F_NO_PREALLOC passed in via
attr->map_flags, since it does not support preallocation yet. We
check the flag, but we never copy the flag into trie->map.map_flags,
which is later on exposed into fdinfo and used by loaders such as
iproute2. Latter uses this in bpf_map_selfcheck_pinned() to test
whether a pinned map has the same spec as the one from the BPF obj
file and if not, bails out, which is currently the case for lpm
since it exposes always 0 as flags.
Also copy over flags in array_map_alloc() and stack_map_alloc().
They always have to be 0 right now, but we should make sure to not
miss to copy them over at a later point in time when we add actual
flags for them to use.
Fixes:
b95a5c4db09b ("bpf: add a longest prefix match trie map implementation")
Reported-by: Jarno Rajahalme <jarno@covalent.io>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Wed, 24 May 2017 23:05:07 +0000 (01:05 +0200)]
bpf: add bpf_clone_redirect to bpf_helper_changes_pkt_data
The bpf_clone_redirect() still needs to be listed in
bpf_helper_changes_pkt_data() since we call into
bpf_try_make_head_writable() from there, thus we need
to invalidate prior pkt regs as well.
Fixes:
36bbef52c7eb ("bpf: direct packet write and access for helpers for clsact progs")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Wed, 24 May 2017 23:05:06 +0000 (01:05 +0200)]
bpf: properly reset caller saved regs after helper call and ld_abs/ind
Currently, after performing helper calls, we clear all caller saved
registers, that is r0 - r5 and fill r0 depending on struct bpf_func_proto
specification. The way we reset these regs can affect pruning decisions
in later paths, since we only reset register's imm to 0 and type to
NOT_INIT. However, we leave out clearing of other variables such as id,
min_value, max_value, etc, which can later on lead to pruning mismatches
due to stale data.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Wed, 24 May 2017 23:05:05 +0000 (01:05 +0200)]
bpf: fix incorrect pruning decision when alignment must be tracked
Currently, when we enforce alignment tracking on direct packet access,
the verifier lets the following program pass despite doing a packet
write with unaligned access:
0: (61) r2 = *(u32 *)(r1 +76)
1: (61) r3 = *(u32 *)(r1 +80)
2: (61) r7 = *(u32 *)(r1 +8)
3: (bf) r0 = r2
4: (07) r0 += 14
5: (25) if r7 > 0x1 goto pc+4
R0=pkt(id=0,off=14,r=0) R1=ctx R2=pkt(id=0,off=0,r=0)
R3=pkt_end R7=inv,min_value=0,max_value=1 R10=fp
6: (2d) if r0 > r3 goto pc+1
R0=pkt(id=0,off=14,r=14) R1=ctx R2=pkt(id=0,off=0,r=14)
R3=pkt_end R7=inv,min_value=0,max_value=1 R10=fp
7: (63) *(u32 *)(r0 -4) = r0
8: (b7) r0 = 0
9: (95) exit
from 6 to 8:
R0=pkt(id=0,off=14,r=0) R1=ctx R2=pkt(id=0,off=0,r=0)
R3=pkt_end R7=inv,min_value=0,max_value=1 R10=fp
8: (b7) r0 = 0
9: (95) exit
from 5 to 10:
R0=pkt(id=0,off=14,r=0) R1=ctx R2=pkt(id=0,off=0,r=0)
R3=pkt_end R7=inv,min_value=2 R10=fp
10: (07) r0 += 1
11: (05) goto pc-6
6: safe <----- here, wrongly found safe
processed 15 insns
However, if we enforce a pruning mismatch by adding state into r8
which is then being mismatched in states_equal(), we find that for
the otherwise same program, the verifier detects a misaligned packet
access when actually walking that path:
0: (61) r2 = *(u32 *)(r1 +76)
1: (61) r3 = *(u32 *)(r1 +80)
2: (61) r7 = *(u32 *)(r1 +8)
3: (b7) r8 = 1
4: (bf) r0 = r2
5: (07) r0 += 14
6: (25) if r7 > 0x1 goto pc+4
R0=pkt(id=0,off=14,r=0) R1=ctx R2=pkt(id=0,off=0,r=0)
R3=pkt_end R7=inv,min_value=0,max_value=1
R8=imm1,min_value=1,max_value=1,min_align=1 R10=fp
7: (2d) if r0 > r3 goto pc+1
R0=pkt(id=0,off=14,r=14) R1=ctx R2=pkt(id=0,off=0,r=14)
R3=pkt_end R7=inv,min_value=0,max_value=1
R8=imm1,min_value=1,max_value=1,min_align=1 R10=fp
8: (63) *(u32 *)(r0 -4) = r0
9: (b7) r0 = 0
10: (95) exit
from 7 to 9:
R0=pkt(id=0,off=14,r=0) R1=ctx R2=pkt(id=0,off=0,r=0)
R3=pkt_end R7=inv,min_value=0,max_value=1
R8=imm1,min_value=1,max_value=1,min_align=1 R10=fp
9: (b7) r0 = 0
10: (95) exit
from 6 to 11:
R0=pkt(id=0,off=14,r=0) R1=ctx R2=pkt(id=0,off=0,r=0)
R3=pkt_end R7=inv,min_value=2
R8=imm1,min_value=1,max_value=1,min_align=1 R10=fp
11: (07) r0 += 1
12: (b7) r8 = 0
13: (05) goto pc-7 <----- mismatch due to r8
7: (2d) if r0 > r3 goto pc+1
R0=pkt(id=0,off=15,r=15) R1=ctx R2=pkt(id=0,off=0,r=15)
R3=pkt_end R7=inv,min_value=2
R8=imm0,min_value=0,max_value=0,min_align=
2147483648 R10=fp
8: (63) *(u32 *)(r0 -4) = r0
misaligned packet access off 2+15+-4 size 4
The reason why we fail to see it in states_equal() is that the
third test in compare_ptrs_to_packet() ...
if (old->off <= cur->off &&
old->off >= old->range && cur->off >= cur->range)
return true;
... will let the above pass. The situation we run into is that
old->off <= cur->off (14 <= 15), meaning that prior walked paths
went with smaller offset, which was later used in the packet
access after successful packet range check and found to be safe
already.
For example: Given is R0=pkt(id=0,off=0,r=0). Adding offset 14
as in above program to it, results in R0=pkt(id=0,off=14,r=0)
before the packet range test. Now, testing this against R3=pkt_end
with 'if r0 > r3 goto out' will transform R0 into R0=pkt(id=0,off=14,r=14)
for the case when we're within bounds. A write into the packet
at offset *(u32 *)(r0 -4), that is, 2 + 14 -4, is valid and
aligned (2 is for NET_IP_ALIGN). After processing this with
all fall-through paths, we later on check paths from branches.
When the above skb->mark test is true, then we jump near the
end of the program, perform r0 += 1, and jump back to the
'if r0 > r3 goto out' test we've visited earlier already. This
time, R0 is of type R0=pkt(id=0,off=15,r=0), and we'll prune
that part because this time we'll have a larger safe packet
range, and we already found that with off=14 all further insn
were already safe, so it's safe as well with a larger off.
However, the problem is that the subsequent write into the packet
with 2 + 15 -4 is then unaligned, and not caught by the alignment
tracking. Note that min_align, aux_off, and aux_off_align were
all 0 in this example.
Since we cannot tell at this time what kind of packet access was
performed in the prior walk and what minimal requirements it has
(we might do so in the future, but that requires more complexity),
fix it to disable this pruning case for strict alignment for now,
and let the verifier do check such paths instead. With that applied,
the test cases pass and reject the program due to misalignment.
Fixes:
d1174416747d ("bpf: Track alignment of register values in the verifier.")
Reference: http://patchwork.ozlabs.org/patch/761909/
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ihar Hrachyshka [Wed, 24 May 2017 22:19:35 +0000 (15:19 -0700)]
arp: fixed -Wuninitialized compiler warning
Commit
7d472a59c0e5ec117220a05de6b370447fb6cb66 ("arp: always override
existing neigh entries with gratuitous ARP") introduced a compiler
warning:
net/ipv4/arp.c:880:35: warning: 'addr_type' may be used uninitialized in
this function [-Wmaybe-uninitialized]
While the code logic seems to be correct and doesn't allow the variable
to be used uninitialized, and the warning is not consistently
reproducible, it's still worth fixing it for other people not to waste
time looking at the warning in case it pops up in the build environment.
Yes, compiler is probably at fault, but we will need to accommodate.
Fixes:
7d472a59c0e5 ("arp: always override existing neigh entries with gratuitous ARP")
Signed-off-by: Ihar Hrachyshka <ihrachys@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Wei Wang [Wed, 24 May 2017 16:59:31 +0000 (09:59 -0700)]
tcp: avoid fastopen API to be used on AF_UNSPEC
Fastopen API should be used to perform fastopen operations on the TCP
socket. It does not make sense to use fastopen API to perform disconnect
by calling it with AF_UNSPEC. The fastopen data path is also prone to
race conditions and bugs when using with AF_UNSPEC.
One issue reported and analyzed by Vegard Nossum is as follows:
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Thread A: Thread B:
------------------------------------------------------------------------
sendto()
- tcp_sendmsg()
- sk_stream_memory_free() = 0
- goto wait_for_sndbuf
- sk_stream_wait_memory()
- sk_wait_event() // sleep
| sendto(flags=MSG_FASTOPEN, dest_addr=AF_UNSPEC)
| - tcp_sendmsg()
| - tcp_sendmsg_fastopen()
| - __inet_stream_connect()
| - tcp_disconnect() //because of AF_UNSPEC
| - tcp_transmit_skb()// send RST
| - return 0; // no reconnect!
| - sk_stream_wait_connect()
| - sock_error()
| - xchg(&sk->sk_err, 0)
| - return -ECONNRESET
- ... // wake up, see sk->sk_err == 0
- skb_entail() on TCP_CLOSE socket
If the connection is reopened then we will send a brand new SYN packet
after thread A has already queued a buffer. At this point I think the
socket internal state (sequence numbers etc.) becomes messed up.
When the new connection is closed, the FIN-ACK is rejected because the
sequence number is outside the window. The other side tries to
retransmit,
but __tcp_retransmit_skb() calls tcp_trim_head() on an empty skb which
corrupts the skb data length and hits a BUG() in copy_and_csum_bits().
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Hence, this patch adds a check for AF_UNSPEC in the fastopen data path
and return EOPNOTSUPP to user if such case happens.
Fixes:
cf60af03ca4e7 ("tcp: Fast Open client - sendmsg(MSG_FASTOPEN)")
Reported-by: Vegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: Wei Wang <weiwan@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Roman Kapl [Wed, 24 May 2017 08:22:22 +0000 (10:22 +0200)]
net: move somaxconn init from sysctl code
The default value for somaxconn is set in sysctl_core_net_init(), but this
function is not called when kernel is configured without CONFIG_SYSCTL.
This results in the kernel not being able to accept TCP connections,
because the backlog has zero size. Usually, the user ends up with:
"TCP: request_sock_TCP: Possible SYN flooding on port 7. Dropping request. Check SNMP counters."
If SYN cookies are not enabled the connection is rejected.
Before
ef547f2ac16 (tcp: remove max_qlen_log), the effects were less
severe, because the backlog was always at least eight slots long.
Signed-off-by: Roman Kapl <roman.kapl@sysgo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
KT Liao [Tue, 23 May 2017 20:41:47 +0000 (13:41 -0700)]
Input: elan_i2c - ignore signals when finishing updating firmware
Use wait_for_completion_timeout() instead of
wait_for_completion_interruptible_timeout() to avoid stray signals ruining
firmware update. Our timeout is only 300 msec so we are fine simply letting
it expire in case device misbehaves.
Signed-off-by: KT Liao <kt.liao@emc.com.tw>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
KT Liao [Thu, 25 May 2017 17:06:21 +0000 (10:06 -0700)]
Input: elan_i2c - clear INT before resetting controller
Some old touchpad FWs need to have interrupt cleared before issuing reset
command after updating firmware. We clear interrupt by attempting to read
full report from the controller, and discarding any data read.
Signed-off-by: KT Liao <kt.liao@emc.com.tw>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Gustavo A. R. Silva [Tue, 23 May 2017 23:18:37 +0000 (18:18 -0500)]
net: fix potential null pointer dereference
Add null check to avoid a potential null pointer dereference.
Addresses-Coverity-ID:
1408831
Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rex Zhu [Mon, 22 May 2017 05:11:41 +0000 (13:11 +0800)]
drm/amdgpu: fix null point error when rmmod amdgpu.
this bug happened when amdgpu load failed.
[ 75.740951] BUG: unable to handle kernel paging request at
00000000000031c0
[ 75.748167] IP: [<
ffffffffa064a0e0>] amdgpu_fbdev_restore_mode+0x20/0x60 [amdgpu]
[ 75.755774] PGD 0
[ 75.759185] Oops: 0000 [#1] SMP
[ 75.762408] Modules linked in: amdgpu(OE-) ttm(OE) drm_kms_helper(OE) drm(OE) i2c_algo_bit(E) fb_sys_fops(E) syscopyarea(E) sysfillrect(E) sysimgblt(E) rpcsec_gss_krb5(E) nfsv4(E) nfs(E) fscache(E) eeepc_wmi(E) asus_wmi(E) sparse_keymap(E) intel_rapl(E) snd_hda_codec_hdmi(E) snd_hda_codec_realtek(E) snd_hda_codec_generic(E) snd_hda_intel(E) snd_hda_codec(E) snd_hda_core(E) x86_pkg_temp_thermal(E) intel_powerclamp(E) snd_hwdep(E) snd_pcm(E) snd_seq_midi(E) coretemp(E) kvm_intel(E) snd_seq_midi_event(E) snd_rawmidi(E) kvm(E) snd_seq(E) joydev(E) snd_seq_device(E) snd_timer(E) irqbypass(E) crct10dif_pclmul(E) crc32_pclmul(E) mei_me(E) ghash_clmulni_intel(E) snd(E) aesni_intel(E) mei(E) soundcore(E) aes_x86_64(E) shpchp(E) serio_raw(E) lrw(E) acpi_pad(E) gf128mul(E) glue_helper(E) ablk_helper(E) mac_hid(E)
[ 75.835574] cryptd(E) parport_pc(E) ppdev(E) lp(E) nfsd(E) parport(E) auth_rpcgss(E) nfs_acl(E) lockd(E) grace(E) sunrpc(E) autofs4(E) hid_generic(E) usbhid(E) mxm_wmi(E) psmouse(E) e1000e(E) ptp(E) pps_core(E) ahci(E) libahci(E) wmi(E) video(E) i2c_hid(E) hid(E)
[ 75.858489] CPU: 5 PID: 1603 Comm: rmmod Tainted: G OE 4.9.0-custom #2
[ 75.866183] Hardware name: System manufacturer System Product Name/Z170-A, BIOS 0901 08/31/2015
[ 75.875050] task:
ffff88045d1bbb80 task.stack:
ffffc90002de4000
[ 75.881094] RIP: 0010:[<
ffffffffa064a0e0>] [<
ffffffffa064a0e0>] amdgpu_fbdev_restore_mode+0x20/0x60 [amdgpu]
[ 75.891238] RSP: 0018:
ffffc90002de7d48 EFLAGS:
00010286
[ 75.896648] RAX:
0000000000000000 RBX:
0000000000000000 RCX:
0000000000000001
[ 75.903933] RDX:
0000000000000000 RSI:
ffff88045d1bbb80 RDI:
0000000000000286
[ 75.911183] RBP:
ffffc90002de7d50 R08:
0000000000000502 R09:
0000000000000004
[ 75.918449] R10:
0000000000000000 R11:
0000000000000001 R12:
ffff880464bf0000
[ 75.925675] R13:
ffffffffa0853000 R14:
0000000000000000 R15:
0000564e44f88210
[ 75.932980] FS:
00007f13d5400700(0000) GS:
ffff880476540000(0000) knlGS:
0000000000000000
[ 75.941238] CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
[ 75.947088] CR2:
00000000000031c0 CR3:
000000045fd0b000 CR4:
00000000003406e0
[ 75.954332] DR0:
0000000000000000 DR1:
0000000000000000 DR2:
0000000000000000
[ 75.961566] DR3:
0000000000000000 DR6:
00000000fffe0ff0 DR7:
0000000000000400
[ 75.968834] Stack:
[ 75.970881]
ffff880464bf0000 ffffc90002de7d60 ffffffffa0636592 ffffc90002de7d80
[ 75.978454]
ffffffffa059015f ffff880464bf0000 ffff880464bf0000 ffffc90002de7da8
[ 75.986076]
ffffffffa0595216 ffff880464bf0000 ffff880460f4d000 ffffffffa0853000
[ 75.993692] Call Trace:
[ 75.996177] [<
ffffffffa0636592>] amdgpu_driver_lastclose_kms+0x12/0x20 [amdgpu]
[ 76.003700] [<
ffffffffa059015f>] drm_lastclose+0x2f/0xd0 [drm]
[ 76.009777] [<
ffffffffa0595216>] drm_dev_unregister+0x16/0xd0 [drm]
[ 76.016255] [<
ffffffffa0595944>] drm_put_dev+0x34/0x70 [drm]
[ 76.022139] [<
ffffffffa062f365>] amdgpu_pci_remove+0x15/0x20 [amdgpu]
[ 76.028800] [<
ffffffff81416499>] pci_device_remove+0x39/0xc0
[ 76.034661] [<
ffffffff81531caa>] __device_release_driver+0x9a/0x140
[ 76.041121] [<
ffffffff81531e58>] driver_detach+0xb8/0xc0
[ 76.046575] [<
ffffffff81530c95>] bus_remove_driver+0x55/0xd0
[ 76.052401] [<
ffffffff815325fc>] driver_unregister+0x2c/0x50
[ 76.058244] [<
ffffffff81416289>] pci_unregister_driver+0x29/0x90
[ 76.064466] [<
ffffffffa0596c5e>] drm_pci_exit+0x9e/0xb0 [drm]
[ 76.070507] [<
ffffffffa0796d71>] amdgpu_exit+0x1c/0x32 [amdgpu]
[ 76.076609] [<
ffffffff81104810>] SyS_delete_module+0x1a0/0x200
[ 76.082627] [<
ffffffff810e2b1a>] ? rcu_eqs_enter.isra.36+0x4a/0x50
[ 76.089001] [<
ffffffff8100392e>] do_syscall_64+0x6e/0x180
[ 76.094583] [<
ffffffff817e1d2f>] entry_SYSCALL64_slow_path+0x25/0x25
[ 76.101114] Code: 94 c0 c3 31 c0 5d c3 0f 1f 40 00 0f 1f 44 00 00 55 31 c0 48 89 e5 53 48 89 fb 48 c7 c7 1d 21 84 a0 e8 ab 77 b3 e0 e8 fc 8b d7 e0 <48> 8b bb c0 31 00 00 48 85 ff 74 09 e8 ff eb fc ff 85 c0 75 03
[ 76.121432] RIP [<
ffffffffa064a0e0>] amdgpu_fbdev_restore_mode+0x20/0x60 [amdgpu]
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Eric Garver [Tue, 23 May 2017 22:37:27 +0000 (18:37 -0400)]
geneve: fix fill_info when using collect_metadata
Since
9b4437a5b870 ("geneve: Unify LWT and netdev handling.") fill_info
does not return UDP_ZERO_CSUM6_RX when using COLLECT_METADATA. This is
because it uses ip_tunnel_info_af() with the device level info, which is
not valid for COLLECT_METADATA.
Fix by checking for the presence of the actual sockets.
Fixes:
9b4437a5b870 ("geneve: Unify LWT and netdev handling.")
Signed-off-by: Eric Garver <e@erig.me>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jan Kara [Thu, 18 May 2017 23:36:24 +0000 (16:36 -0700)]
xfs: Move handling of missing page into one place in xfs_find_get_desired_pgoff()
Currently several places in xfs_find_get_desired_pgoff() handle the case
of a missing page. Make them all handled in one place after the loop has
terminated.
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Jan Kara [Thu, 18 May 2017 23:36:23 +0000 (16:36 -0700)]
xfs: Fix off-by-in in loop termination in xfs_find_get_desired_pgoff()
There is an off-by-one error in loop termination conditions in
xfs_find_get_desired_pgoff() since 'end' may index a page beyond end of
desired range if 'endoff' is page aligned. It doesn't have any visible
effects but still it is good to fix it.
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Jan Kara [Thu, 18 May 2017 23:36:22 +0000 (16:36 -0700)]
xfs: Fix missed holes in SEEK_HOLE implementation
XFS SEEK_HOLE implementation could miss a hole in an unwritten extent as
can be seen by the following command:
xfs_io -c "falloc 0 256k" -c "pwrite 0 56k" -c "pwrite 128k 8k"
-c "seek -h 0" file
wrote 57344/57344 bytes at offset 0
56 KiB, 14 ops; 0.0000 sec (49.312 MiB/sec and 12623.9856 ops/sec)
wrote 8192/8192 bytes at offset 131072
8 KiB, 2 ops; 0.0000 sec (70.383 MiB/sec and 18018.0180 ops/sec)
Whence Result
HOLE 139264
Where we can see that hole at offset 56k was just ignored by SEEK_HOLE
implementation. The bug is in xfs_find_get_desired_pgoff() which does
not properly detect the case when pages are not contiguous.
Fix the problem by properly detecting when found page has larger offset
than expected.
CC: stable@vger.kernel.org
Fixes:
d126d43f631f996daeee5006714fed914be32368
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Eryu Guan [Tue, 23 May 2017 15:30:46 +0000 (08:30 -0700)]
xfs: fix off-by-one on max nr_pages in xfs_find_get_desired_pgoff()
xfs_find_get_desired_pgoff() is used to search for offset of hole or
data in page range [index, end] (both inclusive), and the max number
of pages to search should be at least one, if end == index.
Otherwise the only page is missed and no hole or data is found,
which is not correct.
When block size is smaller than page size, this can be demonstrated
by preallocating a file with size smaller than page size and writing
data to the last block. E.g. run this xfs_io command on a 1k block
size XFS on x86_64 host.
# xfs_io -fc "falloc 0 3k" -c "pwrite 2k 1k" \
-c "seek -d 0" /mnt/xfs/testfile
wrote 1024/1024 bytes at offset 2048
1 KiB, 1 ops; 0.0000 sec (33.675 MiB/sec and 34482.7586 ops/sec)
Whence Result
DATA EOF
Data at offset 2k was missed, and lseek(2) returned ENXIO.
This is uncovered by generic/285 subtest 07 and 08 on ppc64 host,
where pagesize is 64k. Because a recent change to generic/285
reduced the preallocated file size to smaller than 64k.
Cc: stable@vger.kernel.org # v3.7+
Signed-off-by: Eryu Guan <eguan@redhat.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Eric Sandeen [Tue, 23 May 2017 02:54:10 +0000 (19:54 -0700)]
xfs: fix unaligned access in xfs_btree_visit_blocks
This structure copy was throwing unaligned access warnings on sparc64:
Kernel unaligned access at TPC[
1043c088] xfs_btree_visit_blocks+0x88/0xe0 [xfs]
xfs_btree_copy_ptrs does a memcpy, which avoids it.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Linus Torvalds [Thu, 25 May 2017 03:29:53 +0000 (20:29 -0700)]
Merge tag 'scsi-fixes' of git://git./linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"This is quite a big update because it includes a rework of the lpfc
driver to separate the NVMe part from the FC part.
The reason for doing this is because two separate trees (the nvme and
scsi trees respectively) want to update the individual components and
this separation will prevent a really nasty cross tree entanglement by
the time we reach the next merge window.
The rest of the fixes are the usual minor sort with no significant
security implications"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (25 commits)
scsi: zero per-cmd private driver data for each MQ I/O
scsi: csiostor: fix use after free in csio_hw_use_fwconfig()
scsi: ufs: Clean up some rpm/spm level SysFS nodes upon remove
scsi: lpfc: fix build issue if NVME_FC_TARGET is not defined
scsi: lpfc: Fix NULL pointer dereference during PCI error recovery
scsi: lpfc: update version to 11.2.0.14
scsi: lpfc: Add MDS Diagnostic support.
scsi: lpfc: Fix NVMEI's handling of NVMET's PRLI response attributes
scsi: lpfc: Cleanup entry_repost settings on SLI4 queues
scsi: lpfc: Fix debugfs root inode "lpfc" not getting deleted on driver unload.
scsi: lpfc: Fix NVME I+T not registering NVME as a supported FC4 type
scsi: lpfc: Added recovery logic for running out of NVMET IO context resources
scsi: lpfc: Separate NVMET RQ buffer posting from IO resources SGL/iocbq/context
scsi: lpfc: Separate NVMET data buffer pool fir ELS/CT.
scsi: lpfc: Fix NMI watchdog assertions when running nvmet IOPS tests
scsi: lpfc: Fix NVMEI driver not decrementing counter causing bad rport state.
scsi: lpfc: Fix nvmet RQ resource needs for large block writes.
scsi: lpfc: Adding additional stats counters for nvme.
scsi: lpfc: Fix system crash when port is reset.
scsi: lpfc: Fix used-RPI accounting problem.
...
Dan Carpenter [Tue, 23 May 2017 05:13:45 +0000 (13:13 +0800)]
drm/amd/powerplay: fix a signedness bugs
Smatch complains about a signedness bug here:
vega10_hwmgr.c:4202 vega10_force_clock_level()
warn: always true condition '(i >= 0) => (0-u32max >= 0)'
Fixes:
7b52db39a4c2 ("drm/amd/powerplay: fix bug sclk/mclk
level can't be set on vega10.")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Eric Huang <JinHuiEric.Huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Chunming Zhou [Thu, 11 May 2017 10:22:17 +0000 (18:22 +0800)]
drm/amdgpu: fix NULL pointer panic of emit_gds_switch
[ 338.384770] BUG: unable to handle kernel NULL pointer dereference at (null)
[ 338.384817] IP: [< (null)>] (null)
[ 338.385505] RIP: 0010:[<
0000000000000000>] [< (null)>] (null)
[ 338.385950] Call Trace:
[ 338.385993] [<
ffffffffa05d2313>] ? amdgpu_vm_flush+0x283/0x400 [amdgpu]
[ 338.386025] [<
ffffffff811818d3>] ? printk+0x4d/0x4f
[ 338.386074] [<
ffffffffa05d4906>] amdgpu_ib_schedule+0x4a6/0x4d0 [amdgpu]
[ 338.386140] [<
ffffffffa0673e54>] amdgpu_job_run+0x64/0x180 [amdgpu]
[ 338.386203] [<
ffffffffa0672e09>] amd_sched_main+0x2e9/0x4a0 [amdgpu]
[ 338.386232] [<
ffffffff810bfce0>] ? prepare_to_wait_event+0x110/0x110
[ 338.386295] [<
ffffffffa0672b20>] ? amd_sched_select_entity+0xe0/0xe0 [amdgpu]
[ 338.386327] [<
ffffffff8109b423>] kthread+0xd3/0xf0
[ 338.386349] [<
ffffffff8109b350>] ? kthread_park+0x60/0x60
[ 338.386376] [<
ffffffff817e1ee5>] ret_from_fork+0x25/0x30
[ 338.386401] Code: Bad RIP value.
[ 338.386420] RIP [< (null)>] (null)
[ 338.386443] RSP <
ffffc90001bd7d40>
[ 338.386458] CR2:
0000000000000000
[ 338.398508] ---[ end trace
4c66fcdc74b9a0a2 ]---
Signed-off-by: Chunming Zhou <David1.Zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Lyude [Thu, 11 May 2017 23:31:12 +0000 (19:31 -0400)]
drm/radeon: Unbreak HPD handling for r600+
We end up reading the interrupt register for HPD5, and then writing it
to HPD6 which on systems without anything using HPD5 results in
permanently disabling hotplug on one of the display outputs after the
first time we acknowledge a hotplug interrupt from the GPU.
This code is really bad. But for now, let's just fix this. I will
hopefully have a large patch series to refactor all of this soon.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Lyude <lyude@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Thu, 11 May 2017 17:57:41 +0000 (13:57 -0400)]
drm/amd/powerplay/smu7: disable mclk switching for high refresh rates
Even if the vblank period would allow it, it still seems to
be problematic on some cards.
bug: https://bugs.freedesktop.org/show_bug.cgi?id=96868
Cc: stable@vger.kernel.org
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Thu, 11 May 2017 17:46:12 +0000 (13:46 -0400)]
drm/amd/powerplay/smu7: add vblank check for mclk switching (v2)
Check to make sure the vblank period is long enough to support
mclk switching.
v2: drop needless initial assignment (Nils)
bug: https://bugs.freedesktop.org/show_bug.cgi?id=96868
Cc: stable@vger.kernel.org
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Thu, 11 May 2017 17:14:14 +0000 (13:14 -0400)]
drm/radeon/ci: disable mclk switching for high refresh rates (v2)
Even if the vblank period would allow it, it still seems to
be problematic on some cards.
v2: fix logic inversion (Nils)
bug: https://bugs.freedesktop.org/show_bug.cgi?id=96868
Cc: stable@vger.kernel.org
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Thu, 11 May 2017 17:10:02 +0000 (13:10 -0400)]
drm/amdgpu/ci: disable mclk switching for high refresh rates (v2)
Even if the vblank period would allow it, it still seems to
be problematic on some cards.
v2: fix logic inversion (Nils)
bug: https://bugs.freedesktop.org/show_bug.cgi?id=96868
Cc: stable@vger.kernel.org
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
David S. Miller [Wed, 24 May 2017 20:27:22 +0000 (16:27 -0400)]
Merge branch 'q-in-q-checksums'
Daniel Borkmann says:
====================
BPF pruning follow-up
Follow-up to fix incorrect pruning when alignment tracking is
in use and to properly clear regs after call to not leave stale
data behind. For details, please see individual patches.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Vlad Yasevich [Tue, 23 May 2017 17:38:43 +0000 (13:38 -0400)]
virtio-net: enable TSO/checksum offloads for Q-in-Q vlans
Since virtio does not provide it's own ndo_features_check handler,
TSO, and now checksum offload, are disabled for stacked vlans.
Re-enable the support and let the host take care of it. This
restores/improves Guest-to-Guest performance over Q-in-Q vlans.
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vlad Yasevich [Tue, 23 May 2017 17:38:42 +0000 (13:38 -0400)]
be2net: Fix offload features for Q-in-Q packets
At least some of the be2net cards do not seem to be capabled
of performing checksum offload computions on Q-in-Q packets.
In these case, the recevied checksum on the remote is invalid
and TCP syn packets are dropped.
This patch adds a call to check disbled acceleration features
on Q-in-Q tagged traffic.
CC: Sathya Perla <sathya.perla@broadcom.com>
CC: Ajit Khaparde <ajit.khaparde@broadcom.com>
CC: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
CC: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vlad Yasevich [Tue, 23 May 2017 17:38:41 +0000 (13:38 -0400)]
vlan: Fix tcp checksum offloads in Q-in-Q vlans
It appears that TCP checksum offloading has been broken for
Q-in-Q vlans. The behavior was execerbated by the
series
commit
afb0bc972b52 ("Merge branch 'stacked_vlan_tso'")
that that enabled accleleration features on stacked vlans.
However, event without that series, it is possible to trigger
this issue. It just requires a lot more specialized configuration.
The root cause is the interaction between how
netdev_intersect_features() works, the features actually set on
the vlan devices and HW having the ability to run checksum with
longer headers.
The issue starts when netdev_interesect_features() replaces
NETIF_F_HW_CSUM with a combination of NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM,
if the HW advertises IP|IPV6 specific checksums. This happens
for tagged and multi-tagged packets. However, HW that enables
IP|IPV6 checksum offloading doesn't gurantee that packets with
arbitrarily long headers can be checksummed.
This patch disables IP|IPV6 checksums on the packet for multi-tagged
packets.
CC: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
CC: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
Acked-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
Christian König [Wed, 10 May 2017 18:06:58 +0000 (20:06 +0200)]
drm/amdgpu: fix fundamental suspend/resume issue
Reinitializing the VM manager during suspend/resume is a very very bad
idea since all the VMs are still active and kicking.
This can lead to random VM faults after resume when new processes
become the same client ID assigned.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Andrew Lunn [Tue, 23 May 2017 15:49:13 +0000 (17:49 +0200)]
net: phy: marvell: Limit errata to 88m1101
The 88m1101 has an errata when configuring autoneg. However, it was
being applied to many other Marvell PHYs as well. Limit its scope to
just the 88m1101.
Fixes:
76884679c644 ("phylib: Add support for Marvell 88e1111S and
88e1145")
Reported-by: Daniel Walker <danielwa@cisco.com>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Harini Katakam <harinik@xilinx.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Randy Dunlap [Tue, 23 May 2017 15:19:49 +0000 (08:19 -0700)]
net/phy: fix mdio-octeon dependency and build
Fix build errors by making this driver depend on OF_MDIO, like
several other similar drivers do.
drivers/built-in.o: In function `octeon_mdiobus_remove':
mdio-octeon.c:(.text+0x196ee0): undefined reference to `mdiobus_unregister'
mdio-octeon.c:(.text+0x196ee8): undefined reference to `mdiobus_free'
drivers/built-in.o: In function `octeon_mdiobus_probe':
mdio-octeon.c:(.text+0x196f1d): undefined reference to `devm_mdiobus_alloc_size'
mdio-octeon.c:(.text+0x196ffe): undefined reference to `of_mdiobus_register'
mdio-octeon.c:(.text+0x197010): undefined reference to `mdiobus_free'
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 24 May 2017 19:43:57 +0000 (15:43 -0400)]
Merge tag 'mlx5-fixes-2017-05-23' of git://git./linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5-fixes-2017-05-23
Some TC offloads fixes from Or Gerlitz.
From Erez, mlx5 IPoIB RX fix to improve GRO.
From Mohamad, Command interface fix to improve mitigation against FW
commands timeouts.
From Tariq, Driver load Tolerance against affinity settings failures.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 24 May 2017 19:31:39 +0000 (15:31 -0400)]
Merge tag 'mac80211-for-davem-2017-05-23' of git://git./linux/kernel/git/jberg/mac80211
Johannes Berg says:
====================
Just two fixes this time:
* fix the scheduled scan "BUG: scheduling while atomic"
* check mesh address extension flags more strictly
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Potapenko [Tue, 23 May 2017 11:20:28 +0000 (13:20 +0200)]
net: rtnetlink: bail out from rtnl_fdb_dump() on parse error
rtnl_fdb_dump() failed to check the result of nlmsg_parse(), which led
to contents of |ifm| being uninitialized because nlh->nlmsglen was too
small to accommodate |ifm|. The uninitialized data may affect some
branches and result in unwanted effects, although kernel data doesn't
seem to leak to the userspace directly.
The bug has been detected with KMSAN and syzkaller.
For the record, here is the KMSAN report:
==================================================================
BUG: KMSAN: use of unitialized memory in rtnl_fdb_dump+0x5dc/0x1000
CPU: 0 PID: 1039 Comm: probe Not tainted 4.11.0-rc5+ #2727
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:16
dump_stack+0x143/0x1b0 lib/dump_stack.c:52
kmsan_report+0x12a/0x180 mm/kmsan/kmsan.c:1007
__kmsan_warning_32+0x66/0xb0 mm/kmsan/kmsan_instr.c:491
rtnl_fdb_dump+0x5dc/0x1000 net/core/rtnetlink.c:3230
netlink_dump+0x84f/0x1190 net/netlink/af_netlink.c:2168
__netlink_dump_start+0xc97/0xe50 net/netlink/af_netlink.c:2258
netlink_dump_start ./include/linux/netlink.h:165
rtnetlink_rcv_msg+0xae9/0xb40 net/core/rtnetlink.c:4094
netlink_rcv_skb+0x339/0x5a0 net/netlink/af_netlink.c:2339
rtnetlink_rcv+0x83/0xa0 net/core/rtnetlink.c:4110
netlink_unicast_kernel net/netlink/af_netlink.c:1272
netlink_unicast+0x13b7/0x1480 net/netlink/af_netlink.c:1298
netlink_sendmsg+0x10b8/0x10f0 net/netlink/af_netlink.c:1844
sock_sendmsg_nosec net/socket.c:633
sock_sendmsg net/socket.c:643
___sys_sendmsg+0xd4b/0x10f0 net/socket.c:1997
__sys_sendmsg net/socket.c:2031
SYSC_sendmsg+0x2c6/0x3f0 net/socket.c:2042
SyS_sendmsg+0x87/0xb0 net/socket.c:2038
do_syscall_64+0x102/0x150 arch/x86/entry/common.c:285
entry_SYSCALL64_slow_path+0x25/0x25 arch/x86/entry/entry_64.S:246
RIP: 0033:0x401300
RSP: 002b:
00007ffc3b0e6d58 EFLAGS:
00000246 ORIG_RAX:
000000000000002e
RAX:
ffffffffffffffda RBX:
00000000004002b0 RCX:
0000000000401300
RDX:
0000000000000000 RSI:
00007ffc3b0e6d80 RDI:
0000000000000003
RBP:
00007ffc3b0e6e00 R08:
000000000000000b R09:
0000000000000004
R10:
000000000000000d R11:
0000000000000246 R12:
0000000000000000
R13:
00000000004065a0 R14:
0000000000406630 R15:
0000000000000000
origin:
000000008fe00056
save_stack_trace+0x59/0x60 arch/x86/kernel/stacktrace.c:59
kmsan_save_stack_with_flags mm/kmsan/kmsan.c:352
kmsan_internal_poison_shadow+0xb1/0x1a0 mm/kmsan/kmsan.c:247
kmsan_poison_shadow+0x6d/0xc0 mm/kmsan/kmsan.c:260
slab_alloc_node mm/slub.c:2743
__kmalloc_node_track_caller+0x1f4/0x390 mm/slub.c:4349
__kmalloc_reserve net/core/skbuff.c:138
__alloc_skb+0x2cd/0x740 net/core/skbuff.c:231
alloc_skb ./include/linux/skbuff.h:933
netlink_alloc_large_skb net/netlink/af_netlink.c:1144
netlink_sendmsg+0x934/0x10f0 net/netlink/af_netlink.c:1819
sock_sendmsg_nosec net/socket.c:633
sock_sendmsg net/socket.c:643
___sys_sendmsg+0xd4b/0x10f0 net/socket.c:1997
__sys_sendmsg net/socket.c:2031
SYSC_sendmsg+0x2c6/0x3f0 net/socket.c:2042
SyS_sendmsg+0x87/0xb0 net/socket.c:2038
do_syscall_64+0x102/0x150 arch/x86/entry/common.c:285
return_from_SYSCALL_64+0x0/0x6a arch/x86/entry/entry_64.S:246
==================================================================
and the reproducer:
==================================================================
#include <sys/socket.h>
#include <net/if_arp.h>
#include <linux/netlink.h>
#include <stdint.h>
int main()
{
int sock = socket(PF_NETLINK, SOCK_DGRAM | SOCK_NONBLOCK, 0);
struct msghdr msg;
memset(&msg, 0, sizeof(msg));
char nlmsg_buf[32];
memset(nlmsg_buf, 0, sizeof(nlmsg_buf));
struct nlmsghdr *nlmsg = nlmsg_buf;
nlmsg->nlmsg_len = 0x11;
nlmsg->nlmsg_type = 0x1e; // RTM_NEWROUTE = RTM_BASE + 0x0e
// type = 0x0e = 1110b
// kind = 2
nlmsg->nlmsg_flags = 0x101; // NLM_F_ROOT | NLM_F_REQUEST
nlmsg->nlmsg_seq = 0;
nlmsg->nlmsg_pid = 0;
nlmsg_buf[16] = (char)7;
struct iovec iov;
iov.iov_base = nlmsg_buf;
iov.iov_len = 17;
msg.msg_iov = &iov;
msg.msg_iovlen = 1;
sendmsg(sock, &msg, 0);
return 0;
}
==================================================================
Signed-off-by: Alexander Potapenko <glider@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Quentin Schulz [Tue, 23 May 2017 09:48:08 +0000 (11:48 +0200)]
net: fec: add post PHY reset delay DT property
Some PHY require to wait for a bit after the reset GPIO has been
toggled. This adds support for the DT property `phy-reset-post-delay`
which gives the delay in milliseconds to wait after reset.
If the DT property is not given, no delay is observed. Post reset delay
greater than 1000ms are invalid.
Signed-off-by: Quentin Schulz <quentin.schulz@free-electrons.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Fugang Duan <fugang.duan@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 24 May 2017 19:21:05 +0000 (15:21 -0400)]
Merge branch 'sctp-dupcookie-fixes'
Xin Long says:
====================
sctp: a bunch of fixes for processing dupcookie
After introducing transport hashtable and per stream info into sctp,
some regressions were caused when processing dupcookie, this patchset
is to fix them.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Tue, 23 May 2017 05:28:55 +0000 (13:28 +0800)]
sctp: set new_asoc temp when processing dupcookie
After sctp changed to use transport hashtable, a transport would be
added into global hashtable when adding the peer to an asoc, then
the asoc can be got by searching the transport in the hashtbale.
The problem is when processing dupcookie in sctp_sf_do_5_2_4_dupcook,
a new asoc would be created. A peer with the same addr and port as
the one in the old asoc might be added into the new asoc, but fail
to be added into the hashtable, as they also belong to the same sk.
It causes that sctp's dupcookie processing can not really work.
Since the new asoc will be freed after copying it's information to
the old asoc, it's more like a temp asoc. So this patch is to fix
it by setting it as a temp asoc to avoid adding it's any transport
into the hashtable and also avoid allocing assoc_id.
An extra thing it has to do is to also alloc stream info for any
temp asoc, as sctp dupcookie process needs it to update old asoc.
But I don't think it would hurt something, as a temp asoc would
always be freed after finishing processing cookie echo packet.
Reported-by: Jianwen Ji <jiji@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Tue, 23 May 2017 05:28:54 +0000 (13:28 +0800)]
sctp: fix stream update when processing dupcookie
Since commit
3dbcc105d556 ("sctp: alloc stream info when initializing
asoc"), stream and stream.out info are always alloced when creating
an asoc.
So it's not correct to check !asoc->stream before updating stream
info when processing dupcookie, but would be better to check asoc
state instead.
Fixes:
3dbcc105d556 ("sctp: alloc stream info when initializing asoc")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Luis Henriques [Fri, 5 May 2017 17:28:44 +0000 (18:28 +0100)]
ceph: check that the new inode size is within limits in ceph_fallocate()
Currently the ceph client doesn't respect the rlimit in fallocate. This
means that a user can allocate a file with size > RLIMIT_FSIZE. This
patch adds the call to inode_newsize_ok() to verify filesystem limits and
ulimits. This should make ceph successfully run xfstest generic/228.
Signed-off-by: Luis Henriques <lhenriques@suse.com>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Yan, Zheng [Fri, 5 May 2017 10:47:37 +0000 (18:47 +0800)]
libceph: cleanup old messages according to reconnect seq
when reopen a connection, use 'reconnect seq' to clean up
messages that have already been received by peer.
Link: http://tracker.ceph.com/issues/18690
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Linus Torvalds [Wed, 24 May 2017 15:28:59 +0000 (08:28 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/ebiederm/user-namespace
Pull ptrace fix from Eric Biederman:
"This fixes a brown paper bag bug. When I fixed the ptrace interaction
with user namespaces I added a new field ptracer_cred in struct_task
and I failed to properly initialize it on fork.
This dangling pointer wound up breaking runing setuid applications run
from the enlightenment window manager.
As this is the worst sort of bug. A regression breaking user space for
no good reason let's get this fixed"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
ptrace: Properly initialize ptracer_cred on fork
Linus Torvalds [Wed, 24 May 2017 15:21:56 +0000 (08:21 -0700)]
Merge tag 'mmc-v4.12-rc1' of git://git./linux/kernel/git/ulfh/mmc
Pull MMC fixes from Ulf Hansson:
"A couple of MMC host fixes intended for v4.12 rc3:
- sdhci-xenon: Don't free data for phy allocated by devm*
- sdhci-iproc: Suppress spurious interrupts
- cavium: Fix probing race with regulator
- cavium: Prevent crash with incomplete DT
- cavium-octeon: Use proper GPIO name for power control
- cavium-octeon: Fix interrupt enable code"
* tag 'mmc-v4.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
mmc: sdhci-iproc: suppress spurious interrupt with Multiblock read
mmc: cavium: Fix probing race with regulator
of/platform: Make of_platform_device_destroy globally visible
mmc: cavium: Prevent crash with incomplete DT
mmc: cavium-octeon: Use proper GPIO name for power control
mmc: cavium-octeon: Fix interrupt enable code
mmc: sdhci-xenon: kill xenon_clean_phy()
Ingo Molnar [Wed, 24 May 2017 06:57:21 +0000 (08:57 +0200)]
tools/include: Sync kernel ABI headers with tooling headers
Sync (copy) the following v4.12 kernel headers to the tooling headers:
arch/x86/include/asm/disabled-features.h:
arch/x86/include/uapi/asm/kvm.h:
arch/powerpc/include/uapi/asm/kvm.h:
arch/s390/include/uapi/asm/kvm.h:
arch/arm/include/uapi/asm/kvm.h:
arch/arm64/include/uapi/asm/kvm.h:
- 'struct kvm_sync_regs' got changed in an ABI-incompatible way,
fortunately none of the (in-kernel) tooling relied on it
- new KVM_DEV calls added
arch/x86/include/asm/required-features.h:
- 5-level paging hardware ABI detail added
arch/x86/include/asm/cpufeatures.h:
- new CPU feature added
arch/x86/include/uapi/asm/vmx.h:
- new VMX exit conditions
None of the changes requires fixes in the tooling source code.
This addresses the following warnings:
Warning: include/uapi/linux/stat.h differs from kernel
Warning: arch/x86/include/asm/disabled-features.h differs from kernel
Warning: arch/x86/include/asm/required-features.h differs from kernel
Warning: arch/x86/include/asm/cpufeatures.h differs from kernel
Warning: arch/x86/include/uapi/asm/kvm.h differs from kernel
Warning: arch/x86/include/uapi/asm/vmx.h differs from kernel
Warning: arch/powerpc/include/uapi/asm/kvm.h differs from kernel
Warning: arch/s390/include/uapi/asm/kvm.h differs from kernel
Warning: arch/arm/include/uapi/asm/kvm.h differs from kernel
Warning: arch/arm64/include/uapi/asm/kvm.h differs from kernel
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yao Jin <yao.jin@linux.intel.com>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170524065721.j2mlch6bgk5klgbc@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Namhyung Kim [Wed, 24 May 2017 06:21:29 +0000 (15:21 +0900)]
perf tools: Put caller above callee in --children mode
The __hpp__sort_acc() sorts entries using callchain depth in order to
put callers above in children mode. But it assumed the callchain order
was callee-first. Now default (for children) is caller-first so the
order of entries is reverted.
For example, consider following case:
$ perf report --no-children
..l
# Overhead Command Shared Object Symbol
# ........ ....... ................... ..........................
#
99.44% a.out a.out [.] main
|
---main
__libc_start_main
_start
Then children mode should show 'start' above '__libc_start_main' since
it's the caller (parent) of the __libc_start_main. But it's reversed:
# Children Self Command Shared Object Symbol
# ........ ........ ....... ............... .....................
#
99.61% 0.00% a.out libc-2.25.so [.] __libc_start_main
99.61% 0.00% a.out a.out [.] _start
99.54% 99.44% a.out a.out [.] main
This patch fixes it.
# Children Self Command Shared Object Symbol
# ........ ........ ....... ............... .....................
#
99.61% 0.00% a.out a.out [.] _start
99.61% 0.00% a.out libc-2.25.so [.] __libc_start_main
99.54% 99.44% a.out a.out [.] main
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yao Jin <yao.jin@linux.intel.com>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170524062129.32529-8-namhyung@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Milian Wolff [Wed, 24 May 2017 06:21:28 +0000 (15:21 +0900)]
perf report: Do not drop last inlined frame
The very last inlined frame, i.e. the one furthest away from the
non-inlined frame, was silently dropped. This is apparent when
comparing the output of `perf script` and `addr2line`:
~~~~~~
$ perf script --inline
...
a.out 26722 80836.309329: 72425 cycles:
21561 __hypot_finite (/usr/lib/libm-2.25.so)
ace3 hypot (/usr/lib/libm-2.25.so)
a4a main (a.out)
std::abs<double>
std::_Norm_helper<true>::_S_do_it<double>
std::norm<double>
main
20510 __libc_start_main (/usr/lib/libc-2.25.so)
bd9 _start (a.out)
$ addr2line -a -f -i -e /tmp/a.out a4a | c++filt
0x0000000000000a4a
std::__complex_abs(doublecomplex )
/usr/include/c++/6.3.1/complex:589
double std::abs<double>(std::complex<double> const&)
/usr/include/c++/6.3.1/complex:597
double std::_Norm_helper<true>::_S_do_it<double>(std::complex<double> const&)
/usr/include/c++/6.3.1/complex:654
double std::norm<double>(std::complex<double> const&)
/usr/include/c++/6.3.1/complex:664
main
/tmp/inlining.cpp:14
~~~~~
Note how `std::__complex_abs` is missing from the `perf script`
output. This is similarly showing up in `perf report`. The patch
here fixes this issue, and the output becomes:
~~~~~
a.out 26722 80836.309329: 72425 cycles:
21561 __hypot_finite (/usr/lib/libm-2.25.so)
ace3 hypot (/usr/lib/libm-2.25.so)
a4a main (a.out)
std::__complex_abs
std::abs<double>
std::_Norm_helper<true>::_S_do_it<double>
std::norm<double>
main
20510 __libc_start_main (/usr/lib/libc-2.25.so)
bd9 _start (a.out)
~~~~~
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yao Jin <yao.jin@linux.intel.com>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170524062129.32529-7-namhyung@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Milian Wolff [Wed, 24 May 2017 06:21:27 +0000 (15:21 +0900)]
perf report: Always honor callchain order for inlined nodes
So far, the inlined nodes where only reversed when we built perf
against libbfd. If that was not available, the addr2line fallback
code path was missing the inline_list__reverse call.
Now we always add the nodes in the correct order within
inline_list__append. This removes the need to reverse the list
and also ensures that all callers construct the list in the right
order.
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yao Jin <yao.jin@linux.intel.com>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170524062129.32529-6-namhyung@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Namhyung Kim [Wed, 24 May 2017 06:21:26 +0000 (15:21 +0900)]
perf script: Add --inline option for debugging
The --inline option is to show inlined functions in callchains.
For example:
$ perf script
a.out 5644 11611.467597: 309961 cycles:u:
790 main (/home/namhyung/tmp/perf/a.out)
20511 __libc_start_main (/usr/lib/libc-2.25.so)
8ba _start (/home/namhyung/tmp/perf/a.out)
...
$ perf script --inline
a.out 5644 11611.467597: 309961 cycles:u:
790 main (/home/namhyung/tmp/perf/a.out)
std::__detail::_Adaptor<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul>, double>::operator()
std::uniform_real_distribution<double>::operator()<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> >
std::uniform_real_distribution<double>::operator()<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> >
main
20511 __libc_start_main (/usr/lib/libc-2.25.so)
8ba _start (/home/namhyung/tmp/perf/a.out)
...
Reviewed-and-tested-by: Milian Wolff <milian.wolff@kdab.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170524062129.32529-5-namhyung@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Milian Wolff [Wed, 24 May 2017 06:21:25 +0000 (15:21 +0900)]
perf report: Fix off-by-one for non-activation frames
As the documentation for dwfl_frame_pc says, frames that
are no activation frames need to have their program counter
decremented by one to properly find the function of the caller.
This fixes many cases where perf report currently attributes
the cost to the next line. I.e. I have code like this:
~~~~~~~~~~~~~~~
#include <thread>
#include <chrono>
using namespace std;
int main()
{
this_thread::sleep_for(chrono::milliseconds(1000));
this_thread::sleep_for(chrono::milliseconds(100));
this_thread::sleep_for(chrono::milliseconds(10));
return 0;
}
~~~~~~~~~~~~~~~
Now compile and record it:
~~~~~~~~~~~~~~~
g++ -std=c++11 -g -O2 test.cpp
echo 1 | sudo tee /proc/sys/kernel/sched_schedstats
perf record \
--event sched:sched_stat_sleep \
--event sched:sched_process_exit \
--event sched:sched_switch --call-graph=dwarf \
--output perf.data.raw \
./a.out
echo 0 | sudo tee /proc/sys/kernel/sched_schedstats
perf inject --sched-stat --input perf.data.raw --output perf.data
~~~~~~~~~~~~~~~
Before this patch, the report clearly shows the off-by-one issue.
Most notably, the last sleep invocation is incorrectly attributed
to the "return 0;" line:
~~~~~~~~~~~~~~~
Overhead Source:Line
........ ...........
100.00% core.c:0
|
---__schedule core.c:0
schedule
do_nanosleep hrtimer.c:0
hrtimer_nanosleep
sys_nanosleep
entry_SYSCALL_64_fastpath .tmp_entry_64.o:0
__nanosleep_nocancel .:0
std::this_thread::sleep_for<long, std::ratio<1l, 1000l> > thread:323
|
|--90.08%--main test.cpp:9
| __libc_start_main
| _start
|
|--9.01%--main test.cpp:10
| __libc_start_main
| _start
|
--0.91%--main test.cpp:13
__libc_start_main
_start
~~~~~~~~~~~~~~~
With this patch here applied, the issue is fixed. The report becomes
much more usable:
~~~~~~~~~~~~~~~
Overhead Source:Line
........ ...........
100.00% core.c:0
|
---__schedule core.c:0
schedule
do_nanosleep hrtimer.c:0
hrtimer_nanosleep
sys_nanosleep
entry_SYSCALL_64_fastpath .tmp_entry_64.o:0
__nanosleep_nocancel .:0
std::this_thread::sleep_for<long, std::ratio<1l, 1000l> > thread:323
|
|--90.08%--main test.cpp:8
| __libc_start_main
| _start
|
|--9.01%--main test.cpp:9
| __libc_start_main
| _start
|
--0.91%--main test.cpp:10
__libc_start_main
_start
~~~~~~~~~~~~~~~
Similarly it works for signal frames:
~~~~~~~~~~~~~~~
__noinline void bar(void)
{
volatile long cnt = 0;
for (cnt = 0; cnt <
100000000; cnt++);
}
__noinline void foo(void)
{
bar();
}
void sig_handler(int sig)
{
foo();
}
int main(void)
{
signal(SIGUSR1, sig_handler);
raise(SIGUSR1);
foo();
return 0;
}
~~~~~~~~~~~~~~~~
Before, the report wrongly points to `signal.c:29` after raise():
~~~~~~~~~~~~~~~~
$ perf report --stdio --no-children -g srcline -s srcline
...
100.00% signal.c:11
|
---bar signal.c:11
|
|--50.49%--main signal.c:29
| __libc_start_main
| _start
|
--49.51%--0x33a8f
raise .:0
main signal.c:29
__libc_start_main
_start
~~~~~~~~~~~~~~~~
With this patch in, the issue is fixed and we instead get:
~~~~~~~~~~~~~~~~
100.00% signal signal [.] bar
|
---bar signal.c:11
|
|--50.49%--main signal.c:29
| __libc_start_main
| _start
|
--49.51%--0x33a8f
raise .:0
main signal.c:27
__libc_start_main
_start
~~~~~~~~~~~~~~~~
Note how this patch fixes this issue for both unwinding methods, i.e.
both dwfl and libunwind. The former case is straight-forward thanks
to dwfl_frame_pc(). For libunwind, we replace the functionality via
unw_is_signal_frame() for any but the very first frame.
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yao Jin <yao.jin@linux.intel.com>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170524062129.32529-4-namhyung@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Milian Wolff [Wed, 24 May 2017 06:21:24 +0000 (15:21 +0900)]
perf report: Fix memory leak in addr2line when called by addr2inlines
When a filename was found in addr2line it was duplicated via strdup()
but never freed. Now we pass NULL and handle this gracefully in
addr2line.
Detected by Valgrind:
==16331== 1,680 bytes in 21 blocks are definitely lost in loss record 148 of 220
==16331== at 0x4C2AF1F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==16331== by 0x672FA69: strdup (in /usr/lib/libc-2.25.so)
==16331== by 0x52769F: addr2line (srcline.c:256)
==16331== by 0x52769F: addr2inlines (srcline.c:294)
==16331== by 0x52769F: dso__parse_addr_inlines (srcline.c:502)
==16331== by 0x574D7A: inline__fprintf (hist.c:41)
==16331== by 0x574D7A: ipchain__fprintf_graph (hist.c:147)
==16331== by 0x57518A: __callchain__fprintf_graph (hist.c:212)
==16331== by 0x5753CF: callchain__fprintf_graph.constprop.6 (hist.c:337)
==16331== by 0x57738E: hist_entry__fprintf (hist.c:628)
==16331== by 0x57738E: hists__fprintf (hist.c:882)
==16331== by 0x44A20F: perf_evlist__tty_browse_hists (builtin-report.c:399)
==16331== by 0x44A20F: report__browse_hists (builtin-report.c:491)
==16331== by 0x44A20F: __cmd_report (builtin-report.c:624)
==16331== by 0x44A20F: cmd_report (builtin-report.c:1054)
==16331== by 0x4A49CE: run_builtin (perf.c:296)
==16331== by 0x4A4CC0: handle_internal_command (perf.c:348)
==16331== by 0x434371: run_argv (perf.c:392)
==16331== by 0x434371: main (perf.c:530)
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yao Jin <yao.jin@linux.intel.com>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170524062129.32529-3-namhyung@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Milian Wolff [Wed, 24 May 2017 06:21:23 +0000 (15:21 +0900)]
perf report: Don't crash on invalid maps in `-g srcline` mode
I just hit a segfault when doing `perf report -g srcline`.
Valgrind pointed me at this code as the culprit:
==8359== Invalid read of size 8
==8359== at 0x3096D9: map__rip_2objdump (map.c:430)
==8359== by 0x2FC1A3: match_chain_srcline (callchain.c:645)
==8359== by 0x2FC1A3: match_chain (callchain.c:700)
==8359== by 0x2FC1A3: append_chain (callchain.c:895)
==8359== by 0x2FC1A3: append_chain_children (callchain.c:846)
==8359== by 0x2FF719: callchain_append (callchain.c:944)
==8359== by 0x2FF719: hist_entry__append_callchain (callchain.c:1058)
==8359== by 0x32FA06: iter_add_single_cumulative_entry (hist.c:908)
==8359== by 0x33195C: hist_entry_iter__add (hist.c:1050)
==8359== by 0x258F65: process_sample_event (builtin-report.c:204)
==8359== by 0x30D60C: perf_session__deliver_event (session.c:1310)
==8359== by 0x30D60C: ordered_events__deliver_event (session.c:119)
==8359== by 0x310D12: __ordered_events__flush (ordered-events.c:210)
==8359== by 0x310D12: ordered_events__flush.part.3 (ordered-events.c:277)
==8359== by 0x30DD3C: perf_session__process_user_event (session.c:1349)
==8359== by 0x30DD3C: perf_session__process_event (session.c:1475)
==8359== by 0x30FC3C: __perf_session__process_events (session.c:1867)
==8359== by 0x30FC3C: perf_session__process_events (session.c:1921)
==8359== by 0x25A985: __cmd_report (builtin-report.c:575)
==8359== by 0x25A985: cmd_report (builtin-report.c:1054)
==8359== by 0x2B9A80: run_builtin (perf.c:296)
==8359== Address 0x70 is not stack'd, malloc'd or (recently) free'd
This patch fixes the issue.
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
[ Remove dependency from another change ]
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yao Jin <yao.jin@linux.intel.com>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170524062129.32529-2-namhyung@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Patrik Jakobsson [Tue, 18 Apr 2017 11:43:32 +0000 (13:43 +0200)]
drm/gma500/psb: Actually use VBT mode when it is found
With LVDS we were incorrectly picking the pre-programmed mode instead of
the prefered mode provided by VBT. Make sure we pick the VBT mode if
one is provided. It is likely that the mode read-out code is still wrong
but this patch fixes the immediate problem on most machines.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78562
Cc: <stable@vger.kernel.org>
Signed-off-by: Patrik Jakobsson <patrik.r.jakobsson@gmail.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170418114332.12183-1-patrik.r.jakobsson@gmail.com
Imre Deak [Tue, 23 May 2017 19:18:17 +0000 (14:18 -0500)]
PCI/PM: Add needs_resume flag to avoid suspend complete optimization
Some drivers - like i915 - may not support the system suspend direct
complete optimization due to differences in their runtime and system
suspend sequence. Add a flag that when set resumes the device before
calling the driver's system suspend handlers which effectively disables
the optimization.
Needed by a future patch fixing suspend/resume on i915.
Suggested by Rafael.
Signed-off-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: stable@vger.kernel.org
Dan Carpenter [Tue, 23 May 2017 14:25:10 +0000 (17:25 +0300)]
libceph: NULL deref on crush_decode() error path
If there is not enough space then ceph_decode_32_safe() does a goto bad.
We need to return an error code in that situation. The current code
returns ERR_PTR(0) which is NULL. The callers are not expecting that
and it results in a NULL dereference.
Fixes:
f24e9980eb86 ("ceph: OSD client")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Ilya Dryomov [Fri, 19 May 2017 12:24:36 +0000 (14:24 +0200)]
libceph: fix error handling in process_one_ticket()
Don't leak key internals after new_session_key is populated.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Alex Elder <elder@linaro.org>
Ilya Dryomov [Fri, 19 May 2017 10:21:56 +0000 (12:21 +0200)]
libceph: validate blob_struct_v in process_one_ticket()
None of these are validated in userspace, but since we do validate
reply_struct_v in ceph_x_proc_ticket_reply(), tkt_struct_v (first) and
CephXServiceTicket struct_v (second) in process_one_ticket(), validate
CephXTicketBlob struct_v as well.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Alex Elder <elder@linaro.org>
Ilya Dryomov [Fri, 19 May 2017 09:59:22 +0000 (11:59 +0200)]
libceph: drop version variable from ceph_monmap_decode()
It's set but not used: CEPH_FEATURE_MONNAMES feature bit isn't
advertised, which guarantees a v1 MonMap.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Alex Elder <elder@linaro.org>
Ilya Dryomov [Fri, 19 May 2017 09:38:17 +0000 (11:38 +0200)]
libceph: make ceph_msg_data_advance() return void
Both callers ignore the returned bool.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Alex Elder <elder@linaro.org>
Ilya Dryomov [Fri, 19 May 2017 09:33:16 +0000 (11:33 +0200)]
libceph: use kbasename() and kill ceph_file_part()
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Alex Elder <elder@linaro.org>
Linus Torvalds [Tue, 23 May 2017 16:57:39 +0000 (09:57 -0700)]
Merge branch 'i2c/for-current' of git://git./linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
"Fix the i2c-designware regression of rc2.
Also, a DMA buffer fix for the tiny-usb driver where the USB core now
loudly complains about the non DMA-capable buffer"
[ I had cherry-picked the designware fix separately because it hit my
laptop, but here is the proper sync with the i2c tree - Linus ]
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: designware: Fix bogus sda_hold_time due to uninitialized vars
i2c: i2c-tiny-usb: fix buffer not being DMA capable
Richard [Sun, 21 May 2017 19:27:00 +0000 (12:27 -0700)]
partitions/msdos: FreeBSD UFS2 file systems are not recognized
The code in block/partitions/msdos.c recognizes FreeBSD, OpenBSD
and NetBSD partitions and does a reasonable job picking out OpenBSD
and NetBSD UFS subpartitions.
But for FreeBSD the subpartitions are always "bad".
Kernel: <bsd:bad subpartition - ignored
Though all 3 of these BSD systems use UFS as a file system, only
FreeBSD uses relative start addresses in the subpartition
declarations.
The following patch fixes this for FreeBSD partitions and leaves
the code for OpenBSD and NetBSD intact:
Signed-off-by: Richard Narron <comet.berkeley@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
Jesper Dangaard Brouer [Mon, 22 May 2017 18:13:07 +0000 (20:13 +0200)]
mlx5: fix bug reading rss_hash_type from CQE
Masks for extracting part of the Completion Queue Entry (CQE)
field rss_hash_type was swapped, namely CQE_RSS_HTYPE_IP and
CQE_RSS_HTYPE_L4.
The bug resulted in setting skb->l4_hash, even-though the
rss_hash_type indicated that hash was NOT computed over the
L4 (UDP or TCP) part of the packet.
Added comments from the datasheet, to make it more clear what
these masks are selecting.
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>