Pablo Neira Ayuso [Tue, 24 May 2016 09:23:51 +0000 (11:23 +0200)]
netfilter: nf_ct_helper: bail out on duplicated helpers
Don't allow registration of helpers using the same tuple:
{ l3proto, l4proto, src-port }
We lookup for the helper from the packet path using this tuple through
__nf_ct_helper_find(). Therefore, we have to avoid having two helpers
with the same tuple to ensure predictible behaviour.
Don't compare the helper string names anymore since it is valid to
register two helpers with the same name, but using different tuples.
This is also implicitly fixing up duplicated helper registration via
ports= modparam since the name comparison was defeating the tuple
duplication validation.
Reported-by: Feng Gao <gfree.wind@gmail.com>
Reported-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Phil Turnbull [Fri, 27 May 2016 17:34:04 +0000 (13:34 -0400)]
netfilter: nf_tables: validate NFTA_SET_TABLE parameter
If the NFTA_SET_TABLE parameter is missing and the NLM_F_DUMP flag is
not set, then a NULL pointer dereference is triggered in
nf_tables_set_lookup because ctx.table is NULL.
Signed-off-by: Phil Turnbull <phil.turnbull@oracle.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Paolo Abeni [Thu, 26 May 2016 17:08:10 +0000 (19:08 +0200)]
netfilter: nf_dup_ipv6: set again FLOWI_FLAG_KNOWN_NH at flowi6_flags
With the commit
48e8aa6e3137 ("ipv6: Set FLOWI_FLAG_KNOWN_NH at
flowi6_flags") ip6_pol_route() callers were asked to to set the
FLOWI_FLAG_KNOWN_NH properly and xt_TEE was updated accordingly,
but with the later refactor in commit
bbde9fc1824a ("netfilter:
factor out packet duplication for IPv4/IPv6") the flowi6_flags
update was lost.
This commit re-add it just before the routing decision.
Fixes:
bbde9fc1824a ("netfilter: factor out packet duplication for IPv4/IPv6")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Taehee Yoo [Sat, 14 May 2016 13:19:53 +0000 (22:19 +0900)]
netfilter: nf_ct_helper: Fix helper unregister count.
helpers should unregister the only registered ports.
but, helper cannot have correct registered ports value when
failed to register.
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Eric W. Biederman [Sat, 14 May 2016 02:18:52 +0000 (21:18 -0500)]
netfilter: nf_queue: Make the queue_handler pernet
Florian Weber reported:
> Under full load (unshare() in loop -> OOM conditions) we can
> get kernel panic:
>
> BUG: unable to handle kernel NULL pointer dereference at
0000000000000008
> IP: [<
ffffffff81476c85>] nfqnl_nf_hook_drop+0x35/0x70
> [..]
> task:
ffff88012dfa3840 ti:
ffff88012dffc000 task.ti:
ffff88012dffc000
> RIP: 0010:[<
ffffffff81476c85>] [<
ffffffff81476c85>] nfqnl_nf_hook_drop+0x35/0x70
> RSP: 0000:
ffff88012dfffd80 EFLAGS:
00010206
> RAX:
0000000000000008 RBX:
ffffffff81add0c0 RCX:
ffff88013fd80000
> [..]
> Call Trace:
> [<
ffffffff81474d98>] nf_queue_nf_hook_drop+0x18/0x20
> [<
ffffffff814738eb>] nf_unregister_net_hook+0xdb/0x150
> [<
ffffffff8147398f>] netfilter_net_exit+0x2f/0x60
> [<
ffffffff8141b088>] ops_exit_list.isra.4+0x38/0x60
> [<
ffffffff8141b652>] setup_net+0xc2/0x120
> [<
ffffffff8141bd09>] copy_net_ns+0x79/0x120
> [<
ffffffff8106965b>] create_new_namespaces+0x11b/0x1e0
> [<
ffffffff810698a7>] unshare_nsproxy_namespaces+0x57/0xa0
> [<
ffffffff8104baa2>] SyS_unshare+0x1b2/0x340
> [<
ffffffff81608276>] entry_SYSCALL_64_fastpath+0x1e/0xa8
> Code: 65 00 48 89 e5 41 56 41 55 41 54 53 83 e8 01 48 8b 97 70 12 00 00 48 98 49 89 f4 4c 8b 74 c2 18 4d 8d 6e 08 49 81 c6 88 00 00 00 <49> 8b 5d 00 48 85 db 74 1a 48 89 df 4c 89 e2 48 c7 c6 90 68 47
>
The simple fix for this requires a new pernet variable for struct
nf_queue that indicates when it is safe to use the dynamically
allocated nf_queue state.
As we need a variable anyway make nf_register_queue_handler and
nf_unregister_queue_handler pernet. This allows the existing logic of
when it is safe to use the state from the nfnetlink_queue module to be
reused with no changes except for making it per net.
The syncrhonize_rcu from nf_unregister_queue_handler is moved to a new
function nfnl_queue_net_exit_batch so that the worst case of having a
syncrhonize_rcu in the pernet exit path is not experienced in batch
mode.
Reported-by: Florian Westphal <fw@strlen.de>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Florian Westphal [Fri, 13 May 2016 20:48:51 +0000 (22:48 +0200)]
netfilter: conntrack: remove leftover binary sysctl define
Users got removed in
f8572d8f2a2ba ("sysctl net: Remove unused binary
sysctl code").
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Florian Westphal [Thu, 12 May 2016 12:43:54 +0000 (14:43 +0200)]
netfilter: nfnetlink_queue: fix timestamp attribute
Since 4.4 we erronously use timestamp of the netlink skb (which is zero).
Bugzilla: https://bugzilla.netfilter.org/show_bug.cgi?id=1066
Fixes:
b28b1e826f818c30ea7 ("netfilter: nfnetlink_queue: use y2038 safe timestamp")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
David S. Miller [Thu, 12 May 2016 03:46:09 +0000 (23:46 -0400)]
Merge branch 'bnxt_en-fixes'
Michael Chan says:
====================
bnxt_en: Add workaround to detect bad opaque in rx completion.
2-part workaround for this hardware bug.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Tue, 10 May 2016 23:18:00 +0000 (19:18 -0400)]
bnxt_en: Add workaround to detect bad opaque in rx completion (part 2)
Add detection and recovery code when the hardware returned opaque value
does not match the expected consumer index. Once the issue is detected,
we skip the processing of all RX and LRO/GRO packets. These completion
entries are discarded without sending the SKB to the stack and without
producing new buffers. The function will be reset from a workqueue.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Tue, 10 May 2016 23:17:59 +0000 (19:17 -0400)]
bnxt_en: Add workaround to detect bad opaque in rx completion (part 1)
There is a rare hardware bug that can cause a bad opaque value in the RX
or TPA completion. When this happens, the hardware may have used the
same buffer twice for 2 rx packets. In addition, the driver will also
crash later using the bad opaque as the index into the ring.
The rx opaque value is predictable and is always monotonically increasing.
The workaround is to keep track of the expected next opaque value and
compare it with the one returned by hardware during RX and TPA start
completions. If they miscompare, we will not process any more RX and
TPA completions and exit NAPI. We will then schedule a workqueue to
reset the function.
This patch adds the logic to keep track of the next rx consumer index.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Tue, 10 May 2016 19:20:04 +0000 (22:20 +0300)]
qlcnic: potential NULL dereference in qlcnic_83xx_get_minidump_template()
If qlcnic_fw_cmd_get_minidump_temp() fails then "fw_dump->tmpl_hdr" is
NULL or possibly freed. It can lead to an oops later.
Fixes:
d01a6d3c8ae1 ('qlcnic: Add support to enable capability to extend minidump for iSCSI')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Wed, 11 May 2016 20:17:12 +0000 (13:17 -0700)]
Merge tag 'scsi-fixes' of git://git./linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"This is a couple of small fixes: one is a potential uninitialised
error variable in the alua code, potentially causing spurious failures
and the other is a problem caused by the conversion of SCSI to
hostwide tags which resulted in the qla1280 driver always failing in
host initialisation"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
qla1280: Don't allocate 512kb of host tags
scsi_dh_alua: uninitialized variable in alua_rtpg()
Linus Torvalds [Wed, 11 May 2016 19:52:05 +0000 (12:52 -0700)]
Merge git://git./linux/kernel/git/davem/net
Pull networking fixes from David Miller:
"Hopefully the last round of fixes this release, fingers crossed :)
1) Initialize static nf_conntrack_locks_all_lock properly, from
Florian Westphal.
2) Need to cancel pending work when destroying IDLETIMER entries,
from Liping Zhang.
3) Fix TX param usage when sending TSO over iwlwifi devices, from
Emmanuel Grumbach.
4) NFACCT quota params not validated properly, from Phil Turnbull.
5) Resolve more glibc vs. kernel header conflicts, from Mikko
Tapeli.
6) Missing IRQ free in ravb_close(), from Geert Uytterhoeven.
7) Fix infoleak in x25, from Kangjie Lu.
8) Similarly in thunderx driver, from Heinrich Schuchardt.
9) tc_ife.h uapi header not exported properly, from Jamal Hadi Salim.
10) Don't reenable PHY interreupts if device is in polling mode, from
Shaohui Xie.
11) Packet scheduler actions late binding was not being handled
properly at all, from Jamal Hadi Salim.
12) Fix binding of conntrack entries to helpers in openvswitch, from
Joe Stringer"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (21 commits)
gre: do not keep the GRE header around in collect medata mode
openvswitch: Fix cached ct with helper.
net sched: ife action fix late binding
net sched: skbedit action fix late binding
net sched: simple action fix late binding
net sched: mirred action fix late binding
net sched: ipt action fix late binding
net sched: vlan action fix late binding
net: phylib: fix interrupts re-enablement in phy_start
tcp: refresh skb timestamp at retransmit time
net: nps_enet: bug fix - handle lost tx interrupts
net: nps_enet: Tx handler synchronization
export tc ife uapi header
net: thunderx: avoid exposing kernel stack
net: fix a kernel infoleak in x25 module
ravb: Add missing free_irq() call to ravb_close()
uapi glibc compat: fix compile errors when glibc net/if.h included before linux/if.h
netfilter: nfnetlink_acct: validate NFACCT_QUOTA parameter
iwlwifi: mvm: don't override the rate with the AMSDU len
netfilter: IDLETIMER: fix race condition when destroy the target
...
Jiri Benc [Wed, 11 May 2016 13:53:57 +0000 (15:53 +0200)]
gre: do not keep the GRE header around in collect medata mode
For ipgre interface in collect metadata mode, it doesn't make sense for the
interface to be of ARPHRD_IPGRE type. The outer header of received packets
is not needed, as all the information from it is present in metadata_dst. We
already don't set ipgre_header_ops for collect metadata interfaces, which is
the only consumer of mac_header pointing to the outer IP header.
Just set the interface type to ARPHRD_NONE in collect metadata mode for
ipgre (not gretap, that still correctly stays ARPHRD_ETHER) and reset
mac_header.
Fixes:
a64b04d86d14 ("gre: do not assign header_ops in collect metadata mode")
Fixes:
2e15ea390e6f4 ("ip_gre: Add support to collect tunnel metadata.")
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Joe Stringer [Wed, 11 May 2016 17:29:26 +0000 (10:29 -0700)]
openvswitch: Fix cached ct with helper.
When using conntrack helpers from OVS, a common configuration is to
perform a lookup without specifying a helper, then go through a
firewalling policy, only to decide to attach a helper afterwards.
In this case, the initial lookup will cause a ct entry to be attached to
the skb, then the later commit with helper should attach the helper and
confirm the connection. However, the helper attachment has been missing.
If the user has enabled automatic helper attachment, then this issue
will be masked as it will be applied in init_conntrack(). It is also
masked if the action is executed from ovs_packet_cmd_execute() as that
will construct a fresh skb.
This patch fixes the issue by making an explicit call to try to assign
the helper if there is a discrepancy between the action's helper and the
current skb->nfct.
Fixes:
cae3a2627520 ("openvswitch: Allow attaching helpers to ct action")
Signed-off-by: Joe Stringer <joe@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mathias Krause [Tue, 10 May 2016 21:07:02 +0000 (23:07 +0200)]
x86/extable: ensure entries are swapped completely when sorting
The x86 exception table sorting was changed in commit
29934b0fb8ff
("x86/extable: use generic search and sort routines") to use the arch
independent code in lib/extable.c. However, the patch was mangled
somehow on its way into the kernel from the last version posted at [1].
The committed version kind of attempted to incorporate the changes of
commit
548acf19234d ("x86/mm: Expand the exception table logic to allow
new handling options") as in _completely_ _ignoring_ the x86 specific
'handler' member of struct exception_table_entry. This effectively
broke the sorting as entries will only partly be swapped now.
Fortunately, the x86 Kconfig selects BUILDTIME_EXTABLE_SORT, so the
exception table doesn't need to be sorted at runtime. However, in case
that ever changes, we better not break the exception table sorting just
because of that.
[ Ard Biesheuvel points out that BUILDTIME_EXTABLE_SORT applies to the
core image only, but we still rely on the sorting routines for modules
in that case - Linus ]
Fix this by providing a swap_ex_entry_fixup() macro that takes care of
the 'handler' member.
[1] https://lkml.org/lkml/2016/1/27/232
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Fixes:
29934b0fb8f ("x86/extable: use generic search and sort routines")
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: H. Peter Anvin <hpa@linux.intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Wed, 11 May 2016 17:21:16 +0000 (10:21 -0700)]
Merge tag 'spi-fix-v4.6-rc7' of git://git./linux/kernel/git/broonie/spi
Pull spi fixes from Mark Brown:
"A bunch of small driver specific fixes that have come up, none of them
remarkable in themselves. One fixes a regression introduced in the
merge window and another two are targetted at stable"
* tag 'spi-fix-v4.6-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi: pxa2xx: Do not detect number of enabled chip selects on Intel SPT
spi: spi-ti-qspi: Handle truncated frames properly
spi: spi-ti-qspi: Fix FLEN and WLEN settings if bits_per_word is overridden
spi: omap2-mcspi: Undo broken fix for dma transfer of vmalloced buffer
spi: spi-fsl-dspi: Fix cs_change handling in message transfer
Linus Torvalds [Wed, 11 May 2016 17:11:44 +0000 (10:11 -0700)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm
Pull KVM fixes from Paolo Bonzini:
"Two small x86 patches, improving "make kvmconfig" and fixing an
objtool warning for CONFIG_PROFILE_ALL_BRANCHES"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
kvmconfig: add more virtio drivers
x86/kvm: Add stack frame dependency to fastop() inline asm
David S. Miller [Wed, 11 May 2016 03:50:16 +0000 (23:50 -0400)]
Merge branch 'net-sched-fixes'
Jamal Hadi Salim says:
====================
Some actions were broken in allowing for late binding of actions.
Late binding workflow is as follows:
a) create an action and provide all necessary parameters for it
Optionally provide an index or let the kernel give you one.
Example:
sudo tc actions add action police rate 1kbit burst 90k drop index 1
b) later on bind to the pre-created action from a filter definition
by merely specifying the index.
Example:
sudo tc filter add dev lo parent ffff: protocol ip prio 8 \
u32 match ip src 127.0.0.8/32 flowid 1:8 action police index 1
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jamal Hadi Salim [Tue, 10 May 2016 20:49:31 +0000 (16:49 -0400)]
net sched: ife action fix late binding
The process below was broken and is fixed with this patch.
//add an ife action and give it an instance id of 1
sudo tc actions add action ife encode \
type 0xDEAD allow mark dst 02:15:15:15:15:15 index 1
//create a filter which binds to ife action id 1
sudo tc filter add dev $DEV parent ffff: protocol ip prio 1 u32\
match ip dst 17.0.0.1/32 flowid 1:11 action ife index 1
Message before fix was:
RTNETLINK answers: Invalid argument
We have an error talking to the kernel
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Reviewed-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jamal Hadi Salim [Tue, 10 May 2016 20:49:30 +0000 (16:49 -0400)]
net sched: skbedit action fix late binding
The process below was broken and is fixed with this patch.
//add a skbedit action and give it an instance id of 1
sudo tc actions add action skbedit mark 10 index 1
//create a filter which binds to skbedit action id 1
sudo tc filter add dev $DEV parent ffff: protocol ip prio 1 u32\
match ip dst 17.0.0.1/32 flowid 1:10 action skbedit index 1
Message before fix was:
RTNETLINK answers: Invalid argument
We have an error talking to the kernel
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Reviewed-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jamal Hadi Salim [Tue, 10 May 2016 20:49:29 +0000 (16:49 -0400)]
net sched: simple action fix late binding
The process below was broken and is fixed with this patch.
//add a simple action and give it an instance id of 1
sudo tc actions add action simple sdata "foobar" index 1
//create a filter which binds to simple action id 1
sudo tc filter add dev $DEV parent ffff: protocol ip prio 1 u32\
match ip dst 17.0.0.1/32 flowid 1:10 action simple index 1
Message before fix was:
RTNETLINK answers: Invalid argument
We have an error talking to the kernel
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Reviewed-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jamal Hadi Salim [Tue, 10 May 2016 20:49:28 +0000 (16:49 -0400)]
net sched: mirred action fix late binding
The process below was broken and is fixed with this patch.
//add an mirred action and give it an instance id of 1
sudo tc actions add action mirred egress mirror dev $MDEV index 1
//create a filter which binds to mirred action id 1
sudo tc filter add dev $DEV parent ffff: protocol ip prio 1 u32\
match ip dst 17.0.0.1/32 flowid 1:10 action mirred index 1
Message before bug fix was:
RTNETLINK answers: Invalid argument
We have an error talking to the kernel
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Reviewed-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jamal Hadi Salim [Tue, 10 May 2016 20:49:27 +0000 (16:49 -0400)]
net sched: ipt action fix late binding
This was broken and is fixed with this patch.
//add an ipt action and give it an instance id of 1
sudo tc actions add action ipt -j mark --set-mark 2 index 1
//create a filter which binds to ipt action id 1
sudo tc filter add dev $DEV parent ffff: protocol ip prio 1 u32\
match ip dst 17.0.0.1/32 flowid 1:10 action ipt index 1
Message before bug fix was:
RTNETLINK answers: Invalid argument
We have an error talking to the kernel
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Reviewed-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jamal Hadi Salim [Tue, 10 May 2016 20:49:26 +0000 (16:49 -0400)]
net sched: vlan action fix late binding
Late vlan action binding was broken and is fixed with this patch.
//add a vlan action to pop and give it an instance id of 1
sudo tc actions add action vlan pop index 1
//create filter which binds to vlan action id 1
sudo tc filter add dev $DEV parent ffff: protocol ip prio 1 u32 \
match ip dst 17.0.0.1/32 flowid 1:1 action vlan index 1
current message(before bug fix) was:
RTNETLINK answers: Invalid argument
We have an error talking to the kernel
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Reviewed-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Shaohui Xie [Tue, 10 May 2016 09:42:26 +0000 (17:42 +0800)]
net: phylib: fix interrupts re-enablement in phy_start
If phy was suspended and is starting, current driver always enable
phy's interrupts, if phy works in polling, phy can raise unexpected
interrupt which will not be handled, the interrupt will block system
enter suspend again. So interrupts should only be re-enabled if phy
works in interrupt.
Signed-off-by: Shaohui Xie <Shaohui.Xie@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Tue, 10 May 2016 03:55:16 +0000 (20:55 -0700)]
tcp: refresh skb timestamp at retransmit time
In the very unlikely case __tcp_retransmit_skb() can not use the cloning
done in tcp_transmit_skb(), we need to refresh skb_mstamp before doing
the copy and transmit, otherwise TCP TS val will be an exact copy of
original transmit.
Fixes:
7faee5c0d514 ("tcp: remove TCP_SKB_CB(skb)->when")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 10 May 2016 19:04:57 +0000 (15:04 -0400)]
Merge branch 'nps_enet-fixes'
Elad Kanfi says:
====================
nps_enet: Net driver bugs fix
v3:
tx_packet_sent flag is not necessary, use socket buffer pointer
instead.
Use wmb() instead of smp_wmb().
v2:
Remove code style commit for now.
Code style commit will be added after the bugs fix will be approved.
Summary:
1. Bug description: TX done interrupts that arrives while interrupts
are masked, during NAPI poll, will not trigger an interrupt handling.
Since TX interrupt is of level edge we will lose the TX done interrupt.
As a result all pending tx frames will get no service.
Solution: Check if there is a pending tx request after unmasking the
interrupt and if answer is yes then re-add ourselves to
the NAPI poll list.
2. Bug description: CPU-A before sending a frame will set a variable
to true. CPU-B that executes the tx done interrupt service routine
might read a non valid value of that variable.
Solution: Use the socket buffer pointer instead of the variable,
and add a write memory barrier at the tx sending function after
the pointer is set.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Elad Kanfi [Mon, 9 May 2016 17:13:20 +0000 (20:13 +0300)]
net: nps_enet: bug fix - handle lost tx interrupts
The tx interrupt is of edge type, and in case such interrupt is triggered
while it is masked it will not be handled even after tx interrupts are
re-enabled in the end of NAPI poll.
This will cause tx network to stop in the following scenario:
* Rx is being handled, hence interrupts are masked.
* Tx interrupt is triggered after checking if there is some tx to handle
and before re-enabling the interrupts.
In this situation only rx transaction will release tx requests.
In order to handle the tx that was missed( if there was one ),
a NAPI reschdule was added after enabling the interrupts.
Signed-off-by: Elad Kanfi <eladkan@mellanox.com>
Acked-by: Noam Camus <noamca@mellanox.com>
Acked-by: Gilad Ben-Yossef <giladby@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Elad Kanfi [Mon, 9 May 2016 17:13:19 +0000 (20:13 +0300)]
net: nps_enet: Tx handler synchronization
Below is a description of a possible problematic
sequence. CPU-A is sending a frame and CPU-B handles
the interrupt that indicates the frame was sent. CPU-B
reads an invalid value of tx_packet_sent.
CPU-A CPU-B
----- -----
nps_enet_send_frame
.
.
tx_skb = skb
tx_packet_sent = true
order HW to start tx
.
.
HW complete tx
------> get tx complete interrupt
.
.
if(tx_packet_sent == true)
handle tx_skb
end memory transaction
(tx_packet_sent actually
written)
Furthermore there is a dependency between tx_skb and tx_packet_sent.
There is no assurance that tx_skb contains a valid pointer at CPU B
when it sees tx_packet_sent == true.
Solution:
Initialize tx_skb to NULL and use it to indicate that packet was sent,
in this way tx_packet_sent can be removed.
Add a write memory barrier after setting tx_skb in order to make sure
that it is valid before HW is informed and IRQ is fired.
Fixed sequence will be:
CPU-A CPU-B
----- -----
tx_skb = skb
wmb()
.
.
order HW to start tx
.
.
HW complete tx
------> get tx complete interrupt
.
.
if(tx_skb != NULL)
handle tx_skb
tx_skb = NULL
Signed-off-by: Elad Kanfi <eladkan@mellanox.com>
Acked-by: Noam Camus <noamca@mellanox.com>
Acked-by: Gilad Ben-Yossef <giladby@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Tue, 10 May 2016 19:04:40 +0000 (12:04 -0700)]
Merge tag 'pci-v4.6-fixes-3' of git://git./linux/kernel/git/helgaas/pci
Pull PCI fixes from Bjorn Helgaas:
"Since v4.5, we've WARNed during resume if a PCI device, including a
Thunderbolt device, was added while we were suspended. A change we
merged for v4.6-rc1 turned that warning into a system hang. These
enumeration patches from Lukas Wunner fix this issue:
- Fix BUG on device attach failure
- Do not treat EPROBE_DEFER as device attach failure"
* tag 'pci-v4.6-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
PCI: Do not treat EPROBE_DEFER as device attach failure
PCI: Fix BUG on device attach failure
Linus Torvalds [Tue, 10 May 2016 18:41:05 +0000 (11:41 -0700)]
Merge branch 'x86-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
"Two topology corner case fixes, and a MAINTAINERS file update for
mmiotrace maintenance"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/topology: Set x86_max_cores to 1 for CONFIG_SMP=n
MAINTAINERS: Add mmiotrace entry
x86/topology: Handle CPUID bogosity gracefully
Linus Torvalds [Tue, 10 May 2016 18:32:01 +0000 (11:32 -0700)]
Merge branch 'sched-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull scheduler fixes from Ingo Molnar:
"A UP kernel cpufreq fix and a rt/dl scheduler corner case fix"
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/rt, sched/dl: Don't push if task's scheduling class was changed
sched/fair: Fix !CONFIG_SMP kernel cpufreq governor breakage
Andrey Utkin [Mon, 9 May 2016 15:04:19 +0000 (18:04 +0300)]
kvmconfig: add more virtio drivers
"make defconfig kvmconfig" is supposed to end up with usable kernel for
KVM guest. In practice, it won't work for e.g. Hetzner VPS (KVM-based)
unless you add these options.
Signed-off-by: Andrey Utkin <andrey_utkin@fastmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Josh Poimboeuf [Wed, 9 Mar 2016 18:59:50 +0000 (12:59 -0600)]
x86/kvm: Add stack frame dependency to fastop() inline asm
The kbuild test robot reported this objtool warning [1]:
arch/x86/kvm/emulate.o: warning: objtool: fastop()+0x69: call without frame pointer save/setup
The issue seems to be caused by CONFIG_PROFILE_ALL_BRANCHES. With that
option, for some reason gcc decides not to create a stack frame in
fastop() before doing the inline asm call, which can result in a bad
stack trace.
Force a stack frame to be created if CONFIG_FRAME_POINTER is enabled by
listing the stack pointer as an output operand for the inline asm
statement.
This change has no effect for !CONFIG_PROFILE_ALL_BRANCHES.
[1] https://lists.01.org/pipermail/kbuild-all/2016-March/018249.html
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Xunlei Pang [Mon, 9 May 2016 04:11:31 +0000 (12:11 +0800)]
sched/rt, sched/dl: Don't push if task's scheduling class was changed
We got this warning:
WARNING: CPU: 1 PID: 2468 at kernel/sched/core.c:1161 set_task_cpu+0x1af/0x1c0
[...]
Call Trace:
dump_stack+0x63/0x87
__warn+0xd1/0xf0
warn_slowpath_null+0x1d/0x20
set_task_cpu+0x1af/0x1c0
push_dl_task.part.34+0xea/0x180
push_dl_tasks+0x17/0x30
__balance_callback+0x45/0x5c
__sched_setscheduler+0x906/0xb90
SyS_sched_setattr+0x150/0x190
do_syscall_64+0x62/0x110
entry_SYSCALL64_slow_path+0x25/0x25
This corresponds to:
WARN_ON_ONCE(p->state == TASK_RUNNING &&
p->sched_class == &fair_sched_class &&
(p->on_rq && !task_on_rq_migrating(p)))
It happens because in find_lock_later_rq(), the task whose scheduling
class was changed to fair class is still pushed away as if it were
a deadline task ...
So, check in find_lock_later_rq() after double_lock_balance(), if the
scheduling class of the deadline task was changed, break and retry.
Apply the same logic to RT tasks.
Signed-off-by: Xunlei Pang <xlpang@redhat.com>
Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Juri Lelli <juri.lelli@arm.com>
Link: http://lkml.kernel.org/r/1462767091-1215-1-git-send-email-xlpang@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Thomas Gleixner [Tue, 10 May 2016 07:20:33 +0000 (09:20 +0200)]
x86/topology: Set x86_max_cores to 1 for CONFIG_SMP=n
Josef reported that the uncore driver trips over with CONFIG_SMP=n because
x86_max_cores is 16 instead of 12.
The reason is, that for SMP=n the extended topology detection is a NOOP and
the cache leaf is used to determine the number of cores. That's wrong in two
aspects:
1) The cache leaf enumerates the maximum addressable number of cores in the
package, which is obviously not correct
2) UP has no business with topology bits at all.
Make intel_num_cpu_cores() return 1 for CONFIG_SMP=n
Reported-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: kernel-team <Kernel-team@fb.com>
Cc: Kan Liang <kan.liang@intel.com>
Link: http://lkml.kernel.org/r/761b4a2a-0332-7954-f030-c6639f949612@fb.com
Jamal Hadi Salim [Mon, 9 May 2016 12:05:06 +0000 (08:05 -0400)]
export tc ife uapi header
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 10 May 2016 05:02:51 +0000 (01:02 -0400)]
Merge tag 'wireless-drivers-for-davem-2016-05-09' of git://git./linux/kernel/git/kvalo/wireless-drivers
Kalle Valo says:
====================
wireless-drivers fixes for 4.6
iwlwifi
* fix P2P rates (and possibly other issues)
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 10 May 2016 04:50:20 +0000 (00:50 -0400)]
Merge git://git./pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contain Netfilter simple fixes for your net tree,
two one-liner and one two-liner:
1) Oneliner to fix missing spinlock definition that triggers
'BUG: spinlock bad magic on CPU#' when spinlock debugging is enabled,
from Florian Westphal.
2) Fix missing workqueue cancelation on IDLETIMER removal,
from Liping Zhang.
3) Fix insufficient validation of netlink of NFACCT_QUOTA in
nfnetlink_acct, from Phil Turnbull.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
xypron.glpk@gmx.de [Sun, 8 May 2016 22:46:18 +0000 (00:46 +0200)]
net: thunderx: avoid exposing kernel stack
Reserved fields should be set to zero to avoid exposing
bits from the kernel stack.
Signed-off-by: Heinrich Schuchardt <xypron.glpk@gmx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kangjie Lu [Sun, 8 May 2016 16:10:14 +0000 (12:10 -0400)]
net: fix a kernel infoleak in x25 module
Stack object "dte_facilities" is allocated in x25_rx_call_request(),
which is supposed to be initialized in x25_negotiate_facilities.
However, 5 fields (8 bytes in total) are not initialized. This
object is then copied to userland via copy_to_user, thus infoleak
occurs.
Signed-off-by: Kangjie Lu <kjlu@gatech.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Geert Uytterhoeven [Sat, 7 May 2016 11:17:11 +0000 (13:17 +0200)]
ravb: Add missing free_irq() call to ravb_close()
When reopening the network device on ra7795/salvator-x, e.g. after a
DHCP timeout:
IP-Config: Reopening network devices...
genirq: Flags mismatch irq 139.
00000000 (eth0:ch24:emac) vs.
00000000 (eth0:ch24:emac)
ravb
e6800000.ethernet eth0: cannot request IRQ eth0:ch24:emac
IP-Config: Failed to open eth0
IP-Config: No network devices available
The "mismatch" is due to requesting an IRQ that is already in use,
while IRQF_PROBE_SHARED wasn't set.
However, the real cause is that ravb_close() doesn't release the R-Car
Gen3-specific secondary IRQ.
Add the missing free_irq() call to fix this.
Fixes:
22d4df8ff3a3cc72 ("ravb: Add support for r8a7795 SoC")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mikko Rapeli [Sun, 24 Apr 2016 15:45:00 +0000 (17:45 +0200)]
uapi glibc compat: fix compile errors when glibc net/if.h included before linux/if.h
glibc's net/if.h contains copies of definitions from linux/if.h and these
conflict and cause build failures if both files are included by application
source code. Changes in uapi headers, which fixed header file dependencies to
include linux/if.h when it was needed, e.g. commit
1ffad83d, made the
net/if.h and linux/if.h incompatibilities visible as build failures for
userspace applications like iproute2 and xtables-addons.
This patch fixes compile errors when glibc net/if.h is included before
linux/if.h:
./linux/if.h:99:21: error: redeclaration of enumerator ‘IFF_NOARP’
./linux/if.h:98:23: error: redeclaration of enumerator ‘IFF_RUNNING’
./linux/if.h:97:26: error: redeclaration of enumerator ‘IFF_NOTRAILERS’
./linux/if.h:96:27: error: redeclaration of enumerator ‘IFF_POINTOPOINT’
./linux/if.h:95:24: error: redeclaration of enumerator ‘IFF_LOOPBACK’
./linux/if.h:94:21: error: redeclaration of enumerator ‘IFF_DEBUG’
./linux/if.h:93:25: error: redeclaration of enumerator ‘IFF_BROADCAST’
./linux/if.h:92:19: error: redeclaration of enumerator ‘IFF_UP’
./linux/if.h:252:8: error: redefinition of ‘struct ifconf’
./linux/if.h:203:8: error: redefinition of ‘struct ifreq’
./linux/if.h:169:8: error: redefinition of ‘struct ifmap’
./linux/if.h:107:23: error: redeclaration of enumerator ‘IFF_DYNAMIC’
./linux/if.h:106:25: error: redeclaration of enumerator ‘IFF_AUTOMEDIA’
./linux/if.h:105:23: error: redeclaration of enumerator ‘IFF_PORTSEL’
./linux/if.h:104:25: error: redeclaration of enumerator ‘IFF_MULTICAST’
./linux/if.h:103:21: error: redeclaration of enumerator ‘IFF_SLAVE’
./linux/if.h:102:22: error: redeclaration of enumerator ‘IFF_MASTER’
./linux/if.h:101:24: error: redeclaration of enumerator ‘IFF_ALLMULTI’
./linux/if.h:100:23: error: redeclaration of enumerator ‘IFF_PROMISC’
The cases where linux/if.h is included before net/if.h need a similar fix in
the glibc side, or the order of include files can be changed userspace
code as a workaround.
This change was tested in x86 userspace on Debian unstable with
scripts/headers_compile_test.sh:
$ make headers_install && \
cd usr/include && ../../scripts/headers_compile_test.sh -l -k
...
cc -Wall -c -nostdinc -I /usr/lib/gcc/i586-linux-gnu/5/include -I /usr/lib/gcc/i586-linux-gnu/5/include-fixed -I . -I /home/mcfrisk/src/linux-2.6/usr/headers_compile_test_include.2uX2zH -I /home/mcfrisk/src/linux-2.6/usr/headers_compile_test_include.2uX2zH/i586-linux-gnu -o /dev/null ./linux/if.h_libc_before_kernel.h
PASSED libc before kernel test: ./linux/if.h
Reported-by: Jan Engelhardt <jengelh@inai.de>
Reported-by: Josh Boyer <jwboyer@fedoraproject.org>
Reported-by: Stephen Hemminger <shemming@brocade.com>
Reported-by: Waldemar Brodkorb <mail@waldemar-brodkorb.de>
Cc: Gabriel Laskar <gabriel@lse.epita.fr>
Signed-off-by: Mikko Rapeli <mikko.rapeli@iki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Tue, 10 May 2016 01:24:04 +0000 (18:24 -0700)]
Merge branch 'libnvdimm-fixes' of git://git./linux/kernel/git/nvdimm/nvdimm
Pull libnvdimm build fix from Dan Williams:
"A build fix for the usage of HPAGE_SIZE in the last libnvdimm pull
request.
I have taken note that the kbuild robot build success test does not
include results for alpha_allmodconfig. Thanks to Guenter for the
report. It's tagged for -stable since the original fix will land
there and cause build problems"
* 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
libnvdimm, pfn: fix ARCH=alpha allmodconfig build failure
Andy Lutomirski [Mon, 9 May 2016 22:48:51 +0000 (15:48 -0700)]
perf/core: Change the default paranoia level to 2
Allowing unprivileged kernel profiling lets any user dump follow kernel
control flow and dump kernel registers. This most likely allows trivial
kASLR bypassing, and it may allow other mischief as well. (Off the top
of my head, the PERF_SAMPLE_REGS_INTR output during /dev/urandom reads
could be quite interesting.)
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Tue, 10 May 2016 00:54:59 +0000 (17:54 -0700)]
Merge branch 'akpm' (patches from Andrew)
Merge fixes from Andrew Morton:
"2 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
zsmalloc: fix zs_can_compact() integer overflow
Revert "proc/base: make prompt shell start from new line after executing "cat /proc/$pid/wchan""
Sergey Senozhatsky [Mon, 9 May 2016 23:28:49 +0000 (16:28 -0700)]
zsmalloc: fix zs_can_compact() integer overflow
zs_can_compact() has two race conditions in its core calculation:
unsigned long obj_wasted = zs_stat_get(class, OBJ_ALLOCATED) -
zs_stat_get(class, OBJ_USED);
1) classes are not locked, so the numbers of allocated and used
objects can change by the concurrent ops happening on other CPUs
2) shrinker invokes it from preemptible context
Depending on the circumstances, thus, OBJ_ALLOCATED can become
less than OBJ_USED, which can result in either very high or
negative `total_scan' value calculated later in do_shrink_slab().
do_shrink_slab() has some logic to prevent those cases:
vmscan: shrink_slab: zs_shrinker_scan+0x0/0x28 [zsmalloc] negative objects to delete nr=-62
vmscan: shrink_slab: zs_shrinker_scan+0x0/0x28 [zsmalloc] negative objects to delete nr=-62
vmscan: shrink_slab: zs_shrinker_scan+0x0/0x28 [zsmalloc] negative objects to delete nr=-64
vmscan: shrink_slab: zs_shrinker_scan+0x0/0x28 [zsmalloc] negative objects to delete nr=-62
vmscan: shrink_slab: zs_shrinker_scan+0x0/0x28 [zsmalloc] negative objects to delete nr=-62
vmscan: shrink_slab: zs_shrinker_scan+0x0/0x28 [zsmalloc] negative objects to delete nr=-62
However, due to the way `total_scan' is calculated, not every
shrinker->count_objects() overflow can be spotted and handled.
To demonstrate the latter, I added some debugging code to do_shrink_slab()
(x86_64) and the results were:
vmscan: OVERFLOW: shrinker->count_objects() == -1 [
18446744073709551615]
vmscan: but total_scan > 0:
92679974445502
vmscan: resulting total_scan:
92679974445502
[..]
vmscan: OVERFLOW: shrinker->count_objects() == -1 [
18446744073709551615]
vmscan: but total_scan > 0:
22634041808232578
vmscan: resulting total_scan:
22634041808232578
Even though shrinker->count_objects() has returned an overflowed value,
the resulting `total_scan' is positive, and, what is more worrisome, it
is insanely huge. This value is getting used later on in
shrinker->scan_objects() loop:
while (total_scan >= batch_size ||
total_scan >= freeable) {
unsigned long ret;
unsigned long nr_to_scan = min(batch_size, total_scan);
shrinkctl->nr_to_scan = nr_to_scan;
ret = shrinker->scan_objects(shrinker, shrinkctl);
if (ret == SHRINK_STOP)
break;
freed += ret;
count_vm_events(SLABS_SCANNED, nr_to_scan);
total_scan -= nr_to_scan;
cond_resched();
}
`total_scan >= batch_size' is true for a very-very long time and
'total_scan >= freeable' is also true for quite some time, because
`freeable < 0' and `total_scan' is large enough, for example,
22634041808232578. The only break condition, in the given scheme of
things, is shrinker->scan_objects() == SHRINK_STOP test, which is a
bit too weak to rely on, especially in heavy zsmalloc-usage scenarios.
To fix the issue, take a pool stat snapshot and use it instead of
racy zs_stat_get() calls.
Link: http://lkml.kernel.org/r/20160509140052.3389-1-sergey.senozhatsky@gmail.com
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: <stable@vger.kernel.org> [4.3+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Robin Humble [Mon, 9 May 2016 23:28:46 +0000 (16:28 -0700)]
Revert "proc/base: make prompt shell start from new line after executing "cat /proc/$pid/wchan""
This reverts the 4.6-rc1 commit
7e2bc81da333 ("proc/base: make prompt
shell start from new line after executing "cat /proc/$pid/wchan")
because it breaks /proc/$PID/whcan formatting in ps and top.
Revert also because the patch is inconsistent - it adds a newline at the
end of only the '0' wchan, and does not add a newline when
/proc/$PID/wchan contains a symbol name.
eg.
$ ps -eo pid,stat,wchan,comm
PID STAT WCHAN COMMAND
...
1189 S - dbus-launch
1190 Ssl 0
dbus-daemon
1198 Sl 0
lightdm
1299 Ss ep_pol systemd
1301 S - (sd-pam)
1304 Ss wait sh
Signed-off-by: Robin Humble <plaguedbypenguins@gmail.com>
Cc: Minfei Huang <mnfhuang@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Mon, 9 May 2016 19:24:19 +0000 (12:24 -0700)]
Merge branch 'linus' of git://git./linux/kernel/git/herbert/crypto-2.6
Pull crypto fixes from Herbert Xu:
"This fixes the following issues:
- bug in ahash SG list walking that may lead to crashes
- resource leak in qat
- missing RSA dependency that causes it to fail"
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: rsa - select crypto mgr dependency
crypto: hash - Fix page length clamping in hash walk
crypto: qat - fix adf_ctl_drv.c:undefined reference to adf_init_pf_wq
crypto: qat - fix invalid pf2vf_resp_wq logic
Linus Torvalds [Mon, 9 May 2016 19:11:37 +0000 (12:11 -0700)]
Merge git://git./linux/kernel/git/davem/net
Pull networking fixes from David Miller:
1) Check klogctl failure correctly, from Colin Ian King.
2) Prevent OOM when under memory pressure in flowcache, from Steffen
Klassert.
3) Fix info leak in llc and rtnetlink ifmap code, from Kangjie Lu.
4) Memory barrier and multicast handling fixes in bnxt_en, from Michael
Chan.
5) Endianness bug in mlx5, from Daniel Jurgens.
6) Fix disconnect handling in VSOCK, from Ian Campbell.
7) Fix locking of netdev list walking in get_bridge_ifindices(), from
Nikolay Aleksandrov.
8) Bridge multicast MLD parser can look at wrong packet offsets, fix
from Linus Lüssing.
9) Fix chip hang in qede driver, from Sudarsana Reddy Kalluru.
10) Fix missing setting of encapsulation before inner handling completes
in udp_offload code, from Jarno Rajahalme.
11) Missing rollbacks during LAG join and flood configuration failures
in mlxsw driver, from Ido Schimmel.
12) Fix error code checks in netxen driver, from Dan Carpenter.
13) Fix key size in new macsec driver, from Sabrina Dubroca.
14) Fix mlx5/VXLAN dependencies, from Arnd Bergmann.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (29 commits)
net/mlx5e: make VXLAN support conditional
Revert "net/mlx5: Kconfig: Fix MLX5_EN/VXLAN build issue"
macsec: key identifier is 128 bits, not 64
Documentation/networking: more accurate LCO explanation
macvtap: segmented packet is consumed
tools: bpf_jit_disasm: check for klogctl failure
qede: uninitialized variable in qede_start_xmit()
netxen: netxen_rom_fast_read() doesn't return -1
netxen: reversed condition in netxen_nic_set_link_parameters()
netxen: fix error handling in netxen_get_flash_block()
mlxsw: spectrum: Add missing rollback in flood configuration
mlxsw: spectrum: Fix rollback order in LAG join failure
udp_offload: Set encapsulation before inner completes.
udp_tunnel: Remove redundant udp_tunnel_gro_complete().
qede: prevent chip hang when increasing channels
net: ipv6: tcp reset, icmp need to consider L3 domain
bridge: fix igmp / mld query parsing
net: bridge: fix old ioctl unlocked net device walk
VSOCK: do not disconnect socket when peer has shutdown SEND only
net/mlx4_en: Fix endianness bug in IPV6 csum calculation
...
Josh Poimboeuf [Fri, 6 May 2016 14:22:25 +0000 (09:22 -0500)]
compiler-gcc: require gcc 4.8 for powerpc __builtin_bswap16()
gcc support for __builtin_bswap16() was supposedly added for powerpc in
gcc 4.6, and was then later added for other architectures in gcc 4.8.
However, Stephen Rothwell reported that attempting to use it on powerpc
in gcc 4.6 fails with:
lib/vsprintf.c:160:2: error: initializer element is not constant
lib/vsprintf.c:160:2: error: (near initialization for 'decpair[0]')
lib/vsprintf.c:160:2: error: initializer element is not constant
lib/vsprintf.c:160:2: error: (near initialization for 'decpair[1]')
...
I'm not entirely sure what those errors mean, but I don't see them on
gcc 4.8. So let's consider gcc 4.8 to be the official starting point
for __builtin_bswap16().
Arnd Bergmann adds:
"I found the commit in gcc-4.8 that replaced the powerpc-specific
implementation of __builtin_bswap16 with an architecture-independent
one. Apparently the powerpc version (gcc-4.6 and 4.7) just mapped to
the lhbrx/sthbrx instructions, so it ended up not being a constant,
though the intent of the patch was mainly to add support for the
builtin to x86:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52624
has the patch that went into gcc-4.8 and more information."
Fixes:
7322dd755e7d ("byteswap: try to avoid __builtin_constant_p gcc bug")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Tested-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David S. Miller [Mon, 9 May 2016 04:21:13 +0000 (00:21 -0400)]
Merge branch 'mlx5-build-fix'
Saeed Mahameed says:
====================
net/mlx5e: Kconfig fixes for VxLAN
Reposting to net the build errors fixes posted by Arnd last week.
Originally Arnd posted those fixes to net-next, while the issue
is also seen in net. For net-next a different approach is required
for fixing the issue as VXLAN and Device Drivers are no longer
dependent, but there is no harm for those fixes to get into net-next.
Optionally, once net is merged into net-next we can
Revert "net/mlx5e: make VXLAN support conditional" as the
CONFIG_MLX5_CORE_EN_VXLAN will no longer be required.
Applied on top:
288928658583 ('mlxsw: spectrum: Add missing rollback in flood configuration')
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Arnd Bergmann [Sun, 8 May 2016 11:55:25 +0000 (14:55 +0300)]
net/mlx5e: make VXLAN support conditional
VXLAN can be disabled at compile-time or it can be a loadable
module while mlx5 is built-in, which leads to a link error:
drivers/net/built-in.o: In function `mlx5e_create_netdev':
ntb_netdev.c:(.text+0x106de4): undefined reference to `vxlan_get_rx_port'
This avoids the link error and makes the vxlan code optional,
like the other ethernet drivers do as well.
Link: https://patchwork.ozlabs.org/patch/589296/
Fixes:
b3f63c3d5e2c ("net/mlx5e: Add netdev support for VXLAN tunneling")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arnd Bergmann [Sun, 8 May 2016 11:55:24 +0000 (14:55 +0300)]
Revert "net/mlx5: Kconfig: Fix MLX5_EN/VXLAN build issue"
This reverts commit
69976fb1045850a742deb9790ea49cbc6f497531.
We cannot select VXLAN when IPv4 support is disabled, that just gives
us additional build errors, including:
warning: (MLX5_CORE_EN) selects VXLAN which has unmet direct dependencies (NETDEVICES && NET_CORE && INET)
In file included from ../drivers/net/vxlan.c:36:0:
include/net/udp_tunnel.h: In function 'udp_tunnel_handle_offloads':
include/net/udp_tunnel.h:112:9: error: implicit declaration of function 'iptunnel_handle_offloads' [-Werror=implicit-function-declaration]
return iptunnel_handle_offloads(skb, type);
^~~~~~~~~~~~~~~~~~~~~~~~
I'm sending a proper fix for the original bug in a separate patch.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sabrina Dubroca [Sat, 7 May 2016 18:19:29 +0000 (20:19 +0200)]
macsec: key identifier is 128 bits, not 64
The MACsec standard mentions a key identifier for each key, but
doesn't specify anything about it, so I arbitrarily chose 64 bits.
IEEE 802.1X-2010 specifies MKA (MACsec Key Agreement), and defines the
key identifier to be 128 bits (96 bits "member identifier" + 32 bits
"key number").
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Shmulik Ladkani [Fri, 6 May 2016 17:27:43 +0000 (20:27 +0300)]
Documentation/networking: more accurate LCO explanation
In few places the term "ones-complement sum" was used but the actual
meaning is "the complement of the ones-complement sum".
Also, avoid enclosing long statements with underscore, to ease
readability.
Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
Acked-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Fri, 6 May 2016 12:58:21 +0000 (05:58 -0700)]
macvtap: segmented packet is consumed
If GSO packet is segmented and its segments are properly queued,
we call consume_skb() instead of kfree_skb() to be drop monitor
friendly.
Fixes:
3e4f8b7873709 ("macvtap: Perform GSO on forwarding path.")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Vlad Yasevich <vyasevic@redhat.com>
Reviewed-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Colin Ian King [Thu, 5 May 2016 22:39:33 +0000 (23:39 +0100)]
tools: bpf_jit_disasm: check for klogctl failure
klogctl can fail and return -ve len, so check for this and
return NULL to avoid passing a (size_t)-1 to malloc.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Thu, 5 May 2016 13:21:30 +0000 (16:21 +0300)]
qede: uninitialized variable in qede_start_xmit()
"data_split" was never set to false. It's just uninitialized.
Fixes:
2950219d87b0 ('qede: Add basic network device support')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Sun, 8 May 2016 21:38:32 +0000 (14:38 -0700)]
Linux 4.6-rc7
Ingo Molnar [Sun, 8 May 2016 08:27:33 +0000 (10:27 +0200)]
MAINTAINERS: Add mmiotrace entry
The Nouveau maintainers would like to follow and review mmiotrace
changes as well, so create a separate entry for that code. The high
level bits are living in the tracing code, the low level bits in the
x86 code.
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Pekka Paalanen <ppaalanen@gmail.com>
Acked-by: karol herbst <karolherbst@gmail.com>
Cc: linux-kernel@vger.kernel.org
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Dan Carpenter [Thu, 5 May 2016 13:20:20 +0000 (16:20 +0300)]
netxen: netxen_rom_fast_read() doesn't return -1
The error handling is broken here. netxen_rom_fast_read() returns zero
on success and -EIO on error. It never returns -1.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Thu, 5 May 2016 13:19:44 +0000 (16:19 +0300)]
netxen: reversed condition in netxen_nic_set_link_parameters()
My static checker complains that we are using "autoneg" without
initializing it. The problem is the ->phy_read() condition is reversed
so we only set this on error instead of success.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Thu, 5 May 2016 13:18:46 +0000 (16:18 +0300)]
netxen: fix error handling in netxen_get_flash_block()
My static checker complained that "v" can be used unintialized if
netxen_rom_fast_read() returns -EIO. That function never actually
returns -1.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Sat, 7 May 2016 17:53:32 +0000 (10:53 -0700)]
Merge tag 'char-misc-4.6-rc7' of git://git./linux/kernel/git/gregkh/char-misc
Pull misc driver fixes from Gfreg KH:
"Here are three small fixes for some driver problems that were
reported. Full details in the shortlog below.
All of these have been in linux-next with no reported issues"
* tag 'char-misc-4.6-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
nvmem: mxs-ocotp: fix buffer overflow in read
Drivers: hv: vmbus: Fix signaling logic in hv_need_to_signal_on_read()
misc: mic: Fix for double fetch security bug in VOP driver
Linus Torvalds [Sat, 7 May 2016 17:50:48 +0000 (10:50 -0700)]
Merge tag 'staging-4.6-rc7' of git://git./linux/kernel/git/gregkh/staging
Pull IIO driver fixes from Grek KH:
"It's really just IIO drivers here, some small fixes that resolve some
'crash on boot' errors that have shown up in the -rc series, and other
bugfixes that are required.
All have been in linux-next with no reported problems"
* tag 'staging-4.6-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
iio: imu: mpu6050: Fix name/chip_id when using ACPI
iio: imu: mpu6050: fix possible NULL dereferences
iio:adc:at91-sama5d2: Repair crash on module removal
iio: ak8975: fix maybe-uninitialized warning
iio: ak8975: Fix NULL pointer exception on early interrupt
Linus Torvalds [Sat, 7 May 2016 17:47:03 +0000 (10:47 -0700)]
Merge tag 'usb-4.6-rc7' of git://git./linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
"Here are some last-remaining fixes for USB drivers to resolve issues
that have shown up in testing. And two new device ids as well.
All of these have been in linux-next with no reported issues"
* tag 'usb-4.6-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
Revert "USB / PM: Allow USB devices to remain runtime-suspended when sleeping"
usb: musb: jz4740: fix error check of usb_get_phy()
Revert "usb: musb: musb_host: Enable HCD_BH flag to handle urb return in bottom half"
usb: musb: gadget: nuke endpoint before setting its descriptor to NULL
USB: serial: cp210x: add Straizona Focusers device ids
USB: serial: cp210x: add ID for Link ECU
Linus Torvalds [Sat, 7 May 2016 15:27:35 +0000 (08:27 -0700)]
Merge branch 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm
Pull ARM fixes from Russell King:
"These are a number of updates to fix a few problems found in the ARM
nommu code over the last couple of years, caused mostly by changes on
the mmu side"
* 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm:
ARM: 8573/1: domain: move {set,get}_domain under config guard
ARM: 8572/1: nommu: change memory reserve for the vectors
ARM: 8571/1: nommu: fix PMSAv7 setup
Linus Torvalds [Sat, 7 May 2016 15:17:45 +0000 (08:17 -0700)]
Merge tag 'media/v4.6-5' of git://git./linux/kernel/git/mchehab/linux-media
Pull media fixes from Mauro Carvalho Chehab:
- deadlock fixes on driver probe at exynos4-is and s43-camif drivers
- a build breakage if media controller is enabled and USB or PCI is
built as module.
* tag 'media/v4.6-5' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
[media] media-device: fix builds when USB or PCI is compiled as module
[media] media: s3c-camif: fix deadlock on driver probe()
[media] media: exynos4-is: fix deadlock on driver probe
Linus Torvalds [Sat, 7 May 2016 15:13:42 +0000 (08:13 -0700)]
Merge branch 'for-4.6-fixes' of git://git./linux/kernel/git/tj/libata
Pull libata fixes from Tejun Heo:
"An ahci driver addition and updates to ahci port enable handling for
some platform devices"
* 'for-4.6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
ata: add AMD Seattle platform driver
ARM: dts: apq8064: add ahci ports-implemented mask
ata: ahci-platform: Add ports-implemented DT bindings.
libahci: save port map for forced port map
Linus Torvalds [Sat, 7 May 2016 15:10:08 +0000 (08:10 -0700)]
Merge tag 'for-linus' of git://git./linux/kernel/git/dledford/rdma
Pull rdma fix from Doug Ledford:
"Fix for max sector calculation in iSER"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma:
IB/iser: Fix max_sectors calculation
Thomas Gleixner [Fri, 6 May 2016 18:48:16 +0000 (20:48 +0200)]
x86/topology: Handle CPUID bogosity gracefully
Joseph reported that a XEN guest dies with a division by 0 in the package
topology setup code. This happens if cpu_info.x86_max_cores is zero.
Handle that case and emit a warning. This does not fix the underlying XEN bug,
but makes the code more robust.
Reported-and-tested-by: Joseph Salisbury <joseph.salisbury@canonical.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1605062046270.3540@nanos
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Rafael J. Wysocki [Fri, 6 May 2016 12:58:43 +0000 (14:58 +0200)]
sched/fair: Fix !CONFIG_SMP kernel cpufreq governor breakage
The following commit:
34e2c555f3e1 ("cpufreq: Add mechanism for registering utilization update callbacks")
overlooked the fact that update_load_avg(), where CFS invokes cpufreq
utilization update callbacks, becomes an empty stub on UP kernels.
In consequence, if !CONFIG_SMP, cpufreq governors are never invoked
from CFS and they do not have a chance to evaluate CPU performace
levels and update them often enough.
Needless to say, things don't work as expected then.
Fix the problem by making the !CONFIG_SMP stub of update_load_avg()
invoke cpufreq update callbacks too.
Reported-by: Steve Muckle <steve.muckle@linaro.org>
Tested-by: Steve Muckle <steve.muckle@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Steve Muckle <steve.muckle@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Linux PM list <linux-pm@vger.kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Fixes:
34e2c555f3e1 (cpufreq: Add mechanism for registering utilization update callbacks)
Link: http://lkml.kernel.org/r/6282396.VVEdgVYxO3@vostro.rjw.lan
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Ido Schimmel [Fri, 6 May 2016 20:18:40 +0000 (22:18 +0200)]
mlxsw: spectrum: Add missing rollback in flood configuration
When we fail to set the flooding configuration for the broadcast and
unregistered multicast traffic, we should revert the flooding
configuration of the unknown unicast traffic.
Fixes:
0293038e0c36 ("mlxsw: spectrum: Add support for flood control")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Fri, 6 May 2016 20:18:39 +0000 (22:18 +0200)]
mlxsw: spectrum: Fix rollback order in LAG join failure
Make the leave procedure in the error path symmetric to the join
procedure and first remove the port from the collector before
potentially destroying the LAG.
Fixes:
0d65fc13042f ("mlxsw: spectrum: Implement LAG port join/leave")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jarno Rajahalme [Tue, 3 May 2016 23:10:21 +0000 (16:10 -0700)]
udp_offload: Set encapsulation before inner completes.
UDP tunnel segmentation code relies on the inner offsets being set for
an UDP tunnel GSO packet, but the inner *_complete() functions will
set the inner offsets only if 'encapsulation' is set before calling
them. Currently, udp_gro_complete() sets 'encapsulation' only after
the inner *_complete() functions are done. This causes the inner
offsets having invalid values after udp_gro_complete() returns, which
in turn will make it impossible to properly segment the packet in case
it needs to be forwarded, which would be visible to the user either as
invalid packets being sent or as packet loss.
This patch fixes this by setting skb's 'encapsulation' in
udp_gro_complete() before calling into the inner complete functions,
and by making each possible UDP tunnel gro_complete() callback set the
inner_mac_header to the beginning of the tunnel payload.
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Reviewed-by: Alexander Duyck <aduyck@mirantis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jarno Rajahalme [Tue, 3 May 2016 23:10:20 +0000 (16:10 -0700)]
udp_tunnel: Remove redundant udp_tunnel_gro_complete().
The setting of the UDP tunnel GSO type is already performed by
udp[46]_gro_complete().
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Fri, 6 May 2016 20:08:35 +0000 (13:08 -0700)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull writeback fix from Jens Axboe:
"Just a single fix for domain aware writeback, fixing a regression that
can cause balance_dirty_pages() to keep looping while not getting any
work done"
* 'for-linus' of git://git.kernel.dk/linux-block:
writeback: Fix performance regression in wb_over_bg_thresh()
Linus Torvalds [Fri, 6 May 2016 19:59:27 +0000 (12:59 -0700)]
Merge branch 'x86-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
"This contains two fixes: a boot fix for older SGI/UV systems, and an
APIC calibration fix"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/tsc: Read all ratio bits from MSR_PLATFORM_INFO
x86/platform/UV: Bring back the call to map_low_mmrs in uv_system_init
Sudarsana Reddy Kalluru [Thu, 5 May 2016 04:35:16 +0000 (00:35 -0400)]
qede: prevent chip hang when increasing channels
qede requires qed to provide enough resources to accommodate 16 combined
channels, but that upper-bound isn't actually being enforced by it.
Instead, qed inform back to qede how many channels can be opened based on
available resources - but that calculation doesn't really take into account
the resources requested by qede; Instead it considers other FW/HW available
resources.
As a result, if a user would increase the number of channels to more than
16 [e.g., using ethtool] the chip would hang.
This change increments the resources requested by qede to 64 combined
channels instead of 16; This value is an upper bound on the possible
available channels [due to other FW/HW resources].
Signed-off-by: Sudarsana Reddy Kalluru <sudarsana.kalluru@qlogic.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Ahern [Thu, 5 May 2016 04:26:08 +0000 (21:26 -0700)]
net: ipv6: tcp reset, icmp need to consider L3 domain
Responses for packets to unused ports are getting lost with L3 domains.
IPv4 has ip_send_unicast_reply for sending TCP responses which accounts
for L3 domains; update the IPv6 counterpart tcp_v6_send_response.
For icmp the L3 master check needs to be moved up in icmp6_send
to properly respond to UDP packets to a port with no listener.
Fixes:
ca254490c8df ("net: Add VRF support to IPv6 stack")
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Fri, 6 May 2016 18:58:45 +0000 (11:58 -0700)]
Merge tag 'pm+acpi-4.6-rc7' of git://git./linux/kernel/git/rafael/linux-pm
Pull power management and ACPI fixes from Rafael Wysocki:
"Fixes for problems introduced or discovered recently (intel_pstate,
sti-cpufreq, ARM64 cpuidle, Operating Performance Points framework,
generic device properties framework) and one fix for a hotplug-related
deadlock in ACPICA that's been there forever, but is nasty enough.
Specifics:
- Fix for a recent regression in the intel_pstate driver causing it
to fail to restore the HWP (HW-managed P-states) configuration of
the boot CPU after suspend-to-RAM (Rafael Wysocki).
- Fix for two recent regressions in the intel_pstate driver, one that
can trigger a divide by zero if the driver is accessed via sysfs
before it manages to take the first sample and one causing it to
fail to update a structure field used in a trace point, so the
information coming from it is less useful (Rafael Wysocki).
- Fix for a problem in the sti-cpufreq driver introduced during the
4.5 cycle that causes it to break CPU PM in multi-platform kernels
by registering cpufreq-dt (which subsequently doesn't work)
unconditionally and preventing the driver that would actually work
from registering (Sudeep Holla).
- Stable-candidate fix for an ARM64 cpuidle issue causing idle state
usage counters to be incorrectly updated for idle states that were
not entered due to errors (James Morse).
- Fix for a recently introduced issue in the OPP (Operating
Performance Points) framework causing it to print bogus error
messages for missing optional regulators (Viresh Kumar).
- Fix for a recently introduced issue in the generic device
properties framework that may cause it to attempt to dereferece and
invalid pointer in some cases (Heikki Krogerus).
- Fix for a deadlock in the ACPICA core that may be triggered by
device (eg Thunderbolt) hotplug (Prarit Bhargava)"
* tag 'pm+acpi-4.6-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
PM / OPP: Remove useless check
ACPICA: Dispatcher: Update thread ID for recursive method calls
intel_pstate: Fix intel_pstate_get()
cpufreq: intel_pstate: Fix HWP on boot CPU after system resume
cpufreq: st: enable selective initialization based on the platform
ARM: cpuidle: Pass on arm_cpuidle_suspend()'s return value
device property: Avoid potential dereferences of invalid pointers
Linus Torvalds [Fri, 6 May 2016 18:53:27 +0000 (11:53 -0700)]
Merge branch 'sched-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull scheduler fix from Ingo Molnar:
"This contains a single fix that fixes a nohz tick stopping bug when
mixed-poliocy SCHED_FIFO and SCHED_RR tasks are present on a runqueue"
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
nohz/full, sched/rt: Fix missed tick-reenabling bug in sched_can_stop_tick()
Linus Torvalds [Fri, 6 May 2016 18:40:24 +0000 (11:40 -0700)]
Merge branch 'perf-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
"This tree contains two fixes: new Intel CPU model numbers and an
AMD/iommu uncore PMU driver fix"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86/amd/iommu: Do not register a task ctx for uncore like PMUs
perf/x86: Add model numbers for Kabylake CPUs
Linus Torvalds [Fri, 6 May 2016 18:33:02 +0000 (11:33 -0700)]
Merge branch 'efi-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull EFI fixes from Ingo Molnar:
"This tree contains three fixes: a console spam fix, a file pattern fix
and a sysfb_efi fix for a bug that triggered on older ThinkPads"
* 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/sysfb_efi: Fix valid BAR address range check
x86/efi-bgrt: Switch all pr_err() to pr_notice() for invalid BGRT
MAINTAINERS: Remove asterisk from EFI directory names
Linus Torvalds [Fri, 6 May 2016 18:27:05 +0000 (11:27 -0700)]
Merge branch 'parisc-4.6-5' of git://git./linux/kernel/git/deller/parisc-linux
Pull parisc fix from Helge Deller:
"Patch from Dmitry V Levin to fix a kernel crash when a straced process
calls the (invalid) syscall which is equal to value of __NR_Linux_syscalls"
* 'parisc-4.6-5' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: fix a bug when syscall number of tracee is __NR_Linux_syscalls
Linus Torvalds [Fri, 6 May 2016 18:14:38 +0000 (11:14 -0700)]
Merge tag 'arc-4.6-rc7-fixes' of git://git./linux/kernel/git/vgupta/arc
Pull ARC fixes from Vineet Gupta:
"Late in the cycle, but this has fixes for couple of issues: a PAE40
boot crash and Arnd spotting lack of barriers in BE io-accessors.
The 3rd patch for enabling highmem in low physical mem ;-) honestly is
more than a "fix" but its been in works for some time, seems to be
stable in testing and enables 2 of our customers to go forward with
4.6 kernel.
- Fix for PTE truncation in PAE40 builds
- Fix for big endian IO accessors lacking IO barrier
- Allow HIGHMEM to work with low physical addresses"
* tag 'arc-4.6-rc7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
ARC: support HIGHMEM even without PAE40
ARC: Fix PAE40 boot failures due to PTE truncation
ARC: Add missing io barriers to io{read,write}{16,32}be()
Linus Torvalds [Fri, 6 May 2016 18:05:07 +0000 (11:05 -0700)]
Merge tag 'powerpc-4.6-5' of git://git./linux/kernel/git/powerpc/linux
Pull powerpc fix from Michael Ellerman:
"Fix bad inline asm constraint in create_zero_mask() from Anton
Blanchard"
* tag 'powerpc-4.6-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc: Fix bad inline asm constraint in create_zero_mask()
Linus Torvalds [Fri, 6 May 2016 17:59:53 +0000 (10:59 -0700)]
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
"Fixes for i915, amdgpu/radeon and imx.
The IMX fix is for an autoloading regression found in Fedora. The
radeon fixes, are the same fix to amdgpu/radeon to avoid a hardware
lockup in some circumstances with a bad mode, and a double free bug I
took a few hours chasing down the other morning.
The i915 fixes are across the board, all stable material, and fixing
some hangs and suspend/resume issues, along with a live status
regressions"
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
gpu: ipu-v3: Fix imx-ipuv3-crtc module autoloading
drm/amdgpu: make sure vertical front porch is at least 1
drm/radeon: make sure vertical front porch is at least 1
drm/amdgpu: set metadata pointer to NULL after freeing.
drm/i915: Make RPS EI/thresholds multiple of 25 on SNB-BDW
drm/i915: Fake HDMI live status
drm/i915: Fix eDP low vswing for Broadwell
drm/i915/ddi: Fix eDP VDD handling during booting and suspend/resume
drm/i915: Fix system resume if PCI device remained enabled
drm/i915: Avoid stalling on pending flips for legacy cursor updates
Mark Brown [Fri, 6 May 2016 17:20:37 +0000 (18:20 +0100)]
Merge remote-tracking branches 'spi/fix/fsl-dspi', 'spi/fix/omap2-mcspi', 'spi/fix/pxa2xx' and 'spi/fix/ti-qspi' into spi-linus
Dan Williams [Fri, 6 May 2016 17:20:10 +0000 (10:20 -0700)]
libnvdimm, pfn: fix ARCH=alpha allmodconfig build failure
I had relied on the kbuild robot for cross build coverage, however it
only builds alpha_defconfig. Switch from HPAGE_SIZE to PMD_SIZE, which
is more widely defined.
Fixes:
658922e57b84 ("libnvdimm, pfn: fix memmap reservation sizing")
Cc: <stable@vger.kernel.org>
Reported-by: Guenter Roeck <guenter@roeck-us.net>
Tested-by: Guenter Roeck <guenter@roeck-us.net>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Linus Lüssing [Wed, 4 May 2016 15:25:02 +0000 (17:25 +0200)]
bridge: fix igmp / mld query parsing
With the newly introduced helper functions the skb pulling is hidden
in the checksumming function - and undone before returning to the
caller.
The IGMP and MLD query parsing functions in the bridge still
assumed that the skb is pointing to the beginning of the IGMP/MLD
message while it is now kept at the beginning of the IPv4/6 header.
If there is a querier somewhere else, then this either causes
the multicast snooping to stay disabled even though it could be
enabled. Or, if we have the querier enabled too, then this can
create unnecessary IGMP / MLD query messages on the link.
Fixing this by taking the offset between IP and IGMP/MLD header into
account, too.
Fixes:
9afd85c9e455 ("net: Export IGMP/MLD message validation code")
Reported-by: Simon Wunderlich <sw@simonwunderlich.de>
Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dmitry V. Levin [Wed, 27 Apr 2016 01:56:11 +0000 (04:56 +0300)]
parisc: fix a bug when syscall number of tracee is __NR_Linux_syscalls
Do not load one entry beyond the end of the syscall table when the
syscall number of a traced process equals to __NR_Linux_syscalls.
Similar bug with regular processes was fixed by commit
3bb457af4fa8
("[PARISC] Fix bug when syscall nr is __NR_Linux_syscalls").
This bug was found by strace test suite.
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry V. Levin <ldv@altlinux.org>
Acked-by: Helge Deller <deller@gmx.de>
Signed-off-by: Helge Deller <deller@gmx.de>
Kalle Valo [Fri, 6 May 2016 11:27:48 +0000 (14:27 +0300)]
Merge tag 'iwlwifi-for-kalle-2016-05-04' of https://git./linux/kernel/git/iwlwifi/iwlwifi-fixes
* fix P2P rates (and possibly other issues)
Rafael J. Wysocki [Fri, 6 May 2016 11:16:22 +0000 (13:16 +0200)]
Merge branches 'pm-opp-fixes', 'pm-cpufreq-fixes' and 'pm-cpuidle-fixes'
* pm-opp-fixes:
PM / OPP: Remove useless check
* pm-cpufreq-fixes:
intel_pstate: Fix intel_pstate_get()
cpufreq: intel_pstate: Fix HWP on boot CPU after system resume
cpufreq: st: enable selective initialization based on the platform
* pm-cpuidle-fixes:
ARM: cpuidle: Pass on arm_cpuidle_suspend()'s return value
Rafael J. Wysocki [Fri, 6 May 2016 11:15:52 +0000 (13:15 +0200)]
Merge branches 'acpica-fixes' and 'device-properties-fixes'
* acpica-fixes:
ACPICA: Dispatcher: Update thread ID for recursive method calls
* device-properties-fixes:
device property: Avoid potential dereferences of invalid pointers
Chen Yu [Fri, 6 May 2016 03:33:39 +0000 (11:33 +0800)]
x86/tsc: Read all ratio bits from MSR_PLATFORM_INFO
Currently we read the tsc radio: ratio = (MSR_PLATFORM_INFO >> 8) & 0x1f;
Thus we get bit 8-12 of MSR_PLATFORM_INFO, however according to the SDM
(35.5), the ratio bits are bit 8-15.
Ignoring the upper bits can result in an incorrect tsc ratio, which causes the
TSC calibration and the Local APIC timer frequency to be incorrect.
Fix this problem by masking 0xff instead.
[ tglx: Massaged changelog ]
Fixes:
7da7c1561366 "x86, tsc: Add static (MSR) TSC calibration on Intel Atom SoCs"
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: stable@vger.kernel.org
Cc: Bin Gao <bin.gao@intel.com>
Cc: Len Brown <lenb@kernel.org>
Link: http://lkml.kernel.org/r/1462505619-5516-1-git-send-email-yu.c.chen@intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Linus Torvalds [Fri, 6 May 2016 03:48:35 +0000 (20:48 -0700)]
Merge branch 'akpm' (patches from Andrew)
Merge fixes from Andrew Morton:
"14 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
byteswap: try to avoid __builtin_constant_p gcc bug
lib/stackdepot: avoid to return 0 handle
mm: fix kcompactd hang during memory offlining
modpost: fix module autoloading for OF devices with generic compatible property
proc: prevent accessing /proc/<PID>/environ until it's ready
mm/zswap: provide unique zpool name
mm: thp: kvm: fix memory corruption in KVM with THP enabled
MAINTAINERS: fix Rajendra Nayak's address
mm, cma: prevent nr_isolated_* counters from going negative
mm: update min_free_kbytes from khugepaged after core initialization
huge pagecache: mmap_sem is unlocked when truncation splits pmd
rapidio/mport_cdev: fix uapi type definitions
mm: memcontrol: let v2 cgroups follow changes in system swappiness
mm: thp: correct split_huge_pages file permission
Nikolay Aleksandrov [Wed, 4 May 2016 14:18:45 +0000 (16:18 +0200)]
net: bridge: fix old ioctl unlocked net device walk
get_bridge_ifindices() is used from the old "deviceless" bridge ioctl
calls which aren't called with rtnl held. The comment above says that it is
called with rtnl but that is not really the case.
Here's a sample output from a test ASSERT_RTNL() which I put in
get_bridge_ifindices and executed "brctl show":
[ 957.422726] RTNL: assertion failed at net/bridge//br_ioctl.c (30)
[ 957.422925] CPU: 0 PID: 1862 Comm: brctl Tainted: G W O
4.6.0-rc4+ #157
[ 957.423009] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS 1.8.1-20150318_183358- 04/01/2014
[ 957.423009]
0000000000000000 ffff880058adfdf0 ffffffff8138dec5
0000000000000400
[ 957.423009]
ffffffff81ce8380 ffff880058adfe58 ffffffffa05ead32
0000000000000001
[ 957.423009]
00007ffec1a444b0 0000000000000400 ffff880053c19130
0000000000008940
[ 957.423009] Call Trace:
[ 957.423009] [<
ffffffff8138dec5>] dump_stack+0x85/0xc0
[ 957.423009] [<
ffffffffa05ead32>]
br_ioctl_deviceless_stub+0x212/0x2e0 [bridge]
[ 957.423009] [<
ffffffff81515beb>] sock_ioctl+0x22b/0x290
[ 957.423009] [<
ffffffff8126ba75>] do_vfs_ioctl+0x95/0x700
[ 957.423009] [<
ffffffff8126c159>] SyS_ioctl+0x79/0x90
[ 957.423009] [<
ffffffff8163a4c0>] entry_SYSCALL_64_fastpath+0x23/0xc1
Since it only reads bridge ifindices, we can use rcu to safely walk the net
device list. Also remove the wrong rtnl comment above.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>