Daniel Borkmann [Wed, 22 Jul 2015 14:31:49 +0000 (16:31 +0200)]
net: sctp: stop spamming klog with rfc6458, 5.3.2. deprecation warnings
Back then when we added support for SCTP_SNDINFO/SCTP_RCVINFO from
RFC6458 5.3.4/5.3.5, we decided to add a deprecation warning for the
(as per RFC deprecated) SCTP_SNDRCV via commit
bbbea41d5e53 ("net:
sctp: deprecate rfc6458, 5.3.2. SCTP_SNDRCV support"), see [1].
Imho, it was not a good idea, and we should just revert that message
for a couple of reasons:
1) It's uapi and therefore set in stone forever.
2) To be able to run on older and newer kernels, an SCTP application
would need to probe for both, SCTP_SNDRCV, but also SCTP_SNDINFO/
SCTP_RCVINFO support, so that on older kernels, it can make use
of SCTP_SNDRCV, and on newer kernels SCTP_SNDINFO/SCTP_RCVINFO.
In my (limited) experience, a lot of SCTP appliances are migrating
to newer kernels only ve(ee)ry slowly.
3) Some people don't have the chance to change their applications,
f.e. due to proprietary legacy stuff. So, they'll hit this warning
in fast path and are stuck with older kernels.
But i.e. due to point 1) I really fail to see the benefit of a warning.
So just revert that for now, the issue was reported up Jamal.
[1] http://thread.gmane.org/gmane.linux.network/321960/
Reported-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Michael Tuexen <tuexen@fh-muenster.de>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sun, 26 Jul 2015 23:29:26 +0000 (16:29 -0700)]
Merge branch 'mlx4-fixes'
Or Gerlitz says:
====================
mlx4 driver fixes, July 22nd 2015
Just few mlx4 fixes.. the fix related to propagating port state
changes to VF should go to -stable of >= 3.11, all the rest just
to 4.2-rc
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Shamay [Wed, 22 Jul 2015 13:53:49 +0000 (16:53 +0300)]
net/mlx4_en: Remove BUG_ON assert when checking if ring is full
In mlx4_en_is_ring_empty we check if ring surpassed its size.
Since the prod and cons indicators are u32, there might be a state where
prod wrapped around and cons, making this assert false, although no
actual bug exists (other code segment can cope with this state).
Signed-off-by: Ido Shamay <idos@mellanox.com>
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jack Morgenstein [Wed, 22 Jul 2015 13:53:48 +0000 (16:53 +0300)]
net/mlx4_core: Relieve cpu load average on the port sending flow
When a port is not attached, the FW requires a longer than usual time to
execute the SENSE_PORT command. In the command flow, the
wait_for_completion_timeout call used in mlx4_cmd_wait puts the kernel
thread into the uninterruptible state during this time. This, in turn,
due to the computation method, causes the CPU load average to increase.
Fix this by using wait_for_completion_interruptible_timeout() for the
SENSE_PORT command, which puts the thread in the interruptible state.
In this state, the thread does not contribute to the CPU load average.
Treat the interrupted case as if the SENSE_PORT command returned
port_type = NONE.
Fix suggested by Gideon Naim <gideonn@mellanox.com> and
Bart Van Assche <bart.vanassche@sandisk.com>.
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jack Morgenstein [Wed, 22 Jul 2015 13:53:47 +0000 (16:53 +0300)]
net/mlx4_core: Fix wrong index in propagating port change event to VFs
The port-change event processing in procedure mlx4_eq_int() uses "slave"
as the vf_oper array index. Since the value of "slave" is the PF function
index, the result is that the PF link state is used for deciding to
propagate the event for all the VFs. The VF link state should be used,
so the VF function index should be used here.
Fixes:
948e306d7d64 ('net/mlx4: Add VF link state support')
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Or Gerlitz [Wed, 22 Jul 2015 13:53:46 +0000 (16:53 +0300)]
net/mlx4_core: Use sink counter for the VF default as fallback
Some old PF drivers don't let VFs allocate counters, in that case, use
the sink counter so the VF can load and operate properly.
Fixes:
6de5f7f6a1fa ('net/mlx4_core: Allocate default counter per port')
Reported-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nikolay Aleksandrov [Wed, 22 Jul 2015 11:03:40 +0000 (13:03 +0200)]
bridge: netlink: fix slave_changelink/br_setport race conditions
Since slave_changelink support was added there have been a few race
conditions when using br_setport() since some of the port functions it
uses require the bridge lock. It is very easy to trigger a lockup due to
some internal spin_lock() usage without bh disabled, also it's possible to
get the bridge into an inconsistent state.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Fixes:
3ac636b8591c ("bridge: implement rtnl_link_ops->slave_changelink")
Reviewed-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sat, 25 Jul 2015 07:18:10 +0000 (00:18 -0700)]
Merge git://git./pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:
====================
Netfilter/IPVS fixes for net
The following patchset contains ten Netfilter/IPVS fixes, they are:
1) Address refcount leak when creating an expectation from the ctnetlink
interface.
2) Fix bug splat in the IDLETIMER target related to sysfs, from Dmitry
Torokhov.
3) Resolve panic for unreachable route in IPVS with locally generated
traffic in the output path, from Alex Gartrell.
4) Fix wrong source address in rare cases for tunneled traffic in IPVS,
from Julian Anastasov.
5) Fix crash if scheduler is changed via ipvsadm -E, again from Julian.
6) Make sure skb->sk is unset for forwarded traffic through IPVS, again from
Alex Gartrell.
7) Fix crash with IPVS sync protocol v0 and FTP, from Julian.
8) Reset sender cpu for forwarded traffic in IPVS, also from Julian.
9) Allocate template conntracks through kmalloc() to resolve netns dependency
problems with the conntrack kmem_cache.
10) Fix zones with expectations that clash using the same tuple, from Joe
Stringer.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Konstantin Khlebnikov [Wed, 22 Jul 2015 09:23:20 +0000 (12:23 +0300)]
cgroup: net_cls: fix false-positive "suspicious RCU usage"
In dev_queue_xmit() net_cls protected with rcu-bh.
[ 270.730026] ===============================
[ 270.730029] [ INFO: suspicious RCU usage. ]
[ 270.730033] 4.2.0-rc3+ #2 Not tainted
[ 270.730036] -------------------------------
[ 270.730040] include/linux/cgroup.h:353 suspicious rcu_dereference_check() usage!
[ 270.730041] other info that might help us debug this:
[ 270.730043] rcu_scheduler_active = 1, debug_locks = 1
[ 270.730045] 2 locks held by dhclient/748:
[ 270.730046] #0: (rcu_read_lock_bh){......}, at: [<
ffffffff81682b70>] __dev_queue_xmit+0x50/0x960
[ 270.730085] #1: (&qdisc_tx_lock){+.....}, at: [<
ffffffff81682d60>] __dev_queue_xmit+0x240/0x960
[ 270.730090] stack backtrace:
[ 270.730096] CPU: 0 PID: 748 Comm: dhclient Not tainted 4.2.0-rc3+ #2
[ 270.730098] Hardware name: OpenStack Foundation OpenStack Nova, BIOS Bochs 01/01/2011
[ 270.730100]
0000000000000001 ffff8800bafeba58 ffffffff817ad487 0000000000000007
[ 270.730103]
ffff880232a0a780 ffff8800bafeba88 ffffffff810ca4f2 ffff88022fb23e00
[ 270.730105]
ffff880232a0a780 ffff8800bafebb68 ffff8800bafebb68 ffff8800bafebaa8
[ 270.730108] Call Trace:
[ 270.730121] [<
ffffffff817ad487>] dump_stack+0x4c/0x65
[ 270.730148] [<
ffffffff810ca4f2>] lockdep_rcu_suspicious+0xe2/0x120
[ 270.730153] [<
ffffffff816a62d2>] task_cls_state+0x92/0xa0
[ 270.730158] [<
ffffffffa00b534f>] cls_cgroup_classify+0x4f/0x120 [cls_cgroup]
[ 270.730164] [<
ffffffff816aac74>] tc_classify_compat+0x74/0xc0
[ 270.730166] [<
ffffffff816ab573>] tc_classify+0x33/0x90
[ 270.730170] [<
ffffffffa00bcb0a>] htb_enqueue+0xaa/0x4a0 [sch_htb]
[ 270.730172] [<
ffffffff81682e26>] __dev_queue_xmit+0x306/0x960
[ 270.730174] [<
ffffffff81682b70>] ? __dev_queue_xmit+0x50/0x960
[ 270.730176] [<
ffffffff816834a3>] dev_queue_xmit_sk+0x13/0x20
[ 270.730185] [<
ffffffff81787770>] dev_queue_xmit+0x10/0x20
[ 270.730187] [<
ffffffff8178b91c>] packet_snd.isra.62+0x54c/0x760
[ 270.730190] [<
ffffffff8178be25>] packet_sendmsg+0x2f5/0x3f0
[ 270.730203] [<
ffffffff81665245>] ? sock_def_readable+0x5/0x190
[ 270.730210] [<
ffffffff817b64bb>] ? _raw_spin_unlock+0x2b/0x40
[ 270.730216] [<
ffffffff8173bcbc>] ? unix_dgram_sendmsg+0x5cc/0x640
[ 270.730219] [<
ffffffff8165f367>] sock_sendmsg+0x47/0x50
[ 270.730221] [<
ffffffff8165f42f>] sock_write_iter+0x7f/0xd0
[ 270.730232] [<
ffffffff811fd4c7>] __vfs_write+0xa7/0xf0
[ 270.730234] [<
ffffffff811fe5b8>] vfs_write+0xb8/0x190
[ 270.730236] [<
ffffffff811fe8c2>] SyS_write+0x52/0xb0
[ 270.730239] [<
ffffffff817b6bae>] entry_SYSCALL_64_fastpath+0x12/0x76
Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
WANG Cong [Tue, 21 Jul 2015 23:52:43 +0000 (16:52 -0700)]
sch_choke: drop all packets in queue during reset
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
WANG Cong [Tue, 21 Jul 2015 23:31:53 +0000 (16:31 -0700)]
sch_plug: purge buffered packets during reset
Otherwise the skbuff related structures are not correctly
refcount'ed.
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sat, 25 Jul 2015 05:46:23 +0000 (22:46 -0700)]
Merge branch 'fib_select_default-fixes'
Julian Anastasov says:
====================
ipv4: fib_select_default changes
This patchset contains 2 changes for the alternative routes,
one to add tb_id/fa_slen check needed after the recent
fib_trie optimizations for fib aliases and the second
change attempts to support alternative routes with TOS
requirement.
Sorry that I don't have access to the original
report from Hagen Paul Pfeifer. I hope he will see this
change.
The second change adds fa_default field to the
fib aliases (which can be many) and if the feature to
filter the alternative routes by TOS is not worth it,
this second patch can be scrapped.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Julian Anastasov [Wed, 22 Jul 2015 07:43:23 +0000 (10:43 +0300)]
ipv4: consider TOS in fib_select_default
fib_select_default considers alternative routes only when
res->fi is for the first alias in res->fa_head. In the
common case this can happen only when the initial lookup
matches the first alias with highest TOS value. This
prevents the alternative routes to require specific TOS.
This patch solves the problem as follows:
- routes that require specific TOS should be returned by
fib_select_default only when TOS matches, as already done
in fib_table_lookup. This rule implies that depending on the
TOS we can have many different lists of alternative gateways
and we have to keep the last used gateway (fa_default) in first
alias for the TOS instead of using single tb_default value.
- as the aliases are ordered by many keys (TOS desc,
fib_priority asc), we restrict the possible results to
routes with matching TOS and lowest metric (fib_priority)
and routes that match any TOS, again with lowest metric.
For example, packet with TOS 8 can not use gw3 (not lowest
metric), gw4 (different TOS) and gw6 (not lowest metric),
all other gateways can be used:
tos 8 via gw1 metric 2 <--- res->fa_head and res->fi
tos 8 via gw2 metric 2
tos 8 via gw3 metric 3
tos 4 via gw4
tos 0 via gw5
tos 0 via gw6 metric 1
Reported-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
Julian Anastasov [Wed, 22 Jul 2015 07:43:22 +0000 (10:43 +0300)]
ipv4: fib_select_default should match the prefix
fib_trie starting from 4.1 can link fib aliases from
different prefixes in same list. Make sure the alternative
gateways are in same table and for same prefix (0) by
checking tb_id and fa_slen.
Fixes:
79e5ad2ceb00 ("fib_trie: Remove leaf_info")
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Wed, 22 Jul 2015 21:45:25 +0000 (14:45 -0700)]
Merge git://git./linux/kernel/git/davem/net
Pull networking fixes from David Miller:
1) Don't use shared bluetooth antenna in iwlwifi driver for management
frames, from Emmanuel Grumbach.
2) Fix device ID check in ath9k driver, from Felix Fietkau.
3) Off by one in xen-netback BUG checks, from Dan Carpenter.
4) Fix IFLA_VF_PORT netlink attribute validation, from Daniel Borkmann.
5) Fix races in setting peeked bit flag in SKBs during datagram
receive. If it's shared we have to clone it otherwise the value can
easily be corrupted. Fix from Herbert Xu.
6) Revert fec clock handling change, causes regressions. From Fabio
Estevam.
7) Fix use after free in fq_codel and sfq packet schedulers, from WANG
Cong.
8) ipvlan bug fixes (memory leaks, missing rcu_dereference_bh, etc.)
from WANG Cong and Konstantin Khlebnikov.
9) Memory leak in act_bpf packet action, from Alexei Starovoitov.
10) ARM bpf JIT bug fixes from Nicolas Schichan.
11) Fix backwards compat of ANY_LAYOUT in virtio_net driver, from
Michael S Tsirkin.
12) Destruction of bond with different ARP header types not handled
correctly, fix from Nikolay Aleksandrov.
13) Revert GRO receive support in ipv6 SIT tunnel driver, causes
regressions because the GRO packets created cannot be processed
properly on the GSO side if we forward the frame. From Herbert Xu.
14) TCCR update race and other fixes to ravb driver from Sergei
Shtylyov.
15) Fix SKB leaks in caif_queue_rcv_skb(), from Eric Dumazet.
16) Fix panics on packet scheduler filter replace, from Daniel Borkmann.
17) Make sure AF_PACKET sees properly IP headers in defragmented frames
(via PACKET_FANOUT_FLAG_DEFRAG option), from Edward Hyunkoo Jee.
18) AF_NETLINK cannot hold mutex in RCU callback, fix from Florian
Westphal.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (84 commits)
ravb: fix ring memory allocation
net: phy: dp83867: Fix warning check for setting the internal delay
openvswitch: allocate nr_node_ids flow_stats instead of num_possible_nodes
netlink: don't hold mutex in rcu callback when releasing mmapd ring
ARM: net: fix vlan access instructions in ARM JIT.
ARM: net: handle negative offsets in BPF JIT.
ARM: net: fix condition for load_order > 0 when translating load instructions.
tcp: suppress a division by zero warning
drivers: net: cpsw: remove tx event processing in rx napi poll
inet: frags: fix defragmented packet's IP header for af_packet
net: mvneta: fix refilling for Rx DMA buffers
stmmac: fix setting of driver data in stmmac_dvr_probe
sched: cls_flow: fix panic on filter replace
sched: cls_flower: fix panic on filter replace
sched: cls_bpf: fix panic on filter replace
net/mdio: fix mdio_bus_match for c45 PHY
net: ratelimit warnings about dst entry refcount underflow or overflow
caif: fix leaks and race in caif_queue_rcv_skb()
qmi_wwan: add the second QMI/network interface for Sierra Wireless MC7305/MC7355
ravb: fix race updating TCCR
...
Linus Torvalds [Wed, 22 Jul 2015 15:52:42 +0000 (08:52 -0700)]
Merge tag 'arm64-fixes' of git://git./linux/kernel/git/arm64/linux
Pull ARM64 fixes from Catalin Marinas:
- arm64 build fix following the move of the thread_struct to the end of
task_struct and the asm offsets becoming too large for the AArch64
ISA
- preparatory patch for moving irq_data struct members (applied now to
reduce dependency for the next merging window)
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
ARM64/irq: Use access helper irq_data_get_affinity_mask()
arm64: switch_to: calculate cpu context pointer using separate register
Joe Stringer [Wed, 22 Jul 2015 04:37:31 +0000 (21:37 -0700)]
netfilter: nf_conntrack: Support expectations in different zones
When zones were originally introduced, the expectation functions were
all extended to perform lookup using the zone. However, insertion was
not modified to check the zone. This means that two expectations which
are intended to apply for different connections that have the same tuple
but exist in different zones cannot both be tracked.
Fixes:
5d0aa2ccd4 (netfilter: nf_conntrack: add support for "conntrack zones")
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Jiang Liu [Mon, 13 Jul 2015 20:30:04 +0000 (20:30 +0000)]
ARM64/irq: Use access helper irq_data_get_affinity_mask()
This is a preparatory patch for moving irq_data struct members.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Reviewed-by: Hanjun Guo <hanjun.guo@linaro.org>
Cc: linux-arm-kernel@lists.infradead.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Will Deacon [Mon, 20 Jul 2015 14:14:53 +0000 (15:14 +0100)]
arm64: switch_to: calculate cpu context pointer using separate register
Commit
0c8c0f03e3a2 ("x86/fpu, sched: Dynamically allocate 'struct fpu'")
moved the thread_struct to the bottom of task_struct. As a result, the
offset is now too large to be used in an immediate add on arm64 with
some kernel configs:
arch/arm64/kernel/entry.S: Assembler messages:
arch/arm64/kernel/entry.S:588: Error: immediate out of range
arch/arm64/kernel/entry.S:597: Error: immediate out of range
This patch calculates the offset using an additional register instead of
an immediate offset.
Fixes:
0c8c0f03e3a2 ("x86/fpu, sched: Dynamically allocate 'struct fpu'")
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Olof Johansson <olof@lixom.net>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Sergei Shtylyov [Tue, 21 Jul 2015 22:31:59 +0000 (01:31 +0300)]
ravb: fix ring memory allocation
The driver is written as if it can adapt to a low memory situation allocating
less RX skbs and TX aligned buffers than the respective RX/TX ring sizes. In
reality though the driver would malfunction in this case. Stop being overly
smart and just fail in such situation -- this is achieved by moving the memory
allocation from ravb_ring_format() to ravb_ring_init().
We leave dma_map_single() calls in place but make their failure non-fatal
by marking the corresponding RX descriptors with zero data size which should
prevent DMA to an invalid addresses.
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Murphy [Tue, 21 Jul 2015 17:06:45 +0000 (12:06 -0500)]
net: phy: dp83867: Fix warning check for setting the internal delay
Fix warning: logical ‘or’ of collectively exhaustive tests is always true
Change the internal delay check from an 'or' condition to an 'and'
condition.
Reported-by: David Binderman <dcb314@hotmail.com>
Signed-off-by: Dan Murphy <dmurphy@ti.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Chris J Arges [Tue, 21 Jul 2015 17:36:33 +0000 (12:36 -0500)]
openvswitch: allocate nr_node_ids flow_stats instead of num_possible_nodes
Some architectures like POWER can have a NUMA node_possible_map that
contains sparse entries. This causes memory corruption with openvswitch
since it allocates flow_cache with a multiple of num_possible_nodes() and
assumes the node variable returned by for_each_node will index into
flow->stats[node].
Use nr_node_ids to allocate a maximal sparse array instead of
num_possible_nodes().
The crash was noticed after
3af229f2 was applied as it changed the
node_possible_map to match node_online_map on boot.
Fixes:
3af229f2071f5b5cb31664be6109561fbe19c861
Signed-off-by: Chris J Arges <chris.j.arges@canonical.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Westphal [Tue, 21 Jul 2015 14:33:50 +0000 (16:33 +0200)]
netlink: don't hold mutex in rcu callback when releasing mmapd ring
Kirill A. Shutemov says:
This simple test-case trigers few locking asserts in kernel:
int main(int argc, char **argv)
{
unsigned int block_size = 16 * 4096;
struct nl_mmap_req req = {
.nm_block_size = block_size,
.nm_block_nr = 64,
.nm_frame_size = 16384,
.nm_frame_nr = 64 * block_size / 16384,
};
unsigned int ring_size;
int fd;
fd = socket(AF_NETLINK, SOCK_RAW, NETLINK_GENERIC);
if (setsockopt(fd, SOL_NETLINK, NETLINK_RX_RING, &req, sizeof(req)) < 0)
exit(1);
if (setsockopt(fd, SOL_NETLINK, NETLINK_TX_RING, &req, sizeof(req)) < 0)
exit(1);
ring_size = req.nm_block_nr * req.nm_block_size;
mmap(NULL, 2 * ring_size, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
return 0;
}
+++ exited with 0 +++
BUG: sleeping function called from invalid context at /home/kas/git/public/linux-mm/kernel/locking/mutex.c:616
in_atomic(): 1, irqs_disabled(): 0, pid: 1, name: init
3 locks held by init/1:
#0: (reboot_mutex){+.+...}, at: [<
ffffffff81080959>] SyS_reboot+0xa9/0x220
#1: ((reboot_notifier_list).rwsem){.+.+..}, at: [<
ffffffff8107f379>] __blocking_notifier_call_chain+0x39/0x70
#2: (rcu_callback){......}, at: [<
ffffffff810d32e0>] rcu_do_batch.isra.49+0x160/0x10c0
Preemption disabled at:[<
ffffffff8145365f>] __delay+0xf/0x20
CPU: 1 PID: 1 Comm: init Not tainted
4.1.0-00009-gbddf4c4818e0 #253
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS Debian-1.8.2-1 04/01/2014
ffff88017b3d8000 ffff88027bc03c38 ffffffff81929ceb 0000000000000102
0000000000000000 ffff88027bc03c68 ffffffff81085a9d 0000000000000002
ffffffff81ca2a20 0000000000000268 0000000000000000 ffff88027bc03c98
Call Trace:
<IRQ> [<
ffffffff81929ceb>] dump_stack+0x4f/0x7b
[<
ffffffff81085a9d>] ___might_sleep+0x16d/0x270
[<
ffffffff81085bed>] __might_sleep+0x4d/0x90
[<
ffffffff8192e96f>] mutex_lock_nested+0x2f/0x430
[<
ffffffff81932fed>] ? _raw_spin_unlock_irqrestore+0x5d/0x80
[<
ffffffff81464143>] ? __this_cpu_preempt_check+0x13/0x20
[<
ffffffff8182fc3d>] netlink_set_ring+0x1ed/0x350
[<
ffffffff8182e000>] ? netlink_undo_bind+0x70/0x70
[<
ffffffff8182fe20>] netlink_sock_destruct+0x80/0x150
[<
ffffffff817e484d>] __sk_free+0x1d/0x160
[<
ffffffff817e49a9>] sk_free+0x19/0x20
[..]
Cong Wang says:
We can't hold mutex lock in a rcu callback, [..]
Thomas Graf says:
The socket should be dead at this point. It might be simpler to
add a netlink_release_ring() function which doesn't require
locking at all.
Reported-by: "Kirill A. Shutemov" <kirill@shutemov.name>
Diagnosed-by: Cong Wang <cwang@twopensource.com>
Suggested-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 22 Jul 2015 05:19:55 +0000 (22:19 -0700)]
Merge branch 'arm-bpf-fixes'
Nicolas Schichan says:
====================
BPF JIT fixes for ARM
These patches are fixing bugs in the ARM JIT and should probably find
their way to a stable kernel. All 60 test_bpf tests in Linux 4.1 release
are now passing OK (was 54 out of 60 before).
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Nicolas Schichan [Tue, 21 Jul 2015 12:14:14 +0000 (14:14 +0200)]
ARM: net: fix vlan access instructions in ARM JIT.
This makes BPF_ANC | SKF_AD_VLAN_TAG and BPF_ANC | SKF_AD_VLAN_TAG_PRESENT
have the same behaviour as the in kernel VM and makes the test_bpf LD_VLAN_TAG
and LD_VLAN_TAG_PRESENT tests pass.
Signed-off-by: Nicolas Schichan <nschichan@freebox.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nicolas Schichan [Tue, 21 Jul 2015 12:14:13 +0000 (14:14 +0200)]
ARM: net: handle negative offsets in BPF JIT.
Previously, the JIT would reject negative offsets known during code
generation and mishandle negative offsets provided at runtime.
Fix that by calling bpf_internal_load_pointer_neg_helper()
appropriately in the jit_get_skb_{b,h,w} slow path helpers and by forcing
the execution flow to the slow path helpers when the offset is
negative.
Signed-off-by: Nicolas Schichan <nschichan@freebox.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nicolas Schichan [Tue, 21 Jul 2015 12:14:12 +0000 (14:14 +0200)]
ARM: net: fix condition for load_order > 0 when translating load instructions.
To check whether the load should take the fast path or not, the code
would check that (r_skb_hlen - load_order) is greater than the offset
of the access using an "Unsigned higher or same" condition. For
halfword accesses and an skb length of 1 at offset 0, that test is
valid, as we end up comparing 0xffffffff(-1) and 0, so the fast path
is taken and the filter allows the load to wrongly succeed. A similar
issue exists for word loads at offset 0 and an skb length of less than
4.
Fix that by using the condition "Signed greater than or equal"
condition for the fast path code for load orders greater than 0.
Signed-off-by: Nicolas Schichan <nschichan@freebox.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Wed, 22 Jul 2015 05:02:00 +0000 (07:02 +0200)]
tcp: suppress a division by zero warning
Andrew Morton reported following warning on one ARM build
with gcc-4.4 :
net/ipv4/inet_hashtables.c: In function 'inet_ehash_locks_alloc':
net/ipv4/inet_hashtables.c:617: warning: division by zero
Even guarded with a test on sizeof(spinlock_t), compiler does not
like current construct on a !CONFIG_SMP build.
Remove the warning by using a temporary variable.
Fixes:
095dc8e0c368 ("tcp: fix/cleanup inet_ehash_locks_alloc()")
Reported-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Tue, 21 Jul 2015 23:06:53 +0000 (16:06 -0700)]
Revert "fsnotify: fix oops in fsnotify_clear_marks_by_group_flags()"
This reverts commit
a2673b6e040663bf16a552f8619e6bde9f4b9acf.
Kinglong Mee reports a memory leak with that patch, and Jan Kara confirms:
"Thanks for report! You are right that my patch introduces a race
between fsnotify kthread and fsnotify_destroy_group() which can result
in leaking inotify event on group destruction.
I haven't yet decided whether the right fix is not to queue events for
dying notification group (as that is pointless anyway) or whether we
should just fix the original problem differently... Whenever I look
at fsnotify code mark handling I get lost in the maze of locks, lists,
and subtle differences between how different notification systems
handle notification marks :( I'll think about it over night"
and after thinking about it, Jan says:
"OK, I have looked into the code some more and I found another
relatively simple way of fixing the original oops. It will be IMHO
better than trying to fixup this issue which has more potential for
breakage. I'll ask Linus to revert the fsnotify fix he already merged
and send a new fix"
Reported-by: Kinglong Mee <kinglongmee@gmail.com>
Requested-by: Jan Kara <jack@suse.cz>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David S. Miller [Tue, 21 Jul 2015 23:06:39 +0000 (16:06 -0700)]
Merge tag 'wireless-drivers-for-davem-2015-07-20' of git://git./linux/kernel/git/kvalo/wireless-drivers
Kalle Valo says:
====================
ath9k:
* fix device ID check for AR956x
iwlwifi:
* bug fixes specific for 8000 series
* fix a crash in time events
* fix a crash in PCIe transport
* fix BT Coex code that prevented association on certain
devices (3160).
* revert the new RBD allocation model because it introduced
a bug when running on weak VM setups.
* new device IDs
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Tue, 21 Jul 2015 22:27:27 +0000 (15:27 -0700)]
Merge tag 'pinctrl-v4.2-2' of git://git./linux/kernel/git/linusw/linux-pinctrl
Pull pin control fixes from Linus Walleij:
"Here are some overly ripe pin control fixes for the v4.2 series.
They got delayed because of various crap commits and having to clean
and rinse the patch stack a few times. Now they are however looking
good.
- some dead defines dropped from the Samsung driver, was targeted for
-rc2 but got delayed
- drop the strict mode from abx500, this was too strict
- fix the R-Car sparse IRQs code to work as intended
- fix the IRQ code for the pinctrl-single GPIO backend to not enforce
threaded IRQs
- clear the latched events/IRQs for the Broadcom BCM2835 driver
- fix up debugfs for the Freescale imx1 driver
- fix a typo bug in the Schmitt Trigger setup in the LPC18xx driver"
* tag 'pinctrl-v4.2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
pinctrl: lpc18xx: fix schmitt trigger setup
Subject: pinctrl: imx1-core: Fix debug output in .pin_config_set callback
pinctrl: bcm2835: Clear the event latch register when disabling interrupts
pinctrl: single: ensure pcs irq will not be forced threaded
sh-pfc: fix sparse GPIOs for R-Car SoCs
pinctrl: abx500: remove strict mode
pinctrl: samsung: Remove old unused defines
Linus Torvalds [Tue, 21 Jul 2015 22:18:06 +0000 (15:18 -0700)]
Merge branch 'for_linus' of git://git./linux/kernel/git/jack/linux-fs
Pull UDF fix from Jan Kara:
"A fix for UDF corruption when certain disk-format feature is enabled"
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
udf: Don't corrupt unalloc spacetable when writing it
Linus Torvalds [Tue, 21 Jul 2015 21:42:40 +0000 (14:42 -0700)]
Merge tag 'trace-v4.2-rc2-fix2' of git://git./linux/kernel/git/rostedt/linux-trace
Pull tracing sample code fix from Steven Rostedt:
"He Kuang noticed that the sample code using the trace_event helper
function __get_dynamic_array_len() is broken.
This only changes the sample code, and I'm pushing this now instead of
later because I don't want others using the broken code as an example
when using it for real"
* tag 'trace-v4.2-rc2-fix2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing: Fix sample output of dynamic arrays
Mugunthan V N [Tue, 21 Jul 2015 10:30:42 +0000 (16:00 +0530)]
drivers: net: cpsw: remove tx event processing in rx napi poll
With commit
c03abd84634d ("net: ethernet: cpsw: don't requests IRQs
we don't use") common isr and napi are separated into separate tx isr
and rx isr/napi, but still in rx napi tx events are handled. So removing
the tx event handling in rx napi.
Signed-off-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Edward Hyunkoo Jee [Tue, 21 Jul 2015 07:43:59 +0000 (09:43 +0200)]
inet: frags: fix defragmented packet's IP header for af_packet
When ip_frag_queue() computes positions, it assumes that the passed
sk_buff does not contain L2 headers.
However, when PACKET_FANOUT_FLAG_DEFRAG is used, IP reassembly
functions can be called on outgoing packets that contain L2 headers.
Also, IPv4 checksum is not corrected after reassembly.
Fixes:
7736d33f4262 ("packet: Add pre-defragmentation support for ipv4 fanouts.")
Signed-off-by: Edward Hyunkoo Jee <edjee@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Jerry Chu <hkchu@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Simon Guinot [Sun, 19 Jul 2015 11:00:53 +0000 (13:00 +0200)]
net: mvneta: fix refilling for Rx DMA buffers
With the actual code, if a memory allocation error happens while
refilling a Rx descriptor, then the original Rx buffer is both passed
to the networking stack (in a SKB) and let in the Rx ring. This leads
to various kernel oops and crashes.
As a fix, this patch moves Rx descriptor refilling ahead of building
SKB with the associated Rx buffer. In case of a memory allocation
failure, data is dropped and the original DMA buffer is put back into
the Rx ring.
Signed-off-by: Simon Guinot <simon.guinot@sequanux.org>
Fixes:
c5aff18204da ("net: mvneta: driver for Marvell Armada 370/XP network unit")
Cc: <stable@vger.kernel.org> # v3.8+
Tested-by: Yoann Sculo <yoann@sculo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Joachim Eastwood [Fri, 17 Jul 2015 21:48:17 +0000 (23:48 +0200)]
stmmac: fix setting of driver data in stmmac_dvr_probe
Commit
803f8fc46274b ("stmmac: move driver data setting into
stmmac_dvr_probe") mistakenly set priv and not priv->dev as
driver data. This meant that the remove, resume and suspend
callbacks that fetched and tried to use this data would most
likely explode. Fix the issue by using the correct variable.
Fixes:
803f8fc46274b ("stmmac: move driver data setting into stmmac_dvr_probe")
Signed-off-by: Joachim Eastwood <manabian@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 21 Jul 2015 07:25:03 +0000 (00:25 -0700)]
Merge branch 'sch_panic'
Daniel Borkmann says:
====================
Couple of classifier fixes
This fixes a couple of panics in the form of (analogous for
cls_flow{,er}):
[ 912.759276] BUG: unable to handle kernel NULL pointer dereference at (null)
[ 912.759373] IP: [<
ffffffffa09d4d6d>] cls_bpf_change+0x23d/0x268 [cls_bpf]
[ 912.759441] PGD
8783c067 PUD
5f684067 PMD 0
[ 912.759491] Oops: 0002 [#1] SMP DEBUG_PAGEALLOC
[ 912.759543] Modules linked in: cls_bpf(E) act_gact [...]
[ 912.772734] CPU: 3 PID: 10489 Comm: tc Tainted: G W E 4.2.0-rc2+ #73
[ 912.775004] Hardware name: Apple Inc. MacBookAir5,1/Mac-
66F35F19FE2A0D05, BIOS MBA51.88Z.00EF.B02.
1211271028 11/27/2012
[ 912.777327] task:
ffff88025eaa8000 ti:
ffff88005f734000 task.ti:
ffff88005f734000
[ 912.779662] RIP: 0010:[<
ffffffffa09d4d6d>] [<
ffffffffa09d4d6d>] cls_bpf_change+0x23d/0x268 [cls_bpf]
[ 912.781991] RSP: 0018:
ffff88005f7379c8 EFLAGS:
00010286
[ 912.784183] RAX:
ffff880201d64e48 RBX:
0000000000000000 RCX:
ffff880201d64e40
[ 912.786402] RDX:
0000000000000000 RSI:
ffffffffa09d51c0 RDI:
ffffffffa09d51a6
[ 912.788625] RBP:
ffff88005f737a68 R08:
0000000000000000 R09:
0000000000000000
[ 912.790854] R10:
0000000000000001 R11:
0000000000000001 R12:
ffff880078ab5a80
[ 912.793082] R13:
ffff880232b31570 R14:
ffff88005f737ae0 R15:
ffff8801e215d1d0
[ 912.795181] FS:
00007f3c0c80d740(0000) GS:
ffff880265400000(0000) knlGS:
0000000000000000
[ 912.797281] CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
[ 912.799402] CR2:
0000000000000000 CR3:
000000005460f000 CR4:
00000000001407e0
[ 912.799403] Stack:
[ 912.799407]
ffffffff00000000 ffff88023ea18000 000000005f737a08 0000000000000000
[ 912.799415]
ffffffff81f06140 ffff880201d64e40 0000000000000000 ffff88023ea1804c
[ 912.799418]
0000000000000000 ffff88023ea18044 ffff88023ea18030 ffff88023ea18038
[ 912.799418] Call Trace:
[ 912.799437] [<
ffffffff816d5685>] tc_ctl_tfilter+0x335/0x910
[ 912.799443] [<
ffffffff813622a8>] ? security_capable+0x48/0x60
[ 912.799448] [<
ffffffff816b90e5>] rtnetlink_rcv_msg+0x95/0x240
[ 912.799454] [<
ffffffff810f612d>] ? trace_hardirqs_on+0xd/0x10
[ 912.799456] [<
ffffffff816b902f>] ? rtnetlink_rcv+0x1f/0x40
[ 912.799459] [<
ffffffff816b902f>] ? rtnetlink_rcv+0x1f/0x40
[ 912.799461] [<
ffffffff816b9050>] ? rtnetlink_rcv+0x40/0x40
[ 912.799464] [<
ffffffff816df38f>] netlink_rcv_skb+0xaf/0xc0
[ 912.799467] [<
ffffffff816b903e>] rtnetlink_rcv+0x2e/0x40
[ 912.799469] [<
ffffffff816deaef>] netlink_unicast+0xef/0x1b0
[ 912.799471] [<
ffffffff816defa0>] netlink_sendmsg+0x3f0/0x620
[ 912.799476] [<
ffffffff81687028>] sock_sendmsg+0x38/0x50
[ 912.799479] [<
ffffffff81687938>] ___sys_sendmsg+0x288/0x290
[ 912.799482] [<
ffffffff810f7852>] ? __lock_acquire+0x572/0x2050
[ 912.799488] [<
ffffffff810265db>] ? native_sched_clock+0x2b/0x90
[ 912.799493] [<
ffffffff8116135f>] ? __audit_syscall_entry+0xaf/0x100
[ 912.799497] [<
ffffffff8116135f>] ? __audit_syscall_entry+0xaf/0x100
[ 912.799501] [<
ffffffff8112aa19>] ? current_kernel_time+0x69/0xd0
[ 912.799505] [<
ffffffff81266f16>] ? __fget_light+0x66/0x90
[ 912.799508] [<
ffffffff81688812>] __sys_sendmsg+0x42/0x80
[ 912.799510] [<
ffffffff81688862>] SyS_sendmsg+0x12/0x20
[ 912.799515] [<
ffffffff817f9a6e>] entry_SYSCALL_64_fastpath+0x12/0x76
[ 912.799540] Code: 4d 88 49 8b 57 08 48 89 51 08 49 8b 57 10 48 89 c8 48 83 c0 08 48
89 51 10 48 8b 51 10 48 c7 c6 c0 51 9d a0 48 c7 c7 a6 51 9d a0 <48>
89 02 48 8b 51 08 48 89 42 08 48 b8 00 02 20 00 00 00 ad de
[ 912.799544] RIP [<
ffffffffa09d4d6d>] cls_bpf_change+0x23d/0x268 [cls_bpf]
[ 912.799544] RSP <
ffff88005f7379c8>
[ 912.799545] CR2:
0000000000000000
[ 912.807380] ---[ end trace
a6440067cfdc7c29 ]---
I've split them into 3 patches, so they can be backported easier
when needed.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Fri, 17 Jul 2015 20:38:45 +0000 (22:38 +0200)]
sched: cls_flow: fix panic on filter replace
The following test case causes a NULL pointer dereference in cls_flow:
tc filter add dev foo parent 1: handle 0x1 flow hash keys dst action ok
tc filter replace dev foo parent 1: pref 49152 handle 0x1 \
flow hash keys mark action drop
To be more precise, actually two different panics are fixed, the first
occurs because tcf_exts_init() is not called on the newly allocated
filter when we do a replace. And the second panic uncovered after that
happens since the arguments of list_replace_rcu() are swapped, the old
element needs to be the first argument and the new element the second.
Fixes:
70da9f0bf999 ("net: sched: cls_flow use RCU")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Fri, 17 Jul 2015 20:38:44 +0000 (22:38 +0200)]
sched: cls_flower: fix panic on filter replace
The following test case causes a NULL pointer dereference in cls_flower:
tc filter add dev foo parent 1: flower eth_type ipv4 action ok flowid 1:1
tc filter replace dev foo parent 1: pref 49152 handle 0x1 \
flower eth_type ipv6 action ok flowid 1:1
The problem is that commit
77b9900ef53a ("tc: introduce Flower classifier")
accidentally swapped the arguments of list_replace_rcu(), the old
element needs to be the first argument and the new element the second.
Fixes:
77b9900ef53a ("tc: introduce Flower classifier")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Fri, 17 Jul 2015 20:38:43 +0000 (22:38 +0200)]
sched: cls_bpf: fix panic on filter replace
The following test case causes a NULL pointer dereference in cls_bpf:
FOO="1,6 0 0
4294967295,"
tc filter add dev foo parent 1: bpf bytecode "$FOO" flowid 1:1 action ok
tc filter replace dev foo parent 1: pref 49152 handle 0x1 \
bpf bytecode "$FOO" flowid 1:1 action drop
The problem is that commit
1f947bf151e9 ("net: sched: rcu'ify cls_bpf")
accidentally swapped the arguments of list_replace_rcu(), the old
element needs to be the first argument and the new element the second.
Fixes:
1f947bf151e9 ("net: sched: rcu'ify cls_bpf")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.r.fastabend@intel.com>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 21 Jul 2015 07:17:53 +0000 (00:17 -0700)]
Merge tag 'mac80211-for-davem-2015-07-17' of git://git./linux/kernel/git/jberg/mac80211
Johannes Berg says:
====================
Some fixes for the current cycle:
1. Arik introduced an rtnl-locked regulatory API to be able
to differentiate between place do/don't have the RTNL;
this fixes missing locking in some of the code paths
2. Two small mesh bugfixes from Bob, one to avoid treating
a certain malformed over-the-air frame and one to avoid
sending a garbage field over the air.
3. A fix for powersave during WoWLAN suspend from Krishna Chaitanya.
4. A fix for a powersave vs. aggregation teardown race, from Michal.
5. Thomas reduced the loglevel of CRDA messages to avoid spamming
the kernel log with mostly irrelevant information.
6. Tom fixed a dangling debugfs directory pointer that could cause
crashes if subsequent addition of the same interface to debugfs
failed for some reason.
7. A fix from myself for a list corruption issue in mac80211 during
combined interface shutdown/removal - shut down interfaces first
and only then remove them to avoid that.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Shaohui Xie [Fri, 17 Jul 2015 10:07:19 +0000 (18:07 +0800)]
net/mdio: fix mdio_bus_match for c45 PHY
We store c45 PHY's id information in c45_ids, so it should be used to
check the matching between PHY driver and PHY device for c45 PHY.
Signed-off-by: Shaohui Xie <Shaohui.Xie@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Konstantin Khlebnikov [Fri, 17 Jul 2015 11:01:11 +0000 (14:01 +0300)]
net: ratelimit warnings about dst entry refcount underflow or overflow
Kernel generates a lot of warnings when dst entry reference counter
overflows and becomes negative. That bug was seen several times at
machines with outdated 3.10.y kernels. Most like it's already fixed
in upstream. Anyway that flood completely kills machine and makes
further debugging impossible.
Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Fri, 17 Jul 2015 08:19:23 +0000 (10:19 +0200)]
caif: fix leaks and race in caif_queue_rcv_skb()
1) If sk_filter() is applied, skb was leaked (not freed)
2) Testing SOCK_DEAD twice is racy :
packet could be freed while already queued.
3) Remove obsolete comment about caching skb->len
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Reinhard Speyerer [Thu, 16 Jul 2015 21:28:14 +0000 (23:28 +0200)]
qmi_wwan: add the second QMI/network interface for Sierra Wireless MC7305/MC7355
Sierra Wireless MC7305/MC7355 with USB ID 1199:9041 also provide a
second QMI/network interface like the MC73xx with USB ID 1199:68c0 on
USB interface #10 when used in the appropriate USB configuration.
Add the corresponding QMI_FIXED_INTF entry to the qmi_wwan driver.
Please note that the second QMI/network interface is not working for
early MC73xx firmware versions like 01.08.x as the device does not
respond to QMI messages on the second /dev/cdc-wdm port.
Signed-off-by: Reinhard Speyerer <rspmn@arcor.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sergei Shtylyov [Thu, 16 Jul 2015 21:28:38 +0000 (00:28 +0300)]
ravb: fix race updating TCCR
The TCCR.TSRQn bit may get clearead after TCCR gets read, so that TCCR write
would get skipped. We don't need to check this bit before setting.
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Karicheri, Muralidharan [Thu, 16 Jul 2015 19:32:14 +0000 (15:32 -0400)]
net: netcp: fix improper initialization in netcp_ndo_open()
The keystone qmss will raise interrupt when packet arrive at the
receive queue. Only control available to avoid interrupt from happening
is to keep the free descriptor queue (FDQ) empty in the receive side.
So the filling of descriptors into the FDQ has to happen after
request_irq() call is made as part of knav_queue_enable_notify(). So
move the function netcp_rxpool_refill() after this call.
Signed-off-by: Murali Karicheri <m-karicheri2@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
dingtianhong [Thu, 16 Jul 2015 08:30:02 +0000 (16:30 +0800)]
bonding: correct the MAC address for "follow" fail_over_mac policy
The "follow" fail_over_mac policy is useful for multiport devices that
either become confused or incur a performance penalty when multiple
ports are programmed with the same MAC address, but the same MAC
address still may happened by this steps for this policy:
1) echo +eth0 > /sys/class/net/bond0/bonding/slaves
bond0 has the same mac address with eth0, it is MAC1.
2) echo +eth1 > /sys/class/net/bond0/bonding/slaves
eth1 is backup, eth1 has MAC2.
3) ifconfig eth0 down
eth1 became active slave, bond will swap MAC for eth0 and eth1,
so eth1 has MAC1, and eth0 has MAC2.
4) ifconfig eth1 down
there is no active slave, and eth1 still has MAC1, eth2 has MAC2.
5) ifconfig eth0 up
the eth0 became active slave again, the bond set eth0 to MAC1.
Something wrong here, then if you set eth1 up, the eth0 and eth1 will have the same
MAC address, it will break this policy for ACTIVE_BACKUP mode.
This patch will fix this problem by finding the old active slave and
swap them MAC address before change active slave.
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
Tested-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 21 Jul 2015 03:25:59 +0000 (20:25 -0700)]
Merge tag 'linux-can-fixes-for-4.2-
20150716' of git://git./linux/kernel/git/mkl/linux-can
Marc Kleine-Budde says:
====================
pull-request: can 2015-07-16
this is a pull request of 2 patches by Stefan Agner. He fixes the resume
operation in the mcp251x driver.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Herbert Xu [Mon, 20 Jul 2015 09:55:38 +0000 (17:55 +0800)]
Revert "sit: Add gro callbacks to sit_offload"
This patch reverts
19424e052fb44da2f00d1a868cbb51f3e9f4bbb5 ("sit:
Add gro callbacks to sit_offload") because it generates packets
that cannot be handled even by our own GSO.
Reported-by: Wolfgang Walter <linux@stwm.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Wed, 15 Jul 2015 23:09:32 +0000 (16:09 -0700)]
net: dsa: bcm_sf2: do not use indirect reads and writes for 7445E0
7445E0 contains an ECO which disconnected the internal SF2 pseudo-PHY which was
known to conflict with the external pseudo-PHY of BCM53125 switches. This
motivated the need to utilize the internal SF2 MDIO controller via indirect
register reads/writes to control external Broadcom switches due to this address
conflict (both responded at address 30d).
For 7445E0, the internal pseudo-PHY of the SF2 switch got disconnected, and as
a consequence this prevents the internal SF2 MDIO bus controller from reading
data (reads back everything as 0) since the MDI line is tied low.
Fix this by making the indirect register reads and writes conditional to
7445D0, on 7445E0 we can utilize the SWITCH_MDIO controller (backed by
mdio-unimac and not the DSA created slave MII bus).
We utilize of_machine_is_compatible() here since this is the only way for use
to differentiate between these two chips in a way that does not violate layers
or becomes (too) vendor-specific.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nikolay Aleksandrov [Wed, 15 Jul 2015 20:57:01 +0000 (22:57 +0200)]
bonding: correctly handle bonding type change on enslave failure
If the bond is enslaving a device with different type it will be setup
by it, but if after being setup the enslave fails the bond doesn't
switch back its type and also keeps pointers to foreign structures that can
be long gone. Thus revert back any type changes if the enslave failed and
the bond had to change its type.
Example:
Before patch:
$ echo lo > bond0/bonding/slaves
-bash: echo: write error: Cannot assign requested address
$ ip l sh bond0
20: bond0: <BROADCAST,MULTICAST,MASTER> mtu 1500 qdisc noop state DOWN
mode DEFAULT group default
link/loopback 16:54:78:34:bd:41 brd 00:00:00:00:00:00
$ echo +eth1 > bond0/bonding/slaves
$ ip l sh bond0
20: bond0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode
DEFAULT group default qlen 1000
link/ether 52:54:00:3f:47:69 brd ff:ff:ff:ff:ff:ff
(notice the MASTER flag is gone)
After patch:
$ echo lo > bond0/bonding/slaves
-bash: echo: write error: Cannot assign requested address
$ ip l sh bond0
21: bond0: <BROADCAST,MULTICAST,MASTER> mtu 1500 qdisc noop state DOWN
mode DEFAULT group default qlen 1000
link/ether 6e:66:94:f6:07:fc brd ff:ff:ff:ff:ff:ff
$ echo +eth1 > bond0/bonding/slaves
$ ip l sh bond0
21: bond0: <BROADCAST,MULTICAST,MASTER> mtu 1500 qdisc noop state DOWN
mode DEFAULT group default qlen 1000
link/ether 52:54:00:3f:47:69 brd ff:ff:ff:ff:ff:ff
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Fixes:
e36b9d16c6a6 ("bonding: clean muticast addresses when device changes type")
Signed-off-by: David S. Miller <davem@davemloft.net>
Nikolay Aleksandrov [Wed, 15 Jul 2015 19:52:51 +0000 (21:52 +0200)]
bonding: fix destruction of bond with devices different from arphrd_ether
When the bonding is being unloaded and the netdevice notifier is
unregistered it executes NETDEV_UNREGISTER for each device which should
remove the bond's proc entry but if the device enslaved is not of
ARPHRD_ETHER type and is in front of the bonding, it may execute
bond_release_and_destroy() first which would release the last slave and
destroy the bond device leaving the proc entry and thus we will get the
following error (with dynamic debug on for bond_netdev_event to see the
events order):
[ 908.963051] eql: event: 9
[ 908.963052] eql: IFF_SLAVE
[ 908.963054] eql: event: 2
[ 908.963056] eql: IFF_SLAVE
[ 908.963058] eql: event: 6
[ 908.963059] eql: IFF_SLAVE
[ 908.963110] bond0: Releasing active interface eql
[ 908.976168] bond0: Destroying bond bond0
[ 908.976266] bond0 (unregistering): Released all slaves
[ 908.984097] ------------[ cut here ]------------
[ 908.984107] WARNING: CPU: 0 PID: 1787 at fs/proc/generic.c:575
remove_proc_entry+0x112/0x160()
[ 908.984110] remove_proc_entry: removing non-empty directory
'net/bonding', leaking at least 'bond0'
[ 908.984111] Modules linked in: bonding(-) eql(O) 9p nfsd auth_rpcgss
oid_registry nfs_acl nfs lockd grace fscache sunrpc crct10dif_pclmul
crc32_pclmul crc32c_intel ghash_clmulni_intel ppdev qxl drm_kms_helper
snd_hda_codec_generic aesni_intel ttm aes_x86_64 glue_helper pcspkr lrw
gf128mul ablk_helper cryptd snd_hda_intel virtio_console snd_hda_codec
psmouse serio_raw snd_hwdep snd_hda_core 9pnet_virtio 9pnet evdev joydev
drm virtio_balloon snd_pcm snd_timer snd soundcore i2c_piix4 i2c_core
pvpanic acpi_cpufreq parport_pc parport processor thermal_sys button
autofs4 ext4 crc16 mbcache jbd2 hid_generic usbhid hid sg sr_mod cdrom
ata_generic virtio_blk virtio_net floppy ata_piix e1000 libata ehci_pci
virtio_pci scsi_mod uhci_hcd ehci_hcd virtio_ring virtio usbcore
usb_common [last unloaded: bonding]
[ 908.984168] CPU: 0 PID: 1787 Comm: rmmod Tainted: G W O
4.2.0-rc2+ #8
[ 908.984170] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[ 908.984172]
0000000000000000 ffffffff81732d41 ffffffff81525b34
ffff8800358dfda8
[ 908.984175]
ffffffff8106c521 ffff88003595af78 ffff88003595af40
ffff88003e3a4280
[ 908.984178]
ffffffffa058d040 0000000000000000 ffffffff8106c59a
ffffffff8172ebd0
[ 908.984181] Call Trace:
[ 908.984188] [<
ffffffff81525b34>] ? dump_stack+0x40/0x50
[ 908.984193] [<
ffffffff8106c521>] ? warn_slowpath_common+0x81/0xb0
[ 908.984196] [<
ffffffff8106c59a>] ? warn_slowpath_fmt+0x4a/0x50
[ 908.984199] [<
ffffffff81218352>] ? remove_proc_entry+0x112/0x160
[ 908.984205] [<
ffffffffa05850e6>] ? bond_destroy_proc_dir+0x26/0x30
[bonding]
[ 908.984208] [<
ffffffffa057540e>] ? bond_net_exit+0x8e/0xa0 [bonding]
[ 908.984217] [<
ffffffff8142f407>] ? ops_exit_list.isra.4+0x37/0x70
[ 908.984225] [<
ffffffff8142f52d>] ?
unregister_pernet_operations+0x8d/0xd0
[ 908.984228] [<
ffffffff8142f58d>] ?
unregister_pernet_subsys+0x1d/0x30
[ 908.984232] [<
ffffffffa0585269>] ? bonding_exit+0x23/0xdba [bonding]
[ 908.984236] [<
ffffffff810e28ba>] ? SyS_delete_module+0x18a/0x250
[ 908.984241] [<
ffffffff81086f99>] ? task_work_run+0x89/0xc0
[ 908.984244] [<
ffffffff8152b732>] ?
entry_SYSCALL_64_fastpath+0x16/0x75
[ 908.984247] ---[ end trace
7c006ed4abbef24b ]---
Thus remove the proc entry manually if bond_release_and_destroy() is
used. Because of the checks in bond_remove_proc_entry() it's not a
problem for a bond device to change namespaces (the bug fixed by the
Fixes commit) but since commit
f9399814927ad ("bonding: Don't allow bond devices to change network
namespaces.") that can't happen anyway.
Reported-by: Carol Soto <clsoto@linux.vnet.ibm.com>
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Fixes:
a64d49c3dd50 ("bonding: Manage /proc/net/bonding/ entries from
the netdev events")
Tested-by: Carol L Soto <clsoto@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vivien Didelot [Wed, 15 Jul 2015 14:07:07 +0000 (10:07 -0400)]
net: dsa: mv88e6xxx: fix fid_mask when leaving bridge
The mv88e6xxx_priv_state structure contains an fid_mask, where 1 means
the FID is free to use, 0 means the FID is in use.
This patch fixes the bit clear in mv88e6xxx_leave_bridge() when
assigning a new FID to a port.
Example scenario: I have 7 ports, port 5 is CPU, port 6 is unused (no
PHY). After setting the ports 0, 1 and 2 in bridge br0, and ports 3 and
4 in bridge br1, I have the following fid_mask:
0b111110010110 (0xf96).
Indeed, br0 uses FID 0, and br1 uses FID 3.
After setting nomaster for port 0, I get the wrong fid_mask: 0b10 (0x2).
With this patch we correctly get
0b111110010100 (0xf94), meaning port 0
uses FID 1, br0 uses FID 0, and br1 uses FID 3.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael S. Tsirkin [Wed, 15 Jul 2015 12:26:19 +0000 (15:26 +0300)]
virtio_net: don't require ANY_LAYOUT with VERSION_1
ANY_LAYOUT is a compatibility feature. It's implied
for VERSION_1 devices, and non-transitional devices
might not offer it. Change code to behave accordingly.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Mon, 20 Jul 2015 19:18:40 +0000 (12:18 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/s390/linux
Pull s390 fix from Martin Schwidefsky:
"Fast path fix for the thread_struct breakage"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390: adapt entry.S to the move of thread_struct
Linus Torvalds [Mon, 20 Jul 2015 19:17:55 +0000 (12:17 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/egtvedt/linux-avr32
Pull AVR32 update from Hans-Christian Egtvedt.
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/egtvedt/linux-avr32:
AVR32/time: Migrate to new 'set-state' interface
Pablo Neira Ayuso [Mon, 20 Jul 2015 13:00:28 +0000 (15:00 +0200)]
Merge tag 'ipvs-fixes-for-v4.2' of https://git./linux/kernel/git/horms/ipvs
Simon Horman says:
====================
IPVS Fixes for v4.2
please consider this fix for v4.2.
For reasons that are not clear to me it is a bumper crop.
It seems to me that they are all relevant to stable.
Please let me know if you need my help to get the fixes into stable.
* ipvs: fix ipv6 route unreach panic
This problem appears to be present since IPv6 support was added to
IPVS in v2.6.28.
* ipvs: skb_orphan in case of forwarding
This appears to resolve a problem resulting from a side effect of
41063e9dd119 ("ipv4: Early TCP socket demux.") which was included in v3.6.
* ipvs: do not use random local source address for tunnels
This appears to resolve a problem introduced by
026ace060dfe ("ipvs: optimize dst usage for real server") in v3.10.
* ipvs: fix crash if scheduler is changed
This appears to resolve a problem introduced by
ceec4c381681 ("ipvs: convert services to rcu") in v3.10.
Julian has provided backports of the fix:
* [PATCHv2 3.10.81] ipvs: fix crash if scheduler is changed
http://www.spinics.net/lists/lvs-devel/msg04008.html
* [PATCHv2 3.12.44,3.14.45,3.18.16,4.0.6] ipvs: fix crash if scheduler is changed
http://www.spinics.net/lists/lvs-devel/msg04007.html
Please let me know how you would like to handle guiding these
backports into stable.
* ipvs: fix crash with sync protocol v0 and FTP
This appears to resolve a problem introduced by
749c42b620a9 ("ipvs: reduce sync rate with time thresholds") in v3.5
====================
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Pablo Neira Ayuso [Mon, 13 Jul 2015 13:11:48 +0000 (15:11 +0200)]
netfilter: fix netns dependencies with conntrack templates
Quoting Daniel Borkmann:
"When adding connection tracking template rules to a netns, f.e. to
configure netfilter zones, the kernel will endlessly busy-loop as soon
as we try to delete the given netns in case there's at least one
template present, which is problematic i.e. if there is such bravery that
the priviledged user inside the netns is assumed untrusted.
Minimal example:
ip netns add foo
ip netns exec foo iptables -t raw -A PREROUTING -d 1.2.3.4 -j CT --zone 1
ip netns del foo
What happens is that when nf_ct_iterate_cleanup() is being called from
nf_conntrack_cleanup_net_list() for a provided netns, we always end up
with a net->ct.count > 0 and thus jump back to i_see_dead_people. We
don't get a soft-lockup as we still have a schedule() point, but the
serving CPU spins on 100% from that point onwards.
Since templates are normally allocated with nf_conntrack_alloc(), we
also bump net->ct.count. The issue why they are not yet nf_ct_put() is
because the per netns .exit() handler from x_tables (which would eventually
invoke xt_CT's xt_ct_tg_destroy() that drops reference on info->ct) is
called in the dependency chain at a *later* point in time than the per
netns .exit() handler for the connection tracker.
This is clearly a chicken'n'egg problem: after the connection tracker
.exit() handler, we've teared down all the connection tracking
infrastructure already, so rightfully, xt_ct_tg_destroy() cannot be
invoked at a later point in time during the netns cleanup, as that would
lead to a use-after-free. At the same time, we cannot make x_tables depend
on the connection tracker module, so that the xt_ct_tg_destroy() would
be invoked earlier in the cleanup chain."
Daniel confirms this has to do with the order in which modules are loaded or
having compiled nf_conntrack as modules while x_tables built-in. So we have no
guarantees regarding the order in which netns callbacks are executed.
Fix this by allocating the templates through kmalloc() from the respective
SYNPROXY and CT targets, so they don't depend on the conntrack kmem cache.
Then, release then via nf_ct_tmpl_free() from destroy_conntrack(). This branch
is marked as unlikely since conntrack templates are rarely allocated and only
from the configuration plane path.
Note that templates are not kept in any list to avoid further dependencies with
nf_conntrack anymore, thus, the tmpl larval list is removed.
Reported-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Tested-by: Daniel Borkmann <daniel@iogearbox.net>
Martin Schwidefsky [Mon, 20 Jul 2015 08:01:46 +0000 (10:01 +0200)]
s390: adapt entry.S to the move of thread_struct
git commit
0c8c0f03e3a292e031596484275c14cf39c0ab7a
"x86/fpu, sched: Dynamically allocate 'struct fpu'"
moved the thread_struct to the end of the task_struct.
This causes some of the offsets used in entry.S to overflow their
instruction operand field. To fix this use aghi to create a
dedicated pointer for the thread_struct.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Viresh Kumar [Thu, 16 Jul 2015 11:26:15 +0000 (16:56 +0530)]
AVR32/time: Migrate to new 'set-state' interface
Migrate avr32 driver to the new 'set-state' interface provided by
clockevents core, the earlier 'set-mode' interface is marked obsolete
now.
This also enables us to implement callbacks for new states of clockevent
devices, for example: ONESHOT_STOPPED.
We want to call cpu_idle_poll_ctrl() in shutdown only if we were in
oneshot or resume state earlier. Create another variable to save this
information and check that in shutdown callback.
Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Joachim Eastwood [Tue, 14 Jul 2015 22:25:26 +0000 (00:25 +0200)]
pinctrl: lpc18xx: fix schmitt trigger setup
The param_val variable is what determines if schmitt
trigger is enabled on a pin or not. A typo here mean
that schmitt trigger was always enabled for standard
and i2c pins.
Signed-off-by: Joachim Eastwood <manabian@gmail.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Uwe Kleine-König [Fri, 17 Jul 2015 07:38:43 +0000 (09:38 +0200)]
Subject: pinctrl: imx1-core: Fix debug output in .pin_config_set callback
imx1_pinconf_set assumes that the array of pins in struct
imx1_pinctrl_soc_info can be indexed by pin id to get the
pinctrl_pin_desc for a pin. This used to be correct up to commit
607af165c047 which removed some entries from the array and so made it
wrong to access the array by pin id.
The result of this bug is a wrong pin name in the output for small pin
ids and an oops for the bigger ones.
This patch is the result of a discussion that includes patches by Markus
Pargmann and Chris Ruehl.
Fixes:
607af165c047 ("pinctrl: i.MX27: Remove nonexistent pad definitions")
Cc: stable@vger.kernel.org
Reported-by: Chris Ruehl <chris.ruehl@gtsys.com.hk>
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Reviewed-by: Markus Pargmann <mpa@pengutronix.de>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Jonathan Bell [Tue, 30 Jun 2015 11:35:39 +0000 (12:35 +0100)]
pinctrl: bcm2835: Clear the event latch register when disabling interrupts
It's possible to hit a race condition if interrupts are generated on a GPIO
pin when the IRQ line in question is being disabled.
If the interrupt is freed, bcm2835_gpio_irq_disable() is called which
disables the event generation sources (edge, level). If an event occurred
between the last disabling of hard IRQs and the write to the event
source registers, a bit would be set in the GPIO event detect register
(GPEDSn) which goes unacknowledged by bcm2835_gpio_irq_handler()
so Linux complains loudly.
There is no per-GPIO mask register, so when disabling GPIO interrupts
write 1 to the relevant bit in GPEDSn to clear out any stale events.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>
Acked-by: Stephen Warren <swarren@wwwdotorg.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Grygorii Strashko [Mon, 6 Jul 2015 15:13:37 +0000 (18:13 +0300)]
pinctrl: single: ensure pcs irq will not be forced threaded
The PSC IRQ is requested using request_irq() API and as result it can
be forced to be threaded IRQ in RT-Kernel if PCS_QUIRK_HAS_SHARED_IRQ
is enabled for pinctrl domain.
As result, following 'possible irq lock inversion dependency' report
can be seen:
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
3.14.43-rt42-00360-g96ff499-dirty #24 Not tainted
---------------------------------------------------------
irq/369-pinctrl/927 just changed the state of lock:
(&pcs->lock){+.....}, at: [<
c0375b54>] pcs_irq_handle+0x48/0x9c
but this lock was taken by another, HARDIRQ-safe lock in the past:
(&irq_desc_lock_class){-.....}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this:
Possible interrupt unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&pcs->lock);
local_irq_disable();
lock(&irq_desc_lock_class);
lock(&pcs->lock);
<Interrupt>
lock(&irq_desc_lock_class);
*** DEADLOCK ***
no locks held by irq/369-pinctrl/927.
the shortest dependencies between 2nd lock and 1st lock:
-> (&irq_desc_lock_class){-.....} ops: 58724 {
IN-HARDIRQ-W at:
[<
c0090040>] lock_acquire+0x9c/0x158
[<
c07065c8>] _raw_spin_lock+0x48/0x58
[<
c009edac>] handle_fasteoi_irq+0x24/0x15c
[<
c009abb0>] generic_handle_irq+0x3c/0x4c
[<
c000f83c>] handle_IRQ+0x50/0xa0
[<
c0008674>] gic_handle_irq+0x3c/0x6c
[<
c0707a04>] __irq_svc+0x44/0x8c
[<
c000fc44>] arch_cpu_idle+0x40/0x4c
[<
c009aadc>] cpu_startup_entry+0x270/0x2e0
[<
c06fcbf8>] rest_init+0xd4/0xe4
[<
c0a44bfc>] start_kernel+0x3d0/0x3dc
[<
80008084>] 0x80008084
INITIAL USE at:
[<
c0090040>] lock_acquire+0x9c/0x158
[<
c070674c>] _raw_spin_lock_irqsave+0x54/0x68
[<
c009aff8>] __irq_get_desc_lock+0x64/0xa4
[<
c009e38c>] irq_set_chip+0x30/0x78
[<
c009ec30>] irq_set_chip_and_handler_name+0x24/0x3c
[<
c036ca10>] gic_irq_domain_map+0x48/0xb4
[<
c00a0a80>] irq_domain_associate+0x84/0x1d4
[<
c00a1154>] irq_create_mapping+0x80/0x11c
[<
c00a1270>] irq_create_of_mapping+0x80/0x120
[<
c05cdaa8>] irq_of_parse_and_map+0x34/0x3c
[<
c0a4ea24>] omap_dm_timer_init_one+0x90/0x30c
[<
c0a4eef0>] omap5_realtime_timer_init+0x8c/0x48c
[<
c0a486b0>] time_init+0x28/0x38
[<
c0a44a6c>] start_kernel+0x240/0x3dc
[<
80008084>] 0x80008084
}
... key at: [<
c1049ce0>] irq_desc_lock_class+0x0/0x8
... acquired at:
[<
c07065c8>] _raw_spin_lock+0x48/0x58
[<
c0375a90>] pcs_irq_unmask+0x58/0xa0
[<
c009ea48>] irq_enable+0x38/0x48
[<
c009ead0>] irq_startup+0x78/0x7c
[<
c009d440>] __setup_irq+0x4a8/0x4f4
[<
c009d5dc>] request_threaded_irq+0xb8/0x138
[<
c0415a5c>] omap_8250_startup+0x4c/0x148
[<
c041276c>] serial8250_startup+0x24/0x30
[<
c040d0ec>] uart_startup.part.9+0x5c/0x1b4
[<
c040dbcc>] uart_open+0xf4/0x16c
[<
c03f0540>] tty_open+0x170/0x61c
[<
c0157028>] chrdev_open+0xbc/0x1b4
[<
c0150494>] do_dentry_open+0x1e8/0x2bc
[<
c0150a84>] finish_open+0x44/0x5c
[<
c0160d50>] do_last.isra.47+0x710/0xca0
[<
c01613a4>] path_openat+0xc4/0x640
[<
c0162904>] do_filp_open+0x3c/0x98
[<
c0151bdc>] do_sys_open+0x114/0x1d8
[<
c0151cc8>] SyS_open+0x28/0x2c
[<
c0a44d70>] kernel_init_freeable+0x168/0x1e4
[<
c06fcc24>] kernel_init+0x1c/0xf8
[<
c000eee8>] ret_from_fork+0x14/0x20
-> (&pcs->lock){+.....} ops: 65 {
HARDIRQ-ON-W at:
[<
c0090040>] lock_acquire+0x9c/0x158
[<
c07065c8>] _raw_spin_lock+0x48/0x58
[<
c0375b54>] pcs_irq_handle+0x48/0x9c
[<
c0375c5c>] pcs_irq_handler+0x1c/0x28
[<
c009c458>] irq_forced_thread_fn+0x30/0x74
[<
c009c784>] irq_thread+0x158/0x1c4
[<
c0063fc4>] kthread+0xd4/0xe8
[<
c000eee8>] ret_from_fork+0x14/0x20
INITIAL USE at:
[<
c0090040>] lock_acquire+0x9c/0x158
[<
c070674c>] _raw_spin_lock_irqsave+0x54/0x68
[<
c0375344>] pcs_enable+0x7c/0xe8
[<
c0372a44>] pinmux_enable_setting+0x178/0x220
[<
c036fecc>] pinctrl_select_state+0x110/0x194
[<
c04732dc>] pinctrl_bind_pins+0x7c/0x108
[<
c045853c>] driver_probe_device+0x70/0x254
[<
c0458810>] __driver_attach+0x9c/0xa0
[<
c045674c>] bus_for_each_dev+0x78/0xac
[<
c0458030>] driver_attach+0x2c/0x30
[<
c0457c78>] bus_add_driver+0x15c/0x204
[<
c0458ee0>] driver_register+0x88/0x108
[<
c045a168>] __platform_driver_register+0x64/0x6c
[<
c0a8170c>] omap_hsmmc_driver_init+0x1c/0x20
[<
c0008a94>] do_one_initcall+0x110/0x170
[<
c0a44d48>] kernel_init_freeable+0x140/0x1e4
[<
c06fcc24>] kernel_init+0x1c/0xf8
[<
c000eee8>] ret_from_fork+0x14/0x20
}
... key at: [<
c1088a8c>] __key.18572+0x0/0x8
... acquired at:
[<
c008cdd4>] mark_lock+0x388/0x76c
[<
c008df40>] __lock_acquire+0x6d0/0x1f98
[<
c0090040>] lock_acquire+0x9c/0x158
[<
c07065c8>] _raw_spin_lock+0x48/0x58
[<
c0375b54>] pcs_irq_handle+0x48/0x9c
[<
c0375c5c>] pcs_irq_handler+0x1c/0x28
[<
c009c458>] irq_forced_thread_fn+0x30/0x74
[<
c009c784>] irq_thread+0x158/0x1c4
[<
c0063fc4>] kthread+0xd4/0xe8
[<
c000eee8>] ret_from_fork+0x14/0x20
stack backtrace:
CPU: 1 PID: 927 Comm: irq/369-pinctrl Not tainted
3.14.43-rt42-00360-g96ff499-dirty #24
[<
c00177e0>] (unwind_backtrace) from [<
c00130b0>] (show_stack+0x20/0x24)
[<
c00130b0>] (show_stack) from [<
c0702958>] (dump_stack+0x84/0xd0)
[<
c0702958>] (dump_stack) from [<
c008bcfc>] (print_irq_inversion_bug+0x1d0/0x21c)
[<
c008bcfc>] (print_irq_inversion_bug) from [<
c008bf18>] (check_usage_backwards+0xb4/0x11c)
[<
c008bf18>] (check_usage_backwards) from [<
c008cdd4>] (mark_lock+0x388/0x76c)
[<
c008cdd4>] (mark_lock) from [<
c008df40>] (__lock_acquire+0x6d0/0x1f98)
[<
c008df40>] (__lock_acquire) from [<
c0090040>] (lock_acquire+0x9c/0x158)
[<
c0090040>] (lock_acquire) from [<
c07065c8>] (_raw_spin_lock+0x48/0x58)
[<
c07065c8>] (_raw_spin_lock) from [<
c0375b54>] (pcs_irq_handle+0x48/0x9c)
[<
c0375b54>] (pcs_irq_handle) from [<
c0375c5c>] (pcs_irq_handler+0x1c/0x28)
[<
c0375c5c>] (pcs_irq_handler) from [<
c009c458>] (irq_forced_thread_fn+0x30/0x74)
[<
c009c458>] (irq_forced_thread_fn) from [<
c009c784>] (irq_thread+0x158/0x1c4)
[<
c009c784>] (irq_thread) from [<
c0063fc4>] (kthread+0xd4/0xe8)
[<
c0063fc4>] (kthread) from [<
c000eee8>] (ret_from_fork+0x14/0x20)
To fix it use IRQF_NO_THREAD to ensure that pcs irq will not be forced threaded.
Cc: Tony Lindgren <tony@atomide.com>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Acked-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Sergei Shtylyov [Thu, 25 Jun 2015 22:40:56 +0000 (01:40 +0300)]
sh-pfc: fix sparse GPIOs for R-Car SoCs
The PFC driver causes the kernel to hang on the R-Car gen2 SoC based boards
when the CPU_ALL_PORT() macro is fixed to reflect the reality, i.e. when the
GPIO space becomes actually sparse. This happens because the _GP_GPIO() macro
includes an indexed initializer which causes the "holes" (array entries filled
with all 0s) between the groups of the existing GPIOs; and the driver can't
cope with that. There seems to be no reason to use the indexed initializer,
so we can remove the index specifier and so avoid the "holes".
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Linus Walleij [Wed, 8 Jul 2015 13:12:06 +0000 (15:12 +0200)]
pinctrl: abx500: remove strict mode
Commit
a21763a0b1e5a5ab8310f581886d04beadc16616
"pinctrl: nomadik: activate strict mux mode"
put all Nomadik pin controllers to strict mode. This was
not good on the Snowball platform: the muxing of GPIOs to
different pins is done with hogs in the DTS file, and then
these GPIOs are used by offset, relying on hogs to mux the
pins. Since that means the pin controller "owns" the pins
and at the same time we have a GPIO user, this pin controller
is by definition not strict.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Linus Torvalds [Sun, 19 Jul 2015 21:45:02 +0000 (14:45 -0700)]
Linux 4.2-rc3
Linus Torvalds [Sun, 19 Jul 2015 21:18:00 +0000 (14:18 -0700)]
Merge tag 'scsi-fixes' of git://git./linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Two fairly simple fixes: one is a change that causes us to have a very
low queue depth leading to performance issues and the other is a null
deref occasionally in tapes thanks to use after put"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: fix host max depth checking for the 'queue_depth' sysfs interface
st: null pointer dereference panic caused by use after kref_put by st_open
Linus Torvalds [Sun, 19 Jul 2015 21:12:22 +0000 (14:12 -0700)]
Merge branch 'upstream' of git://git.linux-mips.org/ralf/upstream-linus
Pull MIPS fixes from Ralf Baechle:
"Another round of MIPS fixes for 4.2.
Things are looking quite decent at this stage but the recent work on
the FPU support took its toll:
- fix an incorrect overly restrictive ifdef
- select O32 64-bit FP support for O32 binary compatibility
- remove workarounds for Sibyte SB1250 Pass1 parts. There are rare
fixing the workarounds is not worth the effort.
- patch up an outdated and now incorrect comment"
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
MIPS: fpu.h: Allow 64-bit FPU on a 64-bit MIPS R6 CPU
MIPS: SB1: Remove support for Pass 1 parts.
MIPS: Require O32 FP64 support for MIPS64 with O32 compat
MIPS: asm-offset.c: Patch up various comments refering to the old filename.
Linus Torvalds [Sun, 19 Jul 2015 20:46:24 +0000 (13:46 -0700)]
Merge branch 'parisc-4.2-2' of git://git./linux/kernel/git/deller/parisc-linux
Pull parisc fix from Helge Deller:
"A memory leak fix from Christophe Jaillet which was introduced with
kernel 4.0 and which leads to kernel crashes on parisc after 1-3 days"
* 'parisc-4.2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: mm: Fix a memory leak related to pmd not attached to the pgd
Linus Torvalds [Sun, 19 Jul 2015 20:37:44 +0000 (13:37 -0700)]
Merge tag 'armsoc-fixes' of git://git./linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Olof Johansson:
"By far most of the fixes here are updates to DTS files to deal with
some mostly minor bugs.
There's also a fix to deal with non-PM kernel configs on i.MX, a
regression fix for ethernet on PXA platforms and a dependency fix for
OMAP"
* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
ARM: keystone: dts: rename pcie nodes to help override status
ARM: keystone: dts: fix dt bindings for PCIe
ARM: pxa: fix dm9000 platform data regression
ARM: dts: Correct audio input route & set mic bias for am335x-pepper
ARM: OMAP2+: Add HAVE_ARM_SCU for AM43XX
MAINTAINERS: digicolor: add dts files
ARM: ux500: fix MMC/SD card regression
ARM: ux500: define serial port aliases
ARM: dts: OMAP5: Add #iommu-cells property to IOMMUs
ARM: dts: OMAP4: Add #iommu-cells property to IOMMUs
ARM: dts: Fix frequency scaling on Gumstix Pepper
ARM: dts: configure regulators for Gumstix Pepper
ARM: dts: omap3: overo: Update LCD panel names
ARM: dts: cros-ec-keyboard: Add support for some Japanese keys
ARM: imx6: gpc: always enable PU domain if CONFIG_PM is not set
ARM: dts: imx53-qsb: fix TVE entry
ARM: dts: mx23: fix iio-hwmon support
ARM: dts: imx27: Adjust the GPT compatible string
ARM: socfpga: dts: Fix entries order
ARM: socfpga: dts: Fix adxl34x formating and compatible string
Markos Chandras [Thu, 16 Jul 2015 14:30:04 +0000 (15:30 +0100)]
MIPS: fpu.h: Allow 64-bit FPU on a 64-bit MIPS R6 CPU
Commit
6134d94923d0 ("MIPS: asm: fpu: Allow 64-bit FPU on MIPS32 R6")
added support for 64-bit FPU on a 32-bit MIPS R6 processor but it missed
the 64-bit CPU case leading to FPU failures when requesting FR=1 mode
(which is always the case for MIPS R6 userland) when running a 32-bit
kernel on a 64-bit CPU. We also fix the MIPS R2 case.
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Fixes:
6134d94923d0 ("MIPS: asm: fpu: Allow 64-bit FPU on MIPS32 R6")
Reviewed-by: Paul Burton <paul.burton@imgtec.com>
Cc: <stable@vger.kernel.org> # 4.0+
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/10734/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Christophe Jaillet [Mon, 13 Jul 2015 09:32:43 +0000 (11:32 +0200)]
parisc: mm: Fix a memory leak related to pmd not attached to the pgd
Commit
0e0da48dee8d ("parisc: mm: don't count preallocated pmds")
introduced a memory leak.
After this commit, the 'return' statement in pmd_free is executed in all
cases. Even for pmd that are not attached to the pgd. So 'free_pages'
can never be called anymore, leading to a memory leak.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Mikulas Patocka <mpatocka@redhat.com>
Acked-by: Helge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org # v4.0+
Signed-off-by: Helge Deller <deller@gmx.de>
Olof Johansson [Sun, 19 Jul 2015 04:06:10 +0000 (21:06 -0700)]
Merge tag 'pxa-fixes-v4.2-rc2' of https://github.com/rjarzmik/linux into fixesD
Merge "pxa fixes for v4.2" from Robert Jarzmik:
ARM: pxa: fixes for v4.2-rc2
This single fix reenables ethernet cards for several pxa boards,
broken by regulator addition to dm9000 driver.
* tag 'pxa-fixes-v4.2-rc2' of https://github.com/rjarzmik/linux:
ARM: pxa: fix dm9000 platform data regression
Linus Torvalds [Sat, 18 Jul 2015 18:03:48 +0000 (11:03 -0700)]
Merge branch 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm
Pull ARM fixes from Russell King:
"A small set of ARM fixes for -rc3, most of them not far off
one-liners, with the exception of fixing the V7 cache invalidation for
incoming SMP processors which was causing problems for SoCFPGA
devices"
* 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
ARM: fix __virt_to_idmap build error on !MMU
ARM: invalidate L1 before enabling coherency
ARM: 8404/1: dma-mapping: fix off-by-one error in bitmap size check
ARM: 8402/1: perf: Don't use of_node after putting it
ARM: 8400/1: use virt_to_idmap to get phys_reset address
Linus Torvalds [Sat, 18 Jul 2015 17:49:57 +0000 (10:49 -0700)]
Merge branch 'x86-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
"Two families of fixes:
- Fix an FPU context related boot crash on newer x86 hardware with
larger context sizes than what most people test. To fix this
without ugly kludges or extensive reverts we had to touch core task
allocator, to allow x86 to determine the task size dynamically, at
boot time.
I've tested it on a number of x86 platforms, and I cross-built it
to a handful of architectures:
(warns) (warns)
testing x86-64: -git: pass ( 0), -tip: pass ( 0)
testing x86-32: -git: pass ( 0), -tip: pass ( 0)
testing arm: -git: pass ( 1359), -tip: pass ( 1359)
testing cris: -git: pass ( 1031), -tip: pass ( 1031)
testing m32r: -git: pass ( 1135), -tip: pass ( 1135)
testing m68k: -git: pass ( 1471), -tip: pass ( 1471)
testing mips: -git: pass ( 1162), -tip: pass ( 1162)
testing mn10300: -git: pass ( 1058), -tip: pass ( 1058)
testing parisc: -git: pass ( 1846), -tip: pass ( 1846)
testing sparc: -git: pass ( 1185), -tip: pass ( 1185)
... so I hope the cross-arch impact 'none', as intended.
(by Dave Hansen)
- Fix various NMI handling related bugs unearthed by the big asm code
rewrite and generally make the NMI code more robust and more
maintainable while at it. These changes are a bit late in the
cycle, I hope they are still acceptable.
(by Andy Lutomirski)"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/fpu, sched: Introduce CONFIG_ARCH_WANTS_DYNAMIC_TASK_STRUCT and use it on x86
x86/fpu, sched: Dynamically allocate 'struct fpu'
x86/entry/64, x86/nmi/64: Add CONFIG_DEBUG_ENTRY NMI testing code
x86/nmi/64: Make the "NMI executing" variable more consistent
x86/nmi/64: Minor asm simplification
x86/nmi/64: Use DF to avoid userspace RSP confusing nested NMI detection
x86/nmi/64: Reorder nested NMI checks
x86/nmi/64: Improve nested NMI comments
x86/nmi/64: Switch stacks on userspace NMI entry
x86/nmi/64: Remove asm code that saves CR2
x86/nmi: Enable nested do_nmi() handling for 64-bit kernels
Linus Torvalds [Sat, 18 Jul 2015 17:49:11 +0000 (10:49 -0700)]
Merge branch 'timers-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull timer fix from Ingo Molnar:
"Fix for a misplaced export that can cause build failures in certain
(rare) Kconfig situations"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
tick: Move the export of tick_broadcast_oneshot_control to the proper place
Linus Torvalds [Sat, 18 Jul 2015 17:47:44 +0000 (10:47 -0700)]
Merge branch 'sched-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull scheduler fix from Ingo Molnar:
"A oneliner rq throttling fix"
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/fair: Test list head instead of list entry in throttle_cfs_rq()
Linus Torvalds [Sat, 18 Jul 2015 17:44:21 +0000 (10:44 -0700)]
Merge branch 'perf-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
"Mostly tooling fixes, plus a static key fix fixing /sys/devices/cpu/rdpmc"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf tools: Really allow to specify custom CC, AR or LD
perf auxtrace: Fix misplaced check for HAVE_SYNC_COMPARE_AND_SWAP_SUPPORT
perf hists browser: Take the --comm, --dsos, etc filters into account
perf symbols: Store if there is a filter in place
x86, perf: Fix static_key bug in load_mm_cr4()
tools: Copy lib/hweight.c from the kernel sources
perf tools: Fix the detached tarball wrt rbtree copy
perf thread_map: Fix the sizeof() calculation for map entries
tools lib: Improve clean target
perf stat: Fix shadow declaration of close
perf tools: Fix lockup using 32-bit compat vdso
Linus Torvalds [Sat, 18 Jul 2015 17:27:12 +0000 (10:27 -0700)]
Merge branch 'irq-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull irq fixes from Ingo Molnar:
"Misc irq fixes:
- two driver fixes
- a Xen regression fix
- a nested irq thread crash fix"
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/gicv3-its: Fix mapping of LPIs to collections
genirq: Prevent resend to interrupts marked IRQ_NESTED_THREAD
genirq: Revert sparse irq locking around __cpu_up() and move it to x86 for now
gpio/davinci: Fix race in installing chained irq handler
Linus Torvalds [Sat, 18 Jul 2015 17:01:04 +0000 (10:01 -0700)]
Merge branch 'akpm' (patches from Andrew)
Merge fixes from Andrew Morton:
"25 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (25 commits)
lib/decompress: set the compressor name to NULL on error
mm/cma_debug: correct size input to bitmap function
mm/cma_debug: fix debugging alloc/free interface
mm/page_owner: set correct gfp_mask on page_owner
mm/page_owner: fix possible access violation
fsnotify: fix oops in fsnotify_clear_marks_by_group_flags()
/proc/$PID/cmdline: fixup empty ARGV case
dma-debug: skip debug_dma_assert_idle() when disabled
hexdump: fix for non-aligned buffers
checkpatch: fix long line messages about patch context
mm: clean up per architecture MM hook header files
MAINTAINERS: uclinux-h8-devel is moderated for non-subscribers
mailmap: update Sudeep Holla's email id
Update Viresh Kumar's email address
mm, meminit: suppress unused memory variable warning
configfs: fix kernel infoleak through user-controlled format string
include, lib: add __printf attributes to several function prototypes
s390/hugetlb: add hugepages_supported define
mm: hugetlb: allow hugepages_supported to be architecture specific
revert "s390/mm: make hugepages_supported a boot time decision"
...
Linus Torvalds [Sat, 18 Jul 2015 04:46:57 +0000 (21:46 -0700)]
Merge branch 'for-linus-4.2' of git://git./linux/kernel/git/mason/linux-btrfs
Pull btrfs fixes from Chris Mason:
"These are all from Filipe, and cover a few problems we've had reported
on the list recently (along with ones he found on his own)"
* 'for-linus-4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
Btrfs: fix file corruption after cloning inline extents
Btrfs: fix order by which delayed references are run
Btrfs: fix list transaction->pending_ordered corruption
Btrfs: fix memory leak in the extent_same ioctl
Btrfs: fix shrinking truncate when the no_holes feature is enabled
Linus Torvalds [Sat, 18 Jul 2015 04:24:31 +0000 (21:24 -0700)]
Merge tag 'rtc-v4.2-2' of git://git./linux/kernel/git/abelloni/linux
Pull rtc fixes from Alexandre Belloni:
"A few fixes for the RTC susbsystem for 4.2.
The mt6397 driver was introduce in 4.2 so it is worth fixing before
the final release. I though the compilation warning for armada38x was
fixed by akpm in commit
f98b733e93e0 ("rtc-armada38x.c: remove unused
local `flags'") but he actually missed some occurrences of the
variables. Since I received 4 patches for that, I think we can
include it now.
Summary:
- fix mt6397 wakealarm creation
- remove a compilation warning for armada38x that was forgotten"
* tag 'rtc-v4.2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux:
rtc: armada38x: Remove unused variable from armada38x_rtc_set_time()
rtc: mt6397: enable wakeup before registering rtc device
Linus Torvalds [Sat, 18 Jul 2015 03:53:57 +0000 (20:53 -0700)]
Merge tag 'dm-4.2-fixes-2' of git://git./linux/kernel/git/device-mapper/linux-dm
Pull device mapper fixes from Mike Snitzer:
- revert a request-based DM core change that caused IO latency to
increase and adversely impact both throughput and system load
- fix for a use after free bug in DM core's device cleanup
- a couple DM btree removal fixes (used by dm-thinp)
- a DM thinp fix for order-5 allocation failure
- a DM thinp fix to not degrade to read-only metadata mode when in
out-of-data-space mode for longer than the 'no_space_timeout'
- fix a long-standing oversight in both dm-thinp and dm-cache by now
exporting 'needs_check' in status if it was set in metadata
- fix an embarrassing dm-cache busy-loop that caused worker threads to
eat cpu even if no IO was actively being issued to the cache device
* tag 'dm-4.2-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm cache: avoid calls to prealloc_free_structs() if possible
dm cache: avoid preallocation if no work in writeback_some_dirty_blocks()
dm cache: do not wake_worker() in free_migration()
dm cache: display 'needs_check' in status if it is set
dm thin: display 'needs_check' in status if it is set
dm thin: stay in out-of-data-space mode once no_space_timeout expires
dm: fix use after free crash due to incorrect cleanup sequence
Revert "dm: only run the queue on completion if congested or no requests pending"
dm btree: silence lockdep lock inversion in dm_btree_del()
dm thin: allocate the cell_sort_array dynamically
dm btree remove: fix bug in redistribute3
Ingo Molnar [Fri, 17 Jul 2015 10:28:12 +0000 (12:28 +0200)]
x86/fpu, sched: Introduce CONFIG_ARCH_WANTS_DYNAMIC_TASK_STRUCT and use it on x86
Don't burden architectures without dynamic task_struct sizing
with the overhead of dynamic sizing.
Also optimize the x86 code a bit by caching task_struct_size.
Acked-and-Tested-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave@sr71.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1437128892-9831-3-git-send-email-mingo@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Dave Hansen [Fri, 17 Jul 2015 10:28:11 +0000 (12:28 +0200)]
x86/fpu, sched: Dynamically allocate 'struct fpu'
The FPU rewrite removed the dynamic allocations of 'struct fpu'.
But, this potentially wastes massive amounts of memory (2k per
task on systems that do not have AVX-512 for instance).
Instead of having a separate slab, this patch just appends the
space that we need to the 'task_struct' which we dynamically
allocate already. This saves from doing an extra slab
allocation at fork().
The only real downside here is that we have to stick everything
and the end of the task_struct. But, I think the
BUILD_BUG_ON()s I stuck in there should keep that from being too
fragile.
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave@sr71.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1437128892-9831-2-git-send-email-mingo@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Aneesh Kumar K.V [Fri, 17 Jul 2015 23:24:26 +0000 (16:24 -0700)]
lib/decompress: set the compressor name to NULL on error
Without this we end up using the previous name of the compressor in the
loop in unpack_rootfs. For example we get errors like "compression
method gzip not configured" even when we have CONFIG_DECOMPRESS_GZIP
enabled.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joonsoo Kim [Fri, 17 Jul 2015 23:24:23 +0000 (16:24 -0700)]
mm/cma_debug: correct size input to bitmap function
In CMA, 1 bit in bitmap means 1 << order_per_bits pages so size of
bitmap is cma->count >> order_per_bits rather than just cma->count.
This patch fixes it.
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Michal Nazarewicz <mina86@mina86.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Stefan Strogin <stefan.strogin@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joonsoo Kim [Fri, 17 Jul 2015 23:24:20 +0000 (16:24 -0700)]
mm/cma_debug: fix debugging alloc/free interface
CMA has alloc/free interface for debugging. It is intended that
alloc/free occurs in specific CMA region, but, currently, alloc/free
interface is on root dir due to the bug so we can't select CMA region
where alloc/free happens.
This patch fixes this problem by making alloc/free interface per CMA
region.
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Michal Nazarewicz <mina86@mina86.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Stefan Strogin <stefan.strogin@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joonsoo Kim [Fri, 17 Jul 2015 23:24:18 +0000 (16:24 -0700)]
mm/page_owner: set correct gfp_mask on page_owner
Currently, we set wrong gfp_mask to page_owner info in case of isolated
freepage by compaction and split page. It causes incorrect mixed
pageblock report that we can get from '/proc/pagetypeinfo'. This metric
is really useful to measure fragmentation effect so should be accurate.
This patch fixes it by setting correct information.
Without this patch, after kernel build workload is finished, number of
mixed pageblock is 112 among roughly 210 movable pageblocks.
But, with this fix, output shows that mixed pageblock is just 57.
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joonsoo Kim [Fri, 17 Jul 2015 23:24:15 +0000 (16:24 -0700)]
mm/page_owner: fix possible access violation
When I tested my new patches, I found that page pointer which is used
for setting page_owner information is changed. This is because page
pointer is used to set new migratetype in loop. After this work, page
pointer could be out of bound. If this wrong pointer is used for
page_owner, access violation happens. Below is error message that I
got.
BUG: unable to handle kernel paging request at
0000000000b00018
IP: [<
ffffffff81025f30>] save_stack_address+0x30/0x40
PGD
1af2d067 PUD
166e0067 PMD 0
Oops: 0002 [#1] SMP
...snip...
Call Trace:
print_context_stack+0xcf/0x100
dump_trace+0x15f/0x320
save_stack_trace+0x2f/0x50
__set_page_owner+0x46/0x70
__isolate_free_page+0x1f7/0x210
split_free_page+0x21/0xb0
isolate_freepages_block+0x1e2/0x410
compaction_alloc+0x22d/0x2d0
migrate_pages+0x289/0x8b0
compact_zone+0x409/0x880
compact_zone_order+0x6d/0x90
try_to_compact_pages+0x110/0x210
__alloc_pages_direct_compact+0x3d/0xe6
__alloc_pages_nodemask+0x6cd/0x9a0
alloc_pages_current+0x91/0x100
runtest_store+0x296/0xa50
simple_attr_write+0xbd/0xe0
__vfs_write+0x28/0xf0
vfs_write+0xa9/0x1b0
SyS_write+0x46/0xb0
system_call_fastpath+0x16/0x75
This patch fixes this error by moving up set_page_owner().
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jan Kara [Fri, 17 Jul 2015 23:24:12 +0000 (16:24 -0700)]
fsnotify: fix oops in fsnotify_clear_marks_by_group_flags()
fsnotify_clear_marks_by_group_flags() can race with
fsnotify_destroy_marks() so when fsnotify_destroy_mark_locked() drops
mark_mutex, a mark from the list iterated by
fsnotify_clear_marks_by_group_flags() can be freed and we dereference free
memory in the loop there.
Fix the problem by keeping mark_mutex held in
fsnotify_destroy_mark_locked(). The reason why we drop that mutex is that
we need to call a ->freeing_mark() callback which may acquire mark_mutex
again. To avoid this and similar lock inversion issues, we move the call
to ->freeing_mark() callback to the kthread destroying the mark.
Signed-off-by: Jan Kara <jack@suse.cz>
Reported-by: Ashish Sangwan <a.sangwan@samsung.com>
Suggested-by: Lino Sanfilippo <LinoSanfilippo@gmx.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Alexey Dobriyan [Fri, 17 Jul 2015 23:24:09 +0000 (16:24 -0700)]
/proc/$PID/cmdline: fixup empty ARGV case
/proc/*/cmdline code checks if it should look at ENVP area by checking
last byte of ARGV area:
rv = access_remote_vm(mm, arg_end - 1, &c, 1, 0);
if (rv <= 0)
goto out_free_page;
If ARGV is somehow made empty (by doing execve(..., NULL, ...) or
manually setting ->arg_start and ->arg_end to equal values), the decision
will be based on byte which doesn't even belong to ARGV/ENVP.
So, quickly check if ARGV area is empty and report 0 to match previous
behaviour.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Haggai Eran [Fri, 17 Jul 2015 23:24:06 +0000 (16:24 -0700)]
dma-debug: skip debug_dma_assert_idle() when disabled
If dma-debug is disabled due to a memory error, DMA unmaps do not affect
the dma_active_cacheline radix tree anymore, and debug_dma_assert_idle()
can print false warnings.
Disable debug_dma_assert_idle() when dma_debug_disabled() is true.
Signed-off-by: Haggai Eran <haggaie@mellanox.com>
Fixes:
0abdd7a81b7e ("dma-debug: introduce debug_dma_assert_idle()")
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Vinod Koul <vinod.koul@intel.com>
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: James Bottomley <JBottomley@Parallels.com>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Sebastian Ott <sebott@linux.vnet.ibm.com>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Horia Geanta <horia.geanta@freescale.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Horacio Mijail Anton Quiles [Fri, 17 Jul 2015 23:24:04 +0000 (16:24 -0700)]
hexdump: fix for non-aligned buffers
A hexdump with a buf not aligned to the groupsize causes
non-naturally-aligned memory accesses. This was causing a kernel panic
on the processor BlackFin BF527, when such an unaligned buffer was fed
by the function ubifs_scanned_corruption in fs/ubifs/scan.c .
To fix this, change accesses to the contents of the buffer so they go
through get_unaligned(). This change should be harmless to unaligned-
access-capable architectures, and any performance hit should be anyway
dwarfed by the snprintf() processing time.
Signed-off-by: Horacio Mijail Antón Quiles <hmijail@gmail.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Joe Perches <joe@perches.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joe Perches [Fri, 17 Jul 2015 23:24:01 +0000 (16:24 -0700)]
checkpatch: fix long line messages about patch context
Changes in ("checkpatch: categorize some long line length checks")
now erroneously reports long line defects in patch context.
Fix it.
Signed-off-by: Joe Perches <joe@perches.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Laurent Dufour [Fri, 17 Jul 2015 23:23:58 +0000 (16:23 -0700)]
mm: clean up per architecture MM hook header files
Commit
2ae416b142b6 ("mm: new mm hook framework") introduced an empty
header file (mm-arch-hooks.h) for every architecture, even those which
doesn't need to define mm hooks.
As suggested by Geert Uytterhoeven, this could be cleaned through the use
of a generic header file included via each per architecture
asm/include/Kbuild file.
The PowerPC architecture is not impacted here since this architecture has
to defined the arch_remap MM hook.
Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Suggested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Geert Uytterhoeven [Fri, 17 Jul 2015 23:23:55 +0000 (16:23 -0700)]
MAINTAINERS: uclinux-h8-devel is moderated for non-subscribers
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>