Thomas Richter [Wed, 22 Oct 2014 10:18:06 +0000 (12:18 +0200)]
lcs: replace sscanf by kstrto function
Since a single integer value is read from the supplied buffer
use the kstrto functions instead of sscanf.
Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Signed-off-by: Frank Blaschka <blaschka@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Thomas Richter [Wed, 22 Oct 2014 10:18:05 +0000 (12:18 +0200)]
qeth: s390 ethernet device driver dependency
Compile the s390 10GB ethernet device driver only when
ETHERNET has been defined in the kernel configuration file.
Right now the qeth device driver is always built regardless
of which network connectivity is active.
Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Signed-off-by: Frank Blaschka <blaschka@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Thomas Richter [Wed, 22 Oct 2014 10:18:04 +0000 (12:18 +0200)]
qeth: make local functions static in qeth_l3 module
This patch makes 4 local functions static and removes
the prototypes from the header file.
Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Signed-off-by: Frank Blaschka <blaschka@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Thomas Richter [Wed, 22 Oct 2014 10:18:03 +0000 (12:18 +0200)]
qeth: fix some trace formating issues
This patch fixes trace formatting issues using the
QETH_CARD_TEXT_ macro. The total size of each trace entry
is 8 bytes. Some of the sprintf formats exceed these 8
bytes (for example using abcd:%d and the converted value
needs more than 3 bytes). The solution is to shorten the
text prepending the value or use a different format (%x).
Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Signed-off-by: Frank Blaschka <blaschka@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Thomas Richter [Wed, 22 Oct 2014 10:18:02 +0000 (12:18 +0200)]
qeth: qeth_core_main make local functions static
This patch makes some global functions static and removes
the prototypes from the header file.
Also function qeth_query_card_info is not exported anymore,
there is no external user for it, this function should never
have been exported in the first place.
Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Signed-off-by: Frank Blaschka <blaschka@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Vrabel [Wed, 22 Oct 2014 10:17:06 +0000 (11:17 +0100)]
xen-netfront: always keep the Rx ring full of requests
A full Rx ring only requires 1 MiB of memory. This is not enough
memory that it is useful to dynamically scale the number of Rx
requests in the ring based on traffic rates, because:
a) Even the full 1 MiB is a tiny fraction of a typically modern Linux
VM (for example, the AWS micro instance still has 1 GiB of memory).
b) Netfront would have used up to 1 MiB already even with moderate
data rates (there was no adjustment of target based on memory
pressure).
c) Small VMs are going to typically have one VCPU and hence only one
queue.
Keeping the ring full of Rx requests handles bursty traffic better
than trying to converge on an optimal number of requests to keep
filled.
On a 4 core host, an iperf -P 64 -t 60 run from dom0 to a 4 VCPU guest
improved from 5.1 Gbit/s to 5.6 Gbit/s. Gains with more bursty
traffic are expected to be higher.
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sat, 25 Oct 2014 20:20:20 +0000 (16:20 -0400)]
Merge branch 'sunvnet-napi'
Sowmini Varadhan says:
====================
sunvnet: NAPIfy sunvnet
This patchset converts the sunvnet driver to use the NAPI framework.
Changes since v4 to Patch1:
vnet_event accumulates LDC_EVENT_* bits into rx_event.
vnet_event_napi() unrolls send_events() logic to process all rx_event bits.
Changes since v5:
Patch 1: use net_device.h definition for NAPI_POLL_WEIGHT.
Drop sparclinux changes (patch3) per David Miller feedback
Patch 1 in the series addresses the packet-receive path- all
the vnet_event() processing is moved into NAPI context.
This patch is dependant on the sparc-next commit:
"sparc64: Add vio_set_intr() to enable/disable Rx interrupts"
(sparc commit id
ca605b7dd740c8909408d67911d8ddd272c2b320)
Patch 2 uses RCU to fix race conditions between vnet_port_remove and
paths that access/modify port-related state, such as vnet_start_xmit.
Patch 3 leverages from the NAPIfied Rx path,
dropping superfluous usage of the irqsave/irqrestores on the vio.lock
where possible.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Sowmini Varadhan [Sat, 25 Oct 2014 19:12:31 +0000 (15:12 -0400)]
sunvnet: Remove irqsave/irqrestore on vio.lock
After the NAPIfication of sunvnet, we no longer need to
synchronize by doing irqsave/restore on vio.lock in the
I/O fastpath.
NAPI ->poll() is non-reentrant, so all RX processing occurs
strictly in a serialized environment. TX reclaim is done in NAPI
context, so the netif_tx_lock can be used to serialize
critical sections between Tx and Rx paths.
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sowmini Varadhan [Sat, 25 Oct 2014 19:12:20 +0000 (15:12 -0400)]
sunvnet: Use RCU to synchronize port usage with vnet_port_remove()
A vnet_port_remove could be triggered as a result of an ldm-unbind
operation by the peer, module unload, or other changes to the
inter-vnet-link configuration. When this is concurrent with
vnet_start_xmit(), there are several race sequences possible,
such as
thread 1 thread 2
vnet_start_xmit
-> tx_port_find
spin_lock_irqsave(&vp->lock..)
ret = __tx_port_find(..)
spin_lock_irqrestore(&vp->lock..)
vio_remove -> ..
->vnet_port_remove
spin_lock_irqsave(&vp->lock..)
cleanup
spin_lock_irqrestore(&vp->lock..)
kfree(port)
/* attempt to use ret will bomb */
This patch adds RCU locking for port access so that vnet_port_remove
will correctly clean up port-related state.
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: Dwight Engen <dwight.engen@oracle.com>
Acked-by: Bob Picco <bob.picco@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sowmini Varadhan [Sat, 25 Oct 2014 19:12:12 +0000 (15:12 -0400)]
sunvnet: NAPIfy sunvnet
Move Rx packet procssing to the NAPI poll callback.
Disable VIO interrupt and unconditioanlly go into NAPI
context from vnet_event.
Note that we want to minimize the number of LDC
STOP/START messages sent. Specifically, do not send a STOP
message if vnet_walk_rx does not read all the available descriptors
because of the NAPI budget limitation. Instead, note the end index
as part of port state, and resume from this index when the
next poll callback is triggered.
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: Raghuram Kothakota <raghuram.kothakota@oracle.com>
Acked-by: Dwight Engen <dwight.engen@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 24 Oct 2014 20:41:02 +0000 (16:41 -0400)]
Merge branch 'master' of git://git./linux/kernel/git/jkirsher/net-next
Jeff Kirsher says:
====================
Intel Wired LAN Driver Updates 2014-10-23
This series contains updates to i40e and i40evf.
Jesse modifies the i40e driver to only notify the firmware on link up/down
and qualified module events. Also simplified the job of managing link
state by using the admin queue receive event for link events as a signal
to tell the driver to update link state.
Jeff (me) cleans up the inconsistent use of tabs for indentation in the admin
queue command header file.
Neerav converts the use of udelay() to usleep_range().
Anjali fixes a bug where receive would stop after some stress by adding
a sleep and restart as well as moving the setting of flow control because
it should be done at a PF level and not a VSI level.
Mitch adds code to handle link events when updating the PF switch, which
allows link information to be properly provided to VFS in all cases.
Catherine adds driver support for 10GBaseT and bumps driver version.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Fabian Frederick [Wed, 22 Oct 2014 19:06:26 +0000 (21:06 +0200)]
net: llc: include linux/errno.h instead of asm/errno.h
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fabian Frederick [Wed, 22 Oct 2014 19:01:41 +0000 (21:01 +0200)]
lapb: move EXPORT_SYMBOL after functions.
See Documentation/CodingStyle Chapter 6
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 24 Oct 2014 19:49:25 +0000 (15:49 -0400)]
Merge branch 'berlin_ethernet'
Sebastian Hesselbarth says:
====================
Marvell PXA168 libphy handling and Berlin Ethernet
This patch series deals with a removing a IP feature that can be found
on all currently supported Marvell Ethernet IP (pxa168_eth, mv643xx_eth,
mvneta). The MAC IP allows to automatically perform PHY auto-negotiation
without software interaction.
However, this feature (a) fundamentally clashes with the way libphy works
and (b) is unable to deal with quirky PHYs that require special treatment.
In this series, pxa168_eth driver is rewritten to completely disable that
feature and properly deal with libphy provided PHYs.
As usual, a branch on top of v3.18-rc1 can be found at
git://git.infradead.org/users/hesselba/linux-berlin.git devel/bg2-bg2cd-eth-v2
Patches 1-5 should go through David's net tree, I'll pick up the DT patches
6-9.
There have been some changes,
compared to the RFT
- added phy-connection-type property to BG2Q PHY DT node
- bail out from pxa168_eth_adjust_link when there is no change in
PHY parameters. Also, add a call to phy_print_status.
compared to v1
- move phy-connection-type to ethernet node instead of PHY node
Patch 1 adds support for Marvell
88E3016 FastEthernet PHY that is also
integrated in Marvell Berlin BG2/BG2CD SoCs.
Patch 2 allows to pass phy_interface_t on pxa168_eth platform_data that
is only used by mach-mmp/gplug. From the board setup, I guessed gplug's
PHY is connected via RMII. The patch still isn't even compile tested.
Patches 3-5 prepare proper libphy handling and finally remove all in-driver
PHY mangling related to the feature explained above.
Patches 6-9 add corresponding ethernet DT nodes to BG2, BG2CD, add a
phy-connection-type property to BG2Q and enable ethernet on BG2-based Sony
NSZ-GS7. I have tested all this on GS7 successfully with ip=dhcp on 100M FD.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Sebastian Hesselbarth [Wed, 22 Oct 2014 18:26:48 +0000 (20:26 +0200)]
net: pxa168_eth: Remove in-driver PHY mangling
With properly using libphy PHYs now, remove the in-driver PHY
mangling.
Tested-by: Antoine Ténart <antoine.tenart@free-electrons.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sebastian Hesselbarth [Wed, 22 Oct 2014 18:26:47 +0000 (20:26 +0200)]
net: pxa168_eth: Remove HW auto-negotiaion
Marvell Ethernet IP supports PHY negotiation driven by HW. This
fundamentally clashes with libphy (software) driven negotiation and
also cannot cope with quirky PHYs. Therefore, always disable any HW
negotiation features and properly use libphy's phy_device.
Tested-by: Antoine Ténart <antoine.tenart@free-electrons.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sebastian Hesselbarth [Wed, 22 Oct 2014 18:26:46 +0000 (20:26 +0200)]
net: pxa168_eth: Prepare proper libphy handling
Current libphy handling in pxa168_eth lacks proper phy_connect. Prepare
to fix this by first moving phy properties from platform_data to private
driver data.
Tested-by: Antoine Ténart <antoine.tenart@free-electrons.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sebastian Hesselbarth [Wed, 22 Oct 2014 18:26:45 +0000 (20:26 +0200)]
net: pxa168_eth: Provide phy_interface mode on platform_data
The PXA168 Ethernet IP support MII and RMII connection to its PHY.
Currently, pxa168 platform_data does not provide a way to pass that
and there is one user of pxa168 platform_data (mach-mmp/gplug).
Given the pinctrl settings of gplug it uses RMII, so add and pass
a corresponding phy_interface_t.
Tested-by: Antoine Ténart <antoine.tenart@free-electrons.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sebastian Hesselbarth [Wed, 22 Oct 2014 18:26:44 +0000 (20:26 +0200)]
phy: marvell: Add support for
88E3016 FastEthernet PHY
Marvell
88E3016 is a FastEthernet PHY that also can be found in Marvell
Berlin SoCs as integrated PHY.
Tested-by: Antoine Ténart <antoine.tenart@free-electrons.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Geert Uytterhoeven [Tue, 21 Oct 2014 17:53:57 +0000 (19:53 +0200)]
natsemi/macsonic: Remove superfluous interrupt disable/restore
As of commit
e4dc601bf99ccd1c ("m68k: Disable/restore interrupts in
hwreg_present()/hwreg_write()"), this is no longer needed.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Geert Uytterhoeven [Tue, 21 Oct 2014 17:53:11 +0000 (19:53 +0200)]
cirrus/mac89x0: Remove superfluous interrupt disable/restore
As of commit
e4dc601bf99ccd1c ("m68k: Disable/restore interrupts in
hwreg_present()/hwreg_write()"), this is no longer needed.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rasmus Villemoes [Tue, 21 Oct 2014 14:51:43 +0000 (16:51 +0200)]
net: typhoon: Remove redundant casts
Both image_data and typhoon_fw->data are const u8*, so the cast to u8*
is unnecessary and confusing.
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Acked-by: David Dillow <dave@thedillows.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sébastien Barré [Tue, 21 Oct 2014 13:26:15 +0000 (15:26 +0200)]
Removed unused function sctp_addr_is_valid()
sctp_addr_is_valid() only appeared in its definition.
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Sébastien Barré <sebastien.barre@uclouvain.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 24 Oct 2014 04:14:52 +0000 (00:14 -0400)]
Merge branch 'ipv6_route'
Martin KaFai Lau says:
====================
ipv6: Reduce the number of fib6_lookup() calls from ip6_pol_route()
This patch set is trying to reduce the number of fib6_lookup()
calls from ip6_pol_route().
I have adapted davem's udpflooda and kbench_mod test
(https://git.kernel.org/pub/scm/linux/kernel/git/davem/net_test_tools.git) to
support IPv6 and here is the result:
Before:
[root]# for i in $(seq 1 3); do time ./udpflood -l
20000000 -c 250 2401:face:face:face::2; done
real 0m34.190s
user 0m3.047s
sys 0m31.108s
real 0m34.635s
user 0m3.125s
sys 0m31.475s
real 0m34.517s
user 0m3.034s
sys 0m31.449s
[root]# insmod ip6_route_kbench.ko oif=2 src=2401:face:face:face::1 dst=2401:face:face:face::2
[ 660.160976] ip6_route_kbench: ip6_route_output tdiff: 933
[ 660.207261] ip6_route_kbench: ip6_route_output tdiff: 988
[ 660.253492] ip6_route_kbench: ip6_route_output tdiff: 896
[ 660.298862] ip6_route_kbench: ip6_route_output tdiff: 898
After:
[root]# for i in $(seq 1 3); do time ./udpflood -l
20000000 -c 250 2401:face:face:face::2; done
real 0m32.695s
user 0m2.925s
sys 0m29.737s
real 0m32.636s
user 0m3.007s
sys 0m29.596s
real 0m32.797s
user 0m2.866s
sys 0m29.898s
[root]# insmod ip6_route_kbench.ko oif=2 src=2401:face:face:face::1 dst=2401:face:face:face::2
[ 881.220793] ip6_route_kbench: ip6_route_output tdiff: 684
[ 881.253477] ip6_route_kbench: ip6_route_output tdiff: 640
[ 881.286867] ip6_route_kbench: ip6_route_output tdiff: 630
[ 881.320749] ip6_route_kbench: ip6_route_output tdiff: 653
/****************************** udpflood.c ******************************/
/* It is an adaptation of the Eric Dumazet's and David Miller's
* udpflood tool, by adding IPv6 support.
*/
typedef uint32_t u32;
static int debug =3D 0;
/* Allow -fstrict-aliasing */
typedef union sa_u {
struct sockaddr_storage a46;
struct sockaddr_in a4;
struct sockaddr_in6 a6;
} sa_u;
static int usage(void)
{
printf("usage: udpflood [ -l count ] [ -m message_size ] [ -c num_ip_addrs=
] IP_ADDRESS\n");
return -1;
}
static u32 get_last32h(const sa_u *sa)
{
if (sa->a46.ss_family =3D=3D PF_INET)
return ntohl(sa->a4.sin_addr.s_addr);
else
return ntohl(sa->a6.sin6_addr.s6_addr32[3]);
}
static void set_last32h(sa_u *sa, u32 last32h)
{
if (sa->a46.ss_family =3D=3D PF_INET)
sa->a4.sin_addr.s_addr =3D htonl(last32h);
else
sa->a6.sin6_addr.s6_addr32[3] =3D htonl(last32h);
}
static void print_saddr(const sa_u *sa, const char *msg)
{
char buf[64];
if (!debug)
return;
switch (sa->a46.ss_family) {
case PF_INET:
inet_ntop(PF_INET, &(sa->a4.sin_addr.s_addr), buf,
sizeof(buf));
break;
case PF_INET6:
inet_ntop(PF_INET6, &(sa->a6.sin6_addr), buf, sizeof(buf));
break;
}
printf("%s: %s\n", msg, buf);
}
static int send_packets(const sa_u *sa, size_t num_addrs, int count, int ms=
g_sz)
{
char *msg =3D malloc(msg_sz);
sa_u saddr;
u32 start_addr32h, end_addr32h, cur_addr32h;
int fd, i, err;
if (!msg)
return -ENOMEM;
memset(msg, 0, msg_sz);
memcpy(&saddr, sa, sizeof(saddr));
cur_addr32h =3D start_addr32h =3D get_last32h(&saddr);
end_addr32h =3D start_addr32h + num_addrs;
fd =3D socket(saddr.a46.ss_family, SOCK_DGRAM, 0);
if (fd < 0) {
perror("socket");
err =3D fd;
goto out_nofd;
}
/* connect to avoid the kernel spending time in figuring
* out the source address (i.e pin the src address)
*/
err =3D connect(fd, (struct sockaddr *) &saddr, sizeof(saddr));
if (err < 0) {
perror("connect");
goto out;
}
print_saddr(&saddr, "start_addr");
for (i =3D 0; i < count; i++) {
print_saddr(&saddr, "sendto");
err =3D sendto(fd, msg, msg_sz, 0, (struct sockaddr *)&saddr,
sizeof(saddr));
if (err < 0) {
perror("sendto");
goto out;
}
if (++cur_addr32h >=3D end_addr32h)
cur_addr32h =3D start_addr32h;
set_last32h(&saddr, cur_addr32h);
}
err =3D 0;
out:
close(fd);
out_nofd:
free(msg);
return err;
}
int main(int argc, char **argv, char **envp)
{
int port, msg_sz, count, num_addrs, ret;
sa_u start_addr;
port =3D 6000;
msg_sz =3D 32;
count =3D
10000000;
num_addrs =3D 1;
while ((ret =3D getopt(argc, argv, "dl:s:p:c:")) >=3D 0) {
switch (ret) {
case 'l':
sscanf(optarg, "%d", &count);
break;
case 's':
sscanf(optarg, "%d", &msg_sz);
break;
case 'p':
sscanf(optarg, "%d", &port);
break;
case 'c':
sscanf(optarg, "%d", &num_addrs);
break;
case 'd':
debug =3D 1;
break;
case '?':
return usage();
}
}
if (num_addrs < 1)
return usage();
if (!argv[optind])
return usage();
start_addr.a4.sin_port =3D htons(port);
if (inet_pton(PF_INET, argv[optind], &start_addr.a4.sin_addr))
start_addr.a46.ss_family =3D PF_INET;
else if (inet_pton(PF_INET6, argv[optind], &start_addr.a6.sin6_addr.s6_add=
r))
start_addr.a46.ss_family =3D PF_INET6;
else
return usage();
return send_packets(&start_addr, num_addrs, count, msg_sz);
}
/****************** ip6_route_kbench_mod.c ******************/
/* We can't just use "get_cycles()" as on some platforms, such
* as sparc64, that gives system cycles rather than cpu clock
* cycles.
*/
static inline unsigned long long get_tick(void)
{
unsigned long long t;
__asm__ __volatile__("rd %%tick, %0" : "=r" (t));
return t;
}
static inline unsigned long long get_tick(void)
{
unsigned long long t;
rdtscll(t);
return t;
}
static inline unsigned long long get_tick(void)
{
return get_cycles();
}
static int flow_oif = DEFAULT_OIF;
static int flow_iif = DEFAULT_IIF;
static u32 flow_mark = DEFAULT_MARK;
static struct in6_addr flow_dst_ip_addr;
static struct in6_addr flow_src_ip_addr;
static int flow_tos = DEFAULT_TOS;
static char dst_string[64];
static char src_string[64];
module_param_string(dst, dst_string, sizeof(dst_string), 0);
module_param_string(src, src_string, sizeof(src_string), 0);
static int __init flow_setup(void)
{
if (dst_string[0] &&
!in6_pton(dst_string, -1, &flow_dst_ip_addr.s6_addr[0], -1, NULL)) {
pr_info("cannot parse \"%s\"\n", dst_string);
return -1;
}
if (src_string[0] &&
!in6_pton(src_string, -1, &flow_src_ip_addr.s6_addr[0], -1, NULL)) {
pr_info("cannot parse \"%s\"\n", dst_string);
return -1;
}
return 0;
}
module_param_named(oif, flow_oif, int, 0);
module_param_named(iif, flow_iif, int, 0);
module_param_named(mark, flow_mark, uint, 0);
module_param_named(tos, flow_tos, int, 0);
static int warmup_count = DEFAULT_WARMUP_COUNT;
module_param_named(count, warmup_count, int, 0);
static void flow_init(struct flowi6 *fl6)
{
memset(fl6, 0, sizeof(*fl6));
fl6->flowi6_proto = IPPROTO_ICMPV6;
fl6->flowi6_oif = flow_oif;
fl6->flowi6_iif = flow_iif;
fl6->flowi6_mark = flow_mark;
fl6->flowi6_tos = flow_tos;
fl6->daddr = flow_dst_ip_addr;
fl6->saddr = flow_src_ip_addr;
}
static struct sk_buff * fake_skb_get(void)
{
struct ipv6hdr *hdr;
struct sk_buff *skb;
skb = alloc_skb(4096, GFP_KERNEL);
if (!skb) {
pr_info("Cannot alloc SKB for test\n");
return NULL;
}
skb->dev = __dev_get_by_index(&init_net, flow_iif);
if (skb->dev == NULL) {
pr_info("Input device (%d) does not exist\n", flow_iif);
goto err;
}
skb_reset_mac_header(skb);
skb_reset_network_header(skb);
skb_reserve(skb, MAX_HEADER + sizeof(struct ipv6hdr));
hdr = ipv6_hdr(skb);
hdr->priority = 0;
hdr->version = 6;
memset(hdr->flow_lbl, 0, sizeof(hdr->flow_lbl));
hdr->payload_len = htons(sizeof(struct icmp6hdr));
hdr->nexthdr = IPPROTO_ICMPV6;
hdr->saddr = flow_src_ip_addr;
hdr->daddr = flow_dst_ip_addr;
skb->protocol = htons(ETH_P_IPV6);
skb->mark = flow_mark;
return skb;
err:
kfree_skb(skb);
return NULL;
}
static void do_full_output_lookup_bench(void)
{
unsigned long long t1, t2, tdiff;
struct rt6_info *rt;
struct flowi6 fl6;
int i;
rt = NULL;
for (i = 0; i < warmup_count; i++) {
flow_init(&fl6);
rt = (struct rt6_info *)ip6_route_output(&init_net, NULL, &fl6);
if (IS_ERR(rt))
break;
ip6_rt_put(rt);
}
if (IS_ERR(rt)) {
pr_info("ip_route_output_key: err=%ld\n", PTR_ERR(rt));
return;
}
flow_init(&fl6);
t1 = get_tick();
rt = (struct rt6_info *)ip6_route_output(&init_net, NULL, &fl6);
t2 = get_tick();
if (!IS_ERR(rt))
ip6_rt_put(rt);
tdiff = t2 - t1;
pr_info("ip6_route_output tdiff: %llu\n", tdiff);
}
static void do_full_input_lookup_bench(void)
{
unsigned long long t1, t2, tdiff;
struct sk_buff *skb;
struct rt6_info *rt;
int err, i;
skb = fake_skb_get();
if (skb == NULL)
goto out_free;
err = 0;
local_bh_disable();
for (i = 0; i < warmup_count; i++) {
ip6_route_input(skb);
rt = (struct rt6_info *)skb_dst(skb);
err = (!rt || rt == init_net.ipv6.ip6_null_entry);
skb_dst_drop(skb);
if (err)
break;
}
local_bh_enable();
if (err) {
pr_info("Input route lookup fails\n");
goto out_free;
}
local_bh_disable();
t1 = get_tick();
ip6_route_input(skb);
t2 = get_tick();
local_bh_enable();
rt = (struct rt6_info *)skb_dst(skb);
err = (!rt || rt == init_net.ipv6.ip6_null_entry);
skb_dst_drop(skb);
if (err) {
pr_info("Input route lookup fails\n");
goto out_free;
}
tdiff = t2 - t1;
pr_info("ip6_route_input tdiff: %llu\n", tdiff);
out_free:
kfree_skb(skb);
}
static void do_full_lookup_bench(void)
{
if (!flow_iif)
do_full_output_lookup_bench();
else
do_full_input_lookup_bench();
}
static void do_bench(void)
{
do_full_lookup_bench();
do_full_lookup_bench();
do_full_lookup_bench();
do_full_lookup_bench();
}
static int __init kbench_init(void)
{
if (flow_setup())
return -EINVAL;
pr_info("flow [IIF(%d),OIF(%d),MARK(0x%08x),D("IP6_FMT"),"
"S("IP6_FMT"),TOS(0x%02x)]\n",
flow_iif, flow_oif, flow_mark,
IP6_PRT(flow_dst_ip_addr),
IP6_PRT(flow_src_ip_addr),
flow_tos);
if (!cpu_has_tsc) {
pr_err("X86 TSC is required, but is unavailable.\n");
return -EINVAL;
}
pr_info("sizeof(struct rt6_info)==%zu\n", sizeof(struct rt6_info));
do_bench();
return -ENODEV;
}
static void __exit kbench_exit(void)
{
}
module_init(kbench_init);
module_exit(kbench_exit);
MODULE_LICENSE("GPL");
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Martin KaFai Lau [Mon, 20 Oct 2014 20:42:45 +0000 (13:42 -0700)]
ipv6: Avoid redoing fib6_lookup() with reachable = 0 by saving fn
This patch save the fn before doing rt6_backtrack.
Hence, without redo-ing the fib6_lookup(), saved_fn can be used
to redo rt6_select() with RT6_LOOKUP_F_REACHABLE off.
Some minor changes I think make sense to review as a single patch:
* Remove the 'out:' goto label.
* Remove the 'reachable' variable. Only use the 'strict' variable instead.
After this patch, "failing ip6_ins_rt()" should be the only case that
requires a redo of fib6_lookup().
Cc: David Miller <davem@davemloft.net>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Martin KaFai Lau [Mon, 20 Oct 2014 20:42:44 +0000 (13:42 -0700)]
ipv6: Avoid redoing fib6_lookup() for RTF_CACHE hit case
When there is a RTF_CACHE hit, no need to redo fib6_lookup()
with reachable=0.
Cc: David Miller <davem@davemloft.net>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Martin KaFai Lau [Mon, 20 Oct 2014 20:42:43 +0000 (13:42 -0700)]
ipv6: Remove BACKTRACK macro
It is the prep work to reduce the number of calls to fib6_lookup().
The BACKTRACK macro could be hard-to-read and error-prone due to
its side effects (mainly goto).
This patch is to:
1. Replace BACKTRACK macro with a function (fib6_backtrack) with the following
return values:
* If it is backtrack-able, returns next fn for retry.
* If it reaches the root, returns NULL.
2. The caller needs to decide if a backtrack is needed (by testing
rt == net->ipv6.ip6_null_entry).
3. Rename the goto labels in ip6_pol_route() to make the next few
patches easier to read.
Cc: David Miller <davem@davemloft.net>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kenjiro Nakayama [Mon, 20 Oct 2014 09:15:50 +0000 (18:15 +0900)]
net: Remove trailing whitespace in tcp.h icmp.c syncookies.c
Remove trailing whitespace in tcp.h icmp.c syncookies.c
Signed-off-by: Kenjiro Nakayama <nakayamakenjiro@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Catherine Sullivan [Sat, 13 Sep 2014 07:40:48 +0000 (07:40 +0000)]
i40e: Bump version
Bump i40e version to 1.0.21.
Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com>
Tested-By: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Akeem G Abodunrin [Fri, 17 Oct 2014 03:14:39 +0000 (03:14 +0000)]
i40e: Moving variable declaration out of the loops
Move the three variables out of the loop, so it only declares once.
Change-ID: I436913777c7da3c16dc0031b59e3ffa61de74718
Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Signed-off-by: Patrick Lu <patrick.lu@intel.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Mitch Williams [Sat, 13 Sep 2014 07:40:47 +0000 (07:40 +0000)]
i40e: Add 10GBaseT support
Add driver support for 10GBaseT device.
Change-ID: I4be6ed847ac0bddd220b9878a95c523b32038174
Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com>
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Mitch Williams [Sat, 13 Sep 2014 07:40:46 +0000 (07:40 +0000)]
i40e: process link events when setting up switch
Add code to handle link events when updating the PF switch. This
allows link information to be properly provided to VFs in all cases.
Change-ID: If314c95f3d39259ef4c40a4a3b823381e28fb24f
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Anjali Singhai Jain [Sat, 13 Sep 2014 07:40:45 +0000 (07:40 +0000)]
i40e: Fix a bug where Rx would stop after some time
Move the setting of flow control because this should be done at a pf level not
a vsi level. Also add a sleep and restart an to fix a bug where Rx would stop
after some stress.
Change-ID: I9a93d8c2ff27c39339eb00bc4ec1225e43900be0
Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Neerav Parikh [Sat, 13 Sep 2014 07:40:44 +0000 (07:40 +0000)]
i40e/i40evf: Use usleep_range() instead of udelay()
As per the Documentation/timers/timers-howto.txt it is preferred to use
usleep_range() instead of udelay() if the delay value is > 10us in
non-atomic contexts.
So, replacing all the instances of udelay() with 10 or greater than 10
micro seconds delay in the driver and using usleep_range() instead.
Change-ID: Iaa2ab499a4c26f6005e5d86cc421407ef9de16c7
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Neerav Parikh <neerav.parikh@intel.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Jeff Kirsher [Sat, 13 Sep 2014 07:40:43 +0000 (07:40 +0000)]
i40e/i40evf: Fix whitespace indentation
This is one small step in making the indentation more consistent. If
we truly want to align values, then use tabs rather than spaces.
Change-ID: I12368bc77a52f296d1843fdcb67201a7d7cd4749
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Jesse Brandeburg [Sat, 13 Sep 2014 07:40:42 +0000 (07:40 +0000)]
i40e: enable LSE poke and simplify link state
The driver can do a simpler job of managing link state by simply
using the admin queue receive event for link events as a doorbell
that tells the driver to update link state.
Additionally, add a workaround will help make sure the link state in the
hardware is consistent with the link state the driver is reporting
by refreshing the link state every service task interval.
Change-ID: Ib95b5b7b8cc016e97d8009f6363c9f9eed301444
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Jesse Brandeburg [Sat, 13 Sep 2014 07:40:41 +0000 (07:40 +0000)]
i40e: mask phy events
Tell the firmware what kind of link related events the driver is
interested in. In this case, just link up/down and qualified module
events are the ones the driver really cares about.
Change-ID: If132c812c340c8e1927c2caf6d55185296b66201
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Haiyang Zhang [Wed, 22 Oct 2014 20:47:18 +0000 (13:47 -0700)]
hyperv: Fix the total_data_buflen in send path
total_data_buflen is used by netvsc_send() to decide if a packet can be put
into send buffer. It should also include the size of RNDIS message before the
Ethernet frame. Otherwise, a messge with total size bigger than send_section_size
may be copied into the send buffer, and cause data corruption.
[Request to include this patch to the Stable branches]
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 22 Oct 2014 21:50:39 +0000 (17:50 -0400)]
Merge branch 'amd-xgbe'
Tom Lendacky says:
====================
amd-xgbe: AMD XGBE driver fixes 2014-10-22
The following series of patches includes fixes to the driver.
- Properly handle feature changes via ethtool by using correctly sized
variables
- Perform proper napi packet counting and budget checking
This patch series is based on net.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Lendacky, Thomas [Wed, 22 Oct 2014 16:26:17 +0000 (11:26 -0500)]
amd-xgbe: Fix napi Rx budget accounting
Currently the amd-xgbe driver increments the packets processed counter
each time a descriptor is processed. Since a packet can be represented
by more than one descriptor incrementing the counter in this way is not
appropriate. Also, since multiple descriptors cause the budget check
to be short circuited, sometimes the returned value from the poll
function would be larger than the budget value resulting in a WARN_ONCE
being triggered.
Update the polling logic to properly account for the number of packets
processed and exit when the budget value is reached.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Lendacky, Thomas [Wed, 22 Oct 2014 16:26:11 +0000 (11:26 -0500)]
amd-xgbe: Properly handle feature changes via ethtool
The ndo_set_features callback function was improperly using an unsigned
int to save the current feature value for features such as NETIF_F_RXCSUM.
Since that feature is in the upper 32 bits of a 64 bit variable the
result was always 0 making it not possible to actually turn off the
hardware RX checksum support. Change the unsigned int type to the
netdev_features_t type in order to properly capture the current value
and perform the proper operation.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Philipp Zabel [Wed, 22 Oct 2014 14:34:35 +0000 (16:34 +0200)]
net: fec: ptp: fix NULL pointer dereference if ptp_clock is not set
Since commit
278d24047891 (net: fec: ptp: Enable PPS output based on ptp clock)
fec_enet_interrupt calls fec_ptp_check_pps_event unconditionally, which calls
into ptp_clock_event. If fep->ptp_clock is NULL, ptp_clock_event tries to
dereference the NULL pointer.
Since on i.MX53 fep->bufdesc_ex is not set, fec_ptp_init is never called,
and fep->ptp_clock is NULL, which reliably causes a kernel panic.
This patch adds a check for fep->ptp_clock == NULL in fec_enet_interrupt.
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sathya Perla [Wed, 22 Oct 2014 16:12:01 +0000 (21:42 +0530)]
net: fix saving TX flow hash in sock for outgoing connections
The commit "net: Save TX flow hash in sock and set in skbuf on xmit"
introduced the inet_set_txhash() and ip6_set_txhash() routines to calculate
and record flow hash(sk_txhash) in the socket structure. sk_txhash is used
to set skb->hash which is used to spread flows across multiple TXQs.
But, the above routines are invoked before the source port of the connection
is created. Because of this all outgoing connections that just differ in the
source port get hashed into the same TXQ.
This patch fixes this problem for IPv4/6 by invoking the the above routines
after the source port is available for the socket.
Fixes:
b73c3d0e4("net: Save TX flow hash in sock and set in skbuf on xmit")
Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Li RongQing [Wed, 22 Oct 2014 09:09:53 +0000 (17:09 +0800)]
xfrm6: fix a potential use after free in xfrm6_policy.c
pskb_may_pull() maybe change skb->data and make nh and exthdr pointer
oboslete, so recompute the nd and exthdr
Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
LEROY Christophe [Wed, 22 Oct 2014 07:05:47 +0000 (09:05 +0200)]
net: fs_enet: set back promiscuity mode after restart
After interface restart (eg: after link disconnection/reconnection), the bridge
function doesn't work anymore. This is due to the promiscuous mode being cleared
by the restart.
The mac-fcc already includes code to set the promiscuous mode back during the restart.
This patch adds the same handling to mac-fec and mac-scc.
Tested with bridge function on MPC885 with FEC.
Reported-by: Germain Montoies <germain.montoies@c-s.fr>
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Karl Beldan [Tue, 21 Oct 2014 14:06:05 +0000 (16:06 +0200)]
net: tso: fix unaligned access to crafted TCP header in helper API
The crafted header start address is from a driver supplied buffer, which
one can reasonably expect to be aligned on a 4-bytes boundary.
However ATM the TSO helper API is only used by ethernet drivers and
the tcp header will then be aligned to a 2-bytes only boundary from the
header start address.
Signed-off-by: Karl Beldan <karl.beldan@rivierawaves.com>
Cc: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jon Cooper [Tue, 21 Oct 2014 13:50:29 +0000 (14:50 +0100)]
sfc: remove incorrect EFX_BUG_ON_PARANOID check
write_count and insert_count can wrap around, making > check invalid.
Fixes:
70b33fb0ddec827cbbd14cdc664fc27b2ef4a6b6 ("sfc: add support for
skb->xmit_more").
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sabrina Dubroca [Tue, 21 Oct 2014 09:23:30 +0000 (11:23 +0200)]
net: sched: initialize bstats syncp
Use netdev_alloc_pcpu_stats to allocate percpu stats and initialize syncp.
Fixes:
22e0f8b9322c "net: sched: make bstats per cpu and estimator RCU safe"
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Acked-by: Cong Wang <cwang@twopensource.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexei Starovoitov [Mon, 20 Oct 2014 21:54:57 +0000 (14:54 -0700)]
bpf: fix bug in eBPF verifier
while comparing for verifier state equivalency the comparison
was missing a check for uninitialized register.
Make sure it does so and add a testcase.
Fixes:
f1bca824dabb ("bpf: add search pruning optimization to verifier")
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Thomas Graf [Tue, 21 Oct 2014 20:05:38 +0000 (22:05 +0200)]
netlink: Re-add locking to netlink_lookup() and seq walker
The synchronize_rcu() in netlink_release() introduces unacceptable
latency. Reintroduce minimal lookup so we can drop the
synchronize_rcu() until socket destruction has been RCUfied.
Cc: David S. Miller <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Reported-by: Steinar H. Gunderson <sgunderson@bigfoot.com>
Reported-and-tested-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ying Xue [Mon, 20 Oct 2014 06:46:35 +0000 (14:46 +0800)]
tipc: fix lockdep warning when intra-node messages are delivered
When running tipcTC&tipcTS test suite, below lockdep unsafe locking
scenario is reported:
[ 1109.997854]
[ 1109.997988] =================================
[ 1109.998290] [ INFO: inconsistent lock state ]
[ 1109.998575] 3.17.0-rc1+ #113 Not tainted
[ 1109.998762] ---------------------------------
[ 1109.998762] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
[ 1109.998762] swapper/7/0 [HC0[0]:SC1[1]:HE1:SE0] takes:
[ 1109.998762] (slock-AF_TIPC){+.?...}, at: [<
ffffffffa0011969>] tipc_sk_rcv+0x49/0x2b0 [tipc]
[ 1109.998762] {SOFTIRQ-ON-W} state was registered at:
[ 1109.998762] [<
ffffffff810a4770>] __lock_acquire+0x6a0/0x1d80
[ 1109.998762] [<
ffffffff810a6555>] lock_acquire+0x95/0x1e0
[ 1109.998762] [<
ffffffff81a2d1ce>] _raw_spin_lock+0x3e/0x80
[ 1109.998762] [<
ffffffffa0011969>] tipc_sk_rcv+0x49/0x2b0 [tipc]
[ 1109.998762] [<
ffffffffa0004fe8>] tipc_link_xmit+0xa8/0xc0 [tipc]
[ 1109.998762] [<
ffffffffa000ec6f>] tipc_sendmsg+0x15f/0x550 [tipc]
[ 1109.998762] [<
ffffffffa000f165>] tipc_connect+0x105/0x140 [tipc]
[ 1109.998762] [<
ffffffff817676ee>] SYSC_connect+0xae/0xc0
[ 1109.998762] [<
ffffffff81767b7e>] SyS_connect+0xe/0x10
[ 1109.998762] [<
ffffffff817a9788>] compat_SyS_socketcall+0xb8/0x200
[ 1109.998762] [<
ffffffff81a306e5>] sysenter_dispatch+0x7/0x1f
[ 1109.998762] irq event stamp: 241060
[ 1109.998762] hardirqs last enabled at (241060): [<
ffffffff8105a4ad>] __local_bh_enable_ip+0x6d/0xd0
[ 1109.998762] hardirqs last disabled at (241059): [<
ffffffff8105a46f>] __local_bh_enable_ip+0x2f/0xd0
[ 1109.998762] softirqs last enabled at (241020): [<
ffffffff81059a52>] _local_bh_enable+0x22/0x50
[ 1109.998762] softirqs last disabled at (241021): [<
ffffffff8105a626>] irq_exit+0x96/0xc0
[ 1109.998762]
[ 1109.998762] other info that might help us debug this:
[ 1109.998762] Possible unsafe locking scenario:
[ 1109.998762]
[ 1109.998762] CPU0
[ 1109.998762] ----
[ 1109.998762] lock(slock-AF_TIPC);
[ 1109.998762] <Interrupt>
[ 1109.998762] lock(slock-AF_TIPC);
[ 1109.998762]
[ 1109.998762] *** DEADLOCK ***
[ 1109.998762]
[ 1109.998762] 2 locks held by swapper/7/0:
[ 1109.998762] #0: (rcu_read_lock){......}, at: [<
ffffffff81782dc9>] __netif_receive_skb_core+0x69/0xb70
[ 1109.998762] #1: (rcu_read_lock){......}, at: [<
ffffffffa0001c90>] tipc_l2_rcv_msg+0x40/0x260 [tipc]
[ 1109.998762]
[ 1109.998762] stack backtrace:
[ 1109.998762] CPU: 7 PID: 0 Comm: swapper/7 Not tainted 3.17.0-rc1+ #113
[ 1109.998762] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
[ 1109.998762]
ffffffff82745830 ffff880016c03828 ffffffff81a209eb 0000000000000007
[ 1109.998762]
ffff880017b3cac0 ffff880016c03888 ffffffff81a1c5ef 0000000000000001
[ 1109.998762]
ffff880000000001 ffff880000000000 ffffffff81012d4f 0000000000000000
[ 1109.998762] Call Trace:
[ 1109.998762] <IRQ> [<
ffffffff81a209eb>] dump_stack+0x4e/0x68
[ 1109.998762] [<
ffffffff81a1c5ef>] print_usage_bug+0x1f1/0x202
[ 1109.998762] [<
ffffffff81012d4f>] ? save_stack_trace+0x2f/0x50
[ 1109.998762] [<
ffffffff810a406c>] mark_lock+0x28c/0x2f0
[ 1109.998762] [<
ffffffff810a3440>] ? print_irq_inversion_bug.part.46+0x1f0/0x1f0
[ 1109.998762] [<
ffffffff810a467d>] __lock_acquire+0x5ad/0x1d80
[ 1109.998762] [<
ffffffff810a70dd>] ? trace_hardirqs_on+0xd/0x10
[ 1109.998762] [<
ffffffff8108ace8>] ? sched_clock_cpu+0x98/0xc0
[ 1109.998762] [<
ffffffff8108ad2b>] ? local_clock+0x1b/0x30
[ 1109.998762] [<
ffffffff810a10dc>] ? lock_release_holdtime.part.29+0x1c/0x1a0
[ 1109.998762] [<
ffffffff8108aa05>] ? sched_clock_local+0x25/0x90
[ 1109.998762] [<
ffffffffa000dec0>] ? tipc_sk_get+0x60/0x80 [tipc]
[ 1109.998762] [<
ffffffff810a6555>] lock_acquire+0x95/0x1e0
[ 1109.998762] [<
ffffffffa0011969>] ? tipc_sk_rcv+0x49/0x2b0 [tipc]
[ 1109.998762] [<
ffffffff810a6fb6>] ? trace_hardirqs_on_caller+0xa6/0x1c0
[ 1109.998762] [<
ffffffff81a2d1ce>] _raw_spin_lock+0x3e/0x80
[ 1109.998762] [<
ffffffffa0011969>] ? tipc_sk_rcv+0x49/0x2b0 [tipc]
[ 1109.998762] [<
ffffffffa000dec0>] ? tipc_sk_get+0x60/0x80 [tipc]
[ 1109.998762] [<
ffffffffa0011969>] tipc_sk_rcv+0x49/0x2b0 [tipc]
[ 1109.998762] [<
ffffffffa00076bd>] tipc_rcv+0x5ed/0x960 [tipc]
[ 1109.998762] [<
ffffffffa0001d1c>] tipc_l2_rcv_msg+0xcc/0x260 [tipc]
[ 1109.998762] [<
ffffffffa0001c90>] ? tipc_l2_rcv_msg+0x40/0x260 [tipc]
[ 1109.998762] [<
ffffffff81783345>] __netif_receive_skb_core+0x5e5/0xb70
[ 1109.998762] [<
ffffffff81782dc9>] ? __netif_receive_skb_core+0x69/0xb70
[ 1109.998762] [<
ffffffff81784eb9>] ? dev_gro_receive+0x259/0x4e0
[ 1109.998762] [<
ffffffff817838f6>] __netif_receive_skb+0x26/0x70
[ 1109.998762] [<
ffffffff81783acd>] netif_receive_skb_internal+0x2d/0x1f0
[ 1109.998762] [<
ffffffff81785518>] napi_gro_receive+0xd8/0x240
[ 1109.998762] [<
ffffffff815bf854>] e1000_clean_rx_irq+0x2c4/0x530
[ 1109.998762] [<
ffffffff815c1a46>] e1000_clean+0x266/0x9c0
[ 1109.998762] [<
ffffffff8108ad2b>] ? local_clock+0x1b/0x30
[ 1109.998762] [<
ffffffff8108aa05>] ? sched_clock_local+0x25/0x90
[ 1109.998762] [<
ffffffff817842b1>] net_rx_action+0x141/0x310
[ 1109.998762] [<
ffffffff810bd710>] ? handle_fasteoi_irq+0xe0/0x150
[ 1109.998762] [<
ffffffff81059fa6>] __do_softirq+0x116/0x4d0
[ 1109.998762] [<
ffffffff8105a626>] irq_exit+0x96/0xc0
[ 1109.998762] [<
ffffffff81a30d07>] do_IRQ+0x67/0x110
[ 1109.998762] [<
ffffffff81a2ee2f>] common_interrupt+0x6f/0x6f
[ 1109.998762] <EOI> [<
ffffffff8100d2b7>] ? default_idle+0x37/0x250
[ 1109.998762] [<
ffffffff8100d2b5>] ? default_idle+0x35/0x250
[ 1109.998762] [<
ffffffff8100dd1f>] arch_cpu_idle+0xf/0x20
[ 1109.998762] [<
ffffffff810999fd>] cpu_startup_entry+0x27d/0x4d0
[ 1109.998762] [<
ffffffff81034c78>] start_secondary+0x188/0x1f0
When intra-node messages are delivered from one process to another
process, tipc_link_xmit() doesn't disable BH before it directly calls
tipc_sk_rcv() on process context to forward messages to destination
socket. Meanwhile, if messages delivered by remote node arrive at the
node and their destinations are also the same socket, tipc_sk_rcv()
running on process context might be preempted by tipc_sk_rcv() running
BH context. As a result, the latter cannot obtain the socket lock as
the lock was obtained by the former, however, the former has no chance
to be run as the latter is owning the CPU now, so headlock happens. To
avoid it, BH should be always disabled in tipc_sk_rcv().
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ying Xue [Mon, 20 Oct 2014 06:44:25 +0000 (14:44 +0800)]
tipc: fix a potential deadlock
Locking dependency detected below possible unsafe locking scenario:
CPU0 CPU1
T0: tipc_named_rcv() tipc_rcv()
T1: [grab nametble write lock]* [grab node lock]*
T2: tipc_update_nametbl() tipc_node_link_up()
T3: tipc_nodesub_subscribe() tipc_nametbl_publish()
T4: [grab node lock]* [grab nametble write lock]*
The opposite order of holding nametbl write lock and node lock on
above two different paths may result in a deadlock. If we move the
the updating of the name table after link state named out of node
lock, the reverse order of holding locks will be eliminated, and
as a result, the deadlock risk.
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 21 Oct 2014 19:24:30 +0000 (15:24 -0400)]
Merge branch 'enic'
Govindarajulu Varadarajan says:
====================
enic: Bug fixes
This series fixes the following problem.
Please apply this to net.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Govindarajulu Varadarajan [Sun, 19 Oct 2014 08:50:28 +0000 (14:20 +0530)]
enic: Do not call napi_disable when preemption is disabled.
In enic_stop, we disable preemption using local_bh_disable(). We disable
preemption to wait for busy_poll to finish.
napi_disable should not be called here as it might sleep.
Moving napi_disable() call out side of local_bh_disable.
BUG: sleeping function called from invalid context at include/linux/netdevice.h:477
in_atomic(): 1, irqs_disabled(): 0, pid: 443, name: ifconfig
INFO: lockdep is turned off.
Preemption disabled at:[<
ffffffffa029c5c4>] enic_rfs_flw_tbl_free+0x34/0xd0 [enic]
CPU: 31 PID: 443 Comm: ifconfig Not tainted
3.17.0-netnext-05504-g59f35b8 #268
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
ffff8800dac10000 ffff88020b8dfcb8 ffffffff8148a57c 0000000000000000
ffff88020b8dfcd0 ffffffff8107e253 ffff8800dac12a40 ffff88020b8dfd10
ffffffffa029305b ffff88020b8dfd48 ffff8800dac10000 ffff88020b8dfd48
Call Trace:
[<
ffffffff8148a57c>] dump_stack+0x4e/0x7a
[<
ffffffff8107e253>] __might_sleep+0x123/0x1a0
[<
ffffffffa029305b>] enic_stop+0xdb/0x4d0 [enic]
[<
ffffffff8138ed7d>] __dev_close_many+0x9d/0xf0
[<
ffffffff8138ef81>] __dev_close+0x31/0x50
[<
ffffffff813974a8>] __dev_change_flags+0x98/0x160
[<
ffffffff81397594>] dev_change_flags+0x24/0x60
[<
ffffffff814085fd>] devinet_ioctl+0x63d/0x710
[<
ffffffff81139c16>] ? might_fault+0x56/0xc0
[<
ffffffff81409ef5>] inet_ioctl+0x65/0x90
[<
ffffffff813768e0>] sock_do_ioctl+0x20/0x50
[<
ffffffff81376ebb>] sock_ioctl+0x20b/0x2e0
[<
ffffffff81197250>] do_vfs_ioctl+0x2e0/0x500
[<
ffffffff81492619>] ? sysret_check+0x22/0x5d
[<
ffffffff81285f23>] ? __this_cpu_preempt_check+0x13/0x20
[<
ffffffff8109fe19>] ? trace_hardirqs_on_caller+0x119/0x270
[<
ffffffff811974ac>] SyS_ioctl+0x3c/0x80
[<
ffffffff814925ed>] system_call_fastpath+0x1a/0x1f
Signed-off-by: Govindarajulu Varadarajan <_govind@gmx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Govindarajulu Varadarajan [Sun, 19 Oct 2014 08:50:27 +0000 (14:20 +0530)]
enic: fix possible deadlock in enic_stop/ enic_rfs_flw_tbl_free
The following warning is shown when spinlock debug is enabled.
This occurs when enic_flow_may_expire timer function is running and
enic_stop is called on same CPU.
Fix this by using spink_lock_bh().
=================================
[ INFO: inconsistent lock state ]
3.17.0-netnext-05504-g59f35b8 #268 Not tainted
---------------------------------
inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
ifconfig/443 [HC0[0]:SC0[0]:HE1:SE1] takes:
(&(&enic->rfs_h.lock)->rlock){+.?...}, at:
enic_rfs_flw_tbl_free+0x34/0xd0 [enic]
{IN-SOFTIRQ-W} state was registered at:
[<
ffffffff810a25af>] __lock_acquire+0x83f/0x21c0
[<
ffffffff810a45f2>] lock_acquire+0xa2/0xd0
[<
ffffffff814913fc>] _raw_spin_lock+0x3c/0x80
[<
ffffffffa029c3d5>] enic_flow_may_expire+0x25/0x130[enic]
[<
ffffffff810bcd07>] call_timer_fn+0x77/0x100
[<
ffffffff810bd8e3>] run_timer_softirq+0x1e3/0x270
[<
ffffffff8105f9ae>] __do_softirq+0x14e/0x280
[<
ffffffff8105fdae>] irq_exit+0x8e/0xb0
[<
ffffffff8103da0f>] smp_apic_timer_interrupt+0x3f/0x50
[<
ffffffff81493742>] apic_timer_interrupt+0x72/0x80
[<
ffffffff81018143>] default_idle+0x13/0x20
[<
ffffffff81018a6a>] arch_cpu_idle+0xa/0x10
[<
ffffffff81097676>] cpu_startup_entry+0x2c6/0x330
[<
ffffffff8103b7ad>] start_secondary+0x21d/0x290
irq event stamp: 2997
hardirqs last enabled at (2997): [<
ffffffff81491865>] _raw_spin_unlock_irqrestore+0x65/0x90
hardirqs last disabled at (2996): [<
ffffffff814915e6>] _raw_spin_lock_irqsave+0x26/0x90
softirqs last enabled at (2968): [<
ffffffff813b57a3>] dev_deactivate_many+0x213/0x260
softirqs last disabled at (2966): [<
ffffffff813b5783>] dev_deactivate_many+0x1f3/0x260
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(&(&enic->rfs_h.lock)->rlock);
<Interrupt>
lock(&(&enic->rfs_h.lock)->rlock);
*** DEADLOCK ***
Reported-by: Jan Stancek <jstancek@redhat.com>
Signed-off-by: Govindarajulu Varadarajan <_govind@gmx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 20 Oct 2014 16:38:19 +0000 (12:38 -0400)]
Merge branch 'gso_encap_fixes'
Florian Westphal says:
====================
net: minor gso encapsulation fixes
The following series fixes a minor bug in the gso segmentation handlers
when encapsulation offload is used.
Theoretically this could cause kernel panic when the stack tries
to software-segment such a GRE offload packet, but it looks like there
is only one affected call site (tbf scheduler) and it handles NULL
return value.
I've included a followup patch to add IS_ERR_OR_NULL checks where needed.
While looking into this, I also found that size computation of the individual
segments is incorrect if skb->encapsulation is set.
Please see individual patches for delta vs. v1.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Westphal [Mon, 20 Oct 2014 11:49:18 +0000 (13:49 +0200)]
net: core: handle encapsulation offloads when computing segment lengths
if ->encapsulation is set we have to use inner_tcp_hdrlen and add the
size of the inner network headers too.
This is 'mostly harmless'; tbf might send skb that is slightly over
quota or drop skb even if it would have fit.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Westphal [Mon, 20 Oct 2014 11:49:17 +0000 (13:49 +0200)]
net: make skb_gso_segment error handling more robust
skb_gso_segment has three possible return values:
1. a pointer to the first segmented skb
2. an errno value (IS_ERR())
3. NULL. This can happen when GSO is used for header verification.
However, several callers currently test IS_ERR instead of IS_ERR_OR_NULL
and would oops when NULL is returned.
Note that these call sites should never actually see such a NULL return
value; all callers mask out the GSO bits in the feature argument.
However, there have been issues with some protocol handlers erronously not
respecting the specified feature mask in some cases.
It is preferable to get 'have to turn off hw offloading, else slow' reports
rather than 'kernel crashes'.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Westphal [Mon, 20 Oct 2014 11:49:16 +0000 (13:49 +0200)]
net: gso: use feature flag argument in all protocol gso handlers
skb_gso_segment() has a 'features' argument representing offload features
available to the output path.
A few handlers, e.g. GRE, instead re-fetch the features of skb->dev and use
those instead of the provided ones when handing encapsulation/tunnels.
Depending on dev->hw_enc_features of the output device skb_gso_segment() can
then return NULL even when the caller has disabled all GSO feature bits,
as segmentation of inner header thinks device will take care of segmentation.
This e.g. affects the tbf scheduler, which will silently drop GRE-encap GSO skbs
that did not fit the remaining token quota as the segmentation does not work
when device supports corresponding hw offload capabilities.
Cc: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 20 Oct 2014 15:57:47 +0000 (11:57 -0400)]
Merge git://git./pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:
====================
netfilter fixes for net
The following patchset contains netfilter fixes for your net tree,
they are:
1) Fix missing MODULE_LICENSE() in the new nf_reject_ipv{4,6} modules.
2) Restrict nat and masq expressions to the nat chain type. Otherwise,
users may crash their kernel if they attach a nat/masq rule to a non
nat chain.
3) Fix hook validation in nft_compat when non-base chains are used.
Basically, initialize hook_mask to zero.
4) Make sure you use match/targets in nft_compat from the right chain
type. The existing validation relies on the table name which can be
avoided by
5) Better netlink attribute validation in nft_nat. This expression has
to reject the configuration when no address and proto configurations
are specified.
6) Interpret NFTA_NAT_REG_*_MAX if only if NFTA_NAT_REG_*_MIN is set.
Yet another sanity check to reject incorrect configurations from
userspace.
7) Conditional NAT attribute dumping depending on the existing
configuration.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Ian Morgan [Sun, 19 Oct 2014 12:05:13 +0000 (08:05 -0400)]
ax88179_178a: fix bonding failure
The following patch fixes a bug which causes the ax88179_178a driver to be
incapable of being added to a bond.
When I brought up the issue with the bonding maintainers, they indicated
that the real problem was with the NIC driver which must return zero for
success (of setting the MAC address). I see that several other NIC drivers
follow that pattern by either simply always returing zero, or by passing
through a negative (error) result while rewriting any positive return code
to zero. With that same philisophy applied to the ax88179_178a driver, it
allows it to work correctly with the bonding driver.
I believe this is suitable for queuing in -stable, as it's a small, simple,
and obvious fix that corrects a defect with no other known workaround.
This patch is against vanilla 3.17(.0).
Signed-off-by: Ian Morgan <imorgan@primordial.ca>
drivers/net/usb/ax88179_178a.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Sun, 19 Oct 2014 19:58:22 +0000 (12:58 -0700)]
Merge tag 'ntb-3.18' of git://github.com/jonmason/ntb
Pull ntb (non-transparent bridge) updates from Jon Mason:
"Add support for Haswell NTB split BARs, a debugfs entry for basic
debugging info, and some code clean-ups"
* tag 'ntb-3.18' of git://github.com/jonmason/ntb:
ntb: Adding split BAR support for Haswell platforms
ntb: use errata flag set via DID to implement workaround
ntb: conslidate reading of PPD to move platform detection earlier
ntb: move platform detection to separate function
NTB: debugfs device entry
Linus Torvalds [Sun, 19 Oct 2014 19:50:44 +0000 (12:50 -0700)]
Merge branch 'i2c/for-next' of git://git./linux/kernel/git/wsa/linux
Pull i2c updates from Wolfram Sang:
"Highlights from the I2C subsystem for 3.18:
- new drivers for Axxia AM55xx, and Hisilicon hix5hd2 SoC.
- designware driver gained AMD support, exynos gained exynos7 support
The rest is usual driver stuff. Hopefully no lowlights this time"
* 'i2c/for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: i801: Add Device IDs for Intel Sunrise Point PCH
i2c: hix5hd2: add i2c controller driver
i2c-imx: Disable the clock on probe failure
i2c: designware: Add support for AMD I2C controller
i2c: designware: Rework probe() to get clock a bit later
i2c: designware: Default to fast mode in case of ACPI
i2c: axxia: Add I2C driver for AXM55xx
i2c: exynos: add support for HSI2C module on Exynos7
i2c: mxs: detect No Slave Ack on SELECT in PIO mode
i2c: cros_ec: Remove EC_I2C_FLAG_10BIT
i2c: cros-ec-tunnel: Add of match table
i2c: rcar: remove sign-compare flaw
i2c: ismt: Use minimum descriptor size
i2c: imx: Add arbitration lost check
i2c: rk3x: Remove unlikely() annotations
i2c: rcar: check for no IRQ in rcar_i2c_irq()
i2c: rcar: make rcar_i2c_prepare_msg() *void*
i2c: rcar: simplify check for last message
i2c: designware: add support of platform data to set I2C mode
i2c: designware: add support of I2C standard mode
Linus Torvalds [Sun, 19 Oct 2014 19:45:36 +0000 (12:45 -0700)]
Merge tag 'sound-fix-3.18-rc1' of git://git./linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"Here are a collection of small fixes after 3.18 merge.
The urgent one is the fix for kernel panics with linked PCM substream
triggered by the recent nonatomic PCM ops support. Other two fixes
(emu10k1 and bebob) are stable fixes, and one easy PCI ID addition for
a new Intel HD-audio controller"
* tag 'sound-fix-3.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda_intel: Add Device IDs for Intel Sunrise Point PCH
ALSA: emu10k1: Fix deadlock in synth voice lookup
ALSA: pcm: Fix referred substream in snd_pcm_action_group() unlock loop
ALSA: bebob: Fix failure to detect source of clock for Terratec Phase 88
Linus Torvalds [Sun, 19 Oct 2014 19:40:24 +0000 (12:40 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/dtor/input
Pull second round of input updates from Dmitry Torokhov:
"Mostly simple bug fixes, although we do have one brand new driver for
Microchip AR1021 i2c touchscreen.
Also there is the change to stop trying to use i8042 active
multiplexing by default (it is still possible to activate it via
i8042.nomux=0 on boxes that implement it)"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: xpad - add Thrustmaster as Xbox 360 controller vendor
Input: xpad - add USB ID for Thrustmaster Ferrari 458 Racing Wheel
Input: max77693-haptic - fix state check in imax77693_haptic_disable()
Input: xen-kbdfront - free grant table entry in xenkbd_disconnect_backend
Input: alps - fix v4 button press recognition
Input: i8042 - disable active multiplexing by default
Input: i8042 - add noloop quirk for Asus X750LN
Input: synaptics - gate forcepad support by DMI check
Input: Add Microchip AR1021 i2c touchscreen
Input: cros_ec_keyb - add of match table
Input: serio - avoid negative serio device numbers
Input: avoid negative input device numbers
Input: automatically set EV_ABS bit in input_set_abs_params
Input: adp5588-keys - cancel workqueue in failure path
Input: opencores-kbd - switch to using managed resources
Input: evdev - fix EVIOCG{type} ioctl
Linus Torvalds [Sun, 19 Oct 2014 19:29:23 +0000 (12:29 -0700)]
Merge tag 'rdma-for-linus' of git://git./linux/kernel/git/roland/infiniband
Pull infiniband/RDMA updates from Roland Dreier:
- large set of iSER initiator improvements
- hardware driver fixes for cxgb4, mlx5 and ocrdma
- small fixes to core midlayer
* tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (47 commits)
RDMA/cxgb4: Fix ntuple calculation for ipv6 and remove duplicate line
RDMA/cxgb4: Add missing neigh_release in find_route
RDMA/cxgb4: Take IPv6 into account for best_mtu and set_emss
RDMA/cxgb4: Make c4iw_wr_log_size_order static
IB/core: Fix XRC race condition in ib_uverbs_open_qp
IB/core: Clear AH attr variable to prevent garbage data
RDMA/ocrdma: Save the bit environment, spare unncessary parenthesis
RDMA/ocrdma: The kernel has a perfectly good BIT() macro - use it
RDMA/ocrdma: Don't memset() buffers we just allocated with kzalloc()
RDMA/ocrdma: Remove a unused-label warning
RDMA/ocrdma: Convert kernel VA to PA for mmap in user
RDMA/ocrdma: Get vlan tag from ib_qp_attrs
RDMA/ocrdma: Add default GID at index 0
IB/mlx5, iser, isert: Add Signature API additions
Target/iser: Centralize ib_sig_domain setting
IB/iser: Centralize ib_sig_domain settings
IB/mlx5: Use extended internal signature layout
IB/iser: Set IP_CSUM as default guard type
IB/iser: Remove redundant assignment
IB/mlx5: Use enumerations for PI copy mask
...
Linus Torvalds [Sun, 19 Oct 2014 18:55:41 +0000 (11:55 -0700)]
Merge branch 'perf-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull more perf updates from Ingo Molnar:
"A second (and last) round of late coming fixes and changes, almost all
of them in perf tooling:
User visible tooling changes:
- Add period data column and make it default in 'perf script' (Jiri
Olsa)
- Add a visual cue for toggle zeroing of samples in 'perf top'
(Taeung Song)
- Improve callchains when using libunwind (Namhyung Kim)
Tooling fixes and infrastructure changes:
- Fix for double free in 'perf stat' when using some specific invalid
command line combo (Yasser Shalabi)
- Fix off-by-one bugs in map->end handling (Stephane Eranian)
- Fix off-by-one bug in maps__find(), also related to map->end
handling (Namhyung Kim)
- Make struct symbol->end be the first addr after the symbol range,
to make it match the convention used for struct map->end. (Arnaldo
Carvalho de Melo)
- Fix perf_evlist__add_pollfd() error handling in 'perf kvm stat
live' (Jiri Olsa)
- Fix python test build by moving callchain_param to an object linked
into the python binding (Jiri Olsa)
- Document sysfs events/ interfaces (Cody P Schafer)
- Fix typos in perf/Documentation (Masanari Iida)
- Add missing 'struct option' forward declaration (Arnaldo Carvalho
de Melo)
- Add option to copy events when queuing for sorting across cpu
buffers and enable it for 'perf kvm stat live', to avoid having
events left in the queue pointing to the ring buffer be rewritten
in high volume sessions. (Alexander Yarygin, improving work done
by David Ahern):
- Do not include a struct hists per perf_evsel, untangling the
histogram code from perf_evsel, to pave the way for exporting a
minimalistic tools/lib/api/perf/ library usable by tools/perf and
initially by the rasd daemon being developed by Borislav Petkov,
Robert Richter and Jean Pihet. (Arnaldo Carvalho de Melo)
- Make perf_evlist__open(evlist, NULL, NULL), i.e. without cpu and
thread maps mean syswide monitoring, reducing the boilerplate for
tools that only want system wide mode. (Arnaldo Carvalho de Melo)
- Move exit stuff from perf_evsel__delete to perf_evsel__exit, delete
should be just a front end for exit + free (Arnaldo Carvalho de
Melo)
- Add support to new style format of kernel PMU event. (Kan Liang)
and other misc fixes"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (45 commits)
perf script: Add period as a default output column
perf script: Add period data column
perf evsel: No need to drag util/cgroup.h
perf evlist: Add missing 'struct option' forward declaration
perf evsel: Move exit stuff from __delete to __exit
kprobes/x86: Remove stale ARCH_SUPPORTS_KPROBES_ON_FTRACE define
perf kvm stat live: Enable events copying
perf session: Add option to copy events when queueing
perf Documentation: Fix typos in perf/Documentation
perf trace: Use thread_{,_set}_priv helpers
perf kvm: Use thread_{,_set}_priv helpers
perf callchain: Create an address space per thread
perf report: Set callchain_param.record_mode for future use
perf evlist: Fix for double free in tools/perf stat
perf test: Add test case for pmu event new style format
perf tools: Add support to new style format of kernel PMU event
perf tools: Parse the pmu event prefix and suffix
Revert "perf tools: Default to cpu// for events v5"
perf Documentation: Remove Ruplicated docs for powerpc cpu specific events
perf Documentation: sysfs events/ interfaces
...
Linus Torvalds [Sun, 19 Oct 2014 18:46:09 +0000 (11:46 -0700)]
Merge git://git./linux/kernel/git/davem/sparc
Pull sparc fixes from David Miller:
"Here we have two bug fixes:
1) The current thread's fault_code is not setup properly upon entry to
do_sparc64_fault() in some paths, leading to spurious SIGBUS.
2) Don't use a zero length array at the end of thread_info on sparc64,
otherwise end_of_stack() isn't right"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
sparc64: Do not define thread fpregs save area as zero-length array.
sparc64: Fix corrupted thread fault code.
Linus Torvalds [Sun, 19 Oct 2014 18:41:57 +0000 (11:41 -0700)]
Merge git://git./linux/kernel/git/davem/net
Pull networking fixes from David Miller:
"A quick batch of bug fixes:
1) Fix build with IPV6 disabled, from Eric Dumazet.
2) Several more cases of caching SKB data pointers across calls to
pskb_may_pull(), thus referencing potentially free'd memory. From
Li RongQing.
3) DSA phy code tests operation presence improperly, instead of going:
if (x->ops->foo)
r = x->ops->foo(args);
it was going:
if (x->ops->foo(args))
r = x->ops->foo(args);
Fix from Andew Lunn"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
Net: DSA: Fix checking for get_phy_flags function
ipv6: fix a potential use after free in sit.c
ipv6: fix a potential use after free in ip6_offload.c
ipv4: fix a potential use after free in gre_offload.c
tcp: fix build error if IPv6 is not enabled
Andrew Lunn [Sun, 19 Oct 2014 14:41:47 +0000 (16:41 +0200)]
Net: DSA: Fix checking for get_phy_flags function
The check for the presence or not of the optional switch function
get_phy_flags() called the function, rather than checked to see if it
is a NULL pointer. This causes a derefernce of a NULL pointer on all
switch chips except the sf2, the only switch to implement this call.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Fixes:
6819563e646a ("net: dsa: allow switch drivers to specify phy_device::dev_flags")
Cc: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sun, 19 Oct 2014 03:12:33 +0000 (23:12 -0400)]
sparc64: Do not define thread fpregs save area as zero-length array.
This breaks the stack end corruption detection facility.
What that facility does it write a magic value to "end_of_stack()"
and checking to see if it gets overwritten.
"end_of_stack()" is "task_thread_info(p) + 1", which for sparc64 is
the beginning of the FPU register save area.
So once the user uses the FPU, the magic value is overwritten and the
debug checks trigger.
Fix this by making the size explicit.
Due to the size we use for the fpsaved[], gsr[], and xfsr[] arrays we
are limited to 7 levels of FPU state saves. So each FPU register set
is 256 bytes, allocate 256 * 7 for the fpregs area.
Reported-by: Meelis Roos <mroos@linux.ee>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sun, 19 Oct 2014 03:03:09 +0000 (23:03 -0400)]
sparc64: Fix corrupted thread fault code.
Every path that ends up at do_sparc64_fault() must install a valid
FAULT_CODE_* bitmask in the per-thread fault code byte.
Two paths leading to the label winfix_trampoline (which expects the
FAULT_CODE_* mask in register %g4) were not doing so:
1) For pre-hypervisor TLB protection violation traps, if we took
the 'winfix_trampoline' path we wouldn't have %g4 initialized
with the FAULT_CODE_* value yet. Resulting in using the
TLB_TAG_ACCESS register address value instead.
2) In the TSB miss path, when we notice that we are going to use a
hugepage mapping, but we haven't allocated the hugepage TSB yet, we
still have to take the window fixup case into consideration and
in that particular path we leave %g4 not setup properly.
Errors on this sort were largely invisible previously, but after
commit
4ccb9272892c33ef1c19a783cfa87103b30c2784 ("sparc64: sun4v TLB
error power off events") we now have a fault_code mask bit
(FAULT_CODE_BAD_RA) that triggers due to this bug.
FAULT_CODE_BAD_RA triggers because this bit is set in TLB_TAG_ACCESS
(see #1 above) and thus we get seemingly random bus errors triggered
for user processes.
Fixes:
4ccb9272892c ("sparc64: sun4v TLB error power off events")
Reported-by: Meelis Roos <mroos@linux.ee>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Sun, 19 Oct 2014 01:11:04 +0000 (18:11 -0700)]
Merge branch 'for-linus' of git://git.infradead.org/users/vkoul/slave-dma
Pull slave-dmaengine updates from Vinod Koul:
"For dmaengine contributions we have:
- designware cleanup by Andy
- my series moving device_control users to dmanegine_xxx APIs for
later removal of device_control API
- minor fixes spread over drivers mainly mv_xor, pl330, mmp, imx-sdma
etc"
* 'for-linus' of git://git.infradead.org/users/vkoul/slave-dma: (60 commits)
serial: atmel: add missing dmaengine header
dmaengine: remove FSLDMA_EXTERNAL_START
dmaengine: freescale: remove FSLDMA_EXTERNAL_START control method
carma-fpga: move to fsl_dma_external_start()
carma-fpga: use dmaengine_xxx() API
dmaengine: freescale: add and export fsl_dma_external_start()
dmaengine: add dmaengine_prep_dma_sg() helper
video: mx3fb: use dmaengine_terminate_all() API
serial: sh-sci: use dmaengine_terminate_all() API
net: ks8842: use dmaengine_terminate_all() API
mtd: sh_flctl: use dmaengine_terminate_all() API
mtd: fsmc_nand: use dmaengine_terminate_all() API
V4L2: mx3_camer: use dmaengine_pause() API
dmaengine: coh901318: use dmaengine_terminate_all() API
pata_arasan_cf: use dmaengine_terminate_all() API
dmaengine: edma: check for echan->edesc => NULL in edma_dma_pause()
dmaengine: dw: export probe()/remove() and Co to users
dmaengine: dw: enable and disable controller when needed
dmaengine: dw: always export dw_dma_{en,dis}able
dmaengine: dw: introduce dw_dma_on() helper
...
Linus Torvalds [Sun, 19 Oct 2014 01:03:02 +0000 (18:03 -0700)]
Merge tag 'fbdev-3.18' of git://git./linux/kernel/git/tomba/linux
Pull fbdev updates from Tomi Valkeinen:
- new 6x10 font
- various small fixes and cleanups
* tag 'fbdev-3.18' of git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux: (30 commits)
fonts: Add 6x10 font
videomode: provide dummy inline functions for !CONFIG_OF
video/atmel_lcdfb: Introduce regulator support
fbdev: sh_mobile_hdmi: Re-init regs before irq re-enable on resume
framebuffer: fix screen corruption when copying
framebuffer: fix border color
arm, fbdev, omap2, LLVMLinux: Remove nested function from omapfb
arm, fbdev, omap2, LLVMLinux: Remove nested function from omap2 dss
video: fbdev: valkyriefb.c: use container_of to resolve fb_info_valkyrie from fb_info
video: fbdev: pxafb.c: use container_of to resolve pxafb_info/layer from fb_info
video: fbdev: cyber2000fb.c: use container_of to resolve cfb_info from fb_info
video: fbdev: controlfb.c: use container_of to resolve fb_info_control from fb_info
video: fbdev: sa1100fb.c: use container_of to resolve sa1100fb_info from fb_info
video: fbdev: stifb.c: use container_of to resolve stifb_info from fb_info
video: fbdev: sis: sis_main.c: Cleaning up missing null-terminate in conjunction with strncpy
video: valkyriefb: Fix unused variable warning in set_valkyrie_clock()
video: fbdev: use %*ph specifier to dump small buffers
video: mx3fb: always enable BACKLIGHT_LCD_SUPPORT
video: fbdev: au1200fb: delete double assignment
video: fbdev: sis: delete double assignment
...
Linus Torvalds [Sat, 18 Oct 2014 21:32:31 +0000 (14:32 -0700)]
Merge tag 'kvm-arm-for-3.18-take-2' of git://git./linux/kernel/git/kvmarm/kvmarm
Pull second batch of changes for KVM/{arm,arm64} from Marc Zyngier:
"The most obvious thing is the sizeable MMU changes to support 48bit
VAs on arm64.
Summary:
- support for 48bit IPA and VA (EL2)
- a number of fixes for devices mapped into guests
- yet another VGIC fix for BE
- a fix for CPU hotplug
- a few compile fixes (disabled VGIC, strict mm checks)"
[ I'm pulling directly from Marc at the request of Paolo Bonzini, whose
backpack was stolen at Düsseldorf airport and will do new keys and
rebuild his web of trust. - Linus ]
* tag 'kvm-arm-for-3.18-take-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm:
arm/arm64: KVM: Fix BE accesses to GICv2 EISR and ELRSR regs
arm: kvm: STRICT_MM_TYPECHECKS fix for user_mem_abort
arm/arm64: KVM: Ensure memslots are within KVM_PHYS_SIZE
arm64: KVM: Implement 48 VA support for KVM EL2 and Stage-2
arm/arm64: KVM: map MMIO regions at creation time
arm64: kvm: define PAGE_S2_DEVICE as read-only by default
ARM: kvm: define PAGE_S2_DEVICE as read-only by default
arm/arm64: KVM: add 'writable' parameter to kvm_phys_addr_ioremap
arm/arm64: KVM: fix potential NULL dereference in user_mem_abort()
arm/arm64: KVM: use __GFP_ZERO not memset() to get zeroed pages
ARM: KVM: fix vgic-disabled build
arm: kvm: fix CPU hotplug
Linus Torvalds [Sat, 18 Oct 2014 21:24:36 +0000 (14:24 -0700)]
Merge branch 'upstream' of git://git.linux-mips.org/ralf/upstream-linus
Pull MIPS updates from Ralf Baechle:
"This is the MIPS pull request for the next kernel:
- Zubair's patch series adds CMA support for MIPS. Doing so it also
touches ARM64 and x86.
- remove the last instance of IRQF_DISABLED from arch/mips
- updates to two of the MIPS defconfig files.
- cleanup of how cache coherency bits are handled on MIPS and
implement support for write-combining.
- platform upgrades for Alchemy
- move MIPS DTS files to arch/mips/boot/dts/"
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: (24 commits)
MIPS: ralink: remove deprecated IRQF_DISABLED
MIPS: pgtable.h: Implement the pgprot_writecombine function for MIPS
MIPS: cpu-probe: Set the write-combine CCA value on per core basis
MIPS: pgtable-bits: Define the CCA bit for WC writes on Ingenic cores
MIPS: pgtable-bits: Move the CCA bits out of the core's ifdef blocks
MIPS: DMA: Add cma support
x86: use generic dma-contiguous.h
arm64: use generic dma-contiguous.h
asm-generic: Add dma-contiguous.h
MIPS: BPF: Add new emit_long_instr macro
MIPS: ralink: Move device-trees to arch/mips/boot/dts/
MIPS: Netlogic: Move device-trees to arch/mips/boot/dts/
MIPS: sead3: Move device-trees to arch/mips/boot/dts/
MIPS: Lantiq: Move device-trees to arch/mips/boot/dts/
MIPS: Octeon: Move device-trees to arch/mips/boot/dts/
MIPS: Add support for building device-tree binaries
MIPS: Create common infrastructure for building built-in device-trees
MIPS: SEAD3: Enable DEVTMPFS
MIPS: SEAD3: Regenerate defconfigs
MIPS: Alchemy: DB1300: Add touch penirq support
...
Linus Torvalds [Sat, 18 Oct 2014 21:22:32 +0000 (14:22 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/mpe/linux
Pull powerpc fix from Michael Ellerman:
"There was a bit of a misunderstanding between us and the ARM guys in
the device tree PCI code, which is breaking virtio on powerpc.
This is the minimal fix until we can sort it out properly"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux:
powerpc/pci: Fix IO space breakage after of_pci_range_to_resource() change
Linus Torvalds [Sat, 18 Oct 2014 20:39:19 +0000 (13:39 -0700)]
Merge branch 'for-linus' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs/smb3 updates from Steve French:
"Improved SMB3 support (symlink and device emulation, and remapping by
default the 7 reserved posix characters) and a workaround for cifs
mounts to Mac (working around a commonly encountered Mac server bug)"
* 'for-linus' of git://git.samba.org/sfrench/cifs-2.6:
[CIFS] Remove obsolete comment
Check minimum response length on query_network_interface
Workaround Mac server problem
Remap reserved posix characters by default (part 3/3)
Allow conversion of characters in Mac remap range (part 2)
Allow conversion of characters in Mac remap range. Part 1
mfsymlinks support for SMB2.1/SMB3. Part 2 query symlink
Add mfsymlinks support for SMB2.1/SMB3. Part 1 create symlink
Allow mknod and mkfifo on SMB2/SMB3 mounts
add defines for two new file attributes
Linus Torvalds [Sat, 18 Oct 2014 20:37:19 +0000 (13:37 -0700)]
Merge tag 'dlm-3.18' of git://git./linux/kernel/git/teigland/linux-dlm
Pull dlm fix from David Teigland:
"This includes a single commit fixing a missing endian conversion"
* tag 'dlm-3.18' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm:
dlm: fix missing endian conversion of rcom_status flags
Linus Torvalds [Sat, 18 Oct 2014 20:32:17 +0000 (13:32 -0700)]
Merge branch 'for-linus-update' of git://git./linux/kernel/git/mason/linux-btrfs
Pull btrfs data corruption fix from Chris Mason:
"I'm testing a pull with more fixes, but wanted to get this one out so
Greg can pick it up.
The corruption isn't easy to hit, you have to do a readonly snapshot
and have orphans in the snapshot. But my review and testing missed
the bug. Filipe has added a better xfstest to cover it"
* 'for-linus-update' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
Revert "Btrfs: race free update of commit root for ro snapshots"
Linus Torvalds [Sat, 18 Oct 2014 20:25:03 +0000 (13:25 -0700)]
Merge tag 'please-pull-pstore' of git://git./linux/kernel/git/aegl/linux
Pull pstore fix from Tony Luck:
"Ensure unique filenames in pstore"
* tag 'please-pull-pstore' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux:
pstore: Fix duplicate {console,ftrace}-efi entries
Linus Torvalds [Sat, 18 Oct 2014 19:54:46 +0000 (12:54 -0700)]
Merge git://git./linux/kernel/git/aia21/ntfs
Pull NTFS update from Anton Altaparmakov:
"Here is a small NTFS update notably implementing FIBMAP ioctl for NTFS
by adding the bmap address space operation. People seem to still want
FIBMAP"
* git://git.kernel.org/pub/scm/linux/kernel/git/aia21/ntfs:
NTFS: Bump version to 2.1.31.
NTFS: Add bmap address space operation needed for FIBMAP ioctl.
NTFS: Remove changelog from Documentation/filesystems/ntfs.txt.
NTFS: Split ntfs_aops into ntfs_normal_aops and ntfs_compressed_aops in preparation for them diverging.
Linus Torvalds [Sat, 18 Oct 2014 19:52:08 +0000 (12:52 -0700)]
Merge tag 'nfs-for-3.18-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client updates from Trond Myklebust:
"Highlights include:
Stable fixes:
- fix an uninitialised pointer Oops in the writeback error path
- fix a bogus warning (and early exit from the loop) in nfs_generic_pgio()
Features:
- Add NFSv4.2 SEEK feature and client support for lseek(SEEK_HOLE/SEEK_DATA)
Other fixes:
- pnfs: replace broken pnfs_put_lseg_async
- Remove dead prototype for nfs4_insert_deviceid_node"
* tag 'nfs-for-3.18-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
NFS: Fix a bogus warning in nfs_generic_pgio
NFS: Fix an uninitialised pointer Oops in the writeback error path
NFSv4.1/pnfs: replace broken pnfs_put_lseg_async
NFSv4: Remove dead prototype for nfs4_insert_deviceid_node()
NFS: Implement SEEK
Linus Torvalds [Sat, 18 Oct 2014 19:25:30 +0000 (12:25 -0700)]
Merge tag 'dm-3.18' of git://git./linux/kernel/git/device-mapper/linux-dm
Pull device-mapper updates from Mike Snitzer:
"I rebased the DM tree ontop of linux-block.git's 'for-3.18/core' at
the beginning of October because DM core now depends on the newly
introduced bioset_create_nobvec() interface.
Summary:
- fix DM's long-standing excessive use of memory by leveraging the
new bioset_create_nobvec() interface when creating the DM's bioset
- fix a few bugs in dm-bufio and dm-log-userspace
- add DM core support for a DM multipath use-case that requires
loading DM tables that contain devices that have failed (by
allowing active and inactive DM tables to share dm_devs)
- add discard support to the DM raid target; like MD raid456 the user
must opt-in to raid456 discard support be specifying the
devices_handle_discard_safely=Y module param"
* tag 'dm-3.18' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm log userspace: fix memory leak in dm_ulog_tfr_init failure path
dm bufio: when done scanning return from __scan immediately
dm bufio: update last_accessed when relinking a buffer
dm raid: add discard support for RAID levels 4, 5 and 6
dm raid: add discard support for RAID levels 1 and 10
dm: allow active and inactive tables to share dm_devs
dm mpath: stop queueing IO when no valid paths exist
dm: use bioset_create_nobvec()
dm: remove nr_iovecs parameter from alloc_tio()
Linus Torvalds [Sat, 18 Oct 2014 19:12:45 +0000 (12:12 -0700)]
Merge branch 'for-3.18/drivers' of git://git.kernel.dk/linux-block
Pull block layer driver update from Jens Axboe:
"This is the block driver pull request for 3.18. Not a lot in there
this round, and nothing earth shattering.
- A round of drbd fixes from the linbit team, and an improvement in
asender performance.
- Removal of deprecated (and unused) IRQF_DISABLED flag in rsxx and
hd from Michael Opdenacker.
- Disable entropy collection from flash devices by default, from Mike
Snitzer.
- A small collection of xen blkfront/back fixes from Roger Pau Monné
and Vitaly Kuznetsov"
* 'for-3.18/drivers' of git://git.kernel.dk/linux-block:
block: disable entropy contributions for nonrot devices
xen, blkfront: factor out flush-related checks from do_blkif_request()
xen-blkback: fix leak on grant map error path
xen/blkback: unmap all persistent grants when frontend gets disconnected
rsxx: Remove deprecated IRQF_DISABLED
block: hd: remove deprecated IRQF_DISABLED
drbd: use RB_DECLARE_CALLBACKS() to define augment callbacks
drbd: compute the end before rb_insert_augmented()
drbd: Add missing newline in resync progress display in /proc/drbd
drbd: reduce lock contention in drbd_worker
drbd: Improve asender performance
drbd: Get rid of the WORK_PENDING macro
drbd: Get rid of the __no_warn and __cond_lock macros
drbd: Avoid inconsistent locking warning
drbd: Remove superfluous newline from "resync_extents" debugfs entry.
drbd: Use consistent names for all the bi_end_io callbacks
drbd: Use better variable names
Linus Torvalds [Sat, 18 Oct 2014 18:53:51 +0000 (11:53 -0700)]
Merge branch 'for-3.18/core' of git://git.kernel.dk/linux-block
Pull core block layer changes from Jens Axboe:
"This is the core block IO pull request for 3.18. Apart from the new
and improved flush machinery for blk-mq, this is all mostly bug fixes
and cleanups.
- blk-mq timeout updates and fixes from Christoph.
- Removal of REQ_END, also from Christoph. We pass it through the
->queue_rq() hook for blk-mq instead, freeing up one of the request
bits. The space was overly tight on 32-bit, so Martin also killed
REQ_KERNEL since it's no longer used.
- blk integrity updates and fixes from Martin and Gu Zheng.
- Update to the flush machinery for blk-mq from Ming Lei. Now we
have a per hardware context flush request, which both cleans up the
code should scale better for flush intensive workloads on blk-mq.
- Improve the error printing, from Rob Elliott.
- Backing device improvements and cleanups from Tejun.
- Fixup of a misplaced rq_complete() tracepoint from Hannes.
- Make blk_get_request() return error pointers, fixing up issues
where we NULL deref when a device goes bad or missing. From Joe
Lawrence.
- Prep work for drastically reducing the memory consumption of dm
devices from Junichi Nomura. This allows creating clone bio sets
without preallocating a lot of memory.
- Fix a blk-mq hang on certain combinations of queue depths and
hardware queues from me.
- Limit memory consumption for blk-mq devices for crash dump
scenarios and drivers that use crazy high depths (certain SCSI
shared tag setups). We now just use a single queue and limited
depth for that"
* 'for-3.18/core' of git://git.kernel.dk/linux-block: (58 commits)
block: Remove REQ_KERNEL
blk-mq: allocate cpumask on the home node
bio-integrity: remove the needless fail handle of bip_slab creating
block: include func name in __get_request prints
block: make blk_update_request print prefix match ratelimited prefix
blk-merge: don't compute bi_phys_segments from bi_vcnt for cloned bio
block: fix alignment_offset math that assumes io_min is a power-of-2
blk-mq: Make bt_clear_tag() easier to read
blk-mq: fix potential hang if rolling wakeup depth is too high
block: add bioset_create_nobvec()
block: use bio_clone_fast() in blk_rq_prep_clone()
block: misplaced rq_complete tracepoint
sd: Honor block layer integrity handling flags
block: Replace strnicmp with strncasecmp
block: Add T10 Protection Information functions
block: Don't merge requests if integrity flags differ
block: Integrity checksum flag
block: Relocate bio integrity flags
block: Add a disk flag to block integrity profile
block: Add prefix to block integrity profile flags
...
Linus Torvalds [Sat, 18 Oct 2014 18:48:03 +0000 (11:48 -0700)]
Merge tag 'for-linus-
20141015' of git://git.infradead.org/linux-mtd
Pull MTD update from Brian Norris:
"Sorry for delaying this a bit later than usual. There's one mild
regression from 3.16 that was noticed during the 3.17 cycle, and I
meant to send a fix for it along with this pull request. I'll
probably try to queue it up for a later pull request once I've had a
better look at it, hopefully by -rc2 at the latest.
Summary for this pull:
NAND
- Cleanup for Denali driver
- Atmel: add support for new page sizes
- Atmel: fix up 'raw' mode support
- Atmel: miscellaneous cleanups
- New timing mode helpers for non-ONFI NAND
- OMAP: allow driver to be (properly) built as a module
- bcm47xx: RESET support and other cleanups
SPI NOR
- Miscellaneous cleanups, to prepare framework for wider use (some
further work still pending)
- Compile-time configuration to select 4K vs. 64K support for flash
that support both (necessary for using UBIFS on some SPI NOR)
A few scattered code quality fixes, detected by Coverity
See the changesets for more"
* tag 'for-linus-
20141015' of git://git.infradead.org/linux-mtd: (59 commits)
mtd: nand: omap: Correct CONFIG_MTD_NAND_OMAP_BCH help message
mtd: nand: Force omap_elm to be built as a module if omap2_nand is a module
mtd: move support for struct flash_platform_data into m25p80
mtd: spi-nor: add Kconfig option to disable 4K sectors
mtd: nand: Move ELM driver and rename as omap_elm
nand: omap2: Replace pr_err with dev_err
nand: omap2: Remove horrible ifdefs to fix module probe
mtd: nand: add Hynix's H27UCG8T2ATR-BC to nand_ids table
mtd: nand: support ONFI timing mode retrieval for non-ONFI NANDs
mtd: physmap_of: Add non-obsolete map_rom probe
mtd: physmap_of: Fix ROM support via OF
MAINTAINERS: add l2-mtd.git, 'next' tree for MTD
mtd: denali: fix indents and other trivial things
mtd: denali: remove unnecessary parentheses
mtd: denali: remove another set-but-unused variable
mtd: denali: fix include guard and license block of denali.h
mtd: nand: don't break long print messages
mtd: bcm47xxnflash: replace some magic numbers
mtd: bcm47xxnflash: NAND_CMD_RESET support
mtd: bcm47xxnflash: add cmd_ctrl handler
...
Linus Torvalds [Sat, 18 Oct 2014 18:39:52 +0000 (11:39 -0700)]
Merge tag 'md/3.18' of git://neil.brown.name/md
Pull md updates from Neil Brown:
- a few minor bug fixes
- quite a lot of code tidy-up and simplification
- remove PRINT_RAID_DEBUG ioctl. I'm fairly sure it is unused, and it
isn't particularly useful.
* tag 'md/3.18' of git://neil.brown.name/md: (21 commits)
lib/raid6: Add log level to printks
md: move EXPORT_SYMBOL to after function in md.c
md: discard PRINT_RAID_DEBUG ioctl
md: remove MD_BUG()
md: clean up 'exit' labels in md_ioctl().
md: remove unnecessary test for MD_MAJOR in md_ioctl()
md: don't allow "-sync" to be set for device in an active array.
md: remove unwanted white space from md.c
md: don't start resync thread directly from md thread.
md: Just use RCU when checking for overlap between arrays.
md: avoid potential long delay under pers_lock
md: simplify export_array()
md: discard find_rdev_nr in favour of find_rdev_nr_rcu
md: use wait_event() to simplify md_super_wait()
md: be more relaxed about stopping an array which isn't started.
md/raid1: process_checks doesn't use its return value.
md/raid5: fix init_stripe() inconsistencies
md/raid10: another memory leak due to reshape.
md: use set_bit/clear_bit instead of shift/mask for bi_flags changes.
md/raid1: minor typos and reformatting.
...
Linus Torvalds [Sat, 18 Oct 2014 17:26:10 +0000 (10:26 -0700)]
Merge branch 'for-linus2' of git://git./linux/kernel/git/jmorris/linux-security
Pull selinux fix from James Morris:
"Fix for a list corruption bug in the SELinux code"
* 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
selinux: fix inode security list corruption
Linus Torvalds [Sat, 18 Oct 2014 17:25:09 +0000 (10:25 -0700)]
Merge tag 'virtio-next-for-linus' of git://git./linux/kernel/git/rusty/linux
Pull virtio updates from Rusty Russell:
"One cc: stable commit, the rest are a series of minor cleanups which
have been sitting in MST's tree during my vacation. I changed a
function name and made one trivial change, then they spent two days in
linux-next"
* tag 'virtio-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: (25 commits)
virtio-rng: refactor probe error handling
virtio_scsi: drop scan callback
virtio_balloon: enable VQs early on restore
virtio_scsi: fix race on device removal
virito_scsi: use freezable WQ for events
virtio_net: enable VQs early on restore
virtio_console: enable VQs early on restore
virtio_scsi: enable VQs early on restore
virtio_blk: enable VQs early on restore
virtio_scsi: move kick event out from virtscsi_init
virtio_net: fix use after free on allocation failure
9p/trans_virtio: enable VQs early
virtio_console: enable VQs early
virtio_blk: enable VQs early
virtio_net: enable VQs early
virtio: add API to enable VQs early
virtio_net: minor cleanup
virtio-net: drop config_mutex
virtio_net: drop config_enable
virtio-blk: drop config_mutex
...
Linus Torvalds [Sat, 18 Oct 2014 17:24:26 +0000 (10:24 -0700)]
Merge tag 'modules-next-for-linus' of git://git./linux/kernel/git/rusty/linux
Pull module fix from Rusty Russell:
"A single panic fix for a rare race, stable CC'd"
* tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
modules, lock around setting of MODULE_STATE_UNFORMED
Jonathan Corbet [Fri, 17 Oct 2014 12:59:26 +0000 (08:59 -0400)]
MAINTAINERS: Become the docs maintainer
It seems it's my turn to be the documentation maintainer for a bit. My
plan is to work to ensure that docs patches don't fall through the cracks;
I assume most changes will continue to flow through subsystem-specific
trees.
Acked-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andy Lutomirski [Wed, 8 Oct 2014 16:02:13 +0000 (09:02 -0700)]
x86,kvm,vmx: Preserve CR4 across VM entry
CR4 isn't constant; at least the TSD and PCE bits can vary.
TBH, treating CR0 and CR3 as constant scares me a bit, too, but it looks
like it's correct.
This adds a branch and a read from cr4 to each vm entry. Because it is
extremely likely that consecutive entries into the same vcpu will have
the same host cr4 value, this fixes up the vmcs instead of restoring cr4
after the fact. A subsequent patch will add a kernel-wide cr4 shadow,
reducing the overhead in the common case to just two memory reads and a
branch.
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: stable@vger.kernel.org
Cc: Petr Matousek <pmatouse@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Li RongQing [Sat, 18 Oct 2014 09:33:38 +0000 (17:33 +0800)]
ipv6: fix a potential use after free in sit.c
pskb_may_pull() maybe change skb->data and make iph pointer oboslete,
fix it by geting ip header length directly.
Fixes:
ca15a078 (sit: generate icmpv6 error when receiving icmpv4 error)
Cc: Oussama Ghorbel <ghorbel@pivasoftware.com>
Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Li RongQing [Sat, 18 Oct 2014 09:27:42 +0000 (17:27 +0800)]
ipv6: fix a potential use after free in ip6_offload.c
pskb_may_pull() maybe change skb->data and make opth pointer oboslete,
so set the opth again
Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Li RongQing [Sat, 18 Oct 2014 09:26:04 +0000 (17:26 +0800)]
ipv4: fix a potential use after free in gre_offload.c
pskb_may_pull() may change skb->data and make greh pointer oboslete;
so need to reassign greh;
but since first calling pskb_may_pull already ensured that skb->data
has enough space for greh, so move the reference of greh before second
calling pskb_may_pull(), to avoid reassign greh.
Fixes:
7a7ffbabf9("ipv4: fix tunneled VM traffic over hw VXLAN/GRE GSO NIC")
Cc: Wei-Chun Chao <weichunc@plumgrid.com>
Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Sat, 18 Oct 2014 15:34:37 +0000 (08:34 -0700)]
tcp: fix build error if IPv6 is not enabled
$ make M=net/ipv4
CC net/ipv4/route.o
In file included from net/ipv4/route.c:102:0:
include/net/tcp.h: In function ‘tcp_v6_iif’:
include/net/tcp.h:738:32: error: ‘union <anonymous>’ has no member named ‘h6’
return TCP_SKB_CB(skb)->header.h6.iif;
Signed-off-by: Eric Dumazet <edumazet@google.com>
Fixes:
870c3151382c ("ipv6: introduce tcp_v6_iif()")
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Sat, 18 Oct 2014 16:31:37 +0000 (09:31 -0700)]
Merge git://git./linux/kernel/git/davem/net
Pull networking fixes from David Miller:
1) Include fixes for netrom and dsa (Fabian Frederick and Florian
Fainelli)
2) Fix FIXED_PHY support in stmmac, from Giuseppe CAVALLARO.
3) Several SKB use after free fixes (vxlan, openvswitch, vxlan,
ip_tunnel, fou), from Li ROngQing.
4) fec driver PTP support fixes from Luwei Zhou and Nimrod Andy.
5) Use after free in virtio_net, from Michael S Tsirkin.
6) Fix flow mask handling for megaflows in openvswitch, from Pravin B
Shelar.
7) ISDN gigaset and capi bug fixes from Tilman Schmidt.
8) Fix route leak in ip_send_unicast_reply(), from Vasily Averin.
9) Fix two eBPF JIT bugs on x86, from Alexei Starovoitov.
10) TCP_SKB_CB() reorganization caused a few regressions, fixed by Cong
Wang and Eric Dumazet.
11) Don't overwrite end of SKB when parsing malformed sctp ASCONF
chunks, from Daniel Borkmann.
12) Don't call sock_kfree_s() with NULL pointers, this function also has
the side effect of adjusting the socket memory usage. From Cong Wang.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (90 commits)
bna: fix skb->truesize underestimation
net: dsa: add includes for ethtool and phy_fixed definitions
openvswitch: Set flow-key members.
netrom: use linux/uaccess.h
dsa: Fix conversion from host device to mii bus
tipc: fix bug in bundled buffer reception
ipv6: introduce tcp_v6_iif()
sfc: add support for skb->xmit_more
r8152: return -EBUSY for runtime suspend
ipv4: fix a potential use after free in fou.c
ipv4: fix a potential use after free in ip_tunnel_core.c
hyperv: Add handling of IP header with option field in netvsc_set_hash()
openvswitch: Create right mask with disabled megaflows
vxlan: fix a free after use
openvswitch: fix a use after free
ipv4: dst_entry leak in ip_send_unicast_reply()
ipv4: clean up cookie_v4_check()
ipv4: share tcp_v4_save_options() with cookie_v4_check()
ipv4: call __ip_options_echo() in cookie_v4_check()
atm: simplify lanai.c by using module_pci_driver
...
Linus Torvalds [Sat, 18 Oct 2014 16:30:41 +0000 (09:30 -0700)]
Merge git://git./linux/kernel/git/davem/sparc
Pull Sparc bugfix from David Miller:
"Sparc64 AES ctr mode bug fix"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
sparc64: Fix FPU register corruption with AES crypto offload.
Linus Torvalds [Sat, 18 Oct 2014 16:29:59 +0000 (09:29 -0700)]
Merge git://git./linux/kernel/git/davem/ide
Pull IDE cleanup from David Miller:
"One IDE driver cleanup"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/ide:
Drivers: ide: Remove typedef atiixp_ide_timing