GitHub/LineageOS/android_kernel_motorola_exynos9610.git
12 years agoIPv6 routing, NLM_F_* flag support: warn if new route is created without NLM_F_CREATE
Matti Vaittinen [Mon, 14 Nov 2011 00:14:49 +0000 (00:14 +0000)]
IPv6 routing, NLM_F_* flag support: warn if new route is created without NLM_F_CREATE

The support for NLM_F_* flags at IPv6 routing requests.

Warn if NLM_F_CREATE flag is not defined for RTM_NEWROUTE request,
creating new table. Later NLM_F_CREATE may be required for
new route creation.

Patch created against linux-3.2-rc1

Signed-off-by: Matti Vaittinen <Mazziesaccount@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet/can/mscan: Fix buggy listen only mode setting
Wolfgang Grandegger [Mon, 14 Nov 2011 19:30:05 +0000 (14:30 -0500)]
net/can/mscan: Fix buggy listen only mode setting

This patch fixes an issue introduced recently with commit
452448f9283e1939408b397e87974a418825b0a8.

CC: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Wolfgang Grandegger <wg@grandegger.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoSweep the last of the active .get_drvinfo floors under ethernet/
Rick Jones [Mon, 14 Nov 2011 08:13:25 +0000 (08:13 +0000)]
Sweep the last of the active .get_drvinfo floors under ethernet/

This round of floor sweeping converts strncpy calls in various .get_drvinfo
routines to the preferred strlcpy.  It also does a modicum of other
cleaning in those routines.

Signed-off-by: Rick Jones <rick.jones2@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agobnx2x: uses build_skb() in receive path
Eric Dumazet [Mon, 14 Nov 2011 06:05:34 +0000 (06:05 +0000)]
bnx2x: uses build_skb() in receive path

bnx2x uses following formula to compute its rx_buf_sz :

dev->mtu + 2*L1_CACHE_BYTES + 14 + 8 + 8 + 2

Then core network adds NET_SKB_PAD and SKB_DATA_ALIGN(sizeof(struct
skb_shared_info))

Final allocated size for skb head on x86_64 (L1_CACHE_BYTES = 64,
MTU=1500) : 2112 bytes : SLUB/SLAB round this to 4096 bytes.

Since skb truesize is then bigger than SK_MEM_QUANTUM, we have lot of
false sharing because of mem_reclaim in UDP stack.

One possible way to half truesize is to reduce the need by 64 bytes
(2112 -> 2048 bytes)

Instead of allocating a full cache line at the end of packet for
alignment, we can use the fact that skb_shared_info sits at the end of
skb->head, and we can use this room, if we convert bnx2x to new
build_skb() infrastructure.

skb_shared_info will be initialized after hardware finished its
transfert, so we can eventually overwrite the final padding.

Using build_skb() also reduces cache line misses in the driver, since we
use cache hot skb instead of cold ones. Number of in-flight sk_buff
structures is lower, they are recycled while still hot.

Performance results :

(820.000 pps on a rx UDP monothread benchmark, instead of 720.000 pps)

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Eilon Greenstein <eilong@broadcom.com>
CC: Ben Hutchings <bhutchings@solarflare.com>
CC: Tom Herbert <therbert@google.com>
CC: Jamal Hadi Salim <hadi@mojatatu.com>
CC: Stephen Hemminger <shemminger@vyatta.com>
CC: Thomas Graf <tgraf@infradead.org>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Acked-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet: introduce build_skb()
Eric Dumazet [Mon, 14 Nov 2011 06:03:34 +0000 (06:03 +0000)]
net: introduce build_skb()

One of the thing we discussed during netdev 2011 conference was the idea
to change some network drivers to allocate/populate their skb at RX
completion time, right before feeding the skb to network stack.

In old days, we allocated skbs when populating the RX ring.

This means bringing into cpu cache sk_buff and skb_shared_info cache
lines (since we clear/initialize them), then 'queue' skb->data to NIC.

By the time NIC fills a frame in skb->data buffer and host can process
it, cpu probably threw away the cache lines from its caches, because lot
of things happened between the allocation and final use.

So the deal would be to allocate only the data buffer for the NIC to
populate its RX ring buffer. And use build_skb() at RX completion to
attach a data buffer (now filled with an ethernet frame) to a new skb,
initialize the skb_shared_info portion, and give the hot skb to network
stack.

build_skb() is the function to allocate an skb, caller providing the
data buffer that should be attached to it. Drivers are expected to call
skb_reserve() right after build_skb() to adjust skb->data to the
Ethernet frame (usually skipping NET_SKB_PAD and NET_IP_ALIGN, but some
drivers might add a hardware provided alignment)

Data provided to build_skb() MUST have been allocated by a prior
kmalloc() call, with enough room to add SKB_DATA_ALIGN(sizeof(struct
skb_shared_info)) bytes at the end of the data without corrupting
incoming frame.

data = kmalloc(NET_SKB_PAD + NET_IP_ALIGN + 1536 +
               SKB_DATA_ALIGN(sizeof(struct skb_shared_info)),
       GFP_ATOMIC);
...
skb = build_skb(data);
if (!skb) {
recycle_data(data);
} else {
skb_reserve(skb, NET_SKB_PAD + NET_IP_ALIGN);
...
}

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Eilon Greenstein <eilong@broadcom.com>
CC: Ben Hutchings <bhutchings@solarflare.com>
CC: Tom Herbert <therbert@google.com>
CC: Jamal Hadi Salim <hadi@mojatatu.com>
CC: Stephen Hemminger <shemminger@vyatta.com>
CC: Thomas Graf <tgraf@infradead.org>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet: fsl_pq_mdio: fix non tbi phy access
Baruch Siach [Mon, 14 Nov 2011 06:21:30 +0000 (08:21 +0200)]
net: fsl_pq_mdio: fix non tbi phy access

Since 952c5ca1 (fsl_pq_mdio: Clean up tbi address configuration) .probe returns
-EBUSY when the "tbi-phy" node is missing. Fix this.

Cc: Andy Fleming <afleming@freescale.com>
Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet/can/mscan: add listen only mode
Marc Kleine-Budde [Wed, 9 Nov 2011 10:50:49 +0000 (10:50 +0000)]
net/can/mscan: add listen only mode

This patch adds listen only mode to the mscan controller.

Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Acked-by: Wolfgang Grandegger <wg@grandegger.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoneigh: new unresolved queue limits
Eric Dumazet [Wed, 9 Nov 2011 12:07:14 +0000 (12:07 +0000)]
neigh: new unresolved queue limits

Le mercredi 09 novembre 2011 à 16:21 -0500, David Miller a écrit :
> From: David Miller <davem@davemloft.net>
> Date: Wed, 09 Nov 2011 16:16:44 -0500 (EST)
>
> > From: Eric Dumazet <eric.dumazet@gmail.com>
> > Date: Wed, 09 Nov 2011 12:14:09 +0100
> >
> >> unres_qlen is the number of frames we are able to queue per unresolved
> >> neighbour. Its default value (3) was never changed and is responsible
> >> for strange drops, especially if IP fragments are used, or multiple
> >> sessions start in parallel. Even a single tcp flow can hit this limit.
> >  ...
> >
> > Ok, I've applied this, let's see what happens :-)
>
> Early answer, build fails.
>
> Please test build this patch with DECNET enabled and resubmit.  The
> decnet neigh layer still refers to the removed ->queue_len member.
>
> Thanks.

Ouch, this was fixed on one machine yesterday, but not the other one I
used this morning, sorry.

[PATCH V5 net-next] neigh: new unresolved queue limits

unres_qlen is the number of frames we are able to queue per unresolved
neighbour. Its default value (3) was never changed and is responsible
for strange drops, especially if IP fragments are used, or multiple
sessions start in parallel. Even a single tcp flow can hit this limit.

$ arp -d 192.168.20.108 ; ping -c 2 -s 8000 192.168.20.108
PING 192.168.20.108 (192.168.20.108) 8000(8028) bytes of data.
8008 bytes from 192.168.20.108: icmp_seq=2 ttl=64 time=0.322 ms

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agobridge: add NTF_USE support
stephen hemminger [Wed, 9 Nov 2011 18:30:08 +0000 (18:30 +0000)]
bridge: add NTF_USE support

More changes to the recent code to support control of forwarding
database via netlink.
   * Support NTF_USE like neighbour table
   * Validate state bits from application
   * Only send notifications (and change bits) if new entry is
     different.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoSweep additional floors of strcpy in .get_drvinfo routines
Rick Jones [Wed, 9 Nov 2011 09:58:07 +0000 (09:58 +0000)]
Sweep additional floors of strcpy in .get_drvinfo routines

Perform another round of floor sweeping, converting the .get_drvinfo
routines of additional drivers from strcpy to strlcpy along with
some conversion of sprintf to snprintf.

Signed-off-by: Rick Jones <rick.jones2@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agofsl_pq_mdio: Clean up tbi address configuration
Andy Fleming [Fri, 11 Nov 2011 05:10:39 +0000 (05:10 +0000)]
fsl_pq_mdio: Clean up tbi address configuration

The code for setting the address of the internal TBI PHY was
convoluted enough without a maze of ifdefs. Clean it up a bit
so we allow the logic to fail down to -ENODEV at the end of
the if/else ladder, rather than using ifdefs to repeat the same
failure code over and over.

Also, remove the support for the auto-configuration. I'm not aware of
anyone using it, and it ends up using the bus mutex before it's been
initialized.

Signed-off-by: Andy Fleming <afleming@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet-forcedeth: Add internal loopback support for forcedeth NICs.
Sanjay Hortikar [Fri, 11 Nov 2011 16:11:21 +0000 (16:11 +0000)]
net-forcedeth: Add internal loopback support for forcedeth NICs.

Support enabling/disabling/querying internal loopback mode for
forcedeth NICs using ethtool.

Signed-off-by: Sanjay Hortikar <horti@google.com>
Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years ago6LoWPAN: update documentation
alex.bluesman.smirnov@gmail.com [Thu, 10 Nov 2011 07:41:11 +0000 (07:41 +0000)]
6LoWPAN: update documentation

This patch adds chapter to documentation which describes how to use
6lowpan technology.

Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years ago6LoWPAN: UDP header decompression
alex.bluesman.smirnov@gmail.com [Thu, 10 Nov 2011 07:40:53 +0000 (07:40 +0000)]
6LoWPAN: UDP header decompression

This patch provides possibility to decompress UDP headers.
Derived from Contiki OS.

Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years ago6LoWPAN: UDP header compression
alex.bluesman.smirnov@gmail.com [Thu, 10 Nov 2011 07:40:14 +0000 (07:40 +0000)]
6LoWPAN: UDP header compression

This patch adds support for UDP header compression.
Derived from Contiki OS.

Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years ago6LoWPAN: set proper netdev flags
alex.bluesman.smirnov@gmail.com [Thu, 10 Nov 2011 07:39:37 +0000 (07:39 +0000)]
6LoWPAN: set proper netdev flags

This patch fixes settings for device initialization which makes possible to
use NDISC and TCP.

Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>
Acked-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years ago6LoWPAN: disable debugging by default
alex.bluesman.smirnov@gmail.com [Thu, 10 Nov 2011 07:39:15 +0000 (07:39 +0000)]
6LoWPAN: disable debugging by default

This patch disables debug output enabled by default.

Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>
Acked-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years ago6LoWPAN: add fragmentation support
alex.bluesman.smirnov@gmail.com [Thu, 10 Nov 2011 07:38:38 +0000 (07:38 +0000)]
6LoWPAN: add fragmentation support

This patch adds support for frame fragmentation.

Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoipv6: reduce percpu needs for icmpv6msg mibs
Eric Dumazet [Sun, 13 Nov 2011 01:24:04 +0000 (01:24 +0000)]
ipv6: reduce percpu needs for icmpv6msg mibs

Reading /proc/net/snmp6 on a machine with a lot of cpus is very
expensive (can be ~88000 us).

This is because ICMPV6MSG MIB uses 4096 bytes per cpu, and folding
values for all possible cpus can read 16 Mbytes of memory (32MBytes on
non x86 arches)

ICMP messages are not considered as fast path on a typical server, and
eventually few cpus handle them anyway. We can afford an atomic
operation instead of using percpu data.

This saves 4096 bytes per cpu and per network namespace.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet: introduce ethernet teaming device
Jiri Pirko [Fri, 11 Nov 2011 22:16:48 +0000 (22:16 +0000)]
net: introduce ethernet teaming device

This patch introduces new network device called team. It supposes to be
very fast, simple, userspace-driven alternative to existing bonding
driver.

Userspace library called libteam with couple of demo apps is available
here:
https://github.com/jpirko/libteam
Note it's still in its dipers atm.

team<->libteam use generic netlink for communication. That and rtnl
suppose to be the only way to configure team device, no sysfs etc.

Python binding of libteam was recently introduced.
Daemon providing arpmon/miimon active-backup functionality will be
introduced shortly. All what's necessary is already implemented in
kernel team driver.

v7->v8:
- check ndo_ndo_vlan_rx_[add/kill]_vid functions before calling
  them.
- use dev_kfree_skb_any() instead of dev_kfree_skb()

v6->v7:
- transmit and receive functions are not checked in hot paths.
  That also resolves memory leak on transmit when no port is
  present

v5->v6:
- changed couple of _rcu calls to non _rcu ones in non-readers

v4->v5:
- team_change_mtu() uses team->lock while travesing though port
  list
- mac address changes are moved completely to jurisdiction of
  userspace daemon. This way the daemon can do FOM1, FOM2 and
  possibly other weird things with mac addresses.
  Only round-robin mode sets up all ports to bond's address then
  enslaved.
- Extended Kconfig text

v3->v4:
- remove redundant synchronize_rcu from __team_change_mode()
- revert "set and clear of mode_ops happens per pointer, not per
  byte"
- extend comment of function __team_change_mode()

v2->v3:
- team_change_mtu() uses rcu version of list traversal to unwind
- set and clear of mode_ops happens per pointer, not per byte
- port hashlist changed to be embedded into team structure
- error branch in team_port_enter() does cleanup now
- fixed rtln->rtnl

v1->v2:
- modes are made as modules. Makes team more modular and
  extendable.
- several commenters' nitpicks found on v1 were fixed
- several other bugs were fixed.
- note I ignored Eric's comment about roundrobin port selector
  as Eric's way may be easily implemented as another mode (mode
  "random") in future.

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agobnx2x: update driver version to 1.70.35-0
Dmitry Kravkov [Sun, 13 Nov 2011 04:34:32 +0000 (04:34 +0000)]
bnx2x: update driver version to 1.70.35-0

Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agobnx2x: Remove on-stack napi struct variable
Ariel Elior [Sun, 13 Nov 2011 04:34:31 +0000 (04:34 +0000)]
bnx2x: Remove on-stack napi struct variable

Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agobnx2x: prevent race in statistics flow
Dmitry Kravkov [Sun, 13 Nov 2011 04:34:30 +0000 (04:34 +0000)]
bnx2x: prevent race in statistics flow

The race may cause access of registers while MAC hw block is
in reset state. As a result syslog will show error messages.
We can prevent this by using state from local variable.

Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agobnx2x: add fan failure event handling
Ariel Elior [Sun, 13 Nov 2011 04:34:29 +0000 (04:34 +0000)]
bnx2x: add fan failure event handling

Shut down the device in case of fan failure to prevent HW damage.

Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agobnx2x: remove unused #define
Dmitry Kravkov [Sun, 13 Nov 2011 04:34:28 +0000 (04:34 +0000)]
bnx2x: remove unused #define

Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agobnx2x: simplify definition of RX_SGE_MASK_LEN and use it.
Dmitry Kravkov [Sun, 13 Nov 2011 04:34:27 +0000 (04:34 +0000)]
bnx2x: simplify definition of RX_SGE_MASK_LEN and use it.

Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agobnx2x: DCBX: use #define instead of magic
Dmitry Kravkov [Sun, 13 Nov 2011 04:34:26 +0000 (04:34 +0000)]
bnx2x: DCBX: use #define instead of magic

Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agobnx2x: propagate DCBX negotiation
Dmitry Kravkov [Sun, 13 Nov 2011 04:34:25 +0000 (04:34 +0000)]
bnx2x: propagate DCBX negotiation

We need propagate the DCBX results from PMF to other functions
on the same port, in order to properly update netdev structure
and allow following new ETS and PFC configurations.

Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agobnx2x: separate FCoE and iSCSI license initialization.
Dmitry Kravkov [Sun, 13 Nov 2011 04:34:24 +0000 (04:34 +0000)]
bnx2x: separate FCoE and iSCSI license initialization.

FCoE license info must be initialized at probe(), but
iSCSI at open().

Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agobnx2x: remove unused variable
Dmitry Kravkov [Sun, 13 Nov 2011 04:34:23 +0000 (04:34 +0000)]
bnx2x: remove unused variable

Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agobnx2x: use rx_queue index for skb_record_rx_queue()
Dmitry Kravkov [Sun, 13 Nov 2011 04:34:22 +0000 (04:34 +0000)]
bnx2x: use rx_queue index for skb_record_rx_queue()

Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agobnx2x: allow FCoE and DCB for 578xx
Dmitry Kravkov [Sun, 13 Nov 2011 04:34:21 +0000 (04:34 +0000)]
bnx2x: allow FCoE and DCB for 578xx

Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agobe2net: stop issuing FW cmds if any cmd times out
Sathya Perla [Thu, 10 Nov 2011 19:18:00 +0000 (19:18 +0000)]
be2net: stop issuing FW cmds if any cmd times out

A FW cmd timeout (with a sufficiently large timeout value in the
order of tens of seconds) indicates an unresponsive FW. In this state
issuing further cmds and waiting for a completion will only stall the process.

Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agobe2net: don't log more than one error on detecting EEH/UE errors
Sathya Perla [Thu, 10 Nov 2011 19:17:59 +0000 (19:17 +0000)]
be2net: don't log more than one error on detecting EEH/UE errors

Currently we're spamming error messages each time a FW cmd call is made
while in EEH/UE error state. One log msg on error detection is enough.

Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agobe2net: stop checking the UE registers after an EEH error
Sathya Perla [Thu, 10 Nov 2011 19:17:58 +0000 (19:17 +0000)]
be2net: stop checking the UE registers after an EEH error

Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agobe2net: init (vf)_if_handle/vf_pmac_id to handle failure scenarios
Sathya Perla [Thu, 10 Nov 2011 19:17:57 +0000 (19:17 +0000)]
be2net: init (vf)_if_handle/vf_pmac_id to handle failure scenarios

Initialize if_handle, vf_if_handle and vf_pmac_id with "-1" so that in
failure cases when be_clear() is called, we can skip over
if_destroy/pmac_del cmds if they have not been created.

Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoipv4: PKTINFO doesnt need dst reference
Eric Dumazet [Wed, 9 Nov 2011 07:24:35 +0000 (07:24 +0000)]
ipv4: PKTINFO doesnt need dst reference

Le lundi 07 novembre 2011 à 15:33 +0100, Eric Dumazet a écrit :

> At least, in recent kernels we dont change dst->refcnt in forwarding
> patch (usinf NOREF skb->dst)
>
> One particular point is the atomic_inc(dst->refcnt) we have to perform
> when queuing an UDP packet if socket asked PKTINFO stuff (for example a
> typical DNS server has to setup this option)
>
> I have one patch somewhere that stores the information in skb->cb[] and
> avoid the atomic_{inc|dec}(dst->refcnt).
>

OK I found it, I did some extra tests and believe its ready.

[PATCH net-next] ipv4: IP_PKTINFO doesnt need dst reference

When a socket uses IP_PKTINFO notifications, we currently force a dst
reference for each received skb. Reader has to access dst to get needed
information (rt_iif & rt_spec_dst) and must release dst reference.

We also forced a dst reference if skb was put in socket backlog, even
without IP_PKTINFO handling. This happens under stress/load.

We can instead store the needed information in skb->cb[], so that only
softirq handler really access dst, improving cache hit ratios.

This removes two atomic operations per packet, and false sharing as
well.

On a benchmark using a mono threaded receiver (doing only recvmsg()
calls), I can reach 720.000 pps instead of 570.000 pps.

IP_PKTINFO is typically used by DNS servers, and any multihomed aware
UDP application.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoipv4: reduce percpu needs for icmpmsg mibs
Eric Dumazet [Tue, 8 Nov 2011 13:04:43 +0000 (13:04 +0000)]
ipv4: reduce percpu needs for icmpmsg mibs

Reading /proc/net/snmp on a machine with a lot of cpus is very expensive
(can be ~88000 us).

This is because ICMPMSG MIB uses 4096 bytes per cpu, and folding values
for all possible cpus can read 16 Mbytes of memory.

ICMP messages are not considered as fast path on a typical server, and
eventually few cpus handle them anyway. We can afford an atomic
operation instead of using percpu data.

This saves 4096 bytes per cpu and per network namespace.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonet: rename sk_clone to sk_clone_lock
Eric Dumazet [Tue, 8 Nov 2011 22:07:07 +0000 (17:07 -0500)]
net: rename sk_clone to sk_clone_lock

Make clear that sk_clone() and inet_csk_clone() return a locked socket.

Add _lock() prefix and kerneldoc.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agosch_choke: use skb_header_pointer()
Eric Dumazet [Tue, 8 Nov 2011 10:45:04 +0000 (10:45 +0000)]
sch_choke: use skb_header_pointer()

Remove the assumption that skb_get_rxhash() makes IP header and ports
linear, and use skb_header_pointer() instead in choke_match_flow()

This permits __skb_get_rxhash() to use skb_header_pointer() eventually.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoll_temac: Add support for phy_mii_ioctl
Ricardo Ribalda [Mon, 7 Nov 2011 23:47:45 +0000 (23:47 +0000)]
ll_temac: Add support for phy_mii_ioctl

This patch enables the ioctl support for the driver. So userspace
programs like mii-tool can work.

Resend in merge window

Signed-off-by: Ricardo Ribalda Delgado <ricardo.ribalda@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonet: make ipv6 PKTINFO honour freebind
Maciej Żenczykowski [Mon, 7 Nov 2011 14:57:22 +0000 (14:57 +0000)]
net: make ipv6 PKTINFO honour freebind

This just makes it possible to spoof source IPv6 address on a socket
without having to create and bind a new socket for every source IP
we wish to spoof.

Signed-off-by: Maciej Żenczykowski <maze@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonet: make ipv6 bind honour freebind
Maciej Żenczykowski [Mon, 7 Nov 2011 14:57:21 +0000 (14:57 +0000)]
net: make ipv6 bind honour freebind

This makes native ipv6 bind follow the precedent set by:
  - native ipv4 bind behaviour
  - dual stack ipv4-mapped ipv6 bind behaviour.

This does allow an unpriviledged process to spoof its source IPv6
address, just like it currently can spoof its source IPv4 address
(for example when using UDP).

Signed-off-by: Maciej Żenczykowski <maze@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agosweep the floors and convert some .get_drvinfo routines to strlcpy
Rick Jones [Mon, 7 Nov 2011 13:29:27 +0000 (13:29 +0000)]
sweep the floors and convert some .get_drvinfo routines to strlcpy

Per the mention made by Ben Hutchings that strlcpy is now the preferred
string copy routine for a .get_drvinfo routine, do a bit of floor
sweeping and convert some of the as-yet unconverted ethernet drivers to
it.

Signed-off-by: Rick Jones <rick.jones2@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agosctp: fasthandoff with ASCONF at server-node
Michio Honda [Fri, 17 Jun 2011 02:22:35 +0000 (11:22 +0900)]
sctp: fasthandoff with ASCONF at server-node

Retransmit chunks to newly confirmed destination when ASCONF and
HEARTBEAT negotiation has success with a single-homed peer.

Signed-off-by: Michio Honda <micchie@sfc.wide.ad.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agosctp: fasthandoff with ASCONF at mobile-node
Michio Honda [Fri, 17 Jun 2011 02:03:23 +0000 (11:03 +0900)]
sctp: fasthandoff with ASCONF at mobile-node

Fast retransmission after changing the last address
with ASCONF negotiation

Signed-off-by: Michio Honda <micchie@sfc.wide.ad.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonet: better pcpu data alignment
Eric Dumazet [Fri, 4 Nov 2011 23:19:28 +0000 (23:19 +0000)]
net: better pcpu data alignment

Tunnels can force an alignment of their percpu data to reduce number of
cache lines used in fast path, or read in .ndo_get_stats()

percpu_alloc() is a very fine grained allocator, so any small hole will
be used anyway.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoipv4: Fix inetpeer expire time information
Steffen Klassert [Tue, 11 Oct 2011 01:12:02 +0000 (01:12 +0000)]
ipv4: Fix inetpeer expire time information

As we update the learned pmtu informations on demand, we might
report a nagative expiration time value to userspace if the
pmtu informations are already expired and we have not send a
packet to that inetpeer after expiration. With this patch we
send a expire time of null to userspace after expiration
until the next packet is send to that inetpeer.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonet: min_pmtu default is 552
Eric Dumazet [Tue, 8 Nov 2011 19:21:44 +0000 (14:21 -0500)]
net: min_pmtu default is 552

Small fix in Documentation, since min_pmtu is 512 + 20 + 20 = 552

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agotcp: Fix comments for Nagle algorithm
Feng King [Sat, 5 Nov 2011 04:23:23 +0000 (04:23 +0000)]
tcp: Fix comments for Nagle algorithm

TCP_NODELAY is weaker than TCP_CORK, when TCP_CORK was set, small
segments will always pass Nagle test regardless of TCP_NODELAY option.

Signed-off-by: Feng King <kinwin2008@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agosunhme: Allow usage on SBI based SBus systems
oftedal [Mon, 7 Nov 2011 11:47:53 +0000 (11:47 +0000)]
sunhme: Allow usage on SBI based SBus systems

To prevent the SBus driver for Sun Happy Meal cards from being loaded for
PCI cards utilizing the same chipset, a filter was added to the probe
function in commit 0b492fce3d72d982a7981905f85484a1e1ba7fde.

The filter was implemented by checking the name of the parent node in
the OF tree. This patch extends this filter, so that the driver will
load on SBus systems that are based upon SBI SBus Bridges.

Signed-off-by: Kjetil Oftedal <oftedal@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agol2tp: fix l2tp_udp_recv_core()
Eric Dumazet [Tue, 8 Nov 2011 18:59:44 +0000 (13:59 -0500)]
l2tp: fix l2tp_udp_recv_core()

pskb_may_pull() can change skb->data, so we have to load ptr/optr at the
right place.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoFix incorrect usage of NET_IP_ALIGN
Nico Erfurth [Tue, 8 Nov 2011 07:30:40 +0000 (07:30 +0000)]
Fix incorrect usage of NET_IP_ALIGN

The driver used NET_IP_ALIGN to remove some additional padding inside of
the rx_fixup function. On many architectures NET_IP_ALIGN defaults to 2
which removed the correct amount of bytes.

On MCORE2-machines commit ea812ca1b06113597adcd8e70c0f84a413d97544
introduces a change which sets NET_IP_ALIGN to 0 by default. Which
triggered the bug on these machines.

This fix introduces a new RXW_PADDING define and uses this instead of
NET_IP_ALIGN. The name was taken from the original SMSC7500 driver which
is provided by SMSC.

Signed-off-by: Nico Erfurth <ne@erfurth.eu>
Tested-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoipv6: drop packets when source address is multicast
Brian Haley [Tue, 8 Nov 2011 04:41:42 +0000 (04:41 +0000)]
ipv6: drop packets when source address is multicast

RFC 4291 Section 2.7 says Multicast addresses must not be used as source
addresses in IPv6 packets - drop them on input so we don't process the
packet further.

Signed-off-by: Brian Haley <brian.haley@hp.com>
Reported-and-Tested-by: Kumar Sanghvi <divinekumar@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agor8169: increase the delay parameter of pm_schedule_suspend
hayeswang [Mon, 7 Nov 2011 20:44:37 +0000 (20:44 +0000)]
r8169: increase the delay parameter of pm_schedule_suspend

The link down would occur when reseting PHY. And it would take about 2 ~ 5 seconds
from link down to link up. If the delay of pm_schedule_suspend is not long enough,
the device would enter runtime_suspend before link up. After link up, the device
would wake up and reset PHY again. Then, you would find the driver keep in a loop
of runtime_suspend and rumtime_resume.

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Acked-by: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoLinux 3.2-rc1
Linus Torvalds [Tue, 8 Nov 2011 00:16:02 +0000 (16:16 -0800)]
Linux 3.2-rc1

.. with new name.  Because nothing says "really solid kernel release"
like naming it after an extinct animal that just happened to be in the
news lately.

13 years agoMerge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux...
Linus Torvalds [Tue, 8 Nov 2011 00:14:26 +0000 (16:14 -0800)]
Merge branch 'fixes' of git://git./linux/kernel/git/tmlind/linux-omap

* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap: (31 commits)
  ARM: OMAP: Fix export.h or module.h includes
  ARM: OMAP: omap_device: Include linux/export.h
  ARM: OMAP2: Fix H4 matrix keyboard warning
  ARM: OMAP1: Remove unused omap-alsa.h
  ARM: OMAP1: Fix warnings about enabling 32 KiHz timer
  ARM: OMAP2+: timer: Remove omap_device_pm_latency
  ARM: OMAP2+: clock data: Remove redundant timer clkdev
  ARM: OMAP: Devkit8000: Remove double omap_mux_init_gpio
  ARM: OMAP: usb: musb: OMAP: Delete unused function
  MAINTAINERS: Update linux-omap git repository
  ARM: OMAP: change get_context_loss_count ret value to int
  ARM: OMAP4: hsmmc: configure SDMMC1_DR0 properly
  ARM: OMAP4: hsmmc: Fix Pbias configuration on regulator OFF
  ARM: OMAP3: hwmod: fix variant registration and remove SmartReflex from common list
  ARM: OMAP: I2C: Fix omap_register_i2c_bus() return value on success
  ARM: OMAP: dmtimer: Include linux/module.h
  ARM: OMAP2+: l3-noc: Include linux/module.h
  ARM: OMAP2+: devices: Fixes for McPDM
  ARM: OMAP: Fix errors and warnings when building for one board
  ARM: OMAP3: PM: restrict erratum i443 handling to OMAP3430 only
  ...

13 years agoVFS: we need to set LOOKUP_JUMPED on mountpoint crossing
Al Viro [Mon, 7 Nov 2011 21:21:26 +0000 (21:21 +0000)]
VFS: we need to set LOOKUP_JUMPED on mountpoint crossing

Mountpoint crossing is similar to following procfs symlinks - we do
not get ->d_revalidate() called for dentry we have arrived at, with
unpleasant consequences for NFS4.

Simple way to reproduce the problem in mainline:

    cat >/tmp/a.c <<'EOF'
    #include <unistd.h>
    #include <fcntl.h>
    #include <stdio.h>
    main()
    {
            struct flock fl = {.l_type = F_RDLCK, .l_whence = SEEK_SET, .l_len = 1};
            if (fcntl(0, F_SETLK, &fl))
                    perror("setlk");
    }
    EOF
    cc /tmp/a.c -o /tmp/test

then on nfs4:

    mount --bind file1 file2
    /tmp/test < file1 # ok
    /tmp/test < file2 # spews "setlk: No locks available"...

What happens is the missing call of ->d_revalidate() after mountpoint
crossing and that's where NFS4 would issue OPEN request to server.

The fix is simple - treat mountpoint crossing the same way we deal with
following procfs-style symlinks.  I.e.  set LOOKUP_JUMPED...

Cc: stable@kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
13 years agoMerge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Mon, 7 Nov 2011 20:38:11 +0000 (12:38 -0800)]
Merge branch 'perf-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf top: Fix live annotation in the --stdio interface
  perf top tui: Don't recalc column widths considering just the first page
  perf report: Add progress bar when processing time ordered events
  perf hists browser: Warn about lost events
  perf tools: Fix a typo of command name as trace-cmd
  perf hists: Fix recalculation of total_period when sorting entries
  perf header: Fix build on old systems
  perf ui browser: Handle K_RESIZE in dialog windows
  perf ui browser: No need to switch char sets that often
  perf hists browser: Use K_TIMER
  perf ui: Rename ui__warning_paranoid to ui__error_paranoid
  perf ui: Reimplement the popup windows using libslang
  perf ui: Reimplement ui__popup_menu using ui__browser
  perf ui: Reimplement ui_helpline using libslang
  perf ui: Improve handling sigwinch a bit
  perf ui progress: Reimplement using slang
  perf evlist: Fix grouping of multiple events

13 years agoMerge branch 'fixes-modulesplit' into fixes
Tony Lindgren [Mon, 7 Nov 2011 20:27:23 +0000 (12:27 -0800)]
Merge branch 'fixes-modulesplit' into fixes

13 years agoARM: OMAP: Fix export.h or module.h includes
Tony Lindgren [Mon, 7 Nov 2011 20:27:10 +0000 (12:27 -0800)]
ARM: OMAP: Fix export.h or module.h includes

Commit 32aaeffbd4a7457bf2f7448b33b5946ff2a960eb (Merge branch
'modsplit-Oct31_2011'...) caused some build errors. Fix these
and make sure we always have export.h or module.h included
for MODULE_ and EXPORT_SYMBOL users:

$ grep -rl ^MODULE_ arch/arm/*omap*/*.c | xargs \
  grep -L linux/module.h
  arch/arm/mach-omap2/dsp.c
  arch/arm/mach-omap2/mailbox.c
  arch/arm/mach-omap2/omap-iommu.c
  arch/arm/mach-omap2/smartreflex.c

Also check we either have export.h or module.h included
for the files exporting symbols:

$ grep -rl EXPORT_SYMBOL arch/arm/*omap*/*.c | xargs \
  grep -L linux/export.h | xargs grep -L linux/module.h

Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Tony Lindgren <tony@atomide.com>
13 years agoARM: OMAP: omap_device: Include linux/export.h
Axel Lin [Mon, 7 Nov 2011 20:27:10 +0000 (12:27 -0800)]
ARM: OMAP: omap_device: Include linux/export.h

Include linux/export.h to fix below build warning:

  CC      arch/arm/plat-omap/omap_device.o
arch/arm/plat-omap/omap_device.c:1055: warning: data definition has no type or storage class
arch/arm/plat-omap/omap_device.c:1055: warning: type defaults to 'int' in declaration of 'EXPORT_SYMBOL'
arch/arm/plat-omap/omap_device.c:1055: warning: parameter names (without types) in function declaration

Signed-off-by: Axel Lin <axel.lin@gmail.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
13 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Linus Torvalds [Mon, 7 Nov 2011 18:55:33 +0000 (10:55 -0800)]
Merge git://git./linux/kernel/git/davem/net

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (47 commits)
  forcedeth: fix a few sparse warnings (variable shadowing)
  forcedeth: Improve stats counters
  forcedeth: remove unneeded stats updates
  forcedeth: Acknowledge only interrupts that are being processed
  forcedeth: fix race when unloading module
  MAINTAINERS/rds: update maintainer
  wanrouter: Remove kernel_lock annotations
  usbnet: fix oops in usbnet_start_xmit
  ixgbe: Fix compile for kernel without CONFIG_PCI_IOV defined
  etherh: Add MAINTAINERS entry for etherh
  bonding: comparing a u8 with -1 is always false
  sky2: fix regression on Yukon Optima
  netlink: clarify attribute length check documentation
  netlink: validate NLA_MSECS length
  i825xx:xscale:8390:freescale: Fix Kconfig dependancies
  macvlan: receive multicast with local address
  tg3: Update version to 3.121
  tg3: Eliminate timer race with reset_task
  tg3: Schedule at most one tg3_reset_task run
  tg3: Obtain PCI function number from device
  ...

13 years agovfs: d_invalidate() should leave mountpoints alone
Al Viro [Mon, 7 Nov 2011 16:39:57 +0000 (16:39 +0000)]
vfs: d_invalidate() should leave mountpoints alone

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
13 years agoforcedeth: fix a few sparse warnings (variable shadowing)
david decotigny [Sat, 5 Nov 2011 14:38:24 +0000 (14:38 +0000)]
forcedeth: fix a few sparse warnings (variable shadowing)

This fixes the following sparse warnings:
drivers/net/ethernet/nvidia/forcedeth.c:2113:7: warning: symbol 'size' shadows an earlier one
drivers/net/ethernet/nvidia/forcedeth.c:2102:6: originally declared here
drivers/net/ethernet/nvidia/forcedeth.c:2155:7: warning: symbol 'size' shadows an earlier one
drivers/net/ethernet/nvidia/forcedeth.c:2102:6: originally declared here
drivers/net/ethernet/nvidia/forcedeth.c:2227:7: warning: symbol 'size' shadows an earlier one
drivers/net/ethernet/nvidia/forcedeth.c:2215:6: originally declared here
drivers/net/ethernet/nvidia/forcedeth.c:2271:7: warning: symbol 'size' shadows an earlier one
drivers/net/ethernet/nvidia/forcedeth.c:2215:6: originally declared here
drivers/net/ethernet/nvidia/forcedeth.c:2986:20: warning: symbol 'addr' shadows an earlier one
drivers/net/ethernet/nvidia/forcedeth.c:2963:6: originally declared here

Signed-off-by: David Decotigny <david.decotigny@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoforcedeth: Improve stats counters
Mandeep Baines [Sat, 5 Nov 2011 14:38:23 +0000 (14:38 +0000)]
forcedeth: Improve stats counters

Rx byte count was off; instead use the hardware's count.  Tx packet
count was counting pre-TSO packets; instead count on-the-wire packets.
Report hardware dropped frame count as rx_fifo_errors.

- The count of transmitted packets reported by the forcedeth driver
  reports pre-TSO (TCP Segmentation Offload) packet counts and not the
  count of the number of packets sent on the wire. This change fixes
  the forcedeth driver to report the correct count. Fixed the code by
  copying the count stored in the NIC H/W to the value reported by the
  driver.

- Count rx_drop_frame errors as rx_fifo_errors:
  We see a lot of rx_drop_frame errors if we disable the rx bottom-halves
  for too long.  Normally, rx_fifo_errors would be counted in this case.
  The rx_drop_frame error count is private to forcedeth and is not
  reported by ifconfig or sysfs.  The rx_fifo_errors count is currently
  unused in the forcedeth driver.  It is reported by ifconfig as overruns.
  This change reports rx_drop_frame errors as rx_fifo_errors.

Signed-off-by: David Decotigny <david.decotigny@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoforcedeth: remove unneeded stats updates
david decotigny [Sat, 5 Nov 2011 14:38:22 +0000 (14:38 +0000)]
forcedeth: remove unneeded stats updates

Function ndo_get_stats() updates most of the stats from hardware
registers, making the manual updates un-needed. This change removes
these manual updates. Main exception is rx_missed_errors which needs
manual update.

Another exception is rx_packets, still updated manually in this commit
to make sure this patch doesn't change behavior of driver. This will
be addressed by a future patch.

Signed-off-by: David Decotigny <david.decotigny@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoforcedeth: Acknowledge only interrupts that are being processed
Mike Ditto [Sat, 5 Nov 2011 14:38:21 +0000 (14:38 +0000)]
forcedeth: Acknowledge only interrupts that are being processed

This is to avoid a race, accidentally acknowledging an interrupt that
we didn't notice and won't immediately process.  This is based solely
on code inspection; it is not known if there was an actual bug here.

Signed-off-by: David Decotigny <david.decotigny@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoforcedeth: fix race when unloading module
david decotigny [Sat, 5 Nov 2011 14:38:20 +0000 (14:38 +0000)]
forcedeth: fix race when unloading module

When forcedeth module is unloaded, there exists a path that can lead
to mod_timer() after del_timer_sync(), causing an oops. This patch
short-circuits this unneeded path, which originates in
nv_get_ethtool_stats().

Tested:
  x86_64 16-way + 3 ethtool -S infinite loops + 100Mbps incoming traffic
  + rmmod/modprobe/ifconfig in a loop

Initial-Author: Salman Qazi <sqazi@google.com>
Discussion: http://patchwork.ozlabs.org/patch/123548/

Signed-off-by: David Decotigny <david.decotigny@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agodevice-mapper: using EXPORT_SYBOL in dm-space-map-checker.c needs export.h
Stephen Rothwell [Tue, 1 Nov 2011 09:27:43 +0000 (20:27 +1100)]
device-mapper: using EXPORT_SYBOL in dm-space-map-checker.c needs export.h

Reported-by: Witold Baryluk <baryluk@smp.if.uj.edu.pl>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
13 years agodevice-mapper: dm-bufio.c needs to include module.h
Stephen Rothwell [Tue, 1 Nov 2011 07:30:49 +0000 (18:30 +1100)]
device-mapper: dm-bufio.c needs to include module.h

since it uses the module facilities.

Reported-by: Witold Baryluk <baryluk@smp.if.uj.edu.pl>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
13 years agodrivers/md: change module.h -> export.h in persistent-data/dm-*
Paul Gortmaker [Wed, 28 Sep 2011 22:29:32 +0000 (18:29 -0400)]
drivers/md: change module.h -> export.h in persistent-data/dm-*

For the files which are not themselves modular, we can change
them to include only the smaller export.h since all they are
doing is looking for EXPORT_SYMBOL.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
13 years agoarm: Add export.h to recently added files for EXPORT_SYMBOL
Paul Gortmaker [Sun, 9 Oct 2011 03:24:48 +0000 (23:24 -0400)]
arm: Add export.h to recently added files for EXPORT_SYMBOL

These files didn't exist at the time of the module.h split, and
so were not fixed by the commits on that baseline.  Since they use
the EXPORT_SYMBOL and/or THIS_MODULE macros, they will need the
new export.h file included that provides them.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
13 years agoMAINTAINERS/rds: update maintainer
Or Gerlitz [Mon, 7 Nov 2011 18:28:20 +0000 (13:28 -0500)]
MAINTAINERS/rds: update maintainer

update for the actual maintainer

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agowanrouter: Remove kernel_lock annotations
Richard Weinberger [Mon, 7 Nov 2011 18:27:30 +0000 (13:27 -0500)]
wanrouter: Remove kernel_lock annotations

The BKL is gone, these annotations are useless.

Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agousbnet: fix oops in usbnet_start_xmit
Konstantin Khlebnikov [Mon, 7 Nov 2011 05:54:58 +0000 (05:54 +0000)]
usbnet: fix oops in usbnet_start_xmit

This patch fixes the bug added in commit v3.1-rc7-1055-gf9b491e
SKB can be NULL at this point, at least for cdc-ncm.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoixgbe: Fix compile for kernel without CONFIG_PCI_IOV defined
Rose, Gregory V [Mon, 7 Nov 2011 07:44:17 +0000 (07:44 +0000)]
ixgbe: Fix compile for kernel without CONFIG_PCI_IOV defined

Fix compiler errors and warnings with CONFIG_PCI_IOV defined and not
defined.

Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoMerge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux
Linus Torvalds [Mon, 7 Nov 2011 18:13:52 +0000 (10:13 -0800)]
Merge branch 'release' of git://git./linux/kernel/git/lenb/linux

* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux:
  cpuidle: Single/Global registration of idle states
  cpuidle: Split cpuidle_state structure and move per-cpu statistics fields
  cpuidle: Remove CPUIDLE_FLAG_IGNORE and dev->prepare()
  cpuidle: Move dev->last_residency update to driver enter routine; remove dev->last_state
  ACPI: Fix CONFIG_ACPI_DOCK=n compiler warning
  ACPI: Export FADT pm_profile integer value to userspace
  thermal: Prevent polling from happening during system suspend
  ACPI: Drop ACPI_NO_HARDWARE_INIT
  ACPI atomicio: Convert width in bits to bytes in __acpi_ioremap_fast()
  PNPACPI: Simplify disabled resource registration
  ACPI: Fix possible recursive locking in hwregs.c
  ACPI: use kstrdup()
  mrst pmu: update comment
  tools/power turbostat: less verbose debugging

13 years agoMerge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Linus Torvalds [Mon, 7 Nov 2011 18:01:56 +0000 (10:01 -0800)]
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (40 commits)
  vmwgfx: Snoop DMA transfers with non-covering sizes
  vmwgfx: Move the prefered mode first in the list
  vmwgfx: Unreference surface on cursor error path
  vmwgfx: Free prefered mode on error path
  vmwgfx: Use pointer return error codes
  vmwgfx: Fix hw cursor position
  vmwgfx: Infrastructure for explicit placement
  vmwgfx: Make the preferred autofit mode have a 60Hz vrefresh
  vmwgfx: Remove screen object active list
  vmwgfx: Screen object cleanups
  drm/radeon/kms: consolidate GART code, fix segfault after GPU lockup V2
  drm/radeon/kms: don't poll forever if MC GDDR link training fails
  drm/radeon/kms: fix DP setup on TRAVIS bridges
  drm/radeon/kms: set HPD polarity in hpd_init()
  drm/radeon/kms: add MSI module parameter
  drm/radeon/kms: Add MSI quirk for Dell RS690
  drm/radeon/kms: Add MSI quirk for HP RS690
  drm/radeon/kms: split MSI check into a separate function
  vmwgfx: Reinstate the update_layout ioctl
  drm/radeon/kms: always do extended edid probe
  ...

13 years agoMerge branch 'urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Mon, 7 Nov 2011 17:59:02 +0000 (09:59 -0800)]
Merge branch 'urgent-for-linus' of git://git./linux/kernel/git/wfg/linux

* 'urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linux:
  writeback: fix uninitialized task_ratelimit

13 years agoMerge git://git.samba.org/sfrench/cifs-2.6
Linus Torvalds [Mon, 7 Nov 2011 17:56:22 +0000 (09:56 -0800)]
Merge git://git.samba.org/sfrench/cifs-2.6

* git://git.samba.org/sfrench/cifs-2.6:
  CIFS: Cleanup byte-range locking code style
  CIFS: Simplify setlk error handling for mandatory locking

13 years agoMerge git://git.infradead.org/mtd-2.6
Linus Torvalds [Mon, 7 Nov 2011 17:11:16 +0000 (09:11 -0800)]
Merge git://git.infradead.org/mtd-2.6

* git://git.infradead.org/mtd-2.6: (226 commits)
  mtd: tests: annotate as DANGEROUS in Kconfig
  mtd: tests: don't use mtd0 as a default
  mtd: clean up usage of MTD_DOCPROBE_ADDRESS
  jffs2: add compr=lzo and compr=zlib options
  jffs2: implement mount option parsing and compression overriding
  mtd: nand: initialize ops.mode
  mtd: provide an alias for the redboot module name
  mtd: m25p80: don't probe device which has status of 'disabled'
  mtd: nand_h1900 never worked
  mtd: Add DiskOnChip G3 support
  mtd: m25p80: add EON flash EN25Q32B into spi flash id table
  mtd: mark block device queue as non-rotational
  mtd: r852: make r852_pm_ops static
  mtd: m25p80: add support for at25df321a spi data flash
  mtd: mxc_nand: preset_v1_v2: unlock all NAND flash blocks
  mtd: nand: switch `check_pattern()' to standard `memcmp()'
  mtd: nand: invalidate cache on unaligned reads
  mtd: nand: do not scan bad blocks with NAND_BBT_NO_OOB set
  mtd: nand: wait to set BBT version
  mtd: nand: scrub BBT on ECC errors
  ...

Fix up trivial conflicts:
 - arch/arm/mach-at91/board-usb-a9260.c
Merged into board-usb-a926x.c
 - drivers/mtd/maps/lantiq-flash.c
add_mtd_partitions -> mtd_device_register vs changed to use
mtd_device_parse_register.

13 years agoMerge branch 'linux-next' of git://git.infradead.org/ubifs-2.6
Linus Torvalds [Mon, 7 Nov 2011 16:52:19 +0000 (08:52 -0800)]
Merge branch 'linux-next' of git://git.infradead.org/ubifs-2.6

* 'linux-next' of git://git.infradead.org/ubifs-2.6:
  UBIFS: fix the dark space calculation
  UBIFS: introduce a helper to dump scanning info

13 years agovmwgfx: Snoop DMA transfers with non-covering sizes
Jakob Bornecrantz [Thu, 3 Nov 2011 20:03:08 +0000 (21:03 +0100)]
vmwgfx: Snoop DMA transfers with non-covering sizes

Enough to get cursors working under Wayland.

Signed-off-by: Jakob Bornecrantz <jakob@vmware.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
13 years agovmwgfx: Move the prefered mode first in the list
Jakob Bornecrantz [Thu, 3 Nov 2011 20:03:07 +0000 (21:03 +0100)]
vmwgfx: Move the prefered mode first in the list

Signed-off-by: Jakob Bornecrantz <jakob@vmware.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
13 years agovmwgfx: Unreference surface on cursor error path
Jakob Bornecrantz [Thu, 3 Nov 2011 20:03:06 +0000 (21:03 +0100)]
vmwgfx: Unreference surface on cursor error path

Signed-off-by: Jakob Bornecrantz <jakob@vmware.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
13 years agovmwgfx: Free prefered mode on error path
Jakob Bornecrantz [Thu, 3 Nov 2011 20:03:05 +0000 (21:03 +0100)]
vmwgfx: Free prefered mode on error path

Signed-off-by: Jakob Bornecrantz <jakob@vmware.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
13 years agovmwgfx: Use pointer return error codes
Jakob Bornecrantz [Thu, 3 Nov 2011 20:03:04 +0000 (21:03 +0100)]
vmwgfx: Use pointer return error codes

Signed-off-by: Jakob Bornecrantz <jakob@vmware.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
13 years agovmwgfx: Fix hw cursor position
Thomas Hellstrom [Wed, 2 Nov 2011 08:43:12 +0000 (09:43 +0100)]
vmwgfx: Fix hw cursor position

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
13 years agovmwgfx: Infrastructure for explicit placement
Thomas Hellstrom [Wed, 2 Nov 2011 08:43:11 +0000 (09:43 +0100)]
vmwgfx: Infrastructure for explicit placement

Make it possible to use explicit placement
(although not hooked up with a user-space interface yet)
and relax the single framebuffer limit to only apply to implicit placement.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
13 years agovmwgfx: Make the preferred autofit mode have a 60Hz vrefresh
Thomas Hellstrom [Wed, 2 Nov 2011 08:43:10 +0000 (09:43 +0100)]
vmwgfx: Make the preferred autofit mode have a 60Hz vrefresh

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
13 years agovmwgfx: Remove screen object active list
Thomas Hellstrom [Wed, 2 Nov 2011 08:43:09 +0000 (09:43 +0100)]
vmwgfx: Remove screen object active list

It isn't used for anything. Replace with an active bool.

Also make a couple of functions return void instead of int
since their return value wasn't checked anyway.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Jakbo Bornecrantz <jakob@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
13 years agovmwgfx: Screen object cleanups
Thomas Hellstrom [Wed, 2 Nov 2011 08:43:08 +0000 (09:43 +0100)]
vmwgfx: Screen object cleanups

Remove unused member.
No need to pin / unpin fb.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
13 years agowriteback: fix uninitialized task_ratelimit
Wu Fengguang [Mon, 7 Nov 2011 11:19:28 +0000 (19:19 +0800)]
writeback: fix uninitialized task_ratelimit

In balance_dirty_pages() task_ratelimit may be not initialized
(initialization skiped by goto pause), and then used when calling
tracing hook.

Fix it by moving the task_ratelimit assignment before goto pause.

Reported-by: Witold Baryluk <baryluk@smp.if.uj.edu.pl>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
13 years agoRevert "hvc_console: display printk messages on console."
Linus Torvalds [Mon, 7 Nov 2011 06:22:16 +0000 (22:22 -0800)]
Revert "hvc_console: display printk messages on console."

This reverts commit 361162459f62dc0826b82c9690a741a940f457f0.

It causes an infinite loop when booting Linux under Xen, as so:

  [    2.382984] console [hvc0] enabled
  [    2.382984] console [hvc0] enabled
  [    2.382984] console [hvc0] enabled
  ...

as reported by Konrad Rzeszutek Wilk.  And Rusty reports the same for
lguest.  He goes on to say:

   "This is not a concurrency problem: the issue seems to be that
    calling register_console() twice on the same struct console is a bad
    idea."

and Greg says he'll fix it up properly at some point later. Revert for now.

Reported-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reported-by: Rusty Russell <rusty@ozlabs.org>
Requested-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Miche Baker-Harvey <miche@google.com>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
13 years agopowerpc: fix building hvc_opal.c
Michael Neuling [Mon, 7 Nov 2011 06:12:28 +0000 (17:12 +1100)]
powerpc: fix building hvc_opal.c

Fix building following build error:

  drivers/tty/hvc/hvc_opal.c:244:12: error: 'THIS_MODULE' undeclared here (not in a function)

Signed-off-by: Michael Neuling <mikey@neuling.org>
[ New file from powerpc tree not following the new rules from the
  module.h split, both of which were merged today.  - Linus ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
13 years agoMerge branch 'upstream/jump-label-noearly' of git://git.kernel.org/pub/scm/linux...
Linus Torvalds [Mon, 7 Nov 2011 04:20:46 +0000 (20:20 -0800)]
Merge branch 'upstream/jump-label-noearly' of git://git./linux/kernel/git/jeremy/xen

* 'upstream/jump-label-noearly' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen:
  jump-label: initialize jump-label subsystem much earlier
  x86/jump_label: add arch_jump_label_transform_static()
  s390/jump-label: add arch_jump_label_transform_static()
  jump_label: add arch_jump_label_transform_static() to optimise non-live code updates
  sparc/jump_label: drop arch_jump_label_text_poke_early()
  x86/jump_label: drop arch_jump_label_text_poke_early()
  jump_label: if a key has already been initialized, don't nop it out
  stop_machine: make stop_machine safe and efficient to call early
  jump_label: use proper atomic_t initializer

Conflicts:
 - arch/x86/kernel/jump_label.c
Added __init_or_module to arch_jump_label_text_poke_early vs
removal of that function entirely
 - kernel/stop_machine.c
same patch ("stop_machine: make stop_machine safe and efficient
to call early") merged twice, with whitespace fix in one version

13 years agoMerge branch 'upstream/xen-settime' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Mon, 7 Nov 2011 04:15:05 +0000 (20:15 -0800)]
Merge branch 'upstream/xen-settime' of git://git./linux/kernel/git/jeremy/xen

* 'upstream/xen-settime' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen:
  xen/dom0: set wallclock time in Xen
  xen: add dom0_op hypercall
  xen/acpi: Domain0 acpi parser related platform hypercall

13 years agoMerge branch 'stable/cleanups-3.2' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Mon, 7 Nov 2011 04:13:34 +0000 (20:13 -0800)]
Merge branch 'stable/cleanups-3.2' of git://git./linux/kernel/git/konrad/xen

* 'stable/cleanups-3.2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
  xen: use static initializers in xen-balloon.c
  Xen: fix braces and tabs coding style issue in xenbus_probe.c
  Xen: fix braces coding style issue in xenbus_probe.h
  Xen: fix whitespaces,tabs coding style issue in drivers/xen/pci.c
  Xen: fix braces coding style issue in gntdev.c and grant-table.c
  Xen: fix whitespaces,tabs coding style issue in drivers/xen/events.c
  Xen: fix whitespaces,tabs coding style issue in drivers/xen/balloon.c

Fix up trivial whitespace-conflicts in
 drivers/xen/{balloon.c,pci.c,xenbus/xenbus_probe.c}

13 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux...
Linus Torvalds [Mon, 7 Nov 2011 04:03:41 +0000 (20:03 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/mason/linux-btrfs

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (114 commits)
  Btrfs: check for a null fs root when writing to the backup root log
  Btrfs: fix race during transaction joins
  Btrfs: fix a potential btrfs_bio leak on scrub fixups
  Btrfs: rename btrfs_bio multi -> bbio for consistency
  Btrfs: stop leaking btrfs_bios on readahead
  Btrfs: stop the readahead threads on failed mount
  Btrfs: fix extent_buffer leak in the metadata IO error handling
  Btrfs: fix the new inspection ioctls for 32 bit compat
  Btrfs: fix delayed insertion reservation
  Btrfs: ClearPageError during writepage and clean_tree_block
  Btrfs: be smarter about committing the transaction in reserve_metadata_bytes
  Btrfs: make a delayed_block_rsv for the delayed item insertion
  Btrfs: add a log of past tree roots
  btrfs: separate superblock items out of fs_info
  Btrfs: use the global reserve when truncating the free space cache inode
  Btrfs: release metadata from global reserve if we have to fallback for unlink
  Btrfs: make sure to flush queued bios if write_cache_pages waits
  Btrfs: fix extent pinning bugs in the tree log
  Btrfs: make sure btrfs_remove_free_space doesn't leak EAGAIN
  Btrfs: don't wait as long for more batches during SSD log commit
  ...