GitHub/mt8127/android_kernel_alcatel_ttab.git
12 years agosfc: Make all MAC statistics consistently 64 bits wide
Ben Hutchings [Wed, 12 Oct 2011 16:20:25 +0000 (17:20 +0100)]
sfc: Make all MAC statistics consistently 64 bits wide

Currently we use type u64 for byte counts, which can very quickly
exceed 2^32, and unsigned long for packet counts, which do not.  But
it can still take only 20-something minutes to send or receive 2^32
packets, and not all tools properly handle overflow even if they
sample more often than this.

The MAC statistics are all updated synchronously, so it costs very
little to make them all 64-bit regardless of native word size.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
12 years agosfc: Rename implementation of ndo_set_rx_mode
Ben Hutchings [Mon, 9 Jan 2012 19:54:44 +0000 (19:54 +0000)]
sfc: Rename implementation of ndo_set_rx_mode

Rename efx_set_multicast_list() to efx_set_rx_mode(), in line
with the operation name net_device_ops::ndo_set_rx_mode.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
12 years agosfc: Remove redundant 'rc' variable, always set to 0
Ben Hutchings [Mon, 9 Jan 2012 19:54:16 +0000 (19:54 +0000)]
sfc: Remove redundant 'rc' variable, always set to 0

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
12 years agosfc: Minor formatting fixes
Ben Hutchings [Mon, 9 Jan 2012 19:53:41 +0000 (19:53 +0000)]
sfc: Minor formatting fixes

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
12 years agosfc: Use existing local variables instead of repeated indirect lookups
Ben Hutchings [Mon, 9 Jan 2012 19:51:22 +0000 (19:51 +0000)]
sfc: Use existing local variables instead of repeated indirect lookups

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
12 years agosfc: Remove remnants of on-load self-test
Ben Hutchings [Mon, 9 Jan 2012 19:47:08 +0000 (19:47 +0000)]
sfc: Remove remnants of on-load self-test

The out-of-tree version of the sfc driver used to run a self-test on
each device before registering it.  Although this was never included
in-tree, some functions have checks for this special case which is not
really possible.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
12 years agosfc: Remove obsolete function efx_dev_name()
Ben Hutchings [Mon, 9 Jan 2012 19:41:48 +0000 (19:41 +0000)]
sfc: Remove obsolete function efx_dev_name()

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
12 years agosfc: Update the description of SFC_MTD
Ben Hutchings [Fri, 6 Jan 2012 22:47:17 +0000 (22:47 +0000)]
sfc: Update the description of SFC_MTD

SFC4000 boards also have an EEPROM exposed as MTD.
The boot configuration is accessed through MTD.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
12 years agosfc: Add hwmon driver for boards using SFC9000-family controllers
Ben Hutchings [Fri, 6 Jan 2012 20:25:39 +0000 (20:25 +0000)]
sfc: Add hwmon driver for boards using SFC9000-family controllers

The SFC9000-family controllers have firmware to manage all board
peripherals including temperature, heat sink continuity and voltage
sensors.  The firmware reports sensor alarms, which we log, and
will shut down the board if necessary.

Some users may want to monitor their boards more closely, so add an
hwmon driver that exposes all sensors reported by the firmware.  Move
efx_mcdi_sensor_event() into the new file so it can share the array of
sensor labels with the hwmon driver.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
12 years agosfc: Clean up test interrupt handling
Ben Hutchings [Thu, 5 Jan 2012 20:14:10 +0000 (20:14 +0000)]
sfc: Clean up test interrupt handling

Interrupts are normally generated by the event queues, moderated by
timers.  However, they may also be triggered by detection of a 'fatal'
error condition (e.g. memory parity error) or by the host writing to
certain CSR fields as part of a self-test.

The IRQ level/index used for these on Falcon rev B0 and Siena is set
by the KER_INT_LEVE_SEL field and cached by the driver in
efx_nic::fatal_irq_level.  Since this value is also relevant to
self-tests rename the field to just 'irq_level'.

Avoid unnecessary cache traffic by using a per-channel 'last_irq_cpu'
field and only writing to the per-controller field when the interrupt
matches efx_nic::irq_level.  Remove the volatile qualifier and use
ACCESS_ONCE in the places we read these fields.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
12 years agoPartly revert "sfc: Handle serious errors in exactly one interrupt handler"
Ben Hutchings [Fri, 6 Jan 2012 01:08:24 +0000 (01:08 +0000)]
Partly revert "sfc: Handle serious errors in exactly one interrupt handler"

This reverts commit 6369545945b90daa1a73fca174da9194c398417c in
drivers/net/ethernet/sfc/falcon.c.

Unlike the INT_ISR0 register on later controller revisions, the
NET_IVEC_INT_Q bits written to memory are only ever set for
interrupting event queues, not for any other interrupt sources.

By definition there can only be one legacy interrupt handler per
function, so there is no need to worry about detecting a fatal
interrupt more than once.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
12 years agosfc: Remove dependence on NAPI polling in efx_test_eventq_irq()
Ben Hutchings [Fri, 4 Nov 2011 23:06:04 +0000 (23:06 +0000)]
sfc: Remove dependence on NAPI polling in efx_test_eventq_irq()

We cannot safely assume that the NAPI handler will complete within the
20 ms that we allow for the event self-test.  The handler may be
deferred for longer than this, particularly on realtime kernels.

Instead, check whether either an event has been handled or (as in the
old failure path) whether an interrupt has been received and an event
has been delivered but not yet handled.  Use napi_disable() to
synchronize with the NAPI handler before checking, since it will
clear events before updating eventq_read_ptr.

Remove the test result chan.N.eventq.poll, since it is not an error
if the NAPI handler does not run during the test.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
12 years agosfc: Correct interrupt timer quantum for Siena (normal and turbo mode)
Ben Hutchings [Thu, 8 Dec 2011 19:51:47 +0000 (19:51 +0000)]
sfc: Correct interrupt timer quantum for Siena (normal and turbo mode)

We currently assume that the timer quantum for Siena is 5 us, the same
as for Falcon.  This is not correct; timer ticks are generated on a
rota which takes a minimum of 768 cycles (each event delivery or other
timer change will delay it by 3 cycles).  The timer quantum should be
6.144 or 3.072 us depending on whether turbo mode is active.

Replace EFX_IRQ_MOD_RESOLUTION with a timer_quantum_ns field in struct
efx_nic, initialised by the efx_nic_type::probe function.

While we're at it, replace EFX_IRQ_MOD_MAX with a timer_period_max
field in struct efx_nic_type.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
12 years agosfc: Support extraction of CAPABILITIES from GET_BOARD_CFG response.
Matthew Slattery [Wed, 14 Jul 2010 14:36:19 +0000 (15:36 +0100)]
sfc: Support extraction of CAPABILITIES from GET_BOARD_CFG response.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
12 years agosfc: Consistently test DEBUG macro, not EFX_ENABLE_DEBUG
Ben Hutchings [Fri, 4 Nov 2011 22:29:14 +0000 (22:29 +0000)]
sfc: Consistently test DEBUG macro, not EFX_ENABLE_DEBUG

The netif_dbg() macro is defined in <linux/netdevice.h>.  If the DEBUG
macro is defined, it logs a message at 'debug' level, otherwise it
does nothing.

In net_driver.h we define DEBUG if EFX_ENABLE_DEBUG is defined, but
this is too late for those source files that already got a
definition of netif_dbg() by including <linux/netdevice.h>

Get rid of EFX_ENABLE_DEBUG, and only define and test DEBUG.

In mtd.c, we do not use DEBUG as a condition flag but are forced to
use the DEBUG macro-function from <linux/mtd/mtd.h>.  Undefine DEBUG
before including it.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
12 years agosfc: Remove efx_nic_type::push_multicast_hash operation
Ben Hutchings [Tue, 13 Sep 2011 18:47:48 +0000 (19:47 +0100)]
sfc: Remove efx_nic_type::push_multicast_hash operation

Both implementations of efx_nic_type::reconfigure_mac operation
push the multicast hash filter to the hardware.  It is therefore
redundant to call efx_nic_type::push_multicast_hash as well.

efx_mcdi_mac_reconfigure() also uses this operation, but the
implementation for Siena just uses MCDI anyway.  Merge that into
efx_mcdi_mac_reconfigure().

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
12 years agosfc: Merge efx_mcdi_mac_check_fault() and efx_mcdi_get_mac_faults()
Ben Hutchings [Thu, 8 Sep 2011 01:09:42 +0000 (02:09 +0100)]
sfc: Merge efx_mcdi_mac_check_fault() and efx_mcdi_get_mac_faults()

The latter is only called by the former, which is a very short
wrapper.  Further, gcc 4.5 may currently wrongly warn that the
'faults' variable may be used uninitialised.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
12 years agosfc: Merge efx_mac_operations into efx_nic_type
Ben Hutchings [Fri, 2 Sep 2011 23:15:00 +0000 (00:15 +0100)]
sfc: Merge efx_mac_operations into efx_nic_type

No NICs need to switch efx_mac_operations at run-time, and the MAC
operations are fairly closely bound to NIC types.

Move efx_mac_operations::reconfigure to efx_nic_type::reconfigure_mac
and efx_mac_operations::check_fault fo efx_nic_type::check_mac_fault.
Change callers to call through efx->type or directly if the NIC type
is known.

Remove efx_mac_operations::update_stats.  The implementations for
Falcon used to fetch MAC statistics synchronously and this was used by
efx_register_netdev() to clear statistics after running self-tests.
However, it now only converts statistics that have already been
fetched (and that only for Falcon), and the call from
efx_register_netdev() has no effect.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
12 years agosfc: Hold efx_nic::stats_lock while reading efx_nic::mac_stats
Ben Hutchings [Fri, 2 Sep 2011 22:23:00 +0000 (23:23 +0100)]
sfc: Hold efx_nic::stats_lock while reading efx_nic::mac_stats

efx_nic::stats_lock is used to serialise stats updates, but each
reader was dropping it before it finished reading efx_nic::mac_stats.

If there were concurrent stats reads using procfs, or one using procfs
and one using ethtool, an update could race with a read.  On a 32-bit
system, the reader could see word-tearing of 64-bit stats (32 bits of
the old value and 32 bits of the new).

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
12 years agosfc: Use new names for MC shared memory layout constants
Ben Hutchings [Tue, 20 Dec 2011 23:52:02 +0000 (23:52 +0000)]
sfc: Use new names for MC shared memory layout constants

These are defined alongside the firmware protocol in mcdi_pcol.h.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
12 years agosfc: Make handling of MC reboot more reliable
Ben Hutchings [Tue, 20 Dec 2011 23:39:31 +0000 (23:39 +0000)]
sfc: Make handling of MC reboot more reliable

When the MC reboots, either as part of a firmware upgrade or due to a
bug, it attempts to complete (with an error) any requests that were
outstanding before the reboot.  Since there is an inherent race
condition in checking this, it will also write to a status word in
shared memory.

If we look at each of these separately, we may detect each reboot
twice, resulting in a spurious command failure after a firmware
upgrade or frustrating recovery from a firmware bug.  Instead, if a
request completion indicates a reboot, we must poll and clear the
status word.

This bug was previously masked by use of an incorrect address for the
status word.  Fix that, using the definition now included in
mcdi_pcol.h.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
12 years agosfc: Remove fallback for invalid permanent MAC address
Ben Hutchings [Tue, 20 Dec 2011 01:22:51 +0000 (01:22 +0000)]
sfc: Remove fallback for invalid permanent MAC address

By the time we look at the MAC address in efx_probe_port(), either the
driver or the firmware has already validated the board configuration.
The possibility of having an invalid MAC address just isn't worth
considering.  It certainly isn't worth having a compile-time option
for this.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
13 years agosfc: Set default parallelism to per-core by default
Ben Hutchings [Tue, 20 Dec 2011 01:08:05 +0000 (01:08 +0000)]
sfc: Set default parallelism to per-core by default

The previous default of per-package can be more CPU-efficient, but
users generally seem to prefer per-core.  It should also allow
accelerated RFS to direct packets more precisely, if IRQ affinity
is properly spread out.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
13 years agosfc: Rename efx_wanted_channels() to efx_wanted_parallelism()
Ben Hutchings [Tue, 20 Dec 2011 01:15:30 +0000 (01:15 +0000)]
sfc: Rename efx_wanted_channels() to efx_wanted_parallelism()

This function returns the degree of parallelism wanted, which is not
necessarily the total number of channels we want to create.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
13 years agosfc: Update MCDI (firmware interface) definitions
Ben Hutchings [Tue, 20 Dec 2011 00:44:06 +0000 (00:44 +0000)]
sfc: Update MCDI (firmware interface) definitions

Some commands and constants have been renamed; adjust the code
accordingly.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
13 years agosfc: Remove unnecessary inclusion of <asm/io.h>, prompted by checkpatch
Ben Hutchings [Thu, 5 Jan 2012 19:09:24 +0000 (19:09 +0000)]
sfc: Remove unnecessary inclusion of <asm/io.h>, prompted by checkpatch

Fix the warning:

WARNING: Use #include <linux/io.h> instead of <asm/io.h>

There is no need for selftest.c to include the file at all.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
13 years agosfc: Const-qualify static data as appropriate, partly prompted by checkpatch
Ben Hutchings [Thu, 5 Jan 2012 19:05:20 +0000 (19:05 +0000)]
sfc: Const-qualify static data as appropriate, partly prompted by checkpatch

Fix the following warnings:

WARNING: struct dev_pm_ops should normally be const
WARNING: static const char * array should probably be static const char * const

Similarly const-qualify struct i2c_board_info, struct i2c_algo_bit_data,
struct efx_ethtool_stat, struct efx_mtd_ops and struct siena_nvram_type_info.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
13 years agosfc: Remove parentheses around return expressions, reported by checkpatch
Ben Hutchings [Thu, 5 Jan 2012 18:54:04 +0000 (18:54 +0000)]
sfc: Remove parentheses around return expressions, reported by checkpatch

Fix the following error:

ERROR: return is not a function, parentheses are not required

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
13 years agosfc: Avoid assignment in an if-statement, reported by checkpatch
Ben Hutchings [Thu, 5 Jan 2012 18:50:29 +0000 (18:50 +0000)]
sfc: Avoid assignment in an if-statement, reported by checkpatch

Fix the following error:

ERROR: do not use assignment in if condition

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
13 years agosfc: Fix some formatting errors reported by checkpatch
Ben Hutchings [Thu, 5 Jan 2012 17:19:45 +0000 (17:19 +0000)]
sfc: Fix some formatting errors reported by checkpatch

Fix the following errors and warnings:

ERROR: trailing whitespace
ERROR: spaces required around that '=' (ctx:VxV)
WARNING: please, no space before tabs

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
13 years agoipv6/addrconf: speedup /proc/net/if_inet6 filling
Mihai Maruseac [Tue, 3 Jan 2012 23:31:35 +0000 (23:31 +0000)]
ipv6/addrconf: speedup /proc/net/if_inet6 filling

This ensures a linear behaviour when filling /proc/net/if_inet6 thus making
ifconfig run really fast on IPv6 only addresses. In fact, with this patch and
the IPv4 one sent a while ago, ifconfig will run in linear time regardless of
address type.

IPv4 related patch: f04565ddf52e401880f8ba51de0dff8ba51c99fd
 dev: use name hash for dev_seq_ops
 ...

Some statistics (running ifconfig > /dev/null on a different setup):

iface count / IPv6 no-patch time / IPv6 patched time / IPv4 time
----------------------------------------------------------------
      6250  |       0.23 s       |      0.13 s       |  0.11 s
     12500  |       0.62 s       |      0.28 s       |  0.22 s
     25000  |       2.91 s       |      0.57 s       |  0.46 s
     50000  |      11.37 s       |      1.21 s       |  0.94 s
    128000  |      86.78 s       |      3.05 s       |  2.54 s

Signed-off-by: Mihai Maruseac <mmaruseac@ixiacom.com>
Cc: Daniel Baluta <dbaluta@ixiacom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agor6040: place comments before code
Florian Fainelli [Wed, 4 Jan 2012 08:59:38 +0000 (08:59 +0000)]
r6040: place comments before code

checkpatch.pl complained about the line exceding 80 columns, and the
comment was actually on the same line as the code, fix that.

Signed-off-by: Florian Fainelli <florian@openwrt.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agor6040: use __aligned(size)
Florian Fainelli [Wed, 4 Jan 2012 08:59:37 +0000 (08:59 +0000)]
r6040: use __aligned(size)

instead of __attribute__((__aligned(size)__))

Signed-off-by: Florian Fainelli <florian@openwrt.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agor6040: use definitions for MAC_SM register read/writes
Florian Fainelli [Wed, 4 Jan 2012 08:59:36 +0000 (08:59 +0000)]
r6040: use definitions for MAC_SM register read/writes

Bit 1 is the reset bit of the MAC status machine register, define and
use it.

Signed-off-by: Florian Fainelli <florian@openwrt.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agor6040: use MAC_RST bit definition with MCR1 read/writes
Florian Fainelli [Wed, 4 Jan 2012 08:59:35 +0000 (08:59 +0000)]
r6040: use MAC_RST bit definition with MCR1 read/writes

MAC_RST bit is already defined, use it instead of 0x1 where applicable.

Signed-off-by: Florian Fainelli <florian@openwrt.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agor6040: define more MCR0 register bits
Florian Fainelli [Wed, 4 Jan 2012 08:59:34 +0000 (08:59 +0000)]
r6040: define more MCR0 register bits

Define more MCR0-register bits and use them in place of the bits values.

Signed-off-by: Florian Fainelli <florian@openwrt.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agor6040: remove unused variables and definitions
Florian Fainelli [Wed, 4 Jan 2012 08:59:33 +0000 (08:59 +0000)]
r6040: remove unused variables and definitions

Since the conversion to phylib (3831861b: r6040: implement phylib) some
PHY-related variables and definitions are now useless, remove them.

Signed-off-by: Florian Fainelli <florian@openwrt.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agor6040: use an unique MDIO bus name
Florian Fainelli [Wed, 4 Jan 2012 08:50:40 +0000 (08:50 +0000)]
r6040: use an unique MDIO bus name

We should use an unique MDIO bus name which does not clash with anything
else in the system like the Fixed MDIO bus. The bus is now named:
r6040-<card number> which is unique in the system.

Reported-by: Vladimir Kolpakov <vova.kolpakov@gmail.com>
Signed-off-by: Florian Fainelli <florian@openwrt.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoipv6: Check RA for sllao when configuring optimistic ipv6 address (v2)
Neil Horman [Wed, 4 Jan 2012 10:49:15 +0000 (10:49 +0000)]
ipv6: Check RA for sllao when configuring optimistic ipv6 address (v2)

Recently Dave noticed that a test we did in ipv6_add_addr to see if we next hop
route for the interface we're adding an addres to was wrong (see commit
7ffbcecbeed91e5874e9a1cfc4c0cbb07dac3069).  for one, it never triggers, and two,
it was completely wrong to begin with.  This test was meant to cover this
section of RFC 4429:

3.3 Modifications to RFC 2462 Stateless Address Autoconfiguration

   * (modifies section 5.5) A host MAY choose to configure a new address
        as an Optimistic Address.  A host that does not know the SLLAO
        of its router SHOULD NOT configure a new address as Optimistic.
        A router SHOULD NOT configure an Optimistic Address.

This patch should bring us into proper compliance with the above clause.  Since
we only add a SLAAC address after we've received a RA which may or may not
contain a source link layer address option, we can pass a pointer to that option
to addrconf_prefix_rcv (which may be null if the option is not present), and
only set the optimistic flag if the option was found in the RA.

Change notes:
(v2) modified the new parameter to addrconf_prefix_rcv to be a bool rather than
a pointer to make its use more clear as per request from davem.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: "David S. Miller" <davem@davemloft.net>
CC: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonet_sched: sfq: always randomize hash perturbation
Eric Dumazet [Wed, 4 Jan 2012 06:23:01 +0000 (06:23 +0000)]
net_sched: sfq: always randomize hash perturbation

SFQ q->perturbation is used in sfq_hash() as an input to Jenkins hash.

We currently randomize this 32bit value only if a perturbation timer is
setup.

Its much better to always initialize it to defeat attackers, or else
they can predict very well what kind of packets they have to forge to
hit a particular flow.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonet_sched: sfq: fix mem alloc error recovery
Eric Dumazet [Wed, 4 Jan 2012 06:22:24 +0000 (06:22 +0000)]
net_sched: sfq: fix mem alloc error recovery

Since commit 817fb15dfd98 (net_sched: sfq: allow divisor to be a
parameter), we can leave perturbation timer armed if a memory allocation
error aborts sfq_init().

Memory containing active struct timer_list is freed and kernel can
crash.

Call sfq_destroy() from sfq_init() to properly dismantle qdisc.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoethtool: Remove ethtool_ops::set_rx_ntuple operation
Ben Hutchings [Tue, 3 Jan 2012 12:07:59 +0000 (12:07 +0000)]
ethtool: Remove ethtool_ops::set_rx_ntuple operation

All implementations have been converted to implement set_rxnfc
instead.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agosfc: Remove now-unused filter function
Ben Hutchings [Tue, 3 Jan 2012 12:05:58 +0000 (12:05 +0000)]
sfc: Remove now-unused filter function

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agosfc: Implement ethtool RX NFC rules API instead of n-tuple API
Ben Hutchings [Tue, 3 Jan 2012 12:05:47 +0000 (12:05 +0000)]
sfc: Implement ethtool RX NFC rules API instead of n-tuple API

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agosfc: Add support for retrieving and removing filters by ID
Ben Hutchings [Tue, 3 Jan 2012 12:05:39 +0000 (12:05 +0000)]
sfc: Add support for retrieving and removing filters by ID

These new functions will support an implementation of the ethtool
RX NFC rules API.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agosfc: Use consistent types for filter IDs, indices and search depths
Ben Hutchings [Tue, 3 Jan 2012 12:05:27 +0000 (12:05 +0000)]
sfc: Use consistent types for filter IDs, indices and search depths

Filter IDs are u32 (but never very large) so an ID/error return
value should have type s32.

Filter indices and search depths are never negative, so should
have type unsigned int.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agosfc: Change filter ID generation to satisfy priority semantics of RX NFC
Ben Hutchings [Tue, 3 Jan 2012 12:05:15 +0000 (12:05 +0000)]
sfc: Change filter ID generation to satisfy priority semantics of RX NFC

Also add note that the efx_filter_spec::priority field has nothing
to do with priority between multiple matching filters.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoethtool: Allow drivers to select RX NFC rule locations
Ben Hutchings [Tue, 3 Jan 2012 12:04:51 +0000 (12:04 +0000)]
ethtool: Allow drivers to select RX NFC rule locations

Define special location values for RX NFC that request the driver to
select the actual rule location.  This allows for implementation on
devices that use hash-based filter lookup, whereas currently the API is
more suited to devices with TCAM lookup or linear search.

In ethtool_set_rxnfc() and the compat wrapper ethtool_ioctl(), copy
the structure back to user-space after insertion so that the actual
location is returned.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agogianfar: Reject out-of-range RX NFC locations
Ben Hutchings [Tue, 3 Jan 2012 11:59:30 +0000 (11:59 +0000)]
gianfar: Reject out-of-range RX NFC locations

Currently the driver only uses location values to maintain an ordered
list of filters.  Make it reject location values >= MAX_FILER_IDX
passed to the ETHTOOL_SRXCLSRLINS command, consistent with the range
it reports for the ETHTOOL_GRXCLSRLALL command.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Acked-by: Sebastian Pöhn <sebastian.poehn@googlemail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonet/smsc911x: Check if PHY is in operational mode before software reset
Javier Martinez Canillas [Tue, 3 Jan 2012 13:36:19 +0000 (13:36 +0000)]
net/smsc911x: Check if PHY is in operational mode before software reset

SMSC LAN generation 4 chips integrate an IEEE 802.3 ethernet physical layer.
The PHY driver for this integrated chip enable an energy detect power-down mode.
When the PHY is in a power-down mode, it prevents the MAC portion chip to be
software reseted.

That means that if we compile the kernel with the configuration option SMSC_PHY
enabled and try to bring the network interface up without an cable plug-ed the
PHY will be in a low power mode and the software reset will fail returning -EIO
to user-space:

root@igep00x0:~# ifconfig eth0 up
ifconfig: SIOCSIFFLAGS: Input/output error

This patch disable the energy detect power-down mode before trying to software
reset the LAN chip and re-enables after it was reseted successfully.

Signed-off-by: Javier Martinez Canillas <javier@dowhile0.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonet: phy: smsc: Move SMSC PHY constants to <linux/smscphy.h>
Javier Martinez Canillas [Wed, 4 Jan 2012 01:23:18 +0000 (20:23 -0500)]
net: phy: smsc: Move SMSC PHY constants to <linux/smscphy.h>

SMSC generation 4 LAN chips integrate an IEEE 802.3 ethernet physical layer.
The ethernet driver for this family of devices needs to access the SMSC PHY
registers and bit-fields.

So, this patch moves these constants to a place where it can be used for both
the PHY and LAN drivers.

Signed-off-by: Javier Martinez Canillas <javier@dowhile0.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wirel...
John W. Linville [Tue, 3 Jan 2012 20:16:34 +0000 (15:16 -0500)]
Merge branch 'master' of git://git./linux/kernel/git/linville/wireless-next into for-davem

Conflicts:
drivers/net/wireless/b43/dma.c
drivers/net/wireless/brcm80211/brcmfmac/dhd_linux.c

13 years agoskge: fix warning when CONFIG_PM is defined but not CONFIG_PM_SLEEP
Daniel Halperin [Tue, 3 Jan 2012 18:53:16 +0000 (13:53 -0500)]
skge: fix warning when CONFIG_PM is defined but not CONFIG_PM_SLEEP

drivers/net/ethernet/marvell/skge.c:4046: warning: â€˜skge_suspend’ defined but not used
drivers/net/ethernet/marvell/skge.c:4071: warning: â€˜skge_resume’ defined but not used

Signed-off-by: Daniel Halperin <dhalperi@cs.washington.edu>
Cc: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonet/davinci: do not use all descriptors for tx packets
Sascha Hauer [Tue, 3 Jan 2012 05:27:47 +0000 (05:27 +0000)]
net/davinci: do not use all descriptors for tx packets

The driver uses a shared pool for both rx and tx descriptors.
During open it queues fixed number of 128 descriptors for receive
packets. For each received packet it tries to queue another
descriptor. If this fails the descriptor is lost for rx.
The driver has no limitation on tx descriptors to use, so it
can happen during a nmap / ping -f attack that the driver
allocates all descriptors for tx and looses all rx descriptors.
The driver stops working then.
To fix this limit the number of tx descriptors used to half of
the descriptors available, the rx path uses the other half.

Tested on a custom board using nmap / ping -f to the board from
two different hosts.

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonet_sched: qdisc_alloc_handle() can be too slow
Eric Dumazet [Tue, 3 Jan 2012 00:00:11 +0000 (00:00 +0000)]
net_sched: qdisc_alloc_handle() can be too slow

When trying to allocate ~32768 qdiscs using autohandle mechanism, we can
fill the space managed by kernel (handles in [8000-FFFF]:0000 range)

But O(N^2) qdisc_alloc_handle() loops 0x10000 times instead of 0x8000

time tc add qdisc add dev eth0 parent 10:7fff pfifo limit 10
RTNETLINK answers: Cannot allocate memory
real    1m54.826s
user    0m0.000s
sys     0m0.004s

INFO: rcu_sched_state detected stall on CPU 0 (t=60000 jiffies)

Half number of loops, and add a cond_resched() call.
We hold rtnl at this point.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Dave Taht <dave.taht@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agosch_qfq: accurate wsum handling
Eric Dumazet [Mon, 2 Jan 2012 11:47:50 +0000 (11:47 +0000)]
sch_qfq: accurate wsum handling

We can underestimate q->wsum in case of "tc class replace ... qfq"
and/or qdisc_create_dflt() error.

wsum is not really used in fast path, only at qfq qdisc/class setup,
to catch user error.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agosch_sfq: dont put new flow at the end of flows
Eric Dumazet [Sun, 1 Jan 2012 18:33:31 +0000 (18:33 +0000)]
sch_sfq: dont put new flow at the end of flows

SFQ enqueue algo puts a new flow _behind_ all pre-existing flows in the
circular list. In fact this is probably an old SFQ implementation bug.

100 Mbits = ~8333 full frames per second, or ~8 frames per ms.

With 50 flows, it means your "new flow" will have to wait 50 packets
being sent before its own packet. Thats the ~6ms.

We certainly can change SFQ to give a priority advantage to new flows,
so that next dequeued packet is taken from a new flow, not an old one.

Reported-by: Dave Taht <dave.taht@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agomlx4_core: Elaborating limitation on VF port options
Yevgeny Petrilin [Mon, 2 Jan 2012 04:07:43 +0000 (04:07 +0000)]
mlx4_core: Elaborating limitation on VF port options

Showing which capabilities are not passed to VF
when executing QUERY_PORT

Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agomlx4_core: fix mtt range deallocation
Marcel Apfelbaum [Mon, 2 Jan 2012 04:07:39 +0000 (04:07 +0000)]
mlx4_core: fix mtt range deallocation

The mtt range was allocated in mtt units but deallocated
in segments. Among the rest, this caused crash during hotplug removal

Reported-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Marcel Apfelbaum <marcela@mellanox.co.il>
Reviewed-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Tested-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agobonding: fix error handling if slave is busy (v2)
stephen hemminger [Sat, 31 Dec 2011 13:26:46 +0000 (13:26 +0000)]
bonding: fix error handling if slave is busy (v2)

If slave device already has a receive handler registered, then the
error unwind of bonding device enslave function is broken.

The following will leave a pointer to freed memory in the slave
device list, causing a later kernel panic.
# modprobe dummy
# ip li add dummy0-1 link dummy0 type macvlan
# modprobe bonding
# echo +dummy0 >/sys/class/net/bond0/bonding/slaves

The fix is to detach the slave (which removes it from the list)
in the unwind path.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Reviewed-by: Nicolas de Pesloüan <nicolas.2p.debian@free.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years ago8139cp: properly config rx mode after resuming
Jason Wang [Fri, 30 Dec 2011 23:44:42 +0000 (23:44 +0000)]
8139cp: properly config rx mode after resuming

Rx mode should be reset after resming, so unconditionally updating rx
mode rather than conditionally updating based on the value we
remembered, otherwise unexpected value may be used by the nic after
resuming.

btw. I find and test this when debugging guest hibernation in qemu, as
I did not have a 8139cp card in hand, this patch is untested in a
physical 8139cp card, plase review it carefully.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years ago8139cp/8139too: do not read into reserved registers
Jason Wang [Fri, 30 Dec 2011 23:44:33 +0000 (23:44 +0000)]
8139cp/8139too: do not read into reserved registers

delay_eeprom() use long read for Cfg9346 register(offset 0x50) which may read
into the area of reserved register(offset 0x53). Use byte read instead.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoixgbe: add support for new 82599 device.
Don Skidmore [Sat, 10 Dec 2011 06:49:43 +0000 (06:49 +0000)]
ixgbe: add support for new 82599 device.

This device uses an already existing DevID but since it supports
WoL we need to add the Sub DevID.  It's support of WoL is limited
to the first port.

Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
13 years agoixgbe: add support for new 82599 device id
Emil Tantilov [Fri, 4 Nov 2011 06:43:29 +0000 (06:43 +0000)]
ixgbe: add support for new 82599 device id

Support for new 82599 based quad port adapter.

Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
13 years agoixgbe: add write flush in ixgbe_clock_out_i2c_byte()
Emil Tantilov [Fri, 4 Nov 2011 06:43:23 +0000 (06:43 +0000)]
ixgbe: add write flush in ixgbe_clock_out_i2c_byte()

I2C access is timing critical. Always do a write flush after writing
to the I2CCTL register.

Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
13 years agoixgbe: fix typo's
Stephen Hemminger [Thu, 22 Dec 2011 16:34:52 +0000 (16:34 +0000)]
ixgbe: fix typo's

Saw typo in one message, so decided to run spell checker.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
13 years agoixgbe: fix incorrect PHY register reads
Emil Tantilov [Sat, 10 Dec 2011 08:21:47 +0000 (08:21 +0000)]
ixgbe: fix incorrect PHY register reads

Fix some register reads that had the opcode and register parameters swapped.
Also use define instead of a magic (0x3) number.

Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
13 years agoigb: Add flow control advertising to ethtool setting.
Carolyn Wyborny [Fri, 2 Dec 2011 00:03:15 +0000 (00:03 +0000)]
igb: Add flow control advertising to ethtool setting.

Added pause flag for bi-directional flow control advertising to ethtool
settings.

Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
13 years agoixgbevf: Fix register defines to correctly handle complex expressions
Alexander Duyck [Thu, 8 Dec 2011 06:36:28 +0000 (06:36 +0000)]
ixgbevf: Fix register defines to correctly handle complex expressions

This patch is meant to address possible issues with the IXGBEVF register
defines generating incorrect values when given a complex expression for the
register offset.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
13 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
David S. Miller [Mon, 2 Jan 2012 23:56:49 +0000 (18:56 -0500)]
Merge git://git./linux/kernel/git/davem/net

13 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/padovan/blueto...
John W. Linville [Mon, 2 Jan 2012 21:43:54 +0000 (16:43 -0500)]
Merge branch 'master' of git://git./linux/kernel/git/padovan/bluetooth-next

13 years agoMerge branch 'devicetree/merge' of git://git.secretlab.ca/git/linux-2.6
Linus Torvalds [Mon, 2 Jan 2012 20:34:03 +0000 (12:34 -0800)]
Merge branch 'devicetree/merge' of git://git.secretlab.ca/git/linux-2.6

* 'devicetree/merge' of git://git.secretlab.ca/git/linux-2.6:
  dt/device: Fix auxdata matching to handle entries without a name override

13 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Linus Torvalds [Mon, 2 Jan 2012 03:36:08 +0000 (19:36 -0800)]
Merge git://git./linux/kernel/git/davem/net

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
  netfilter: ctnetlink: fix timeout calculation
  ipvs: try also real server with port 0 in backup server
  skge: restore rx multicast filter on resume and after config changes
  mlx4_en: nullify cq->vector field when closing completion queue

13 years agonetfilter: nfnetlink_acct: fix nfnl_acct_get operation
Pablo Neira Ayuso [Fri, 30 Dec 2011 19:32:26 +0000 (20:32 +0100)]
netfilter: nfnetlink_acct: fix nfnl_acct_get operation

The get operation was not sending the message that was built to
user-space. This patch also includes the appropriate handling for
the return value of netlink_unicast().

Moreover, fix error codes on error (for example, for non-existing
entry was uncorrect).

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
13 years agoMerge branch 'fix/asoc' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Linus Torvalds [Sat, 31 Dec 2011 19:55:06 +0000 (11:55 -0800)]
Merge branch 'fix/asoc' of git://git./linux/kernel/git/tiwai/sound

* 'fix/asoc' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ASoC: wm8776: add missing break in sample size switch

13 years agogspca: Fix falling back to lower isoc alt settings
Mauro Carvalho Chehab [Sat, 31 Dec 2011 13:32:03 +0000 (11:32 -0200)]
gspca: Fix falling back to lower isoc alt settings

The current gspca core code has a regression where it no longer properly
falls back to lower alt settings when there is not enough bandwidth.

This causes many iso based usb-1 cameras to not work when plugged into a
usb2 hub or a sandybridge chipset motherboard!

This patch fixes this.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
13 years agofutex: Fix uninterruptible loop due to gate_area
Hugh Dickins [Sat, 31 Dec 2011 19:44:01 +0000 (11:44 -0800)]
futex: Fix uninterruptible loop due to gate_area

It was found (by Sasha) that if you use a futex located in the gate
area we get stuck in an uninterruptible infinite loop, much like the
ZERO_PAGE issue.

While looking at this problem, PeterZ realized you'll get into similar
trouble when hitting any install_special_pages() mapping.  And are there
still drivers setting up their own special mmaps without page->mapping,
and without special VM or pte flags to make get_user_pages fail?

In most cases, if page->mapping is NULL, we do not need to retry at all:
Linus points out that even /proc/sys/vm/drop_caches poses no problem,
because it ends up using remove_mapping(), which takes care not to
interfere when the page reference count is raised.

But there is still one case which does need a retry: if memory pressure
called shmem_writepage in between get_user_pages_fast dropping page
table lock and our acquiring page lock, then the page gets switched from
filecache to swapcache (and ->mapping set to NULL) whatever the refcount.
Fault it back in to get the page->mapping needed for key->shared.inode.

Reported-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
13 years agonetfilter: ctnetlink: fix timeout calculation
Xi Wang [Fri, 30 Dec 2011 15:40:17 +0000 (10:40 -0500)]
netfilter: ctnetlink: fix timeout calculation

The sanity check (timeout < 0) never works; the dividend is unsigned
and so is the division, which should have been a signed division.

long timeout = (ct->timeout.expires - jiffies) / HZ;
if (timeout < 0)
timeout = 0;

This patch converts the time values to signed for the division.

Signed-off-by: Xi Wang <xi.wang@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
13 years agoipvs: try also real server with port 0 in backup server
Julian Anastasov [Fri, 30 Dec 2011 05:19:02 +0000 (14:19 +0900)]
ipvs: try also real server with port 0 in backup server

We should not forget to try for real server with port 0
in the backup server when processing the sync message. We should
do it in all cases because the backup server can use different
forwarding method.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
13 years agoskge: restore rx multicast filter on resume and after config changes
Florian Zumbiehl [Fri, 30 Dec 2011 17:30:09 +0000 (17:30 +0000)]
skge: restore rx multicast filter on resume and after config changes

Restore skge hardware registers for multicast filtering to their
appropriate values after system resume and after hardware restarts
that are done when changing certain settings.

Signed-off-by: Florian Zumbiehl <florz@florz.de>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agobe2net: query link status in be_open()
Ajit Khaparde [Fri, 30 Dec 2011 12:15:40 +0000 (12:15 +0000)]
be2net: query link status in be_open()

be2net gets an async link status notification from the FW when it creates
an MCC queue. There are some cases in which this gratuitous notification
is not received from FW. To cover this explicitly query the link status
in be_open().

Signed-off-by: Vasundhara Volam <vasundhara.volam@emulex.com>
Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: Ajit Khaparde <ajit.khaparde@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agobe2net: fix range check for set_qos for a VF
Ajit Khaparde [Fri, 30 Dec 2011 12:15:30 +0000 (12:15 +0000)]
be2net: fix range check for set_qos for a VF

Signed-off-by: Ajit Khaparde <ajit.khaparde@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agobe2net: fix be_vlan_add/rem_vid
Ajit Khaparde [Fri, 30 Dec 2011 12:15:12 +0000 (12:15 +0000)]
be2net: fix be_vlan_add/rem_vid

1) fix be_vlan_add/rem_vid to return proper status
2) perform appropriate housekeeping if firmware command succeeds.

Signed-off-by: Ajit Khaparde <ajit.khaparde@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agomlx4_en: nullify cq->vector field when closing completion queue
Yevgeny Petrilin [Thu, 29 Dec 2011 05:49:58 +0000 (05:49 +0000)]
mlx4_en: nullify cq->vector field when closing completion queue

Caused loss of connectivity when changing ring size.

Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonetem: fix classful handling
Eric Dumazet [Wed, 28 Dec 2011 23:12:02 +0000 (23:12 +0000)]
netem: fix classful handling

Commit 10f6dfcfde (Revert "sch_netem: Remove classful functionality")
reintroduced classful functionality to netem, but broke basic netem
behavior :

netem uses an t(ime)fifo queue, and store timestamps in skb->cb[]

If qdisc is changed, time constraints are not respected and other qdisc
can destroy skb->cb[] and block netem at dequeue time.

Fix this by always using internal tfifo, and optionally attach a child
qdisc to netem (or a tree of qdiscs)

Example of use :

DEV=eth3
tc qdisc del dev $DEV root
tc qdisc add dev $DEV root handle 30: est 1sec 8sec netem delay 20ms 10ms
tc qdisc add dev $DEV handle 40:0 parent 30:0 tbf \
burst 20480 limit 20480 mtu 1514 rate 32000bps

qdisc netem 30: root refcnt 18 limit 1000 delay 20.0ms  10.0ms
 Sent 190792 bytes 413 pkt (dropped 0, overlimits 0 requeues 0)
 rate 18416bit 3pps backlog 0b 0p requeues 0
qdisc tbf 40: parent 30: rate 256000bit burst 20Kb/8 mpu 0b lat 0us
 Sent 190792 bytes 413 pkt (dropped 6, overlimits 10 requeues 0)
 backlog 0b 5p requeues 0

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoIPv6: Avoid taking write lock for /proc/net/ipv6_route
Josh Hunt [Wed, 28 Dec 2011 13:23:07 +0000 (13:23 +0000)]
IPv6: Avoid taking write lock for /proc/net/ipv6_route

During some debugging I needed to look into how /proc/net/ipv6_route
operated and in my digging I found its calling fib6_clean_all() which uses
"write_lock_bh(&table->tb6_lock)" before doing the walk of the table. I
found this on 2.6.32, but reading the code I believe the same basic idea
exists currently. Looking at the rtnetlink code they are only calling
"read_lock_bh(&table->tb6_lock);" via fib6_dump_table(). While I realize
reading from proc isn't the recommended way of fetching the ipv6 route
table; taking a write lock seems unnecessary and would probably cause
network performance issues.

To verify this I loaded up the ipv6 route table and then ran iperf in 3
cases:
  * doing nothing
  * reading ipv6 route table via proc
    (while :; do cat /proc/net/ipv6_route > /dev/null; done)
  * reading ipv6 route table via rtnetlink
    (while :; do ip -6 route show table all > /dev/null; done)

* Load the ipv6 route table up with:
  * for ((i = 0;i < 4000;i++)); do ip route add unreachable 2000::$i; done

* iperf commands:
  * client: iperf -i 1 -V -c <ipv6 addr>
  * server: iperf -V -s

* iperf results - 3 runs each (in Mbits/sec)
  * nothing: client: 927,927,927 server: 927,927,927
  * proc: client: 179,97,96,113 server: 142,112,133
  * iproute: client: 928,927,928 server: 927,927,927

lock_stat shows taking the write lock is causing the slowdown. Using this
info I decided to write a version of fib6_clean_all() which replaces
write_lock_bh(&table->tb6_lock) with read_lock_bh(&table->tb6_lock). With
this new function I see the same results as with my rtnetlink iperf test.

Signed-off-by: Josh Hunt <joshhunt00@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agounix_diag: Fixup RQLEN extension report
Pavel Emelyanov [Fri, 30 Dec 2011 00:54:39 +0000 (00:54 +0000)]
unix_diag: Fixup RQLEN extension report

While it's not too late fix the recently added RQLEN diag extension
to report rqlen and wqlen in the same way as TCP does.

I.e. for listening sockets the ack backlog length (which is the input
queue length for socket) in rqlen and the max ack backlog length in
wqlen, and what the CINQ/OUTQ ioctls do for established.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoaf_unix: Move CINQ/COUTQ code to helpers
Pavel Emelyanov [Fri, 30 Dec 2011 00:54:11 +0000 (00:54 +0000)]
af_unix: Move CINQ/COUTQ code to helpers

Currently tcp diag reports rqlen and wqlen values similar to how
the CINQ/COUTQ iotcls do. To make unix diag report these values
in the same way move the respective code into helpers.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoMerge branch 'fixes' of http://ftp.arm.linux.org.uk/pub/linux/arm/kernel/git-cur...
Linus Torvalds [Fri, 30 Dec 2011 21:45:34 +0000 (13:45 -0800)]
Merge branch 'fixes' of ftp.arm.linux.org.uk/pub/linux/arm/kernel/git-cur/linux-2.6-arm

* 'fixes' of http://ftp.arm.linux.org.uk/pub/linux/arm/kernel/git-cur/linux-2.6-arm:
  ARM: 7237/1: PL330: Fix driver freeze
  ARM: 7197/1: errata: Remove SMP dependency for erratum 751472
  ARM: 7196/1: errata: Remove SMP dependency for erratum 720789
  ARM: 7220/1: mmc: mmci: Fixup error handling for dma
  ARM: 7214/1: mmc: mmci: Fixup handling of MCI_STARTBITERR

13 years agounix_diag: Add the MEMINFO extension
Pavel Emelyanov [Fri, 30 Dec 2011 09:27:43 +0000 (09:27 +0000)]
unix_diag: Add the MEMINFO extension

[ Fix indentation of sock_diag*() calls. -DaveM ]

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoMerge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Linus Torvalds [Fri, 30 Dec 2011 21:43:45 +0000 (13:43 -0800)]
Merge branch 'fixes' of git://git./linux/kernel/git/arm/arm-soc

* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
  ARM: plat-orion: make gpiochip label unique
  enable uncompress log on cpuimx35sd
  cpuimx35: fix touchscreen support
  cpuimx35sd: fix Kconfig
  clock-imx35: fix reboot in internal boot mode
  dma: MX3_IPU fix depends
  imx_v4_v5_defconfig: update default configuration
  cpuimx25sd: fix Kconfig
  arm/imx: fix cpufreq section mismatch
  ARM:imx:fix pwm period value
  ARM: OMAP: hwmod data: fix iva and mailbox hwmods for OMAP 3

13 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Linus Torvalds [Fri, 30 Dec 2011 21:42:41 +0000 (13:42 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/dtor/input

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: sentelic - fix retrieving number of buttons
  Input: sentelic - release mutex upon register write failure

13 years agoinet_diag: Add the SKMEMINFO extension
Pavel Emelyanov [Fri, 30 Dec 2011 00:53:32 +0000 (00:53 +0000)]
inet_diag: Add the SKMEMINFO extension

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agosock_diag: Introduce the meminfo nla core (v2)
Pavel Emelyanov [Fri, 30 Dec 2011 00:53:13 +0000 (00:53 +0000)]
sock_diag: Introduce the meminfo nla core (v2)

Add a routine that dumps memory-related values of a socket.
It's made as an array to make it possible to add more stuff
here later without breaking compatibility.

Since v1: The SK_MEMINFO_ constants are in userspace
visible part of sock_diag.h, the rest is under __KERNEL__.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agounix_diag: Include unix_diag.h into header-y target
Pavel Emelyanov [Fri, 30 Dec 2011 00:52:51 +0000 (00:52 +0000)]
unix_diag: Include unix_diag.h into header-y target

The headers check complains it should include the linux/types.h
withing, thus add this one.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agosock_diag: Arrange sock_diag.h such that it is exportable to userspace
Pavel Emelyanov [Fri, 30 Dec 2011 00:52:21 +0000 (00:52 +0000)]
sock_diag: Arrange sock_diag.h such that it is exportable to userspace

Properly toss existing components around the ifdef __KERNEL__
and include the header into the header-y target.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph...
Linus Torvalds [Fri, 30 Dec 2011 21:34:22 +0000 (13:34 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/sage/ceph-client

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
  ceph: disable use of dcache for readdir etc.

13 years agoMerge branch 'v3.2-samsung-fixes-4' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 30 Dec 2011 21:34:00 +0000 (13:34 -0800)]
Merge branch 'v3.2-samsung-fixes-4' of git://git./linux/kernel/git/kgene/linux-samsung

* 'v3.2-samsung-fixes-4' of git://git.kernel.org/pub/scm/linux/kernel/git/kgene/linux-samsung:
  ARM: EXYNOS: Remove duplicated SROMC static memory mapping
  ARM: SAMSUNG: Fix build error when selecting CPU_FREQ_S3C24XX_DEBUGFS on S3C2440

13 years agoRevert "clockevents: Set noop handler in clockevents_exchange_device()"
Linus Torvalds [Fri, 30 Dec 2011 21:24:40 +0000 (13:24 -0800)]
Revert "clockevents: Set noop handler in clockevents_exchange_device()"

This reverts commit de28f25e8244c7353abed8de0c7792f5f883588c.

It results in resume problems for various people. See for example

  http://thread.gmane.org/gmane.linux.kernel/1233033
  http://thread.gmane.org/gmane.linux.kernel/1233389
  http://thread.gmane.org/gmane.linux.kernel/1233159
  http://thread.gmane.org/gmane.linux.kernel/1227868/focus=1230877

and the fedora and ubuntu bug reports

  https://bugzilla.redhat.com/show_bug.cgi?id=767248
  https://bugs.launchpad.net/ubuntu/+source/linux/+bug/904569

which got bisected down to the stable version of this commit.

Reported-by: Jonathan Nieder <jrnieder@gmail.com>
Reported-by: Phil Miller <mille121@illinois.edu>
Reported-by: Philip Langdale <philipl@overt.org>
Reported-by: Tim Gardner <tim.gardner@canonical.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Greg KH <gregkh@suse.de>
Cc: stable@kernel.org # for stable kernels that applied the original
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
13 years agoMerge git://www.linux-watchdog.org/linux-watchdog
Linus Torvalds [Fri, 30 Dec 2011 20:13:03 +0000 (12:13 -0800)]
Merge git://www.linux-watchdog.org/linux-watchdog

* git://www.linux-watchdog.org/linux-watchdog:
  watchdog: iTCO_wdt.c - problems with newer hardware due to SMI clearing (part 2)
  watchdog: hpwdt: Changes to handle NX secure bit in 32bit path
  watchdog: sp805: Fix section mismatch in ID table.
  watchdog: move coh901327 state holders