GitHub/mt8127/android_kernel_alcatel_ttab.git
12 years agotcp: md5: RST: getting md5 key from listener
Shawn Lu [Tue, 31 Jan 2012 22:35:48 +0000 (22:35 +0000)]
tcp: md5: RST: getting md5 key from listener

TCP RST mechanism is broken in TCP md5(RFC2385). When
connection is gone, md5 key is lost, sending RST
without md5 hash is deem to ignored by peer. This can
be a problem since RST help protocal like bgp to fast
recove from peer crash.

In most case, users of tcp md5, such as bgp and ldp,
have listener on both sides to accept connection from peer.
md5 keys for peers are saved in listening socket.

There are two cases in finding md5 key when connection is
lost:
1.Passive receive RST: The message is send to well known port,
tcp will associate it with listner. md5 key is gotten from
listener.

2.Active receive RST (no sock): The message is send to ative
side, there is no socket associated with the message. In this
case, finding listener from source port, then find md5 key from
listener.

we are not loosing sercuriy here:
packet is checked with md5 hash. No RST is generated
if md5 hash doesn't match or no md5 key can be found.

Signed-off-by: Shawn Lu <shawn.lu@ericsson.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoxfrm6: remove unneeded NULL check in __xfrm6_output()
Dan Carpenter [Tue, 31 Jan 2012 21:45:26 +0000 (21:45 +0000)]
xfrm6: remove unneeded NULL check in __xfrm6_output()

We don't check for NULL consistently in __xfrm6_output().  If "x" were
NULL here it would lead to an OOPs later.  I asked Steffen Klassert
about this and he suggested that we remove the NULL check.

On 10/29/11, Steffen Klassert <steffen.klassert@secunet.com> wrote:
>> net/ipv6/xfrm6_output.c
>>    148
>>    149 if ((x && x->props.mode == XFRM_MODE_TUNNEL) &&
>>                           ^
>
> x can't be null here. It would be a bug if __xfrm6_output() is called
> without a xfrm_state attached to the skb. I think we can just remove
> this null check.

Cc: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agotcp: md5: protects md5sig_info with RCU
Eric Dumazet [Tue, 31 Jan 2012 18:45:40 +0000 (18:45 +0000)]
tcp: md5: protects md5sig_info with RCU

This patch makes sure we use appropriate memory barriers before
publishing tp->md5sig_info, allowing tcp_md5_do_lookup() being used from
tcp_v4_send_reset() without holding socket lock (upcoming patch from
Shawn Lu)

Note we also need to respect rcu grace period before its freeing, since
we can free socket without this grace period thanks to
SLAB_DESTROY_BY_RCU

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Shawn Lu <shawn.lu@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agodrivers/net: Remove alloc_etherdev error messages
Joe Perches [Sun, 29 Jan 2012 13:47:52 +0000 (13:47 +0000)]
drivers/net: Remove alloc_etherdev error messages

alloc_etherdev has a generic OOM/unable to alloc message.
Remove the duplicative messages after alloc_etherdev calls.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agodrivers/net: Remove unnecessary k.alloc/v.alloc OOM messages
Joe Perches [Sun, 29 Jan 2012 12:56:23 +0000 (12:56 +0000)]
drivers/net: Remove unnecessary k.alloc/v.alloc OOM messages

alloc failures use dump_stack so emitting an additional
out-of-memory message is an unnecessary duplication.

Remove the allocation failure messages.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agotcp: md5: use sock_kmalloc() to limit md5 keys
Eric Dumazet [Tue, 31 Jan 2012 10:56:48 +0000 (10:56 +0000)]
tcp: md5: use sock_kmalloc() to limit md5 keys

There is no limit on number of MD5 keys an application can attach to a
tcp socket.

This patch adds a per tcp socket limit based
on /proc/sys/net/core/optmem_max

With current default optmem_max values, this allows about 150 keys on
64bit arches, and 88 keys on 32bit arches.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agotcp: md5: rcu conversion
Eric Dumazet [Tue, 31 Jan 2012 05:18:33 +0000 (05:18 +0000)]
tcp: md5: rcu conversion

In order to be able to support proper RST messages for TCP MD5 flows, we
need to allow access to MD5 keys without locking listener socket.

This conversion is a nice cleanup, and shrinks size of timewait sockets
by 80 bytes.

IPv6 code reuses generic code found in IPv4 instead of duplicating it.

Control path uses GFP_KERNEL allocations instead of GFP_ATOMIC.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Shawn Lu <shawn.lu@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agotcp: md5: remove obsolete md5_add() method
Eric Dumazet [Tue, 31 Jan 2012 01:04:42 +0000 (01:04 +0000)]
tcp: md5: remove obsolete md5_add() method

We no longer use md5_add() method from struct tcp_sock_af_ops

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agor8169: spinlock redux.
Francois Romieu [Tue, 31 Jan 2012 10:20:34 +0000 (11:20 +0100)]
r8169: spinlock redux.

rtl8169_get_regs operates under RTNL and rtl task mutex whereas
rtl_set_rx_mode is either called under RTNL or rtl task mutex protection.

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Cc: Hayes Wang <hayeswang@realtek.com>
12 years agor8169: avoid a useless work scheduling.
Francois Romieu [Tue, 31 Jan 2012 10:09:21 +0000 (11:09 +0100)]
r8169: avoid a useless work scheduling.

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Suggested-by: Michał Mirosław <mirqus@gmail.com>
Cc: Hayes Wang <hayeswang@realtek.com>
12 years agor8169: move task enable boolean to bitfield.
Francois Romieu [Tue, 31 Jan 2012 09:56:44 +0000 (10:56 +0100)]
r8169: move task enable boolean to bitfield.

Simpler, more consistent, with negligible cost in non-critical paths.

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Suggested-by: Michał Mirosław <mirqus@gmail.com>
Cc: Hayes Wang <hayeswang@realtek.com>
12 years agor8169: bh locking redux and task scheduling.
Francois Romieu [Tue, 31 Jan 2012 09:47:34 +0000 (10:47 +0100)]
r8169: bh locking redux and task scheduling.

- atomic bit operations are globally visible
- pending status is always cleared before execution
- scheduled works are either idempotent or only required to happen once
  after a series of originating events, say link events for instance

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Suggested-by: Michał Mirosław <mirqus@gmail.com>
Cc: Hayes Wang <hayeswang@realtek.com>
12 years agor8169: fix early queue wake-up.
Francois Romieu [Mon, 30 Jan 2012 23:00:19 +0000 (00:00 +0100)]
r8169: fix early queue wake-up.

With infinite gratitude to Eric Dumazet for allowing me to identify
the error.

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Hayes Wang <hayeswang@realtek.com>
12 years agoMerge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/bwh/sfc...
David S. Miller [Mon, 30 Jan 2012 20:47:57 +0000 (15:47 -0500)]
Merge branch 'for-davem' of git://git./linux/kernel/git/bwh/sfc-next

12 years agonet: Deinline __nlmsg_put and genlmsg_put. -7k code on i386 defconfig.
Denys Vlasenko [Mon, 30 Jan 2012 20:22:06 +0000 (15:22 -0500)]
net: Deinline __nlmsg_put and genlmsg_put. -7k code on i386 defconfig.

   text    data     bss     dec     hex filename
8455963  532732 1810804 10799499 a4c98b vmlinux.o.before
8448899  532732 1810804 10792435 a4adf3 vmlinux.o

This change also removes commented-out copy of __nlmsg_put
which was last touched in 2005 with "Enable once all users
have been converted" comment on top.

Changes in v2: rediffed against net-next.

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoipv6: fix RFC5722 comment
Eric Dumazet [Mon, 30 Jan 2012 04:29:24 +0000 (04:29 +0000)]
ipv6: fix RFC5722 comment

RFC5722 Section 4 was amended by Errata 3089

Our implementation did the right thing anyway...

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet: Allow ipv6 proxies and arp proxies be shown with iproute2
Tony Zelenoff [Thu, 26 Jan 2012 22:28:58 +0000 (22:28 +0000)]
net: Allow ipv6 proxies and arp proxies be shown with iproute2

Add ability to return neighbour proxies list to caller if
it sent full ndmsg structure and has NTF_PROXY flag set.

Before this patch (and before iproute2 patches):
$ ip neigh add proxy 2001::1 dev eth0
$ ip -6 neigh show
$

After it and with applied iproute2 patches:
$ ip neigh add proxy 2001::1 dev eth0
$ ip -6 neigh show
2001::1 dev eth0  proxy
$

Compatibility with old versions of iproute2 is not broken,
kernel checks for incoming structure size and properly
works if old structure is came.

[v2]
* changed comments style.
* removed useless line with continue and curly bracket.
* changed incoming message size check from equal to more or
  equal.

CC: davem@davemloft.net
CC: kuznet@ms2.inr.ac.ru
CC: netdev@vger.kernel.org
CC: xemul@parallels.com
Signed-off-by: Tony Zelenoff <antonz@parallels.com>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agodrivers/net: strip unused module code from sun3_82586.c
Paul Gortmaker [Fri, 27 Jan 2012 13:55:46 +0000 (13:55 +0000)]
drivers/net: strip unused module code from sun3_82586.c

This code is clearly unused, since it has a #error right
in it.  Given the vintage of sun3 hardware, it is probably
safe to assume that there is little interest in adding new
functionality to the driver now, so just delete the unused
block of code.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Sam Creasey <sammy@sammy.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agodrivers/net: fix up stale paths from driver reorg
Paul Gortmaker [Fri, 27 Jan 2012 13:36:01 +0000 (13:36 +0000)]
drivers/net: fix up stale paths from driver reorg

The reorganization of the driver layout in drivers/net
left behind some stale paths in comments and in Kconfig
help text.  Bring them up to date.  No actual change to
any code takes place here.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoMerge branch 'davem-next.r8169' of git://violet.fr.zoreil.com/romieu/linux
David S. Miller [Mon, 30 Jan 2012 17:39:17 +0000 (12:39 -0500)]
Merge branch 'davem-next.r8169' of git://violet.fr.zoreil.com/romieu/linux

12 years agosfc: Use a more sensible cast in efx_rx_buf_offset()
Ben Hutchings [Mon, 30 Jan 2012 16:55:05 +0000 (16:55 +0000)]
sfc: Use a more sensible cast in efx_rx_buf_offset()

This function returns the page offset of the buffer, which can be
calculated based on either its DMA address or its virtual address.  It
used to use the virtual address and we would cast that to unsigned
long, as anything smaller would result in a compiler warning.  Now
that it's using the DMA address we should use unsigned int, matching
the return type.  It is also unnecessary to use __force.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
12 years agosfc: MTD: Leave the DEBUG macro alone
Ben Hutchings [Mon, 30 Jan 2012 16:53:37 +0000 (16:53 +0000)]
sfc: MTD: Leave the DEBUG macro alone

<linux/mtd/mtd.h> no longer defines DEBUG so we do not need to
un-define it here.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
12 years agoMerge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/bwh/sfc...
David S. Miller [Sun, 29 Jan 2012 21:11:26 +0000 (16:11 -0500)]
Merge branch 'for-davem' of git://git./linux/kernel/git/bwh/sfc-next

12 years agoipv6: Eliminate dst_get_neighbour_noref() usage in ip6_forward().
David S. Miller [Fri, 27 Jan 2012 23:32:19 +0000 (15:32 -0800)]
ipv6: Eliminate dst_get_neighbour_noref() usage in ip6_forward().

It's only used to get at neigh->primary_key, which in this context is
always going to be the same as rt->rt6i_gateway.

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoipv6: Remove neigh argument from ndisc_send_redirect()
David S. Miller [Fri, 27 Jan 2012 23:30:48 +0000 (15:30 -0800)]
ipv6: Remove neigh argument from ndisc_send_redirect()

Instead, compute it as-needed inside of that function using
dst_neigh_lookup().

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoipv6: fib: Convert fib6_age() to dst_neigh_lookup().
David S. Miller [Fri, 27 Jan 2012 23:14:01 +0000 (15:14 -0800)]
ipv6: fib: Convert fib6_age() to dst_neigh_lookup().

In this specific situation we know we are dealing with a gatewayed route
and therefore rt6i_gateway is not going to be in6addr_any even in future
interpretations.

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoipv6: ndisc: Convert to dst_neigh_lookup()
David S. Miller [Fri, 27 Jan 2012 23:07:56 +0000 (15:07 -0800)]
ipv6: ndisc: Convert to dst_neigh_lookup()

Now all code paths grab a local reference to the neigh, so if neigh
is not NULL we unconditionally release it at the end.  The old logic
would only release if we didn't have a non-NULL 'rt'.

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoipv4: ip_gre: Convert to dst_neigh_lookup()
David S. Miller [Fri, 27 Jan 2012 23:01:08 +0000 (15:01 -0800)]
ipv4: ip_gre: Convert to dst_neigh_lookup()

The conversion is very similar to that made to ipv6's SIT code.

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agor8169: remove work from irq handler.
Francois Romieu [Thu, 26 Jan 2012 13:18:23 +0000 (14:18 +0100)]
r8169: remove work from irq handler.

The irq handler was a mess.

See 7ab87ff4c770eed71e3777936299292739fcd0fe ("via-rhine: move work from
irq handler to softirq and beyond") for similar changes. One can notice:
- all non-napi tasks are explicitely scheduled trough a single work queue.
- hiding software tx queue start behind the rtl_hw_start method is mildly
  natural. Move it in the caller where needed.
- as can be seen from the heavy use of bh disabling locks, the driver is
  not safe for irq context messages with netconsole. It is still quite
  usable for general messaging though. Tested ok with concurrent registers
  dump (ethtool -d) + background traffic + "echo t > /proc/sysrq-trigger".

Tested with old PCI chipset, PCIe 8168 and 810x:
- XID 0c900800 RTL8168evl/8111evl
- XID 18000000 RTL8168b/8111b
- XID 98000000 RTL8169sc/8110sc
- XID 083000c0 RTL8168d/8111d
- XID 081000c0 RTL8168d/8111d
- XID 00b00000 RTL8105e
- XID 04a00000 RTL8102e

As a side note, the comments in f11a377b3f4e897d11f0e8d1fc688667e2f19708
("r8169: avoid losing MSI interrupts") does not seem completely clear: if
I hack the driver further to stop acking the irq link event bit, MSI
interrupts keep being delivered (RTL8168b/8111b, XID 18000000).

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Cc: Hayes Wang <hayeswang@realtek.com>
12 years agor8169: missing barriers.
Francois Romieu [Fri, 27 Jan 2012 14:05:38 +0000 (15:05 +0100)]
r8169: missing barriers.

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Cc: Hayes Wang <hayeswang@realtek.com>
12 years agor8169: irq mask helpers.
Francois Romieu [Thu, 26 Jan 2012 11:59:08 +0000 (12:59 +0100)]
r8169: irq mask helpers.

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Cc: Hayes Wang <hayeswang@realtek.com>
12 years agor8169: factor out IntrMask writes.
Francois Romieu [Thu, 26 Jan 2012 11:50:01 +0000 (12:50 +0100)]
r8169: factor out IntrMask writes.

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Cc: Hayes Wang <hayeswang@realtek.com>
12 years agor8169: stop delaying workqueue.
Francois Romieu [Thu, 26 Jan 2012 10:23:32 +0000 (11:23 +0100)]
r8169: stop delaying workqueue.

Though motivated by the move of the driver to a single work queue of
sequential events and removal of hard irq processing, it looks safe as
a standalone change.

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Cc: Hayes Wang <hayeswang@realtek.com>
12 years agor8169: remove rtl8169_reinit_task.
Francois Romieu [Thu, 26 Jan 2012 08:59:50 +0000 (09:59 +0100)]
r8169: remove rtl8169_reinit_task.

I see no good reason to keep both rtl8169_reinit_task and rtl8169_reset_task:
- rtl8169_reinit_task adds a software failure point which does relate to
  any hardware state
- they handle hardware the same. Remember that rtl8169_reinit_task was
  introduced in the 8169 only era to handle PCI errors way before the 8168
  asked for pll and firmware ops and compare :

      rtl8169_reinit_task     |    rtl8169_reset_task
  ----------------------------+--------------------------
  rtl8169_wait_for_quiescence | rtl8169_hw_reset
  rtl8169_update_counters     | rtl8169_wait_for_quiescence
  rtl8169_hw_reset            | rtl_hw_start
  rtl8169_rx_missed           | rtl8169_check_link_status
  rtl_pll_power_down          |
  rtl_request_firmware        |
  rtl8169_init_phy            |
  rtl_pll_power_up            |
  rtl_hw_start                |
  rtl8169_check_link_status   |

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Cc: Hayes Wang <hayeswang@realtek.com>
12 years agor8169: remove hardcoded PCIe registers accesses.
Francois Romieu [Thu, 22 Dec 2011 17:59:37 +0000 (18:59 +0100)]
r8169: remove hardcoded PCIe registers accesses.

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
12 years agoe1000e: update copyright year
Bruce Allan [Sun, 1 Jan 2012 16:00:03 +0000 (16:00 +0000)]
e1000e: update copyright year

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
12 years agoe1000e: split lib.c into three more-appropriate files
Bruce Allan [Wed, 21 Dec 2011 09:47:10 +0000 (09:47 +0000)]
e1000e: split lib.c into three more-appropriate files

The generic lib.c file contains code relative to the various MACs, NVM and
Manageability supported by the driver.  This patch splits the file into
three which are specific to those areas similar to how the PHY-specific
code is in phy.c and code specific to the 80003es2lan, 8257x, and ichX
MAC families are in their own files.  The generic code that is applicable
to all MAC/PHY parts supported by the driver remains in netdev.c, param.c
and ethtool.c files.  No change in functionality, just moving code
around for ease of maintenance, with some whitespace and other checkpatch
cleanups.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
12 years agoe1000e: call er16flash() instead of __er16flash()
Bruce Allan [Sat, 17 Dec 2011 08:32:57 +0000 (08:32 +0000)]
e1000e: call er16flash() instead of __er16flash()

__er16flash() is not meant to be called directly.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
12 years agoe1000e: increase version number
Bruce Allan [Fri, 16 Dec 2011 00:47:04 +0000 (00:47 +0000)]
e1000e: increase version number

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
12 years agoe1000e: convert final strncpy() to strlcpy()
Bruce Allan [Fri, 16 Dec 2011 00:46:59 +0000 (00:46 +0000)]
e1000e: convert final strncpy() to strlcpy()

Convert the last instances of strncpy() to the preferred strlcpy().

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
12 years agoe1000e: concatenate long debug strings which span multiple lines
Bruce Allan [Fri, 16 Dec 2011 00:46:54 +0000 (00:46 +0000)]
e1000e: concatenate long debug strings which span multiple lines

To ease searching for debug message strings, concatenate strings that span
multiple lines even if the resulting line exceeds 80 columns; these will
not cause checkpatch warnings.

Also, add '\n' and remove unnecessary '\r' from a few debug strings.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
12 years agoe1000e: conditionally restart autoneg on 82577/8/9 when setting LPLU state
Bruce Allan [Fri, 16 Dec 2011 00:46:49 +0000 (00:46 +0000)]
e1000e: conditionally restart autoneg on 82577/8/9 when setting LPLU state

When setting the Low Power Link Up (LPLU, a.k.a. reverse auto-negotiation)
on 82577/8278/82579, do not restart auto-negotiation if reset of the Phy is
blocked by the Manageability Engine.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
12 years agoe1000e: increase Rx PBA to prevent dropping received packets on 82566/82567
Bruce Allan [Fri, 16 Dec 2011 00:46:43 +0000 (00:46 +0000)]
e1000e: increase Rx PBA to prevent dropping received packets on 82566/82567

During bi-directional stress on some 82566/82567 devices, some received
packets were dropped.  Increasing the Receive Packet Buffer Allocation
resolves this.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
12 years agoe1000e: ICHx/PCHx LOMs should use LPLU setting in NVM when going to Sx
Bruce Allan [Fri, 16 Dec 2011 00:46:38 +0000 (00:46 +0000)]
e1000e: ICHx/PCHx LOMs should use LPLU setting in NVM when going to Sx

When going to Sx with an ICHx/PCH device, the default Low Power Link Up
(LPLU, a.k.a. reverse auto-negotiation) behavior should be whatever is set
in the NVM.  However, the function e1000_suspend_workarounds_ich8lan()
called when going to Sx always enabled LPLU in all power states.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
12 years agoe1000e: update workaround for 82579 intermittently disabled during S0->Sx
Bruce Allan [Fri, 16 Dec 2011 00:46:33 +0000 (00:46 +0000)]
e1000e: update workaround for 82579 intermittently disabled during S0->Sx

The workaround which toggles the LANPHYPC (LAN PHY Power Control) value bit
to force the MAC-Phy interconnect into PCIe mode from SMBus mode during
driver load and resume should always be done except if PHY resets are
blocked by the Manageability Engine (ME).  Previously, the toggle was done
only if PHY resets are blocked and the ME was disabled.

The rest of the patch is just indentation changes as a consequence of the
updated workaround.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
12 years agoe1000e: disable Early Receive DMA on ICH LOMs
Bruce Allan [Fri, 16 Dec 2011 00:46:27 +0000 (00:46 +0000)]
e1000e: disable Early Receive DMA on ICH LOMs

Internal stress testing with jumbo frames shows the reliability of ICH9 and
ICH10D devices is improved in certain corner cases by disabling the Early
Receive feature. To reduce the performance impact caused by disabling this
feature, the packet buffer sizes and relevant flow control settings are
modified accordingly.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
12 years agosfc: Replace efx_rx_buffer::is_page and other booleans with a flags field
Ben Hutchings [Fri, 26 Aug 2011 17:05:11 +0000 (18:05 +0100)]
sfc: Replace efx_rx_buffer::is_page and other booleans with a flags field

Replace checksummed and discard booleans from efx_handle_rx_event()
with a bitmask, added to the flags field.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
12 years agosfc: Move the end of the non-GRO RX path into its own function
Ben Hutchings [Mon, 23 Jan 2012 22:41:30 +0000 (22:41 +0000)]
sfc: Move the end of the non-GRO RX path into its own function

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
12 years agosfc: Make all MAC statistics consistently 64 bits wide
Ben Hutchings [Wed, 12 Oct 2011 16:20:25 +0000 (17:20 +0100)]
sfc: Make all MAC statistics consistently 64 bits wide

Currently we use type u64 for byte counts, which can very quickly
exceed 2^32, and unsigned long for packet counts, which do not.  But
it can still take only 20-something minutes to send or receive 2^32
packets, and not all tools properly handle overflow even if they
sample more often than this.

The MAC statistics are all updated synchronously, so it costs very
little to make them all 64-bit regardless of native word size.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
12 years agosfc: Rename implementation of ndo_set_rx_mode
Ben Hutchings [Mon, 9 Jan 2012 19:54:44 +0000 (19:54 +0000)]
sfc: Rename implementation of ndo_set_rx_mode

Rename efx_set_multicast_list() to efx_set_rx_mode(), in line
with the operation name net_device_ops::ndo_set_rx_mode.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
12 years agosfc: Remove redundant 'rc' variable, always set to 0
Ben Hutchings [Mon, 9 Jan 2012 19:54:16 +0000 (19:54 +0000)]
sfc: Remove redundant 'rc' variable, always set to 0

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
12 years agosfc: Minor formatting fixes
Ben Hutchings [Mon, 9 Jan 2012 19:53:41 +0000 (19:53 +0000)]
sfc: Minor formatting fixes

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
12 years agosfc: Use existing local variables instead of repeated indirect lookups
Ben Hutchings [Mon, 9 Jan 2012 19:51:22 +0000 (19:51 +0000)]
sfc: Use existing local variables instead of repeated indirect lookups

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
12 years agosfc: Remove remnants of on-load self-test
Ben Hutchings [Mon, 9 Jan 2012 19:47:08 +0000 (19:47 +0000)]
sfc: Remove remnants of on-load self-test

The out-of-tree version of the sfc driver used to run a self-test on
each device before registering it.  Although this was never included
in-tree, some functions have checks for this special case which is not
really possible.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
12 years agosfc: Remove obsolete function efx_dev_name()
Ben Hutchings [Mon, 9 Jan 2012 19:41:48 +0000 (19:41 +0000)]
sfc: Remove obsolete function efx_dev_name()

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
12 years agosfc: Update the description of SFC_MTD
Ben Hutchings [Fri, 6 Jan 2012 22:47:17 +0000 (22:47 +0000)]
sfc: Update the description of SFC_MTD

SFC4000 boards also have an EEPROM exposed as MTD.
The boot configuration is accessed through MTD.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
12 years agosfc: Add hwmon driver for boards using SFC9000-family controllers
Ben Hutchings [Fri, 6 Jan 2012 20:25:39 +0000 (20:25 +0000)]
sfc: Add hwmon driver for boards using SFC9000-family controllers

The SFC9000-family controllers have firmware to manage all board
peripherals including temperature, heat sink continuity and voltage
sensors.  The firmware reports sensor alarms, which we log, and
will shut down the board if necessary.

Some users may want to monitor their boards more closely, so add an
hwmon driver that exposes all sensors reported by the firmware.  Move
efx_mcdi_sensor_event() into the new file so it can share the array of
sensor labels with the hwmon driver.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
12 years agosfc: Clean up test interrupt handling
Ben Hutchings [Thu, 5 Jan 2012 20:14:10 +0000 (20:14 +0000)]
sfc: Clean up test interrupt handling

Interrupts are normally generated by the event queues, moderated by
timers.  However, they may also be triggered by detection of a 'fatal'
error condition (e.g. memory parity error) or by the host writing to
certain CSR fields as part of a self-test.

The IRQ level/index used for these on Falcon rev B0 and Siena is set
by the KER_INT_LEVE_SEL field and cached by the driver in
efx_nic::fatal_irq_level.  Since this value is also relevant to
self-tests rename the field to just 'irq_level'.

Avoid unnecessary cache traffic by using a per-channel 'last_irq_cpu'
field and only writing to the per-controller field when the interrupt
matches efx_nic::irq_level.  Remove the volatile qualifier and use
ACCESS_ONCE in the places we read these fields.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
12 years agoPartly revert "sfc: Handle serious errors in exactly one interrupt handler"
Ben Hutchings [Fri, 6 Jan 2012 01:08:24 +0000 (01:08 +0000)]
Partly revert "sfc: Handle serious errors in exactly one interrupt handler"

This reverts commit 6369545945b90daa1a73fca174da9194c398417c in
drivers/net/ethernet/sfc/falcon.c.

Unlike the INT_ISR0 register on later controller revisions, the
NET_IVEC_INT_Q bits written to memory are only ever set for
interrupting event queues, not for any other interrupt sources.

By definition there can only be one legacy interrupt handler per
function, so there is no need to worry about detecting a fatal
interrupt more than once.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
12 years agosfc: Remove dependence on NAPI polling in efx_test_eventq_irq()
Ben Hutchings [Fri, 4 Nov 2011 23:06:04 +0000 (23:06 +0000)]
sfc: Remove dependence on NAPI polling in efx_test_eventq_irq()

We cannot safely assume that the NAPI handler will complete within the
20 ms that we allow for the event self-test.  The handler may be
deferred for longer than this, particularly on realtime kernels.

Instead, check whether either an event has been handled or (as in the
old failure path) whether an interrupt has been received and an event
has been delivered but not yet handled.  Use napi_disable() to
synchronize with the NAPI handler before checking, since it will
clear events before updating eventq_read_ptr.

Remove the test result chan.N.eventq.poll, since it is not an error
if the NAPI handler does not run during the test.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
12 years agosfc: Correct interrupt timer quantum for Siena (normal and turbo mode)
Ben Hutchings [Thu, 8 Dec 2011 19:51:47 +0000 (19:51 +0000)]
sfc: Correct interrupt timer quantum for Siena (normal and turbo mode)

We currently assume that the timer quantum for Siena is 5 us, the same
as for Falcon.  This is not correct; timer ticks are generated on a
rota which takes a minimum of 768 cycles (each event delivery or other
timer change will delay it by 3 cycles).  The timer quantum should be
6.144 or 3.072 us depending on whether turbo mode is active.

Replace EFX_IRQ_MOD_RESOLUTION with a timer_quantum_ns field in struct
efx_nic, initialised by the efx_nic_type::probe function.

While we're at it, replace EFX_IRQ_MOD_MAX with a timer_period_max
field in struct efx_nic_type.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
12 years agosfc: Support extraction of CAPABILITIES from GET_BOARD_CFG response.
Matthew Slattery [Wed, 14 Jul 2010 14:36:19 +0000 (15:36 +0100)]
sfc: Support extraction of CAPABILITIES from GET_BOARD_CFG response.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
12 years agosfc: Consistently test DEBUG macro, not EFX_ENABLE_DEBUG
Ben Hutchings [Fri, 4 Nov 2011 22:29:14 +0000 (22:29 +0000)]
sfc: Consistently test DEBUG macro, not EFX_ENABLE_DEBUG

The netif_dbg() macro is defined in <linux/netdevice.h>.  If the DEBUG
macro is defined, it logs a message at 'debug' level, otherwise it
does nothing.

In net_driver.h we define DEBUG if EFX_ENABLE_DEBUG is defined, but
this is too late for those source files that already got a
definition of netif_dbg() by including <linux/netdevice.h>

Get rid of EFX_ENABLE_DEBUG, and only define and test DEBUG.

In mtd.c, we do not use DEBUG as a condition flag but are forced to
use the DEBUG macro-function from <linux/mtd/mtd.h>.  Undefine DEBUG
before including it.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
12 years agosfc: Remove efx_nic_type::push_multicast_hash operation
Ben Hutchings [Tue, 13 Sep 2011 18:47:48 +0000 (19:47 +0100)]
sfc: Remove efx_nic_type::push_multicast_hash operation

Both implementations of efx_nic_type::reconfigure_mac operation
push the multicast hash filter to the hardware.  It is therefore
redundant to call efx_nic_type::push_multicast_hash as well.

efx_mcdi_mac_reconfigure() also uses this operation, but the
implementation for Siena just uses MCDI anyway.  Merge that into
efx_mcdi_mac_reconfigure().

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
12 years agosfc: Merge efx_mcdi_mac_check_fault() and efx_mcdi_get_mac_faults()
Ben Hutchings [Thu, 8 Sep 2011 01:09:42 +0000 (02:09 +0100)]
sfc: Merge efx_mcdi_mac_check_fault() and efx_mcdi_get_mac_faults()

The latter is only called by the former, which is a very short
wrapper.  Further, gcc 4.5 may currently wrongly warn that the
'faults' variable may be used uninitialised.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
12 years agosfc: Merge efx_mac_operations into efx_nic_type
Ben Hutchings [Fri, 2 Sep 2011 23:15:00 +0000 (00:15 +0100)]
sfc: Merge efx_mac_operations into efx_nic_type

No NICs need to switch efx_mac_operations at run-time, and the MAC
operations are fairly closely bound to NIC types.

Move efx_mac_operations::reconfigure to efx_nic_type::reconfigure_mac
and efx_mac_operations::check_fault fo efx_nic_type::check_mac_fault.
Change callers to call through efx->type or directly if the NIC type
is known.

Remove efx_mac_operations::update_stats.  The implementations for
Falcon used to fetch MAC statistics synchronously and this was used by
efx_register_netdev() to clear statistics after running self-tests.
However, it now only converts statistics that have already been
fetched (and that only for Falcon), and the call from
efx_register_netdev() has no effect.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
12 years agosfc: Hold efx_nic::stats_lock while reading efx_nic::mac_stats
Ben Hutchings [Fri, 2 Sep 2011 22:23:00 +0000 (23:23 +0100)]
sfc: Hold efx_nic::stats_lock while reading efx_nic::mac_stats

efx_nic::stats_lock is used to serialise stats updates, but each
reader was dropping it before it finished reading efx_nic::mac_stats.

If there were concurrent stats reads using procfs, or one using procfs
and one using ethtool, an update could race with a read.  On a 32-bit
system, the reader could see word-tearing of 64-bit stats (32 bits of
the old value and 32 bits of the new).

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
12 years agosfc: Use new names for MC shared memory layout constants
Ben Hutchings [Tue, 20 Dec 2011 23:52:02 +0000 (23:52 +0000)]
sfc: Use new names for MC shared memory layout constants

These are defined alongside the firmware protocol in mcdi_pcol.h.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
12 years agosfc: Make handling of MC reboot more reliable
Ben Hutchings [Tue, 20 Dec 2011 23:39:31 +0000 (23:39 +0000)]
sfc: Make handling of MC reboot more reliable

When the MC reboots, either as part of a firmware upgrade or due to a
bug, it attempts to complete (with an error) any requests that were
outstanding before the reboot.  Since there is an inherent race
condition in checking this, it will also write to a status word in
shared memory.

If we look at each of these separately, we may detect each reboot
twice, resulting in a spurious command failure after a firmware
upgrade or frustrating recovery from a firmware bug.  Instead, if a
request completion indicates a reboot, we must poll and clear the
status word.

This bug was previously masked by use of an incorrect address for the
status word.  Fix that, using the definition now included in
mcdi_pcol.h.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
12 years agosfc: Remove fallback for invalid permanent MAC address
Ben Hutchings [Tue, 20 Dec 2011 01:22:51 +0000 (01:22 +0000)]
sfc: Remove fallback for invalid permanent MAC address

By the time we look at the MAC address in efx_probe_port(), either the
driver or the firmware has already validated the board configuration.
The possibility of having an invalid MAC address just isn't worth
considering.  It certainly isn't worth having a compile-time option
for this.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
12 years agoipv6: Use ipv6_addr_any()
David S. Miller [Thu, 26 Jan 2012 21:29:16 +0000 (16:29 -0500)]
ipv6: Use ipv6_addr_any()

Suggested by YOSHIFUJI Hideaki.

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoe1000e: Need to include vmalloc.h
David S. Miller [Thu, 26 Jan 2012 21:25:55 +0000 (16:25 -0500)]
e1000e: Need to include vmalloc.h

Otherwise (on sparc64):

drivers/net/ethernet/intel/e1000e/ethtool.c:657:3: error: implicit declaration of function 'vmalloc' [-Werror=implicit-function-declaration]

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoipv6: sit: Convert to dst_neigh_lookup()
David S. Miller [Thu, 26 Jan 2012 20:23:21 +0000 (15:23 -0500)]
ipv6: sit: Convert to dst_neigh_lookup()

The only semantic difference is that we now hold a reference to the
neighbour and thus have to release it.

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoipv4/ipv6: Prepare for new route gateway semantics.
David S. Miller [Thu, 26 Jan 2012 20:22:32 +0000 (15:22 -0500)]
ipv4/ipv6: Prepare for new route gateway semantics.

In the future the ipv4/ipv6 route gateway will take on two types
of values:

1) INADDR_ANY/IN6ADDR_ANY, for local network routes, and in this case
   the neighbour must be obtained using the destination address in
   ipv4/ipv6 header as the lookup key.

2) Everything else, the actual nexthop route address.

So if the gateway is not inaddr-any we use it, otherwise we must use
the packet's destination address.

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agotcp: add LINUX_MIB_TCPRETRANSFAIL counter
Eric Dumazet [Wed, 25 Jan 2012 04:44:20 +0000 (04:44 +0000)]
tcp: add LINUX_MIB_TCPRETRANSFAIL counter

It might be useful to get a counter of failed tcp_retransmit_skb()
calls.

Reported-by: Satoru Moriya <satoru.moriya@hds.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agobe2net: allocate more headroom in incoming skbs
Eric Dumazet [Wed, 25 Jan 2012 03:56:30 +0000 (03:56 +0000)]
be2net: allocate more headroom in incoming skbs

Allocation of 64 bytes in skb headroom is not enough if we have to pull
ethernet + ipv6 + tcp headers, and/or extra tunneling header.

Its currently not noticed because netdev_alloc_skb_ip_align(64) give us
more room, thanks to power-of-two kmalloc() roundups.

Make sure we ask for 128 bytes so that side effects of upcoming patches
from Ian Campbell dont decrease benet rx performance, because of extra
skb head reallocations.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Ian Campbell <Ian.Campbell@citrix.com>
Cc: Vasundhara Volam <vasundhara.volam@emulex.com>
Cc: Sathya Perla <sathya.perla@emulex.com>
Cc: Ajit Khaparde <ajit.khaparde@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agobnx2x: Update version to 1.72.0 and copyrights
Ariel Elior [Thu, 26 Jan 2012 06:01:54 +0000 (06:01 +0000)]
bnx2x: Update version to 1.72.0 and copyrights

Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agobnx2x: Recoverable and unrecoverable error statistics
Ariel Elior [Thu, 26 Jan 2012 06:01:53 +0000 (06:01 +0000)]
bnx2x: Recoverable and unrecoverable error statistics

Add statistics for tracking parity errors from which we successfully
recovered and those which were deemed unrecoverable.

Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agobnx2x: Recovery flow bug fixes
Ariel Elior [Thu, 26 Jan 2012 06:01:52 +0000 (06:01 +0000)]
bnx2x: Recovery flow bug fixes

1. Sample mcp pulse and mcp sequence in nic load instead of in init_one
as they may change by the time we want to use them.

2. Allow cnic to access device during nic load (by adding a new "LOADING" state
to recovery flow). This prevents the unnecessary cnic timeout which resulted
by cnic attempting to access because nic is loading, but being blocked because
of the Recovery state.

3. Issue 'fake' driver load command to mcp when last driver unloads to prevent
mcp from taking ownership. When recovery is complete unload fake driver to
allow mcp to initialize the hardware before first driver loads.

Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agobnx2x: Track active PFs with bitmap
Ariel Elior [Thu, 26 Jan 2012 06:01:51 +0000 (06:01 +0000)]
bnx2x: Track active PFs with bitmap

The recovery register (to which a hardware lock has been added in previous
patch) is used amongst other things to track the active PFs. The old
implementation which used a per path counter is not viable in a virtualized
environment where a pf may increment the counter and then have the kernel
crash around it preventing the counter from ever reaching zero.
In the new implementation the scenario described will result in the PF timing
out against the mcp, which will clear the PF's bit in the bitmask allowing
recovery process to proceed.

Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agobnx2x: Lock PF-common resources
Ariel Elior [Thu, 26 Jan 2012 06:01:50 +0000 (06:01 +0000)]
bnx2x: Lock PF-common resources

Use hardware locks to protect resources common to several Physical Functions. In
a virtualized environment the RTNL lock only protects a PF's driver against
the PFs sharing it's VMs with regard to device resources. Other PFs may reside
in other VMs under other OSs, and are not subject to the lock. Such resources
which were previously protected implicitly by the RTNL lock must now be
protected explicitly with dedicated HW locks.

Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agobnx2x: Loaded Firmware Version Validation
Ariel Elior [Thu, 26 Jan 2012 06:01:49 +0000 (06:01 +0000)]
bnx2x: Loaded Firmware Version Validation

In a virtualized environment it is possible for a loading driver to discover
that Firmware is already loaded to the device, and that this FW does not match
its own. This can happen for example if different Physical Functions are
Assigned to different VMs in which different driver versions are loaded. The
code in this patch ensures that only drivers with matching FW are loaded over
the device, and that in the case described above where the Firmware version
doesn't match the driver load is aborted.

Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agobnx2x: Function Level Reset Final Cleanup
Ariel Elior [Thu, 26 Jan 2012 06:01:48 +0000 (06:01 +0000)]
bnx2x: Function Level Reset Final Cleanup

1. Fix bug where return value is ignored
2. Improve printouts
3. Fix typos

Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agobnx2x: Obtain Bus Device Function from register
Ariel Elior [Thu, 26 Jan 2012 06:01:47 +0000 (06:01 +0000)]
bnx2x: Obtain Bus Device Function from register

BDF was obtained from kernel but since in virtualized environment
(e.g. physical device assigment in KVM) the function number may
not be the real one, the info must be obtained from the device.

Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agobnx2x: Removing indirect register access
Ariel Elior [Thu, 26 Jan 2012 06:01:46 +0000 (06:01 +0000)]
bnx2x: Removing indirect register access

In virtualized environments indirect access to the device may not be supported
(depending on the Hypervisor type). Indirect device access was used since in
some harware contexts (i.e. certain chipset and BIOS) every access the driver
makes across the pci is followed by a BIOS initiated Zero Length Read to the
same address. When accessing widebus registers this zero length read corrupts
the serialization of the read/write sequence resulting with errors. To avoid
this problem widebus registers are always accessed via the DMAE or the indirect
interface. However, the 57712x and 578xx devices intercept the zero length read
and so using the indirect interface with these devices is not necessary. Since
PDA is only supported for 57712x and 578xx the indirect access to device was
restricted to 57710 and 57711x.

Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agobnx2x: Support Queue Per Cos in 5771xx devices
Ariel Elior [Thu, 26 Jan 2012 06:01:45 +0000 (06:01 +0000)]
bnx2x: Support Queue Per Cos in 5771xx devices

Enable the use of up to three hardware queues for transmission. The queues
are always dequed round robin (i.e. strict priority, PFC and ETS are not
supported). This does allow the allocation of a seperate HW queue for low
volume, high priority traffic which will be serviced more promptly.

Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoe1000e: 82574/82583 Tx hang workaround
Bruce Allan [Fri, 16 Dec 2011 00:46:22 +0000 (00:46 +0000)]
e1000e: 82574/82583 Tx hang workaround

On 82574/82583, there is a hardware bug which might cause a Tx hang when
the internal buffer is full.  Setting this bit enables a hardware fix to
work around the issue.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
12 years agoe1000e: use hardware default values for Transmit Control register
Bruce Allan [Fri, 16 Dec 2011 00:46:17 +0000 (00:46 +0000)]
e1000e: use hardware default values for Transmit Control register

This code snippet is simply writing default values to the register which is
unnecessary since the values are programmed into the register by default.
There is a special case for 80003es2lan needing the Retransmit on Late
Collision bit set but that is also done in e1000_init_hw_80003es2lan().

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
12 years agoe1000e: use default settings for Tx Inter Packet Gap timer
Bruce Allan [Fri, 16 Dec 2011 00:46:12 +0000 (00:46 +0000)]
e1000e: use default settings for Tx Inter Packet Gap timer

Use the default hardware values for TIPG except for 80003es2lan(*).  The
code that is removed in this patch is either unnecessarily writing the TIPG
register with the hardware default values for some devices (82571/2/3/4) or
writing the wrong value for others (ICH/PCH LOMs).  The only change in
functionality is setting the correct default TIPG for the latter devices.

(*) The correct value for 80003es2lan is already set properly in
e1000_init_hw_80003es2lan() and e1000_cfg_kmrn_{10_100|1000}_80003es2lan(),
and the unused flag FLAG_TIPG_MEDIUM_FOR_80003ESLAN is removed.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
12 years agoe1000e: 82579: workaround for link drop issue
Bruce Allan [Fri, 16 Dec 2011 00:46:06 +0000 (00:46 +0000)]
e1000e: 82579: workaround for link drop issue

When connected to certain switches, the 82579 PHY might drop link
unexpectedly.  Work around the issue by setting the Mean Square Error
higher than the hardware default.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
12 years agoe1000e: always set transmit descriptor control registers the same
Bruce Allan [Fri, 16 Dec 2011 00:46:01 +0000 (00:46 +0000)]
e1000e: always set transmit descriptor control registers the same

The hardware erratum workaround where the TXDCTL register must be the same
setting for both queues should always be done.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
12 years agoe1000e: default IntMode based on kernel config & available hardware support
Bruce Allan [Fri, 16 Dec 2011 00:45:56 +0000 (00:45 +0000)]
e1000e: default IntMode based on kernel config & available hardware support

Based on a patch from Prabhakar Kushwaha <prabhakar@freescale.com>, set
appropriate default interrupt mode dependent on whether CONFIG_PCI_MSI
is enabled in the kernel configuration and if the hardware supports
MSI-X.  Set the module parameter log message accordingly.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Cc: Jin Qing <b24347@freescale.com>
Cc: Prabhakar Kushwaha <prabhakar@freescale.com>
Cc: Jin Qing <b24347@freescale.com>
Cc: Kumar Gala <galak@kernel.crashing.org>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
12 years agoe1000e: re-factor ethtool get/set ring parameter
Bruce Allan [Fri, 16 Dec 2011 00:45:51 +0000 (00:45 +0000)]
e1000e: re-factor ethtool get/set ring parameter

Make it more like how igb does it, with some additional error checking.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
12 years agoe1000e: pass pointer to ring struct instead of adapter struct
Bruce Allan [Fri, 16 Dec 2011 00:45:45 +0000 (00:45 +0000)]
e1000e: pass pointer to ring struct instead of adapter struct

For ring-specific functions, pass a pointer to the ring struct instead of a
pointer to the adapter struct.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
12 years agoe1000e: convert head, tail and itr_register offsets to __iomem pointers
Bruce Allan [Fri, 16 Dec 2011 00:45:40 +0000 (00:45 +0000)]
e1000e: convert head, tail and itr_register offsets to __iomem pointers

The Tx/Rx head and tail registers and itr_register are always at known
addresses based on the __iomem address at which the PCI region (from BAR 0)
is mapped and known offsets within the region for each of these registers.
Store and use the full address rather than just the region offset to reduce
unnecessary address calculations.  Also, change current u8 __iomem pointers
to void __iomem pointers.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
12 years agoe1000e: re-enable alternate MAC address for all devices which support it
Bruce Allan [Fri, 16 Dec 2011 00:45:35 +0000 (00:45 +0000)]
e1000e: re-enable alternate MAC address for all devices which support it

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
12 years agoe1000e: add Receive Packet Steering (RPS) support
Bruce Allan [Wed, 11 Jan 2012 01:26:50 +0000 (01:26 +0000)]
e1000e: add Receive Packet Steering (RPS) support

Enable RPS by default.  Disallow jumbo frames when both receive checksum
and receive hashing are enabled because the hardware cannot do both IP
payload checksum (enabled when receive checksum is enabled when using
packet split which is used for jumbo frames) and provide RSS hash at the
same time.

v2: added ethtool command to query flow hashing behavior per Ben Hutchings
    and changed the type of rsskey to cleanup the setting of the register
    array and avoid unnecessary casts (as pointed out by Joe Perches).
    The long error messages are not changed since there is nothing in
    the kernel ./Documentation that suggests the preferred method for
    dealing with long messages other than to never break strings; leaving
    them as-is for now.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
12 years agoe1000e: cleanup Rx checksum offload code
Bruce Allan [Thu, 5 Jan 2012 00:34:05 +0000 (00:34 +0000)]
e1000e: cleanup Rx checksum offload code

1) cleanup whitespace in e1000_rx_checksum() function header comment
2) do not check hardware checksum when Rx checksum is disabled
3) reduce duplicated calls to le16_to_cpu() by just using it within
   e1000_rx_checksum() instead of in each call to the function

v2: use swab16 instead of le16_to_cpu & htons and corrected type for the
passed-in csum

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
12 years agoinfiniband: nes: Convert nes_addr_resolve_neigh() over to dst_neigh_lookup().
David Miller [Tue, 24 Jan 2012 13:16:03 +0000 (13:16 +0000)]
infiniband: nes: Convert nes_addr_resolve_neigh() over to dst_neigh_lookup().

Now we must provide the IP destination address, and a reference has
to be dropped when we're done with the entry.

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoinfiniband: cxgb4: Convert import_ep() over to dst_neigh_lookup().
David Miller [Tue, 24 Jan 2012 13:15:57 +0000 (13:15 +0000)]
infiniband: cxgb4: Convert import_ep() over to dst_neigh_lookup().

Now we must provide the IP destination address, and a reference has
to be dropped when we're done with the entry.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: David S. Miller <davem@davemloft.net>