Thomas Klein [Tue, 20 Apr 2010 23:11:31 +0000 (23:11 +0000)]
ehea: fix possible DLPAR/mem deadlock
Force serialization of userspace-triggered DLPAR/mem operations
Signed-off-by: Thomas Klein <tklein@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Thomas Klein [Tue, 20 Apr 2010 23:10:55 +0000 (23:10 +0000)]
ehea: error handling improvement
Reset a port's resources only if they're actually in an error state
Signed-off-by: Thomas Klein <tklein@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Richard Röjfors [Wed, 21 Apr 2010 23:33:29 +0000 (16:33 -0700)]
ks8842: Add platform data for setting mac address
This patch adds platform data to the ks8842 driver.
Via the platform data a MAC address, to be used by the controller,
can be passed.
To ensure this MAC address is used, the MAC address is written
after each hardware reset.
Signed-off-by: Richard Röjfors <richard.rojfors@pelagicore.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nikanth Karthikesan [Thu, 15 Apr 2010 02:21:23 +0000 (02:21 +0000)]
net: small cleanup of lib8390
Remove the always true #if 1. Also the unecessary re-test of ei_local->irqlock
and the unreachable printk format string.
Signed-off-by: Nikanth Karthikesan <knikanth@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Wed, 14 Apr 2010 09:55:35 +0000 (09:55 +0000)]
fasync: RCU and fine grained locking
kill_fasync() uses a central rwlock, candidate for RCU conversion, to
avoid cache line ping pongs on SMP.
fasync_remove_entry() and fasync_add_entry() can disable IRQS on a short
section instead during whole list scan.
Use a spinlock per fasync_struct to synchronize kill_fasync_rcu() and
fasync_{remove|add}_entry(). This spinlock is IRQ safe, so sock_fasync()
doesnt need its own implementation and can use fasync_helper(), to
reduce code size and complexity.
We can remove __kill_fasync() direct use in net/socket.c, and rename it
to kill_fasync_rcu().
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 21 Apr 2010 21:59:20 +0000 (14:59 -0700)]
tcp: Mark v6 response packets as CHECKSUM_PARTIAL
Otherwise we only get the checksum right for data-less TCP responses.
Noticed by Herbert Xu.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 21 Apr 2010 08:57:01 +0000 (01:57 -0700)]
tcp: Fix ipv6 checksumming on response packets for real.
Commit
6651ffc8e8bdd5fb4b7d1867c6cfebb4f309512c
("ipv6: Fix tcp_v6_send_response transport header setting.")
fixed one half of why ipv6 tcp response checksums were
invalid, but it's not the whole story.
If we're going to use CHECKSUM_PARTIAL for these things (which we are
since commit
2e8e18ef52e7dd1af0a3bd1f7d990a1d0b249586 "tcp: Set
CHECKSUM_UNNECESSARY in tcp_init_nondata_skb"), we can't be setting
buff->csum as we always have been here in tcp_v6_send_response. We
need to leave it at zero.
Kill that line and checksums are good again.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 21 Apr 2010 08:14:25 +0000 (01:14 -0700)]
Merge branch 'master' of /linux/kernel/git/davem/net-2.6
Conflicts:
drivers/net/wireless/iwlwifi/iwl-6000.c
net/core/dev.c
David Howells [Tue, 20 Apr 2010 00:25:58 +0000 (00:25 +0000)]
net: Fix an RCU warning in dev_pick_tx()
Fix the following RCU warning in dev_pick_tx():
===================================================
[ INFO: suspicious rcu_dereference_check() usage. ]
---------------------------------------------------
net/core/dev.c:1993 invoked rcu_dereference_check() without protection!
other info that might help us debug this:
rcu_scheduler_active = 1, debug_locks = 0
2 locks held by swapper/0:
#0: (&idev->mc_ifc_timer){+.-...}, at: [<
ffffffff81039e65>] run_timer_softirq+0x17b/0x278
#1: (rcu_read_lock_bh){.+....}, at: [<
ffffffff812ea3eb>] dev_queue_xmit+0x14e/0x4dc
stack backtrace:
Pid: 0, comm: swapper Not tainted 2.6.34-rc5-cachefs #4
Call Trace:
<IRQ> [<
ffffffff810516c4>] lockdep_rcu_dereference+0xaa/0xb2
[<
ffffffff812ea4f6>] dev_queue_xmit+0x259/0x4dc
[<
ffffffff812ea3eb>] ? dev_queue_xmit+0x14e/0x4dc
[<
ffffffff81052324>] ? trace_hardirqs_on+0xd/0xf
[<
ffffffff81035362>] ? local_bh_enable_ip+0xbc/0xc1
[<
ffffffff812f0954>] neigh_resolve_output+0x24b/0x27c
[<
ffffffff8134f673>] ip6_output_finish+0x7c/0xb4
[<
ffffffff81350c34>] ip6_output2+0x256/0x261
[<
ffffffff81052324>] ? trace_hardirqs_on+0xd/0xf
[<
ffffffff813517fb>] ip6_output+0xbbc/0xbcb
[<
ffffffff8135bc5d>] ? fib6_force_start_gc+0x2b/0x2d
[<
ffffffff81368acb>] mld_sendpack+0x273/0x39d
[<
ffffffff81368858>] ? mld_sendpack+0x0/0x39d
[<
ffffffff81052099>] ? mark_held_locks+0x52/0x70
[<
ffffffff813692fc>] mld_ifc_timer_expire+0x24f/0x288
[<
ffffffff81039ed6>] run_timer_softirq+0x1ec/0x278
[<
ffffffff81039e65>] ? run_timer_softirq+0x17b/0x278
[<
ffffffff813690ad>] ? mld_ifc_timer_expire+0x0/0x288
[<
ffffffff81035531>] ? __do_softirq+0x69/0x140
[<
ffffffff8103556a>] __do_softirq+0xa2/0x140
[<
ffffffff81002e0c>] call_softirq+0x1c/0x28
[<
ffffffff81004b54>] do_softirq+0x38/0x80
[<
ffffffff81034f06>] irq_exit+0x45/0x47
[<
ffffffff810177c3>] smp_apic_timer_interrupt+0x88/0x96
[<
ffffffff810028d3>] apic_timer_interrupt+0x13/0x20
<EOI> [<
ffffffff810488dd>] ? __atomic_notifier_call_chain+0x0/0x86
[<
ffffffff810096bf>] ? mwait_idle+0x6e/0x78
[<
ffffffff810096b6>] ? mwait_idle+0x65/0x78
[<
ffffffff810011cb>] cpu_idle+0x4d/0x83
[<
ffffffff81380b05>] rest_init+0xb9/0xc0
[<
ffffffff81380a4c>] ? rest_init+0x0/0xc0
[<
ffffffff8168dcf0>] start_kernel+0x392/0x39d
[<
ffffffff8168d2a3>] x86_64_start_reservations+0xb3/0xb7
[<
ffffffff8168d38b>] x86_64_start_kernel+0xe4/0xeb
An rcu_dereference() should be an rcu_dereference_bh().
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 21 Apr 2010 07:50:39 +0000 (00:50 -0700)]
Merge branch 'master' of /home/davem/src/GIT/linux-2.6/
Herbert Xu [Wed, 21 Apr 2010 07:47:15 +0000 (00:47 -0700)]
ipv6: Fix tcp_v6_send_response transport header setting.
My recent patch to remove the open-coded checksum sequence in
tcp_v6_send_response broke it as we did not set the transport
header pointer on the new packet.
Actually, there is code there trying to set the transport
header properly, but it sets it for the wrong skb ('skb'
instead of 'buff').
This bug was introduced by commit
a8fdf2b331b38d61fb5f11f3aec4a4f9fb2dedcb ("ipv6: Fix
tcp_v6_send_response(): it didn't set skb transport header")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rami Rosen [Wed, 21 Apr 2010 05:39:53 +0000 (22:39 -0700)]
net: Remove two unnecessary exports (skbuff).
There is no need to export skb_under_panic() and skb_over_panic() in
skbuff.c, since these methods are used only in skbuff.c ; this patch
removes these two exports. It also marks these functions as 'static'
and removeS the extern declarations of them from
include/linux/skbuff.h
Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Wed, 21 Apr 2010 02:06:52 +0000 (19:06 -0700)]
net: Fix various endianness glitches
Sparse can help us find endianness bugs, but we need to make some
cleanups to be able to more easily spot real bugs.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Tue, 20 Apr 2010 03:20:05 +0000 (03:20 +0000)]
bridge: add a missing ntohs()
grec_nsrcs is in network order, we should convert to host horder in
br_multicast_igmp3_report()
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 21 Apr 2010 01:49:45 +0000 (18:49 -0700)]
tg3: Enable GRO by default.
This was merely an oversight when I added the *_gro_receive()
calls.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 21 Apr 2010 01:44:52 +0000 (18:44 -0700)]
niu: Enable GRO by default.
This was merely an oversight when I added the napi_gro_receive()
calls.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 21 Apr 2010 00:57:56 +0000 (17:57 -0700)]
Merge branch 'master' of git://git./linux/kernel/git/linville/wireless-2.6
Eric Dumazet [Tue, 20 Apr 2010 13:03:51 +0000 (13:03 +0000)]
net: sk_sleep() helper
Define a new function to return the waitqueue of a "struct sock".
static inline wait_queue_head_t *sk_sleep(struct sock *sk)
{
return sk->sk_sleep;
}
Change all read occurrences of sk_sleep by a call to this function.
Needed for a future RCU conversion. sk_sleep wont be a field directly
available.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Tue, 20 Apr 2010 16:39:40 +0000 (09:39 -0700)]
Merge branch 'for_linus' of git://git./linux/kernel/git/jack/linux-fs-2.6
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6:
quota: Convert __DQUOT_PARANOIA symbol to standard config option
Jan Kara [Mon, 19 Apr 2010 14:47:20 +0000 (16:47 +0200)]
quota: Convert __DQUOT_PARANOIA symbol to standard config option
Make __DQUOT_PARANOIA define from the old days a standard config option
and turn it off by default.
This gets rid of a quota warning about writes before quota is turned on
for systems with ext4 root filesystem. Currently there's no way to legally
solve this because /etc/mtab has to be written before quota is turned on
on most systems.
Signed-off-by: Jan Kara <jack@suse.cz>
Linus Torvalds [Tue, 20 Apr 2010 16:21:19 +0000 (09:21 -0700)]
Merge branch 'urgent' of git://git./linux/kernel/git/brodo/pcmcia-2.6
* 'urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/brodo/pcmcia-2.6:
pcmcia: fix error handling in cm4000_cs.c
drivers/pcmcia: Add missing local_irq_restore
serial_cs: MD55x support (PCMCIA GPRS/EDGE modem) (kernel 2.6.33)
pcmcia: avoid late calls to pccard_validate_cis
pcmcia: fix ioport size calculation in rsrc_nonstatic
pcmcia: re-start on MFC override
pcmcia: fix io_probe due to parent (PCI) resources
pcmcia: use previously assigned IRQ for all card functions
Linus Torvalds [Tue, 20 Apr 2010 16:20:55 +0000 (09:20 -0700)]
Merge git://git./linux/kernel/git/davem/sparc-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6:
sparc64: Fix hardirq tracing in trap return path.
sparc64: Use correct pt_regs in decode_access_size() error paths.
sparc64: Fix PREEMPT_ACTIVE value.
sparc64: Run NMIs on the hardirq stack.
sparc64: Allocate sufficient stack space in ftrace stubs.
sparc: Fix forgotten kmemleak headers inclusion
Linus Torvalds [Tue, 20 Apr 2010 16:20:23 +0000 (09:20 -0700)]
Merge branch 'perf-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
perf: Fix unsafe frame rewinding with hot regs fetching
Linus Torvalds [Tue, 20 Apr 2010 16:20:11 +0000 (09:20 -0700)]
Merge branch 'drm-linus' of git://git./linux/kernel/git/airlied/drm-2.6
* 'drm-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
drm: delay vblank cleanup until after driver unload
Christoph Hellwig [Tue, 20 Apr 2010 03:31:02 +0000 (05:31 +0200)]
x86: correctly wire up the newuname system call
Before commit
e28cbf22933d0c0ccaf3c4c27a1a263b41f73859 ("improve
sys_newuname() for compat architectures") 64-bit x86 had a private
implementation of sys_uname which was just called sys_uname, which other
architectures used for the old uname.
Due to some merge issues with the uname refactoring patches we ended up
calling the old uname version for both the old and new system call
slots, which lead to the domainname filed never be set which caused
failures with libnss_nis.
Reported-and-tested-by: Andy Isaacson <adi@hexapodia.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jiri Pirko [Tue, 20 Apr 2010 08:45:37 +0000 (01:45 -0700)]
net: emphasize rtnl lock required in call_netdevice_notifiers
Since netdev_chain is guarded by rtnl_lock, ASSERT_RTNL should be
present here to make sure that all callers of call_netdevice_notifiers
does the locking properly.
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Mon, 19 Apr 2010 21:56:38 +0000 (21:56 +0000)]
rps: consistent rxhash
In case we compute a software skb->rxhash, we can generate a consistent
hash : Its value will be the same in both flow directions.
This helps some workloads, like conntracking, since the same state needs
to be accessed in both directions.
tbench + RFS + this patch gives better results than tbench with default
kernel configuration (no RPS, no RFS)
Also fixed some sparse warnings.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Mon, 19 Apr 2010 21:17:14 +0000 (21:17 +0000)]
rps: cleanups
struct softnet_data holds many queues, so consistent use "sd" name
instead of "queue" is better.
Adds a rps_ipi_queued() helper to cleanup enqueue_to_backlog()
Adds a _and_irq_disable suffix to net_rps_action() name, as David
suggested.
incr_input_queue_head() becomes input_queue_head_incr()
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 20 Apr 2010 07:48:37 +0000 (00:48 -0700)]
sparc64: Fix hardirq tracing in trap return path.
We can overflow the hardirq stack if we set the %pil here
so early, just let the normal control flow do it.
This is fine as we are allowed to do the actual IRQ enable
at any point after we call trace_hardirqs_on.
Signed-off-by: David S. Miller <davem@davemloft.net>
Jesse Barnes [Fri, 26 Mar 2010 18:07:16 +0000 (11:07 -0700)]
drm: delay vblank cleanup until after driver unload
Drivers may use vblank calls now (e.g. drm_vblank_off) in their unload
paths, so don't clean up the vblank related structures until after
driver unload.
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Linus Torvalds [Mon, 19 Apr 2010 23:29:56 +0000 (16:29 -0700)]
Linux 2.6.34-rc5
Rik van Riel [Wed, 14 Apr 2010 21:59:28 +0000 (17:59 -0400)]
rmap: add exclusively owned pages to the newest anon_vma
The recent anon_vma fixes cause many anonymous pages to end up
in the parent process anon_vma, even when the page is exclusively
owned by the current process.
Adding exclusively owned anonymous pages to the top anon_vma
reduces rmap scanning overhead, especially in workloads with
forking servers.
This patch adds a parameter to __page_set_anon_rmap that can
be used to indicate whether or not the added page is exclusively
owned by the current process.
Pages added through page_add_new_anon_rmap are exclusively
owned by the current process, and can be added to the top
anon_vma.
Pages added through page_add_anon_rmap can be either shared
or exclusively owned, so we do the conservative thing and
add it to the oldest anon_vma.
A next step would be to add the exclusive parameter to
page_add_anon_rmap, to be used from functions where we do
know for sure whether a page is exclusively owned.
Signed-off-by: Rik van Riel <riel@redhat.com>
Reviewed-by: Johannes Weiner <hannes@cmpxchg.org>
Lightly-tested-by: Borislav Petkov <bp@alien8.de>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
[ Edited to look nicer - Linus ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Eric Dumazet [Mon, 19 Apr 2010 21:40:57 +0000 (14:40 -0700)]
rps: static functions
store_rps_map() & store_rps_dev_flow_table_cnt() are static.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Mon, 19 Apr 2010 21:20:32 +0000 (14:20 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/ecryptfs/ecryptfs-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ecryptfs/ecryptfs-2.6:
eCryptfs: Turn lower lookup error messages into debug messages
eCryptfs: Copy lower directory inode times and size on link
ecryptfs: fix use with tmpfs by removing d_drop from ecryptfs_destroy_inode
ecryptfs: fix error code for missing xattrs in lower fs
eCryptfs: Decrypt symlink target for stat size
eCryptfs: Strip metadata in xattr flag in encrypted view
eCryptfs: Clear buffer before reading in metadata xattr
eCryptfs: Rename ecryptfs_crypt_stat.num_header_bytes_at_front
eCryptfs: Fix metadata in xattr feature regression
Alexander Kuznetsov [Mon, 19 Apr 2010 21:17:43 +0000 (14:17 -0700)]
8139too: Fix a typo in the function name.
Signed-off-by: Alexander Kuznetsov <alr.kuznetsov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 19 Apr 2010 20:46:48 +0000 (13:46 -0700)]
sparc64: Use correct pt_regs in decode_access_size() error paths.
Signed-off-by: David S. Miller <davem@davemloft.net>
Reinette Chatre [Mon, 19 Apr 2010 17:46:31 +0000 (10:46 -0700)]
mac80211: pass HT changes to driver when off channel
Since "mac80211: make off-channel work generic" drivers have not been
notified of configuration changes after association or authentication. This
caused more dependence on current state to ensure driver will be notified
when configuration changes occur. One such problem arises if off-channel is
in progress when HT information changes. Since HT is only enabled on the
"oper_channel" the driver will never be notified of this change. Usually
the driver is notified soon after of a BSS information change
(BSS_CHANGED_HT) ... but since the driver did not get a notification that
this is a HT channel the new BSS information does not make sense.
Fix this by also changing the off-channel information when HT is enabled
and thus cause driver to be notified correctly.
This fixes a problem in 4965 when associated with 5GHz 40MHz channel.
Without this patch the system can associate but is unable to transfer any
data, not even ping.
See http://bugzilla.intellinuxwireless.org/show_bug.cgi?id=2158
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Johannes Berg [Mon, 19 Apr 2010 08:48:38 +0000 (10:48 +0200)]
mac80211: remove bogus TX agg state assignment
When the addba timer expires but has no work to do,
it should not affect the state machine. If it does,
TX will not see the successfully established and we
can also crash trying to re-establish the session.
Cc: stable@kernel.org [2.6.32, 2.6.33]
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Eric Dumazet [Mon, 19 Apr 2010 05:07:33 +0000 (05:07 +0000)]
rps: shortcut net_rps_action()
net_rps_action() is a bit expensive on NR_CPUS=64..4096 kernels, even if
RPS is not active.
Tom Herbert used two bitmasks to hold information needed to send IPI,
but a single LIFO list seems more appropriate.
Move all RPS logic into net_rps_action() to cleanup net_rx_action() code
(remove two ifdefs)
Move rps_remote_softirq_cpus into softnet_data to share its first cache
line, filling an existing hole.
In a future patch, we could call net_rps_action() from process_backlog()
to make sure we send IPI before handling this cpu backlog.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vladislav Zolotarov [Mon, 19 Apr 2010 01:15:17 +0000 (01:15 +0000)]
bnx2x: Date and version
Set version to 1.52.53-1.
Signed-off-by: Vladislav Zolotarov <vladz@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vladislav Zolotarov [Mon, 19 Apr 2010 01:15:08 +0000 (01:15 +0000)]
bnx2x: Don't report link down if has been already down
Author: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Vladislav Zolotarov <vladz@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vladislav Zolotarov [Mon, 19 Apr 2010 01:14:49 +0000 (01:14 +0000)]
bnx2x: Rework power state handling code
Move "don't shut down the power" logic into bnx2x_set_power_state() to make the code cleaner.
Signed-off-by: Vladislav Zolotarov <vladz@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vladislav Zolotarov [Mon, 19 Apr 2010 01:14:37 +0000 (01:14 +0000)]
bnx2x: use mask in test_registers() to avoid parity error
Properly mask the value to be written to the register (according to the register size) during the self-test.
Otherwise immediate parity error would be generated.
Signed-off-by: Vladislav Zolotarov <vladz@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vladislav Zolotarov [Mon, 19 Apr 2010 01:14:18 +0000 (01:14 +0000)]
bnx2x: Fixed MSI-X enabling flow
Try to enable less MSI-X vectors if initial request has failed.
Author: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Vladislav Zolotarov <vladz@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vladislav Zolotarov [Mon, 19 Apr 2010 01:14:07 +0000 (01:14 +0000)]
bnx2x: Added new statistics
Added total_mcast/bcast_pkts_transmitted statistics.
Author: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Vladislav Zolotarov <vladz@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vladislav Zolotarov [Mon, 19 Apr 2010 01:13:57 +0000 (01:13 +0000)]
bnx2x: White spaces
White spaces, code readability and prints.
Signed-off-by: Vladislav Zolotarov <vladz@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vladislav Zolotarov [Mon, 19 Apr 2010 01:13:49 +0000 (01:13 +0000)]
bnx2x: Protect code with NOMCP
Don't run code that can't be run if MCP is not present.
This will prevent NULL pointer dereferencing.
Signed-off-by: Vladislav Zolotarov <vladz@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vladislav Zolotarov [Mon, 19 Apr 2010 01:13:33 +0000 (01:13 +0000)]
bnx2x: Increase DMAE max write size for 57711
Increase DMAE max write size for 57711 to the maximum allowed value.
Signed-off-by: Vladislav Zolotarov <vladz@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vladislav Zolotarov [Mon, 19 Apr 2010 01:13:23 +0000 (01:13 +0000)]
bnx2x: Use VPD-R V0 entry to display firmware revision
Author: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Vladislav Zolotarov <vladz@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vladislav Zolotarov [Mon, 19 Apr 2010 01:13:12 +0000 (01:13 +0000)]
bnx2x: Parity errors handling for 57710 and 57711
This patch introduces the parity errors handling code for 57710 and 57711 chips.
HW is configured to stop all DMA transactions to the host and sending packets to the network
once parity error is detected, which is meant to prevent silent data corruption.
At the same time HW generates the attention interrupt to every function of the device where parity
has been detected so that driver can start the recovery flow.
The recovery is actually resetting the chip and restarting the driver on all active functions
of the chip where the parity error has been reported.
Signed-off-by: Vladislav Zolotarov <vladz@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tyler Hicks [Thu, 25 Mar 2010 16:16:56 +0000 (11:16 -0500)]
eCryptfs: Turn lower lookup error messages into debug messages
Vaugue warnings about ENAMETOOLONG errors when looking up an encrypted
file name have caused many users to become concerned about their data.
Since this is a rather harmless condition, I'm moving this warning to
only be printed when the ecryptfs_verbosity module param is 1.
Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
Tyler Hicks [Tue, 23 Mar 2010 23:09:02 +0000 (18:09 -0500)]
eCryptfs: Copy lower directory inode times and size on link
The timestamps and size of a lower inode involved in a link() call was
being copied to the upper parent inode. Instead, we should be
copying lower parent inode's timestamps and size to the upper parent
inode. I discovered this bug using the POSIX test suite at Tuxera.
Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
Jeff Mahoney [Fri, 19 Mar 2010 19:35:46 +0000 (15:35 -0400)]
ecryptfs: fix use with tmpfs by removing d_drop from ecryptfs_destroy_inode
Since tmpfs has no persistent storage, it pins all its dentries in memory
so they have d_count=1 when other file systems would have d_count=0.
->lookup is only used to create new dentries. If the caller doesn't
instantiate it, it's freed immediately at dput(). ->readdir reads
directly from the dcache and depends on the dentries being hashed.
When an ecryptfs mount is mounted, it associates the lower file and dentry
with the ecryptfs files as they're accessed. When it's umounted and
destroys all the in-memory ecryptfs inodes, it fput's the lower_files and
d_drop's the lower_dentries. Commit
4981e081 added this and a d_delete in
2008 and several months later commit
caeeeecf removed the d_delete. I
believe the d_drop() needs to be removed as well.
The d_drop effectively hides any file that has been accessed via ecryptfs
from the underlying tmpfs since it depends on it being hashed for it to
be accessible. I've removed the d_drop on my development node and see no
ill effects with basic testing on both tmpfs and persistent storage.
As a side effect, after ecryptfs d_drops the dentries on tmpfs, tmpfs
BUGs on umount. This is due to the dentries being unhashed.
tmpfs->kill_sb is kill_litter_super which calls d_genocide to drop
the reference pinning the dentry. It skips unhashed and negative dentries,
but shrink_dcache_for_umount_subtree doesn't. Since those dentries
still have an elevated d_count, we get a BUG().
This patch removes the d_drop call and fixes both issues.
This issue was reported at:
https://bugzilla.novell.com/show_bug.cgi?id=567887
Reported-by: Árpád Bíró <biroa@demasz.hu>
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Cc: Dustin Kirkland <kirkland@canonical.com>
Cc: stable@kernel.org
Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
Christian Pulvermacher [Tue, 23 Mar 2010 16:51:38 +0000 (11:51 -0500)]
ecryptfs: fix error code for missing xattrs in lower fs
If the lower file system driver has extended attributes disabled,
ecryptfs' own access functions return -ENOSYS instead of -EOPNOTSUPP.
This breaks execution of programs in the ecryptfs mount, since the
kernel expects the latter error when checking for security
capabilities in xattrs.
Signed-off-by: Christian Pulvermacher <pulvermacher@gmx.de>
Cc: stable@kernel.org
Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
Tyler Hicks [Mon, 22 Mar 2010 05:41:35 +0000 (00:41 -0500)]
eCryptfs: Decrypt symlink target for stat size
Create a getattr handler for eCryptfs symlinks that is capable of
reading the lower target and decrypting its path. Prior to this patch,
a stat's st_size field would represent the strlen of the encrypted path,
while readlink() would return the strlen of the decrypted path. This
could lead to confusion in some userspace applications, since the two
values should be equal.
https://bugs.launchpad.net/bugs/524919
Reported-by: Loïc Minier <loic.minier@canonical.com>
Cc: stable@kernel.org
Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
Linus Torvalds [Mon, 19 Apr 2010 18:53:17 +0000 (11:53 -0700)]
Fix ISDN/Gigaset build failure
Commit
b91ecb00 ("gigaset: include cleanup cleanup") removed an implicit
sched.h inclusion that came in via slab.h, and caused various compile
problems as a result.
This should fix it.
Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Mon, 19 Apr 2010 15:35:47 +0000 (08:35 -0700)]
Merge branch 'core-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
rcu: Make RCU lockdep check the lockdep_recursion variable
rcu: Update docs for rcu_access_pointer and rcu_dereference_protected
rcu: Better explain the condition parameter of rcu_dereference_check()
rcu: Add rcu_access_pointer and rcu_dereference_protected
Linus Torvalds [Mon, 19 Apr 2010 14:27:45 +0000 (07:27 -0700)]
Merge git://git./linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
gigaset: include cleanup cleanup
packet : remove init_net restriction
WAN: flush tx_queue in hdlc_ppp to prevent panic on rmmod hw_driver.
ip: Fix ip_dev_loopback_xmit()
net: dev_pick_tx() fix
fib: suppress lockdep-RCU false positive in FIB trie.
tun: orphan an skb on tx
forcedeth: fix tx limit2 flag check
iwlwifi: work around bogus active chains detection
Linus Torvalds [Mon, 19 Apr 2010 14:27:06 +0000 (07:27 -0700)]
Merge branch 'drm-linus' of git://git./linux/kernel/git/airlied/drm-2.6
* 'drm-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
drm/radeon/kms: add FireMV 2400 PCI ID.
drm/radeon/kms: allow R500 regs VAP_ALT_NUM_VERTICES and VAP_INDEX_OFFSET
drivers/gpu/radeon: Add MSPOS regs to safe list.
drm/radeon/kms: disable the tv encoder when tv/cv is not in use
drm/radeon/kms: adjust pll settings for tv
drm/radeon/kms: fix tv dac conflict resolver
drm/radeon/kms/evergreen: don't enable hdmi audio stuff
drm/radeon/kms/atom: fix dual-link DVI on DCE3.2/4.0
drm/radeon/kms: fix rs600 tlb flush
drm/radeon/kms: print GPU family and device id when loading
drm/radeon/kms: fix calculation of mipmapped 3D texture sizes
drm/radeon/kms: only change mode when coherent value changes.
drm/radeon/kms: more atom parser fixes (v2)
Linus Torvalds [Mon, 19 Apr 2010 14:26:21 +0000 (07:26 -0700)]
Merge master.kernel.org:/home/rmk/linux-2.6-arm
* master.kernel.org:/home/rmk/linux-2.6-arm:
ARM: 5974/1: arm/mach-at91 Makefile: remove two blanks.
ARM: 6052/1: kdump: make kexec work in interrupt context
ARM: 6051/1: VFP: preserve the HW context when calling signal handlers
ARM: 6050/1: VFP: fix the SMP versions of vfp_{sync,flush}_hwstate
ARM: 6007/1: fix highmem with VIPT cache and DMA
ARM: 5975/1: AT91 slow-clock suspend: don't wait when turning PLLs off
Dan Carpenter [Sun, 18 Apr 2010 19:07:33 +0000 (22:07 +0300)]
pcmcia: fix error handling in cm4000_cs.c
In the original code we used -ENODEV as the number of bytes to
copy_to_user() and we didn't release the locks.
Signed-off-by: Dan Carpenter <error27@gmail.com>
Acked-by: Harald Welte <laforge@gnumonks.org>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Dave Airlie [Mon, 19 Apr 2010 07:54:31 +0000 (17:54 +1000)]
drm/radeon/kms: add FireMV 2400 PCI ID.
This is an M24/X600 chip.
From RH# 581927
cc: stable@kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
David S. Miller [Mon, 19 Apr 2010 08:30:51 +0000 (01:30 -0700)]
sparc64: Fix PREEMPT_ACTIVE value.
It currently overlaps the NMI bit.
Signed-off-by: David S. Miller <davem@davemloft.net>
Paul E. McKenney [Thu, 15 Apr 2010 19:50:39 +0000 (12:50 -0700)]
rcu: Make RCU lockdep check the lockdep_recursion variable
The lockdep facility temporarily disables lockdep checking by
incrementing the current->lockdep_recursion variable. Such
disabling happens in NMIs and in other situations where lockdep
might expect to recurse on itself.
This patch therefore checks current->lockdep_recursion, disabling RCU
lockdep splats when this variable is non-zero. In addition, this patch
removes the "likely()", as suggested by Lai Jiangshan.
Reported-by: Frederic Weisbecker <fweisbec@gmail.com>
Reported-by: David Miller <davem@davemloft.net>
Tested-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
Cc: eric.dumazet@gmail.com
LKML-Reference: <
20100415195039.GA22623@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Marek Olšák [Sun, 21 Feb 2010 20:24:15 +0000 (21:24 +0100)]
drm/radeon/kms: allow R500 regs VAP_ALT_NUM_VERTICES and VAP_INDEX_OFFSET
[airlied: fix V_A_N_V to not be safe and fix check to make sure only r500
- bump userspace version]
Signed-off-by: Marek Olšák <maraeo@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Corbin Simpson [Sun, 11 Apr 2010 19:34:00 +0000 (12:34 -0700)]
drivers/gpu/radeon: Add MSPOS regs to safe list.
Permits MSAA and D3D-style rasterization.
[airlied: add rs600]
Signed-off-by: Corbin Simpson <MostAwesomeDude@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Alex Deucher [Tue, 13 Apr 2010 15:21:59 +0000 (11:21 -0400)]
drm/radeon/kms: disable the tv encoder when tv/cv is not in use
Switching between TV and VGA caused VGA to break on some systems
since the TV encoder was left enabled when VGA was used.
fixes fdo bug 25520.
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Cc: stable <stable@kernel.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Alex Deucher [Fri, 9 Apr 2010 19:31:56 +0000 (15:31 -0400)]
drm/radeon/kms: adjust pll settings for tv
May fix fdo bug 26582.
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Alex Deucher [Thu, 15 Apr 2010 17:31:12 +0000 (13:31 -0400)]
drm/radeon/kms: fix tv dac conflict resolver
On systems with the tv dac shared between DVI and TV,
we can only use the dac for one of the connectors.
However, when using a digital monitor on the DVI port,
you can use the dac for the TV connector just fine.
Check the use_digital status when resolving the conflict.
Fixes fdo bug 27649, possibly others.
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Cc: stable <stable@kernel.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Alex Deucher [Fri, 16 Apr 2010 15:35:30 +0000 (11:35 -0400)]
drm/radeon/kms/evergreen: don't enable hdmi audio stuff
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Alex Deucher [Thu, 15 Apr 2010 20:54:38 +0000 (16:54 -0400)]
drm/radeon/kms/atom: fix dual-link DVI on DCE3.2/4.0
Got broken during the evergreen merge.
Fixes fdo bug 27001.
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Jerome Glisse [Fri, 16 Apr 2010 16:46:35 +0000 (18:46 +0200)]
drm/radeon/kms: fix rs600 tlb flush
Typo in in flush leaded to no flush of the RS600 tlb which
ultimately leaded to massive system ram corruption, with
this patch everythings seems to work properly.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Cc: stable <stable@kernel.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Jerome Glisse [Mon, 12 Apr 2010 20:21:53 +0000 (20:21 +0000)]
drm/radeon/kms: print GPU family and device id when loading
This will help figuring out GPU when looking at bugs log.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Eric Dumazet [Fri, 16 Apr 2010 12:18:22 +0000 (12:18 +0000)]
net: Introduce skb_orphan_try()
Transmitted skb might be attached to a socket and a destructor, for
memory accounting purposes.
Traditionally, this destructor is called at tx completion time, when skb
is freed.
When tx completion is performed by another cpu than the sender, this
forces some cache lines to change ownership. XPS was an attempt to give
tx completion to initial cpu.
David idea is to call destructor right before giving skb to device (call
to ndo_start_xmit()). Because device queues are usually small, orphaning
skb before tx completion is not a big deal. Some drivers already do
this, we could do it in upper level.
There is one known exception to this early orphaning, called tx
timestamping. It needs to keep a reference to socket until device can
give a hardware or software timestamp.
This patch adds a skb_orphan_try() helper, to centralize all exceptions
to early orphaning in one spot, and use it in dev_hard_start_xmit().
"tbench 16" results on a Nehalem machine (2 X5570 @ 2.93GHz)
before: Throughput 4428.9 MB/sec 16 procs
after: Throughput 4448.14 MB/sec 16 procs
UDP should get even better results, its destructor being more complex,
since SOCK_USE_WRITE_QUEUE is not set (four atomic ops instead of one)
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Sat, 17 Apr 2010 04:17:02 +0000 (04:17 +0000)]
net: remove time limit in process_backlog()
- There is no point to enforce a time limit in process_backlog(), since
other napi instances dont follow same rule. We can exit after only one
packet processed...
The normal quota of 64 packets per napi instance should be the norm, and
net_rx_action() already has its own time limit.
Note : /proc/net/core/dev_weight can be used to tune this 64 default
value.
- Use DEFINE_PER_CPU_ALIGNED for softnet_data definition.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tilman Schmidt [Fri, 16 Apr 2010 12:08:58 +0000 (12:08 +0000)]
gigaset: include cleanup cleanup
Commit
5a0e3ad causes slab.h to be included twice in many of the
Gigaset driver's source files, first via the common include file
gigaset.h and then a second time directly. Drop the spares, and
use the opportunity to clean up a few more similar cases.
Impact: cleanup, no functional change
Signed-off-by: Tilman Schmidt <tilman@imap.cc>
CC: Tejun Heo <tj@kernel.org>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Sat, 17 Apr 2010 21:28:50 +0000 (14:28 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/anholt/drm-intel
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/anholt/drm-intel:
drm/i915: Ignore LVDS EDID when it is unavailabe or invalid
drm/i915: Add no_lvds entry for the Clientron U800
drm/i915: Rename many remaining uses of "output" to encoder or connector.
drm/i915: Rename intel_output to intel_encoder.
agp/intel: intel_845_driver is an agp driver!
drm/i915: introduce to_intel_bo helper
drm/i915: Disable FBC on 915GM and 945GM.
Linus Torvalds [Sat, 17 Apr 2010 17:58:38 +0000 (10:58 -0700)]
Merge branch 'release' of git://git./linux/kernel/git/lenb/linux-acpi-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6:
ACPI: EC: Limit burst to 64 bits
Linus Torvalds [Sat, 17 Apr 2010 17:57:56 +0000 (10:57 -0700)]
Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs
* 'for-linus' of git://oss.sgi.com/xfs/xfs:
xfs: don't warn on EAGAIN in inode reclaim
xfs: ensure that sync updates the log tail correctly
Julia Lawall [Mon, 29 Mar 2010 15:35:24 +0000 (17:35 +0200)]
drivers/pcmcia: Add missing local_irq_restore
Use local_irq_restore in this error-handling case just like in the one just
below.
A simplified version of the semantic patch that finds this problem is as
follows: (http://coccinelle.lip6.fr/)
// <smpl>
@r exists@
expression E1;
identifier f;
@@
f (...) { <+...
* local_irq_save (E1,...);
... when != E1
* return ...;
...+> }
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Timur Maximov [Wed, 14 Apr 2010 15:06:57 +0000 (19:06 +0400)]
serial_cs: MD55x support (PCMCIA GPRS/EDGE modem) (kernel 2.6.33)
Many PCMCIA GPRS modems like: Onda Edge N100E, Novaway PC98 (OEM SPC98Z),
Rovermate Edgus Adaptmate-039 and others have same construction and
identification:
lspcmcia -vvv
Product Name: Generic Modem: MD55x 1.00 Serial number: xxxxx-xxx
Identification: manf_id: 0x015d card_id: 0x4c45
function: 2 (serial)
prod_id(1): "Generic" (0xc49e4731)
prod_id(2): "Modem: MD55x" (0x8913b110)
prod_id(3): "1.00" (0x83dbf271)
prod_id(4): "Serial number: xxxxx-xxx" (0x73ee9514)
Serial connection to GSM module based on Elan VPU16551 PCMCIA UART with
datasheet recommeded 14.7456MHz crystal oscillator.
By default serial_cs set UART clock ==
1843200 Hz
For correct work need set clock
14745600 Hz.
This quirk already present in driver, only need add device in quirk list.
Signed-off-by: Timur Maximov <xcom.org@gmail.com>
Acked-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Dominik Brodowski [Sat, 17 Apr 2010 15:37:33 +0000 (17:37 +0200)]
pcmcia: avoid late calls to pccard_validate_cis
pccard_validate_cis() nowadays destroys the CIS cache. Therefore,
calling it after card setup should be avoided. We can't control
the deprecated PCMCIA ioctl (which is only used on ARM nowadays),
but we can avoid -- and report -- any other calls.
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Eric Dumazet [Sat, 17 Apr 2010 07:54:36 +0000 (00:54 -0700)]
rps: rps_sock_flow_table is mostly read
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tom Herbert [Fri, 16 Apr 2010 23:01:27 +0000 (16:01 -0700)]
rfs: Receive Flow Steering
This patch implements receive flow steering (RFS). RFS steers
received packets for layer 3 and 4 processing to the CPU where
the application for the corresponding flow is running. RFS is an
extension of Receive Packet Steering (RPS).
The basic idea of RFS is that when an application calls recvmsg
(or sendmsg) the application's running CPU is stored in a hash
table that is indexed by the connection's rxhash which is stored in
the socket structure. The rxhash is passed in skb's received on
the connection from netif_receive_skb. For each received packet,
the associated rxhash is used to look up the CPU in the hash table,
if a valid CPU is set then the packet is steered to that CPU using
the RPS mechanisms.
The convolution of the simple approach is that it would potentially
allow OOO packets. If threads are thrashing around CPUs or multiple
threads are trying to read from the same sockets, a quickly changing
CPU value in the hash table could cause rampant OOO packets--
we consider this a non-starter.
To avoid OOO packets, this solution implements two types of hash
tables: rps_sock_flow_table and rps_dev_flow_table.
rps_sock_table is a global hash table. Each entry is just a CPU
number and it is populated in recvmsg and sendmsg as described above.
This table contains the "desired" CPUs for flows.
rps_dev_flow_table is specific to each device queue. Each entry
contains a CPU and a tail queue counter. The CPU is the "current"
CPU for a matching flow. The tail queue counter holds the value
of a tail queue counter for the associated CPU's backlog queue at
the time of last enqueue for a flow matching the entry.
Each backlog queue has a queue head counter which is incremented
on dequeue, and so a queue tail counter is computed as queue head
count + queue length. When a packet is enqueued on a backlog queue,
the current value of the queue tail counter is saved in the hash
entry of the rps_dev_flow_table.
And now the trick: when selecting the CPU for RPS (get_rps_cpu)
the rps_sock_flow table and the rps_dev_flow table for the RX queue
are consulted. When the desired CPU for the flow (found in the
rps_sock_flow table) does not match the current CPU (found in the
rps_dev_flow table), the current CPU is changed to the desired CPU
if one of the following is true:
- The current CPU is unset (equal to RPS_NO_CPU)
- Current CPU is offline
- The current CPU's queue head counter >= queue tail counter in the
rps_dev_flow table. This checks if the queue tail has advanced
beyond the last packet that was enqueued using this table entry.
This guarantees that all packets queued using this entry have been
dequeued, thus preserving in order delivery.
Making each queue have its own rps_dev_flow table has two advantages:
1) the tail queue counters will be written on each receive, so
keeping the table local to interrupting CPU s good for locality. 2)
this allows lockless access to the table-- the CPU number and queue
tail counter need to be accessed together under mutual exclusion
from netif_receive_skb, we assume that this is only called from
device napi_poll which is non-reentrant.
This patch implements RFS for TCP and connected UDP sockets.
It should be usable for other flow oriented protocols.
There are two configuration parameters for RFS. The
"rps_flow_entries" kernel init parameter sets the number of
entries in the rps_sock_flow_table, the per rxqueue sysfs entry
"rps_flow_cnt" contains the number of entries in the rps_dev_flow
table for the rxqueue. Both are rounded to power of two.
The obvious benefit of RFS (over just RPS) is that it achieves
CPU locality between the receive processing for a flow and the
applications processing; this can result in increased performance
(higher pps, lower latency).
The benefits of RFS are dependent on cache hierarchy, application
load, and other factors. On simple benchmarks, we don't necessarily
see improvement and sometimes see degradation. However, for more
complex benchmarks and for applications where cache pressure is
much higher this technique seems to perform very well.
Below are some benchmark results which show the potential benfit of
this patch. The netperf test has 500 instances of netperf TCP_RR
test with 1 byte req. and resp. The RPC test is an request/response
test similar in structure to netperf RR test ith 100 threads on
each host, but does more work in userspace that netperf.
e1000e on 8 core Intel
No RFS or RPS 104K tps at 30% CPU
No RFS (best RPS config): 290K tps at 63% CPU
RFS 303K tps at 61% CPU
RPC test tps CPU% 50/90/99% usec latency Latency StdDev
No RFS/RPS 103K 48% 757/900/3185 4472.35
RPS only: 174K 73% 415/993/2468 491.66
RFS 223K 73% 379/651/1382 315.61
Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Lezcano [Wed, 14 Apr 2010 23:11:14 +0000 (23:11 +0000)]
packet : remove init_net restriction
The af_packet protocol is used by Perl to do ioctls as reported by
Stephane Riviere:
"Net::RawIP relies on SIOCGIFADDR et SIOCGIFHWADDR to get the IP and MAC
addresses of the network interface."
But in a new network namespace these ioctl fail because it is disabled for
a namespace different from the init_net_ns.
These two lines should not be there as af_inet and af_packet are
namespace aware since a long time now. I suppose we forget to remove these
lines because we sent the af_packet first, before af_inet was supported.
Signed-off-by: Daniel Lezcano <daniel.lezcano@free.fr>
Reported-by: Stephane Riviere <stephane.riviere@regis-dgac.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Krzysztof Halasa [Wed, 14 Apr 2010 14:09:52 +0000 (14:09 +0000)]
WAN: flush tx_queue in hdlc_ppp to prevent panic on rmmod hw_driver.
tx_queue is used as a temporary queue when not allowed to queue skb
directly to the hw device driver (which may sleep). Most paths flush
it before returning, but ppp_start() currently cannot. Make sure we
don't leave skbs pointing to a non-existent device.
Thanks to Michael Barkowski for reporting this problem.
Signed-off-by: Krzysztof Hałasa <khc@pm.waw.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Shanyu Zhao [Thu, 8 Apr 2010 01:37:52 +0000 (18:37 -0700)]
iwlwifi: correct 6000 EEPROM regulatory address
For 6000 series, the 2.4G HT40 band regulatory settings address in EEPROM
was off by 2.
Before the fix, you'll see this in dmesg:
[79535.788877] ieee80211 phy8: U iwl_mod_ht40_chan_info HT40 Ch. 7 [2.4GHz]
WIDE (0x61 0dBm): Ad-Hoc not supported
[79535.788880] ieee80211 phy8: U iwl_mod_ht40_chan_info HT40 Ch. 11 [2.4GHz]
WIDE (0x61 0dBm): Ad-Hoc not supported
And after the fix:
[91132.688706] ieee80211 phy14: U iwl_mod_ht40_chan_info HT40 Ch. 7 [2.4GHz]
IBSS ACTIVE WIDE (0x6f 0dBm): Ad-Hoc supported
[91132.688709] ieee80211 phy14: U iwl_mod_ht40_chan_info HT40 Ch. 11 [2.4GHz]
IBSS ACTIVE WIDE (0x6f 0dBm): Ad-Hoc supported
Signed-off-by: Shanyu Zhao <shanyu.zhao@intel.com>
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Johannes Berg [Wed, 7 Apr 2010 07:21:36 +0000 (00:21 -0700)]
iwlwifi: fix scan races
When an internal scan is started, nothing protects the
is_internal_short_scan variable which can cause crashes,
cf. https://bugzilla.kernel.org/show_bug.cgi?id=15667.
Fix this by making the short scan request use the mutex
for locking, which requires making the request go to a
work struct so that it can sleep.
Reported-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Len Brown [Fri, 16 Apr 2010 20:08:07 +0000 (16:08 -0400)]
Merge branch 'bugzilla-15749' into release
Alexey Starikovskiy [Fri, 16 Apr 2010 19:36:40 +0000 (15:36 -0400)]
ACPI: EC: Limit burst to 64 bits
access_bit_width field is u8 in ACPICA, thus 256 value written to it
becomes 0, causing divide by zero later.
Proper fix would be to remove access_bit_width at all, just because
we already have access_byte_width, which is access_bit_width / 8.
Limit access width to 64 bit for now.
https://bugzilla.kernel.org/show_bug.cgi?id=15749
fixes regression caused by the fix for:
https://bugzilla.kernel.org/show_bug.cgi?id=14667
Signed-off-by: Alexey Starikovskiy <astarikovskiy@suse.de>
Signed-off-by: Len Brown <len.brown@intel.com>
Dave Chinner [Tue, 13 Apr 2010 05:06:45 +0000 (15:06 +1000)]
xfs: don't warn on EAGAIN in inode reclaim
Any inode reclaim flush that returns EAGAIN will result in the inode
reclaim being attempted again later. There is no need to issue a
warning into the logs about this situation.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Alex Elder <aelder@sgi.com>
Signed-off-by: Alex Elder <aelder@sgi.com>
Dave Chinner [Tue, 13 Apr 2010 05:06:44 +0000 (15:06 +1000)]
xfs: ensure that sync updates the log tail correctly
Updates to the VFS layer removed an extra ->sync_fs call into the
filesystem during the sync process (from the quota code).
Unfortunately the sync code was unknowingly relying on this call to
make sure metadata buffers were flushed via a xfs_buftarg_flush()
call to move the tail of the log forward in memory before the final
transactions of the sync process were issued.
As a result, the old code would write a very recent log tail value
to the log by the end of the sync process, and so a subsequent crash
would leave nothing for log recovery to do. Hence in qa test 182,
log recovery only replayed a small handle for inode fsync
transactions in this case.
However, with the removal of the extra ->sync_fs call, the log tail
was now not moved forward with the inode fsync transactions near the
end of the sync procese the first (and only) buftarg flush occurred
after these transactions went to disk. The result is that log
recovery now sees a large number of transactions for metadata that
is already on disk.
This usually isn't a problem, but when the transactions include
inode chunk allocation, the inode create transactions and all
subsequent changes are replayed as we cannt rely on what is on disk
is valid. As a result, if the inode was written and contains
unlogged changes, the unlogged changes are lost, thereby violating
sync semantics.
The fix is to always issue a transaction after the buftarg flush
occurs is the log iѕ not idle or covered. This results in a dummy
transaction being written that contains the up-to-date log tail
value, which will be very recent. Indeed, it will be at least as
recent as the old code would have left on disk, so log recovery
will behave exactly as it used to in this situation.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>
Linus Torvalds [Fri, 16 Apr 2010 14:26:31 +0000 (07:26 -0700)]
Merge git://git./linux/kernel/git/wim/linux-2.6-watchdog
* git://git.kernel.org/pub/scm/linux/kernel/git/wim/linux-2.6-watchdog:
[WATCHDOG] max63xx driver depends on ioremap()
[WATCHDOG] max63xx: be careful when disabling the watchdog
[WATCHDOG] fixed book E watchdog period register mask.
[WATCHDOG] omap4: Fix WDT Kconfig
Linus Torvalds [Fri, 16 Apr 2010 14:25:48 +0000 (07:25 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/tiwai/sound-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
ASoC: imx-ssi: do not call hrtimer_disable in trigger function
ALSA: hda - Add position_fix quirk for Biostar mobo
ALSA: hda - add a quirk for Clevo M570U laptop
ASoC: imx-ssi: increase minimum periods to 4
ALSA: hda - Avoid invalid "Independent HP" control for VIA codecs
ALSA: hda - Fix control element allocations in VIA codec parser
ALSA: aaci - Fix alignment faults on ARM Cortex introduced by commit
29a4f2d3
ALSA: hda - Add fix-up for Sony VAIO with ALC269
ALSA: hda - Enhance fix-up table for Realtek codecs
ALSA: usb - Fix Oops after usb-midi disconnection
ALSA: hda - Fix initial capture source connections of ALC880/260
ALSA: hda - Fix setup for ALC269vb amic and dmic models
ALSA: hda - Fix auto-parser of ALC269vb for HP pin NID 0x21
ASoC: imx-ssi: Use a hrtimer in FIQ mode
ASoC: imx-pcm-dma-mx2: restart DMA after an error
ASoC: imx-ssi: honor IMX_SSI_DMA flag
ASoC: wm2000: remove unused #include <linux/version.h>
ALSA: hda: Add support for Medion WIM2160
Geert Uytterhoeven [Wed, 7 Apr 2010 17:57:02 +0000 (19:57 +0200)]
[WATCHDOG] max63xx driver depends on ioremap()
Correct fix for the "ioremap() causes build failure on S390" should have been
a dependancy on HAS_IOMEM. So we add this dependancy also (and leave the driver
in the ARM section for now).
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Marc Zyngier [Fri, 9 Apr 2010 16:43:33 +0000 (17:43 +0100)]
[WATCHDOG] max63xx: be careful when disabling the watchdog
When shutting down the watchdog timer, special care must be taken
not to overwrite other bits in the register, as it may be shared
with other peripherals.
For example, on the Arcom Vulcan, the register is shared between
the watchdog and the PCI reset line...
Signed-off-by: Marc Zyngier <maz@misterjones.org>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Luuk Paulussen [Thu, 15 Apr 2010 03:59:10 +0000 (15:59 +1200)]
[WATCHDOG] fixed book E watchdog period register mask.
A previous fix changed the WDTP function to use the period directly,
rather than subtracting from 63. However the mask generation was
not changed, so the mask was coming out as 0. This patch fixes it.
Signed-off-by: Luuk Paulussen <luuk.paulussen@alliedtelesis.co.nz>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Santosh Shilimkar [Wed, 7 Apr 2010 07:47:22 +0000 (13:17 +0530)]
[WATCHDOG] omap4: Fix WDT Kconfig
This patch allows Watchdog timer to be selected for OMAP4 by fixing
Kconfig entry
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Takashi Iwai [Fri, 16 Apr 2010 08:03:48 +0000 (10:03 +0200)]
Merge branch 'fix/hda' into for-linus
Takashi Iwai [Fri, 16 Apr 2010 08:03:42 +0000 (10:03 +0200)]
Merge branch 'fix/misc' into for-linus