GitHub/LineageOS/android_kernel_motorola_exynos9610.git
14 years agoIPVS: Handle Scheduling errors.
Hans Schillstrom [Fri, 19 Nov 2010 13:25:10 +0000 (14:25 +0100)]
IPVS: Handle Scheduling errors.

If ip_vs_conn_fill_param_persist return an error to ip_vs_sched_persist,
this error must propagate as ignored=-1 to ip_vs_schedule().
Errors from ip_vs_conn_new() in ip_vs_sched_persist() and ip_vs_schedule()
should also return *ignored=-1;

This patch just relies on the fact that ignored is 1 before calling
ip_vs_sched_persist().

Sent from Julian:
  "The new case when ip_vs_conn_fill_param_persist fails
   should set *ignored = -1, so that we can use NF_DROP,
   see below. *ignored = -1 should be also used for ip_vs_conn_new
   failure in ip_vs_sched_persist() and ip_vs_schedule().
   The new negative value should be handled in tcp,udp,sctp"

"To summarize:

- *ignored = 1:
      protocol tried to schedule (eg. on SYN), found svc but the
      svc/scheduler decides that this packet should be accepted with
      NF_ACCEPT because it must not be scheduled.

- *ignored = 0:
      scheduler can not find destination, so try bypass or
      return ICMP and then NF_DROP (ip_vs_leave).

- *ignored = -1:
      scheduler tried to schedule but fatal error occurred, eg.
      ip_vs_conn_new failure (ENOMEM) or ip_vs_sip_fill_param
      failure such as missing Call-ID, ENOMEM on skb_linearize
      or pe_data. In this case we should return NF_DROP without
      any attempts to send ICMP with ip_vs_leave."

More or less all ideas and input to this patch is work from
Julian Anastasov

Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
14 years agoIPVS: skb defrag in L7 helpers
Hans Schillstrom [Fri, 19 Nov 2010 13:25:09 +0000 (14:25 +0100)]
IPVS: skb defrag in L7 helpers

L7 helpers like sip needs skb defrag
since L7 data can be fragmented.

This patch requires "IPVS Break ports-2 into src_port and dst_port" patch

Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
14 years agoIPVS: Split ports[2] into src_port and dst_port
Hans Schillstrom [Fri, 19 Nov 2010 13:25:08 +0000 (14:25 +0100)]
IPVS: Split ports[2] into src_port and dst_port

Avoid sending invalid pointer due to skb_linearize() call.
This patch prepares for next patch where skb_linearize is a part.

In ip_vs_sched_persist() params the ports ptr will be replaced by
src and dst port.

Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
14 years agoIPVS: Backup, Prepare for transferring firewall marks (fwmark) to the backup daemon.
Hans Schillstrom [Fri, 19 Nov 2010 13:25:07 +0000 (14:25 +0100)]
IPVS: Backup, Prepare for transferring firewall marks (fwmark) to the backup daemon.

One struct will have fwmark added:
 * ip_vs_conn

ip_vs_conn_new() and ip_vs_find_dest()
will have an extra param - fwmark
The effects of that, is in this patch.

Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
14 years agoMerge branch 'for-patrick' of git://git.kernel.org/pub/scm/linux/kernel/git/horms...
Patrick McHardy [Tue, 16 Nov 2010 09:21:27 +0000 (10:21 +0100)]
Merge branch 'for-patrick' of git://git./linux/kernel/git/horms/lvs-test-2.6

14 years agonetfilter: nf_conntrack: one less atomic op in nf_ct_expect_insert()
Eric Dumazet [Tue, 16 Nov 2010 09:19:18 +0000 (10:19 +0100)]
netfilter: nf_conntrack: one less atomic op in nf_ct_expect_insert()

Instead of doing atomic_inc(&exp->use) twice,
call atomic_add(2, &exp->use);

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
14 years agoipvs: allow transmit of GRO aggregated skbs
Simon Horman [Tue, 9 Nov 2010 01:08:49 +0000 (10:08 +0900)]
ipvs: allow transmit of GRO aggregated skbs

Attempt at allowing LVS to transmit skbs of greater than MTU length that
have been aggregated by GRO and can thus be deaggregated by GSO.

Cc: Julian Anastasov <ja@ssi.bg>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Simon Horman <horms@verge.net.au>
14 years agoipvs: remove shadow rt variable
Eric Dumazet [Mon, 15 Nov 2010 18:46:33 +0000 (19:46 +0100)]
ipvs: remove shadow rt variable

Remove a sparse warning about rt variable.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
14 years agoipvs: add static and read_mostly attributes
Eric Dumazet [Mon, 15 Nov 2010 17:38:52 +0000 (18:38 +0100)]
ipvs: add static and read_mostly attributes

ip_vs_conn_tab_bits & ip_vs_conn_tab_mask are static to
ipvs/ip_vs_conn.c

ip_vs_conn_tab_size, ip_vs_conn_tab_mask, ip_vs_conn_tab [the pointer],
ip_vs_conn_rnd are mostly read.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
14 years agoIPVS: buffer argument to ip_vs_process_message() should not be const
Simon Horman [Tue, 9 Nov 2010 00:33:28 +0000 (09:33 +0900)]
IPVS: buffer argument to ip_vs_process_message() should not be const

It is assigned to a non-const variable and its contents are modified.

Acked-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
14 years agoIPVS: Remove useless { } block from ip_vs_process_message()
Simon Horman [Tue, 9 Nov 2010 00:33:25 +0000 (09:33 +0900)]
IPVS: Remove useless { } block from ip_vs_process_message()

Acked-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
14 years agoIPVS: Make the cp argument to ip_vs_sync_conn() static
Simon Horman [Tue, 9 Nov 2010 00:33:15 +0000 (09:33 +0900)]
IPVS: Make the cp argument to ip_vs_sync_conn() static

Acked-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
14 years agoIPVS: Only match pe_data created by the same pe
Simon Horman [Mon, 8 Nov 2010 11:06:30 +0000 (20:06 +0900)]
IPVS: Only match pe_data created by the same pe

Only match persistence engine data if it was
created by the same persistence engine.

Reported-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
14 years agoIPVS: Add persistence engine to connection entry
Simon Horman [Mon, 8 Nov 2010 11:05:57 +0000 (20:05 +0900)]
IPVS: Add persistence engine to connection entry

The dest of a connection may not exist if it has been created as the result
of connection synchronisation. But in order for connection entries for
templates with persistence engine data created through connection
synchronisation to be valid access to the persistence engine pointer is
required.  So add the persistence engine to the connection itself.

Signed-off-by: Simon Horman <horms@verge.net.au>
14 years agonetfilter: rcu sparse cleanups
Eric Dumazet [Mon, 15 Nov 2010 18:45:13 +0000 (19:45 +0100)]
netfilter: rcu sparse cleanups

Use RCU helpers to reduce number of sparse warnings
(CONFIG_SPARSE_RCU_POINTER=y), and adds lockdep checks.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
14 years agonetfilter: nf_nat_amanda: rename a variable
Eric Dumazet [Mon, 15 Nov 2010 17:45:12 +0000 (18:45 +0100)]
netfilter: nf_nat_amanda: rename a variable

Avoid a sparse warning about 'ret' variable shadowing

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
14 years agonetfilter: add __rcu annotations
Eric Dumazet [Mon, 15 Nov 2010 17:43:59 +0000 (18:43 +0100)]
netfilter: add __rcu annotations

Use helpers to reduce number of sparse warnings
(CONFIG_SPARSE_RCU_POINTER=y)

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
14 years agonetfilter: nf_ct_frag6_sysctl_table is static
Eric Dumazet [Mon, 15 Nov 2010 17:18:29 +0000 (18:18 +0100)]
netfilter: nf_ct_frag6_sysctl_table is static

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
14 years agonetfilter: add __rcu annotations
Eric Dumazet [Mon, 15 Nov 2010 17:17:21 +0000 (18:17 +0100)]
netfilter: add __rcu annotations

Add some __rcu annotations and use helpers to reduce number of sparse
warnings (CONFIG_SPARSE_RCU_POINTER=y)

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
14 years agonetfilter: xt_CLASSIFY: add ARP support, allow CLASSIFY target on any table
Frédéric Leroy [Mon, 15 Nov 2010 12:57:56 +0000 (13:57 +0100)]
netfilter: xt_CLASSIFY: add ARP support, allow CLASSIFY target on any table

Signed-off-by: Frédéric Leroy <fredo@starox.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
14 years agonetfilter: nf_nat: define nat_pptp_info as needed
Changli Gao [Mon, 15 Nov 2010 11:27:27 +0000 (12:27 +0100)]
netfilter: nf_nat: define nat_pptp_info as needed

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
14 years agonetfilter: ct_extend: define NF_CT_EXT_* as needed
Changli Gao [Mon, 15 Nov 2010 11:23:24 +0000 (12:23 +0100)]
netfilter: ct_extend: define NF_CT_EXT_* as needed

Less IDs make nf_ct_ext smaller.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
14 years agonetfilter: nf_nat: don't use atomic bit operation
Changli Gao [Mon, 15 Nov 2010 10:59:03 +0000 (11:59 +0100)]
netfilter: nf_nat: don't use atomic bit operation

As we own the conntrack and the others can't see it until we confirm it,
we don't need to use atomic bit operation on ct->status.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
14 years agonetfilter: nf_conntrack: define ct_*_info as needed
Changli Gao [Mon, 15 Nov 2010 10:51:06 +0000 (11:51 +0100)]
netfilter: nf_conntrack: define ct_*_info as needed

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
14 years agonetfilter: ct_extend: fix the wrong alloc_size
Changli Gao [Mon, 15 Nov 2010 10:47:52 +0000 (11:47 +0100)]
netfilter: ct_extend: fix the wrong alloc_size

In function update_alloc_size(), sizeof(struct nf_ct_ext) is added twice
wrongly.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
14 years agonetfilter: xt_LOG: do print MAC header on FORWARD
Jan Engelhardt [Mon, 15 Nov 2010 10:23:06 +0000 (11:23 +0100)]
netfilter: xt_LOG: do print MAC header on FORWARD

I am observing consistent behavior even with bridges, so let's unlock
this. xt_mac is already usable in FORWARD, too. Section 9 of
http://ebtables.sourceforge.net/br_fw_ia/br_fw_ia.html#section9 says
the MAC source address is changed, but my observation does not match
that claim -- the MAC header is retained.

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
[Patrick; code inspection seems to confirm this]
Signed-off-by: Patrick McHardy <kaber@trash.net>
14 years agonetfilter: xt_NFQUEUE: remove modulo operations
Changli Gao [Fri, 12 Nov 2010 16:34:17 +0000 (17:34 +0100)]
netfilter: xt_NFQUEUE: remove modulo operations

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
14 years agonetfilter: nf_conntrack: don't always initialize ct->proto
Changli Gao [Fri, 12 Nov 2010 16:33:17 +0000 (17:33 +0100)]
netfilter: nf_conntrack: don't always initialize ct->proto

ct->proto is big(60 bytes) due to structure ip_ct_tcp, and we don't need
to initialize the whole for all the other protocols. This patch moves
proto to the end of structure nf_conn, and pushes the initialization down
to the individual protocols.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
14 years agoipv4: Make rt->fl.iif tests lest obscure.
David S. Miller [Fri, 12 Nov 2010 01:07:48 +0000 (17:07 -0800)]
ipv4: Make rt->fl.iif tests lest obscure.

When we test rt->fl.iif against zero, we're seeing if it's
an output or an input route.

Make that explicit with some helper functions.

Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoMerge branch 'dccp' of git://eden-feed.erg.abdn.ac.uk/net-next-2.6
David S. Miller [Thu, 11 Nov 2010 18:43:30 +0000 (10:43 -0800)]
Merge branch 'dccp' of git://eden-feed.erg.abdn.ac.uk/net-next-2.6

14 years agonet: get rid of rtable->idev
Eric Dumazet [Thu, 11 Nov 2010 07:14:07 +0000 (07:14 +0000)]
net: get rid of rtable->idev

It seems idev field in struct rtable has no special purpose, but adding
extra atomic ops.

We hold refcounts on the device itself (using percpu data, so pretty
cheap in current kernel).

infiniband case is solved using dst.dev instead of idev->dev

Removal of this field means routing without route cache is now using
shared data, percpu data, and only potential contention is a pair of
atomic ops on struct neighbour per forwarded packet.

About 5% speedup on routing test.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Roland Dreier <rolandd@cisco.com>
Cc: Sean Hefty <sean.hefty@intel.com>
Cc: Hal Rosenstock <hal.rosenstock@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoneigh: reorder struct neighbour
Eric Dumazet [Thu, 11 Nov 2010 06:57:19 +0000 (06:57 +0000)]
neigh: reorder struct neighbour

It is important to move nud_state outside of the often modified cache
line (because of refcnt), to reduce false sharing in neigh_event_send()

This is a followup of commit 0ed8ddf4045f (neigh: Protect neigh->ha[]
with a seqlock)

This gives a 7% speedup on routing test with IP route cache disabled.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agovxge: update driver version
Jon Mason [Thu, 11 Nov 2010 04:26:04 +0000 (04:26 +0000)]
vxge: update driver version

Update vxge driver version

Signed-off-by: Jon Mason <jon.mason@exar.com>
Signed-off-by: Ram Vepa <ram.vepa@exar.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agovxge: sparse and other clean-ups
Jon Mason [Thu, 11 Nov 2010 04:26:03 +0000 (04:26 +0000)]
vxge: sparse and other clean-ups

Correct issues found by running sparse on the vxge driver, as well as
other miscellaneous cleanups.

Signed-off-by: Jon Mason <jon.mason@exar.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agovxge: update Kconfig
Jon Mason [Thu, 11 Nov 2010 04:26:02 +0000 (04:26 +0000)]
vxge: update Kconfig

Update Kconfig to reflect Exar's purchase of Neterion (formerly S2IO).

Signed-off-by: Jon Mason <jon.mason@exar.com>
Signed-off-by: Ram Vepa <ram.vepa@exar.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agovxge: correct multi-function detection
Jon Mason [Thu, 11 Nov 2010 04:26:01 +0000 (04:26 +0000)]
vxge: correct multi-function detection

The values used to determined if the adapter is running in single or
multi-function mode were previously modified to the values necessary
when making the VXGE_HW_FW_API_GET_FUNC_MODE firmware call.  However,
the firmware call was not modified.  This had the driver printing out on
probe that the adapter was in multi-function mode when in single
function mode and vice versa.

Signed-off-by: Jon Mason <jon.mason@exar.com>
Signed-off-by: Ram Vepa <ram.vepa@exar.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agovxge: Titan1A detection
Jon Mason [Thu, 11 Nov 2010 04:26:00 +0000 (04:26 +0000)]
vxge: Titan1A detection

Detect if the adapter is Titan or Titan1A, and tune the driver for this
hardware.  Also, remove unnecessary function __vxge_hw_device_id_get.

Signed-off-by: Jon Mason <jon.mason@exar.com>
Signed-off-by: Ram Vepa <ram.vepa@exar.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agovxge: Handle errors in vxge_hw_vpath_fw_api
Jon Mason [Thu, 11 Nov 2010 04:25:59 +0000 (04:25 +0000)]
vxge: Handle errors in vxge_hw_vpath_fw_api

Propagate the return code of the call to vxge_hw_vpath_fw_api and
__vxge_hw_vpath_pci_func_mode_get.  This enables the proper handling of
error conditions when querying the function mode of the device during
probe.

Signed-off-by: Jon Mason <jon.mason@exar.com>
Signed-off-by: Ram Vepa <ram.vepa@exar.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agovxge: add receive hardware timestamping
Jon Mason [Thu, 11 Nov 2010 04:25:58 +0000 (04:25 +0000)]
vxge: add receive hardware timestamping

Add support for enable/disabling hardware timestamping on receive
packets via ioctl call.  When enabled, the hardware timestamp replaces
the FCS in the payload.

Signed-off-by: Jon Mason <jon.mason@exar.com>
Signed-off-by: Ram Vepa <ram.vepa@exar.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agovxge: add support for ethtool firmware flashing
Jon Mason [Thu, 11 Nov 2010 04:25:57 +0000 (04:25 +0000)]
vxge: add support for ethtool firmware flashing

Add the ability in the vxge driver to flash firmware via ethtool.

Updated to include comments from Ben Hutchings.

Signed-off-by: Jon Mason <jon.mason@exar.com>
Signed-off-by: Ram Vepa <ram.vepa@exar.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agovxge: serialize access to steering control register
Jon Mason [Thu, 11 Nov 2010 04:25:56 +0000 (04:25 +0000)]
vxge: serialize access to steering control register

It is possible for multiple callers to access the firmware interface for
the same vpath simultaneously, resulting in uncertain output.  Add locks
to serialize access.  Also, make functions only accessed locally static,
thus requiring some movement of code blocks.

Signed-off-by: Jon Mason <jon.mason@exar.com>
Signed-off-by: Ram Vepa <ram.vepa@exar.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agovxge: cleanup debug printing and asserts
Jon Mason [Thu, 11 Nov 2010 04:25:55 +0000 (04:25 +0000)]
vxge: cleanup debug printing and asserts

Remove all of the unnecessary debug printk indirection and temporary
variables for vxge_debug_ll and vxge_assert.

Signed-off-by: Jon Mason <jon.mason@exar.com>
Signed-off-by: Ram Vepa <ram.vepa@exar.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agovxge: Wait for Rx to become idle before reseting or closing
Jon Mason [Thu, 11 Nov 2010 04:25:54 +0000 (04:25 +0000)]
vxge: Wait for Rx to become idle before reseting or closing

Wait for the receive traffic to become idle before attempting to close
or reset the adapter.  To enable the processing of packets while Receive
Idle, move the clearing of __VXGE_STATE_CARD_UP bit in vxge_close to
after it.  Also, modify the return value of the ISR when the adapter is
down to IRQ_HANDLED.  Otherwise there are unhandled interrupts for the
device.

Signed-off-by: Jon Mason <jon.mason@exar.com>
Signed-off-by: Ram Vepa <ram.vepa@exar.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agovxge: enable rxhash
Jon Mason [Thu, 11 Nov 2010 04:25:53 +0000 (04:25 +0000)]
vxge: enable rxhash

Enable RSS hashing and add ability to pass up the adapter calculated rx
hash up the network stack (if feature is available).  Add the ability to
enable/disable feature via ethtool, which requires that the adapter is
not running at the time.  Other miscellaneous cleanups and fixes
required to get RSS working.

Signed-off-by: Jon Mason <jon.mason@exar.com>
Signed-off-by: Ram Vepa <ram.vepa@exar.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agodccp ccid-2: Implementation of circular Ack Vector buffer with overflow handling
Gerrit Renker [Wed, 10 Nov 2010 20:21:35 +0000 (21:21 +0100)]
dccp ccid-2: Implementation of circular Ack Vector buffer with overflow handling

This completes the implementation of a circular buffer for Ack Vectors, by
extending the current (linear array-based) implementation.  The changes are:

 (a) An `overflow' flag to deal with the case of overflow. As before, dynamic
     growth of the buffer will not be supported; but code will be added to deal
     robustly with overflowing Ack Vector buffers.

 (b) A `tail_seqno' field. When naively implementing the algorithm of Appendix A
     in RFC 4340, problems arise whenever subsequent Ack Vector records overlap,
     which can bring the entire run length calculation completely out of synch.
     (This is documented on http://www.erg.abdn.ac.uk/users/gerrit/dccp/notes/\
                                             ack_vectors/tracking_tail_ackno/ .)
 (c) The buffer length is now computed dynamically (i.e. current fill level),
     as the span between head to tail.

As a result, dccp_ackvec_pending() is now simpler - the #ifdef is no longer
necessary since buf_empty is always true when IP_DCCP_ACKVEC is not configured.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
14 years agodccp ccid-2: Separate internals of Ack Vectors from option-parsing code
Gerrit Renker [Wed, 10 Nov 2010 20:21:02 +0000 (21:21 +0100)]
dccp ccid-2: Separate internals of Ack Vectors from option-parsing code

This patch
 * separates Ack Vector housekeeping code from option-insertion code;
 * shifts option-specific code from ackvec.c into options.c;
 * introduces a dedicated routine to take care of the Ack Vector records;
 * simplifies the dccp_ackvec_insert_avr() routine: the BUG_ON was redundant,
   since the list is automatically arranged in descending order of ack_seqno.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
14 years agodccp ccid-2: Ack Vector interface clean-up
Gerrit Renker [Wed, 10 Nov 2010 20:20:07 +0000 (21:20 +0100)]
dccp ccid-2: Ack Vector interface clean-up

This patch brings the Ack Vector interface up to date. Its main purpose is
to lay the basis for the subsequent patches of this set, which will use the
new data structure fields and routines.

There are no real algorithmic changes, rather an adaptation:

 (1) Replaced the static Ack Vector size (2) with a #define so that it can
     be adapted (with low loss / Ack Ratio, a value of 1 works, so 2 seems
     to be sufficient for the moment) and added a solution so that computing
     the ECN nonce will continue to work - even with larger Ack Vectors.

 (2) Replaced the #defines for Ack Vector states with a complete enum.

 (3) Replaced #defines to compute Ack Vector length and state with general
     purpose routines (inlines), and updated code to use these.

 (4) Added a `tail' field (conversion to circular buffer in subsequent patch).

 (5) Updated the (outdated) documentation for Ack Vector struct.

 (6) All sequence number containers now trimmed to 48 bits.

 (7) Removal of unused bits:
     * removed dccpav_ack_nonce from struct dccp_ackvec, since this is already
       redundantly stored in the `dccpavr_ack_nonce' (of Ack Vector record);
     * removed Elapsed Time for Ack Vectors (it was nowhere used);
     * replaced semantics of dccpavr_sent_len with dccpavr_ack_runlen, since
       the code needs to be able to remember the old run length;
     * reduced the de-/allocation routines (redundant / duplicate tests).

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
14 years agoqlge: Version change to v1.00.00.27
Ron Mercer [Wed, 10 Nov 2010 09:29:46 +0000 (09:29 +0000)]
qlge: Version change to v1.00.00.27

Signed-off-by: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com>
Signed-off-by: Ron Mercer <ron.mercer@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoqlge: Add firmware info to ethtool get regs.
Ron Mercer [Wed, 10 Nov 2010 09:29:45 +0000 (09:29 +0000)]
qlge: Add firmware info to ethtool get regs.

By default we add firmware information to ethtool get regs.
Optionally firmware info can instead be sent to log.

Signed-off-by: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com>
Signed-off-by: Ron Mercer <ron.mercer@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agonet/ipv4/tcp.c: Update WARN uses
Joe Perches [Sat, 30 Oct 2010 11:08:53 +0000 (11:08 +0000)]
net/ipv4/tcp.c: Update WARN uses

Coalesce long formats.
Align arguments.
Remove KERN_<level>.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agonet/core/dev.c: Update WARN uses
Joe Perches [Sat, 30 Oct 2010 11:08:52 +0000 (11:08 +0000)]
net/core/dev.c: Update WARN uses

Coalesce long formats.
Add missing newlines.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agodrivers/net/usb: Update WARN uses
Joe Perches [Sat, 30 Oct 2010 11:08:34 +0000 (11:08 +0000)]
drivers/net/usb: Update WARN uses

Add missing newlines.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agodrivers/net/can: Update WARN uses
Joe Perches [Sat, 30 Oct 2010 11:08:33 +0000 (11:08 +0000)]
drivers/net/can: Update WARN uses

Add missing newlines.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agodrivers/net: normalize TX_TIMEOUT
Eric Dumazet [Wed, 3 Nov 2010 22:49:35 +0000 (22:49 +0000)]
drivers/net: normalize TX_TIMEOUT

Some network drivers use old TX_TIMEOUT definitions, assuming HZ=100 of
old kernels.

Convert these definitions to include HZ, since HZ can be 1000 these
days.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoaf_unix: optimize unix_dgram_poll()
Eric Dumazet [Sun, 31 Oct 2010 05:38:25 +0000 (05:38 +0000)]
af_unix: optimize unix_dgram_poll()

unix_dgram_poll() is pretty expensive to check POLLOUT status, because
it has to lock the socket to get its peer, take a reference on the peer
to check its receive queue status, and queue another poll_wait on
peer_wait. This all can be avoided if the process calling
unix_dgram_poll() is not interested in POLLOUT status. It makes
unix_dgram_recvmsg() faster by not queueing irrelevant pollers in
peer_wait.

On a test program provided by Alan Crequy :

Before:

real    0m0.211s
user    0m0.000s
sys     0m0.208s

After:

real    0m0.044s
user    0m0.000s
sys     0m0.040s

Suggested-by: Davide Libenzi <davidel@xmailserver.org>
Reported-by: Alban Crequy <alban.crequy@collabora.co.uk>
Acked-by: Davide Libenzi <davidel@xmailserver.org>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoaf_unix: fix unix_dgram_poll() behavior for EPOLLOUT event
Eric Dumazet [Sun, 31 Oct 2010 05:36:23 +0000 (05:36 +0000)]
af_unix: fix unix_dgram_poll() behavior for EPOLLOUT event

Alban Crequy reported a problem with connected dgram af_unix sockets and
provided a test program. epoll() would miss to send an EPOLLOUT event
when a thread unqueues a packet from the other peer, making its receive
queue not full.

This is because unix_dgram_poll() fails to call sock_poll_wait(file,
&unix_sk(other)->peer_wait, wait);
if the socket is not writeable at the time epoll_ctl(ADD) is called.

We must call sock_poll_wait(), regardless of 'writable' status, so that
epoll can be notified later of states changes.

Misc: avoids testing twice (sk->sk_shutdown & RCV_SHUTDOWN)

Reported-by: Alban Crequy <alban.crequy@collabora.co.uk>
Cc: Davide Libenzi <davidel@xmailserver.org>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Davide Libenzi <davidel@xmailserver.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoaf_unix: use keyed wakeups
Eric Dumazet [Fri, 29 Oct 2010 20:44:44 +0000 (20:44 +0000)]
af_unix: use keyed wakeups

Instead of wakeup all sleepers, use wake_up_interruptible_sync_poll() to
wakeup only ones interested into writing the socket.

This patch is a specialization of commit 37e5540b3c9d (epoll keyed
wakeups: make sockets use keyed wakeups).

On a test program provided by Alan Crequy :

Before:
real    0m3.101s
user    0m0.000s
sys     0m6.104s

After:

real 0m0.211s
user 0m0.000s
sys 0m0.208s

Reported-by: Alban Crequy <alban.crequy@collabora.co.uk>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Davide Libenzi <davidel@xmailserver.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agodecnet: RCU conversion and get rid of dev_base_lock
Eric Dumazet [Fri, 29 Oct 2010 03:09:24 +0000 (03:09 +0000)]
decnet: RCU conversion and get rid of dev_base_lock

While tracking dev_base_lock users, I found decnet used it in
dnet_select_source(), but for a wrong purpose:

Writers only hold RTNL, not dev_base_lock, so readers must use RCU if
they cannot use RTNL.

Adds an rcu_head in struct dn_ifaddr and handle proper RCU management.

Adds __rcu annotation in dn_route as well.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Steven Whitehouse <swhiteho@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agobonding: remove dev_base_lock use
Eric Dumazet [Fri, 29 Oct 2010 01:52:46 +0000 (01:52 +0000)]
bonding: remove dev_base_lock use

bond_info_seq_start() uses a read_lock(&dev_base_lock) to make sure
device doesn’t disappear. Same goal can be achieved using RCU.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoaoe: remove dev_base_lock use from aoecmd_cfg_pkts()
Eric Dumazet [Fri, 29 Oct 2010 01:15:29 +0000 (01:15 +0000)]
aoe: remove dev_base_lock use from aoecmd_cfg_pkts()

dev_base_lock is the legacy way to lock the device list, and is planned
to disappear. (writers hold RTNL, readers hold RCU lock)

Convert aoecmd_cfg_pkts() to RCU locking.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: "Ed L. Cashin" <ecashin@coraid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoks8851: suspend resume support
Arce, Abraham [Thu, 28 Oct 2010 18:57:20 +0000 (18:57 +0000)]
ks8851: suspend resume support

Add suspend/resume support using default open/stop interface methods
to do hardware dependant operations.

On suspend, same low power state (soft power mode) will be kept, the
following blocks will be disabled:

 - Internal PLL Clock
 - Tx/Rx PHY
 - MAC
 - SPI Interface

Signed-off-by: Abraham Arce <x0066660@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoMerge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
David S. Miller [Mon, 8 Nov 2010 20:38:28 +0000 (12:38 -0800)]
Merge branch 'master' of /linux/kernel/git/davem/net-2.6

14 years agords: Fix rds message leak in rds_message_map_pages
Pavel Emelyanov [Mon, 8 Nov 2010 06:20:50 +0000 (06:20 +0000)]
rds: Fix rds message leak in rds_message_map_pages

The sgs allocation error path leaks the allocated message.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Andy Grover <andy.grover@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoqeth: fix race condition during device startup
Frank Blaschka [Mon, 8 Nov 2010 03:03:49 +0000 (03:03 +0000)]
qeth: fix race condition during device startup

QDIO is running independent from netdevice state. We are not
allowed to schedule NAPI in case the netdevice is not open.

Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoqeth: remove dev_queue_xmit invocation
Ursula Braun [Mon, 8 Nov 2010 03:03:48 +0000 (03:03 +0000)]
qeth: remove dev_queue_xmit invocation

For a certain Hipersockets specific error code in the xmit path, the
qeth driver tries to invoke dev_queue_xmit again.
Commit 79640a4ca6955e3ebdb7038508fa7a0cd7fa5527 introduces a busylock
causing locking problems in case of re-invoked dev_queue_xmit by qeth.
This patch removes the attempts to retry packet sending with
dev_queue_xmit from the qeth driver.

Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agopktgen: correct uninitialized queue_map
Junchang Wang [Sun, 7 Nov 2010 23:19:43 +0000 (23:19 +0000)]
pktgen: correct uninitialized queue_map

This fix a bug reported by backyes.
Right the first time pktgen's using queue_map that's not been initialized
by set_cur_queue_map(pkt_dev);

Signed-off-by: Junchang Wang <junchangwang@gmail.com>
Signed-off-by: Backyes <backyes@mail.ustc.edu.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agonet: Detect and ignore netif_stop_queue() calls before register_netdev()
Guillaume Chazarain [Sat, 6 Nov 2010 06:39:32 +0000 (06:39 +0000)]
net: Detect and ignore netif_stop_queue() calls before register_netdev()

After e6484930d7c73d324bccda7d43d131088da697b9: net: allocate tx queues in register_netdevice
These calls make net drivers oops at load time, so let's avoid people
git-bisect'ing known problems.

Signed-off-by: Guillaume Chazarain <guichaz@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoskge: Remove tx queue stopping in skge_devinit()
Guillaume Chazarain [Sat, 6 Nov 2010 06:39:31 +0000 (06:39 +0000)]
skge: Remove tx queue stopping in skge_devinit()

After e6484930d7c73d324bccda7d43d131088da697b9: net: allocate tx queues in register_netdevice
It causes an Oops at skge_probe() time.

Signed-off-by: Guillaume Chazarain <guichaz@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoipv6: fix overlap check for fragments
Shan Wei [Fri, 5 Nov 2010 01:56:34 +0000 (01:56 +0000)]
ipv6: fix overlap check for fragments

The type of FRAG6_CB(prev)->offset is int, skb->len is *unsigned* int,
and offset is int.

Without this patch, type conversion occurred to this expression, when
(FRAG6_CB(prev)->offset + prev->len) is less than offset.

Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoclassifier: report statistics for basic classifier
stephen hemminger [Thu, 4 Nov 2010 11:47:04 +0000 (11:47 +0000)]
classifier: report statistics for basic classifier

The basic classifier keeps statistics but does not report it to user space.
This showed up when using basic classifier (with police) as a default catch
all on ingress; no statistics were reported.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agosolos: Refuse to upgrade firmware with older FPGA. It doesn't work.
David Woodhouse [Mon, 1 Nov 2010 10:35:28 +0000 (10:35 +0000)]
solos: Refuse to upgrade firmware with older FPGA. It doesn't work.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agosolos: Add 'Firmware' attribute for Traverse overall firmware version
David Woodhouse [Mon, 1 Nov 2010 10:34:29 +0000 (10:34 +0000)]
solos: Add 'Firmware' attribute for Traverse overall firmware version

The existing 'FirmwareVersion' attribute only covers the DSP firmware as
provided by Conexant; not the overall version of the device firmware. We
do want to be able to see the full version number too.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoMerge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Linus Torvalds [Mon, 8 Nov 2010 19:54:53 +0000 (11:54 -0800)]
Merge branch 'for_linus' of git://git./linux/kernel/git/tytso/ext4

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  ext4: Add new ext4 inode tracepoints
  ext4: Don't call sb_issue_discard() in ext4_free_blocks()
  ext4: do not try to grab the s_umount semaphore in ext4_quota_off
  ext4: fix potential race when freeing ext4_io_page structures
  ext4: handle writeback of inodes which are being freed
  ext4: initialize the percpu counters before replaying the journal
  ext4: "ret" may be used uninitialized in ext4_lazyinit_thread()
  ext4: fix lazyinit hang after removing request

14 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6
Linus Torvalds [Mon, 8 Nov 2010 18:55:29 +0000 (10:55 -0800)]
Merge git://git./linux/kernel/git/gregkh/tty-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6:
  TTY: move .gitignore from drivers/char/ to drivers/tty/vt/
  TTY: create drivers/tty/vt and move the vt code there
  TTY: create drivers/tty and move the tty core files there

14 years agoMerge branch 'staging-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh...
Linus Torvalds [Mon, 8 Nov 2010 18:54:49 +0000 (10:54 -0800)]
Merge branch 'staging-linus' of git://git./linux/kernel/git/gregkh/staging-next-2.6

* 'staging-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging-next-2.6:
  Staging: ath6kl: remove empty files that mess with 'distclean'
  staging: ath6kl: Fixing the driver to use modified mmc_host structure
  Staging: solo6x10: fix build problem

14 years agoMerge branch 'rmobile-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Mon, 8 Nov 2010 18:54:23 +0000 (10:54 -0800)]
Merge branch 'rmobile-fixes-for-linus' of git://git./linux/kernel/git/lethal/sh-2.6

* 'rmobile-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6:
  mmc: sh_mmcif: Convert extern inline to static inline.
  ARM: mach-shmobile: Allow GPIO chips to register IRQ mappings.
  ARM: mach-shmobile: fix sh7372 after a recent clock framework rework
  ARM: mach-shmobile: include drivers/sh/Kconfig
  ARM: mach-shmobile: ap4evb: Add HDMI sound support
  ARM: mach-shmobile: clock-sh7372: Add FSIDIV clock support
  ARM: shmobile: remove sh_timer_config clk member

14 years agoMerge branch 'sh-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Mon, 8 Nov 2010 18:53:21 +0000 (10:53 -0800)]
Merge branch 'sh-fixes-for-linus' of git://git./linux/kernel/git/lethal/sh-2.6

* 'sh-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6:
  sh: clkfwk: Fix up checkpatch warnings.
  sh: make some needlessly global sh7724 clocks static
  sh: add clk_round_parent() to optimize parent clock rate
  sh: Simplify phys_addr_mask()/PTE_PHYS_MASK for 29/32-bit.
  sh: nommu: Support building without an uncached mapping.
  sh: nommu: use 32-bit phys mode.
  sh: mach-se: Fix up SE7206 no ioport build.
  sh: intc: Update for single IRQ reservation helper.
  sh: clkfwk: Fix up rate rounding error handling.
  sh: mach-se: Rip out superfluous 7751 PIO routines.
  sh: mach-se: Rip out superfluous 770x PIO routines.
  sh: mach-edosk7705: Kill off machtype, consolidate board def.
  sh: mach-edosk7705: update for this century, kill off PIO trapping.
  sh: mach-se: Rip out superfluous 7206 PIO routines.
  sh: mach-systemh: Kill off dead board.
  sh: mach-snapgear: Kill off machtype, consolidate board def.
  sh: mach-snapgear: Rip out superfluous PIO routines.
  sh: mach-microdev: SuperIO-relative ioport mapping.

14 years agoext4: Add new ext4 inode tracepoints
Theodore Ts'o [Mon, 8 Nov 2010 18:51:33 +0000 (13:51 -0500)]
ext4: Add new ext4 inode tracepoints

Add ext4_evict_inode, ext4_drop_inode, ext4_mark_inode_dirty, and
ext4_begin_ordered_truncate()

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
14 years agoext4: Don't call sb_issue_discard() in ext4_free_blocks()
Theodore Ts'o [Mon, 8 Nov 2010 18:49:33 +0000 (13:49 -0500)]
ext4: Don't call sb_issue_discard() in ext4_free_blocks()

Commit 5c521830cf (ext4: Support discard requests when running in
no-journal mode) attempts to add sb_issue_discard() for data blocks
(in data=writeback mode) and in no-journal mode.  Unfortunately, this
no longer works, because in commit dd3932eddf (block: remove
BLKDEV_IFL_WAIT), sb_issue_discard() only presents a synchronous
interface, and there are times when we call ext4_free_blocks() when we
are are holding a spinlock, or are otherwise in an atomic context.

For now, I've removed the call to sb_issue_discard() to prevent a
deadlock or (if spinlock debugging is enabled) failures like this:

BUG: scheduling while atomic: rc.sysinit/1376/0x00000002
Pid: 1376, comm: rc.sysinit Not tainted 2.6.36-ARCH #1
Call Trace:
[<ffffffff810397ce>] __schedule_bug+0x5e/0x70
[<ffffffff81403110>] schedule+0x950/0xa70
[<ffffffff81060bad>] ? insert_work+0x7d/0x90
[<ffffffff81060fbd>] ? queue_work_on+0x1d/0x30
[<ffffffff81061127>] ? queue_work+0x37/0x60
[<ffffffff8140377d>] schedule_timeout+0x21d/0x360
[<ffffffff812031c3>] ? generic_make_request+0x2c3/0x540
[<ffffffff81402680>] wait_for_common+0xc0/0x150
[<ffffffff81041490>] ? default_wake_function+0x0/0x10
[<ffffffff812034bc>] ? submit_bio+0x7c/0x100
[<ffffffff810680a0>] ? wake_bit_function+0x0/0x40
[<ffffffff814027b8>] wait_for_completion+0x18/0x20
[<ffffffff8120a969>] blkdev_issue_discard+0x1b9/0x210
[<ffffffff811ba03e>] ext4_free_blocks+0x68e/0xb60
[<ffffffff811b1650>] ? __ext4_handle_dirty_metadata+0x110/0x120
[<ffffffff811b098c>] ext4_ext_truncate+0x8cc/0xa70
[<ffffffff810d713e>] ? pagevec_lookup+0x1e/0x30
[<ffffffff81191618>] ext4_truncate+0x178/0x5d0
[<ffffffff810eacbb>] ? unmap_mapping_range+0xab/0x280
[<ffffffff810d8976>] vmtruncate+0x56/0x70
[<ffffffff811925cb>] ext4_setattr+0x14b/0x460
[<ffffffff811319e4>] notify_change+0x194/0x380
[<ffffffff81117f80>] do_truncate+0x60/0x90
[<ffffffff811e08fa>] ? security_inode_permission+0x1a/0x20
[<ffffffff811eaec1>] ? tomoyo_path_truncate+0x11/0x20
[<ffffffff81127539>] do_last+0x5d9/0x770
[<ffffffff811278bd>] do_filp_open+0x1ed/0x680
[<ffffffff8140644f>] ? page_fault+0x1f/0x30
[<ffffffff81132bfc>] ? alloc_fd+0xec/0x140
[<ffffffff81118db1>] do_sys_open+0x61/0x120
[<ffffffff81118e8b>] sys_open+0x1b/0x20
[<ffffffff81002e6b>] system_call_fastpath+0x16/0x1b

https://bugzilla.kernel.org/show_bug.cgi?id=22302

Reported-by: Mathias Burén <mathias.buren@gmail.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: jiayingz@google.com
14 years agoext4: do not try to grab the s_umount semaphore in ext4_quota_off
Dmitry Monakhov [Mon, 8 Nov 2010 18:47:33 +0000 (13:47 -0500)]
ext4: do not try to grab the s_umount semaphore in ext4_quota_off

It's not needed to sync the filesystem, and it fixes a lock_dep complaint.

Signed-off-by: Dmitry Monakhov <dmonakhov@gmail.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
14 years agoext4: fix potential race when freeing ext4_io_page structures
Theodore Ts'o [Mon, 8 Nov 2010 18:45:33 +0000 (13:45 -0500)]
ext4: fix potential race when freeing ext4_io_page structures

Use an atomic_t and make sure we don't free the structure while we
might still be submitting I/O for that page.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
14 years agoext4: handle writeback of inodes which are being freed
Theodore Ts'o [Mon, 8 Nov 2010 18:43:33 +0000 (13:43 -0500)]
ext4: handle writeback of inodes which are being freed

The following BUG can occur when an inode which is getting freed when
it still has dirty pages outstanding, and it gets deleted (in this
because it was the target of a rename).  In ordered mode, we need to
make sure the data pages are written just in case we crash before the
rename (or unlink) is committed.  If the inode is being freed then
when we try to igrab the inode, we end up tripping the BUG_ON at
fs/ext4/page-io.c:146.

To solve this problem, we need to keep track of the number of io
callbacks which are pending, and avoid destroying the inode until they
have all been completed.  That way we don't have to bump the inode
count to keep the inode from being destroyed; an approach which
doesn't work because the count could have already been dropped down to
zero before the inode writeback has started (at which point we're not
allowed to bump the count back up to 1, since it's already started
getting freed).

Thanks to Dave Chinner for suggesting this approach, which is also
used by XFS.

  kernel BUG at /scratch_space/linux-2.6/fs/ext4/page-io.c:146!
  Call Trace:
   [<ffffffff811075b1>] ext4_bio_write_page+0x172/0x307
   [<ffffffff811033a7>] mpage_da_submit_io+0x2f9/0x37b
   [<ffffffff811068d7>] mpage_da_map_and_submit+0x2cc/0x2e2
   [<ffffffff811069b3>] mpage_add_bh_to_extent+0xc6/0xd5
   [<ffffffff81106c66>] write_cache_pages_da+0x2a4/0x3ac
   [<ffffffff81107044>] ext4_da_writepages+0x2d6/0x44d
   [<ffffffff81087910>] do_writepages+0x1c/0x25
   [<ffffffff810810a4>] __filemap_fdatawrite_range+0x4b/0x4d
   [<ffffffff810815f5>] filemap_fdatawrite_range+0xe/0x10
   [<ffffffff81122a2e>] jbd2_journal_begin_ordered_truncate+0x7b/0xa2
   [<ffffffff8110615d>] ext4_evict_inode+0x57/0x24c
   [<ffffffff810c14a3>] evict+0x22/0x92
   [<ffffffff810c1a3d>] iput+0x212/0x249
   [<ffffffff810bdf16>] dentry_iput+0xa1/0xb9
   [<ffffffff810bdf6b>] d_kill+0x3d/0x5d
   [<ffffffff810be613>] dput+0x13a/0x147
   [<ffffffff810b990d>] sys_renameat+0x1b5/0x258
   [<ffffffff81145f71>] ? _atomic_dec_and_lock+0x2d/0x4c
   [<ffffffff810b2950>] ? cp_new_stat+0xde/0xea
   [<ffffffff810b29c1>] ? sys_newlstat+0x2d/0x38
   [<ffffffff810b99c6>] sys_rename+0x16/0x18
   [<ffffffff81002a2b>] system_call_fastpath+0x16/0x1b

Reported-by: Nick Bowler <nbowler@elliptictech.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Tested-by: Nick Bowler <nbowler@elliptictech.com>
14 years agonet dst: need linux/cache.h for ____cacheline_aligned_in_smp.
Paul Mundt [Mon, 8 Nov 2010 03:58:05 +0000 (19:58 -0800)]
net dst: need linux/cache.h for ____cacheline_aligned_in_smp.

Presently the b43legacy build fails on an sh randconfig:

In file included from include/net/dst.h:12,
                 from drivers/net/wireless/b43legacy/xmit.c:32:
include/net/dst_ops.h:28: error: expected ':', ',', ';', '}' or '__attribute__' before '____cacheline_aligned_in_smp'
include/net/dst_ops.h: In function 'dst_entries_get_fast':
include/net/dst_ops.h:33: error: 'struct dst_ops' has no member named 'pcpuc_entries'
include/net/dst_ops.h: In function 'dst_entries_get_slow':
include/net/dst_ops.h:41: error: 'struct dst_ops' has no member named 'pcpuc_entries'
include/net/dst_ops.h: In function 'dst_entries_add':
include/net/dst_ops.h:49: error: 'struct dst_ops' has no member named 'pcpuc_entries'
include/net/dst_ops.h: In function 'dst_entries_init':
include/net/dst_ops.h:55: error: 'struct dst_ops' has no member named 'pcpuc_entries'
include/net/dst_ops.h: In function 'dst_entries_destroy':
include/net/dst_ops.h:60: error: 'struct dst_ops' has no member named 'pcpuc_entries'
make[5]: *** [drivers/net/wireless/b43legacy/xmit.o] Error 1
make[5]: *** Waiting for unfinished jobs....

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoMerge branch 'rmobile/core' into rmobile-fixes-for-linus
Paul Mundt [Mon, 8 Nov 2010 00:51:41 +0000 (09:51 +0900)]
Merge branch 'rmobile/core' into rmobile-fixes-for-linus

14 years agoMerge branches 'sh/pio-death', 'sh/nommu', 'sh/clkfwk', 'sh/core' and 'sh/intc-extens...
Paul Mundt [Mon, 8 Nov 2010 00:42:43 +0000 (09:42 +0900)]
Merge branches 'sh/pio-death', 'sh/nommu', 'sh/clkfwk', 'sh/core' and 'sh/intc-extension' into sh-fixes-for-linus

14 years agosh: clkfwk: Fix up checkpatch warnings.
Paul Mundt [Mon, 8 Nov 2010 00:40:23 +0000 (09:40 +0900)]
sh: clkfwk: Fix up checkpatch warnings.

The clk_round_parent() change introduced various checkpatch warnings,
tidy them up.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
14 years agosh: make some needlessly global sh7724 clocks static
Guennadi Liakhovetski [Thu, 4 Nov 2010 14:14:29 +0000 (14:14 +0000)]
sh: make some needlessly global sh7724 clocks static

These clocks are currently only used inside one .c file and are not
declared in any headers, therefore having them global is useless.

Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
14 years agosh: add clk_round_parent() to optimize parent clock rate
Guennadi Liakhovetski [Tue, 2 Nov 2010 11:27:24 +0000 (11:27 +0000)]
sh: add clk_round_parent() to optimize parent clock rate

Sometimes it is possible and reasonable to adjust the parent clock rate to
improve precision of the child clock, e.g., if the child clock has no siblings.
clk_round_parent() is a new addition to the SH clock-framework API, that
implements such an optimization for child clocks with divisors, taking all
integer values in a range.

Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
14 years agoNET: pktgen - fix compile warning
Dmitry Torokhov [Sat, 6 Nov 2010 20:11:38 +0000 (20:11 +0000)]
NET: pktgen - fix compile warning

This should fix the following warning:

net/core/pktgen.c: In function ‘pktgen_if_write’:
net/core/pktgen.c:890: warning: comparison of distinct pointer types lacks a cast

Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Reviewed-by: Nelson Elhage <nelhage@ksplice.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoStaging: ath6kl: remove empty files that mess with 'distclean'
Greg Kroah-Hartman [Sat, 6 Nov 2010 18:27:04 +0000 (11:27 -0700)]
Staging: ath6kl: remove empty files that mess with 'distclean'

These two .h files would get removed from the tree when doing
make distclean

It turns out they are not needed at all, so just delete them which fixes
people's git trees when doing development.

Reported-by: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agofloppy: fix another use-after-free
Vivek Goyal [Sat, 6 Nov 2010 12:16:05 +0000 (08:16 -0400)]
floppy: fix another use-after-free

While scanning the floopy code due to c093ee4f07f4 ("floppy: fix
use-after-free in module load failure path"), I found one more instance
of trying to access disk->queue pointer after doing put_disk() on
gendisk.  For some reason , floppy moule still loads/unloads fine.  The
object is probably still around with right pointer values.

 o There seems to be one more instance of trying to cleanup the request
   queue after we have called put_disk() on associated gendisk.

 o This fix is more out of code inspection.  Even without this fix for
   some reason I am able to load/unload floppy module without any
   issues.

 o Floppy module loads/unloads fine after the fix.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agoTTY: move .gitignore from drivers/char/ to drivers/tty/vt/
Greg Kroah-Hartman [Sat, 6 Nov 2010 05:18:23 +0000 (22:18 -0700)]
TTY: move .gitignore from drivers/char/ to drivers/tty/vt/

The autogenerated files (consolemap_deftbl.c and defkeymap.c) need to
be ignored by git, so move the .gitignore file that was doing it to the
properly location now that the files have moved as well.

Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoipw2x00: remove the right /proc/net entry
Linus Torvalds [Sat, 6 Nov 2010 01:57:04 +0000 (18:57 -0700)]
ipw2x00: remove the right /proc/net entry

Commit 27ae60f8f7aa ("ipw2x00: replace "ieee80211" with "libipw" where
appropriate") changed DRV_NAME to be "libipw", but didn't properly fix
up the places where it was used to specify the name for the /proc/net/
directory.

For backwards compatibility reasons, that directory name remained
"ieee80211", but due to the DRV_NAME change, the error case printouts
and the cleanup functions now used "libipw" instead.  Which made it all
fail badly.

For example, on module unload as reported by Randy:

  WARNING: at fs/proc/generic.c:816 remove_proc_entry+0x156/0x35e()
  name 'libipw'

because it's trying to unregister a /proc directory that obviously
doesn't even exist.

Clean it all up to use DRV_PROCNAME for the actual /proc directory name.

Reported-and-tested-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Pavel Roskin <proski@gnu.org>
Cc: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agoMerge branch 'kvm-updates/2.6.37' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Linus Torvalds [Sat, 6 Nov 2010 00:49:22 +0000 (17:49 -0700)]
Merge branch 'kvm-updates/2.6.37' of git://git./virt/kvm/kvm

* 'kvm-updates/2.6.37' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: PPC: BookE: Load the lower half of MSR
  KVM: PPC: BookE: fix sleep with interrupts disabled
  KVM: PPC: e500: Call kvm_vcpu_uninit() before kvmppc_e500_tlb_uninit().
  PPC: KVM: Book E doesn't have __end_interrupts.
  KVM: x86: Issue smp_call_function_many with preemption disabled
  KVM: x86: fix information leak to userland
  KVM: PPC: fix information leak to userland
  KVM: MMU: fix rmap_remove on non present sptes
  KVM: Write protect memory after slot swap

14 years agofloppy: fix use-after-free in module load failure path
Linus Torvalds [Sat, 6 Nov 2010 00:45:59 +0000 (17:45 -0700)]
floppy: fix use-after-free in module load failure path

Commit 488211844e0c ("floppy: switch to one queue per drive instead of
sharing a queue") introduced a use-after-free.  We do "put_disk()" on
the disk device _before_ we then clean up the queue associated with that
disk.

Move the put_disk() down to avoid dereferencing a free'd data structure.

Cc: Jens Axboe <jaxboe@fusionio.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Reported-and-tested-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agowatchdog: Fix section mismatch and potential undefined behavior.
David Daney [Fri, 5 Nov 2010 23:17:39 +0000 (16:17 -0700)]
watchdog: Fix section mismatch and potential undefined behavior.

Commit d9ca07a05ce1 ("watchdog: Avoid kernel crash when disabling
watchdog") introduces a section mismatch.

Now that we reference no_watchdog from non-__init code it can no longer
be __initdata.

Signed-off-by: David Daney <ddaney@caviumnetworks.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
Linus Torvalds [Fri, 5 Nov 2010 22:25:48 +0000 (15:25 -0700)]
Merge git://git./linux/kernel/git/davem/net-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (41 commits)
  inet_diag: Make sure we actually run the same bytecode we audited.
  netlink: Make nlmsg_find_attr take a const nlmsghdr*.
  fib: fib_result_assign() should not change fib refcounts
  netfilter: ip6_tables: fix information leak to userspace
  cls_cgroup: Fix crash on module unload
  memory corruption in X.25 facilities parsing
  net dst: fix percpu_counter list corruption and poison overwritten
  rds: Remove kfreed tcp conn from list
  rds: Lost locking in loop connection freeing
  de2104x: fix panic on load
  atl1 : fix panic on load
  netxen: remove unused firmware exports
  caif: Remove noisy printout when disconnecting caif socket
  caif: SPI-driver bugfix - incorrect padding.
  caif: Bugfix for socket priority, bindtodev and dbg channel.
  smsc911x: Set Ethernet EEPROM size to supported device's size
  ipv4: netfilter: ip_tables: fix information leak to userland
  ipv4: netfilter: arp_tables: fix information leak to userland
  cxgb4vf: remove call to stop TX queues at load time.
  cxgb4: remove call to stop TX queues at load time.
  ...

14 years agoMerge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1...
Linus Torvalds [Fri, 5 Nov 2010 21:17:22 +0000 (14:17 -0700)]
Merge branch 'fixes' of git://git./linux/kernel/git/ieee1394/linux1394-2.6

* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6:
  firewire: ohci: fix race when reading count in AR descriptor
  firewire: ohci: avoid reallocation of AR buffers
  firewire: ohci: fix race in AR split packet handling
  firewire: ohci: fix buffer overflow in AR split packet handling

14 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6
Linus Torvalds [Fri, 5 Nov 2010 21:17:01 +0000 (14:17 -0700)]
Merge git://git./linux/kernel/git/sfrench/cifs-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
  cifs: make cifs_set_oplock_level() take a cifsInodeInfo pointer
  cifs: dereferencing first then checking
  cifs: trivial comment fix: tlink_tree is now a rbtree
  [CIFS] Cleanup unused variable build warning
  cifs: convert tlink_tree to a rbtree
  cifs: store pointer to master tlink in superblock (try #2)
  cifs: trivial doc fix: note setlease implemented
  CIFS: Add cifs_set_oplock_level
  FS: cifs, remove unneeded NULL tests

14 years agoposix-cpu-timers: workaround to suppress the problems with mt exec
Oleg Nesterov [Fri, 5 Nov 2010 15:53:42 +0000 (16:53 +0100)]
posix-cpu-timers: workaround to suppress the problems with mt exec

posix-cpu-timers.c correctly assumes that the dying process does
posix_cpu_timers_exit_group() and removes all !CPUCLOCK_PERTHREAD
timers from signal->cpu_timers list.

But, it also assumes that timer->it.cpu.task is always the group
leader, and thus the dead ->task means the dead thread group.

This is obviously not true after de_thread() changes the leader.
After that almost every posix_cpu_timer_ method has problems.

It is not simple to fix this bug correctly. First of all, I think
that timer->it.cpu should use struct pid instead of task_struct.
Also, the locking should be reworked completely. In particular,
tasklist_lock should not be used at all. This all needs a lot of
nontrivial and hard-to-test changes.

Change __exit_signal() to do posix_cpu_timers_exit_group() when
the old leader dies during exec. This is not the fix, just the
temporary hack to hide the problem for 2.6.37 and stable. IOW,
this is obviously wrong but this is what we currently have anyway:
cpu timers do not work after mt exec.

In theory this change adds another race. The exiting leader can
detach the timers which were attached to the new leader. However,
the window between de_thread() and release_task() is small, we
can pretend that sys_timer_create() was called before de_thread().

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>