GitHub/MotorolaMobilityLLC/kernel-slsi.git
14 years agoMerge branches 'amso1100', 'bkl', 'cma', 'cxgb3', 'cxgb4', 'ipoib', 'iser', 'masked...
Roland Dreier [Sun, 16 May 2010 03:06:01 +0000 (20:06 -0700)]
Merge branches 'amso1100', 'bkl', 'cma', 'cxgb3', 'cxgb4', 'ipoib', 'iser', 'masked-atomics', 'misc', 'mthca' and 'nes' into for-next

14 years agoIB/core: Use kmemdup() instead of kmalloc()+memcpy()
Julia Lawall [Sat, 15 May 2010 21:22:38 +0000 (23:22 +0200)]
IB/core: Use kmemdup() instead of kmalloc()+memcpy()

Use kmemdup when some other buffer is immediately copied into the
allocated region.

A simplified version of the semantic patch that makes this change is as
follows: (http://coccinelle.lip6.fr/)

// <smpl>
@@
expression from,to,size,flag;
statement S;
@@

-  to = \(kmalloc\|kzalloc\)(size,flag);
+  to = kmemdup(from,size,flag);
   if (to==NULL || ...) S
-  memcpy(to, from, size);
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
14 years agoIB/iser: Fix error flow in iser_create_ib_conn_res()
Dan Carpenter [Thu, 6 May 2010 13:22:21 +0000 (16:22 +0300)]
IB/iser: Fix error flow in iser_create_ib_conn_res()

We shouldn't free things here because we free them later.
The call tree looks like this:
iser_connect() ==> initiating the connection establishment
and later
iser_cma_handler() => iser_route_handler() => iser_create_ib_conn_res()
if we fail here, eventually iser_conn_release() is called, resulting
in a double free.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
14 years agoIB/iser: Enhance disconnection logic for multi-pathing
Or Gerlitz [Wed, 5 May 2010 14:31:44 +0000 (17:31 +0300)]
IB/iser: Enhance disconnection logic for multi-pathing

The iser connection teardown flow isn't over until the underlying
Connection Manager (e.g the IB CM) delivers a disconnected or timeout
event through the RDMA-CM.  When the remote (target) side isn't
reachable, e.g when some HW e.g port/hca/switch isn't functioning or
taken down administratively, the CM timeout flow is used and the event
may be generated only after relatively long time -- on the order of
tens of seconds.

The current iser code exposes this possibly long delay to higher
layers, specifically to the iscsid daemon and iscsi kernel stack. As a
result, the iscsi stack doesn't respond well: this low-level CM delay
is added to the fail-over time under HA schemes such as the one
provided by DM multipath through the multipathd(8) service.

This patch enhances the reference counting scheme on iser's IB
connections so that the disconnect flow initiated by iscsid from user
space (ep_disconnect) doesn't wait for the CM to deliver the
disconnect/timeout event.  (The connection teardown isn't done from
iser's view point until the event is delivered)

The iser ib (rdma) connection object is destroyed when its reference
count reaches zero.  When this happens on the RDMA-CM callback
context, extra care is taken so that the RDMA-CM does the actual
destroying of the associated ID, since doing it in the callback is
prohibited.

The reference count of iser ib connection normally reaches three,
where the <ref, deref> relations are

 1. conn <init, terminate>
 2. conn <bind, stop/destroy>
 3. cma id <create, disconnect/error/timeout callbacks>

With this patch, multipath fail-over time is about 30 seconds, while
without this patch, multipath fail-over time is about 130 seconds.

Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
14 years agoIB/iser: Remove buggy back-pointer setting
Or Gerlitz [Wed, 5 May 2010 14:30:34 +0000 (17:30 +0300)]
IB/iser: Remove buggy back-pointer setting

The iscsi connection object life cycle includes binding and unbinding
(conn_stop) to/from the iscsi transport connection object.  Since
iscsi connection objects are recycled, at the time the transport
connection (e.g iser's IB connection) is released, it is not valid to
touch the iscsi connection tied to the transport back-pointer since it
may already point to a different transport connection.

Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
14 years agoIB/iser: Add asynchronous event handler
Or Gerlitz [Wed, 5 May 2010 14:30:10 +0000 (17:30 +0300)]
IB/iser: Add asynchronous event handler

Add handler to handle events such as port up and down.  This is useful
when testing high-availability schemes such as multi-pathing.

Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
14 years agoMAINTAINERS: Add cxgb4 and iw_cxgb4 entries
Roland Dreier [Wed, 5 May 2010 21:45:40 +0000 (14:45 -0700)]
MAINTAINERS: Add cxgb4 and iw_cxgb4 entries

Signed-off-by: Roland Dreier <rolandd@cisco.com>
14 years agoRDMA/cxgb3: Shrink .text with compile-time init of handlers arrays
Roland Dreier [Wed, 28 Apr 2010 21:57:40 +0000 (14:57 -0700)]
RDMA/cxgb3: Shrink .text with compile-time init of handlers arrays

Using compile-time designated initializers for the handler arrays
instead of open-coding the initialization in iwch_cm_init() is (IMHO)
cleaner, and leads to substantially smaller code: on my x86-64 build,
bloat-o-meter shows:

add/remove: 0/1 grow/shrink: 4/3 up/down: 4/-1682 (-1678)
function                                     old     new   delta
tx_ack                                       167     168      +1
state_set                                     55      56      +1
start_ep_timer                                99     100      +1
pass_establish                               177     178      +1
act_open_req_arp_failure                      39      38      -1
sched                                         84      82      -2
iwch_cm_init                                 442      91    -351
work_handlers                               1328       -   -1328

Signed-off-by: Roland Dreier <rolandd@cisco.com>
14 years agoIPoIB: Allow disabling/enabling TSO on the fly through ethtool
Or Gerlitz [Thu, 4 Mar 2010 13:16:52 +0000 (13:16 +0000)]
IPoIB: Allow disabling/enabling TSO on the fly through ethtool

Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
14 years agoIB/mlx4: Add support for masked atomic operations
Vladimir Sokolovsky [Wed, 14 Apr 2010 14:23:39 +0000 (17:23 +0300)]
IB/mlx4: Add support for masked atomic operations

Add support for masked atomic operations (masked compare and swap,
masked fetch and add).

Signed-off-by: Vladimir Sokolovsky <vlad@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
14 years agoIB/core: Add support for masked atomic operations
Vladimir Sokolovsky [Wed, 14 Apr 2010 14:23:01 +0000 (17:23 +0300)]
IB/core: Add support for masked atomic operations

 - Add new IB_WR_MASKED_ATOMIC_CMP_AND_SWP and IB_WR_MASKED_ATOMIC_FETCH_AND_ADD
   send opcodes that can be used to post "masked atomic compare and
   swap" and "masked atomic fetch and add" work request respectively.
 - Add masked_atomic_cap capability.
 - Add mask fields to atomic struct of ib_send_wr
 - Add new opcodes to ib_wc_opcode

The new operations are described more precisely below:

* Masked Compare and Swap (MskCmpSwap)

The MskCmpSwap atomic operation is an extension to the CmpSwap
operation defined in the IB spec.  MskCmpSwap allows the user to
select a portion of the 64 bit target data for the “compare” check as
well as to restrict the swap to a (possibly different) portion.  The
pseudo code below describes the operation:

| atomic_response = *va
| if (!((compare_add ^ *va) & compare_add_mask)) then
|     *va = (*va & ~(swap_mask)) | (swap & swap_mask)
|
| return atomic_response

The additional operands are carried in the Extended Transport Header.
Atomic response generation and packet format for MskCmpSwap is as for
standard IB Atomic operations.

* Masked Fetch and Add (MFetchAdd)

The MFetchAdd Atomic operation extends the functionality of the
standard IB FetchAdd by allowing the user to split the target into
multiple fields of selectable length. The atomic add is done
independently on each one of this fields. A bit set in the
field_boundary parameter specifies the field boundaries. The pseudo
code below describes the operation:

| bit_adder(ci, b1, b2, *co)
| {
| value = ci + b1 + b2
| *co = !!(value & 2)
|
| return value & 1
| }
|
| #define MASK_IS_SET(mask, attr)      (!!((mask)&(attr)))
| bit_position = 1
| carry = 0
| atomic_response = 0
|
| for i = 0 to 63
| {
|         if ( i != 0 )
|                 bit_position =  bit_position << 1
|
|         bit_add_res = bit_adder(carry, MASK_IS_SET(*va, bit_position),
|                                 MASK_IS_SET(compare_add, bit_position), &new_carry)
|         if (bit_add_res)
|                 atomic_response |= bit_position
|
|         carry = ((new_carry) && (!MASK_IS_SET(compare_add_mask, bit_position)))
| }
|
| return atomic_response

Signed-off-by: Vladimir Sokolovsky <vlad@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
14 years agoRDMA/cma: Randomize local port allocation
Tetsuo Handa [Thu, 15 Apr 2010 02:29:04 +0000 (11:29 +0900)]
RDMA/cma: Randomize local port allocation

Randomize local port allocation in the way sctp_get_port_local() does.
Update rover at the end of loop since we're likely to pick a valid port
on the first try.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reviewed-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
14 years agoRDMA/nes: Make unnecessarily global functions static
Roland Dreier [Wed, 21 Apr 2010 22:58:28 +0000 (15:58 -0700)]
RDMA/nes: Make unnecessarily global functions static

This allows the compiler to do a bit better; on my x86-64 build:

add/remove: 0/2 grow/shrink: 1/0 up/down: 2288/-2365 (-77)
function                                     old     new   delta
nes_init_phy                                 273    2561   +2288
nes_init_1g_phy                              469       -    -469
nes_init_2025_phy                           1896       -   -1896

Signed-off-by: Roland Dreier <rolandd@cisco.com>
14 years agoRDMA/nes: Make nesadapter->phy_lock usage consistent
Chien Tung [Tue, 9 Mar 2010 21:50:40 +0000 (21:50 +0000)]
RDMA/nes: Make nesadapter->phy_lock usage consistent

nes_{read,write}_1G_phy_reg() are using phy_lock while
nes_{read,write}_10G_phy_reg() leave that to the caller.

Remove phy_lock from 1G routines and leave the locking to the caller.
Add additional phy_lock calls around 1G read/write.

Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
14 years agoRDMA/cxgb4: Add driver for Chelsio T4 RNIC
Steve Wise [Wed, 21 Apr 2010 22:30:06 +0000 (15:30 -0700)]
RDMA/cxgb4: Add driver for Chelsio T4 RNIC

Add an RDMA/iWARP driver for Chelsio T4 Ethernet adapters.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
14 years agoIB/mthca: Use the dma state API instead of pci equivalents
FUJITA Tomonori [Fri, 2 Apr 2010 04:29:39 +0000 (04:29 +0000)]
IB/mthca: Use the dma state API instead of pci equivalents

The DMA API is preferred; no functional change.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
14 years agoRDMA/amso1100: Use the dma state API instead of pci equivalents
FUJITA Tomonori [Wed, 21 Apr 2010 22:23:10 +0000 (15:23 -0700)]
RDMA/amso1100: Use the dma state API instead of pci equivalents

The DMA API is preferred; no functional change.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
14 years agoRDMA/cxgb3: Don't free skbs on NET_XMIT_* indications from LLD
Steve Wise [Mon, 5 Apr 2010 19:59:57 +0000 (19:59 +0000)]
RDMA/cxgb3: Don't free skbs on NET_XMIT_* indications from LLD

The low level cxgb3 driver can return NET_XMIT_CN and friends.
The iw_cxgb3 driver should _not_ treat these as errors.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
14 years agoRDMA/cxgb3: Use the dma state API instead of pci equivalents
FUJITA Tomonori [Fri, 2 Apr 2010 04:29:38 +0000 (04:29 +0000)]
RDMA/cxgb3: Use the dma state API instead of pci equivalents

The DMA API is preferred; no functional change.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
14 years agoIB: Explicitly rule out llseek to avoid BKL in default_llseek()
Roland Dreier [Sat, 10 Apr 2010 00:13:50 +0000 (17:13 -0700)]
IB: Explicitly rule out llseek to avoid BKL in default_llseek()

Several RDMA user-access drivers have file_operations structures with
no .llseek method set.  None of the drivers actually do anything with
f_pos, so this means llseek is essentially a NOP, instead of returning
an error as leaving other file_operations methods unimplemented would
do.  This is mostly harmless, except that a NULL .llseek means that
default_llseek() is used, and this function grabs the BKL, which we
would like to avoid.

Since llseek does nothing useful on these files, we would like it to
return an error to userspace instead of silently grabbing the BKL and
succeeding.  For nearly all of the file types, we take the
belt-and-suspenders approach of setting the .llseek method to
no_llseek and also calling nonseekable_open(); the exception is the
uverbs_event files, which are created with anon_inode_getfile(), which
already sets f_mode the same way as nonseekable_open() would.

This work is motivated by Arnd Bergmann's bkl-removal tree.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
14 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland...
Linus Torvalds [Fri, 9 Apr 2010 18:53:06 +0000 (11:53 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/roland/infiniband

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
  IB/mlx4: Check correct variable for allocation failure
  RDMA/nes: Correct cap.max_inline_data assignment in nes_query_qp()
  RDMA/cm: Set num_paths when manually assigning path records
  IB/cm: Fix device_create() return value check

14 years agoMerge branch 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6
Linus Torvalds [Fri, 9 Apr 2010 18:52:48 +0000 (11:52 -0700)]
Merge branch 'for-linus' of git://git390.marist.edu/linux-2.6

* 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6:
  [S390] Update default configuration.
  [S390] nss: add missing .previous statement to asm function
  [S390] increase default size of vmalloc area
  [S390] s390: disable change bit override
  [S390] fix io_return critical section cleanup
  [S390] sclp_async: potential buffer overflow
  [S390] arch/s390/kernel: Add missing unlock

14 years agoMerge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
Linus Torvalds [Fri, 9 Apr 2010 18:50:29 +0000 (11:50 -0700)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block

* 'for-linus' of git://git.kernel.dk/linux-2.6-block: (34 commits)
  cfq-iosched: Fix the incorrect timeslice accounting with forced_dispatch
  loop: Update mtime when writing using aops
  block: expose the statistics in blkio.time and blkio.sectors for the root cgroup
  backing-dev: Handle class_create() failure
  Block: Fix block/elevator.c elevator_get() off-by-one error
  drbd: lc_element_by_index() never returns NULL
  cciss: unlock on error path
  cfq-iosched: Do not merge queues of BE and IDLE classes
  cfq-iosched: Add additional blktrace log messages in CFQ for easier debugging
  i2o: Remove the dangerous kobj_to_i2o_device macro
  block: remove 16 bytes of padding from struct request on 64bits
  cfq-iosched: fix a kbuild regression
  block: make CONFIG_BLK_CGROUP visible
  Remove GENHD_FL_DRIVERFS
  block: Export max number of segments and max segment size in sysfs
  block: Finalize conversion of block limits functions
  block: Fix overrun in lcm() and move it to lib
  vfs: improve writeback_inodes_wb()
  paride: fix off-by-one test
  drbd: fix al-to-on-disk-bitmap for 4k logical_block_size
  ...

14 years agoMerge branch 'drm-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied...
Linus Torvalds [Fri, 9 Apr 2010 18:50:01 +0000 (11:50 -0700)]
Merge branch 'drm-linus' of git://git./linux/kernel/git/airlied/drm-2.6

* 'drm-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: (29 commits)
  drm/nouveau: bail out of auxch transaction if we repeatedly recieve defers
  drm/nv50: implement gpio set/get routines
  drm/nv50: parse/use some more de-magiced parts of gpio table entries
  drm/nouveau: store raw gpio table entry in bios gpio structs
  drm/nv40: Init some tiling-related PGRAPH state.
  drm/nv50: Add NVA3 support in ctxprog/ctxvals generator.
  drm/nv50: another dodgy DP hack
  drm/nv50: punt hotplug irq handling out to workqueue
  drm/nv50: preserve an unknown SOR_MODECTRL value for DP encoders
  drm/nv50: Allow using the NVA3 new compute class.
  drm/nv50: cleanup properly if PDISPLAY init fails
  drm/nouveau: fixup the init failure paths some more
  drm/nv50: fix instmem init on IGPs if stolen mem crosses 4GiB mark
  drm/nv40: add LVDS table quirk for Dell Latitude D620
  drm/nv40: rework lvds table parsing
  drm/nouveau: detect vram amount once, and save the value
  drm/nouveau: remove some unused members from drm_nouveau_private
  drm/nouveau: Make use of TTM busy_placements.
  drm/nv50: add more 0x100c80 flushy magic
  drm/nv50: fix fbcon when framebuffer above 4GiB mark
  ...

14 years agoradix_tree_tag_get() is not as safe as the docs make out [ver #2]
David Howells [Tue, 6 Apr 2010 21:36:20 +0000 (22:36 +0100)]
radix_tree_tag_get() is not as safe as the docs make out [ver #2]

radix_tree_tag_get() is not safe to use concurrently with radix_tree_tag_set()
or radix_tree_tag_clear().  The problem is that the double tag_get() in
radix_tree_tag_get():

if (!tag_get(node, tag, offset))
saw_unset_tag = 1;
if (height == 1) {
int ret = tag_get(node, tag, offset);

may see the value change due to the action of set/clear.  RCU is no protection
against this as no pointers are being changed, no nodes are being replaced
according to a COW protocol - set/clear alter the node directly.

The documentation in linux/radix-tree.h, however, says that
radix_tree_tag_get() is an exception to the rule that "any function modifying
the tree or tags (...) must exclude other modifications, and exclude any
functions reading the tree".

The problem is that the next statement in radix_tree_tag_get() checks that the
tag doesn't vary over time:

BUG_ON(ret && saw_unset_tag);

This has been seen happening in FS-Cache:

https://www.redhat.com/archives/linux-cachefs/2010-April/msg00013.html

To this end, remove the BUG_ON() from radix_tree_tag_get() and note in various
comments that the value of the tag may change whilst the RCU read lock is held,
and thus that the return value of radix_tree_tag_get() may not be relied upon
unless radix_tree_tag_set/clear() and radix_tree_delete() are excluded from
running concurrently with it.

Reported-by: Romain DEGEZ <romain.degez@smartjog.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agoslub: Fix kmem_ptr_validate() for non-kernel pointers
Pekka Enberg [Wed, 7 Apr 2010 16:23:41 +0000 (19:23 +0300)]
slub: Fix kmem_ptr_validate() for non-kernel pointers

As suggested by Linus, fix up kmem_ptr_validate() to handle non-kernel pointers
more graciously. The patch changes kmem_ptr_validate() to use the newly
introduced kern_ptr_validate() helper to check that a pointer is a valid kernel
pointer before we attempt to convert it into a 'struct page'.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Matt Mackall <mpm@selenic.com>
Cc: Nick Piggin <npiggin@suse.de>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Acked-by: Christoph Lameter <cl@linux-foundation.org>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agoslab: Generify kernel pointer validation
Pekka Enberg [Wed, 7 Apr 2010 16:23:40 +0000 (19:23 +0300)]
slab: Generify kernel pointer validation

As suggested by Linus, introduce a kern_ptr_validate() helper that does some
sanity checks to make sure a pointer is a valid kernel pointer.  This is a
preparational step for fixing SLUB kmem_ptr_validate().

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Matt Mackall <mpm@selenic.com>
Cc: Nick Piggin <npiggin@suse.de>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agoRevert "memory-hotplug: add 0x prefix to HEX block_size_bytes"
Linus Torvalds [Fri, 9 Apr 2010 17:05:33 +0000 (10:05 -0700)]
Revert "memory-hotplug: add 0x prefix to HEX block_size_bytes"

This reverts commit ba168fc37dea145deeb8fa9e7e71c748d2e00d74.

It changes user-visible sysfs interfaces, and breaks some existing user
space applications which apparently rely on the fact that the output
does not contain the "0x" prefix.

Requested-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Acked-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agoMerge branches 'cma', 'misc', 'mlx4' and 'nes' into for-linus
Roland Dreier [Fri, 9 Apr 2010 16:14:21 +0000 (09:14 -0700)]
Merge branches 'cma', 'misc', 'mlx4' and 'nes' into for-linus

14 years ago[S390] Update default configuration.
Martin Schwidefsky [Fri, 9 Apr 2010 11:43:04 +0000 (13:43 +0200)]
[S390] Update default configuration.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
14 years ago[S390] nss: add missing .previous statement to asm function
Heiko Carstens [Fri, 9 Apr 2010 11:43:03 +0000 (13:43 +0200)]
[S390] nss: add missing .previous statement to asm function

The savesys_ipl_nss asm function is put into the .init.text section
however it is missing a ".previous" section which would restore the
previous section.
Luckily all functions in early.c are init functions so it doesn't
matter currently.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
14 years ago[S390] increase default size of vmalloc area
Martin Schwidefsky [Fri, 9 Apr 2010 11:43:02 +0000 (13:43 +0200)]
[S390] increase default size of vmalloc area

The default size of the vmalloc area is currently 1 GB. The memory resource
controller uses about 10 MB of vmalloc space per gigabyte of memory. That
turns a system with more than ~100 GB memory unbootable with the default
vmalloc size. It costs us nothing to increase the default size to some
more adequate value, e.g. 128 GB.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
14 years ago[S390] s390: disable change bit override
Christian Borntraeger [Fri, 9 Apr 2010 11:43:01 +0000 (13:43 +0200)]
[S390] s390: disable change bit override

commit 6a985c6194017de2c062916ad1cd00dee0302c40
([S390] s390: use change recording override for kernel mapping)
deactivated the change bit recording for the kernel mapping to
improve the performance. This works most of the time, but there
are cases (e.g. kernel runs in home space, futex atomic compare xcmg)
where we modify user memory with the kernel mapping instead of the
user mapping.
Instead of fixing these cases, this patch just deactivates change bit
override to avoid future problems with other kernel code that might
use the kernel mapping for user memory.

CC: stable@kernel.org
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
14 years ago[S390] fix io_return critical section cleanup
Martin Schwidefsky [Fri, 9 Apr 2010 11:43:00 +0000 (13:43 +0200)]
[S390] fix io_return critical section cleanup

If a machine check interrupts the io interrupt handler on one of the
instructions between io_return and io_leave the critical section
cleanup code will move the return psw to io_work_loop. By doing that
the switch from the asynchronous interrupt stack to the process stack
is skipped. If e.g. TIF_NEED_RESCHED is set things break because
the scheduler is called with the asynchronous interrupts stack.
Moving the psw back to io_return instead fixes the problem.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
14 years ago[S390] sclp_async: potential buffer overflow
Dan Carpenter [Fri, 9 Apr 2010 11:42:59 +0000 (13:42 +0200)]
[S390] sclp_async: potential buffer overflow

"len" hasn't been properly range checked so we shouldn't use it as an
array offset.  This can only be written to by root but it would still be
annoying to accidentally write more than 3 characters and corrupt your
memory.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
14 years ago[S390] arch/s390/kernel: Add missing unlock
Julia Lawall [Fri, 9 Apr 2010 11:42:58 +0000 (13:42 +0200)]
[S390] arch/s390/kernel: Add missing unlock

In the default case the lock is not unlocked.  The return is
converted to a goto, to share the unlock at the end of the function.

A simplified version of the semantic patch that finds this problem is as
follows: (http://coccinelle.lip6.fr/)

// <smpl>
@r exists@
expression E1;
identifier f;
@@

f (...) { <+...
* spin_lock_irq (E1,...);
... when != E1
* return ...;
...+> }
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
14 years agocfq-iosched: Fix the incorrect timeslice accounting with forced_dispatch
Divyesh Shah [Fri, 9 Apr 2010 07:29:57 +0000 (09:29 +0200)]
cfq-iosched: Fix the incorrect timeslice accounting with forced_dispatch

When CFQ dispatches requests forcefully due to a barrier or changing iosched,
it runs through all cfqq's dispatching requests and then expires each queue.
However, it does not activate a cfqq before flushing its IOs resulting in
using stale values for computing slice_used.
This patch fixes it by calling activate queue before flushing reuqests from
each queue.

This is useful mostly for barrier requests because when the iosched is changing
it really doesnt matter if we have incorrect accounting since we're going to
break down all structures anyway.

We also now expire the current timeslice before moving on with the dispatch
to accurately account slice used for that cfqq.

Signed-off-by: Divyesh Shah<dpshah@google.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
14 years agoMerge remote branch 'nouveau/for-airlied' of ../drm-nouveau-next into drm-linus
Dave Airlie [Fri, 9 Apr 2010 04:27:51 +0000 (14:27 +1000)]
Merge remote branch 'nouveau/for-airlied' of ../drm-nouveau-next into drm-linus

* 'nouveau/for-airlied' of ../drm-nouveau-next: (21 commits)
  drm/nouveau: bail out of auxch transaction if we repeatedly recieve defers
  drm/nv50: implement gpio set/get routines
  drm/nv50: parse/use some more de-magiced parts of gpio table entries
  drm/nouveau: store raw gpio table entry in bios gpio structs
  drm/nv40: Init some tiling-related PGRAPH state.
  drm/nv50: Add NVA3 support in ctxprog/ctxvals generator.
  drm/nv50: another dodgy DP hack
  drm/nv50: punt hotplug irq handling out to workqueue
  drm/nv50: preserve an unknown SOR_MODECTRL value for DP encoders
  drm/nv50: Allow using the NVA3 new compute class.
  drm/nv50: cleanup properly if PDISPLAY init fails
  drm/nouveau: fixup the init failure paths some more
  drm/nv50: fix instmem init on IGPs if stolen mem crosses 4GiB mark
  drm/nv40: add LVDS table quirk for Dell Latitude D620
  drm/nv40: rework lvds table parsing
  drm/nouveau: detect vram amount once, and save the value
  drm/nouveau: remove some unused members from drm_nouveau_private
  drm/nouveau: Make use of TTM busy_placements.
  drm/nv50: add more 0x100c80 flushy magic
  drm/nv50: fix fbcon when framebuffer above 4GiB mark
  ...

14 years agodrm/nouveau: bail out of auxch transaction if we repeatedly recieve defers
Ben Skeggs [Mon, 15 Mar 2010 22:45:07 +0000 (08:45 +1000)]
drm/nouveau: bail out of auxch transaction if we repeatedly recieve defers

There's one known case where we never stop recieving DEFER, and loop here
forever.  Lets not do that..

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
14 years agodrm/nv50: implement gpio set/get routines
Ben Skeggs [Wed, 7 Apr 2010 02:57:35 +0000 (12:57 +1000)]
drm/nv50: implement gpio set/get routines

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
14 years agodrm/nv50: parse/use some more de-magiced parts of gpio table entries
Ben Skeggs [Wed, 7 Apr 2010 02:05:32 +0000 (12:05 +1000)]
drm/nv50: parse/use some more de-magiced parts of gpio table entries

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
14 years agodrm/nouveau: store raw gpio table entry in bios gpio structs
Ben Skeggs [Wed, 7 Apr 2010 02:00:14 +0000 (12:00 +1000)]
drm/nouveau: store raw gpio table entry in bios gpio structs

And use our own version of the GPIO table for the INIT_GPIO opcode.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
14 years agodrm/nv40: Init some tiling-related PGRAPH state.
Francisco Jerez [Tue, 6 Apr 2010 19:11:58 +0000 (21:11 +0200)]
drm/nv40: Init some tiling-related PGRAPH state.

Fixes garbled 3D on an nv46 card.

Reported-by: Francesco Marella <francesco.marella@gmail.com>
Signed-off-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
14 years agodrm/nv50: Add NVA3 support in ctxprog/ctxvals generator.
Marcin Kościelnicki [Fri, 2 Apr 2010 10:28:18 +0000 (10:28 +0000)]
drm/nv50: Add NVA3 support in ctxprog/ctxvals generator.

Signed-off-by: Marcin Kościelnicki <koriakin@0x04.net>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
14 years agodrm/nv50: another dodgy DP hack
Ben Skeggs [Tue, 30 Mar 2010 06:01:41 +0000 (16:01 +1000)]
drm/nv50: another dodgy DP hack

Allows *some* DP cards to keep working in some corner cases that most
people shouldn't hit.  I hit it all the time with development, so this
can stay for now.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
14 years agodrm/nv50: punt hotplug irq handling out to workqueue
Ben Skeggs [Tue, 30 Mar 2010 05:14:41 +0000 (15:14 +1000)]
drm/nv50: punt hotplug irq handling out to workqueue

On DP outputs we'll likely end up running vbios init tables here, which
may sleep.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
14 years agodrm/nv50: preserve an unknown SOR_MODECTRL value for DP encoders
Ben Skeggs [Mon, 29 Mar 2010 00:06:09 +0000 (10:06 +1000)]
drm/nv50: preserve an unknown SOR_MODECTRL value for DP encoders

This value interacts with some registers we don't currently know how to
program properly ourselves.  The default of 5 that we were using matches
what the VBIOS on early DP cards do, but later ones use 6, which would
cause nouveau to program an incorrect mode on these chips.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
14 years agodrm/nv50: Allow using the NVA3 new compute class.
Marcin Kościelnicki [Wed, 24 Mar 2010 13:43:16 +0000 (13:43 +0000)]
drm/nv50: Allow using the NVA3 new compute class.

Signed-off-by: Marcin Kościelnicki <koriakin@0x04.net>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
14 years agodrm/nv50: cleanup properly if PDISPLAY init fails
Ben Skeggs [Thu, 25 Mar 2010 06:01:04 +0000 (16:01 +1000)]
drm/nv50: cleanup properly if PDISPLAY init fails

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
14 years agodrm/nouveau: fixup the init failure paths some more
Ben Skeggs [Thu, 25 Mar 2010 06:00:09 +0000 (16:00 +1000)]
drm/nouveau: fixup the init failure paths some more

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
14 years agodrm/nv50: fix instmem init on IGPs if stolen mem crosses 4GiB mark
Ben Skeggs [Fri, 19 Mar 2010 02:49:59 +0000 (12:49 +1000)]
drm/nv50: fix instmem init on IGPs if stolen mem crosses 4GiB mark

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
14 years agodrm/nv40: add LVDS table quirk for Dell Latitude D620
Ben Skeggs [Thu, 18 Mar 2010 03:38:04 +0000 (13:38 +1000)]
drm/nv40: add LVDS table quirk for Dell Latitude D620

Should fix:
 https://bugzilla.redhat.com/show_bug.cgi?id=505132
 https://bugzilla.redhat.com/show_bug.cgi?id=543091
 https://bugzilla.redhat.com/show_bug.cgi?id=530425
 https://bugs.edge.launchpad.net/ubuntu/+source/xserver-xorg-video-nouveau/
 +bug/539730

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
14 years agodrm/nv40: rework lvds table parsing
Ben Skeggs [Thu, 18 Mar 2010 02:05:43 +0000 (12:05 +1000)]
drm/nv40: rework lvds table parsing

All indications seem to be that the version 0x30 table should be handled
the same way as 0x40 (as used on G80), at least for the parts that we
currently try use.

This commit cleans up the parsing to make it clearer about what we're
actually trying to achieve, and unifies the 0x30/0x40 parsing.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
14 years agodrm/nouveau: detect vram amount once, and save the value
Ben Skeggs [Wed, 17 Mar 2010 23:45:20 +0000 (09:45 +1000)]
drm/nouveau: detect vram amount once, and save the value

As opposed to repeatedly reading the amount back from the GPU every
time we need to know the VRAM size.

We should now fail to load gracefully on detecting no VRAM, rather than
something potentially messy happening.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
14 years agodrm/nouveau: remove some unused members from drm_nouveau_private
Ben Skeggs [Wed, 17 Mar 2010 23:23:19 +0000 (09:23 +1000)]
drm/nouveau: remove some unused members from drm_nouveau_private

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
14 years agodrm/nouveau: Make use of TTM busy_placements.
Francisco Jerez [Thu, 18 Mar 2010 12:07:47 +0000 (13:07 +0100)]
drm/nouveau: Make use of TTM busy_placements.

Previously we were filling it the same as "placements", but in some
cases there're valid alternatives that we were ignoring completely.
Keeping a back-up memory type helps on several low-mem situations.

Signed-off-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
14 years agodrm/nv50: add more 0x100c80 flushy magic
Ben Skeggs [Mon, 15 Mar 2010 06:43:47 +0000 (16:43 +1000)]
drm/nv50: add more 0x100c80 flushy magic

Fixes the !vbo_fifo path in the 3D driver on certain chipsets.  Still not
really any good idea of what exactly the magic achieves, but it makes
things work.

While we're at it, in the PCIEGART path, flush on unbinding also.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
14 years agodrm/nv50: fix fbcon when framebuffer above 4GiB mark
Ben Skeggs [Tue, 16 Mar 2010 03:20:58 +0000 (13:20 +1000)]
drm/nv50: fix fbcon when framebuffer above 4GiB mark

This can't actually happen right now, but lets fix it anyway.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
14 years agodrm/nv50: Fix NEWCTX_DONE flag number
Marcin Kościelnicki [Wed, 17 Mar 2010 00:58:47 +0000 (00:58 +0000)]
drm/nv50: Fix NEWCTX_DONE flag number

Signed-off-by: Marcin Kościelnicki <koriakin@0x04.net>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
14 years agoloop: Update mtime when writing using aops
Nikanth Karthikesan [Thu, 8 Apr 2010 19:39:31 +0000 (21:39 +0200)]
loop: Update mtime when writing using aops

Update mtime when writing to backing filesystem using the address space
operations write_begin and write_end.

Signed-off-by: Nikanth Karthikesan <knikanth@suse.de>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
14 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6
Linus Torvalds [Thu, 8 Apr 2010 18:58:14 +0000 (11:58 -0700)]
Merge git://git./linux/kernel/git/sfrench/cifs-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
  not overwriting file_lock structure after GET_LK
  cifs: Fix a kernel BUG with remote OS/2 server (try #3)
  [CIFS] initialize nbytes at the beginning of CIFSSMBWrite()
  [CIFS] Add mmap for direct, nobrl cifs mount types

14 years agoMerge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzi...
Linus Torvalds [Thu, 8 Apr 2010 17:02:02 +0000 (10:02 -0700)]
Merge branch 'upstream-linus' of git://git./linux/kernel/git/jgarzik/libata-dev

* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
  libata: Fix accesses at LBA28 boundary (old bug, but nasty) (v2)

14 years agolibata: Fix accesses at LBA28 boundary (old bug, but nasty) (v2)
Mark Lord [Wed, 7 Apr 2010 17:52:08 +0000 (13:52 -0400)]
libata: Fix accesses at LBA28 boundary (old bug, but nasty) (v2)

Most drives from Seagate, Hitachi, and possibly other brands,
do not allow LBA28 access to sector number 0x0fffffff (2^28 - 1).
So instead use LBA48 for such accesses.

This bug could bite a lot of systems, especially when the user has
taken care to align partitions to 4KB boundaries. On misaligned systems,
it is less likely to be encountered, since a 4KB read would end at
0x10000000 rather than at 0x0fffffff.

Signed-off-by: Mark Lord <mlord@pobox.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
14 years agoMerge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Thu, 8 Apr 2010 15:37:05 +0000 (08:37 -0700)]
Merge branch 'sched-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip

* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  sched: Fix sched_getaffinity()

14 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/ide-2.6
Linus Torvalds [Thu, 8 Apr 2010 14:45:36 +0000 (07:45 -0700)]
Merge git://git./linux/kernel/git/davem/ide-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/ide-2.6:
  ide: Fix IDE taskfile with cfq scheduler
  ide: Must hold queue lock when requeueing
  ide: Requeue request after DMA timeout

14 years agoMerge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux...
Linus Torvalds [Thu, 8 Apr 2010 14:44:53 +0000 (07:44 -0700)]
Merge branch 'release' of git://git./linux/kernel/git/lenb/linux-acpi-2.6

* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6:
  ACPI / PM: Move ACPI video resume to a PM notifier
  ACPI: Reduce ACPI resource conflict message to KERN_WARNING, printk cleanup
  ACPI: battery drivers should call power_supply_changed()
  ACPI: battery: Fix CONFIG_ACPI_SYSFS_POWER=n
  PNPACPI: truncate _CRS windows with _LEN > _MAX - _MIN + 1
  ACPI: Don't send KEY_UNKNOWN for random video notifications
  ACPI: NUMA: map pxms to low node ids
  ACPI: use _HID when supplied by root-level devices
  ACPI / ACPICA: Do not check reference counters in acpi_ev_enable_gpe()
  ACPI: fixes a false alarm from lockdep
  ACPI dock: support multiple ACPI dock devices
  ACPI: EC: Allow multibyte access to EC

14 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus
Linus Torvalds [Thu, 8 Apr 2010 01:49:20 +0000 (18:49 -0700)]
Merge git://git./linux/kernel/git/rusty/linux-2.6-for-linus

* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus:
  hvc_console: Fix race between hvc_close and hvc_remove
  virtio: disable multiport console support.
  virtio: console makes incorrect assumption about virtio API
  virtio: console: Fix early_put_chars usage
  MAINTAINERS: Put the virtio-console entry in correct alphabetical order

14 years agohvc_console: Fix race between hvc_close and hvc_remove
Anton Blanchard [Tue, 6 Apr 2010 11:42:38 +0000 (21:42 +1000)]
hvc_console: Fix race between hvc_close and hvc_remove

I don't claim to understand the tty layer, but it seems like hvc_open and
hvc_close should be balanced in their kref reference counting.

Right now we get a kref every call to hvc_open:

        if (hp->count++ > 0) {
                tty_kref_get(tty); <----- here
                spin_unlock_irqrestore(&hp->lock, flags);
                hvc_kick();
                return 0;
        } /* else count == 0 */

        tty->driver_data = hp;

        hp->tty = tty_kref_get(tty); <------ or here if hp->count was 0

But hvc_close has:

        tty_kref_get(tty);

        if (--hp->count == 0) {
...
                /* Put the ref obtained in hvc_open() */
                tty_kref_put(tty);
...
        }

        tty_kref_put(tty);

Since the outside kref get/put balance we only do a single kref_put when
count reaches 0.

The patch below changes things to call tty_kref_put once for every
hvc_close call, and with that my machine boots fine.

Signed-off-by: Anton Blanchard <anton@samba.org>
Acked-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
14 years agovirtio: disable multiport console support.
Michael S. Tsirkin [Wed, 31 Mar 2010 18:56:42 +0000 (21:56 +0300)]
virtio: disable multiport console support.

Move MULTIPORT feature and related config changes
out of exported headers, and disable the feature
at runtime.

At this point, it seems less risky to keep code around
until we can enable it than rip it out completely.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
14 years agovirtio: console makes incorrect assumption about virtio API
Rusty Russell [Thu, 8 Apr 2010 15:46:16 +0000 (09:46 -0600)]
virtio: console makes incorrect assumption about virtio API

The get_buf() API sets the second arg to the number of bytes *written*
by the other side; in this case it should be zero as these are output buffers.

lguest gets this right (obviously kvm's console doesn't), resulting in
continual buildup of console writes.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Amit Shah <amit.shah@redhat.com>
14 years agovirtio: console: Fix early_put_chars usage
François Diakhaté [Tue, 23 Mar 2010 12:53:15 +0000 (18:23 +0530)]
virtio: console: Fix early_put_chars usage

Currently early_put_chars is not used by virtio_console because it can
only be used once a port has been found, at which point it's too late
because it is no longer needed. This patch should fix it.

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
14 years agoMAINTAINERS: Put the virtio-console entry in correct alphabetical order
Amit Shah [Tue, 23 Mar 2010 12:53:09 +0000 (18:23 +0530)]
MAINTAINERS: Put the virtio-console entry in correct alphabetical order

Move around the entry for virtio-console to keep the file sorted.

Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
14 years agorwsem generic spinlock: use IRQ save/restore spinlocks
Kevin Hilman [Wed, 7 Apr 2010 18:52:46 +0000 (11:52 -0700)]
rwsem generic spinlock: use IRQ save/restore spinlocks

rwsems can be used with IRQs disabled, particularily in early boot
before IRQs are enabled.  Currently the spin_unlock_irq() usage in the
slow-patch will unconditionally enable interrupts and cause problems
since interrupts are not yet initialized or enabled.

This patch uses save/restore versions of IRQ spinlocks in the slowpath
to ensure interrupts are not unintentionally disabled.

Signed-off-by: Kevin Hilman <khilman@deeprootsystems.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agoHave nfs ->d_revalidate() report errors properly
Al Viro [Wed, 7 Apr 2010 23:06:07 +0000 (00:06 +0100)]
Have nfs ->d_revalidate() report errors properly

If nfs atomic open implementation ends up doing open request from
->d_revalidate() codepath and gets an error from server, return that error
to caller explicitly and don't bother with lookup_instantiate_filp() at all.
->d_revalidate() can return an error itself just fine...

See
http://bugzilla.kernel.org/show_bug.cgi?id=15674
http://marc.info/?l=linux-kernel&m=126988782722711&w=2

for original report.

Reported-by: Daniel J Blueman <daniel.blueman@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agoIB/mlx4: Check correct variable for allocation failure
Dan Carpenter [Wed, 7 Apr 2010 09:39:01 +0000 (09:39 +0000)]
IB/mlx4: Check correct variable for allocation failure

The intent here is to check the "mfrpl->mapped_page_list" allocation.
We checked "mfrpl->ibfrpl.page_list" earlier.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
14 years agoRDMA/nes: Correct cap.max_inline_data assignment in nes_query_qp()
Chien Tung [Thu, 25 Mar 2010 13:39:50 +0000 (13:39 +0000)]
RDMA/nes: Correct cap.max_inline_data assignment in nes_query_qp()

cap.max_inline_data is incorrectly set in init_attr instead of attr.
Set it in attr so subsequent init_attr.cap assignment will get the
correct value.

Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
14 years agoRDMA/cm: Set num_paths when manually assigning path records
Sean Hefty [Thu, 25 Mar 2010 19:12:36 +0000 (19:12 +0000)]
RDMA/cm: Set num_paths when manually assigning path records

When manually assigning the path records to use for a connection, save
the number of paths that were set.  Otherwise, checks against num_path
will show 0, even though path record data is available.

This was discovered by manually setting the path records from user
space, then querying the kernel to see if the correct path records
were assigned, only to discover that the kernel returned 0 path
records to the query.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
14 years agoMerge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Wed, 7 Apr 2010 21:01:51 +0000 (14:01 -0700)]
Merge branch 'perf-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip

* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  perf, x86: Enable Nehalem-EX support
  perf kmem: Fix breakage introduced by 5a0e3ad slab.h script

14 years agoMerge branch 'davinci-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Wed, 7 Apr 2010 18:03:06 +0000 (11:03 -0700)]
Merge branch 'davinci-fixes-for-linus' of git://git./linux/kernel/git/khilman/linux-davinci

* 'davinci-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/khilman/linux-davinci:
  davinci: fix compile warning: <mach/da8xx.h>: #include <linux/platform_device.h>
  davinci: DM365: fix duplicate default IRQ priorities
  davinci: edma: clear events in edma_start()
  davinci: da8xx/omap-l1: fix build error when CONFIG_DAVINCI_MUX is undefined
  davinci: timers: don't enable timer until clocksource is initialized

14 years agoMerge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Wed, 7 Apr 2010 18:02:23 +0000 (11:02 -0700)]
Merge branch 'x86-fixes-for-linus' of git://git./linux/kernel/git/x86/linux-2.6-tip

* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-tip:
  x86: Fix double enable_IR_x2apic() call on SMP kernel on !SMP boards
  x86: Increase CONFIG_NODES_SHIFT max to 10
  ibft, x86: Change reserve_ibft_region() to find_ibft_region()
  x86, hpet: Fix bug in RTC emulation
  x86, hpet: Erratum workaround for read after write of HPET comparator
  bootmem, x86: Fix 32bit numa system without RAM on node 0
  nobootmem, x86: Fix 32bit numa system without RAM on node 0
  x86: Handle overlapping mptables
  x86: Make e820_remove_range to handle all covered case
  x86-32, resume: do a global tlb flush in S4 resume

14 years agodavinci: fix compile warning: <mach/da8xx.h>: #include <linux/platform_device.h>
Sergei Shtylyov [Fri, 26 Mar 2010 14:56:58 +0000 (17:56 +0300)]
davinci: fix compile warning: <mach/da8xx.h>: #include <linux/platform_device.h>

This hushes the following warning:

arch/arm/mach-davinci/include/mach/da8xx.h:104: warning: ‘struct platform_device’
declared inside parameter list
arch/arm/mach-davinci/include/mach/da8xx.h:104: warning: its scope is only this
definition or declaration, which is probably not what you want

Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Kevin Hilman <khilman@deeprootsystems.com>
14 years agoMerge branch 'for-linus' of git://git.monstr.eu/linux-2.6-microblaze
Linus Torvalds [Wed, 7 Apr 2010 15:48:39 +0000 (08:48 -0700)]
Merge branch 'for-linus' of git://git.monstr.eu/linux-2.6-microblaze

* 'for-linus' of git://git.monstr.eu/linux-2.6-microblaze:
  microblaze: Remove unused variable from ptrace
  microblaze: io.h: Add io big-endian function
  microblaze: Enable memory leak detector
  microblaze: Fix futex code
  microblaze: Fix ftrace_update_ftrace_func panic

14 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6
Linus Torvalds [Wed, 7 Apr 2010 15:42:25 +0000 (08:42 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/tiwai/sound-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
  ALSA: mixart: range checking proc file
  ALSA: hda - Fix a wrong array range check in patch_realtek.c
  ALSA: ASoC: move dma_data from snd_soc_dai to snd_soc_pcm_stream
  ALSA: hda - Enable amplifiers on Acer Inspire 6530G
  ASoC: Only do WM8994 bias off transition from standby
  ASoC: Don't use DCS_DATAPATH_BUSY for WM hubs devices
  ASoC: Don't do runtime wm_hubs DC servo updates if using offset correction
  ASoC: Support second DC servo readback method for wm_hubs
  ASoC: Avoid wraparound in wm_hubs DC servo correction
  ALSA: echoaudio - Eliminate use after free
  ALSA: i2c: cleanup: change parameter to pointer
  ALSA: hda - Add MSI blacklist for Aopen MZ915-M
  ASoC: OMAP: Fix capture pointer handling for OMAP1510 to work correctly with recent ALSA PCM code
  ALSA: hda - Update document about MSI and interrupts
  ALSA: hda: Fix 0 dB offset for Lenovo Thinkpad models using AD1981
  ALSA: hda - Add missing printk argument in previous patch
  ASoC: Fix passing platform_data to ac97 bus users and fix a leak
  ALSA: hda - Fix ADC/MUX assignment of ALC269 codec
  ALSA: hda - Fix invalid bit values passed to snd_hda_codec_amp_stereo()
  ASoC: wm8994: playback => capture

14 years agoMerge branch 'slabh' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc
Linus Torvalds [Wed, 7 Apr 2010 15:38:27 +0000 (08:38 -0700)]
Merge branch 'slabh' of git://git./linux/kernel/git/tj/misc

* 'slabh' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc:
  nodemask: include slab.h from drivers/base/node.c

14 years agofs-cache: order the debugfs stats correctly
David Howells [Tue, 6 Apr 2010 21:35:09 +0000 (14:35 -0700)]
fs-cache: order the debugfs stats correctly

Order the debugfs statistics correctly.  The values displayed through a
seq_printf() statement should be in the same order as the names in the
format string.

In the 'Lookups' line, objects created ('crt=') and lookups timed out
('tmo=') have their values transposed.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agofrv: fix kernel/user segment handling in NOMMU mode
David Howells [Tue, 6 Apr 2010 21:35:09 +0000 (14:35 -0700)]
frv: fix kernel/user segment handling in NOMMU mode

In NOMMU mode, the FRV segment handling is broken because KERNEL_DS ==
USER_DS.  This causes tests of the following sort:

/* don't pin down non-user-based iovecs */
if (segment_eq(get_fs(), KERNEL_DS))
return NULL;

to malfunction.

To fix this, make USER_DS the top of RAM instead of the top of the non-IO
address space, and make KERNEL_DS one more than the top of the non-IO
address space.

Also get rid of FRV's __addr_ok() as nothing uses it.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agofrv: hide uncached_access() when pgprot_noncached is not #defined
David Howells [Tue, 6 Apr 2010 21:35:08 +0000 (14:35 -0700)]
frv: hide uncached_access() when pgprot_noncached is not #defined

Hide uncached_access() when pgprot_noncached is not #defined.  This prevents
the following warning:

  CC      drivers/char/mem.o
drivers/char/mem.c:229: warning: 'uncached_access' defined but not used

Repairs d7d4d849b4e3acc405ec222884936800ffb26d48 ("drivers/char/mem.c:
cleanups").

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agortc-mxc: multiple fixes in rtc-mxc probe method
Vladimir Zapolskiy [Tue, 6 Apr 2010 21:35:07 +0000 (14:35 -0700)]
rtc-mxc: multiple fixes in rtc-mxc probe method

On exit paths in mxc_rtc_probe() method some resources are not freed
correctly.

This patch fixes:
* unrequested memory region containing imx RTC registers
* iounmap() isn't called on exit_free_pdata branch
* clock get rate is called for freed clock source
* clock isn't disabled on exit_put_clk branch

To simplify the fix managed device resources are used.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Vladimir Zapolskiy <vzapolskiy@gmail.com>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Cc: Daniel Mack <daniel@caiaq.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agomemcg: fix race in file_mapped accounting
KAMEZAWA Hiroyuki [Tue, 6 Apr 2010 21:35:05 +0000 (14:35 -0700)]
memcg: fix race in file_mapped accounting

Presently, memcg's FILE_MAPPED accounting has following race with
move_account (happens at rmdir()).

    increment page->mapcount (rmap.c)
    mem_cgroup_update_file_mapped()           move_account()
      lock_page_cgroup()
      check page_mapped() if
      page_mapped(page)>1 {
FILE_MAPPED -1 from old memcg
FILE_MAPPED +1 to old memcg
      }
      .....
      overwrite pc->mem_cgroup
      unlock_page_cgroup()
    lock_page_cgroup()
    FILE_MAPPED + 1 to pc->mem_cgroup
    unlock_page_cgroup()

Then,
old memcg (-1 file mapped)
new memcg (+2 file mapped)

This happens because move_account see page_mapped() which is not guarded
by lock_page_cgroup().  This patch adds FILE_MAPPED flag to page_cgroup
and move account information based on it.  Now, all checks are synchronous
with lock_page_cgroup().

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Reviewed-by: Balbir Singh <balbir@in.ibm.com>
Reviewed-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: Andrea Righi <arighi@develer.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agopagemap: fix pfn calculation for hugepage
Naoya Horiguchi [Tue, 6 Apr 2010 21:35:04 +0000 (14:35 -0700)]
pagemap: fix pfn calculation for hugepage

When we look into pagemap using page-types with option -p, the value of
pfn for hugepages looks wrong (see below.) This is because pte was
evaluated only once for one vma although it should be updated for each
hugepage.  This patch fixes it.

  $ page-types -p 3277 -Nl -b huge
  voffset   offset  len     flags
  7f21e8a00 11e400  1       ___U___________H_G________________
  7f21e8a01 11e401  1ff     ________________TG________________
               ^^^
  7f21e8c00 11e400  1       ___U___________H_G________________
  7f21e8c01 11e401  1ff     ________________TG________________
               ^^^

One hugepage contains 1 head page and 511 tail pages in x86_64 and each
two lines represent each hugepage.  Voffset and offset mean virtual
address and physical address in the page unit, respectively.  The
different hugepages should not have the same offset value.

With this patch applied:

  $ page-types -p 3386 -Nl -b huge
  voffset   offset   len    flags
  7fec7a600 112c00   1      ___UD__________H_G________________
  7fec7a601 112c01   1ff    ________________TG________________
               ^^^
  7fec7a800 113200   1      ___UD__________H_G________________
  7fec7a801 113201   1ff    ________________TG________________
               ^^^
               OK

More info:

- This patch modifies walk_page_range()'s hugepage walker.  But the
  change only affects pagemap_read(), which is the only caller of hugepage
  callback.

- Without this patch, hugetlb_entry() callback is called per vma, that
  doesn't match the natural expectation from its name.

- With this patch, hugetlb_entry() is called per hugepte entry and the
  callback can become much simpler.

Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Matt Mackall <mpm@selenic.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agoratelimit: fix the return value when __ratelimit() fails to acquire the lock
Yong Zhang [Tue, 6 Apr 2010 21:35:03 +0000 (14:35 -0700)]
ratelimit: fix the return value when __ratelimit() fails to acquire the lock

The log of commit edaac8e3167501cda336231d00611bf59c164346 ("ratelimit:
Fix/allow use in atomic contexts"), indicates that we want to suppress the
callback when the trylock fails.

Signed-off-by: Yong Zhang <yong.zhang@windriver.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agokernel.h: fix wrong usage of __ratelimit()
Yong Zhang [Tue, 6 Apr 2010 21:35:02 +0000 (14:35 -0700)]
kernel.h: fix wrong usage of __ratelimit()

When __ratelimit() returns 1 this means that we can go ahead.

Signed-off-by: Yong Zhang <yong.zhang@windriver.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agoratelimit: annotate ___ratelimit()
Yong Zhang [Tue, 6 Apr 2010 21:35:01 +0000 (14:35 -0700)]
ratelimit: annotate ___ratelimit()

To prevent from wrongly using the return value.

[akpm@linux-foundation.org: fix spello]
Signed-off-by: Yong Zhang <yong.zhang@windriver.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Dave Young <hidave.darkstar@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years ago/dev/mem: allow rewinding
Eric Dumazet [Tue, 6 Apr 2010 21:35:01 +0000 (14:35 -0700)]
/dev/mem: allow rewinding

commit dcefafb6 ("/dev/mem: dont allow seek to last page") inadvertently
disabled rewinding on /dev/mem.

This broke x86info for example.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agovfs: rename block_fsync() to blkdev_fsync()
Andrew Morton [Tue, 6 Apr 2010 21:35:00 +0000 (14:35 -0700)]
vfs: rename block_fsync() to blkdev_fsync()

Requested by hch, for consistency now it is exported.

Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Anton Blanchard <anton@samba.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jan Kara <jack@suse.cz>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agoraw: fsync method is now required
Anton Blanchard [Tue, 6 Apr 2010 21:34:58 +0000 (14:34 -0700)]
raw: fsync method is now required

Commit 148f948ba877f4d3cdef036b1ff6d9f68986706a (vfs: Introduce new
helpers for syncing after writing to O_SYNC file or IS_SYNC inode) broke
the raw driver.

We now call through generic_file_aio_write -> generic_write_sync ->
vfs_fsync_range.  vfs_fsync_range has:

        if (!fop || !fop->fsync) {
                ret = -EINVAL;
                goto out;
        }

But drivers/char/raw.c doesn't set an fsync method.

We have two options: fix it or remove the raw driver completely.  I'm
happy to do either, the fact this has been broken for so long suggests it
is rarely used.

The patch below adds an fsync method to the raw driver.  My knowledge of
the block layer is pretty sketchy so this could do with a once over.

If we instead decide to remove the raw driver, this patch might still be
useful as a backport to 2.6.33 and 2.6.32.

Signed-off-by: Anton Blanchard <anton@samba.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Jens Axboe <jens.axboe@oracle.com>
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Tested-by: Jeff Moyer <jmoyer@redhat.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agomb862xxfb: update Valentin's email address
Alexander Shishkin [Tue, 6 Apr 2010 21:34:57 +0000 (14:34 -0700)]
mb862xxfb: update Valentin's email address

Since Valentin's email address @siemens.com is no longer valid, it's time
to change it to the one that actually works so that I don't have to
manually forward patches against mb862xx to him every time.

Signed-off-by: Alexander Shishkin <virtuoso@slind.org>
Acked-by: Valentin Sitdikov <v.sitdikov@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agomb862xxfb: fix acceleration module license
Randy Dunlap [Tue, 6 Apr 2010 21:34:57 +0000 (14:34 -0700)]
mb862xxfb: fix acceleration module license

mb862xxfb_accel built as a separate module, but it does not have a
MODULE_LICENSE, so it taints the kernel.  Add a MODULE_LICENSE to it (same
as mb862xxfb license).

mb862xxfb_accel: module license 'unspecified' taints kernel.

Or should mb862xxfb_accel be built into the mb862xxfb binary file instead?

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Acked-by: Alexander Shishkin <virtuoso@slind.org>
Cc: Valentin Sitdikov <v.sitdikov@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agomm: revert "vmscan: get_scan_ratio() cleanup"
KOSAKI Motohiro [Tue, 6 Apr 2010 21:34:56 +0000 (14:34 -0700)]
mm: revert "vmscan: get_scan_ratio() cleanup"

Shaohua Li reported his tmpfs streaming I/O test can lead to make oom.
The test uses a 6G tmpfs in a system with 3G memory.  In the tmpfs, there
are 6 copies of kernel source and the test does kbuild for each copy.  His
investigation shows the test has a lot of rotated anon pages and quite few
file pages, so get_scan_ratio calculates percent[0] (i.e.  scanning
percent for anon) to be zero.  Actually the percent[0] shoule be a big
value, but our calculation round it to zero.

Although before commit 84b18490 ("vmscan: get_scan_ratio() cleanup") , we
have the same problem too.  But the old logic can rescue percent[0]==0
case only when priority==0.  It had hided the real issue.  I didn't think
merely streaming io can makes percent[0]==0 && priority==0 situation.  but
I was wrong.

So, definitely we have to fix such tmpfs streaming io issue.  but anyway I
revert the regression commit at first.

This reverts commit 84b18490d1f1bc7ed5095c929f78bc002eb70f26.

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reported-by: Shaohua Li <shaohua.li@intel.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agodevmem: handle class_create() failure
Anton Blanchard [Tue, 6 Apr 2010 21:34:55 +0000 (14:34 -0700)]
devmem: handle class_create() failure

I hit this when we had a bug in IDR for a few days.  Basically sysfs would
fail to create new inodes since it uses an IDR and therefore class_create
would fail.

While we are unlikely to see this fail we may as well handle it instead of
oopsing.

Signed-off-by: Anton Blanchard <anton@samba.org>
Reviewed-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>