GitHub/LineageOS/G12/android_kernel_amlogic_linux-4.9.git
14 years agosplice: fix misuse of SPLICE_F_NONBLOCK
Miklos Szeredi [Tue, 3 Aug 2010 10:48:50 +0000 (12:48 +0200)]
splice: fix misuse of SPLICE_F_NONBLOCK

SPLICE_F_NONBLOCK is clearly documented to only affect blocking on the
pipe.  In __generic_file_splice_read(), however, it causes an EAGAIN
if the page is currently being read.

This makes it impossible to write an application that only wants
failure if the pipe is full.  For example if the same process is
handling both ends of a pipe and isn't otherwise able to determine
whether a splice to the pipe will fill it or not.

We could make the read non-blocking on O_NONBLOCK or some other splice
flag, but for now this is the simplest fix.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
CC: stable@kernel.org
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agoxen/blkfront: Use QUEUE_ORDERED_DRAIN for old backends
Jeremy Fitzhardinge [Wed, 28 Jul 2010 17:49:29 +0000 (10:49 -0700)]
xen/blkfront: Use QUEUE_ORDERED_DRAIN for old backends

If there's no feature-barrier key in xenstore, then it means its a fairly
old backend which does uncached in-order writes, which means ORDERED_DRAIN
is appropriate.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
14 years agoxen/blkfront: use tagged queuing for barriers
Jeremy Fitzhardinge [Thu, 22 Jul 2010 21:17:00 +0000 (14:17 -0700)]
xen/blkfront: use tagged queuing for barriers

When barriers are supported, then use QUEUE_ORDERED_TAG to tell the block
subsystem that it doesn't need to do anything else with the barriers.
Previously we used ORDERED_DRAIN which caused the block subsystem to
drain all pending IO before submitting the barrier, which would be
very expensive.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
14 years agoscsi: use REQ_TYPE_FS for flush request
FUJITA Tomonori [Fri, 9 Jul 2010 00:38:26 +0000 (09:38 +0900)]
scsi: use REQ_TYPE_FS for flush request

scsi-ml uses REQ_TYPE_BLOCK_PC for flush requests from file
systems. The definition of REQ_TYPE_BLOCK_PC is that we don't retry
requests even when we can (e.g. UNIT ATTENTION) and we send the
response to the callers (then the callers can decide what they want).
We need a workaround such as the commit
77a4229719e511a0d38d9c355317ae1469adeb54 to retry BLOCK_PC flush
requests. We will need the similar workaround for discard requests too
since SCSI-ml handle them as BLOCK_PC internally.

This uses REQ_TYPE_FS for flush requests from file systems instead of
REQ_TYPE_BLOCK_PC.

scsi-ml retries only REQ_TYPE_FS requests that have data to
transfer when we can retry them (e.g. UNIT_ATTENTION). However, we
also need to retry REQ_TYPE_FS requests without data because the
callers don't.

This also changes scsi_check_sense() to retry all the REQ_TYPE_FS
requests when appropriate. Thanks to scsi_noretry_cmd(),
REQ_TYPE_BLOCK_PC requests don't be retried as before.

Note that basically, this reverts the commit
77a4229719e511a0d38d9c355317ae1469adeb54 since now we use REQ_TYPE_FS
for flush requests.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agoblock: set up rq->rq_disk properly for flush requests
FUJITA Tomonori [Fri, 9 Jul 2010 00:38:25 +0000 (09:38 +0900)]
block: set up rq->rq_disk properly for flush requests

q->bar_rq.rq_disk is NULL. Use the rq_disk of the original request
instead.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agoblock: set REQ_TYPE_FS on flush requests
FUJITA Tomonori [Fri, 9 Jul 2010 00:38:24 +0000 (09:38 +0900)]
block: set REQ_TYPE_FS on flush requests

the block layer doesn't set rq->cmd_type on flush requests. By
definition, it should be REQ_TYPE_FS (the lower layers build a command
and interpret the result of it, that is, the block layer doesn't know
the details).

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agofloppy: make controller const
Stephen Hemminger [Wed, 21 Jul 2010 02:09:00 +0000 (20:09 -0600)]
floppy: make controller const

The struct cont_t is just a set of virtual function pointers.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agodrivers/block: use memdup_user
Julia Lawall [Wed, 21 Jul 2010 02:08:59 +0000 (20:08 -0600)]
drivers/block: use memdup_user

Use memdup_user when user data is immediately copied into the
allocated region.  Some checkpatch cleanups in nearby code.

The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@@
expression from,to,size,flag;
position p;
identifier l1,l2;
@@

-  to = \(kmalloc@p\|kzalloc@p\)(size,flag);
+  to = memdup_user(from,size);
   if (
-      to==NULL
+      IS_ERR(to)
                 || ...) {
   <+... when != goto l1;
-  -ENOMEM
+  PTR_ERR(to)
   ...+>
   }
-  if (copy_from_user(to, from, size) != 0) {
-    <+... when != goto l2;
-    -EFAULT
-    ...+>
-  }
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Cc: Chirag Kantharia <chirag.kantharia@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agoscsi: convert discard to REQ_TYPE_FS from REQ_TYPE_BLOCK_PC
FUJITA Tomonori [Wed, 21 Jul 2010 01:29:37 +0000 (10:29 +0900)]
scsi: convert discard to REQ_TYPE_FS from REQ_TYPE_BLOCK_PC

Jens, any reason why this isn't included in your for-2.6.36 yet?

=
From: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Subject: [PATCH resend] scsi: convert discard to REQ_TYPE_FS from REQ_TYPE_BLOCK_PC

The block layer (file systems) sends discard requests as REQ_TYPE_FS
(the role of REQ_TYPE_FS is that setting up commands and interpreting
the results). But SCSI-ml treats discard requests as
REQ_TYPE_BLOCK_PC.

scsi-ml can handle discard requests as REQ_TYPE_FS
easily. scsi_setup_discard_cmnd() sets up struct request and the bio
nicely. Only remaining issue is that discard requests can't be
completed partially so we need to modify sd_done.

This conversion also fixes the problem that discard requests aren't
retried when possible (e.g. UNIT ATTENTION).

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agocciss: cleanup interrupt_not_for_us
Stephen M. Cameron [Mon, 19 Jul 2010 18:46:54 +0000 (13:46 -0500)]
cciss: cleanup interrupt_not_for_us

cciss: cleanup interrupt_not_for_us
In the case of MSI/MSIX interrutps, we don't need to check
if the interrupt is for us, and in the case of the intx interrupt
handler, when checking if the interrupt is for us, we don't need
to check if we're using MSI/MSIX, we know we're not.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agocciss: change printks to dev_warn, etc.
Stephen M. Cameron [Mon, 19 Jul 2010 18:46:48 +0000 (13:46 -0500)]
cciss: change printks to dev_warn, etc.

cciss: change printks to dev_warn, etc.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agocciss: separate cmd_alloc() and cmd_special_alloc()
Stephen M. Cameron [Mon, 19 Jul 2010 18:46:43 +0000 (13:46 -0500)]
cciss: separate cmd_alloc() and cmd_special_alloc()

cciss: separate cmd_alloc() and cmd_special_alloc()
cmd_alloc() took a parameter which caused it to either allocate
from a pre-allocated pool, or allocate using pci_alloc_consistent.
This parameter is always known at compile time, so this would
be better handled by breaking the function into two functions
and differentiating the cases by function names.  Same goes
for cmd_free().

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agocciss: use consistent variable names
Stephen M. Cameron [Mon, 19 Jul 2010 18:46:38 +0000 (13:46 -0500)]
cciss: use consistent variable names

cciss: use consistent variable names
"h", for the hba structure and "c" for the command structures.
and get rid of trivial CCISS_LOCK macro.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agocciss: forbid hard reset of 640x boards
Stephen M. Cameron [Mon, 19 Jul 2010 18:46:33 +0000 (13:46 -0500)]
cciss: forbid hard reset of 640x boards

cciss: forbid hard reset of 640x boards
The 6402/6404 are two PCI devices -- two Smart Array controllers
-- that fit into one slot.  It is possible to reset them independently,
however, they share a battery backed cache module.  One of the pair
controls the cache and the 2nd one access the cache through the first
one.  If you reset the one controlling the cache, the other one will
not be a happy camper.  So we just forbid resetting this conjoined
mess.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agocciss: sanitize max commands
Stephen M. Cameron [Mon, 19 Jul 2010 18:46:28 +0000 (13:46 -0500)]
cciss: sanitize max commands

cciss: sanitize max commands
Some controllers might try to tell us they support 0 commands
in performant mode.  This is a lie told by buggy firmware.
We have to be wary of this lest we try to allocate a negative
number of command blocks, which will be treated as unsigned,
and get an out of memory condition.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agocciss: fix hard reset code.
Stephen M. Cameron [Mon, 19 Jul 2010 18:46:22 +0000 (13:46 -0500)]
cciss: fix hard reset code.

cciss: Fix hard reset code.
Smart Array controllers newer than the P600 do not honor the
PCI power state method of resetting the controllers.  Instead,
in these cases we can get them to reset via the "doorbell" register.

This escaped notice until we began using "performant" mode because
the fact that the controllers did not reset did not normally
impede subsequent operation, and so things generally appeared to
"work".  Once the performant mode code was added, if the controller
does not reset, it remains in performant mode.  The code immediately
after the reset presumes the controller is in "simple" mode
(which previously, it had remained in simple mode the whole time).
If the controller remains in performant mode any code which presumes
it is in simple mode will not work.  So the reset needs to be fixed.

Unfortunately there are some controllers which cannot be reset by
either method. (eg. p800).  We detect these cases by noticing that
the controller seems to remain in performant mode even after a
reset has been attempted.  In those cases we ignore the controller,
as any commands outstanding on it will result in stale completions.
To sum up, we try to do a better job of resetting the controller if
"reset_devices" is set, and if it doesn't work, we ignore that
controller.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agocciss: factor out cciss_reset_devices()
Stephen M. Cameron [Mon, 19 Jul 2010 18:46:17 +0000 (13:46 -0500)]
cciss: factor out cciss_reset_devices()

cciss: factor out cciss_reset_devices()

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agocciss: factor out cciss_find_cfg_addrs.
Stephen M. Cameron [Mon, 19 Jul 2010 18:46:12 +0000 (13:46 -0500)]
cciss: factor out cciss_find_cfg_addrs.

Rationale for this is that I will also need to use this code
in fixing kdump host reset code prior to having the hba structure.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agocciss: factor out cciss_enter_performant_mode
Stephen M. Cameron [Mon, 19 Jul 2010 18:46:07 +0000 (13:46 -0500)]
cciss: factor out cciss_enter_performant_mode

cciss: factor out cciss_enter_performant_mode

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agocciss: factor out cciss_wait_for_mode_change_ack()
Stephen M. Cameron [Mon, 19 Jul 2010 18:46:01 +0000 (13:46 -0500)]
cciss: factor out cciss_wait_for_mode_change_ack()

cciss: factor out cciss_wait_for_mode_change_ack()

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agocciss: make cciss_put_controller_into_performant_mode as __devinit
Stephen M. Cameron [Mon, 19 Jul 2010 18:45:56 +0000 (13:45 -0500)]
cciss: make cciss_put_controller_into_performant_mode as __devinit

cciss: make cciss_put_controller_into_performant_mode as __devinit

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agocciss: cleanup some debug ifdefs
Stephen M. Cameron [Mon, 19 Jul 2010 18:45:51 +0000 (13:45 -0500)]
cciss: cleanup some debug ifdefs

cciss: cleanup some debug ifdefs

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agocciss: factor out cciss_p600_dma_prefetch_quirk()
Stephen M. Cameron [Mon, 19 Jul 2010 18:45:46 +0000 (13:45 -0500)]
cciss: factor out cciss_p600_dma_prefetch_quirk()

cciss: factor out cciss_p600_dma_prefetch_quirk()

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agocciss: factor out cciss_enable_scsi_prefetch()
Stephen M. Cameron [Mon, 19 Jul 2010 18:45:41 +0000 (13:45 -0500)]
cciss: factor out cciss_enable_scsi_prefetch()

cciss: factor out cciss_enable_scsi_prefetch()

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agocciss: factor out CISS_signature_present()
Stephen M. Cameron [Mon, 19 Jul 2010 18:45:36 +0000 (13:45 -0500)]
cciss: factor out CISS_signature_present()

cciss: factor out CISS_signature_present()

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agocciss: factor out cciss_find_board_params
Stephen M. Cameron [Mon, 19 Jul 2010 18:45:31 +0000 (13:45 -0500)]
cciss: factor out cciss_find_board_params

cciss: factor out cciss_find_board_params

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agocciss: fix leak of ioremapped memory
Stephen M. Cameron [Mon, 19 Jul 2010 18:45:26 +0000 (13:45 -0500)]
cciss: fix leak of ioremapped memory

cciss: fix leak of ioremapped memory
in cciss_pci_init error path.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agocciss: factor out cciss_find_cfgtables
Stephen M. Cameron [Mon, 19 Jul 2010 18:45:21 +0000 (13:45 -0500)]
cciss: factor out cciss_find_cfgtables

cciss: factor out cciss_find_cfgtables

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agocciss: factor out cciss_wait_for_board_ready()
Stephen M. Cameron [Mon, 19 Jul 2010 18:45:15 +0000 (13:45 -0500)]
cciss: factor out cciss_wait_for_board_ready()

cciss: factor out cciss_wait_for_board_ready()

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agocciss: factor out cciss_find_memory_BAR()
Stephen M. Cameron [Mon, 19 Jul 2010 18:45:10 +0000 (13:45 -0500)]
cciss: factor out cciss_find_memory_BAR()

cciss: factor out cciss_find_memory_BAR()

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agocciss: remove board_id parameter from cciss_interrupt_mode()
Stephen M. Cameron [Mon, 19 Jul 2010 18:45:05 +0000 (13:45 -0500)]
cciss: remove board_id parameter from cciss_interrupt_mode()

cciss: remove board_id parameter from cciss_interrupt_mode()

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agocciss: factor out cciss_board_disabled
Stephen M. Cameron [Mon, 19 Jul 2010 18:45:00 +0000 (13:45 -0500)]
cciss: factor out cciss_board_disabled

cciss: factor out cciss_board_disabled

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agocciss: factor out cciss_lookup_board_id
Stephen M. Cameron [Mon, 19 Jul 2010 18:44:55 +0000 (13:44 -0500)]
cciss: factor out cciss_lookup_board_id

cciss: factor out cciss_lookup_board_id

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agocciss: save pdev pointer in per hba structure early to avoid passing it around so...
Stephen M. Cameron [Mon, 19 Jul 2010 18:44:50 +0000 (13:44 -0500)]
cciss: save pdev pointer in per hba structure early to avoid passing it around so much.

cciss: save pdev pointer in per hba structure early to avoid passing it around so much.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agocciss: Set the performant mode bit in the scsi half of the driver
Stephen M. Cameron [Mon, 19 Jul 2010 18:44:45 +0000 (13:44 -0500)]
cciss: Set the performant mode bit in the scsi half of the driver

cciss: Set the performant mode bit in the scsi half of the driver
In a couple of places, the performant mode bit wasn't being set in
the scsi half of the driver, causing commands to seem to hang.  Use
enqueue_cmd_and_start_io() where appropriate.  This fixes a bug that

echo engage scsi > /proc/driver/cciss/cciss0

would hang.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agoblkfront: Klog the unclean release path
Daniel Stodden [Sat, 7 Aug 2010 16:51:21 +0000 (18:51 +0200)]
blkfront: Klog the unclean release path

Signed-off-by: Daniel Stodden <daniel.stodden@citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agoblkfront: Remove obsolete info->users
Daniel Stodden [Fri, 30 Apr 2010 22:01:23 +0000 (22:01 +0000)]
blkfront: Remove obsolete info->users

This is just bd_openers, protected by the bd_mutex.

Signed-off-by: Daniel Stodden <daniel.stodden@citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agoblkfront: Remove obsolete info->users
Daniel Stodden [Sat, 7 Aug 2010 16:47:26 +0000 (18:47 +0200)]
blkfront: Remove obsolete info->users

This is just bd_openers, protected by the bd_mutex.

Signed-off-by: Daniel Stodden <daniel.stodden@citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agoblkfront: Lock blockfront_info during xbdev removal
Daniel Stodden [Fri, 30 Apr 2010 22:01:22 +0000 (22:01 +0000)]
blkfront: Lock blockfront_info during xbdev removal

Same approach as blkfront_closing:
 * Grab the bdev safely, holding the info mutex.
 * Zap xbdev safely, holding the info mutex.
 * Try bdev removal safely, holding bd_mutex.

Signed-off-by: Daniel Stodden <daniel.stodden@citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
14 years agoblkfront: Fix blkfront backend switch race (bdev release)
Daniel Stodden [Sat, 7 Aug 2010 16:45:12 +0000 (18:45 +0200)]
blkfront: Fix blkfront backend switch race (bdev release)

We cannot read backend state within bdev operations, because it risks
grabbing the state change before xenbus gets to do it.

Fixed by tracking deferral with a frontend switch to Closing. State
exposure isn't strictly necessary, but the backends won't mind.

For a 'clean' deferral this seems actually a more decent protocol than
raising errors.

Signed-off-by: Daniel Stodden <daniel.stodden@citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agoblkfront: Fix blkfront backend switch race (bdev open)
Daniel Stodden [Sat, 7 Aug 2010 16:36:53 +0000 (18:36 +0200)]
blkfront: Fix blkfront backend switch race (bdev open)

We need not mind if users grab a late handle on a closing disk. We
probably even should not. But we have to make sure it's not a dead
one already

Let the bdev deal with a gendisk deleted under its feet. Takes the
info mutex to decide a race against backend closing.

Signed-off-by: Daniel Stodden <daniel.stodden@citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agoblkfront: Lock blkfront_info when closing
Daniel Stodden [Fri, 30 Apr 2010 22:01:19 +0000 (22:01 +0000)]
blkfront: Lock blkfront_info when closing

The bdev .open/.release fops race against backend switches to Closing,
handled by the XenBus thread.

The original code attempted to serialize block device holders and
xenbus only via bd_mutex. This is insufficient, the info->bd pointer
may already be stale (or null) while xenbus tries to bump up the
refcount.

Protect blkfront_info with a dedicated mutex.

Signed-off-by: Daniel Stodden <daniel.stodden@citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
14 years agoblkfront: Clean up vbd release
Daniel Stodden [Sat, 7 Aug 2010 16:33:17 +0000 (18:33 +0200)]
blkfront: Clean up vbd release

 * Current blkfront_closing is rather a xlvbd_release_gendisk.
   Renamed in preparation of later patches (need the name again).

 * Removed the misleading comment -- this only applied to the backend
   switch handler, and the queue is already flushed btw.

 * Break out the xenbus call, callers know better when to switch
   frontend state.

Signed-off-by: Daniel Stodden <daniel.stodden@citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agoblkfront: Fix gendisk leak
Daniel Stodden [Fri, 30 Apr 2010 22:01:17 +0000 (22:01 +0000)]
blkfront: Fix gendisk leak

Signed-off-by: Daniel Stodden <daniel.stodden@citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
14 years agoblkfront: Fix backtrace in del_gendisk
Daniel Stodden [Fri, 30 Apr 2010 22:01:16 +0000 (22:01 +0000)]
blkfront: Fix backtrace in del_gendisk

The call to del_gendisk follows an non-refcounted gd->queue
pointer. We release the last ref in blk_cleanup_queue. Fixed by
reordering releases accordingly.

Signed-off-by: Daniel Stodden <daniel.stodden@citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
14 years agoxenbus: Make xenbus_switch_state transactional
Daniel Stodden [Fri, 30 Apr 2010 22:01:15 +0000 (22:01 +0000)]
xenbus: Make xenbus_switch_state transactional

According to the comments, this was how it's been done years ago, but
apparently took an xbt pointer from elsewhere back then. The code was
removed because of consistency issues: cancellation wont't roll back
the saved xbdev->state.

Still, unsolicited writes to the state field remain an issue,
especially if device shutdown takes thread synchronization, and subtle
races cause accidental recreation of the device node.

Fixed by reintroducing the transaction. An internal one is sufficient,
so the xbdev->state value remains consistent.

Also fixes the original hack to prevent infinite recursion. Instead of
bailing out on the first attempt to switch to Closing, checks call
depth now.

Signed-off-by: Daniel Stodden <daniel.stodden@citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
14 years agoxen/blkfront: revalidate after setting capacity
K. Y. Srinivasan [Thu, 18 Mar 2010 22:00:54 +0000 (15:00 -0700)]
xen/blkfront: revalidate after setting capacity

Signed-off-by: K. Y. Srinivasan <ksrinivasan@novell.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
14 years agoxen/blkfront: avoid compiler warning from missing cases
Jeremy Fitzhardinge [Thu, 11 Mar 2010 23:10:40 +0000 (15:10 -0800)]
xen/blkfront: avoid compiler warning from missing cases

Fix:
drivers/block/xen-blkfront.c: In function ‘blkfront_connect’:
drivers/block/xen-blkfront.c:933: warning: enumeration value ‘BLKIF_STATE_DISCONNECTED’ not handled in switch

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
14 years agoxen/front: Propagate changed size of VBDs
K. Y. Srinivasan [Thu, 11 Mar 2010 21:42:26 +0000 (13:42 -0800)]
xen/front: Propagate changed size of VBDs

Support dynamic resizing of virtual block devices. This patch supports
both file backed block devices as well as physical devices that can be
dynamically resized on the host side.

Signed-off-by: K. Y. Srinivasan <ksrinivasan@novell.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
14 years agoblkfront: don't access freed struct xenbus_device
Jan Beulich [Sat, 7 Aug 2010 16:31:12 +0000 (18:31 +0200)]
blkfront: don't access freed struct xenbus_device

Unfortunately commit "blkfront: fixes for 'xm block-detach ... --force'"
still wasn't quite right - there was a reference to freed memory left
from blkfront_closing().

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agoblkfront: fixes for 'xm block-detach ... --force'
Jan Beulich [Sat, 7 Aug 2010 16:28:55 +0000 (18:28 +0200)]
blkfront: fixes for 'xm block-detach ... --force'

Prevent prematurely freeing 'struct blkfront_info' instances (when the
xenbus data structures are gone, but the Linux ones are still needed).

Prevent adding a disk with the same (major, minor) [and hence the same
name and sysfs entries, which leads to oopses] when the previous
instance wasn't fully de-allocated yet.

This still doesn't address all issues resulting from forced detach:
I/O submitted after the detach still blocks forever, likely preventing
subsequent un-mounting from completing. It's not clear to me (not
knowing much about the block layer) how this can be avoided.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agoxen: use less generic names in blkfront driver.
Ian Campbell [Fri, 4 Dec 2009 15:33:54 +0000 (15:33 +0000)]
xen: use less generic names in blkfront driver.

All Xen frontend drivers have a couple of identically named functions which
makes figuring out which device went wrong from a stacktrace harder than it
needs to be. Rename them to something specificto the device type.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
14 years agowriteback.h: needs linux/device.h
Randy Dunlap [Mon, 19 Jul 2010 23:49:17 +0000 (16:49 -0700)]
writeback.h: needs linux/device.h

include/trace/events/writeback.h uses dev_name(), so it needs to
include linux/device.h.

include/trace/events/writeback.h:12: error: implicit declaration of function 'dev_name'

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agoblock: fix problem with sending down discard that isn't of correct granularity
Jens Axboe [Thu, 15 Jul 2010 16:49:31 +0000 (10:49 -0600)]
block: fix problem with sending down discard that isn't of correct granularity

If the queue doesn't have a limit set, or it just set UINT_MAX like
we default to, we coud be sending down a discard request that isn't
of the correct granularity if the block size is > 512b.

Fix this by adjusting max_discard_sectors down to the proper
alignment.

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agoblkdev: check for valid request queue before issuing flush
Dave Chinner [Tue, 13 Jul 2010 07:50:50 +0000 (17:50 +1000)]
blkdev: check for valid request queue before issuing flush

Issuing a blkdev_issue_flush() on an unconfigured loop device causes a panic as
q->make_request_fn is not configured. This can occur when trying to mount the
unconfigured loop device as an XFS filesystem. There are no guards that catch
the bio before the request function is called because we don't add a payload to
the bio. Instead, manually check this case as soon as we have a pointer to the
queue to flush.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agoblock: fix for block tracing build error
Stephen Rothwell [Fri, 9 Jul 2010 04:24:38 +0000 (14:24 +1000)]
block: fix for block tracing build error

block/compat_ioctl.c: In function 'compat_blkdev_ioctl':
block/compat_ioctl.c:754: error: 'BLKTRACESETUP32' undeclared (first use in this function)

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agoscsi/i2o: restore ioctl changes
Arnd Bergmann [Thu, 8 Jul 2010 12:57:03 +0000 (14:57 +0200)]
scsi/i2o: restore ioctl changes

This restores the changes from "scsi/i2o_block: cleanup ioctl
handling", which accidentally got reverted.

Origignal changelog:
      This fixes the ioctl function of the i2o_block driver, which
      has multiple problems:

      * The BLKI2OSRSTRAT and BLKI2OSWSTRAT commands always return
        -ENOTTY on success, where they should return 0.
      * Support for 32 bit compat is missing
      * The driver should use the .ioctl function and because
        .locked_ioctl is going away.

      The use of the big kernel lock remains for now, but gets
      made explictit in the ioctl function.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agoscsi/sd: remove big kernel lock
Arnd Bergmann [Wed, 7 Jul 2010 14:51:29 +0000 (16:51 +0200)]
scsi/sd: remove big kernel lock

Every user of the BKL in the sd driver is the
result of the pushdown from the block layer
into the open/close/ioctl functions.

The only place that used to rely on the BKL is
the sdkp->openers variable, which gets converted
into an atomic_t.

Nothing else seems to rely on the BKL, since the
functions do not touch global data without holding
another lock, and the open/close functions are
still protected from concurrent execution using
the bdev->bd_mutex.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: linux-scsi@vger.kernel.org
Cc: "James E.J. Bottomley" <James.Bottomley@suse.de>
Acked-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agoblock: remove BKL from partition ioctls
Arnd Bergmann [Wed, 7 Jul 2010 14:51:28 +0000 (16:51 +0200)]
block: remove BKL from partition ioctls

The blkpg_ioctl and blkdev_reread_part access fields of
the bdev and gendisk structures, yet they always do so
under the protection of bdev->bd_mutex, which seems
sufficient.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
cked-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agoblock: remove BKL from BLKROSET and BLKFLSBUF
Arnd Bergmann [Wed, 7 Jul 2010 14:51:27 +0000 (16:51 +0200)]
block: remove BKL from BLKROSET and BLKFLSBUF

We only call the functions set_device_ro(),
invalidate_bdev(), sync_filesystem() and sync_blockdev()
while holding the BKL in these commands. All
of these are also done in other code paths without
the BKL, which leads me to the conclusion that
the BKL is not needed here either.

The reason we hold it here is that it was originally
pushed down into the ioctl function from vfs_ioctl.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agoblock: push BKL into blktrace ioctls
Arnd Bergmann [Wed, 7 Jul 2010 14:51:26 +0000 (16:51 +0200)]
block: push BKL into blktrace ioctls

The blktrace driver currently needs the BKL, but
we should not need to take that in the block layer,
so just push it down into the driver itself.

It is quite likely that the BKL is not actually
required in blktrace code and could be removed
in a follow-on patch.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agoblock: push down BKL into .open and .release
Arnd Bergmann [Sat, 7 Aug 2010 16:25:34 +0000 (18:25 +0200)]
block: push down BKL into .open and .release

The open and release block_device_operations are currently
called with the BKL held. In order to change that, we must
first make sure that all drivers that currently rely
on this have no regressions.

This blindly pushes the BKL into all .open and .release
operations for all block drivers to prepare for the
next step. The drivers can subsequently replace the BKL
with their own locks or remove it completely when it can
be shown that it is not needed.

The functions blkdev_get and blkdev_put are the only
remaining users of the big kernel lock in the block
layer, besides a few uses in the ioctl code, none
of which need to serialize with blkdev_{get,put}.

Most of these two functions is also under the protection
of bdev->bd_mutex, including the actual calls to
->open and ->release, and the common code does not
access any global data structures that need the BKL.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agoblock: push down BKL into .locked_ioctl
Arnd Bergmann [Thu, 8 Jul 2010 08:18:46 +0000 (10:18 +0200)]
block: push down BKL into .locked_ioctl

As a preparation for the removal of the big kernel
lock in the block layer, this removes the BKL
from the common ioctl handling code, moving it
into every single driver still using it.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agoscsi/i2o_block: cleanup ioctl handling
Arnd Bergmann [Wed, 7 Jul 2010 14:51:23 +0000 (16:51 +0200)]
scsi/i2o_block: cleanup ioctl handling

This fixes the ioctl function of the i2o_block driver, which
has multiple problems:

* The BLKI2OSRSTRAT and BLKI2OSWSTRAT commands always return
  -ENOTTY on success, where they should return 0.
* Support for 32 bit compat is missing
* The driver should use the .ioctl function and because
  .locked_ioctl is going away.

The use of the big kernel lock remains for now, but gets
made explictit in the ioctl function.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agoscsi: fix discard page leak
FUJITA Tomonori [Thu, 8 Jul 2010 08:16:17 +0000 (10:16 +0200)]
scsi: fix discard page leak

We leak a page allocated for discard on some error conditions
(e.g. scsi_prep_state_check returns BLKPREP_DEFER in
scsi_setup_blk_pc_cmnd).

We unprep on requests that weren't prepped in the error path of
scsi_init_io. It makes the error path to clean up scsi commands messy.

Let's strictly apply the rule that we can't unprep on a request that
wasn't prepped.

Calling just scsi_put_command() in the error path of scsi_init_io() is
enough. We don't set REQ_DONTPREP yet.

scsi_setup_discard_cmnd can safely free a page on the error case with
the above rule.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agowriteback: Add tracing to write_cache_pages
Dave Chinner [Wed, 7 Jul 2010 03:24:08 +0000 (13:24 +1000)]
writeback: Add tracing to write_cache_pages

Add a trace event to the ->writepage loop in write_cache_pages to give
visibility into how the ->writepage call is changing variables within the
writeback control structure. Of most interest is how wbc->nr_to_write changes
from call to call, especially with filesystems that write multiple pages
in ->writepage.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agowriteback: Add tracing to balance_dirty_pages
Dave Chinner [Wed, 7 Jul 2010 03:24:07 +0000 (13:24 +1000)]
writeback: Add tracing to balance_dirty_pages

Tracing high level background writeback events is good, but it doesn't
give the entire picture. Add visibility into write throttling to catch IO
dispatched by foreground throttling of processing dirtying lots of pages.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agowriteback: Initial tracing support
Dave Chinner [Wed, 7 Jul 2010 03:24:06 +0000 (13:24 +1000)]
writeback: Initial tracing support

Trace queue/sched/exec parts of the writeback loop. This provides
insight into when and why flusher threads are scheduled to run. e.g
a sync invocation leaves traces like:

     sync-[...]: writeback_queue: bdi 8:0: sb_dev 8:1 nr_pages=7712 sync_mode=0 kupdate=0 range_cyclic=0 background=0
flush-8:0-[...]: writeback_exec: bdi 8:0: sb_dev 8:1 nr_pages=7712 sync_mode=0 kupdate=0 range_cyclic=0 background=0

This also lays the foundation for adding more writeback tracing to
provide deeper insight into the whole writeback path.

The original tracing code is from Jens Axboe, though this version is
a rewrite as a result of the code being traced changing
significantly.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agoblock: remove unused REQ_TYPE_LINUX_BLOCK
FUJITA Tomonori [Tue, 6 Jul 2010 07:03:18 +0000 (09:03 +0200)]
block: remove unused REQ_TYPE_LINUX_BLOCK

Nobody uses REQ_TYPE_LINUX_BLOCK (and its REQ_LB_OP_*).

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agoscsi: need to reset unprep_rq_fn in sd_remove
FUJITA Tomonori [Sat, 3 Jul 2010 14:07:04 +0000 (08:07 -0600)]
scsi: need to reset unprep_rq_fn in sd_remove

This is for block's for-2.6.36.

We need to reset q->unprep_rq_fn in sd_remove. Otherwise we hit kernel
oops if we access to a scsi disk device via sg after removing scsi
disk module.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agoblock: remove q->prepare_flush_fn completely
FUJITA Tomonori [Sat, 3 Jul 2010 08:45:40 +0000 (17:45 +0900)]
block: remove q->prepare_flush_fn completely

This removes q->prepare_flush_fn completely (changes the
blk_queue_ordered API).

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agoide: stop using q->prepare_flush_fn
FUJITA Tomonori [Sat, 3 Jul 2010 08:45:39 +0000 (17:45 +0900)]
ide: stop using q->prepare_flush_fn

use REQ_FLUSH flag instead.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: David S. Miller <davem@davemloft.net>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agovirtio_blk: stop using q->prepare_flush_fn
FUJITA Tomonori [Sat, 3 Jul 2010 08:45:38 +0000 (17:45 +0900)]
virtio_blk: stop using q->prepare_flush_fn

use REQ_FLUSH flag instead.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agodm: stop using q->prepare_flush_fn
FUJITA Tomonori [Sat, 3 Jul 2010 08:45:37 +0000 (17:45 +0900)]
dm: stop using q->prepare_flush_fn

use REQ_FLUSH flag instead.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Alasdair G Kergon <agk@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agops3disk: stop using q->prepare_flush_fn
FUJITA Tomonori [Sat, 3 Jul 2010 08:45:36 +0000 (17:45 +0900)]
ps3disk: stop using q->prepare_flush_fn

REQ_FLUSH flag enables us to kill ps3disk_prepare_flush().

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agoosdblk: stop using q->prepare_flush_fn
FUJITA Tomonori [Sat, 3 Jul 2010 08:45:35 +0000 (17:45 +0900)]
osdblk: stop using q->prepare_flush_fn

use REQ_FLUSH flag instead.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agoscsi: stop using q->prepare_flush_fn
FUJITA Tomonori [Sat, 3 Jul 2010 08:45:34 +0000 (17:45 +0900)]
scsi: stop using q->prepare_flush_fn

scsi-ml builds flush requests via q->prepare_flush_fn(), however,
builds discard requests via q->prep_rq_fn.

Using two different mechnisms for the similar requests (building
commands in SCSI ULD) doesn't make sense.

Handing both via q->prep_rq_fn makes the code design simpler.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: James Bottomley <James.Bottomley@suse.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agoblock: permit PREFLUSH and POSTFLUSH without prepare_flush_fn
FUJITA Tomonori [Sat, 3 Jul 2010 08:45:33 +0000 (17:45 +0900)]
block: permit PREFLUSH and POSTFLUSH without prepare_flush_fn

This is preparation for removing q->prepare_flush_fn.

Temporarily, blk_queue_ordered() permits QUEUE_ORDERED_DO_PREFLUSH and
QUEUE_ORDERED_DO_POSTFLUSH without prepare_flush_fn.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agoblock: introduce REQ_FLUSH flag
FUJITA Tomonori [Sat, 3 Jul 2010 08:45:32 +0000 (17:45 +0900)]
block: introduce REQ_FLUSH flag

SCSI-ml needs a way to mark a request as flush request in
q->prepare_flush_fn because it needs to identify them later (e.g. in
q->request_fn or prep_rq_fn).

queue_flush sets REQ_HARDBARRIER in rq->cmd_flags however the block
layer also sends normal REQ_TYPE_FS requests with REQ_HARDBARRIER. So
SCSI-ml can't use REQ_HARDBARRIER to identify flush requests.

We could change the block layer to clear REQ_HARDBARRIER bit before
sending non flush requests to the lower layers. However, intorudcing
the new flag looks cleaner (surely easier).

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: James Bottomley <James.Bottomley@suse.de>
Cc: David S. Miller <davem@davemloft.net>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Alasdair G Kergon <agk@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agoscsi: remove unused free discard page in sd_done
FUJITA Tomonori [Thu, 1 Jul 2010 10:49:19 +0000 (19:49 +0900)]
scsi: remove unused free discard page in sd_done

- sd_done isn't called for pc request so we never call the code.
- we use sd_unprep to free discard page now.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agoscsi: add sd_unprep_fn to free discard page
FUJITA Tomonori [Thu, 1 Jul 2010 10:49:18 +0000 (19:49 +0900)]
scsi: add sd_unprep_fn to free discard page

This fixes discard page leak by using q->unprep_rq_fn facility.

q->unprep_rq_fn is called when all the data buffer (req->bio and
scsi_data_buffer) in the request is freed.

sd_unprep() uses rq->buffer to free discard page allocated in
sd_prepare_discard().

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agoblock: implement an unprep function corresponding directly to prep
James Bottomley [Thu, 1 Jul 2010 10:49:17 +0000 (19:49 +0900)]
block: implement an unprep function corresponding directly to prep

Reviewed-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agodrivers/cdrom: use pr_<level>
Joe Perches [Thu, 1 Jul 2010 06:24:32 +0000 (08:24 +0200)]
drivers/cdrom: use pr_<level>

- add pr_fmt.

- convert printks to pr_<level>

- add if (0) and printf argument checking to cdinfo

- coalesce consecutive printks to single pr_

- fix a typo "back ground" to "background"

- convert printks without level to pr_info

- remove VIOCD_ prefixes and use pr_fmt/pr_<level>

- add a missing newline to an OS/400 message

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Folded in tab indentation fix from Andrew.

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agoblock: fixup missing conversion from BIO_RW_DISCARD to REQ_DISCARD
Jens Axboe [Tue, 29 Jun 2010 11:33:38 +0000 (13:33 +0200)]
block: fixup missing conversion from BIO_RW_DISCARD to REQ_DISCARD

Didn't cause a merge conflict, so fixed this one up manually
post merge.

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agoblock/xd.c: fix brace typo
Randy Dunlap [Tue, 22 Jun 2010 16:03:43 +0000 (09:03 -0700)]
block/xd.c: fix brace typo

Fix extra brace typo that is causing build errors.

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agogcc-4.6: fs: fix unused but set warnings
Andi Kleen [Mon, 21 Jun 2010 09:02:48 +0000 (11:02 +0200)]
gcc-4.6: fs: fix unused but set warnings

No real bugs I believe, just some dead code, and some
shut up code.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Eric Paris <eparis@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agogcc-4.6: block: fix unused but set variables in blk-merge
Andi Kleen [Mon, 21 Jun 2010 09:02:47 +0000 (11:02 +0200)]
gcc-4.6: block: fix unused but set variables in blk-merge

Just some dead code.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agoblock: don't allocate a payload for discard request
Christoph Hellwig [Fri, 18 Jun 2010 14:59:42 +0000 (16:59 +0200)]
block: don't allocate a payload for discard request

Allocating a fixed payload for discard requests always was a horrible hack,
and it's not coming to byte us when adding support for discard in DM/MD.

So change the code to leave the allocation of a payload to the lowlevel
driver.  Unfortunately that means we'll need another hack, which allows
us to update the various block layer length fields indicating that we
have a payload.  Instead of hiding this in sd.c, which we already partially
do for UNMAP support add a documented helper in the core block layer for it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agowriteback: merge bdi_writeback_task and bdi_start_fn
Christoph Hellwig [Sat, 19 Jun 2010 21:08:22 +0000 (23:08 +0200)]
writeback: merge bdi_writeback_task and bdi_start_fn

Move all code for the writeback thread into fs/fs-writeback.c instead of
splitting it over two functions in two files.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agowriteback: remove wb_list
Christoph Hellwig [Sat, 19 Jun 2010 21:08:06 +0000 (23:08 +0200)]
writeback: remove wb_list

The wb_list member of struct backing_device_info always has exactly one
element.  Just use the direct bdi->wb pointer instead and simplify some
code.

Also remove bdi_task_init which is now trivial to prepare for the next
patch.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agoblock: fix some more cmd_type cleanup fallout
Christoph Hellwig [Sat, 19 Jun 2010 15:26:47 +0000 (17:26 +0200)]
block: fix some more cmd_type cleanup fallout

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agovirtio_blk: add default case to cmd type switch
Jens Axboe [Fri, 18 Jun 2010 10:10:18 +0000 (12:10 +0200)]
virtio_blk: add default case to cmd type switch

On compilation, gcc correctly detects that we do not handle
all types:

In function ‘blk_done’:
warning: enumeration value ‘REQ_TYPE_FS’ not handled in switch
warning: enumeration value ‘REQ_TYPE_SENSE’ not handled in switch
warning: enumeration value ‘REQ_TYPE_PM_SUSPEND’ not handled in switch
warning: enumeration value ‘REQ_TYPE_PM_RESUME’ not handled in switch
warning: enumeration value ‘REQ_TYPE_PM_SHUTDOWN’ not handled in switch
warning: enumeration value ‘REQ_TYPE_LINUX_BLOCK’ not handled in switch
warning: enumeration value ‘REQ_TYPE_ATA_TASKFILE’ not handled in switch
warning: enumeration value ‘REQ_TYPE_ATA_PC’ not handled in switch

which is a bit pointless since this is at the end of the request
processessing. Add a default case that just breaks out.

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agoblock: unify flags for struct bio and struct request
Christoph Hellwig [Sat, 7 Aug 2010 16:20:39 +0000 (18:20 +0200)]
block: unify flags for struct bio and struct request

Remove the current bio flags and reuse the request flags for the bio, too.
This allows to more easily trace the type of I/O from the filesystem
down to the block driver.  There were two flags in the bio that were
missing in the requests:  BIO_RW_UNPLUG and BIO_RW_AHEAD.  Also I've
renamed two request flags that had a superflous RW in them.

Note that the flags are in bio.h despite having the REQ_ name - as
blkdev.h includes bio.h that is the only way to go for now.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agoblock: remove wrappers for request type/flags
Christoph Hellwig [Sat, 7 Aug 2010 16:17:56 +0000 (18:17 +0200)]
block: remove wrappers for request type/flags

Remove all the trivial wrappers for the cmd_type and cmd_flags fields in
struct requests.  This allows much easier grepping for different request
types instead of unwinding through macros.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agoremove needless ISA_DMA_THRESHOLD
FUJITA Tomonori [Mon, 31 May 2010 06:59:04 +0000 (15:59 +0900)]
remove needless ISA_DMA_THRESHOLD

Architectures don't need to define ISA_DMA_THRESHOLD anymore.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: James Bottomley <James.Bottomley@suse.de>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agoblock: kill ISA_DMA_THRESHOLD usage
FUJITA Tomonori [Mon, 31 May 2010 06:59:03 +0000 (15:59 +0900)]
block: kill ISA_DMA_THRESHOLD usage

block uses ISA_DMA_THRESHOLD for BLK_BOUNCE_ISA. Only SCSI uses
ISA_DMA_THRESHOLD for ancient drivers with non-zero
unchecked_isa_dma. Nowadays drivers (and subsystems) use dma_mask
properly instead of ISA_DMA_THRESHOLD.

Documentation/scsi/scsi_mid_low_api.txt says:

unchecked_isa_dma - 1=>only use bottom 16 MB of ram (ISA DMA addressing
                   restriction), 0=>can use full 32 bit (or better) DMA
                   address space

So block simply uses DMA_BIT_MASK(24) for BLK_BOUNCE_ISA for SCSI.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: James Bottomley <James.Bottomley@suse.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agoaha1532: remove ISA_DMA_THRESHOLD usage
FUJITA Tomonori [Thu, 17 Jun 2010 12:58:21 +0000 (14:58 +0200)]
aha1532: remove ISA_DMA_THRESHOLD usage

We can safely remove ISA_DMA_THRESHOLD usage in aha1542. aha1542 uses
ISA_DMA_THRESHOLD to see if:

- the buffers in scatter/list are below 16MB.
- scsi_host is below 16MB.

Both checkings were added in the ancient times but aren't necessary
nowadays since we properly bounce the buffers and allocate scsi_host
below 16MB with non-zero unchecked_isa_dma.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: James Bottomley <James.Bottomley@suse.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agoblock: BARRIER request should imply SYNC
Christoph Hellwig [Thu, 17 Jun 2010 06:54:16 +0000 (08:54 +0200)]
block: BARRIER request should imply SYNC

A barrier request should by defintion have priority in get_request
and let the queue be unplugged immediately as it's blocking all forward
progress due to the queue draining.

Most filesystems already get this implicitly by the way how submit_bh
treats the buffer_ordered flag, and gfs2 sets it explicitly.  But btrfs
and XFS are still forgetting to set the flag, as is blkdev_issue_flush
and some places in DM/MD.

For XFS on metadata heavy workloads this gives a consistent speedup
in the 2-3% range.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agofloppy: use warning macros
Stephen Hemminger [Tue, 15 Jun 2010 11:21:11 +0000 (13:21 +0200)]
floppy: use warning macros

Convert assertions to use WARN().  There are several error checks in the
code for things that should never happen.  Convert them to standard
warnings so kerneloops.org will see them.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
14 years agofloppy: use wait_event_interruptible
Stephen Hemminger [Tue, 15 Jun 2010 11:21:11 +0000 (13:21 +0200)]
floppy: use wait_event_interruptible

Convert wait loops to use wait_event_ macros.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>