Sasha Levin [Mon, 29 Apr 2013 02:30:08 +0000 (12:00 +0930)]
virtio-net: fill only rx queues which are being used
Due to MQ support we may allocate a whole bunch of rx queues but
never use them. With this patch we'll safe the space used by
the receive buffers until they are actually in use:
sh-4.2# free -h
total used free shared buffers cached
Mem: 490M 35M 455M 0B 0B 4.1M
-/+ buffers/cache: 31M 459M
Swap: 0B 0B 0B
sh-4.2# ethtool -L eth0 combined 8
sh-4.2# free -h
total used free shared buffers cached
Mem: 490M 162M 327M 0B 0B 4.1M
-/+ buffers/cache: 158M 331M
Swap: 0B 0B 0B
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Mon, 22 Apr 2013 04:40:42 +0000 (14:10 +0930)]
lguest: map Switcher below fixmap.
Now we've adjusted all the code, we can simply set switcher_addr to
wherever it needs to go below the fixmaps, rather than asserting that
it should be so.
With large NR_CPUS and PAE, people were hitting the "mapping switcher
would thwack fixmap" message.
Reported-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Mon, 22 Apr 2013 04:40:41 +0000 (14:10 +0930)]
lguest: cache last cpu we ran on.
This optimizes the frobbing of our Switcher map.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Mon, 22 Apr 2013 04:40:41 +0000 (14:10 +0930)]
lguest: map Switcher text whenever we allocate a new pagetable.
It's always to same, so no need to put in the PTE every time we're
about to run. Keep a flag to track whether the pagetable has the
Switcher entries allocated, and when allocating always initialize the
Switcher text PTE.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Mon, 22 Apr 2013 04:40:40 +0000 (14:10 +0930)]
lguest: don't share Switcher PTE pages between guests.
We currently use the whole top PGD entry for the switcher, so we
simply share a fixed page of PTEs between all guests (actually, it's
one per Host CPU, to ensure isolation between guests).
Changes to a scheme where every guest has its own mappings.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Mon, 22 Apr 2013 04:40:40 +0000 (14:10 +0930)]
lguest: expost switcher_pages array (as lg_switcher_pages).
We will need this in page_table.c soon.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Mon, 22 Apr 2013 04:40:39 +0000 (14:10 +0930)]
lguest: extract shadow PTE walking / allocating.
We want a separate find_pte() function so we can call it for populating the
switcher PTE entries.
We can also use it in page_writable().
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Mon, 22 Apr 2013 04:40:39 +0000 (14:10 +0930)]
lguest: make check_gpte et. al return bool.
This is a bit neater: we can immediately return if a PTE/PGD/PMD entry
is invalid (which also kills the guest). It means we don't risk using
invalid entries as we reshuffle the code.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Mon, 22 Apr 2013 04:40:38 +0000 (14:10 +0930)]
lguest: assume Switcher text is a single page.
ie. SHARED_SWITCHER_PAGES == 1. It is well under a page, and it's a
minor simplification: it's nice to have *one* simplification in a
patch series!
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Mon, 22 Apr 2013 04:40:38 +0000 (14:10 +0930)]
lguest: rename switcher_page to switcher_pages.
There is a single page with the Switcher in it, but it's followed by 2
pages per Host CPU.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Mon, 22 Apr 2013 04:40:37 +0000 (14:10 +0930)]
lguest: remove RESERVE_MEM constant.
We can use switcher_addr directly.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Mon, 22 Apr 2013 04:40:37 +0000 (14:10 +0930)]
lguest: check vaddr not pgd for Switcher protection.
We currently assume that the Switcher the top pgd; we want to remove
this assumption, so check that vaddr is OK, rather then checking pgd
index.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Mon, 22 Apr 2013 04:40:37 +0000 (14:10 +0930)]
lguest: prepare to make SWITCHER_ADDR a variable.
We currently use the whole top PGD entry for the switcher, but that's
hitting the fixmap in some configurations (mainly, large NR_CPUS).
Introduce a variable, currently set to the constant.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Amit Shah [Mon, 15 Apr 2013 02:30:15 +0000 (12:00 +0930)]
virtio: console: replace EMFILE with EBUSY for already-open port
Returning EMFILE (process has too many open files) is incorrect to
indicate a port is already open by another process. Use EBUSY for that.
This does change what we report to userspace, but I believe userspace
can look at it this way: it gets EBUSY, a new error code, instead of
EMFILE. It's still an error, and that's not changing.
Reported-by: Mateusz Guzik <mguzik@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Wanlong Gao [Mon, 8 Apr 2013 13:35:49 +0000 (23:05 +0930)]
virtio-scsi: reset virtqueue affinity when doing cpu hotplug
Add hot cpu notifier to reset the request virtqueue affinity
when doing cpu hotplug.
Cc: linux-scsi@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
Reviewed-by: Asias He <asias@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Paolo Bonzini [Mon, 8 Apr 2013 13:33:25 +0000 (23:03 +0930)]
virtio-scsi: introduce multiqueue support
This patch adds queue steering to virtio-scsi. When a target is sent
multiple requests, we always drive them to the same queue so that FIFO
processing order is kept. However, if a target was idle, we can choose
a queue arbitrarily. In this case the queue is chosen according to the
current VCPU, so the driver expects the number of request queues to be
equal to the number of VCPUs. This makes it easy and fast to select
the queue, and also lets the driver optimize the IRQ affinity for the
virtqueues (each virtqueue's affinity is set to the CPU that "owns"
the queue).
The speedup comes from improving cache locality and giving CPU affinity
to the virtqueues, which is why this scheme was selected. Assuming that
the thread that is sending requests to the device is I/O-bound, it is
likely to be sleeping at the time the ISR is executed, and thus executing
the ISR on the same processor that sent the requests is cheap.
However, the kernel will not execute the ISR on the "best" processor
unless you explicitly set the affinity. This is because in practice
you will have many such I/O-bound processes and thus many otherwise
idle processors. Then the kernel will execute the ISR on a random
processor, rather than the one that is sending requests to the device.
The alternative to per-CPU virtqueues is per-target virtqueues. To
achieve the same locality, we could dynamically choose the virtqueue's
affinity based on the CPU of the last task that sent a request. This
is less appealing because we do not set the affinity directly---we only
provide a hint to the irqbalanced running in userspace. Dynamically
changing the affinity only works if the userspace applies the hint
fast enough.
Cc: linux-scsi@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
Reviewed-by: Asias He <asias@redhat.com>
Tested-by: Venkatesh Srinivas <venkateshs@google.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Paolo Bonzini [Mon, 8 Apr 2013 13:32:07 +0000 (23:02 +0930)]
virtio-scsi: push vq lock/unlock into virtscsi_vq_done
Avoid duplicated code in all of the callers.
Cc: linux-scsi@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
Reviewed-by: Asias He <asias@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Paolo Bonzini [Mon, 8 Apr 2013 13:31:38 +0000 (23:01 +0930)]
virtio-scsi: pass struct virtio_scsi to virtqueue completion function
This will be needed soon in order to retrieve the per-target
struct.
Cc: linux-scsi@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
Reviewed-by: Asias He <asias@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Wanlong Gao [Mon, 8 Apr 2013 13:31:16 +0000 (23:01 +0930)]
virtio-scsi: redo allocation of target data
virtio_scsi_target_state is now empty. We will find new uses for it in
the next few patches, so this patch does not drop it completely.
And as James suggested, we use entries target_alloc and target_destroy
in the host template to allocate and destroy the virtio_scsi_target_state
of each target, attach this struct to scsi_target->hostdata. Now
we can get at it from the sdev with scsi_target(sdev)->hostdata.
No messing around with fixed size arrays and bulk memory allocation
and no need to pass in the maximum target size as a parameter because
everything should now happen dynamically.
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: linux-scsi@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
Reviewed-by: Asias He <asias@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Wei Yongjun [Mon, 8 Apr 2013 06:43:59 +0000 (16:13 +0930)]
virtio_console: make local symbols static
Those symbols only used within this file, and should be static.
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Acked-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Wei Yongjun [Tue, 2 Apr 2013 06:15:56 +0000 (16:45 +1030)]
caif_virtio: fix error return code in cfv_create_genpool()
Fix to return a negative error code from the error handling
case instead of 0, as returned elsewhere in this function.
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Acked-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Amos Kong [Tue, 2 Apr 2013 06:12:02 +0000 (16:42 +1030)]
MAINTAINERS: add missing entries for virtio
Some head files were split or moved to uapi/ without
updating MAINTAINERS.
Signed-off-by: Amos Kong <kongjianjun@gmail.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Paul Bolle [Mon, 1 Apr 2013 23:29:17 +0000 (09:59 +1030)]
virtio: do not export "u16" and "u64" to userspace
virtio_balloon.h exports "u16" and "u64" to userspace. Use "__u16" and
"__u64" instead.
Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Sjur Brændeland [Sun, 24 Mar 2013 03:49:59 +0000 (14:19 +1030)]
caif_virtio: Check that vringh_config is not null
Check that vringh_config is not NULL before using it.
Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Sjur Brændeland [Sun, 24 Mar 2013 03:49:44 +0000 (14:19 +1030)]
caif_virtio: Use vringh_notify_enable correctly
Check on the correct return value from
vringh_notify_enable_kern(). It returns false if
more packets are available, not true.
Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Wed, 20 Mar 2013 05:14:30 +0000 (15:44 +1030)]
tools/virtio: remove virtqueue_add_buf() from tests.
Make the rest of the paths use virtqueue_add_sgs or add_outbuf.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Wed, 20 Mar 2013 05:14:30 +0000 (15:44 +1030)]
9p/trans_virtio.c: use virtio_add_sgs[]
virtio_add_buf() is going away, replaced with virtio_add_sgs() which
takes multiple terminated scatterlists.
Cc: Eric Van Hensbergen <ericvh@gmail.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Wed, 20 Mar 2013 05:14:30 +0000 (15:44 +1030)]
virtio_balloon: use simplified virtqueue accessors.
We never add buffers with input and output parts, so use the new accessors.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Wed, 20 Mar 2013 05:14:29 +0000 (15:44 +1030)]
virtio_rpmsg_bus: use simplified virtqueue accessors.
We never add buffers with input and output parts, so use the new accessors.
Cc: Ohad Ben-Cohen <ohad@wizery.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Wed, 20 Mar 2013 05:14:29 +0000 (15:44 +1030)]
caif_virtio: use simplified virtqueue accessors.
We never add buffers with input and output parts, so use the new accessors.
Cc: Sjur Brendeland <sjur.brandeland@stericsson.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Wed, 20 Mar 2013 05:14:29 +0000 (15:44 +1030)]
virtio_console: use simplified virtqueue accessors.
We never add buffers with input and output parts, so use the new accessors.
Acked-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Wed, 20 Mar 2013 05:14:29 +0000 (15:44 +1030)]
virtio_rng: use simplified virtqueue accessors.
We never add buffers with input and output parts, so use the new accessors.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Reviewed-by: Asias He <asias@redhat.com>
Rusty Russell [Wed, 20 Mar 2013 05:14:28 +0000 (15:44 +1030)]
virtio_net: use simplified virtqueue accessors.
We never add buffers with input and output parts, so use the new accessors.
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Reviewed-by: Asias He <asias@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Wed, 20 Mar 2013 05:14:28 +0000 (15:44 +1030)]
virtio_net: use virtqueue_add_sgs[] for command buffers.
It's a bit cleaner to hand multiple sgs, rather than one big one.
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Tested-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Wed, 20 Mar 2013 05:14:28 +0000 (15:44 +1030)]
virtio_scsi: use virtqueue_add_inbuf() for virtscsi_kick_event.
It's a bit clearer, and add_buf is going away.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Reviewed-by: Asias He <asias@redhat.com>
Wanlong Gao [Wed, 20 Mar 2013 05:14:28 +0000 (15:44 +1030)]
virtio-scsi: use virtqueue_add_sgs for command buffers
Using the new virtqueue_add_sgs function lets us simplify the queueing
path. In particular, all data protected by the tgt_lock is just gone
(multiqueue will find a new use for the lock).
Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Asias He <asias@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Wed, 20 Mar 2013 05:14:27 +0000 (15:44 +1030)]
virtio_blk: remove nents member.
It's simply a flag as to whether we have data now, so make it an
explicit function parameter rather than a member of struct
virtblk_req.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Reviewed-by: Asias He <asias@redhat.com>
Paolo Bonzini [Wed, 20 Mar 2013 05:14:27 +0000 (15:44 +1030)]
virtio-blk: use virtqueue_add_sgs on req path
(This is a respin of Paolo Bonzini's patch, but it calls
virtqueue_add_sgs() instead of his multi-part API).
This is similar to the previous patch, but a bit more radical
because the bio and req paths now share the buffer construction
code. Because the req path doesn't use vbr->sg, however, we
need to add a couple of arguments to __virtblk_add_req.
We also need to teach __virtblk_add_req how to build SCSI command
requests.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Reviewed-by: Asias He <asias@redhat.com>
Paolo Bonzini [Wed, 20 Mar 2013 05:14:27 +0000 (15:44 +1030)]
virtio-blk: use virtqueue_add_sgs on bio path
(This is a respin of Paolo Bonzini's patch, but it calls
virtqueue_add_sgs() instead of his multi-part API).
Move the creation of the request header and response footer to
__virtblk_add_req. vbr->sg only contains the data scatterlist,
the header/footer are added separately using virtqueue_add_sgs().
With this change, virtio-blk (with use_bio) is not relying anymore on
the virtio functions ignoring the end markers in a scatterlist.
The next patch will do the same for the other path.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Reviewed-by: Asias He <asias@redhat.com>
Paolo Bonzini [Wed, 20 Mar 2013 05:14:27 +0000 (15:44 +1030)]
virtio-blk: reorganize virtblk_add_req
Right now, both virtblk_add_req and virtblk_add_req_wait call
virtqueue_add_buf. To prepare for the next patches, abstract the call
to virtqueue_add_buf into a new function __virtblk_add_req, and include
the waiting logic directly in virtblk_add_req.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Asias He <asias@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Wed, 20 Mar 2013 05:14:26 +0000 (15:44 +1030)]
tools/virtio: make vringh_test use inbuf/outbuf.
As expected, the simplified accessors are faster.
for i in `seq 50`; do /usr/bin/time -f 'Wall time:%e' ./vringh_test --indirect --eventidx --parallel --fast-vringh; done 2>&1 | stats --trim-outliers:
Before:
Using CPUS 0 and 3
Guest: notified 0, pinged 39062-39063(39063)
Host: notified 39062-39063(39063), pinged 0
Wall time:1.760000-2.220000(1.789167)
After:
Using CPUS 0 and 3
Guest: notified 0, pinged 39037-39063(39062)
Host: notified 39037-39063(39062), pinged 0
Wall time:1.640000-1.810000(1.676875)
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Wed, 20 Mar 2013 05:14:26 +0000 (15:44 +1030)]
virtio_ring: virtqueue_add_outbuf / virtqueue_add_inbuf.
These are specialized versions of virtqueue_add_buf(), which cover
over 80% of cases and are far clearer.
In particular, the scatterlists passed to these functions don't have
to be clean (ie. we ignore end markers).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Wed, 20 Mar 2013 05:07:09 +0000 (15:37 +1030)]
virtio_ring: virtqueue_add_sgs, to add multiple sgs.
virtio_scsi can really use this, to avoid the current hack of copying
the whole sg array. Some other things get slightly neater, too.
This causes a slowdown in virtqueue_add_buf(), which is implemented as
a wrapper. This is addressed in the next patches.
for i in `seq 50`; do /usr/bin/time -f 'Wall time:%e' ./vringh_test --indirect --eventidx --parallel --fast-vringh; done 2>&1 | stats --trim-outliers:
Before:
Using CPUS 0 and 3
Guest: notified 0, pinged 39009-39063(39062)
Host: notified 39009-39063(39062), pinged 0
Wall time:1.700000-1.950000(1.723542)
After:
Using CPUS 0 and 3
Guest: notified 0, pinged 39062-39063(39063)
Host: notified 39062-39063(39063), pinged 0
Wall time:1.760000-2.220000(1.789167)
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Reviewed-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
Reviewed-by: Asias He <asias@redhat.com>
Paolo Bonzini [Wed, 20 Mar 2013 05:07:08 +0000 (15:37 +1030)]
scatterlist: introduce sg_unmark_end
This is useful in places that recycle the same scatterlist multiple
times, and do not want to incur the cost of sg_init_table every
time in hot paths.
Acked-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Erwan Yvin [Wed, 20 Mar 2013 03:22:24 +0000 (13:52 +1030)]
caif_virtio: Introduce caif over virtio
Add the CAIF Virtio shared memory driver for talking
to a modem.
This CAIF Link layer communicates to the modem over
shared memory. It is implemented as a virtio_driver.
The underlying virtio device is managed by the remoteproc
framework. The Virtio queue is used for transmitting data
to the modem, and the new vringh is used for receiving data.
Genalloc is used for managing the shared memory used for TX
data. The default dma-alloc-coherent allocator can only
allocate whole pages, and this wastes too much shared memory.
Flow control is implemented by stopping the TX-queues if the
virtio queues go full or we run out of memory. Queued are
reopened when queues are below the watermark.
NAPI is used in RX path, and a dedicated tasklet is used
for releasing TX buffers.
Signed-off-by: Erwan Yvin <erwan.yvin@stericsson.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (minor fixes)
Sjur Brændeland [Wed, 20 Mar 2013 03:21:24 +0000 (13:51 +1030)]
virtio: Introduce vringh wrappers in virtio_config
Add wrappers for the host vrings to support loose
coupling between the virtio device and driver.
A new struct vringh_config_ops with the functions
find_vrhs() and del_vrhs() is added to the virtio_device
struct. This enables virtio drivers to manage virtio
host rings without detailed knowledge of how the
vrings are created and deleted.
The function vringh_notify() is added so vringh clients
can notify the other side that buffers are added to the
used-ring.
Cc: Ohad Ben-Cohen <ohad@wizery.com>
Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (constified vringh_config)
Rusty Russell [Wed, 20 Mar 2013 03:20:24 +0000 (13:50 +1030)]
tools/virtio: add vring_test.
This is mainly to test the drivers/vhost/vringh.c code, but it also
uses the drivers/virtio/virtio_ring.c code for the guest side.
Usage for testing the basic implementation:
./vringh_test
# Test with indirect descriptors
./vringh_test --indirect
# Test with indirect descriptors and event indexex
./vringh_test --indirect --eventidx
You can run a parallel stress test by adding --parallel to any of the
above options.
eg ./vringh_test --parallel:
Using CPUS 0 and 3
Guest: notified
10107974, pinged 107970
Host: notified 108158, pinged
3172148
./vringh_test --indirect --eventidx --parallel:
Using CPUS 0 and 3
Guest: notified 156357, pinged 156251
Host: notified 156251, pinged 78179
Average of 50 times doing ./vringh_test --indirect --eventidx --parallel:
2.840000-3.040000(2.927292)user
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Wed, 20 Mar 2013 03:20:14 +0000 (13:50 +1030)]
vringh: host-side implementation of virtio rings.
Getting use of virtio rings correct is tricky, and a recent patch saw
an implementation of in-kernel rings (as separate from userspace).
This abstracts the business of dealing with the virtio ring layout
from the access (userspace or direct); to do this, we use function
pointers, which gcc inlines correctly.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Rusty Russell [Mon, 18 Mar 2013 02:52:19 +0000 (13:22 +1030)]
tools/virtio: separate headers more.
This makes them a bit more like the kernel headers, so we can include more
real kernel headers in our tests.
In addition this means that we don't break tools/virtio with the next
patch.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Mon, 18 Mar 2013 02:52:19 +0000 (13:22 +1030)]
virtio_ring: expose virtio barriers for use in vringh.
The host side of ring needs this logic too.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Michael S. Tsirkin [Mon, 18 Mar 2013 02:52:18 +0000 (13:22 +1030)]
tools/virtio: fix build for 3.8
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Tue, 12 Mar 2013 05:07:33 +0000 (15:37 +1030)]
Remove Documentation/virtual/virtio-spec.txt
We haven't been keeping it in sync, so just remove it.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Milos Vyletel [Tue, 12 Mar 2013 05:04:40 +0000 (15:34 +1030)]
virtio-blk: emit udev event when device is resized
When virtio-blk device is resized from host (using block_resize from QEMU) emit
KOBJ_CHANGE uevent to notify guest about such change. This allows user to have
custom udev rules which would take whatever action if such event occurs. As a
proof of concept I've created simple udev rule that automatically resize
filesystem on virtio-blk device.
ACTION=="change", KERNEL=="vd*", \
ENV{RESIZE}=="1", \
ENV{ID_FS_TYPE}=="ext[3-4]", \
RUN+="/sbin/resize2fs /dev/%k"
ACTION=="change", KERNEL=="vd*", \
ENV{RESIZE}=="1", \
ENV{ID_FS_TYPE}=="LVM2_member", \
RUN+="/sbin/pvresize /dev/%k"
Signed-off-by: Milos Vyletel <milos.vyletel@sde.cz>
Tested-by: Asias He <asias@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (minor simplification)
Wanlong Gao [Tue, 12 Mar 2013 05:04:40 +0000 (15:34 +1030)]
virtio-scsi: use pr_err() instead of printk()
Convert the virtio-scsi driver to use pr_err() instead of printk().
Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Wanlong Gao [Wed, 6 Mar 2013 21:53:00 +0000 (08:53 +1100)]
lguest: fix paths in comments
After commit
07fe997, lguest tool has already moved from
Documentation/virtual/lguest/ to tools/lguest/.
Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Linus Torvalds [Sun, 3 Mar 2013 23:11:05 +0000 (15:11 -0800)]
Linux 3.9-rc1
Linus Torvalds [Sun, 3 Mar 2013 22:24:59 +0000 (14:24 -0800)]
Merge tag 'disintegrate-fbdev-
20121220' of git://git.infradead.org/users/dhowells/linux-headers
Pull fbdev UAPI disintegration from David Howells:
"You'll be glad to here that the end is nigh for the UAPI patches.
Only the fbdev/framebuffer piece remains now that the SCSI stuff has
gone in.
Here are the UAPI disintegration bits for the fbdev drivers. It
appears that Florian hasn't had time to deal with my patch, but back
in December he did say he didn't mind if I pushed it forward."
Yay. No more uapi movement. And hopefully no more big header file
cleanups coming up either, it just tends to be very painful.
* tag 'disintegrate-fbdev-
20121220' of git://git.infradead.org/users/dhowells/linux-headers:
UAPI: (Scripted) Disintegrate include/video
Linus Torvalds [Sun, 3 Mar 2013 22:22:53 +0000 (14:22 -0800)]
Merge tag 'stable/for-linus-3.9-rc1-tag' of git://git./linux/kernel/git/konrad/xen
Pull Xen bug-fixes from Konrad Rzeszutek Wilk:
- Update the Xen ACPI memory and CPU hotplug locking mechanism.
- Fix PAT issues wherein various applications would not start
- Fix handling of multiple MSI as AHCI now does it.
- Fix ARM compile failures.
* tag 'stable/for-linus-3.9-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
xenbus: fix compile failure on ARM with Xen enabled
xen/pci: We don't do multiple MSI's.
xen/pat: Disable PAT using pat_enabled value.
xen/acpi: xen cpu hotplug minor updates
xen/acpi: xen memory hotplug minor updates
Linus Torvalds [Sun, 3 Mar 2013 21:23:02 +0000 (13:23 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/viro/vfs
Pull more VFS bits from Al Viro:
"Unfortunately, it looks like xattr series will have to wait until the
next cycle ;-/
This pile contains 9p cleanups and fixes (races in v9fs_fid_add()
etc), fixup for nommu breakage in shmem.c, several cleanups and a bit
more file_inode() work"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
constify path_get/path_put and fs_struct.c stuff
fix nommu breakage in shmem.c
cache the value of file_inode() in struct file
9p: if v9fs_fid_lookup() gets to asking server, it'd better have hashed dentry
9p: make sure ->lookup() adds fid to the right dentry
9p: untangle ->lookup() a bit
9p: double iput() in ->lookup() if d_materialise_unique() fails
9p: v9fs_fid_add() can't fail now
v9fs: get rid of v9fs_dentry
9p: turn fid->dlist into hlist
9p: don't bother with private lock in ->d_fsdata; dentry->d_lock will do just fine
more file_inode() open-coded instances
selinux: opened file can't have NULL or negative ->f_path.dentry
(In the meantime, the hlist traversal macros have changed, so this
required a semantic conflict fixup for the newly hlistified fid->dlist)
Linus Torvalds [Sun, 3 Mar 2013 21:13:20 +0000 (13:13 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/mason/linux-btrfs
Pull btrfs fixup from Chris Mason:
"Geert and James both sent this one in, sorry guys"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
btrfs/raid56: Add missing #include <linux/vmalloc.h>
Linus Torvalds [Sun, 3 Mar 2013 20:58:43 +0000 (12:58 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/s390/linux
Pull second set of s390 patches from Martin Schwidefsky:
"The main part of this merge are Heikos uaccess patches. Together with
commit
09884964335e ("mm: do not grow the stack vma just because of an
overrun on preceding vma") the user string access is hopefully fixed
for good.
In addition some bug fixes and two cleanup patches."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/module: fix compile warning
qdio: remove unused parameters
s390/uaccess: fix kernel ds access for page table walk
s390/uaccess: fix strncpy_from_user string length check
input: disable i8042 PC Keyboard controller for s390
s390/dis: Fix invalid array size
s390/uaccess: remove pointless access_ok() checks
s390/uaccess: fix strncpy_from_user/strnlen_user zero maxlen case
s390/uaccess: shorten strncpy_from_user/strnlen_user
s390/dasd: fix unresponsive device after all channel paths were lost
s390/mm: ignore change bit for vmemmap
s390/page table dumper: add support for change-recording override bit
Linus Torvalds [Sun, 3 Mar 2013 20:57:38 +0000 (12:57 -0800)]
Merge branch 'fixes-for-3.9-latest' of git://git./linux/kernel/git/deller/parisc-linux
Pull second round of PARISC updates from Helge Deller:
"The most important fix in this branch is the switch of io_setup,
io_getevents and io_submit syscalls to use the available compat
syscalls when running 32bit userspace on 64bit kernel. Other than
that it's mostly removal of compile warnings."
* 'fixes-for-3.9-latest' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: fix redefinition of SET_PERSONALITY
parisc: do not install modules when installing kernel
parisc: fix compile warnings triggered by atomic_sub(sizeof(),v)
parisc: check return value of down_interruptible() in hp_sdc_rtc.c
parisc: avoid unitialized variable warning in pa_memcpy()
parisc: remove unused variable 'compat_val'
parisc: switch to compat_functions of io_setup, io_getevents and io_submit
parisc: select ARCH_WANT_FRAME_POINTERS
Linus Torvalds [Sun, 3 Mar 2013 20:06:09 +0000 (12:06 -0800)]
Merge tag 'metag-v3.9-rc1-v4' of git://git./linux/kernel/git/jhogan/metag
Pull new ImgTec Meta architecture from James Hogan:
"This adds core architecture support for Imagination's Meta processor
cores, followed by some later miscellaneous arch/metag cleanups and
fixes which I kept separate to ease review:
- Support for basic Meta 1 (ATP) and Meta 2 (HTP) core architecture
- A few fixes all over, particularly for symbol prefixes
- A few privilege protection fixes
- Several cleanups (setup.c includes, split out a lot of
metag_ksyms.c)
- Fix some missing exports
- Convert hugetlb to use vm_unmapped_area()
- Copy device tree to non-init memory
- Provide dma_get_sgtable()"
* tag 'metag-v3.9-rc1-v4' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/metag: (61 commits)
metag: Provide dma_get_sgtable()
metag: prom.h: remove declaration of metag_dt_memblock_reserve()
metag: copy devicetree to non-init memory
metag: cleanup metag_ksyms.c includes
metag: move mm/init.c exports out of metag_ksyms.c
metag: move usercopy.c exports out of metag_ksyms.c
metag: move setup.c exports out of metag_ksyms.c
metag: move kick.c exports out of metag_ksyms.c
metag: move traps.c exports out of metag_ksyms.c
metag: move irq enable out of irqflags.h on SMP
genksyms: fix metag symbol prefix on crc symbols
metag: hugetlb: convert to vm_unmapped_area()
metag: export clear_page and copy_page
metag: export metag_code_cache_flush_all
metag: protect more non-MMU memory regions
metag: make TXPRIVEXT bits explicit
metag: kernel/setup.c: sort includes
perf: Enable building perf tools for Meta
metag: add boot time LNKGET/LNKSET check
metag: add __init to metag_cache_probe()
...
Linus Torvalds [Sun, 3 Mar 2013 19:54:39 +0000 (11:54 -0800)]
Merge branch 'for-linus' of git://git.linaro.org/people/rmk/linux-arm
Pull late ARM updates from Russell King:
"Here is the late set of ARM updates for this merge window; in here is:
- The ARM parts of the broadcast timer support, core parts merged
through tglx's tree. This was left over from the previous merge to
allow the dependency on tglx's tree to be resolved.
- A fix to the VFP code which shows up on Raspberry Pi's, as well as
fixing the fallout from a previous commit in this area.
- A number of smaller fixes scattered throughout the ARM tree"
* 'for-linus' of git://git.linaro.org/people/rmk/linux-arm:
ARM: Fix broken commit
0cc41e4a21d43 corrupting kernel messages
ARM: fix scheduling while atomic warning in alignment handling code
ARM: VFP: fix emulation of second VFP instruction
ARM: 7656/1: uImage: Error out on build of multiplatform without LOADADDR
ARM: 7640/1: memory: tegra_ahb_enable_smmu() depends on TEGRA_IOMMU_SMMU
ARM: 7654/1: Preserve L_PTE_VALID in pte_modify()
ARM: 7653/2: do not scale loops_per_jiffy when using a constant delay clock
ARM: 7651/1: remove unused smp_timer_broadcast #define
Linus Torvalds [Sun, 3 Mar 2013 18:25:47 +0000 (10:25 -0800)]
Merge tag 'char-misc-3.9-rc1' of git://git./linux/kernel/git/gregkh/char-misc
Pull char/misc patch from Greg Kroah-Hartman:
"Here is one remaining patch for 3.9-rc1. It is for the hyper-v
drivers, and had to wait until some other patches went in through the
x86 tree."
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* tag 'char-misc-3.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
Drivers: hv: vmbus: Use the new infrastructure for delivering VMBUS interrupts
Linus Torvalds [Sun, 3 Mar 2013 18:24:57 +0000 (10:24 -0800)]
Merge tag 'usb-3.9-rc1' of git://git./linux/kernel/git/gregkh/usb
Pull USB patch revert from Greg Kroah-Hartman:
"Here is one remaining USB patch for 3.9-rc1, it reverts a 3.8 patch
that has caused a lot of regressions for some VIA EHCI controllers."
* tag 'usb-3.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
USB: EHCI: revert "remove ASS/PSS polling timeout"
Linus Torvalds [Sun, 3 Mar 2013 18:23:29 +0000 (10:23 -0800)]
Merge git://www.linux-watchdog.org/linux-watchdog
Pull watchdog updates from Wim Van Sebroeck:
"This contains:
- fixes and improvements
- devicetree bindings
- conversion to watchdog generic framework of the following drivers:
- booke_wdt
- bcm47xx_wdt.c
- at91sam9_wdt
- Removal of old STMP3xxx driver
- Addition of following new drivers:
- new driver for STMP3xxx and i.MX23/28
- Retu watchdog driver"
* git://www.linux-watchdog.org/linux-watchdog: (30 commits)
watchdog: sp805_wdt depends on ARM
watchdog: davinci_wdt: update to devm_* API
watchdog: davinci_wdt: use devm managed clk get
watchdog: at91rm9200: add DT support
watchdog: add timeout-sec property binding
watchdog: at91sam9_wdt: Convert to use the watchdog framework
watchdog: omap_wdt: Add option nowayout
watchdog: core: dt: add support for the timeout-sec dt property
watchdog: bcm47xx_wdt.c: add hard timer
watchdog: bcm47xx_wdt.c: rename wdt_time to timeout
watchdog: bcm47xx_wdt.c: rename ops methods
watchdog: bcm47xx_wdt.c: use platform device
watchdog: bcm47xx_wdt.c: convert to watchdog core api
watchdog: Convert BookE watchdog driver to watchdog infrastructure
watchdog: s3c2410_wdt: Use devm_* functions
watchdog: remove old STMP3xxx driver
watchdog: add new driver for STMP3xxx and i.MX23/28
rtc: stmp3xxx: add wdt-accessor function
watchdog: introduce retu_wdt driver
watchdog: intel_scu_watchdog: fix Kconfig dependency
...
Linus Torvalds [Sun, 3 Mar 2013 18:20:22 +0000 (10:20 -0800)]
Merge branch 'next' of git://git.infradead.org/users/vkoul/slave-dma
Pull second set of slave-dmaengine updates from Vinod Koul:
"Arnd's patch moves the dw_dmac to use generic DMA binding. I agreed
to merge this late as it will avoid the conflicts between trees.
The second patch from Matt adding a dma_request_slave_channel_compat
API was supposed to be picked up, but somehow never got picked up.
Some patches dependent on this are already in -next :("
* 'next' of git://git.infradead.org/users/vkoul/slave-dma:
dmaengine: dw_dmac: move to generic DMA binding
dmaengine: add dma_request_slave_channel_compat()
Linus Torvalds [Sun, 3 Mar 2013 18:16:19 +0000 (10:16 -0800)]
Merge branch 'for_linus' of git://cavan.codon.org.uk/platform-drivers-x86
Pull x86 platform driver updates from Matthew Garrett:
"Mostly relatively small updates, along with some hardware enablement
for Sony hardware and a pile of updates to Google's Chromebook driver"
* 'for_linus' of git://cavan.codon.org.uk/platform-drivers-x86: (49 commits)
ideapad-laptop: Depend on BACKLIGHT_CLASS_DEVICE instead of selecting it
ideapad: depends on backlight subsystem and update comment
Platform: x86: chromeos_laptop - add i915 gmbuses to adapter names
Platform: x86: chromeos_laptop - Add isl light sensor for Pixel
Platform: x86: chromeos_laptop - Add a more general add_i2c_device
Platform: x86: chromeos_laptop - Add Pixel Touchscreen
Platform: x86: chromeos_laptop - Add support for probing devices
Platform: x86: chromeos_laptop - Add Pixel Trackpad
hp-wmi: fix handling of platform device
sony-laptop: leak in error handling sony_nc_lid_resume_setup()
hp-wmi: Add support for SMBus hotkeys
asus-wmi: Fix unused function build warning
acer-wmi: avoid the warning of 'devices' may be used uninitialized
drivers/platform/x86/thinkpad_acpi.c: Handle HKEY event 0x6040
Platform: x86: chromeos_laptop - Add HP Pavilion 14
Platform: x86: chromeos_laptop - Add Taos tsl2583 device
Platform: x86: chromeos_laptop - Add Taos tsl2563 device
Platform: x86: chromeos_laptop - Add Acer C7 trackpad
Platform: x86: chromeos_laptop - Rename setup_lumpy_tp to setup_cyapa_smbus_tp
asus-laptop: always report brightness key events
...
Geert Uytterhoeven [Sun, 3 Mar 2013 11:44:41 +0000 (04:44 -0700)]
btrfs/raid56: Add missing #include <linux/vmalloc.h>
tilegx_defconfig:
fs/btrfs/raid56.c: In function 'btrfs_alloc_stripe_hash_table':
fs/btrfs/raid56.c:206:3: error: implicit declaration of function 'vzalloc' [-Werror=implicit-function-declaration]
fs/btrfs/raid56.c:206:9: warning: assignment makes pointer from integer without a cast [enabled by default]
fs/btrfs/raid56.c:226:4: error: implicit declaration of function 'vfree' [-Werror=implicit-function-declaration]
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Linus Torvalds [Sun, 3 Mar 2013 03:33:21 +0000 (19:33 -0800)]
Merge tag 'ext4_for_linus' of git://git./linux/kernel/git/tytso/ext4
Pull ext4 bug fixes from Ted Ts'o:
"Various bug fixes for ext4. The most important is a fix for the new
extent cache's slab shrinker which can cause significant, user-visible
pauses when the system is under memory pressure."
* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4: enable quotas before orphan cleanup
ext4: don't allow quota mount options when quota feature enabled
ext4: fix a warning from sparse check for ext4_dir_llseek
ext4: convert number of blocks to clusters properly
ext4: fix possible memory leak in ext4_remount()
jbd2: fix ERR_PTR dereference in jbd2__journal_start
ext4: use percpu counter for extent cache count
ext4: optimize ext4_es_shrink()
Linus Torvalds [Sun, 3 Mar 2013 03:32:06 +0000 (19:32 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/viro/signal
Pull sigprocmask compat fix from Al Viro:
"generic compat_sys_rt_sigprocmask() had a very dumb braino; I'd spent
quite a while staring at the offending commit before finally managing
to spot the idiocy ;-/"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal:
fix compat_sys_rt_sigprocmask()
Al Viro [Sun, 3 Mar 2013 01:39:15 +0000 (20:39 -0500)]
fix compat_sys_rt_sigprocmask()
Converting bitmask to 32bit granularity is fine, but we'd better
_do_ something with the result. Such as "copy it to userland"...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Linus Torvalds [Sun, 3 Mar 2013 00:46:07 +0000 (16:46 -0800)]
Merge tag 'nfs-for-3.9-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client bugfixes from Trond Myklebust:
"We've just concluded another Connectathon interoperability testing
week, and so here are the fixes for the bugs that were discovered:
- Don't allow NFS silly-renamed files to be deleted
- Don't start the retransmission timer when out of socket space
- Fix a couple of pnfs-related Oopses.
- Fix one more NFSv4 state recovery deadlock
- Don't loop forever when LAYOUTGET returns NFS4ERR_LAYOUTTRYLATER"
* tag 'nfs-for-3.9-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
SUNRPC: One line comment fix
NFSv4.1: LAYOUTGET EDELAY loops timeout to the MDS
SUNRPC: add call to get configured timeout
PNFS: set the default DS timeout to 60 seconds
NFSv4: Fix another open/open_recovery deadlock
nfs: don't allow nfs_find_actor to match inodes of the wrong type
NFSv4.1: Hold reference to layout hdr in layoutget
pnfs: fix resend_to_mds for directio
SUNRPC: Don't start the retransmission timer when out of socket space
NFS: Don't allow NFS silly-renamed files to be deleted, no signal
Linus Torvalds [Sun, 3 Mar 2013 00:41:54 +0000 (16:41 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/mason/linux-btrfs
Pull btrfs update from Chris Mason:
"The biggest feature in the pull is the new (and still experimental)
raid56 code that David Woodhouse started long ago. I'm still working
on the parity logging setup that will avoid inconsistent parity after
a crash, so this is only for testing right now. But, I'd really like
to get it out to a broader audience to hammer out any performance
issues or other problems.
scrub does not yet correct errors on raid5/6 either.
Josef has another pass at fsync performance. The big change here is
to combine waiting for metadata with waiting for data, which is a big
latency win. It is also step one toward using atomics from the
hardware during a commit.
Mark Fasheh has a new way to use btrfs send/receive to send only the
metadata changes. SUSE is using this to make snapper more efficient
at finding changes between snapshosts.
Snapshot-aware defrag is also included.
Otherwise we have a large number of fixes and cleanups. Eric Sandeen
wins the award for removing the most lines, and I'm hoping we steal
this idea from XFS over and over again."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (118 commits)
btrfs: fixup/remove module.h usage as required
Btrfs: delete inline extents when we find them during logging
btrfs: try harder to allocate raid56 stripe cache
Btrfs: cleanup to make the function btrfs_delalloc_reserve_metadata more logic
Btrfs: don't call btrfs_qgroup_free if just btrfs_qgroup_reserve fails
Btrfs: remove reduplicate check about root in the function btrfs_clean_quota_tree
Btrfs: return ENOMEM rather than use BUG_ON when btrfs_alloc_path fails
Btrfs: fix missing deleted items in btrfs_clean_quota_tree
btrfs: use only inline_pages from extent buffer
Btrfs: fix wrong reserved space when deleting a snapshot/subvolume
Btrfs: fix wrong reserved space in qgroup during snap/subv creation
Btrfs: remove unnecessary dget_parent/dput when creating the pending snapshot
btrfs: remove a printk from scan_one_device
Btrfs: fix NULL pointer after aborting a transaction
Btrfs: fix memory leak of log roots
Btrfs: copy everything if we've created an inline extent
btrfs: cleanup for open-coded alignment
Btrfs: do not change inode flags in rename
Btrfs: use reserved space for creating a snapshot
clear chunk_alloc flag on retryable failure
...
Linus Torvalds [Sun, 3 Mar 2013 00:33:54 +0000 (16:33 -0800)]
Merge tag 'for-linus-
20130301' of git://git.infradead.org/linux-mtd
Pull MTD update from David Woodhouse:
"Fairly unexciting MTD merge for 3.9:
- misc clean-ups in the MTD command-line partitioning parser
(cmdlinepart)
- add flash locking support for STmicro chips serial flash chips, as
well as for CFI command set 2 chips.
- new driver for the ELM error correction HW module found in various
TI chips, enable the OMAP NAND driver to use the ELM HW error
correction
- added number of new serial flash IDs
- various fixes and improvements in the gpmi NAND driver
- bcm47xx NAND driver improvements
- make the mtdpart module actually removable"
* tag 'for-linus-
20130301' of git://git.infradead.org/linux-mtd: (45 commits)
mtd: map: BUG() in non handled cases
mtd: bcm47xxnflash: use pr_fmt for module prefix in messages
mtd: davinci_nand: Use managed resources
mtd: mtd_torturetest can cause stack overflows
mtd: physmap_of: Convert device allocation to managed devm_kzalloc()
mtd: at91: atmel_nand: for PMECC, add code to check the ONFI parameter ECC requirement.
mtd: atmel_nand: make pmecc-cap, pmecc-sector-size in dts is optional.
mtd: atmel_nand: avoid to report an error when lookup table offset is 0.
mtd: bcm47xxsflash: adjust names of bus-specific functions
mtd: bcm47xxpart: improve probing of nvram partition
mtd: bcm47xxpart: add support for other erase sizes
mtd: bcm47xxnflash: register this as normal driver
mtd: bcm47xxnflash: fix message
mtd: bcm47xxsflash: register this as normal driver
mtd: bcm47xxsflash: write number of written bytes
mtd: gpmi: add sanity check for the ECC
mtd: gpmi: set the Golois Field bit for mx6q's BCH
mtd: devices: elm: Removes <xx> literals in elm DT node
mtd: gpmi: fix a dereferencing freed memory error
mtd: fix the wrong timeo for panic_nand_wait()
...
Russell King [Sun, 3 Mar 2013 00:32:50 +0000 (00:32 +0000)]
Merge branches 'devel-stable', 'fixes' and 'mmci' into for-linus
Trond Myklebust [Sat, 2 Mar 2013 23:54:11 +0000 (15:54 -0800)]
SUNRPC: One line comment fix
Reported-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Jan Kara [Sat, 2 Mar 2013 23:22:38 +0000 (18:22 -0500)]
ext4: enable quotas before orphan cleanup
When using quota feature we need to enable quotas before orphan cleanup
so that changes happening during it are properly reflected in quota
accounting.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Jan Kara [Sat, 2 Mar 2013 22:57:08 +0000 (17:57 -0500)]
ext4: don't allow quota mount options when quota feature enabled
So far we silently ignored when quota mount options were set while quota
feature was enabled. But this can create confusion in userspace when
mount options are set but silently ignored and also creates opportunities
for bugs when we don't properly test all quota types. Actually
ext4_mark_dquot_dirty() forgets to test for quota feature so it was
dependent on journaled quota options being set. OTOH ext4_orphan_cleanup()
tries to enable journaled quota when quota options are specified which is
wrong when quota feature is enabled.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Zheng Liu [Sat, 2 Mar 2013 22:24:05 +0000 (17:24 -0500)]
ext4: fix a warning from sparse check for ext4_dir_llseek
ext4_dir_llseek is only used as a callback function, and no one calls
it directly. So make it as a static function in order to remove a
warning message from sparse check.
Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Lukas Czerner [Sat, 2 Mar 2013 22:18:58 +0000 (17:18 -0500)]
ext4: convert number of blocks to clusters properly
We're using macro EXT4_B2C() to convert number of blocks to number of
clusters for bigalloc file systems. However, we should be using
EXT4_NUM_B2C().
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: stable@vger.kernel.org
Wei Yongjun [Sat, 2 Mar 2013 22:13:55 +0000 (17:13 -0500)]
ext4: fix possible memory leak in ext4_remount()
'orig_data' is malloced in ext4_remount() and should be freed
before leaving from the error handling cases, otherwise it will
cause memory leak.
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Lukas Czerner <lczerner@redhat.com>
Cc: stable@vger.kernel.org
Dmitry Monakhov [Sat, 2 Mar 2013 22:08:46 +0000 (17:08 -0500)]
jbd2: fix ERR_PTR dereference in jbd2__journal_start
If start_this_handle() failed handle will be initialized
to ERR_PTR() and can not be dereferenced.
paging request at
fffffffffffffff6
IP: [<
ffffffff813c073f>] jbd2__journal_start+0x18f/0x290
PGD
200e067 PUD
200f067 PMD 0
Oops: 0000 [#1] SMP
Modules linked in: cpufreq_ondemand acpi_cpufreq freq_table mperf coretemp kvm_intel kvm crc32c_intel ghash_clmulni_intel microcode sg xhci_hcd button sd_mod crc_t10dif aesni_intel ablk_helper cryptd lrw aes_x86_64 xts gf128mul ahci libahci pata_acpi ata_generic dm_mirror dm_region_hash dm_log dm_mod
CPU 0 journal commit I/O error
Pid: 2694, comm: fio Not tainted 3.8.0-rc3+ #79 /DQ67SW
RIP: 0010:[<
ffffffff813c073f>] [<
ffffffff813c073f>] jbd2__journal_start+0x18f/0x290
RSP: 0018:
ffff880233b8ba58 EFLAGS:
00010292
RAX:
00000000ffffffe2 RBX:
ffffffffffffffe2 RCX:
0000000000000006
RDX:
0000000000000000 RSI:
0000000000000000 RDI:
ffffffff82128f48
RBP:
ffff880233b8ba98 R08:
0000000000000000 R09:
ffff88021440a6e0
Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
James Hogan [Tue, 19 Feb 2013 13:25:46 +0000 (13:25 +0000)]
metag: Provide dma_get_sgtable()
metag/allmodconfig:
drivers/media/v4l2-core/videobuf2-dma-contig.c: In function 'vb2_dc_get_base_sgt':
drivers/media/v4l2-core/videobuf2-dma-contig.c:387: error: implicit declaration of function 'dma_get_sgtable'
For architectures using dma_map_ops, dma_get_sgtable() is provided in
<asm-generic/dma-mapping-common.h>.
Metag does not use dma_map_ops yet, hence it should implement it as an
inline stub using dma_common_get_sgtable().
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Marek Szyprowski <m.szyprowski@samsung.com>
James Hogan [Wed, 20 Feb 2013 14:19:54 +0000 (14:19 +0000)]
metag: prom.h: remove declaration of metag_dt_memblock_reserve()
Metag doesn't have a metag_dt_memblock_reserve() function so remove the
declaration from asm/prom.h.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
James Hogan [Wed, 20 Feb 2013 13:59:13 +0000 (13:59 +0000)]
metag: copy devicetree to non-init memory
Make a copy of the device tree blob in non-init memory. It is required
when using built-in device tree files that the platform code copies the
blob to non-init memory prior to calling unflatten_device_tree(),
otherwise the strings that the device tree refer to will get poisoned
and potentially reused, breaking later reading of the device tree
post-init (such as compatible matching in modules, debugfs, and the
procfs interface).
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Reviewed-by: Vineet Gupta <vgupta@synopsys.com>
James Hogan [Wed, 13 Feb 2013 13:30:15 +0000 (13:30 +0000)]
metag: cleanup metag_ksyms.c includes
Minimise metag_ksyms.c includes to directly include the <asm/*.h> files
that declare a particular symbol, and not include any unnecessary ones.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
James Hogan [Wed, 13 Feb 2013 13:19:15 +0000 (13:19 +0000)]
metag: move mm/init.c exports out of metag_ksyms.c
It's less error prone to have function symbols exported immediately
after the function rather than in metag_ksyms.c. Move each EXPORT_SYMBOL
in metag_ksyms.c for symbols defined in mm/init.c into mm/init.c.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
James Hogan [Wed, 13 Feb 2013 12:57:10 +0000 (12:57 +0000)]
metag: move usercopy.c exports out of metag_ksyms.c
It's less error prone to have function symbols exported immediately
after the function rather than in metag_ksyms.c. Move each EXPORT_SYMBOL
in metag_ksyms.c for symbols defined in usercopy.c into usercopy.c
Signed-off-by: James Hogan <james.hogan@imgtec.com>
James Hogan [Wed, 13 Feb 2013 12:55:51 +0000 (12:55 +0000)]
metag: move setup.c exports out of metag_ksyms.c
It's less error prone to have function symbols exported immediately
after the function rather than in metag_ksyms.c. Move each EXPORT_SYMBOL
in metag_ksyms.c for symbols defined in setup.c into setup.c
Signed-off-by: James Hogan <james.hogan@imgtec.com>
James Hogan [Wed, 13 Feb 2013 13:01:55 +0000 (13:01 +0000)]
metag: move kick.c exports out of metag_ksyms.c
It's less error prone to have function symbols exported immediately
after the function rather than in metag_ksyms.c. Move each EXPORT_SYMBOL
in metag_ksyms.c for symbols defined in kick.c into kick.c
Signed-off-by: James Hogan <james.hogan@imgtec.com>
James Hogan [Wed, 13 Feb 2013 12:19:03 +0000 (12:19 +0000)]
metag: move traps.c exports out of metag_ksyms.c
It's less error prone to have function symbols exported immediately
after the function rather than in metag_ksyms.c. Move each EXPORT_SYMBOL
in metag_ksyms.c for symbols defined in traps.c into traps.c
Signed-off-by: James Hogan <james.hogan@imgtec.com>
James Hogan [Tue, 12 Feb 2013 16:04:53 +0000 (16:04 +0000)]
metag: move irq enable out of irqflags.h on SMP
The SMP version of arch_local_irq_enable() uses preempt_disable(), but
<asm/irqflags.h> doesn't include <linux/preempt.h> causing the following
errors on SMP when pstore/ftrace is enabled (caught by buildbot smp
allyesconfig):
In file included from include/linux/irqflags.h:15,
from fs/pstore/ftrace.c:16:
arch/metag/include/asm/irqflags.h: In function 'arch_local_irq_enable':
arch/metag/include/asm/irqflags.h:84: error: implicit declaration of function 'preempt_disable'
arch/metag/include/asm/irqflags.h:86: error: implicit declaration of function 'preempt_enable_no_resched'
However <linux/preempt.h> cannot be easily included from
<asm/irqflags.h> as it can cause circular include dependencies in the
!SMP case, and potentially in the SMP case in the future. Therefore move
the SMP implementation of arch_local_irq_enable() into traps.c and use
an inline version of get_trigger_mask() which is also defined in traps.c
for SMP.
This adds an extra layer of function call / stack push when
preempt_disable needs to call other functions, however in the
non-preemptive SMP case it should be about as fast, as it was already
calling the get_trigger_mask() function which is now used inline.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
James Hogan [Mon, 11 Feb 2013 15:40:24 +0000 (15:40 +0000)]
genksyms: fix metag symbol prefix on crc symbols
Meta uses symbol prefixes, so add "metag" to the list of architectures
to set the mod_prefix to "_" for. This fixes __crc_* symbols to add the
extra underscore to match _CRC_SYMBOL macro in <linux/export.h> and so
that modpost finds them.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
James Hogan [Mon, 11 Feb 2013 17:28:10 +0000 (17:28 +0000)]
metag: hugetlb: convert to vm_unmapped_area()
Convert hugetlb_get_unmapped_area_new_pmd() to use vm_unmapped_area()
rather than searching the virtual address space itself. This fixes the
following errors in linux-next due to the specified members being
removed after other architectures have already been converted:
arch/metag/mm/hugetlbpage.c: In function 'hugetlb_get_unmapped_area_new_pmd':
arch/metag/mm/hugetlbpage.c:199: error: 'struct mm_struct' has no member named 'cached_hole_size'
arch/metag/mm/hugetlbpage.c:200: error: 'struct mm_struct' has no member named 'free_area_cache'
arch/metag/mm/hugetlbpage.c:215: error: 'struct mm_struct' has no member named 'cached_hole_size'
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Acked-by: Michel Lespinasse <walken@google.com>
James Hogan [Mon, 11 Feb 2013 11:47:02 +0000 (11:47 +0000)]
metag: export clear_page and copy_page
Various file systems use clear_page() and copy_page(), so when they're
built as modules we get build errors like the following:
ERROR: "clear_page" [fs/ntfs/ntfs.ko] undefined!
ERROR: "copy_page" [fs/nilfs2/nilfs2.ko] undefined!
Therefore export these functions to modules from metag_ksyms.c to fix
the errors. This was hit by a randconfig build.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
James Hogan [Mon, 11 Feb 2013 11:43:20 +0000 (11:43 +0000)]
metag: export metag_code_cache_flush_all
Various file systems indirectly use metag_code_cache_flush_all(), so
when they're built as modules we get build errors like the following:
ERROR: "metag_code_cache_flush_all" [fs/xfs/xfs.ko] undefined!
Therefore export this function to modules to fix the errors. This was
hit by a randconfig build.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
James Hogan [Thu, 31 Jan 2013 13:42:03 +0000 (13:42 +0000)]
metag: protect more non-MMU memory regions
Rename setup_txprivext() to setup_priv() and add initialisation of some
more per-thread privilege protection registers:
- TxPRIVSYSR: 0x04400000-0x047fffff
0x05000000-0x07ffffff
0x84000000-0x87ffffff
- TxPIOREG: 0x02000000-0x02ffffff
0x04800000-0x048fffff
- TxSYREG: 0x04000000-0x04000fff (except write fetch system event)
Signed-off-by: James Hogan <james.hogan@imgtec.com>
James Hogan [Thu, 31 Jan 2013 13:27:35 +0000 (13:27 +0000)]
metag: make TXPRIVEXT bits explicit
Define PRIV_BITS using explicit constants from <asm/metag_regs.h> rather
than with a hard coded value. This also adds a couple of missing
definitions for the TXPRIVEXT priv bits for protecting writes to TXTIMER
and the trace registers.
Signed-off-by: James Hogan <james.hogan@imgtec.com>