We do not need the sglist after calling virtqueue_add_buf. Hence we
can "pipeline" the locked operations and start preparing the sglist
for the next request while we kick the virtqueue.
Together with the previous two patches, this improves performance as
follows. For a simple "if=/dev/sda of=/dev/null bs=128M iflag=direct"
(the source being a 10G disk, residing entirely in the host buffer cache),
the additional locking does not cause any penalty with only one dd
process, but 2 simultaneous I/O operations improve their times by 3%:
number of simultaneous dd
1 2
----------------------------------------
current 5.9958s 10.2640s
patched 5.9531s 9.8663s
(Times are best of 10).
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
spin_lock(&vq->vq_lock);
ret = virtqueue_add_buf(vq->vq, vscsi->sg, out_num, in_num, cmd, gfp);
+ spin_unlock(&vscsi->sg_lock);
if (ret >= 0)
ret = virtqueue_kick_prepare(vq->vq);
- spin_unlock(&vq->vq_lock);
- spin_unlock_irqrestore(&vscsi->sg_lock, flags);
+ spin_unlock_irqrestore(&vq->vq_lock, flags);
if (ret > 0)
virtqueue_notify(vq->vq);