Bound workqueues might be too restrictive since they allow
only a single core per session for processing completions.
WQ_UNBOUND will allow bouncing to another CPU if the running
CPU is currently busy. Luckily, our workqueues are NUMA aware
and will first try to bounce within the same NUMA socket.
My measurements with NULL backend devices show that there is
no (noticeable) additional latency as a result of the change.
I'd expect even to gain performance when working with fast
devices that also allocate MSIX interrupt vectors.
While we're at it, make it WQ_HIGHPRI since processing
completions is really a high priority for performance.
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Reported-by: Moussa Ba <moussaba@micron.com>
Signed-off-by: Moussa Ba <moussaba@micron.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
{
int ret;
- isert_comp_wq = alloc_workqueue("isert_comp_wq", 0, 0);
+ isert_comp_wq = alloc_workqueue("isert_comp_wq",
+ WQ_UNBOUND | WQ_HIGHPRI, 0);
if (!isert_comp_wq) {
isert_err("Unable to allocate isert_comp_wq\n");
ret = -ENOMEM;