The path_rec_completion() callback may be invoked asynchronously even
at the middle of "driver uninit" process. This can lead to scheduling
a task that tries to touch members of the priv object that are no
longer valid. For example the function cm_create_tx_qp can attempt to
create qp with no valid priv->pd object.
The following crash is one of the results:
RIP: 0010:[<
ffffffffa021bb47>] [<
ffffffffa021bb47>] ipoib_cm_create_tx_qp+0x57/0x90 [ib_ipoib]
Process ipoib (pid: 5916, threadinfo
ffff8803786e4000, task
ffff8804150e1500)
Stack:
Call Trace:
[<
ffffffff81309ef0>] ? get_random_bytes+0x20/0x30
[<
ffffffffa021be2a>] ipoib_cm_tx_init+0xca/0x340 [ib_ipoib]
[<
ffffffffa021f765>] ipoib_cm_tx_start+0x215/0x3f0 [ib_ipoib]
[<
ffffffffa021f550>] ? ipoib_cm_tx_start+0x0/0x3f0 [ib_ipoib]
[<
ffffffff8108b2b0>] worker_thread+0x170/0x2a0
[<
ffffffff81090bf0>] ? autoremove_wake_function+0x0/0x40
[<
ffffffff8108b140>] ? worker_thread+0x0/0x2a0
[<
ffffffff81090886>] kthread+0x96/0xa0
[<
ffffffff8100c14a>] child_rip+0xa/0x20
[<
ffffffff810907f0>] ? kthread+0x0/0xa0
[<
ffffffff8100c140>] ? child_rip+0x0/0x20
Fix that by flushing all pending path queries at this point.
Signed-off-by: Alex Markuze <markuze@mellanox.com>
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
struct ipoib_dev_priv *priv = netdev_priv(dev);
ipoib_dbg(priv, "cleaning up ib_dev\n");
+ /*
+ * We must make sure there are no more (path) completions
+ * that may wish to touch priv fields that are no longer valid
+ */
+ ipoib_flush_paths(dev);
ipoib_mcast_stop_thread(dev, 1);
ipoib_mcast_dev_flush(dev);