RDMA/iwcm: Fix hang in uninterruptible wait on cm_id destroy
authorAnimesh K Trivedi <ATR@zurich.ibm.com>
Tue, 28 Sep 2010 14:44:02 +0000 (14:44 +0000)
committerRoland Dreier <rolandd@cisco.com>
Tue, 12 Oct 2010 03:24:04 +0000 (20:24 -0700)
A process can get stuck in an uninterruptible wait in the
kernel while destroying a cm_id when iw_cm_connect() fails:

For example, When creation of a PD fails but the user continues with
an attempt to connect to the server without checking the return value,
in iw_cm_connect() a NULL qp is found so the call fails.  However the
IWCM_F_CONNECT_WAIT bit is not cleared.  destroy_cm_id() then waits
forever for IWCM_F_CONNECT_WAIT to be cleared.

The same problem exists on the passive side with the accept call.

Fix this by clearing the bit and waking up any waiters in the
appropriate spots.

Signed-off-by: Animesh Trivedi <atr@zurich.ibm.com>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
drivers/infiniband/core/iwcm.c

index bfead5bc25f6e14efcc09c4e5be51050785acb53..2a1e9ae134b4c330bbffd048d6ce037e75652d45 100644 (file)
@@ -506,6 +506,8 @@ int iw_cm_accept(struct iw_cm_id *cm_id,
        qp = cm_id->device->iwcm->get_qp(cm_id->device, iw_param->qpn);
        if (!qp) {
                spin_unlock_irqrestore(&cm_id_priv->lock, flags);
+               clear_bit(IWCM_F_CONNECT_WAIT, &cm_id_priv->flags);
+               wake_up_all(&cm_id_priv->connect_wait);
                return -EINVAL;
        }
        cm_id->device->iwcm->add_ref(qp);
@@ -565,6 +567,8 @@ int iw_cm_connect(struct iw_cm_id *cm_id, struct iw_cm_conn_param *iw_param)
        qp = cm_id->device->iwcm->get_qp(cm_id->device, iw_param->qpn);
        if (!qp) {
                spin_unlock_irqrestore(&cm_id_priv->lock, flags);
+               clear_bit(IWCM_F_CONNECT_WAIT, &cm_id_priv->flags);
+               wake_up_all(&cm_id_priv->connect_wait);
                return -EINVAL;
        }
        cm_id->device->iwcm->add_ref(qp);