Until now, while creating a subscription the subscriber lock
protects only the subscribers subscription list and not the
nametable. The call to tipc_nametbl_subscribe() is outside
the lock. However, at subscription timeout and cancel both
the subscribers subscription list and the nametable are
protected by the subscriber lock.
This asymmetric locking mechanism leads to the following problem:
In a SMP system, the timer can be fire on another core before
the create request is complete.
When the timer thread calls tipc_nametbl_unsubscribe() before create
thread calls tipc_nametbl_subscribe(), we get a nullptr exception.
This can be simulated by creating subscription with timeout=0 and
sometimes the timeout occurs before the create request is complete.
The following is the oops:
[57.569661] BUG: unable to handle kernel NULL pointer dereference at (null)
[57.577498] IP: [<
ffffffffa02135aa>] tipc_nametbl_unsubscribe+0x8a/0x120 [tipc]
[57.584820] PGD 0
[57.586834] Oops: 0002 [#1] SMP
[57.685506] CPU: 14 PID: 10077 Comm: kworker/u40:1 Tainted: P OENX 3.12.48-52.27.1. 9688.1.PTF-default #1
[57.703637] Workqueue: tipc_rcv tipc_recv_work [tipc]
[57.708697] task:
ffff88064c7f00c0 ti:
ffff880629ef4000 task.ti:
ffff880629ef4000
[57.716181] RIP: 0010:[<
ffffffffa02135aa>] [<
ffffffffa02135aa>] tipc_nametbl_unsubscribe+0x8a/ 0x120 [tipc]
[...]
[57.812327] Call Trace:
[57.814806] [<
ffffffffa0211c77>] tipc_subscrp_delete+0x37/0x90 [tipc]
[57.821357] [<
ffffffffa0211e2f>] tipc_subscrp_timeout+0x3f/0x70 [tipc]
[57.827982] [<
ffffffff810618c1>] call_timer_fn+0x31/0x100
[57.833490] [<
ffffffff81062709>] run_timer_softirq+0x1f9/0x2b0
[57.839414] [<
ffffffff8105a795>] __do_softirq+0xe5/0x230
[57.844827] [<
ffffffff81520d1c>] call_softirq+0x1c/0x30
[57.850150] [<
ffffffff81004665>] do_softirq+0x55/0x90
[57.855285] [<
ffffffff8105aa35>] irq_exit+0x95/0xa0
[57.860290] [<
ffffffff815215b5>] smp_apic_timer_interrupt+0x45/0x60
[57.866644] [<
ffffffff8152005d>] apic_timer_interrupt+0x6d/0x80
[57.872686] [<
ffffffffa02121c5>] tipc_subscrb_rcv_cb+0x2a5/0x3f0 [tipc]
[57.879425] [<
ffffffffa021c65f>] tipc_receive_from_sock+0x9f/0x100 [tipc]
[57.886324] [<
ffffffffa021c826>] tipc_recv_work+0x26/0x60 [tipc]
[57.892463] [<
ffffffff8106fb22>] process_one_work+0x172/0x420
[57.898309] [<
ffffffff8107079a>] worker_thread+0x11a/0x3c0
[57.903871] [<
ffffffff81077114>] kthread+0xb4/0xc0
[57.908751] [<
ffffffff8151f318>] ret_from_fork+0x58/0x90
In this commit, we do the following at subscription creation:
1. set the subscription's subscriber pointer before performing
tipc_nametbl_subscribe(), as this value is required further in
the call chain ex: by tipc_subscrp_send_event().
2. move tipc_nametbl_subscribe() under the scope of subscriber lock
Acked-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>