IB/hfi1: Prevent NULL pointer deferences in caching code
There is a potential kernel crash when the MMU notifier calls the
invalidation routines in the hfi1 pinned page caching code for sdma.
The invalidation routine could call the remove callback
for the node, which in turn ends up dereferencing the
current task_struct to get a pointer to the mm_struct.
However, the mm_struct pointer could be NULL resulting in
the following backtrace:
BUG: unable to handle kernel NULL pointer dereference at
00000000000000a8
IP: [<
ffffffffa041f75a>] sdma_rb_remove+0xaa/0x100 [hfi1]
15
task:
ffff88085e66e080 ti:
ffff88085c244000 task.ti:
ffff88085c244000
RIP: 0010:[<
ffffffffa041f75a>] [<
ffffffffa041f75a>] sdma_rb_remove+0xaa/0x100 [hfi1]
RSP: 0000:
ffff88085c245878 EFLAGS:
00010002
RAX:
0000000000000000 RBX:
ffff88105b9bbd40 RCX:
ffffea003931a830
RDX:
0000000000000004 RSI:
ffff88105754a9c0 RDI:
ffff88105754a9c0
RBP:
ffff88085c245890 R08:
ffff88105b9bbd70 R09:
00000000fffffffb
R10:
ffff88105b9bbd58 R11:
0000000000000013 R12:
ffff88105754a9c0
R13:
0000000000000001 R14:
0000000000000001 R15:
ffff88105b9bbd40
FS:
0000000000000000(0000) GS:
ffff88107ef40000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
CR2:
00000000000000a8 CR3:
0000000001a0b000 CR4:
00000000001407e0
Stack:
ffff88105b9bbd40 ffff88080ec481a8 ffff88080ec481b8 ffff88085c2458c0
ffffffffa03fa00e ffff88080ec48190 ffff88080ed9cd00 0000000001024000
0000000000000000 ffff88085c245920 ffffffffa03fa0e7 0000000000000282
Call Trace:
[<
ffffffffa03fa00e>] __mmu_rb_remove.isra.5+0x5e/0x70 [hfi1]
[<
ffffffffa03fa0e7>] mmu_notifier_mem_invalidate+0xc7/0xf0 [hfi1]
[<
ffffffffa03fa143>] mmu_notifier_page+0x13/0x20 [hfi1]
[<
ffffffff81156dd0>] __mmu_notifier_invalidate_page+0x50/0x70
[<
ffffffff81140bbb>] try_to_unmap_one+0x20b/0x470
[<
ffffffff81141ee7>] try_to_unmap_anon+0xa7/0x120
[<
ffffffff81141fad>] try_to_unmap+0x4d/0x60
[<
ffffffff8111fd7b>] shrink_page_list+0x2eb/0x9d0
[<
ffffffff81120ab3>] shrink_inactive_list+0x243/0x490
[<
ffffffff81121491>] shrink_lruvec+0x4c1/0x640
[<
ffffffff81121641>] shrink_zone+0x31/0x100
[<
ffffffff81121b0f>] kswapd_shrink_zone.constprop.62+0xef/0x1c0
[<
ffffffff811229e3>] kswapd+0x403/0x7e0
[<
ffffffff811225e0>] ? shrink_all_memory+0xf0/0xf0
[<
ffffffff81068ac0>] kthread+0xc0/0xd0
[<
ffffffff81068a00>] ? insert_kthread_work+0x40/0x40
[<
ffffffff814ff8ec>] ret_from_fork+0x7c/0xb0
[<
ffffffff81068a00>] ? insert_kthread_work+0x40/0x40
To correct this, the mm_struct passed to us by the MMU notifier is
used (which is what should have been done to begin with). This avoids
the broken derefences and ensures that the correct mm_struct is used.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>