Testing with CONFIG_SLUB_DEBUG_ON=y resulted in the kernel panic below.
This is the result of the mm_struct sometimes being free'd prior to
hfi1_file_close being called.
This was due to the combination of 2 reasons:
1) hfi1_file_close is deferred in process exit and it therefore may not
be called synchronously with process exit.
2) exit_mm is called prior to exit_files in do_exit. Normally this is ok
however, our kernel bypass code requires us to have access to the
mm_struct for house keeping both at "normal" close time as well as at
process exit.
Therefore, the fix is to simply keep a reference to the mm_struct until
we are done with it.
[ 3006.340150] general protection fault: 0000 [#1] SMP
[ 3006.346469] Modules linked in: hfi1 rdmavt rpcrdma ib_isert iscsi_target_mod
ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod
ib_srp scsi_transport_srp ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm
ib_cm iw_cm dm_mirror dm_region_hash dm_log dm_mod snd_hda_code
c_realtek iTCO_wdt snd_hda_codec_generic iTCO_vendor_support sb_edac edac_core
x86_pkg_temp_thermal intel_powerclamp coretemp kvm irqbypass c
rct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel lrw snd_hda_intel
gf128mul snd_hda_codec glue_helper snd_hda_core ablk_helper sn
d_hwdep cryptd snd_seq snd_seq_device snd_pcm snd_timer snd soundcore pcspkr
shpchp mei_me sg lpc_ich mei i2c_i801 mfd_core ioatdma ipmi_devi
ntf wmi ipmi_si ipmi_msghandler acpi_cpufreq nfsd auth_rpcgss nfs_acl lockd
grace sunrpc ip_tables ext4 jbd2 mbcache mlx4_en ib_core sr_mod s
d_mod cdrom crc32c_intel mgag200 drm_kms_helper syscopyarea sysfillrect igb
sysimgblt fb_sys_fops ptp mlx4_core ttm isci pps_core ahci drm li
bsas libahci dca firewire_ohci i2c_algo_bit scsi_transport_sas firewire_core
crc_itu_t i2c_core libata [last unloaded: mlx4_ib]
[ 3006.461759] CPU: 16 PID: 11624 Comm: mpi_stress Not tainted 4.7.0-rc5+ #1
[ 3006.469915] Hardware name: Intel Corporation W2600CR ........../W2600CR, BIOS SE5C600.86B.01.08.0003.
022620131521 02/26/2013
[ 3006.483027] task:
ffff8804102f0040 ti:
ffff8804102f8000 task.ti:
ffff8804102f8000
[ 3006.491971] RIP: 0010:[<
ffffffff810f0383>] [<
ffffffff810f0383>] __lock_acquire+0xb3/0x19e0
[ 3006.501905] RSP: 0018:
ffff8804102fb908 EFLAGS:
00010002
[ 3006.508447] RAX:
6b6b6b6b6b6b6b6b RBX:
0000000000000001 RCX:
0000000000000000
[ 3006.517012] RDX:
0000000000000001 RSI:
0000000000000000 RDI:
ffff880410b56a40
[ 3006.525569] RBP:
ffff8804102fb9b0 R08:
0000000000000001 R09:
0000000000000000
[ 3006.534119] R10:
ffff8804102f0040 R11:
0000000000000000 R12:
0000000000000000
[ 3006.542664] R13:
ffff880410b56a40 R14:
0000000000000000 R15:
0000000000000000
[ 3006.551203] FS:
00007ff478c08700(0000) GS:
ffff88042e200000(0000) knlGS:
0000000000000000
[ 3006.560814] CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
[ 3006.567806] CR2:
00007f667f5109e0 CR3:
0000000001c06000 CR4:
00000000000406e0
[ 3006.576352] Stack:
[ 3006.579157]
ffffffff8124b819 ffffffffffffffff 0000000000000000 ffff8804102fb940
[ 3006.588072]
0000000000000002 0000000000000000 ffff8804102f0040 0000000000000007
[ 3006.596971]
0000000000000006 ffff8803cad6f000 0000000000000000 ffff8804102f0040
[ 3006.605878] Call Trace:
[ 3006.609220] [<
ffffffff8124b819>] ? uncharge_batch+0x109/0x250
[ 3006.616382] [<
ffffffff810f2313>] lock_acquire+0xd3/0x220
[ 3006.623056] [<
ffffffffa0a30bfc>] ? hfi1_release_user_pages+0x7c/0xa0 [hfi1]
[ 3006.631593] [<
ffffffff81775579>] down_write+0x49/0x80
[ 3006.638022] [<
ffffffffa0a30bfc>] ? hfi1_release_user_pages+0x7c/0xa0 [hfi1]
[ 3006.646569] [<
ffffffffa0a30bfc>] hfi1_release_user_pages+0x7c/0xa0 [hfi1]
[ 3006.654898] [<
ffffffffa0a2efb6>] cacheless_tid_rb_remove+0x106/0x330 [hfi1]
[ 3006.663417] [<
ffffffff810efd36>] ? mark_held_locks+0x66/0x90
[ 3006.670498] [<
ffffffff817771f6>] ? _raw_spin_unlock_irqrestore+0x36/0x60
[ 3006.678741] [<
ffffffffa0a2f1ee>] tid_rb_remove+0xe/0x10 [hfi1]
[ 3006.686010] [<
ffffffffa0a0c5d5>] hfi1_mmu_rb_unregister+0xc5/0x100 [hfi1]
[ 3006.694387] [<
ffffffffa0a2fcb9>] hfi1_user_exp_rcv_free+0x39/0x120 [hfi1]
[ 3006.702732] [<
ffffffffa09fc6ea>] hfi1_file_close+0x17a/0x330 [hfi1]
[ 3006.710489] [<
ffffffff81263e9a>] __fput+0xfa/0x230
[ 3006.716595] [<
ffffffff8126400e>] ____fput+0xe/0x10
[ 3006.722696] [<
ffffffff810b95c6>] task_work_run+0x86/0xc0
[ 3006.729379] [<
ffffffff81099933>] do_exit+0x323/0xc40
[ 3006.735672] [<
ffffffff8109a2dc>] do_group_exit+0x4c/0xc0
[ 3006.742371] [<
ffffffff810a7f55>] get_signal+0x345/0x940
[ 3006.748958] [<
ffffffff810340c7>] do_signal+0x37/0x700
[ 3006.755328] [<
ffffffff8127872a>] ? poll_select_set_timeout+0x5a/0x90
[ 3006.763146] [<
ffffffff811609cb>] ? __audit_syscall_exit+0x1db/0x260
[ 3006.770853] [<
ffffffff8110f3e3>] ? rcu_read_lock_sched_held+0x93/0xa0
[ 3006.778765] [<
ffffffff812347a4>] ? kfree+0x1e4/0x2a0
[ 3006.784986] [<
ffffffff8108e75a>] ? exit_to_usermode_loop+0x33/0xac
[ 3006.792551] [<
ffffffff8108e785>] exit_to_usermode_loop+0x5e/0xac
[ 3006.799907] [<
ffffffff81003dca>] do_syscall_64+0x12a/0x190
[ 3006.806664] [<
ffffffff81777a7f>] entry_SYSCALL64_slow_path+0x25/0x25
[ 3006.814396] Code: 24 08 44 89 44 24 10 89 4c 24 18 e8 a8 d8 ff ff 48 85 c0
8b 4c 24 18 44 8b 44 24 10 44 8b 4c 24 08 4c 8b 14 24 0f 84 30
08 00 00 <f0> ff 80 98 01 00 00 8b 3d 48 ad be 01 45 8b a2 90 0b 00 00 85
[ 3006.837158] RIP [<
ffffffff810f0383>] __lock_acquire+0xb3/0x19e0
[ 3006.844401] RSP <
ffff8804102fb908>
[ 3006.851170] ---[ end trace
b7b9f21cf06c27df ]---
[ 3006.927420] Kernel panic - not syncing: Fatal exception
[ 3006.933954] Kernel Offset: disabled
[ 3006.940961] ---[ end Kernel panic - not syncing: Fatal exception
[ 3006.948249] ------------[ cut here ]------------
Fixes:
3faa3d9a308e ("IB/hfi1: Make use of mm consistent")
Reviewed-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>