slub: Check for page NULL before doing the node_match check
In the -rt kernel (mrg), we hit the following dump:
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<
ffffffff811573f1>] kmem_cache_alloc_node+0x51/0x180
PGD
a2d39067 PUD
b1641067 PMD 0
Oops: 0000 [#1] PREEMPT SMP
Modules linked in: sunrpc cpufreq_ondemand ipv6 tg3 joydev sg serio_raw pcspkr k8temp amd64_edac_mod edac_core i2c_piix4 e100 mii shpchp ext4 mbcache jbd2 sd_mod crc_t10dif sr_mod cdrom sata_svw ata_generic pata_acpi pata_serverworks radeon ttm drm_kms_helper drm hwmon i2c_algo_bit i2c_core dm_mirror dm_region_hash dm_log dm_mod
CPU 3
Pid: 20878, comm: hackbench Not tainted 3.6.11-rt25.14.el6rt.x86_64 #1 empty empty/Tyan Transport GT24-B3992
RIP: 0010:[<
ffffffff811573f1>] [<
ffffffff811573f1>] kmem_cache_alloc_node+0x51/0x180
RSP: 0018:
ffff8800a9b17d70 EFLAGS:
00010213
RAX:
0000000000000000 RBX:
0000000001200011 RCX:
ffff8800a06d8000
RDX:
0000000004d92a03 RSI:
00000000000000d0 RDI:
ffff88013b805500
RBP:
ffff8800a9b17dc0 R08:
ffff88023fd14d10 R09:
ffffffff81041cbd
R10:
00007f4e3f06e9d0 R11:
0000000000000246 R12:
ffff88013b805500
R13:
ffff8801ff46af40 R14:
0000000000000001 R15:
0000000000000000
FS:
00007f4e3f06e700(0000) GS:
ffff88023fd00000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
000000008005003b
CR2:
0000000000000000 CR3:
00000000a2d3a000 CR4:
00000000000007e0
DR0:
0000000000000000 DR1:
0000000000000000 DR2:
0000000000000000
DR3:
0000000000000000 DR6:
00000000ffff0ff0 DR7:
0000000000000400
Process hackbench (pid: 20878, threadinfo
ffff8800a9b16000, task
ffff8800a06d8000)
Stack:
ffff8800a9b17da0 ffffffff81202e08 ffff8800a9b17de0 000000d001200011
0000000001200011 0000000001200011 0000000000000000 0000000000000000
00007f4e3f06e9d0 0000000000000000 ffff8800a9b17e60 ffffffff81041cbd
Call Trace:
[<
ffffffff81202e08>] ? current_has_perm+0x68/0x80
[<
ffffffff81041cbd>] copy_process+0xdd/0x15b0
[<
ffffffff810a2125>] ? rt_up_read+0x25/0x30
[<
ffffffff8104369a>] do_fork+0x5a/0x360
[<
ffffffff8107c66b>] ? migrate_enable+0xeb/0x220
[<
ffffffff8100b068>] sys_clone+0x28/0x30
[<
ffffffff81527423>] stub_clone+0x13/0x20
[<
ffffffff81527152>] ? system_call_fastpath+0x16/0x1b
Code: 89 fc 89 75 cc 41 89 d6 4d 8b 04 24 65 4c 03 04 25 48 ae 00 00 49 8b 50 08 4d 8b 28 49 8b 40 10 4d 85 ed 74 12 41 83 fe ff 74 27 <48> 8b 00 48 c1 e8 3a 41 39 c6 74 1b 8b 75 cc 4c 89 c9 44 89 f2
RIP [<
ffffffff811573f1>] kmem_cache_alloc_node+0x51/0x180
RSP <
ffff8800a9b17d70>
CR2:
0000000000000000
---[ end trace
0000000000000002 ]---
Now, this uses SLUB pretty much unmodified, but as it is the -rt kernel
with CONFIG_PREEMPT_RT set, spinlocks are mutexes, although they do
disable migration. But the SLUB code is relatively lockless, and the
spin_locks there are raw_spin_locks (not converted to mutexes), thus I
believe this bug can happen in mainline without -rt features. The -rt
patch is just good at triggering mainline bugs ;-)
Anyway, looking at where this crashed, it seems that the page variable
can be NULL when passed to the node_match() function (which does not
check if it is NULL). When this happens we get the above panic.
As page is only used in slab_alloc() to check if the node matches, if
it's NULL I'm assuming that we can say it doesn't and call the
__slab_alloc() code. Is this a correct assumption?
Acked-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>