During memory hotplug, build_allzonelists() may be called under
stop_machine_run(). In this function, setup_zone_pageset() is called.
But it's bug because it will do page allocation under stop_machine_run().
Here is a report from Alok Kataria.
BUG: sleeping function called from invalid context at kernel/mutex.c:94
in_atomic(): 0, irqs_disabled(): 1, pid: 4, name: migration/0
Pid: 4, comm: migration/0 Not tainted 2.6.35.6-45.fc14.x86_64 #1
Call Trace:
[<
ffffffff8103d12b>] __might_sleep+0xeb/0xf0
[<
ffffffff81468245>] mutex_lock+0x24/0x50
[<
ffffffff8110eaa6>] pcpu_alloc+0x6d/0x7ee
[<
ffffffff81048888>] ? load_balance+0xbe/0x60e
[<
ffffffff8103a1b3>] ? rt_se_boosted+0x21/0x2f
[<
ffffffff8103e1cf>] ? dequeue_rt_stack+0x18b/0x1ed
[<
ffffffff8110f237>] __alloc_percpu+0x10/0x12
[<
ffffffff81465e22>] setup_zone_pageset+0x38/0xbe
[<
ffffffff810d6d81>] ? build_zonelists_node.clone.58+0x79/0x8c
[<
ffffffff81452539>] __build_all_zonelists+0x419/0x46c
[<
ffffffff8108ef01>] ? cpu_stopper_thread+0xb2/0x198
[<
ffffffff8108f075>] stop_machine_cpu_stop+0x8e/0xc5
[<
ffffffff8108efe7>] ? stop_machine_cpu_stop+0x0/0xc5
[<
ffffffff8108ef57>] cpu_stopper_thread+0x108/0x198
[<
ffffffff81467a37>] ? schedule+0x5b2/0x5cc
[<
ffffffff8108ee4f>] ? cpu_stopper_thread+0x0/0x198
[<
ffffffff81065f29>] kthread+0x7f/0x87
[<
ffffffff8100aae4>] kernel_thread_helper+0x4/0x10
[<
ffffffff81065eaa>] ? kthread+0x0/0x87
[<
ffffffff8100aae0>] ? kernel_thread_helper+0x0/0x10
Built 5 zonelists in Node order, mobility grouping on. Total pages: 289456
Policy zone: Normal
This patch tries to fix the issue by moving setup_zone_pageset() out from
stop_machine_run(). It's obviously not necessary to be called under
stop_machine_run().
[akpm@linux-foundation.org: remove unneeded local]
Reported-by: Alok Kataria <akataria@vmware.com>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Petr Vandrovec <petr@vmware.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Reviewed-by: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
build_zonelist_cache(pgdat);
}
-#ifdef CONFIG_MEMORY_HOTPLUG
- /* Setup real pagesets for the new zone */
- if (data) {
- struct zone *zone = data;
- setup_zone_pageset(zone);
- }
-#endif
-
/*
* Initialize the boot_pagesets that are going to be used
* for bootstrapping processors. The real pagesets for
} else {
/* we have to stop all cpus to guarantee there is no user
of zonelist */
- stop_machine(__build_all_zonelists, data, NULL);
+#ifdef CONFIG_MEMORY_HOTPLUG
+ if (data)
+ setup_zone_pageset((struct zone *)data);
+#endif
+ stop_machine(__build_all_zonelists, NULL, NULL);
/* cpuset refresh routine should be here */
}
vm_total_pages = nr_free_pagecache_pages();