mm: make mm_percpu_wq non freezable
authorMichal Hocko <mhocko@suse.com>
Wed, 19 Apr 2017 07:52:46 +0000 (09:52 +0200)
committerLinus Torvalds <torvalds@linux-foundation.org>
Wed, 19 Apr 2017 22:53:48 +0000 (15:53 -0700)
Geert has reported a freeze during PM resume and some additional
debugging has shown that the device_resume worker cannot make a forward
progress because it waits for an event which is stuck waiting in
drain_all_pages:

  INFO: task kworker/u4:0:5 blocked for more than 120 seconds.
        Not tainted 4.11.0-rc7-koelsch-00029-g005882e53d62f25d-dirty #3476
  "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
  kworker/u4:0    D    0     5      2 0x00000000
  Workqueue: events_unbound async_run_entry_fn
    __schedule
    schedule
    schedule_timeout
    wait_for_common
    dpm_wait_for_superior
    device_resume
    async_resume
    async_run_entry_fn
    process_one_work
    worker_thread
    kthread
  [...]
  bash            D    0  1703   1694 0x00000000
    __schedule
    schedule
    schedule_timeout
    wait_for_common
    flush_work
    drain_all_pages
    start_isolate_page_range
    alloc_contig_range
    cma_alloc
    __alloc_from_contiguous
    cma_allocator_alloc
    __dma_alloc
    arm_dma_alloc
    sh_eth_ring_init
    sh_eth_open
    sh_eth_resume
    dpm_run_callback
    device_resume
    dpm_resume
    dpm_resume_end
    suspend_devices_and_enter
    pm_suspend
    state_store
    kernfs_fop_write
    __vfs_write
    vfs_write
    SyS_write
  [...]
  Showing busy workqueues and worker pools:
  [...]
  workqueue mm_percpu_wq: flags=0xc
    pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=0/0
      delayed: drain_local_pages_wq, vmstat_update
    pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=0/0
      delayed: drain_local_pages_wq BAR(1703), vmstat_update

Tetsuo has properly noted that mm_percpu_wq is created as WQ_FREEZABLE
so it is frozen this early during resume so we are effectively
deadlocked.  Fix this by dropping WQ_FREEZABLE when creating
mm_percpu_wq.  We really want to have it operational all the time.

Fixes: ce612879ddc7 ("mm: move pcp and lru-pcp draining into single wq")
Reported-and-tested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Debugged-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Signed-off-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
mm/vmstat.c

index 809025ed97ea0eee97573a32ba2764c63ee2dffd..5a4f5c5a31e88ee558f536d22f61f05a3fd13c45 100644 (file)
@@ -1768,8 +1768,7 @@ void __init init_mm_internals(void)
 {
        int ret __maybe_unused;
 
-       mm_percpu_wq = alloc_workqueue("mm_percpu_wq",
-                                      WQ_FREEZABLE|WQ_MEM_RECLAIM, 0);
+       mm_percpu_wq = alloc_workqueue("mm_percpu_wq", WQ_MEM_RECLAIM, 0);
 
 #ifdef CONFIG_SMP
        ret = cpuhp_setup_state_nocalls(CPUHP_MM_VMSTAT_DEAD, "mm/vmstat:dead",