mm,compaction: serialize waitqueue_active() checks
authorDavidlohr Bueso <dave@stgolabs.net>
Wed, 22 Feb 2017 23:44:55 +0000 (15:44 -0800)
committerLinus Torvalds <torvalds@linux-foundation.org>
Thu, 23 Feb 2017 00:41:29 +0000 (16:41 -0800)
Without a memory barrier, the following race can occur with a high-order
allocation:

wakeup_kcompactd(order == 1)        kcompactd()
  [L] waitqueue_active(kcompactd_wait)
[S] prepare_to_wait_event(kcompactd_wait)
[L] (kcompactd_max_order == 0)
  [S] kcompactd_max_order = order;       schedule()

Where the waitqueue_active() check is speculatively re-ordered to before
setting the actual condition (max_order), not seeing the threads that's
going to block; making us miss a wakeup.  There are a couple of options
to fix this, including calling wq_has_sleepers() which adds a full
barrier, or unconditionally doing the wake_up_interruptible() and
serialize on the q->lock.  However, to make use of the control
dependency, we just need to add L->L guarantees.

While this bug is theoretical, there have been other offenders of the
lockless waitqueue_active() in the past -- this is also documented in
the call itself.

Link: http://lkml.kernel.org/r/1483975528-24342-1-git-send-email-dave@stgolabs.net
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
mm/compaction.c

index c6178bbd3e044ed08476cf52cb83626c836bbfc2..0aa2757399ee00ace11a0492df09517fdea99f9e 100644 (file)
@@ -1966,6 +1966,13 @@ void wakeup_kcompactd(pg_data_t *pgdat, int order, int classzone_idx)
        if (pgdat->kcompactd_max_order < order)
                pgdat->kcompactd_max_order = order;
 
+       /*
+        * Pairs with implicit barrier in wait_event_freezable()
+        * such that wakeups are not missed in the lockless
+        * waitqueue_active() call.
+        */
+       smp_acquire__after_ctrl_dep();
+
        if (pgdat->kcompactd_classzone_idx > classzone_idx)
                pgdat->kcompactd_classzone_idx = classzone_idx;