Currently, lowmemorykiller (LMK) is using TIF_MEMDIE for two purposes.
One is to remember processes killed by LMK, and the other is to
accelerate termination of processes killed by LMK.
But since LMK is invoked as a memory shrinker function, there still
should be some memory available. It is very likely that memory
allocations by processes killed by LMK will succeed without using
ALLOC_NO_WATERMARKS via TIF_MEMDIE. Even if their allocations cannot
escape from memory allocation loop unless they use ALLOC_NO_WATERMARKS,
lowmem_deathpending_timeout can guarantee forward progress by choosing
next victim process.
On the other hand, mark_oom_victim() assumes that it must be called with
oom_lock held and it must not be called after oom_killer_disable() was
called. But LMK is calling it without holding oom_lock and checking
oom_killer_disabled. It is possible that LMK calls mark_oom_victim()
due to allocation requests by kernel threads after current thread
returned from oom_killer_disabled(). This will break synchronization
for PM/suspend.
This patch introduces per a task_struct flag for remembering processes
killed by LMK, and replaces TIF_MEMDIE with that flag. By applying this
patch, assumption by mark_oom_victim() becomes true.
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Arve Hjonnevag <arve@android.com>
Cc: Riley Andrews <riandrews@android.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
if (!p)
continue;
- if (test_tsk_thread_flag(p, TIF_MEMDIE) &&
+ if (task_lmk_waiting(p) && p->mm &&
time_before_eq(jiffies, lowmem_deathpending_timeout)) {
task_unlock(p);
rcu_read_unlock();
if (selected) {
task_lock(selected);
send_sig(SIGKILL, selected, 0);
- /*
- * FIXME: lowmemorykiller shouldn't abuse global OOM killer
- * infrastructure. There is no real reason why the selected
- * task should have access to the memory reserves.
- */
if (selected->mm)
- mark_oom_victim(selected);
+ task_set_lmk_waiting(selected);
task_unlock(selected);
lowmem_print(1, "Killing '%s' (%d), adj %hd,\n"
" to free %ldkB on behalf of '%s' (%d) because\n"
#define PFA_NO_NEW_PRIVS 0 /* May not gain new privileges. */
#define PFA_SPREAD_PAGE 1 /* Spread page cache over cpuset */
#define PFA_SPREAD_SLAB 2 /* Spread some slab caches over cpuset */
+#define PFA_LMK_WAITING 3 /* Lowmemorykiller is waiting */
#define TASK_PFA_TEST(name, func) \
TASK_PFA_SET(SPREAD_SLAB, spread_slab)
TASK_PFA_CLEAR(SPREAD_SLAB, spread_slab)
+TASK_PFA_TEST(LMK_WAITING, lmk_waiting)
+TASK_PFA_SET(LMK_WAITING, lmk_waiting)
+
/*
* task->jobctl flags
*/