memcg: allow a memcg with kmem charges to be destructed
authorGlauber Costa <glommer@parallels.com>
Tue, 18 Dec 2012 22:22:11 +0000 (14:22 -0800)
committerLinus Torvalds <torvalds@linux-foundation.org>
Tue, 18 Dec 2012 23:02:13 +0000 (15:02 -0800)
Because the ultimate goal of the kmem tracking in memcg is to track slab
pages as well, we can't guarantee that we'll always be able to point a
page to a particular process, and migrate the charges along with it -
since in the common case, a page will contain data belonging to multiple
processes.

Because of that, when we destroy a memcg, we only make sure the
destruction will succeed by discounting the kmem charges from the user
charges when we try to empty the cgroup.

Signed-off-by: Glauber Costa <glommer@parallels.com>
Acked-by: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Frederic Weisbecker <fweisbec@redhat.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: JoonSoo Kim <js1304@gmail.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Rik van Riel <riel@redhat.com>
Cc: Suleiman Souhlal <suleiman@google.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
mm/memcontrol.c

index bc70254558fac5ce8d0e4dfe13f901466bd51bac..f96ccc90fa664a08064436f28e8ba7f9fb205da3 100644 (file)
@@ -547,6 +547,11 @@ static void disarm_kmem_keys(struct mem_cgroup *memcg)
 {
        if (memcg_kmem_is_active(memcg))
                static_key_slow_dec(&memcg_kmem_enabled_key);
+       /*
+        * This check can't live in kmem destruction function,
+        * since the charges will outlive the cgroup
+        */
+       WARN_ON(res_counter_read_u64(&memcg->kmem, RES_USAGE) != 0);
 }
 #else
 static void disarm_kmem_keys(struct mem_cgroup *memcg)
@@ -4025,6 +4030,7 @@ static void mem_cgroup_force_empty_list(struct mem_cgroup *memcg,
 static void mem_cgroup_reparent_charges(struct mem_cgroup *memcg)
 {
        int node, zid;
+       u64 usage;
 
        do {
                /* This is for making all *used* pages to be on LRU. */
@@ -4045,13 +4051,20 @@ static void mem_cgroup_reparent_charges(struct mem_cgroup *memcg)
                cond_resched();
 
                /*
+                * Kernel memory may not necessarily be trackable to a specific
+                * process. So they are not migrated, and therefore we can't
+                * expect their value to drop to 0 here.
+                * Having res filled up with kmem only is enough.
+                *
                 * This is a safety check because mem_cgroup_force_empty_list
                 * could have raced with mem_cgroup_replace_page_cache callers
                 * so the lru seemed empty but the page could have been added
                 * right after the check. RES_USAGE should be safe as we always
                 * charge before adding to the LRU.
                 */
-       } while (res_counter_read_u64(&memcg->res, RES_USAGE) > 0);
+               usage = res_counter_read_u64(&memcg->res, RES_USAGE) -
+                       res_counter_read_u64(&memcg->kmem, RES_USAGE);
+       } while (usage > 0);
 }
 
 /*