cgroup: avoid attaching a cgroup root to two different superblocks
authorZefan Li <lizefan@huawei.com>
Fri, 7 Apr 2017 08:51:55 +0000 (16:51 +0800)
committerTejun Heo <tj@kernel.org>
Tue, 11 Apr 2017 00:00:57 +0000 (09:00 +0900)
Run this:

    touch file0
    for ((; ;))
    {
        mount -t cpuset xxx file0
    }

And this concurrently:

    touch file1
    for ((; ;))
    {
        mount -t cpuset xxx file1
    }

We'll trigger a warning like this:

 ------------[ cut here ]------------
 WARNING: CPU: 1 PID: 4675 at lib/percpu-refcount.c:317 percpu_ref_kill_and_confirm+0x92/0xb0
 percpu_ref_kill_and_confirm called more than once on css_release!
 CPU: 1 PID: 4675 Comm: mount Not tainted 4.11.0-rc5+ #5
 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
 Call Trace:
  dump_stack+0x63/0x84
  __warn+0xd1/0xf0
  warn_slowpath_fmt+0x5f/0x80
  percpu_ref_kill_and_confirm+0x92/0xb0
  cgroup_kill_sb+0x95/0xb0
  deactivate_locked_super+0x43/0x70
  deactivate_super+0x46/0x60
 ...
 ---[ end trace a79f61c2a2633700 ]---

Here's a race:

  Thread A Thread B

  cgroup1_mount()
    # alloc a new cgroup root
    cgroup_setup_root()
cgroup1_mount()
  # no sb yet, returns NULL
  kernfs_pin_sb()

  # but succeeds in getting the refcnt,
  # so re-use cgroup root
  percpu_ref_tryget_live()
    # alloc sb with cgroup root
    cgroup_do_mount()

  cgroup_kill_sb()
  # alloc another sb with same root
  cgroup_do_mount()

cgroup_kill_sb()

We end up using the same cgroup root for two different superblocks,
so percpu_ref_kill() will be called twice on the same root when the
two superblocks are destroyed.

We should fix to make sure the superblock pinning is really successful.

Cc: stable@vger.kernel.org # 3.16+
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Zefan Li <lizefan@huawei.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
kernel/cgroup/cgroup-v1.c

index 1dc22f6b49f5e06c4af22222dfb1b32c885ce16a..12e19f0636ea8366b190ab2ef41a0de84e0829f5 100644 (file)
@@ -1146,7 +1146,7 @@ struct dentry *cgroup1_mount(struct file_system_type *fs_type, int flags,
                 * path is super cold.  Let's just sleep a bit and retry.
                 */
                pinned_sb = kernfs_pin_sb(root->kf_root, NULL);
-               if (IS_ERR(pinned_sb) ||
+               if (IS_ERR_OR_NULL(pinned_sb) ||
                    !percpu_ref_tryget_live(&root->cgrp.self.refcnt)) {
                        mutex_unlock(&cgroup_mutex);
                        if (!IS_ERR_OR_NULL(pinned_sb))