tmpfs: distribute interleave better across nodes
authorNathan Zimmer <nzimmer@sgi.com>
Tue, 31 Jul 2012 23:46:17 +0000 (16:46 -0700)
committerLinus Torvalds <torvalds@linux-foundation.org>
Wed, 1 Aug 2012 01:42:50 +0000 (18:42 -0700)
When tmpfs has the interleave memory policy, it always starts allocating
for each file from node 0 at offset 0.  When there are many small files,
the lower nodes fill up disproportionately.

This patch spreads out node usage by starting files at nodes other than 0,
by using the inode number to bias the starting node for interleave.

Signed-off-by: Nathan Zimmer <nzimmer@sgi.com>
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Nick Piggin <npiggin@gmail.com>
Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
mm/shmem.c

index c15b998e5a860f9d3c375ad31ce9d5f34fbb5ea0..d4e184e2a38ea590350e5f31073c1ed8ad6690e0 100644 (file)
@@ -929,7 +929,8 @@ static struct page *shmem_swapin(swp_entry_t swap, gfp_t gfp,
 
        /* Create a pseudo vma that just contains the policy */
        pvma.vm_start = 0;
-       pvma.vm_pgoff = index;
+       /* Bias interleave by inode number to distribute better across nodes */
+       pvma.vm_pgoff = index + info->vfs_inode.i_ino;
        pvma.vm_ops = NULL;
        pvma.vm_policy = spol;
        return swapin_readahead(swap, gfp, &pvma, 0);
@@ -942,7 +943,8 @@ static struct page *shmem_alloc_page(gfp_t gfp,
 
        /* Create a pseudo vma that just contains the policy */
        pvma.vm_start = 0;
-       pvma.vm_pgoff = index;
+       /* Bias interleave by inode number to distribute better across nodes */
+       pvma.vm_pgoff = index + info->vfs_inode.i_ino;
        pvma.vm_ops = NULL;
        pvma.vm_policy = mpol_shared_policy_lookup(&info->policy, index);