mm/mprotect.c: don't touch single threaded PTEs which are on the right node
authorAndi Kleen <ak@linux.intel.com>
Tue, 13 Dec 2016 00:41:47 +0000 (16:41 -0800)
committerLinus Torvalds <torvalds@linux-foundation.org>
Tue, 13 Dec 2016 02:55:07 +0000 (18:55 -0800)
We had some problems with pages getting unmapped in single threaded
affinitized processes.  It was tracked down to NUMA scanning.

In this case it doesn't make any sense to unmap pages if the process is
single threaded and the page is already on the node the process is
running on.

Add a check for this case into the numa protection code, and skip
unmapping if true.

In theory the process could be migrated later, but we will eventually
rescan and unmap and migrate then.

In theory this could be made more fancy: remembering this state per
process or even whole mm.  However that would need extra tracking and be
more complicated, and the simple check seems to work fine so far.

[ak@linux.intel.com: v3: Minor updates from Mel. Change code layout]
Link: http://lkml.kernel.org/r/1476382117-5440-1-git-send-email-andi@firstfloor.org
Link: http://lkml.kernel.org/r/1476288949-20970-1-git-send-email-andi@firstfloor.org
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
mm/mprotect.c

index 11936526b08b8c5f5c5d0454f6789008a0c6f313..05a02b72c98dae137d6d3e8aa2a75e4d9c8ba2f5 100644 (file)
@@ -69,11 +69,17 @@ static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd,
        pte_t *pte, oldpte;
        spinlock_t *ptl;
        unsigned long pages = 0;
+       int target_node = NUMA_NO_NODE;
 
        pte = lock_pte_protection(vma, pmd, addr, prot_numa, &ptl);
        if (!pte)
                return 0;
 
+       /* Get target node for single threaded private VMAs */
+       if (prot_numa && !(vma->vm_flags & VM_SHARED) &&
+           atomic_read(&vma->vm_mm->mm_users) == 1)
+               target_node = numa_node_id();
+
        arch_enter_lazy_mmu_mode();
        do {
                oldpte = *pte;
@@ -95,6 +101,13 @@ static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd,
                                /* Avoid TLB flush if possible */
                                if (pte_protnone(oldpte))
                                        continue;
+
+                               /*
+                                * Don't mess with PTEs if page is already on the node
+                                * a single-threaded process is running on.
+                                */
+                               if (target_node == page_to_nid(page))
+                                       continue;
                        }
 
                        ptent = ptep_modify_prot_start(mm, addr, pte);