[IA64] optimize flush_tlb_range on large numa box
authorChen, Kenneth W <kenneth.w.chen@intel.com>
Mon, 6 Mar 2006 22:12:54 +0000 (14:12 -0800)
committerTony Luck <tony.luck@intel.com>
Mon, 27 Mar 2006 18:20:03 +0000 (10:20 -0800)
It was reported from a field customer that global spin lock ptcg_lock
is giving a lot of grief on munmap performance running on a large numa
machine.  What appears to be a problem coming from flush_tlb_range(),
which currently unconditionally calls platform_global_tlb_purge().
For some of the numa machines in existence today, this function is
mapped into ia64_global_tlb_purge(), which holds ptcg_lock spin lock
while executing ptc.ga instruction.

Here is a patch that attempt to avoid global tlb purge whenever
possible.  It will use local tlb purge as much as possible. Though the
conditions to use local tlb purge is pretty restrictive.  One of the
side effect of having flush tlb range instruction on ia64 is that
kernel don't get a chance to clear out cpu_vm_mask.  On ia64, this mask
is sticky and it will accumulate if process bounces around.  Thus
diminishing the possible use of ptc.l.  Thoughts?

Signed-off-by: Ken Chen <kenneth.w.chen@intel.com>
Acked-by: Jack Steiner <steiner@sgi.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
arch/ia64/mm/tlb.c

index 6a4eec9113e8383efd029253a9a038637d6819d0..4dbbca0b5e9c5698286c34462d98a246fd3d6cf7 100644 (file)
@@ -156,17 +156,19 @@ flush_tlb_range (struct vm_area_struct *vma, unsigned long start,
                nbits = purge.max_bits;
        start &= ~((1UL << nbits) - 1);
 
-# ifdef CONFIG_SMP
-       platform_global_tlb_purge(mm, start, end, nbits);
-# else
        preempt_disable();
+#ifdef CONFIG_SMP
+       if (mm != current->active_mm || cpus_weight(mm->cpu_vm_mask) != 1) {
+               platform_global_tlb_purge(mm, start, end, nbits);
+               preempt_enable();
+               return;
+       }
+#endif
        do {
                ia64_ptcl(start, (nbits<<2));
                start += (1UL << nbits);
        } while (start < end);
        preempt_enable();
-# endif
-
        ia64_srlz_i();                  /* srlz.i implies srlz.d */
 }
 EXPORT_SYMBOL(flush_tlb_range);