dax: increase granularity of dax_clear_blocks() operations
authorDan Williams <dan.j.williams@intel.com>
Sat, 16 Jan 2016 00:55:53 +0000 (16:55 -0800)
committerLinus Torvalds <torvalds@linux-foundation.org>
Sat, 16 Jan 2016 01:56:32 +0000 (17:56 -0800)
dax_clear_blocks is currently performing a cond_resched() after every
PAGE_SIZE memset.  We need not check so frequently, for example md-raid
only calls cond_resched() at stripe granularity.  Also, in preparation
for introducing a dax_map_atomic() operation that temporarily pins a dax
mapping move the call to cond_resched() to the outer loop.

The worst case latency between calls to cond_resched() after this change
is 500us the average latency is 133us.  This is up from a 10us max and
4us average.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Jan Kara <jack@suse.com>
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
fs/dax.c

index 19492cc65a302ce2285f179d056bc9f860cbea9c..11721c0fc12765b80e25de26904565fd0be3505a 100644 (file)
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -28,6 +28,7 @@
 #include <linux/sched.h>
 #include <linux/uio.h>
 #include <linux/vmstat.h>
+#include <linux/sizes.h>
 
 /*
  * dax_clear_blocks() is called from within transaction context from XFS,
@@ -43,24 +44,17 @@ int dax_clear_blocks(struct inode *inode, sector_t block, long size)
        do {
                void __pmem *addr;
                unsigned long pfn;
-               long count;
+               long count, sz;
 
                count = bdev_direct_access(bdev, sector, &addr, &pfn, size);
                if (count < 0)
                        return count;
-               BUG_ON(size < count);
-               while (count > 0) {
-                       unsigned pgsz = PAGE_SIZE - offset_in_page(addr);
-                       if (pgsz > count)
-                               pgsz = count;
-                       clear_pmem(addr, pgsz);
-                       addr += pgsz;
-                       size -= pgsz;
-                       count -= pgsz;
-                       BUG_ON(pgsz & 511);
-                       sector += pgsz / 512;
-                       cond_resched();
-               }
+               sz = min_t(long, count, SZ_128K);
+               clear_pmem(addr, sz);
+               size -= sz;
+               BUG_ON(sz & 511);
+               sector += sz / 512;
+               cond_resched();
        } while (size);
 
        wmb_pmem();