Commit
20dd4c624d251 ('powerpc/perf: Fix SDAR_MODE value for continous
sampling on Power9') set the default sdar_mode value in MMCRA[SDAR_MODE]
to be used as 0b01 (Update on TLB miss). And this value is set if sdar_mode
from event is zero, or we are in continous sampling mode in power9 dd1.
But it is preferred to have the sdar_mode value for power9 as
0b10 (Update on dcache miss) for better sampling updates instead
of 0b01 (Update on TLB miss).
From Anton:
Using a bandwidth test case with a 1MB footprint, I profiled cycles and
chose TLB updates of the SDAR:
$ perf record -d -e r000400000000001E:u ./bw2001 1M
^
SDAR TLB
$ perf report -D | grep PERF_RECORD_SAMPLE | sed 's/.*addr: //' | sort -u | wc -l
4
I get 4 unique addresses. If I ran with dcache misses:
$ perf record -d -e r000800000000001E:u ./bw2001 1M
^
SDAR dcache miss
$ perf report -D|grep PERF_RECORD_SAMPLE| sed 's/.*addr: //'|sort -u | wc -l
5217
I get 5217 unique addresses. No surprises here, but it does show why
TLB misses is the wrong event to default to - we get very little useful
information out of it.
Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Acked-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
else if (!cpu_has_feature(CPU_FTR_POWER9_DD1) && p9_SDAR_MODE(event))
*mmcra |= p9_SDAR_MODE(event) << MMCRA_SDAR_MODE_SHIFT;
else
- *mmcra |= MMCRA_SDAR_MODE_TLB;
+ *mmcra |= MMCRA_SDAR_MODE_DCACHE;
} else
*mmcra |= MMCRA_SDAR_MODE_TLB;
}
#define MMCRA_SDAR_MODE_SHIFT 42
#define MMCRA_SDAR_MODE_TLB (1ull << MMCRA_SDAR_MODE_SHIFT)
#define MMCRA_SDAR_MODE_NO_UPDATES ~(0x3ull << MMCRA_SDAR_MODE_SHIFT)
+#define MMCRA_SDAR_MODE_DCACHE (2ull << MMCRA_SDAR_MODE_SHIFT)
#define MMCRA_IFM_SHIFT 30
#define MMCRA_THR_CTR_MANT_SHIFT 19
#define MMCRA_THR_CTR_MANT_MASK 0x7Ful