perf_events: Optimize the swcounter hotpath
The structure init creates a bit memcpy, which shows
up big time in perf annotate output:
:
ffffffff810a859d <__perf_sw_event>:
1.68 :
ffffffff810a859d: 55 push %rbp
1.69 :
ffffffff810a859e: 41 89 fa mov %edi,%r10d
0.01 :
ffffffff810a85a1: 49 89 c9 mov %rcx,%r9
0.00 :
ffffffff810a85a4: 31 c0 xor %eax,%eax
1.71 :
ffffffff810a85a6: b9 16 00 00 00 mov $0x16,%ecx
0.00 :
ffffffff810a85ab: 48 89 e5 mov %rsp,%rbp
0.00 :
ffffffff810a85ae: 48 83 ec 60 sub $0x60,%rsp
1.52 :
ffffffff810a85b2: 48 8d 7d a0 lea -0x60(%rbp),%rdi
85.20 :
ffffffff810a85b6: f3 ab rep stos %eax,%es:(%rdi)
None of the callees depends on the structure being pre-initialized,
so only initialize ->addr. This gets rid of the memcpy overhead.
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>