perf: optimize perf_fetch_caller_regs
authorAlexei Starovoitov <ast@fb.com>
Thu, 7 Apr 2016 01:43:22 +0000 (18:43 -0700)
committerDavid S. Miller <davem@davemloft.net>
Fri, 8 Apr 2016 01:04:26 +0000 (21:04 -0400)
commitec5e099d6e941668d121ea9ca7057f4fa00830b0
tree4ec273d88501a68ce23eb53649530161a7fa8440
parentb33b0a1bf69faff89693df49519fa7b459f5d807
perf: optimize perf_fetch_caller_regs

avoid memset in perf_fetch_caller_regs, since it's the critical path of all tracepoints.
It's called from perf_sw_event_sched, perf_event_task_sched_in and all of perf_trace_##call
with this_cpu_ptr(&__perf_regs[..]) which are zero initialized by perpcu init logic and
subsequent call to perf_arch_fetch_caller_regs initializes the same fields on all archs,
so we can safely drop memset from all of the above cases and move it into
perf_ftrace_function_call that calls it with stack allocated pt_regs.

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
include/linux/perf_event.h
kernel/trace/trace_event_perf.c