For hardware events, the userspace page of the event gets updated in
context switches, so if we read the timestamp in the page, we get
fresh info.
For software events, this is missing currently. This patch makes the
behavior consistent.
With this patch, we can implement clock_gettime(THREAD_CPUTIME) with
PERF_COUNT_SW_DUMMY in userspace as suggested by Andy and Peter. Code
like this:
if (pc->cap_user_time) {
do {
seq = pc->lock;
barrier();
running = pc->time_running;
cyc = rdtsc();
time_mult = pc->time_mult;
time_shift = pc->time_shift;
time_offset = pc->time_offset;
barrier();
} while (pc->lock != seq);
quot = (cyc >> time_shift);
rem = cyc & ((1 << time_shift) - 1);
delta = time_offset + quot * time_mult +
((rem * time_mult) >> time_shift);
running += delta;
return running;
}
I tried it on a busy system, the userspace page updating doesn't
have noticeable overhead.
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/aa2dd2e4f1e9f2225758be5ba00f14d6909a8ce1.1423180257.git.shli@fb.com
[ Improved the changelog. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
}
hlist_add_head_rcu(&event->hlist_entry, head);
+ perf_event_update_userpage(event);
return 0;
}
{
if (flags & PERF_EF_START)
cpu_clock_event_start(event, flags);
+ perf_event_update_userpage(event);
return 0;
}
{
if (flags & PERF_EF_START)
task_clock_event_start(event, flags);
+ perf_event_update_userpage(event);
return 0;
}