GitHub/mt8127/android_kernel_alcatel_ttab.git
14 years agox86, apic: Allow to use certain functions without APIC built-in support
Cyrill Gorcunov [Wed, 17 Mar 2010 10:37:00 +0000 (13:37 +0300)]
x86, apic: Allow to use certain functions without APIC built-in support

In case even if the kernel is configured so that
no APIC support is built-in we still may allow
to use certain apic functions as dummy calls.

In particular we start using it in perf-events code.

Note that this is not that same as NOOP apic driver (which
is used if APIC support is present but no physical APIC is
available), this is for the case when we don't have apic code
compiled in at all.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <20100317104356.011052632@openvz.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf events: Change perf parameter --pid to process-wide collection instead of thread...
Zhang, Yanmin [Thu, 18 Mar 2010 14:36:05 +0000 (11:36 -0300)]
perf events: Change perf parameter --pid to process-wide collection instead of thread-wide

Parameter --pid (or -p) of perf currently means a thread-wide
collection. For exmaple, if a process whose id is 8888 has 10
threads, 'perf top -p 8888' just collects the main thread
statistics. That's misleading. Users are used to attach a whole
process when debugging a process by gdb. To follow normal usage
style, the patch change --pid to process-wide collection and add
--tid (-t) to mean a thread-wide collection.

Usage example is:

 # perf top -p 8888
 # perf record -p 8888 -f sleep 10
 # perf stat -p 8888 -f sleep 10

Above commands collect the statistics of all threads of process
8888.

Signed-off-by: Zhang Yanmin <yanmin_zhang@linux.intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Avi Kivity <avi@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Sheng Yang <sheng@linux.intel.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Jes Sorensen <Jes.Sorensen@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Gleb Natapov <gleb@redhat.com>
Cc: zhiteng.huang@intel.com
Cc: Zachary Amsden <zamsden@redhat.com>
LKML-Reference: <1268922965-14774-3-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf record: Enable counters only when kernel is execing subcommand
Zhang, Yanmin [Thu, 18 Mar 2010 14:36:04 +0000 (11:36 -0300)]
perf record: Enable counters only when kernel is execing subcommand

'perf record' starts counters before subcommand is execed, so
the statistics is not precise because it includes data of some
preparation steps. I fix it with the patch.

In addition, change the condition to fork/exec subcommand. If
there is a subcommand parameter, perf always fork/exec it. The
usage example is:

 # perf record -f -a sleep 10

So this command could collect statistics for 10 seconds
precisely. User still could stop it by CTRL+C. Without the new
capability, user could only input CTRL+C to stop it without
precise time clock.

Signed-off-by: Zhang Yanmin <yanmin_zhang@linux.intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Avi Kivity <avi@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Sheng Yang <sheng@linux.intel.com>
Cc: oerg Roedel <joro@8bytes.org>
Cc: Jes Sorensen <Jes.Sorensen@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Gleb Natapov <gleb@redhat.com>
Cc: <zhiteng.huang@intel.com>
Cc: Zachary Amsden <zamsden@redhat.com>
LKML-Reference: <1268922965-14774-2-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf stat: Enable counters when collecting process-wide or system-wide data
Zhang, Yanmin [Thu, 18 Mar 2010 14:36:03 +0000 (11:36 -0300)]
perf stat: Enable counters when collecting process-wide or system-wide data

Command 'perf stat' doesn't enable counters when collecting an
existing (by -p) process or system-wide statistics. Fix the
issue.

Change the condition of fork/exec subcommand. If there is a
subcommand parameter, perf always forks/execs it. The usage
example is:

 # perf stat -a sleep 10

So this command could collect statistics for 10 seconds
precisely. User still could stop it by CTRL+C. Without the new
capability, user could only use CTRL+C to stop it without
precise time clock.

Another issue is 'perf stat -a' consumes 100% time of a full
single logical cpu. It has a bad impact on running workload.

Fix it by adding a sleep(1) in the while(!done) loop in function
run_perf_stat.

Signed-off-by: Zhang Yanmin <yanmin_zhang@linux.intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Avi Kivity <avi@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Sheng Yang <sheng@linux.intel.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Jes Sorensen <Jes.Sorensen@redhat.com>
Cc: Gleb Natapov <gleb@redhat.com>
Cc: Zachary Amsden <zamsden@redhat.com>
Cc: <zhiteng.huang@intel.com>
LKML-Reference: <1268922965-14774-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf/core, x86: Remove duplicate perf_event_mask variable
Robert Richter [Wed, 17 Mar 2010 11:49:13 +0000 (12:49 +0100)]
perf/core, x86: Remove duplicate perf_event_mask variable

The same information is stored also in x86_pmu.intel_ctrl. This
patch removes perf_event_mask and instead uses
x86_pmu.intel_ctrl directly.

Signed-off-by: Robert Richter <robert.richter@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1268826553-19518-5-git-send-email-robert.richter@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf/core, x86: Remove cpu_hw_events.interrupts
Robert Richter [Wed, 17 Mar 2010 11:49:12 +0000 (12:49 +0100)]
perf/core, x86: Remove cpu_hw_events.interrupts

This member in the struct is not used anymore and can be
removed.

Signed-off-by: Robert Richter <robert.richter@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1268826553-19518-4-git-send-email-robert.richter@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf/core: Correct files in MAINTAINERS entry
Robert Richter [Wed, 17 Mar 2010 11:49:11 +0000 (12:49 +0100)]
perf/core: Correct files in MAINTAINERS entry

This corrects the file entries for perf_events. The following
files are caught now:

 $ xargs | eval ls $(cat) | sort -u
 kernel/perf_event*.c
 include/linux/perf_event.h
 arch/*/kernel/perf_event*.c
 arch/*/kernel/*/perf_event*.c
 arch/*/kernel/*/*/perf_event*.c
 arch/*/include/asm/perf_event.h
 arch/*/lib/perf_event*.c
 arch/*/kernel/perf_callchain.c

 arch/alpha/include/asm/perf_event.h
 arch/arm/include/asm/perf_event.h
 arch/arm/kernel/perf_event.c
 arch/frv/include/asm/perf_event.h
 arch/frv/lib/perf_event.c
 arch/parisc/include/asm/perf_event.h
 arch/powerpc/include/asm/perf_event.h
 arch/powerpc/kernel/perf_callchain.c
 arch/powerpc/kernel/perf_event.c
 arch/s390/include/asm/perf_event.h
 arch/sh/include/asm/perf_event.h
 arch/sh/kernel/cpu/sh4a/perf_event.c
 arch/sh/kernel/cpu/sh4/perf_event.c
 arch/sh/kernel/perf_callchain.c
 arch/sh/kernel/perf_event.c
 arch/sparc/include/asm/perf_event.h
 arch/sparc/kernel/perf_event.c
 arch/x86/include/asm/perf_event.h
 arch/x86/kernel/cpu/perf_event_amd.c
 arch/x86/kernel/cpu/perf_event.c
 arch/x86/kernel/cpu/perf_event_intel.c
 arch/x86/kernel/cpu/perf_event_p6.c
 include/linux/perf_event.h
 kernel/perf_event.c

Signed-off-by: Robert Richter <robert.richter@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1268826553-19518-3-git-send-email-robert.richter@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf/core, x86: Reduce number of CONFIG_X86_LOCAL_APIC macros
Robert Richter [Wed, 17 Mar 2010 11:49:10 +0000 (12:49 +0100)]
perf/core, x86: Reduce number of CONFIG_X86_LOCAL_APIC macros

The function reserve_pmc_hardware() and release_pmc_hardware()
were hard to read. This patch improves readability of the code by
removing most of the CONFIG_X86_LOCAL_APIC macros.

Signed-off-by: Robert Richter <robert.richter@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1268826553-19518-2-git-send-email-robert.richter@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf probe: Fix !dwarf build
Ingo Molnar [Wed, 17 Mar 2010 11:13:28 +0000 (12:13 +0100)]
perf probe: Fix !dwarf build

Fix the !drawf build.

This uses the existing NO_DWARF_SUPPORT mechanism we use for that,
but it's really fragile and needs a cleanup. (in a separate patch)

1) Such uses:

 #ifndef NO_DWARF_SUPPORT

are double inverted logic a'la 'not not'. Instead the flag should
be called DWARF_SUPPORT.

2) Furthermore, assymetric #ifdef polluted code flow like:

        if (need_dwarf)
 #ifdef NO_DWARF_SUPPORT
                die("Debuginfo-analysis is not supported");
 #else   /* !NO_DWARF_SUPPORT */
                pr_debug("Some probes require debuginfo.\n");

        fd = open_vmlinux();

is very fragile and not acceptable. Instead of that helper functions
should be created and the dwarf/no-dwarf logic should be separated more
cleanly.

3) Local variable #ifdefs like this:

 #ifndef NO_DWARF_SUPPORT
        int fd;
 #endif

Are fragile as well and should be eliminated. Helper functions achieve
that too.

Cc: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20100316220612.32050.33806.stgit@localhost6.localdomain6>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf probe: Add data structure member access support
Masami Hiramatsu [Tue, 16 Mar 2010 22:06:26 +0000 (18:06 -0400)]
perf probe: Add data structure member access support

Support accessing members in the data structures. With this,
perf-probe accepts data-structure members(IOW, it now accepts
dot '.' and arrow '->' operators) as probe arguemnts.

e.g.

 ./perf probe --add 'schedule:44 rq->curr'

 ./perf probe --add 'vfs_read file->f_op->read file->f_path.dentry'

Note that '>' can be interpreted as redirection in command-line.

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: systemtap <systemtap@sources.redhat.com>
Cc: DLE <dle-develop@lists.sourceforge.net>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20100316220626.32050.57552.stgit@localhost6.localdomain6>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf probe: List probes with line number and file name
Masami Hiramatsu [Tue, 16 Mar 2010 22:06:19 +0000 (18:06 -0400)]
perf probe: List probes with line number and file name

Improve --list to show current exist probes with line number and
file name. This enables user easily to check which line is
already probed.

for example:

 ./perf probe --list
 probe:vfs_read       (on vfs_read:8@linux-2.6-tip/fs/read_write.c)

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: systemtap <systemtap@sources.redhat.com>
Cc: DLE <dle-develop@lists.sourceforge.net>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20100316220619.32050.48702.stgit@localhost6.localdomain6>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf probe: Introduce kprobe_trace_event and perf_probe_event
Masami Hiramatsu [Tue, 16 Mar 2010 22:06:12 +0000 (18:06 -0400)]
perf probe: Introduce kprobe_trace_event and perf_probe_event

Introduce kprobe_trace_event and perf_probe_event and replace
old probe_point structure with it. probe_point structure is
not enough flexible nor extensible. New data structures
will help implementing further features.

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: systemtap <systemtap@sources.redhat.com>
Cc: DLE <dle-develop@lists.sourceforge.net>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20100316220612.32050.33806.stgit@localhost6.localdomain6>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf probe: Add --dry-run option
Masami Hiramatsu [Tue, 16 Mar 2010 22:06:05 +0000 (18:06 -0400)]
perf probe: Add --dry-run option

Add --dry-run option for debugging and testing.

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: systemtap <systemtap@sources.redhat.com>
Cc: DLE <dle-develop@lists.sourceforge.net>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20100316220605.32050.6571.stgit@localhost6.localdomain6>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf probe: Introduce die_find_child() function
Masami Hiramatsu [Tue, 16 Mar 2010 22:05:58 +0000 (18:05 -0400)]
perf probe: Introduce die_find_child() function

Introduce die_find_child() function to integrate DIE-tree
searching functions.

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: systemtap <systemtap@sources.redhat.com>
Cc: DLE <dle-develop@lists.sourceforge.net>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20100316220558.32050.7905.stgit@localhost6.localdomain6>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf probe: Rename some die_get_* functions
Masami Hiramatsu [Tue, 16 Mar 2010 22:05:51 +0000 (18:05 -0400)]
perf probe: Rename some die_get_* functions

Rename die_get_real_subprogram and die_get_inlinefunc to
die_find_real_subprogram and die_find_inlinefunc respectively,
because these functions search its children. After that,
'die_get_' means getting a property of that die, and
'die_find_' means searching DIE-tree to get an appropriate
child die.

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: systemtap <systemtap@sources.redhat.com>
Cc: DLE <dle-develop@lists.sourceforge.net>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20100316220551.32050.36181.stgit@localhost6.localdomain6>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf probe: Rename session to param
Masami Hiramatsu [Tue, 16 Mar 2010 22:05:44 +0000 (18:05 -0400)]
perf probe: Rename session to param

Since this name 'session' conflicts with 'perf_session', and
this structure just holds parameters anymore.

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: systemtap <systemtap@sources.redhat.com>
Cc: DLE <dle-develop@lists.sourceforge.net>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20100316220544.32050.8788.stgit@localhost6.localdomain6>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf probe: Move add-probe routine to util/
Masami Hiramatsu [Tue, 16 Mar 2010 22:05:37 +0000 (18:05 -0400)]
perf probe: Move add-probe routine to util/

Move add-probe routine to util/probe_event.c. This simplifies
main routine for reducing maintenance cost.

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: systemtap <systemtap@sources.redhat.com>
Cc: DLE <dle-develop@lists.sourceforge.net>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20100316220537.32050.72214.stgit@localhost6.localdomain6>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf probe: Use wrapper functions
Masami Hiramatsu [Tue, 16 Mar 2010 22:05:30 +0000 (18:05 -0400)]
perf probe: Use wrapper functions

Use wrapped functions as much as possible, to check out of
memory conditions in perf probe.

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: systemtap <systemtap@sources.redhat.com>
Cc: DLE <dle-develop@lists.sourceforge.net>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20100316220530.32050.53951.stgit@localhost6.localdomain6>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf tools: Introduce xzalloc() for detecting out of memory conditions
Masami Hiramatsu [Tue, 16 Mar 2010 22:05:21 +0000 (18:05 -0400)]
perf tools: Introduce xzalloc() for detecting out of memory conditions

Introducing xzalloc() which wrapping zalloc() for detecting out
of memory conditions.

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: systemtap <systemtap@sources.redhat.com>
Cc: DLE <dle-develop@lists.sourceforge.net>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20100316220521.32050.85155.stgit@localhost6.localdomain6>
[ -v2: small cleanups in surrounding code ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoMerge branch 'perf/urgent' into perf/core
Ingo Molnar [Wed, 17 Mar 2010 10:31:45 +0000 (11:31 +0100)]
Merge branch 'perf/urgent' into perf/core

Merge reason: We'll be queueing dependent changes.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf: Fix unexported generic perf_arch_fetch_caller_regs
Frederic Weisbecker [Tue, 16 Mar 2010 00:05:02 +0000 (01:05 +0100)]
perf: Fix unexported generic perf_arch_fetch_caller_regs

perf_arch_fetch_caller_regs() is exported for the overriden x86
version, but not for the generic weak version.

As a general rule, weak functions should not have their symbol
exported in the same file they are defined.

So let's export it on trace_event_perf.c as it is used by trace
events only.

This fixes:

ERROR: ".perf_arch_fetch_caller_regs" [fs/xfs/xfs.ko] undefined!
ERROR: ".perf_arch_fetch_caller_regs" [arch/powerpc/platforms/cell/spufs/spufs.ko] undefined!

-v2: And also only build it if trace events are enabled.
-v3: Fix changelog mistake

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1268697902-9518-1-git-send-regression-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf, x86: Report error code that returned from x86_pmu.hw_config()
Robert Richter [Tue, 16 Mar 2010 16:07:33 +0000 (17:07 +0100)]
perf, x86: Report error code that returned from x86_pmu.hw_config()

If x86_pmu.hw_config() fails a fixed error code (-EOPNOTSUPP) is
returned even if a different error was reported. This patch fixes
this.

Signed-off-by: Robert Richter <robert.richter@amd.com>
Acked-by: Cyrill Gorcunov <gorcunov@gmail.com>
Acked-by: Lin Ming <ming.m.lin@intel.com>
Cc: acme@redhat.com
Cc: eranian@google.com
Cc: gorcunov@openvz.org
Cc: peterz@infradead.org
Cc: fweisbec@gmail.com
LKML-Reference: <20100316160733.GR1585@erda.amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf top: Add missing initialization to zero
Arnaldo Carvalho de Melo [Tue, 16 Mar 2010 21:28:46 +0000 (18:28 -0300)]
perf top: Add missing initialization to zero

The dso_short_width has to start as zero, as we're calculating
the maximum short DSO name length, somehow I missed this one.

Reported-by: Frédéric Weisbecker <fweisbec@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1268774926-27488-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agokprobes: Hide CONFIG_OPTPROBES and set if arch supports optimized kprobes
Masami Hiramatsu [Mon, 15 Mar 2010 17:00:54 +0000 (13:00 -0400)]
kprobes: Hide CONFIG_OPTPROBES and set if arch supports optimized kprobes

Hide CONFIG_OPTPROBES and set if the arch supports optimized
kprobes (IOW, HAVE_OPTPROBES=y), since this option doesn't
change the major behavior of kprobes, and workarounds for minor
changes are documented.

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: systemtap <systemtap@sources.redhat.com>
Cc: DLE <dle-develop@lists.sourceforge.net>
Cc: Dieter Ries <mail@dieterries.net>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20100315170054.31593.3153.stgit@localhost6.localdomain6>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf probe: Use original address instead of CU-based address
Masami Hiramatsu [Mon, 15 Mar 2010 17:02:35 +0000 (13:02 -0400)]
perf probe: Use original address instead of CU-based address

Use original address for looking up the location of variables
for dwarf_getlocation_addr() instead of CU-based address.

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: systemtap <systemtap@sources.redhat.com>
Cc: DLE <dle-develop@lists.sourceforge.net>
LKML-Reference: <20100315170235.31852.91195.stgit@localhost6.localdomain6>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf probe: Fix offset to allow signed value
Masami Hiramatsu [Mon, 15 Mar 2010 17:02:28 +0000 (13:02 -0400)]
perf probe: Fix offset to allow signed value

Fix dereference offset to intmax_t from uintmax_t, because
it can have negative values (for example local variable's offset
from frame pointer).

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: systemtap <systemtap@sources.redhat.com>
Cc: DLE <dle-develop@lists.sourceforge.net>
LKML-Reference: <20100315170228.31852.71946.stgit@localhost6.localdomain6>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf top: Improve the autosizing of column lenghts
Arnaldo Carvalho de Melo [Mon, 15 Mar 2010 18:03:50 +0000 (15:03 -0300)]
perf top: Improve the autosizing of column lenghts

When profiling C++ workloads the symbol name length can be
really big, so cap it before it garbles the result.

This builds upon the autosizing already present where we choose
to use the short, basename of DSOs instead of its long, full
pathname.

Reported-by: Pavel Krauz <krauz@cngroup.cz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1268676230-9261-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agox86, perf: Fix comments in Pentium-4 PMU definitions
Lin Ming [Tue, 16 Mar 2010 02:12:36 +0000 (10:12 +0800)]
x86, perf: Fix comments in Pentium-4 PMU definitions

Reported-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <1268705556.3379.8.camel@minggr.sh.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf: Fix unexported generic perf_arch_fetch_caller_regs
Frederic Weisbecker [Tue, 16 Mar 2010 00:05:02 +0000 (01:05 +0100)]
perf: Fix unexported generic perf_arch_fetch_caller_regs

perf_arch_fetch_caller_regs() is exported for the overriden x86
version, but not for the generic weak version.

As a general rule, weak functions should not have their symbol
exported in the same file they are defined.

So let's export it on trace_event_perf.c as it is used by trace
events only.

This fixes:

ERROR: ".perf_arch_fetch_caller_regs" [fs/xfs/xfs.ko] undefined!
ERROR: ".perf_arch_fetch_caller_regs" [arch/powerpc/platforms/cell/spufs/spufs.ko] undefined!

-v2: And also only build it if trace events are enabled.
-v3: Fix changelog mistake

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1268697902-9518-1-git-send-regression-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf annotate: Properly notify the user that vmlinux is missing
Arnaldo Carvalho de Melo [Mon, 15 Mar 2010 16:04:33 +0000 (13:04 -0300)]
perf annotate: Properly notify the user that vmlinux is missing

Before this patch we would not find a vmlinux, then try to pass
objdump "[kernel.kallsyms]" as the filename, it would get
confused and produce no output:

 [root@doppio ~]# perf annotate n_tty_write

 ------------------------------------------------
  Percent |      Source code & Disassembly of [kernel.kallsyms]
 ------------------------------------------------

Now we check that and emit meaningful warning:

 [root@doppio ~]# perf annotate n_tty_write
 Can't annotate n_tty_write: No vmlinux file was found in the
 path: [0] vmlinux
 [1] /boot/vmlinux
 [2] /boot/vmlinux-2.6.34-rc1-tip+
 [3] /lib/modules/2.6.34-rc1-tip+/build/vmlinux
 [4] /usr/lib/debug/lib/modules/2.6.34-rc1-tip+/vmlinux
 [root@doppio ~]#

This bug was introduced when we added automatic search for
vmlinux, before that time the user had to specify a vmlinux
file.

v2: Print the warning just for the first symbol found when no
    symbol name is specified, otherwise it will spam the screen
    repeating the warning for each symbol.

Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: <stable@kernel.org>
LKML-Reference: <1268669073-6856-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf top: Properly notify the user that vmlinux is missing
Arnaldo Carvalho de Melo [Mon, 15 Mar 2010 14:46:58 +0000 (11:46 -0300)]
perf top: Properly notify the user that vmlinux is missing

Before this patch this message would very briefly appear on the
screen and then the screen would get updates only on the top,
for number of interrupts received, etc, but no annotation would
be performed:

 [root@doppio linux-2.6-tip]# perf top -s n_tty_write > /tmp/bla
 objdump: '[kernel.kallsyms]': No such file

Now this is what the user gets:

 [root@doppio linux-2.6-tip]# perf top -s n_tty_write
 Can't annotate n_tty_write: No vmlinux file was found in the
 path: [0] vmlinux
 [1] /boot/vmlinux
 [2] /boot/vmlinux-2.6.33-rc5
 [3] /lib/modules/2.6.33-rc5/build/vmlinux
 [4] /usr/lib/debug/lib/modules/2.6.33-rc5/vmlinux
 [root@doppio linux-2.6-tip]#

This bug was introduced when we added automatic search for
vmlinux, before that time the user had to specify a vmlinux
file.

Reported-by: David S. Miller <davem@davemloft.net>
Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: <stable@kernel.org>
LKML-Reference: <1268664418-28328-2-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf record: Enable the enable_on_exec flag if record forks the target
Eric B Munson [Mon, 15 Mar 2010 14:46:57 +0000 (11:46 -0300)]
perf record: Enable the enable_on_exec flag if record forks the target

When forking its target, perf record can capture data from
before the target application is started.  Perf stat uses the
enable_on_exec flag in the event attributes to keep from
displaying events from before the target program starts, this
patch adds the same functionality to perf record when it is will
fork the target process.

Signed-off-by: Eric B Munson <ebmunson@us.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <1268664418-28328-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf, x86: Enable not tagged retired instruction counting on P4s
Cyrill Gorcunov [Mon, 15 Mar 2010 04:58:22 +0000 (12:58 +0800)]
perf, x86: Enable not tagged retired instruction counting on P4s

This should turn on instruction counting on P4s, which was missing in
the first version of the new PMU driver.

It's inaccurate for now, we still need dependant event to tag mops
before we can count them precisely. The result is that the number of
instruction may be lifted up.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <1268629102.3355.11.camel@minggr.sh.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agox86, perf: Unmask LVTPC only if we have APIC supported
Cyrill Gorcunov [Sat, 13 Mar 2010 08:11:16 +0000 (11:11 +0300)]
x86, perf: Unmask LVTPC only if we have APIC supported

Ingo reported:

 |
 | There's a build failure on -tip with the P4 driver, on UP 32-bit, if
 | PERF_EVENTS is enabled but UP_APIC is disabled:
 |
 | arch/x86/built-in.o: In function `p4_pmu_handle_irq':
 | perf_event.c:(.text+0xa756): undefined reference to `apic'
 | perf_event.c:(.text+0xa76e): undefined reference to `apic'
 |

So we have to unmask LVTPC only if we're configured to have one.

Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
CC: Lin Ming <ming.m.lin@intel.com>
CC: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <20100313081116.GA5179@lenovo>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf probe: Fix need_dwarf flag if lazy matching is used
Masami Hiramatsu [Fri, 12 Mar 2010 23:22:24 +0000 (18:22 -0500)]
perf probe: Fix need_dwarf flag if lazy matching is used

Set need_dwarf if lazy matching pattern is specified, because
lazy matching requires real source path for which we must use
debuginfo.

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: systemtap <systemtap@sources.redhat.com>
Cc: DLE <dle-develop@lists.sourceforge.net>
LKML-Reference: <20100312232224.2017.54550.stgit@localhost6.localdomain6>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf probe: Fix probe_point buffer overrun
Masami Hiramatsu [Fri, 12 Mar 2010 23:22:17 +0000 (18:22 -0500)]
perf probe: Fix probe_point buffer overrun

Fix probe_point array-size overrun problem. In some cases (e.g.
inline function), one user-specified probe-point can be
translated to many probe address, and it overruns pre-defined
array-size. This also removes redundant MAX_PROBES macro
definition.

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: systemtap <systemtap@sources.redhat.com>
Cc: DLE <dle-develop@lists.sourceforge.net>
Cc: <stable@kernel.org>
LKML-Reference: <20100312232217.2017.45017.stgit@localhost6.localdomain6>
[ Note that only root can create new probes. Eventually we should remove
  the MAX_PROBES limit, but that is a larger patch not eligible to
  perf/urgent treatment. ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf tools: Fix non-newt build
Arnaldo Carvalho de Melo [Sat, 13 Mar 2010 00:05:10 +0000 (21:05 -0300)]
perf tools: Fix non-newt build

The use_browser needs to be in a file that is always built and
also we need a browser__show_help stub in that case.

Reported-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1268438710-32697-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoMerge branch 'perf/x86' into perf/core
Ingo Molnar [Fri, 12 Mar 2010 20:06:35 +0000 (21:06 +0100)]
Merge branch 'perf/x86' into perf/core

Merge reason: The new P4 driver is stable and ready now for more
              testing.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf hist: Don't fprintf the callgraph unconditionally
Arnaldo Carvalho de Melo [Fri, 12 Mar 2010 15:46:48 +0000 (12:46 -0300)]
perf hist: Don't fprintf the callgraph unconditionally

[root@doppio ~]# perf report -i newt.data | head -10
  # Samples: 11999679868
  #
  # Overhead  Command                  Shared Object  Symbol
  # ........  .......  .............................  ......
  #
      63.61%     perf  libslang.so.2.1.4              [.] SLsmg_write_chars
       6.30%     perf  perf                           [.] symbols__find
       2.19%     perf  libnewt.so.0.52.10             [.] newtListboxAppendEntry
       2.08%     perf  libslang.so.2.1.4              [.] SLsmg_write_chars@plt
       1.99%     perf  libc-2.10.2.so                 [.] _IO_vfprintf_internal
  [root@doppio ~]#

Not good, the newt form for report works, but slang has to eat
the cost of the additional callgraph lines everytime it prints a
line, and the callgraph doesn't appear on the screen, so move
the callgraph printing to a separate function and don't use it
in newt.c.

Newt tree widgets are being investigated to properly support
callgraphs, but till that gets merged, lets remove this huge
overhead and show at least the symbol overheads for a callgraph
rich perf.data with good performance.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1268408808-13595-2-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf newt: Use newtGetScreenSize
Arnaldo Carvalho de Melo [Fri, 12 Mar 2010 15:46:47 +0000 (12:46 -0300)]
perf newt: Use newtGetScreenSize

For consistency, use the newt API more fully.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1268408808-13595-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf newt: Add 'Q', 'q' and Ctrl+C as ways to exit from forms
Arnaldo Carvalho de Melo [Fri, 12 Mar 2010 13:48:12 +0000 (10:48 -0300)]
perf newt: Add 'Q', 'q' and Ctrl+C as ways to exit from forms

These are keys people expect when pressed to exit the current
widget, so have associate all of them to this semantic.

Suggested-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1268401692-9361-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf report: Implement initial UI using newt
Arnaldo Carvalho de Melo [Thu, 11 Mar 2010 23:12:44 +0000 (20:12 -0300)]
perf report: Implement initial UI using newt

Newt has widespread availability and provides a rather simple
API as can be seen by the size of this patch.

The work needed to support it will benefit other frontends too.

In this initial patch it just checks if the output is a tty, if
not it falls back to the previous behaviour, also if
newt-devel/libnewt-dev is not installed the previous behaviour
is maintaned.

Pressing enter on a symbol will annotate it, ESC in the
annotation window will return to the report symbol list.

More work will be done to remove the special casing in
color_fprintf, stop using fmemopen/FILE in the printing of
hist_entries, etc.

Also the annotation doesn't need to be done via spawning "perf
annotate" and then browsing its output, we can do better by
calling directly the builtin-annotate.c functions, that would
then be moved to tools/perf/util/annotate.c and shared with perf
top, etc

But lets go by baby steps, this patch already improves perf
usability by allowing to quickly do annotations on symbols from
the report screen and provides a first experimentation with
libnewt/TUI integration of tools.

Tested on RHEL5 and Fedora12 X86_64 and on Debian PARISC64 to
browse a perf.data file collected on a Fedora12 x86_64 box.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Avi Kivity <avi@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1268349164-5822-5-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf tools: Add missing bytes printed in hist_entry__fprintf
Arnaldo Carvalho de Melo [Thu, 11 Mar 2010 23:12:43 +0000 (20:12 -0300)]
perf tools: Add missing bytes printed in hist_entry__fprintf

We need those to properly size the browser widht in the newt
TUI.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1268349164-5822-4-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf tools: Use eprintf for pr_{err,warning,info} too
Arnaldo Carvalho de Melo [Thu, 11 Mar 2010 23:12:42 +0000 (20:12 -0300)]
perf tools: Use eprintf for pr_{err,warning,info} too

Just like we do for pr_debug, so that we can have a single point
where to redirect to the currently used output system, be it
stdio or newt.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1268349164-5822-3-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf top: Export get_window_dimensions
Arnaldo Carvalho de Melo [Thu, 11 Mar 2010 23:12:41 +0000 (20:12 -0300)]
perf top: Export get_window_dimensions

Will be used by the newt code too.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1268349164-5822-2-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf symbols: Bump plt synthesizing warning debug level
Arnaldo Carvalho de Melo [Thu, 11 Mar 2010 23:12:40 +0000 (20:12 -0300)]
perf symbols: Bump plt synthesizing warning debug level

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1268349164-5822-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoMerge branch 'perf/urgent' into perf/core
Ingo Molnar [Fri, 12 Mar 2010 09:20:57 +0000 (10:20 +0100)]
Merge branch 'perf/urgent' into perf/core

Merge reason: We want to queue up a dependent patch.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agox86, perf: Fix NULL deref on not assigned x86_pmu
Cyrill Gorcunov [Thu, 11 Mar 2010 21:50:16 +0000 (00:50 +0300)]
x86, perf: Fix NULL deref on not assigned x86_pmu

In case of not assigned x86_pmu and software events NULL dereference may
being hit via x86_pmu::schedule_events method.

Fix it by checking if x86_pmu is initialized at all.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Lin Ming <ming.m.lin@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <20100311215016.GG25162@lenovo>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf record: Mention paranoid sysctl when failing to create counter
Arnaldo Carvalho de Melo [Thu, 11 Mar 2010 18:53:12 +0000 (15:53 -0300)]
perf record: Mention paranoid sysctl when failing to create counter

[acme@mica linux-2.6-tip]$ perf record -a -f
   Fatal: Permission error - are you root?
   Consider tweaking /proc/sys/kernel/perf_event_paranoid.

 [acme@mica linux-2.6-tip]$

Suggested-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1268333592-30872-2-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf record: Don't try to find buildids in a zero sized file
Arnaldo Carvalho de Melo [Thu, 11 Mar 2010 18:53:11 +0000 (15:53 -0300)]
perf record: Don't try to find buildids in a zero sized file

Fixing this symptom:

 [acme@mica linux-2.6-tip]$ perf record -a -f
   Fatal: Permission error - are you root?

 Bus error
 [acme@mica linux-2.6-tip]$

I.e. if for some reason no data is collected, in this case a non
root user trying to do systemwide profiling, no data will be
collected, and then we end up trying to mmap a zero sized file
and access the file header, b00m.

Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: <stable@kernel.org>
LKML-Reference: <1268333592-30872-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf, x86: Implement initial P4 PMU driver
Cyrill Gorcunov [Thu, 11 Mar 2010 16:54:39 +0000 (19:54 +0300)]
perf, x86: Implement initial P4 PMU driver

The netburst PMU is way different from the "architectural
perfomance monitoring" specification that current CPUs use.
P4 uses a tuple of ESCR+CCCR+COUNTER MSR registers to handle
perfomance monitoring events.

A few implementational details:

1) We need a separate x86_pmu::hw_config helper in struct
   x86_pmu since register bit-fields are quite different from P6,
   Core and later cpu series.

2) For the same reason is a x86_pmu::schedule_events helper
   introduced.

3) hw_perf_event::config consists of packed ESCR+CCCR values.
   It's allowed since in reality both registers only use a half
   of their size. Of course before making a real write into a
   particular MSR we need to unpack the value and extend it to
   a proper size.

4) The tuple of packed ESCR+CCCR in hw_perf_event::config
   doesn't describe the memory address of ESCR MSR register
   so that we need to keep a mapping between these tuples
   used and available ESCR (various P4 events may use same
   ESCRs but not simultaneously), for this sake every active
   event has a per-cpu map of hw_perf_event::idx <--> ESCR
   addresses.

5) Since hw_perf_event::idx is an offset to counter/control register
   we need to lift X86_PMC_MAX_GENERIC up, otherwise kernel
   strips it down to 8 registers and event armed may never be turned
   off (ie the bit in active_mask is set but the loop never reaches
   this index to check), thanks to Peter Zijlstra

Restrictions:

 - No cascaded counters support (do we ever need them?)
 - No dependent events support (so PERF_COUNT_HW_INSTRUCTIONS
   doesn't work for now)
 - There are events with same counters which can't work simultaneously
   (need to use intersected ones due to broken counter 1)
 - No PERF_COUNT_HW_CACHE_ events yet

Todo:

 - Implement dependent events
 - Need proper hashing for event opcodes (no linear search, good for
   debugging stage but not in real loads)
 - Some events counted during a clock cycle -- need to set threshold
   for them and count every clock cycle just to get summary statistics
   (ie to behave the same way as other PMUs do)
 - Need to swicth to use event_constraints
 - To support RAW events we need to encode a global list of P4 events
   into p4_templates
 - Cache events need to be added

Event support status matrix:

 Event status
 -----------------------------
 cycles works
 cache-references works
 cache-misses works
 branch-misses works
 bus-cycles partially (does not work on 64bit cpu with HT enabled)
 instruction doesnt work (needs dependent event [mop tagging])
 branches doesnt work

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20100311165439.GB5129@lenovo>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf_events: Improve task_sched_in()
eranian@google.com [Thu, 11 Mar 2010 06:26:05 +0000 (22:26 -0800)]
perf_events: Improve task_sched_in()

This patch is an optimization in perf_event_task_sched_in() to avoid
scheduling the events twice in a row.

Without it, the perf_disable()/perf_enable() pair is invoked twice,
thereby pinned events counts while scheduling flexible events and we go
throuh hw_perf_enable() twice.

By encapsulating, the whole sequence into perf_disable()/perf_enable() we
ensure, hw_perf_enable() is going to be invoked only once because of the
refcount protection.

Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1268288765-5326-1-git-send-email-eranian@google.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf: export perf_trace_regs and perf_arch_fetch_caller_regs
Xiao Guangrong [Thu, 11 Mar 2010 07:30:35 +0000 (15:30 +0800)]
perf: export perf_trace_regs and perf_arch_fetch_caller_regs

Export perf_trace_regs and perf_arch_fetch_caller_regs since module will
use these.

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
[ use EXPORT_PER_CPU_SYMBOL_GPL() ]
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <4B989C1B.2090407@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf, x86: Fix hw_perf_enable() event assignment
Peter Zijlstra [Thu, 11 Mar 2010 12:40:30 +0000 (13:40 +0100)]
perf, x86: Fix hw_perf_enable() event assignment

What happens is that we schedule badly like:

<...>-1987  [019]   280.252808: x86_pmu_start: event-46/1300c0: idx: 0
<...>-1987  [019]   280.252811: x86_pmu_start: event-47/1300c0: idx: 1
<...>-1987  [019]   280.252812: x86_pmu_start: event-48/1300c0: idx: 2
<...>-1987  [019]   280.252813: x86_pmu_start: event-49/1300c0: idx: 3
<...>-1987  [019]   280.252814: x86_pmu_start: event-50/1300c0: idx: 32
<...>-1987  [019]   280.252825: x86_pmu_stop: event-46/1300c0: idx: 0
<...>-1987  [019]   280.252826: x86_pmu_stop: event-47/1300c0: idx: 1
<...>-1987  [019]   280.252827: x86_pmu_stop: event-48/1300c0: idx: 2
<...>-1987  [019]   280.252828: x86_pmu_stop: event-49/1300c0: idx: 3
<...>-1987  [019]   280.252829: x86_pmu_stop: event-50/1300c0: idx: 32
<...>-1987  [019]   280.252834: x86_pmu_start: event-47/1300c0: idx: 1
<...>-1987  [019]   280.252834: x86_pmu_start: event-48/1300c0: idx: 2
<...>-1987  [019]   280.252835: x86_pmu_start: event-49/1300c0: idx: 3
<...>-1987  [019]   280.252836: x86_pmu_start: event-50/1300c0: idx: 32
<...>-1987  [019]   280.252837: x86_pmu_start: event-51/1300c0: idx: 32 *FAIL*

This happens because we only iterate the n_running events in the first
pass, and reset their index to -1 if they don't match to force a
re-assignment.

Now, in our RR example, n_running == 0 because we fully unscheduled, so
event-50 will retain its idx==32, even though in scheduling it will have
gotten idx=0, and we don't trigger the re-assign path.

The easiest way to fix this is the below patch, which simply validates
the full assignment in the second pass.

Reported-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1268311069.5037.31.camel@laptop>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf, ppc: Fix compile error due to new cpu notifiers
Peter Zijlstra [Thu, 11 Mar 2010 12:06:56 +0000 (13:06 +0100)]
perf, ppc: Fix compile error due to new cpu notifiers

Fix:

  arch/powerpc/kernel/perf_event.c:1334: error: 'power_pmu_notifier' undeclared (first use in this function)
  arch/powerpc/kernel/perf_event.c:1334: error: (Each undeclared identifier is reported only once
  arch/powerpc/kernel/perf_event.c:1334: error: for each function it appears in.)
  arch/powerpc/kernel/perf_event.c:1334: error: implicit declaration of function 'power_pmu_notifier'
  arch/powerpc/kernel/perf_event.c:1334: error: implicit declaration of function 'register_cpu_notifier'

Due to commit 3f6da390 (perf: Rework and fix the arch CPU-hotplug hooks).

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf: Make the install relative to DESTDIR if specified
John Kacur [Thu, 11 Mar 2010 12:57:00 +0000 (13:57 +0100)]
perf: Make the install relative to DESTDIR if specified

Without this change, the install path is relative to
prefix/DESTDIR where prefix is automatically set to $HOME.

This can produce unexpected results. For example:

  make -C tools/perf DESTDIR=/home/jkacur/tmp install-man

creates the directory: /home/jkacur/home/jkacur/tmp/share/...
instead of the expected: /home/jkacur/tmp/share/...

Signed-off-by: John Kacur <jkacur@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Kyle McMartin <kyle@redhat.com>
Cc: <stable@kernel.org>
LKML-Reference: <1268312220-12880-1-git-send-email-jkacur@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agokprobes: Calculate the index correctly when freeing the out-of-line execution slot
Masami Hiramatsu [Tue, 9 Mar 2010 15:22:19 +0000 (10:22 -0500)]
kprobes: Calculate the index correctly when freeing the out-of-line execution slot

From : Ananth N Mavinakayanahalli <ananth@in.ibm.com>

When freeing the instruction slot, the arithmetic to calculate
the index of the slot in the page needs to account for the total
size of the instruction on the various architectures.

Calculate the index correctly when freeing the out-of-line
execution slot.

Reported-by: Sachin Sant <sachinp@in.ibm.com>
Reported-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
LKML-Reference: <4B9667AB.9050507@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf tools: Fix sparse CPU numbering related bugs
Paul Mackerras [Wed, 10 Mar 2010 09:36:09 +0000 (20:36 +1100)]
perf tools: Fix sparse CPU numbering related bugs

At present, the perf subcommands that do system-wide monitoring
(perf stat, perf record and perf top) don't work properly unless
the online cpus are numbered 0, 1, ..., N-1.  These tools ask
for the number of online cpus with sysconf(_SC_NPROCESSORS_ONLN)
and then try to create events for cpus 0, 1, ..., N-1.

This creates problems for systems where the online cpus are
numbered sparsely.  For example, a POWER6 system in
single-threaded mode (i.e. only running 1 hardware thread per
core) will have only even-numbered cpus online.

This fixes the problem by reading the /sys/devices/system/cpu/online
file to find out which cpus are online.  The code that does that is in
tools/perf/util/cpumap.[ch], and consists of a read_cpu_map()
function that sets up a cpumap[] array and returns the number of
online cpus.  If /sys/devices/system/cpu/online can't be read or
can't be parsed successfully, it falls back to using sysconf to
ask how many cpus are online and sets up an identity map in cpumap[].

The perf record, perf stat and perf top code then calls
read_cpu_map() in the system-wide monitoring case (instead of
sysconf) and uses cpumap[] to get the cpu numbers to pass to
perf_event_open.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Cc: Anton Blanchard <anton@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
LKML-Reference: <20100310093609.GA3959@brick.ozlabs.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf_event: Fix oops triggered by cpu offline/online
Paul Mackerras [Wed, 10 Mar 2010 09:45:52 +0000 (20:45 +1100)]
perf_event: Fix oops triggered by cpu offline/online

Anton Blanchard found that he could reliably make the kernel hit a
BUG_ON in the slab allocator by taking a cpu offline and then online
while a system-wide perf record session was running.

The reason is that when the cpu comes up, we completely reinitialize
the ctx field of the struct perf_cpu_context for the cpu.  If there is
a system-wide perf record session running, then there will be a struct
perf_event that has a reference to the context, so its refcount will
be 2.  (The perf_event has been removed from the context's group_entry
and event_entry lists by perf_event_exit_cpu(), but that doesn't
remove the perf_event's reference to the context and doesn't decrement
the context's refcount.)

When the cpu comes up, perf_event_init_cpu() gets called, and it calls
__perf_event_init_context() on the cpu's context.  That resets the
refcount to 1.  Then when the perf record session finishes and the
perf_event is closed, the refcount gets decremented to 0 and the
context gets kfreed after an RCU grace period.  Since the context
wasn't kmalloced -- it's part of a per-cpu variable -- bad things
happen.

In fact we don't need to completely reinitialize the context when the
cpu comes up.  It's sufficient to initialize the context once at boot,
but we need to do it for all possible cpus.

This moves the context initialization to happen at boot time.  With
this, we don't trash the refcount and the context never gets kfreed,
and we don't hit the BUG_ON.

Reported-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Tested-by: Anton Blanchard <anton@samba.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: <stable@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf: Drop the obsolete profile naming for trace events
Frederic Weisbecker [Fri, 5 Mar 2010 04:35:37 +0000 (05:35 +0100)]
perf: Drop the obsolete profile naming for trace events

Drop the obsolete "profile" naming used by perf for trace events.
Perf can now do more than simple events counting, so generalize
the API naming.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Jason Baron <jbaron@redhat.com>
14 years agoperf: Take a hot regs snapshot for trace events
Frederic Weisbecker [Wed, 3 Mar 2010 06:16:16 +0000 (07:16 +0100)]
perf: Take a hot regs snapshot for trace events

We are taking a wrong regs snapshot when a trace event triggers.
Either we use get_irq_regs(), which gives us the interrupted
registers if we are in an interrupt, or we use task_pt_regs()
which gives us the state before we entered the kernel, assuming
we are lucky enough to be no kernel thread, in which case
task_pt_regs() returns the initial set of regs when the kernel
thread was started.

What we want is different. We need a hot snapshot of the regs,
so that we can get the instruction pointer to record in the
sample, the frame pointer for the callchain, and some other
things.

Let's use the new perf_fetch_caller_regs() for that.

Comparison with perf record -e lock: -R -a -f -g
Before:

        perf  [kernel]                   [k] __do_softirq
               |
               --- __do_softirq
                  |
                  |--55.16%-- __open
                  |
                   --44.84%-- __write_nocancel

After:

            perf  [kernel]           [k] perf_tp_event
               |
               --- perf_tp_event
                  |
                  |--41.07%-- lock_acquire
                  |          |
                  |          |--39.36%-- _raw_spin_lock
                  |          |          |
                  |          |          |--7.81%-- hrtimer_interrupt
                  |          |          |          smp_apic_timer_interrupt
                  |          |          |          apic_timer_interrupt

The old case was producing unreliable callchains. Now having
right frame and instruction pointers, we have the trace we
want.

Also syscalls and kprobe events already have the right regs,
let's use them instead of wasting a retrieval.

v2: Follow the rename perf_save_regs() -> perf_fetch_caller_regs()

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Jason Baron <jbaron@redhat.com>
Cc: Archs <linux-arch@vger.kernel.org>
14 years agoperf: Introduce new perf_fetch_caller_regs() for hot regs snapshot
Frederic Weisbecker [Thu, 4 Mar 2010 20:15:56 +0000 (21:15 +0100)]
perf: Introduce new perf_fetch_caller_regs() for hot regs snapshot

Events that trigger overflows by interrupting a context can
use get_irq_regs() or task_pt_regs() to retrieve the state
when the event triggered. But this is not the case for some
other class of events like trace events as tracepoints are
executed in the same context than the code that triggered
the event.

It means we need a different api to capture the regs there,
namely we need a hot snapshot to get the most important
informations for perf: the instruction pointer to get the
event origin, the frame pointer for the callchain, the code
segment for user_mode() tests (we always use __KERNEL_CS as
trace events always occur from the kernel) and the eflags
for further purposes.

v2: rename perf_save_regs to perf_fetch_caller_regs as per
Masami's suggestion.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Jason Baron <jbaron@redhat.com>
Cc: Archs <linux-arch@vger.kernel.org>
14 years agoperf/x86-64: Use frame pointer to walk on irq and process stacks
Frederic Weisbecker [Wed, 3 Mar 2010 06:38:37 +0000 (07:38 +0100)]
perf/x86-64: Use frame pointer to walk on irq and process stacks

We were using the frame pointer based stack walker on every
contexts in x86-32, but not in x86-64 where we only use the
seven-league boots on the exception stacks.

Use it also on irq and process stacks. This utterly accelerate
the captures.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
14 years agolockdep: Move lock events under lockdep recursion protection
Frederic Weisbecker [Tue, 2 Feb 2010 22:34:40 +0000 (23:34 +0100)]
lockdep: Move lock events under lockdep recursion protection

There are rcu locked read side areas in the path where we submit
a trace event. And these rcu_read_(un)lock() trigger lock events,
which create recursive events.

One pair in do_perf_sw_event:

__lock_acquire
      |
      |--96.11%-- lock_acquire
      |          |
      |          |--27.21%-- do_perf_sw_event
      |          |          perf_tp_event
      |          |          |
      |          |          |--49.62%-- ftrace_profile_lock_release
      |          |          |          lock_release
      |          |          |          |
      |          |          |          |--33.85%-- _raw_spin_unlock

Another pair in perf_output_begin/end:

__lock_acquire
      |--23.40%-- perf_output_begin
      |          |          __perf_event_overflow
      |          |          perf_swevent_overflow
      |          |          perf_swevent_add
      |          |          perf_swevent_ctx_event
      |          |          do_perf_sw_event
      |          |          perf_tp_event
      |          |          |
      |          |          |--55.37%-- ftrace_profile_lock_acquire
      |          |          |          lock_acquire
      |          |          |          |
      |          |          |          |--37.31%-- _raw_spin_lock

The problem is not that much the trace recursion itself, as we have a
recursion protection already (though it's always wasteful to recurse).
But the trace events are outside the lockdep recursion protection, then
each lockdep event triggers a lock trace, which will trigger two
other lockdep events. Here the recursive lock trace event won't
be taken because of the trace recursion, so the recursion stops there
but lockdep will still analyse these new events:

To sum up, for each lockdep events we have:

lock_*()
     |
             trace lock_acquire
                  |
                  ----- rcu_read_lock()
                  |          |
                  |          lock_acquire()
                  |          |
                  |          trace_lock_acquire() (stopped)
                  |          |
  |          lockdep analyze
                  |
                  ----- rcu_read_unlock()
                             |
                             lock_release
                             |
                             trace_lock_release() (stopped)
                             |
                             lockdep analyze

And you can repeat the above two times as we have two rcu read side
sections when we submit an event.

This is fixed in this patch by moving the lock trace event under
the lockdep recursion protection.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
14 years agoperf report: Print the map table just after samples for which no map was found
Arnaldo Carvalho de Melo [Tue, 9 Mar 2010 18:58:17 +0000 (15:58 -0300)]
perf report: Print the map table just after samples for which no map was found

If -vv is used just the map table will be printed, -vvv will
print the symbol table too, with it we can see that we have a
bug where some samples are not being resolved to a map when we
get them in the perf.data stream, but after we have it all
processed, we can find the right map, some reordering probably
is happening.

Upcoming patches will provide ways to ask for most PERF_SAMPLE_
conditional samples to be taken for !PERF_RECORD_SAMPLE events
too, then we'll be able to ask for PERF_SAMPLE_TIME and
PERF_SAMPLE_CPU to help diagnose this.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1268161097-17761-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf report: Add multiple event support
Eric B Munson [Fri, 5 Mar 2010 15:51:09 +0000 (12:51 -0300)]
perf report: Add multiple event support

Perf report does not handle multiple events being reported, even
though perf record stores them properly on disk.  This patch
addresses that issue by adding the logic to perf report to use
the event stream id that is saved by record and the new data
structures to seperate the event streams and report them
individually.

Signed-off-by: Eric B Munson <ebmunson@us.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1267804269-22660-6-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf session: Change perf_session post processing functions to take histogram tree
Eric B Munson [Fri, 5 Mar 2010 15:51:08 +0000 (12:51 -0300)]
perf session: Change perf_session post processing functions to take histogram tree

Now that report can store historgrams for multiple events we
need to be able to do the post processing work for each
histogram. This patch changes the post processing functions so
that they can be called individually for each event's histogram.

Signed-off-by: Eric B Munson <ebmunson@us.ibm.com>
[ Guarantee bisectabilty by fixing up builtin-report.c ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1267804269-22660-5-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf session: Add storage for seperating event types in report
Eric B Munson [Fri, 5 Mar 2010 15:51:07 +0000 (12:51 -0300)]
perf session: Add storage for seperating event types in report

This patch adds the structures necessary to count each event
type independently in perf report.

Signed-off-by: Eric B Munson <ebmunson@us.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1267804269-22660-4-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf session: Change add_hist_entry to take the tree root instead of session
Eric B Munson [Fri, 5 Mar 2010 15:51:06 +0000 (12:51 -0300)]
perf session: Change add_hist_entry to take the tree root instead of session

In order to minimize the impact of storing multiple events in a
report this function will now take the root of the histogram
tree so that the logic for selecting the proper tree can be
inserted before the call.

Signed-off-by: Eric B Munson <ebmunson@us.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1267804269-22660-3-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf record: Add ID and to recorded event data when recording multiple events
Eric B Munson [Fri, 5 Mar 2010 15:51:05 +0000 (12:51 -0300)]
perf record: Add ID and to recorded event data when recording multiple events

Currently perf record does not write the ID or the to disk for
events. This doesn't allow report to tell if an event stream
contains one or more types of events.  This patch adds this
entry to the list of data that record will write to disk if more
than one event was requested.

Signed-off-by: Eric B Munson <ebmunson@us.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1267804269-22660-2-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf probe: Add missing variable initialization
Arnaldo Carvalho de Melo [Fri, 5 Mar 2010 15:51:04 +0000 (12:51 -0300)]
perf probe: Add missing variable initialization

cc1: warnings being treated as errors
 util/probe-finder.c: In function 'find_line_range':
 util/probe-finder.c:172: warning: 'src' may be used
 uninitialized in this function make: *** [util/probe-finder.o]
 Error 1

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1267804269-22660-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf tools: Don't trow away old map slices not overlapped by new maps
Arnaldo Carvalho de Melo [Fri, 5 Mar 2010 14:54:02 +0000 (11:54 -0300)]
perf tools: Don't trow away old map slices not overlapped by new maps

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1267800842-22324-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf, x86: Fix the !CONFIG_CPU_SUP_INTEL build
Ingo Molnar [Sat, 6 Jun 2009 11:58:12 +0000 (13:58 +0200)]
perf, x86: Fix the !CONFIG_CPU_SUP_INTEL build

Fix typo. But the modularization here is ugly and should be improved.

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf, x86: Add INSTRUCTION_DECODER config flag
Ingo Molnar [Sat, 6 Jun 2009 11:58:12 +0000 (13:58 +0200)]
perf, x86: Add INSTRUCTION_DECODER config flag

The PEBS+LBR decoding magic needs the insn_get_length() infrastructure
to be able to decode x86 instruction length.

So split it out of KPROBES dependency and make it enabled when either
KPROBES or PERF_EVENTS is enabled.

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf, x86: Fix LBR read-out
Peter Zijlstra [Tue, 9 Mar 2010 10:51:02 +0000 (11:51 +0100)]
perf, x86: Fix LBR read-out

Don't decrement the TOS twice...

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: paulus@samba.org
Cc: eranian@google.com
Cc: robert.richter@amd.com
Cc: fweisbec@gmail.com
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf, x86: Fixup the PEBS handler for Core2 cpus
Peter Zijlstra [Tue, 9 Mar 2010 10:41:02 +0000 (11:41 +0100)]
perf, x86: Fixup the PEBS handler for Core2 cpus

Pull the core handler in line with the nhm one, also make sure we always
drain the buffer.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: paulus@samba.org
Cc: eranian@google.com
Cc: robert.richter@amd.com
Cc: fweisbec@gmail.com
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf, x86: Remove checking_{wr,rd}msr() usage
Peter Zijlstra [Mon, 8 Mar 2010 12:51:31 +0000 (13:51 +0100)]
perf, x86: Remove checking_{wr,rd}msr() usage

We don't need checking_{wr,rd}msr() calls, since we should know what cpu
we're running on and not use blindly poke at msrs.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: paulus@samba.org
Cc: eranian@google.com
Cc: robert.richter@amd.com
Cc: fweisbec@gmail.com
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf, x86: Don't reset the LBR as frequently
Peter Zijlstra [Mon, 8 Mar 2010 12:51:12 +0000 (13:51 +0100)]
perf, x86: Don't reset the LBR as frequently

If we reset the LBR on each first counter, simple counter rotation which
first deschedules all counters and then reschedules the new ones will
lead to LBR reset, even though we're still in the same task context.

Reduce this by not flushing on the first counter but only flushing on
different task contexts.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: paulus@samba.org
Cc: eranian@google.com
Cc: robert.richter@amd.com
Cc: fweisbec@gmail.com
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf, x86: Fix silly bug in intel_pmu_pebs_{enable,disable}
Peter Zijlstra [Sat, 6 Mar 2010 18:49:06 +0000 (19:49 +0100)]
perf, x86: Fix silly bug in intel_pmu_pebs_{enable,disable}

We need to use the actual cpuc->pebs_enabled value, not a local copy for
the changes to take effect.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: paulus@samba.org
Cc: eranian@google.com
Cc: robert.richter@amd.com
Cc: fweisbec@gmail.com
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf, x86: Deal with multiple state bits for pebs-fmt1
Peter Zijlstra [Sat, 6 Mar 2010 17:57:38 +0000 (18:57 +0100)]
perf, x86: Deal with multiple state bits for pebs-fmt1

Its unclear if the PEBS state record will have only a single bit set, in
case it does not and accumulates bits, deal with that by only processing
each event once.

Also, robustify some of the code.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: paulus@samba.org
Cc: eranian@google.com
Cc: robert.richter@amd.com
Cc: fweisbec@gmail.com
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf, x86: Reorder intel_pmu_enable_all()
Peter Zijlstra [Mon, 8 Mar 2010 12:57:14 +0000 (13:57 +0100)]
perf, x86: Reorder intel_pmu_enable_all()

The documentation says we have to enable PEBS before we enable the PMU
proper.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: paulus@samba.org
Cc: eranian@google.com
Cc: robert.richter@amd.com
Cc: fweisbec@gmail.com
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf, x86: Fix LBR enable/disable vs cpuc->enabled
Peter Zijlstra [Sat, 6 Mar 2010 12:48:54 +0000 (13:48 +0100)]
perf, x86: Fix LBR enable/disable vs cpuc->enabled

We should never call ->enable with the pmu enabled, and we _can_ have
->disable called with the pmu enabled.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: paulus@samba.org
Cc: eranian@google.com
Cc: robert.richter@amd.com
Cc: fweisbec@gmail.com
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf, x86: Fix PEBS enable/disable vs cpuc->enabled
Peter Zijlstra [Sat, 6 Mar 2010 12:47:07 +0000 (13:47 +0100)]
perf, x86: Fix PEBS enable/disable vs cpuc->enabled

We should never call ->enable with the pmu enabled, and we _can_ have
->disable called with the pmu enabled.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: paulus@samba.org
Cc: eranian@google.com
Cc: robert.richter@amd.com
Cc: fweisbec@gmail.com
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf, x86: Fix pebs drains
Peter Zijlstra [Sat, 6 Mar 2010 12:26:11 +0000 (13:26 +0100)]
perf, x86: Fix pebs drains

I overlooked the perf_disable()/perf_enable() calls in
intel_pmu_handle_irq(), (pointed out by Markus) so we should not
explicitly disable_all/enable_all pebs counters in the drain functions,
these are already disabled and enabling them early is confusing.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: paulus@samba.org
Cc: eranian@google.com
Cc: robert.richter@amd.com
Cc: fweisbec@gmail.com
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf, x86: Avoid double disable on throttle vs ioctl(PERF_IOC_DISABLE)
Peter Zijlstra [Mon, 8 Mar 2010 16:51:33 +0000 (17:51 +0100)]
perf, x86: Avoid double disable on throttle vs ioctl(PERF_IOC_DISABLE)

Calling ioctl(PERF_EVENT_IOC_DISABLE) on a thottled counter would result
in a double disable, cure this by using x86_pmu_{start,stop} for
throttle/unthrottle and teach x86_pmu_stop() to check ->active_mask.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: paulus@samba.org
Cc: eranian@google.com
Cc: robert.richter@amd.com
Cc: fweisbec@gmail.com
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf, x86: Robustify PEBS fixup
Peter Zijlstra [Fri, 5 Mar 2010 15:29:14 +0000 (16:29 +0100)]
perf, x86: Robustify PEBS fixup

It turns out the LBR is massively unreliable on certain CPUs, so code the
fixup a little more defensive to avoid crashing the kernel.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: paulus@samba.org
Cc: eranian@google.com
Cc: robert.richter@amd.com
Cc: fweisbec@gmail.com
LKML-Reference: <20100305154129.042271287@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf, x86: Clear the LBRs on init
Peter Zijlstra [Fri, 5 Mar 2010 12:49:35 +0000 (13:49 +0100)]
perf, x86: Clear the LBRs on init

Some CPUs have errata where the LBR is not cleared on Power-On. So always
clear the LBRs before use.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: paulus@samba.org
Cc: eranian@google.com
Cc: robert.richter@amd.com
Cc: fweisbec@gmail.com
LKML-Reference: <20100305154128.966563424@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf, x86: Disable PEBS on clovertown chips
Peter Zijlstra [Thu, 4 Mar 2010 20:49:01 +0000 (21:49 +0100)]
perf, x86: Disable PEBS on clovertown chips

This CPU has just too many handycaps to be really useful.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: paulus@samba.org
Cc: eranian@google.com
Cc: robert.richter@amd.com
Cc: fweisbec@gmail.com
LKML-Reference: <20100305154128.890278662@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf, x86: Fix silly bug in data store buffer allocation
Peter Zijlstra [Fri, 5 Mar 2010 11:09:29 +0000 (12:09 +0100)]
perf, x86: Fix silly bug in data store buffer allocation

Fix up the ds allocation error path, where we could free @buffer before
we used it.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: paulus@samba.org
Cc: eranian@google.com
Cc: robert.richter@amd.com
Cc: fweisbec@gmail.com
LKML-Reference: <20100305154128.813452402@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agox86: Move MAX_INSN_SIZE into asm/insn.h
Peter Zijlstra [Thu, 4 Mar 2010 12:49:21 +0000 (13:49 +0100)]
x86: Move MAX_INSN_SIZE into asm/insn.h

Since there's now two users for this, place it in a common header.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Cc: paulus@samba.org
Cc: eranian@google.com
Cc: robert.richter@amd.com
Cc: fweisbec@gmail.com
LKML-Reference: <20100304140100.923774125@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf, x86: Expose the full PEBS record using PERF_SAMPLE_RAW
Peter Zijlstra [Thu, 4 Mar 2010 11:38:03 +0000 (12:38 +0100)]
perf, x86: Expose the full PEBS record using PERF_SAMPLE_RAW

Expose the full PEBS record using PERF_SAMPLE_RAW

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: paulus@samba.org
Cc: eranian@google.com
Cc: robert.richter@amd.com
Cc: fweisbec@gmail.com
LKML-Reference: <20100304140100.847218224@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf, x86: Clean up IA32_PERF_CAPABILITIES usage
Peter Zijlstra [Wed, 3 Mar 2010 16:07:40 +0000 (17:07 +0100)]
perf, x86: Clean up IA32_PERF_CAPABILITIES usage

Saner PERF_CAPABILITIES support, which also exposes pebs_trap. Use that
latter to make PEBS's use of LBR conditional since a fault-like pebs
should already report the correct IP.

( As of this writing there is no known hardware that implements
  !pebs_trap )

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: paulus@samba.org
Cc: eranian@google.com
Cc: robert.richter@amd.com
Cc: fweisbec@gmail.com
LKML-Reference: <20100304140100.770650663@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf-top: Show the percentage of successfull PEBS-fixups
Peter Zijlstra [Thu, 4 Mar 2010 13:19:36 +0000 (14:19 +0100)]
perf-top: Show the percentage of successfull PEBS-fixups

Use the PERF_RECORD_MISC_EXACT information to measure the success rate of
the PEBS fix-up.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: paulus@samba.org
Cc: eranian@google.com
Cc: robert.richter@amd.com
Cc: fweisbec@gmail.com
LKML-Reference: <20100304140100.694233760@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf, x86: use LBR for PEBS IP+1 fixup
Peter Zijlstra [Wed, 3 Mar 2010 12:12:23 +0000 (13:12 +0100)]
perf, x86: use LBR for PEBS IP+1 fixup

Use the LBR to fix up the PEBS IP+1 issue.

As said, PEBS reports the next instruction, here we use the LBR to find
the last branch and from that construct the actual IP. If the IP matches
the LBR-TO, we use LBR-FROM, otherwise we use the LBR-TO address as the
beginning of the last basic block and decode forward.

Once we find a match to the current IP, we use the previous location.

This patch introduces a new ABI element: PERF_RECORD_MISC_EXACT, which
conveys that the reported IP (PERF_SAMPLE_IP) is the exact instruction
that caused the event (barring CPU errata).

The fixup can fail due to various reasons:

 1) LBR contains invalid data (quite possible)
 2) part of the basic block got paged out
 3) the reported IP isn't part of the basic block (see 1)

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Cc: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
Cc: paulus@samba.org
Cc: eranian@google.com
Cc: robert.richter@amd.com
Cc: fweisbec@gmail.com
LKML-Reference: <20100304140100.619375431@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf, x86: Implement simple LBR support
Peter Zijlstra [Wed, 3 Mar 2010 11:02:30 +0000 (12:02 +0100)]
perf, x86: Implement simple LBR support

Implement simple suport Intel Last-Branch-Record, it supports all
hardware that implements FREEZE_LBRS_ON_PMI, but does not (yet) implement
the LBR config register.

The Intel LBR is a FIFO of From,To addresses describing the last few
branches the hardware took.

This patch does not add perf interface to the LBR, but merely provides an
interface for internal use.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: paulus@samba.org
Cc: eranian@google.com
Cc: robert.richter@amd.com
Cc: fweisbec@gmail.com
LKML-Reference: <20100304140100.544191154@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf: Add attr->precise support to raw event parsing
Peter Zijlstra [Thu, 4 Mar 2010 12:57:24 +0000 (13:57 +0100)]
perf: Add attr->precise support to raw event parsing

Minimal userspace interface to the new 'precise' events flag.

Can be used like "perf top -e r00c0p" which will use PEBS to sample
retired instructions.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: paulus@samba.org
Cc: eranian@google.com
Cc: robert.richter@amd.com
Cc: fweisbec@gmail.com
LKML-Reference: <20100304140100.468665803@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf, x86: Add PEBS infrastructure
Peter Zijlstra [Tue, 2 Mar 2010 18:52:12 +0000 (19:52 +0100)]
perf, x86: Add PEBS infrastructure

This patch implements support for Intel Precise Event Based Sampling,
which is an alternative counter mode in which the counter triggers a
hardware assist to collect information on events. The hardware assist
takes a trap like snapshot of a subset of the machine registers.

This data is written to the Intel Debug-Store, which can be programmed
with a data threshold at which to raise a PMI.

With the PEBS hardware assist being trap like, the reported IP is always
one instruction after the actual instruction that triggered the event.

This implements a simple PEBS model that always takes a single PEBS event
at a time. This is done so that the interaction with the rest of the
system is as expected (freq adjust, period randomization, lbr,
callchains, etc.).

It adds an ABI element: perf_event_attr::precise, which indicates that we
wish to use this (constrained, but precise) mode.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: paulus@samba.org
Cc: eranian@google.com
Cc: robert.richter@amd.com
Cc: fweisbec@gmail.com
LKML-Reference: <20100304140100.392111285@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf: Provide better condition for event rotation
Peter Zijlstra [Mon, 8 Mar 2010 12:51:20 +0000 (13:51 +0100)]
perf: Provide better condition for event rotation

Try to avoid useless rotation and PMU disables.

[ Could be improved by keeping a nr_runnable count to better account
  for the < PERF_STAT_INACTIVE counters ]

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: paulus@samba.org
Cc: eranian@google.com
Cc: robert.richter@amd.com
Cc: fweisbec@gmail.com
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf, x86: Fix double enable calls
Peter Zijlstra [Sat, 6 Mar 2010 12:24:58 +0000 (13:24 +0100)]
perf, x86: Fix double enable calls

hw_perf_enable() would enable already enabled events.

This causes problems with code that assumes that ->enable/->disable calls
are balanced (like the LBR code does).

What happens is that events that were already running and left in place
would get enabled again.

Avoid this by only enabling new events that match their previous
assignment.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: paulus@samba.org
Cc: eranian@google.com
Cc: robert.richter@amd.com
Cc: fweisbec@gmail.com
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoperf, x86: Fix double disable calls
Peter Zijlstra [Sat, 6 Mar 2010 12:20:40 +0000 (13:20 +0100)]
perf, x86: Fix double disable calls

hw_perf_enable() would disable events that were not yet enabled.

This causes problems with code that assumes that ->enable/->disable calls
are balanced (like the LBR code does).

What happens is that we disable newly added counters that match their
previous assignment, even though they are not yet programmed on the
hardware.

Avoid this by only doing the first pass over the existing events.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: paulus@samba.org
Cc: eranian@google.com
Cc: robert.richter@amd.com
Cc: fweisbec@gmail.com
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>