GitHub/exynos8895/android_kernel_samsung_universal8895.git
15 years agoperf_counter: x86: Protect against infinite loops in intel_pmu_handle_irq()
Ingo Molnar [Fri, 15 May 2009 06:26:20 +0000 (08:26 +0200)]
perf_counter: x86: Protect against infinite loops in intel_pmu_handle_irq()

intel_pmu_handle_irq() can lock up in an infinite loop if the hardware
does not allow the acking of irqs. Alas, this happened in testing so
make this robust and emit a warning if it happens in the future.

Also, clean up the IRQ handlers a bit.

[ Impact: improve perfcounter irq/nmi handling robustness ]

Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoperf_counter: x86: Disallow interval of 1
Ingo Molnar [Fri, 15 May 2009 06:25:22 +0000 (08:25 +0200)]
perf_counter: x86: Disallow interval of 1

On certain CPUs i have observed a stuck PMU if interval was set to
1 and NMIs were used. The PMU had PMC0 set in MSR_CORE_PERF_GLOBAL_STATUS,
but it was not possible to ack it via MSR_CORE_PERF_GLOBAL_OVF_CTRL,
and the NMI loop got stuck infinitely.

[ Impact: fix rare hangs during high perfcounter load ]

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoperf_counter: x86: Robustify interrupt handling
Peter Zijlstra [Thu, 14 May 2009 12:52:17 +0000 (14:52 +0200)]
perf_counter: x86: Robustify interrupt handling

Two consecutive NMIs could daze and confuse the machine when the
first would handle the overflow of both counters.

[ Impact: fix false-positive syslog messages under multi-session profiling ]

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoperf_counter: Rework the perf counter disable/enable
Peter Zijlstra [Wed, 13 May 2009 14:21:38 +0000 (16:21 +0200)]
perf_counter: Rework the perf counter disable/enable

The current disable/enable mechanism is:

token = hw_perf_save_disable();
...
/* do bits */
...
hw_perf_restore(token);

This works well, provided that the use nests properly. Except we don't.

x86 NMI/INT throttling has non-nested use of this, breaking things. Therefore
provide a reference counter disable/enable interface, where the first disable
disables the hardware, and the last enable enables the hardware again.

[ Impact: refactor, simplify the PMU disable/enable logic ]

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoperf_counter: x86: Fix up the amd NMI/INT throttle
Peter Zijlstra [Wed, 13 May 2009 11:21:36 +0000 (13:21 +0200)]
perf_counter: x86: Fix up the amd NMI/INT throttle

perf_counter_unthrottle() restores throttle_ctrl, buts its never set.
Also, we fail to disable all counters when throttling.

[ Impact: fix rare stuck perf-counters when they are throttled ]

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoperf_counter: Fix perf_output_copy() WARN to account for overflow
Peter Zijlstra [Wed, 13 May 2009 19:26:19 +0000 (21:26 +0200)]
perf_counter: Fix perf_output_copy() WARN to account for overflow

The simple reservation test in perf_output_copy() failed to take
unsigned int overflow into account, fix this.

[ Impact: fix false positive warning with more than 4GB of profiling data ]

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoperf_counter: x86: Allow unpriviliged use of NMIs
Peter Zijlstra [Wed, 13 May 2009 08:02:57 +0000 (10:02 +0200)]
perf_counter: x86: Allow unpriviliged use of NMIs

Apply sysctl_perf_counter_priv to NMIs. Also, fail the counter
creation instead of silently down-grading to regular interrupts.

[ Impact: allow wider perf-counter usage ]

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoperf_counter: x86: Fix throttling
Ingo Molnar [Wed, 13 May 2009 10:54:01 +0000 (12:54 +0200)]
perf_counter: x86: Fix throttling

If counters are disabled globally when a perfcounter IRQ/NMI hits,
and if we throttle in that case, we'll promote the '0' value to
the next lapic IRQ and disable all perfcounters at that point,
permanently ...

Fix it.

[ Impact: fix hung perfcounters under load ]

Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoperf_counter: x86: More accurate counter update
Peter Zijlstra [Wed, 13 May 2009 07:45:19 +0000 (09:45 +0200)]
perf_counter: x86: More accurate counter update

Take the counter width into account instead of assuming 32 bits.

In particular Nehalem has 44 bit wide counters, and all
arithmetics should happen on a 44-bit signed integer basis.

[ Impact: fix rare event imprecision, warning message on Nehalem ]

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoperf record: Allow specifying a pid to record
Arnaldo Carvalho de Melo [Fri, 15 May 2009 01:50:46 +0000 (22:50 -0300)]
perf record: Allow specifying a pid to record

Allow specifying a pid instead of always fork+exec'ing a command.

Because the PERF_EVENT_COMM and PERF_EVENT_MMAP events happened before
we connected, we must synthesize them so that 'perf report' can get what
it needs.

[ Impact: add new command line option ]

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Clark Williams <williams@redhat.com>
Cc: John Kacur <jkacur@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <20090515015046.GA13664@ghostprotocols.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoperf_counter: fix print debug irq disable
Peter Zijlstra [Wed, 13 May 2009 06:12:51 +0000 (08:12 +0200)]
perf_counter: fix print debug irq disable

inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage.
bash/15802 [HC0[0]:SC0[0]:HE1:SE1] takes:
 (sysrq_key_table_lock){?.....},

Don't unconditionally enable interrupts in the perf_counter_print_debug()
path.

[ Impact: fix potential deadlock pointed out by lockdep ]

LKML-Reference: <new-submission>
Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
15 years agoperf_counter: call hw_perf_save_disable/restore around group_sched_in
Paul Mackerras [Tue, 12 May 2009 11:59:01 +0000 (21:59 +1000)]
perf_counter: call hw_perf_save_disable/restore around group_sched_in

I noticed that when enabling a group via the PERF_COUNTER_IOC_ENABLE
ioctl on the group leader, the counters weren't enabled and counting
immediately on return from the ioctl, but did start counting a little
while later (presumably after a context switch).

The reason was that __perf_counter_enable calls group_sched_in which
calls hw_perf_group_sched_in, which on powerpc assumes that the caller
has called hw_perf_save_disable already.  Until commit 46d686c6
("perf_counter: put whole group on when enabling group leader") it was
true that all callers of group_sched_in had called
hw_perf_save_disable first, and the powerpc hw_perf_group_sched_in
relies on that (there isn't an x86 version).

This fixes the problem by putting calls to hw_perf_save_disable /
hw_perf_restore around the calls to group_sched_in and
counter_sched_in in __perf_counter_enable.  Having the calls to
hw_perf_save_disable/restore around the counter_sched_in call is
harmless and makes this call consistent with the other call sites
of counter_sched_in, which have all called hw_perf_save_disable first.

[ Impact: more precise counter group disable/enable functionality ]

Signed-off-by: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <18953.25733.53359.147452@cargo.ozlabs.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoperf_counter: call atomic64_set for counter->count
Paul Mackerras [Mon, 11 May 2009 05:50:21 +0000 (15:50 +1000)]
perf_counter: call atomic64_set for counter->count

A compile warning triggered because we are calling
atomic_set(&counter->count). But since counter->count
is an atomic64_t, we have to use atomic64_set.

So the count can be set short, resulting in the reset ioctl
only resetting the low word.

[ Impact: clear counter properly during the reset ioctl ]

Signed-off-by: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <18951.48285.270311.981806@drongo.ozlabs.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoperf_counter: don't count scheduler ticks as context switches
Paul Mackerras [Mon, 11 May 2009 05:46:10 +0000 (15:46 +1000)]
perf_counter: don't count scheduler ticks as context switches

The context-switch software counter gives inflated values at present
because each scheduler tick and each process-wide counter
enable/disable prctl gets counted as a context switch.

This happens because perf_counter_task_tick, perf_counter_task_disable
and perf_counter_task_enable all call perf_counter_task_sched_out,
which calls perf_swcounter_event to record a context switch event.

This fixes it by introducing a variant of perf_counter_task_sched_out
with two underscores in front for internal use within the perf_counter
code, and makes perf_counter_task_{tick,disable,enable} call it.  This
variant doesn't record a context switch event, and takes a struct
perf_counter_context *.  This adds the new variant rather than
changing the behaviour or interface of perf_counter_task_sched_out
because that is called from other code.

[ Impact: fix inflated context-switch event counts ]

Signed-off-by: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <18951.48034.485580.498953@drongo.ozlabs.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoperf_counter: Put whole group on when enabling group leader
Paul Mackerras [Mon, 11 May 2009 02:08:02 +0000 (12:08 +1000)]
perf_counter: Put whole group on when enabling group leader

Currently, if you have a group where the leader is disabled and there
are siblings that are enabled, and then you enable the leader, we only
put the leader on the PMU, and not its enabled siblings.  This is
incorrect, since the enabled group members should be all on or all off
at any given point.

This fixes it by adding a call to group_sched_in in
__perf_counter_enable in the case where we're enabling a group leader.

To avoid the need for a forward declaration this also moves
group_sched_in up before __perf_counter_enable.  The actual content of
group_sched_in is unchanged by this patch.

[ Impact: fix bug in counter enable code ]

Signed-off-by: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <18951.34946.451546.691693@drongo.ozlabs.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoperf_counter, x86: clean up throttling printk
Mike Galbraith [Sun, 10 May 2009 08:53:05 +0000 (10:53 +0200)]
perf_counter, x86: clean up throttling printk

s/PERFMON/perfcounters for perfcounter interrupt throttling warning.

'perfmon' is the CPU feature name that is Intel-only, while we do
throttling in a generic way.

[ Impact: cleanup ]

Signed-off-by: Mike Galbraith <efault@gmx.de>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoperf_counter tools: fix buffer overwrite problem for perf top command
Erdem Aktas [Sun, 10 May 2009 06:13:19 +0000 (02:13 -0400)]
perf_counter tools: fix buffer overwrite problem for perf top command

There is a buffer overwrite problem in builtin-top.c line 526, When I
tried to use ./perf top command, it was giving memory corruption
problem.

[ Impact: fix 'perf top' crash ]

LKML-Reference: <3fee128b0905092313x608e65e0l7b1116d86914114f@mail.gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoperf_counter tools: remove debug code from builtin-stat.c
Ingo Molnar [Sat, 9 May 2009 08:04:22 +0000 (10:04 +0200)]
perf_counter tools: remove debug code from builtin-stat.c

Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoperf_counter: add PERF_RECORD_CPU
Peter Zijlstra [Fri, 8 May 2009 16:52:24 +0000 (18:52 +0200)]
perf_counter: add PERF_RECORD_CPU

Allow recording the CPU number the event was generated on.

RFC: this leaves a u32 as reserved, should we fill in the
     node_id() there, or leave this open for future extention,
     as userspace can already easily do the cpu->node mapping
     if needed.

[ Impact: extend perfcounter output record format ]

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <20090508170029.008627711@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoperf_counter: add PERF_RECORD_CONFIG
Peter Zijlstra [Fri, 8 May 2009 16:52:23 +0000 (18:52 +0200)]
perf_counter: add PERF_RECORD_CONFIG

Much like CONFIG_RECORD_GROUP records the hw_event.config to
identify the values, allow to record this for all counters.

[ Impact: extend perfcounter output record format ]

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <20090508170028.923228280@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoperf_counter: rework ioctl()s
Peter Zijlstra [Fri, 8 May 2009 16:52:22 +0000 (18:52 +0200)]
perf_counter: rework ioctl()s

Corey noticed that ioctl()s on grouped counters didn't work on
the whole group. This extends the ioctl() interface to take a
second argument that is interpreted as a flags field. We then
provide PERF_IOC_FLAG_GROUP to toggle the behaviour.

Having this flag gives the greatest flexibility, allowing you
to individually enable/disable/reset counters in a group, or
all together.

[ Impact: fix group counter enable/disable semantics ]

Reported-by: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <20090508170028.837558214@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoperf_counter: optimize perf_counter_task_tick()
Peter Zijlstra [Fri, 8 May 2009 16:52:21 +0000 (18:52 +0200)]
perf_counter: optimize perf_counter_task_tick()

perf_counter_task_tick() does way too much work to find out
there's nothing to do. Provide an easy short-circuit for the
normal case where there are no counters on the system.

[ Impact: micro-optimization ]

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <20090508170028.750619201@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoMerge branch 'core/locking' into perfcounters/core
Ingo Molnar [Wed, 6 May 2009 06:46:27 +0000 (08:46 +0200)]
Merge branch 'core/locking' into perfcounters/core

Merge reason: we moved a mutex.h commit that originated from the
              perfcounters tree into core/locking - but now merge
      back that branch to solve a merge artifact and to
      pick up cleanups of this commit that happened in
      core/locking.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoperf_counter: tools: update the tools to support process and inherited counters
Peter Zijlstra [Tue, 5 May 2009 15:50:27 +0000 (17:50 +0200)]
perf_counter: tools: update the tools to support process and inherited counters

"perf record":
 - per task counter
 - inherit switch
 - nmi switch

"perf report":
 - userspace/kernel filter

"perf stat":
 - userspace/kernel filter

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <20090505155437.389163017@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoperf_counter: inheritable sample counters
Peter Zijlstra [Tue, 5 May 2009 15:50:26 +0000 (17:50 +0200)]
perf_counter: inheritable sample counters

Redirect the output to the parent counter and put in some sanity checks.

[ Impact: new perfcounter feature - inherited sampling counters ]

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <20090505155437.331556171@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoperf_counter: fix the output lock
Peter Zijlstra [Tue, 5 May 2009 15:50:25 +0000 (17:50 +0200)]
perf_counter: fix the output lock

Use -1 instead of 0 as unlocked, since 0 is a valid cpu number.

( This is not an issue right now but will be once we allow multiple
  counters to output to the same mmap area. )

[ Impact: prepare code for multi-counter profile output ]

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <20090505155437.232686598@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoperf_counter: provide an mlock threshold
Peter Zijlstra [Tue, 5 May 2009 15:50:24 +0000 (17:50 +0200)]
perf_counter: provide an mlock threshold

Provide a threshold to relax the mlock accounting, increasing usability.

Each counter gets perf_counter_mlock_kb for free.

[ Impact: allow more mmap buffering ]

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <20090505155437.112113632@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoperf_counter: add ioctl(PERF_COUNTER_IOC_RESET)
Peter Zijlstra [Tue, 5 May 2009 15:50:23 +0000 (17:50 +0200)]
perf_counter: add ioctl(PERF_COUNTER_IOC_RESET)

Provide a way to reset an existing counter - this eases PAPI
libraries around perfcounters.

Similar to read() it doesn't collapse pending child counters.

[ Impact: new perfcounter fd ioctl method to reset counters ]

Suggested-by: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <20090505155437.022272933@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoperf_counter: uncouple data_head updates from wakeups
Peter Zijlstra [Tue, 5 May 2009 15:50:22 +0000 (17:50 +0200)]
perf_counter: uncouple data_head updates from wakeups

Keep data_head up-to-date irrespective of notifications. This fixes
the case where you disable a counter and don't get a notification for
the last few pending events, and it also allows polling usage.

[ Impact: increase precision of perfcounter mmap-ed fields ]

Suggested-by: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <20090505155436.925084300@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoperf_counter: fix fixed-purpose counter support on v2 Intel-PERFMON
Ingo Molnar [Mon, 4 May 2009 17:04:09 +0000 (19:04 +0200)]
perf_counter: fix fixed-purpose counter support on v2 Intel-PERFMON

Fixed-purpose counters stopped working in a simple 'perf stat ls' run:

   <not counted>  cache references
   <not counted>  cache misses

Due to:

  ef7b3e0: perf_counter, x86: remove vendor check in fixed_mode_idx()

Which made x86_pmu.num_counters_fixed matter: if it's nonzero, the
fixed-purpose counters are utilized.

But on v2 perfmon this field is not set (despite there being
fixed-purpose PMCs). So add a quirk to set the number of fixed-purpose
counters to at least three.

[ Impact: add quirk for three fixed-purpose counters on certain Intel CPUs ]

Cc: Robert Richter <robert.richter@amd.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1241002046-8832-28-git-send-email-robert.richter@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoperf_counter: convert perf_resource_mutex to a spinlock
Ingo Molnar [Mon, 4 May 2009 17:23:18 +0000 (19:23 +0200)]
perf_counter: convert perf_resource_mutex to a spinlock

Now percpu counters can be initialized very early. But the init
sequence uses mutex_lock(). Fortunately, perf_resource_mutex should
be a spinlock anyway, so convert it.

[ Impact: fix crash due to early init mutex use ]

LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoperf_counter: initialize the per-cpu context earlier
Ingo Molnar [Mon, 4 May 2009 17:13:30 +0000 (19:13 +0200)]
perf_counter: initialize the per-cpu context earlier

percpu scheduling for perfcounters wants to take the context lock,
but that lock first needs to be initialized. Currently it is an
early_initcall() - but that is too late, the task tick runs much
sooner than that.

Call it explicitly from the scheduler init sequence instead.

[ Impact: fix access-before-init crash ]

LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoperf_counter: x86: fixup nmi_watchdog vs perf_counter boo-boo
Peter Zijlstra [Mon, 4 May 2009 16:47:44 +0000 (18:47 +0200)]
perf_counter: x86: fixup nmi_watchdog vs perf_counter boo-boo

Invert the atomic_inc_not_zero() test so that we will indeed detect the
first activation.

Also rename the global num_counters, since its easy to confuse with
x86_pmu.num_counters.

[ Impact: fix non-working perfcounters on AMD CPUs, cleanup ]

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1241455664.7620.4938.camel@twins>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoperf_counter: round-robin per-CPU counters too
Ingo Molnar [Mon, 4 May 2009 16:54:32 +0000 (18:54 +0200)]
perf_counter: round-robin per-CPU counters too

This used to be unstable when we had the rq->lock dependencies,
but now that they are that of the past we can turn on percpu
counter RR too.

[ Impact: handle counter over-commit for per-CPU counters too ]

LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoperf_counter tools: fix build error
Mike Galbraith [Sat, 2 May 2009 06:02:36 +0000 (08:02 +0200)]
perf_counter tools: fix build error

ctype.h crawled out of the bit bucket :)

Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoperfcounter tools: get the syscall number from arch/*/include/asm/unistd.h
Thomas Gleixner [Fri, 1 May 2009 16:48:06 +0000 (18:48 +0200)]
perfcounter tools: get the syscall number from arch/*/include/asm/unistd.h

Avoid further confusion during development

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
15 years agoperfcounter tools: fix pointer mismatch
Thomas Gleixner [Fri, 1 May 2009 16:42:47 +0000 (18:42 +0200)]
perfcounter tools: fix pointer mismatch

Neither process_options nor execvp take an const **char as argument.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
15 years agoperfcounter tools: make rdclock an inline function
Thomas Gleixner [Fri, 1 May 2009 16:39:47 +0000 (18:39 +0200)]
perfcounter tools: make rdclock an inline function

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
15 years agoperfcounter tools: move common defines ... to local header file
Thomas Gleixner [Fri, 1 May 2009 16:29:57 +0000 (18:29 +0200)]
perfcounter tools: move common defines ... to local header file

No change, move of duplicated stuff only.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
15 years agoperf_counter tools: remove build generated files
Thomas Gleixner [Fri, 1 May 2009 15:37:51 +0000 (17:37 +0200)]
perf_counter tools: remove build generated files

These files are generated during the build process. No need to have
them in the git repository.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
15 years agoperf_counter tools: fix x86 syscall numbers
Ingo Molnar [Fri, 1 May 2009 14:51:44 +0000 (16:51 +0200)]
perf_counter tools: fix x86 syscall numbers

Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoperf_counter: documentation update
Peter Zijlstra [Fri, 1 May 2009 10:23:19 +0000 (12:23 +0200)]
perf_counter: documentation update

Update the documentation to reflect the current state of affairs

[ Impact: documentation update ]

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <20090501102533.296727903@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoperf_counter: tool: handle 0-length data files
Peter Zijlstra [Fri, 1 May 2009 10:23:18 +0000 (12:23 +0200)]
perf_counter: tool: handle 0-length data files

Avoid perf-report barfing on 0-length data files.

[ Impact: fix perf-report SIGBUS ]

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <20090501102533.196245693@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoperf_counter: fix nmi-watchdog interaction
Peter Zijlstra [Fri, 1 May 2009 10:23:17 +0000 (12:23 +0200)]
perf_counter: fix nmi-watchdog interaction

When we don't have any perf-counters active, don't act like we know
what the NMI is for.

[ Impact: fix hard hang with nmi_watchdog=2 ]

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <20090501102533.109867793@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoperf_counter: fix race in perf_output_*
Peter Zijlstra [Fri, 1 May 2009 10:23:16 +0000 (12:23 +0200)]
perf_counter: fix race in perf_output_*

When two (or more) contexts output to the same buffer, it is possible
to observe half written output.

Suppose we have CPU0 doing perf_counter_mmap(), CPU1 doing
perf_counter_overflow(). If CPU1 does a wakeup and exposes head to
user-space, then CPU2 can observe the data CPU0 is still writing.

[ Impact: fix occasionally corrupted profiling records ]

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <20090501102533.007821627@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoMerge branch 'core/signal' into perfcounters/core
Thomas Gleixner [Thu, 30 Apr 2009 19:12:13 +0000 (21:12 +0200)]
Merge branch 'core/signal' into perfcounters/core

This is necessary to avoid the conflict of syscall numbers.

Conflicts:
arch/x86/ia32/ia32entry.S
arch/x86/include/asm/unistd_32.h
arch/x86/include/asm/unistd_64.h

Fixes up the borked syscall numbers of perfcounters versus
preadv/pwritev as well.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
15 years agox86: hookup sys_rt_tgsigqueueinfo
Thomas Gleixner [Sat, 4 Apr 2009 21:01:10 +0000 (21:01 +0000)]
x86: hookup sys_rt_tgsigqueueinfo

Make the new sys_rt_tgsigqueueinfo available for x86.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
15 years agosignals: implement sys_rt_tgsigqueueinfo
Thomas Gleixner [Sat, 4 Apr 2009 21:01:06 +0000 (21:01 +0000)]
signals: implement sys_rt_tgsigqueueinfo

sys_kill has the per thread counterpart sys_tgkill. sigqueueinfo is
missing a thread directed counterpart. Such an interface is important
for migrating applications from other OSes which have the per thread
delivery implemented.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Roland McGrath <roland@redhat.com>
Acked-by: Ulrich Drepper <drepper@redhat.com>
15 years agosignals: split do_tkill
Thomas Gleixner [Sat, 4 Apr 2009 21:01:01 +0000 (21:01 +0000)]
signals: split do_tkill

Split out the code from do_tkill to make it reusable by the follow up
patch which implements sys_rt_tgsigqueueinfo

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
15 years agoperf_counter tools: fix infinite loop in perf-report on zeroed event records
Ingo Molnar [Thu, 30 Apr 2009 12:14:37 +0000 (14:14 +0200)]
perf_counter tools: fix infinite loop in perf-report on zeroed event records

Bail out early if a record has zero size - we have no chance to make
reliable progress in that case. Print out the offset where this happens,
and print the number of bytes we missed out on.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoperf_counter tools: perf stat: make -l default-on
Ingo Molnar [Thu, 30 Apr 2009 11:53:33 +0000 (13:53 +0200)]
perf_counter tools: perf stat: make -l default-on

Turn on scaling display by default - this is less confusing.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoperf_counter tools: add perf-report to the Makefile
Ingo Molnar [Thu, 30 Apr 2009 11:52:19 +0000 (13:52 +0200)]
perf_counter tools: add perf-report to the Makefile

Build it explicitly until it's a proper builtin command.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agomutex: add atomic_dec_and_mutex_lock(), fix
Andrew Morton [Wed, 29 Apr 2009 22:59:58 +0000 (15:59 -0700)]
mutex: add atomic_dec_and_mutex_lock(), fix

include/linux/mutex.h:136: warning: 'mutex_lock' declared inline after being called
 include/linux/mutex.h:136: warning: previous declaration of 'mutex_lock' was here

uninline it.

[ Impact: clean up and uninline, address compiler warning ]

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Eric Paris <eparis@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <200904292318.n3TNIsi6028340@imap1.linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoperf_counter: update copyright notice
Paul Mackerras [Wed, 29 Apr 2009 23:48:16 +0000 (09:48 +1000)]
perf_counter: update copyright notice

This adds my name to the list of copyright holders on the core
perf_counter.c, since I have contributed a significant amount of the
code in there.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Robert Richter <robert.richter@amd.com>
LKML-Reference: <18936.59200.888049.746658@cargo.ozlabs.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoLinux 2.6.30-rc4
Linus Torvalds [Thu, 30 Apr 2009 04:48:16 +0000 (21:48 -0700)]
Linux 2.6.30-rc4

15 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ecryptfs...
Linus Torvalds [Thu, 30 Apr 2009 04:25:53 +0000 (21:25 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/ecryptfs/ecryptfs-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ecryptfs/ecryptfs-2.6:
  eCryptfs: Fix min function comparison warning
  ecryptfs: fix printk format warning

15 years agoMerge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab...
Linus Torvalds [Thu, 30 Apr 2009 04:25:39 +0000 (21:25 -0700)]
Merge branch 'for_linus' of git://git./linux/kernel/git/mchehab/linux-2.6

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6:
  V4L/DVB (11652): au0828: fix kernel oops regression on USB disconnect.
  V4L/DVB (11626): cx23885: Two fixes for DViCO FusionHDTV DVB-T Dual Express
  V4L/DVB (11612): mx3_camera: Fix compilation with CONFIG_PM
  V4L/DVB (11570): patch: s2255drv: fix race condition on set mode
  V4L/DVB (11568): cx18: Fix the handling of i2c bus registration error
  V4L/DVB (11561a): move media after i2c
  V4L/DVB (11516): drivers/media/video/saa5246a.c: fix use-after-free
  V4L/DVB (11515): drivers/media/video/saa5249.c: fix use-after-free and leak
  V4L/DVB (11494a): cx231xx Kconfig fixes
  V4L/DVB (11494): cx18: Send correct input routing value to external audio multiplexers

15 years agolocking, rtmutex.c: Documentation cleanup
Luis Henriques [Wed, 29 Apr 2009 20:54:51 +0000 (21:54 +0100)]
locking, rtmutex.c: Documentation cleanup

Two minor updates on functions documentation:
 - Updated documentation for function rt_mutex_unlock(), which contained an
   incorrect name
 - Removed extra '*' from comment in function rt_mutex_destroy()

[ Impact: cleanup ]

Signed-off-by: Luis Henriques <henrix@sapo.pt>
Cc: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <20090429205451.GA23154@hades.domain.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoperf_counter, x86: rename bitmasks to ->used_mask and ->active_mask
Robert Richter [Wed, 29 Apr 2009 14:55:56 +0000 (16:55 +0200)]
perf_counter, x86: rename bitmasks to ->used_mask and ->active_mask

Standardize on explicitly mentioning '_mask' in fields that
are not plain flags but masks. This avoids typos like:

       if (cpuc->used)

(which could easily slip through review unnoticed), while if a
typo looks like this:

       if (cpuc->used_mask)

it might get noticed during review.

[ Impact: cleanup ]

Signed-off-by: Robert Richter <robert.richter@amd.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1241016956-24648-1-git-send-email-robert.richter@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoV4L/DVB (11652): au0828: fix kernel oops regression on USB disconnect.
Devin Heitmueller [Tue, 28 Apr 2009 16:14:07 +0000 (13:14 -0300)]
V4L/DVB (11652): au0828: fix kernel oops regression on USB disconnect.

A regression was introduced in hg changeset 33810c734a0d, which resulted in
a kernel panic whenever the device was disconnected from USB.  The call to
4l2_device_register() was overwriting the pointer for usb_set_intfdata(), so
when au0828_usb_disconnect() was called, the usb_get_intfdata() returned a
pointer to the v4l2_device instead of the au0828_dev structure.

Signed-off-by: Devin Heitmueller <dheitmueller@linuxtv.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
15 years agoV4L/DVB (11626): cx23885: Two fixes for DViCO FusionHDTV DVB-T Dual Express
Christopher Pascoe [Mon, 27 Apr 2009 14:27:04 +0000 (11:27 -0300)]
V4L/DVB (11626): cx23885: Two fixes for DViCO FusionHDTV DVB-T Dual Express

Two fixes for DViCO FusionHDTV DVB-T Dual Express:

 * Reset correct tuner when reinitializing xc3028.
 * Disable the I2C gate control to avoid locking up the I2C bus.

Tested-by: John Knops <jknops@australiaonline.net.au>
Reviewed-by: Steven Toth <stoth@linuxtv.org>
Signed-off-by: Christopher Pascoe <linuxdvb@itee.uq.edu.au>
Signed-off-by: Devin Heitmueller <dheitmueller@linuxtv.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
15 years agoV4L/DVB (11612): mx3_camera: Fix compilation with CONFIG_PM
Sascha Hauer [Fri, 24 Apr 2009 15:58:24 +0000 (12:58 -0300)]
V4L/DVB (11612): mx3_camera: Fix compilation with CONFIG_PM

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
15 years agoV4L/DVB (11570): patch: s2255drv: fix race condition on set mode
Dean Anderson [Mon, 20 Apr 2009 22:07:44 +0000 (19:07 -0300)]
V4L/DVB (11570): patch: s2255drv: fix race condition on set mode

set_modeready flag must be set before command sent to USB in
s2255_write_config.

Signed-off-by: Dean Anderson <dean@sensoray.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
15 years agoV4L/DVB (11568): cx18: Fix the handling of i2c bus registration error
Jean Delvare [Fri, 17 Apr 2009 13:56:51 +0000 (10:56 -0300)]
V4L/DVB (11568): cx18: Fix the handling of i2c bus registration error

* Return actual error values as returned by the i2c subsystem, rather
  than 0 or 1.
* If the registration of the second bus fails, unregister the first one
  before exiting, otherwise we are leaking resources.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Acked-by: Andy Walls <awalls@radix.net>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
15 years agoV4L/DVB (11561a): move media after i2c
Guennadi Liakhovetski [Tue, 21 Apr 2009 07:22:38 +0000 (04:22 -0300)]
V4L/DVB (11561a): move media after i2c

Currently drivers/media drivers are linked very early - directly after
base, block, misc, and mfd and before ata, scsi, ide, input, firewire,
usb, and i2c. This breaks static build of video4linux drivers, that use
generic CPU i2c adapter drivers and the v4l2-subdev subsystem, because
during video4linux probing the v4l2-subdev core requires a struct
i2c_adapter context, which cannot be satisfied before the i2c subsystem is
initialised. Moving drivers/media after drivers/i2c fixes this problem.

The best way to trigger action is by submitting a patch:-) So, let's see
what comes out of it - on the one hand I don't see any reason why media
has to be linked this early, and nobody was able to give me one yesterday
as this problem has been discussed on linux-media, OTOH, maybe indeed it
would be better to move i2c the whole way up above media, but that'd be
much bigger of a change, I think.
--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Acked-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
15 years agoV4L/DVB (11516): drivers/media/video/saa5246a.c: fix use-after-free
Dan Carpenter [Tue, 14 Apr 2009 22:51:30 +0000 (19:51 -0300)]
V4L/DVB (11516): drivers/media/video/saa5246a.c: fix use-after-free

I lowered the kfree(t) down a couple lines and removed the superflous
"t->vdev = NULL;"

Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
15 years agoV4L/DVB (11515): drivers/media/video/saa5249.c: fix use-after-free and leak
Dan Carpenter [Tue, 14 Apr 2009 22:50:33 +0000 (19:50 -0300)]
V4L/DVB (11515): drivers/media/video/saa5249.c: fix use-after-free and leak

I moved the kfree() down a couple lines.  t->vdev is going to be in freed
memory so there is no point setting it to NULL.  I added a kfree(t) on a

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
15 years agoV4L/DVB (11494a): cx231xx Kconfig fixes
Mauro Carvalho Chehab [Tue, 14 Apr 2009 16:56:58 +0000 (13:56 -0300)]
V4L/DVB (11494a): cx231xx Kconfig fixes

selecting ALSA module breaks if !SND. Just remove select.

While here, let's fix the whitespacing at the Kconfig.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
15 years agoV4L/DVB (11494): cx18: Send correct input routing value to external audio multiplexers
Andy Walls [Tue, 14 Apr 2009 00:43:31 +0000 (21:43 -0300)]
V4L/DVB (11494): cx18: Send correct input routing value to external audio multiplexers

A late v4l2_subdev framework change accidentally sent the audio input
routing value to the external multiplexer, instead of the muxer input routing
value to the external multiplexer.  This change corrects that error.

Signed-off-by: Andy Walls <awalls@radix.net>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
15 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
Linus Torvalds [Wed, 29 Apr 2009 14:55:45 +0000 (07:55 -0700)]
Merge git://git./linux/kernel/git/davem/net-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (24 commits)
  e100: do not go D3 in shutdown unless system is powering off
  netfilter: revised locking for x_tables
  Bluetooth: Fix connection establishment with low security requirement
  Bluetooth: Add different pairing timeout for Legacy Pairing
  Bluetooth: Ensure that HCI sysfs add/del is preempt safe
  net: Avoid extra wakeups of threads blocked in wait_for_packet()
  net: Fix typo in net_device_ops description.
  ipv4: Limit size of route cache hash table
  Add reference to CAPI 2.0 standard
  Documentation/isdn/INTERFACE.CAPI
  update Documentation/isdn/00-INDEX
  ixgbe: Fix WoL functionality for 82599 KX4 devices
  veth: prevent oops caused by netdev destructor
  xfrm: wrong hash value for temporary SA
  forcedeth: tx timeout fix
  net: Fix LL_MAX_HEADER for CONFIG_TR_MODULE
  mlx4_en: Handle page allocation failure during receive
  mlx4_en: Fix cleanup flow on cq activation
  vlan: update vlan carrier state for admin up/down
  netfilter: xt_recent: fix stack overread in compat code
  ...

15 years agomutex: add atomic_dec_and_mutex_lock()
Eric Paris [Mon, 23 Mar 2009 17:22:09 +0000 (18:22 +0100)]
mutex: add atomic_dec_and_mutex_lock()

Much like the atomic_dec_and_lock() function in which we take an hold a
spin_lock if we drop the atomic to 0 this function takes and holds the
mutex if we dec the atomic to 0.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Orig-LKML-Reference: <20090323172417.410913479@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoperf_counter: powerpc: allow use of limited-function counters
Paul Mackerras [Wed, 29 Apr 2009 12:38:51 +0000 (22:38 +1000)]
perf_counter: powerpc: allow use of limited-function counters

POWER5+ and POWER6 have two hardware counters with limited functionality:
PMC5 counts instructions completed in run state and PMC6 counts cycles
in run state.  (Run state is the state when a hardware RUN bit is 1;
the idle task clears RUN while waiting for work to do and sets it when
there is work to do.)

These counters can't be written to by the kernel, can't generate
interrupts, and don't obey the freeze conditions.  That means we can
only use them for per-task counters (where we know we'll always be in
run state; we can't put a per-task counter on an idle task), and only
if we don't want interrupts and we do want to count in all processor
modes.

Obviously some counters can't go on a limited hardware counter, but there
are also situations where we can only put a counter on a limited hardware
counter - if there are already counters on that exclude some processor
modes and we want to put on a per-task cycle or instruction counter that
doesn't exclude any processor mode, it could go on if it can use a
limited hardware counter.

To keep track of these constraints, this adds a flags argument to the
processor-specific get_alternatives() functions, with three bits defined:
one to say that we can accept alternative event codes that go on limited
counters, one to say we only want alternatives on limited counters, and
one to say that this is a per-task counter and therefore events that are
gated by run state are equivalent to those that aren't (e.g. a "cycles"
event is equivalent to a "cycles in run state" event).  These flags
are computed for each counter and stored in the counter->hw.counter_base
field (slightly wonky name for what it does, but it was an existing
unused field).

Since the limited counters don't freeze when we freeze the other counters,
we need some special handling to avoid getting skew between things counted
on the limited counters and those counted on normal counters.  To minimize
this skew, if we are using any limited counters, we read PMC5 and PMC6
immediately after setting and clearing the freeze bit.  This is done in
a single asm in the new write_mmcr0() function.

The code here is specific to PMC5 and PMC6 being the limited hardware
counters.  Being more general (e.g. having a bitmap of limited hardware
counter numbers) would have meant more complex code to read the limited
counters when freezing and unfreezing the normal counters, with
conditional branches, which would have increased the skew.  Since it
isn't necessary for the code to be more general at this stage, it isn't.

This also extends the back-ends for POWER5+ and POWER6 to be able to
handle up to 6 counters rather than the 4 they previously handled.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Robert Richter <robert.richter@amd.com>
LKML-Reference: <18936.19035.163066.892208@cargo.ozlabs.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoperf_counter: add/update copyrights
Ingo Molnar [Wed, 29 Apr 2009 12:52:50 +0000 (14:52 +0200)]
perf_counter: add/update copyrights

Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoperf_counter: update 'perf top' documentation
Robert Richter [Wed, 29 Apr 2009 10:47:26 +0000 (12:47 +0200)]
perf_counter: update 'perf top' documentation

The documentation about the perf-top build was outdated after
perfstat has been implemented. This updates it.

[ Impact: update documentation ]

Signed-off-by: Robert Richter <robert.richter@amd.com>
Cc: Paul Mackerras <paulus@samba.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1241002046-8832-30-git-send-email-robert.richter@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoperf_counter, x86: remove unused function argument in intel_pmu_get_status()
Robert Richter [Wed, 29 Apr 2009 10:47:25 +0000 (12:47 +0200)]
perf_counter, x86: remove unused function argument in intel_pmu_get_status()

The mask argument is unused and thus can be removed.

[ Impact: cleanup ]

Signed-off-by: Robert Richter <robert.richter@amd.com>
Cc: Paul Mackerras <paulus@samba.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1241002046-8832-29-git-send-email-robert.richter@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoperf_counter, x86: remove vendor check in fixed_mode_idx()
Robert Richter [Wed, 29 Apr 2009 10:47:24 +0000 (12:47 +0200)]
perf_counter, x86: remove vendor check in fixed_mode_idx()

The function fixed_mode_idx() is used generically. Now it checks the
num_counters_fixed value instead of the vendor to decide if fixed
counters are present.

[ Impact: generalize code ]

Signed-off-by: Robert Richter <robert.richter@amd.com>
Cc: Paul Mackerras <paulus@samba.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1241002046-8832-28-git-send-email-robert.richter@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoperf_counter, x86: introduce max_period variable
Robert Richter [Wed, 29 Apr 2009 10:47:23 +0000 (12:47 +0200)]
perf_counter, x86: introduce max_period variable

In x86 pmus the allowed counter period to programm differs. This
introduces a max_period value and allows the generic implementation
for all models to check the max period.

[ Impact: generalize code ]

Signed-off-by: Robert Richter <robert.richter@amd.com>
Cc: Paul Mackerras <paulus@samba.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1241002046-8832-27-git-send-email-robert.richter@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoperf_counter, x86: return raw count with x86_perf_counter_update()
Robert Richter [Wed, 29 Apr 2009 10:47:22 +0000 (12:47 +0200)]
perf_counter, x86: return raw count with x86_perf_counter_update()

To check on AMD cpus if a counter overflows, the upper bit of the raw
counter value must be checked. This value is already internally
available in x86_perf_counter_update(). Now, the value is returned so
that it can be used directly to check for overflows.

[ Impact: micro-optimization ]

Signed-off-by: Robert Richter <robert.richter@amd.com>
Cc: Paul Mackerras <paulus@samba.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1241002046-8832-26-git-send-email-robert.richter@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoperf_counter, x86: implement the interrupt handler for AMD cpus
Robert Richter [Wed, 29 Apr 2009 10:47:21 +0000 (12:47 +0200)]
perf_counter, x86: implement the interrupt handler for AMD cpus

This patch implements the interrupt handler for AMD performance
counters. In difference to the Intel pmu, there is no single status
register and also there are no fixed counters. This makes the handler
very different and it is useful to make the handler vendor
specific. To check if a counter is overflowed the upper bit of the
counter is checked. Only counters where the active bit is set are
checked.

With this patch throttling is enabled for AMD performance counters.

This patch also reenables Linux performance counters on AMD cpus.

[ Impact: re-enable perfcounters on AMD CPUs ]

Signed-off-by: Robert Richter <robert.richter@amd.com>
Cc: Paul Mackerras <paulus@samba.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1241002046-8832-25-git-send-email-robert.richter@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoperf_counter, x86: change and remove pmu initialization checks
Robert Richter [Wed, 29 Apr 2009 10:47:20 +0000 (12:47 +0200)]
perf_counter, x86: change and remove pmu initialization checks

Some functions are only called if the pmu was proper initialized. That
initalization checks can be removed. The way to check initialization
changed too. Now, the pointer to the interrupt handler is checked. If
it exists the pmu is initialized. This also removes a static variable
and uses struct x86_pmu as only data source for the check.

[ Impact: simplify code ]

Signed-off-by: Robert Richter <robert.richter@amd.com>
Cc: Paul Mackerras <paulus@samba.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1241002046-8832-24-git-send-email-robert.richter@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoperf_counter, x86: rework counter disable functions
Robert Richter [Wed, 29 Apr 2009 10:47:19 +0000 (12:47 +0200)]
perf_counter, x86: rework counter disable functions

As for the enable function, this patch reworks the disable functions
and introduces x86_pmu_disable_counter(). The internal function i/f in
struct x86_pmu changed too.

[ Impact: refactor and generalize code ]

Signed-off-by: Robert Richter <robert.richter@amd.com>
Cc: Paul Mackerras <paulus@samba.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1241002046-8832-23-git-send-email-robert.richter@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoperf_counter, x86: rework counter enable functions
Robert Richter [Wed, 29 Apr 2009 10:47:18 +0000 (12:47 +0200)]
perf_counter, x86: rework counter enable functions

There is vendor specific code in generic x86 code, and there is vendor
specific code that could be generic. This patch introduces
x86_pmu_enable_counter() for x86 generic code. Fixed counter code for
Intel is moved to Intel only functions. In the end, checks and calls
via function pointers were reduced to the necessary. Also, the
internal function i/f changed.

[ Impact: refactor and generalize code ]

Signed-off-by: Robert Richter <robert.richter@amd.com>
Cc: Paul Mackerras <paulus@samba.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1241002046-8832-22-git-send-email-robert.richter@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoperf_counter, x86: consistent use of type int for counter index
Robert Richter [Wed, 29 Apr 2009 10:47:17 +0000 (12:47 +0200)]
perf_counter, x86: consistent use of type int for counter index

The type of counter index is sometimes implemented as unsigned
int. This patch changes this to have a consistent usage of int.

[ Impact: cleanup ]

Signed-off-by: Robert Richter <robert.richter@amd.com>
Cc: Paul Mackerras <paulus@samba.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1241002046-8832-21-git-send-email-robert.richter@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoperf_counter, x86: generic use of cpuc->active
Robert Richter [Wed, 29 Apr 2009 10:47:16 +0000 (12:47 +0200)]
perf_counter, x86: generic use of cpuc->active

cpuc->active will now be used to indicate an enabled counter which
implies also valid pointers of cpuc->counters[]. In contrast,
cpuc->used only locks the counter, but it can be still uninitialized.

[ Impact: refactor and generalize code ]

Signed-off-by: Robert Richter <robert.richter@amd.com>
Cc: Paul Mackerras <paulus@samba.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1241002046-8832-20-git-send-email-robert.richter@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoperf_counter, x86: rename cpuc->active_mask
Robert Richter [Wed, 29 Apr 2009 10:47:15 +0000 (12:47 +0200)]
perf_counter, x86: rename cpuc->active_mask

This is to have a consistent naming scheme with cpuc->used.

[ Impact: cleanup ]

Signed-off-by: Robert Richter <robert.richter@amd.com>
Cc: Paul Mackerras <paulus@samba.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1241002046-8832-19-git-send-email-robert.richter@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoperf_counter, x86: make x86_pmu_read() static inline
Robert Richter [Wed, 29 Apr 2009 10:47:14 +0000 (12:47 +0200)]
perf_counter, x86: make x86_pmu_read() static inline

[ Impact: micro-optimization ]

Signed-off-by: Robert Richter <robert.richter@amd.com>
Cc: Paul Mackerras <paulus@samba.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1241002046-8832-18-git-send-email-robert.richter@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoperf_counter, x86: make pmu version generic
Robert Richter [Wed, 29 Apr 2009 10:47:13 +0000 (12:47 +0200)]
perf_counter, x86: make pmu version generic

This makes the use of the version variable generic. Also, some debug
messages have been generalized.

[ Impact: refactor and generalize code ]

Signed-off-by: Robert Richter <robert.richter@amd.com>
Cc: Paul Mackerras <paulus@samba.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1241002046-8832-17-git-send-email-robert.richter@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoperf_counter, x86: move counter parameters to struct x86_pmu
Robert Richter [Wed, 29 Apr 2009 10:47:12 +0000 (12:47 +0200)]
perf_counter, x86: move counter parameters to struct x86_pmu

[ Impact: refactor and generalize code ]

Signed-off-by: Robert Richter <robert.richter@amd.com>
Cc: Paul Mackerras <paulus@samba.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1241002046-8832-16-git-send-email-robert.richter@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoperf_counter, x86: make x86_pmu data a static struct
Robert Richter [Wed, 29 Apr 2009 10:47:11 +0000 (12:47 +0200)]
perf_counter, x86: make x86_pmu data a static struct

Instead of using a pointer to reference to the x86 pmu we now have one
single data structure that is initialized at the beginning. This saves
the pointer access when using this memory.

[ Impact: micro-optimization ]

Signed-off-by: Robert Richter <robert.richter@amd.com>
Cc: Paul Mackerras <paulus@samba.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1241002046-8832-15-git-send-email-robert.richter@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoperf_counter, x86: modify initialization of struct x86_pmu
Robert Richter [Wed, 29 Apr 2009 10:47:10 +0000 (12:47 +0200)]
perf_counter, x86: modify initialization of struct x86_pmu

This patch adds an error handler and changes initialization of struct
x86_pmu. No functional changes. Needed for follow-on patches.

[ Impact: cleanup ]

Signed-off-by: Robert Richter <robert.richter@amd.com>
Cc: Paul Mackerras <paulus@samba.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1241002046-8832-14-git-send-email-robert.richter@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoperf_counter, x86: rename intel only functions
Robert Richter [Wed, 29 Apr 2009 10:47:09 +0000 (12:47 +0200)]
perf_counter, x86: rename intel only functions

[ Impact: cleanup ]

Signed-off-by: Robert Richter <robert.richter@amd.com>
Cc: Paul Mackerras <paulus@samba.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1241002046-8832-13-git-send-email-robert.richter@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoperf_counter, x86: rename __hw_perf_counter_set_period into x86_perf_counter_set_period
Robert Richter [Wed, 29 Apr 2009 10:47:08 +0000 (12:47 +0200)]
perf_counter, x86: rename __hw_perf_counter_set_period into x86_perf_counter_set_period

[ Impact: cleanup ]

Signed-off-by: Robert Richter <robert.richter@amd.com>
Cc: Paul Mackerras <paulus@samba.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1241002046-8832-12-git-send-email-robert.richter@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoperf_counter, x86: remove ack_status() from struct x86_pmu
Robert Richter [Wed, 29 Apr 2009 10:47:07 +0000 (12:47 +0200)]
perf_counter, x86: remove ack_status() from struct x86_pmu

This function is Intel only and not necessary for AMD cpus.

[ Impact: simplify code ]

Signed-off-by: Robert Richter <robert.richter@amd.com>
Cc: Paul Mackerras <paulus@samba.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1241002046-8832-11-git-send-email-robert.richter@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoperf_counter, x86: remove get_status() from struct x86_pmu
Robert Richter [Wed, 29 Apr 2009 10:47:06 +0000 (12:47 +0200)]
perf_counter, x86: remove get_status() from struct x86_pmu

This function is Intel only and not necessary for AMD cpus.

[ Impact: simplify code ]

Signed-off-by: Robert Richter <robert.richter@amd.com>
Cc: Paul Mackerras <paulus@samba.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1241002046-8832-10-git-send-email-robert.richter@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoperf_counter, x86: make interrupt handler model specific
Robert Richter [Wed, 29 Apr 2009 10:47:05 +0000 (12:47 +0200)]
perf_counter, x86: make interrupt handler model specific

This separates the perfcounter interrupt handler for AMD and Intel
cpus. The AMD interrupt handler implementation is a follow-on patch.

[ Impact: refactor and clean up code ]

Signed-off-by: Robert Richter <robert.richter@amd.com>
Cc: Paul Mackerras <paulus@samba.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1241002046-8832-9-git-send-email-robert.richter@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoperf_counter, x86: rename struct pmc_x86_ops into struct x86_pmu
Robert Richter [Wed, 29 Apr 2009 10:47:04 +0000 (12:47 +0200)]
perf_counter, x86: rename struct pmc_x86_ops into struct x86_pmu

This patch renames struct pmc_x86_ops into struct x86_pmu. It
introduces a structure to describe an x86 model specific pmu
(performance monitoring unit). It may contain ops and data. The new
name of the structure fits better, is shorter, and thus better to
handle. Where it was appropriate, names of function and variable have
been changed too.

[ Impact: cleanup ]

Signed-off-by: Robert Richter <robert.richter@amd.com>
Cc: Paul Mackerras <paulus@samba.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1241002046-8832-8-git-send-email-robert.richter@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoperfcounters: rename struct hw_perf_counter_ops into struct pmu
Robert Richter [Wed, 29 Apr 2009 10:47:03 +0000 (12:47 +0200)]
perfcounters: rename struct hw_perf_counter_ops into struct pmu

This patch renames struct hw_perf_counter_ops into struct pmu. It
introduces a structure to describe a cpu specific pmu (performance
monitoring unit). It may contain ops and data. The new name of the
structure fits better, is shorter, and thus better to handle. Where it
was appropriate, names of function and variable have been changed too.

[ Impact: cleanup ]

Signed-off-by: Robert Richter <robert.richter@amd.com>
Cc: Paul Mackerras <paulus@samba.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1241002046-8832-7-git-send-email-robert.richter@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoperf_counter, x86: protect per-cpu variables with compile barriers only
Robert Richter [Wed, 29 Apr 2009 10:47:02 +0000 (12:47 +0200)]
perf_counter, x86: protect per-cpu variables with compile barriers only

Per-cpu variables needn't to be protected with cpu barriers
(smp_wmb()). Protection is only needed for preemption on the same cpu
(rescheduling or the nmi handler). This can be done using a compiler
barrier only.

[ Impact: micro-optimization ]

Signed-off-by: Robert Richter <robert.richter@amd.com>
Cc: Paul Mackerras <paulus@samba.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1241002046-8832-6-git-send-email-robert.richter@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoperf_counter, x86: rework pmc_amd_save_disable_all() and pmc_amd_restore_all()
Robert Richter [Wed, 29 Apr 2009 10:47:01 +0000 (12:47 +0200)]
perf_counter, x86: rework pmc_amd_save_disable_all() and pmc_amd_restore_all()

MSR reads and writes are expensive. This patch adds checks to avoid
its usage where possible.

[ Impact: micro-optimization on AMD CPUs ]

Signed-off-by: Robert Richter <robert.richter@amd.com>
Cc: Paul Mackerras <paulus@samba.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1241002046-8832-5-git-send-email-robert.richter@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoperf_counter, x86: add default path to cpu detection
Robert Richter [Wed, 29 Apr 2009 10:47:00 +0000 (12:47 +0200)]
perf_counter, x86: add default path to cpu detection

This quits hw counter initialization immediately if no cpu is
detected.

[ Impact: cleanup ]

Signed-off-by: Robert Richter <robert.richter@amd.com>
Cc: Paul Mackerras <paulus@samba.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1241002046-8832-4-git-send-email-robert.richter@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>