GitHub/LineageOS/android_kernel_motorola_exynos9610.git
10 years agoperf tools: Fix tags/TAGS targets rebuilding
Jiri Olsa [Tue, 26 Nov 2013 12:54:12 +0000 (13:54 +0100)]
perf tools: Fix tags/TAGS targets rebuilding

Once the tags/TAGS file is generated it's never rebuilt until it's
removed by hand.

The reason is that the Makefile does not treat tags/TAGS as targets but
as files and thus won't rebuilt them once they are in place.

Adding PHONY tags/TAGS targets into Makefile.

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20131126125412.GJ1267@krava.brq.redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 years agoperf timechart: Remove misplaced __maybe_unused
Arnaldo Carvalho de Melo [Wed, 27 Nov 2013 19:32:56 +0000 (16:32 -0300)]
perf timechart: Remove misplaced __maybe_unused

The 'event' parameter _is_ used.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stanislav Fomichev <stfomichev@yandex-team.ru>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-`ranpwd
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 years agoperf timechart: Remove some needless struct forward declarations
Arnaldo Carvalho de Melo [Wed, 27 Nov 2013 19:29:50 +0000 (16:29 -0300)]
perf timechart: Remove some needless struct forward declarations

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stanislav Fomichev <stfomichev@yandex-team.ru>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-jomi6mjv5zi9vsn4vmih5xps@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 years agoperf timechart: dynamically determine event fields offset
Stanislav Fomichev [Wed, 27 Nov 2013 10:45:00 +0000 (14:45 +0400)]
perf timechart: dynamically determine event fields offset

Since b000c8065a92 "tracing: Remove the extra 4 bytes of padding in
events" removed padding bytes, perf timechart got out of sync with the
kernel's trace_entry structure.

Convert perf timechart to use dynamic fields offsets (via
perf_evsel__intval) not relying on a hardcoded copy of fields layout
from the kernel.

Signed-off-by: Stanislav Fomichev <stfomichev@yandex-team.ru>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Chia-I Wu <olvaffe@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/20131127104459.GB3309@stfomichev-desktop
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 years agoperf symbols: Fix not finding kcore in buildid cache
Adrian Hunter [Tue, 26 Nov 2013 13:19:24 +0000 (15:19 +0200)]
perf symbols: Fix not finding kcore in buildid cache

The logic was not looking in the buildid cache for kcore if the host
kernel buildid did not match the recorded kernel buildid.

This affects the non-live case i.e. the kernel has changed and we are
looking at a special copy of kcore that we placed in the buildid cache
(using "perf buildid-cache -v -k /proc/kcore") when the data was
recorded.

After this fix kernel symbols get resolved/annotated correctly.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1385471964-4037-1-git-send-email-adrian.hunter@intel.com
[ Added further explanation extracted from conversation between Ingo & Adrian on lkml ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 years agoperf script: Print mmap[2] events also
Namhyung Kim [Tue, 26 Nov 2013 08:54:26 +0000 (17:54 +0900)]
perf script: Print mmap[2] events also

If --show-mmap-events option is given, also print internal MMAP and
MMAP2 events.  It would be helpful for debugging.

  $ perf script --show-mmap-events
  ...
           sleep  9486 [009] 3350640.335531: PERF_RECORD_MMAP 9486/9486: [0x400000(0x6000) @ 0]: x /usr/bin/sleep
           sleep  9486 [009] 3350640.335542: PERF_RECORD_MMAP 9486/9486: [0x3153a00000(0x223000) @ 0]: x /usr/lib64/ld-2.17.so
           sleep  9486 [009] 3350640.335553: PERF_RECORD_MMAP 9486/9486: [0x7fff8b5fe000(0x2000) @ 0x7fff8b5fe000]: x [vdso]
           sleep  9486 [009] 3350640.335643: PERF_RECORD_MMAP 9486/9486: [0x3153e00000(0x3c0000) @ 0]: x /usr/lib64/libc-2.17.so

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Suggested-by: Frederic Weisbecker <fweisbec@gmail.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1385456066-26592-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 years agoperf script: Print comm, fork and exit events also
Namhyung Kim [Tue, 26 Nov 2013 08:51:12 +0000 (17:51 +0900)]
perf script: Print comm, fork and exit events also

If --show-task-events option is given, also print internal COMM, FORK
and EXIT events.  It would be helpful for debugging.

  $ perf script --show-task-events
  ...
         swapper     0 [009] 3350640.335261: sched:sched_switch: prev_comm=swapper/9
           sleep  9486 [009] 3350640.335509: PERF_RECORD_COMM: sleep:9486
           sleep  9486 [009] 3350640.335806: sched:sched_stat_runtime: comm=sleep pid=9486
         firefox  2635 [003] 3350641.275896: PERF_RECORD_FORK(2635:9487):(2635:2635)
         firefox  2635 [003] 3350641.275896: sched:sched_process_fork: comm=firefox pid=2635
           sleep  9486 [009] 3350641.336009: PERF_RECORD_EXIT(9486:9486):(9486:9486)

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Suggested-by: Frederic Weisbecker <fweisbec@gmail.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1385455873-25865-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 years agoperf script: Print callchains and symbols if they exist
David Ahern [Wed, 20 Nov 2013 04:07:37 +0000 (21:07 -0700)]
perf script: Print callchains and symbols if they exist

The intent of perf-script is to dump the events and information in the
file. H/W, S/W and raw events all dump callchains if they are present;
might as well make that the default for tracepoints too.

v2: Only add options for sym, dso and ip if callchains are present

Signed-off-by: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/1384920457-5986-1-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 years agoperf tools: Export setup_list
David Ahern [Mon, 18 Nov 2013 20:32:48 +0000 (13:32 -0700)]
perf tools: Export setup_list

Used in upcoming patches (perf sched timehist command).

Signed-off-by: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1384806771-2945-6-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 years agoperf thread: Move comm_list check into function
David Ahern [Mon, 18 Nov 2013 20:32:47 +0000 (13:32 -0700)]
perf thread: Move comm_list check into function

Signed-off-by: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1384806771-2945-5-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 years agoperf symbols: Move idle syms check from top to generic function
David Ahern [Mon, 18 Nov 2013 20:32:45 +0000 (13:32 -0700)]
perf symbols: Move idle syms check from top to generic function

Allows list of idle symbols to be leveraged by other commands, such as
the upcoming timehist command.

Signed-off-by: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1384806771-2945-3-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 years agoperf evsel: Skip ignored symbols while printing callchain
David Ahern [Mon, 18 Nov 2013 20:32:44 +0000 (13:32 -0700)]
perf evsel: Skip ignored symbols while printing callchain

Allows a command to have a symbol_filter controlled by the user to skip
certain functions in a backtrace. One example is to allow the user to
reduce repeating patterns like:

    do_select  core_sys_select  sys_select

to just sys_select when dumping callchains, consuming less real estate
on the screen while still conveying the essential message - the process
is in a select call.

This option is leveraged by the upcoming timehist command.

Signed-off-by: David Ahern <dsahern@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1384806771-2945-2-git-send-email-dsahern@gmail.com
[ Checked if al.sym is NULL before touching al.sym->ignored, as noted by Adrian Hunter ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 years agoperf timechart: Add backtrace support
Stanislav Fomichev [Fri, 1 Nov 2013 16:25:51 +0000 (20:25 +0400)]
perf timechart: Add backtrace support

Add -g flag to `perf timechart record` which saves callchain info in the
perf.data.

When generating SVG, add backtrace information to the figure details, so
now it's possible to see which code path woke up the task and why some
task went to sleep.

Signed-off-by: Stanislav Fomichev <stfomichev@yandex-team.ru>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1383323151-19810-8-git-send-email-stfomichev@yandex-team.ru
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 years agoperf timechart: Add support for -P and -T in timechart recording
Stanislav Fomichev [Fri, 1 Nov 2013 16:25:50 +0000 (20:25 +0400)]
perf timechart: Add support for -P and -T in timechart recording

If we don't want either power or task events we may use -T or -P with
the `perf timechart record` command to filter out events while recording
to keep perf.data small.

Signed-off-by: Stanislav Fomichev <stfomichev@yandex-team.ru>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1383323151-19810-7-git-send-email-stfomichev@yandex-team.ru
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 years agoperf timechart: Group figures and add title with details
Stanislav Fomichev [Fri, 1 Nov 2013 16:25:49 +0000 (20:25 +0400)]
perf timechart: Group figures and add title with details

Add titles to figures so we can run SVG interactively in Firefox and
check event details in the tooltips.

This also aids exploring SVG with Inkscape because when user clicks on
one part of logical figure, all parts are selected.

It's also possible to read titles with Inkscape in the object details.

Signed-off-by: Stanislav Fomichev <stfomichev@yandex-team.ru>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1383323151-19810-6-git-send-email-stfomichev@yandex-team.ru
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 years agoperf timechart: Add support for displaying only tasks related data
Stanislav Fomichev [Fri, 1 Nov 2013 16:25:48 +0000 (20:25 +0400)]
perf timechart: Add support for displaying only tasks related data

In order to make SVG smaller and faster to browse add possibility to
switch off power related information with -T switch.

Signed-off-by: Stanislav Fomichev <stfomichev@yandex-team.ru>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1383323151-19810-5-git-send-email-stfomichev@yandex-team.ru
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 years agoperf timechart: Use proc_num to implement --power-only
Stanislav Fomichev [Fri, 1 Nov 2013 16:25:47 +0000 (20:25 +0400)]
perf timechart: Use proc_num to implement --power-only

Don't use special flag to indicate power-only mode, just set proc_num to
0.

Signed-off-by: Stanislav Fomichev <stfomichev@yandex-team.ru>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1383323151-19810-4-git-send-email-stfomichev@yandex-team.ru
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 years agoperf timechart: Add option to limit number of tasks
Stanislav Fomichev [Fri, 1 Nov 2013 16:25:46 +0000 (20:25 +0400)]
perf timechart: Add option to limit number of tasks

Add -n option to specify min. number of tasks to print.

Signed-off-by: Stanislav Fomichev <stfomichev@yandex-team.ru>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1383323151-19810-3-git-send-email-stfomichev@yandex-team.ru
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 years agoperf timechart: Always try to print at least 15 tasks
Stanislav Fomichev [Fri, 1 Nov 2013 16:25:45 +0000 (20:25 +0400)]
perf timechart: Always try to print at least 15 tasks

Always try to print at least 15 tasks no matter how long they run.

Signed-off-by: Stanislav Fomichev <stfomichev@yandex-team.ru>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1383323151-19810-2-git-send-email-stfomichev@yandex-team.ru
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 years agoperf record: Default -t option to no inheritance
Adrian Hunter [Mon, 18 Nov 2013 09:55:57 +0000 (11:55 +0200)]
perf record: Default -t option to no inheritance

The change to per-cpu mmaps causes the -p, -t and -u options now to have
inheritance enabled by default.  Change that back to no inheritance but
for the -t option only.

Requested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1384768557-23331-5-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 years agoperf tools: Add option macro OPT_BOOLEAN_SET
Adrian Hunter [Mon, 18 Nov 2013 09:55:56 +0000 (11:55 +0200)]
perf tools: Add option macro OPT_BOOLEAN_SET

OPT_BOOLEAN_SET records whether a boolean option was set by the user.

That information can be used to change the default value for the option
after the options have been parsed.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1384768557-23331-4-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 years agoperf tools: Allow '--inherit' as the negation of '--no-inherit'
Adrian Hunter [Mon, 18 Nov 2013 09:55:55 +0000 (11:55 +0200)]
perf tools: Allow '--inherit' as the negation of '--no-inherit'

Long options can be negated by prefixing them with 'no-'.  However
options that already start with 'no-', such as '--no-inherit' result in
ugly double 'no's.

Avoid that by accepting that the removal of 'no-' also negates the long
option.

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1384768557-23331-3-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 years agoperf record: Make per-cpu mmaps the default.
Adrian Hunter [Fri, 15 Nov 2013 13:52:29 +0000 (15:52 +0200)]
perf record: Make per-cpu mmaps the default.

This affects the -p, -t and -u options that previously defaulted to
per-thread mmaps.

Consequently add an option to select per-thread mmaps to support the old
behaviour.

Note that per-thread can be used with a workload-only (i.e. none of -p,
-t, -u, -a or -C is selected) to get a per-thread mmap with no
inheritance.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/5286271D.3020808@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 years agoperf script: Move evname print code to process_event()
Namhyung Kim [Mon, 18 Nov 2013 05:34:52 +0000 (14:34 +0900)]
perf script: Move evname print code to process_event()

The print_sample_start() will be reused by other printing routine for
internal events like COMM, FORK and EXIT from next patch.  And because
they're not tied to a specific event, move the evname print code to its
caller.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: David Ahern <dsahern@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1384752894-10974-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 years agoperf completion: Rename file to reflect zsh support
Ramkumar Ramachandra [Sun, 17 Nov 2013 16:13:27 +0000 (21:43 +0530)]
perf completion: Rename file to reflect zsh support

Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Link: http://lkml.kernel.org/r/1384704807-15779-6-git-send-email-artagnon@gmail.com
[ Fix 'make install' target ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 years agoperf completion: Introduce zsh support
Ramkumar Ramachandra [Sun, 17 Nov 2013 16:13:26 +0000 (21:43 +0530)]
perf completion: Introduce zsh support

__perfcomp(), __perfcomp_colon(), and _perf() have to be overridden.
Inspired by the way the git.git completion system is structured.

Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1384704807-15779-5-git-send-email-artagnon@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 years agoperf completion: Factor out call to __ltrim_colon_completions
Ramkumar Ramachandra [Sun, 17 Nov 2013 16:13:25 +0000 (21:43 +0530)]
perf completion: Factor out call to __ltrim_colon_completions

In our sole callsite, __ltrim_colon_completions is called after
__perfcomp, to modify the COMPREPLY set by the invocation.

This is problematic, because in the zsh equivalent (using compset/
compadd), we'll have to generate completions in one-shot.

So factor out this entire callsite into a special override'able
__perfcomp_colon function; we will override it when introducing zsh
support.

Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Link: http://lkml.kernel.org/r/1384704807-15779-4-git-send-email-artagnon@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 years agoperf completion: Factor out compgen stuff
Ramkumar Ramachandra [Sun, 17 Nov 2013 16:13:24 +0000 (21:43 +0530)]
perf completion: Factor out compgen stuff

compgen is a bash-builtin; factor out the invocations into a separate
function to give us a chance to override it with a zsh equivalent in
future patches.

Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Link: http://lkml.kernel.org/r/1384704807-15779-3-git-send-email-artagnon@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 years agoperf completion: Introduce a layer of indirection
Ramkumar Ramachandra [Sun, 17 Nov 2013 16:13:23 +0000 (21:43 +0530)]
perf completion: Introduce a layer of indirection

Define the variables cur, words, cword, and prev outside the main
completion function so that we have a chance to override it when we
introduce zsh support.

Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Link: http://lkml.kernel.org/r/1384704807-15779-2-git-send-email-artagnon@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 years agoperf top: Make -g refer to callchains
David Ahern [Fri, 15 Nov 2013 03:51:30 +0000 (20:51 -0700)]
perf top: Make -g refer to callchains

In most commands -g is used for callchains. Make perf-top follow suit.
Move group to just --group with no short cut making it similar to
perf-record.

Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1384487490-6865-1-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 years agoperf trace: Remove thread summary coloring
Pekka Enberg [Thu, 14 Nov 2013 16:43:30 +0000 (18:43 +0200)]
perf trace: Remove thread summary coloring

Thread summary line coloring looks ugly.  It doesn't add much value so
remove coloring completely.

Signed-off-by: Pekka Enberg <penberg@kernel.org>
Acked-by: David Ahern <dsahern@gmail.com>
Link: http://lkml.kernel.org/r/1384447410-1771-1-git-send-email-penberg@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 years agotools lib traceevent: Use helper trace-seq in print functions like kernel does
Steven Rostedt [Tue, 19 Nov 2013 23:29:37 +0000 (18:29 -0500)]
tools lib traceevent: Use helper trace-seq in print functions like kernel does

Jiri Olsa reported that his plugin for scsi was chopping off part of the
output. Investigating this, I found that Jiri used the same functions as
what is in the kernel, which adds the following:

trace_seq_putc(p, 0);

This adds a '\0' to the output string. The reason this works in the
kernel is that the "p" that is passed to the function helper is a
temporary trace_seq. But in the libtraceevent library, it's the pointer
to the trace_seq used to output. By adding the '\0', it truncates the
line and nothing added after that will be printed.

We can solve this in two ways. One is to have the helper functions for
the library not add the unnecessary '\0'. The other is to change the
library to also use a helper trace_seq structure that gets copied to the
main trace_seq just like the kernel does.

The latter allows the helper functions in the plugins to be the same as
the kernel, which is the better solution.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Reported-by: Jiri Olsa <jolsa@redhat.com>
Tested-by: Jiri Olsa <jolsa@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/20131119182937.401668e3@gandalf.local.home
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 years agoperf/x86: Add RAPL hrtimer support
Stephane Eranian [Tue, 12 Nov 2013 16:58:51 +0000 (17:58 +0100)]
perf/x86: Add RAPL hrtimer support

The RAPL PMU counters do not interrupt on overflow.
Therefore, the kernel needs to poll the counters
to avoid missing an overflow. This patch adds
the hrtimer code to do this.

The timer interval is calculated at boot time
based on the power unit used by the HW.

There is one hrtimer per-cpu to handle the case
of multiple simultaneous use across cores on
the same package + hotplug CPU.

Thanks to Maria Dimakopoulou for her contributions
to this patch especially on the math aspects.

Signed-off-by: Stephane Eranian <eranian@google.com>
Reviewed-by: Maria Dimakopoulou <maria.n.dimakopoulou@gmail.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
[ Applied 32-bit build fix. ]
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: acme@redhat.com
Cc: jolsa@redhat.com
Cc: zheng.z.yan@intel.com
Cc: bp@alien8.de
Cc: maria.n.dimakopoulou@gmail.com
Link: http://lkml.kernel.org/r/1384275531-10892-5-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
10 years agoperf/x86: Add Intel RAPL PMU support
Stephane Eranian [Tue, 12 Nov 2013 16:58:50 +0000 (17:58 +0100)]
perf/x86: Add Intel RAPL PMU support

This patch adds a new uncore PMU to expose the Intel
RAPL energy consumption counters. Up to 3 counters,
each counting a particular RAPL event are exposed.

The RAPL counters are available on Intel SandyBridge,
IvyBridge, Haswell. The server skus add a 3rd counter.

The following events are available and exposed in sysfs:

  - power/energy-cores: power consumption of all cores on socket
  - power/energy-pkg: power consumption of all cores + LLc cache
  - power/energy-dram: power consumption of DRAM (servers only)

For each event both the unit (Joules) and scale (2^-32 J)
is exposed in sysfs for use by perf stat and other tools.
The files are:

/sys/devices/power/events/energy-*.unit
/sys/devices/power/events/energy-*.scale

The RAPL PMU is uncore by nature and is implemented such
that it only works in system-wide mode. Measuring only
one CPU per socket is sufficient. The /sys/devices/power/cpumask
file can be used by tools to figure out which CPUs to monitor
by default. For instance, on a 2-socket system, 2 CPUs
(one on each socket) will be shown.

All the counters measure in the same unit (exposed via sysfs).
The perf_events API exposes all RAPL counters as 64-bit integers
counting in unit of 1/2^32 Joules (about 0.23 nJ). User level tools
must convert the counts by multiplying them by 2^-32 to obtain
Joules. The reason for this is that the kernel avoids
doing floating point math whenever possible because it is
expensive (user floating-point state must be saved). The method
used avoids kernel floating-point usage. There is no loss of
precision. Thanks to PeterZ for suggesting this approach.

To convert the raw count in Watt:
   W = C * 2.3 / (1e10 * time)
or ldexp(C, -32).

RAPL PMU is a new standalone PMU which registers with the
perf_event core subsystem. The PMU type (attr->type) is
dynamically allocated and is available from /sys/device/power/type.

Sampling is not supported by the RAPL PMU. There is no
privilege level filtering either.

Signed-off-by: Stephane Eranian <eranian@google.com>
Reviewed-by: Maria Dimakopoulou <maria.n.dimakopoulou@gmail.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: acme@redhat.com
Cc: jolsa@redhat.com
Cc: zheng.z.yan@intel.com
Cc: bp@alien8.de
Link: http://lkml.kernel.org/r/1384275531-10892-4-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
10 years agotools/perf/stat: Add event unit and scale support
Stephane Eranian [Tue, 12 Nov 2013 16:58:49 +0000 (17:58 +0100)]
tools/perf/stat: Add event unit and scale support

This patch adds perf stat support for handling event units and
scales as exported by the kernel.

The kernel can export PMU events actual unit and scaling factor
via sysfs:

  $ ls -1 /sys/devices/power/events/energy-*
  /sys/devices/power/events/energy-cores
  /sys/devices/power/events/energy-cores.scale
  /sys/devices/power/events/energy-cores.unit
  /sys/devices/power/events/energy-pkg
  /sys/devices/power/events/energy-pkg.scale
  /sys/devices/power/events/energy-pkg.unit
  $ cat /sys/devices/power/events/energy-cores.scale
  2.3283064365386962890625e-10
  $ cat cat /sys/devices/power/events/energy-cores.unit
  Joules

This patch modifies the pmu event alias code to check
for the presence of the .unit and .scale files to load
the corresponding values. They are then used by perf stat
transparently:

   # perf stat -a -e power/energy-pkg/,power/energy-cores/,cycles -I 1000 sleep 1000
   #          time             counts   unit events
       1.000214717               3.07 Joules power/energy-pkg/         [100.00%]
       1.000214717               0.53 Joules power/energy-cores/
       1.000214717           12965028        cycles                    [100.00%]
       2.000749289               3.01 Joules power/energy-pkg/
       2.000749289               0.52 Joules power/energy-cores/
       2.000749289           15817043        cycles

When the event does not have an explicit unit exported by
the kernel, nothing is printed. In csv output mode, there
will be an empty field.

Special thanks to Jiri for providing the supporting code
in the parser to trigger reading of the scale and unit files.

Signed-off-by: Stephane Eranian <eranian@google.com>
Reviewed-by: Jiri Olsa <jolsa@redhat.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: zheng.z.yan@intel.com
Cc: bp@alien8.de
Cc: maria.n.dimakopoulou@gmail.com
Cc: acme@redhat.com
Link: http://lkml.kernel.org/r/1384275531-10892-3-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
10 years agoperf: Add active_entry list head to struct perf_event
Stephane Eranian [Tue, 12 Nov 2013 16:58:48 +0000 (17:58 +0100)]
perf: Add active_entry list head to struct perf_event

This patch adds a new field to the struct perf_event.
It is intended to be used to chain events which are
active (enabled). It helps in the hardware layer
for PMUs which do not have actual counter restrictions, i.e.,
free running read-only counters. Active events are chained
as opposed to being tracked via the counter they use.

To save space we use a union with hlist_entry as both
are mutually exclusive (suggested by Jiri Olsa).

Signed-off-by: Stephane Eranian <eranian@google.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: acme@redhat.com
Cc: jolsa@redhat.com
Cc: zheng.z.yan@intel.com
Cc: bp@alien8.de
Cc: maria.n.dimakopoulou@gmail.com
Link: http://lkml.kernel.org/r/1384275531-10892-2-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
10 years agoMerge branch 'uprobes/core' of git://git.kernel.org/pub/scm/linux/kernel/git/oleg...
Ingo Molnar [Thu, 21 Nov 2013 08:59:27 +0000 (09:59 +0100)]
Merge branch 'uprobes/core' of git://git./linux/kernel/git/oleg/misc into perf/core

Pull uprobes cleanups from Oleg Nesterov.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
10 years agouprobes: Document xol_area and arch_uprobe->insn/ixol
Oleg Nesterov [Tue, 19 Nov 2013 16:20:21 +0000 (17:20 +0100)]
uprobes: Document xol_area and arch_uprobe->insn/ixol

Document xol_area and arch_uprobe.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
10 years agouprobes: Cleanup !CONFIG_UPROBES decls, unexport xol_area
Oleg Nesterov [Sat, 9 Nov 2013 18:49:39 +0000 (19:49 +0100)]
uprobes: Cleanup !CONFIG_UPROBES decls, unexport xol_area

1. Don't include asm/uprobes.h unconditionally, we only need
   it if CONFIG_UPROBES.

2. Move the definition of "struct xol_area" into uprobes.c.

   Perhaps we should simply kill struct uprobes_state, it buys
   nothing.

3. Kill the dummy definition of uprobe_get_swbp_addr(), nobody
   except handle_swbp() needs it.

4. Purely cosmetic, but move the decl of uprobe_get_swbp_addr()
   up, close to other __weak helpers.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
10 years agouprobes/powerpc: Kill arch_uprobe->ainsn
Oleg Nesterov [Sat, 9 Nov 2013 17:44:19 +0000 (18:44 +0100)]
uprobes/powerpc: Kill arch_uprobe->ainsn

powerpc has both arch_uprobe->insn and arch_uprobe->ainsn to
make the generic code happy. This is no longer needed after
the previous change, powerpc can just use "u32 insn".

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Acked-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
10 years agouprobes: Don't assume that arch_uprobe->insn/ixol is u8[MAX_UINSN_BYTES]
Oleg Nesterov [Sat, 9 Nov 2013 16:58:54 +0000 (17:58 +0100)]
uprobes: Don't assume that arch_uprobe->insn/ixol is u8[MAX_UINSN_BYTES]

arch_uprobe should be opaque as much as possible to the generic
code, but currently it assumes that insn/ixol must be u8[] of the
known size. Remove this unnecessary dependency, we can use "&" and
and sizeof() with the same effect.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
10 years agouprobes: Add uprobe_task->dup_xol_work/dup_xol_addr
Oleg Nesterov [Fri, 8 Nov 2013 17:52:21 +0000 (18:52 +0100)]
uprobes: Add uprobe_task->dup_xol_work/dup_xol_addr

uprobe_task->vaddr is a bit strange. The generic code uses it only
to pass the additional argument to arch_uprobe_pre_xol(), and since
it is always equal to instruction_pointer() this looks even more
strange.

And both utask->vaddr and and utask->autask have the same scope,
they only have the meaning when the task executes the probed insn
out-of-line, so it is safe to reuse both in UTASK_RUNNING state.

This all means that logically ->vaddr belongs to arch_uprobe_task
and we should probably move it there, arch_uprobe_pre_xol() can
record instruction_pointer() itself.

OTOH, it is also used by uprobe_copy_process() and dup_xol_work()
for another purpose, this doesn't look clean and doesn't allow to
move this member into arch_uprobe_task.

This patch adds the union with 2 anonymous structs into uprobe_task.

The first struct is autask + vaddr, this way we "almost" move vaddr
into autask.

The second struct has 2 new members for uprobe_copy_process() paths:
->dup_xol_addr which can be used instead ->vaddr, and ->dup_xol_work
which can be used to avoid kmalloc() and simplify the code.

Note that this union will likely have another member(s), we need
something like "private_data_for_handlers" so that the tracing
handlers could use it to communicate with call_fetch() methods.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
10 years agoMerge tag 'perf-urgent-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git...
Ingo Molnar [Wed, 20 Nov 2013 13:29:46 +0000 (14:29 +0100)]
Merge tag 'perf-urgent-for-mingo' of git://git./linux/kernel/git/acme/linux into perf/urgent

Pull perf/urgent fixes from Arnaldo Carvalho de Melo:

  * Tag thread comm as overriden, showing the right comm for threads after forks.
    (Frederic Weisbecker)

  * Fix memory leak when processing perf.data file header. (Namhyung Kim.)

  * Don't try to free string constant used for anonymous event groups. (Namhyung Kim)

  * Fix use of multiple options in processing field in libtraceevent. (Steven Rostedt)

  * Fix conversion of pointer to integer of different size in libtraceevent.
    (Arnaldo Carvalho de Melo)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
10 years agotools lib traceevent: Fix conversion of pointer to integer of different size
Arnaldo Carvalho de Melo [Tue, 19 Nov 2013 19:14:51 +0000 (16:14 -0300)]
tools lib traceevent: Fix conversion of pointer to integer of different size

gcc complaint on 32-bit system:

  /home/acme/git/linux/tools/lib/traceevent/event-parse.c: In function ‘eval_num_arg’:
  /home/acme/git/linux/tools/lib/traceevent/event-parse.c:3468:9: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]

This is because the eval_num_arg returns everything as an 'unsigned long long',
so it converts a void pointer to a wider integer, fix it by converting the void
pointer to an integer of the same size, 'unsigned long', before casting it to
'unsigned long long'.

Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-yllx4aqcg06v5n4vjpwiiuld@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 years agoperf/trace: Properly use u64 to hold event_id
Vince Weaver [Fri, 15 Nov 2013 17:39:45 +0000 (12:39 -0500)]
perf/trace: Properly use u64 to hold event_id

The 64-bit attr.config value for perf trace events was being copied into
an "int" before doing a comparison, meaning the top 32 bits were
being truncated.

As far as I can tell this didn't cause any errors, but it did mean
it was possible to create valid aliases for all the tracepoint ids
which I don't think was intended.  (For example, 0xffffffff00000018
and 0x18 both enable the same tracepoint).

Signed-off-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1311151236100.11932@vincent-weaver-1.um.maine.edu
Signed-off-by: Ingo Molnar <mingo@kernel.org>
10 years agoperf: Remove fragile swevent hlist optimization
Peter Zijlstra [Fri, 13 Sep 2013 11:14:47 +0000 (13:14 +0200)]
perf: Remove fragile swevent hlist optimization

Currently we only allocate a single cpu hashtable for per-cpu
swevents; do away with this optimization for it is fragile in the face
of things like perf_pmu_migrate_context().

The easiest thing is to make sure all CPUs are consistent wrt state.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20130913111447.GN31370@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
10 years agoftrace, perf: Avoid infinite event generation loop
Peter Zijlstra [Thu, 14 Nov 2013 15:23:04 +0000 (16:23 +0100)]
ftrace, perf: Avoid infinite event generation loop

Vince's perf-trinity fuzzer found yet another 'interesting' problem.

When we sample the irq_work_exit tracepoint with period==1 (or
PERF_SAMPLE_PERIOD) and we add an fasync SIGNAL handler we create an
infinite event generation loop:

  ,-> <IPI>
  |     irq_work_exit() ->
  |       trace_irq_work_exit() ->
  |         ...
  |           __perf_event_overflow() -> (due to fasync)
  |             irq_work_queue() -> (irq_work_list must be empty)
  '---------      arch_irq_work_raise()

Similar things can happen due to regular poll() wakeups if we exceed
the ring-buffer wakeup watermark, or have an event_limit.

To avoid this, dis-allow sampling this particular tracepoint.

In order to achieve this, create a special perf_perm function pointer
for each event and call this (when set) on trying to create a
tracepoint perf event.

[ roasted: use expr... to allow for ',' in your expression ]

Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Tested-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Dave Jones <davej@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/20131114152304.GC5364@laptop.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
10 years agotools lib traceevent: Fix use of multiple options in processing field
Steven Rostedt [Mon, 18 Nov 2013 19:23:14 +0000 (14:23 -0500)]
tools lib traceevent: Fix use of multiple options in processing field

Jiri Olsa reported that the scsi_dispatch_cmd_done event failed to parse
with:

  Error: expected type 5 but read 4
  Error: expected type 5 but read 4

The problem is with this part of the print_fmt:

  __print_symbolic(((REC->result) >> 24) & 0xff, ...

The __print_symbolic() helper function's first parameter is the field to
use to determine what symbol to print based on the value of the result.
The parser can handle one operation, but it can not handle multiple
operations ('>>' and '&').

Add code to process all operations for the field argument for
__print_symbolic() as well as __print_flags().

Reported-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/20131118142314.27ca334b@gandalf.local.home
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 years agoperf header: Fix possible memory leaks in process_group_desc()
Namhyung Kim [Mon, 18 Nov 2013 02:20:44 +0000 (11:20 +0900)]
perf header: Fix possible memory leaks in process_group_desc()

After processing all group descriptors or encountering an error, it
frees all descriptors.  However, current logic can leak memory since it
might not traverse all descriptors.

Note that the 'i' can have different value than nr_groups when an error
occurred and it's safe to call free(desc[i].name) for every desc since
we already make it NULL when it's reused for group names.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1384741244-7271-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 years agoperf header: Fix bogus group name
Namhyung Kim [Mon, 18 Nov 2013 02:20:43 +0000 (11:20 +0900)]
perf header: Fix bogus group name

When processing event group descriptor in perf file header, we reuse an
allocated group name but forgot to prevent it from freeing.

Reported-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1384741244-7271-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 years agoperf tools: Tag thread comm as overriden
Frederic Weisbecker [Sat, 16 Nov 2013 01:02:09 +0000 (02:02 +0100)]
perf tools: Tag thread comm as overriden

The problem is that when a thread overrides its default ":%pid" comm, we
forget to tag the thread comm as overriden. Hence, this overriden comm
is not inherited on future forks. Fix it.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Tested-by: David Ahern <dsahern@gmail.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/20131116010207.GA18855@localhost.localdomain
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 years agoseq_file: always clear m->count when we free m->buf
Al Viro [Tue, 19 Nov 2013 01:20:43 +0000 (01:20 +0000)]
seq_file: always clear m->count when we free m->buf

Once we'd freed m->buf, m->count should become zero - we have no valid
contents reachable via m->buf.

Reported-by: Charley (Hao Chuan) Chu <charley.chu@broadcom.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
10 years agoMerge git://www.linux-watchdog.org/linux-watchdog
Linus Torvalds [Mon, 18 Nov 2013 23:56:13 +0000 (15:56 -0800)]
Merge git://www.linux-watchdog.org/linux-watchdog

Pull watchdog changes from Wim Van Sebroeck:
 - addition of MOXA ART watchdog driver (moxart_wdt)
 - addition of CSR SiRFprimaII and SiRFatlasVI watchdog driver
   (sirfsoc_wdt)
 - addition of ralink watchdog driver (rt2880_wdt)
 - various fixes and cleanups (__user annotation, ioctl return codes,
   removal of redundant of_match_ptr, removal of unnecessary
   amba_set_drvdata(), use allocated buffer for usb_control_msg, ...)
 - removal of MODULE_ALIAS_MISCDEV statements
 - watchdog related DT bindings
 - first set of improvements on the w83627hf_wdt driver

* git://www.linux-watchdog.org/linux-watchdog: (26 commits)
  watchdog: w83627hf: Use helper functions to access superio registers
  watchdog: w83627hf: Enable watchdog device only if not already enabled
  watchdog: w83627hf: Enable watchdog only once
  watchdog: w83627hf: Convert to watchdog infrastructure
  watchdog: omap_wdt: raw read and write endian fix
  watchdog: sirf: don't depend on dummy value of CLOCK_TICK_RATE
  watchdog: pcwd_usb: overflow in usb_pcwd_send_command()
  watchdog: rt2880_wdt: fix return value check in rt288x_wdt_probe()
  watchdog: watchdog_core: Fix a trivial typo
  watchdog: dw: Enable OF support for DW watchdog timer
  watchdog: Get rid of MODULE_ALIAS_MISCDEV statements
  watchdog: ts72xx_wdt: Propagate return value from timeout_to_regval
  watchdog: pcwd_usb: Use allocated buffer for usb_control_msg
  watchdog: sp805_wdt: Remove unnecessary amba_set_drvdata()
  watchdog: sirf: add watchdog driver of CSR SiRFprimaII and SiRFatlasVI
  watchdog: Remove redundant of_match_ptr
  watchdog: ts72xx_wdt: cleanup return codes in ioctl
  documentation/devicetree: Move DT bindings from gpio to watchdog
  watchdog: add ralink watchdog driver
  watchdog: Add MOXA ART watchdog driver
  ...

10 years agoMerge branch 'i2c/for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa...
Linus Torvalds [Mon, 18 Nov 2013 23:50:07 +0000 (15:50 -0800)]
Merge branch 'i2c/for-next' of git://git./linux/kernel/git/wsa/linux

Pull i2c changes from Wolfram Sang:
 - new drivers for exynos5, bcm kona, and st micro
 - bigger overhauls for drivers mxs and rcar
 - typical driver bugfixes, cleanups, improvements
 - got rid of the superfluous 'driver' member in i2c_client struct This
   touches a few drivers in other subsystems.  All acked.

* 'i2c/for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: (38 commits)
  i2c: bcm-kona: fix error return code in bcm_kona_i2c_probe()
  i2c: i2c-eg20t: do not print error message in syslog if no ACK received
  i2c: bcm-kona: Introduce Broadcom I2C Driver
  i2c: cbus-gpio: Fix device tree binding
  i2c: wmt: add missing clk_disable_unprepare() on error
  i2c: designware: add new ACPI IDs
  i2c: i801: Add Device IDs for Intel Wildcat Point-LP PCH
  i2c: exynos5: Remove incorrect clk_disable_unprepare
  i2c: i2c-st: Add ST I2C controller
  i2c: exynos5: add High Speed I2C controller driver
  i2c: rcar: fixup rcar type naming
  i2c: scmi: remove some bogus NULL checks
  i2c: sh_mobile & rcar: Enable the driver on all ARM platforms
  i2c: sh_mobile: Convert to clk_prepare/unprepare
  i2c: mux: gpio: use reg value for i2c_add_mux_adapter
  i2c: mux: gpio: use gpio_set_value_cansleep()
  i2c: Include linux/of.h header
  i2c: mxs: Fix PIO mode on i.MX23
  i2c: mxs: Rework the PIO mode operation
  i2c: mxs: distinguish i.MX23 and i.MX28 based I2C controller
  ...

10 years agoMerge tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland...
Linus Torvalds [Mon, 18 Nov 2013 23:36:04 +0000 (15:36 -0800)]
Merge tag 'rdma-for-linus' of git://git./linux/kernel/git/roland/infiniband

Pull infiniband/rdma updates from Roland Dreier:
 - Re-enable flow steering verbs with new improved userspace ABI
 - Fixes for slow connection due to GID lookup scalability
 - IPoIB fixes
 - Many fixes to HW drivers including mlx4, mlx5, ocrdma and qib
 - Further improvements to SRP error handling
 - Add new transport type for Cisco usNIC

* tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (66 commits)
  IB/core: Re-enable create_flow/destroy_flow uverbs
  IB/core: extended command: an improved infrastructure for uverbs commands
  IB/core: Remove ib_uverbs_flow_spec structure from userspace
  IB/core: Use a common header for uverbs flow_specs
  IB/core: Make uverbs flow structure use names like verbs ones
  IB/core: Rename 'flow' structs to match other uverbs structs
  IB/core: clarify overflow/underflow checks on ib_create/destroy_flow
  IB/ucma: Convert use of typedef ctl_table to struct ctl_table
  IB/cm: Convert to using idr_alloc_cyclic()
  IB/mlx5: Fix page shift in create CQ for userspace
  IB/mlx4: Fix device max capabilities check
  IB/mlx5: Fix list_del of empty list
  IB/mlx5: Remove dead code
  IB/core: Encorce MR access rights rules on kernel consumers
  IB/mlx4: Fix endless loop in resize CQ
  RDMA/cma: Remove unused argument and minor dead code
  RDMA/ucma: Discard events for IDs not yet claimed by user space
  IB/core: Add Cisco usNIC rdma node and transport types
  RDMA/nes: Remove self-assignment from nes_query_qp()
  IB/srp: Report receive errors correctly
  ...

10 years agoMerge tag 'for-v3.13' of git://git.infradead.org/battery-2.6
Linus Torvalds [Mon, 18 Nov 2013 23:35:09 +0000 (15:35 -0800)]
Merge tag 'for-v3.13' of git://git.infradead.org/battery-2.6

Pull battery updates from Anton Vorontsov:
 "Highlights:
   - A new driver for TI BQ24735 Battery Chargers, courtesy of NVidia.
   - Device tree bindings for TWL4030 chips.
   - Random fixes and cleanups"

* tag 'for-v3.13' of git://git.infradead.org/battery-2.6:
  pm2301-charger: Remove unneeded NULL checks
  twl4030_charger: Add devicetree support
  power_supply: Fix documentation for TEMP_*ALERT* properties
  max17042_battery: Support regmap to access device's registers
  max17042_battery: Use SIMPLE_DEV_PM_OPS
  charger-manager : Replace kzalloc to devm_kzalloc and remove uneccessary code
  bq2415x_charger: Fix max battery regulation voltage
  tps65090-charger: Use "IS_ENABLED(CONFIG_OF)" for DT code
  tps65090-charger: Drop devm_free_irq of devm_ allocated irq
  power_supply: Add support for bq24735 charger
  pm2301-charger: Staticize pm2xxx_charger_die_therm_mngt
  pm2301-charger: Check return value of regulator_enable
  ab8500-charger: Remove redundant break
  ab8500-charger: Check return value of regulator_enable
  isp1704_charger: Fix driver to work with changes introduced in v3.5

10 years agoMerge branch 'topic/kbuild-fixes-for-next' of git://git.kernel.org/pub/scm/linux...
Linus Torvalds [Mon, 18 Nov 2013 23:10:05 +0000 (15:10 -0800)]
Merge branch 'topic/kbuild-fixes-for-next' of git://git./linux/kernel/git/mchehab/linux-media

Pull media build fixes from Mauro Carvalho Chehab:
 "A series of patches that fix compilation on non-x86 archs.

  While most of them are just build fixes, there are some fixes for real
  bugs, as there are a number of drivers using dynamic stack allocation.
  A few of those might be considered a security risk, if the i2c-dev
  module is loaded, as someone could be sending very long I2C data that
  could potentially overflow the Kernel stack.  Ok, as using /dev/i2c-*
  devnodes usually requires root on usual distros, and exploiting it
  would require a DVB board or USB stick, the risk is not high"

* 'topic/kbuild-fixes-for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (28 commits)
  [media] platform drivers: Fix build on frv arch
  [media] lirc_zilog: Don't use dynamic static allocation
  [media] mxl111sf: Don't use dynamic static allocation
  [media] af9035: Don't use dynamic static allocation
  [media] af9015: Don't use dynamic static allocation
  [media] dw2102: Don't use dynamic static allocation
  [media] dibusb-common: Don't use dynamic static allocation
  [media] cxusb: Don't use dynamic static allocation
  [media] v4l2-async: Don't use dynamic static allocation
  [media] cimax2: Don't use dynamic static allocation
  [media] tuner-xc2028: Don't use dynamic static allocation
  [media] tuners: Don't use dynamic static allocation
  [media] av7110_hw: Don't use dynamic static allocation
  [media] stv090x: Don't use dynamic static allocation
  [media] stv0367: Don't use dynamic static allocation
  [media] stb0899_drv: Don't use dynamic static allocation
  [media] dvb-frontends: Don't use dynamic static allocation
  [media] dvb-frontends: Don't use dynamic static allocation
  [media] s5h1420: Don't use dynamic static allocation
  [media] uvc/lirc_serial: Fix some warnings on parisc arch
  ...

10 years agoMerge branch 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab...
Linus Torvalds [Mon, 18 Nov 2013 23:08:02 +0000 (15:08 -0800)]
Merge branch 'v4l_for_linus' of git://git./linux/kernel/git/mchehab/linux-media

Pull media updates from Mauro Carvalho Chehab:
 "This series include:
   - a new Remote Controller driver for ST SoC with the corresponding DT
     bindings
   - a new frontend (cx24117)
   - a new I2C camera flash driver (lm3560)
   - a new mem2mem driver for TI SoC (ti-vpe)
   - support for Raphael r828d added to r820t driver
   - some improvements on buffer allocation at VB2 core
   - usual driver fixes and improvements

  PS this time, we have a smaller number of patches.  While it is hard
  to pinpoint to the reasons, I believe that it is mainly due to:

   1) there are several patch series ready, but depending on DT review.
      I decided to grant some extra time for DT maintainers to look on
      it, as they're expecting to have more time with the changes agreed
      during ARM mini-summit and KS.  If they can't review in time for
      3.14, I'll review myself and apply for the next merge window.

   2) I suspect that having both LinuxCon EU and LinuxCon NA happening
      during the same merge window affected the development
      productivity, as several core media developers participated on
      both events"

* 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (151 commits)
  [media] media: st-rc: Add ST remote control driver
  [media] gpio-ir-recv: Include linux/of.h header
  [media] tvp7002: Include linux/of.h header
  [media] tvp514x: Include linux/of.h header
  [media] ths8200: Include linux/of.h header
  [media] adv7343: Include linux/of.h header
  [media] v4l: Fix typo in v4l2_subdev_get_try_crop()
  [media] media: i2c: add driver for dual LED Flash, lm3560
  [media] rtl28xxu: add 15f4:0131 Astrometa DVB-T2
  [media] rtl28xxu: add RTL2832P + R828D support
  [media] rtl2832: add new tuner R828D
  [media] r820t: add support for R828D
  [media] media/i2c: ths8200: fix build failure with gcc 4.5.4
  [media] Add support for KWorld UB435-Q V2
  [media] staging/media: fix msi3101 build errors
  [media] ddbridge: Remove casting the return value which is a void pointer
  [media] ngene: Remove casting the return value which is a void pointer
  [media] dm1105: remove unneeded not-null test
  [media] sh_mobile_ceu_camera: remove deprecated IRQF_DISABLED
  [media] media: rcar_vin: Add preliminary r8a7790 support
  ...

10 years agoMerge branch 'linux_next' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab...
Linus Torvalds [Mon, 18 Nov 2013 22:51:52 +0000 (14:51 -0800)]
Merge branch 'linux_next' of git://git./linux/kernel/git/mchehab/linux-edac

Pull EDAC driver updates from Mauro Carvalho Chehab:
 - sb_edac: add support for Ivy Bridge support
 - cell_edac: add a missing of_node_put() call

* 'linux_next' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-edac:
  cell_edac: fix missing of_node_put
  sb_edac: add support for Ivy Bridge
  sb_edac: avoid decoding the same error multiple times
  sb_edac: rename mci_bind_devs()
  sb_edac: enable multiple PCI id tables to be used
  sb_edac: rework sad_pkg
  sb_edac: allow different interleave lists
  sb_edac: allow different dram_rule arrays
  sb_edac: isolate TOHM retrieval
  sb_edac: rename pci_br
  sb_edac: isolate TOLM retrieval
  sb_edac: make RANK_CFG_A value part of sbridge_info

10 years agoMerge tag 'edac_for_3.13' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp
Linus Torvalds [Mon, 18 Nov 2013 22:50:17 +0000 (14:50 -0800)]
Merge tag 'edac_for_3.13' of git://git./linux/kernel/git/bp/bp

Pull EDAC updates from Borislav Petkov:
 "Following up on last week's discussion, here's my part of the EDAC
  pile, highlights in the signed tag.

  The last two patches have a date from just now because I've just
  applied them to the tree after Johannes sent them to me earlier.  I
  decided to forward them now because they're trivial.

  There's a third one for MPC85xx which adds PCIe error interrupt
  support but since it is not so trivial and hasn't seen any linux-next
  time, I'm deferring it to 3.14

  EDAC update highlights:
   - Support for Calxeda ECX-2000 memory controller, from Robert Richter
   - Misc Calxeda Highbank drivers and EDAC core cleanups, from Rob
     Herring and Robert Richter
   - New maintainer for Freescale's MPC85xx EDAC driver: Johannes
     Thumshirn"

* tag 'edac_for_3.13' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp:
  edac/85xx: Remove mpc85xx_pci_err_remove
  EDAC: Add edac-mpc85xx driver to MAINTAINERS
  edac, highbank: Moving error injection to sysfs for edac
  edac, highbank: Add MAINTAINERS entry
  edac: Unify reporting of device info for device, mc and pci
  edac, highbank: Improve and unify naming
  edac, highbank: Add Calxeda ECX-2000 support
  ARM: dts: calxeda: move memory-controller node out of ecx-common.dtsi
  edac, highbank: Fix interrupt setup of mem and l2 controller

10 years agoMerge tag 'mmc-updates-for-3.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Mon, 18 Nov 2013 22:47:30 +0000 (14:47 -0800)]
Merge tag 'mmc-updates-for-3.13-rc1' of git://git./linux/kernel/git/cjb/mmc

Pull MMC updates from Chris Ball:
 "MMC highlights for 3.13:

  Core:
   - Improve runtime PM support, remove mmc_{suspend,resume}_host().
   - Add MMC_CAP_RUNTIME_RESUME, for delaying MMC resume until we're
     outside of the resume sequence (in runtime_resume) to decrease
     system resume time.

  Drivers:
   - dw_mmc: Support HS200 mode.
   - sdhci-eshdc-imx: Support SD3.0 SDR clock tuning, DDR on IMX6.
   - sdhci-pci: Add support for Intel Clovertrail and Merrifield"

* tag 'mmc-updates-for-3.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc: (108 commits)
  mmc: wbsd: Silence compiler warning
  mmc: core: Silence compiler warning in __mmc_switch
  mmc: sh_mmcif: Convert to clk_prepare|unprepare
  mmc: sh_mmcif: Convert to PM macros when defining dev_pm_ops
  mmc: dw_mmc: exynos: Revert the sdr_timing assignment
  mmc: sdhci: Avoid needless loop while handling SDIO interrupts in sdhci_irq
  mmc: core: Add MMC_CAP_RUNTIME_RESUME to resume at runtime_resume
  mmc: core: Improve runtime PM support during suspend/resume for sd/mmc
  mmc: core: Remove redundant mmc_power_up|off at runtime callbacks
  mmc: Don't force card to active state when entering suspend/shutdown
  MIPS: db1235: Don't use MMC_CLKGATE
  mmc: core: Remove deprecated mmc_suspend|resume_host APIs
  mmc: mmci: Move away from using deprecated APIs
  mmc: via-sdmmc: Move away from using deprecated APIs
  mmc: tmio: Move away from using deprecated APIs
  mmc: sh_mmcif: Move away from using deprecated APIs
  mmc: sdricoh_cs: Move away from using deprecated APIs
  mmc: rtsx: Remove redundant suspend and resume callbacks
  mmc: wbsd: Move away from using deprecated APIs
  mmc: pxamci: Remove redundant suspend and resume callbacks
  ...

10 years agowatchdog: w83627hf: Use helper functions to access superio registers
Guenter Roeck [Sat, 17 Aug 2013 20:58:42 +0000 (13:58 -0700)]
watchdog: w83627hf: Use helper functions to access superio registers

Use helper functions named similar to other drivers to access
superio registers.

Request memory region only when needed, and use request_muxed_region().
This lets other devices (hwmon, gpio) use the same region.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
10 years agowatchdog: w83627hf: Enable watchdog device only if not already enabled
Guenter Roeck [Sat, 17 Aug 2013 20:58:41 +0000 (13:58 -0700)]
watchdog: w83627hf: Enable watchdog device only if not already enabled

There is no need to enable the watchdog device if it is already enabled.
Also, when enabling the watchdog device, only set the watchdog device
enable bit and do not touch other bits; depending on the chip type,
those bits may enable other functionality.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
10 years agowatchdog: w83627hf: Enable watchdog only once
Guenter Roeck [Sat, 17 Aug 2013 20:58:40 +0000 (13:58 -0700)]
watchdog: w83627hf: Enable watchdog only once

It is unnecessary to enable the logical device and WDT0 each time
the watchdog is accessed. Do it only once during initialization.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
10 years agowatchdog: w83627hf: Convert to watchdog infrastructure
Guenter Roeck [Tue, 29 Oct 2013 02:43:57 +0000 (19:43 -0700)]
watchdog: w83627hf: Convert to watchdog infrastructure

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
10 years agoi2c: bcm-kona: fix error return code in bcm_kona_i2c_probe()
Wei Yongjun [Mon, 18 Nov 2013 13:03:08 +0000 (21:03 +0800)]
i2c: bcm-kona: fix error return code in bcm_kona_i2c_probe()

Fix to return a negative error code from the bus speed parse
error handling case instead of 0.

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Reviewed-by: Tim Kryger <tim.kryger@linaro.org>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
10 years agoRevert "init/Kconfig: add option to disable kernel compression"
H. Peter Anvin [Fri, 15 Nov 2013 05:43:47 +0000 (21:43 -0800)]
Revert "init/Kconfig: add option to disable kernel compression"

This reverts commit 69f0554ec261fd686ac7fa1c598cc9eb27b83a80.

This patch breaks randconfig on at least the x86-64 architecture, and
most likely on others.  There is work underway to support uncompressed
kernels in a generic way, but it looks like it will amount to
rewriting the support from scratch; see the LKML thread in the Link:
for info.

Therefore, revert this change and wait for the fix.

Reported-by: Pavel Roskin <proski@gnu.org>
Cc: Christian Ruppert <christian.ruppert@abilis.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20131113113418.167b8ffd@IRBT4585
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
10 years agoedac/85xx: Remove mpc85xx_pci_err_remove
Johannes Thumshirn [Sun, 17 Nov 2013 18:25:14 +0000 (19:25 +0100)]
edac/85xx: Remove mpc85xx_pci_err_remove

Remove mpc85xx_pci_err_remove(...) which is obsolete, this removes the
compiler warning which can be seen when building the driver either
statically or as a module.

Signed-off-by: Johannes Thumshirn <morbidrsa@gmail.com>
Link: https://lkml.kernel.org/r/20131112161901.GA15637@jtlinux
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@men.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
10 years agoEDAC: Add edac-mpc85xx driver to MAINTAINERS
Johannes Thumshirn [Sun, 17 Nov 2013 18:25:12 +0000 (19:25 +0100)]
EDAC: Add edac-mpc85xx driver to MAINTAINERS

Add drivers/edac/mpc85xx_edac.[ch] to MAINTAINERS file and me as
maintainer.

Signed-off-by: Johannes Thumshirn <johannes.thumshirn@men.de>
Link: https://lkml.kernel.org/r/20131112161901.GA15637@jtlinux
Cc: Doug Thompson <dougthompson@xmission.com>
Cc: Dave Jiang <dave.jiang@gmail.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
10 years agowatchdog: omap_wdt: raw read and write endian fix
Victor Kamensky [Sat, 16 Nov 2013 00:01:05 +0000 (02:01 +0200)]
watchdog: omap_wdt: raw read and write endian fix

All OMAP IP blocks expect LE data, but CPU may operate in BE mode.
Need to use endian neutral functions to read/write h/w registers.
I.e instead of __raw_read[lw] and __raw_write[lw] functions code
need to use read[lw]_relaxed and write[lw]_relaxed functions.
If the first simply reads/writes register, the second will byteswap
it if host operates in BE mode.

Changes are trivial sed like replacement of __raw_xxx functions
with xxx_relaxed variant.

Signed-off-by: Victor Kamensky <victor.kamensky@linaro.org>
Signed-off-by: Taras Kondratiuk <taras.kondratiuk@linaro.org>
Acked-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
10 years agowatchdog: sirf: don't depend on dummy value of CLOCK_TICK_RATE
Uwe Kleine-König [Mon, 11 Nov 2013 20:33:44 +0000 (21:33 +0100)]
watchdog: sirf: don't depend on dummy value of CLOCK_TICK_RATE

As CSR SiRF is converted to multi platform CLOCK_TICK_RATE is a dummy
value that seems to match the right value is used.
(arch/arm/mach-prima2/include/mach/timex.h which defined CLOCK_TICK_RATE
to 1000000 was removed in commit cf82e0e (ARM: sirf: enable
multiplatform support); marco used the same file.)

To not depend on that dummy value use a local #define instead.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
10 years agowatchdog: pcwd_usb: overflow in usb_pcwd_send_command()
Dan Carpenter [Fri, 8 Nov 2013 09:24:19 +0000 (01:24 -0800)]
watchdog: pcwd_usb: overflow in usb_pcwd_send_command()

We changed "buf" from being an array of 6 chars to being a pointer this
sizeof(buf) needs to be updated as well.

Fixes: 2ddb8089a7e5 ('watchdog: pcwd_usb: Use allocated buffer for usb_control_msg')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
10 years agowatchdog: rt2880_wdt: fix return value check in rt288x_wdt_probe()
Wei Yongjun [Thu, 31 Oct 2013 07:50:55 +0000 (15:50 +0800)]
watchdog: rt2880_wdt: fix return value check in rt288x_wdt_probe()

In case of error, the function devm_request_and_ioremap() returns NULL
pointer not ERR_PTR(). Fix it by using devm_ioremap_resource() instead
of devm_request_and_ioremap().

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
10 years agowatchdog: watchdog_core: Fix a trivial typo
Sachin Kamat [Mon, 28 Oct 2013 08:47:17 +0000 (14:17 +0530)]
watchdog: watchdog_core: Fix a trivial typo

Fixed a trivial typo.

Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
10 years agowatchdog: dw: Enable OF support for DW watchdog timer
Dinh Nguyen [Tue, 22 Oct 2013 16:59:12 +0000 (11:59 -0500)]
watchdog: dw: Enable OF support for DW watchdog timer

Add device tree support to the DW watchdog timer.

Signed-off-by: Dinh Nguyen <dinguyen@altera.com>
Acked-by: Jamie Iles <jamie@jamieiles.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Pavel Machek <pavel@denx.de>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Cc: Viresh Kumar <viresh.linux@gmail.com>
Cc: Rob Herring <rob.herring@calxeda.com>
Cc: Pawel Moll <pawel.moll@arm.com>
Cc: Stephen Warren <swarren@wwwdotorg.org>
Cc: Ian Campbell <ijc+devicetree@hellion.org.uk>
Cc: devicetree@vger.kernel.org
Cc: linux-watchdog@vger.kernel.org
10 years agowatchdog: Get rid of MODULE_ALIAS_MISCDEV statements
Jean Delvare [Mon, 21 Oct 2013 15:38:49 +0000 (17:38 +0200)]
watchdog: Get rid of MODULE_ALIAS_MISCDEV statements

I just can't find any value in MODULE_ALIAS_MISCDEV(WATCHDOG_MINOR)
and MODULE_ALIAS_MISCDEV(TEMP_MINOR) statements.

Either the device is enumerated and the driver already has a module
alias (e.g. PCI, USB etc.) that will get the right driver loaded
automatically.

Or the device is not enumerated and loading its driver will lead to
more or less intrusive hardware poking. Such hardware poking should be
limited to a bare minimum, so the user should really decide which
drivers should be tried and in what order. Trying them all in
arbitrary order can't do any good.

On top of that, loading that many drivers at once bloats the kernel
log. Also many drivers will stay loaded afterward, bloating the output
of "lsmod" and wasting memory. Some modules (cs5535_mfgpt which gets
loaded as a dependency) can't even be unloaded!

If defining char-major-10-130 is needed then it should happen in
user-space.

Signed-off-by: Jean Delvare <jdelvare@suse.de>
Acked-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Cc: Stephen Warren <swarren@wwwdotorg.org>
Cc: Mike Frysinger <vapier.adi@gmail.com>
Cc: Wan ZongShun <mcuos.com@gmail.com>
Cc: Ben Dooks <ben-linux@fluff.org>
Cc: Kukjin Kim <kgene.kim@samsung.com>
Cc: Zwane Mwaikambo <zwane@arm.linux.org.uk>
Cc: Jim Cromie <jim.cromie@gmail.com>
10 years agowatchdog: ts72xx_wdt: Propagate return value from timeout_to_regval
Guenter Roeck [Mon, 14 Oct 2013 16:32:48 +0000 (09:32 -0700)]
watchdog: ts72xx_wdt: Propagate return value from timeout_to_regval

timeout_to_regval() returns a valid error code. Might as well use it.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
10 years agowatchdog: pcwd_usb: Use allocated buffer for usb_control_msg
Guenter Roeck [Mon, 14 Oct 2013 16:29:34 +0000 (09:29 -0700)]
watchdog: pcwd_usb: Use allocated buffer for usb_control_msg

usb_control_msg() must use a dma-capable buffer.

This fixes the following error reported by smatch:

drivers/watchdog/pcwd_usb.c:257 usb_pcwd_send_command() error: doing dma on the
stack (buf)

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
10 years agowatchdog: sp805_wdt: Remove unnecessary amba_set_drvdata()
Michal Simek [Thu, 3 Oct 2013 09:46:30 +0000 (11:46 +0200)]
watchdog: sp805_wdt: Remove unnecessary amba_set_drvdata()

Driver core clears the driver data to NULL after device_release
or on probe failure, so just remove it from here.

Driver core change:
"device-core: Ensure drvdata = NULL when no driver is bound"
(sha1: 0998d0631001288a5974afc0b2a5f568bcdecb4d)

Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
10 years agowatchdog: sirf: add watchdog driver of CSR SiRFprimaII and SiRFatlasVI
Xianglong Du [Wed, 2 Oct 2013 00:13:49 +0000 (08:13 +0800)]
watchdog: sirf: add watchdog driver of CSR SiRFprimaII and SiRFatlasVI

On CSR SiRFprimaII and SiRFatlasVI, the 6th timer can act as a watchdog
timer when the Watchdog mode is enabled.

watchdog occur when TIMER watchdog counter matches the value software
pre-set, when this event occurs, the effect is the same as the system
software reset.

Signed-off-by: Xianglong Du <Xianglong.Du@csr.com>
Signed-off-by: Barry Song <Baohua.Song@csr.com>
Cc: Romain Izard <romain.izard.pro@gmail.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
10 years agowatchdog: Remove redundant of_match_ptr
Sachin Kamat [Mon, 30 Sep 2013 04:42:51 +0000 (10:12 +0530)]
watchdog: Remove redundant of_match_ptr

of_match_ptr() is a macro used to avoid undefined reference error if
CONFIG_OF is used to selectively compile in or out the
data structure. It is defined as follows:

#ifdef CONFIG_OF
#define of_match_ptr(ptr) ptr
#else
#define of_match_ptr(ptr) NULL
#endif

In the case of this series, none of the drivers use CONFIG_OF macro to
compile out the data structure (i.e., the data structure is always
defined).
Hence the use of of_match_ptr() does not make any sense. Thus removing
it to make the code look simpler for readability.

Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Acked-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
10 years agowatchdog: ts72xx_wdt: cleanup return codes in ioctl
Dan Carpenter [Fri, 23 Aug 2013 09:38:32 +0000 (12:38 +0300)]
watchdog: ts72xx_wdt: cleanup return codes in ioctl

There seems to be some confusion here which functions return positive
numbers and which return negative error codes.

copy_to_user() returns the number of bytes remaining to be copied but we
want to return -EFAULT.

The rest is just clean up.  get_user() actually returns zero on success
and -EFAULT on error so we can preserve the error code.  The
timeout_to_regval() function returns -EINVAL on failure, but we can
propogate that back instead of hardcoding -EINVAL ourselves.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
--

10 years agodocumentation/devicetree: Move DT bindings from gpio to watchdog
Johannes Thumshirn [Wed, 21 Aug 2013 12:42:09 +0000 (14:42 +0200)]
documentation/devicetree: Move DT bindings from gpio to watchdog

I accidently put the devicetree bindings for the MEN A21 watchdog driver in
Documentation/devicetree/bindings/gpio instead of
Documentation/devicetree/bindings/watchdog, this patch addresses this error.

Signed-off-by: Johannes Thumshirn <johannes.thumshirn@men.de>
Acked-by: Stephen Warren <swarren@nvidia.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Cc: Rob Herring <rob.herring@calxeda.com>
Cc: Pawel Moll <pawel.moll@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Rob Landley <rob@landley.net>
10 years agowatchdog: add ralink watchdog driver
John Crispin [Thu, 8 Aug 2013 09:31:43 +0000 (11:31 +0200)]
watchdog: add ralink watchdog driver

Add a driver for the watchdog timer found on Ralink SoC

Signed-off-by: John Crispin <blogic@openwrt.org>
Signed-off-by: Gabor Juhos <juhosg@openwrt.org>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Cc: linux-watchdog@vger.kernel.org
Cc: linux-mips@linux-mips.org
Cc: devicetree-discuss@lists.ozlabs.org
10 years agowatchdog: Add MOXA ART watchdog driver
Jonas Jensen [Fri, 2 Aug 2013 14:40:45 +0000 (16:40 +0200)]
watchdog: Add MOXA ART watchdog driver

This patch adds a watchdog driver for the main hardware watchdog timer
found on MOXA ART SoCs.

The MOXA ART SoC provides one writable timer register, restarting
the hardware once it reaches zero. The register is auto decremented
every APB clock cycle.

Signed-off-by: Jonas Jensen <jonas.jensen@gmail.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
10 years agowatchdog: kempld_wdt: Add __user annotation
Jingoo Han [Thu, 1 Aug 2013 05:39:46 +0000 (14:39 +0900)]
watchdog: kempld_wdt: Add __user annotation

Added __user annotation to fix the following sparse warnings.
Also, it makes 'kempld_prescaler' static because it is used
only in this file.

drivers/watchdog/kempld_wdt.c:70:11: warning: symbol 'kempld_prescaler' was not declared. Should it be static?
drivers/watchdog/kempld_wdt.c:364:23: warning: incorrect type in initializer (different address spaces)
drivers/watchdog/kempld_wdt.c:364:23:    expected int const [noderef] <asn:1>*register __p
drivers/watchdog/kempld_wdt.c:364:23:    got int *<noident>

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
10 years agowatchdog: dw_wdt: Add __user annotation
Jingoo Han [Thu, 1 Aug 2013 05:38:36 +0000 (14:38 +0900)]
watchdog: dw_wdt: Add __user annotation

Added __user annotation to fix the following sparse warnings.

drivers/watchdog/dw_wdt.c:206:38: warning: incorrect type in argument 1 (different address spaces)
drivers/watchdog/dw_wdt.c:206:38:    expected void [noderef] <asn:1>*to
drivers/watchdog/dw_wdt.c:206:38:    got struct watchdog_info *<noident>
drivers/watchdog/dw_wdt.c:211:24: warning: incorrect type in initializer (different address spaces)
drivers/watchdog/dw_wdt.c:211:24:    expected int const [noderef] <asn:1>*register __p
drivers/watchdog/dw_wdt.c:211:24:    got int *<noident>

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Acked-by: Jamie Iles <jamie@jamieiles.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
10 years agowatchdog: use dev_get_platdata()
Jingoo Han [Tue, 30 Jul 2013 10:58:51 +0000 (19:58 +0900)]
watchdog: use dev_get_platdata()

Use the wrapper function for retrieving the platform data instead of
accessing dev->platform_data directly.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
10 years agowatchdog: imx2_wdt: expose module alias for loading from device-tree
Niels de Vos [Mon, 29 Jul 2013 07:38:18 +0000 (09:38 +0200)]
watchdog: imx2_wdt: expose module alias for loading from device-tree

Enable auto loading by udev when imx2_wdt is compiled as a module.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
Acked-by: Shawn Guo <shawn.guo@linaro.org>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
10 years agowatchdog: dw_wdt: use clk_prepare_enable and clk_disable_unprepare
Heiko Stübner [Wed, 26 Jun 2013 18:04:31 +0000 (20:04 +0200)]
watchdog: dw_wdt: use clk_prepare_enable and clk_disable_unprepare

This is necessary to make the driver work with platforms using the
common clock framework.

Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
10 years agowatchdog: dw_wdt: convert to SIMPLE_DEV_PM_OPS
Heiko Stübner [Wed, 26 Jun 2013 18:03:52 +0000 (20:03 +0200)]
watchdog: dw_wdt: convert to SIMPLE_DEV_PM_OPS

The dw_wdt only provides PM_SLEEP operations, so convert the driver
to use SIMPLE_DEV_PM_OPS instead of populating the struct manually.
This has the added effect of simplifying the CONFIG_PM ifdefs.

Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
10 years agoi2c: i2c-eg20t: do not print error message in syslog if no ACK received
Andreas Werner [Sun, 17 Nov 2013 17:46:20 +0000 (18:46 +0100)]
i2c: i2c-eg20t: do not print error message in syslog if no ACK received

Using the i2c-eg20t driver and call i2cdetect or probe on the bus,
the driver will print a lot of error messages if there was no ACK
received.

i2cdetect normally print a table with all the available devices. If there
is no device on the address, the table will be empty.
Currently with the i2c-eg20t driver, the table is not visible because
the error messages destroy the table.

Error message: pch_i2c_getack return -71

This patch prevent the driver to print the messages to syslog.
The pch_i2c_wait_for_check_xfer function is the only one who is
calling pch_i2c_getack, so we can delete the function and add the
read to pch_i2c_wait_for_check_xfer.
If no ACK is received, the Message will be printed as a dbg
message.

Fixed print message to be a one liner so we can grep for the
error message.

Tested on Intel Atom E6xx and Eg20t Chipset.

Signed-off-by: Andreas Werner <wernerandy@gmx.de>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
10 years agoMerge branches 'cma', 'cxgb4', 'flowsteer', 'ipoib', 'misc', 'mlx4', 'mlx5', 'nes...
Roland Dreier [Sun, 17 Nov 2013 16:22:19 +0000 (08:22 -0800)]
Merge branches 'cma', 'cxgb4', 'flowsteer', 'ipoib', 'misc', 'mlx4', 'mlx5', 'nes', 'ocrdma', 'qib' and 'srp' into for-next

10 years agoIB/core: Re-enable create_flow/destroy_flow uverbs
Matan Barak [Wed, 6 Nov 2013 22:21:50 +0000 (23:21 +0100)]
IB/core: Re-enable create_flow/destroy_flow uverbs

This commit reverts commit 7afbddfae993 ("IB/core: Temporarily disable
create_flow/destroy_flow uverbs").  Since the uverbs extensions
functionality was experimental for v3.12, this patch re-enables the
support for them and flow-steering for v3.13.

Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoIB/core: extended command: an improved infrastructure for uverbs commands
Yann Droneaud [Wed, 6 Nov 2013 22:21:49 +0000 (23:21 +0100)]
IB/core: extended command: an improved infrastructure for uverbs commands

Commit 400dbc96583f ("IB/core: Infrastructure for extensible uverbs
commands") added an infrastructure for extensible uverbs commands
while later commit 436f2ad05a0b ("IB/core: Export ib_create/destroy_flow
through uverbs") exported ib_create_flow()/ib_destroy_flow() functions
using this new infrastructure.

According to the commit 400dbc96583f, the purpose of this
infrastructure is to support passing around provider (eg. hardware)
specific buffers when userspace issue commands to the kernel, so that
it would be possible to extend uverbs (eg. core) buffers independently
from the provider buffers.

But the new kernel command function prototypes were not modified to
take advantage of this extension. This issue was exposed by Roland
Dreier in a previous review[1].

So the following patch is an attempt to a revised extensible command
infrastructure.

This improved extensible command infrastructure distinguish between
core (eg. legacy)'s command/response buffers from provider
(eg. hardware)'s command/response buffers: each extended command
implementing function is given a struct ib_udata to hold core
(eg. uverbs) input and output buffers, and another struct ib_udata to
hold the hw (eg. provider) input and output buffers.

Having those buffers identified separately make it easier to increase
one buffer to support extension without having to add some code to
guess the exact size of each command/response parts: This should make
the extended functions more reliable.

Additionally, instead of relying on command identifier being greater
than IB_USER_VERBS_CMD_THRESHOLD, the proposed infrastructure rely on
unused bits in command field: on the 32 bits provided by command
field, only 6 bits are really needed to encode the identifier of
commands currently supported by the kernel. (Even using only 6 bits
leaves room for about 23 new commands).

So this patch makes use of some high order bits in command field to
store flags, leaving enough room for more command identifiers than one
will ever need (eg. 256).

The new flags are used to specify if the command should be processed
as an extended one or a legacy one. While designing the new command
format, care was taken to make usage of flags itself extensible.

Using high order bits of the commands field ensure that newer
libibverbs on older kernel will properly fail when trying to call
extended commands. On the other hand, older libibverbs on newer kernel
will never be able to issue calls to extended commands.

The extended command header includes the optional response pointer so
that output buffer length and output buffer pointer are located
together in the command, allowing proper parameters checking. This
should make implementing functions easier and safer.

Additionally the extended header ensure 64bits alignment, while making
all sizes multiple of 8 bytes, extending the maximum buffer size:

                             legacy      extended

   Maximum command buffer:  256KBytes   1024KBytes (512KBytes + 512KBytes)
  Maximum response buffer:  256KBytes   1024KBytes (512KBytes + 512KBytes)

For the purpose of doing proper buffer size accounting, the headers
size are no more taken in account in "in_words".

One of the odds of the current extensible infrastructure, reading
twice the "legacy" command header, is fixed by removing the "legacy"
command header from the extended command header: they are processed as
two different parts of the command: memory is read once and
information are not duplicated: it's making clear that's an extended
command scheme and not a different command scheme.

The proposed scheme will format input (command) and output (response)
buffers this way:

- command:

  legacy header +
  extended header +
  command data (core + hw):

    +----------------------------------------+
    | flags     |   00      00    |  command |
    |        in_words    |   out_words       |
    +----------------------------------------+
    |                 response               |
    |                 response               |
    | provider_in_words | provider_out_words |
    |                 padding                |
    +----------------------------------------+
    |                                        |
    .              <uverbs input>            .
    .              (in_words * 8)            .
    |                                        |
    +----------------------------------------+
    |                                        |
    .             <provider input>           .
    .          (provider_in_words * 8)       .
    |                                        |
    +----------------------------------------+

- response, if present:

    +----------------------------------------+
    |                                        |
    .          <uverbs output space>         .
    .             (out_words * 8)            .
    |                                        |
    +----------------------------------------+
    |                                        |
    .         <provider output space>        .
    .         (provider_out_words * 8)       .
    |                                        |
    +----------------------------------------+

The overall design is to ensure that the extensible infrastructure is
itself extensible while begin more reliable with more input and bound
checking.

Note:

The unused field in the extended header would be perfect candidate to
hold the command "comp_mask" (eg. bit field used to handle
compatibility).  This was suggested by Roland Dreier in a previous
review[2].  But "comp_mask" field is likely to be present in the uverb
input and/or provider input, likewise for the response, as noted by
Matan Barak[3], so it doesn't make sense to put "comp_mask" in the
header.

[1]:
http://marc.info/?i=CAL1RGDWxmM17W2o_era24A-TTDeKyoL6u3NRu_=t_dhV_ZA9MA@mail.gmail.com

[2]:
http://marc.info/?i=CAL1RGDXJtrc849M6_XNZT5xO1+ybKtLWGq6yg6LhoSsKpsmkYA@mail.gmail.com

[3]:
http://marc.info/?i=525C1149.6000701@mellanox.com

Signed-off-by: Yann Droneaud <ydroneaud@opteya.com>
Link: http://marc.info/?i=cover.1383773832.git.ydroneaud@opteya.com
[ Convert "ret ? ret : 0" to the equivalent "ret".  - Roland ]

Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoIB/core: Remove ib_uverbs_flow_spec structure from userspace
Yann Droneaud [Wed, 6 Nov 2013 22:21:48 +0000 (23:21 +0100)]
IB/core: Remove ib_uverbs_flow_spec structure from userspace

The structure holding any types of flow_spec is of no use to
userspace.  It would be wrong for userspace to do:

  struct ib_uverbs_flow_spec flow_spec;

  flow_spec.type = IB_FLOW_SPEC_TCP;
  flow_spec.size = sizeof(flow_spec);

Instead, userspace should use the dedicated flow_spec structure for
  - Ethernet : struct ib_uverbs_flow_spec_eth,
  - IPv4     : struct ib_uverbs_flow_spec_ipv4,
  - TCP/UDP  : struct ib_uverbs_flow_spec_tcp_udp.

In other words, struct ib_uverbs_flow_spec is a "virtual" data
structure that can only be use by the kernel as an alias to the other.

Signed-off-by: Yann Droneaud <ydroneaud@opteya.com>
Link: http://marc.info/?i=cover.1383773832.git.ydroneaud@opteya.com
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoIB/core: Use a common header for uverbs flow_specs
Yann Droneaud [Wed, 6 Nov 2013 22:21:47 +0000 (23:21 +0100)]
IB/core: Use a common header for uverbs flow_specs

A common header will allows better checking of flow specs size, while
ensuring strict alignment to 64 bits.

Signed-off-by: Yann Droneaud <ydroneaud@opteya.com>
Link: http://marc.info/?i=cover.1383773832.git.ydroneaud@opteya.com
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoIB/core: Make uverbs flow structure use names like verbs ones
Yann Droneaud [Wed, 6 Nov 2013 22:21:46 +0000 (23:21 +0100)]
IB/core: Make uverbs flow structure use names like verbs ones

This patch adds "flow" prefix to most of data structure added as part
of commit 436f2ad05a0b ("IB/core: Export ib_create/destroy_flow through
uverbs") to keep those names in sync with the data structures added in
commit 319a441d1361 ("IB/core: Add receive flow steering support").

It's just a matter of translating 'ib_flow' to 'ib_uverbs_flow'.

Signed-off-by: Yann Droneaud <ydroneaud@opteya.com>
Link: http://marc.info/?i=cover.1383773832.git.ydroneaud@opteya.com
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoIB/core: Rename 'flow' structs to match other uverbs structs
Yann Droneaud [Wed, 6 Nov 2013 22:21:45 +0000 (23:21 +0100)]
IB/core: Rename 'flow' structs to match other uverbs structs

Commit 436f2ad05a0b ("IB/core: Export ib_create/destroy_flow through
uverbs") added public data structures to support receive flow
steering.  The new structs are not following the 'uverbs' pattern:
they're lacking the common prefix 'ib_uverbs'.

This patch replaces ib_kern prefix by ib_uverbs.

Signed-off-by: Yann Droneaud <ydroneaud@opteya.com>
Link: http://marc.info/?i=cover.1383773832.git.ydroneaud@opteya.com
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoIB/core: clarify overflow/underflow checks on ib_create/destroy_flow
Matan Barak [Wed, 6 Nov 2013 22:21:44 +0000 (23:21 +0100)]
IB/core: clarify overflow/underflow checks on ib_create/destroy_flow

This patch fixes the following issues:

1. Unneeded checks were removed

2. Removed the fixed size out of flow_attr.size, thus simplifying the checks.

3. Remove a 32bit hole on 64bit systems with strict alignment in
   struct ib_kern_flow_att by adding a reserved field.

Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>