Arnaldo Carvalho de Melo [Tue, 29 Sep 2015 20:34:46 +0000 (17:34 -0300)]
perf maps: Introduce maps__find_symbol_by_name()
Out of map_groups__find_symbol_by_name(), so that we can turn this later
one first into a call to maps__find_symbol_by_name(MAP__FUNCTION) +
MAP__VARIABLE, and then to just one call, we'll merge MAP__FUNCTION with
MAP__VARIABLE maps, to simplify the code.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-pvkar0jacqn92g148u9sqttt@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Tue, 29 Sep 2015 15:05:31 +0000 (17:05 +0200)]
perf tools: Fix shadowed declaration in parse-events.c
The error variable breaks build on CentOS 6.7, due to a collision with a
global error symbol:
CC util/parse-events.o
cc1: warnings being treated as errors
util/parse-events.c:419: error: declaration of ‘error’ shadows a global
declaration
util/util.h:135: error: shadowed declaration is here
util/parse-events.c: In function ‘add_tracepoint_multi_event’:
...
Using different argument names instead to fix it.
Reported-by: Vinson Lee <vlee@twopensource.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: linux-tip-commits@vger.kernel.org
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Raphael Beamonte <raphael.beamonte@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20150929150531.GI27383@krava.redhat.com
[ Fix one more case, at line 770 ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Tue, 29 Sep 2015 14:53:12 +0000 (16:53 +0200)]
tools: Fix shadowed declaration in err.h
The error variable breaks build on CentOS 6.7, due to collision with
global error symbol:
CC util/evlist.o
cc1: warnings being treated as errors
In file included from util/evlist.c:28:
tools/include/linux/err.h: In function ‘ERR_PTR’:
tools/include/linux/err.h:34: error: declaration of ‘error’ shadows a global declaration
util/util.h:135: error: shadowed declaration is here
Using 'error_' name instead to fix it.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/n/tip-i9mdgdbrgauy3fe76s9rd125@git.kernel.org
Reported-by: Vinson Lee <vlee@twopensource.com>
[ Use 'error_' instead of 'err' to, visually, not diverge too much from include/linux/err.h ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Ingo Molnar [Tue, 29 Sep 2015 07:43:46 +0000 (09:43 +0200)]
Merge tag 'perf-core-for-mingo' of git://git./linux/kernel/git/acme/linux into perf/core
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
- Accept a zero --itrace period, meaning "as often as possible". In the case
of Intel PT that is the same as a period of 1 and a unit of 'instructions'
(i.e. --itrace=i1i). (Adrian Hunter)
- Harmonize itrace's synthesized callchains with the existing --max-stack
tool option. (Adrian Hunter)
- Allow time to be displayed in nanoseconds in 'perf script'. (Adrian Hunter)
- Fix potential infinite loop when handling Intel PT timestamps. (Adrian Hunter)
- Slighly improve Intel PT debug logging. (Adrian Hunter)
- Warn when AUX data has been lost, just like when processing PERF_RECORD_LOST.
(Adrian Hunter)
- Further document export-to-postgresql.py script. (Adrian Hunter)
- Add option to synthesize branch stack from auxtrace data. (Adrian Hunter)
- Use equivalent logic to avoid using dso->kernel. (Arnaldo Carvalho de Melo)
- Show proper error messages when parsing bad terms for hw/sw events. (He Kuang)
- Tracepoint event parsing improvements. (He Kuang)
- Store tracing mountpoint for better error message. (Jiri Olsa)
- Add fixdep to tools/build, bringing it closer to the kernel counterpart, from
where it is being lifted. (Jiri Olsa)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
He Kuang [Mon, 28 Sep 2015 03:52:16 +0000 (03:52 +0000)]
perf tools: Enable event_config terms to tracepoint events
This patch enables config terms for tracepoint perf events. Valid terms
for tracepoint events are 'call-graph' and 'stack-size', so we can use
different callgraph settings for each event and eliminate unnecessary
overhead.
Here is an example for using different call-graph config for each
tracepoint.
$ perf record -e syscalls:sys_enter_write/call-graph=fp/
-e syscalls:sys_exit_write/call-graph=no/
dd if=/dev/zero of=test bs=4k count=10
$ perf report --stdio
#
# Total Lost Samples: 0
#
# Samples: 13 of event 'syscalls:sys_enter_write'
# Event count (approx.): 13
#
# Children Self Command Shared Object Symbol
# ........ ........ ....... .................. ......................
#
76.92% 76.92% dd libpthread-2.20.so [.] __write_nocancel
|
---__write_nocancel
23.08% 23.08% dd libc-2.20.so [.] write
|
---write
|
|--33.33%-- 0x2031342820736574
|
|--33.33%-- 0xa6e69207364726f
|
--33.33%-- 0x34202c7320393039
...
# Samples: 13 of event 'syscalls:sys_exit_write'
# Event count (approx.): 13
#
# Children Self Command Shared Object Symbol
# ........ ........ ....... .................. ......................
#
76.92% 76.92% dd libpthread-2.20.so [.] __write_nocancel
23.08% 23.08% dd libc-2.20.so [.] write
7.69% 0.00% dd [unknown] [.] 0x0a6e69207364726f
7.69% 0.00% dd [unknown] [.] 0x2031342820736574
7.69% 0.00% dd [unknown] [.] 0x34202c7320393039
Signed-off-by: He Kuang <hekuang@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1443412336-120050-4-git-send-email-hekuang@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
He Kuang [Mon, 28 Sep 2015 03:52:15 +0000 (03:52 +0000)]
perf tools: Adds the tracepoint name parsing support
Adds rules for parsing tracepoint names. Change rules of tracepoint which
derives from PE_NAMEs into tracepoint names directly, so adding more rules
based on tracepoint names will be easier.
Changes v2-v3:
- Change __event_legacy_tracepoint label in bison file to tracepoint_name
- Fix formats error.
Signed-off-by: He Kuang <hekuang@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1443412336-120050-3-git-send-email-hekuang@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
He Kuang [Mon, 28 Sep 2015 03:52:14 +0000 (03:52 +0000)]
perf tools: Show proper error message for wrong terms of hw/sw events
Show proper error message and show valid terms when wrong config terms
is specified for hw/sw type perf events.
This patch makes the original error format function formats_error_string()
more generic, which only outputs the static config terms for hw/sw perf
events, and prepends pmu formats for pmu events.
Before this patch:
$ perf record -e 'cpu-clock/freqx=200/' -a sleep 1
invalid or unsupported event: 'cpu-clock/freqx=200/'
Run 'perf list' for a list of valid events
usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]
-e, --event <event> event selector. use 'perf list' to list available events
After this patch:
$ perf record -e 'cpu-clock/freqx=200/' -a sleep 1
event syntax error: 'cpu-clock/freqx=200/'
\___ unknown term
valid terms: config,config1,config2,name,period,freq,branch_type,time,call-graph,stack-size
Run 'perf list' for a list of valid events
usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]
-e, --event <event> event selector. use 'perf list' to list available events
Signed-off-by: He Kuang <hekuang@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1443412336-120050-2-git-send-email-hekuang@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
He Kuang [Mon, 28 Sep 2015 03:52:13 +0000 (03:52 +0000)]
perf tools: Adds the config_term callback for different type events
Currently, function config_term() is used for checking config terms of
all types of events, while unknown terms is not reported as an error
because pmu events have valid terms in sysfs.
But this is wrong when unknown terms are specificed to hw/sw events.
This patch Adds the config_term callback so we can use separate check
routines for each type of events.
Signed-off-by: He Kuang <hekuang@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1443412336-120050-1-git-send-email-hekuang@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adrian Hunter [Fri, 25 Sep 2015 13:15:56 +0000 (16:15 +0300)]
perf intel-pt: Add mispred-all config option to aid use with autofdo
autofdo incorrectly expects branch flags to include either mispred or
predicted. In fact mispred = predicted = 0 is valid and means the flags
are not supported, which they aren't by Intel PT.
To make autofdo work, add a config option which will cause Intel PT
decoder to set the mispred flag on all branches.
Below is an example of using Intel PT with autofdo. The example is
also added to the Intel PT documentation. It requires autofdo
(https://github.com/google/autofdo) and gcc version 5. The bubble
sort example is from the AutoFDO tutorial (https://gcc.gnu.org/wiki/AutoFDO/Tutorial)
amended to take the number of elements as a parameter.
$ gcc-5 -O3 sort.c -o sort_optimized
$ ./sort_optimized 30000
Bubble sorting array of 30000 elements
2254 ms
$ cat ~/.perfconfig
[intel-pt]
mispred-all
$ perf record -e intel_pt//u ./sort 3000
Bubble sorting array of 3000 elements
58 ms
[ perf record: Woken up 2 times to write data ]
[ perf record: Captured and wrote 3.939 MB perf.data ]
$ perf inject -i perf.data -o inj --itrace=i100usle --strip
$ ./create_gcov --binary=./sort --profile=inj --gcov=sort.gcov -gcov_version=1
$ gcc-5 -O3 -fauto-profile=sort.gcov sort.c -o sort_autofdo
$ ./sort_autofdo 30000
Bubble sorting array of 30000 elements
2155 ms
Note there is currently no advantage to using Intel PT instead of LBR,
but that may change in the future if greater use is made of the data.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1443186956-18718-26-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adrian Hunter [Fri, 25 Sep 2015 13:15:55 +0000 (16:15 +0300)]
perf inject: Add --strip option to strip out non-synthesized events
Add a new option --strip which is used with --itrace to strip out
non-synthesized events. This results in a perf.data file that is
simpler for external tools to parse. In particular, this can be used to
prepare a perf.data file for consumption by autofdo.
A subsequent patch makes a change to Intel PT also to enable use with
autofdo and gives an example of that use.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1443186956-18718-25-git-send-email-adrian.hunter@intel.com
[ Made it use perf_evlist__remove() + perf_evsel__delete() ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adrian Hunter [Fri, 25 Sep 2015 13:15:54 +0000 (16:15 +0300)]
perf inject: Remove more aux-related stuff when processing instruction traces
perf inject can process instruction traces (using the --itrace option)
which removes aux-related events and replaces them with the requested
synthesized events.
However there are still some leftovers, namely PERF_RECORD_ITRACE_START
events and the original evsel (selected event) e.g. intel_pt//
For the sake of completeness, remove them too.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1443186956-18718-24-git-send-email-adrian.hunter@intel.com
[ Made it use perf_evlist__remove() + perf_evsel__delete() ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adrian Hunter [Fri, 25 Sep 2015 13:15:53 +0000 (16:15 +0300)]
perf evlist: Add perf_evlist__remove()
Add a counterpart to perf_evlist__add() that does the opposite and
deletes the evsel.
This will be used by perf inject to remove unwanted evsels.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1443186956-18718-23-git-send-email-adrian.hunter@intel.com
[ Renamed it from perf_evlist__del() to perf_evlist__remove() and removed the perf_evsel__delete() call ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adrian Hunter [Fri, 25 Sep 2015 13:15:52 +0000 (16:15 +0300)]
perf evlist: Add perf_evlist__id2evsel_strict()
perf_evlist__id2evsel_strict() is the same as perf_evlist__id2evsel()
except that it ensures that the id must match.
This will be used by perf inject to find a specific evsel that is to be
deleted, hence the need to match exactly.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1443186956-18718-22-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adrian Hunter [Fri, 25 Sep 2015 13:15:51 +0000 (16:15 +0300)]
perf script: Make scripting_max_stack value allow for synthesized callchains
perf script has a setting to set the maximum stack depth when processing
callchains. The setting defaults to the hard-coded maximum definition
PERF_MAX_STACK_DEPTH which is 127.
It is possible, when processing instruction traces, to synthesize
callchains. Synthesized callchains do not have the kernel size
limitation and are whatever size the user requests, although validation
presently prevents the user requested a value greater that 1024. The
default value is 16.
To allow for synthesized callchains, make the scripting_max_stack value
at least the same size as the synthesized callchain size.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1443186956-18718-21-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adrian Hunter [Fri, 25 Sep 2015 13:15:50 +0000 (16:15 +0300)]
perf scripting python: Allow for max_stack greater than PERF_MAX_STACK_DEPTH
Use the scripting_max_stack value to allow for values greater than
PERF_MAX_STACK_DEPTH.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1443186956-18718-20-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adrian Hunter [Fri, 25 Sep 2015 13:15:49 +0000 (16:15 +0300)]
perf script: Add a setting for maximum stack depth
Add a setting for maximum stack depth in preparation for allowing for
synthesized callchains.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1443186956-18718-19-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adrian Hunter [Fri, 25 Sep 2015 13:15:47 +0000 (16:15 +0300)]
perf hists: Allow for max_stack greater than PERF_MAX_STACK_DEPTH
Use the max_stack value instead of PERF_MAX_STACK_DEPTH so that
arbitrary-sized callchains can be supported.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1443186956-18718-17-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adrian Hunter [Fri, 25 Sep 2015 13:15:46 +0000 (16:15 +0300)]
perf report: Make max_stack value allow for synthesized callchains
perf report has an option (--max-stack) to set the maximum stack depth
when processing callchains. The option defaults to the hard-coded
maximum definition PERF_MAX_STACK_DEPTH which is 127. The intention of
the option is to allow the user to reduce the processing time by
reducing the amount of the callchain that is processed.
It is also possible, when processing instruction traces, to synthesize
callchains. Synthesized callchains do not have the kernel size
limitation and are whatever size the user requests, although validation
presently prevents the user requested a value greater that 1024. The
default value is 16.
To allow for synthesized callchains, make the max_stack value at least
the same size as the synthesized callchain size.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1443186956-18718-16-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adrian Hunter [Fri, 25 Sep 2015 13:15:45 +0000 (16:15 +0300)]
perf intel-pt: Support generating branch stack
Add support for generating branch stack context for PT samples. The
decoder reports a configurable number of branches as branch context for
each sample. Internally it keeps track of them by using a simple sliding
window. We also flush the last branch buffer on each sample to avoid
overlapping intervals.
This is useful for:
- Reporting accurate basic block edge frequencies through the perf
report branch view
- Using with --branch-history to get the wider context of samples
- Other users of LBRs
Also the Documentation is updated.
Examples:
Record with Intel PT:
perf record -e intel_pt//u ls
Branch stacks are used by default if synthesized so:
perf report --itrace=ile
is the same as:
perf report --itrace=ile -b
Branch history can be requested also:
perf report --itrace=igle --branch-history
Based-on-patch-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1443186956-18718-15-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adrian Hunter [Fri, 25 Sep 2015 13:15:44 +0000 (16:15 +0300)]
perf intel-pt: Move branch filter logic
intel_pt_synth_branch_sample() skips synthesizing if the branch does not
match the branch filter. That logic was sitting in the middle of the
function but is more efficiently placed at the start of the function, so
move it.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1443186956-18718-14-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adrian Hunter [Fri, 25 Sep 2015 13:15:43 +0000 (16:15 +0300)]
perf inject: Set branch stack feature flag when synthesizing branch stacks
The branch stack feature flag is set by 'perf record' when recording
data that contains branch stacks. Consequently, when 'perf inject'
synthesizes branch stacks, the feature flag should be set also.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1443186956-18718-13-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adrian Hunter [Fri, 25 Sep 2015 13:15:42 +0000 (16:15 +0300)]
perf report: Skip events with null branch stacks
A non-synthesized event might not have a branch stack if branch stacks
have been synthesized (using itrace options).
An example of that is when Intel PT records sched_switch events for
decoding purposes. Those sched_switch events do not have branch stacks
even though the Intel PT decoder may be synthesizing other events that
do due to the itrace options.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1443186956-18718-12-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adrian Hunter [Fri, 25 Sep 2015 13:15:41 +0000 (16:15 +0300)]
perf report: Also do default setup for synthesized branch stacks
The 'perf report' tool will default to displaying branch stacks (-b
option) if they are present. Make that also happen for synthesized
branch stacks.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1443186956-18718-11-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adrian Hunter [Fri, 25 Sep 2015 13:15:40 +0000 (16:15 +0300)]
perf report: Adjust sample type validation for synthesized branch stacks
perf report looks at event sample types to determine if branch stacks
have been sampled. Adjust the validation to know about instruction
tracing options.
This change allows the use of the -b option which otherwise would
complain with an error like:
Error:
Selected -b but no branch data. Did you call perf record without -b?
# To display the perf.data header info,
# please use --header/--header-only options.
#
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1443186956-18718-10-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adrian Hunter [Fri, 25 Sep 2015 13:15:39 +0000 (16:15 +0300)]
perf auxtrace: Add option to synthesize branch stacks on samples
Add AUX area tracing option 'l' to synthesize branch stacks on samples
just like sample type PERF_SAMPLE_BRANCH_STACK. This is taken into use
by Intel PT in a subsequent patch.
Based-on-patch-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1443186956-18718-9-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adrian Hunter [Fri, 25 Sep 2015 13:15:38 +0000 (16:15 +0300)]
perf tools: Add more documentation to export-to-postgresql.py script
Add some comments to the script and some 'views' to the created database
that better illustrate the database structure and how it can be used.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1443186956-18718-8-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adrian Hunter [Fri, 25 Sep 2015 13:15:37 +0000 (16:15 +0300)]
perf session: Warn when AUX data has been lost
By default 'perf record' will postprocess the perf.data file to
determine build-ids. When that happens, the number of lost perf events
is displayed.
Make that also happen for AUX events.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1443186956-18718-7-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adrian Hunter [Fri, 25 Sep 2015 13:15:36 +0000 (16:15 +0300)]
perf script: Allow time to be displayed in nanoseconds
Add option --ns to display time to 9 decimal places. That is useful in
some cases, for example when using Intel PT cycle accurate mode.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1443186956-18718-6-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adrian Hunter [Fri, 25 Sep 2015 13:15:35 +0000 (16:15 +0300)]
perf intel-pt: Make logging slightly more efficient
Logging is only used for debugging. Use macros to save calling into the
functions only to return immediately when logging is not enabled.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1443186956-18718-5-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adrian Hunter [Fri, 25 Sep 2015 13:15:34 +0000 (16:15 +0300)]
perf intel-pt: Fix potential loop forever
TSC packets contain only 7 bytes of TSC. The 8th byte is assumed to
change so infrequently that its value can be inferred. However the
logic must cater for a 7 byte wraparound, which it does by adding 1 to
the top byte.
The existing code was doing that with a while loop even though the
addition should only need to be done once. That logic won't work (will
loop forever) if TSC wraps around at the 8th byte. Theoretically that
would take at least 10 years, unless something else went wrong.
And what else could go wrong. Well, if the chunks of trace data are
processed out of order, it will make it look like the 7-byte TSC has
gone backwards (i.e. wrapped). If that happens 256 times then stuck in
the while loop it will be.
Fix that by getting rid of the unnecessary while loop.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1443186956-18718-4-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adrian Hunter [Fri, 25 Sep 2015 13:15:33 +0000 (16:15 +0300)]
perf report: Fix sample type validation for synthesized callchains
Processing instruction tracing data (e.g. Intel PT) can synthesize
callchains e.g.
$ perf record -e intel_pt//u uname
$ perf report --stdio --itrace=ige
However perf report's callgraph option gets extra validation, so:
$ perf report --stdio --itrace=ige -gflat
Error:
Selected -g or --branch-history but no callchain data. Did
you call 'perf record' without -g?
# To display the perf.data header info,
# please use --header/--header-only options.
#
Fix the validation to know about instruction tracing options so
above command works.
A side-effect of the change is that the default option to
accumulate the callchain of child functions comes into force.
To get the previous behaviour the --no-children option can be
used e.g.
$ perf report --stdio --itrace=ige -gflat --no-children
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1443186956-18718-3-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adrian Hunter [Fri, 25 Sep 2015 13:15:32 +0000 (16:15 +0300)]
perf auxtrace: Fix 'instructions' period of zero
Instruction tracing options (i.e. --itrace) include an option for
sampling instructions at an arbitrary period. e.g.
--itrace=i10us
means make an 'instructions' sample for every 10us of trace.
Currently the logic does not distinguish between a period of
zero and no period being specified at all, so it gets treated
as the default period which is 100000. That doesn't really
make sense.
Fix it so that zero period is accepted and treated as meaning
"as often as possible".
In the case of Intel PT that is the same as a period of 1 and
a unit of 'instructions' (i.e. --itrace=i1i).
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1443186956-18718-2-git-send-email-adrian.hunter@intel.com
[ Add a few lines describing this in the Documentation/intel-pt.txt file ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Wed, 23 Sep 2015 10:34:02 +0000 (12:34 +0200)]
tools build: Build fixdep helper from perf and basic libs
Adding the fixdep target into the Makefile.include to ease up building of
fixdep helper, that needs to be built before we dive in to the build itself.
The user can invoke the fixdep target to build the helper.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1443004442-32660-8-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Wed, 23 Sep 2015 10:34:01 +0000 (12:34 +0200)]
perf tools: Rename the 'single_dep' target to 'prepare'
And use the new 'prepare' target for the $(PERF_IN) target.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1443004442-32660-7-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Wed, 23 Sep 2015 10:34:00 +0000 (12:34 +0200)]
tools build: Make the fixdep helper part of the build process
Making the fixdep helper to be invoked within dep-cmd.
Each user of the build framework needs to make sure fixdep exists before
executing the build itself.
If the build doesn't find fixdep, it falls back to the old style
dependency tracking.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1443004442-32660-6-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Wed, 23 Sep 2015 10:33:59 +0000 (12:33 +0200)]
tools build: Move dependency copy into function
So it's easier to add more functionality in the following commit.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1443004442-32660-5-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Wed, 23 Sep 2015 10:33:58 +0000 (12:33 +0200)]
tools build: Add fixdep dependency helper
For dependency tracking we currently use targets that fall out of the
gcc -MD command. We store this info in the .cmd file and include as
makefile during the build.
This format put object as target and all the c and header files as
dependencies, like:
util/abspath.o: util/abspath.c /usr/include/stdc-predef.h util/cache.h \
/usr/include/bits/wordsize.h /usr/include/gnu/stubs.h \
...
If any of those dependency header files (krava.h below) is removed the
build fails on:
make[1]: *** No rule to make target 'krava.h', needed by 'inc.o'. Stop.
This patch adds fixdep helper, that is used by kbuild to alter the shape
of the object dependencies like:
source_util/abspath.o := util/abspath.c
deps_util/abspath.o := \
/usr/include/stdc-predef.h \
util/cache.h \
...
util/abspath.o: $(deps_util/abspath.o)
$(deps_util/abspath.o):
With this format the header removal won't make the build fail, because
it'll be picked up by the last empty target defined for each header.
As previously mentioned the fixdep tool is taken from kbuild. It's not
complete backport, only the part that alters the standard dependency
info was taken, the part that adds the CONFIG_* dependency logic will be
probably taken later on.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Kai Germaschewski <kai.germaschewski@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1443004442-32660-4-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Wed, 23 Sep 2015 10:33:57 +0000 (12:33 +0200)]
tools build: Add test for missing include
The current build framework fails to cope with header file removal. The
reason is that the removed header file stays in the .cmd file target
rule and forces the build to fail.
This issue is fixed and explained in the following patches.
Adding a new build test that simulates header removal.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1443004442-32660-3-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Wed, 23 Sep 2015 10:33:56 +0000 (12:33 +0200)]
tools build: Add Makefile.include
To ease up build framework code setup for users.
More shared code will be added in the following patches.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1443004442-32660-2-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Sat, 19 Sep 2015 14:47:07 +0000 (16:47 +0200)]
tools lib api fs: Store tracing mountpoint for better error message
Storing the actual tracing path mountpoint to display correct
error message hint ('Hint:' line). The error hint rediscovers
mountpoints, but it could be different from what we actually
used in tracing path.
Before we'd display debugfs mount even though tracefs was used:
$ perf record -e sched:sched_krava ls
event syntax error: 'sched:sched_krava'
\___ can't access trace events
Error: No permissions to read /sys/kernel/debug/tracing/events/sched/sched_krava
Hint: Try 'sudo mount -o remount,mode=755 /sys/kernel/debug'
...
After this change, correct mountpoint is displayed:
$ perf record -e sched:sched_krava ls
event syntax error: 'sched:sched_krava'
\___ can't access trace events
Error: No permissions to read /sys/kernel/debug/tracing/events/sched/sched_krava
Hint: Try 'sudo mount -o remount,mode=755 /sys/kernel/debug/tracing'
...
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Raphael Beamonte <raphael.beamonte@gmail.com>
Link: http://lkml.kernel.org/r/1442674027-19427-1-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Arnaldo Carvalho de Melo [Wed, 23 Sep 2015 18:45:20 +0000 (15:45 -0300)]
perf tools: Use __map__is_kernel() when synthesizing kernel module mmap records
Equivalent and removes one more case of using dso->kernel.
# perf record -a usleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.768 MB perf.data (30 samples) ]
Before:
[root@zoo ~]# perf script --show-task --show-mmap | head -3
swapper 0 [0] 0.0: PERF_RECORD_MMAP -1/0: [0xffffffff81000000(0x1f000000) @ 0xffffffff81000000]: x [kernel.kallsyms]_text
swapper 0 [0] 0.0: PERF_RECORD_MMAP -1/0: [0xffffffffa0000000(0xa000) @ 0]: x /lib/modules/4.3.0-rc1+/kernel/drivers/acpi/video.ko
swapper 0 [0] 0.0: PERF_RECORD_MMAP -1/0: [0xffffffffa000a000(0x5000) @ 0]: x /lib/modules/4.3.0-rc1+/kernel/drivers/i2c/algos/i2c-algo-bit.ko
#
# perf script --show-task --show-mmap | head -3
swapper 0 [0] 0.0: PERF_RECORD_MMAP -1/0: [0xffffffff81000000(0x1f000000) @ 0xffffffff81000000]: x [kernel.kallsyms]_text
swapper 0 [0] 0.0: PERF_RECORD_MMAP -1/0: [0xffffffffa0000000(0xa000) @ 0]: x /lib/modules/4.3.0-rc1+/kernel/drivers/acpi/video.ko
swapper 0 [0] 0.0: PERF_RECORD_MMAP -1/0: [0xffffffffa000a000(0x5000) @ 0]: x /lib/modules/4.3.0-rc1+/kernel/drivers/i2c/algos/i2c-algo-bit.ko
#
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-b65xe578dwq22mzmmj5y94wr@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Arnaldo Carvalho de Melo [Wed, 23 Sep 2015 18:38:55 +0000 (15:38 -0300)]
perf hists browser: Use the map to determine if a DSO is being used as a kernel
The map is what should say if an ELF (or some other format) image is
being used for some particular purpose, as a kernel, host or guest.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-zufousvfar0710p4qj71c32d@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Arnaldo Carvalho de Melo [Wed, 23 Sep 2015 18:15:54 +0000 (15:15 -0300)]
perf top: Filter symbols based on __map__is_kernel(map)
Instead of using dso->kernel, this is equivalent at the moment,
and helps in reducing the accesses to dso->kernel.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-1pc2v63iphtifovw3bv0bo1v@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Geliang Tang [Sun, 27 Sep 2015 15:25:50 +0000 (23:25 +0800)]
perf/core, perf/x86: Change needlessly global functions and a variable to static
Fixes various sparse warnings.
Signed-off-by: Geliang Tang <geliangtang@163.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/70c14234da1bed6e3e67b9c419e2d5e376ab4f32.1443367286.git.geliangtang@163.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Ingo Molnar [Mon, 28 Sep 2015 06:06:57 +0000 (08:06 +0200)]
Merge branch 'linus' into perf/core, to pick up fixes before applying new changes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Linus Torvalds [Sun, 27 Sep 2015 22:22:34 +0000 (18:22 -0400)]
Merge branch 'upstream' of git://git.linux-mips.org/ralf/upstream-linus
Pull MIPS fixes from Ralf Baechle:
- Properly setup irq handling for ATH79 platforms
- Fix bootmem mapstart calculation for contiguous maps
- Handle little endian and older CPUs correct in BPF
- Fix console for Fulong 2E systems
- Handle FTLB correctly on R6 CPUs
- Fixes for CM, GIC and MAAR support code
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
MIPS: Initialise MAARs on secondary CPUs
MIPS: print MAAR configuration during boot
MIPS: mm: compile maar_init unconditionally
irqchip: mips-gic: Fix pending & mask reads for MIPS64 with 32b GIC.
irqchip: mips-gic: Convert CPU numbers to VP IDs.
MIPS: CM: Provide a function to map from CPU to VP ID.
MIPS: Fix FTLB detection for R6
MIPS: cpu-features: Add cpu_has_ftlb
MIPS: ATH79: Add irq chip ar7240-misc-intc
MIPS: ATH79: Set missing irq ack handler for ar7100-misc-intc irq chip
MIPS: BPF: Fix build on pre-R2 little endian CPUs
MIPS: BPF: Avoid unreachable code on little endian
MIPS: bootmem: Fix mapstart calculation for contiguous maps
MIPS: Fix console output for Fulong2e system
Linus Torvalds [Sun, 27 Sep 2015 16:51:39 +0000 (12:51 -0400)]
Merge branch 'perf-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull perf fixes from Thomas Gleixner:
"Another pile of fixes for perf:
- Plug overflows and races in the core code
- Sanitize the flow of the perf syscall so we error out before
handling the more complex and hard to undo setups
- Improve and fix Broadwell and Skylake hardware support
- Revert a fix which broke what it tried to fix in perf tools
- A couple of smaller fixes in various places of perf tools"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf tools: Fix copying of /proc/kcore
perf intel-pt: Remove no_force_psb from documentation
perf probe: Use existing routine to look for a kernel module by dso->short_name
perf/x86: Change test_aperfmperf() and test_intel() to static
tools lib traceevent: Fix string handling in heterogeneous arch environments
perf record: Avoid infinite loop at buildid processing with no samples
perf: Fix races in computing the header sizes
perf: Fix u16 overflows
perf: Restructure perf syscall point of no return
perf/x86/intel: Fix Skylake FRONTEND MSR extrareg mask
perf/x86/intel/pebs: Add PEBS frontend profiling for Skylake
perf/x86/intel: Make the CYCLE_ACTIVITY.* constraint on Broadwell more specific
perf tools: Bool functions shouldn't return -1
tools build: Add test for presence of __get_cpuid() gcc builtin
tools build: Add test for presence of numa_num_possible_cpus() in libnuma
Revert "perf symbols: Fix mismatched declarations for elf_getphdrnum"
perf stat: Fix per-pkg event reporting bug
Linus Torvalds [Sun, 27 Sep 2015 16:50:27 +0000 (12:50 -0400)]
Merge branch 'sched-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull scheduler fix from Thomas Gleixner:
"A single bug fix for the scheduler to prevent dequeueing of the idle
task when setting the cpus allowed mask"
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched: Fix crash trying to dequeue/enqueue the idle thread
Linus Torvalds [Sun, 27 Sep 2015 16:47:20 +0000 (12:47 -0400)]
Merge branch 'locking-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull locking fix from Thomas Gleixner:
"A single bugfix for lockdep to preserve the pinning counter when
rebuilding the lock stack"
* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
locking/lockdep: Fix hlock->pin_count reset on lock stack rebuilds
Paul Burton [Fri, 25 Sep 2015 15:59:38 +0000 (08:59 -0700)]
MIPS: Initialise MAARs on secondary CPUs
MAARs should be initialised on each CPU (or rather, core) in the system
in order to achieve consistent behaviour & performance. Previously they
have only been initialised on the boot CPU which leads to performance
problems if tasks are later scheduled on a secondary CPU, particularly
if those tasks make use of unaligned vector accesses where some CPUs
don't handle any cases in hardware for non-speculative memory regions.
Fix this by recording the MAAR configuration from the boot CPU and
applying it to secondary CPUs as part of their bringup.
Reported-by: Doug Gilmore <doug.gilmore@imgtec.com>
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Steven J. Hill <Steven.Hill@imgtec.com>
Cc: Andrew Bresticker <abrestic@chromium.org>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: David Hildenbrand <dahi@linux.vnet.ibm.com>
Cc: linux-kernel@vger.kernel.org
Cc: Aaro Koskinen <aaro.koskinen@iki.fi>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Markos Chandras <markos.chandras@imgtec.com>
Cc: Hemmo Nieminen <hemmo.nieminen@iki.fi>
Cc: Alex Smith <alex.smith@imgtec.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Patchwork: https://patchwork.linux-mips.org/patch/11239/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Paul Burton [Fri, 25 Sep 2015 15:59:37 +0000 (08:59 -0700)]
MIPS: print MAAR configuration during boot
Verifying that the MAAR configuration is as expected is useful when
debugging the performance of a system. Print out the memory regions
configured via MAAR along with their attributes.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: Steven J. Hill <Steven.Hill@imgtec.com>
Cc: David Hildenbrand <dahi@linux.vnet.ibm.com>
Cc: linux-kernel@vger.kernel.org
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Patchwork: https://patchwork.linux-mips.org/patch/11238/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Paul Burton [Fri, 25 Sep 2015 15:59:36 +0000 (08:59 -0700)]
MIPS: mm: compile maar_init unconditionally
maar_init was previously only compiled when CONFIG_NEED_MULTIPLE_NODES
was not set, which has been fine since it is only called from the
standard implementation of mem_init which has the same condition. In
preparation for calling it from the SMP startup code on secondary CPUs,
move maar_init outside of the #ifndef such that it is always compiled.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: Markos Chandras <markos.chandras@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: Steven J. Hill <Steven.Hill@imgtec.com>
Cc: David Hildenbrand <dahi@linux.vnet.ibm.com>
Cc: linux-kernel@vger.kernel.org
Cc: Ingo Molnar <mingo@kernel.org>
Patchwork: https://patchwork.linux-mips.org/patch/11237/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Paul Burton [Tue, 22 Sep 2015 18:29:11 +0000 (11:29 -0700)]
irqchip: mips-gic: Fix pending & mask reads for MIPS64 with 32b GIC.
gic_handle_shared_int reads the GIC interrupt pending & mask registers
directly into a bitmap, which is defined as an array of unsigned longs.
The GIC pending registers may be 32 bits wide if the CM is older than
CM3, regardless of the bit width of the CPU, but for MIPS64 kernels
the unsigned longs in the bitmap will be 64 bits wide. In this case we
need to perform 2 x 32 bit reads per 64 bit unsigned long in order to
avoid missing interrupts.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-mips@linux-mips.org
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/11213/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Paul Burton [Tue, 22 Sep 2015 18:29:10 +0000 (11:29 -0700)]
irqchip: mips-gic: Convert CPU numbers to VP IDs.
Make use of the mips_cm_vp_id function to convert from Linux CPU numbers
to the VP IDs used by hardware, which are not identical in all systems.
Without doing so we map interrupts to incorrect VP(E)s.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-mips@linux-mips.org
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/11212/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Paul Burton [Tue, 22 Sep 2015 18:29:09 +0000 (11:29 -0700)]
MIPS: CM: Provide a function to map from CPU to VP ID.
The VP ID of a given CPU may not match up with the CPU number used by
Linux. For example, if the width of the VP part of the VP ID is wider
than log2(number of VPs per core) and the system has multiple cores then
this will be the case. Alternatively, if a pre-r6 system implements the
MT ASE with multiple VPEs per core and Linux is built without support
for the MT ASE then the numbers won't match up either. Provide a
function to convert from CPU number to VP ID.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Markos Chandras <markos.chandras@imgtec.com>
Patchwork: https://patchwork.linux-mips.org/patch/11211/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Linus Torvalds [Sun, 27 Sep 2015 11:50:08 +0000 (07:50 -0400)]
Linux 4.3-rc3
Linus Torvalds [Sun, 27 Sep 2015 10:51:42 +0000 (06:51 -0400)]
Merge branch 'x86-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull x86 fixes from Thomas Gleixner:
"Two bugfixes from Andy addressing at least some of the subtle NMI
related wreckage which has been reported by Sasha Levin"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/nmi/64: Fix a paravirt stack-clobbering bug in the NMI code
x86/paravirt: Replace the paravirt nop with a bona fide empty function
Linus Torvalds [Sun, 27 Sep 2015 10:50:23 +0000 (06:50 -0400)]
Merge branch 'irq-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull irq fix from Thomass Gleixner:
"A bugfix for the atmel aic5 irq chip driver which caches the wrong
data and thereby breaking resume"
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/atmel-aic5: Use per chip mask caches in mask/unmask()
Linus Torvalds [Sun, 27 Sep 2015 10:48:48 +0000 (06:48 -0400)]
Merge branch 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm
Pull ARM fixes from Russell King:
"Just two fixes: wire up the new system calls added during the last
merge window, and fix another user access site"
* 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
ARM: alignment: fix alignment handling for uaccess changes
ARM: wire up new syscalls
Linus Torvalds [Sun, 27 Sep 2015 10:45:18 +0000 (06:45 -0400)]
Merge tag 'armsoc-fixes' of git://git./linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Olof Johansson:
"Our first real batch of fixes this release cycle. Nothing really
concerning, and diffstat is a bit inflated due to some DT contents
moving around on STi platforms.
There's a collection of them here:
- A fixup for a build breakage that hits on arm64 allmodconfig in
QCOM SCM firmware drivers
- MMC fixes for OMAP that had quite a bit of breakage this merge
window.
- Misc build/warning fixes on PXA and OMAP
- A couple of minor fixes for Beagleboard X15 which is now starting
to see a few more users in the wild"
* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (31 commits)
ARM: sti: dt: adapt DT to fix probe/bind issues in DRM driver
ARM: dts: fix omap2+ address translation for pbias
firmware: qcom: scm: Add function stubs for ARM64
ARM: dts: am57xx-beagle-x15: use palmas-usb for USB2
ARM: omap2plus_defconfig: enable GPIO_PCA953X
ARM: dts: omap5-uevm.dts: fix i2c5 pinctrl offsets
ARM: OMAP2+: AM43XX: Enable autoidle for clks in am43xx_init_late
ARM: dts: am57xx-beagle-x15: Update Phy supplies
ARM: pxa: balloon3: Fix build error
ARM: dts: Fixup model name for HP t410 dts
ARM: dts: DRA7: fix a typo in ethernet
ARM: omap2plus_defconfig: make PCF857x built-in
ARM: dts: Use ti,pbias compatible string for pbias
ARM: OMAP5: Cleanup options for SoC only build
ARM: DRA7: Select missing options for SoC only build
ARM: OMAP2+: board-generic: Remove stale of_irq macros
ARM: OMAP4+: PM: erratum is used by OMAP5 and DRA7 as well
ARM: dts: omap3-igep: Move eth IRQ pinmux to IGEPv2 common dtsi
ARM: dts: am57xx-beagle-x15: Add wakeup irq for mcp79410
ARM: dts: am335x-phycore-som: Fix mpu voltage
...
Linus Torvalds [Sun, 27 Sep 2015 10:42:54 +0000 (06:42 -0400)]
Merge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6
Pull CIFS fixes from Steve French:
"Four fixes from testing at the recent SMB3 Plugfest including two
important authentication ones (one fixes authentication problems to
some popular servers when clock times differ more than two hours
between systems, the other fixes Kerberos authentication for SMB3)"
* 'for-next' of git://git.samba.org/sfrench/cifs-2.6:
fix encryption error checks on mount
[SMB3] Fix sec=krb5 on smb3 mounts
cifs: use server timestamp for ntlmv2 authentication
disabling oplocks/leases via module parm enable_oplocks broken for SMB3
Olof Johansson [Sun, 27 Sep 2015 05:23:26 +0000 (22:23 -0700)]
Merge tag 'pxa-fixes-v4.3' of https://github.com/rjarzmik/linux into fixes
ARM: pxa: fixes for v4.3
These fixes are mainly regression fixes triggered by irq changes,
common clock framework introduction and sound side-effect of
other platforms.
* tag 'pxa-fixes-v4.3' of https://github.com/rjarzmik/linux:
ARM: pxa: balloon3: Fix build error
ARM: pxa: ssp: Fix build error by removing originally incorrect DT binding
ARM: pxa: fix DFI bus lockups on startup
Signed-off-by: Olof Johansson <olof@lixom.net>
Olof Johansson [Sun, 27 Sep 2015 05:22:31 +0000 (22:22 -0700)]
Merge tag 'omap-for-v4.3/fixes-rc2' of git://git./linux/kernel/git/tmlind/linux-omap into fixes
Fixes for omaps for v4.3-rc cycle:
- Two more patches to fix most of the MMC regressions with the
PBIAS regulator changes. At least two MMC driver related issues
still seems to remain for omap3 legacy booting and omap4 duovero.
Note that the dts changes depend on a recent regulator fix, and
are based on the regulator commit now in mainline kernel
- Enable autoidle for am43xx clocks to prevent clocks from staying
always on
- Fix i2c5 pinctrl offsets for omap5-uevm
- Enable PCA953X as that's needed for HDMI to work on omap5
- Update phy supplies for beagle x15 beta board
- Use palmas-usb for on beagle x15 to start using the related
driver that recently got merged
* tag 'omap-for-v4.3/fixes-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
ARM: dts: fix omap2+ address translation for pbias
ARM: dts: am57xx-beagle-x15: use palmas-usb for USB2
ARM: omap2plus_defconfig: enable GPIO_PCA953X
ARM: dts: omap5-uevm.dts: fix i2c5 pinctrl offsets
ARM: OMAP2+: AM43XX: Enable autoidle for clks in am43xx_init_late
ARM: dts: am57xx-beagle-x15: Update Phy supplies
regulator: pbias: program pbias register offset in pbias driver
ARM: omap2plus_defconfig: Enable MUSB DMA support
ARM: DRA752: Add ID detect for ES2.0
ARM: OMAP3: vc: fix 'or' always true warning
ARM: OMAP2+: Fix booting if no timer parent clock is available
ARM: OMAP2+: omap-device: fix race deferred probe of omap_hsmmc vs omap_device_late_init
Signed-off-by: Olof Johansson <olof@lixom.net>
Linus Torvalds [Sun, 27 Sep 2015 01:05:23 +0000 (21:05 -0400)]
Merge git://git./linux/kernel/git/herbert/crypto-2.6
Pull crypto fixes from Herbert Xu:
"This fixes the following issues:
- check the return value of platform_get_irq as signed int in xgene.
- skip adf_dev_restore on virtual functions in qat.
- fix double-free with backlogged requests in marvell_cesa"
* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
hwrng: xgene - fix handling platform_get_irq
crypto: qat - VF should never trigger SBR on PH
crypto: marvell - properly handle CRYPTO_TFM_REQ_MAY_BACKLOG-flagged requests
Linus Torvalds [Sun, 27 Sep 2015 01:02:42 +0000 (21:02 -0400)]
Merge git://git./linux/kernel/git/nab/target-pending
Pull SCSI target fixes from Nicholas Bellinger:
"This includes a iser-target series from Jenny + Sagi @ Mellanox that
addresses the few remaining active I/O shutdown bugs, along with a
patch to support zero-copy for immediate data payloads that gives a
nice performance improvement for small block WRITEs.
Also included are some recent >= v4.2 regression bug-fixes. The most
notable is a RCU conversion regression for SPC-3 PR registrations, and
recent removal of obsolete RFC-3720 markers that introduced a login
regression bug with MSFT iSCSI initiators.
Thanks to everyone who has been testing + reporting bugs for v4.x"
* git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
iscsi-target: Avoid OFMarker + IFMarker negotiation
target: Make TCM_WRITE_PROTECT failure honor D_SENSE bit
target: Fix target_sense_desc_format NULL pointer dereference
target: Propigate backend read-only to core_tpg_add_lun
target: Fix PR registration + APTPL RCU conversion regression
iser-target: Skip data copy if all the command data comes as immediate
iser-target: Change the recv buffers posting logic
iser-target: Fix pending connections handling in target stack shutdown sequnce
iser-target: Remove np_ prefix from isert_np members
iser-target: Remove unused variables
iser-target: Put the reference on commands waiting for unsol data
iser-target: remove command with state ISTATE_REMOVE
Linus Torvalds [Sun, 27 Sep 2015 01:00:28 +0000 (21:00 -0400)]
Merge tag 'usb-4.3-rc3' of git://git./linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
"Here are some USB driver fixes for 4.3-rc3.
There's the usual assortment of new device ids, combined with xhci and
gadget driver fixes. Full details in the shortlog. All of these have
been in linux-next with no reported problems"
* tag 'usb-4.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (34 commits)
MAINTAINERS: remove amd5536udc USB gadget driver maintainer
USB: whiteheat: fix potential null-deref at probe
xhci: init command timeout timer earlier to avoid deleting it uninitialized
xhci: change xhci 1.0 only restrictions to support xhci 1.1
usb: xhci: exit early in xhci_setup_device() if we're halted or dying
usb: xhci: stop everything on the first call to xhci_stop
usb: xhci: Clear XHCI_STATE_DYING on start
usb: xhci: lock mutex on xhci_stop
xhci: Move xhci_pme_quirk() behind #ifdef CONFIG_PM
xhci: give command abortion one more chance before killing xhci
usb: Use the USB_SS_MULT() macro to get the burst multiplier.
usb: dwc3: gadget: Fix BUG in RT config
usb: musb: fix cppi channel teardown for isoch transfer
usb: phy: isp1301: Export I2C module alias information
usb: gadget: drop null test before destroy functions
usb: gadget: dummy_hcd: in transfer(), return data sent, not limit
usb: gadget: dummy_hcd: fix rescan logic for transfer
usb: gadget: dummy_hcd: fix unneeded else-if condition
usb: gadget: dummy_hcd: emulate sending zlp in packet logic
usb: musb: dsps: fix polling in device-only mode
...
Linus Torvalds [Sun, 27 Sep 2015 00:58:38 +0000 (20:58 -0400)]
Merge tag 'tty-4.3-rc3' of git://git./linux/kernel/git/gregkh/tty
Pull serial driver fix from Greg KH:
"Here is one serial driver fix for 4.3-rc3 that resolves a module
loading issue due to splitting up of the 8250 driver into smaller
pieces. It's been in linux-next with no reported problems"
* tag 'tty-4.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
tty: serial: Add missing module license for 8250_base.ko
Linus Torvalds [Sun, 27 Sep 2015 00:56:50 +0000 (20:56 -0400)]
Merge tag 'staging-4.3-rc3' of git://git./linux/kernel/git/gregkh/staging
Pull staging driver fixes from Greg KH:
"Here are some tiny staging driver and documentation fixes for 4.3-rc3.
All of these resolve reported issues that people have found and have
been in the linux-next tree for a while with no problems"
* tag 'staging-4.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
MAINTAINERS: Update email address for Martyn Welch
staging: ion: fix corruption of ion_import_dma_buf
staging: dgap: Remove myself from the MAINTAINERS file
staging: most: Add dependency to HAS_IOMEM
staging: unisys: remove reference of visorutil
staging: unisys: visornic: handle error return from device registration
staging: unisys: stop device registration before visorbus registration
staging: unisys: visorbus: Unregister driver on error
staging: unisys: visornic: Fix receive bytes statistics
staging: unisys: unregister netdev when create debugfs fails
staging: fbtft: replace master->setup() with spi_setup()
staging: fbtft: fix 9-bit SPI support detection
staging/lustre: change Lustre URLs and mailing list
staging/android: Update ION TODO per LPC discussion
Staging: most: MOST and MOSTCORE should depend on HAS_DMA
staging: most: fix HDM_USB dependencies and build errors
Linus Torvalds [Sun, 27 Sep 2015 00:54:53 +0000 (20:54 -0400)]
Merge tag 'driver-core-4.3-rc3' of git://git./linux/kernel/git/gregkh/driver-core
Pull driver core fix from Greg KH:
"Here is one driver core fix for 4.3-rc3 that resolves a reported oops"
* tag 'driver-core-4.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
cpu/cacheinfo: Fix teardown path
Linus Torvalds [Sun, 27 Sep 2015 00:53:15 +0000 (20:53 -0400)]
Merge tag 'char-misc-4.3-rc3' of git://git./linux/kernel/git/gregkh/char-misc
Pull char/misc driver fixes from Greg KH:
"Here's some tiny char and misc driver fixes that resolve some reported
errors for 4.3-rc3.
All of these have been in linux-next with no problems for a while"
* tag 'char-misc-4.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
extcon: Fix attached value returned by is_extcon_changed
Drivers: hv: vmbus: fix init_vp_index() for reloading hv_netvsc
mei: fix debugfs files leak on error path
thunderbolt: Allow loading of module on recent Apple MacBooks with thunderbolt 2 controller
Linus Torvalds [Sat, 26 Sep 2015 10:01:33 +0000 (06:01 -0400)]
Merge git://git./linux/kernel/git/davem/net
Pull networking fixes from David Miller:
1) When we run a tap on netlink sockets, we have to copy mmap'd SKBs
instead of cloning them. From Daniel Borkmann.
2) When converting classical BPF into eBPF, fix the setting of the
source reg to BPF_REG_X. From Tycho Andersen.
3) Fix igmpv3/mldv2 report parsing in the bridge multicast code, from
Linus Lussing.
4) Fix dst refcounting for ipv6 tunnels, from Martin KaFai Lau.
5) Set NLM_F_REPLACE flag properly when replacing ipv6 routes, from
Roopa Prabhu.
6) Add some new cxgb4 PCI device IDs, from Hariprasad Shenai.
7) Fix headroom tests and SKB leaks in ipv6 fragmentation code, from
Florian Westphal.
8) Check DMA mapping errors in bna driver, from Ivan Vecera.
9) Several 8139cp bug fixes (dev_kfree_skb_any in interrupt context,
misclearing of interrupt status in TX timeout handler, etc.) from
David Woodhouse.
10) In tipc, reset SKB header pointer after skb_linearize(), from Erik
Hugne.
11) Fix autobind races et al. in netlink code, from Herbert Xu with
help from Tejun Heo and others.
12) Missing SET_NETDEV_DEV in sunvnet driver, from Sowmini Varadhan.
13) Fix various races in timewait timer and reqsk_queue_hadh_req, from
Eric Dumazet.
14) Fix array overruns in mac80211, from Johannes Berg and Dan
Carpenter.
15) Fix data race in rhashtable_rehash_one(), from Dmitriy Vyukov.
16) Fix race between poll_one_napi and napi_disable, from Neil Horman.
17) Fix byte order in geneve tunnel port config, from John W Linville.
18) Fix handling of ARP replies over lightweight tunnels, from Jiri
Benc.
19) We can loop when fib rule dumps cross multiple SKBs, fix from Wilson
Kok and Roopa Prabhu.
20) Several reference count handling bug fixes in the PHY/MDIO layer
from Russel King.
21) Fix lockdep splat in ppp_dev_uninit(), from Guillaume Nault.
22) Fix crash in icmp_route_lookup(), from David Ahern.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (116 commits)
net: Fix panic in icmp_route_lookup
net: update docbook comment for __mdiobus_register()
ppp: fix lockdep splat in ppp_dev_uninit()
net: via/Kconfig: GENERIC_PCI_IOMAP required if PCI not selected
phy: marvell: add link partner advertised modes
net: fix net_device refcounting
phy: add phy_device_remove()
phy: fixed-phy: properly validate phy in fixed_phy_update_state()
net: fix phy refcounting in a bunch of drivers
of_mdio: fix MDIO phy device refcounting
phy: add proper phy struct device refcounting
phy: fix mdiobus module safety
net: dsa: fix of_mdio_find_bus() device refcount leak
phy: fix of_mdio_find_bus() device refcount leak
ip6_tunnel: Reduce log level in ip6_tnl_err() to debug
ip6_gre: Reduce log level in ip6gre_err() to debug
fib_rules: fix fib rule dumps across multiple skbs
bnx2x: byte swap rss_key to comply to Toeplitz specs
net: revert "net_sched: move tp->root allocation into fw_init()"
lwtunnel: remove source and destination UDP port config option
...
Ingo Molnar [Sat, 26 Sep 2015 06:15:52 +0000 (08:15 +0200)]
Merge tag 'perf-urgent-for-mingo' of git://git./linux/kernel/git/acme/linux into perf/urgent
Pull perf/urgent fixes from Arnaldo Carvalho de Melo:
- Fix copying of /proc/kcore made to the ~/.debug/ DSO cache to allow using
objdump with kcore files. (Adrian Hunter)
- Fix adding perf probes in kernel module functions. (Arnaldo Carvalho de Melo)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
David Ahern [Thu, 24 Sep 2015 21:31:29 +0000 (15:31 -0600)]
net: Fix panic in icmp_route_lookup
Andrey reported a panic:
[ 7249.865507] BUG: unable to handle kernel pointer dereference at
000000b4
[ 7249.865559] IP: [<
c16afeca>] icmp_route_lookup+0xaa/0x320
[ 7249.865598] *pdpt =
0000000030f7f001 *pde =
0000000000000000
[ 7249.865637] Oops: 0000 [#1]
...
[ 7249.866811] CPU: 0 PID: 0 Comm: swapper/0 Not tainted
4.3.0-999-generic #
201509220155
[ 7249.866876] Hardware name: MSI MS-7250/MS-7250, BIOS 080014 08/02/2006
[ 7249.866916] task:
c1a5ab00 ti:
c1a52000 task.ti:
c1a52000
[ 7249.866949] EIP: 0060:[<
c16afeca>] EFLAGS:
00210246 CPU: 0
[ 7249.866981] EIP is at icmp_route_lookup+0xaa/0x320
[ 7249.867012] EAX:
00000000 EBX:
f483ba48 ECX:
00000000 EDX:
f2e18a00
[ 7249.867045] ESI:
000000c0 EDI:
f483ba70 EBP:
f483b9ec ESP:
f483b974
[ 7249.867077] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
[ 7249.867108] CR0:
8005003b CR2:
000000b4 CR3:
36ee07c0 CR4:
000006f0
[ 7249.867141] Stack:
[ 7249.867165]
320310ee 00000000 00000042 320310ee 00000000 c1aeca00
f3920240 f0c69180
[ 7249.867268]
f483ba04 f855058b a89b66cd f483ba44 f8962f4b 00000000
e659266c f483ba54
[ 7249.867361]
8004753c f483ba5c f8962f4b f2031140 000003c1 ffbd8fa0
c16b0e00 00000064
[ 7249.867448] Call Trace:
[ 7249.867494] [<
f855058b>] ? e1000_xmit_frame+0x87b/0xdc0 [e1000e]
[ 7249.867534] [<
f8962f4b>] ? tcp_in_window+0xeb/0xb10 [nf_conntrack]
[ 7249.867576] [<
f8962f4b>] ? tcp_in_window+0xeb/0xb10 [nf_conntrack]
[ 7249.867615] [<
c16b0e00>] ? icmp_send+0xa0/0x380
[ 7249.867648] [<
c16b102f>] icmp_send+0x2cf/0x380
[ 7249.867681] [<
f89c8126>] nf_send_unreach+0xa6/0xc0 [nf_reject_ipv4]
[ 7249.867714] [<
f89cd0da>] reject_tg+0x7a/0x9f [ipt_REJECT]
[ 7249.867746] [<
f88c29a7>] ipt_do_table+0x317/0x70c [ip_tables]
[ 7249.867780] [<
f895e0a6>] ? __nf_conntrack_find_get+0x166/0x3b0
[nf_conntrack]
[ 7249.867838] [<
f895eea8>] ? nf_conntrack_in+0x398/0x600 [nf_conntrack]
[ 7249.867889] [<
f84c0035>] iptable_filter_hook+0x35/0x80 [iptable_filter]
[ 7249.867933] [<
c16776a1>] nf_iterate+0x71/0x80
[ 7249.867970] [<
c1677715>] nf_hook_slow+0x65/0xc0
[ 7249.868002] [<
c1681811>] __ip_local_out_sk+0xc1/0xd0
[ 7249.868034] [<
c1680f30>] ? ip_forward_options+0x1a0/0x1a0
[ 7249.868066] [<
c1681836>] ip_local_out_sk+0x16/0x30
[ 7249.868097] [<
c1684054>] ip_send_skb+0x14/0x80
[ 7249.868129] [<
c16840f4>] ip_push_pending_frames+0x34/0x40
[ 7249.868163] [<
c16844a2>] ip_send_unicast_reply+0x282/0x310
[ 7249.868196] [<
c16a0863>] tcp_v4_send_reset+0x1b3/0x380
[ 7249.868227] [<
c16a1b63>] tcp_v4_rcv+0x323/0x990
[ 7249.868257] [<
c16776a1>] ? nf_iterate+0x71/0x80
[ 7249.868289] [<
c167dc2b>] ip_local_deliver_finish+0x8b/0x230
[ 7249.868322] [<
c167df4c>] ip_local_deliver+0x4c/0xa0
[ 7249.868353] [<
c167dba0>] ? ip_rcv_finish+0x390/0x390
[ 7249.868384] [<
c167d88c>] ip_rcv_finish+0x7c/0x390
[ 7249.868415] [<
c167e280>] ip_rcv+0x2e0/0x420
...
Prior to the VRF change the oif was not set in the flow struct, so the
VRF support should really have only added the vrf_master_ifindex lookup.
Fixes:
613d09b30f8b ("net: Use VRF device index for lookups on TX")
Cc: Andrey Melnikov <temnota.am@gmail.com>
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Russell King [Fri, 25 Sep 2015 10:56:56 +0000 (11:56 +0100)]
net: update docbook comment for __mdiobus_register()
Update the docbook comment for __mdiobus_register() to include the new
module owner argument. This resolves a warning found by the 0-day
builder.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Greg Kroah-Hartman [Sat, 26 Sep 2015 03:45:27 +0000 (20:45 -0700)]
MAINTAINERS: remove amd5536udc USB gadget driver maintainer
Thomas can no longer work on the driver, so he asked me to mark the
MAINTAINER entry as "Orphan" with the hope that someone else would
someday pick it up.
Cc: Thomas Dahlmann <dahlmann.thomas@arcor.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Linus Torvalds [Fri, 25 Sep 2015 23:20:55 +0000 (16:20 -0700)]
Merge branch 'for-4.3-fixes' of git://git./linux/kernel/git/tj/cgroup
Pull another cgroup fix from Tejun Heo:
"The cgroup writeback support got inadvertently enabled for traditional
hierarchies revealing two regressions which are currently being worked
on. It shouldn't have been enabled on traditional hierarchies, so
disable it on them. This is enough to make the regressions go away
for people who aren't experimenting with cgroup"
* 'for-4.3-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cgroup, writeback: don't enable cgroup writeback on traditional hierarchies
Guillaume Nault [Thu, 24 Sep 2015 10:54:01 +0000 (12:54 +0200)]
ppp: fix lockdep splat in ppp_dev_uninit()
ppp_dev_uninit() locks all_ppp_mutex while under rtnl mutex protection.
ppp_create_interface() must then lock these mutexes in that same order
to avoid possible deadlock.
[ 120.880011] ======================================================
[ 120.880011] [ INFO: possible circular locking dependency detected ]
[ 120.880011] 4.2.0 #1 Not tainted
[ 120.880011] -------------------------------------------------------
[ 120.880011] ppp-apitest/15827 is trying to acquire lock:
[ 120.880011] (&pn->all_ppp_mutex){+.+.+.}, at: [<
ffffffffa0145f56>] ppp_dev_uninit+0x64/0xb0 [ppp_generic]
[ 120.880011]
[ 120.880011] but task is already holding lock:
[ 120.880011] (rtnl_mutex){+.+.+.}, at: [<
ffffffff812e4255>] rtnl_lock+0x12/0x14
[ 120.880011]
[ 120.880011] which lock already depends on the new lock.
[ 120.880011]
[ 120.880011]
[ 120.880011] the existing dependency chain (in reverse order) is:
[ 120.880011]
[ 120.880011] -> #1 (rtnl_mutex){+.+.+.}:
[ 120.880011] [<
ffffffff81073a6f>] lock_acquire+0xcf/0x10e
[ 120.880011] [<
ffffffff813ab18a>] mutex_lock_nested+0x56/0x341
[ 120.880011] [<
ffffffff812e4255>] rtnl_lock+0x12/0x14
[ 120.880011] [<
ffffffff812d9d94>] register_netdev+0x11/0x27
[ 120.880011] [<
ffffffffa0147b17>] ppp_ioctl+0x289/0xc98 [ppp_generic]
[ 120.880011] [<
ffffffff8113b367>] do_vfs_ioctl+0x4ea/0x532
[ 120.880011] [<
ffffffff8113b3fd>] SyS_ioctl+0x4e/0x7d
[ 120.880011] [<
ffffffff813ad7d7>] entry_SYSCALL_64_fastpath+0x12/0x6f
[ 120.880011]
[ 120.880011] -> #0 (&pn->all_ppp_mutex){+.+.+.}:
[ 120.880011] [<
ffffffff8107334e>] __lock_acquire+0xb07/0xe76
[ 120.880011] [<
ffffffff81073a6f>] lock_acquire+0xcf/0x10e
[ 120.880011] [<
ffffffff813ab18a>] mutex_lock_nested+0x56/0x341
[ 120.880011] [<
ffffffffa0145f56>] ppp_dev_uninit+0x64/0xb0 [ppp_generic]
[ 120.880011] [<
ffffffff812d5263>] rollback_registered_many+0x19e/0x252
[ 120.880011] [<
ffffffff812d5381>] rollback_registered+0x29/0x38
[ 120.880011] [<
ffffffff812d53fa>] unregister_netdevice_queue+0x6a/0x77
[ 120.880011] [<
ffffffffa0146a94>] ppp_release+0x42/0x79 [ppp_generic]
[ 120.880011] [<
ffffffff8112d9f6>] __fput+0xec/0x192
[ 120.880011] [<
ffffffff8112dacc>] ____fput+0x9/0xb
[ 120.880011] [<
ffffffff8105447a>] task_work_run+0x66/0x80
[ 120.880011] [<
ffffffff81001801>] prepare_exit_to_usermode+0x8c/0xa7
[ 120.880011] [<
ffffffff81001900>] syscall_return_slowpath+0xe4/0x104
[ 120.880011] [<
ffffffff813ad931>] int_ret_from_sys_call+0x25/0x9f
[ 120.880011]
[ 120.880011] other info that might help us debug this:
[ 120.880011]
[ 120.880011] Possible unsafe locking scenario:
[ 120.880011]
[ 120.880011] CPU0 CPU1
[ 120.880011] ---- ----
[ 120.880011] lock(rtnl_mutex);
[ 120.880011] lock(&pn->all_ppp_mutex);
[ 120.880011] lock(rtnl_mutex);
[ 120.880011] lock(&pn->all_ppp_mutex);
[ 120.880011]
[ 120.880011] *** DEADLOCK ***
Fixes:
8cb775bc0a34 ("ppp: fix device unregistration upon netns deletion")
Reported-by: Sedat Dilek <sedat.dilek@gmail.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sudip Mukherjee [Thu, 24 Sep 2015 10:16:53 +0000 (15:46 +0530)]
net: via/Kconfig: GENERIC_PCI_IOMAP required if PCI not selected
The builds of allmodconfig of avr32 is failing with:
drivers/net/ethernet/via/via-rhine.c:1098:2: error: implicit declaration
of function 'pci_iomap' [-Werror=implicit-function-declaration]
drivers/net/ethernet/via/via-rhine.c:1119:2: error: implicit declaration
of function 'pci_iounmap' [-Werror=implicit-function-declaration]
The generic empty pci_iomap and pci_iounmap is used only if CONFIG_PCI
is not defined and CONFIG_GENERIC_PCI_IOMAP is defined.
Add GENERIC_PCI_IOMAP in the dependency list for VIA_RHINE as we are
getting build failure when CONFIG_PCI and CONFIG_GENERIC_PCI_IOMAP both
are not defined.
Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Russell King [Wed, 23 Sep 2015 23:07:17 +0000 (00:07 +0100)]
phy: marvell: add link partner advertised modes
Read the standard link partner advertisment registers and store it in
phydev->lp_advertising, so ethtool can report this information to
userspace via ethtool. Zero it as per genphy if autonegotiation is
disabled. Tested with a Marvell
88E1512 PHY.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Fri, 25 Sep 2015 19:08:41 +0000 (12:08 -0700)]
Merge branch 'for-linus-4.3' of git://git./linux/kernel/git/mason/linux-btrfs
Pull btrfs fixes from Chris Mason:
"This is an assorted set I've been queuing up:
Jeff Mahoney tracked down a tricky one where we ended up starting IO
on the wrong mapping for special files in btrfs_evict_inode. A few
people reported this one on the list.
Filipe found (and provided a test for) a difficult bug in reading
compressed extents, and Josef fixed up some quota record keeping with
snapshot deletion. Chandan killed off an accounting bug during DIO
that lead to WARN_ONs as we freed inodes"
* 'for-linus-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
Btrfs: keep dropped roots in cache until transaction commit
Btrfs: Direct I/O: Fix space accounting
btrfs: skip waiting on ordered range for special files
Btrfs: fix read corruption of compressed and shared extents
Btrfs: remove unnecessary locking of cleaner_mutex to avoid deadlock
Btrfs: don't initialize a space info as full to prevent ENOSPC
Linus Torvalds [Fri, 25 Sep 2015 18:33:52 +0000 (11:33 -0700)]
Merge tag 'nfs-for-4.3-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client bugfixes from Trond Myklebust:
"Highlights include:
Stable patches:
- fix v4.2 SEEK on files over 2 gigs
- Fix a layout segment reference leak when pNFS I/O falls back to inband I/O.
- Fix recovery of recalled read delegations
Bugfixes:
- Fix a case where NFSv4 fails to send CLOSE after a server reboot
- Fix sunrpc to wait for connections to complete before retrying
- Fix sunrpc races between transport connect/disconnect and shutdown
- Fix an infinite loop when layoutget fail with BAD_STATEID
- nfs/filelayout: Fix NULL reference caused by double freeing of fh_array
- Fix a bogus WARN_ON_ONCE() in O_DIRECT when layout commit_through_mds is set
- Fix layoutreturn/close ordering issues"
* tag 'nfs-for-4.3-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
NFS41: make close wait for layoutreturn
NFS: Skip checking ds_cinfo.buckets when lseg's commit_through_mds is set
NFSv4.x/pnfs: Don't try to recover stateids twice in layoutget
NFSv4: Recovery of recalled read delegations is broken
NFS: Fix an infinite loop when layoutget fail with BAD_STATEID
NFS: Do cleanup before resetting pageio read/write to mds
SUNRPC: xs_sock_mark_closed() does not need to trigger socket autoclose
SUNRPC: Lock the transport layer on shutdown
nfs/filelayout: Fix NULL reference caused by double freeing of fh_array
SUNRPC: Ensure that we wait for connections to complete before retrying
SUNRPC: drop null test before destroy functions
nfs: fix v4.2 SEEK on files over 2 gigs
SUNRPC: Fix races between socket connection and destroy code
nfs: fix pg_test page count calculation
Failing to send a CLOSE if file is opened WRONLY and server reboots on a 4.x mount
Linus Torvalds [Fri, 25 Sep 2015 18:25:30 +0000 (11:25 -0700)]
Merge tag 'sound-4.3-rc3' of git://git./linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"This ended up with a larger set of fixes than wished, unfortunately.
As diffstat shows, the majority of changes are for various ASoC
drivers (Realtek, Wolfson codec drivers, etc), in addition to a couple
of HD-audio regression fixes. All these are reasonably small and
nothing to scare much"
* tag 'sound-4.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (29 commits)
ALSA: hda - Disable power_save_node for Thinkpads
ALSA: hda/tegra - async probe for avoiding module loading deadlock
ASoC: rt5645: Prevent the pop sound in case of playback and the jack is plugging
ASoC: rt5645: Increase the delay time to remove the pop sound
ASoC: rt5645: Use the type SOC_DAPM_SINGLE_AUTODISABLE to prevent the weird sound in runtime of power up
ASoC: pxa: pxa2xx-ac97: fix dma requestor lines
MAINTAINERS: Update website and git repo for Wolfson Microelectronics
ASoC: fsl_ssi: Fix checking of dai format for AC97 mode
ASoC: wm0010: fix error path
ASoC: wm0010: fix memory leak
ASoC: wm8960: correct the max register value of mic boost pga
ASoC: wm8962: remove 64k sample rate support
ASoC: davinci-mcasp: Fix devm_kasprintf format string
ASoC: fix broken pxa SoC support
ASoC: davinci-mcasp: Set .symmetric_rates = 1 in snd_soc_dai_driver
ASoC: au1x: psc-i2s: Fix unused variable 'ret' warning
ASoC: SPEAr: Make SND_SPEAR_SOC select SND_SOC_GENERIC_DMAENGINE_PCM
ASoC: mediatek: Increase periods_min in capture
ASoC: davinci-mcasp: Revise the FIFO threshold calculation
ASoC: wm8960: correct gain value for input PGA and add microphone PGA
...
Linus Torvalds [Fri, 25 Sep 2015 18:16:53 +0000 (11:16 -0700)]
Merge tag 'pci-v4.3-fixes-1' of git://git./linux/kernel/git/helgaas/pci
Pull PCI fixes from Bjorn Helgaas:
"These are fixes for things we merged for v4.3 (VPD, MSI, and bridge
window management), and a new Renesas R8A7794 SoC device ID.
Details:
Resource management:
- Revert pci_read_bridge_bases() unification (Bjorn Helgaas)
- Clear IORESOURCE_UNSET when clipping a bridge window (Bjorn
Helgaas)
MSI:
- Fix MSI IRQ domains for VFs on virtual buses (Alex Williamson)
Renesas R-Car host bridge driver:
- Add R8A7794 support (Sergei Shtylyov)
Miscellaneous:
- Fix devfn for VPD access through function 0 (Alex Williamson)
- Use function 0 VPD only for identical functions (Alex Williamson)"
* tag 'pci-v4.3-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
PCI: rcar: Add R8A7794 support
PCI: Use function 0 VPD for identical functions, regular VPD for others
PCI: Fix devfn for VPD access through function 0
PCI/MSI: Fix MSI IRQ domains for VFs on virtual buses
PCI: Clear IORESOURCE_UNSET when clipping a bridge window
PCI: Revert "PCI: Call pci_read_bridge_bases() from core instead of arch code"
Linus Torvalds [Fri, 25 Sep 2015 17:51:40 +0000 (10:51 -0700)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm
Pull KVM fixes from Paolo Bonzini:
"AMD fixes for bugs introduced in the 4.2 merge window, and a few PPC
bug fixes too"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: disable halt_poll_ns as default for s390x
KVM: x86: fix off-by-one in reserved bits check
KVM: x86: use correct page table format to check nested page table reserved bits
KVM: svm: do not call kvm_set_cr0 from init_vmcb
KVM: x86: trap AMD MSRs for the TSeg base and mask
KVM: PPC: Book3S: Take the kvm->srcu lock in kvmppc_h_logical_ci_load/store()
KVM: PPC: Book3S HV: Pass the correct trap argument to kvmhv_commence_exit
KVM: PPC: Book3S HV: Fix handling of interrupted VCPUs
kvm: svm: reset mmu on VCPU reset
Linus Torvalds [Fri, 25 Sep 2015 17:11:26 +0000 (10:11 -0700)]
Merge tag 'powerpc-4.3-3' of git://git./linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
- Wire up sys_membarrier()
- cxl: Fix lockdep warning while creating afu_err_buff from Vaibhav
* tag 'powerpc-4.3-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
cxl: Fix lockdep warning while creating afu_err_buff attribute
powerpc: Wire up sys_membarrier()
Adrian Hunter [Thu, 24 Sep 2015 10:05:22 +0000 (13:05 +0300)]
perf tools: Fix copying of /proc/kcore
A copy of /proc/kcore containing the kernel text can be made to the
buildid cache. e.g.
perf buildid-cache -v -k /proc/kcore
To workaround objdump limitations, a copy is also made when annotating
against /proc/kcore.
The copying process stops working from libelf about v1.62 onwards (the
problem was found with v1.63).
The cause is that a call to gelf_getphdr() in kcore__add_phdr() fails
because additional validation has been added to gelf_getphdr().
The use of gelf_getphdr() is a misguided attempt to get default
initialization of the Gelf_Phdr structure. That should not be
necessary because every member of the Gelf_Phdr structure is
subsequently assigned. So just remove the call to gelf_getphdr().
Similarly, a call to gelf_getehdr() in gelf_kcore__init() can be
removed also.
Committer notes:
Note to stable@kernel.org, from Adrian in the cover letter for this
patchkit:
The "Fix copying of /proc/kcore" problem goes back to v3.13 if you think
it is important enough for stable.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@kernel.org
Link: http://lkml.kernel.org/r/1443089122-19082-3-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adrian Hunter [Thu, 24 Sep 2015 10:05:21 +0000 (13:05 +0300)]
perf intel-pt: Remove no_force_psb from documentation
no_force_psb was dropped as a late change to the kernel driver.
Consequently, remove it from the documentation.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1443089122-19082-2-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Arnaldo Carvalho de Melo [Thu, 24 Sep 2015 14:24:18 +0000 (11:24 -0300)]
perf probe: Use existing routine to look for a kernel module by dso->short_name
We have map_groups__find_by_name() to look at the list of modules that
are in place for a given machine, so use it instead of traversing the
machine dso list, which also includes DSOs for userspace.
When merging the user and kernel DSO lists a bug was introduced where
'perf probe' stopped being able to add probes to modules using its short
name:
# perf probe -m usbnet --add usbnet_start_xmit
usbnet_start_xmit is out of .text, skip it.
Error: Failed to add events.
#
With this fix it works again:
# perf probe -m usbnet --add usbnet_start_xmit
Added new event:
probe:usbnet_start_xmit (on usbnet_start_xmit in usbnet)
You can now use it in all perf tools, such as:
perf record -e probe:usbnet_start_xmit -aR sleep 1
#
Reported-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Fixes:
3d39ac538629 ("perf machine: No need to have two DSOs lists")
Link: http://lkml.kernel.org/r/20150924015008.GE1897@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
David Hildenbrand [Fri, 18 Sep 2015 10:34:53 +0000 (12:34 +0200)]
KVM: disable halt_poll_ns as default for s390x
We observed some performance degradation on s390x with dynamic
halt polling. Until we can provide a proper fix, let's enable
halt_poll_ns as default only for supported architectures.
Architectures are now free to set their own halt_poll_ns
default value.
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Tue, 22 Sep 2015 08:15:59 +0000 (10:15 +0200)]
KVM: x86: fix off-by-one in reserved bits check
29ecd6601904 ("KVM: x86: avoid uninitialized variable warning",
2015-09-06) introduced a not-so-subtle problem, which probably
escaped review because it was not part of the patch context.
Before the patch, leaf was always equal to iterator.level. After,
it is equal to iterator.level - 1 in the call to is_shadow_zero_bits_set,
and when is_shadow_zero_bits_set does another "-1" the check on
reserved bits becomes incorrect. Using "iterator.level" in the call
fixes this call trace:
WARNING: CPU: 2 PID: 17000 at arch/x86/kvm/mmu.c:3385 handle_mmio_page_fault.part.93+0x1a/0x20 [kvm]()
Modules linked in: tun sha256_ssse3 sha256_generic drbg binfmt_misc ipv6 vfat fat fuse dm_crypt dm_mod kvm_amd kvm crc32_pclmul aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd fam15h_power amd64_edac_mod k10temp edac_core amdkfd amd_iommu_v2 radeon acpi_cpufreq
[...]
Call Trace:
dump_stack+0x4e/0x84
warn_slowpath_common+0x95/0xe0
warn_slowpath_null+0x1a/0x20
handle_mmio_page_fault.part.93+0x1a/0x20 [kvm]
tdp_page_fault+0x231/0x290 [kvm]
? emulator_pio_in_out+0x6e/0xf0 [kvm]
kvm_mmu_page_fault+0x36/0x240 [kvm]
? svm_set_cr0+0x95/0xc0 [kvm_amd]
pf_interception+0xde/0x1d0 [kvm_amd]
handle_exit+0x181/0xa70 [kvm_amd]
? kvm_arch_vcpu_ioctl_run+0x68b/0x1730 [kvm]
kvm_arch_vcpu_ioctl_run+0x6f6/0x1730 [kvm]
? kvm_arch_vcpu_ioctl_run+0x68b/0x1730 [kvm]
? preempt_count_sub+0x9b/0xf0
? mutex_lock_killable_nested+0x26f/0x490
? preempt_count_sub+0x9b/0xf0
kvm_vcpu_ioctl+0x358/0x710 [kvm]
? __fget+0x5/0x210
? __fget+0x101/0x210
do_vfs_ioctl+0x2f4/0x560
? __fget_light+0x29/0x90
SyS_ioctl+0x4c/0x90
entry_SYSCALL_64_fastpath+0x16/0x73
---[ end trace
37901c8686d84de6 ]---
Reported-by: Borislav Petkov <bp@alien8.de>
Tested-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Tue, 22 Sep 2015 21:02:14 +0000 (23:02 +0200)]
KVM: x86: use correct page table format to check nested page table reserved bits
Intel CPUID on AMD host or vice versa is a weird case, but it can
happen. Handle it by checking the host CPU vendor instead of the
guest's in reset_tdp_shadow_zero_bits_mask. For speed, the
check uses the fact that Intel EPT has an X (executable) bit while
AMD NPT has NX.
Reported-by: Borislav Petkov <bp@alien8.de>
Tested-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Mon, 21 Sep 2015 05:46:55 +0000 (07:46 +0200)]
KVM: svm: do not call kvm_set_cr0 from init_vmcb
kvm_set_cr0 may want to call kvm_zap_gfn_range and thus access the
memslots array (SRCU protected). Using a mini SRCU critical section
is ugly, and adding it to kvm_arch_vcpu_create doesn't work because
the VMX vcpu_create callback calls synchronize_srcu.
Fixes this lockdep splat:
===============================
[ INFO: suspicious RCU usage. ]
4.3.0-rc1+ #1 Not tainted
-------------------------------
include/linux/kvm_host.h:488 suspicious rcu_dereference_check() usage!
other info that might help us debug this:
rcu_scheduler_active = 1, debug_locks = 0
1 lock held by qemu-system-i38/17000:
#0: (&(&kvm->mmu_lock)->rlock){+.+...}, at: kvm_zap_gfn_range+0x24/0x1a0 [kvm]
[...]
Call Trace:
dump_stack+0x4e/0x84
lockdep_rcu_suspicious+0xfd/0x130
kvm_zap_gfn_range+0x188/0x1a0 [kvm]
kvm_set_cr0+0xde/0x1e0 [kvm]
init_vmcb+0x760/0xad0 [kvm_amd]
svm_create_vcpu+0x197/0x250 [kvm_amd]
kvm_arch_vcpu_create+0x47/0x70 [kvm]
kvm_vm_ioctl+0x302/0x7e0 [kvm]
? __lock_is_held+0x51/0x70
? __fget+0x101/0x210
do_vfs_ioctl+0x2f4/0x560
? __fget_light+0x29/0x90
SyS_ioctl+0x4c/0x90
entry_SYSCALL_64_fastpath+0x16/0x73
Reported-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Geliang Tang [Thu, 24 Sep 2015 11:48:53 +0000 (04:48 -0700)]
perf/x86: Change test_aperfmperf() and test_intel() to static
Fixes the following sparse warnings:
arch/x86/kernel/cpu/perf_event_msr.c:13:6: warning: symbol
'test_aperfmperf' was not declared. Should it be static?
arch/x86/kernel/cpu/perf_event_msr.c:18:6: warning: symbol
'test_intel' was not declared. Should it be static?
Signed-off-by: Geliang Tang <geliangtang@163.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/4588e8ab09638458f2451af572827108be3b4a36.1443123796.git.geliangtang@163.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Nicholas Bellinger [Wed, 23 Sep 2015 05:32:14 +0000 (22:32 -0700)]
iscsi-target: Avoid OFMarker + IFMarker negotiation
This patch fixes a v4.2+ regression introduced by commit
c04a6091
that removed support for obsolete sync-and-steering markers usage
as originally defined in RFC-3720.
The regression would involve attempting to send OFMarker=No +
IFMarker=No keys during opertional negotiation login phase,
including when initiators did not actually propose these keys.
The result for MSFT iSCSI initiators would be random junk in
TCP stream after the last successful login request was been sent
signaling the move to full feature phase (FFP) operation.
To address this bug, go ahead and avoid negotiating these keys
by default unless the initiator explicitly proposes them, but
still respond to them with 'No' if they are proposed.
Reported-by: Dragan Milivojević <galileo@pkm-inc.com>
Bisected-by: Christophe Vu-Brugier <cvubrugier@fastmail.fm>
Tested-by: Christophe Vu-Brugier <cvubrugier@fastmail.fm>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Nicholas Bellinger [Thu, 17 Sep 2015 03:23:53 +0000 (20:23 -0700)]
target: Make TCM_WRITE_PROTECT failure honor D_SENSE bit
This patch changes transport_lookup_cmd_lun() to obtain
se_lun->lun_ref + se_cmd->se_device rcu_dereference during
TCM_WRITE_PROTECT -> CHECK_CONDITION failure status.
Do this to ensure the active control D_SENSE mode page bit
is being honored.
Reported-by: Sagi Grimberg <sagig@mellanox.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Nicholas Bellinger [Wed, 16 Sep 2015 06:07:45 +0000 (23:07 -0700)]
target: Fix target_sense_desc_format NULL pointer dereference
This patch allows target_sense_desc_format() to be called without a
valid se_device pointer, which can occur during an early exception
ahead of transport_lookup_cmd_lun() setting up se_cmd->se_device.
This addresses a v4.3-rc1 specific NULL pointer dereference
regression introduced by commit
4e4937e8.
Cc: Sagi Grimberg <sagig@mellanox.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Nicholas Bellinger [Wed, 16 Sep 2015 00:27:35 +0000 (17:27 -0700)]
target: Propigate backend read-only to core_tpg_add_lun
This patch adds a DF_READ_ONLY flag that is used by IBLOCK to
signal when a backend has been set to read-only mode, in order
to propigate read-only status up to core_tpg_add_lun() for all
future LUN fabric exports.
With this is place, existing emulation for reporting read-only
in spc_emulate_modesense() and normal transport_lookup_cmd_lun()
TCM_WRITE_PROTECTED status checking just works as expected.
Reported-by: Joeue Deng <joeue404@gmail.com>
Reported-by: Andy Grover <agrover@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Nicholas Bellinger [Sun, 13 Sep 2015 09:30:46 +0000 (02:30 -0700)]
target: Fix PR registration + APTPL RCU conversion regression
This patch fixes a v4.2+ regression introduced by commit
79dc9c9e86
where lookup of t10_pr_registration->pr_reg_deve and associated
->pr_kref get was missing from __core_scsi3_do_alloc_registration(),
which is responsible for setting DEF_PR_REG_ACTIVE.
This would result in REGISTER operations completing successfully,
but subsequent core_scsi3_pr_seq_non_holder() checking would fail
with !DEF_PR_REG_ACTIVE -> RESERVATION CONFLICT status.
Update __core_scsi3_add_registration() to drop ->pr_kref reference
after registration and any optional ALL_TG_PT=1 processing has
completed. Update core_scsi3_decode_spec_i_port() to release
the new parent local_pr_reg->pr_kref as well.
Also, update __core_scsi3_check_aptpl_registration() to perform
the same target_nacl_find_deve() lookup + ->pr_kref get, now that
__core_scsi3_add_registration() expects to drop the reference.
Finally, since there are cases when se_dev_entry->se_lun_acl can
still be dereferenced in core_scsi3_lunacl_undepend_item() while
holding ->pr_kref, go ahead and move explicit rcu_assign_pointer()
NULL assignments within core_disable_device_list_for_node() until
after orig->pr_comp finishes.
Reported-by: Scott L. Lykens <scott@lykens.org>
Tested-by: Scott L. Lykens <scott@lykens.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Lee Duncan <lduncan@suse.com>
Cc: <stable@vger.kernel.org> # v4.2+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
David S. Miller [Fri, 25 Sep 2015 06:04:53 +0000 (23:04 -0700)]
Merge branch 'phy-mdio-refcnt'
Russell King says:
====================
Phy, mdiobus, and netdev struct device fixes
The third version of this series fixes the build error which David
identified, and drops the broken changes for the Cavium Thunger BGX
ethernet driver as this driver requires some complex changes to
resolve the leakage - and this is best done by people who can test
the driver.
Compared to v2, the only patch which has changed is patch 6
"net: fix phy refcounting in a bunch of drivers"
I _think_ I've been able to build-test all the drivers touched by
that patch to some degree now, though several of them needed the
Kconfig hacked to allow it (not all had || COMPILE_TEST clause on
their dependencies.)
Previous cover letters below:
This is the second version of the series, with the comments David had
on the first patch fixed up. Original series description with updated
diffstat below.
While looking at the DSA code, I noticed we have a
of_find_net_device_by_node(), and it looks like users of that are
similarly buggy - it looks like net/dsa/dsa.c is the only user. Fix
that too.
Hi,
While looking at the phy code, I identified a number of weaknesses
where refcounting on device structures was being leaked, where
modules could be removed while in-use, and where the fixed-phy could
end up having unintended consequences caused by incorrect calls to
fixed_phy_update_state().
This patch series resolves those issues, some of which were discovered
with testing on an Armada 388 board. Not all patches are fully tested,
particularly the one which touches several network drivers.
When resolving the struct device refcounting problems, several different
solutions were considered before settling on the implementation here -
one of the considerations was to avoid touching many network drivers.
The solution here is:
phy_attach*() - takes a refcount
phy_detach*() - drops the phy_attach refcount
Provided drivers always attach and detach their phys, which they should
already be doing, this should change nothing, even if they leak a refcount.
of_phy_find_device() and of_* functions which use that take
a refcount. Arrange for this refcount to be dropped once
the phy is attached.
This is the reason why the previous change is important - we can't drop
this refcount taken by of_phy_find_device() until something else holds
a reference on the device. This resolves the leaked refcount caused by
using of_phy_connect() or of_phy_attach().
Even without the above changes, these drivers are leaking by calling
of_phy_find_device(). These drivers are addressed by adding the
appropriate release of that refcount.
The mdiobus code also suffered from the same kind of leak, but thankfully
this only happened in one place - the mdio-mux code.
I also found that the try_module_get() in the phy layer code was utterly
useless: phydev->dev.driver was guaranteed to always be NULL, so
try_module_get() was always being called with a NULL argument. I proved
this with my SFP code, which declares its own MDIO bus - the module use
count was never incremented irrespective of how I set the MDIO bus up.
This allowed the MDIO bus code to be removed from the kernel while there
were still PHYs attached to it.
One other bug was discovered: while using in-band-status with mvneta, it
was found that if a real phy is attached with in-band-status enabled,
and another ethernet interface is using the fixed-phy infrastructure, the
interface using the fixed-phy infrastructure is configured according to
the other interface using the in-band-status - which is caused by the
fixed-phy code not verifying that the phy_device passed in is actually
a fixed-phy device, rather than a real MDIO phy.
Lastly, having mdio_bus reversing phy_device_register() internals seems
like a layering violation - it's trivial to move that code to the phy
device layer.
====================
Tested-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Russell King [Thu, 24 Sep 2015 19:36:33 +0000 (20:36 +0100)]
net: fix net_device refcounting
of_find_net_device_by_node() uses class_find_device() internally to
lookup the corresponding network device. class_find_device() returns
a reference to the embedded struct device, with its refcount
incremented.
Add a comment to the definition in net/core/net-sysfs.c indicating the
need to drop this refcount, and fix the DSA code to drop this refcount
when the OF-generated platform data is cleaned up and freed. Also
arrange for the ref to be dropped when handling errors.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>