Steven Rostedt [Mon, 16 Mar 2009 23:20:15 +0000 (19:20 -0400)]
tracing: protect reader of cmdline output
Impact: fix to one cause of incorrect comm outputs in trace
The spinlock only protected the creation of a comm <=> pid pair.
But it was possible that a reader could look up a pid, and get the
wrong comm because it had no locking.
This also required changing trace_find_cmdline to copy the comm cache
and not just send back a pointer to it.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Frederic Weisbecker [Mon, 16 Mar 2009 21:41:00 +0000 (22:41 +0100)]
tracing/ftrace: fix the check on nopped sites
Impact: fix a dynamic tracing failure
Recently, the function and function graph tracers failed to use dynamic
tracing after the following commit:
fa9d13cf135efbd454453a53b6299976bea245a9
(ftrace: don't try to __ftrace_replace_code on !FTRACE_FL_CONVERTED rec)
The patch is right except a mistake on the check for the FTRACE_FL_CONVERTED
flag. The code patching is aborted in case of successfully nopped sites.
What we want is the opposite: ignore the callsites that haven't been nopped.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Lai Jiangshan [Fri, 13 Mar 2009 07:10:26 +0000 (15:10 +0800)]
kallsyms, tracing: output more proper symbol name
Impact: bugfix, output more reliable symbol lookup result
Debug tools(dump_stack(), ftrace...) are like to print out symbols.
But it is always print out the first aliased symbol.(Aliased symbols
are symbols with the same address), and the first aliased symbol is
sometime not proper.
# echo function_graph > current_tracer
# cat trace
......
1) 1.923 us | select_nohz_load_balancer();
1) + 76.692 us | }
1) | default_idle() {
1) ==========> | __irqentry_text_start() {
1) 0.000 us | native_apic_mem_write();
1) | irq_enter() {
1) 0.000 us | idle_cpu();
1) | tick_check_idle() {
1) 0.000 us | tick_check_oneshot_broadcast();
1) | tick_nohz_stop_idle() {
......
It's very embarrassing, it ouputs "__irqentry_text_start()",
actually, it should output "smp_apic_timer_interrupt()".
(these two symbol are the same address, but "__irqentry_text_start"
is deemed to the first aliased symbol by scripts/kallsyms)
This patch puts symbols like "__irqentry_text_start" to the second
aliased symbols. And a more proper symbol name becomes the first.
Aliased symbols mostly come from linker script. The solution is
guessing "is this symbol defined in linker script", the symbols
defined in linker script will not become the first aliased symbol.
And if symbols are found to be equal in this "linker script provided"
criteria, symbols are sorted by the number of prefix underscores.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Reviewed-by: Paulo Marques <pmarques@grupopie.com>
LKML-Reference: <
49BA06E2.
7080807@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Lai Jiangshan [Fri, 13 Mar 2009 09:51:27 +0000 (17:51 +0800)]
ftrace: remove struct list_head from struct dyn_ftrace
Impact: save memory
The struct dyn_ftrace table is very large, this patch will save
about 50%.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Steven Rostedt <srostedt@redhat.com>
LKML-Reference: <
49BA2C9F.
8020009@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Lai Jiangshan [Fri, 13 Mar 2009 09:47:23 +0000 (17:47 +0800)]
ftrace: use seq_read
Impact: cleanup
VFS layer has tested the file mode, we do not need test it.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Steven Rostedt <srostedt@redhat.com>
LKML-Reference: <
49BA2BAB.
6010608@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Zhaolei [Fri, 13 Mar 2009 09:16:34 +0000 (17:16 +0800)]
ftrace: don't try to __ftrace_replace_code on !FTRACE_FL_CONVERTED rec
Do __ftrace_replace_code for !FTRACE_FL_CONVERTED rec will always
fail, we should ignore this rec.
Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Cc: "Steven Rostedt ;" <rostedt@goodmis.org>
LKML-Reference: <
49BA2472.
4060206@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Zhaolei [Fri, 13 Mar 2009 09:14:01 +0000 (17:14 +0800)]
ftrace: avoid double-free of dyn_ftrace
If dyn_ftrace is freed before ftrace_release(), ftrace_release()
will free it again and make ftrace_free_records wrong.
Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Cc: "Steven Rostedt ;" <rostedt@goodmis.org>
LKML-Reference: <
49BA23D9.
1050900@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Fri, 13 Mar 2009 09:23:39 +0000 (10:23 +0100)]
Merge branches 'tracing/ftrace' and 'tracing/syscalls'; commit 'v2.6.29-rc8' into tracing/core
Ingo Molnar [Fri, 13 Mar 2009 05:30:52 +0000 (06:30 +0100)]
Merge branch 'tip/tracing/ftrace' of git://git./linux/kernel/git/rostedt/linux-2.6-trace into tracing/ftrace
Ingo Molnar [Fri, 13 Mar 2009 05:29:58 +0000 (06:29 +0100)]
Merge commit 'v2.6.29-rc8' into tracing/ftrace
Frederic Weisbecker [Sat, 7 Mar 2009 04:53:00 +0000 (05:53 +0100)]
tracing/x86: basic implementation of syscall tracing for x86
Provide the x86 trace callbacks to trace syscalls.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
LKML-Reference: <
1236401580-5758-3-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Frederic Weisbecker [Sat, 7 Mar 2009 04:52:59 +0000 (05:52 +0100)]
tracing/ftrace: syscall tracing infrastructure, basics
Provide basic callbacks to do syscall tracing.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
LKML-Reference: <
1236401580-5758-2-git-send-email-fweisbec@gmail.com>
[ simplified it to a trace_printk() for now. ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Steven Rostedt [Fri, 13 Mar 2009 04:43:33 +0000 (00:43 -0400)]
softirq: no need to have SOFTIRQ in softirq name
Impact: clean up
It is redundant to have 'SOFTIRQ' in the softirq names.
Reported-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Steven Rostedt [Fri, 13 Mar 2009 04:37:42 +0000 (00:37 -0400)]
tracing: move binary buffers into per cpu directory
The binary_buffers directory in /debugfs/tracing held the files
to read the trace buffers in a binary format. This held one file
per CPU buffer. But we also have a per_cpu directory that holds
a way to read the pretty-print formats.
This patch moves the binary buffers into the per_cpu_directory:
# ls /debug/tracing/per_cpu/cpu1/
trace trace_pipe trace_pipe_raw
The new name is called "trace_pipe_raw". The binary buffers always
acted similar to trace_pipe, except that they produce raw data.
Requested-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Steven Rostedt [Fri, 13 Mar 2009 04:12:52 +0000 (00:12 -0400)]
tracing: add comment for use of double __builtin_consant_p
Impact: documentation
The use of the double __builtin_contant_p checks in the event_trace_printk
can be confusing to developers and reviewers. This patch adds a comment
to explain why it is there.
Requested-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
LKML-Reference: <
20090313122235.43EB.
A69D9226@jp.fujitsu.com>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Steven Rostedt [Fri, 13 Mar 2009 04:00:58 +0000 (00:00 -0400)]
tracing: left align location header in stack_trace
Ingo Molnar suggested, instead of:
Depth Size Location (27 entries)
----- ---- --------
0) 2880 48 lock_timer_base+0x2b/0x4f
1) 2832 80 __mod_timer+0x33/0xe0
2) 2752 16 __ide_set_handler+0x63/0x65
To have it be:
Depth Size Location (27 entries)
----- ---- --------
0) 2880 48 lock_timer_base+0x2b/0x4f
1) 2832 80 __mod_timer+0x33/0xe0
2) 2752 16 __ide_set_handler+0x63/0x65
Requested-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Ingo Molnar [Fri, 13 Mar 2009 03:34:09 +0000 (04:34 +0100)]
Merge branch 'tip/tracing/ftrace' of git://git./linux/kernel/git/rostedt/linux-2.6-trace into tracing/ftrace
Ingo Molnar [Fri, 13 Mar 2009 03:33:17 +0000 (04:33 +0100)]
Merge branches 'tracing/ftrace' and 'linus' into tracing/core
Linus Torvalds [Fri, 13 Mar 2009 02:39:28 +0000 (19:39 -0700)]
Linus 2.6.29-rc8
Linus Torvalds [Fri, 13 Mar 2009 02:32:51 +0000 (19:32 -0700)]
bitmap: fix end condition in bitmap_find_free_region
Guennadi Liakhovetski noticed that the end condition for the loop in
bitmap_find_free_region() is wrong, and the "return if error" was also
using the wrong conditional that would only trigger if the bitmap was an
exact multiple of the allocation size, which is not necessarily the case
with dma_alloc_from_coherent().
Such a failure would end up in bitmap_find_free_region() accessing
beyond the end of the bitmap.
Reported-by: Guennadi Liakhovetski <lg@denx.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Steven Rostedt [Fri, 13 Mar 2009 02:24:17 +0000 (22:24 -0400)]
ring-buffer: document reader page design
In a private email conversation I explained how the ring buffer
page worked by using silly ASCII art. Ingo suggested that I add
that to the comments of the code.
Here it is.
Requested-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Steven Rostedt [Fri, 13 Mar 2009 02:00:19 +0000 (22:00 -0400)]
tracing: show event name in trace for TRACE_EVENT created events
Unlike TRACE_FORMAT() macros, the TRACE_EVENT() macros do not show
the event name in the trace file. Knowing the event type in the trace
output is very useful.
Instead of:
task swapper:0 [140] ==> ntpd:3308 [120]
We now have:
sched_switch: task swapper:0 [140] ==> ntpd:3308 [120]
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
KOSAKI Motohiro [Fri, 13 Mar 2009 00:03:04 +0000 (09:03 +0900)]
tracing: Don't use tracing_record_cmdline() in workqueue tracer fix
commit
c3ffc7a40b7e94b094efe1c8ab4e24370a782b65
"Don't use tracing_record_cmdline() in workqueue tracer"
has a race window.
find_task_by_vpid() requires task_list_lock().
LKML-Reference: <
20090313090042.43CD.
A69D9226@jp.fujitsu.com>
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Jason Baron [Thu, 12 Mar 2009 18:36:03 +0000 (14:36 -0400)]
tracing: tracepoints for softirq entry/exit - tracepoints
Introduce softirq entry/exit tracepoints. These are useful for
augmenting existing tracers, and to figure out softirq frequencies and
timings.
[
s/irq_softirq_/softirq_/ for trace point names and
Fixed printf format in TRACE_FORMAT macro
- Steven Rostedt
]
LKML-Reference: <
20090312183603.GC3352@redhat.com>
Signed-off-by: Jason Baron <jbaron@redhat.com>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Jason Baron [Thu, 12 Mar 2009 18:33:36 +0000 (14:33 -0400)]
tracing: tracepoints for softirq entry/exit - add softirq-to-name array
Create a 'softirq_to_name' array, which is indexed by softirq #, so
that we can easily convert between the softirq index # and its name, in
order to get more meaningful output messages.
LKML-Reference: <
20090312183336.GB3352@redhat.com>
Signed-off-by: Jason Baron <jbaron@redhat.com>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Steven Rostedt [Thu, 12 Mar 2009 23:42:29 +0000 (19:42 -0400)]
tracing: explain why stack tracer is empty
If the stack tracing is disabled (by default) the stack_trace file
will only contain the header:
# cat /debug/tracing/stack_trace
Depth Size Location (0 entries)
----- ---- --------
This can be frustrating to a developer that does not realize that the
stack tracer is disabled. This patch adds the following text:
# cat /debug/tracing/stack_trace
Depth Size Location (0 entries)
----- ---- --------
#
# Stack tracer disabled
#
# To enable the stack tracer, either add 'stacktrace' to the
# kernel command line
# or 'echo 1 > /proc/sys/kernel/stack_tracer_enabled'
#
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Steven Rostedt [Thu, 12 Mar 2009 22:57:51 +0000 (18:57 -0400)]
tracing: fix stack tracer header
The stack tracer use to look like this:
# cat /debug/tracing/stack_trace
Depth Size Location (57 entries)
----- ---- --------
0) 5088 16 mempool_alloc_slab+0x16/0x18
1) 5072 144 mempool_alloc+0x4d/0xfe
2) 4928 16 scsi_sg_alloc+0x48/0x4a [scsi_mod]
Now it looks like this:
# cat /debug/tracing/stack_trace
Depth Size Location (57 entries)
----- ---- --------
0) 5088 16 mempool_alloc_slab+0x16/0x18
1) 5072 144 mempool_alloc+0x4d/0xfe
2) 4928 16 scsi_sg_alloc+0x48/0x4a [scsi_mod]
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Steven Rostedt [Thu, 12 Mar 2009 18:23:17 +0000 (14:23 -0400)]
tracing: export trace formats to user space
The binary printk saves a pointer to the format string in the ring buffer.
On output, the format is processed. But if the user is reading the
ring buffer through a binary interface, the pointer is meaningless.
This patch creates a file called printk_formats that maps the pointers
to the formats.
# cat /debug/tracing/printk_formats
0xffffffff80713d40 : "irq_handler_entry: irq=%d handler=%s\n"
0xffffffff80713d48 : "lock_acquire: %s%s%s\n"
0xffffffff80713d50 : "lock_release: %s\n"
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Steven Rostedt [Thu, 12 Mar 2009 18:19:25 +0000 (14:19 -0400)]
tracing: have event_trace_printk use static tracer
Impact: speed up on event tracing
The event_trace_printk is currently a wrapper function that calls
trace_vprintk. Because it uses a variable for the fmt it misses out
on the optimization of using the binary printk.
This patch makes event_trace_printk into a macro wrapper to use the
fmt as the same as the trace_printks.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Steven Rostedt [Thu, 12 Mar 2009 18:14:31 +0000 (14:14 -0400)]
tracing: make bprint event use the proper event id
The bprint record is using TRACE_PRINT when it should be TRACE_BPRINT.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Frederic Weisbecker [Thu, 12 Mar 2009 17:24:49 +0000 (18:24 +0100)]
tracing/core: bring back raw trace_printk for dynamic formats strings
Impact: fix callsites with dynamic format strings
Since its new binary implementation, trace_printk() internally uses static
containers for the format strings on each callsites. But the value is
assigned once at build time, which means that it can't take dynamic
formats.
So this patch unearthes the raw trace_printk implementation for the callers
that will need trace_printk to be able to carry these dynamic format
strings. The trace_printk() macro will use the appropriate implementation
for each callsite. Most of the time however, the binary implementation will
still be used.
The other impact of this patch is that mmiotrace_printk() will use the old
implementation because it calls the low level trace_vprintk and we can't
guess here whether the format passed in it is dynamic or not.
Some parts of this patch have been written by Steven Rostedt (most notably
the part that chooses the appropriate implementation for each callsites).
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Steven Rostedt [Thu, 12 Mar 2009 17:53:25 +0000 (13:53 -0400)]
tracing: show that buffer size is not expanded
Impact: do not confuse user on small trace buffer sizes
When the system boots up, the trace buffer is small to conserve memory.
It is only two pages per online CPU. When the tracer is used, it expands
to the default value.
This can confuse the user if they look at the buffer size and see only
7, but then later they see 1408.
# cat /debug/tracing/buffer_size_kb
7
# echo sched_switch > /debug/tracing/current_tracer
# cat /debug/tracing/buffer_size_kb
1408
This patch tries to help remove this confustion by showing that the
buffer has not been expanded.
# cat /debug/tracing/buffer_size_kb
7 (expanded: 1408)
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Steven Rostedt [Thu, 12 Mar 2009 17:13:49 +0000 (13:13 -0400)]
ring-buffer: remove unneeded get_online_cpus
Impact: speed up and remove possible races
The get_online_cpus was added to the ring buffer because the original
design would free the ring buffer on a CPU that was being taken
off line. The final design kept the ring buffer around even when the
CPU was taken off line. This is to allow a user to still read the
information on that ring buffer.
Most of the get_online_cpus are no longer needed since the ring buffer will
not disappear from the use cases.
Reported-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Steven Rostedt [Thu, 12 Mar 2009 15:46:03 +0000 (11:46 -0400)]
ring-buffer: use CONFIG_HOTPLUG_CPU not CONFIG_HOTPLUG
The hotplug code in the ring buffers is for use with CPU hotplug,
not generic hotplug.
Reported-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Steven Rostedt [Thu, 12 Mar 2009 15:33:20 +0000 (11:33 -0400)]
tracing: protect ring_buffer_expanded with trace_types_lock
Impact: prevent races with ring_buffer_expanded
This patch places the expanding of the tracing buffer under the
protection of the trace_types_lock mutex. It is highly unlikely
that there would be any contention, but better safe than sorry.
Reported-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Steven Rostedt [Thu, 12 Mar 2009 15:21:08 +0000 (11:21 -0400)]
tracing: fix comments about trace buffer resizing
Impact: cleanup
Some of the comments about the trace buffer resizing is gobbledygook.
And I wonder why people question if I'm a native English speaker.
This patch makes the comments make a bit more sense.
Reported-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Steven Rostedt [Fri, 13 Mar 2009 01:12:46 +0000 (21:12 -0400)]
Merge branch 'tracing/ftrace' of git://git./linux/kernel/git/tip/linux-2.6-tip into trace/tip/tracing/ftrace-merge
Ingo Molnar [Fri, 13 Mar 2009 00:33:21 +0000 (01:33 +0100)]
Merge branch 'core/locking' into tracing/ftrace
Ingo Molnar [Fri, 13 Mar 2009 00:30:40 +0000 (01:30 +0100)]
locking: rename trace_softirq_[enter|exit] => lockdep_softirq_[enter|exit]
Impact: cleanup
The naming clashes with upcoming softirq tracepoints, so rename the
APIs to lockdep_*().
Requested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Fri, 13 Mar 2009 00:29:17 +0000 (01:29 +0100)]
Merge branch 'linus' into core/locking
Linus Torvalds [Thu, 12 Mar 2009 23:35:26 +0000 (16:35 -0700)]
Merge git://git./linux/kernel/git/sam/kbuild-fixes
* git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild-fixes:
kbuild: remove unused -r option for module-init-tool depmod
kbuild: fix 'make rpm' when CONFIG_LOCALVERSION_AUTO=y and using SCM tree
kbuild: fix mkspec to cleanup RPM_BUILD_ROOT
kbuild: fix C libary confusion in unifdef.c due to getline()
Linus Torvalds [Thu, 12 Mar 2009 23:34:59 +0000 (16:34 -0700)]
Merge git://git./linux/kernel/git/rusty/linux-2.6-for-linus
* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus:
cpumask: mm_cpumask for accessing the struct mm_struct's cpu_vm_mask.
cpumask: tsk_cpumask for accessing the struct task_struct's cpus_allowed.
Linus Torvalds [Thu, 12 Mar 2009 23:32:36 +0000 (16:32 -0700)]
Merge git://git./linux/kernel/git/pkl/squashfs-linus
* git://git.kernel.org/pub/scm/linux/kernel/git/pkl/squashfs-linus:
Squashfs: Valid filesystems are flagged as bad by the corrupted fs patch
Linus Torvalds [Thu, 12 Mar 2009 23:25:04 +0000 (16:25 -0700)]
Merge branch 'hwmon-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6
* 'hwmon-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6:
hwmon: (f75375s) Remove unnecessary and confusing initialization
hwmon: (it87) Properly decode -128 degrees C temperature
hwmon: (lm90) Document support for the MAX6648/6692 chips
hwmon: (abituguru3) Fix I/O error handling
Jody McIntyre [Thu, 12 Mar 2009 21:39:23 +0000 (17:39 -0400)]
trivial: fix bad links in the ext2 and ext3 documentation
Trivial patch to fix bad links in the ext2 and ext3 documentation.
Signed-off-by: Jody McIntyre <scjody@sun.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Thu, 12 Mar 2009 23:22:51 +0000 (16:22 -0700)]
Merge branch 'fixes-
20090312' of git://git./linux/kernel/git/willy/pci
* 'fixes-
20090312' of git://git.kernel.org/pub/scm/linux/kernel/git/willy/pci:
PCIe: portdrv: call pci_disable_device during remove
pci: Fix typo in message while disabling HT MSI mapping
pci: don't disable too many HT MSI mapping
powerpc/pseries: The RPA PCI hotplug driver depends on EEH
PCIe: AER: during disable, check subordinate before walking
PCI: Add PCI quirk to disable L0s ASPM state for 82575 and 82598
Faisal Latif [Thu, 12 Mar 2009 21:34:59 +0000 (14:34 -0700)]
RDMA/nes: Don't allow userspace QPs to use STag zero
STag zero is a special STag that allows consumers to access any bus
address without registering memory. The nes driver unfortunately
allows STag zero to be used even with QPs created by unprivileged
userspace consumers, which means that any process with direct verbs
access to the nes device can read and write any memory accessible to
the underlying PCI device (usually any memory in the system). Such
access is usually given for cluster software such as MPI to use, so
this is a local privilege escalation bug on most systems running this
driver.
The driver was using STag zero to receive the last streaming mode
data; to allow STag zero to be disabled for unprivileged QPs, the
driver now registers a special MR for this data.
Cc: <stable@kernel.org>
Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Nick Piggin [Thu, 12 Mar 2009 21:31:38 +0000 (14:31 -0700)]
fs: new inode i_state corruption fix
There was a report of a data corruption
http://lkml.org/lkml/2008/11/14/121. There is a script included to
reproduce the problem.
During testing, I encountered a number of strange things with ext3, so I
tried ext2 to attempt to reduce complexity of the problem. I found that
fsstress would quickly hang in wait_on_inode, waiting for I_LOCK to be
cleared, even though instrumentation showed that unlock_new_inode had
already been called for that inode. This points to memory scribble, or
synchronisation problme.
i_state of I_NEW inodes is not protected by inode_lock because other
processes are not supposed to touch them until I_LOCK (and I_NEW) is
cleared. Adding WARN_ON(inode->i_state & I_NEW) to sites where we modify
i_state revealed that generic_sync_sb_inodes is picking up new inodes from
the inode lists and passing them to __writeback_single_inode without
waiting for I_NEW. Subsequently modifying i_state causes corruption. In
my case it would look like this:
CPU0 CPU1
unlock_new_inode() __sync_single_inode()
reg <- inode->i_state
reg -> reg & ~(I_LOCK|I_NEW) reg <- inode->i_state
reg -> inode->i_state reg -> reg | I_SYNC
reg -> inode->i_state
Non-atomic RMW on CPU1 overwrites CPU0 store and sets I_LOCK|I_NEW again.
Fix for this is rather than wait for I_NEW inodes, just skip over them:
inodes concurrently being created are not subject to data integrity
operations, and should not significantly contribute to dirty memory
either.
After this change, I'm unable to reproduce any of the added warnings or
hangs after ~1hour of running. Previously, the new warnings would start
immediately and hang would happen in under 5 minutes.
I'm also testing on ext3 now, and so far no problems there either. I
don't know whether this fixes the problem reported above, but it fixes a
real problem for me.
Cc: "Jorge Boncompte [DTI2]" <jorge@dti2.net>
Reported-by: Adrian Hunter <ext-adrian.hunter@nokia.com>
Cc: Jan Kara <jack@suse.cz>
Cc: <stable@kernel.org>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
KOSAKI Motohiro [Thu, 12 Mar 2009 21:31:36 +0000 (14:31 -0700)]
memcg: use correct scan number at reclaim
Even when page reclaim is under mem_cgroup, # of scan page is determined by
status of global LRU. Fix that.
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: Balbir Singh <balbir@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mark Brown [Thu, 12 Mar 2009 21:31:36 +0000 (14:31 -0700)]
mfd: add support for WM8351 revision B
No software visible difference from revision A.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Cc: Samuel Ortiz <sameo@openedhand.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Michael Spang [Thu, 12 Mar 2009 21:31:34 +0000 (14:31 -0700)]
acer-wmi: fix regression in backlight detection
Currently we disable the Acer WMI backlight device if there is no ACPI
backlight device. As a result, we end up with no backlight device at all.
We should instead disable it if there is an ACPI device, as the other
laptop drivers do. This regression was introduced in
febf2d9 ("Acer-WMI:
fingers off backlight if video.ko is serving this functionality").
Each laptop driver with backlight support got a similar change around
febf2d9. The changes to the other drivers look correct; see e.g.
a598c82f for a similar but correct change. The regression is also in
2.6.28.
Signed-off-by: Michael Spang <mspang@csclub.uwaterloo.ca>
Acked-by: Thomas Renninger <trenn@suse.de>
Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Carlos Corbacho <carlos@strangeworlds.co.uk>
Cc: Len Brown <len.brown@intel.com>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: <stable@kernel.org> [2.6.28.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ben Dooks [Thu, 12 Mar 2009 21:31:33 +0000 (14:31 -0700)]
mmc: s3cmci: fix s3c2410_dma_config() arguments.
The s3cmci driver is calling s3c2410_dma_config with incorrect data for
the DCON register. The S3C2410_DCON_HWTRIG is implicit in the channel
configuration and the device selection of S3C2410_DCON_CH0_SDI is
incorrect as the DMA system may not select channel 0.
Signed-off-by: Ben Dooks <ben@simtec.co.uk>
Acked-by: Pierre Ossman <drzeus@drzeus.cx>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Michael Kerrisk [Thu, 12 Mar 2009 21:31:32 +0000 (14:31 -0700)]
MAINTAINERS: downgrade support for man-pages
Unfortunately, Linux Foundation funding for my work on
man-pages/testing/doc under the auspices of the LF documentation
fellowship unfortunately ran out a short while ago (after earlier attempts
to seek funding, only Google stepped forward with a bit of further funding
for the position), so the patch below acknowledges something closer to
reality.
Unfortunately, there will (probably very) soon be a further downgrade from
"Maintained" to "Odd Fixes" or "Orphan", unless some funding miracle
occurs. So, if anyone is looking to become man-pages maintainer, there
may soon be an opening (okay, don't trample me in the rush ;-).)
Signed-off-by: Michael Kerrisk <mtk.manpages@googlemail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Daniel Mack [Thu, 12 Mar 2009 21:31:30 +0000 (14:31 -0700)]
ds2760_battery.c: fix division by zero
The 'battery remaining capacity' calculation in
drivers/power/ds2760_battery.c lacks a parameter check to a division
operation which causes the kernel to oops on my board.
[ 21.233750] Division by zero in kernel.
[ 21.237646] [<
c002955c>] (__div0+0x0/0x20) from [<
c012561c>] (Ldiv0+0x8/0x10)
[ 21.244816] [<
c01bef34>] (ds2760_battery_read_status+0x0/0x2a4) from [<
c01bf3a4>] (ds2760_battery_get_property+0x30/0xdc)
[ 21.255803] r8:
c03a22c0 r7:
c7886100 r6:
00000009 r5:
c782fe7c r4:
c7886084
[ 21.262518] [<
c01bf374>] (ds2760_battery_get_property+0x0/0xdc) from [<
c01bde98>] (power_supply_show_property+0x48/0x114)
[ 21.273480] r6:
c7996000 r5:
00000009 r4:
00000000
[ 21.278111] [<
c01bde50>] (power_supply_show_property+0x0/0x114) from [<
c01be158>] (power_supply_uevent+0x188/0x280)
[ 21.288537] r8:
00000001 r7:
c7886100 r6:
c7996000 r5:
000000b4 r4:
00000000
[ 21.295222] [<
c01bdfd0>] (power_supply_uevent+0x0/0x280) from [<
c015c664>] (dev_uevent+0xd4/0x10c)
[ 21.304199] [<
c015c590>] (dev_uevent+0x0/0x10c) from [<
c0128440>] (kobject_uevent_env+0x180/0x390)
[ 21.313170] r5:
00000000 r4:
c78860ac
[ 21.316725] [<
c01282c0>] (kobject_uevent_env+0x0/0x390) from [<
c0128664>] (kobject_uevent+0x14/0x18)
[ 21.325850] [<
c0128650>] (kobject_uevent+0x0/0x18) from [<
c01bdc34>] (power_supply_changed_work+0x5c/0x70)
[ 21.335506] [<
c01bdbd8>] (power_supply_changed_work+0x0/0x70) from [<
c004d290>] (run_workqueue+0xbc/0x144)
[ 21.345167] r4:
c7812040
[ 21.347716] [<
c004d1d4>] (run_workqueue+0x0/0x144) from [<
c004d94c>] (worker_thread+0xa8/0xbc)
[ 21.356296] r7:
c7812040 r6:
c7820b00 r5:
c782ffa4 r4:
c7812048
[ 21.361957] [<
c004d8a4>] (worker_thread+0x0/0xbc) from [<
c0051008>] (kthread+0x5c/0x94)
[ 21.369971] r7:
00000000 r6:
c004d8a4 r5:
c7812040 r4:
c782e000
[ 21.375612] [<
c0050fac>] (kthread+0x0/0x94) from [<
c00403d0>] (do_exit+0x0/0x688)
Signed-off-by: Daniel Mack <daniel@caiaq.de>
Cc: Szabolcs Gyurko <szabolcs.gyurko@tlt.hu>
Acked-by: Matt Reimer <mreimer@vpop.net>
Acked-by: Anton Vorontsov <cbou@mail.ru>
Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Li Zefan [Thu, 12 Mar 2009 21:31:29 +0000 (14:31 -0700)]
vfs: add missing unlock in sget()
In sget(), destroy_super(s) is called with s->s_umount held, which makes
lockdep unhappy.
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Menage <menage@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Oleg Nesterov [Thu, 12 Mar 2009 21:31:28 +0000 (14:31 -0700)]
pipe_rdwr_fasync: fix the error handling to prevent the leak/crash
If the second fasync_helper() fails, pipe_rdwr_fasync() returns the error
but leaves the file on ->fasync_readers.
This was always wrong, but since
233e70f4228e78eb2f80dc6650f65d3ae3dbf17c
"saner FASYNC handling on file close" we have the new problem. Because in
this case setfl() doesn't set FASYNC bit, __fput() will not do
->fasync(0), and we leak fasync_struct with ->fa_file pointing to the
freed file.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Daniel Mack [Thu, 12 Mar 2009 21:31:25 +0000 (14:31 -0700)]
drivers/w1/masters/w1-gpio.c: fix read_bit()
W1 master implementations are expected to return 0 or 1 from their
read_bit() function. However, not all platforms do return these values
from gpio_get_value() - namely PXAs won't. Hence the w1 gpio-master needs
to break the result down to 0 or 1 itself.
Signed-off-by: Daniel Mack <daniel@caiaq.de>
Cc: Ville Syrjala <syrjala@sci.fi>
Cc: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Cc: David Brownell <david-b@pacbell.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
akpm@linux-foundation.org [Thu, 12 Mar 2009 21:31:24 +0000 (14:31 -0700)]
uml: fix WARNING: vmlinux: 'memcpy' exported twice
Fix the following warning on x86_64:
LD vmlinux.o
MODPOST vmlinux.o
WARNING: vmlinux: 'memcpy' exported twice. Previous export was in vmlinux
For x86_64, this symbol is already exported from arch/um/sys-x86_64/ksyms.c.
Reported-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: WANG Cong <xiyou.wangcong@gmail.com>
Tested-by: Boaz Harrosh <bharrosh@panasas.com>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Renzo Davoli [Thu, 12 Mar 2009 21:31:23 +0000 (14:31 -0700)]
UML on UML fixed: it did not start
It is currently impossible to run a user-mode linux machine inside another
user-mode linux (UML on UML). It breaks after a few instructions. When
it tries to check whether SYSEMU is installed (the inner) UML receives an
inconsistent result (from the outer UML).
This is the output of a broken attempt:
$ ./linux mem=256m ubd0=cow
Locating the bottom of the address space ... 0x0
Locating the top of the address space ... 0xc0000000
Core dump limits :
soft - 0
hard - NONE
Checking that ptrace can change system call numbers...OK
Checking ptrace new tags for syscall emulation...unsupported
Checking syscall emulation patch for ptrace...check_sysemu : expected SIGTRAP, got status = 256
$
The problem is the following:
PTRACE_SYSCALL/SINGLESTEP is currently managed inside arch_ptrace for ARCH=um.
PTRACE_SYSEMU/SUSEMU_SINGLESTEP is not captured in arch_ptrace's switch,
therefore it is erroneously passed back to ptrace_request (in
kernel/ptrace).
This simple patch simply forces ptrace to return an error on
PTRACE_SYSEMU/SUSEMU_SINGLESTEP as it is unsupported on ARCH=um, and fixes
the problem.
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Renzo Davoli <renzo@cs.unibo.it>
Reviewed-by: WANG Cong <xiyou.wangcong@gmail.com>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Alex Chiang [Sun, 8 Mar 2009 02:35:47 +0000 (19:35 -0700)]
PCIe: portdrv: call pci_disable_device during remove
The PCIe port driver calls pci_enable_device() during probe but
never calls pci_disable_device() during remove.
Cc: stable@kernel.org
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Prakash Punnoor [Fri, 6 Mar 2009 09:10:35 +0000 (10:10 +0100)]
pci: Fix typo in message while disabling HT MSI mapping
"Enabling" should read "Disabling"
Signed-off-by: Prakash Punnoor <prakash@punnoor.de>
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Prakash Punnoor [Thu, 5 Mar 2009 23:45:12 +0000 (00:45 +0100)]
pci: don't disable too many HT MSI mapping
Prakash's system needs MSI disabled on some bridges, but not all.
This seems to be the minimal fix for 2.6.29, but should be replaced
during 2.6.30.
Signed-off-by: Prakash Punnoor <prakash@punnoor.de>
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Michael Ellerman [Fri, 6 Mar 2009 03:39:14 +0000 (14:39 +1100)]
powerpc/pseries: The RPA PCI hotplug driver depends on EEH
The RPA PCI hotplug driver calls EEH routines, so should depend on
EEH. Also PPC_PSERIES implies PPC64, so remove that.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Alex Chiang [Fri, 6 Mar 2009 02:28:40 +0000 (19:28 -0700)]
PCIe: AER: during disable, check subordinate before walking
Commit
47a8b0cc (Enable PCIe AER only after checking firmware
support) wants to walk the PCI bus in the remove path to disable
AER, and calls pci_walk_bus for downstream bridges.
Unfortunately, in the remove path, we remove devices and bridges
in a depth-first manner, starting with the furthest downstream
bridge and working our way backwards.
The furthest downstream bridges will not have a dev->subordinate,
and we hit a NULL deref in pci_walk_bus.
Check for dev->subordinate first before attempting to walk the
PCI hierarchy below us.
Acked-by: Andrew Patterson <andrew.patterson@hp.com>
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Alexander Duyck [Thu, 5 Mar 2009 18:57:28 +0000 (13:57 -0500)]
PCI: Add PCI quirk to disable L0s ASPM state for 82575 and 82598
This patch is intended to disable L0s ASPM link state for 82598 (ixgbe)
parts due to the fact that it is possible to corrupt TX data when coming
back out of L0s on some systems. The workaround had been added for 82575
(igb) previously, but did not use the ASPM api. This quirk uses the ASPM
api to prevent the ASPM subsystem from re-enabling the L0s state.
Instead of adding the fix in igb to the ixgbe driver as well it was
decided to move it into a pci quirk. It is necessary to move the fix out
of the driver and into a pci quirk in order to prevent the issue from
occuring prior to driver load to handle the possibility of the device being
passed to a VM via direct assignment.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
CC: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Linus Torvalds [Thu, 12 Mar 2009 16:27:53 +0000 (09:27 -0700)]
Merge git://git./linux/kernel/git/davem/sparc-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6:
sunhme: Fix qfe parent detection.
sparc64: Fix lost interrupts on sun4u.
sparc64: wait_event_interruptible_timeout may return -ERESTARTSYS
jsflash: stop defining MAJOR_NR
Linus Torvalds [Thu, 12 Mar 2009 16:25:10 +0000 (09:25 -0700)]
Merge branch 'upstream' of git://ftp.linux-mips.org/upstream-linus
* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus:
MIPS: IP27: Enable RAID5 module
MIPS: TXx9: update defconfigs
MIPS: NEC VR5500 processor support fixup
MIPS: Fix build of non-CONFIG_SYSVIPC version of sys_32_ipc
Andrew Klossner [Thu, 12 Mar 2009 12:36:39 +0000 (13:36 +0100)]
hwmon: (f75375s) Remove unnecessary and confusing initialization
f75375_probe calls i2c_get_clientdata to initialize the data pointer,
but there isn't yet any client data to get, and the value is never
used before the variable is assigned a new value seven lines later.
The call doesn't hurt anything and wastes only a couple of cycles.
The reason to fix it is because this module serves as an example to
hackers writing new hwmon drivers, and this part of the example is
confusing.
Signed-off-by: Andrew Klossner <andrew@cesa.opbu.xerox.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Jean Delvare [Thu, 12 Mar 2009 12:36:39 +0000 (13:36 +0100)]
hwmon: (it87) Properly decode -128 degrees C temperature
The it87 driver is reporting -128 degrees C as +128 degrees C.
That's not a terribly likely temperature value but let's still
get it right, especially when it simplifies the code.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Darrick J. Wong [Thu, 12 Mar 2009 12:36:38 +0000 (13:36 +0100)]
hwmon: (lm90) Document support for the MAX6648/6692 chips
Update documentation to prevent further confusion/duplication.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Jean Delvare [Thu, 12 Mar 2009 12:36:38 +0000 (13:36 +0100)]
hwmon: (abituguru3) Fix I/O error handling
Fix a logic bug reported by Roel Kluin, by rewriting the error
handling code in a clearer way.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Roel Kluin <roel.kluin@gmail.com>
Acked-by: Alistair John Strachan <alistair@devzero.co.uk>
Acked-by: Hans de Goede <hdegoede@redhat.com>
Rusty Russell [Thu, 12 Mar 2009 20:35:44 +0000 (14:35 -0600)]
cpumask: mm_cpumask for accessing the struct mm_struct's cpu_vm_mask.
This allows us to change the representation (to a dangling bitmap or
cpumask_var_t) without breaking all the callers: they can use
mm_cpumask() now and won't see a difference as the changes roll into
linux-next.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Thu, 12 Mar 2009 20:35:43 +0000 (14:35 -0600)]
cpumask: tsk_cpumask for accessing the struct task_struct's cpus_allowed.
This allows us to change the representation (to a dangling bitmap or
cpumask_var_t) without breaking all the callers: they can use
tsk_cpumask() now and won't see a difference as the changes roll into
linux-next.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Phillip Lougher [Thu, 12 Mar 2009 03:23:48 +0000 (03:23 +0000)]
Squashfs: Valid filesystems are flagged as bad by the corrupted fs patch
The corrupted filesystem patch added a check against zlib trying to
output too much data in the presence of data corruption. This check
triggered if zlib_inflate asked to be called again (Z_OK) with
avail_out == 0 and no more output buffers available. This check proves
to be rather dumb, as it incorrectly catches the case where zlib has
generated all the output, but there are still input bytes to be processed.
This patch does a number of things. It removes the original check and
replaces it with code to not move to the next output buffer if there
are no more output buffers available, relying on zlib to error if it
wants an extra output buffer in the case of data corruption. It
also replaces the Z_NO_FLUSH flag with the more correct Z_SYNC_FLUSH
flag, and makes the error messages more understandable to
non-technical users.
Signed-off-by: Phillip Lougher <phillip@lougher.demon.co.uk>
Reported-by: Stefan Lippers-Hollmann <s.L-H@gmx.de>
Steven Rostedt [Thu, 12 Mar 2009 02:00:13 +0000 (22:00 -0400)]
ring-buffer: only allocate buffers for online cpus
Impact: save on memory
Currently, a ring buffer was allocated for each "possible_cpus". On
some systems, this is the same as NR_CPUS. Thus, if a system defined
NR_CPUS = 64 but it only had 1 CPU, we could have possibly 63 useless
ring buffers taking up space. With a default buffer of 3 megs, this
could be quite drastic.
This patch changes the ring buffer code to only allocate ring buffers
for online CPUs. If a CPU goes off line, we do not free the buffer.
This is because the user may still have trace data in that buffer
that they would like to look at.
Perhaps in the future we could add code to delete a ring buffer if
the CPU is offline and the ring buffer becomes empty.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Steven Rostedt [Wed, 11 Mar 2009 23:52:30 +0000 (19:52 -0400)]
tracing: fix trace_wait to know to wait on all cpus or just one
Impact: fix to task live locking on reading trace_pipe on one CPU
The same code is used for both trace_pipe (all CPUS) and the per_cpu
trace_pipe file. When there is no data to read, it will check for
signals and wait on the trace wait queue.
The problem happens with the per_cpu wait. The trace_wait code checks
all CPUs. Thus, if there's data in another CPU buffer, then it will
exit the wait, without checking for signals or waiting on the wait queue.
It would then try to read the empty buffer, and since that will just
return nothing, then it will try to wait again. Unfortunately, that will
again fail due to there still being data in the other buffers. This
ends up with a live lock for the task.
This patch fixes the trace_wait to be aware that the iterator may only
be waiting on a single buffer.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Steven Rostedt [Wed, 11 Mar 2009 18:33:00 +0000 (14:33 -0400)]
tracing: expand the ring buffers when an event is activated
To save memory, the tracer ring buffers are set to a minimum.
The activating of a trace expands the ring buffer size. This patch
adds this expanding, when an event is activated.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Steven Rostedt [Wed, 11 Mar 2009 17:42:01 +0000 (13:42 -0400)]
tracing: keep ring buffer to minimum size till used
Impact: less memory impact on systems not using tracer
When the kernel boots up that has tracing configured, it allocates
the default size of the ring buffer. This currently happens to be
1.4Megs per possible CPU. This is quite a bit of wasted memory if
the system is never using the tracer.
The current solution is to keep the ring buffers to a minimum size
until the user uses them. Once a tracer is piped into the current_tracer
the ring buffer will be expanded to the default size. If the user
changes the size of the ring buffer, it will take the size given
by the user immediately.
If the user adds a "ftrace=" to the kernel command line, then the ring
buffers will be set to the default size on initialization.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Linus Torvalds [Wed, 11 Mar 2009 21:29:03 +0000 (14:29 -0700)]
Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs
* 'for-linus' of git://oss.sgi.com/xfs/xfs:
xfs: only issues a cache flush on unmount if barriers are enabled
xfs: prevent lockdep false positive in xfs_iget_cache_miss
xfs: prevent kernel crash due to corrupted inode log format
Ralf Baechle [Wed, 11 Mar 2009 20:08:50 +0000 (21:08 +0100)]
MIPS: IP27: Enable RAID5 module
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Atsushi Nemoto [Wed, 4 Mar 2009 14:45:44 +0000 (23:45 +0900)]
MIPS: TXx9: update defconfigs
Enable following features:
* MTD (PHYSMAP)
* LED (LEDS_GPIO)
* RBTX4939
* 7SEGLED
* IDE (IDE_TX4938, IDE_TX4939)
* SMC91X
* RTC_DRV_TX4939
Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Shinya Kuribayashi [Tue, 3 Mar 2009 09:05:51 +0000 (18:05 +0900)]
MIPS: NEC VR5500 processor support fixup
Current VR5500 processor support lacks of some functions which are
expected to be configured/synthesized on arch initialization.
Here're some VR5500A spec notes:
* All execution hazards are handled in hardware.
* Once VR5500A stops the operation of the pipeline by WAIT instruction,
it could return from the standby mode only when either a reset, NMI
request, or all enabled interrupts is/are detected. In other words,
if interrupts are disabled by Status.IE=0, it keeps in standby mode
even when interrupts are internally asserted.
Notes on WAIT: The operation of the processor is undefined if WAIT
insn is in the branch delay slot. The operation is also undefined
if WAIT insn is executed when Status.EXL and Status.ERL are set to 1.
* VR5500A core only implements the Load prefetch.
With these changes, it boots fine.
Signed-off-by: Shinya Kuribayashi <shinya.kuribayashi@necel.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Xiaotian Feng [Mon, 9 Mar 2009 01:45:12 +0000 (09:45 +0800)]
MIPS: Fix build of non-CONFIG_SYSVIPC version of sys_32_ipc
Signed-off-by: Xiaotian Feng <xiaotian.feng@windriver.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Ingo Molnar [Wed, 11 Mar 2009 19:47:23 +0000 (20:47 +0100)]
Merge branch 'tip/tracing/ftrace' of git://git./linux/kernel/git/rostedt/linux-2.6-trace into tracing/ftrace
Linus Torvalds [Wed, 11 Mar 2009 19:14:55 +0000 (12:14 -0700)]
Merge branch 'drm-fixes' of git://git./linux/kernel/git/airlied/drm-2.6
* 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
drm: fix EDID parser problem with positive/negative hsync/vsync
Linus Torvalds [Wed, 11 Mar 2009 19:14:04 +0000 (12:14 -0700)]
Merge branch 'merge' of git://git./linux/kernel/git/benh/powerpc
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
radeonfb/aty128fb: Disable broken early resume hook for PowerBooks
hvc_console: Remove tty->low_latency on pseries backends
powerpc: fix linkstation and storcenter compilation breakage
powerpc/4xx: Enable SERIAL_OF support by default for Virtex platforms
Linus Torvalds [Wed, 11 Mar 2009 19:09:45 +0000 (12:09 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/anholt/drm-intel
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/anholt/drm-intel:
drm/i915: fix 945 fence register writes for fence 8 and above.
drm/i915: Protect active fences on i915
drm/i915: Check to see if we've pinned all available fences
drm/i915: Check fence status on every pin.
drm/i915: First recheck for an empty fence register.
drm/i915: Fix bad \n in MTRR failure notice.
drm/i915: Don't restore palettes through VGA registers.
i915: add newline to i915_gem_object_pin failure msg
drm: Return EINVAL on duplicate objects in execbuffer object list
Linus Torvalds [Wed, 11 Mar 2009 19:04:51 +0000 (12:04 -0700)]
Merge branch 'x86-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: work around Fedora-11 x86-32 kernel failures on Intel Atom CPUs
OGAWA Hirofumi [Wed, 11 Mar 2009 17:03:23 +0000 (02:03 +0900)]
Fix _fat_bmap() locking
On swapon() path, it has already i_mutex. So, this uses i_alloc_sem
instead of it.
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Reported-by: Laurent GUERBY <laurent@guerby.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Eric Anholt [Wed, 11 Mar 2009 05:34:49 +0000 (22:34 -0700)]
drm/i915: fix 945 fence register writes for fence 8 and above.
The last 8 fence registers sit at a different offset, so when we went to set
fence number 8 in the lower offset, we instead set PGETBL_CTL, and the GPU
got all sorts of angry at us.
fd.o bug #20567. Easily reproducible by running glxgears and killing it about
6 times.
Signed-off-by: Eric Anholt <eric@anholt.net>
Chris Wilson [Wed, 11 Feb 2009 14:26:47 +0000 (14:26 +0000)]
drm/i915: Protect active fences on i915
The i915 also uses the fence registers for GPU access to tiled buffers so
we cannot reallocate one whilst it is on the active list. By performing a
LRU scan of the fenced buffers we also avoid waiting the possibility of
waiting on a pinned, or otherwise unusable, buffer.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Eric Anholt <eric@anholt.net>
Ingo Molnar [Tue, 10 Mar 2009 21:31:03 +0000 (22:31 +0100)]
x86: work around Fedora-11 x86-32 kernel failures on Intel Atom CPUs
Impact: work around boot crash
Work around Intel Atom erratum AAH41 (probabilistically) - it's triggering
in the field.
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Tested-by: Kyle McMartin <kyle@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Wu Fengguang [Wed, 11 Mar 2009 01:00:04 +0000 (09:00 +0800)]
proc: fix kflags to uflags copying in /proc/kpageflags
Fix kpf_copy_bit(src,dst) to be kpf_copy_bit(dst,src) to match the
actual call patterns, e.g. kpf_copy_bit(kflags, KPF_LOCKED, PG_locked).
This misplacement of src/dst only affected reporting of PG_writeback,
PG_reclaim and PG_buddy. For others kflags==uflags so not affected.
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Chris Wilson [Wed, 11 Feb 2009 14:26:46 +0000 (14:26 +0000)]
drm/i915: Check to see if we've pinned all available fences
We need to check and report if there are no available fences - or else we
spin endlessly waiting for a buffer to magically unpin itself.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Eric Anholt <eric@anholt.net>
Chris Wilson [Wed, 11 Feb 2009 14:26:45 +0000 (14:26 +0000)]
drm/i915: Check fence status on every pin.
As we may steal the fence register of an unpinned buffer for another,
every time we repin the buffer we need to recheck whether it needs to be
allocated a fence.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Eric Anholt <eric@anholt.net>
Chris Wilson [Wed, 11 Feb 2009 14:26:44 +0000 (14:26 +0000)]
drm/i915: First recheck for an empty fence register.
If we wait upon a request and successfully unbind a buffer occupying a
fence register, then that slot will be freed and cause a NULL derefrence
upon rescanning.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Eric Anholt <eric@anholt.net>
Pantelis Koukousoulas [Tue, 10 Mar 2009 11:16:14 +0000 (13:16 +0200)]
drm: fix EDID parser problem with positive/negative hsync/vsync
Comparing the layouts of struct detail_pixel_timing with
x.org's struct detailed_timings and how those are handled,
it appears that the hsync_positive and vsync_positive
fields are backwards.
This patch fixes https://bugs.freedesktop.org/show_bug.cgi?id=20019
for me. It was tested on 2 monitors, LG FLATRON L225WS 22" and
a YAKUMO 17" for which more details are unknown.
Signed-off-by: Pantelis Koukousoulas <pktoss@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Benjamin Herrenschmidt [Tue, 10 Mar 2009 23:45:17 +0000 (10:45 +1100)]
radeonfb/aty128fb: Disable broken early resume hook for PowerBooks
radeonfb and aty128fb have a special hook called by the PowerMac platform
code very very early on resume from sleep to bring the screen back. This
is useful for debugging wakup problems, but unfortunately, this also became
a source of problems of its own.
The hook is called extremely early, with interrupts still off, and the code
path involved with that code nowadays rely on things like taking mutexes,
GFP_KERNEL allocations, etc...
In addition, the driver now relies on the PCI core to restore the standard
config space before calling resume which doesn't happen with this early
code path.
I'm keeping the code in but commented out along with a fixup call to
pci_restore_state(). The reason is that I still want to make it easy to
re-enable temporarily to track wake up problems, and it's possible that
I can revive it at some stage if we make sleeping things save to call
in early resume using a system state.
In the meantime, this should fix several reported regressions.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Benjamin Herrenschmidt [Mon, 9 Mar 2009 14:36:15 +0000 (14:36 +0000)]
hvc_console: Remove tty->low_latency on pseries backends
The hvcs and hvsi backends both set tty->low_latency to one, along
with more or less scary comments regarding bugs or races that would
happen if not doing so.
However, they also both call tty_flip_buffer_push() in conexts where
it's illegal to do so since some recent tty changes (or at least it
may have been illegal always but it nows blows) when low_latency is
set (ie, hard interrupt or with spinlock held and irqs disabled).
This removes the setting for now to get them back to working condition,
we'll have to address the races described in the comments separately
if they are still an issue (some of this might have been fixed already).
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Benjamin Herrenschmidt [Tue, 10 Mar 2009 23:40:29 +0000 (10:40 +1100)]
Merge commit 'gcl/merge' into merge