Peter Zijlstra [Tue, 28 Apr 2009 12:56:18 +0000 (14:56 +0200)]
perf_counter tools: fix Documentation/perf_counter build error
Mike Galbraith reported:
> marge:..Documentation/perf_counter # make
> CC builtin-stat.o
> In file included from builtin-stat.c:71:
> /usr/include/ctype.h:102: error: expected expression before ‘]’ token
Remove the ctype.h include.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Mon, 27 Apr 2009 06:02:14 +0000 (08:02 +0200)]
perf_counter tools: move helper library to util/*
Clean up the top level directory a bit by moving all the helper libraries
to util/*.[ch].
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Mon, 20 Apr 2009 18:38:21 +0000 (20:38 +0200)]
perfcounters, sched: remove __task_delta_exec()
This function was left orphan by the latest round of sw-counter
cleanups.
[ Impact: remove unused kernel function ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Mon, 20 Apr 2009 14:13:46 +0000 (16:13 +0200)]
perf_counter tools: fix 'make install'
Remove Git leftovers from this area.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Mon, 20 Apr 2009 14:05:55 +0000 (16:05 +0200)]
perf_counter tools: add 'perf help'
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Mon, 20 Apr 2009 14:01:30 +0000 (16:01 +0200)]
perf_counter tools: fix --version
Hook up the 'perf version' built-in command.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Mon, 20 Apr 2009 13:58:01 +0000 (15:58 +0200)]
perf_counter tools: add 'perf record' command
Move perf-record.c into the perf suite of commands.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Mon, 20 Apr 2009 13:52:29 +0000 (15:52 +0200)]
perf_counter tools: add help texts
Add Documentation/perf-stat.txt and Documentation/perf-top.txt.
The template that was used for it: Documentation/git-add.txt from Git.
Fix up small bugs to make these help texts show up both in the 'perf'
common-command summary output screen, and on the individual help screens.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Mon, 20 Apr 2009 13:37:32 +0000 (15:37 +0200)]
perf_counter tools: separate kerneltop into 'perf top' and 'perf stat'
Lets use the Git framework of built-in commands.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Mon, 20 Apr 2009 13:22:22 +0000 (15:22 +0200)]
perf_counter tools: clean up after introduction of the Git command framework
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Mon, 20 Apr 2009 13:00:56 +0000 (15:00 +0200)]
perf_counter tools: add in basic glue from Git
First very raw version at having a central 'perf' command and
a list of subcommands:
perf top
perf stat
perf record
perf report
...
This is done by picking up Git's collection of utility functions,
and hacking them to build fine in this new environment.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Mon, 20 Apr 2009 11:32:07 +0000 (13:32 +0200)]
perf_counter: copy in Git's top Makefile
We'd like to have a similar user-space structure as Git has, for the
perfcounter tools - so copy in Git's toplevel makefile as-is.
We'll strip it down in subsequent commits to make it fit the
perfcounters code.
The Git version used:
66996ec: Sync with 1.6.2.4
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Steven Whitehouse [Wed, 15 Apr 2009 15:55:05 +0000 (16:55 +0100)]
perfcounters: export perf_tpcounter_event
Needed for modular tracepoint support.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Luis Henriques [Mon, 13 Apr 2009 19:24:50 +0000 (20:24 +0100)]
perf_counter: fix alignment in /proc/interrupts
Trivial fix on columns alignment in /proc/interrupts file.
Signed-off-by: Luis Henriques <henrix@sapo.pt>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <
20090413192449.GA3920@hades.domain.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Peter Zijlstra [Thu, 9 Apr 2009 08:53:46 +0000 (10:53 +0200)]
perf_counter: log full path names
Impact: fix perf-report output for /home mounted binaries, etc.
dentry_path() only provide path-names up to the mount root, which is
unsuited for out purpose, use d_path() instead.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <
20090409085524.
601794134@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Peter Zijlstra [Thu, 9 Apr 2009 08:53:45 +0000 (10:53 +0200)]
perf_counter: sysctl for system wide perf counters
Impact: add sysctl for paranoid/relaxed perfcounters policy
Allow the use of system wide perf counters to everybody, but provide
a sysctl to disable it for the paranoid security minded.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <
20090409085524.
514046352@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Peter Zijlstra [Thu, 9 Apr 2009 08:53:44 +0000 (10:53 +0200)]
perf_counter: optimize mmap/comm tracking
Impact: performance optimization
The mmap/comm tracking code does quite a lot of work before it discovers
there's no interest in it, avoid that by keeping a counter.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <
20090409085524.
427173196@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Thu, 9 Apr 2009 07:50:04 +0000 (09:50 +0200)]
perf_counter tools: include PID in perf-report output, tweak user/kernel printut
It's handier than an <unknown> entry.
Also replace the kernel/user column with a more compact version:
0.52 cc1 [k] page_fault
0.57 :0 [k] _spin_lock
0.59 :7506 [.] <unknown>
0.69 as [.] /usr/bin/as: <unknown>
0.76 cc1 [.] /lib64/libc-2.8.so: _int_free
0.92 cc1 [k] clear_page_c
1.00 :7465 [.] <unknown>
1.43 cc1 [.] /lib64/libc-2.8.so: memset
1.86 cc1 [.] /lib64/libc-2.8.so: _int_malloc
70.33 cc1 [.] /usr/libexec/gcc/x86_64-redhat-linux/4.3.2/cc1: <unknown>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Thu, 9 Apr 2009 07:48:22 +0000 (09:48 +0200)]
perf_counter: fix off task->comm by one
strlen() does not include the \0.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Paul Mackerras [Thu, 9 Apr 2009 04:42:56 +0000 (14:42 +1000)]
perf_counter: powerpc: add nmi_enter/nmi_exit calls
Impact: fix potential deadlocks on powerpc
Now that the core is using in_nmi() (added in
e30e08f6, "perf_counter:
fix NMI race in task clock"), we need the powerpc perf_counter_interrupt
to call nmi_enter() and nmi_exit() in those cases where the interrupt
happens when interrupts are soft-disabled.
If interrupts were soft-enabled, we can treat it as a regular interrupt
and do irq_enter/irq_exit around the whole routine. This lets us get rid
of the test_perf_counter_pending() call at the end of
perf_counter_interrupt, thus simplifying things a little.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <18909.31952.873098.336615@cargo.ozlabs.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Paul Mackerras [Wed, 8 Apr 2009 23:27:37 +0000 (09:27 +1000)]
perf_counter: add MAINTAINERS entry
This adds an entry in MAINTAINERS for the perf_counter subsystem.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <18909.13033.345975.434902@cargo.ozlabs.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Peter Zijlstra [Wed, 8 Apr 2009 13:01:33 +0000 (15:01 +0200)]
perf_counter: allow for data addresses to be recorded
Paul suggested we allow for data addresses to be recorded along with
the traditional IPs as power can provide these.
For now, only the software pagefault events provide data addresses,
but in the future power might as well for some events.
x86 doesn't seem capable of providing this atm.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <
20090408130409.
394816925@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Peter Zijlstra [Wed, 8 Apr 2009 13:01:32 +0000 (15:01 +0200)]
perf_counter: move PERF_RECORD_TIME
Move PERF_RECORD_TIME so that all the fixed length items come before
the variable length ones.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <
20090408130409.
307926436@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Peter Zijlstra [Wed, 8 Apr 2009 13:01:31 +0000 (15:01 +0200)]
perf_counter: some simple userspace profiling
# perf-record make -j4 kernel/
# perf-report | tail -15
0.39 cc1 [kernel] lock_acquired
0.42 cc1 [kernel] lock_acquire
0.51 cc1 [ user ] /lib64/libc-2.8.90.so: _int_free
0.51 as [kernel] clear_page_c
0.53 cc1 [ user ] /lib64/libc-2.8.90.so: memcpy
0.56 cc1 [ user ] /lib64/libc-2.8.90.so: _IO_vfprintf
0.63 cc1 [kernel] lock_release
0.67 cc1 [ user ] /lib64/libc-2.8.90.so: strlen
0.68 cc1 [kernel] debug_smp_processor_id
1.38 cc1 [ user ] /lib64/libc-2.8.90.so: _int_malloc
1.55 cc1 [ user ] /lib64/libc-2.8.90.so: memset
1.77 cc1 [kernel] __lock_acquire
1.88 cc1 [kernel] clear_page_c
3.61 as [ user ] /usr/bin/as: <unknown>
59.16 cc1 [ user ] /usr/libexec/gcc/x86_64-redhat-linux/4.3.2/cc1: <unknown>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
LKML-Reference: <
20090408130409.
220518450@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Peter Zijlstra [Wed, 8 Apr 2009 13:01:30 +0000 (15:01 +0200)]
perf_counter: track task-comm data
Similar to the mmap data stream, add one that tracks the task COMM field,
so that the userspace reporting knows what to call a task.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <
20090408130409.
127422406@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Peter Zijlstra [Wed, 8 Apr 2009 13:01:29 +0000 (15:01 +0200)]
perf_counter: add some comments
Add a few comments because I was forgetting what field what for what
functionality.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <
20090408130409.
036984214@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Peter Zijlstra [Wed, 8 Apr 2009 13:01:28 +0000 (15:01 +0200)]
perf_counter: kerneltop: keep up with ABI changes
Update kerneltop to use PERF_EVENT_MISC_OVERFLOW
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <
20090408130408.
947197470@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Peter Zijlstra [Wed, 8 Apr 2009 13:01:27 +0000 (15:01 +0200)]
perf_counter: use misc field to widen type
Push the PERF_EVENT_COUNTER_OVERFLOW bit into the misc field so that
we can have the full 32bit for PERF_RECORD_ bits.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <
20090408130408.
891867663@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Peter Zijlstra [Wed, 8 Apr 2009 13:01:26 +0000 (15:01 +0200)]
perf_counter: provide misc bits in the event header
Limit the size of each record to 64k (or should we count in multiples
of u64 and have a 512K limit?), this gives 16 bits or spare room in the
header, which we can use for misc bits, so as to not have to grow the
record with u64 every time we have a few bits to report.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <
20090408130408.
769271806@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Peter Zijlstra [Wed, 8 Apr 2009 13:01:25 +0000 (15:01 +0200)]
perf_counter: fix NMI race in task clock
We should not be updating ctx->time from NMI context, work around that.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <
20090408130408.
681326666@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Hidetoshi Seto [Wed, 25 Mar 2009 01:50:34 +0000 (10:50 +0900)]
x86: smarten /proc/interrupts output for new counters
Now /proc/interrupts of tip tree has new counters:
CNT: Performance counter interrupts
Format change of output, as like that by commit:
commit
7a81d9a7da03d2f27840d659f97ef140d032f609
x86: smarten /proc/interrupts output
should be applied to these new counters too.
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Cc: Jan Beulich <jbeulich@novell.com>
LKML-Reference: <
49C98DEA.
8060208@jp.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Paul Mackerras [Wed, 8 Apr 2009 10:30:18 +0000 (20:30 +1000)]
perf_counter: powerpc: set sample enable bit for marked instruction events
Impact: enable access to hardware feature
POWER processors have the ability to "mark" a subset of the instructions
and provide more detailed information on what happens to the marked
instructions as they flow through the pipeline. This marking is
enabled by the "sample enable" bit in MMCRA, and there are
synchronization requirements around setting and clearing the bit.
This adds logic to the processor-specific back-ends so that they know
which events relate to marked instructions and set the sampling enable
bit if any event that we want to put on the PMU is a marked instruction
event. It also adds logic to the generic powerpc code to do the
necessary synchronization if that bit is set.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <18908.31930.1024.228867@cargo.ozlabs.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Paul Mackerras [Wed, 8 Apr 2009 10:30:10 +0000 (20:30 +1000)]
perf_counter: fix powerpc build
Commit
4af4998b ("perf_counter: rework context time") changed struct
perf_counter_context to have a 'time' field instead of a 'time_now'
field, but neglected to fix the place in the powerpc perf_counter.c
where the time_now field was accessed. This fixes it.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <18908.31922.411398.147810@cargo.ozlabs.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Wed, 8 Apr 2009 08:35:30 +0000 (10:35 +0200)]
Merge commit 'v2.6.30-rc1' into perfcounters/core
Conflicts:
arch/powerpc/include/asm/systbl.h
arch/powerpc/include/asm/unistd.h
include/linux/init_task.h
Merge reason: the conflicts are non-trivial: PowerPC placement
of sys_perf_counter_open has to be mixed with the
new preadv/pwrite syscalls.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Linus Torvalds [Tue, 7 Apr 2009 21:25:01 +0000 (14:25 -0700)]
Linux 2.6.30-rc1
Linus Torvalds [Tue, 7 Apr 2009 21:11:07 +0000 (14:11 -0700)]
Merge branch 'core/softlockup' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'core/softlockup' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
softlockup: make DETECT_HUNG_TASK default depend on DETECT_SOFTLOCKUP
softlockup: move 'one' to the softlockup section in sysctl.c
softlockup: ensure the task has been switched out once
softlockup: remove timestamp checking from hung_task
softlockup: convert read_lock in hung_task to rcu_read_lock
softlockup: check all tasks in hung_task
softlockup: remove unused definition for spawn_softlockup_task
softlockup: fix potential race in hung_task when resetting timeout
softlockup: fix to allow compiling with !DETECT_HUNG_TASK
softlockup: decouple hung tasks check from softlockup detection
Linus Torvalds [Tue, 7 Apr 2009 21:10:10 +0000 (14:10 -0700)]
Merge branch 'tracing-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
branch tracer, intel-iommu: fix build with CONFIG_BRANCH_TRACER=y
branch tracer: Fix for enabling branch profiling makes sparse unusable
ftrace: Correct a text align for event format output
Update /debug/tracing/README
tracing/ftrace: alloc the started cpumask for the trace file
tracing, x86: remove duplicated #include
ftrace: Add check of sched_stopped for probe_sched_wakeup
function-graph: add proper initialization for init task
tracing/ftrace: fix missing include string.h
tracing: fix incorrect return type of ns2usecs()
tracing: remove CALLER_ADDR2 from wakeup tracer
blktrace: fix pdu_len when tracing packet command requests
blktrace: small cleanup in blk_msg_write()
blktrace: NUL-terminate user space messages
tracing: move scripts/trace/power.pl to scripts/tracing/power.pl
Linus Torvalds [Tue, 7 Apr 2009 21:07:52 +0000 (14:07 -0700)]
Merge branch 'irq/threaded' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'irq/threaded' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
genirq: fix devres.o build for GENERIC_HARDIRQS=n
genirq: provide old request_irq() for CONFIG_GENERIC_HARDIRQ=n
genirq: threaded irq handlers review fixups
genirq: add support for threaded interrupts to devres
genirq: add threaded interrupt handler support
Trond Myklebust [Tue, 7 Apr 2009 21:02:53 +0000 (14:02 -0700)]
NFS: Fix the return value in nfs_page_mkwrite()
Commit
c2ec175c39f62949438354f603f4aa170846aabb ("mm: page_mkwrite
change prototype to match fault") exposed a bug in the NFS
implementation of page_mkwrite. We should be returning 0 on success...
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Tue, 7 Apr 2009 18:24:19 +0000 (11:24 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/jbarnes/pci-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
PCI: pci_slot: grab refcount on slot's bus
PCI Hotplug: acpiphp: grab refcount on p2p subordinate bus
PCI: allow PCI core hotplug to remove PCI root bus
PCI: Fix oops in pci_vpd_truncate
PCI: don't corrupt enable_cnt when doing manual resource alignment
PCI: annotate pci_rescan_bus as __ref, not __devinit
PCI-IOV: fix missing kernel-doc
PCI: Setup disabled bridges even if buses are added
PCI: SR-IOV quirk for Intel 82576 NIC
Linus Torvalds [Tue, 7 Apr 2009 18:06:41 +0000 (11:06 -0700)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
loop: mutex already unlocked in loop_clr_fd()
cfq-iosched: don't let idling interfere with plugging
block: remove unused REQ_UNPLUG
cfq-iosched: kill two unused cfqq flags
cfq-iosched: change dispatch logic to deal with single requests at the time
mflash: initial support
cciss: change to discover first memory BAR
cciss: kernel scan thread for MSA2012
cciss: fix residual count for block pc requests
block: fix inconsistency in I/O stat accounting code
block: elevator quiescing helpers
Linus Torvalds [Tue, 7 Apr 2009 14:59:41 +0000 (07:59 -0700)]
Fix build errors due to CONFIG_BRANCH_TRACER=y
The code that enables branch tracing for all (non-constant) branches
plays games with the preprocessor and #define's the C 'if ()' construct
to do tracing.
That's all fine, but it fails for some unusual but valid C code that is
sometimes used in macros, notably by the intel-iommu code:
if (i=drhd->iommu, drhd->ignored) ..
because now the preprocessor complains about multiple arguments to the
'if' macro.
So make the macro expansion of this particularly horrid trick use
varargs, and handle the case of comma-expressions in if-statements. Use
another macro to do it cleanly in just one place.
This replaces a patch by David (and acked by Steven) that did this all
inside that one already-too-horrid macro.
Tested-by: Ingo Molnar <mingo@elte.hu>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Tue, 7 Apr 2009 15:54:43 +0000 (08:54 -0700)]
Merge branch 'for-2.6.30' of git://git./linux/kernel/git/broonie/sound-2.6
* 'for-2.6.30' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound-2.6:
ASoC: TWL4030: Compillation error fix
Linus Torvalds [Tue, 7 Apr 2009 15:53:38 +0000 (08:53 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/tiwai/sound-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6: (36 commits)
ALSA: hda - Add VREF powerdown sequence for another board
ALSA: oss - volume control for CSWITCH and CROUTE
ALSA: hda - add missing comma in ad1884_slave_vols
sound: usb-audio: allow period sizes less than 1 ms
sound: usb-audio: save data packet interval in audioformat structure
sound: usb-audio: remove check_hw_params_convention()
sound: usb-audio: show sample format width in proc file
ASoC: fsl_dma: Pass the proper device for dma mapping routines
ASoC: Fix null dereference in ak4535_remove()
ALSA: hda - enable SPDIF output for Intel DX58SO board
ALSA: snd-atmel-abdac: increase periods_min to 6 instead of 4
ALSA: snd-atmel-abdac: replace bus_id with dev_name()
ALSA: snd-atmel-ac97c: replace bus_id with dev_name()
ALSA: snd-atmel-ac97c: cleanup registers when removing driver
ALSA: snd-atmel-ac97c: do a proper reset of the external codec
ALSA: snd-atmel-ac97c: enable interrupts to catch events for error reporting
ALSA: snd-atmel-ac97c: set correct size for buffer hardware parameter
ALSA: snd-atmel-ac97c: do not overwrite OCA and ICA when assigning channels
ALSA: snd-atmel-ac97c: remove dead break statements after return in switch case
ALSA: snd-atmel-ac97c: cleanup register definitions
...
Linus Torvalds [Tue, 7 Apr 2009 15:53:02 +0000 (08:53 -0700)]
Merge branch 'upstream-linus' of git://git./linux/kernel/git/jgarzik/libata-dev
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
sata_mv: shorten register names
sata_mv: workaround errata SATA#13
sata_mv: cosmetic renames
sata_mv: workaround errata SATA#26
sata_mv: workaround errata PCI#7
sata_mv: replace 0x1f with ATA_PIO4 (v2)
sata_mv: fix irq mask races
sata_mv: revert SoC irq breakage
libata: ahci enclosure management bios workaround
ata: Add TRIM infrastructure
ata_piix: VGN-BX297XP wants the controller power up on suspend
libata: Remove some redundant casts from pata_octeon_cf.c
pata_artop: typo
Linus Torvalds [Tue, 7 Apr 2009 15:45:12 +0000 (08:45 -0700)]
Merge branch 'i2c-for-2630-v2' of git://aeryn.fluff.org.uk/bjdooks/linux
* 'i2c-for-2630-v2' of git://aeryn.fluff.org.uk/bjdooks/linux:
i2c: imx: Make disable_delay a per-device variable
i2c: xtensa s6000 i2c driver
powerpc/85xx: i2c-mpc: use new I2C bindings for the Socates board
i2c: i2c-mpc: make I2C bus speed configurable
i2c: i2c-mpc: use dev based printout function
i2c: i2c-mpc: various coding style fixes
i2c: imx: Add missing request_mem_region in probe()
i2c: i2c-s3c2410: Initialise Samsung I2C controller early
i2c-s3c2410: Simplify bus frequency calculation
i2c-s3c2410: sda_delay should be in ns, not clock ticks
i2c: iMX/MXC support
Linus Torvalds [Tue, 7 Apr 2009 15:44:43 +0000 (08:44 -0700)]
Merge branch 'hwmon-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6
* 'hwmon-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6:
hwmon: Add Asus ATK0110 support
hwmon: (lm95241) Convert to new-style i2c driver
Alan Cox [Tue, 7 Apr 2009 14:30:57 +0000 (15:30 +0100)]
parport: Use the PCI IRQ if offered
PCI parallel port devices can IRQ share so we should stop them hogging
the line and making a mess on modern PC systems. We know the sharing
side works as the PCMCIA driver has shared the parallel port IRQ for
some time.
Signed-off-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Breno Leitao [Tue, 7 Apr 2009 15:53:48 +0000 (16:53 +0100)]
tty: jsm cleanups
Here are some cleanups, mainly removing unused variables and silly
declarations.
Signed-off-by: Breno Leitao <leitao@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mike Frysinger [Tue, 7 Apr 2009 15:53:11 +0000 (16:53 +0100)]
Adjust path to gpio headers
Signed-off-by: Mike Frysinger <vapier.adi@gmail.com>
Signed-off-by: Bryan Wu <cooloney@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mike Frysinger [Tue, 7 Apr 2009 15:52:49 +0000 (16:52 +0100)]
KGDB_SERIAL_CONSOLE check for module
Depend on KGDB_SERIAL_CONSOLE being set to N rather than !Y, since it can
be built as a module.
Signed-off-by: Mike Frysinger <vapier.adi@gmail.com>
Signed-off-by: Bryan Wu <cooloney@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mike Frysinger [Tue, 7 Apr 2009 15:52:39 +0000 (16:52 +0100)]
Change KCONFIG name
Signed-off-by: Mike Frysinger <vapier.adi@gmail.com>
Signed-off-by: Bryan Wu <cooloney@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Sonic Zhang [Tue, 7 Apr 2009 15:52:26 +0000 (16:52 +0100)]
tty: Blackin CTS/RTS
Both software emulated and hardware based CTS and RTS are enabled in
serial driver.
The CTS RTS PIN connection on BF548 UART port is defined as a modem
device not as a host device. In order to test it under Linux, please
nake a cross UART cable to exchange CTS and RTS signal.
Signed-off-by: Sonic Zhang <sonic.zhang@analog.com>
Signed-off-by: Bryan Wu <cooloney@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Sonic Zhang [Tue, 7 Apr 2009 15:51:15 +0000 (16:51 +0100)]
Change hardware flow control from poll to interrupt driven
Only the CTS bit is affected.
Signed-off-by: Sonic Zhang <sonic.zhang@analog.com>
Signed-off-by: Bryan Wu <cooloney@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Christian Pellegrin [Tue, 7 Apr 2009 15:48:51 +0000 (16:48 +0100)]
Add support for the MAX3100 SPI UART.
(akpm: queued pending confirmation of the new major number)
[randy.dunlap@oracle.com: select SERIAL_CORE]
Signed-off-by: Christian Pellegrin <chripell@fsfe.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Alan Cox [Tue, 7 Apr 2009 15:48:35 +0000 (16:48 +0100)]
lanana: assign a device name and numbering for MAX3100
This is a low density serial port so needs a real major/minor
Signed-off-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Alan Cox [Tue, 7 Apr 2009 15:48:27 +0000 (16:48 +0100)]
serqt: initial clean up pass for tty side
Avoid using port->tty where possible (makes refcount fixing easier
later).
Remove unused code (the ioctl path is not used if the device has
mget/mset functions)
Remove various un-needed typecasts and long names so it could read it to
do the changes.
Signed-off-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Claudio Scordino [Tue, 7 Apr 2009 15:48:19 +0000 (16:48 +0100)]
tty: Use the generic RS485 ioctl on CRIS
Use the new general RS485 Linux data structure (introduced by Alan with
commit number
c26c56c0f40e200e61d1390629c806f6adaffbcc) in the Cris
architecture too (currently, Cris still uses the old private data
structure instead of the new one).
Signed-off-by: Claudio Scordino <claudio@evidence.eu.com>
Tested-by: Hinko Kocevar <hinko.kocevar@cetrtapot.si>
Tested-by: Janez Cufer <janez.cufer@cetrtapot.si>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Adrian Bunk [Tue, 7 Apr 2009 15:48:07 +0000 (16:48 +0100)]
tty: Correct inline types for tty_driver_kref_get()
tty_driver_kref_get() should be static inline and not extern inline
(the latter even changed it's semantics in gcc >= 4.3).
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Miklos Szeredi [Mon, 6 Apr 2009 15:41:00 +0000 (17:41 +0200)]
splice: fix deadlock in splicing to file
There's a possible deadlock in generic_file_splice_write(),
splice_from_pipe() and ocfs2_file_splice_write():
- task A calls generic_file_splice_write()
- this calls inode_double_lock(), which locks i_mutex on both
pipe->inode and target inode
- ordering depends on inode pointers, can happen that pipe->inode is
locked first
- __splice_from_pipe() needs more data, calls pipe_wait()
- this releases lock on pipe->inode, goes to interruptible sleep
- task B calls generic_file_splice_write(), similarly to the first
- this locks pipe->inode, then tries to lock inode, but that is
already held by task A
- task A is interrupted, it tries to lock pipe->inode, but fails, as
it is already held by task B
- ABBA deadlock
Fix this by explicitly ordering locks: the outer lock must be on
target inode and the inner lock (which is later unlocked and relocked)
must be on pipe->inode. This is OK, pipe inodes and target inodes
form two nonoverlapping sets, generic_file_splice_write() and friends
are not called with a target which is a pipe.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Acked-by: Mark Fasheh <mfasheh@suse.com>
Acked-by: Jens Axboe <jens.axboe@oracle.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Tue, 7 Apr 2009 02:02:00 +0000 (19:02 -0700)]
nilfs2: support nanosecond timestamp
After a review of user's feedback for finding out other compatibility
issues, I found nilfs improperly initializes timestamps in inode;
CURRENT_TIME was used there instead of CURRENT_TIME_SEC even though nilfs
didn't have nanosecond timestamps on disk. A few users gave us the report
that the tar program sometimes failed to expand symbolic links on nilfs,
and it turned out to be the cause.
Instead of applying the above displacement, I've decided to support
nanosecond timestamps on this occation. Fortunetaly, a needless 64-bit
field was in the nilfs_inode struct, and I found it's available for this
purpose without impact for the users.
So, this will do the enhancement and resolve the tar problem.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Tue, 7 Apr 2009 02:01:59 +0000 (19:01 -0700)]
nilfs2: introduce secondary super block
The former versions didn't have extra super blocks. This improves the
weak point by introducing another super block at unused region in tail of
the partition.
This doesn't break disk format compatibility; older versions just ingore
the secondary super block, and new versions just recover it if it doesn't
exist. The partition created by an old mkfs may not have unused region,
but in that case, the secondary super block will not be added.
This doesn't make more redundant copies of the super block; it is a future
work.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Tue, 7 Apr 2009 02:01:58 +0000 (19:01 -0700)]
nilfs2: simplify handling of active state of segments
will reduce some lines of segment constructor. Previously, the state was
complexly controlled through a list of segments in order to keep
consistency in meta data of usage state of segments. Instead, this
presents ``calculated'' active flags to userland cleaner program and stop
maintaining its real flag on disk.
Only by this fake flag, the cleaner cannot exactly know if each segment is
reclaimable or not. However, the recent extension of nilfs_sustat ioctl
struct (nilfs2-extend-nilfs_sustat-ioctl-struct.patch) can prevent the
cleaner from reclaiming in-use segment wrongly.
So, now I can apply this for simplification.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Tue, 7 Apr 2009 02:01:57 +0000 (19:01 -0700)]
nilfs2: mark minor flag for checkpoint created by internal operation
Nilfs creates checkpoints even for garbage collection or metadata updates
such as checkpoint mode change. So, user often sees checkpoints created
only by such internal operations.
This is inconvenient in some situations. For example, application that
monitors checkpoints and changes them to snapshots, will fall into an
infinite loop because it cannot distinguish internally created
checkpoints.
This patch solves this sort of problem by adding a flag to checkpoint for
identification.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Tue, 7 Apr 2009 02:01:56 +0000 (19:01 -0700)]
nilfs2: clean up sketch file
The sketch file is a file to mark checkpoints with user data. It was
experimentally introduced in the original implementation, and now
obsolete. The file was handled differently with regular files; the file
size got truncated when a checkpoint was created.
This stops the special treatment and will treat it as a regular file.
Most users are not affected because mkfs.nilfs2 no longer makes this file.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Tue, 7 Apr 2009 02:01:55 +0000 (19:01 -0700)]
nilfs2: super block operations fix endian bug
This adds a missing endian conversion of checksum field in the super
block. This fixes compatibility issue on big endian machines which will
come to surface after supporting recovery of super block.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Tue, 7 Apr 2009 02:01:55 +0000 (19:01 -0700)]
nilfs2: replace BUG_ON and BUG calls triggerable from ioctl
Pekka Enberg advised me:
> It would be nice if BUG(), BUG_ON(), and panic() calls would be
> converted to proper error handling using WARN_ON() calls. The BUG()
> call in nilfs_cpfile_delete_checkpoints(), for example, looks to be
> triggerable from user-space via the ioctl() system call.
This will follow the comment and keep them to a minimum.
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Tue, 7 Apr 2009 02:01:54 +0000 (19:01 -0700)]
nilfs2: extend nilfs_sustat ioctl struct
This adds a new argument to the nilfs_sustat structure.
The extended field allows to delete volatile active state of segments,
which was needed to protect freshly-created segments from garbage
collection but has confused code dealing with segments. This
extension alleviates the mess and gives room for further
simplifications.
The volatile active flag is not persistent, so it's eliminable on this
occasion without affecting compatibility other than the ioctl change.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Tue, 7 Apr 2009 02:01:53 +0000 (19:01 -0700)]
nilfs2: use unlocked_ioctl
Pekka Enberg suggested converting ->ioctl operations to use
->unlocked_ioctl to avoid BKL.
The conversion was verified to be safe, so I will take it on this
occasion.
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Tue, 7 Apr 2009 02:01:53 +0000 (19:01 -0700)]
nilfs2: remove compat ioctl code
This removes compat code from the nilfs ioctls and applies the same
function for both .ioctl and .compat_ioctl file operations.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Tue, 7 Apr 2009 02:01:52 +0000 (19:01 -0700)]
nilfs2: use fixed sized types for ioctl structures
Nilfs ioctl had structures not having fixed sized types such as:
struct nilfs_argv {
void *v_base;
size_t v_nmembs;
size_t v_size;
int v_index;
int v_flags;
};
Further, some of them are wrongly aligned:
e.g.
struct nilfs_cpmode {
__u64 cm_cno;
int cm_mode;
};
The size of wrongly aligned structures varies depending on
architectures, and it breaks the identity of ioctl commands, which
leads to arch dependent errors.
Previously, these are compensated by using compat_ioctl.
This fixes these problems and allows removal of compat ioctl.
Since this will change sizes of those structures, binary compatibility
for the past utilities will once break; new utilities have to be used
instead. However, it would be helpful to avoid platform dependent
problems in the long term.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Tue, 7 Apr 2009 02:01:51 +0000 (19:01 -0700)]
nilfs2: remove timedwait ioctl command
This removes NILFS_IOCTL_TIMEDWAIT command from ioctl interface along
with the related flags and wait queue.
The command is terrible because it just sleeps in the ioctl. I prefer
to avoid this by devising means of event polling in userland program.
By reconsidering the userland GC daemon, I found this is possible
without changing behaviour of the daemon and sacrificing efficiency.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Tue, 7 Apr 2009 02:01:50 +0000 (19:01 -0700)]
nilfs2: fix buggy behavior seen in enumerating checkpoints
This will fix the weird behavior of lscp command in listing continuously
created checkpoints; the output of lscp is rewinded regularly for the
recent nilfs. As a result of debugging, a defect was found in
nilfs_cpfile_do_get_cpinfo() function.
Though the function can be repeatedly called to enumerate checkpoints and
it can skip invalid checkpoint entries, the index value was not carried
between successive calls.
The bug has long been present, and came to surface after applying a bugfix
nilfs2-fix-problems-of-memory-allocation-in-ioctl.patch, which increased
frequency of calling the function. The similar bugfix was already applied
for ``snapshots'' by
nilfs2-fix-gc-failure-on-volumes-keeping-numerous-snapshots.patch.
This fixes the problem by making the index argument bidirectional on the
function.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pekka Enberg [Tue, 7 Apr 2009 02:01:49 +0000 (19:01 -0700)]
nilfs2: clean up indirect function calling conventions
This cleans up the strange indirect function calling convention used in
nilfs to follow the normal kernel coding style.
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Acked-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Tue, 7 Apr 2009 02:01:48 +0000 (19:01 -0700)]
nilfs2: fix improper return values of nilfs_get_cpinfo ioctl
A few tool developers gave me requests for fixing inconvenient return
value of nilfs_get_cpinfo() ioctl; if the requested mode is NILFS_SNAPSHOT
and the specified start entry is not a snapshot, the ioctl unnaturally
returns one as the number of acquired snapshot item.
In addition, the ioctl function returns an ENOENT error for checkpoints
within blocks deleted by garbage collection.
These behaviors require corrections for programs which enumerate
snapshots. This resolves the inconvenience by changing the return values
to zero for the above cases.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Tue, 7 Apr 2009 02:01:47 +0000 (19:01 -0700)]
nilfs2: fix gc failure on volumes keeping numerous snapshots
This resolves the following failure of nilfs2 cleaner daemon:
nilfs_cleanerd[20670]: cannot clean segments: No such file or directory
nilfs_cleanerd[20670]: shutdown
When creating thousands of snapshots, the cleaner daemon had rarely died
as above due to an error returned from the kernel code.
After applying the recent patch which fixed memory allocation problems in
ioctl (Message-Id: <
20081215.155840.
105124170.ryusuke@osrg.net>), the
problem gets more frequent.
It turned out to be a bug of nilfs_ioctl_wrap_copy function and one of its
callback routines to read out information of snapshots; if the
nilfs_ioctl_wrap_copy function divided a large read request into multiple
requests, the second and later requests have failed since a restart
position on snapshot meta data was not properly set forward.
It's a deficiency of the callback interface that cannot pass the restart
position among multiple requests. This patch fixes the issue by allowing
nilfs_ioctl_wrap_copy and snapshot read functions to exchange a position
argument.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Tue, 7 Apr 2009 02:01:46 +0000 (19:01 -0700)]
nilfs2: add maintainer
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Tue, 7 Apr 2009 02:01:45 +0000 (19:01 -0700)]
nilfs2: insert explanations in gcinode file
The file gcinode.c gives buffer cache functions for on-disk blocks
moved in garbage collection. Joern Engel has suggested inserting its
explanations in the source file (Message-ID:
<
20080917144146.GD8750@logfs.org> and
<
20080917224953.GB14644@logfs.org>).
This follows the comment.
Cc: Joern Engel <joern@logfs.org>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Tue, 7 Apr 2009 02:01:45 +0000 (19:01 -0700)]
nilfs2: avoid double error caused by nilfs_transaction_end
Pekka Enberg pointed out that double error handlings found after
nilfs_transaction_end() can be avoided by separating abort operation:
OK, I don't understand this. The only way nilfs_transaction_end() can
fail is if we have NILFS_TI_SYNC set and we fail to construct the
segment. But why do we want to construct a segment if we don't commit?
I guess what I'm asking is why don't we have a separate
nilfs_transaction_abort() function that can't fail for the erroneous
case to avoid this double error value tracking thing?
This does the separation and renames nilfs_transaction_end() to
nilfs_transaction_commit() for clarification.
Since, some calls of these functions were used just for exclusion control
against the segment constructor, they are replaced with semaphore
operations.
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Tue, 7 Apr 2009 02:01:44 +0000 (19:01 -0700)]
nilfs2: cleanup nilfs_clear_inode
This will remove the following unnecessary locks and cleanup code in
nilfs_clear_inode():
- unnecessary protection using nilfs_transaction_begin() and
nilfs_transaction_end().
- cleanup code of i_dirty list field which is never chained
when this function is called.
- spinlock used when releasing i_bh field.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Tue, 7 Apr 2009 02:01:43 +0000 (19:01 -0700)]
nilfs2: fix problems of memory allocation in ioctl
This is another patch for fixing the following problems of a memory
copy function in nilfs2 ioctl:
(1) It tries to allocate 128KB size of memory even for small objects.
(2) Though the function repeatedly tries large memory allocations
while reducing the size, GFP_NOWAIT flag is not specified.
This increases the possibility of system memory shortage.
(3) During the retries of (2), verbose warnings are printed
because _GFP_NOWARN flag is not used for the kmalloc calls.
The first patch was still doing large allocations by kmalloc which are
repeatedly tried while reducing the size.
Andi Kleen told me that using copy_from_user for large memory is not
good from the viewpoint of preempt latency:
On Fri, 12 Dec 2008 21:24:11 +0100, Andi Kleen <andi@firstfloor.org> wrote:
> > In the current interface, each data item is copied twice: one is to
> > the allocated memory from user space (via copy_from_user), and another
>
> For such large copies it is better to use multiple smaller (e.g. 4K)
> copy user, that gives better real time preempt latencies. Each cfu has a
> cond_resched(), but only one, not multiple times in the inner loop.
He also advised me that:
On Sun, 14 Dec 2008 16:13:27 +0100, Andi Kleen <andi@firstfloor.org> wrote:
> Better would be if you could go to PAGE_SIZE. order 0 allocations
> are typically the fastest / least likely to stall.
>
> Also in this case it's a good idea to use __get_free_pages()
> directly, kmalloc tends to be become less efficient at larger
> sizes.
For the function in question, the size of buffer memory can be reduced
since the buffer is repeatedly used for a number of small objects. On
the other hand, it may incur large preempt latencies for larger buffer
because a copy_from_user (and a copy_to_user) was applied only once
each cycle.
With that, this revision uses the order 0 allocations with
__get_free_pages() to fix the original problems.
Cc: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Tue, 7 Apr 2009 02:01:41 +0000 (19:01 -0700)]
nilfs2: update makefile and Kconfig
This adds a Makefile for the nilfs2 file system, and updates the
makefile and Kconfig file in the file system directory.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Koji Sato [Tue, 7 Apr 2009 02:01:41 +0000 (19:01 -0700)]
nilfs2: ioctl operations
This adds userland interface implemented with ioctl.
Signed-off-by: Koji Sato <sato.koji@lab.ntt.co.jp>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Tue, 7 Apr 2009 02:01:40 +0000 (19:01 -0700)]
nilfs2: block cache for garbage collection
This adds the cache of on-disk blocks to be moved in garbage
collection. The disk blocks are held with dummy inodes (called
gcinodes), and this file provides lookup function of the dummy inodes,
and their buffer read function.
Signed-off-by: Seiji Kihara <kihara.seiji@lab.ntt.co.jp>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Yoshiji Amagai <amagai.yoshiji@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Tue, 7 Apr 2009 02:01:39 +0000 (19:01 -0700)]
nilfs2: another dat for garbage collection
NILFS2 uses another DAT inode during garbage collection to ensure
atomicity and consistency of the DAT in the transient state. This
twin inode is called GCDAT.
This adds functions to initialize the GCDAT and to switch page caches
and B-tree node caches between these two inodes.
Signed-off-by: Seiji Kihara <kihara.seiji@lab.ntt.co.jp>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Yoshiji Amagai <amagai.yoshiji@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Tue, 7 Apr 2009 02:01:38 +0000 (19:01 -0700)]
nilfs2: recovery functions
This adds recovery function on mount.
Usually the recovery is achieved by just finding the latest super
root. When logs without checkpoints were appended for data sync
operations after the latest super root, the recovery function will
perform roll forwarding and reconstruct new log(s) with a super root.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Tue, 7 Apr 2009 02:01:38 +0000 (19:01 -0700)]
nilfs2: fix missed-sync issue for do_sync_mapping_range()
Chris Mason pointed out that there is a missed sync issue in
nilfs_writepages():
On Wed, 17 Dec 2008 21:52:55 -0500, Chris Mason wrote:
> It looks like nilfs_writepage ignores WB_SYNC_NONE, which is used by
> do_sync_mapping_range().
where WB_SYNC_NONE in do_sync_mapping_range() was replaced with
WB_SYNC_ALL by Nick's patch (commit:
ee53a891f47444c53318b98dac947ede963db400).
This fixes the problem by letting nilfs_writepages() write out the log of
file data within the range if sync_mode is WB_SYNC_ALL.
This involves removal of nilfs_file_aio_write() which was previously
needed to ensure O_SYNC sync writes.
Cc: Chris Mason <chris.mason@oracle.com>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Tue, 7 Apr 2009 02:01:37 +0000 (19:01 -0700)]
nilfs2: segment constructor
This adds the segment constructor (also called log writer).
The segment constructor collects dirty buffers for every dirty inode,
makes summaries of the buffers, assigns disk block addresses to the
buffers, and then submits BIOs for the buffers.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Tue, 7 Apr 2009 02:01:36 +0000 (19:01 -0700)]
nilfs2: segment buffer
This adds the segment buffer which is used to constuct logs.
[akpm@linux-foundation.org: BIO_RW_SYNC got removed]
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Tue, 7 Apr 2009 02:01:35 +0000 (19:01 -0700)]
nilfs2: super block operations
This adds super block operations for the nilfs2 file system.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Tue, 7 Apr 2009 02:01:35 +0000 (19:01 -0700)]
nilfs2: operations for the_nilfs core object
This adds functions on the_nilfs object, which keeps shared resources and
states among a read/write mount and snapshots mounts going individually.
the_nilfs is allocated per block device; it is created when user first
mount a snapshot or a read/write mount on the device, then it is reused
for successive mounts. It will be freed when all mount instances on the
device are detached.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Tue, 7 Apr 2009 02:01:34 +0000 (19:01 -0700)]
nilfs2: pathname operations
This adds pathname operations, most of which comes from the ext2 file
system.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Yoshiji Amagai [Tue, 7 Apr 2009 02:01:34 +0000 (19:01 -0700)]
nilfs2: directory entry operations
This adds directory handling functions, most of which comes from the ext2
file system.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Yoshiji Amagai <amagai.yoshiji@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Tue, 7 Apr 2009 02:01:33 +0000 (19:01 -0700)]
nilfs2: file operations
This adds primitives for regular file handling.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Tue, 7 Apr 2009 02:01:32 +0000 (19:01 -0700)]
nilfs2: inode operations
This adds inode level operations of the nilfs2 file system.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Koji Sato [Tue, 7 Apr 2009 02:01:32 +0000 (19:01 -0700)]
nilfs2: segment usage file
This adds a meta data file which stores the allocation state of segments.
[konishi.ryusuke@lab.ntt.co.jp: fix wrong counting of checkpoints and dirty segments]
Signed-off-by: Koji Sato <sato.koji@lab.ntt.co.jp>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Koji Sato [Tue, 7 Apr 2009 02:01:31 +0000 (19:01 -0700)]
nilfs2: checkpoint file
This adds a meta data file which holds checkpoint entries in its data
blocks.
Signed-off-by: Koji Sato <sato.koji@lab.ntt.co.jp>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Tue, 7 Apr 2009 02:01:30 +0000 (19:01 -0700)]
nilfs2: inode map file
This adds a meta data file which stores on-disk inodes in its data blocks.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Yoshiji Amagai <amagai.yoshiji@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Koji Sato [Tue, 7 Apr 2009 02:01:30 +0000 (19:01 -0700)]
nilfs2: disk address translator
This adds the disk address translation file (DAT) whose primary function
is to convert virtual disk block numbers to actual disk block numbers.
The virtual block numbers of NILFS are associated with checkpoint
generation numbers, and this file also provides functions to manage the
lifetime information of each virtual block number.
Signed-off-by: Koji Sato <sato.koji@lab.ntt.co.jp>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Tue, 7 Apr 2009 02:01:29 +0000 (19:01 -0700)]
nilfs2: persistent object allocator
This adds common functions to allocate or deallocate entries with bitmaps
on a meta data file. This feature is used by the DAT and ifile.
Signed-off-by: Koji Sato <sato.koji@lab.ntt.co.jp>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Yoshiji Amagai <amagai.yoshiji@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>