GitHub/MotorolaMobilityLLC/kernel-slsi.git
15 years agoLinuxPPS: core support
Rodolfo Giometti [Wed, 17 Jun 2009 23:28:37 +0000 (16:28 -0700)]
LinuxPPS: core support

This patch adds the kernel side of the PPS support currently named
"LinuxPPS".

PPS means "pulse per second" and a PPS source is just a device which
provides a high precision signal each second so that an application can
use it to adjust system clock time.

Common use is the combination of the NTPD as userland program with a GPS
receiver as PPS source to obtain a wallclock-time with sub-millisecond
synchronisation to UTC.

To obtain this goal the userland programs shoud use the PPS API
specification (RFC 2783 - Pulse-Per-Second API for UNIX-like Operating
Systems, Version 1.0) which in part is implemented by this patch.  It
provides a set of chars devices, one per PPS source, which can be used to
get the time signal.  The RFC's functions can be implemented by accessing
to these char devices.

Signed-off-by: Rodolfo Giometti <giometti@linux.it>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Greg KH <greg@kroah.com>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Acked-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agogru: copyright fixes
Jack Steiner [Wed, 17 Jun 2009 23:28:36 +0000 (16:28 -0700)]
gru: copyright fixes

Fix the copyright statements in a couple of GRU files.  No functional
changes are being made.

Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agogru: remove references to the obsolete global status handle
Jack Steiner [Wed, 17 Jun 2009 23:28:35 +0000 (16:28 -0700)]
gru: remove references to the obsolete global status handle

Delete references to the SGI GRU GSH hardware resources.  These GRU
resources have been deleted from the hardware.  (These resources have
never benn used, anyway).

Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agogru: fixes to grudump utility
Jack Steiner [Wed, 17 Jun 2009 23:28:34 +0000 (16:28 -0700)]
gru: fixes to grudump utility

Minor fixes to the SGI GRU grudump facility:
- fix address where user data is written
- add gru number to data passed to user
- indicate if context is locked

Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agogru: fix potential use-after-free when purging GRU tlbs
Jack Steiner [Wed, 17 Jun 2009 23:28:33 +0000 (16:28 -0700)]
gru: fix potential use-after-free when purging GRU tlbs

Fix potential SGI GRU bug that could cause a use-after-free.  If one
thread in a task is flushing the GRU and another thread destroys the GRU
context, there is the potential to access a table after it has been freed.

Copy the gms pointer to a local variable before unlocking the gts table.
Note that no refcnt is needed for the gms - the reference is held
indirectly by the task's mm_struct.

Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agogru: add user request to specify gru slice
Jack Steiner [Wed, 17 Jun 2009 23:28:33 +0000 (16:28 -0700)]
gru: add user request to specify gru slice

Add a user request to specify the gru instruction slice parameter for user
contexts.

Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agogru: generic infrastructure for context options
Jack Steiner [Wed, 17 Jun 2009 23:28:32 +0000 (16:28 -0700)]
gru: generic infrastructure for context options

Change the user GRU request for specifying the "task_slice" option to use
a generic infrastructure that can be expanded in the future to include
additional context options.  No new capabilities are added with this
patch.

Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agogru: cleanup gru inline functions
Jack Steiner [Wed, 17 Jun 2009 23:28:31 +0000 (16:28 -0700)]
gru: cleanup gru inline functions

Cleanup of GRU inline functions to eliminate unnecessary inline code.
Update function descriptions.

Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agogru: delete user request for fetching chiplet status
Jack Steiner [Wed, 17 Jun 2009 23:28:31 +0000 (16:28 -0700)]
gru: delete user request for fetching chiplet status

Delete the user request for fetching the status of a GRU chiplet.  This
request has been made obsolete by other changes.  Note: this is not a
change to a user API - there are no compatibility issues with this change.

Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agogru: collect per-context user statistics
Jack Steiner [Wed, 17 Jun 2009 23:28:30 +0000 (16:28 -0700)]
gru: collect per-context user statistics

Collect GRU statistics for each user GRU context.  Statistics are kept for
TLB misses & content resource contention.  Add user request for retrieving
the statistics.

Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agogru: fix automatic retry of gru instruction failures
Jack Steiner [Wed, 17 Jun 2009 23:28:29 +0000 (16:28 -0700)]
gru: fix automatic retry of gru instruction failures

Fix bug in automatic retry of GRU instruction failures.  CBR substatus
(message queue failure) was being checked incorrectly.

Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agogru: add user request to explicitly unload a gru context
Jack Steiner [Wed, 17 Jun 2009 23:28:28 +0000 (16:28 -0700)]
gru: add user request to explicitly unload a gru context

Add user function to explicitly unload GRU kernel contexts from the GRU.
Only contexts that are not in-use will be unloaded.

This function is primarily for testing.  It is not expected that this will
be used in normal production systems.

Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agogru: fix cache coherency issues with instruction retry
Jack Steiner [Wed, 17 Jun 2009 23:28:28 +0000 (16:28 -0700)]
gru: fix cache coherency issues with instruction retry

Fix two problems related to GRU instruction failures.  Cache coherency is
not maintained for CBEs except when loading or unloading contexts.  When
reading a CBE to extract error information, the CBE must first be flushed
from the cache.

The function that reads kerrnel CBEs was reading the wrong CBE.

Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agogru: update to rev 0.9 of gru spec
Jack Steiner [Wed, 17 Jun 2009 23:28:27 +0000 (16:28 -0700)]
gru: update to rev 0.9 of gru spec

Update GRU driver to the latest version of the GRU spec. This consists
of minor updates:
- changes & additions to error status bits
- new restriction on handling of TLB misses while in FMM mode
- new field (not used by software) in TFH

Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agogru: update gru kernel self tests
Jack Steiner [Wed, 17 Jun 2009 23:28:26 +0000 (16:28 -0700)]
gru: update gru kernel self tests

Change the kernel self tests that can be optionally executed on GRU
initialization.  This is primarily for testing.

Eliminate the BUG statements on failure and return bad status.  Add ioctl
interface to execute the tests on demand.

Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agogru: support for asynchronous gru instructions
Jack Steiner [Wed, 17 Jun 2009 23:28:25 +0000 (16:28 -0700)]
gru: support for asynchronous gru instructions

Add support for asynchronous GRU instructions.  Currently, asynchronous
instructions are supported only for GRU instructions issued by the kernel.

[akpm@linux-foundation.org: build fix]
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agogru: support instruction completion interrupts
Jack Steiner [Wed, 17 Jun 2009 23:28:25 +0000 (16:28 -0700)]
gru: support instruction completion interrupts

Add support for interrupts generated by GRU instruction completion.
Previously, the only interrupts were for TLB misses.  The hardware also
supports interrupts on instruction completion.  This will be supported for
instructions issued by the kernel.

Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agogru: check context state on reload
Jack Steiner [Wed, 17 Jun 2009 23:28:24 +0000 (16:28 -0700)]
gru: check context state on reload

Check whether the gru state being loaded into a gru is from a new context
or a previously unloaded context.  If new, simply zero out the hardware
context; if unloaded and valid, reload the old state.

This change is primarily for reloading kernel contexts where the previous
is not required to be saved.

Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agogru: fix handling of mesq failures
Jack Steiner [Wed, 17 Jun 2009 23:28:23 +0000 (16:28 -0700)]
gru: fix handling of mesq failures

Fix endcase in handling GRU message queue failures due to NACKs of PUT
requests.  Must ensure that the "present" bits are cleared before
resending the message.

Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agogru: support contexts with zero dsrs or cbrs
Jack Steiner [Wed, 17 Jun 2009 23:28:23 +0000 (16:28 -0700)]
gru: support contexts with zero dsrs or cbrs

Support alocation of GRU contexts that contain zero DSR or CBR resources.
Some instructions do not require DSR resources.  Contexts without CBR
resources are useful for diagnostics.

Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agogru: change resource assignment for kernel threads
Jack Steiner [Wed, 17 Jun 2009 23:28:22 +0000 (16:28 -0700)]
gru: change resource assignment for kernel threads

Change the way GRU resources are assigned for kernel threads.  GRU
contexts for kernel threads are now allocated on demand and can be stolen
by user processes when idle.  This allows MPI jobs to use ALL of the GRU
resources when the kernel is not using them.

Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agogru: support cch_allocate for kernel threads
Jack Steiner [Wed, 17 Jun 2009 23:28:21 +0000 (16:28 -0700)]
gru: support cch_allocate for kernel threads

Change the interface to cch_allocate so that it can be used to allocate
GRU contexts for kernel threads.  Kernel threads use the GRU in unmapped
mode and do not require ASIDs for the GRU TLB.

Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agogru: change context load and unload
Jack Steiner [Wed, 17 Jun 2009 23:28:20 +0000 (16:28 -0700)]
gru: change context load and unload

Remove "static" from the functions for loading/unloading GRU contexts.
These functions will be called from other GRU files.  Fix bug in unlocking
gru context.

Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agogru: dynamic allocation of kernel contexts
Jack Steiner [Wed, 17 Jun 2009 23:28:20 +0000 (16:28 -0700)]
gru: dynamic allocation of kernel contexts

Change the interface to gru_alloc_gts() so that it can be used to allocate
GRU contexts for kernel threads.  Kernel threads do not have vdata
structures for the GRU contexts.  The GRU resource count are now passed
explicitly instead of inside the vdata structure.

Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agogru: dump chiplet state
Jack Steiner [Wed, 17 Jun 2009 23:28:19 +0000 (16:28 -0700)]
gru: dump chiplet state

Add support for dumpping the state of an entire GRU chiplet.

Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agogru: bug fixes for GRU exception handling
Jack Steiner [Wed, 17 Jun 2009 23:28:19 +0000 (16:28 -0700)]
gru: bug fixes for GRU exception handling

Bug fixes for GRU exception handling.  Additional fields from the CBR must
be returned to the user to allow the user to correctly diagnose GRU
exceptions.

Handle endcase in TFH TLB miss handling.  Verify that TFH actually
indicates a pending exception.

Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agokexec: sysrq: simplify sysrq-c handler
Neil Horman [Wed, 17 Jun 2009 23:28:17 +0000 (16:28 -0700)]
kexec: sysrq: simplify sysrq-c handler

Currently the sysrq-c handler is bit over-engineered.  Its behavior is
dependent on a few compile time and run time factors that alter its
behavior which is really unnecessecary.

If CONFIG_KEXEC is not configured, sysrq-c, crashes the system with a NULL
pointer dereference.  If CONFIG_KEXEC is configured, it calls crash_kexec
directly, which implies that the kexec kernel will either be booted (if
its been previously loaded), or it will simply do nothing (the no kexec
kernel has been loaded).

It would be much easier to just simplify the whole thing to dereference a
NULL pointer all the time regardless of configuration.  That way, it will
always try to crash the system, and if a kexec kernel has been loaded into
reserved space, it will still boot from the page fault trap handler
(assuming panic_on_oops is set appropriately).

[akpm@linux-foundation.org: build fix]
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Cc: Brayan Arraes <brayan@yack.com.br>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agow1-gpio: add external pull-up enable callback
Daniel Mack [Wed, 17 Jun 2009 23:28:15 +0000 (16:28 -0700)]
w1-gpio: add external pull-up enable callback

On embedded devices, sleep mode conditions can be tricky to handle,
Especially when processors tend to pull-down the w1 bus during sleep.  Bus
slaves (such as the ds2760) may interpret this as a reason for power-down
conditions and entirely switch off the device.

This patch adds a callback function pointer to let users switch on and off
the external pull-up resistor.  This lets the outside world know whether
the processor is currently actively driving the bus or not.

When this callback is not provided, the code behaviour won't change.

Signed-off-by: Daniel Mack <daniel@caiaq.de>
Acked-by: Ville Syrjala <syrjala@sci.fi>
Acked-by: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agodma-mapping: ia64: add CONFIG_DMA_API_DEBUG support
FUJITA Tomonori [Wed, 17 Jun 2009 23:28:14 +0000 (16:28 -0700)]
dma-mapping: ia64: add CONFIG_DMA_API_DEBUG support

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Joerg Roedel <joerg.roedel@amd.com>
Cc: Ingo Molnar <mingo@elte.hu>
Acked-by: "Luck, Tony" <tony.luck@intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc; "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agodma-mapping: ia64: use asm-generic/dma-mapping-common.h
FUJITA Tomonori [Wed, 17 Jun 2009 23:28:13 +0000 (16:28 -0700)]
dma-mapping: ia64: use asm-generic/dma-mapping-common.h

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Joerg Roedel <joerg.roedel@amd.com>
Cc: Ingo Molnar <mingo@elte.hu>
Acked-by: "Luck, Tony" <tony.luck@intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc; "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agodma-mapping: x86: use asm-generic/dma-mapping-common.h
FUJITA Tomonori [Wed, 17 Jun 2009 23:28:12 +0000 (16:28 -0700)]
dma-mapping: x86: use asm-generic/dma-mapping-common.h

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Joerg Roedel <joerg.roedel@amd.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agodma-mapping: add asm-generic/dma-mapping-common.h
FUJITA Tomonori [Wed, 17 Jun 2009 23:28:10 +0000 (16:28 -0700)]
dma-mapping: add asm-generic/dma-mapping-common.h

We unified x86 and IA64's handling of multiple dma mapping operations
(struct dma_map_ops in linux/dma-mapping.h) so we can remove duplication
in their arch/include/asm/dma-mapping.h.

This patchset adds include/asm-generic/dma-mapping-common.h that provides
some generic dma mapping function definitions for the users of struct
dma_map_ops.  This enables us to remove about 100 lines.  This also
enables us to easily add CONFIG_DMA_API_DEBUG support, which only x86
supports for now.  The 4th patch adds CONFIG_DMA_API_DEBUG support to IA64
by adding only 8 lines.

This patch:

This header file provides some mapping function definitions that the users
of struct dma_map_ops can use.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Joerg Roedel <joerg.roedel@amd.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agogcov: enable GCOV_PROFILE_ALL for x86_64
Peter Oberparleiter [Wed, 17 Jun 2009 23:28:09 +0000 (16:28 -0700)]
gcov: enable GCOV_PROFILE_ALL for x86_64

Enable gcov profiling of the entire kernel on x86_64. Required changes
include disabling profiling for:

* arch/kernel/acpi/realmode and arch/kernel/boot/compressed:
  not linked to main kernel
* arch/vdso, arch/kernel/vsyscall_64 and arch/kernel/hpet:
  profiling causes segfaults during boot (incompatible context)

Signed-off-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Li Wei <W.Li@Sun.COM>
Cc: Michael Ellerman <michaele@au1.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Heiko Carstens <heicars2@linux.vnet.ibm.com>
Cc: Martin Schwidefsky <mschwid2@linux.vnet.ibm.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: WANG Cong <xiyou.wangcong@gmail.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agogcov: add gcov profiling infrastructure
Peter Oberparleiter [Wed, 17 Jun 2009 23:28:08 +0000 (16:28 -0700)]
gcov: add gcov profiling infrastructure

Enable the use of GCC's coverage testing tool gcov [1] with the Linux
kernel.  gcov may be useful for:

 * debugging (has this code been reached at all?)
 * test improvement (how do I change my test to cover these lines?)
 * minimizing kernel configurations (do I need this option if the
   associated code is never run?)

The profiling patch incorporates the following changes:

 * change kbuild to include profiling flags
 * provide functions needed by profiling code
 * present profiling data as files in debugfs

Note that on some architectures, enabling gcc's profiling option
"-fprofile-arcs" for the entire kernel may trigger compile/link/
run-time problems, some of which are caused by toolchain bugs and
others which require adjustment of architecture code.

For this reason profiling the entire kernel is initially restricted
to those architectures for which it is known to work without changes.
This restriction can be lifted once an architecture has been tested
and found compatible with gcc's profiling. Profiling of single files
or directories is still available on all platforms (see config help
text).

[1] http://gcc.gnu.org/onlinedocs/gcc/Gcov.html

Signed-off-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Li Wei <W.Li@Sun.COM>
Cc: Michael Ellerman <michaele@au1.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Heiko Carstens <heicars2@linux.vnet.ibm.com>
Cc: Martin Schwidefsky <mschwid2@linux.vnet.ibm.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: WANG Cong <xiyou.wangcong@gmail.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoseq_file: add function to write binary data
Peter Oberparleiter [Wed, 17 Jun 2009 23:28:05 +0000 (16:28 -0700)]
seq_file: add function to write binary data

seq_write() can be used to construct seq_files containing arbitrary data.
Required by the gcov-profiling interface to synthesize binary profiling
data files.

Signed-off-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Li Wei <W.Li@Sun.COM>
Cc: Michael Ellerman <michaele@au1.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Heiko Carstens <heicars2@linux.vnet.ibm.com>
Cc: Martin Schwidefsky <mschwid2@linux.vnet.ibm.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: WANG Cong <xiyou.wangcong@gmail.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agokernel: constructor support
Peter Oberparleiter [Wed, 17 Jun 2009 23:28:03 +0000 (16:28 -0700)]
kernel: constructor support

Call constructors (gcc-generated initcall-like functions) during kernel
start and module load.  Constructors are e.g.  used for gcov data
initialization.

Disable constructor support for usermode Linux to prevent conflicts with
host glibc.

Signed-off-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: WANG Cong <xiyou.wangcong@gmail.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Li Wei <W.Li@Sun.COM>
Cc: Michael Ellerman <michaele@au1.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Heiko Carstens <heicars2@linux.vnet.ibm.com>
Cc: Martin Schwidefsky <mschwid2@linux.vnet.ibm.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoedac: Kconfig: fix the meaning of EDAC abbreviation
GeunSik Lim [Wed, 17 Jun 2009 23:28:02 +0000 (16:28 -0700)]
edac: Kconfig: fix the meaning of EDAC abbreviation

Fix the meaning of EDAC(Error Detection And Correction) correctly.

[akpm@linux-foundation.org: add missing space]
Signed-off-by: GeunSik Lim <geunsik.lim@samsung.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Acked-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoedac: add missing __devexit_p()
Mike Frysinger [Wed, 17 Jun 2009 23:28:01 +0000 (16:28 -0700)]
edac: add missing __devexit_p()

The remove function uses __devexit, so the .remove assignment needs
__devexit_p() to fix a build error with hotplug disabled.

Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Cc: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoedac: cpc925 MC platform device setup
Harry Ciao [Wed, 17 Jun 2009 23:28:00 +0000 (16:28 -0700)]
edac: cpc925 MC platform device setup

Fix up the number of cells for the values of CPC925 Memory Controller,
and setup related platform device during system booting up, against
which CPC925 Memory Controller EDAC driver would be matched.

Signed-off-by: Harry Ciao <qingtao.cao@windriver.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Cc: Michael Ellerman <michael@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Kumar Gala <galak@gate.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoedac: add edac_device_alloc_index()
Harry Ciao [Wed, 17 Jun 2009 23:27:59 +0000 (16:27 -0700)]
edac: add edac_device_alloc_index()

Add edac_device_alloc_index(), because for MAPLE platform there may
exist several EDAC driver modules that could make use of
edac_device_ctl_info structure at the same time. The index allocation
for these structures should be taken care of by EDAC core.

[akpm@linux-foundation.org: cleanups]
Signed-off-by: Harry Ciao <qingtao.cao@windriver.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Cc: Michael Ellerman <michael@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Kumar Gala <galak@gate.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoedac: add CPC925 Memory Controller driver
Harry Ciao [Wed, 17 Jun 2009 23:27:58 +0000 (16:27 -0700)]
edac: add CPC925 Memory Controller driver

Introduce IBM CPC925 EDAC driver, which makes use of ECC, CPU and
HyperTransport Link error detections and corrections on the IBM
CPC925 Bridge and Memory Controller.

[akpm@linux-foundation.org: cleanup]
Signed-off-by: Harry Ciao <qingtao.cao@windriver.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Cc: Michael Ellerman <michael@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Kumar Gala <galak@gate.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agofutex: documentation: fix inconsistent description of futex list_op_pending
Matt Helsley [Wed, 17 Jun 2009 23:27:58 +0000 (16:27 -0700)]
futex: documentation: fix inconsistent description of futex list_op_pending

Strictly speaking list_op_pending points to the 'lock entry', not the
'lock word' (which is actually at 'offset' from 'lock entry').  We can
infer this based on reading the code in kernel/futex.c:

    struct robust_list __user *entry, *next_entry, *pending;
...
            if (fetch_robust_entry(&pending, &head->list_op_pending, &pip))
                    return;
            ...
            if (pending)
                    handle_futex_death((void __user *)pending + futex_offset,
                                       curr, pip);

Which is also consistent with the rest of the docs on robust futex lists.

Signed-off-by: Matt Helsley <matthltc@us.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Thomas Gleixner <tglx@linuxtronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoipcns: move free_ipcs() proto
Alexey Dobriyan [Wed, 17 Jun 2009 23:27:57 +0000 (16:27 -0700)]
ipcns: move free_ipcs() proto

Function is really private to ipc/ and avoid struct kern_ipc_perm
forward declaration.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Reviewed-by: WANG Cong <xiyou.wangcong@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoipcns: make free_ipc_ns() static
Alexey Dobriyan [Wed, 17 Jun 2009 23:27:56 +0000 (16:27 -0700)]
ipcns: make free_ipc_ns() static

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Reviewed-by: WANG Cong <xiyou.wangcong@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agonsproxy: extract create_nsproxy()
Alexey Dobriyan [Wed, 17 Jun 2009 23:27:56 +0000 (16:27 -0700)]
nsproxy: extract create_nsproxy()

clone_nsproxy() does useless copying of old nsproxy -- every pointer will
be rewritten to new ns or to old ns.  Remove copying, rename
clone_nsproxy(), create_nsproxy() will be used by C/R code to create fresh
nsproxy on restart.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Cc: Pavel Emelyanov <xemul@openvz.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoipcns: extract create_ipc_ns()
Alexey Dobriyan [Wed, 17 Jun 2009 23:27:55 +0000 (16:27 -0700)]
ipcns: extract create_ipc_ns()

clone_ipc_ns() is misnamed, it doesn't clone anything and doesn't use
passed parameter.  Rename it.

create_ipc_ns() will be used by C/R to create fresh ipcns.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Reviewed-by: WANG Cong <xiyou.wangcong@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoipcns: remove useless get/put while CLONE_NEWIPC
Alexey Dobriyan [Wed, 17 Jun 2009 23:27:54 +0000 (16:27 -0700)]
ipcns: remove useless get/put while CLONE_NEWIPC

copy_ipcs() doesn't actually copy anything. If new ipcns is created, it's
created from scratch, in this case get/put on old ipcns isn't needed.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Reviewed-by: WANG Cong <xiyou.wangcong@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoutsns: extract creeate_uts_ns()
Alexey Dobriyan [Wed, 17 Jun 2009 23:27:54 +0000 (16:27 -0700)]
utsns: extract creeate_uts_ns()

create_uts_ns() will be used by C/R to create fresh uts_ns.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agopidns: rewrite copy_pid_ns()
Alexey Dobriyan [Wed, 17 Jun 2009 23:27:53 +0000 (16:27 -0700)]
pidns: rewrite copy_pid_ns()

copy_pid_ns() is a perfect example of a case where unwinding leads to more
code and makes it less clear.  Watch the diffstat.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Pavel Emelyanov <xemul@openvz.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Reviewed-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Reviewed-by: WANG Cong <xiyou.wangcong@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agopidns: make create_pid_namespace() accept parent pidns
Alexey Dobriyan [Wed, 17 Jun 2009 23:27:52 +0000 (16:27 -0700)]
pidns: make create_pid_namespace() accept parent pidns

create_pid_namespace() creates everything, but caller has to assign parent
pidns by hand, which is unnatural.  At the moment of call new ->level has
to be taken from somewhere and parent pidns is already available.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Pavel Emelyanov <xemul@openvz.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Reviewed-by: WANG Cong <xiyou.wangcong@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agopids: clean up find_task_by_pid variants
Christoph Hellwig [Wed, 17 Jun 2009 23:27:51 +0000 (16:27 -0700)]
pids: clean up find_task_by_pid variants

find_task_by_pid_type_ns is only used to implement find_task_by_vpid and
find_task_by_pid_ns, but both of them pass PIDTYPE_PID as first argument.
So just fold find_task_by_pid_type_ns into find_task_by_pid_ns and use
find_task_by_pid_ns to implement find_task_by_vpid.

While we're at it also remove the exports for find_task_by_pid_ns and
find_task_by_vpid - we don't have any modular callers left as the only
modular caller of he old pre pid namespace find_task_by_pid (gfs2) was
switched to pid_task which operates on a struct pid pointer instead of a
pid_t.  Given the confusion about pid_t values vs namespace that's
generally the better option anyway and I think we're better of restricting
modules to do it that way.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Pavel Emelyanov <xemul@openvz.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agosysctl.c: remove unused variable
Sukanto Ghosh [Wed, 17 Jun 2009 23:27:50 +0000 (16:27 -0700)]
sysctl.c: remove unused variable

Remoce the unused variable 'val' from __do_proc_dointvec()

The integer has been declared and used as 'val = -val' and there is no
reference to it anywhere.

Signed-off-by: Sukanto Ghosh <sukanto.cse.iitb@gmail.com>
Cc: Jaswinder Singh Rajput <jaswinder@kernel.org>
Cc: Sukanto Ghosh <sukanto.cse.iitb@gmail.com>
Cc: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoppdev: reduce kernel log spam
Michael Buesch [Wed, 17 Jun 2009 23:27:49 +0000 (16:27 -0700)]
ppdev: reduce kernel log spam

One of my programs frequently grabs the parport, does something with it
and then drops it again. This results in spamming of the kernel log with

"... registered pardevice"
"... unregistered pardevice"

These messages are completely useless, except for debugging ppdev,
probably.  So put them under DEBUG (or dynamic debug).

Signed-off-by: Michael Buesch <mb@bu3sch.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoChar: isicom: fix build warning
Jiri Slaby [Wed, 17 Jun 2009 23:27:48 +0000 (16:27 -0700)]
Char: isicom: fix build warning

Fix this:
isicom.c: In function `isicom_probe':
isicom.c:1587: warning: `signature' may be used uninitialized in this function
by uninitialized_var(), because if the signature is not initialized in
reset_card(), we won't use it.

Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agodrivers/char/mem.c: memory_open() cleanup: lookup minor device number from devlist
Adriano dos Santos Fernandes [Wed, 17 Jun 2009 23:27:48 +0000 (16:27 -0700)]
drivers/char/mem.c: memory_open() cleanup: lookup minor device number from devlist

memory_open() ignores devlist and does a switch for each item, duplicating
code and conditional definitions.

Clean it up by adding backing_dev_info to devlist and use it to lookup for
the minor device.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Adriano dos Santos Fernandes <adrianosf@uol.com.br>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoipc: use __ARCH_WANT_IPC_PARSE_VERSION in ipc/util.h
Arnd Bergmann [Wed, 17 Jun 2009 23:27:46 +0000 (16:27 -0700)]
ipc: use __ARCH_WANT_IPC_PARSE_VERSION in ipc/util.h

The definition of ipc_parse_version depends on
__ARCH_WANT_IPC_PARSE_VERSION, but the header file declares it
conditionally based on the architecture.

Use the macro consistently to make it easier to add new architectures.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agokthreads: simplify migration_thread() exit path
Oleg Nesterov [Wed, 17 Jun 2009 23:27:45 +0000 (16:27 -0700)]
kthreads: simplify migration_thread() exit path

Now that kthread_stop() can be used even if the task has already exited,
we can kill the "wait_to_die:" loop in migration_thread().  But we must
pin rq->migration_thread after creation.

Actually, I don't think CPU_UP_CANCELED or CPU_DEAD should wait for
->migration_thread exit.  Perhaps we can simplify this code a bit more.
migration_call() can set ->should_stop and forget about this thread.  But
we need a new helper in kthred.c for that.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Pavel Emelyanov <xemul@openvz.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Vitaliy Gusev <vgusev@openvz.org
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agokthreads: rework kthread_stop()
Oleg Nesterov [Wed, 17 Jun 2009 23:27:45 +0000 (16:27 -0700)]
kthreads: rework kthread_stop()

Based on Eric's patch which in turn was based on my patch.

kthread_stop() has the nasty problems:

- it runs unpredictably long with the global semaphore held.

- it deadlocks if kthread itself does kthread_stop() before it obeys
  the kthread_should_stop() request.

- it is not useable if kthread exits on its own, see for example the
  ugly "wait_to_die:" hack in migration_thread()

- it is not possible to just tell kthread it should stop, we must always
  wait for its exit.

With this patch kthread() allocates all neccesary data (struct kthread) on
its own stack, globals kthread_stop_xxx are deleted.  ->vfork_done is used
as a pointer into "struct kthread", this means kthread_stop() can easily
wait for kthread's exit.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Pavel Emelyanov <xemul@openvz.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Vitaliy Gusev <vgusev@openvz.org
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agokthreads: simplify the startup synchronization
Oleg Nesterov [Wed, 17 Jun 2009 23:27:43 +0000 (16:27 -0700)]
kthreads: simplify the startup synchronization

We use two completions two create the kernel thread, this is a bit ugly.
kthread() wakes up create_kthread() via ->started, then create_kthread()
wakes up the caller kthread_create() via ->done.  But kthread() does not
need to wait for kthread(), it can just return.  Instead kthread() itself
can wake up the caller of kthread_create().

Kill kthread_create_info->started, ->done is enough.  This improves the
scalability a bit and sijmplifies the code.

The only problem if kernel_thread() fails, in that case create_kthread()
must do complete(&create->done).

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Pavel Emelyanov <xemul@openvz.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Vitaliy Gusev <vgusev@openvz.org
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agomm: exit.c reorder wait_opts to remove padding on 64 bit builds
Richard Kennedy [Wed, 17 Jun 2009 23:27:42 +0000 (16:27 -0700)]
mm: exit.c reorder wait_opts to remove padding on 64 bit builds

Reorder struct wait_opts to remove 8 bytes of alignment padding on 64 bit
builds.

Signed-off-by: Richard Kennedy <richard@rsk.demon.co.uk>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agodo_wait: fix the theoretical race with stop/trace/cont
Oleg Nesterov [Wed, 17 Jun 2009 23:27:42 +0000 (16:27 -0700)]
do_wait: fix the theoretical race with stop/trace/cont

do_wait:

current->state = TASK_INTERRUPTIBLE;

read_lock(&tasklist_lock);
... search for the task to reap ...

In theory, the ->state changing can leak into the critical section.  Since
the child can change its status under read_lock(tasklist) in parallel
(finish_stop/ptrace_stop), we can miss the wakeup if __wake_up_parent()
sees us in TASK_RUNNING state.  Add the barrier.

Also, use __set_current_state() to set TASK_RUNNING.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Acked-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agodo_wait: kill the old BUG_ON, use while_each_thread()
Oleg Nesterov [Wed, 17 Jun 2009 23:27:41 +0000 (16:27 -0700)]
do_wait: kill the old BUG_ON, use while_each_thread()

do_wait() does BUG_ON(tsk->signal != current->signal), this looks like a
raher obsolete check.  At least, I don't think do_wait() is the best place
to verify that all threads have the same ->signal.  Remove it.

Also, change the code to use while_each_thread().

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Acked-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agodo_wait: simplify retval/tsk_result/notask_error mess
Oleg Nesterov [Wed, 17 Jun 2009 23:27:40 +0000 (16:27 -0700)]
do_wait: simplify retval/tsk_result/notask_error mess

Now that we don't pass &retval down to other helpers we can simplify
the code more.

- kill tsk_result, just use retval

- add the "notask" label right after the main loop, and
  s/got end/goto notask/ after the fastpath pid check.

  This way we don't need to initialize retval before this
  check and the code becomes a bit more clean, if this pid
  has no attached tasks we should just skip the list search.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Acked-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agointroduce "struct wait_opts" to simplify do_wait() patches
Oleg Nesterov [Wed, 17 Jun 2009 23:27:39 +0000 (16:27 -0700)]
introduce "struct wait_opts" to simplify do_wait() patches

Introduce "struct wait_opts" which holds the parameters for misc helpers
in do_wait() pathes.

This adds 13 lines to kernel/exit.c, but saves 256 bytes from .o and imho
makes the code much more readable.

This patch temporary uglifies rusage/siginfo code a little bit, will be
addressed by further cleanups.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoshift "ptrace implies WUNTRACED" from ptrace_do_wait() to wait_task_stopped()
Oleg Nesterov [Wed, 17 Jun 2009 23:27:39 +0000 (16:27 -0700)]
shift "ptrace implies WUNTRACED" from ptrace_do_wait() to wait_task_stopped()

No functional changes, preparation for the next patch.

ptrace_do_wait() adds WUNTRACED to options for wait_task_stopped() which
should always accept the stopped tracee, even if do_wait() was called
without WUNTRACED.

Change wait_task_stopped() to check "ptrace || WUNTRACED" instead.  This
makes the code more explicit, and "int options" argument becomes const in
do_wait() pathes.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoelf_core_dump: use rcu_read_lock() to access ->real_parent
Oleg Nesterov [Wed, 17 Jun 2009 23:27:38 +0000 (16:27 -0700)]
elf_core_dump: use rcu_read_lock() to access ->real_parent

In theory it is not safe to dereference ->parent/real_parent without
tasklist or rcu lock, we can race with re-parenting.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Roland McGrath <roland@redhat.com>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agocopy_process(): remove the unneeded clear_tsk_thread_flag(TIF_SIGPENDING)
Oleg Nesterov [Wed, 17 Jun 2009 23:27:37 +0000 (16:27 -0700)]
copy_process(): remove the unneeded clear_tsk_thread_flag(TIF_SIGPENDING)

The forked child can have TIF_SIGPENDING if it was copied from parent's
ti->flags.  But this is harmless and actually almost never happens,
because copy_process() can't succeed if signal_pending() == T.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agowait_task_zombie: do not use thread_group_cputime()
Oleg Nesterov [Wed, 17 Jun 2009 23:27:36 +0000 (16:27 -0700)]
wait_task_zombie: do not use thread_group_cputime()

There is no reason for thread_group_cputime() in wait_task_zombie(), there
must be no other threads.

This call was previously needed to collect the per-cpu data which we do
not have any longer.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Roland McGrath <roland@redhat.com>
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Vitaly Mayatskikh <vmayatsk@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoptrace: don't take tasklist to get/set ->last_siginfo
Oleg Nesterov [Wed, 17 Jun 2009 23:27:36 +0000 (16:27 -0700)]
ptrace: don't take tasklist to get/set ->last_siginfo

Change ptrace_getsiginfo/ptrace_setsiginfo to use lock_task_sighand()
without tasklist_lock.  Perhaps it makes sense to make a single helper
with "bool rw" argument.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoptrace: do_notify_parent_cldstop: fix the wrong ->nsproxy usage
Oleg Nesterov [Wed, 17 Jun 2009 23:27:35 +0000 (16:27 -0700)]
ptrace: do_notify_parent_cldstop: fix the wrong ->nsproxy usage

If the non-traced sub-thread calls do_notify_parent_cldstop(), we send the
notification to group_leader->real_parent and we report group_leader's
pid.

But, if group_leader is traced we use the wrong ->parent->nsproxy->pid_ns,
the tracer and parent can live in different namespaces.  Change the code
to use "parent" instead of tsk->parent.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Roland McGrath <roland@redhat.com>
Acked-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoptrace: wait_task_zombie: s/->parent/->real_parent/
Oleg Nesterov [Wed, 17 Jun 2009 23:27:34 +0000 (16:27 -0700)]
ptrace: wait_task_zombie: s/->parent/->real_parent/

Change wait_task_zombie() to use ->real_parent instead of ->parent.  We
could even use current afaics, but ->real_parent is more clean.

We know that the child is not ptrace_reparented() and thus they are equal.
 But we should avoid using task_struct->parent, we are going to remove it.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoptrace_get_task_struct: s/tasklist/rcu/, make it static
Oleg Nesterov [Wed, 17 Jun 2009 23:27:34 +0000 (16:27 -0700)]
ptrace_get_task_struct: s/tasklist/rcu/, make it static

- Use rcu_read_lock() instead of tasklist_lock to find/get the task
  in ptrace_get_task_struct().

- Make it static, it has no callers outside of ptrace.c.

- The comment doesn't match the reality, this helper does not do
  any checks. Beacuse it is really trivial and static I removed the
  whole comment.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoptrace: do not use task_lock() for attach
Oleg Nesterov [Wed, 17 Jun 2009 23:27:33 +0000 (16:27 -0700)]
ptrace: do not use task_lock() for attach

Remove the "Nasty, nasty" lock dance in ptrace_attach()/ptrace_traceme() -
from now task_lock() has nothing to do with ptrace at all.

With the recent changes nobody uses task_lock() to serialize with ptrace,
but in fact it was never needed and it was never used consistently.

However ptrace_attach() calls __ptrace_may_access() and needs task_lock()
to pin task->mm for get_dumpable().  But we can call __ptrace_may_access()
before we take tasklist_lock, ->cred_exec_mutex protects us against
do_execve() path which can change creds and MMF_DUMP* flags.

(ugly, but we can't use ptrace_may_access() because it hides the error
code, so we have to take task_lock() and use __ptrace_may_access()).

NOTE: this change assumes that LSM hooks, security_ptrace_may_access() and
security_ptrace_traceme(), can be called without task_lock() held.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Chris Wright <chrisw@sous-sol.org>
Acked-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoptrace: cleanup check/set of PT_PTRACED during attach
Oleg Nesterov [Wed, 17 Jun 2009 23:27:32 +0000 (16:27 -0700)]
ptrace: cleanup check/set of PT_PTRACED during attach

ptrace_attach() and ptrace_traceme() are the last functions which look as
if the untraced task can have task->ptrace != 0, this must not be
possible.  Change the code to just check ->ptrace != 0 and s/|=/=/ to set
PT_PTRACED.

Also, a couple of trivial whitespace cleanups in ptrace_attach().

And move ptrace_traceme() up near ptrace_attach() to keep them close to
each other.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Chris Wright <chrisw@sous-sol.org>
Acked-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoptrace: ptrace_attach: check PF_KTHREAD + exit_state instead of ->mm
Oleg Nesterov [Wed, 17 Jun 2009 23:27:31 +0000 (16:27 -0700)]
ptrace: ptrace_attach: check PF_KTHREAD + exit_state instead of ->mm

- Add PF_KTHREAD check to prevent attaching to the kernel thread
  with a borrowed ->mm.

  With or without this change we can race with daemonize() which
  can set PF_KTHREAD or clear ->mm after ptrace_attach() does the
  check, but this doesn't matter because reparent_to_kthreadd()
  does ptrace_unlink().

- Kill "!task->mm" check. We don't really care about ->mm != NULL,
  and the task can call exit_mm() right after we drop task_lock().
  What we need is to make sure we can't attach after exit_notify(),
  check task->exit_state != 0 instead.

Also, move the "already traced" check down for cosmetic reasons.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Chris Wright <chrisw@sous-sol.org>
Acked-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoptrace: do not use task->ptrace directly in core kernel
Oleg Nesterov [Wed, 17 Jun 2009 23:27:30 +0000 (16:27 -0700)]
ptrace: do not use task->ptrace directly in core kernel

No functional changes.

- Nobody except ptrace.c & co should use ptrace flags directly, we have
  task_ptrace() for that.

- No need to specially check PT_PTRACED, we must not have other PT_ bits
  set without PT_PTRACED. And no need to know this flag exists.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoptrace: tracehook_unsafe_exec(): remove the stale comment
Oleg Nesterov [Wed, 17 Jun 2009 23:27:29 +0000 (16:27 -0700)]
ptrace: tracehook_unsafe_exec(): remove the stale comment

tracehook_unsafe_exec() doesn't need task_lock(), remove the old comment.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoptrace: mm_need_new_owner: use ->real_parent to search in the siblings
Oleg Nesterov [Wed, 17 Jun 2009 23:27:29 +0000 (16:27 -0700)]
ptrace: mm_need_new_owner: use ->real_parent to search in the siblings

"Search in the siblings" should use ->real_parent, not ->parent.  If the
task is traced then ->parent == tracer, while the task's parent is always
->real_parent.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoptrace: remove PT_DTRACE from arch/m32r
Oleg Nesterov [Wed, 17 Jun 2009 23:27:27 +0000 (16:27 -0700)]
ptrace: remove PT_DTRACE from arch/m32r

m32r: PTRACE_SINGLESTEP sets PT_DTRACE, it is never used except cleared
after do_execve().

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Hirokazu Takata <takata@linux-m32r.org>
Acked-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoptrace: remove PT_DTRACE from m68k, m68knommu
Oleg Nesterov [Wed, 17 Jun 2009 23:27:27 +0000 (16:27 -0700)]
ptrace: remove PT_DTRACE from m68k, m68knommu

m68k sets PT_DTRACE in trap_c() but never uses it.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Greg Ungerer <gerg@snapgear.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Acked-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoptrace: remove PT_DTRACE from avr32, mn10300, parisc, s390, sh, xtensa
Oleg Nesterov [Wed, 17 Jun 2009 23:27:25 +0000 (16:27 -0700)]
ptrace: remove PT_DTRACE from avr32, mn10300, parisc, s390, sh, xtensa

avr32, mn10300, parisc, s390, sh, xtensa:

They never set PT_DTRACE, but clear it after do_execve().

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: David Howells <dhowells@redhat.com>
Acked-by: Kyle McMartin <kyle@mcmartin.ca>
Cc: Grant Grundler <grundler@parisc-linux.org>
Cc: Matthew Wilcox <matthew@wil.cx>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Acked-by: Chris Zankel <chris@zankel.net>
Acked-by: Roland McGrath <roland@redhat.com>
Acked-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoptrace: remove PT_DTRACE from arch/h8300
Oleg Nesterov [Wed, 17 Jun 2009 23:27:23 +0000 (16:27 -0700)]
ptrace: remove PT_DTRACE from arch/h8300

h8300 defines PT_DTRACE for asm but never uses it.

DEFINE(PT_PTRACED, PT_PTRACED) seems to be unused too.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Yoshinori Sato <ysato@users.sourceforge.jp>
Acked-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoallow_signal: kill the bogus ->mm check, add a note about CLONE_SIGHAND
Oleg Nesterov [Wed, 17 Jun 2009 23:27:23 +0000 (16:27 -0700)]
allow_signal: kill the bogus ->mm check, add a note about CLONE_SIGHAND

allow_signal() checks ->mm == NULL.  Not sure why.  Perhaps to make sure
current is the kernel thread.  But this helper must not be used unless we
are the kernel thread, kill this check.

Also, document the fact that the CLONE_SIGHAND kthread must not use
allow_signal(), unless the caller really wants to change the parent's
->sighand->action as well.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agomemcg: fix lru rotation in isolate_pages
KAMEZAWA Hiroyuki [Wed, 17 Jun 2009 23:27:21 +0000 (16:27 -0700)]
memcg: fix lru rotation in isolate_pages

Try to fix memcg's lru rotation sanity: make memcg use the same logic as
the global LRU does.

Now, at __isolate_lru_page() retruns -EBUSY, the page is rotated to the
tail of LRU in global LRU's isolate LRU pages.  But in memcg, it's not
handled.  This makes memcg do the same behavior as global LRU and rotate
LRU in the page is busy.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Acked-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Minchan Kim <minchan.kim@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agomemcg: add interface to reset limits
Daisuke Nishimura [Wed, 17 Jun 2009 23:27:20 +0000 (16:27 -0700)]
memcg: add interface to reset limits

We don't have an interface to reset mem.limit or memsw.limit now.

This patch allows to reset mem.limit or memsw.limit when they are being
set to -1.

Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Dhaval Giani <dhaval@linux.vnet.ibm.com>
Cc: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agomemcg: fix behavior under memory.limit equals to memsw.limit
KAMEZAWA Hiroyuki [Wed, 17 Jun 2009 23:27:19 +0000 (16:27 -0700)]
memcg: fix behavior under memory.limit equals to memsw.limit

A user can set memcg.limit_in_bytes == memcg.memsw.limit_in_bytes when the
user just want to limit the total size of applications, in other words,
not very interested in memory usage itself.  In this case, swap-out will
be done only by global-LRU.

But, under current implementation, memory.limit_in_bytes is checked at
first and try_to_free_page() may do swap-out.  But, that swap-out is
useless for memsw.limit_in_bytes and the thread may hit limit again.

This patch tries to fix the current behavior at memory.limit ==
memsw.limit case.  And documentation is updated to explain the behavior of
this special case.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Dhaval Giani <dhaval@linux.vnet.ibm.com>
Cc: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agomemcg: fix swap accounting
KAMEZAWA Hiroyuki [Wed, 17 Jun 2009 23:27:17 +0000 (16:27 -0700)]
memcg: fix swap accounting

This patch fixes mis-accounting of swap usage in memcg.

In the current implementation, memcg's swap account is uncharged only when
swap is completely freed.  But there are several cases where swap cannot
be freed cleanly.  For handling that, this patch changes that memcg
uncharges swap account when swap has no references other than cache.

By this, memcg's swap entry accounting can be fully synchronous with the
application's behavior.

This patch also changes memcg's hooks for swap-out.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Acked-by: Balbir Singh <balbir@in.ibm.com>
Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Dhaval Giani <dhaval@linux.vnet.ibm.com>
Cc: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agomemcg: remove unneeded forward declaration from sched.h
Li Zefan [Wed, 17 Jun 2009 23:27:16 +0000 (16:27 -0700)]
memcg: remove unneeded forward declaration from sched.h

This forward declaration seems pointless.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agomemcg: remove some redundant checks
Li Zefan [Wed, 17 Jun 2009 23:27:15 +0000 (16:27 -0700)]
memcg: remove some redundant checks

We don't need to check do_swap_account in the case that the function which
checks do_swap_account will never get called if do_swap_account == 0.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agomemcg: remove mem_cgroup_cache_charge_swapin()
Daisuke Nishimura [Wed, 17 Jun 2009 23:27:15 +0000 (16:27 -0700)]
memcg: remove mem_cgroup_cache_charge_swapin()

mem_cgroup_cache_charge_swapin() isn't used any more, so remove no-op
definition of it in header file.

Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agomemcg: add file-based RSS accounting
Balbir Singh [Wed, 17 Jun 2009 23:26:34 +0000 (16:26 -0700)]
memcg: add file-based RSS accounting

Add file RSS tracking per memory cgroup

We currently don't track file RSS, the RSS we report is actually anon RSS.
 All the file mapped pages, come in through the page cache and get
accounted there.  This patch adds support for accounting file RSS pages.
It should

1. Help improve the metrics reported by the memory resource controller
2. Will form the basis for a future shared memory accounting heuristic
   that has been proposed by Kamezawa.

Unfortunately, we cannot rename the existing "rss" keyword used in
memory.stat to "anon_rss".  We however, add "mapped_file" data and hope to
educate the end user through documentation.

[hugh.dickins@tiscali.co.uk: fix mem_cgroup_update_mapped_file_stat oops]
Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Li Zefan <lizf@cn.fujitsu.cn>
Cc: Paul Menage <menage@google.com>
Cc: Dhaval Giani <dhaval@linux.vnet.ibm.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agodevcgroup: skip superfluous checks when found the DEV_ALL elem
Li Zefan [Wed, 17 Jun 2009 23:26:33 +0000 (16:26 -0700)]
devcgroup: skip superfluous checks when found the DEV_ALL elem

While walking through the whitelist, if the DEV_ALL item is found, no more
check is needed.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agocgroups: forbid noprefix if mounting more than just cpuset subsystem
Li Zefan [Wed, 17 Jun 2009 23:26:33 +0000 (16:26 -0700)]
cgroups: forbid noprefix if mounting more than just cpuset subsystem

The 'noprefix' option was introduced for backwards-compatibility of
cpuset, but actually it can be used when mounting other subsystems.

This results in possibility of name collision, and now the collision can
really happen, because we have 'stat' file in both memory and cpuacct
subsystem:

# mount -t cgroup -o noprefix,memory,cpuacct xxx /mnt

Cgroup will happily mount the 2 subsystems, but only 'stat' file of memory
subsys can be seen.

We don't want users to use nopreifx, and also want to avoid name
collision, so we change to allow noprefix only if mounting just the cpuset
subsystem.

[akpm@linux-foundation.org: fix shift for cpuset_subsys_id >= 32]
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Cc: Paul Menage <menage@google.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
Acked-by: Dhaval Giani <dhaval@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agocgroups: make messages more readable
Randy Dunlap [Wed, 17 Jun 2009 23:26:32 +0000 (16:26 -0700)]
cgroups: make messages more readable

Fix some cgroup messages to read better.
Update MAINTAINERS to include mm/*cgroup* files.

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Paul Menage <menage@google.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoDocumentation/connector/cn_test.c comment unused cn_test_want_notify()
Jaswinder Singh Rajput [Wed, 17 Jun 2009 23:26:30 +0000 (16:26 -0700)]
Documentation/connector/cn_test.c comment unused cn_test_want_notify()

Currently cn_test_want_notify() has no user.

So add an ifdef and a comment which tells us to not remove it.

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Acked-by: Evgeniy Polyakov <zbr@ioremap.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoDocumentation/Changes: perl is needed to build the kernel
Jose Luis Perez Diez [Wed, 17 Jun 2009 23:26:30 +0000 (16:26 -0700)]
Documentation/Changes: perl is needed to build the kernel

Perl is used on the kernel Makefile to generate documentation, firmwares
in c source form, sources, graphs, and some headers and this fact is
undocumented.

[akpm@linux-foundation.org: 80-columns, please]
Signed-off-by: Jose Luis Perez Diez <jluis@escomposlinux.org>
Cc: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoreiserfs: fix warnings with gcc 4.4
Jeff Mahoney [Wed, 17 Jun 2009 23:26:29 +0000 (16:26 -0700)]
reiserfs: fix warnings with gcc 4.4

Several code paths in reiserfs have a construct like:

 if (is_direntry_le_ih(ih = B_N_PITEM_HEAD(src, item_num))) ...

which, in addition to being ugly, end up causing compiler warnings with
gcc 4.4.0.  Previous compilers didn't issue a warning.

fs/reiserfs/do_balan.c:1273: warning: operation on `aux_ih' may be undefined
fs/reiserfs/lbalance.c:393: warning: operation on `ih' may be undefined
fs/reiserfs/lbalance.c:421: warning: operation on `ih' may be undefined
fs/reiserfs/lbalance.c:777: warning: operation on `ih' may be undefined

I believe this is due to the ih being passed to macros which evaluate the
argument more than once.  This is old code and we haven't seen any
problems with it, but this patch eliminates the warnings.

It converts the multiple evaluation macros to static inlines and does a
preassignment for the cases that were causing the warnings because that
code is just ugly.

Reported-by: Chris Mason <mason@oracle.com>
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoufs: sector_t cannot be negative
Roel Kluin [Wed, 17 Jun 2009 23:26:28 +0000 (16:26 -0700)]
ufs: sector_t cannot be negative

unsigned i_block,fragment cannot be negative.

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Evgeniy Dushistov <dushistov@mail.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoisofs: cleanup mount option processing
Jan Kara [Wed, 17 Jun 2009 23:26:27 +0000 (16:26 -0700)]
isofs: cleanup mount option processing

Remove unused variables from isofs_sb_info (used to be some mount
options), unify variables for option to use 0/1 (some options used
'y'/'n'), use bit fields for option flags in superblock.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoisofs: fix setting of uid and gid to 0
Jan Kara [Wed, 17 Jun 2009 23:26:27 +0000 (16:26 -0700)]
isofs: fix setting of uid and gid to 0

isofs allows setting of default uid and gid of files but value 0 was used
to indicate that user did not specify any uid/gid mount option.  Since
this option also overrides uid/gid set in Rock Ridge extension, it makes
sense to allow forcing uid/gid 0.  Fix option processing to allow this.

Cc: <Hans-Joachim.Baader@cjt.de>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>