Rafael J. Wysocki [Thu, 19 Jul 2007 08:47:39 +0000 (01:47 -0700)]
ACPI: Do not prepare for hibernation in acpi_shutdown
Since we are now explicitly calling hibernation_ops->prepare() before
hibernation_ops->enter() in hibernation_platform_enter() (defined in
kernel/power/disk.c), ACPI should not call acpi_sleep_prepare(ACPI_STATE_S4)
from acpi_shutdown().
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@ucw.cz>
Cc: Len Brown <lenb@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Rafael J. Wysocki [Thu, 19 Jul 2007 08:47:38 +0000 (01:47 -0700)]
PM: Reduce code duplication between main.c and user.c
The SNAPSHOT_S2RAM ioctl code is outdated and it should not duplicate the
suspend code in kernel/power/main.c. Fix that.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@ucw.cz>
Cc: Nigel Cunningham <nigel@nigel.suspend2.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Rafael J. Wysocki [Thu, 19 Jul 2007 08:47:37 +0000 (01:47 -0700)]
PM: prevent frozen user mode helpers from failing the freezing of tasks
At present, if a user mode helper is running while
usermodehelper_pm_callback() is executed, the helper may be frozen and the
completion in call_usermodehelper_exec() won't be completed until user
space processes are thawed. As a result, the freezing of kernel threads
may fail, which is not desirable.
Prevent this from happening by introducing a counter of running user mode
helpers and allowing usermodehelper_pm_callback() to succeed for action =
PM_HIBERNATION_PREPARE or action = PM_SUSPEND_PREPARE only if there are no
helpers running. [Namely, usermodehelper_pm_callback() waits for at most
RUNNING_HELPERS_TIMEOUT for the number of running helpers to become zero
and fails if that doesn't happen.]
Special thanks to Uli Luckas <u.luckas@road.de>, Pavel Machek
<pavel@ucw.cz> and Oleg Nesterov <oleg@tv-sign.ru> for reviewing the
previous versions of this patch and for very useful comments.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Uli Luckas <u.luckas@road.de>
Acked-by: Nigel Cunningham <nigel@nigel.suspend2.net>
Acked-by: Pavel Machek <pavel@ucw.cz>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Rafael J. Wysocki [Thu, 19 Jul 2007 08:47:36 +0000 (01:47 -0700)]
PM: disable usermode helper before hibernation and suspend
Use a hibernation and suspend notifier to disable the user mode helper before
a hibernation/suspend and enable it after the operation.
[akpm@linux-foundation.org: build fix]
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@ucw.cz>
Acked-by: Nigel Cunningham <nigel@nigel.suspend2.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Rafael J. Wysocki [Thu, 19 Jul 2007 08:47:36 +0000 (01:47 -0700)]
PM: introduce hibernation and suspend notifiers
Make it possible to register hibernation and suspend notifiers, so that
subsystems can perform hibernation-related or suspend-related operations that
should not be carried out by device drivers' .suspend() and .resume()
routines.
[akpm@linux-foundation.org: build fixes]
[akpm@linux-foundation.org: cleanups]
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@ucw.cz>
Cc: Nigel Cunningham <nigel@nigel.suspend2.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Rafael J. Wysocki [Thu, 19 Jul 2007 08:47:35 +0000 (01:47 -0700)]
Freezer: remove redundant check in try_to_freeze_tasks
We don't need to check if todo is positive before calling time_after() in
try_to_freeze_tasks(), because if todo is zero at this point, the loop will be
broken anyway due to the while () condition being false.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@ucw.cz>
Cc: Gautham R Shenoy <ego@in.ibm.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Rafael J. Wysocki [Thu, 19 Jul 2007 08:47:34 +0000 (01:47 -0700)]
Freezer: return int from freeze_processes
Make try_to_freeze_tasks() and freeze_processes() return -EBUSY on failure
instead of the number of unfrozen tasks (none of the callers actually uses
this number).
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@ucw.cz>
Cc: Gautham R Shenoy <ego@in.ibm.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Rafael J. Wysocki [Thu, 19 Jul 2007 08:47:33 +0000 (01:47 -0700)]
Freezer: use __set_current_state in refrigerator
Use __set_current_state() as appropriate in refrigerator() instead of
accessing current->state directly.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@ucw.cz>
Cc: Gautham R Shenoy <ego@in.ibm.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Rafael J. Wysocki [Thu, 19 Jul 2007 08:47:33 +0000 (01:47 -0700)]
Freezer: avoid freezing kernel threads prematurely
Kernel threads should not have TIF_FREEZE set when user space processes are
being frozen, since otherwise some of them might be frozen prematurely.
To prevent this from happening we can (1) make exit_mm() unset TIF_FREEZE
unconditionally just after clearing tsk->mm and (2) make try_to_freeze_tasks()
check if p->mm is different from zero and PF_BORROWED_MM is unset in p->flags
when user space processes are to be frozen.
Namely, when user space processes are being frozen, we only should set
TIF_FREEZE for tasks that have p->mm different from NULL and don't have
PF_BORROWED_MM set in p->flags. For this reason task_lock() must be used to
prevent try_to_freeze_tasks() from racing with use_mm()/unuse_mm(), in which
p->mm and p->flags.PF_BORROWED_MM are changed under task_lock(p). Also, we
need to prevent the following scenario from happening:
* daemonize() is called by a task spawned from a user space code path
* freezer checks if the task has p->mm set and the result is positive
* task enters exit_mm() and clears its TIF_FREEZE
* freezer sets TIF_FREEZE for the task
* task calls try_to_freeze() and goes to the refrigerator, which is wrong at
that point
This requires us to acquire task_lock(p) before p->flags.PF_BORROWED_MM and
p->mm are examined and release it after TIF_FREEZE is set for p (or it turns
out that TIF_FREEZE should not be set).
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Gautham R Shenoy <ego@in.ibm.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Nigel Cunningham <nigel@nigel.suspend2.net>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Rafael J. Wysocki [Thu, 19 Jul 2007 08:47:31 +0000 (01:47 -0700)]
Hibernation: prepare to enter the low power state
During hibernation we call hibernation_ops->prepare() before creating the image,
but then, before saving it, we cancel the power transition by calling
hibernation_ops->finish(). Thus prior to calling hibernation_ops->enter() we
should let the platform firmware know that we're going to enter the low power
state after all.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Gautham R Shenoy <ego@in.ibm.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Nigel Cunningham <nigel@nigel.suspend2.net>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Rafael J. Wysocki [Thu, 19 Jul 2007 08:47:31 +0000 (01:47 -0700)]
swsusp: fix hibernation code ordering
Change the code ordering so that hibernation_ops->prepare() is called after
device_suspend(). This is needed so that we don't violate the ACPI
specification, which states that the _PTS and _GTS system-control methods,
executed from acpi_sleep_prepare(), ought to be called after devices have been
put in low power states.
The "Finish" label in hibernation_restore() is moved, because device_suspend()
resumes devices if the suspending of them fails and the restore code ordering
should reflect the hibernation code ordering.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@ucw.cz>
Cc: Nigel Cunningham <nigel@nigel.suspend2.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Rafael J. Wysocki [Thu, 19 Jul 2007 08:47:30 +0000 (01:47 -0700)]
swsusp: introduce restore platform operations
At least on some machines it is necessary to prepare the ACPI firmware for the
restoration of the system memory state from the hibernation image if the
"platform" mode of hibernation has been used. Namely, in that cases we need
to disable the GPEs before replacing the "boot" kernel with the "frozen"
kernel (cf. http://bugzilla.kernel.org/show_bug.cgi?id=7887). After the
restore they will be re-enabled by hibernation_ops->finish(), but if the
restore fails, they have to be re-enabled by the restore code explicitly.
For this purpose we can introduce two additional hibernation operations,
called pre_restore() and restore_cleanup() and call them from the restore code
path. Still, they should be called if the "platform" mode of hibernation has
been used, so we need to pass the information about the hibernation mode from
the "frozen" kernel to the "boot" kernel in the image header.
Apparently, we can't drop the disabling of GPEs before the restore because of
Bug #7887 . We also can't do it unconditionally, because the GPEs wouldn't
have been enabled after a successful restore if the suspend had been done in
the 'shutdown' or 'reboot' mode.
In principle we could (and probably should) unconditionally disable the GPEs
before each snapshot creation *and* before the restore, but then we'd have to
unconditionally enable them after the snapshot creation as well as after the
restore (or restore failure) Still, for this purpose we'd need to modify
acpi_enter_sleep_state_prep() and acpi_leave_sleep_state() and we'd have to
introduce some mechanism synchronizing the disablind/enabling of the GPEs with
the device drivers' .suspend()/.resume() routines and with
disable_/enable_nonboot_cpus(). However, this would have affected the
suspend (ie. s2ram) code as well as the hibernation, which I'd like to avoid
in this patch series.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Nigel Cunningham <nigel@nigel.suspend2.net>
Cc: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Rafael J. Wysocki [Thu, 19 Jul 2007 08:47:29 +0000 (01:47 -0700)]
swsusp: remove code duplication between disk.c and user.c
Currently, much of the code in kernel/power/disk.c is duplicated in
kernel/power/user.c , mainly for historical reasons. By eliminating this code
duplication we can reduce the size of user.c quite substantially and remove
the maintenance difficulty resulting from it.
[bunk@stusta.de: kernel/power/disk.c: make code static]
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@ucw.cz>
Cc: Nigel Cunningham <nigel@nigel.suspend2.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Rafael J. Wysocki [Thu, 19 Jul 2007 08:47:28 +0000 (01:47 -0700)]
swsusp: remove incorrect code from user.c
In the face of the recent change of suspend code ordering (cf.
http://marc.info/?l=linux-acpi&m=
117938245931603&w=2) we should also modify
the code ordering in swsusp so that hibernation_ops->prepare() is executed
after device_suspend().
However, for this purpose it seems reasonable to eliminate the code
duplication between kernel/power/disk.c and kernel/power/user.c first. By
eliminating it we can reduce the size of user.c quite substantially and remove
the maintenance difficulty with making essentially the same changes in two
different places.
Moreover, we should also remove the calls to "platform" functions from the
restore code path, since it doesn't carry out any power transition of the
system, but we generally need to disable the GPEs before the restore if the
'platform' hibernation mode has been used. To do this, we can introduce two
new hibernation_ops to be used in the restore code.
This patch:
Make the code hibernation code in kernel/power/user.c be functionally
equivalent to the corresponding code in kernel/power/disk.c , as it should be.
The calls to the platform functions removed by this patch are incorrect. They
should be replaced with some other "platform" invocations that will be
introduced in one of the subsequent patches.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@ucw.cz>
Cc: Nigel Cunningham <nigel@nigel.suspend2.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ben Collins [Thu, 19 Jul 2007 08:47:27 +0000 (01:47 -0700)]
PM: Do not require dev spew to get PM_DEBUG
In order to enable things like PM_TRACE, you're required to enable
PM_DEBUG, which sends a large spew of messages on boot, and often times can
overflow dmesg buffer.
Create new PM_VERBOSE and shift that to be the option that enables
drivers/base/power's messages.
Signed-off-by: Ben Collins <bcollins@ubuntu.com>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andrew Morton [Thu, 19 Jul 2007 08:47:26 +0000 (01:47 -0700)]
freezer: run show_state() when freezing times out
To see which tasks are stuck where.
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Miklos Szeredi [Thu, 19 Jul 2007 08:47:24 +0000 (01:47 -0700)]
only allow nonlinear vmas for ram backed filesystems
page_mkclean() doesn't re-protect ptes for non-linear mappings, so a later
re-dirty through such a mapping will not generate a fault, PG_dirty will
not reflect the dirty state and the dirty count will be skewed. This
implies that msync() is also currently broken for nonlinear mappings.
The easiest solution is to emulate remap_file_pages on non-linear mappings
with simple mmap() for non ram-backed filesystems. Applications continue
to work (albeit slower), as long as the number of remappings remain below
the maximum vma count.
However all currently known real uses of non-linear mappings are for ram
backed filesystems, which this patch doesn't affect.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: William Lee Irwin III <wli@holomorphy.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mel Gorman [Thu, 19 Jul 2007 08:47:23 +0000 (01:47 -0700)]
Remove alloc_zeroed_user_highpage()
alloc_zeroed_user_highpage() has no in-tree users and it is not exported.
As it is not exported, it can simply be removed.
Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Acked-by: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Nick Piggin [Thu, 19 Jul 2007 08:47:22 +0000 (01:47 -0700)]
mm: fix clear_page_dirty_for_io vs fault race
Fix msync data loss and (less importantly) dirty page accounting
inaccuracies due to the race remaining in clear_page_dirty_for_io().
The deleted comment explains what the race was, and the added comments
explain how it is fixed.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Nick Piggin [Thu, 19 Jul 2007 08:47:05 +0000 (01:47 -0700)]
mm: fault feedback #2
This patch completes Linus's wish that the fault return codes be made into
bit flags, which I agree makes everything nicer. This requires requires
all handle_mm_fault callers to be modified (possibly the modifications
should go further and do things like fault accounting in handle_mm_fault --
however that would be for another patch).
[akpm@linux-foundation.org: fix alpha build]
[akpm@linux-foundation.org: fix s390 build]
[akpm@linux-foundation.org: fix sparc build]
[akpm@linux-foundation.org: fix sparc64 build]
[akpm@linux-foundation.org: fix ia64 build]
Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Ian Molton <spyro@f2s.com>
Cc: Bryan Wu <bryan.wu@analog.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: Greg Ungerer <gerg@uclinux.org>
Cc: Matthew Wilcox <willy@debian.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp>
Cc: Richard Curnow <rc@rc0.org.uk>
Cc: William Lee Irwin III <wli@holomorphy.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Miles Bader <uclinux-v850@lsi.nec.co.jp>
Cc: Chris Zankel <chris@zankel.net>
Acked-by: Kyle McMartin <kyle@mcmartin.ca>
Acked-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Acked-by: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
[ Still apparently needs some ARM and PPC loving - Linus ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Nick Piggin [Thu, 19 Jul 2007 08:47:03 +0000 (01:47 -0700)]
mm: fault feedback #1
Change ->fault prototype. We now return an int, which contains
VM_FAULT_xxx code in the low byte, and FAULT_RET_xxx code in the next byte.
FAULT_RET_ code tells the VM whether a page was found, whether it has been
locked, and potentially other things. This is not quite the way he wanted
it yet, but that's changed in the next patch (which requires changes to
arch code).
This means we no longer set VM_CAN_INVALIDATE in the vma in order to say
that a page is locked which requires filemap_nopage to go away (because we
can no longer remain backward compatible without that flag), but we were
going to do that anyway.
struct fault_data is renamed to struct vm_fault as Linus asked. address
is now a void __user * that we should firmly encourage drivers not to use
without really good reason.
The page is now returned via a page pointer in the vm_fault struct.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mark Fasheh [Thu, 19 Jul 2007 08:47:01 +0000 (01:47 -0700)]
Document ->page_mkwrite() locking
There seems to be very little documentation about this callback in general.
The locking in particular is a bit tricky, so it's worth having this in
writing.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mark Fasheh [Thu, 19 Jul 2007 08:47:00 +0000 (01:47 -0700)]
ocfs2: release page lock before calling ->page_mkwrite
__do_fault() was calling ->page_mkwrite() with the page lock held, which
violates the locking rules for that callback. Release and retake the page
lock around the callback to avoid deadlocking file systems which manually
take it.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Nick Piggin [Thu, 19 Jul 2007 08:46:59 +0000 (01:46 -0700)]
mm: merge populate and nopage into fault (fixes nonlinear)
Nonlinear mappings are (AFAIKS) simply a virtual memory concept that encodes
the virtual address -> file offset differently from linear mappings.
->populate is a layering violation because the filesystem/pagecache code
should need to know anything about the virtual memory mapping. The hitch here
is that the ->nopage handler didn't pass down enough information (ie. pgoff).
But it is more logical to pass pgoff rather than have the ->nopage function
calculate it itself anyway (because that's a similar layering violation).
Having the populate handler install the pte itself is likewise a nasty thing
to be doing.
This patch introduces a new fault handler that replaces ->nopage and
->populate and (later) ->nopfn. Most of the old mechanism is still in place
so there is a lot of duplication and nice cleanups that can be removed if
everyone switches over.
The rationale for doing this in the first place is that nonlinear mappings are
subject to the pagefault vs invalidate/truncate race too, and it seemed stupid
to duplicate the synchronisation logic rather than just consolidate the two.
After this patch, MAP_NONBLOCK no longer sets up ptes for pages present in
pagecache. Seems like a fringe functionality anyway.
NOPAGE_REFAULT is removed. This should be implemented with ->fault, and no
users have hit mainline yet.
[akpm@linux-foundation.org: cleanup]
[randy.dunlap@oracle.com: doc. fixes for readahead]
[akpm@linux-foundation.org: build fix]
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Mark Fasheh <mark.fasheh@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Nick Piggin [Thu, 19 Jul 2007 08:46:57 +0000 (01:46 -0700)]
mm: fix fault vs invalidate race for linear mappings
Fix the race between invalidate_inode_pages and do_no_page.
Andrea Arcangeli identified a subtle race between invalidation of pages from
pagecache with userspace mappings, and do_no_page.
The issue is that invalidation has to shoot down all mappings to the page,
before it can be discarded from the pagecache. Between shooting down ptes to
a particular page, and actually dropping the struct page from the pagecache,
do_no_page from any process might fault on that page and establish a new
mapping to the page just before it gets discarded from the pagecache.
The most common case where such invalidation is used is in file truncation.
This case was catered for by doing a sort of open-coded seqlock between the
file's i_size, and its truncate_count.
Truncation will decrease i_size, then increment truncate_count before
unmapping userspace pages; do_no_page will read truncate_count, then find the
page if it is within i_size, and then check truncate_count under the page
table lock and back out and retry if it had subsequently been changed (ptl
will serialise against unmapping, and ensure a potentially updated
truncate_count is actually visible).
Complexity and documentation issues aside, the locking protocol fails in the
case where we would like to invalidate pagecache inside i_size. do_no_page
can come in anytime and filemap_nopage is not aware of the invalidation in
progress (as it is when it is outside i_size). The end result is that
dangling (->mapping == NULL) pages that appear to be from a particular file
may be mapped into userspace with nonsense data. Valid mappings to the same
place will see a different page.
Andrea implemented two working fixes, one using a real seqlock, another using
a page->flags bit. He also proposed using the page lock in do_no_page, but
that was initially considered too heavyweight. However, it is not a global or
per-file lock, and the page cacheline is modified in do_no_page to increment
_count and _mapcount anyway, so a further modification should not be a large
performance hit. Scalability is not an issue.
This patch implements this latter approach. ->nopage implementations return
with the page locked if it is possible for their underlying file to be
invalidated (in that case, they must set a special vm_flags bit to indicate
so). do_no_page only unlocks the page after setting up the mapping
completely. invalidation is excluded because it holds the page lock during
invalidation of each page (and ensures that the page is not mapped while
holding the lock).
This also allows significant simplifications in do_no_page, because we have
the page locked in the right place in the pagecache from the start.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Thu, 19 Jul 2007 01:38:25 +0000 (18:38 -0700)]
Merge branch 'isdn-fix' of /linux/kernel/git/jgarzik/misc-2.6
* 'isdn-fix' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/misc-2.6:
ISDN HiSax: uninitialized return in hisax_cs_setup
Linus Torvalds [Thu, 19 Jul 2007 01:33:45 +0000 (18:33 -0700)]
Merge branch 'upstream-linus' of /linux/kernel/git/jgarzik/netdev-2.6
* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6:
eHEA: Fix bonding support
Blackfin ethernet driver: on chip ethernet MAC controller driver
fix wrong argument of tc35815_read_plat_dev_addr()
ARM/ETHER3: Handle multicast frames.
SAA9730: Handle multicast frames.
NI5010: Handle multicast frames.
NS83820: Handle multicast frames.
Fix RGMII-ID handling in gianfar
Fix Vitesse RGMII-ID support
Add phy-connection-type to gianfar nodes
Fix Vitesse 824x PHY interrupt acking
[PATCH] zd1211rw: Add ID for Siemens Gigaset USB Stick 54
[PATCH] zd1211rw: Add ID for Planex GW-US54GXS
[PATCH] Update version ipw2200 stamp to 1.2.2
[PATCH] ipw2200: Fix ipw_isr() comments error on shared IRQ
[PATCH] Fix ipw2200 set wrong power parameter causing firmware error
[PATCH] ipw2100: Fix `iwpriv set_power` error
[PATCH] softmac: Channel is listed twice in scan output
Linus Torvalds [Thu, 19 Jul 2007 01:32:28 +0000 (18:32 -0700)]
Merge git://git./linux/kernel/git/sfrench/cifs-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6: (24 commits)
[CIFS] merge conflict in fs/cifs/export.c
[CIFS] Allow disabling CIFS Unix Extensions as mount option
[CIFS] More whitespace/formatting fixes (noticed by checkpatch)
[CIFS] Typo in previous patch
[CIFS] zero_user_page() conversions
[CIFS] use simple_prepare_write to zero page data
[CIFS] Fix build break - inet.h not included when experimental ifdef off
[CIFS] Add support for new POSIX unlink
[CIFS] whitespace/formatting fixes
[CIFS] Fix oops in cifs_create when nfsd server exports cifs mount
[CIFS] whitespace cleanup
[CIFS] Fix packet signatures for NTLMv2 case
[CIFS] more whitespace fixes
[CIFS] more whitespace cleanup
[CIFS] whitespace cleanup
[CIFS] whitespace cleanup
[CIFS] ipv6 support no longer experimental
[CIFS] Mount should fail if server signing off but client mount option requires it
[CIFS] whitespace fixes
[CIFS] Fix sign mount option and sign proc config setting
...
Linus Torvalds [Thu, 19 Jul 2007 01:28:34 +0000 (18:28 -0700)]
Merge /pub/scm/linux/kernel/git/gregkh/docs-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/docs-2.6:
zh_CN/HOWTO: update URLs of git trees
Chinese translation of Documentation/stable_api_nonsense.txt
HOWTO: add Chinese translation of Documentation/HOWTO
Documentation: add Japanese translated stable_api_nonsense.txt
HOWTO: add Japanese translation of Documentation/HOWTO
Linus Torvalds [Thu, 19 Jul 2007 01:28:08 +0000 (18:28 -0700)]
Merge /pub/scm/linux/kernel/git/gregkh/driver-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/driver-2.6:
sysfs: cosmetic clean up on node creation failure paths
sysfs: kill an extra put in sysfs_create_link() failure path
Driver core: check return code of sysfs_create_link()
HOWTO: Add the knwon_regression URI to the documentation
dev_vdbg() documentation
dev_vdbg(), available with -DVERBOSE_DEBUG
sysfs: make sysfs_init_inode() static
sysfs: fix sysfs root inode nlink accounting
Documentation fix devres.txt: lib/iomap.c -> lib/devres.c
sysfs: avoid kmem_cache_free(NULL)
PM: remove deprecated dpm_runtime_* routines
PM: Remove deprecated sysfs files
Driver core: accept all valid action-strings in uevent-trigger
debugfs: remove rmdir() non-empty complaint
Linus Torvalds [Thu, 19 Jul 2007 01:27:50 +0000 (18:27 -0700)]
Merge /pub/scm/linux/kernel/git/gregkh/uio-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/uio-2.6:
UIO: Hilscher CIF card driver
UIO: Documentation
UIO: Add the User IO core code
Linus Torvalds [Thu, 19 Jul 2007 01:27:00 +0000 (18:27 -0700)]
Merge branch 'for-linus' of git://linux-nfs.org/~bfields/linux
* 'for-linus' of git://linux-nfs.org/~bfields/linux:
locks: fix vfs_test_lock() comment
locks: make posix_test_lock() interface more consistent
nfs: disable leases over NFS
gfs2: stop giving out non-cluster-coherent leases
locks: export setlease to filesystems
locks: provide a file lease method enabling cluster-coherent leases
locks: rename lease functions to reflect locks.c conventions
locks: share more common lease code
locks: clean up lease_alloc()
locks: convert an -EINVAL return to a BUG
leases: minor break_lease() comment clarification
Linus Torvalds [Thu, 19 Jul 2007 01:26:18 +0000 (18:26 -0700)]
Merge branch 'for-linus' of /linux/kernel/git/roland/infiniband
* 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband: (29 commits)
IB/mthca: Simplify use of size0 in work request posting
IB/mthca: Factor out setting WQE UD segment entries
IB/mthca: Factor out setting WQE remote address and atomic segment entries
IB/mlx4: Factor out setting other WQE segments
IB/mlx4: Factor out setting WQE data segment entries
IB/mthca: Factor out setting WQE data segment entries
IB/mlx4: Return receive queue sizes for userspace QPs from query QP
IB/mlx4: Increase max outstanding RDMA reads as target
RDMA/cma: Remove local write permission from QP access flags
IB/mthca: Use uninitialized_var() for f0
IB/cm: Make internal function cm_get_ack_delay() static
IB/ipath: Remove ipath_get_user_pages_nocopy()
IB/ipath: Make a few functions static
mlx4_core: Reset device when internal error is detected
IB/iser: Make a couple of functions static
IB/mthca: Fix printk format used for firmware version in warning
IB/mthca: Schedule MSI support for removal
IB/ehca: Fix warnings issued by checkpatch.pl
IB/ehca: Restructure ehca_set_pagebuf()
IB/ehca: MR/MW structure refactoring
...
Steve French [Thu, 19 Jul 2007 00:38:57 +0000 (00:38 +0000)]
Merge branch 'master' of /linux/kernel/git/torvalds/linux-2.6
Conflicts:
fs/cifs/export.c
Steve French [Thu, 19 Jul 2007 00:32:25 +0000 (00:32 +0000)]
[CIFS] merge conflict in fs/cifs/export.c
Signed-off-by: Steve French <sfrench@us.ibm.com>
Steve French [Wed, 18 Jul 2007 23:21:09 +0000 (23:21 +0000)]
[CIFS] Allow disabling CIFS Unix Extensions as mount option
Previously the only way to do this was to umount all mounts to that server,
turn off a proc setting (/proc/fs/cifs/LinuxExtensionsEnabled).
Fixes Samba bugzilla bug number: 4582 (and also 2008)
Signed-off-by: Steve French <sfrench@us.ibm.com>
J. Bruce Fields [Fri, 11 May 2007 20:22:50 +0000 (16:22 -0400)]
locks: fix vfs_test_lock() comment
Thanks to Doug Chapman for pointing out that the comment here is
inconsistent with the function prototype.
Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
J. Bruce Fields [Fri, 11 May 2007 20:09:32 +0000 (16:09 -0400)]
locks: make posix_test_lock() interface more consistent
Since posix_test_lock(), like fcntl() and ->lock(), indicates absence or
presence of a conflict lock by setting fl_type to, respectively, F_UNLCK
or something other than F_UNLCK, the return value is no longer needed.
Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
J. Bruce Fields [Fri, 8 Jun 2007 19:23:34 +0000 (15:23 -0400)]
nfs: disable leases over NFS
As Peter Staubach says elsewhere
(http://marc.info/?l=linux-kernel&m=
118113649526444&w=2):
> The problem is that some file system such as NFSv2 and NFSv3 do
> not have sufficient support to be able to support leases correctly.
> In particular for these two file systems, there is no over the wire
> protocol support.
>
> Currently, these two file systems fail the fcntl(F_SETLEASE) call
> accidentally, due to a reference counting difference. These file
> systems should fail more consciously, with a proper error to
> indicate that the call is invalid for them.
Define an nfs setlease method that just returns -EINVAL.
If someone can demonstrate a real need, perhaps we could reenable
them in the presence of the "nolock" mount option.
Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
Cc: Peter Staubach <staubach@redhat.com>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Marc Eshel [Mon, 15 Jan 2007 23:33:36 +0000 (18:33 -0500)]
gfs2: stop giving out non-cluster-coherent leases
Since gfs2 can't prevent conflicting opens or leases on other nodes, we
probably shouldn't allow it to give out leases at all.
Put the newly defined lease operation into use in gfs2 by turning off
lease, unless we're using the "nolock' locking module (in which case all
locking is local anyway).
Signed-off-by: Marc Eshel <eshel@almaden.ibm.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Cc: Steven Whitehouse <swhiteho@redhat.com>
J. Bruce Fields [Wed, 4 Jul 2007 21:21:37 +0000 (17:21 -0400)]
locks: export setlease to filesystems
Export setlease so it can used by filesystems to implement their lease
methods.
Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
J. Bruce Fields [Tue, 14 Nov 2006 20:51:40 +0000 (15:51 -0500)]
locks: provide a file lease method enabling cluster-coherent leases
Currently leases are only kept locally, so there's no way for a distributed
filesystem to enforce them against multiple clients. We're particularly
interested in the case of nfsd exporting a cluster filesystem, in which
case nfsd needs cluster-coherent leases in order to implement delegations
correctly.
Also add some documentation.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
J. Bruce Fields [Thu, 7 Jun 2007 21:09:49 +0000 (17:09 -0400)]
locks: rename lease functions to reflect locks.c conventions
We've been using the convention that vfs_foo is the function that calls
a filesystem-specific foo method if it exists, or falls back on a
generic method if it doesn't; thus vfs_foo is what is called when some
other part of the kernel (normally lockd or nfsd) wants to get a lock,
whereas foo is what filesystems call to use the underlying local
functionality as part of their lock implementation.
So rename setlease to vfs_setlease (which will call a
filesystem-specific setlease after a later patch) and __setlease to
setlease.
Also, vfs_setlease need only be GPL-exported as long as it's only needed
by lockd and nfsd.
Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
J. Bruce Fields [Thu, 31 May 2007 21:03:46 +0000 (17:03 -0400)]
locks: share more common lease code
Share more code between setlease (used by nfsd) and fcntl.
Also some minor cleanup.
Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
Acked-by: Christoph Hellwig <hch@infradead.org>
J. Bruce Fields [Thu, 1 Mar 2007 19:34:35 +0000 (14:34 -0500)]
locks: clean up lease_alloc()
Return the newly allocated structure as the return value instead of
using a struct ** parameter.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
J. Bruce Fields [Sat, 30 Jun 2007 16:40:32 +0000 (12:40 -0400)]
locks: convert an -EINVAL return to a BUG
There's no point trying to return an error in these cases, which all represent
bugs in the callers.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
david m. richter [Wed, 9 May 2007 20:10:27 +0000 (16:10 -0400)]
leases: minor break_lease() comment clarification
clarify that break_lease() checks for presence of any lock, not just leases.
Signed-off-by: David M. Richter <richterd@citi.umich.edu>
Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
Li Yang [Tue, 26 Jun 2007 03:15:27 +0000 (11:15 +0800)]
zh_CN/HOWTO: update URLs of git trees
Addressing patch from Stefan Richter:
HOWTO: update URLs of git trees
(It will be better if we update this to commit-id later)
Signed-off-by: Li Yang <leoli@freescale.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
TripleX [Thu, 21 Jun 2007 17:20:36 +0000 (01:20 +0800)]
Chinese translation of Documentation/stable_api_nonsense.txt
This is a Chinese translated version of
Documentation/stable_api_nonsense.txt.
From: TripleX <zhongyu@18mail.cn>
Signed-off-by: WANG Cong <xiyou.wangcong@gmail.com>
Signed-off-by: Li Yang <leoli@freescale.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Li Yang [Thu, 21 Jun 2007 14:40:17 +0000 (22:40 +0800)]
HOWTO: add Chinese translation of Documentation/HOWTO
This is a Chinese translated version of Documentation/HOWTO. Currently
Chinese involvement in Linux kernel is very low, especially comparing to
its largest population base. Language could be the main obstacle. Hope
this document will help more Chinese to contribute to Linux kernel.
Signed-off-by: Li Yang <leoli@freescale.com>
Signed-off-by: TripleX Chung <xxx.phy@gmail.com>
Signed-off-by: Maggie Chen <chenqi@beyondsoft.com>
Signed-off-by: WANG Cong <xiyou.wangcong@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
IKEDA, Munehiro [Tue, 12 Jun 2007 00:46:04 +0000 (09:46 +0900)]
Documentation: add Japanese translated stable_api_nonsense.txt
Signed-off-by: IKEDA, Munehiro <m-ikeda@ds.jp.nec.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Tsugikazu Shibata [Tue, 12 Jun 2007 08:16:12 +0000 (17:16 +0900)]
HOWTO: add Japanese translation of Documentation/HOWTO
Add the japanese translation of the Documentation/HOWTO file.
Signed-off-by: Tsugikazu Shibata <tshibata@ab.jp.nec.com>
Cc: IKEDA Munehiro <m-ikeda@ds.jp.nec.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Hans-Jürgen Koch [Fri, 2 Mar 2007 12:03:12 +0000 (13:03 +0100)]
UIO: Hilscher CIF card driver
this is a patch that adds support for Hilscher CIF DeviceNet and
Profibus cards. I tested it on a Kontron CPX board, and Thomas reviewed
it.
You can find the user space part here:
http://www.osadl.org/projects/downloads/UIO/user/cif-0.1.0.tar.gz
Notes: cif_api.c is the main file you want to look at. It contains the
functions to open, close, mmap and so on. cif_dps.c adds functions
specific to Profibus cards, and cif_dn.c contains functions for
DeviceNet cards. cif.c is a universal playground, it's just a small
test program. The user space part of this UIO driver is still work in
progress, and not everything is tested yet. At the moment, the thread in
cif_api.c contains some code that artificially makes the card generate
interrupts, this was added for testing and will be removed later. But
the driver already contains all the functions needed for useful
operation, so it gives a good idea of how such a thing looks like.
For comparison, here's what you get from the manufacturer
(www.hilscher.com) when you ask for a Linux 2.6 driver:
http://www.tglx.de/private/hjk/cif-orig-2.6.tar.bz2
WARNING: Don't look at the code for too long, you might become sick :-)
Signed-off-by: Hans-Jürgen Koch <hjk@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Hans J. Koch [Mon, 11 Dec 2006 15:59:59 +0000 (16:59 +0100)]
UIO: Documentation
Documentation for the UIO interface
From: Hans J. Koch <hjk@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Hans J. Koch [Thu, 7 Dec 2006 09:58:29 +0000 (10:58 +0100)]
UIO: Add the User IO core code
This interface allows the ability to write the majority of a driver in
userspace with only a very small shell of a driver in the kernel itself.
It uses a char device and sysfs to interact with a userspace process to
process interrupts and control memory accesses.
See the docbook documentation for more details on how to use this
interface.
From: Hans J. Koch <hjk@linutronix.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Benedikt Spranger <b.spranger@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Tejun Heo [Wed, 18 Jul 2007 07:38:11 +0000 (16:38 +0900)]
sysfs: cosmetic clean up on node creation failure paths
Node addition failure is detected by testing return value of
sysfs_addfm_finish() which returns the number of added and removed
nodes. As the function is called as the last step of addition right
on top of error handling block, the if blocks looked like the
following.
if (sysfs_addrm_finish(&acxt))
success handling, usually return;
/* fall through to error handling */
This is the opposite of usual convention in sysfs and makes the code
difficult to understand. This patch inverts the test and makes those
blocks look more like others.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Gabriel C <nix.or.die@googlemail.com>
Cc: Miles Lane <miles.lane@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Tejun Heo [Wed, 18 Jul 2007 07:14:45 +0000 (16:14 +0900)]
sysfs: kill an extra put in sysfs_create_link() failure path
There is a subtle bug in sysfs_create_link() failure path. When
symlink creation fails because there's already a node with the same
name, the target sysfs_dirent is put twice - once by failure path of
sysfs_create_link() and once more when the symlink is released.
Fix it by making only the symlink node responsible for putting
target_sd.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Gabriel C <nix.or.die@googlemail.com>
Cc: Miles Lane <miles.lane@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Cornelia Huck [Wed, 18 Jul 2007 08:43:47 +0000 (01:43 -0700)]
Driver core: check return code of sysfs_create_link()
Check for return value of sysfs_create_link() in device_add() and
device_rename(). Add helper functions device_add_class_symlinks() and
device_remove_class_symlinks() to make the code easier to read.
[akpm@linux-foundation.org: fix unused var warnings]
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Acked-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Paolo Ciarrocchi [Mon, 16 Jul 2007 21:55:05 +0000 (23:55 +0200)]
HOWTO: Add the knwon_regression URI to the documentation
We should let everybody know about where the regression
list is hosted. The more is known the more it is used.
Signed-off-by: Paolo Ciarrocchi <paolo.ciarrocchi@gmail.com>
Cc: Li Yang <leoli@freescale.com>
Cc: TripleX Chung <xxx.phy@gmail.com>
Cc: Maggie Chen <chenqi@beyondsoft.com>
Cc: WANG Cong <xiyou.wangcong@gmail.com>
Cc: Tsugikazu Shibata <tshibata@ab.jp.nec.com>
Cc: IKEDA Munehiro <m-ikeda@ds.jp.nec.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
David Brownell [Fri, 13 Jul 2007 23:32:09 +0000 (16:32 -0700)]
dev_vdbg() documentation
Update CodingStyle to talk about "-DDEBUG" message conventions and the
new "-DVERBOSE_DEBUG" convention.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
David Brownell [Fri, 13 Jul 2007 05:08:22 +0000 (22:08 -0700)]
dev_vdbg(), available with -DVERBOSE_DEBUG
This defines a dev_vdbg() call, which is enabled with -DVERBOSE_DEBUG.
When enabled, dev_vdbg() acts just like dev_dbg(). When disabled, it is a
NOP ... just like dev_dbg() without -DDEBUG. The specific code was moved
out of a USB patch, but lots of drivers have similar support.
That is, code can now be written to use an additional level of debug
output, selected at compile time. Many driver authors have found this
idiom to be very useful. A typical usage model is for "normal" debug
messages to focus on fault paths and not be very "chatty", so that those
messages can be left on during normal operation without much of a
performance or syslog load. On the other hand "verbose" messages would be
noisy enough that they wouldn't normally be enabled; they might even affect
timings enough to change system or driver behavior.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Tejun Heo [Wed, 18 Jul 2007 05:30:28 +0000 (14:30 +0900)]
sysfs: make sysfs_init_inode() static
With sysfs_fill_super() converted to use sysfs_get_inode(), there is
no user of sysfs_init_inode() outside of fs/sysfs/inode.c. Make it
static.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Acked-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Tejun Heo [Wed, 18 Jul 2007 05:29:06 +0000 (14:29 +0900)]
sysfs: fix sysfs root inode nlink accounting
While making sysfs indoes hashed, sysfs root inode was left out. Now
that nlink accounting depends on the inode being on the hash, sysfs
root inode nlink isn't adjusted properly.
Put sysfs root inode on the inode hash by allocating it using
sysfs_get_inode() like other sysfs inodes. While at it, massage
comments a bit.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Acked-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Brandon Philips [Wed, 18 Jul 2007 05:09:34 +0000 (22:09 -0700)]
Documentation fix devres.txt: lib/iomap.c -> lib/devres.c
Signed-off-by: Brandon Philips <bphilips@suse.de>
Acked-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Akinobu Mita [Sat, 14 Jul 2007 02:03:35 +0000 (11:03 +0900)]
sysfs: avoid kmem_cache_free(NULL)
kmem_cache_free() with NULL is not allowed. But it may happen
if out of memory error is triggered in sysfs_new_dirent().
This patch fixes that error handling.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Alan Stern [Thu, 12 Jul 2007 20:57:22 +0000 (16:57 -0400)]
PM: remove deprecated dpm_runtime_* routines
This patch (as933) removes the deprecated dpm_runtime_suspend() and
dpm_runtime_resume() routines from the PM core. The only user of
those routines is the PCMCIA ds driver; local replacements are added.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
CC: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Alan Stern [Thu, 12 Jul 2007 20:55:07 +0000 (16:55 -0400)]
PM: Remove deprecated sysfs files
This patch (as932) removes the deprecated sysfs .../power/state
attribute files.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Kay Sievers [Sun, 8 Jul 2007 20:29:26 +0000 (22:29 +0200)]
Driver core: accept all valid action-strings in uevent-trigger
This allows the uevent file to handle any type of uevent action to be
triggered by userspace instead of just the "add" uevent.
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Jens Axboe [Wed, 11 Jul 2007 12:53:28 +0000 (14:53 +0200)]
debugfs: remove rmdir() non-empty complaint
Hi,
This patch kills the pointless debugfs rmdir() printk() when called on a
non-empty directory. blktrace will sometimes have to call it a few times
when forcefully ending a trace, which polutes the log with pointless
warnings.
Rationale:
- It's more code to work-around this "problem" in the debugfs users, and
you would have to add code to check for empty directories to do so (or
assume that debugfs is using simple_ helpers, but that would be a
layering violation).
- Other rmdir() implementations don't complain about something this
silly.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Thomas Klein [Wed, 18 Jul 2007 15:34:09 +0000 (17:34 +0200)]
eHEA: Fix bonding support
The driver didn't allow an interface's MAC address to be modified if the
respective interface wasn't setup - a failing Hcall was the result. Thus
bonding wasn't usable. The fix moves the failing Hcall which was registering
a MAC address for the reception of BC packets in firmware from the port up
and down functions to the port resources setup functions. Additionally the
missing update of the last_rx member of the netdev structure was added.
Signed-off-by: Thomas Klein <tklein@de.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Bryan Wu [Tue, 17 Jul 2007 06:43:44 +0000 (14:43 +0800)]
Blackfin ethernet driver: on chip ethernet MAC controller driver
This patch implements the driver necessary use the Analog Devices
Blackfin processor's on-chip ethernet MAC controller.
[try#2]
- add timeout control
- kill dma_config_reg bitfields
- some trivial cleanup
[try#3]
- add endianess check
- add DRV_NAME, DRV_VERSION... driver information string
- add some comments for silicon anomaly and dma API confusion
- some code trivial cleanup
[try#4]
- add Blackfin latest GPIO pin mux opertion with Michael Hennerich's
help and Dan's review
- rewrite the DMA descriptor list operation in a more readable way
by Joe's review
[try#5]
- cleanup some coding style by Joe's review.
[try#6]
- 1.1 version fix a bug when set up multicast list pointed by Mr. yoshfuji
- rearrange the desc_list_free function.
Signed-off-by: Michael Hennerich <michael.hennerich@analog.com>
Signed-off-by: Bryan Wu <bryan.wu@analog.com>
Cc: Michael Buesch <mb@bu3sch.de>
Cc: Mike Frysinger <vapier.adi@gmail.com>
Cc: Jeff Garzik <jeff@garzik.org>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Dan Williams <dcbw@redhat.com>
Cc: Joe Perches <joe@perches.com>
Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Yoichi Yuasa [Wed, 18 Jul 2007 02:13:42 +0000 (11:13 +0900)]
fix wrong argument of tc35815_read_plat_dev_addr()
Fix wrong argument of tc35815_read_plat_dev_addr()
Signed-off-by: Yoichi Yuasa <yoichi_yuasa@tripeaks.co.jp>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Jeff Garzik [Wed, 18 Jul 2007 22:31:03 +0000 (18:31 -0400)]
Merge branch 'upstream-jgarzik' of git://git./linux/kernel/git/linville/wireless-2.6 into upstream
YOSHIFUJI Hideaki / 吉藤英明 [Tue, 17 Jul 2007 04:45:43 +0000 (13:45 +0900)]
ARM/ETHER3: Handle multicast frames.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
--
Signed-off-by: Jeff Garzik <jeff@garzik.org>
YOSHIFUJI Hideaki / 吉藤英明 [Tue, 17 Jul 2007 04:46:00 +0000 (13:46 +0900)]
SAA9730: Handle multicast frames.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
--
Signed-off-by: Jeff Garzik <jeff@garzik.org>
YOSHIFUJI Hideaki / 吉藤英明 [Tue, 17 Jul 2007 04:45:50 +0000 (13:45 +0900)]
NI5010: Handle multicast frames.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
--
Signed-off-by: Jeff Garzik <jeff@garzik.org>
YOSHIFUJI Hideaki / 吉藤英明 [Tue, 17 Jul 2007 04:45:54 +0000 (13:45 +0900)]
NS83820: Handle multicast frames.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
--
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Andy Fleming [Wed, 11 Jul 2007 16:43:07 +0000 (11:43 -0500)]
Fix RGMII-ID handling in gianfar
The TSEC/eTSEC can detect the interface to the PHY automatically,
but it isn't able to detect whether the RGMII connection needs internal
delay. So we need to detect that change in the device tree, propagate
it to the platform data, and then check it if we're in RGMII. This fixes
a bug on the 8641D HPCN board where the Vitesse PHY doesn't use the delay
for RGMII.
Signed-off-by: Andy Fleming <afleming@freescale.com>
Andy Fleming [Wed, 11 Jul 2007 16:42:35 +0000 (11:42 -0500)]
Fix Vitesse RGMII-ID support
The Vitesse PHY on the 8641D needs to be set up with internal delay to
work in RGMII mode. So we add skew when it is set to RGMII_ID mode.
Signed-off-by: Andy Fleming <afleming@freescale.com>
Signed-off-by: Haruki Dai <Dai.Haruki@freescale.com>
Signed-off-by: Haiying Wang <Haiying.Wang@freescale.com>
Andy Fleming [Tue, 10 Jul 2007 22:28:49 +0000 (17:28 -0500)]
Add phy-connection-type to gianfar nodes
The TSEC/eTSEC automatically detect their PHY interface type, unless
the type is RGMII-ID (RGMII with internal delay). In that situation,
it just detects RGMII. In order to fix this, we need to pass in rgmii-id
if that is the connection type.
Signed-off-by: Andy Fleming <afleming@freescale.com>
Andy Fleming [Tue, 10 Jul 2007 21:42:04 +0000 (16:42 -0500)]
Fix Vitesse 824x PHY interrupt acking
The Vitesse 824x PHY doesn't allow an interrupt to be cleared if
the mask bit for that interrupt isn't set. This means that the PHY
Lib's order of handling interrupts (disable, then clear) breaks on this
PHY. However, clearing then disabling the interrupt opens up the code
for a silly race condition. So rather than change the PHY Lib, we change
the Vitesse driver so it always clears interrupts before disabling them.
Further, the ack function only clears the interrupt if interrupts are
enabled.
Signed-off-by: Andy Fleming <afleming@freescale.com>
Signed-off-by: York Sun <yorksun@freescale.com>
Acked-by: Haiying Wang <Haiying.Wang@freescale.com>
Florin Malita [Wed, 18 Jul 2007 22:04:46 +0000 (18:04 -0400)]
ISDN HiSax: uninitialized return in hisax_cs_setup
Coverity (1792) spotted a possibly uninitialized return value in case of
kmalloc() failure:
1116 static int hisax_cs_setup(int cardnr, struct IsdnCard *card,
1117 struct IsdnCardState *cs)
1119 int ret;
1120
1121 if (!(cs->rcvbuf = kmalloc(MAX_DFRAME_LEN_L1, GFP_ATOMIC))) {
1122 printk(KERN_WARNING "HiSax: No memory for isac rcvbuf\n");
1123 ll_unload(cs);
1124 goto outf_cs;
...
1165 outf_cs:
1166 kfree(cs);
1167 card->cs = NULL;
1168 return ret;
The straightforward solution would be to just add the missing
initialization but hardcoding the return value in the out_cs branch
(only taken on failure) seems to work just as well and it allows killing
a couple of other lines too.
Signed-off-by: Florin Malita <fmalita@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Stefan Richter [Mon, 16 Jul 2007 19:05:41 +0000 (21:05 +0200)]
firewire: fw-sbp2: convert to new SCSI data buffer accessors
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Stefan Richter [Tue, 17 Jul 2007 00:15:36 +0000 (02:15 +0200)]
firewire: fix memory leak of fw_request instances
Found and debugged by Jay Fenlason <fenlason@redhat.com>.
The bug was especially noticeable with direct I/O over fw-sbp2.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Kristian Høgsberg <krh@redhat.com>
Stefan Richter [Tue, 17 Jul 2007 00:13:48 +0000 (02:13 +0200)]
firewire: remove bogus check in fw_core_handle_request
This check is bogus:
- Maximum asynchronous payload size for S800...S3200 is 4096.
- The p->payload_length is totally uninteresting. Only the
request->length of the subsequently allocated and initialized
struct fw_request is of significance.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Kristian Høgsberg <krh@redhat.com>
Stefan Richter [Thu, 12 Jul 2007 20:25:14 +0000 (22:25 +0200)]
firewire: fw-ohci: fix "scheduling while atomic"
context_stop is called by bus_reset_tasklet, among else.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Stefan Richter [Thu, 12 Jul 2007 20:24:19 +0000 (22:24 +0200)]
firewire: fw-ohci: flush MMIO write before msleep
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Roland Dreier [Wed, 18 Jul 2007 20:28:29 +0000 (13:28 -0700)]
IB/mthca: Simplify use of size0 in work request posting
Current code sets size0 to 0 at the start of work request posting
functions and then handles size0 == 0 specially within the loop over
work requests. Change this so size0 is set along with f0 the first
time through the loop (when nreq == 0). This makes the code easier to
understand by making it clearer that f0 and size0 are always
initialized if nreq != 0 without having to know that size0 == 0
implies nreq == 0.
Also annotate size0 with uninitialized_var() so that this doesn't
introduce a new compiler warning.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Roland Dreier [Wed, 18 Jul 2007 20:21:14 +0000 (13:21 -0700)]
IB/mthca: Factor out setting WQE UD segment entries
Factor code to set UD entries out of the work request posting
functions into inline functions set_tavor_ud_seg() and
set_arbel_ud_seg(). This doesn't change the generated code in any
significant way, and makes the source easier on the eyes.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Linus Torvalds [Wed, 18 Jul 2007 19:57:52 +0000 (12:57 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/hskinnemoen/avr32-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hskinnemoen/avr32-2.6:
[AVR32] Initialize phy_mask for both macb devices
[AVR32] Fix atomic_add_unless() and atomic_sub_unless()
[AVR32] Correct misspelled CONFIG_BLK_DEV_INITRD variable.
[AVR32] Fix build error in parse_tag_rdimg()
[AVR32] Don't wire up macb0 unless SW6 is in default position
[AVR32] Wire up SSC platform device 0 as TX on ATSTK1000 board
[AVR32] Add Atmel SSC driver platform device to AT32AP architecture
[AVR32] Remove optimization of unaligned word loads
[AVR32] Make STK1000 mux settings configurable
[AVR32] CPU frequency scaling for AT32AP
[AVR32] Split SM device into PM, RTC, WDT and EIC
[AVR32] faster avr32 unaligned access
Roland Dreier [Wed, 18 Jul 2007 19:55:42 +0000 (12:55 -0700)]
IB/mthca: Factor out setting WQE remote address and atomic segment entries
Factor code to set remote address and atomic segment entries out of the
work request posting functions into inline functions set_raddr_seg()
and set_atomic_seg(). This doesn't change the generated code in any
significant way, and makes the source easier on the eyes.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Linus Torvalds [Wed, 18 Jul 2007 19:13:02 +0000 (12:13 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/hpa/linux-2.6-x86setup
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hpa/linux-2.6-x86setup:
[PATCH] x86: do not recompile boot for each build
[x86 setup] Save/restore DS around invocations of INT 10h
[x86 setup] VGA: Clear the Protect bit before setting the vertical height
[x86 setup] Fix assembly constraints
[x86 setup] build/tools.c: fix comment
[x86 setup] MAINTAINERS: document x86 setup code git tree
Peter Zijlstra [Wed, 18 Jul 2007 18:59:22 +0000 (20:59 +0200)]
i386: fixup TRACE_IRQ breakage
The TRACE_IRQS_ON function in iret_exc: calls a C function without
ensuring that the segments are set properly. Move the trace function and
the enabling of interrupt into the C stub.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Roland McGrath [Mon, 16 Jul 2007 08:03:16 +0000 (01:03 -0700)]
Handle bogus %cs selector in single-step instruction decoding
The code for LDT segment selectors was not robust in the face of a bogus
selector set in %cs via ptrace before the single-step was done.
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Roland Dreier [Wed, 18 Jul 2007 18:47:55 +0000 (11:47 -0700)]
IB/mlx4: Factor out setting other WQE segments
Factor code to set remote address, atomic and datagram segments out of
mlx4_ib_post_send() into small helper functions. This doesn't change
the generated code in any significant way, and makes the source easier
on the eyes.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Haavard Skinnemoen [Wed, 18 Jul 2007 18:32:49 +0000 (20:32 +0200)]
[AVR32] Initialize phy_mask for both macb devices
The STK1000 uses pullups on the MDIO lines to the PHY, but they are
too weak. This causes the PHY layer to detect PHYs on all possible MII
addresses. Mask out all but the correct address to prevent this from
happening.
Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
Haavard Skinnemoen [Wed, 18 Jul 2007 18:06:04 +0000 (20:06 +0200)]
[AVR32] Fix atomic_add_unless() and atomic_sub_unless()
These functions depend on "result" being initalized to 0, but "result"
is not included as an input constraint to the inline assembly block
following its initialization, only as an output constraint. Thus gcc
thinks it doesn't need to initialize it, so result ends up undefined
if the "unless" condition is true.
This fixes an oops in sunrpc where the faulty atomics caused
rpciod_up() to not start the workqueue as it should.
Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
Robert P. J. Day [Thu, 12 Jul 2007 22:31:08 +0000 (18:31 -0400)]
[AVR32] Correct misspelled CONFIG_BLK_DEV_INITRD variable.
Signed-off-by: Robert P. J. Day <rpjday@mindspring.com>
Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
Haavard Skinnemoen [Fri, 13 Jul 2007 09:26:01 +0000 (11:26 +0200)]
[AVR32] Fix build error in parse_tag_rdimg()
This code is inside an #ifdef with a misspelled config symbol, so it
hasn't been used for a long time. Fix it before fixing the config
symbol to keep bisection working.
Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
Roland Dreier [Wed, 18 Jul 2007 18:46:27 +0000 (11:46 -0700)]
IB/mlx4: Factor out setting WQE data segment entries
Factor code to set data segment entries out of mlx4_ib_post_send()
into set_data_seg(). This cleans up the code and lets the compiler do
a better job -- on x86_64:
add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-16 (-16)
function old new delta
mlx4_ib_post_send 1598 1582 -16
Signed-off-by: Roland Dreier <rolandd@cisco.com>