Len Brown [Thu, 7 Feb 2008 08:19:43 +0000 (03:19 -0500)]
Merge branches 'release' and 'wmi-2.6.25' into release
Len Brown [Thu, 7 Feb 2008 08:18:04 +0000 (03:18 -0500)]
Merge branches 'release' and 'menlo' into release
Conflicts:
drivers/acpi/video.c
Signed-off-by: Len Brown <len.brown@intel.com>
Len Brown [Thu, 7 Feb 2008 08:13:36 +0000 (03:13 -0500)]
Merge branches 'release' and 'stats' into release
Len Brown [Thu, 7 Feb 2008 08:13:13 +0000 (03:13 -0500)]
Merge branches 'release', 'misc' and 'misc-2.6.25' into release
Len Brown [Thu, 7 Feb 2008 08:12:17 +0000 (03:12 -0500)]
Merge branches 'release' and 'ppc-workaround' into release
Len Brown [Thu, 7 Feb 2008 08:11:56 +0000 (03:11 -0500)]
Merge branches 'release' and 'hp-cid' into release
Len Brown [Thu, 7 Feb 2008 08:11:47 +0000 (03:11 -0500)]
Merge branches 'release' and 'gpe-ack' into release
Len Brown [Thu, 7 Feb 2008 08:11:31 +0000 (03:11 -0500)]
Merge branches 'release' and 'dmi' into release
Len Brown [Thu, 7 Feb 2008 08:11:05 +0000 (03:11 -0500)]
Merge branches 'release', 'cpuidle-2.6.25' and 'idle' into release
Len Brown [Thu, 7 Feb 2008 08:09:43 +0000 (03:09 -0500)]
Merge branches 'release', 'bugzilla-6217', 'bugzilla-6629', 'bugzilla-6933', 'bugzilla-7186', 'bugzilla-8269', 'bugzilla-8570', 'bugzilla-9139', 'bugzilla-9277', 'bugzilla-9341', 'bugzilla-9444', 'bugzilla-9614', 'bugzilla-9643' and 'bugzilla-9644' into release
Len Brown [Thu, 7 Feb 2008 08:07:55 +0000 (03:07 -0500)]
Merge branches 'release' and 'autoload' into release
Len Brown [Thu, 7 Feb 2008 08:07:35 +0000 (03:07 -0500)]
Merge branches 'release', 'asus', 'sony-laptop' and 'thinkpad' into release
Len Brown [Thu, 7 Feb 2008 08:07:03 +0000 (03:07 -0500)]
Merge branches 'release', 'acpi_pm_device_sleep_state' and 'battery' into release
venkatesh.pallipadi@intel.com [Fri, 1 Feb 2008 01:35:06 +0000 (17:35 -0800)]
cpuidle: Add a poll_idle method
Add a default poll idle state with 0 latency. Provides an option to users
to use poll_idle by using 0 as the latency requirement.
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
venkatesh.pallipadi@intel.com [Fri, 1 Feb 2008 01:35:05 +0000 (17:35 -0800)]
ACPI: cpuidle: Support C1 idle time accounting
Show C1 idle time in /sysfs cpuidle interface. C1 idle time may not
be entirely accurate in all cases. It includes the time spent
in the interrupt handler after wakeup with "hlt" based C1. But, it will
be accurate with "mwait" based C1.
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
venkatesh.pallipadi@intel.com [Fri, 1 Feb 2008 01:35:04 +0000 (17:35 -0800)]
ACPI: enable MWAIT for C1 idle
Add MWAIT idle for C1 state instead of halt, on platforms that support
C1 state with MWAIT.
Renames cx->space_id to something more appropriate.
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
venkatesh.pallipadi@intel.com [Fri, 1 Feb 2008 01:35:03 +0000 (17:35 -0800)]
ACPI: idle: Fix acpi_safe_halt usages and interrupt enabling/disabling
acpi_safe_halt() needs interrupts to be disabled for atomic
need_resched check and safe halt. Otherwise we may miss an
interrupt and go into halt.
acpi_safe_halt() also does not enable interrupts on all return paths.
So the callers should handle enable and disable interrupts around it.
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Pavel Machek [Tue, 5 Feb 2008 18:27:12 +0000 (19:27 +0100)]
PM: documentation cleanups
Signed-off-by: Pavel Machek <pavel@suse.cz>
Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Len Brown <len.brown@intel.com>
Roel Kluin [Mon, 4 Feb 2008 23:24:56 +0000 (00:24 +0100)]
ACPI: thinkpad-acpi: second TP_EC_FAN_FULLSPEED should be TP_EC_FAN_AUTO
fix bug in safety net for TPEC fan control mode
eaa7571b2d1a08873e4bdd8e6db3431df61cd9ad
Signed-off-by: Roel Kluin <12o3l@tiscali.nl>
Acked-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Signed-off-by: Len Brown <len.brown@intel.com>
Matthias Kaehlcke [Tue, 5 Feb 2008 07:31:20 +0000 (23:31 -0800)]
ACPI: acpi_pci_irq_find_prt_entry(): use list_for_each_entry() instead of list_for_each()
Signed-off-by: Matthias Kaehlcke <matthias.kaehlcke@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Len Brown <len.brown@intel.com>
Miguel Botón [Tue, 5 Feb 2008 07:31:19 +0000 (23:31 -0800)]
ACPI: remove duplicated warning message
Remove duplicated warning message in acpi_power_transition()
ACPI: Transitioning device [%s] to D%d\n
This warning message is printed by acpi_bus_set_power() so we don't
need to print it again.
Signed-off-by: Miguel Botón <mboton@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Len Brown <len.brown@intel.com>
Luca Tettamanti [Tue, 5 Feb 2008 07:31:18 +0000 (23:31 -0800)]
asus_acpi: add support for F3Sa
Add support for ASUS F3Sa notebook. Features:
- LCD on/off
- Brightness
- Wifi kill
- Bluetooth kill
Signed-off-by: Luca Tettamanti <kronos.it@gmail.com>
Cc: Corentin Chary <corentincj@iksaif.net>
Cc: Karol Kozimor <sziwan@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Len Brown <len.brown@intel.com>
Roel Kluin [Sat, 2 Feb 2008 20:07:38 +0000 (21:07 +0100)]
asus-laptop: add parentheses
'!' has a higher priority than '&': bitanding has no effect.
Signed-off-by: Roel Kluin <12o3l@tiscali.nl>
Signed-off-by: Len Brown <len.brown@intel.com>
Corentin CHARY [Wed, 16 Jan 2008 15:56:42 +0000 (16:56 +0100)]
asus-laptop new write_acpi_int
Just a little modification of write_acpi_int
Signed-off-by: Corentin Chary <corentincj@iksaif.net>
Signed-off-by: Len Brown <len.brown@intel.com>
Len Brown [Wed, 6 Feb 2008 06:26:55 +0000 (01:26 -0500)]
ACPI: create /sys/firmware/acpi/interrupts
See Documentation/ABI/testing/sysfs-firmware-acpi
Based-on-original-patch-by: Luming Yu <luming.yu@intel.com>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Len Brown <len.brown@intel.com>
Linus Torvalds [Thu, 7 Feb 2008 02:06:58 +0000 (18:06 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/ericvh/v9fs
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs:
9p: fix p9_printfcall export
9p: transport API reorganization
9p: add remove function to trans_virtio
9p: Convert semaphore to spinlock for p9_idpool
9p: fix mmap to be read-only
9p: add support for sticky bit
9p: Fix soft lockup in virtio transport
9p: fix bug in attach-per-user
9p: block-based virtio client
9p: create transport rpc cut-thru
9p: fix bug in p9_clone_stat
Andrew Morton [Thu, 7 Feb 2008 01:25:01 +0000 (19:25 -0600)]
9p: fix p9_printfcall export
ERROR: "p9_printfcall" [net/9p/9pnet_virtio.ko] undefined!
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Eric Van Hensbergen <ericvh@gmail.com>
Eric Van Hensbergen [Thu, 7 Feb 2008 01:25:03 +0000 (19:25 -0600)]
9p: transport API reorganization
This merges the mux.c (including the connection interface) with trans_fd
in preparation for transport API changes. Ultimately, trans_fd will need
to be rewritten to clean it up and simplify the implementation, but this
reorganization is viewed as the first step.
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
Eric Van Hensbergen [Thu, 7 Feb 2008 01:25:04 +0000 (19:25 -0600)]
9p: add remove function to trans_virtio
Request from rusty:
Just cleaning up patches for 2.6.25 merge, and noticed that
net/9p/trans_virtio.c doesn't have a remove function. This will crash when
removing the module (console doesn't have one because it can't really be
removed).
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
Anthony Liguori [Thu, 7 Feb 2008 01:25:04 +0000 (19:25 -0600)]
9p: Convert semaphore to spinlock for p9_idpool
When booting from v9fs, down_interruptible in p9_idpool_get() triggered a BUG
as it was being called with IRQs disabled. A spinlock seems like the right
thing to be using since the idr functions go out of their way not to sleep.
This patch eliminates the BUG by converting the semaphore to a spinlock.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Acked-by: Eric Van Hensbergen <ericvh@gmail.com>
Eric Van Hensbergen [Thu, 7 Feb 2008 01:25:05 +0000 (19:25 -0600)]
9p: fix mmap to be read-only
v9fs was allowing writable mmap which could lead to kernel BUG() cases.
This sets the mmap function to generic_file_readonly_mmap which (correctly)
returns an error to applications which open mmap for writing.
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
Anthony Liguori [Thu, 7 Feb 2008 01:25:06 +0000 (19:25 -0600)]
9p: add support for sticky bit
GDM gets unhappy if /var/gdm doesn't have the sticky bit set. This patch adds
support for the sticky bit in much the same way setuid/setgid is supported.
With this patch, I can launch X from a v9fs rootfs (although I quickly run out
of fds in the server once gnome starts up).
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Acked-by: Eric Van Hensbergen <ericvh@gmail.com>
Eric Van Hensbergen [Thu, 7 Feb 2008 01:25:07 +0000 (19:25 -0600)]
9p: Fix soft lockup in virtio transport
This fixes a poorly placed spinlock which could result in a
soft lockup condition.
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
Eric Van Hensbergen [Thu, 7 Feb 2008 01:25:08 +0000 (19:25 -0600)]
9p: fix bug in attach-per-user
When a new user attached at a directory other than the root, he would end
up in the parent directory of the cwd. This was due to a logic error in
the code which attaches the user at the mount point and walks back to the
cwd. This patch fixes that.
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
Eric Van Hensbergen [Thu, 7 Feb 2008 01:25:58 +0000 (19:25 -0600)]
9p: block-based virtio client
This replaces the console-based virto client with a block-based
client using a single request queue.
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
Eric Van Hensbergen [Thu, 7 Feb 2008 01:25:09 +0000 (19:25 -0600)]
9p: create transport rpc cut-thru
Add a new transport function which allows a cut-thru directly to
the transport instead of processing request through the mux if the
cut-thru exists.
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
Martin Stava [Tue, 5 Feb 2008 15:27:09 +0000 (09:27 -0600)]
9p: fix bug in p9_clone_stat
This patch fixes a bug in the copying of 9P
stat information where string references
weren't being updated properly.
Signed-off-by: Martin Sava <martin.stava@gmail.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
Randy Dunlap [Thu, 7 Feb 2008 00:29:55 +0000 (16:29 -0800)]
docbook: dmapool: fix fatal changed filename
Docbook fatal error, file was moved:
docproc: linux-2.6.24-git15/drivers/base/dmapool.c: No such file or directory
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Wed, 6 Feb 2008 21:54:09 +0000 (13:54 -0800)]
Merge git://git./linux/kernel/git/x86/linux-2.6-x86
* git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86:
x86: fix deadlock, make pgd_lock irq-safe
virtio: fix trivial build bug
x86: fix mttr trimming
x86: delay CPA self-test and repeat it
x86: fix 64-bit sections
generic: add __FINITDATA
x86: remove suprious ifdefs from pageattr.c
x86: mark the .rodata section also NX
x86: fix iret exception recovery on 64-bit
cpuidle: dubious one-bit signed bitfield in cpuidle.h
x86: fix sparse warnings in powernow-k8.c
x86: fix sparse error in traps_32.c
x86: trivial sparse/checkpatch in quirks.c
x86 ptrace: disallow null cs/ss
MAINTAINERS: RDC R-321x SoC maintainer
brk randomization: introduce CONFIG_COMPAT_BRK
brk: check the lower bound properly
x86: remove X2 workaround
x86: make spurious fault handler aware of large mappings
x86: make traps on entry code be debuggable in user space, 64-bit
Ingo Molnar [Wed, 6 Feb 2008 21:39:45 +0000 (22:39 +0100)]
x86: fix deadlock, make pgd_lock irq-safe
lockdep just caught this one:
=================================
[ INFO: inconsistent lock state ]
2.6.24 #38
---------------------------------
inconsistent {in-softirq-W} -> {softirq-on-W} usage.
swapper/1 [HC0[0]:SC0[0]:HE1:SE1] takes:
(pgd_lock){-+..}, at: [<
ffffffff8022a9ea>] mm_init+0x1da/0x250
{in-softirq-W} state was registered at:
[<
ffffffffffffffff>] 0xffffffffffffffff
irq event stamp: 394559
hardirqs last enabled at (394559): [<
ffffffff80267f0a>] get_page_from_freelist+0x30a/0x4c0
hardirqs last disabled at (394558): [<
ffffffff80267d25>] get_page_from_freelist+0x125/0x4c0
softirqs last enabled at (393952): [<
ffffffff80232f8e>] __do_softirq+0xce/0xe0
softirqs last disabled at (393945): [<
ffffffff8020c57c>] call_softirq+0x1c/0x30
other info that might help us debug this:
no locks held by swapper/1.
stack backtrace:
Pid: 1, comm: swapper Not tainted 2.6.24 #38
Call Trace:
[<
ffffffff8024e1fb>] print_usage_bug+0x18b/0x190
[<
ffffffff8024f55d>] mark_lock+0x53d/0x560
[<
ffffffff8024fffa>] __lock_acquire+0x3ca/0xed0
[<
ffffffff80250ba8>] lock_acquire+0xa8/0xe0
[<
ffffffff8022a9ea>] ? mm_init+0x1da/0x250
[<
ffffffff809bcd10>] _spin_lock+0x30/0x70
[<
ffffffff8022a9ea>] mm_init+0x1da/0x250
[<
ffffffff8022aa99>] mm_alloc+0x39/0x50
[<
ffffffff8028b95a>] bprm_mm_init+0x2a/0x1a0
[<
ffffffff8028d12b>] do_execve+0x7b/0x220
[<
ffffffff80209776>] sys_execve+0x46/0x70
[<
ffffffff8020c214>] kernel_execve+0x64/0xd0
[<
ffffffff8020901e>] ? _stext+0x1e/0x20
[<
ffffffff802090ba>] init_post+0x9a/0xf0
[<
ffffffff809bc5f6>] ? trace_hardirqs_on_thunk+0x35/0x3a
[<
ffffffff8024f75a>] ? trace_hardirqs_on+0xba/0xd0
[<
ffffffff8020c1a8>] ? child_rip+0xa/0x12
[<
ffffffff8020bcbc>] ? restore_args+0x0/0x44
[<
ffffffff8020c19e>] ? child_rip+0x0/0x12
turns out that pgd_lock has been used on 64-bit x86 in an irq-unsafe
way for almost two years, since commit
8c914cb704a11460e.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Ingo Molnar [Wed, 6 Feb 2008 21:39:45 +0000 (22:39 +0100)]
virtio: fix trivial build bug
fix build bug:
drivers/virtio/virtio_balloon.c: In function 'fill_balloon':
drivers/virtio/virtio_balloon.c:98: error: implicit declaration of function 'msleep'
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Yinghai Lu [Wed, 6 Feb 2008 21:39:45 +0000 (22:39 +0100)]
x86: fix mttr trimming
Pavel Emelyanov reported that his networking card did not work
and bisected it down to:
"
The commit
093af8d7f0ba3c6be1485973508584ef081e9f93
x86_32: trim memory by updating e820
broke my e1000 card: on loading driver says that
e1000: probe of 0000:04:03.0 failed with error -5
and the interface doesn't appear.
"
on a 32-bit kernel, base will overflow when try to do PAGE_SHIFT,
and highest_addr will always less 4G.
So use pfn instead of address to avoid the overflow when more than
4g RAM is installed on a 32-bit kernel.
Many thanks to Pavel Emelyanov for reporting and testing it.
Bisected-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: Yinghai Lu <yinghai.lu@sun.com>
Tested-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Wed, 6 Feb 2008 21:39:45 +0000 (22:39 +0100)]
x86: delay CPA self-test and repeat it
delay the CPA self-test so that any impact (corruption) of
user-space pagetables can be triggered. Repeat the test
every 30 seconds.
this would have prevented the bug fixed by
8cb2a7c1e95e472b5,
at its source.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Sam Ravnborg [Wed, 6 Feb 2008 21:39:45 +0000 (22:39 +0100)]
x86: fix 64-bit sections
fix 64-bit section warnings.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Wed, 6 Feb 2008 21:39:45 +0000 (22:39 +0100)]
generic: add __FINITDATA
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Arjan van de Ven [Wed, 6 Feb 2008 21:39:45 +0000 (22:39 +0100)]
x86: remove suprious ifdefs from pageattr.c
The .rodata section really should just be read only; the config option
is there to make breaking up the 2Mb page an option (so people whos machines
give more performance for the 2Mb case can opt to do so).
But when the page gets split anyway, this is no longer an issue, so
clean up the code and remove the ifdefs
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Arjan van de Ven [Wed, 6 Feb 2008 21:39:45 +0000 (22:39 +0100)]
x86: mark the .rodata section also NX
The .rodata section shouldn't just be read-only,
but also non-executable. This is free since we've broken
up the 2MB page already anyway.
also update test_nx to check for this.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Roland McGrath [Wed, 6 Feb 2008 21:39:45 +0000 (22:39 +0100)]
x86: fix iret exception recovery on 64-bit
This change broke recovery of exceptions in iret:
commit
72fe4858544292ad64600765cb78bc02298c6b1c
Author: Glauber de Oliveira Costa <gcosta@redhat.com>
x86: replace privileged instructions with paravirt macros
The ENTRY(native_iret) macro adds alignment padding before the iretq
instruction, so "iret_label" no longer points exactly at the instruction.
It was sloppy to leave the old "iret_label" label behind when replacing
its nearby use. Removing it would have revealed the other use of the
label later in the file, and upon noticing that use, anyone exercising
the minimum of attention to detail expected of anyone touching this
subtle code would realize it needed to change as well.
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Harvey Harrison [Wed, 6 Feb 2008 21:39:44 +0000 (22:39 +0100)]
cpuidle: dubious one-bit signed bitfield in cpuidle.h
fix these sparse warnings:
CHECK arch/x86/kernel/acpi/cstate.c
include/linux/cpuidle.h:82:17: error: dubious one-bit signed bitfield
CHECK arch/x86/kernel/acpi/processor.c
include/linux/cpuidle.h:82:17: error: dubious one-bit signed bitfield
CHECK arch/x86/kernel/cpu/cpufreq/powernow-k7.c
include/linux/cpuidle.h:82:17: error: dubious one-bit signed bitfield
CHECK arch/x86/kernel/cpu/cpufreq/powernow-k8.c
include/linux/cpuidle.h:82:17: error: dubious one-bit signed bitfield
CHECK arch/x86/kernel/cpu/cpufreq/longhaul.c
include/linux/cpuidle.h:82:17: error: dubious one-bit signed bitfield
CHECK arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c
include/linux/cpuidle.h:82:17: error: dubious one-bit signed bitfield
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Harvey Harrison [Wed, 6 Feb 2008 21:39:44 +0000 (22:39 +0100)]
x86: fix sparse warnings in powernow-k8.c
arch/x86/kernel/cpu/cpufreq/powernow-k8.c:830:7: warning: symbol 'hi' shadows an earlier one
arch/x86/kernel/cpu/cpufreq/powernow-k8.c:824:6: originally declared here
arch/x86/kernel/cpu/cpufreq/powernow-k8.c:830:15: warning: symbol 'lo' shadows an earlier one
arch/x86/kernel/cpu/cpufreq/powernow-k8.c:824:14: originally declared here
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Harvey Harrison [Wed, 6 Feb 2008 21:39:44 +0000 (22:39 +0100)]
x86: fix sparse error in traps_32.c
This was being used to ensure the proper alignment of the FXSAVE/FXRSTOR data.
This would create a sparse error in the _correct_ cases, hiding further
warnings. Use BUILD_BUG_ON instead.
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Harvey Harrison [Wed, 6 Feb 2008 21:39:44 +0000 (22:39 +0100)]
x86: trivial sparse/checkpatch in quirks.c
arch/x86/kernel/quirks.c:384:3: warning: returning void-valued expression
arch/x86/kernel/quirks.c:387:3: warning: returning void-valued expression
arch/x86/kernel/quirks.c:390:3: warning: returning void-valued expression
arch/x86/kernel/quirks.c:393:3: warning: returning void-valued expression
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Roland McGrath [Wed, 6 Feb 2008 21:39:44 +0000 (22:39 +0100)]
x86 ptrace: disallow null cs/ss
In my revamp of the x86 ptrace code for setting register values,
I accidentally omitted a check that was there in the old code.
Allowing %cs to be 0 causes a bad crash in recovery from iret failure.
This patch fixes that regression against 2.6.24, and adds a comment
that should help prevent this subtlety from being overlooked again.
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Florian Fainelli [Wed, 6 Feb 2008 21:39:44 +0000 (22:39 +0100)]
MAINTAINERS: RDC R-321x SoC maintainer
Signed-off-by: Florian Fainelli <florian.fainelli@telecomint.eu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Wed, 6 Feb 2008 21:39:44 +0000 (22:39 +0100)]
brk randomization: introduce CONFIG_COMPAT_BRK
based on similar patch from: Pavel Machek <pavel@ucw.cz>
Introduce CONFIG_COMPAT_BRK. If disabled then the kernel is free
(but not obliged to) randomize the brk area.
Heap randomization breaks ancient binaries, so we keep COMPAT_BRK
enabled by default.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Jiri Kosina [Wed, 6 Feb 2008 21:39:44 +0000 (22:39 +0100)]
brk: check the lower bound properly
There is a check in sys_brk(), that tries to make sure that we do not
underflow the area that is dedicated to brk heap.
The check is however wrong, as it assumes that brk area starts immediately
after the end of the code (+bss), which is wrong for example in
environments with randomized brk start. The proper way is to check whether
the address is not below the start_brk address.
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Wed, 6 Feb 2008 21:39:44 +0000 (22:39 +0100)]
x86: remove X2 workaround
With the spurious handler fix, the X2 does not lock up anymore.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Thomas Gleixner [Wed, 6 Feb 2008 21:39:43 +0000 (22:39 +0100)]
x86: make spurious fault handler aware of large mappings
In very rare cases, on certain CPUs, we could end up in the spurious
fault handler and ignore a large pud/pmd mapping. The resulting pte
pointer points into the mapped physical space and dereferencing it
will fault recursively.
Make the code aware of large mappings and do the permission check
on the pmd/pud entry, when a large pud/pmd mapping is detected.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Roland McGrath [Wed, 6 Feb 2008 21:39:43 +0000 (22:39 +0100)]
x86: make traps on entry code be debuggable in user space, 64-bit
Unify the x86-64 behavior for 32-bit processes that set
bogus %cs/%ss values (the only ones that can fault in iret)
match what the native i386 behavior is. (do not kill the task
via do_exit but generate a SIGSEGV signal)
[ tglx@linutronix.de: build fix ]
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Linus Torvalds [Wed, 6 Feb 2008 19:16:11 +0000 (11:16 -0800)]
Merge branch 'async-tx-for-linus' of git://lost.foo-projects.org/~dwillia2/git/iop into fix
* 'async-tx-for-linus' of git://lost.foo-projects.org/~dwillia2/git/iop:
async_tx: allow architecture specific async_tx_find_channel implementations
async_tx: replace 'int_en' with operation preparation flags
async_tx: kill tx_set_src and tx_set_dest methods
async_tx: kill ASYNC_TX_ASSUME_COHERENT
iop-adma: use LIST_HEAD instead of LIST_HEAD_INIT
async_tx: use LIST_HEAD instead of LIST_HEAD_INIT
async_tx: fix compile breakage, mark do_async_xor __always_inline
Daniel Walker [Wed, 6 Feb 2008 14:50:37 +0000 (06:50 -0800)]
scsi: megaraid: trivial drop duplicate mutex.h include
Signed-off-by: Daniel Walker <dwalker@mvista.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Wed, 6 Feb 2008 18:48:34 +0000 (10:48 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/jmorris/selinux-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/selinux-2.6:
SELinux: Remove security_get_policycaps()
security: allow Kconfig to set default mmap_min_addr protection
Linus Torvalds [Wed, 6 Feb 2008 18:47:46 +0000 (10:47 -0800)]
Merge branch 'upstream-linus' of git://git./linux/kernel/git/jgarzik/libata-dev
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
ata_piix.c:piix_init_one() must be __devinit
sata_via.c: Remove missleading comment.
libata-core: unblacklist HITACHI drives
sata_nv: fix ATAPI issues with memory over 4GB (v7)
ata: drivers/ata/sata_mv.c needs dmapool.h
libata: kill now unused n_iter and fix sata_fsl
ahci: fix CAP.NP and PI handling
sata_mv: Support SoC controllers
Rename: linux/pata_platform.h to linux/ata_platform.h
Linus Torvalds [Wed, 6 Feb 2008 18:47:18 +0000 (10:47 -0800)]
Merge git://git./linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (35 commits)
virtio net: fix oops on interface-up
Fix PHY Lib support for gianfar and ucc_geth
forcedeth: preserve registers
forcedeth: phy status fix
forcedeth: restart tx/rx
ipvs: Make wrr "no available servers" error message rate-limited
[PPPOL2TP]: Label unused warning when CONFIG_PROC_FS is not set.
[NET_SCHED]: cls_flow: support classification based on VLAN tag
[VLAN]: Constify skb argument to vlan_get_tag()
[NET_SCHED]: cls_flow: fix key mask validity check
[NET_SCHED]: em_meta: fix compile warning
b43: Fix DMA for 30/32-bit DMA engines
b43: fix build with CONFIG_SSB_PCIHOST=n
mac80211: Is not EXPERIMENTAL anymore
iwl3945-base.c: fix off-by-one errors
b43legacy: fix DMA slot resource leakage
b43legacy: drop packets we are not able to encrypt
b43legacy: fix suspend/resume
b43legacy: fix PIO crash
Generic HDLC - use random_ether_addr()
...
Linus Torvalds [Wed, 6 Feb 2008 18:46:58 +0000 (10:46 -0800)]
Merge git://git./linux/kernel/git/davem/sparc-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6:
[SPARC64]: Temporarily remove IOMMU merging code.
[SPARC64]: Update defconfig.
[SPARC]: Add new timerfd syscall entries.
Anton Vorontsov [Wed, 6 Feb 2008 09:40:23 +0000 (01:40 -0800)]
fb: fix warning: no return statement in function returning non-void
Warning is reproducible with selected FB_CFB_REV_PIXELS_IN_BYTE.
CC drivers/video/sysfillrect.o
In file included from drivers/video/sysfillrect.c:18:
drivers/video/fb_draw.h: In function `fb_rev_pixels_in_long':
drivers/video/fb_draw.h:94: warning: no return statement in function returning non-void
CC drivers/video/syscopyarea.o
In file included from drivers/video/syscopyarea.c:22:
drivers/video/fb_draw.h: In function `fb_rev_pixels_in_long':
drivers/video/fb_draw.h:94: warning: no return statement in function returning non-void
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Cc: "Antonino A. Daplas" <adaplas@pol.net>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Johann Felix Soden [Wed, 6 Feb 2008 09:40:22 +0000 (01:40 -0800)]
virtio: add missing #include <linux/delay.h>
Include linux/delay.h to fix compiler error:
drivers/virtio/virtio_balloon.c: In function 'fill_balloon':
drivers/virtio/virtio_balloon.c:98: error: implicit declaration of function 'msleep'
Signed-off-by: Johann Felix Soden <johfel@users.sourceforge.net>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jan Kara [Wed, 6 Feb 2008 09:40:21 +0000 (01:40 -0800)]
ext3: fix lock inversion in direct IO
We cannot start transaction in ext3_direct_IO() and just let it last during
the whole write because dio_get_page() acquires mmap_sem which ranks above
transaction start (e.g. because we have dependency chain
mmap_sem->PageLock->journal_start, or because we update atime while holding
mmap_sem) and thus deadlocks could happen. We solve the problem by
starting a transaction separately for each ext3_get_block() call.
We *could* have a problem that we allocate a block and before its data are
written out the machine crashes and thus we expose stale data. But that
does not happen because for hole-filling generic code falls back to
buffered writes and for file extension, we add inode to orphan list and
thus in case of crash, journal replay will truncate inode back to the
original size.
[akpm@linux-foundation.org: build fix]
Signed-off-by: Jan Kara <jack@suse.cz>
Cc: <linux-ext4@vger.kernel.org>
Cc: Zach Brown <zach.brown@oracle.com>
Cc: Badari Pulavarty <pbadari@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Olaf Hering [Wed, 6 Feb 2008 09:40:19 +0000 (01:40 -0800)]
jbd.h: hide kernel only code
Move a few kernel-only things into __KERNEL__.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mariusz Kozlowski [Wed, 6 Feb 2008 09:40:18 +0000 (01:40 -0800)]
ext3: remove unused code from ext3_find_entry()
Signed-off-by: Mariusz Kozlowski <m.kozlowski@tuxland.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Akinobu Mita [Wed, 6 Feb 2008 09:40:17 +0000 (01:40 -0800)]
ext[234]: cleanup ext[234]_bg_num_gdb()
Use ext[234]_bg_has_super() to remove duplicate code.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Akinobu Mita [Wed, 6 Feb 2008 09:40:16 +0000 (01:40 -0800)]
ext[234]: remove unused argument for ext[234]_find_goal()
The argument chain for ext[234]_find_goal() is not used. This patch removes
it and fixes comment as well.
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Akinobu Mita [Wed, 6 Feb 2008 09:40:16 +0000 (01:40 -0800)]
ext[234]: use ext[234]_get_group_desc()
Use ext[234]_get_group_desc() to get group descriptor from group number.
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Akinobu Mita [Wed, 6 Feb 2008 09:40:15 +0000 (01:40 -0800)]
ext[234]: fix comment for nonexistent variable
The comment in ext[234]_new_blocks() describes about "i". But there is no
local variable called "i" in that scope. I guess it has been renamed to
group_no.
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Aneesh Kumar K.V [Wed, 6 Feb 2008 09:40:14 +0000 (01:40 -0800)]
ext3: change the default behaviour on error
ext3 file system was by default ignoring errors and continuing. This is
not a good default as continuing on error could lead to file system
corruption. Change the default to mark the file system readonly. Debian
and ubuntu already does this as the default in their fstab.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: <linux-ext4@vger.kernel.org>
Cc: Eric Sandeen <sandeen@redhat.com>
Cc: Jan Kara <jack@ucw.cz>
Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: Chuck Ebbert <cebbert@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Aneesh Kumar K.V [Wed, 6 Feb 2008 09:40:13 +0000 (01:40 -0800)]
ext3: return after ext3_error in case of failures
This fixes some instances where we were continuing after calling
ext3_error. ext3_error calls panic only if errors=panic mount option is
set. So we need to make sure we return correctly after ext3_error call
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Adrian Bunk [Wed, 6 Feb 2008 09:40:12 +0000 (01:40 -0800)]
make jbd/journal.c:__journal_abort_hard() static
__journal_abort_hard() can now become static.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andi Kleen [Wed, 6 Feb 2008 09:40:11 +0000 (01:40 -0800)]
BKL-removal: remove incorrect comment refering to lock_kernel() from jbd/jbd2
None of the callers of this function does actually take the BKL as far as I
can see. So remove the comment refering to the BKL.
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: <linux-ext4@vger.kernel.org>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andi Kleen [Wed, 6 Feb 2008 09:40:11 +0000 (01:40 -0800)]
BKL-removal: remove incorrect BKL comment in ext2
No BKL used anywhere, so don't mention it.
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andi Kleen [Wed, 6 Feb 2008 09:40:10 +0000 (01:40 -0800)]
BKL-removal: convert ext2 over to use unlocked_ioctl
I checked ext2_ioctl and could not find anything in there that would need the
BKL. So convert it over to use unlocked_ioctl
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Aneesh Kumar K.V [Wed, 6 Feb 2008 09:40:09 +0000 (01:40 -0800)]
ext3: add block bitmap validation
When a new block bitmap is read from disk in read_block_bitmap() there are a
few bits that should ALWAYS be set. In particular, the blocks given
corresponding to block bitmap, inode bitmap and inode tables. Validate the
block bitmap against these blocks.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Aneesh Kumar K.V [Wed, 6 Feb 2008 09:40:08 +0000 (01:40 -0800)]
ext2: add block bitmap validation
When a new block bitmap is read from disk in read_block_bitmap() there are a
few bits that should ALWAYS be set. In particular, the blocks given
corresponding to block bitmap, inode bitmap and inode tables. Validate the
block bitmap against these blocks.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bjorn Helgaas [Wed, 6 Feb 2008 09:40:08 +0000 (01:40 -0800)]
PNP: disable Supermicro H8DCE motherboard resources that overlap SATA BARs
Some Supermicro BIOSes describe a SATA PCI BAR as a motherboard resource.
The PNP system driver claims motherboard resources, and this prevents the
sata_nv driver from requesting it later.
This patch disables the PNP0C01/PNP0C02 resources so they won't be claimed
by the PNP system driver, so they'll available for sata_nv.
This fixes the bugs below, where sata_nv detects only two out of four SATA
drives. The signature includes dmesg lines similar to these:
pnp: 00:09: iomem range 0xdfefc000-0xdfefcfff has been reserved
pnp: 00:09: iomem range 0xdfefd000-0xdfefd3ff has been reserved
pnp: 00:09: iomem range 0xdfefe000-0xdfefe3ff has been reserved
PCI: Unable to reserve mem region #6:1000@
dfefd000 for device 0000:80:07.0
sata_nv: probe of 0000:80:07.0 failed with error -16
PCI: Unable to reserve mem region #6:1000@
dfefe000 for device 0000:80:08.0
sata_nv: probe of 0000:80:08.0 failed with error -16
References:
https://bugzilla.redhat.com/show_bug.cgi?id=280641
https://bugzilla.redhat.com/show_bug.cgi?id=313491
http://lkml.org/lkml/2008/1/9/449
http://thread.gmane.org/gmane.linux.acpi.devel/27312
This is post-2.6.24 material.
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Rene Herman [Wed, 6 Feb 2008 09:40:05 +0000 (01:40 -0800)]
PNP: do not test PNP_DRIVER_RES_DO_NOT_CHANGE on suspend/resume
The PNP_DRIVER_RES_DO_NOT_CHANGE flag is meant to signify that the PNP core
should not change resources for the device -- not that it shouldn't
disable/enable the device on suspend/resume.
ALSA ISAPnP drivers set PNP_DRIVER_RES_DO_NOT_CHANAGE (0x0001) through
setting PNP_DRIVER_RES_DISABLE (0x0003). The latter including the former
may in itself be considered rather unexpected but doesn't change that
suspend/resume wouldn't seem to have any business testing the flag.
As reported by Ondrej Zary for snd-cs4236, ALSA driven ISAPnP cards don't
survive swsusp hibernation with the resume skipping setting the resources
due to testing the flag -- the same test in the suspend path isn't enough
to keep hibernation from disabling the card it seems.
These tests were added (in 2005) by Piere Ossman in commit
68094e3251a664ee1389fcf179497237cbf78331, "alsa: Improved PnP suspend
support" who doesn't remember why. This deletes them.
Signed-off-by: Rene Herman <rene.herman@gmail.com>
Tested-by: Ondrej Zary <linux@rainbow-software.org>
Cc: Bjorn Helgaas <bjorn.helgaas@hp.com>
Cc: Pierre Ossman <drzeus@drzeus.cx>
Cc: Adam Belay <ambx1@neo.rr.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Daniel Walker [Wed, 6 Feb 2008 09:40:04 +0000 (01:40 -0800)]
isapnp driver semaphore to mutex
Changed the isapnp semaphore to a mutex.
[akpm@linux-foundation.org: no externs-in-c]
[akpm@linux-foundation.org: build fix]
Signed-off-by: Daniel Walker <dwalker@mvista.com>
Cc: Bjorn Helgaas <bjorn.helgaas@hp.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Thomas Renninger [Wed, 6 Feb 2008 09:40:03 +0000 (01:40 -0800)]
pnp: declare PNP option parsing functions as __init
There are three kind of parse functions provided by PNP acpi/bios:
- get current resources
- set resources
- get possible resources
The first two may be needed later at runtime.
The possible resource settings should never change dynamically.
And even if this would make any sense (I doubt it), the current implementation
only parses possible resource settings at early init time:
-> declare all the option parsing __init
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Thomas Renninger <trenn@suse.de>
Acked-By: Rene Herman <rene.herman@gmail.com>
Cc: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bjorn Helgaas [Wed, 6 Feb 2008 09:40:02 +0000 (01:40 -0800)]
simplify pnp_activate_dev() and pnp_disable_dev() return values
Make pnp_activate_dev() and pnp_disable_dev() return only 0 (success) or a
negative error value, as pci_enable_device() and pci_disable_device() do.
Previously they returned:
0: device was already active (or disabled)
1: we just activated (or disabled) device
<0: -EBUSY or error from pnp_start_dev() (or pnp_stop_dev())
Now we return only 0 (device is active or disabled) or <0 (error).
All in-tree callers either ignore the return values or check only for
errors (negative values).
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Cc: Adam Belay <ambx1@neo.rr.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
NeilBrown [Wed, 6 Feb 2008 09:40:00 +0000 (01:40 -0800)]
md: fix an occasional deadlock in raid5
raid5's 'make_request' function calls generic_make_request on underlying
devices and if we run out of stripe heads, it could end up waiting for one of
those requests to complete. This is bad as recursive calls to
generic_make_request go on a queue and are not even attempted until
make_request completes.
So: don't make any generic_make_request calls in raid5 make_request until all
waiting has been done. We do this by simply setting STRIPE_HANDLE instead of
calling handle_stripe().
If we need more stripe_heads, raid5d will get called to process the pending
stripe_heads which will call generic_make_request from a
This change by itself causes a performance hit. So add a change so that
raid5_activate_delayed is only called at unplug time, never in raid5. This
seems to bring back the performance numbers. Calling it in raid5d was
sometimes too soon...
Neil said:
How about we queue it for 2.6.25-rc1 and then about when -rc2 comes out,
we queue it for 2.6.24.y?
Acked-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Neil Brown <neilb@suse.de>
Tested-by: dean gaudet <dean@arctic.org>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
NeilBrown [Wed, 6 Feb 2008 09:39:59 +0000 (01:39 -0800)]
md: change ITERATE_RDEV_GENERIC to rdev_for_each_list, and remove ITERATE_RDEV_PENDING.
Finish ITERATE_ to for_each conversion.
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
NeilBrown [Wed, 6 Feb 2008 09:39:59 +0000 (01:39 -0800)]
md: change ITERATE_RDEV to rdev_for_each
As this is more in line with common practice in the kernel. Also swap the
args around to be more like list_for_each.
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
NeilBrown [Wed, 6 Feb 2008 09:39:58 +0000 (01:39 -0800)]
md: change INTERATE_MDDEV to for_each_mddev
As this is more consistent with kernel style.
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
NeilBrown [Wed, 6 Feb 2008 09:39:57 +0000 (01:39 -0800)]
md: change a few 'int' to 'size_t' in md
As suggested by Andrew Morton.
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
NeilBrown [Wed, 6 Feb 2008 09:39:56 +0000 (01:39 -0800)]
md: fix use-after-free bug when dropping an rdev from an md array
Due to possible deadlock issues we need to use a schedule work to kobject_del
an 'rdev' object from a different thread.
A recent change means that kobject_add no longer gets a refernce, and
kobject_del doesn't put a reference. Consequently, we need to explicitly hold
a reference to ensure that the last reference isn't dropped before the
scheduled work get a chance to call kobject_del.
Also, rename delayed_delete to md_delayed_delete to that it is more obvious in
a stack trace which code is to blame.
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
NeilBrown [Wed, 6 Feb 2008 09:39:55 +0000 (01:39 -0800)]
md: allow an md array to appear with 0 drives if it has external metadata
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
NeilBrown [Wed, 6 Feb 2008 09:39:55 +0000 (01:39 -0800)]
md: lock address when changing attributes of component devices
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
NeilBrown [Wed, 6 Feb 2008 09:39:54 +0000 (01:39 -0800)]
md: allow devices to be shared between md arrays
Currently, a given device is "claimed" by a particular array so that it cannot
be used by other arrays.
This is not ideal for DDF and other metadata schemes which have their own
partitioning concept.
So for externally managed metadata, just claim the device for md in general,
require that "offset" and "size" are set properly for each device, and make
sure that if a device is included in different arrays then the active sections
do not overlap.
This involves adding another flag to the rdev which makes it awkward to set
"->flags = 0" to clear certain flags. So now clear flags explicitly by name
when we want to clear things.
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
NeilBrown [Wed, 6 Feb 2008 09:39:53 +0000 (01:39 -0800)]
md: set and test the ->persistent flag for md devices more consistently
If you try to start an array for which the number of raid disks is listed as
zero, md will currently try to read metadata off any devices that have been
given. This was done because the value of raid_disks is used to signal
whether array details have been provided by userspace (raid_disks > 0) or must
be read from the devices (raid_disks == 0).
However for an array without persistent metadata (or with externally managed
metadata) this is the wrong thing to do. So we add a test in do_md_run to
give an error if raid_disks is zero for non-persistent arrays.
This requires that mddev->persistent is set corrently at this point, which it
currently isn't for in-kernel autodetected arrays.
So set ->persistent for autodetect arrays, and remove the settign in
super_*_validate which is now redundant.
Also clear ->persistent when stopping an array so it is consistently zero when
starting an array.
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
NeilBrown [Wed, 6 Feb 2008 09:39:52 +0000 (01:39 -0800)]
md: allow a maximum extent to be set for resyncing
This allows userspace to control resync/reshape progress and synchronise it
with other activities, such as shared access in a SAN, or backing up critical
sections during a tricky reshape.
Writing a number of sectors (which must be a multiple of the chunk size if
such is meaningful) causes a resync to pause when it gets to that point.
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
NeilBrown [Wed, 6 Feb 2008 09:39:51 +0000 (01:39 -0800)]
md: give userspace control over removing failed devices when external metdata in use
When a device fails, we must not allow an further writes to the array until
the device failure has been recorded in array metadata. When metadata is
managed externally, this requires some synchronisation...
Allow/require userspace to explicitly remove failed devices from active
service in the array by writing 'none' to the 'slot' attribute. If this
reduces the number of failed devices to 0, the write block will automatically
be lowered.
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
NeilBrown [Wed, 6 Feb 2008 09:39:51 +0000 (01:39 -0800)]
md: support 'external' metadata for md arrays
- Add a state flag 'external' to indicate that the metadata is managed
externally (by user-space) so important changes need to be
left of user-space to handle.
Alternates are non-persistant ('none') where there is no stable metadata -
after the array is stopped there is no record of it's status - and
internal which can be version 0.90 or version 1.x
These are selected by writing to the 'metadata' attribute.
- move the updating of superblocks (sync_sbs) to after we have checked if
there are any superblocks or not.
- New array state 'write_pending'. This means that the metadata records
the array as 'clean', but a write has been requested, so the metadata has
to be updated to record a 'dirty' array before the write can continue.
This change is reported to md by writing 'active' to the array_state
attribute.
- tidy up marking of sb_dirty:
- don't set sb_dirty when resync finishes as md_check_recovery
calls md_update_sb when the sync thread finishes anyway.
- Don't set sb_dirty in multipath_run as the array might not be dirty.
- don't mark superblock dirty when switching to 'clean' if there
is no internal superblock (if external, userspace can choose to
update the superblock whenever it chooses to).
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>