Alexander van Heukelum [Wed, 17 Jun 2009 22:35:58 +0000 (00:35 +0200)]
i386: fix/simplify espfix stack switching, move it into assembly
The espfix code triggers if we have a protected mode userspace
application with a 16-bit stack. On returning to userspace, with iret,
the CPU doesn't restore the high word of the stack pointer. This is an
"official" bug, and the work-around used in the kernel is to temporarily
switch to a 32-bit stack segment/pointer pair where the high word of the
pointer is equal to the high word of the userspace stackpointer.
The current implementation uses THREAD_SIZE to determine the cut-off,
but there is no good reason not to use the more natural 64kb... However,
implementing this by simply substituting THREAD_SIZE with 65536 in
patch_espfix_desc crashed the test application. patch_espfix_desc tries
to do what is described above, but gets it subtly wrong if the userspace
stack pointer is just below a multiple of THREAD_SIZE: an overflow
occurs to bit 13... With a bit of luck, when the kernelspace
stackpointer is just below a 64kb-boundary, the overflow then ripples
trough to bit 16 and userspace will see its stack pointer changed by
65536.
This patch moves all espfix code into entry_32.S. Selecting a 16-bit
cut-off simplifies the code. The game with changing the limit dynamically
is removed too. It complicates matters and I see no value in it. Changing
only the top 16-bit word of ESP is one instruction and it also implies
that only two bytes of the ESPFIX GDT entry need to be changed and this
can be implemented in just a handful simple to understand instructions.
As a side effect, the operation to compute the original ESP from the
ESPFIX ESP and the GDT entry simplifies a bit too, and the remaining
three instructions have been expanded inline in entry_32.S.
impact: can now reliably run userspace with ESP=xxxxfffc on 16-bit
stack segment
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Acked-by: Stas Sergeev <stsp@aknet.ru>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Alexander van Heukelum [Wed, 17 Jun 2009 22:35:57 +0000 (00:35 +0200)]
i386: fix return to 16-bit stack from NMI handler
Returning to a task with a 16-bit stack requires special care: the iret
instruction does not restore the high word of esp in that case. The
espfix code fixes this, but currently is not invoked on NMIs. This means
that a running task gets the upper word of esp clobbered due intervening
NMIs. To reproduce, compile and run the following program with the nmi
watchdog enabled (nmi_watchdog=2 on the command line). Using gdb you can
see that the high bits of esp contain garbage, while the low bits are
still correct.
This patch puts the espfix code back into the NMI code path.
The patch is slightly complicated due to the irqtrace infrastructure not
being NMI-safe. The NMI return path cannot call TRACE_IRQS_IRET.
Otherwise, the tail of the normal iret-code is correct for the nmi code
path too. To be able to share this code-path, the TRACE_IRQS_IRET was
move up a bit. The espfix code exists after the TRACE_IRQS_IRET, but
this code explicitly disables interrupts. This short interrupts-off
section is now not traced anymore. The return-to-kernel path now always
includes the preliminary test to decide if the espfix code should be
called. This is never the case, but doing it this way keeps the patch as
simple as possible and the few extra instructions should not affect
timing in any significant way.
#define _GNU_SOURCE
#include <stdio.h>
#include <sys/types.h>
#include <sys/mman.h>
#include <unistd.h>
#include <sys/syscall.h>
#include <asm/ldt.h>
int modify_ldt(int func, void *ptr, unsigned long bytecount)
{
return syscall(SYS_modify_ldt, func, ptr, bytecount);
}
/* this is assumed to be usable */
#define SEGBASEADDR 0x10000
#define SEGLIMIT 0x20000
/* 16-bit segment */
struct user_desc desc = {
.entry_number = 0,
.base_addr = SEGBASEADDR,
.limit = SEGLIMIT,
.seg_32bit = 0,
.contents = 0, /* ??? */
.read_exec_only = 0,
.limit_in_pages = 0,
.seg_not_present = 0,
.useable = 1
};
int main(void)
{
setvbuf(stdout, NULL, _IONBF, 0);
/* map a 64 kb segment */
char *pointer = mmap((void *)SEGBASEADDR, SEGLIMIT+1,
PROT_EXEC|PROT_READ|PROT_WRITE,
MAP_SHARED|MAP_ANONYMOUS, -1, 0);
if (pointer == NULL) {
printf("could not map space\n");
return 0;
}
/* write ldt, new mode */
int err = modify_ldt(0x11, &desc, sizeof(desc));
if (err) {
printf("error modifying ldt: %i\n", err);
return 0;
}
for (int i=0; i<1000; i++) {
asm volatile (
"pusha\n\t"
"mov %ss, %eax\n\t" /* preserve ss:esp */
"mov %esp, %ebp\n\t"
"push $7\n\t" /* index 0, ldt, user mode */
"push $65536-4096\n\t" /* esp */
"lss (%esp), %esp\n\t" /* switch to new stack */
"push %eax\n\t" /* save old ss:esp on new stack */
"push %ebp\n\t"
"add $17*65536, %esp\n\t" /* set high bits */
"mov %esp, %edx\n\t"
"mov $
10000000, %ecx\n\t" /* wait... */
"1: loop 1b\n\t" /* ... a bit */
"cmp %esp, %edx\n\t"
"je 1f\n\t"
"ud2\n\t" /* esp changed inexplicably! */
"1:\n\t"
"sub $17*65536, %esp\n\t" /* restore high bits */
"lss (%esp), %esp\n\t" /* restore old ss:esp */
"popa\n\t");
printf("\rx%ix", i);
}
return 0;
}
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Acked-by: Stas Sergeev <stsp@aknet.ru>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cyrill Gorcunov [Wed, 17 Jun 2009 18:13:22 +0000 (22:13 +0400)]
x86, ioapic: Don't call disconnect_bsp_APIC if no APIC present
Vegard Nossum reported:
[ 503.576724] ACPI: Preparing to enter system sleep state S5
[ 503.710857] Disabling non-boot CPUs ...
[ 503.716853] Power down.
[ 503.717770] ------------[ cut here ]------------
[ 503.717770] WARNING: at arch/x86/kernel/apic/apic.c:249 native_apic_write_du)
[ 503.717770] Hardware name: OptiPlex GX100
[ 503.717770] Modules linked in:
[ 503.717770] Pid: 2136, comm: halt Not tainted 2.6.30 #443
[ 503.717770] Call Trace:
[ 503.717770] [<
c154d327>] ? printk+0x18/0x1a
[ 503.717770] [<
c1017358>] ? native_apic_write_dummy+0x38/0x50
[ 503.717770] [<
c10360fc>] warn_slowpath_common+0x6c/0xc0
[ 503.717770] [<
c1017358>] ? native_apic_write_dummy+0x38/0x50
[ 503.717770] [<
c1036165>] warn_slowpath_null+0x15/0x20
[ 503.717770] [<
c1017358>] native_apic_write_dummy+0x38/0x50
[ 503.717770] [<
c1017173>] disconnect_bsp_APIC+0x63/0x100
[ 503.717770] [<
c1019e48>] disable_IO_APIC+0xb8/0xc0
[ 503.717770] [<
c1214231>] ? acpi_power_off+0x0/0x29
[ 503.717770] [<
c1015e55>] native_machine_shutdown+0x65/0x80
[ 503.717770] [<
c1015c36>] native_machine_power_off+0x26/0x30
[ 503.717770] [<
c1015c49>] machine_power_off+0x9/0x10
[ 503.717770] [<
c1046596>] kernel_power_off+0x36/0x40
[ 503.717770] [<
c104680d>] sys_reboot+0xfd/0x1f0
[ 503.717770] [<
c109daa0>] ? perf_swcounter_event+0xb0/0x130
[ 503.717770] [<
c109db7d>] ? perf_counter_task_sched_out+0x5d/0x120
[ 503.717770] [<
c102dfc6>] ? finish_task_switch+0x56/0xd0
[ 503.717770] [<
c154da1e>] ? schedule+0x49e/0xb40
[ 503.717770] [<
c10444b0>] ? sys_kill+0x70/0x160
[ 503.717770] [<
c119d9db>] ? selinux_file_ioctl+0x3b/0x50
[ 503.717770] [<
c10dd443>] ? sys_ioctl+0x63/0x70
[ 503.717770] [<
c1003024>] sysenter_do_call+0x12/0x22
[ 503.717770] ---[ end trace
8157b5d0ed378f15 ]---
|
| That's including this commit:
|
| commit
103428e57be323c3c5545db8ad12667099bc6005
|Author: Cyrill Gorcunov <gorcunov@openvz.org>
|Date: Sun Jun 7 16:48:40 2009 +0400
|
| x86, apic: Fix dummy apic read operation together with broken MP handling
|
If we have apic disabled we don't even switch to APIC mode and do not
calling for connect_bsp_APIC. Though on SMP compiled kernel the
native_machine_shutdown does try to write the apic register anyway.
Fix it with explicit check if we really should touch apic registers.
Reported-by: Vegard Nossum <vegard.nossum@gmail.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <
20090617181322.GG10822@lenovo>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Huang Weiyi [Sat, 13 Jun 2009 12:21:26 +0000 (20:21 +0800)]
x86: Remove duplicated #include's
Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com>
LKML-Reference: <
1244895686-2348-1-git-send-email-weiyi.huang@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Jaswinder Singh Rajput [Wed, 17 Jun 2009 08:41:10 +0000 (14:11 +0530)]
x86: msr.h linux/types.h is only required for __KERNEL__
<linux/types.h> is only required for __KERNEL__ as whole file is covered with it
Also fixed some spacing issues for usr/include/asm-x86/msr.h
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Cc: "H. Peter Anvin" <hpa@kernel.org>
LKML-Reference: <
1245228070.2662.1.camel@ht.satnam>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Prarit Bhargava [Wed, 10 Jun 2009 19:41:02 +0000 (12:41 -0700)]
x86: nmi: Add Intel processor 0x6f4 to NMI perfctr1 workaround
Expand Intel NMI perfctr1 workaround to include a Core2 processor stepping
(cpuid family-6, model-f, stepping-4). Resolves a situation where the NMI
would not enable on these processors.
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: prarit@redhat.com
Cc: suresh.b.siddha@intel.com
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Jaswinder Singh Rajput [Wed, 10 Jun 2009 19:41:01 +0000 (12:41 -0700)]
x86: apic/io_apic.c: dmar_msi_type should be static
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Figo.zhang [Wed, 17 Jun 2009 14:25:20 +0000 (22:25 +0800)]
x86, io_apic.c: Work around compiler warning
This compiler warning:
arch/x86/kernel/apic/io_apic.c: In function ‘ioapic_write_entry’:
arch/x86/kernel/apic/io_apic.c:466: warning: ‘eu’ is used uninitialized in this function
arch/x86/kernel/apic/io_apic.c:465: note: ‘eu’ was declared here
Is bogus as 'eu' is always initialized. But annotate it away by
initializing the variable, to make it easier for people to notice
real warnings. A compiler that sees through this logic will
optimize away the initialization.
Signed-off-by: Figo.zhang <figo1802@gmail.com>
LKML-Reference: <
1245248720.3312.27.camel@myhost>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cyrill Gorcunov [Mon, 15 Jun 2009 18:26:33 +0000 (22:26 +0400)]
x86: mce: Don't touch THERMAL_APIC_VECTOR if no active APIC present
If APIC was disabled (for some reason) and as result
it's not even mapped we should not try to enable thermal
interrupts at all.
Reported-by: Simon Holm Thøgersen <odie@cs.aau.dk>
Tested-by: Simon Holm Thøgersen <odie@cs.aau.dk>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
LKML-Reference: <
20090615182633.GA7606@lenovo>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Andi Kleen [Mon, 15 Jun 2009 12:52:01 +0000 (14:52 +0200)]
x86: mce: Handle banks == 0 case in K7 quirk
Vegard Nossum reported:
> I get an MCE-related crash like this in latest linus tree:
>
> [ 0.115341] CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
> [ 0.116396] CPU: L2 Cache: 512K (64 bytes/line)
> [ 0.120570] mce: CPU supports 0 MCE banks
> [ 0.124870] BUG: unable to handle kernel NULL pointer dereference at
00000000 00000010
> [ 0.128001] IP: [<
ffffffff813b98ad>] mcheck_init+0x278/0x320
> [ 0.128001] PGD 0
> [ 0.128001] Thread overran stack, or stack corrupted
> [ 0.128001] Oops: 0002 [#1] PREEMPT SMP
> [ 0.128001] last sysfs file:
> [ 0.128001] CPU 0
> [ 0.128001] Modules linked in:
> [ 0.128001] Pid: 0, comm: swapper Not tainted 2.6.30 #426
> [ 0.128001] RIP: 0010:[<
ffffffff813b98ad>] [<
ffffffff813b98ad>] mcheck_init+0x278/0x320
> [ 0.128001] RSP: 0018:
ffffffff81595e38 EFLAGS:
00000246
> [ 0.128001] RAX:
0000000000000010 RBX:
ffffffff8158f900 RCX:
0000000000000000
> [ 0.128001] RDX:
0000000000000000 RSI:
00000000000000ff RDI:
0000000000000010
> [ 0.128001] RBP:
ffffffff81595e68 R08:
0000000000000001 R09:
0000000000000000
> [ 0.128001] R10:
0000000000000010 R11:
0000000000000000 R12:
0000000000000000
> [ 0.128001] R13:
00000000ffffffff R14:
0000000000000000 R15:
0000000000000000
> [ 0.128001] FS:
0000000000000000(0000) GS:
ffff880002288000(0000) knlGS:00000
>
00000000000
> [ 0.128001] CS: 0010 DS: 0018 ES: 0018 CR0:
000000008005003b
> [ 0.128001] CR2:
0000000000000010 CR3:
0000000001001000 CR4:
00000000000006b0
> [ 0.128001] DR0:
0000000000000000 DR1:
0000000000000000 DR2:
0000000000000000
> [ 0.128001] DR3:
0000000000000000 DR6:
0000000000000000 DR7:
0000000000000000
> [ 0.128001] Process swapper (pid: 0, threadinfo
ffffffff81594000, task ffffff
>
ff8152a4a0)
> [ 0.128001] Stack:
> [ 0.128001]
0000000081595e68 5aa50ed3b4ddbe6e ffffffff8158f900 ffffffff8158f
> 914
> [ 0.128001]
ffffffff8158f948 0000000000000000 ffffffff81595eb8 ffffffff813b8
> 69c
> [ 0.128001]
5aa50ed3b4ddbe6e 00000001078bfbfd 0000062300000800 5aa50ed3b4ddb
> e6e
> [ 0.128001] Call Trace:
> [ 0.128001] [<
ffffffff813b869c>] identify_cpu+0x331/0x392
> [ 0.128001] [<
ffffffff815a1445>] identify_boot_cpu+0x23/0x6e
> [ 0.128001] [<
ffffffff815a14ac>] check_bugs+0x1c/0x60
> [ 0.128001] [<
ffffffff8159c075>] start_kernel+0x403/0x46e
> [ 0.128001] [<
ffffffff8159b2ac>] x86_64_start_reservations+0xac/0xd5
> [ 0.128001] [<
ffffffff8159b3ea>] x86_64_start_kernel+0x115/0x14b
> [ 0.128001] [<
ffffffff8159b140>] ? early_idt_handler+0x0/0x71
This happens on QEMU which reports MCA capability, but no banks.
Without this patch there is a buffer overrun and boot ops because
the code would try to initialize the 0 element of a zero length
kmalloc() buffer.
Reported-by: Vegard Nossum <vegard.nossum@gmail.com>
Tested-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
LKML-Reference: <
20090615125200.GD31969@one.firstfloor.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Wed, 17 Jun 2009 06:59:01 +0000 (08:59 +0200)]
Merge branch 'linus' into x86/urgent
Merge reason: pull in latest to fix a bug in it.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
H. Peter Anvin [Wed, 17 Jun 2009 00:39:04 +0000 (17:39 -0700)]
x86, boot: use .code16gcc instead of .code16
Use .code16gcc to compile arch/x86/boot/bioscall.S rather than
.code16, since some older versions of binutils can't generate 32-bit
addressing expressions (67 prefixes) in .code16 mode, only in
.code16gcc mode.
Reported-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cliff Wickman [Tue, 16 Jun 2009 21:43:40 +0000 (16:43 -0500)]
x86: correct the conversion of EFI memory types
This patch causes all the EFI_RESERVED_TYPE memory reservations to be recorded
in the e820 table as type E820_RESERVED.
(This patch replaces one called 'x86: vendor reserved memory type'.
This version has been discussed a bit with Peter and Yinghai but not given
a final opinion.)
Without this patch EFI_RESERVED_TYPE memory reservations may be
marked usable in the e820 table. There may be a collision between
kernel use and some reserver's use of this memory.
(An example use of this functionality is the UV system, which
will access extremely large areas of memory with a memory engine
that allows a user to address beyond the processor's range. Such
areas are reserved in the EFI table by the BIOS.
Some loaders have a restricted number of entries possible in the e820 table,
hence the need to record the reservations in the unrestricted EFI table.)
The call to do_add_efi_memmap() is only made if "add_efi_memmap" is specified
on the kernel command line.
Signed-off-by: Cliff Wickman <cpw@sgi.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
H. Peter Anvin [Wed, 10 Jun 2009 01:20:39 +0000 (18:20 -0700)]
x86: cap iomem_resource to addressable physical memory
iomem_resource is by default initialized to -1, which means 64 bits of
physical address space if 64-bit resources are enabled. However, x86
CPUs cannot address 64 bits of physical address space. Thus, we want
to cap the physical address space to what the union of all CPU can
actually address.
Without this patch, we may end up assigning inaccessible values to
uninitialized 64-bit PCI memory resources.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: Matthew Wilcox <matthew@wil.cx>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Martin Mares <mj@ucw.cz>
Cc: stable@kernel.org
Linus Torvalds [Tue, 16 Jun 2009 19:11:57 +0000 (12:11 -0700)]
Merge branch 'upstream-linus' of git://git./linux/kernel/git/jlbec/ocfs2
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2:
ocfs2/net: Use wait_event() in o2net_send_message_vec()
ocfs2: Adjust rightmost path in ocfs2_add_branch.
ocfs2: fdatasync should skip unimportant metadata writeout
ocfs2: Remove redundant gotos in ocfs2_mount_volume()
ocfs2: Add statistics for the checksum and ecc operations.
ocfs2 patch to track delayed orphan scan timer statistics
ocfs2: timer to queue scan of all orphan slots
ocfs2: Correct ordering of ip_alloc_sem and localloc locks for directories
ocfs2: Fix possible deadlock in quota recovery
ocfs2: Fix possible deadlock with quotas in ocfs2_setattr()
ocfs2: Fix lock inversion in ocfs2_local_read_info()
ocfs2: Fix possible deadlock in ocfs2_global_read_dquot()
ocfs2: update comments in masklog.h
ocfs2: Don't printk the error when listing too many xattrs.
Linus Torvalds [Tue, 16 Jun 2009 19:10:31 +0000 (12:10 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/tiwai/sound-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
ALSA: ctxfi - Fix deadlock with xfi-timer
ALSA: intel8x0 - Fix PCM position craziness
ALSA: usb-audio - rework quirk for TerraTec Aureon USB 5.1 MkII
ASoC: magician: fix PXA SSP clock polarity
ASoC: Instantiate any forgotten DAPM widgets
ASoC: Revert duplicated code in SSM2602 driver
ALSA: hda - Add quirk for Acer Aspire 6935G
ALSA: ctxfi - Replace atc lock to mutex
ASoC: Remove odd bit clock ratios for WM8903
Linus Torvalds [Tue, 16 Jun 2009 19:03:43 +0000 (12:03 -0700)]
Merge branch 'serial'
* serial:
imx: Check for NULL pointer deref before calling tty_encode_baud_rate
atmel_serial: fix hang in set_termios when crtscts is enabled
MAINTAINERS: update 8250 section, give Alan Cox a name
tty: fix sanity check
pty: Narrow the race on ldisc locking
tty: fix unused warning when TCGETX is not defined
ldisc: debug aids
ldisc: Make sure the ldisc isn't active when we close it
tty: Fix leaks introduced by the shift to separate ldisc objects
Fix conflicts in drivers/char/pty.c due to earlier version of the ldisc
race narrowing.
Sascha Hauer [Tue, 16 Jun 2009 16:02:15 +0000 (17:02 +0100)]
imx: Check for NULL pointer deref before calling tty_encode_baud_rate
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Haavard Skinnemoen [Tue, 16 Jun 2009 16:02:03 +0000 (17:02 +0100)]
atmel_serial: fix hang in set_termios when crtscts is enabled
After enabling hardware flow control, any subsequent termios call may hang
waiting for the transmitter to drain. This appears to be caused by a
busy-loop in set_termios() waiting for the transmitter to become empty,
which may take a very long time (or hang indefinitely) if the device at
the other end is blocking us.
A quick look through the tty and serial_core code indicates that any
necessary flushing (which is optional) has already been done at this
point, so there's no need for the driver to flush the transmitter on its
own.
Fix it by removing the busy-loop altogether.
Tested-by: Eirik Aanonsen <eaa@wprmedical.com>
Signed-off-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joe Perches [Tue, 16 Jun 2009 16:01:52 +0000 (17:01 +0100)]
MAINTAINERS: update 8250 section, give Alan Cox a name
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Alan Cox [Tue, 16 Jun 2009 16:01:33 +0000 (17:01 +0100)]
tty: fix sanity check
The WARN_ON() that was added to tty_reopen can be triggered in the specific
case of a hangup occurring during a re-open of a tty which is not in the
middle of being otherwise closed.
In that case however the WARN() is bogus as we don't hold the neccessary
locks to make a correct decision.
The case we should be checking is "if the ldisc is not changing and reopen
is occuring". We could drop the WARN_ON but for the moment the debug is more
valuable even if it means taking a mutex as it will find any other cases.
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Alan Cox [Tue, 16 Jun 2009 16:01:13 +0000 (17:01 +0100)]
pty: Narrow the race on ldisc locking
The pty code has always been buggy on its ldisc handling. The recent
changes made the window for the race much bigger. Pending fixing it
properly which is not at all trivial, at least make the race small again so
we don't disrupt other dev work.
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mike Frysinger [Tue, 16 Jun 2009 16:01:02 +0000 (17:01 +0100)]
tty: fix unused warning when TCGETX is not defined
If TCGETX is not defined, we end up with this warning:
drivers/char/tty_ioctl.c: In function ‘tty_mode_ioctl’:
drivers/char/tty_ioctl.c:950: warning: unused variable ‘ktermx’
Since the variable is only used in one case statement, push it down to
the local case scope.
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Alan Cox [Tue, 16 Jun 2009 16:00:53 +0000 (17:00 +0100)]
ldisc: debug aids
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Alan Cox [Tue, 16 Jun 2009 16:00:40 +0000 (17:00 +0100)]
ldisc: Make sure the ldisc isn't active when we close it
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Alan Cox [Tue, 16 Jun 2009 16:00:26 +0000 (17:00 +0100)]
tty: Fix leaks introduced by the shift to separate ldisc objects
Gold star for the kmemleak detector.
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Tue, 16 Jun 2009 18:52:41 +0000 (11:52 -0700)]
Merge git://git./linux/kernel/git/davem/sparc-next-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-next-2.6: (30 commits)
sparc64: Update defconfig.
sparc: Wire up sys_rt_tgsigqueueinfo().
openprom: Squelch useless GCC warning.
sparc: replace uses of CPU_MASK_ALL_PTR
sparc64: Add proper dynamic ftrace support.
sparc: Simplify code using is_power_of_2() routine.
sparc: move of_device common code to of_device_common
sparc: remove dma-mapping_{32|64}.h
sparc: use dma_map_page instead of dma_map_single
sparc: add sync_single_for_device and sync_sg_for_device to struct dma_ops
sparc: move the duplication in dma-mapping_{32|64}.h to dma-mapping.h
p9100: use standard fields for framebuffer physical address and length
leo: use standard fields for framebuffer physical address and length
cg6: use standard fields for framebuffer physical address and length
cg3: use standard fields for framebuffer physical address and length
cg14: use standard fields for framebuffer physical address and length
bw2: use standard fields for framebuffer physical address and length
sparc64: fix and optimize irq distribution
sparc64: Use new dynamic per-cpu allocator.
sparc64: Only allocate per-cpu areas for possible cpus.
...
Linus Torvalds [Tue, 16 Jun 2009 18:49:58 +0000 (11:49 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/vapier/blackfin
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/vapier/blackfin: (27 commits)
Blackfin: hook up new rt_tgsigqueueinfo syscall
Blackfin: improve CLKIN_HZ config default
Blackfin: initial support for ftrace grapher
Blackfin: initial support for ftrace
Blackfin: enable support for LOCKDEP
Blackfin: add preliminary support for STACKTRACE
Blackfin: move custom sections into sections.h
Blackfin: punt unused/wrong mutex-dec.h
Blackfin: add support for irqflags
Blackfin: add support for bzip2/lzma compressed kernel images
Blackfin: convert Kconfig style to def_bool
Blackfin: bf548-ezkit: update smsc911x resources
Blackfin: update aedos-ipipe code to upstream 1.10-00
Blackfin: bf537-stamp: update ADP5520 resources
Blackfin: bf518f-ezbrd: fix SPI CS for SPI flash
Blackfin: define SPI IRQ in board resources
Blackfin: do not configure the UART early if on wrong processor
Blackfin: fix deadlock in SMP IPI handler
Blackfin: fix flag storage for irq funcs
Blackfin: push down exception oops checking
...
Linus Torvalds [Tue, 16 Jun 2009 18:48:13 +0000 (11:48 -0700)]
Merge branch 'for-linus' of git://git390.marist.edu/linux-2.6
* 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6: (33 commits)
[S390] s390: hibernation support for s390
[S390] pm: dcssblk power management callbacks.
[S390] pm: monreader power management callbacks.
[S390] pm: monwriter power management callbacks.
[S390] pm: memory hotplug power management callbacks
[S390] pm: con3270 power management callbacks.
[S390] pm: smsgiucv power management callbacks.
[S390] pm: hvc_iucv power management callbacks
[S390] PM: af_iucv power management callbacks.
[S390] pm: netiucv power management callbacks.
[S390] pm: iucv power management callbacks.
[S390] iucv: establish reboot notifier
[S390] pm: power management support for SCLP drivers.
[S390] pm: tape power management callbacks
[S390] pm: vmlogrdr power management callbacks
[S390] pm: vmur driver power management callbacks
[S390] pm: appldata power management callbacks
[S390] pm: vmwatchdog power management callbacks.
[S390] pm: zfcp driver power management callbacks
[S390] pm: claw driver power management callbacks
...
Linus Torvalds [Tue, 16 Jun 2009 18:46:45 +0000 (11:46 -0700)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
block: remove some includings of blktrace_api.h
mg_disk: seperate mg_disk.h again
block: Introduce helper to reset queue limits to default values
cfq: remove extraneous '\n' in blktrace output
ubifs: register backing_dev_info
btrfs: properly register fs backing device
block: don't overwrite bdi->state after bdi_init() has been run
cfq: cleanup for last_end_request in cfq_data
Linus Torvalds [Tue, 16 Jun 2009 18:30:37 +0000 (11:30 -0700)]
Merge branch 'merge' of git://git./linux/kernel/git/benh/powerpc
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (38 commits)
ps3flash: Always read chunks of 256 KiB, and cache them
ps3flash: Cache the last accessed FLASH chunk
ps3: Replace direct file operations by callback
ps3: Switch ps3_os_area_[gs]et_rtc_diff to EXPORT_SYMBOL_GPL()
ps3: Correct debug message in dma_ioc0_map_pages()
drivers/ps3: Add missing annotations
ps3fb: Use ps3_system_bus_[gs]et_drvdata() instead of direct access
ps3flash: Use ps3_system_bus_[gs]et_drvdata() instead of direct access
ps3: shorten ps3_system_bus_[gs]et_driver_data to ps3_system_bus_[gs]et_drvdata
ps3: Use dev_[gs]et_drvdata() instead of direct access for system bus devices
block/ps3: remove driver_data direct access of struct device
ps3vram: Make ps3vram_priv.reports a void *
ps3vram: Remove no longer used ps3vram_priv.ddr_base
ps3vram: Replace mutex by spinlock + bio_list
block: Add bio_list_peek()
powerpc: Use generic atomic64_t implementation on 32-bit processors
lib: Provide generic atomic64_t implementation
powerpc: Add compiler memory barrier to mtmsr macro
powerpc/iseries: Mark signal_vsp_instruction() as maybe unused
powerpc/iseries: Fix unused function warning in iSeries DT code
...
Linus Torvalds [Tue, 16 Jun 2009 18:30:16 +0000 (11:30 -0700)]
Merge git://git./linux/kernel/git/mason/btrfs-unstable
* git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable:
Btrfs: always update root items for fs trees at commit time
Linus Torvalds [Tue, 16 Jun 2009 18:29:44 +0000 (11:29 -0700)]
Merge git://git./linux/kernel/git/hirofumi/fatfs-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/hirofumi/fatfs-2.6:
fat: split fat_generic_ioctl
FAT: add 'errors' mount option
Linus Torvalds [Tue, 16 Jun 2009 18:29:17 +0000 (11:29 -0700)]
Merge branch 'i2c-for-linus' of git://git./linux/kernel/git/jdelvare/staging
* 'i2c-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
therm_windtunnel: Convert to a new-style i2c driver
therm_adt746x: Convert to a new-style i2c driver
windfarm: Convert to new-style i2c drivers
therm_pm72: Convert to a new-style i2c driver
i2c-viapro: Add new PCI device ID for VX855
i2c/chips: Move max6875 to drivers/misc/eeprom
i2c: Do not give adapters a default parent
i2c: Do not probe for TV chips on Voodoo3 adapters
i2c: Retry automatically on arbitration loss
i2c: Remove void casts
Linus Torvalds [Tue, 16 Jun 2009 18:28:50 +0000 (11:28 -0700)]
Merge branch 'hwmon-for-linus' of git://git./linux/kernel/git/jdelvare/staging
* 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
hwmon: (max6650) Add support for alarms
hwmon: (f71882fg) Add support for the
F71858F
hwmon: (f71882fg) Add temp#_fault sysfs attr for f8000
hwmon: (f71882fg) Sanity check f8000 pwm settings
hwmon: (f71882fg) Cleanup f8000 pwm handling
hwmon: PCI quirk for hwmon access on MSI MS-7031 board
hwmon: (w83627ehf) Add W83627DHG-P support
hwmon: (tmp401) Add documentation
hwmon: (tmp401) Add support for TI's TMP411 sensors chip
hwmon: (tmp401) Add support for TI's TMP401 sensor chip
hwmon: (ibmaem) Automatically load on HC10 blade
hwmon: Fix more __devexit_p glitches
Linus Torvalds [Tue, 16 Jun 2009 18:24:23 +0000 (11:24 -0700)]
Merge branch 'acpica' of git://git./linux/kernel/git/lenb/linux-acpi-2.6
* 'acpica' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: (27 commits)
ACPICA: Update version to
20090521.
ACPICA: Disable preservation of SCI enable bit (SCI_EN)
ACPICA: Region deletion: Ensure region object is removed from handler list
ACPICA: Eliminate extra call to NsGetParentNode
ACPICA: Simplify internal operation region interface
ACPICA: Update Load() to use operation region interfaces
ACPICA: New: AcpiInstallMethod - install a single control method
ACPICA: Invalidate DdbHandle after table unload
ACPICA: Fix reference count issues for DdbHandle object
ACPICA: Simplify and optimize NsGetNextNode function
ACPICA: Additional validation of _PRT packages (resource mgr)
ACPICA: Fix DebugObject output for DdbHandle objects
ACPICA: Fix allowable release order for ASL mutex objects
ACPICA: Mutex support: Fix release ordering issue and current sync level
ACPICA: Update version to
20090422.
ACPICA: Linux OSL: cleanup/update/merge
ACPICA: Fix implementation of AML BreakPoint operator (break to debugger)
ACPICA: Fix miscellaneous warnings under gcc 4+
ACPICA: Miscellaneous lint changes
ACPICA: Fix possible dereference of null pointer
...
Alan Cox [Mon, 15 Jun 2009 15:28:29 +0000 (16:28 +0100)]
tty: Fix leaks introduced by the shift to separate ldisc objects
Gold star for the kmemleak detector.
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Alan Cox [Mon, 15 Jun 2009 15:27:29 +0000 (16:27 +0100)]
pty: Narrow the race on ldisc locking
The pty code has always been buggy on its ldisc handling. The recent
changes made the window for the race much bigger. Pending fixing it
properly which is not at all trivial, at least make the race small again so
we don't disrupt other dev work.
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Tue, 16 Jun 2009 18:07:14 +0000 (11:07 -0700)]
printk: add KERN_DEFAULT loglevel to print_modules()
Several WARN_ON() messages omit the '\n' at the end of the string, which
is a simple (and understandable) error. The next line printed after
that warning line is usually the current module list, and that printk
does not have a log-level marker - resulting in one long mixed-up line.
Adding this loglevel marker will now avoid this unreadable mess.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Tue, 16 Jun 2009 18:02:28 +0000 (11:02 -0700)]
printk: Add KERN_DEFAULT printk log-level
This adds a KERN_DEFAULT loglevel marker, for when you cannot decide
which loglevel you want, and just want to keep an existing printk
with the default loglevel.
The difference between having KERN_DEFAULT and having no log-level
marker at all is two-fold:
- having the log-level marker will now force a new-line if the
previous printout had not added one (perhaps because it forgot,
but perhaps because it expected a continuation)
- having a log-level marker is required if you are printing out a
message that otherwise itself could perhaps otherwise be mistaken
for a log-level.
Signed-of-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Tue, 16 Jun 2009 17:57:02 +0000 (10:57 -0700)]
printk: clean up handling of log-levels and newlines
It used to be that we would only look at the log-level in a printk()
after explicit newlines, which can cause annoying problems when the
previous printk() did not end with a '\n'. In that case, the log-level
marker would be just printed out in the middle of the line, and be
seen as just noise rather than change the logging level.
This changes things to always look at the log-level in the first
bytes of the printout. If a log level marker is found, it is always
used as the log-level. Additionally, if no newline existed, one is
added (unless the log-level is the explicit KERN_CONT marker, to
explicitly show that it's a continuation of a previous line).
Acked-by: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Takashi Iwai [Tue, 16 Jun 2009 15:38:50 +0000 (17:38 +0200)]
Merge branch 'topic/usb-audio' into for-linus
* topic/usb-audio:
ALSA: usb-audio - rework quirk for TerraTec Aureon USB 5.1 MkII
Takashi Iwai [Tue, 16 Jun 2009 15:38:46 +0000 (17:38 +0200)]
Merge branch 'topic/intel8x0' into for-linus
* topic/intel8x0:
ALSA: intel8x0 - Fix PCM position craziness
Takashi Iwai [Tue, 16 Jun 2009 15:38:43 +0000 (17:38 +0200)]
Merge branch 'topic/hda' into for-linus
* topic/hda:
ALSA: hda - Add quirk for Acer Aspire 6935G
Takashi Iwai [Tue, 16 Jun 2009 15:38:40 +0000 (17:38 +0200)]
Merge branch 'topic/ctxfi' into for-linus
* topic/ctxfi:
ALSA: ctxfi - Fix deadlock with xfi-timer
ALSA: ctxfi - Replace atc lock to mutex
Takashi Iwai [Tue, 16 Jun 2009 15:38:36 +0000 (17:38 +0200)]
Merge branch 'topic/asoc' into for-linus
* topic/asoc:
ASoC: magician: fix PXA SSP clock polarity
ASoC: Instantiate any forgotten DAPM widgets
ASoC: Revert duplicated code in SSM2602 driver
ASoC: Remove odd bit clock ratios for WM8903
David S. Miller [Tue, 16 Jun 2009 11:59:53 +0000 (04:59 -0700)]
sparc64: Update defconfig.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 16 Jun 2009 11:17:58 +0000 (04:17 -0700)]
sparc: Wire up sys_rt_tgsigqueueinfo().
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 16 Jun 2009 10:46:54 +0000 (03:46 -0700)]
openprom: Squelch useless GCC warning.
drivers/sbus/char/openprom.c: In function ‘openprom_sunos_ioctl’:
drivers/sbus/char/openprom.c:306: warning: ‘opp’ may be used uninitialized in this function
Signed-off-by: David S. Miller <davem@davemloft.net>
Stephen Rothwell [Mon, 15 Jun 2009 10:06:18 +0000 (03:06 -0700)]
sparc: replace uses of CPU_MASK_ALL_PTR
CPU_MASK_ALL is the (deprecated) "all bits set" cpumask, defined as so:
#define CPU_MASK_ALL (cpumask_t) { { ... } }
Taking the address of such a temporary is questionable at best,
unfortunately
321a8e9d (cpumask: add CPU_MASK_ALL_PTR macro) added
CPU_MASK_ALL_PTR:
#define CPU_MASK_ALL_PTR (&CPU_MASK_ALL)
Which formalizes this practice. One day gcc could bite us over this
usage (though we seem to have gotten away with it so far).
[Description by Rusty Russell]
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sat, 13 Jun 2009 08:03:24 +0000 (01:03 -0700)]
sparc64: Add proper dynamic ftrace support.
Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Ingo Molnar <mingo@elte.hu>
Robert P. J. Day [Fri, 24 Apr 2009 03:58:24 +0000 (03:58 +0000)]
sparc: Simplify code using is_power_of_2() routine.
Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
Robert Reif [Thu, 4 Jun 2009 09:00:02 +0000 (02:00 -0700)]
sparc: move of_device common code to of_device_common
This patch moves code common to of_device_32.c and of_device_64.c into
of_device_common.h and of_device_common.c.
The only functional difference is in sparc32 where of_bus_default_map is
used in place of of_bus_sbus_map because they are equivelent.
There is still room for further code consolidation with some minor
refactoring.
Boot tested on sparc32 and compile tested on sparc64.
Signed-off-by: Robert Reif <reif@earthlink.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
FUJITA Tomonori [Thu, 14 May 2009 16:23:11 +0000 (16:23 +0000)]
sparc: remove dma-mapping_{32|64}.h
This modifies SPARC32 to use struct dma_map ops. It means that we can
remove dma-mapping_{32|64}.h.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Tested-by: Robert Reif <reif@earthlink.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
FUJITA Tomonori [Thu, 14 May 2009 16:23:10 +0000 (16:23 +0000)]
sparc: use dma_map_page instead of dma_map_single
This patch converts dma_map_single and dma_unmap_single to use
map_page and unmap_page respectively and removes unnecessary
map_single and unmap_single. map_page can be used to implement
map_single but the opposite is impossible. Having only dma_map_page in
struct dma_ops is enough.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Tested-by: Robert Reif <reif@earthlink.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
FUJITA Tomonori [Thu, 14 May 2009 16:23:09 +0000 (16:23 +0000)]
sparc: add sync_single_for_device and sync_sg_for_device to struct dma_ops
This adds sync_single_for_device() and sync_sg_for_device() to struct
dma_ops in order to unify dma-mpping_{32|64}.h. dma-mpping_32.h needs them though dma-mpping_64.h doesn't.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Tested-by: Robert Reif <reif@earthlink.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
FUJITA Tomonori [Thu, 14 May 2009 16:23:08 +0000 (16:23 +0000)]
sparc: move the duplication in dma-mapping_{32|64}.h to dma-mapping.h
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Tested-by: Robert Reif <reif@earthlink.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Krzysztof Helt [Mon, 4 May 2009 10:42:57 +0000 (10:42 +0000)]
p9100: use standard fields for framebuffer physical address and length
Use standard fields fbinfo.fix.smem_start and fbinfo.fix.smem_len
for physical address and length of framebuffer.
This also fixes output of the 'fbset -i' command - address and length
of the framebuffer are displayed correctly.
Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Krzysztof Helt [Mon, 4 May 2009 10:41:56 +0000 (10:41 +0000)]
leo: use standard fields for framebuffer physical address and length
Use standard fields fbinfo.fix.smem_start and fbinfo.fix.smem_len
for physical address and length of framebuffer.
This also fixes output of the 'fbset -i' command - address and length
of the framebuffer are displayed correctly.
Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Krzysztof Helt [Mon, 4 May 2009 10:39:45 +0000 (10:39 +0000)]
cg6: use standard fields for framebuffer physical address and length
Use standard fields fbinfo.fix.smem_start and fbinfo.fix.smem_len
for physical address and length of framebuffer.
This also fixes output of the 'fbset -i' command - address and length
of the framebuffer are displayed correctly.
Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Krzysztof Helt [Mon, 4 May 2009 10:37:46 +0000 (10:37 +0000)]
cg3: use standard fields for framebuffer physical address and length
Use standard fields fbinfo.fix.smem_start and fbinfo.fix.smem_len
for physical address and length of framebuffer.
This also fixes output of the 'fbset -i' command - address and length
of the framebuffer is displayed correctly.
Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Krzysztof Helt [Mon, 4 May 2009 10:36:50 +0000 (10:36 +0000)]
cg14: use standard fields for framebuffer physical address and length
Use standard fields fbinfo.fix.smem_start and fbinfo.fix.smem_len
for physical address and length of framebuffer.
This also fixes output of the 'fbset -i' command - address and length
of the framebuffer is displayed correctly.
Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Krzysztof Helt [Mon, 4 May 2009 10:35:36 +0000 (10:35 +0000)]
bw2: use standard fields for framebuffer physical address and length
Use standard fields fbinfo.fix.smem_start and fbinfo.fix.smem_len
for physical address and length of framebuffer.
This also fixes output of the 'fbset -i' command - address and length
of the framebuffer is displayed correctly.
Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hong H. Pham [Thu, 4 Jun 2009 09:10:11 +0000 (02:10 -0700)]
sparc64: fix and optimize irq distribution
irq_choose_cpu() should compare the affinity mask against cpu_online_map
rather than CPU_MASK_ALL, since irq_select_affinity() sets the interrupt's
affinity mask to cpu_online_map "and" CPU_MASK_ALL (which ends up being
just cpu_online_map). The mask comparison in irq_choose_cpu() will always
fail since the two masks are not the same. So the CPU chosen is the first CPU
in the intersection of cpu_online_map and CPU_MASK_ALL, which is always CPU0.
That means all interrupts are reassigned to CPU0...
Distributing interrupts to CPUs in a linearly increasing round robin fashion
is not optimal for the UltraSPARC T1/T2. Also, the irq_rover in
irq_choose_cpu() causes an interrupt to be assigned to a different
processor each time the interrupt is allocated and released. This may lead
to an unbalanced distribution over time.
A static mapping of interrupts to processors is done to optimize and balance
interrupt distribution. For the T1/T2, interrupts are spread to different
cores first, and then to strands within a core.
The following is some benchmarks showing the effects of interrupt
distribution on a T2. The test was done with iperf using a pair of T5220
boxes, each with a 10GBe NIU (XAUI) connected back to back.
TCP | Stock Linear RR IRQ Optimized IRQ
Streams | 2.6.30-rc5 Distribution Distribution
| GBits/sec GBits/sec GBits/sec
--------+-----------------------------------------
1 0.839 0.862 0.868
8 1.16 4.96 5.88
16 1.15 6.40 8.04
100 1.09 7.28 8.68
Signed-off-by: Hong H. Pham <hong.pham@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 9 Apr 2009 03:32:02 +0000 (20:32 -0700)]
sparc64: Use new dynamic per-cpu allocator.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 1 Apr 2009 23:22:27 +0000 (16:22 -0700)]
sparc64: Only allocate per-cpu areas for possible cpus.
This gets us real close to the generic implementation of
setup_per_cpu_areas() except:
1) We store the per-cpu offset into the trap_block[], whereas
the generic code has it's own static array.
2) We have to initialize the %g5 register to hold the boot cpu's
per-cpu area offset.
3) The OBP/MDESC cpu info scan is performed at the end.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 1 Apr 2009 23:15:20 +0000 (16:15 -0700)]
sparc64: Get rid of real_setup_per_cpu_areas().
Now that we defer the cpu_data() initializations to the end of per-cpu
setup, we can get rid of this local hack we had to setup the per-cpu
areas eary.
This is a necessary step in order to support HAVE_DYNAMIC_PER_CPU_AREA
since the per-cpu setup must run when page structs are available.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 27 May 2009 05:37:25 +0000 (22:37 -0700)]
sparc64: Defer cpu_data() setup until end of per-cpu data initialization.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 1 Apr 2009 10:15:11 +0000 (03:15 -0700)]
sparc64: Make mdesc_fill_in_cpu_data take a cpumask_t pointer.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 1 Apr 2009 10:13:15 +0000 (03:13 -0700)]
sparc: Call OF and MD cpu scanning explicitly from paging_init()
We need to split up the cpu present mask setup from the cpu_data
initialization, and this is a first step towards that.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 9 Apr 2009 02:55:47 +0000 (19:55 -0700)]
sparc64: Refactor MDESC cpu scanning code using an iterator.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 27 May 2009 04:46:14 +0000 (21:46 -0700)]
sparc64: Refactor OBP cpu scanning code using an iterator.
With feedback from Sam Ravnborg.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 1 Apr 2009 08:57:36 +0000 (01:57 -0700)]
sparc64: Use BUILD_BUG_ON() in trap_init().
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 1 Apr 2009 08:47:10 +0000 (01:47 -0700)]
sparc64: Store per-cpu offset in trap_block[]
Surprisingly this actually makes LOAD_PER_CPU_BASE() a little
more efficient.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 1 Apr 2009 08:26:43 +0000 (01:26 -0700)]
sparc64: Move trap_block[] definitions into a new header file.
Later we're going to want to get at these definitions from
asm/percpu.h and that's not possible via cpudata.h because
of the set of dependencies the non-trap_block[] stuff has.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 1 Apr 2009 00:15:40 +0000 (17:15 -0700)]
sparc64: Reclaim trap_block[]->hdesc
This really isn't necessary at all, a local variable suits the
job just fine.
This frees up 8 bytes in the trap_block[] that we can use later
to store the per-cpu base addresses.
Signed-off-by: David S. Miller <davem@davemloft.net>
Ingo Molnar [Tue, 16 Jun 2009 09:51:24 +0000 (11:51 +0200)]
Merge branch 'amd-iommu/fixes' of git://git./linux/kernel/git/joro/linux-2.6-iommu into x86/urgent
Li Zefan [Tue, 16 Jun 2009 09:19:36 +0000 (11:19 +0200)]
block: remove some includings of blktrace_api.h
When porting blktrace to tracepoints, we changed to trace/block.h
for trace prober declarations.
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Hans-Joachim Picht [Tue, 16 Jun 2009 08:30:52 +0000 (10:30 +0200)]
[S390] s390: hibernation support for s390
This patch introduces the hibernation backend support to the
s390 architecture. Now it is possible to suspend a mainframe Linux
guest using the following command:
echo disk > /sys/power/state
Signed-off-by: Hans-Joachim Picht <hans@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Gerald Schaefer [Tue, 16 Jun 2009 08:30:51 +0000 (10:30 +0200)]
[S390] pm: dcssblk power management callbacks.
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Gerald Schaefer [Tue, 16 Jun 2009 08:30:50 +0000 (10:30 +0200)]
[S390] pm: monreader power management callbacks.
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Gerald Schaefer [Tue, 16 Jun 2009 08:30:49 +0000 (10:30 +0200)]
[S390] pm: monwriter power management callbacks.
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Gerald Schaefer [Tue, 16 Jun 2009 08:30:48 +0000 (10:30 +0200)]
[S390] pm: memory hotplug power management callbacks
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Martin Schwidefsky [Tue, 16 Jun 2009 08:30:47 +0000 (10:30 +0200)]
[S390] pm: con3270 power management callbacks.
Introduce the power management callbacks to the 3270 driver. On suspend
the current 3270 view is deactivated and for non-console 3270 device
the release callback is done. This disconnects the current tty /
fullscreen application from the 3270 device. On resume the current
view is reactivated, on the tty you get a fresh login.
If the system panics before the 3270 device has been resumed, the ccw
device for the 3270 console is reactivated with ccw_device_force_console.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Martin Schwidefsky [Tue, 16 Jun 2009 08:30:46 +0000 (10:30 +0200)]
[S390] pm: smsgiucv power management callbacks.
Create dummy iucv-device to get control when the system is suspended
and resumed. Server the smsg iucv path on suspend, reestablish the
path on resume.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Hendrik Brueckner [Tue, 16 Jun 2009 08:30:45 +0000 (10:30 +0200)]
[S390] pm: hvc_iucv power management callbacks
The patch adds supporting for suspending and resuming IUCV HVC terminal
devices from disk. The obligatory Linux device driver interfaces has
been added by registering a device driver on the IUCV bus.
For each IUCV HVC terminal device the driver creates a respective device
on the IUCV bus.
To support suspend and resume, the PM freeze callback severs any established
IUCV communication path and triggers a HVC tty hang-up when the system image
is restored.
IUCV communication path are no longer valid when the z/VM guest is halted.
The device driver initialization has been updated to register devices and
the a new routine has been extracted to facilitate the hang-up of IUCV HVC
terminal devices.
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Ursula Braun [Tue, 16 Jun 2009 08:30:44 +0000 (10:30 +0200)]
[S390] PM: af_iucv power management callbacks.
Patch establishes a dummy afiucv-device to make sure af_iucv is
notified as iucv-bus device about suspend/resume.
The PM freeze callback severs all iucv pathes of connected af_iucv sockets.
The PM thaw/restore callback switches the state of all previously connected
sockets to IUCV_DISCONN.
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Ursula Braun [Tue, 16 Jun 2009 08:30:43 +0000 (10:30 +0200)]
[S390] pm: netiucv power management callbacks.
Patch establishes a dummy netiucv device to make sure iucv is notified
about suspend/resume even if netiucv is the only loaded iucv-exploting
module without any real net_device defined.
The PM freeze callback closes all open netiucv connections. Thus the
corresponding iucv path is removed.
The PM thaw/restore callback re-opens previously closed netiucv
connections.
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Ursula Braun [Tue, 16 Jun 2009 08:30:42 +0000 (10:30 +0200)]
[S390] pm: iucv power management callbacks.
Patch calls the PM callback functions of iucv-bus devices, which are
responsible for removal of their established iucv pathes.
The PM freeze callback for the first iucv-bus device disables all iucv
interrupts except the connection severed interrupt.
The PM freeze callback for the last iucv-bus device shuts down iucv.
The PM thaw callback for the first iucv-bus device re-enables iucv
if it has been shut down during freeze. If freezing has been interrupted,
it re-enables iucv interrupts according to the needs of iucv-exploiters.
The PM restore callback for the first iucv-bus device re-enables iucv.
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Ursula Braun [Tue, 16 Jun 2009 08:30:41 +0000 (10:30 +0200)]
[S390] iucv: establish reboot notifier
To guarantee a proper cleanup, patch adds a reboot notifier to
the iucv base code, which disables iucv interrupts, shuts down
established iucv pathes, and removes iucv declarations for z/VM.
Checks have to be added to the iucv-API functions, whether
iucv-buffers removed at reboot time are still declared.
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Michael Holzheu [Tue, 16 Jun 2009 08:30:40 +0000 (10:30 +0200)]
[S390] pm: power management support for SCLP drivers.
The SCLP base driver defines a new notifier call back for all upper level SCLP
drivers, like the SCLP console, etc. This guarantees that in suspend first the
upper level drivers are suspended and afterwards the SCLP base driver. For
resume it is the other way round. The SCLP base driver itself registers a
new platform device at the platform bus and gets PM notifications via
the dev_pm_ops.
In suspend, the SCLP base driver switches off the receiver and sender mask
This is done in sclp_deactivate(). After suspend all new requests will be
rejected with -EIO and no more interrupts will be received, because the masks
are switched off. For resume the sender and receiver masks are reset in
the sclp_reactivate() function.
When the SCLP console is suspended, all new messages are cached in the
sclp console buffers. In resume, all the cached messages are written to the
console. In addition to that we have an early resume function that removes
the cached messages from the suspend image.
Signed-off-by: Michael Holzheu <holzheu@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Frank Munzert [Tue, 16 Jun 2009 08:30:39 +0000 (10:30 +0200)]
[S390] pm: tape power management callbacks
Signed-off-by: Frank Munzert <munzert@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Stefan Weinhuber [Tue, 16 Jun 2009 08:30:38 +0000 (10:30 +0200)]
[S390] pm: vmlogrdr power management callbacks
Signed-off-by: Stefan Weinhuber <wein@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Frank Munzert [Tue, 16 Jun 2009 08:30:37 +0000 (10:30 +0200)]
[S390] pm: vmur driver power management callbacks
Signed-off-by: Frank Munzert <munzert@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Gerald Schaefer [Tue, 16 Jun 2009 08:30:36 +0000 (10:30 +0200)]
[S390] pm: appldata power management callbacks
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Christian Borntraeger [Tue, 16 Jun 2009 08:30:35 +0000 (10:30 +0200)]
[S390] pm: vmwatchdog power management callbacks.
This patch implements suspend/hibernation for the vmwatchdog driver. The
pm_notifier_callchain is used to get control on PM events. Since watchdog
operation and suspend cannot work together in a reliable fashion, the open
flag is also used to prevent suspend and open from happening at the same
time.
The watchdog can also be active with no open file descriptor. This patch
adds another flag which is only changed in vmwdt_keep_alive and
vmwdt_disable.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Martin Petermann [Tue, 16 Jun 2009 08:30:34 +0000 (10:30 +0200)]
[S390] pm: zfcp driver power management callbacks
Signed-off-by: Martin Petermann <martin.petermann@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Frank Blaschka [Tue, 16 Jun 2009 08:30:33 +0000 (10:30 +0200)]
[S390] pm: claw driver power management callbacks
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Frank Blaschka [Tue, 16 Jun 2009 08:30:32 +0000 (10:30 +0200)]
[S390] pm: ctcm driver power management callbacks
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Frank Blaschka [Tue, 16 Jun 2009 08:30:31 +0000 (10:30 +0200)]
[S390] pm: qeth driver power management callbacks
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>