Ingo Molnar [Wed, 18 Mar 2009 12:19:49 +0000 (13:19 +0100)]
Merge branches 'x86/cleanups', 'x86/cpu', 'x86/debug', 'x86/mce2', 'x86/mm', 'x86/mtrr', 'x86/setup', 'x86/setup-memory', 'x86/urgent', 'x86/uv', 'x86/x2apic' and 'linus' into x86/core
Conflicts:
arch/parisc/kernel/irq.c
Jaswinder Singh Rajput [Wed, 18 Mar 2009 12:06:55 +0000 (17:36 +0530)]
x86: cpu/mttr/cleanup.c fix compilation warning
arch/x86/kernel/cpu/mtrr/cleanup.c:197: warning: format ‘%d’ expects type ‘int’, but argument 2 has type ‘long unsigned int’
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Cc: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <
1237378015.13488.1.camel@localhost.localdomain>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Rusty Russell [Tue, 17 Mar 2009 21:52:30 +0000 (08:22 +1030)]
x86, uv: fix cpumask iterator in uv_bau_init()
Impact: fix boot crash on UV systems
Commit
76ba0ecda0de9accea9a91cb6dbde46782110e1c "cpumask: use
cpumask_var_t in uv_flush_tlb_others" used cur_cpu as an iterator;
it was supposed to be zero for the code below it.
Reported-by: Cliff Wickman <cpw@sgi.com>
Original-From: Cliff Wickman <cpw@sgi.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Mike Travis <travis@sgi.com>
Cc: steiner@sgi.com
Cc: <stable@kernel.org>
LKML-Reference: <
200903180822.31196.rusty@rustcorp.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Suresh Siddha [Tue, 17 Mar 2009 18:16:54 +0000 (10:16 -0800)]
x86: add x2apic_wrmsr_fence() to x2apic flush tlb paths
Impact: optimize APIC IPI related barriers
Uncached MMIO accesses for xapic are inherently serializing and hence
we don't need explicit barriers for xapic IPI paths.
x2apic MSR writes/reads don't have serializing semantics and hence need
a serializing instruction or mfence, to make all the previous memory
stores globally visisble before the x2apic msr write for IPI.
Add x2apic_wrmsr_fence() in flush tlb path to x2apic specific paths.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: "steiner@sgi.com" <steiner@sgi.com>
Cc: Nick Piggin <npiggin@suse.de>
LKML-Reference: <
1237313814.27006.203.camel@localhost.localdomain>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Andrew Morton [Wed, 18 Mar 2009 00:10:25 +0000 (10:40 +1030)]
x86: use smp_call_function_single() in arch/x86/kernel/cpu/mcheck/mce_amd_64.c
Attempting to rid us of the problematic work_on_cpu(). Just use
smp_call_function_single() here.
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
LKML-Reference: <
20090318042217.
EF3F1DDF39@ozlabs.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Suresh Siddha [Tue, 17 Mar 2009 00:05:05 +0000 (17:05 -0700)]
x86, dmar: use atomic allocations for QI and Intr-remapping init
Impact: invalid use of GFP_KERNEL in interrupt context
Queued invalidation and interrupt-remapping will get initialized with
interrupts disabled (while enabling interrupt-remapping). So use
GFP_ATOMIC instead of GFP_KERNEL for memory alloacations.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Suresh Siddha [Tue, 17 Mar 2009 00:05:04 +0000 (17:05 -0700)]
x86: fix broken irq migration logic while cleaning up multiple vectors
Impact: fix spurious IRQs
During irq migration, we send a low priority interrupt to the previous
irq destination. This happens in non interrupt-remapping case after interrupt
starts arriving at new destination and in interrupt-remapping case after
modifying and flushing the interrupt-remapping table entry caches.
This low priority irq cleanup handler can cleanup multiple vectors, as
multiple irq's can be migrated at almost the same time. While
there will be multiple invocations of irq cleanup handler (one cleanup
IPI for each irq migration), first invocation of the cleanup handler
can potentially cleanup more than one vector (as the first invocation can
see the requests for more than vector cleanup). When we cleanup multiple
vectors during the first invocation of the smp_irq_move_cleanup_interrupt(),
other vectors that are to be cleanedup can still be pending in the local
cpu's IRR (as smp_irq_move_cleanup_interrupt() runs with interrupts disabled).
When we are ready to unhook a vector corresponding to an irq, check if that
vector is registered in the local cpu's IRR. If so skip that cleanup and
do a self IPI with the cleanup vector, so that we give a chance to
service the pending vector interrupt and then cleanup that vector
allocation once we execute the lowest priority handler.
This fixes spurious interrupts seen when migrating multiple vectors
at the same time.
[ This is apparently possible even on conventional xapic, although to
the best of our knowledge it has never been seen. The stable
maintainers may wish to consider this one for -stable. ]
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: stable@kernel.org
Suresh Siddha [Tue, 17 Mar 2009 00:05:03 +0000 (17:05 -0700)]
x86, ioapic: Fix non atomic allocation with interrupts disabled
Impact: fix possible race
save_mask_IO_APIC_setup() was using non atomic memory allocation while getting
called with interrupts disabled. Fix this by splitting this into two different
function. Allocation part save_IO_APIC_setup() now happens before
disabling interrupts.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Suresh Siddha [Tue, 17 Mar 2009 00:05:02 +0000 (17:05 -0700)]
x86, x2apic: cleanup ifdef CONFIG_INTR_REMAP in io_apic code
Impact: cleanup
Clean up #ifdefs and replace them with helper functions.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Suresh Siddha [Tue, 17 Mar 2009 00:05:01 +0000 (17:05 -0700)]
x86, x2apic: cleanup the IO-APIC level migration with interrupt-remapping
Impact: simplification
In the current code, for level triggered migration, we need to modify the
io-apic RTE with the update vector information, along with modifying interrupt
remapping table entry(IRTE) with vector and destination. This is to ensure that
remote IRR bit inthe IOAPIC RTE gets cleared when the cpu does EOI.
With this patch, for level triggered, we eliminate the io-apic RTE modification
(with the updated vector information), by using a virtual vector (io-apic pin
number). Real vector that is used for interrupting cpu will be coming from
the interrupt-remapping table entry. Trigger mode in the IRTE will always be
edge, and the actual level or edge trigger will be setup in the IO-APIC RTE.
So a level triggered interrupt will appear as an edge to the local apic
cpu but still as level to the IO-APIC.
With this change, level irq migration can be done by simply modifying
the interrupt-remapping table entry with out changing the io-apic RTE.
And as the interrupt appears as edge at the cpu, in addition to do the
local apic EOI, we need to do IO-APIC directed EOI to clear the remote
IRR bit in the IO-APIC RTE.
This simplies the irq migration in the presence of interrupt-remapping.
Idea-by: Rajesh Sankaran <rajesh.sankaran@intel.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Suresh Siddha [Tue, 17 Mar 2009 00:05:00 +0000 (17:05 -0700)]
x86, x2apic: fix clear_local_APIC() in the presence of x2apic
Impact: cleanup, paranoia
We were not clearing the local APIC in clear_local_APIC() in the
presence of x2apic. Fix it.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Suresh Siddha [Tue, 17 Mar 2009 00:04:59 +0000 (17:04 -0700)]
x86, x2apic: use virtual wire A mode in disable_IO_APIC() with interrupt-remapping
Impact: make kexec work with x2apic
disable_IO_APIC() gets called during crashdump aswell, which configures the
IO-APIC/LAPIC so that legacy interrupts can be delivered for the kexec'd kernel.
In the presence of interrupt-remapping, we need to change the
interrupt-remapping configuration aswell as modifying IO-APIC for virtual wire
B mode.
To keep things simple during the crash, use virtual wire A mode
(for which we don't need to touch io-apic and interrupt-remapping tables).
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Suresh Siddha [Tue, 17 Mar 2009 00:04:58 +0000 (17:04 -0700)]
x86, intr-remapping: fix free_irte() to clear all the IRTE entries
Impact: fix interrupt table entry leak
Fix the typo which was not clearing all the interrupt remapping table
entries corresponding to an irq.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Suresh Siddha [Tue, 17 Mar 2009 00:04:57 +0000 (17:04 -0700)]
x86, dmar: start with sane state while enabling dma and interrupt-remapping
Impact: cleanup/sanitization
Start from a sane state while enabling dma and interrupt-remapping, by
clearing the previous recorded faults and disabling previously
enabled queued invalidation and interrupt-remapping.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Suresh Siddha [Tue, 17 Mar 2009 00:04:56 +0000 (17:04 -0700)]
x86, dmar: routines for disabling queued invalidation and intr remapping
Impact: new interfaces (not yet used)
Routines for disabling queued invalidation and interrupt remapping.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Suresh Siddha [Tue, 17 Mar 2009 00:04:55 +0000 (17:04 -0700)]
x86, x2apic: enable fault handling for intr-remapping
Impact: interface augmentation (not yet used)
Enable fault handling flow for intr-remapping aswell. Fault handling
code now shared by both dma-remapping and intr-remapping.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Suresh Siddha [Tue, 17 Mar 2009 00:04:54 +0000 (17:04 -0700)]
x86, dmar: move page fault handling code to dmar.c
Impact: code movement
Move page fault handling code to dmar.c
This will be shared both by DMA-remapping and Intr-remapping code.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Suresh Siddha [Tue, 17 Mar 2009 00:04:53 +0000 (17:04 -0700)]
x86, x2apic: fix lock ordering during IRQ migration
Impact: fix potential deadlock on x2apic
fix "hard-safe -> hard-unsafe lock order detected" with irq_2_ir_lock
On x2apic enabled system:
[ INFO: hard-safe -> hard-unsafe lock order detected ]
2.6.27-03151-g4480f15b #1
------------------------------------------------------
swapper/1 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire:
(irq_2_ir_lock){--..}, at: [<
ffffffff8038ebc0>] get_irte+0x2f/0x95
and this task is already holding:
(&irq_desc_lock_class){+...}, at: [<
ffffffff802649ed>] setup_irq+0x67/0x281
which would create a new lock dependency:
(&irq_desc_lock_class){+...} -> (irq_2_ir_lock){--..}
but this new dependency connects a hard-irq-safe lock:
(&irq_desc_lock_class){+...}
... which became hard-irq-safe at:
[<
ffffffffffffffff>] 0xffffffffffffffff
to a hard-irq-unsafe lock:
(irq_2_ir_lock){--..}
... which became hard-irq-unsafe at:
... [<
ffffffff802547b5>] __lock_acquire+0x571/0x706
[<
ffffffff8025499f>] lock_acquire+0x55/0x71
[<
ffffffff8062f2c4>] _spin_lock+0x2c/0x38
[<
ffffffff8038ee50>] alloc_irte+0x8a/0x14b
[<
ffffffff8021f733>] setup_IO_APIC_irq+0x119/0x30e
[<
ffffffff8090860e>] setup_IO_APIC+0x146/0x6e5
[<
ffffffff809058fc>] native_smp_prepare_cpus+0x24e/0x2e9
[<
ffffffff808f982c>] kernel_init+0x5a/0x176
[<
ffffffff8020c289>] child_rip+0xa/0x11
[<
ffffffffffffffff>] 0xffffffffffffffff
Fix this theoretical lock order issue by using spin_lock_irqsave() instead of
spin_lock()
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
H. Peter Anvin [Tue, 17 Mar 2009 22:26:06 +0000 (15:26 -0700)]
x86, setup: move 32-bit code to .text32
Impact: cleanup
The setup code is mostly 16-bit code, but there is a small stub of
32-bit code at the end. Move the 32-bit code to a separate segment,
.text32, to avoid scrambling the disassembly.
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
H. Peter Anvin [Tue, 17 Mar 2009 21:14:31 +0000 (14:14 -0700)]
x86-32: move _end to a dummy section
Impact: build fix with CONFIG_RELOCATABLE
Move _end into a dummy section, so that relocs.c will know it is a
relocatable symbol.
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Jeremy Fitzhardinge [Sun, 15 Mar 2009 06:20:47 +0000 (23:20 -0700)]
x86/brk: put the brk reservations in their own section
Impact: disambiguate real .bss variables from .brk storage
Add a .brk section after the .bss section. This has no effect
on the final vmlinux, but it more clearly distinguishes the space
taken by actual .bss symbols, and the variable space reserved
by .brk users.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Jeremy Fitzhardinge [Sun, 15 Mar 2009 06:19:38 +0000 (23:19 -0700)]
x86/brk: make the brk reservation symbols inaccessible from C
Impact: bulletproofing, clarification
The brk reservation symbols are just there to document the amount
of space reserved by brk users in the final vmlinux file. Their
addresses are irrelevent, and using their addresses will cause
certain havok. Name them ".brk.NAME", which is a valid asm symbol
but C can't reference it; it also highlights their special
role in the symbol table.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
H. Peter Anvin [Tue, 17 Mar 2009 18:38:23 +0000 (11:38 -0700)]
x86-32: tighten the bound on additional memory to map
Impact: Tighten bound to avoid masking errors
The definition of MAPPING_BEYOND_END was excessive; this has a nasty
tendency to mask bugs. We have learned over time that this kind of
bug hiding can cause some very strange errors. Therefore, tighten the
bound to only need to map the actual kernel area.
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Jeremy Fitzhardinge [Mon, 16 Mar 2009 19:10:07 +0000 (12:10 -0700)]
x86-32: remove ALLOCATOR_SLOP from head_32.S
Impact: cleanup
ALLOCATOR_SLOP is a vestigial remain from when we used the
bootmem allocator to allocate the kernel's linear memory mapping.
Now we directly reserve pages from the e820 mapping, and no
longer require secondary structures to keep track of allocated
pages.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Jeremy Fitzhardinge [Mon, 16 Mar 2009 19:07:54 +0000 (12:07 -0700)]
x86-32: make sure we map enough to fit linear map pagetables
Impact: crash fix
head_32.S needs to map the kernel itself, and enough space so
that mm/init.c can allocate space from the e820 allocator
for the linear map of low memory.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Linus Torvalds [Tue, 17 Mar 2009 17:02:35 +0000 (10:02 -0700)]
Avoid 64-bit "switch()" statements on 32-bit architectures
Commit
ee6f779b9e0851e2f7da292a9f58e0095edf615a ("filp->f_pos not
correctly updated in proc_task_readdir") changed the proc code to use
filp->f_pos directly, rather than through a temporary variable. In the
process, that caused the operations to be done on the full 64 bits, even
though the offset is never that big.
That's all fine and dandy per se, but for some unfathomable reason gcc
generates absolutely horrid code when using 64-bit values in switch()
statements. To the point of actually calling out to gcc helper
functions like __cmpdi2 rather than just doing the trivial comparisons
directly the way gcc does for normal compares. At which point we get
link failures, because we really don't want to support that kind of
crazy code.
Fix this by just casting the f_pos value to "unsigned long", which
is plenty big enough for /proc, and avoids the gcc code generation issue.
Reported-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Zhang Le <r0bertz@gentoo.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Masami Hiramatsu [Mon, 16 Mar 2009 22:57:22 +0000 (18:57 -0400)]
prevent boosting kprobes on exception address
Don't boost at the addresses which are listed on exception tables,
because major page fault will occur on those addresses. In that case,
kprobes can not ensure that when instruction buffer can be freed since
some processes will sleep on the buffer.
kprobes-ia64 already has same check.
Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Tue, 17 Mar 2009 15:59:33 +0000 (08:59 -0700)]
Merge git://git./linux/kernel/git/agk/linux-2.6-dm
* git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-2.6-dm:
dm crypt: wait for endio to complete before destruction
dm crypt: fix kcryptd_async_done parameter
dm io: respect BIO_MAX_PAGES limit
dm table: rework reference counting fix
dm ioctl: validate name length when renaming
Linus Torvalds [Tue, 17 Mar 2009 15:13:17 +0000 (08:13 -0700)]
Fast TSC calibration: calculate proper frequency error bounds
In order for ntpd to correctly synchronize the clocks, the frequency of
the system clock must not be off by more than 500 ppm (or, put another
way, 1:2000), or ntpd will end up giving up on trying to synchronize
properly, and ends up reseting the clock in jumps instead.
The fast TSC PIT calibration sometimes failed this test - it was
assuming that the PIT reads always took about one microsecond each (2us
for the two reads to get a 16-bit timer), and that calibrating TSC to
the PIT over 15ms should thus be sufficient to get much closer than
500ppm (max 2us error on both sides giving 4us over 15ms: a 270 ppm
error value).
However, that assumption does not always hold: apparently some hardware
is either very much slower at reading the PIT registers, or there was
other noise causing at least one machine to get 700+ ppm errors.
So instead of using a fixed 15ms timing loop, this changes the fast PIT
calibration to read the TSC delta over the individual PIT timer reads,
and use the result to calculate the error bars on the PIT read timing
properly. We then successfully calibrate the TSC only if the maximum
error bars fall below 500ppm.
In the process, we also relax the timing to allow up to 25ms for the
calibration, although it can happen much faster depending on hardware.
Reported-and-tested-by: Jesper Krogh <jesper@krogh.cc>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Tue, 17 Mar 2009 14:58:26 +0000 (07:58 -0700)]
Fix potential fast PIT TSC calibration startup glitch
During bootup, when we reprogram the PIT (programmable interval timer)
to start counting down from 0xffff in order to use it for the fast TSC
calibration, we should also make sure to delay a bit afterwards to allow
the PIT hardware to actually start counting with the new value.
That will happens at the next CLK pulse (1.193182 MHz), so the easiest
way to do that is to just wait at least one microsecond after
programming the new PIT counter value. We do that by just reading the
counter value back once - which will take about 2us on PC hardware.
Reported-and-tested-by: john stultz <johnstul@us.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Yinghai Lu [Mon, 16 Mar 2009 23:33:59 +0000 (16:33 -0700)]
x86: MTRR workaround for system with stange var MTRRs
Impact: don't trim e820 according to wrong mtrr
Ozan reports that his server emits strange warning.
it turns out the BIOS sets the MTRRs incorrectly.
Ignore those strange ranges, and don't trim e820,
just emit one warning about BIOS
Reported-by: Ozan Çağlayan <ozan@pardus.org.tr>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <
49BEE1E7.
7020706@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Jeremy Fitzhardinge [Tue, 17 Mar 2009 00:24:34 +0000 (17:24 -0700)]
x86, paravirt: prevent gcc from generating the wrong addressing mode
Impact: fix crash on VMI (VMware)
When we generate a call sequence for calling a paravirtualized
function, we presume that the generated code is "call *0xXXXXX",
which is a 6 byte opcode; this is larger than a normal
direct call, and so we can patch a direct call over it.
At the moment, however we give gcc enough rope to hang us by
putting the address in a register and generating a two byte
indirect-via-register call. Prevent this by explicitly
dereferencing the function pointer and passing it into the
asm as a constant.
This prevents crashes in VMI, as it cannot handle unpatchable
callsites.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Alok Kataria <akataria@vmware.com>
LKML-Reference: <
49BEEDC2.
2070809@goop.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Linus Torvalds [Mon, 16 Mar 2009 19:49:12 +0000 (12:49 -0700)]
Merge branch 'release' of git://git./linux/kernel/git/lenb/linux-acpi-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6:
acpi-wmi: unsigned cannot be less than 0
thinkpad-acpi: fix module autoloading for older models
acer-wmi: Unmark as 'experimental'
acpi-wmi: Unmark as 'experimental'
acer-wmi: double free in acer_rfkill_exit()
platform/x86: depends instead of select for laptop platform drivers
asus-laptop: use select instead of depends on
eeepc-laptop: restore acpi_generate_proc_event()
asus-laptop: restore acpi_generate_proc_event()
acpi: check for pxm_to_node_map overflow
ACPI: remove doubled status checking
ACPI suspend: Blacklist Toshiba Satellite L300 that requires to set SCI_EN directly on resume
Revert "ACPI: make some IO ports off-limits to AML"
suspend: switch the Asus Pundit P1-AH2 to old ACPI sleep ordering
Milan Broz [Mon, 16 Mar 2009 17:44:36 +0000 (17:44 +0000)]
dm crypt: wait for endio to complete before destruction
The following oops has been reported when dm-crypt runs over a loop device.
...
[ 70.381058] Process loop0 (pid: 4268, ti=
cf3b2000 task=
cf1cc1f0 task.ti=
cf3b2000)
...
[ 70.381058] Call Trace:
[ 70.381058] [<
d0d76601>] ? crypt_dec_pending+0x5e/0x62 [dm_crypt]
[ 70.381058] [<
d0d767b8>] ? crypt_endio+0xa2/0xaa [dm_crypt]
[ 70.381058] [<
d0d76716>] ? crypt_endio+0x0/0xaa [dm_crypt]
[ 70.381058] [<
c01a2f24>] ? bio_endio+0x2b/0x2e
[ 70.381058] [<
d0806530>] ? dec_pending+0x224/0x23b [dm_mod]
[ 70.381058] [<
d08066e4>] ? clone_endio+0x79/0xa4 [dm_mod]
[ 70.381058] [<
d080666b>] ? clone_endio+0x0/0xa4 [dm_mod]
[ 70.381058] [<
c01a2f24>] ? bio_endio+0x2b/0x2e
[ 70.381058] [<
c02bad86>] ? loop_thread+0x380/0x3b7
[ 70.381058] [<
c02ba8a1>] ? do_lo_send_aops+0x0/0x165
[ 70.381058] [<
c013754f>] ? autoremove_wake_function+0x0/0x33
[ 70.381058] [<
c02baa06>] ? loop_thread+0x0/0x3b7
When a table is being replaced, it waits for I/O to complete
before destroying the mempool, but the endio function doesn't
call mempool_free() until after completing the bio.
Fix it by swapping the order of those two operations.
The same problem occurs in dm.c with md referenced after dec_pending.
Again, we swap the order.
Cc: stable@kernel.org
Signed-off-by: Milan Broz <mbroz@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Huang Ying [Mon, 16 Mar 2009 17:44:33 +0000 (17:44 +0000)]
dm crypt: fix kcryptd_async_done parameter
In the async encryption-complete function (kcryptd_async_done), the
crypto_async_request passed in may be different from the one passed to
crypto_ablkcipher_encrypt/decrypt. Only crypto_async_request->data is
guaranteed to be same as the one passed in. The current
kcryptd_async_done uses the passed-in crypto_async_request directly
which may cause the AES-NI-based AES algorithm implementation to panic.
This patch fixes this bug by only using crypto_async_request->data,
which points to dm_crypt_request, the crypto_async_request passed in.
The original data (convert_context) is gotten from dm_crypt_request.
[mbroz@redhat.com: reworked]
Cc: stable@kernel.org
Signed-off-by: Huang Ying <ying.huang@intel.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Milan Broz <mbroz@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Mikulas Patocka [Mon, 16 Mar 2009 17:44:30 +0000 (17:44 +0000)]
dm io: respect BIO_MAX_PAGES limit
dm-io calls bio_get_nr_vecs to get the maximum number of pages to use
for a given device. It allocates one additional bio_vec to use
internally but failed to respect BIO_MAX_PAGES, so fix this.
This was the likely cause of:
https://bugzilla.redhat.com/show_bug.cgi?id=173153
Cc: stable@kernel.org
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Mikulas Patocka [Mon, 16 Mar 2009 17:44:26 +0000 (17:44 +0000)]
dm table: rework reference counting fix
Fix an error introduced in dm-table-rework-reference-counting.patch.
When there is failure after table initialization, we need to use
dm_table_destroy, not dm_table_put, to free the table.
dm_table_put may be used only after dm_table_get.
Cc: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Reviewed-by: Jonathan Brassow <jbrassow@redhat.com>
Reviewed-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Milan Broz [Mon, 16 Mar 2009 16:56:01 +0000 (16:56 +0000)]
dm ioctl: validate name length when renaming
When renaming a mapped device validate the length of the new name.
The rename ioctl accepted any correctly-terminated string enclosed
within the data passed from userspace. The other ioctls enforce a
size limit of DM_NAME_LEN. If the name is changed and becomes longer
than that, the device can no longer be addressed by name.
Fix it by properly checking for device name length (including
terminating zero).
Cc: stable@kernel.org
Signed-off-by: Milan Broz <mbroz@redhat.com>
Reviewed-by: Jonathan Brassow <jbrassow@redhat.com>
Reviewed-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Linus Torvalds [Mon, 16 Mar 2009 14:56:58 +0000 (07:56 -0700)]
Merge git://git./linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (21 commits)
r8169: revert "r8169: read MAC address from EEPROM on init (2nd attempt)"
r8169: use hardware auto-padding.
igb: remove ASPM L0s workaround
netxen: remove old flash check.
mv643xx_eth: fix unicast address filter corruption on mtu change
xfrm: Fix xfrm_state_find() wrt. wildcard source address.
emac: Fix clock control for 405EX and 405EXr chips
ixgbe: fix multiple unicast address support
via-velocity: Fix DMA mapping length errors on transmit.
qlge: bugfix: Pad outbound frames smaller than 60 bytes.
qlge: bugfix: Move netif_napi_del() to common call point.
qlge: bugfix: Tell hw to strip vlan header.
qlge: bugfix: Increase filter on inbound csum.
dnet: replace obsolete *netif_rx_* functions with *napi_*
net: Add be2net driver.
dnet: Fix warnings on 64-bit.
dnet: Dave DNET ethernet controller driver (updated)
ipv6: Fix BUG when disabled ipv6 module is unloaded
bnx2x: Using DMAE to initialize the chip
bnx2x: Casting page alignment
...
Rusty Russell [Sun, 15 Mar 2009 22:35:07 +0000 (09:05 +1030)]
linux.conf.au 2009: Tuz
Impact: help prevent extinction of species
The Tasmanian Devil is a shy iconic Australian creature named for its
spine-chilling screech. It is threatened with extinction due to a
scientifically interesting but horrific transmissible facial cancer.
This one is standing in for Tux for one release using the far less-known
Devil Facial Tux Disguise.
Save The Tasmanian Devil http://tassiedevil.com.au
Signed-off-by: Linux.conf.au Hobart Team <contact@marchsouth.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Zhang Le [Mon, 16 Mar 2009 06:44:31 +0000 (14:44 +0800)]
filp->f_pos not correctly updated in proc_task_readdir
filp->f_pos only get updated at the end of the function. Thus d_off of those
dirents who are in the middle will be 0, and this will cause a problem in
glibc's readdir implementation, specifically endless loop. Because when overflow
occurs, f_pos will be set to next dirent to read, however it will be 0, unless
the next one is the last one. So it will start over again and again.
There is a sample program in man 2 gendents. This is the output of the program
running on a multithread program's task dir before this patch is applied:
$ ./a.out /proc/3807/task
--------------- nread=128 ---------------
i-node# file type d_reclen d_off d_name
506442 directory 16 1 .
506441 directory 16 0 ..
506443 directory 16 0 3807
506444 directory 16 0 3809
506445 directory 16 0 3812
506446 directory 16 0 3861
506447 directory 16 0 3862
506448 directory 16 8 3863
This is the output after this patch is applied
$ ./a.out /proc/3807/task
--------------- nread=128 ---------------
i-node# file type d_reclen d_off d_name
506442 directory 16 1 .
506441 directory 16 2 ..
506443 directory 16 3 3807
506444 directory 16 4 3809
506445 directory 16 5 3812
506446 directory 16 6 3861
506447 directory 16 7 3862
506448 directory 16 8 3863
Signed-off-by: Zhang Le <r0bertz@gentoo.org>
Acked-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Thomas Gleixner [Mon, 16 Mar 2009 12:07:21 +0000 (13:07 +0100)]
x86: reduce preemption off section in exit thread
Impact: latency improvement
No need to keep preemption disabled over the kfree call.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Hidetoshi Seto [Mon, 16 Mar 2009 08:07:33 +0000 (17:07 +0900)]
x86, mce: remove incorrect __cpuinit for intel_init_cmci()
Impact: Bug fix on UP
Referring commit
cc3ca22063784076bd240fda87217387a8f2ae92,
Peter removed __cpuinit annotations for mce_cpu_features()
and its successor functions, which caused troubles on UP
configurations.
However the intel_init_cmci() was introduced after that and
it also has __cpuinit annotation even though it is called from
mce_cpu_features(). Remove the annotation from that function
too.
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Len Brown [Mon, 16 Mar 2009 04:38:52 +0000 (00:38 -0400)]
Merge branches 'misc-up-now' and 'platform-drivers' into release
Roel Kluin [Wed, 4 Mar 2009 19:55:30 +0000 (11:55 -0800)]
acpi-wmi: unsigned cannot be less than 0
include/linux/pci-acpi.h:74:
typedef u32 acpi_status;
result is unsigned, so an error returned by acpi_bus_register_driver()
will not be noticed.
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Len Brown <len.brown@intel.com>
Mathieu Chouquet-Stringer [Sat, 14 Mar 2009 15:35:26 +0000 (16:35 +0100)]
thinkpad-acpi: fix module autoloading for older models
Looking at the source, there seems to be a missing * to match my DMI
string. I mean for newer IBM and Lenovo's laptops you match either one
of the following:
MODULE_ALIAS("dmi:bvnIBM:*:svnIBM:*:pvrThinkPad*:rvnIBM:*");
MODULE_ALIAS("dmi:bvnLENOVO:*:svnLENOVO:*:pvrThinkPad*:rvnLENOVO:*");
While for older Thinkpads, you do this (for instance):
IBM_BIOS_MODULE_ALIAS("1[0,3,6,8,A-G,I,K,M-P,S,T]");
with IBM_BIOS_MODULE_ALIAS being MODULE_ALIAS("dmi:bvnIBM:bvr" __type "ET??WW")
Note there's no * terminating the string. As result, udev doesn't load
anything because modprobe cannot find anything matching this (my
machine actually):
udevtest: run: '/sbin/modprobe dmi:bvnIBM:bvr1IET71WW(2.10):bd06/16/2006:svnIBM:pn236621U:pvrNotAv
Signed-off-by: Mathieu Chouquet-Stringer <mchouque@free.fr>
Acked-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Signed-off-by: Len Brown <len.brown@intel.com>
Carlos Corbacho [Sat, 14 Feb 2009 09:53:59 +0000 (09:53 +0000)]
acer-wmi: Unmark as 'experimental'
This driver has been around and used long enough that we can drop the
'experimental'.
Signed-off-by: Carlos Corbacho <carlos@strangeworlds.co.uk>
Signed-off-by: Len Brown <len.brown@intel.com>
Carlos Corbacho [Sat, 14 Feb 2009 09:53:53 +0000 (09:53 +0000)]
acpi-wmi: Unmark as 'experimental'
ACPI-WMI isn't experimental anymore, and there are other drivers that now
depend on it that aren't either.
Signed-off-by: Carlos Corbacho <carlos@strangeworlds.co.uk>
Signed-off-by: Len Brown <len.brown@intel.com>
Dan Carpenter [Sat, 14 Feb 2009 09:53:48 +0000 (09:53 +0000)]
acer-wmi: double free in acer_rfkill_exit()
This is acer_rfkill_exit() from drivers/platform/x86/acer-wmi.c.
The code frees wireless_rfkill->data again instead of
bluetooth_rfkill->data.
This was found using a code checker (http://repo.or.cz/w/smatch.git/).
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Carlos Corbacho <carlos@strangeworlds.co.uk>
Signed-off-by: Len Brown <len.brown@intel.com>
Corentin Chary [Wed, 25 Feb 2009 08:37:09 +0000 (09:37 +0100)]
platform/x86: depends instead of select for laptop platform drivers
"I hate `select' and will gleefully leap on any s/select/depends/ patch,
whether it works or not :)"
Andrew Morton
select INPUT is not needed here, because if someone doesn't want INPUT,
he won't want these drivers either.
Signed-off-by: Corentin Chary <corentincj@iksaif.net>
Signed-off-by: Len Brown <len.brown@intel.com>
Corentin Chary [Sun, 15 Feb 2009 18:30:21 +0000 (19:30 +0100)]
asus-laptop: use select instead of depends on
Like thinkpad_acpi or eeepc-laptop, asus-laptop will
now use "select" instead of "depends on"
for LEDS_CLASS, NEW_LEDS and BACKLIGHT_CLASS_DEVICE
Signed-off-by: Corentin Chary <corentincj@iksaif.net>
Signed-off-by: Len Brown <len.brown@intel.com>
Corentin Chary [Sun, 15 Feb 2009 18:30:20 +0000 (19:30 +0100)]
eeepc-laptop: restore acpi_generate_proc_event()
Restore acpi_generate_proc_event() for backward
compatibility with old acpi scripts.
Signed-off-by: Corentin Chary <corentincj@iksaif.net>
Signed-off-by: Len Brown <len.brown@intel.com>
Corentin Chary [Sun, 15 Feb 2009 18:30:19 +0000 (19:30 +0100)]
asus-laptop: restore acpi_generate_proc_event()
Restore acpi_generate_proc_event() for backward
compatibility with old acpi scripts.
Signed-off-by: Corentin Chary <corentincj@iksaif.net>
Signed-off-by: Len Brown <len.brown@intel.com>
Cyrill Gorcunov [Wed, 4 Mar 2009 19:55:29 +0000 (11:55 -0800)]
acpi: check for pxm_to_node_map overflow
It is hardly (if ever) possible but in case of broken _PXM entry we could
reach out of pxm_to_node_map array bounds in acpi_map_pxm_to_node() call.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Len Brown <len.brown@intel.com>
Jiri Slaby [Wed, 4 Mar 2009 19:55:27 +0000 (11:55 -0800)]
ACPI: remove doubled status checking
There was a misplaced status test (two consequent tests without a
statement in between) in acpi_bus_init for ages. Remove it, since the
function which should be checked (acpi_os_initialize1) has BUG_ONs on
failure paths.
Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Len Brown <len.brown@intel.com>
Zhang Rui [Mon, 16 Mar 2009 02:13:44 +0000 (22:13 -0400)]
ACPI suspend: Blacklist Toshiba Satellite L300 that requires to set SCI_EN directly on resume
This is a supplement of commit
65df78473ffbf3bff5e2034df1638acc4f3ddd50.
http://bugzilla.kernel.org/show_bug.cgi?id=12798
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Len Brown [Wed, 25 Feb 2009 23:00:18 +0000 (18:00 -0500)]
Revert "ACPI: make some IO ports off-limits to AML"
This reverts commit
5ec5d38a1c8af255ffc481c81eef13e9155524b3.
because it caused spurious dmesg warmings.
We'll implement the check for off-limit ports
in a more clever way in the future.
http://bugzilla.kernel.org/show_bug.cgi?id=12758
Signed-off-by: Len Brown <len.brown@intel.com>
Andy Whitcroft [Wed, 11 Feb 2009 18:11:22 +0000 (18:11 +0000)]
suspend: switch the Asus Pundit P1-AH2 to old ACPI sleep ordering
Switch the Asus Pundit P1-AH2 (M2N8L motherboard) to the old ACPI 1.0
sleep ordering by default. Without this it will not suspend/resume
correctly.
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Tested-by: Dustin Kirkland <kirkland@canonical.com>
Signed-off-by: Len Brown <len.brown@intel.com>
françois romieu [Sun, 15 Mar 2009 01:10:50 +0000 (01:10 +0000)]
r8169: revert "r8169: read MAC address from EEPROM on init (2nd attempt)"
It fails on the following systems:
- RTL8169sc/8110sc (XID
18000000)
reported by Tim Durack <tdurack@gmail.com> (x86)
- RTL8169sb/8110sb (XID
10000000)
reported by Mikael Pettersson <mikpe@it.uu.se> (ARM)
The patch appeared to work on x86 for the following systems:
RTL8169sb/8110sb
10000000 PCI (EXT)
RTL8110s
04000000 PCI (EXT)
RTL8102e
24a00000 PCI-E (LOM)
RTL8168c/8111c
3c2000c0 PCI-E (LOM)
RTL8168b/8111b
38000000 PCI-E (LOM)
RTL8168b/8111b
38000000 PCI-E (EXT)
The patch exposes two problems:
1) while not completely wrong, mac addresses are not read correctly
from the EEPROM
2) the MAC address registers are not correctly set
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Tested-by: Mikael Pettersson <mikpe@it.uu.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
françois romieu [Sun, 15 Mar 2009 01:09:54 +0000 (01:09 +0000)]
r8169: use hardware auto-padding.
It shortens the code and fixes the current pci_unmap leak with
padded skb reported by Dave Jones.
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kyle McMartin [Sat, 14 Mar 2009 23:40:59 +0000 (19:40 -0400)]
parisc: sba_iommu: fix build bug when CONFIG_PARISC_AGP=y
CC drivers/parisc/sba_iommu.o
drivers/parisc/sba_iommu.c:1373: error: expected identifier or '('
before '}' token
make[2]: *** [drivers/parisc/sba_iommu.o] Error 1
make[1]: *** [drivers/parisc] Error 2
make: *** [drivers] Error 2
Don't know how this has gone missed for so long... clearly I need
to do builds on my C8000 more often.
Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sun, 15 Mar 2009 20:34:56 +0000 (13:34 -0700)]
Merge master.kernel.org:/home/rmk/linux-2.6-arm
* master.kernel.org:/home/rmk/linux-2.6-arm: (23 commits)
[ARM] Fix virtual to physical translation macro corner cases
[ARM] update mach-types
[ARM] 5421/1: ftrace: fix crash due to tracing of __naked functions
MX1 fix include
[ARM] 5419/1: ep93xx: fix build warnings about struct i2c_board_info
[ARM] 5418/1: restore lr before leaving mcount
ARM: OMAP: board-omap3beagle: set i2c-3 to 100kHz
ARM: OMAP: Allow I2C bus driver to be compiled as a module
ARM: OMAP: sched_clock() corrected
ARM: OMAP: Fix compile error if pm.h is included
[ARM] orion5x: pass dram mbus data to xor driver
[ARM] S3C64XX: Fix s3c64xx_setrate_clksrc
[ARM] S3C64XX: sparse warnings in arch/arm/plat-s3c64xx/irq.c
[ARM] S3C64XX: sparse warnings in arch/arm/plat-s3c64xx/s3c6400-clock.c
[ARM] S3C64XX: Fix USB host clock mux list
[ARM] S3C64XX: Fix name of USB host clock.
[ARM] S3C64XX: Rename IRQ_UHOST to IRQ_USBH
[ARM] S3C64XX: Do gpiolib configuration earlier
[ARM] S3C64XX: Staticise s3c64xx_init_irq_eint()
[ARM] SMDK6410: Declare iodesc table static
...
Akinobu Mita [Sun, 15 Mar 2009 15:15:18 +0000 (00:15 +0900)]
x86, mm: remove unnecessary include file from iomap_32.c
asm/highmem.h inclusion is added to use kmap_atomic_prot_pfn()
by commit
bb6d59ca927d855ffac567b35c0a790c67016103
Now kmap_atomic_prot_pfn is moved to iomap_32.c
by commit
dd63fdcc63f0f853b116b52e56200a0e0227cf5f
So the asm/highmem.h inclusion in iomap_32.c is unnecessary now.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
LKML-Reference: <
20090315151517.GA29074@localhost.localdomain>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Yinghai Lu [Sun, 15 Mar 2009 07:59:19 +0000 (00:59 -0700)]
x86: print out more info in e820_update_range()
Impact: help debug e820 bugs
Try to print out more info, to catch wrong call parameters.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <
49BCB557.
3030000@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Yinghai Lu [Sat, 14 Mar 2009 21:32:41 +0000 (14:32 -0700)]
x86: fix 64k corruption-check
Impact: fix boot crash
Need to exit early if the addr is far above 64k.
The crash got exposed by:
78a8b35: x86: make e820_update_range() handle small range update
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: <stable@kernel.org>
LKML-Reference: <
49BC2279.
2030101@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Alexander Duyck [Sun, 15 Mar 2009 05:26:40 +0000 (22:26 -0700)]
igb: remove ASPM L0s workaround
The L0s workaround should be moved into a pci quirk and so it is not
necessary in the driver. This update removes the L0s workaround from the
igb driver.
This was the second half of the PCI quirk patch that Matthew Wilcox did
not pick up when he picked up the quirk patch.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yinghai Lu [Mon, 9 Mar 2009 08:15:57 +0000 (01:15 -0700)]
x86: put initial_pg_tables into .bss
Impact: makes vmlinux section information more useful
Don't use ram after _end blindly for pagetables. aka init pages is before _end
put those pg table into .bss
[Adapted to use brk segment - Jeremy]
v2: keep initial page table up to 512M only.
v4: put initial page tables just before _end
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Jeremy Fitzhardinge [Thu, 12 Mar 2009 23:09:49 +0000 (16:09 -0700)]
x86: allow extend_brk users to reserve brk space
Impact: new interface; remove hard-coded limit
Add RESERVE_BRK(name, size) macro to reserve space in the brk
area. This should be a conservative (ie, larger) estimate of
how much space might possibly be required from the brk area.
Any unused space will be freed, so there's no real downside
on making the reservation too large (within limits).
The name should be unique within a given file, and somewhat
descriptive.
The C definition of RESERVE_BRK() ends up being more complex than
one would expect to work around a cluster of gcc infelicities:
The first attempt was to simply try putting __section(.brk_reservation)
on a variable. This doesn't work because it ends up making it a
@progbits section, which gets actual space allocated in the vmlinux
executable.
The second attempt was to emit the space into a section using asm,
but gcc doesn't allow arguments to be passed to file-level asm()
statements, making it hard to pass in the size.
The final attempt is to wrap the asm() in a function to allow
it to have arguments, and put the function itself into the
.discard section, which vmlinux*.lds drops entirely from the
emitted vmlinux.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Yinghai Lu [Thu, 12 Mar 2009 23:04:42 +0000 (16:04 -0700)]
x86-32: compute initial mapping size more accurately
Impact: simplification
We only need to map the kernel in head_32.S, not the whole of
lowmem. We use 512MB as a reasonable (but arbitrary) limit on
the maximum size of the kernel image.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Jeremy Fitzhardinge [Fri, 27 Feb 2009 21:35:45 +0000 (13:35 -0800)]
x86: use brk allocation for DMI
Impact: use new interface instead of previous ad hoc implementation
Use extend_brk() to allocate memory for DMI rather than having an
ad-hoc allocator.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Jeremy Fitzhardinge [Fri, 27 Feb 2009 21:27:38 +0000 (13:27 -0800)]
x86-32: use brk segment for allocating initial kernel pagetable
Impact: use new interface instead of previous ad hoc implementation
Rather than having special purpose init_pg_table_start/end variables
to delimit the kernel pagetable built by head_32.S, just use the brk
mechanism to extend the bss for the new pagetable.
This patch removes init_pg_table_start/end and pg0, defines __brk_base
(which is page-aligned and immediately follows _end), initializes
the brk region to start there, and uses it for the 32-bit pagetable.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
H. Peter Anvin [Sun, 15 Mar 2009 00:19:51 +0000 (17:19 -0700)]
x86: move brk initialization out of #ifdef CONFIG_BLK_DEV_INITRD
Impact: build fix
The brk initialization functions were incorrectly located inside
an #ifdef CONFIG_VLK_DEV_INITRD block, causing the obvious build failure in
minimal configurations.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Jeremy Fitzhardinge [Fri, 27 Feb 2009 01:35:44 +0000 (17:35 -0800)]
x86: add brk allocation for very, very early allocations
Impact: new interface
Add a brk()-like allocator which effectively extends the bss in order
to allow very early code to do dynamic allocations. This is better than
using statically allocated arrays for data in subsystems which may never
get used.
The space for brk allocations is in the bss ELF segment, so that the
space is mapped properly by the code which maps the kernel, and so
that bootloaders keep the space free rather than putting a ramdisk or
something into it.
The bss itself, delimited by __bss_stop, ends before the brk area
(__brk_base to __brk_limit). The kernel text, data and bss is reserved
up to __bss_stop.
Any brk-allocated data is reserved separately just before the kernel
pagetable is built, as that code allocates from unreserved spaces
in the e820 map, potentially allocating from any unused brk memory.
Ultimately any unused memory in the brk area is used in the general
kernel memory pool.
Initially the brk space is set to 1MB, which is probably much larger
than any user needs (the largest current user is i386 head_32.S's code
to build the pagetables to map the kernel, which can get fairly large
with a big kernel image and no PSE support). So long as the system
has sufficient memory for the bootloader to reserve the kernel+1MB brk,
there are no bad effects resulting from an over-large brk.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Jeremy Fitzhardinge [Tue, 10 Mar 2009 18:19:18 +0000 (11:19 -0700)]
x86: make section delimiter symbols part of their section
Impact: cleanup
Move the symbols delimiting a section part of the section
(section relative) rather than absolute. This avoids any
unexpected gaps between the section-start symbol and the first
data in the section, which could be caused by implicit
alignment of the section data. It also makes the general
form of vmlinux_64.lds.S consistent with vmlinux_32.lds.S.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Linus Torvalds [Sat, 14 Mar 2009 20:43:18 +0000 (13:43 -0700)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
Fix Xilinx SystemACE driver to handle empty CF slot
block: fix memory leak in bio_clone()
block: Add gfp_mask parameter to bio_integrity_clone()
Grant Likely [Mon, 9 Mar 2009 12:42:24 +0000 (13:42 +0100)]
Fix Xilinx SystemACE driver to handle empty CF slot
The SystemACE driver does not handle an empty CF slot gracefully. An
empty CF slot ends up hanging the system. This patch adds a check for
the CF state and stops trying to process requests if the slot is empty.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Li Zefan [Mon, 9 Mar 2009 09:42:45 +0000 (10:42 +0100)]
block: fix memory leak in bio_clone()
If bio_integrity_clone() fails, bio_clone() returns NULL without freeing
the newly allocated bio.
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
un'ichi Nomura [Mon, 9 Mar 2009 09:40:52 +0000 (10:40 +0100)]
block: Add gfp_mask parameter to bio_integrity_clone()
Stricter gfp_mask might be required for clone allocation.
For example, request-based dm may clone bio in interrupt context
so it has to use GFP_ATOMIC.
Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Acked-by: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Linus Torvalds [Sat, 14 Mar 2009 19:02:21 +0000 (12:02 -0700)]
Merge branch 'upstream' of git://ftp.linux-mips.org/upstream-linus
* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus:
MIPS: Mark Eins: Fix configuration.
MIPS: Fix TIF_32BIT undefined problem when seccomp is disabled
Linus Torvalds [Sat, 14 Mar 2009 19:01:37 +0000 (12:01 -0700)]
Merge git://git./linux/kernel/git/jejb/scsi-rc-fixes-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6: (31 commits)
[SCSI] qla2xxx: Update version number to 8.03.00-k4.
[SCSI] qla2xxx: Correct overwrite of pre-assigned init-control-block structure size.
[SCSI] qla2xxx: Correct truncation in return-code status checking.
[SCSI] qla2xxx: Correct vport delete bug.
[SCSI] qla2xxx: Use correct value for max vport in LOOP topology.
[SCSI] qla2xxx: Correct address range checking for option-rom updates.
[SCSI] fcoe: Change fcoe receive thread nice value from 19 (lowest priority) to -20
[SCSI] fcoe: fix handling of pending queue, prevent out of order frames (v3)
[SCSI] fcoe: Out of order tx frames was causing several check condition SCSI status
[SCSI] fcoe: fix kfree(skb)
[SCSI] fcoe: ETH_P_8021Q is already in if_ether and fcoe is not using it anyway
[SCSI] libfc: do not change the fh_rx_id of a recevied frame
[SCSI] fcoe: Correct fcoe_transports initialization vs. registration
[SCSI] fcoe: Use setup_timer() and mod_timer()
[SCSI] libfc, fcoe: Remove unnecessary cast by removing inline wrapper
[SCSI] libfc, fcoe: Cleanup function formatting and minor typos
[SCSI] libfc, fcoe: Fix kerneldoc comments
[SCSI] libfc: Cleanup libfc_function_template comments
[SCSI] libfc: check for err when recv and state is incorrect
[SCSI] libfc: rename rp to rdata in fc_disc_new_target()
...
Linus Torvalds [Sat, 14 Mar 2009 19:00:42 +0000 (12:00 -0700)]
Merge branch 'upstream-linus' of git://git./linux/kernel/git/jgarzik/libata-dev
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
ata_piix: add workaround for Samsung DB-P70
libata: Keep shadow last_ctl up to date during resets
sata_mv: fix MSI irq race condition
Linus Torvalds [Sat, 14 Mar 2009 19:00:18 +0000 (12:00 -0700)]
Merge branch 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6
* 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6:
NFS: Fix the fix to Bugzilla #11061, when IPv6 isn't defined...
SUNRPC: xprt_connect() don't abort the task if the transport isn't bound
SUNRPC: Fix an Oops due to socket not set up yet...
Bug 11061, NFS mounts dropped
NFS: Handle -ESTALE error in access()
NLM: Fix GRANT callback address comparison when IPv6 is enabled
NLM: Shrink the IPv4-only version of nlm_cmp_addr()
NFSv3: Fix posix ACL code
NFS: Fix misparsing of nfsv4 fs_locations attribute (take 2)
SUNRPC: Tighten up the task locking rules in __rpc_execute()
Linus Torvalds [Sat, 14 Mar 2009 18:59:22 +0000 (11:59 -0700)]
Merge branch 'upstream-linus' of git://git./linux/kernel/git/mfasheh/ocfs2
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2:
ocfs2: Use xs->bucket to set xattr value outside
ocfs2: Fix a bug found by sparse check.
ocfs2: tweak to get the maximum inline data size with xattr
ocfs2: reserve xattr block for new directory with inline data
Linus Torvalds [Sat, 14 Mar 2009 18:59:05 +0000 (11:59 -0700)]
Merge branch 'for_linus' of git://git./linux/kernel/git/mchehab/linux-2.6
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6:
V4L/DVB (10978): Report tuning algorith correctly
V4L/DVB (10977): STB6100 init fix, the call to stb6100_set_bandwidth needs an argument
V4L/DVB (10976): Bug fix: For legacy applications stv0899 performs search only first time after insmod.
V4L/DVB (10975): Bug: Use signed types, Offsets and range can be negative
V4L/DVB (10974): Use Diseqc 3/3 mode to send data
V4L/DVB (10972): zl10353: i2c_gate_ctrl bug fix
V4L/DVB (10834): zoran: auto-select bt866 for AverMedia 6 Eyes
V4L/DVB (10832): tvaudio: Avoid breakage with tda9874a
V4L/DVB (10789): m5602-s5k4aa: Split up the initial sensor probe in chunks.
Linus Torvalds [Sat, 14 Mar 2009 18:58:38 +0000 (11:58 -0700)]
Merge git://git./linux/kernel/git/kyle/parisc-2.6.29
* git://git.kernel.org/pub/scm/linux/kernel/git/kyle/parisc-2.6.29:
parisc: update defconfigs
parisc: define x->x mmio accessors
parisc: dino: struct device - replace bus_id with dev_name(), dev_set_name()
parisc: convert cpu_check_affinity to new cpumask api
parisc: convert (read|write)bwlq to inlines
parisc: fix use of new cpumask api in irq.c
parisc: update parisc for new irq_desc
parisc: update MAINTAINERS
parisc: fix wrong assumption about bus->self
parisc: fix 64bit build
parisc: add braces around arguments in assembler macros
parisc: fix dev_printk() compile warnings for accessing a device struct
parisc: remove unused local out_putf label
parisc: fix `struct pt_regs' declared inside parameter list warning
parisc: fix section mismatch warnings
parisc: remove klist iterators
parisc: BUG_ON() cleanup
Linus Torvalds [Sat, 14 Mar 2009 18:58:10 +0000 (11:58 -0700)]
Merge git://git./linux/kernel/git/bart/ide-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6:
ide: save the returned value of dma_map_sg
ide-floppy: do not map dataless cmds to an sg
Daisuke Nishimura [Fri, 13 Mar 2009 20:52:00 +0000 (13:52 -0700)]
vmscan: pgmoved should be cleared after updating recent_rotated
pgmoved should be cleared after updating recent_rotated.
Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: Rik van Riel <riel@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Tyler Hicks [Fri, 13 Mar 2009 20:51:59 +0000 (13:51 -0700)]
eCryptfs: don't encrypt file key with filename key
eCryptfs has file encryption keys (FEK), file encryption key encryption
keys (FEKEK), and filename encryption keys (FNEK). The per-file FEK is
encrypted with one or more FEKEKs and stored in the header of the
encrypted file. I noticed that the FEK is also being encrypted by the
FNEK. This is a problem if a user wants to use a different FNEK than
their FEKEK, as their file contents will still be accessible with the
FNEK.
This is a minimalistic patch which prevents the FNEKs signatures from
being copied to the inode signatures list. Ultimately, it keeps the FEK
from being encrypted with a FNEK.
Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
Cc: Serge Hallyn <serue@us.ibm.com>
Acked-by: Dustin Kirkland <kirkland@canonical.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Johannes Weiner [Fri, 13 Mar 2009 20:51:58 +0000 (13:51 -0700)]
nommu: ramfs: don't leak pages when adding to page cache fails
When a ramfs nommu mapping is expanded, contiguous pages are allocated
and added to the pagecache. The caller's reference is then passed on
by moving whole pagevecs to the file lru list.
If the page cache adding fails, make sure that the error path also
moves the pagevec contents which might still contain up to PAGEVEC_SIZE
successfully added pages, of which we would leak references otherwise.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Enrik Berkhan <Enrik.Berkhan@ge.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Enrik Berkhan [Fri, 13 Mar 2009 20:51:56 +0000 (13:51 -0700)]
nommu: ramfs: pages allocated to an inode's pagecache may get wrongly discarded
The pages attached to a ramfs inode's pagecache by truncation from nothing
- as done by SYSV SHM for example - may get discarded under memory
pressure.
The problem is that the pages are not marked dirty. Anything that creates
data in an MMU-based ramfs will cause the pages holding that data will
cause the set_page_dirty() aop to be called.
For the NOMMU-based mmap, set_page_dirty() may be called by write(), but
it won't be called by page-writing faults on writable mmaps, and it isn't
called by ramfs_nommu_expand_for_mapping() when a file is being truncated
from nothing to allocate a contiguous run.
The solution is to mark the pages dirty at the point of allocation by the
truncation code.
Signed-off-by: Enrik Berkhan <Enrik.Berkhan@ge.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Dhananjay Phadke [Fri, 6 Mar 2009 14:52:12 +0000 (14:52 +0000)]
netxen: remove old flash check.
Remove flash size check which made sense only for ancient
boards with 1MB flash. The check is based on values read
from specific locations and fails with firmware size changes.
This prevents driver from getting right mac addresses.
Signed-off-by: Dhananjay Phadke <dhananjay@netxen.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jaswinder Singh Rajput [Fri, 13 Mar 2009 14:29:26 +0000 (19:59 +0530)]
x86: cpu_debug add support for various AMD CPUs
Impact: Added AMD CPUs support
Added flags for various AMD CPUs.
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Sebastian Andrzej Siewior [Sat, 14 Mar 2009 11:24:02 +0000 (12:24 +0100)]
x86/centaur: merge 32 & 64 bit version
there should be no difference, except:
* the 64bit variant now also initializes the padlock unit.
* ->c_early_init() is executed again from ->c_init()
* the 64bit fixups made into 32bit path.
Signed-off-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
Cc: herbert@gondor.apana.org.au
LKML-Reference: <
1237029843-28076-2-git-send-email-sebastian@breakpoint.cc>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Sat, 14 Mar 2009 15:25:40 +0000 (16:25 +0100)]
Merge branches 'x86/apic', 'x86/asm', 'x86/cleanups', 'x86/debug', 'x86/kconfig', 'x86/mm', 'x86/ptrace', 'x86/setup' and 'x86/urgent'; commit 'v2.6.29-rc8' into x86/core
Yinghai Lu [Fri, 13 Mar 2009 21:08:49 +0000 (14:08 -0700)]
x86: print the continous part of fixed mtrrs together
Impact: print out fewer lines
1. print continuous range with same type together
2. change _INFO to _DEBUG
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <
49BACB61.
8000302@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Yinghai Lu [Fri, 13 Mar 2009 19:46:07 +0000 (12:46 -0700)]
x86: fix get_mtrr() warning about smp_processor_id() with CONFIG_PREEMPT=y
Impact: fix debug warning
Jaswinder noticed that there is a warning about smp_processor_id()
in get_mtrr().
Fix it by wrapping the printout into a get/put_cpu() pair.
Reported-by: Jaswinder Singh Rajput <jaswinder@kernel.org>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <
49BAB7FF.
4030107@kernel.org>
[ changed to get/put_cpu(), cleaned up surrounding code a it. ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Yinghai Lu [Fri, 13 Mar 2009 05:36:01 +0000 (22:36 -0700)]
x86: make e820_update_range() handle small range update
Impact: enhance e820 code to handle more cases
Try to handle new range which could be covered by one entry.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: jbeulich@novell.com
LKML-Reference: <
49B9F0C1.10402@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Sat, 14 Mar 2009 07:46:17 +0000 (08:46 +0100)]
x86: cpu/common.c more cleanups
Complete/fix the cleanups of cpu/common.c:
- fix ugly warning due to asm/topology.h -> linux/topology.h change
- standardize the style across the file
- simplify/refactor the code flow where possible
Cc: Jaswinder Singh Rajput <jaswinder@kernel.org>
LKML-Reference: <
1237009789.4387.2.camel@localhost.localdomain>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Sat, 14 Mar 2009 08:50:10 +0000 (09:50 +0100)]
Merge branch 'core/percpu' into x86/core
Pallipadi, Venkatesh [Fri, 13 Mar 2009 23:35:44 +0000 (16:35 -0700)]
VM, x86, PAT: add a new vm flag to track full pfnmap at mmap
Impact: cleanup
Add a new vm flag VM_PFN_AT_MMAP to identify a PFNMAP that is
fully mapped with remap_pfn_range. Patch removes the overloading
of VM_INSERTPAGE from the earlier patch.
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Acked-by: Nick Piggin <npiggin@suse.de>
LKML-Reference: <
20090313233543.GA19909@linux-os.sc.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>