Christoph Lameter [Thu, 27 Mar 2008 04:03:04 +0000 (21:03 -0700)]
x86: stricter check in follow_huge_addr()
The first page of the compound page is determined in follow_huge_addr()
but then PageCompound() only checks if the page is part of a compound page.
PageHead() allows checking if this is indeed the first page of the
compound.
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Florian Fainelli [Wed, 26 Mar 2008 21:39:15 +0000 (22:39 +0100)]
rdc321x: GPIO routines bugfixes
This patch fixes the use of GPIO routines which are in the PCI
configuration space of the RDC321x, therefore reading/writing
to this space without spinlock protection can be problematic.
We also now request and free GPIOs and support the MGB100
board, previous code was very AR525W-centric.
Signed-off-by: Volker Weiss <volker@tintuc.de>
Signed-off-by: Florian Fainelli <florian.fainelli@telecomint.eu>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Andrew Morton [Tue, 4 Mar 2008 23:05:39 +0000 (15:05 -0800)]
x86: ptrace.c: fix defined-but-unused warnings
arch/x86/kernel/ptrace.c:548: warning: 'ptrace_bts_get_size' defined but not used
arch/x86/kernel/ptrace.c:558: warning: 'ptrace_bts_read_record' defined but not used
arch/x86/kernel/ptrace.c:607: warning: 'ptrace_bts_clear' defined but not used
arch/x86/kernel/ptrace.c:617: warning: 'ptrace_bts_drain' defined but not used
arch/x86/kernel/ptrace.c:720: warning: 'ptrace_bts_config' defined but not used
arch/x86/kernel/ptrace.c:788: warning: 'ptrace_bts_status' defined but not used
Cc: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Ingo Molnar [Thu, 27 Mar 2008 14:58:28 +0000 (15:58 +0100)]
x86: fix prefetch workaround
some early Athlon XP's and Opterons generate bogus faults on prefetch
instructions. The workaround for this regressed over .24 - reinstate it.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Linus Torvalds [Wed, 26 Mar 2008 22:02:12 +0000 (15:02 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/x86/linux-2.6-x86
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86:
x86: fix performance drop for glx
x86: fix trim mtrr not to setup_memory two times
x86: GEODE: add missing module.h include
x86, cpufreq: fix Speedfreq-SMI call that clobbers ECX
x86: fix memoryless node oops during boot
x86: add dmi quirk for io_delay
x86: convert mtrr/generic.c to kernel-doc
x86: Documentation/i386/IO-APIC.txt: fix description
Nishanth Aravamudan [Wed, 26 Mar 2008 21:40:20 +0000 (14:40 -0700)]
hugetlb: fix potential livelock in return_unused_surplus_hugepages()
Running the counters testcase from libhugetlbfs results in on 2.6.25-rc5
and 2.6.25-rc5-mm1:
BUG: soft lockup - CPU#3 stuck for 61s! [counters:10531]
NIP:
c0000000000d1f3c LR:
c0000000000d1f2c CTR:
c0000000001b5088
REGS:
c000005db12cb360 TRAP: 0901 Not tainted (2.6.25-rc5-autokern1)
MSR:
8000000000009032 <EE,ME,IR,DR> CR:
48008448 XER:
20000000
TASK =
c000005dbf3d6000[10531] 'counters' THREAD:
c000005db12c8000 CPU: 3
GPR00:
0000000000000004 c000005db12cb5e0 c000000000879228 0000000000000004
GPR04:
0000000000000010 0000000000000000 0000000000200200 0000000000100100
GPR08:
c0000000008aba10 000000000000ffff 0000000000000004 0000000000000000
GPR12:
0000000028000442 c000000000770080
NIP [
c0000000000d1f3c] .return_unused_surplus_pages+0x84/0x18c
LR [
c0000000000d1f2c] .return_unused_surplus_pages+0x74/0x18c
Call Trace:
[
c000005db12cb5e0] [
c000005db12cb670] 0xc000005db12cb670 (unreliable)
[
c000005db12cb670] [
c0000000000d24c4] .hugetlb_acct_memory+0x2e0/0x354
[
c000005db12cb740] [
c0000000001b5048] .truncate_hugepages+0x1d4/0x214
[
c000005db12cb890] [
c0000000001b50a4] .hugetlbfs_delete_inode+0x1c/0x3c
[
c000005db12cb920] [
c000000000103fd8] .generic_delete_inode+0xf8/0x1c0
[
c000005db12cb9b0] [
c0000000001b5100] .hugetlbfs_drop_inode+0x3c/0x24c
[
c000005db12cba50] [
c00000000010287c] .iput+0xdc/0xf8
[
c000005db12cbad0] [
c0000000000fee54] .dentry_iput+0x12c/0x194
[
c000005db12cbb60] [
c0000000000ff050] .d_kill+0x6c/0xa4
[
c000005db12cbbf0] [
c0000000000ffb74] .dput+0x18c/0x1b0
[
c000005db12cbc70] [
c0000000000e9e98] .__fput+0x1a4/0x1e8
[
c000005db12cbd10] [
c0000000000e61ec] .filp_close+0xb8/0xe0
[
c000005db12cbda0] [
c0000000000e62d0] .sys_close+0xbc/0x134
[
c000005db12cbe30] [
c00000000000872c] syscall_exit+0x0/0x40
Instruction dump:
ebbe8038 38800010 e8bf0002 3bbd0008 7fa3eb78 38a50001 7ca507b4 4818df25
60000000 38800010 38a00000 7c601b78 <
7fa3eb78>
2f800010 409d0008 38000010
This was tracked down to a potential livelock in
return_unused_surplus_hugepages(). In the case where we have surplus
pages on some node, but no free pages on the same node, we may never
break out of the loop. To avoid this livelock, terminate the search if
we iterate a number of times equal to the number of online nodes without
freeing a page.
Thanks to Andy Whitcroft and Adam Litke for helping with debugging and
the patch.
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Nishanth Aravamudan [Wed, 26 Mar 2008 21:37:53 +0000 (14:37 -0700)]
hugetlb: indicate surplus huge page counts in per-node meminfo
Currently we show the surplus hugetlb pool state in /proc/meminfo, but
not in the per-node meminfo files, even though we track the information
on a per-node basis. Printing it there can help track down dynamic pool
bugs including the one in the follow-on patch.
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Suresh Siddha [Wed, 26 Mar 2008 00:39:12 +0000 (17:39 -0700)]
x86: fix performance drop for glx
fix the 3D performance drop reported at:
http://bugzilla.kernel.org/show_bug.cgi?id=10328
fb drivers are using ioremap()/ioremap_nocache(), followed by mtrr_add with
WC attribute. Recent changes in page attribute code made both
ioremap()/ioremap_nocache() mappings as UC (instead of previous UC-). This
breaks the graphics performance, as the effective memory type is UC instead
of expected WC.
The correct way to fix this is to add ioremap_wc() (which uses UC- in the
absence of PAT kernel support and WC with PAT) and change all the
fb drivers to use this new ioremap_wc() API.
We can take this correct and longer route for post 2.6.25. For now,
revert back to the UC- behavior for ioremap/ioremap_nocache.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Yinghai Lu [Sun, 23 Mar 2008 07:16:49 +0000 (00:16 -0700)]
x86: fix trim mtrr not to setup_memory two times
we could call find_max_pfn() directly instead of setup_memory() to get
max_pfn needed for mtrr trimming.
otherwise setup_memory() is called two times... that is duplicated...
[ mingo@elte.hu: both Thomas and me simulated a double call to
setup_bootmem_allocator() and can confirm that it is a real bug
which can hang in certain configs. It's not been reported yet but
that is probably due to the relatively scarce nature of
MTRR-trimming systems. ]
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Andres Salomon [Wed, 26 Mar 2008 18:13:01 +0000 (14:13 -0400)]
x86: GEODE: add missing module.h include
On Wed, 26 Mar 2008 11:56:22 -0600
Jordan Crouse <jordan.crouse@amd.com> wrote:
> On 26/03/08 14:31 +0100, Stefan Pfetzing wrote:
> > Hello Jordan,
> >
> > I just tried to build your geodwdt driver for the geode watchdog. Therefore
> > I pulled your repository from http://git.infradead.org/geode.git (or more,
> > the git url).
> >
> > I tried to build the geodewdt driver as a module - which didn't work, and
> > it failed with the same problem as earlier mentioned on lkmk [1]. I also
> > checked the fix [2], but that seems to be already in your (or linus) tree -
> > and so I'm unsure what the problem is.
> >
> > [1] http://kerneltrap.org/mailarchive/linux-kernel/2008/2/17/884074
> > [2] http://kerneltrap.org/mailarchive/linux-kernel/2008/2/17/884174
> >
> > Building directly into the kernel seems to work.
> >
> > Maybe you have some idea?
>
> Hmm - that is strange. Exporting the symbols should work. I recommend
> starting over with a clean tree.
>
> CCing Andres - any thoughts?
>
> Jordan
>
Er, yeah. The patch below should fix it. This should probably go into
2.6.25.
Oops, EXPORT_SYMBOL_GPL wasn't being declared due to this header
being missing.
Signed-off-by: Andres Salomon <dilinger@debian.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Stephan Diestelhorst [Mon, 10 Mar 2008 15:05:41 +0000 (16:05 +0100)]
x86, cpufreq: fix Speedfreq-SMI call that clobbers ECX
I have found that using SMI to change the cpu's frequency on my DELL
Latitude L400 clobbers the ECX register in speedstep_set_state, causing
unneccessary retries because the "state" variable has changed silently (GCC
assumes it is still present in ECX).
play safe and avoid gcc caching any register across IO port accesses
that trigger SMIs.
Signed-off by: <Stephan.Diestelhorst@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Yinghai Lu [Mon, 25 Feb 2008 07:23:09 +0000 (23:23 -0800)]
x86: fix memoryless node oops during boot
fix oops during boot reported in this thread:
http://lkml.org/lkml/2008/2/6/65
enable booting on memoryless nodes.
Reported-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Fri, 21 Mar 2008 09:06:32 +0000 (10:06 +0100)]
x86: add dmi quirk for io_delay
reported by mereandor@gmail.com, in:
http://bugzilla.kernel.org/show_bug.cgi?id=6307
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Randy Dunlap [Thu, 13 Mar 2008 23:59:12 +0000 (16:59 -0700)]
x86: convert mtrr/generic.c to kernel-doc
Convert function comment blocks to kernel-doc notation.
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Nick Andrew [Tue, 4 Mar 2008 23:05:40 +0000 (15:05 -0800)]
x86: Documentation/i386/IO-APIC.txt: fix description
The description of the interrupt routing doesn't match the (nice) diagram.
Signed-off-by: Nick Andrew <nick@nick-andrew.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Masami Hiramatsu [Wed, 26 Mar 2008 19:53:19 +0000 (15:53 -0400)]
kprobes: MAINTAINERS update
Add Masami Hiramatsu to kprobes maintainers
Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Wed, 26 Mar 2008 18:30:17 +0000 (11:30 -0700)]
Merge branch 'slab-linus' of git://git./linux/kernel/git/christoph/vm
* 'slab-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/christoph/vm:
slab: fix cache_cache bootstrap in kmem_cache_init()
count_partial() is not used if !SLUB_DEBUG and !CONFIG_SLABINFO
Linus Torvalds [Wed, 26 Mar 2008 18:29:35 +0000 (11:29 -0700)]
Merge branch 'master' of git://git./linux/kernel/git/tglx/linux-2.6-hrt
* 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/tglx/linux-2.6-hrt:
NOHZ: reevaluate idle sleep length after add_timer_on()
clocksource: revert: use init_timer_deferrable for clocksource_watchdog
Linus Torvalds [Wed, 26 Mar 2008 18:27:24 +0000 (11:27 -0700)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
relay: set an spd_release() hook for splice
set relay file can not be read by pread(2)
Tom Tucker [Wed, 26 Mar 2008 02:27:19 +0000 (22:27 -0400)]
SVCRDMA: Check num_sge when setting LAST_CTXT bit
The RDMACTXT_F_LAST_CTXT bit was getting set incorrectly
when the last chunk in the read-list spanned multiple pages. This
resulted in a kernel panic when the wrong context was used to
build the RPC iovec page list.
RDMA_READ is used to fetch RPC data from the client for
NFS_WRITE requests. A scatter-gather is used to map the
advertised client side buffer to the server-side iovec and
associated page list.
WR contexts are used to convey which scatter-gather entries are
handled by each WR. When the write data is large, a single RPC may
require multiple RDMA_READ requests so the contexts for a single RPC
are chained together in a linked list. The last context in this list
is marked with a bit RDMACTXT_F_LAST_CTXT so that when this WR completes,
the CQ handler code can enqueue the RPC for processing.
The code in rdma_read_xdr was setting this bit on the last two
contexts on this list when the last read-list chunk spanned multiple
pages. This caused the svc_rdma_recvfrom logic to incorrectly build
the RPC and caused the kernel to crash because the second-to-last
context doesn't contain the iovec page list.
Modified the condition that sets this bit so that it correctly detects
the last context for the RPC.
Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Tested-by: Roland Dreier <rolandd@cisco.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Wed, 26 Mar 2008 18:22:40 +0000 (11:22 -0700)]
Revert "PCI: remove transparent bridge sizing"
This reverts commit
8fa5913d54f3b1e09948e6a0db34da887e05ff1f, which
caused various interesting problems for people, including wrong resource
allocations. See for example bugzilla entry "2.6.25-rc2: ohci1394
problem (MMIO broken)" at
http://bugzilla.kernel.org/show_bug.cgi?id=10080
And Gary Hade says:
"The same change had also exposed an issue reported by Paul Martin that
has been causing an Oops while hotplugging ThinkPads to a ThinkPad
Dock II. See
http://lkml.org/lkml/2008/2/19/405
http://bugzilla.kernel.org/show_bug.cgi?id=9961
I have a fix for the ThinkPad docking Oops but if the issue being
discussed here is caused by the transparent bridge sizing removal
change I totally agree that it should be reverted."
The transparent bridge sizing removal change was motivated by
insufficient PCI memory resource for a transparent bridge window that
was being created as a result of expansion ROM(s) being included in
the transparent bridge sizing calculations.
A later "PCI: Remove default PCI expansion ROM memory allocation"
change ( re: http://lkml.org/lkml/2007/12/11/361 ) removes the
expansion ROM(s) from the transparent bridge sizing calculations which
actually resolves the original issue in a different manner. So, even
if the "PCI: remove transparent bridge sizing" is not problematic it
is no longer needed anyway."
Identified-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Tested-by: Thomas Meyer <thomas@m3y3r.de>
Acked-by: Gary Hade <garyhade@us.ibm.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Daniel Yeisley [Tue, 25 Mar 2008 21:59:08 +0000 (23:59 +0200)]
slab: fix cache_cache bootstrap in kmem_cache_init()
Commit
556a169dab38b5100df6f4a45b655dddd3db94c1 ("slab: fix bootstrap on
memoryless node") introduced bootstrap-time cache_cache list3s for all nodes
but forgot that initkmem_list3 needs to be accessed by [somevalue + node]. This
patch fixes list_add() corruption in mm/slab.c seen on the ES7000.
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Olaf Hering <olaf@aepfle.de>
Cc: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Dan Yeisley <dan.yeisley@unisys.com>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Christoph Lameter [Wed, 19 Mar 2008 20:42:07 +0000 (13:42 -0700)]
count_partial() is not used if !SLUB_DEBUG and !CONFIG_SLABINFO
Avoid warnings about unused functions if neither SLUB_DEBUG nor CONFIG_SLABINFO
is defined. This patch will be reversed when slab defrag is merged since slab
defrag requires count_partial() to determine the fragmentation status of
slab caches.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Jens Axboe [Wed, 26 Mar 2008 11:04:09 +0000 (12:04 +0100)]
relay: set an spd_release() hook for splice
relay doesn't reference the pages it adds, however we need a non-NULL
hook or splice_to_pipe() can oops.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Lai Jiangshan [Wed, 26 Mar 2008 11:01:28 +0000 (12:01 +0100)]
set relay file can not be read by pread(2)
I found that relay files can be read by pread(2). I fix it,
for relay files are not capable of seeking.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Thomas Gleixner [Sat, 22 Mar 2008 08:20:24 +0000 (09:20 +0100)]
NOHZ: reevaluate idle sleep length after add_timer_on()
add_timer_on() can add a timer on a CPU which is currently in a long
idle sleep, but the timer wheel is not reevaluated by the nohz code on
that CPU. So a timer can be delayed for quite a long time. This
triggered a false positive in the clocksource watchdog code.
To avoid this we need to wake up the idle CPU and enforce the
reevaluation of the timer wheel for the next timer event.
Add a function, which checks a given CPU for idle state, marks the
idle task with NEED_RESCHED and sends a reschedule IPI to notify the
other CPU of the change in the timer wheel.
Call this function from add_timer_on().
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: stable@kernel.org
--
include/linux/sched.h | 6 ++++++
kernel/sched.c | 43 +++++++++++++++++++++++++++++++++++++++++++
kernel/timer.c | 10 +++++++++-
3 files changed, 58 insertions(+), 1 deletion(-)
Linus Torvalds [Wed, 26 Mar 2008 01:38:14 +0000 (18:38 -0700)]
Linux 2.6.25-rc7
Bjorn Helgaas [Tue, 25 Mar 2008 17:21:11 +0000 (11:21 -0600)]
ACPI: fix Medion _PRT quirk (use "ISA_", not "ISA")
This fixes the builtin RTL8139 NIC on the Medion MD9580-F laptop. The
BIOS reports the interrupt routing incorrectly. I recently added a
quirk to work around this, and this patch fixes a typo in the quirk.
We pad every ACPI pathname component to four characters, so ".ISA." will
never match anything. We need ".ISA_." instead.
Thank you Johann-Nikolaus Andreae <johann-nikolaus.andreae@nacs.de>
for patiently testing this patch.
See http://bugzilla.kernel.org/show_bug.cgi?id=4773
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Thomas Gleixner [Tue, 25 Mar 2008 08:01:51 +0000 (09:01 +0100)]
clocksource: revert: use init_timer_deferrable for clocksource_watchdog
Revert
commit
1077f5a917b7c630231037826b344b2f7f5b903f
Author: Parag Warudkar <parag.warudkar@gmail.com>
Date: Wed Jan 30 13:30:01 2008 +0100
clocksource.c: use init_timer_deferrable for clocksource_watchdog
clocksource_watchdog can use a deferrable timer - reduces wakeups from
idle per second.
The watchdog timer needs to run with the specified interval. Otherwise
it will miss the possible wrap of the watchdog clocksource.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@kernel.org
Linus Torvalds [Tue, 25 Mar 2008 16:06:44 +0000 (09:06 -0700)]
Merge branch 'i2c-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6
* 'i2c-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6:
i2c: Fix docbook problem
ASoC/TLV320AIC3X: Stop I2C driver ID abuse
i2c-omap: Fix unhandled fault
i2c-bfin-twi: Disable BF54x support for now
Linus Torvalds [Tue, 25 Mar 2008 16:06:19 +0000 (09:06 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/avi/kvm
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm:
KVM: MMU: Fix memory leak on guest demand faults
KVM: VMX: convert init_rmode_tss() to slots_lock
KVM: MMU: handle page removal with shadow mapping
KVM: MMU: Fix is_rmap_pte() with io ptes
KVM: VMX: Restore tss even on x86_64
Linus Torvalds [Tue, 25 Mar 2008 15:57:47 +0000 (08:57 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/viro/vfs-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
[PATCH] get stack footprint of pathname resolution back to relative sanity
[PATCH] double iput() on failure exit in hugetlb
[PATCH] double dput() on failure exit in tiny-shmem
[PATCH] fix up new filp allocators
[PATCH] check for null vfsmount in dentry_open()
[PATCH] reiserfs: eliminate private use of struct file in xattr
[PATCH] sanitize hppfs
hppfs pass vfsmount to dentry_open()
[PATCH] restore export of do_kern_mount()
Avi Kivity [Sun, 23 Mar 2008 12:21:08 +0000 (14:21 +0200)]
KVM: MMU: Fix memory leak on guest demand faults
While backporting
72dc67a69690288538142df73a7e3ac66fea68dc, a gfn_to_page()
call was duplicated instead of moved (due to an unrelated patch not being
present in mainline). This caused a page reference leak, resulting in a
fairly massive memory leak.
Fix by removing the extraneous gfn_to_page() call.
Signed-off-by: Avi Kivity <avi@qumranet.com>
Marcelo Tosatti [Tue, 18 Mar 2008 20:42:34 +0000 (17:42 -0300)]
KVM: VMX: convert init_rmode_tss() to slots_lock
init_rmode_tss was forgotten during the conversion from mmap_sem to
slots_lock.
INFO: task qemu-system-x86:3748 blocked for more than 120 seconds.
Call Trace:
[<
ffffffff8053d100>] __down_read+0x86/0x9e
[<
ffffffff8053fb43>] do_page_fault+0x346/0x78e
[<
ffffffff8053d235>] trace_hardirqs_on_thunk+0x35/0x3a
[<
ffffffff8053dcad>] error_exit+0x0/0xa9
[<
ffffffff8035a7a7>] copy_user_generic_string+0x17/0x40
[<
ffffffff88099a8a>] :kvm:kvm_write_guest_page+0x3e/0x5f
[<
ffffffff880b661a>] :kvm_intel:init_rmode_tss+0xa7/0xf9
[<
ffffffff880b7d7e>] :kvm_intel:vmx_vcpu_reset+0x10/0x38a
[<
ffffffff8809b9a5>] :kvm:kvm_arch_vcpu_setup+0x20/0x53
[<
ffffffff8809a1e4>] :kvm:kvm_vm_ioctl+0xad/0x1cf
[<
ffffffff80249dea>] __lock_acquire+0x4f7/0xc28
[<
ffffffff8028fad9>] vfs_ioctl+0x21/0x6b
[<
ffffffff8028fd75>] do_vfs_ioctl+0x252/0x26b
[<
ffffffff8028fdca>] sys_ioctl+0x3c/0x5e
[<
ffffffff8020b01b>] system_call_after_swapgs+0x7b/0x80
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
Marcelo Tosatti [Mon, 17 Mar 2008 13:08:18 +0000 (10:08 -0300)]
KVM: MMU: handle page removal with shadow mapping
Do not assume that a shadow mapping will always point to the same host
frame number. Fixes crash with madvise(MADV_DONTNEED).
[avi: move after first printk(), add another printk()]
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
Avi Kivity [Sun, 23 Mar 2008 10:18:19 +0000 (12:18 +0200)]
KVM: MMU: Fix is_rmap_pte() with io ptes
is_rmap_pte() doesn't take into account io ptes, which have the avail bit set.
Signed-off-by: Avi Kivity <avi@qumranet.com>
Avi Kivity [Sun, 16 Mar 2008 16:48:26 +0000 (18:48 +0200)]
KVM: VMX: Restore tss even on x86_64
The vmx hardware state restore restores the tss selector and base address, but
not its length. Usually, this does not matter since most of the tss contents
is within the default length of 0x67. However, if a process is using ioperm()
to grant itself I/O port permissions, an additional bitmap within the tss,
but outside the default length is consulted. The effect is that the process
will receive a SIGSEGV instead of transparently accessing the port.
Fix by restoring the tss length. Note that i386 had this working already.
Closes bugzilla 10246.
Signed-off-by: Avi Kivity <avi@qumranet.com>
Linus Torvalds [Tue, 25 Mar 2008 06:24:16 +0000 (23:24 -0700)]
Merge git://git./linux/kernel/git/gregkh/usb-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6:
USB: Fix cut-and-paste error in rtl8150.c
USB: ehci: stop vt6212 bus hogging
USB: sierra: add another device id
USB: sierra: dma fixes
USB: add support for Motorola ROKR Z6 cellphone in mass storage mode
USB: isd200: fix memory leak in isd200_get_inquiry_data
USB: pl2303: another product ID
USB: new quirk flag to avoid Set-Interface
USB: fix gadgetfs class request delegation
Linus Torvalds [Tue, 25 Mar 2008 06:23:39 +0000 (23:23 -0700)]
Merge git://git./linux/kernel/git/gregkh/driver-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-2.6:
driver core: debug for bad dev_attr_show() return value.
UIO: add pgprot_noncached() to UIO mmap code
Andrew Morton [Mon, 17 Mar 2008 21:21:18 +0000 (14:21 -0700)]
PCI: revert "pcie: utilize pcie transaction pending bit"
Revert as it is reported to cause problems for people.
commit
4348a2dc49f9baecd34a9b0904245488c6189398
Author: Shaohua Li <shaohua.li@intel.com>
Date: Wed Oct 24 10:45:08 2007 +0800
pcie: utilize pcie transaction pending bit
PCIE has a mechanism to wait for Non-Posted request to complete. I think
pci_disable_device is a good place to do this.
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Due to the regression reported at
http://bugzilla.kernel.org/show_bug.cgi?id=10065
Cc: Shaohua Li <shaohua.li@intel.com>
Cc: Soeren Sonnenburg <kernel@nn7.de>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Mark Gross [Tue, 4 Mar 2008 22:59:31 +0000 (14:59 -0800)]
PCI: iova: lockdep false alarm fix
lockdep goes off on the iova copy_reserved_iova() because it and a function
it calls grabs locks in the from, and the to of the copy operation.
The function grab locks of the same lock classes triggering the warning. The
first lock grabbed is for the constant reserved areas that is never accessed
after early boot. Technically you could do without grabbing the locks for the
"from" structure its copying reserved areas from.
But dropping the from locks to me looks wrong, even though it would be ok.
The affected code only runs in early boot as its setting up the DMAR
engines.
This patch gives the reserved_ioval_list locks special lockdep classes.
Signed-off-by: Mark Gross <mgross@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Andrew Morton [Tue, 4 Mar 2008 23:09:07 +0000 (15:09 -0800)]
driver core: debug for bad dev_attr_show() return value.
Try to find the culprit who caused
http://bugzilla.kernel.org/show_bug.cgi?id=10150
Cc: <balajirrao@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Jean-Samuel Chenard [Fri, 14 Mar 2008 10:28:36 +0000 (11:28 +0100)]
UIO: add pgprot_noncached() to UIO mmap code
Mapping of physical memory in UIO needs pgprot_noncached() to ensure
that IO memory is not cached. Without pgprot_noncached(), it (accidentally)
works on x86 and arm, but fails on PPC.
Signed-off-by: Jean-Samuel Chenard <jsamch@gmail.com>
Signed-off-by: Hans J Koch <hjk@linutronix.de>
Cc: stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Robert P. J. Day [Mon, 10 Mar 2008 00:44:52 +0000 (20:44 -0400)]
USB: Fix cut-and-paste error in rtl8150.c
Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Rene Herman [Thu, 20 Mar 2008 07:58:16 +0000 (00:58 -0700)]
USB: ehci: stop vt6212 bus hogging
The VIA VT6212 defaults to only waiting 1us between passes over EHCI's
async ring, which hammers PCI badly ... and by preventing other devices
from accessing the bus, causes problems like drops in IDE throughput,
a problem that's been bugging users of those chips for several years.
A (partial) datasheet for this chip eventually turned up, letting us
see how to make it use a VIA-specific register to switch over to the
the normal 10us value instead, as suggested by the EHCI specification
Solution noted by Lev A. Melnikovsky.
It's not clear whether this register exists on other VIA chips; we
know that it's ineffective on the vt8235. So this patch only applies
to chips that seem to be incarnations of the (discrete) vt6212.
Signed-off-by: Rene Herman <rene.herman@gmail.com>
Tested-by: Lev A. Melnikovsky <melnikovsky@mail.ru>
Tested-by: Alessandro Suardi <alessandro.suardi@gmail.com>
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Kevin Lloyd [Fri, 14 Mar 2008 07:53:24 +0000 (00:53 -0700)]
USB: sierra: add another device id
Add support for the MC8775 device to the sierra driver.
Signed-off-by: Kevin Lloyd <klloyd@sierrawireless.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Oliver Neukum [Fri, 14 Mar 2008 07:53:24 +0000 (00:53 -0700)]
USB: sierra: dma fixes
while I was adding autosuspend to that driver I noticed a few issues.
You were having DMAed buffers as a part of a structure.
This will fail on platforms that are not DMA-coherent (arm, sparc, ppc, ...)
Please test this patch to fix it.
Signed-off-by: Kevin Lloyd <klloyd@sierrawireless.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Constantin Baranov [Sun, 16 Mar 2008 20:04:23 +0000 (20:04 +0000)]
USB: add support for Motorola ROKR Z6 cellphone in mass storage mode
Motorola ROKR Z6 cellphone has bugs in its USB, so it is impossible to use
it as mass storage. Patch describes new "unusual" USB device for it with
FIX_INQUIRY and FIX_CAPACITY flags and new BULK_IGNORE_TAG flag.
Last flag relaxes check for equality of bcs->Tag and us->tag in
usb_stor_Bulk_transport routine.
Signed-off-by: Constantin Baranov <const@tltsu.ru>
Signed-off-by: Matthew Dharm <mdharm-usb@one-eyed-alien.net>
Signed-off-by: Daniel Drake <dsd@gentoo.org>
Cc: stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Boaz Harrosh [Mon, 17 Mar 2008 21:21:01 +0000 (14:21 -0700)]
USB: isd200: fix memory leak in isd200_get_inquiry_data
If the inquiry fails then the info structure on us->extra was not freed.
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Max Arnold [Thu, 20 Mar 2008 09:43:56 +0000 (16:43 +0700)]
USB: pl2303: another product ID
Device like this http://aldiga.com/english/A-100-USB-EDGE10.htm
contains Prolific 2303 chip.
Actually their site a bit outdated - I have AlDiga AL-11U
GSM/GPRS/EDGE modem and it works with pl2303 module after adding
corresponding product ID.
By default modem uses baud rate 460800. GSM chipset - SIMCom SIM600,
quad band 850/900/1800/1900 MHz
Device info:
T: Bus=04 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 5 Spd=12 MxCh= 0
D: Ver= 1.10 Cls=00(>ifc ) Sub=00 Prot=00 MxPS= 8 #Cfgs= 1
P: Vendor=067b ProdID=0611 Rev= 0.00
C:* #Ifs= 1 Cfg#= 1 Atr=a0 MxPwr=500mA
I:* If#= 0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=pl2303
E: Ad=81(I) Atr=03(Int.) MxPS= 10 Ivl=1ms
E: Ad=02(O) Atr=02(Bulk) MxPS= 64 Ivl=0ms
E: Ad=83(I) Atr=02(Bulk) MxPS= 64 Ivl=0ms
From: Max Arnold <lwarxx@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Alan Stern [Tue, 11 Mar 2008 14:20:12 +0000 (10:20 -0400)]
USB: new quirk flag to avoid Set-Interface
This patch (as1057) fixes a problem with the X-Rite/Gretag-Macbeth
Eye-One Pro display colorimeter; the device crashes when it receives a
Set-Interface request. A new quirk (USB_QUIRK_NO_SET_INTF) is
introduced and a quirks entry is created for this device.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Cc: stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Roy Hashimoto [Wed, 12 Mar 2008 21:55:31 +0000 (13:55 -0800)]
USB: fix gadgetfs class request delegation
gadgetfs (drivers/usb/gadget/inode.c) was not delegating all
non-device requests to userspace. This patch makes the handling of
all request cases consistent.
Signed-off-by: Roy Hashimoto <hashimot@alumni.caltech.edu>
Acked-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Linus Torvalds [Tue, 25 Mar 2008 03:35:48 +0000 (20:35 -0700)]
Merge branch 'merge' of git://git./linux/kernel/git/paulus/powerpc
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc:
[POWERPC] mpc5200: Fix incorrect compatible string for the mdio node
[POWERPC] Update some defconfigs
Linus Torvalds [Tue, 25 Mar 2008 03:02:32 +0000 (20:02 -0700)]
Merge branch 'upstream-linus' of git://git./linux/kernel/git/jgarzik/libata-dev
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
[libata] ahci: SB600 workaround is suspect... play it safe for now
sata_promise: fix hardreset hotplug events, take 2
libata: improve HPA error handling
libata: assume no device is attached if both IDENTIFYs are aborted
pata_it821x: use raw nbytes in check_atapi_dma
libata: implement ata_qc_raw_nbytes()
Jeff Garzik [Tue, 25 Mar 2008 02:40:40 +0000 (22:40 -0400)]
[libata] ahci: SB600 workaround is suspect... play it safe for now
At least one report claims that
a878539ef994787c447a98c2e3ba0fe3dad984ec
failed to solve lockups, whereas the old limit-to-32-bit trick worked.
Restore the 32-bit limit, but also leave the 255-sector limit in place,
because we know that's needed as well.
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Linus Torvalds [Tue, 25 Mar 2008 02:32:38 +0000 (19:32 -0700)]
Merge git://git./linux/kernel/git/sam/kbuild-linus
* git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild-linus:
kbuild: soften modpost checks when doing cross builds
Paul Mackerras [Tue, 25 Mar 2008 02:31:46 +0000 (13:31 +1100)]
Merge branch 'linux-2.6' into merge
Mikael Pettersson [Sun, 23 Mar 2008 17:41:01 +0000 (18:41 +0100)]
sata_promise: fix hardreset hotplug events, take 2
A Promise SATA controller will signal hotplug events when a hard
reset (COMRESET) is done on a port. These events aren't masked by
the driver, and the unexpected interrupts will cause a sequence
of failed reset attempts util libata's EH finally gives up.
This has not been a common problem so far, but the pending libata
hardreset-by-default changes makes it a critical issue.
The solution is to disable hotplug events before a reset, and to
reenable them afterwards. (Promise's driver does this too.)
This patch adds SATA-specific versions of ->freeze() and ->thaw()
that also disable and enable hotplug events. PATA ports continue
to use the old versions of ->freeze() and ->thaw().
Accesses to the hotplug register must be serialised via host->lock.
We rely on ap->lock == &ap->host->lock and that libata takes this
lock before ->freeze() and ->thaw(). Document this requirement.
The interrupt handler is adjusted so its hotplug register accesses
are inside the region protected by host->lock.
Tested on various chips (SATA300TX4, SATA300TX2plus, SATAII150TX4,
FastTrack TX4000) with various combinations of SATA and PATA disks,
with and without the pending hardreset-by-default changes.
Signed-off-by: Mikael Pettersson <mikpe@it.uu.se>
Acked-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Linus Torvalds [Tue, 25 Mar 2008 02:25:08 +0000 (19:25 -0700)]
Make printk() console semaphore accesses sensible
The printk() logic on when/how to get the console semaphore was
unreadable, this splits the code up into a few helper functions and
makes it easier to follow what is going on.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pavel Emelyanov [Mon, 24 Mar 2008 19:29:53 +0000 (12:29 -0700)]
bsd_acct: using task_struct->tgid is not right in pid-namespaces
In case we're accounting from a sub-namespace, the tgids reported will not
refer to the right namespace.
Save the pid_namespace we're accounting in on the acct_glbs and use it in
do_acct_process.
Two less :) places using the task_struct.tgid member.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Cc: "Paul E. McKenney" <paulmck@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pavel Emelyanov [Mon, 24 Mar 2008 19:29:52 +0000 (12:29 -0700)]
bsd_acct: plain current->real_parent access is not always safe
This is minor, but dereferencing even current real_parent is not safe on debug
kernels, since the memory, this points to, can be unmapped - RCU protection is
required.
Besides, the tgid field is deprecated and is to be replaced with task_tgid_xxx
call (the 2nd patch), so RCU will be required anyway.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Cc: "Paul E. McKenney" <paulmck@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andrew Morton [Mon, 24 Mar 2008 19:29:52 +0000 (12:29 -0700)]
revert "kswapd should only wait on IO if there is IO"
Revert commit
f1a9ee758de7de1e040de849fdef46e6802ea117:
Author: Rik van Riel <riel@redhat.com>
Date: Thu Feb 7 00:14:08 2008 -0800
kswapd should only wait on IO if there is IO
The current kswapd (and try_to_free_pages) code has an oddity where the
code will wait on IO, even if there is no IO in flight. This problem is
notable especially when the system scans through many unfreeable pages,
causing unnecessary stalls in the VM.
Additionally, tasks without __GFP_FS or __GFP_IO in the direct reclaim path
will sleep if a significant number of pages are encountered that should be
written out. This gives kswapd a chance to write out those pages, while
the direct reclaim task sleeps.
Signed-off-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Because of large latencies and interactivity problems reported by Carlos,
here: http://lkml.org/lkml/2008/3/22/211
Cc: Rik van Riel <riel@redhat.com>
Cc: "Carlos R. Mafra" <crmafra2@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Brownell [Mon, 24 Mar 2008 19:29:51 +0000 (12:29 -0700)]
hw_random doc updates
Update documentation for the hw_random support to be current:
- Documentation/hw_random.txt has been updated to reflect the
current code: it's a framework now, a "core" with a small
sysfs interface, that hardware-specific drivers plug in to.
Text specific to Intel hardware is now at the end.
- Kconfig now references the Documentation/hw_random.txt file
and better explains what this really does.
Both chunks of documentation now higlight the fact that the kernel entropy
pool is maintained by "rngd", and this driver has nothing directly to do with
that important task.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ahmed S. Darwish [Mon, 24 Mar 2008 19:29:49 +0000 (12:29 -0700)]
smackfs: remove redundant lock, fix open(,O_RDWR)
Older smackfs was parsing MAC rules by characters, thus a need of locking
write sessions on open() was needed. This lock is no longer useful now since
each rule is handled by a single write() call.
This is also a bugfix since seq_open() was not called if an open() O_RDWR flag
was given, leading to a seq_read() without an initialized seq_file, thus an
Oops.
Signed-off-by: Ahmed S. Darwish <darwish.07@gmail.com>
Reported-by: Jonathan Corbet <corbet@lwn.net>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mathieu Desnoyers [Mon, 24 Mar 2008 19:29:49 +0000 (12:29 -0700)]
markers: remove ACCESS_ONCE
As Paul pointed out, the ACCESS_ONCE are not needed because we already have
the explicit surrounding memory barriers.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Mike Mason <mmlnx@us.ibm.com>
Cc: Dipankar Sarma <dipankar@in.ibm.com>
Cc: David Smith <dsmith@redhat.com>
Cc: "Paul E. McKenney" <paulmck@us.ibm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Adrian Bunk <adrian.bunk@movial.fi>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mathieu Desnoyers [Mon, 24 Mar 2008 19:29:47 +0000 (12:29 -0700)]
markers: update preempt_disable. call_rcu, rcu_barrier comments
Add comments requested by Andrew.
Updated comments about synchronize_sched(). Since we use call_rcu and
rcu_barrier now, these comments were out of sync with the code.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Mike Mason <mmlnx@us.ibm.com>
Cc: Dipankar Sarma <dipankar@in.ibm.com>
Cc: David Smith <dsmith@redhat.com>
Cc: "Paul E. McKenney" <paulmck@us.ibm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Adrian Bunk <adrian.bunk@movial.fi>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Yinghai Lu [Mon, 24 Mar 2008 19:29:45 +0000 (12:29 -0700)]
mm: fix boundary checking in free_bootmem_core
With numa enabled, some callers could have a range of memory on one node
but try to free that on other node. This can cause some pages to be
freed wrongly.
For example: when we try to allocate 128g boot ram early for
gart/swiotlb, and free that range later so gart/swiotlb can get some
range afterwards.
With this patch, we don't need to care which node holds the range, just
loop to call free_bootmem_node for all online nodes.
This patch makes free_bootmem_core() more robust by trimming the sidx
and eidx according the ram range that the node has.
And make the free_bootmem_core handle this out of range case. We could
use bdata_list to make sure the range can be freed for sure. So next
time, we don't need to loop online nodes and could use free_bootmem
directly.
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Yasunori Goto <y-goto@jp.fujitsu.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Tested-by: Ingo Molnar <mingo@elte.hu>
Cc: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ingo van Lil [Mon, 24 Mar 2008 19:29:44 +0000 (12:29 -0700)]
mtd: memory corruption in block2mtd.c
The block2mtd driver (drivers/mtd/devices/block2mtd.c) will kfree an on-stack
pointer when handling an invalid argument line (e.g.
block2mtd=/dev/loop0,xxx).
The kfree was added some time ago when "name" was dynamically allocated.
Signed-off-by: Ingo van Lil <inguin@gmx.de>
Acked-by: Joern Engel <joern@lazybastard.org>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: <stable@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pavel Machek [Mon, 24 Mar 2008 19:29:43 +0000 (12:29 -0700)]
kernel-parameters.txt: document memmap option better
Provide example for memmap exclude option (it is slightly strange and
non-trivial) and provide nice small HOWTO for people with bad memory.
Signed-off-by: Jan-Simon Moeller <dl9pf@gmx.de>
Signed-off-by: Pavel Machek <pavel@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Grant Likely [Sat, 22 Mar 2008 03:25:15 +0000 (14:25 +1100)]
[POWERPC] mpc5200: Fix incorrect compatible string for the mdio node
The MDIO node in the lite5200b.dts file needs to also claim compatibility
with the older mpc5200 chip. Otherwise the driver won't find the device.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Tejun Heo [Sun, 23 Mar 2008 12:05:15 +0000 (21:05 +0900)]
libata: improve HPA error handling
There's no point in retrying and eventually failing device detection
when the device rejects READ_NATIVE_MAX[_EXT]. Disable HPA unlocking
if READ_NATIVE_MAX[_EXT] is rejected as done when SET_MAX[_EXT] is
rejected.
This allows some old drives to work even if they aren't blacklisted.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Tejun Heo [Sun, 23 Mar 2008 06:16:53 +0000 (15:16 +0900)]
libata: assume no device is attached if both IDENTIFYs are aborted
This is to fix bugzilla #10254. QSI cdrom attached to pata_sis as
secondary master appears as phantom device for the slave.
Interestingly, instead of not setting DRQ after IDENTIFY which
triggers NODEV_HINT, it aborts both IDENTIFY and IDENTIFY PACKET which
makes EH retry.
Modify EH such that it assumes no device is attached if both flavors
of IDENTIFY are aborted by the device. There really isn't much point
in retrying when the device actively aborts the commands.
While at it, convert NODEV detection message to ata_dev_printk() to
help debugging obscure detection problems.
This problem was reported by Jan Bücken.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Jan Bücken <jb.faq@gmx.de>
Acked-by: Alan Cox <alan@redhat.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Tejun Heo [Tue, 18 Mar 2008 08:56:12 +0000 (17:56 +0900)]
pata_it821x: use raw nbytes in check_atapi_dma
pata_it821x needs to look at raw request size in check_atapi_dma().
Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Tejun Heo [Tue, 18 Mar 2008 08:47:43 +0000 (17:47 +0900)]
libata: implement ata_qc_raw_nbytes()
Implement ata_qc_raw_nbytes() which determines the raw user-requested
size of a PC command.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Linus Torvalds [Mon, 24 Mar 2008 20:09:34 +0000 (13:09 -0700)]
Merge branch 'merge' of git://git./linux/kernel/git/paulus/powerpc
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc:
[POWERPC] Fix Oops with TQM5200 on TQM5200
[POWERPC] mpc5200: Fix null dereference if bestcomm fails to initialize
[POWERPC] mpc5200-fec: Fix possible NULL dereference in mdio driver
[POWERPC] Fix crash in init_ipic_sysfs on efika
[POWERPC] Don't use 64k pages for ioremap on pSeries
Linus Torvalds [Mon, 24 Mar 2008 20:08:01 +0000 (13:08 -0700)]
Merge git://git./linux/kernel/git/davem/sparc-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6:
[SPARC64]: exec PT_DTRACE
[SPARC64]: Use shorter list_splice_init() for brevity.
[SPARC64]: Remove most limitations to kernel image size.
Linus Torvalds [Mon, 24 Mar 2008 20:07:24 +0000 (13:07 -0700)]
Merge git://git./linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
sch_htb: fix "too many events" situation
connector: convert to single-threaded workqueue
[ATM]: When proc_create() fails, do some error handling work and return -ENOMEM.
[SUNGEM]: Fix NAPI assertion failure.
BNX2X: prevent ethtool from setting port type
[9P] net/9p/trans_fd.c: remove unused variable
[IPV6] net/ipv6/ndisc.c: remove unused variable
[IPV4] fib_trie: fix warning from rcu_assign_poinger
[TCP]: Let skbs grow over a page on fast peers
[DLCI]: Fix tiny race between module unload and sock_ioctl.
[SCTP]: Fix build warnings with IPV6 disabled.
[IPV4]: Fix null dereference in ip_defrag
Roland Dreier [Mon, 24 Mar 2008 16:03:03 +0000 (12:03 -0400)]
SVCRDMA: Use only 1 RDMA read scatter entry for iWARP adapters
The iWARP protocol limits RDMA read requests to a single scatter
entry. NFS/RDMA has code in rdma_read_max_sge() that is supposed to
limit the sge_count for RDMA read requests to 1, but the code to do
that is inside an #ifdef RDMA_TRANSPORT_IWARP block. In the mainline
kernel at least, RDMA_TRANSPORT_IWARP is an enum and not a
preprocessor #define, so the #ifdef'ed code is never compiled.
In my test of a kernel build with -j8 on an NFS/RDMA mount, this
problem eventually leads to trouble starting with:
svcrdma: Error posting send = -22
svcrdma : RDMA_READ error = -22
and things go downhill from there.
The trivial fix is to delete the #ifdef guard. The check seems to be
a remnant of when the NFS/RDMA code was not merged and needed to
compile against multiple kernel versions, although I don't think it
ever worked as intended. In any case now that the code is upstream
there's no need to test whether the RDMA_TRANSPORT_IWARP constant is
defined or not.
Without this patch, my kernel build on an NFS/RDMA mount using NetEffect
adapters quickly and 100% reproducibly failed with an error like:
ld: final link failed: Software caused connection abort
With the patch applied I was able to complete a kernel build on the
same setup.
(Tom Tucker says this is "actually an _ancient_ remnant when it had to
compile against iWARP vs. non-iWARP enabled OFA trees.")
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Acked-by: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Mon, 24 Mar 2008 18:22:39 +0000 (11:22 -0700)]
x86-32: Pass the full resource data to ioremap()
It appears that 64-bit PCI resources cannot possibly ever have worked on
x86-32 even when the RESOURCES_64BIT config option was set, because any
driver that tried to [pci_]ioremap() the resource would have been unable
to do so because the high 32 bits would have been silently dropped on
the floor by the ioremap() routines that only used "unsigned long".
Change them to use "resource_size_t" instead, which properly encodes the
whole 64-bit resource data if RESOURCES_64BIT is enabled.
Acked-by: H. Peter Anvin <hpa@kernel.org>
Acked-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Mon, 24 Mar 2008 18:07:15 +0000 (11:07 -0700)]
Don't 'printk()' while holding xtime lock for writing
The printk() can deadlock because it can wake up klogd(), and
task enqueueing will try to read the time in order to set a hrtimer.
Reported-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Debugged-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kumar Gala [Mon, 24 Mar 2008 13:56:06 +0000 (08:56 -0500)]
[POWERPC] Update some defconfigs
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Anatolij Gustschin [Sat, 22 Mar 2008 10:49:05 +0000 (21:49 +1100)]
[POWERPC] Fix Oops with TQM5200 on TQM5200
The "bestcomm-core" driver defines its of_match table as follows
static struct of_device_id mpc52xx_bcom_of_match[] = {
{ .type = "dma-controller", .compatible = "fsl,mpc5200-bestcomm", },
{ .type = "dma-controller", .compatible = "mpc5200-bestcomm", },
{},
};
so while registering the driver, the driver's probe function won't be
called, because the device tree node doesn't have a device_type
property. Thus the driver's bcom_engine structure won't be allocated.
Referencing this structure later causes observed Oops.
Checking bcom_eng pointer for NULL before referencing data pointed
by it prevents oopsing, but fec driver still doesn't work (because
of the lost bestcomm match and resulted task allocation failure).
Actually the compatible property exists and should match and so
the fec driver should work.
This removes .type = "dma-controller" from the bestcomm driver's
mpc52xx_bcom_of_match table to solve the problem.
Signed-off-by: Anatolij Gustschin <agust@denx.de>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Grant Likely [Sat, 22 Mar 2008 03:41:05 +0000 (14:41 +1100)]
[POWERPC] mpc5200: Fix null dereference if bestcomm fails to initialize
If the bestcomm initialization fails, calls to the task allocate
function should fail gracefully instead of oopsing with a NULL deref.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Grant Likely [Sat, 22 Mar 2008 03:20:29 +0000 (14:20 +1100)]
[POWERPC] mpc5200-fec: Fix possible NULL dereference in mdio driver
If the reg property is missing from the phy node (unlikely, but possible),
then the kernel will oops with a NULL pointer dereference. This fixes
it by checking the pointer first.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Olaf Hering [Mon, 17 Mar 2008 19:53:05 +0000 (06:53 +1100)]
[POWERPC] Fix crash in init_ipic_sysfs on efika
The global primary_ipic in arch/powerpc/sysdev/ipic.c can remain NULL
if ipic_init() fails, which will happen on machines that don't have an
ipic interrupt controller. init_ipic_sysfs() will crash in that case.
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Paul Mackerras [Mon, 24 Mar 2008 06:41:22 +0000 (17:41 +1100)]
[POWERPC] Don't use 64k pages for ioremap on pSeries
On pSeries, the hypervisor doesn't let us map in the eHEA ethernet
adapter using 64k pages, and thus the ehea driver will fail if 64k
pages are configured. This works around the problem by always
using 4k pages for ioremap on pSeries (but not on other platforms).
A better fix would be to check whether the partition could ever
have an eHEA adapter, and only force 4k pages if it could, but this
will do for 2.6.25.
This is based on an earlier patch by Tony Breeds.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Roland McGrath [Mon, 24 Mar 2008 05:50:16 +0000 (22:50 -0700)]
[SPARC64]: exec PT_DTRACE
The PT_DTRACE flag is meaningless and obsolete.
Don't touch it.
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Robert P. J. Day [Mon, 24 Mar 2008 05:48:29 +0000 (22:48 -0700)]
[SPARC64]: Use shorter list_splice_init() for brevity.
Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
Martin Devera [Mon, 24 Mar 2008 05:00:38 +0000 (22:00 -0700)]
sch_htb: fix "too many events" situation
HTB is event driven algorithm and part of its work is to apply
scheduled events at proper times. It tried to defend itself from
livelock by processing only limited number of events per dequeue.
Because of faster computers some users already hit this hardcoded
limit.
This patch limits processing up to 2 jiffies (why not 1 jiffie ?
because it might stop prematurely when only fraction of jiffie
remains).
Signed-off-by: Martin Devera <devik@cdi.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
Evgeniy Polyakov [Mon, 24 Mar 2008 04:51:12 +0000 (21:51 -0700)]
connector: convert to single-threaded workqueue
From: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
We don't need one cqueue thread for each CPU. cqueue is used for
receiving userspace datagrams, which are very rare and thus will
happily live with a single queue.
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Wang Chen [Mon, 24 Mar 2008 04:45:36 +0000 (21:45 -0700)]
[ATM]: When proc_create() fails, do some error handling work and return -ENOMEM.
Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sam Ravnborg [Sun, 23 Mar 2008 20:38:54 +0000 (21:38 +0100)]
kbuild: soften modpost checks when doing cross builds
The module alias support in the kernel have a consistency
check where it is checked that the size of a structure
in the kernel and on the build host are the same.
For cross builds this check does not make sense so detect
when we do cross builds and silently skip the check in these
situations.
This fixes a build bug for a wireless driver when cross building
for arm.
Acked-by: Michael Buesch <mb@bu3sch.de>
Tested-by: Gordon Farquharson <gordonfarquharson@gmail.com>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Cc: stable@kernel.org
Randy Dunlap [Sun, 23 Mar 2008 19:28:20 +0000 (20:28 +0100)]
i2c: Fix docbook problem
Sometimes kernel-doc and xmlto conspire to create output that is invalid
and causes problems. Until I know a real/better solution, change the
source code that causes this.
If anyone has better fixes or can just explain what is happening here,
that would be great.
xmlto: input does not validate (status 1)
mmotm-2008-0314-1449/Documentation/DocBook/kernel-api.xml:71468: parser error : Opening and ending tag mismatch: programlisting line 71464 and para
</para><para>
^
mmotm-2008-0314-1449/Documentation/DocBook/kernel-api.xml:71480: parser error : Opening and ending tag mismatch: para line 71473 and programlisting
</programlisting></informalexample>
^
make[1]: *** [Documentation/DocBook/kernel-api.html] Error 1
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Jean Delvare [Sun, 23 Mar 2008 19:28:20 +0000 (20:28 +0100)]
ASoC/TLV320AIC3X: Stop I2C driver ID abuse
Please stop using random I2C driver IDs.
Also removed a pointless initialization to 0 of a static struct member.
Acked-by: Takashi Iwai <tiwai@suse.de>
Cc: Jarkko Nikula <jarkko.nikula@nokia.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Tony Lindgren [Sun, 23 Mar 2008 19:28:20 +0000 (20:28 +0100)]
i2c-omap: Fix unhandled fault
If an I2C interrupt happens between disabling interface clock
and functional clock, the interrupt handler will produce an
external abort on non-linefetch error when trying to access
driver registers while interface clock is disabled.
This patch fixes the problem by saving and disabling i2c-omap
interrupt before turning off the clocks. Also disable functional
clock before the interface clock as suggested by Paul Walmsley.
Patch also renames enable/disable_clocks functions to unidle/idle
functions. Note that the driver is currently not taking advantage
of the idle interrupts. To use the idle interrupts, driver would
have to enable interface clock based on the idle interrupt
and dev->idle flag.
This patch has been tested in linux-omap tree with various omaps.
Cc: Paul Walmsley <paul@pwsan.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Bryan Wu [Sun, 23 Mar 2008 19:28:20 +0000 (20:28 +0100)]
i2c-bfin-twi: Disable BF54x support for now
The i2c-bfin-twi driver doesn't support BF54x for now due to
missing header definitions causing the build to fail. Exclude
it for now, it will be enabled again later.
Signed-off-by: Bryan Wu <cooloney@kernel.org>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
David S. Miller [Sun, 23 Mar 2008 10:35:12 +0000 (03:35 -0700)]
[SUNGEM]: Fix NAPI assertion failure.
As reported by Johannes Berg:
I started getting this warning with recent kernels:
[ 773.908927] ------------[ cut here ]------------
[ 773.908954] Badness at net/core/dev.c:2204
...
If we loop more than once in gem_poll(), we'll
use more than the real budget in our gem_rx()
calls, thus eventually trigger the caller's
assertions in net_rx_action().
Subtract "work_done" from "budget" for the second
arg to gem_rx() to fix the bug.
Signed-off-by: David S. Miller <davem@davemloft.net>
Eliezer Tamir [Sun, 23 Mar 2008 10:07:45 +0000 (03:07 -0700)]
BNX2X: prevent ethtool from setting port type
On 10GBaseT boards setting the type to TP will cause the driver to try
to configure 1GBaseT.
Since there are currently no boards that support setting of the port
type, disable this for now.
Signed-off-by: Eliezer Tamir <eliezert@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Julia Lawall [Sun, 23 Mar 2008 01:05:33 +0000 (18:05 -0700)]
[9P] net/9p/trans_fd.c: remove unused variable
The variable cb is initialized but never used otherwise.
The semantic patch that makes this change is as follows:
(http://www.emn.fr/x-info/coccinelle/)
// <smpl>
@@
type T;
identifier i;
constant C;
@@
(
extern T i;
|
- T i;
<+... when != i
- i = C;
...+>
)
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Julia Lawall [Sun, 23 Mar 2008 01:04:16 +0000 (18:04 -0700)]
[IPV6] net/ipv6/ndisc.c: remove unused variable
The variable hlen is initialized but never used otherwise.
The semantic patch that makes this change is as follows:
(http://www.emn.fr/x-info/coccinelle/)
// <smpl>
@@
type T;
identifier i;
constant C;
@@
(
extern T i;
|
- T i;
<+... when != i
- i = C;
...+>
)
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>