Jan Beulich [Thu, 12 Mar 2009 13:07:23 +0000 (13:07 +0000)]
x86: fix code paths used by update_mptable
Impact: fix crashes under Xen due to unrobust e820 code
find_e820_area_size() must return a properly distinguishable and
out-of-bounds value when it fails, and -1UL does not meet that
criteria on i386/PAE. Additionally, callers of the function must
check against that value.
early_reserve_e820() should be prepared for the region found to be
outside of the addressable range on 32-bits.
e820_update_range_map() should not blindly update e820, but should do
all it work on the map it got a pointer passed for (which in 50% of the
cases is &e820_saved). It must also not call e820_add_region(), as that
again acts on e820 unconditionally.
The issues were found when trying to make this option work in our Xen
kernel (i.e. where some of the silent assumptions made in the code
would not hold).
Signed-off-by: Jan Beulich <jbeulich@novell.com>
LKML-Reference: <
49B9171B.76E4.0078.0@novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Jan Beulich [Thu, 12 Mar 2009 12:57:10 +0000 (12:57 +0000)]
x86: clean up output resulting from update_mptable option
Impact: cleanup
Without apic=verbose, using the update_mptable option would result in
garbled and confusing output due to the inconsistent use of printk() vs
apic_printk().
Signed-off-by: Jan Beulich <jbeulich@novell.com>
LKML-Reference: <
49B914B6.76E4.0078.0@novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Jan Beulich [Thu, 12 Mar 2009 12:41:23 +0000 (12:41 +0000)]
x86: properly __init-annotate recent early_printk additions
Impact: cleanup, save memory
Don't keep code resident that's only needed during startup.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
LKML-Reference: <
49B91103.76E4.0078.0@novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Jan Beulich [Thu, 12 Mar 2009 12:40:06 +0000 (12:40 +0000)]
x86: move save_mr() into .meminit.text
Impact: cleanup, save memory
The function is only being called from boot or memory hotplug paths.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
LKML-Reference: <
49B910B6.76E4.0078.0@novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Jan Beulich [Thu, 12 Mar 2009 12:37:34 +0000 (12:37 +0000)]
x86, 32-bit: also use cpuinfo_x86's x86_{phys,virt}_bits members
Impact: 32/64-bit consolidation
In a first step, this allows fixing phys_addr_valid() for PAE (which
until now reported all addresses to be valid). Subsequently, this will
also allow simplifying some MTRR handling code.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
LKML-Reference: <
49B9101E.76E4.0078.0@novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Jan Beulich [Thu, 12 Mar 2009 12:33:06 +0000 (12:33 +0000)]
x86, 32-bit: also limit NODES_HIGH_SHIFT here
Impact: configuration bug fix
Just like for x86-64, the range of widths valid for NODE_SHIFT is not
unbounded. The upper bound 64-bit uses is definitely also an upper
bound for 32-bit.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
LKML-Reference: <
49B90F12.76E4.0078.0@novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Fri, 13 Mar 2009 02:20:49 +0000 (03:20 +0100)]
x86: unify kmap_atomic_pfn() and iomap_atomic_prot_pfn(), fix
Impact: build fix
Move kmap_atomic_prot_pfn() to iomap_32.c. It is used on all 32-bit
kernels, while highmem_32.c is only built on highmem kernels.
( Note: the debug_kmap_atomic_prot() check is removed for now, that
problem is handled via another patch. )
Reported-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Akinobu Mita <akinobu.mita@gmail.com>
LKML-Reference: <
20090311143317.GA22244@localhost.localdomain>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Akinobu Mita [Wed, 11 Mar 2009 14:34:50 +0000 (23:34 +0900)]
x86: debug check for kmap_atomic_pfn and iomap_atomic_prot_pfn()
It may be useful for kmap_atomic_pfn() and iomap_atomic_prot_pfn()
to check invalid kmap usage as well as kmap_atomic.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
LKML-Reference: <
20090311143449.GB22244@localhost.localdomain>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Akinobu Mita [Wed, 11 Mar 2009 14:33:18 +0000 (23:33 +0900)]
x86: unify kmap_atomic_pfn() and iomap_atomic_prot_pfn()
kmap_atomic_pfn() and iomap_atomic_prot_pfn() are almost same
except pgprot. This patch removes the code duplication for these
two functions.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
LKML-Reference: <
20090311143317.GA22244@localhost.localdomain>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Wed, 11 Mar 2009 09:49:15 +0000 (10:49 +0100)]
Merge branches 'x86/cleanups', 'x86/kexec', 'x86/mce2' and 'linus' into x86/core
Thomas Gleixner [Mon, 9 Mar 2009 21:04:45 +0000 (22:04 +0100)]
x86: convert obsolete irq_desc_t typedef to struct irq_desc
Impact: cleanup
Convert the last remaining users.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
KOSAKI Motohiro [Wed, 11 Mar 2009 01:14:26 +0000 (10:14 +0900)]
x86, mce: use round_jiffies() instead round_jiffies_relative()
Impact: saving power _very_ little
round_jiffies() round up absolute jiffies to full second.
round_jiffies_relative() round up relative jiffies to full second.
The "t->expires" is absolute jiffies. Then, round_jiffies() should be
used instead round_jiffies_relative().
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Huang Ying [Tue, 10 Mar 2009 02:57:16 +0000 (10:57 +0800)]
x86, kexec: x86_64: add kexec jump support for x86_64
Impact: New major feature
This patch add kexec jump support for x86_64. More information about
kexec jump can be found in corresponding x86_32 support patch.
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Huang Ying [Tue, 10 Mar 2009 02:57:04 +0000 (10:57 +0800)]
x86, kexec: x86_64: add identity map for pages at image->start
Impact: Fix corner case that cannot yet occur
image->start may be outside of 0 ~ max_pfn, for example when jumping
back to original kernel from kexeced kenrel. This patch add identity
map for pages at image->start.
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Huang Ying [Tue, 10 Mar 2009 02:56:57 +0000 (10:56 +0800)]
x86, kexec: fix kexec x86 coding style
Impact: Cleanup
Fix some coding style issue for kexec x86.
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Samuel CUELLA [Tue, 10 Mar 2009 19:56:00 +0000 (12:56 -0700)]
i810: fix kernel crash fix when struct fb_var_screeninfo is supplied
Prevent the kernel from being crashed by a divide-by-zero operation when
supplied an incorrectly filled 'struct fb_var_screeninfo' from userland.
Previously i810_main.c:1005 (i810_check_params) was using the global
'yres' symbol previously defined at i810_main.c:145 as a module parameter
value holder (i810_main.c:2174). If i810fb is compiled-in or if this
param doesn't get a default value, this direct usage leads to a
divide-by-zero at i810_main.c:1005 (i810_check_params). The patch simply
replace the 'yres' global, perhaps undefined symbol usage by a given
parameter structure lookup.
This problem occurs with directfb, mplayer -vo fbdev, SDL library.
It was also reported ( but non solved ) at:
http://mail.directfb.org/pipermail/directfb-dev/2008-March/004050.html
Signed-off-by: Samuel CUELLA <samuel.cuella@supinfo.com>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Krzysztof Helt <krzysztof.h1@poczta.fm>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Steven King [Tue, 10 Mar 2009 19:55:58 +0000 (12:55 -0700)]
m68knommu: m528x build fix
There isn't any mcfqspi.h in the tree, and without it everything inside the
#ifdef CONFIG_SPI is uncompilable.
Signed-off-by: Steven King <sfking@fdwdc.com>
Acked-by: Greg Ungerer <gerg@snapgear.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Steven King [Tue, 10 Mar 2009 19:55:57 +0000 (12:55 -0700)]
m68knommu: m5206e build fix
Signed-off-by: Steven King <sfking@fdwdc.com>
Acked-by: Greg Ungerer <gerg@snapgear.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Paul E. McKenney [Tue, 10 Mar 2009 19:55:57 +0000 (12:55 -0700)]
rcu: documentation 1Q09 update
Update the RCU documentation to call out the need for callers of
primitives like call_rcu() and synchronize_rcu() to prevent subsequent RCU
readers from hazard.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Dhaval Giani [Tue, 10 Mar 2009 19:55:56 +0000 (12:55 -0700)]
kernel/user.c: fix a memory leak when freeing up non-init usernamespaces users
We were returning early in the sysfs directory cleanup function if the
user belonged to a non init usernamespace. Due to this a lot of the
cleanup was not done and we were left with a leak. Fix the leak.
Reported-by: Serge Hallyn <serue@linux.vnet.ibm.com>
Signed-off-by: Dhaval Giani <dhaval@linux.vnet.ibm.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Tested-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Atsushi Nemoto [Tue, 10 Mar 2009 19:55:55 +0000 (12:55 -0700)]
mtd: physmap: fix NULL pointer dereference in error path
commit
e480814f138cd5d78a8efe397756ba6b6518fdb6 ("[MTD] [MAPS] physmap:
fix wrong free and del_mtd_{partition,device}") introduces a NULL pointer
dereference in physmap_flash_remove when called from the error path in
physmap_flash_probe (if map_probe failed).
Call del_mtd_{partition,device} only if info->cmtd was not NULL.
Reported-by: pHilipp Zabel <philipp.zabel@gmail.com>
Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Lubomir Rintel [Tue, 10 Mar 2009 19:55:54 +0000 (12:55 -0700)]
intel-agp: fix a panic with 1M of shared memory, no GTT entries
When GTT size is equal to amount of video memory, the amount of GTT
entries is computed lower than zero, which is invalid and leads to
off-by-one error in intel_i915_configure()
Originally posted here:
http://bugzilla.kernel.org/show_bug.cgi?id=12539
http://bugzilla.redhat.com/show_bug.cgi?id=445592
Signed-off-by: Lubomir Rintel <lkundrak@v3.sk>
Cc: Lubomir Rintel <lkundrak@v3.sk>
Cc: Dave Airlie <airlied@linux.ie>
Reviewed-by: Eric Anholt <eric@anholt.net>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Will Newton [Tue, 10 Mar 2009 19:55:53 +0000 (12:55 -0700)]
mtd_dataflash: fix probing of AT45DB321C chips.
Commit
771999b65f79264acde4b855e5d35696eca5e80c ("[MTD] DataFlash: bugfix,
binary page sizes now handled") broke support for probing AT45DB321C flash
chips. These chips do not support the "page size" status bit, so if we
match the JEDEC id return early.
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Will Newton <will.newton@gmail.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Acked-by: David Brownell <dbrownell@users.sourceforge.net>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Paul E. McKenney [Tue, 10 Mar 2009 19:55:52 +0000 (12:55 -0700)]
idr: make idr_remove_all() do removal -before- free_layer()
Fix a problem in the IDR system, where an idr_remove_all() hands a data
element to call_rcu() (via free_layer()) before making that data element
inaccessible to new readers. This is very bad, and results in readers
still having a reference to this data element at the end of the grace
period.
Tests on large machines that concurrently map and unmap user-space memory
within the same multithreaded process result in crashes within about five
minutes. Applying this patch increases the kernel's longevity to the
three-to-eight-hour range.
There appear to be other similar problems in idr_get_empty_slot() and
sub_remove(), but I fixed the easy one in idr_remove_all() first. It is
therefore no surprise that failures still occur.
Located-by: Milton Miller II <miltonm@austin.ibm.com>
Tested-by: Milton Miller II <miltonm@austin.ibm.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Alexey Dobriyan [Tue, 10 Mar 2009 19:55:51 +0000 (12:55 -0700)]
devpts: remove graffiti
Very annoying when working with containters.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Yinghai Lu [Tue, 10 Mar 2009 19:55:50 +0000 (12:55 -0700)]
x86/agp: tighten check to update amd nb aperture
Impact: fix bug to make agp work with dri
Jeffrey reported that dri does work with 64bit, but doesn't work with
32bit it turns out NB aperture is 32M, aperture on agp is 128M
64bit is using 64M for vaidation for 64 iommu/gart 32bit is only using
32M..., and will not update the nb aperture.
So try to compare nb apterture and agp apterture before leaving not
touch nb aperture.
Reported-by: Jeffrey Trull <jetrull@sbcglobal.net>
Tested-by: Jeffrey Trull <jetrull@sbcglobal.net>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Acked-by: Dave Airlie <airlied@linux.ie>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Alexey Dobriyan [Tue, 10 Mar 2009 19:55:49 +0000 (12:55 -0700)]
xtensa: fix compilation somewhat
* ->put_char changes
* HIGHMEM is bogus it seems, there is no kmap_atomic() et al
* some includes
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Chris Zankel <zankel@tensilica.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Darrick J. Wong [Tue, 10 Mar 2009 19:55:48 +0000 (12:55 -0700)]
lm85: add VRM10 support for adt7468 chip
The adt7468 chip supports VRM10 sensors just like the adt7463; add a
missing check for it.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Cc: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Darrick J. Wong [Tue, 10 Mar 2009 19:55:47 +0000 (12:55 -0700)]
lm85: fix the version check that broke adt7468 probing
The verstep check in the lm85 driver fails because the upper nibble of
the version register is 0x7, not 0x6, on the adt7468 chip. Probing of
all adt7468s was broken by
69fc1feba2d5856ff74dedb6ae9d8c490210825c
("hwmon: (lm85) Rework the device detection"), and this patch fixes
that. Also add in a missing i2c_device_id that accidentally got dropped
from the original patch.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Cc: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Randy Dunlap [Tue, 10 Mar 2009 19:55:46 +0000 (12:55 -0700)]
menu: fix embedded menu snafu
The COMPAT_BRK kconfig symbol does not depend on EMBEDDED, but it is in
the midst of the EMBEDDED menu symbols, so it mucks up the EMBEDDED menu.
Fix by moving it to just after all of the EMBEDDED menu symbols. Also,
ANON_INODES has a similar problem, so move it to just above the EMBEDDED
menu items since it is used in the EMBEDDED menu.
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Roel Kluin [Tue, 10 Mar 2009 19:55:45 +0000 (12:55 -0700)]
mm: get_nid_for_pfn() returns int
get_nid_for_pfn() returns int
Presumably the (nid < 0) case has never happened.
We do know that it is happening on one system while creating a symlink for
a memory section so it should also happen on the same system if
unregister_mem_sect_under_nodes() were called to remove the same symlink.
The test was actually added in response to a problem with an earlier
version reported by Yasunori Goto where one or more of the leading pages
of a memory section on the 2nd node of one of his systems was
uninitialized because I believe they coincided with a memory hole.
That earlier version did not ignore uninitialized pages and determined
the nid by considering only the 1st page of each memory section. This
caused the symlink to the 1st memory section on the 2nd node to be
incorrectly created in /sys/devices/system/node/node0 instead of
/sys/devices/system/node/node1. The problem was fixed by adding the
test to skip over uninitialized pages.
I suspect we have not seen any reports of the non-removal
of a symlink due to the incorrect declaration of the nid
variable in unregister_mem_sect_under_nodes() because
- systems where a memory section could have an uninitialized
range of leading pages are probably rare.
- memory remove is probably not done very frequently on the
systems that are capable of demonstrating the problem.
- lingering symlink(s) that should have been removed may
have simply gone unnoticed.
[garyhade@us.ibm.com: wrote changelog]
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Cc: Gary Hade <garyhade@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Tue, 10 Mar 2009 19:03:30 +0000 (12:03 -0700)]
Merge branch 'x86-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86 mmiotrace: fix remove_kmmio_fault_pages()
Linus Torvalds [Tue, 10 Mar 2009 16:31:19 +0000 (09:31 -0700)]
Merge branch 'sh/for-2.6.29' of git://git./linux/kernel/git/lethal/sh-2.6
* 'sh/for-2.6.29' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6:
video: deferred io cleanup fix for sh_mobile_lcdcfb
sh: Add media/soc_camera.h to board setup of Renesas AP325RXA
Ingo Molnar [Tue, 10 Mar 2009 08:26:38 +0000 (09:26 +0100)]
Merge branches 'x86/apic', 'x86/asm', 'x86/fixmap', 'x86/memtest', 'x86/mm', 'x86/urgent', 'linus' and 'core/percpu' into x86/core
Magnus Damm [Tue, 10 Mar 2009 06:08:49 +0000 (06:08 +0000)]
video: deferred io cleanup fix for sh_mobile_lcdcfb
Fix deferred io cleanup patch in the sh_mobile_lcdcfb driver.
If probe() fails early the sh_mobile_lcdc_stop() function will
be called to clean up deferred io. This patch modifies the
code to only call fb_deferred_io_cleanup() after deferred io
has been initialized.
With this patch applied we no longer hit BUG_ON() inside
fb_deferred_io_cleanup(). Triggers on a Migo-R with the
SYS QVGA panel board unmounted.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Nobuhiro Iwamatsu [Fri, 6 Mar 2009 02:51:14 +0000 (02:51 +0000)]
sh: Add media/soc_camera.h to board setup of Renesas AP325RXA
Other compilation errors were revised by commit of
"sh: ap325rxa: Revert ov772x support"
(
08c2f5b4d76f83213e379b12df504269d21c9e7c) but other compilation
errors are given.
We revert this commit and need to add new header(media/soc_camera.h).
This change revises new compilation error.
Signed-off-by: Nobuhiro Iwamatsu <iwamatsu.nobuhiro@renesas.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Linus Torvalds [Tue, 10 Mar 2009 03:50:11 +0000 (20:50 -0700)]
Merge branch 'for-linus' of git://neil.brown.name/md
* 'for-linus' of git://neil.brown.name/md:
md: fix deadlock when stopping arrays
Linus Torvalds [Mon, 9 Mar 2009 20:23:59 +0000 (13:23 -0700)]
Merge branch 'fixes' of git://git./linux/kernel/git/davej/cpufreq
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq:
[CPUFREQ] Add p4-clockmod sysfs-ui removal to feature-removal schedule.
Revert "[CPUFREQ] Disable sysfs ui for p4-clockmod."
Oleg Nesterov [Mon, 2 Mar 2009 21:58:45 +0000 (22:58 +0100)]
copy_process: fix CLONE_PARENT && parent_exec_id interaction
CLONE_PARENT can fool the ->self_exec_id/parent_exec_id logic. If we
re-use the old parent, we must also re-use ->parent_exec_id to make
sure exit_notify() sees the right ->xxx_exec_id's when the CLONE_PARENT'ed
task exits.
Also, move down the "p->parent_exec_id = p->self_exec_id" thing, to place
two different cases together.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Roland McGrath <roland@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Serge E. Hallyn <serge@hallyn.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Dave Jones [Mon, 9 Mar 2009 19:14:37 +0000 (15:14 -0400)]
[CPUFREQ] Add p4-clockmod sysfs-ui removal to feature-removal schedule.
Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Dave Jones <davej@redhat.com>
Dave Jones [Mon, 9 Mar 2009 19:07:33 +0000 (15:07 -0400)]
Revert "[CPUFREQ] Disable sysfs ui for p4-clockmod."
This reverts commit
e088e4c9cdb618675874becb91b2fd581ee707e6.
Removing the sysfs interface for p4-clockmod was flagged as a
regression in bug 12826.
Course of action:
- Find out the remaining causes of overheating, and fix them
if possible. ACPI should be doing the right thing automatically.
If it isn't, we need to fix that.
- mark p4-clockmod ui as deprecated
- try again with the removal in six months.
It's not really feasible to printk about the deprecation, because
it needs to happen at all the sysfs entry points, which means adding
a lot of strcmp("p4-clockmod".. calls to the core, which.. bleuch.
Signed-off-by: Dave Jones <davej@redhat.com>
Linus Torvalds [Mon, 9 Mar 2009 16:15:40 +0000 (09:15 -0700)]
Merge git://git./linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (29 commits)
p54: fix race condition in memory management
cfg80211: test before subtraction on unsigned
iwlwifi: fix error flow in iwl*_pci_probe
rt2x00 : more devices to rt73usb.c
rt2x00 : more devices to rt2500usb.c
bonding: Fix device passed into ->ndo_neigh_setup().
vlan: Fix vlan-in-vlan crashes.
net: Fix missing dev->neigh_setup in register_netdevice().
tmspci: fix request_irq race
pkt_sched: act_police: Fix a rate estimator test.
tg3: Fix 5906 link problems
SCTP: change sctp_ctl_sock_init() to try IPv4 if IPv6 fails
IPv6: add "disable" module parameter support to ipv6.ko
sungem: another error printed one too early
aoe: error printed 1 too early
net pcmcia: worklimit reaches -1
net: more timeouts that reach -1
net: fix tokenring license
dm9601: new vendor/product IDs
netlink: invert error code in netlink_set_err()
...
Linus Torvalds [Mon, 9 Mar 2009 16:14:17 +0000 (09:14 -0700)]
Merge git://git./linux/kernel/git/rusty/linux-2.6-for-linus
* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus:
lguest: fix for CONFIG_SPARSE_IRQ=y
lguest: fix crash 'unhandled trap 13 at <native_read_msr_safe>'
Linus Torvalds [Mon, 9 Mar 2009 16:13:16 +0000 (09:13 -0700)]
Merge git://git./linux/kernel/git/mason/btrfs-unstable
* git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable:
Btrfs: fix spinlock assertions on UP systems
Chris Mason [Mon, 9 Mar 2009 15:45:38 +0000 (11:45 -0400)]
Btrfs: fix spinlock assertions on UP systems
btrfs_tree_locked was being used to make sure a given extent_buffer was
properly locked in a few places. But, it wasn't correct for UP compiled
kernels.
This switches it to using assert_spin_locked instead, and renames it to
btrfs_assert_tree_locked to better reflect how it was really being used.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Heiko Carstens [Mon, 9 Mar 2009 12:31:59 +0000 (13:31 +0100)]
Fix fixpoint divide exception in acct_update_integrals
Frans Pop reported the crash below when running an s390 kernel under Hercules:
Kernel BUG at
000738b4 verbose debug info unavailable!
fixpoint divide exception: 0009 #1! SMP
Modules linked in: nfs lockd nfs_acl sunrpc ctcm fsm tape_34xx
cu3088 tape ccwgroup tape_class ext3 jbd mbcache dm_mirror dm_log dm_snapshot
dm_mod dasd_eckd_mod dasd_mod
CPU: 0 Not tainted 2.6.27.19 #13
Process awk (pid: 2069, task:
0f9ed9b8, ksp:
0f4f7d18)
Krnl PSW :
070c1000 800738b4 (acct_update_integrals+0x4c/0x118)
R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:0 CC:1 PM:0
Krnl GPRS:
00000000 000007d0 7fffffff fffff830
00000000 ffffffff 00000002 0f9ed9b8
00000000 00008ca0 00000000 0f9ed9b8
0f9edda4 8007386e 0f4f7ec8 0f4f7e98
Krnl Code:
800738aa:
a71807d0 lhi %r1,2000
800738ae:
8c200001 srdl %r2,1
800738b2: 1d21 dr %r2,%r1
>
800738b4:
5810d10e l %r1,270(%r13)
800738b8: 1823 lr %r2,%r3
800738ba:
4130f060 la %r3,96(%r15)
800738be: 0de1 basr %r14,%r1
800738c0:
5800f060 l %r0,96(%r15)
Call Trace:
( <
000000000004fdea>! blocking_notifier_call_chain+0x1e/0x2c)
<
0000000000038502>! do_exit+0x106/0x7c0
<
0000000000038c36>! do_group_exit+0x7a/0xb4
<
0000000000038c8e>! SyS_exit_group+0x1e/0x30
<
0000000000021c28>! sysc_do_restart+0x12/0x16
<
0000000077e7e924>! 0x77e7e924
Reason for this is that cpu time accounting usually only happens from
interrupt context, but acct_update_integrals gets also called from
process context with interrupts enabled.
So in acct_update_integrals we may end up with the following scenario:
Between reading tsk->stime/tsk->utime and tsk->acct_timexpd an interrupt
happens which updates accouting values. This causes acct_timexpd to be
greater than the former stime + utime. The subsequent calculation of
dtime = cputime_sub(time, tsk->acct_timexpd);
will be negative and the division performed by
cputime_to_jiffies(dtime)
will generate an exception since the result won't fit into a 32 bit
register.
In order to fix this just always disable interrupts while accessing any
of the accounting values.
Reported by: Frans Pop <elendil@planet.nl>
Tested by: Frans Pop <elendil@planet.nl>
Cc: stable@kernel.org
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Rusty Russell [Mon, 9 Mar 2009 16:06:28 +0000 (10:06 -0600)]
lguest: fix for CONFIG_SPARSE_IRQ=y
Impact: remove lots of lguest boot WARN_ON() when CONFIG_SPARSE_IRQ=y
We now need to call irq_to_desc_alloc_cpu() before
set_irq_chip_and_handler_name(), but we can't do that from init_IRQ (no
kmalloc available).
So do it as we use interrupts instead. Also means we only alloc for
irqs we use, which was the intent of CONFIG_SPARSE_IRQ anyway.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Ingo Molnar <mingo@redhat.com>
Rusty Russell [Mon, 9 Mar 2009 16:06:22 +0000 (10:06 -0600)]
lguest: fix crash 'unhandled trap 13 at <native_read_msr_safe>'
Impact: fix lguest boot crash on modern Intel machines
The code in early_init_intel does:
if (c->x86 > 6 || (c->x86 == 6 && c->x86_model >= 0xd)) {
u64 misc_enable;
rdmsrl(MSR_IA32_MISC_ENABLE, misc_enable);
And that rdmsr faults (not allowed from non-0 PL). We can get around
this by mugging the family ID part of the cpuid. 5 seems like a good
number.
Of course, this is a hack (how very lguest!). We could just indicate
that we don't support MSRs, or implement lguest_rdmst.
Reported-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Tested-by: Patrick McHardy <kaber@trash.net>
Jeremy Fitzhardinge [Fri, 6 Mar 2009 18:09:26 +0000 (10:09 -0800)]
x86-32: make sure virt_addr_valid() returns false for fixmap addresses
I found that virt_addr_valid() was returning true for fixmap addresses.
I'm not sure whether pfn_valid() is supposed to include this test,
but there's no harm in being explicit.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Jiri Slaby <jirislaby@gmail.com>
Cc: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <
49B166D6.
2080505@goop.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Stuart Bennett [Sun, 8 Mar 2009 18:21:35 +0000 (20:21 +0200)]
x86 mmiotrace: fix remove_kmmio_fault_pages()
Impact: fix race+crash in mmiotrace
The list manipulation in remove_kmmio_fault_pages() was broken. If more
than one consecutive kmmio_fault_page was re-added during the grace
period between unregister_kmmio_probe() and remove_kmmio_fault_pages(),
the list manipulation failed to remove pages from the release list.
After a second grace period the pages get into rcu_free_kmmio_fault_pages()
and raise a BUG_ON() kernel crash.
The list manipulation is fixed to properly remove pages from the release
list.
This bug has been present from the very beginning of mmiotrace in the
mainline kernel. It was introduced in
0fd0e3da ("x86: mmiotrace full
patch, preview 1");
An urgent fix for Linus. Tested by Stuart (on 32-bit) and Pekka
(on amd and intel 64-bit systems, nouveau and nvidia proprietary).
Signed-off-by: Stuart Bennett <stuart@freedesktop.org>
Signed-off-by: Pekka Paalanen <pq@iki.fi>
LKML-Reference: <
20090308202135.
34933feb@daedalus.pq.iki.fi>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Yinghai Lu [Thu, 5 Mar 2009 20:04:57 +0000 (12:04 -0800)]
x86: fix warning about nodeid
Impact: cleanup
Ingo found there warning about nodeid with some configs.
try to use for_each_online_node for non numa too. in that case
nodeid will be 0.
also move out boundary checking from setup_node_bootmem(), so
non-numa config will not check it.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <
49B03069.80001@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Linus Torvalds [Sun, 8 Mar 2009 17:37:57 +0000 (10:37 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/drzeus/mmc
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc:
mmc: fix data timeout for SEND_EXT_CSD
Linus Torvalds [Sun, 8 Mar 2009 17:30:18 +0000 (10:30 -0700)]
Merge branch 'core-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
rcu: increment quiescent state counter in ksoftirqd()
Linus Torvalds [Sun, 8 Mar 2009 17:27:13 +0000 (10:27 -0700)]
Merge branch 'x86-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86, pebs: correct qualifier passed to ds_write_config() from ds_request_pebs()
x86, bts: remove bad warning
x86: add Dell XPS710 reboot quirk
x86, math-emu: fix init_fpu for task != current
x86: EFI: Back efi_ioremap with init_memory_mapping instead of FIX_MAP
x86: fix DMI on EFI
Linus Torvalds [Sun, 8 Mar 2009 17:25:13 +0000 (10:25 -0700)]
Merge git://git./linux/kernel/git/wim/linux-2.6-watchdog
* git://git.kernel.org/pub/scm/linux/kernel/git/wim/linux-2.6-watchdog:
[WATCHDOG] orion5x_wdt.c: 'ORION5X_TCLK' undeclared
[WATCHDOG] gef_wdt.c: fsl_get_sys_freq() failure not noticed
[WATCHDOG] ks8695_wdt.c: 'CLOCK_TICK_RATE' undeclared
[WATCHDOG] rc32434_wdt: fix sections
[WATCHDOG] rc32434_wdt: fix watchdog driver
Linus Torvalds [Sun, 8 Mar 2009 17:24:57 +0000 (10:24 -0700)]
Merge branch 'for_linus' of git://git./linux/kernel/git/tytso/ext4
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4: fix ext4_free_inode() vs. ext4_claim_inode() race
Linus Torvalds [Sun, 8 Mar 2009 17:24:39 +0000 (10:24 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/cooloney/blackfin-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cooloney/blackfin-2.6: (28 commits)
Blackfin arch: SPI_MMC is now mainlined MMC_SPI
Blackfin arch: disable legacy /proc/scsi/ support by default
Blackfin arch: remove duplicated ANOMALY_05000448 ifdef check
Blackfin arch: add stubs for anomalies 447 and 448
Blackfin arch: cleanup bfin_sport.h header and export it to userspace
Blackfin arch: fix bug - gdb signull case make trunk kernel panic frequently
Blackfin arch: remove spurious dash when dcache is off
Blackfin arch: mark init_pda as __init as only __init funcs all it
Blackfin arch: fix bug - On bf548-ezkit, ethernet fails to work after wakeup from "mem"
Blackfin arch: Random read/write errors are a bad thing
Blackfin arch: update default kernel config, select KSZ8893M driver for BF518
Blackfin arch: Fix bug - KGDB single step into the middle of a 4 bytes instruction on bf561 after soft bp is hit
Blackfin arch: Fix bug - make ksz8893m driver available when bfin_mac is enabled
Blackfin arch: make sure people do not set the kernel load address too high
Blackfin arch: fix bug - The SPORT_HYS bit is not set for BF561 0.5
Blackfin arch: update anomaly sheets to match latest public info
Blackfin arch: Fix BUG - kernel fails to build in pm.c when allow wakeup fromi standby by GPIO
Blackfin arch: PM_BFIN_WAKE_GP: update help
Blackfin arch: fix bug - kgdb fails to continue after setting breakpoint on bf561-ezkit kernel with smp patch
Blackfin arch: Enable Write Back Cache on all Blackfin Boards
...
Linus Torvalds [Sun, 8 Mar 2009 17:23:05 +0000 (10:23 -0700)]
Merge branch 'fixes' of git://git./linux/kernel/git/djbw/async_tx
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx:
dmatest: fix use after free in dmatest_exit
ipu_idmac: fix spinlock type
iop-adma, mv_xor: fix mem leak on self-test setup failure
fsldma: fix off by one in dma_halt
I/OAT: fail self-test if callback test reaches timeout
I/OAT: update driver version and copyright dates
I/OAT: list usage cleanup
I/OAT: set tcp_dma_copybreak to 256k for I/OAT ver.3
I/OAT: cancel watchdog before dma remove
I/OAT: fail initialization on zero channels detection
I/OAT: do not set DCACTRL_CMPL_WRITE_ENABLE for I/OAT ver.3
I/OAT: add verification for proper APICID_TAG_MAP setting by BIOS
dmaengine: update kerneldoc
Linus Torvalds [Sun, 8 Mar 2009 17:22:22 +0000 (10:22 -0700)]
Merge git://git./linux/kernel/git/bart/ide-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6:
ata: add CFA specific identify data words
remove stale comment from <linux/hdreg.h>
AT91: initialize Compact Flash on AT91SAM9263 cpu
ide: add at91_ide driver
ide: allow to wrap interrupt handler
ide-iops: fix odd-length ATAPI PIO transfers
ide: NULL noise: drivers/ide/ide-*.c
ide: expiry() returns int, negative expiry() return values won't be noticed
Linus Torvalds [Sun, 8 Mar 2009 17:22:01 +0000 (10:22 -0700)]
Merge branch 'upstream-linus' of git://git./linux/kernel/git/jgarzik/libata-dev
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
libata: Don't trust current capacity values in identify words 57-58
libata: make sure port is thawed when skipping resets
sata_nv: fix module parameter description
ahci: Add the Device IDs for MCP89 and remove IDs of MCP7B to/from ahci.c
libata: don't use on-stack sense buffer
libata: align ap->sector_buf
libata: fix dma_unmap_sg misuse
libata: change drive ready wait after hard reset to 5s
Linus Torvalds [Sun, 8 Mar 2009 17:21:31 +0000 (10:21 -0700)]
Merge git://git./linux/kernel/git/pkl/squashfs-linus
* git://git.kernel.org/pub/scm/linux/kernel/git/pkl/squashfs-linus:
Squashfs: frag_size should be signed, as it can hold an error result
Squashfs: fix documentation typo, Cramfs filesystem limit is 256 MiB
Squashfs: Fix oops when reading fsfuzzer corrupted filesystems
Linus Torvalds [Sun, 8 Mar 2009 17:21:10 +0000 (10:21 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/jmorris/security-testing-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6:
smack: fixes for unlabeled host support
Linus Torvalds [Sun, 8 Mar 2009 17:14:19 +0000 (10:14 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/dtor/input
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: serio - fix protocol number for TouchIT213
Linus Torvalds [Sun, 8 Mar 2009 17:13:28 +0000 (10:13 -0700)]
Merge branch 'release' of git://git./linux/kernel/git/aegl/linux-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6:
[IA64] fix PCI DMA flag propagation on SN (Altix) with PICs
Linus Torvalds [Sun, 8 Mar 2009 17:08:57 +0000 (10:08 -0700)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
block: fix missing bio back/front segment size setting in blk_recount_segments()
loop: don't increment p->offset with (size_t) -EINVAL
cciss: remove 30 second initial timeout on controller reset
Fix kernel NULL pointer dereference in xen-blkfront
Linus Torvalds [Sun, 8 Mar 2009 17:03:31 +0000 (10:03 -0700)]
Merge branch 'fix/hda' of git://git./linux/kernel/git/tiwai/sound-2.6
* 'fix/hda' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
ALSA: hda - Fix headphone-detect regression with multiple HP jacks
ALSA: hda - Fix typos in slave controls in patch_sigmatel.c
Ralf Baechle [Thu, 5 Mar 2009 10:45:48 +0000 (11:45 +0100)]
MIPS: compat: Implement is_compat_task.
This is a build fix required after "x86-64: seccomp: fix 32/64 syscall
hole" (commit
5b1017404aea6d2e552e991b3fd814d839e9cd67). MIPS doesn't
have the issue that was fixed for x86-64 by that patch.
This also doesn't solve the N32 issue which is that N32 seccomp processes
will be treated as non-compat processes thus only have access to N64
syscalls.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Wang Chen [Sat, 7 Mar 2009 05:34:19 +0000 (13:34 +0800)]
x86: don't define __this_fixmap_does_not_exist()
Impact: improve out-of-range fixmap index debugging
Commit "
1b42f51630c7eebce6fb780b480731eb81afd325"
defined the __this_fixmap_does_not_exist() function
with a WARN_ON(1) in it.
This causes the linker to not report an error when
__this_fixmap_does_not_exist() is called with a
non-constant parameter.
Ingo defined __this_fixmap_does_not_exist() because he
wanted to get virt addresses of fix memory of nest level
by non-constant index.
But we can fix this and still keep the link-time check:
We can get the four slot virt addresses on link time and
store them to array slot_virt[].
Then we can then refer the slot_virt with non-constant index,
in the ioremap-leak detection code.
Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com>
LKML-Reference: <
49B2075B.
4070509@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Yinghai Lu [Sun, 8 Mar 2009 07:46:26 +0000 (23:46 -0800)]
x86: remove smp_apply_quirks()/smp_checks()
Impact: cleanup and code size reduction on 64-bit
This code is only applied to Intel Pentium and AMD K7 32-bit cpus.
Move those checks to intel_init()/amd_init() for 32-bit
so 64-bit will not build this code.
Also change to use cpu_index check to see if we need to emit warning.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <
49B377D2.
8030108@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Adrian Hunter [Tue, 10 Feb 2009 14:32:33 +0000 (16:32 +0200)]
mmc: fix data timeout for SEND_EXT_CSD
Commit
0d3e0460f307e84904968aad6cff97bd688583d8
"MMC: CSD and CID timeout values" inadvertently broke
the timeout for the MMC command SEND_EXT_CSD.
This patch puts it back again.
Depending on the characteristics of the controller,
this bug may prevent the use of MMC cards.
Signed-off-by: Adrian Hunter <adrian.hunter@nokia.com>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Cliff Wickman [Fri, 6 Mar 2009 23:30:56 +0000 (17:30 -0600)]
x86: UV: remove uv_flush_tlb_others() WARN_ON
In uv_flush_tlb_others() (arch/x86/kernel/tlb_uv.c),
the "WARN_ON(!in_atomic())" fails if CONFIG_PREEMPT is not enabled.
And CONFIG_PREEMPT is not enabled by default in the distribution that
most UV owners will use.
We could #ifdef CONFIG_PREEMPT the warning, but that is not good form.
And there seems to be no suitable fix to in_atomic() when CONFIG_PREMPT
is not on.
As Ingo commented:
> and we have no proper primitive to test for atomicity. (mainly
> because we dont know about atomicity on a non-preempt kernel)
So we drop the WARN_ON.
Signed-off-by: Cliff Wickman <cpw@sgi.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Dmitry Torokhov [Sat, 7 Mar 2009 21:39:22 +0000 (13:39 -0800)]
Input: serio - fix protocol number for TouchIT213
Protocol 0x37 has been reserved for iNexio devices and Sahara
was supposed to get 0x38.
Reported-by: Claudio Nieder <private@claudio.ch>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Tejun Heo [Fri, 6 Mar 2009 15:44:13 +0000 (00:44 +0900)]
percpu: finer grained locking to break deadlock and allow atomic free
Impact: fix deadlock and allow atomic free
Percpu allocation always uses GFP_KERNEL and whole alloc/free paths
were protected by single mutex. All percpu allocations have been from
GFP_KERNEL-safe context and the original allocator had this assumption
too. However, by protecting both alloc and free paths with the same
mutex, the new allocator creates free -> alloc -> GFP_KERNEL
dependency which the original allocator didn't have. This can lead to
deadlock if free is called from FS or IO paths. Also, in general,
allocators are expected to allow free to be called from atomic
context.
This patch implements finer grained locking to break the deadlock and
allow atomic free. For details, please read the "Synchronization
rules" comment.
While at it, also add CONTEXT: to function comments to describe which
context they expect to be called from and what they do to it.
This problem was reported by Thomas Gleixner and Peter Zijlstra.
http://thread.gmane.org/gmane.linux.kernel/802384
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Thomas Gleixner <tglx@linutronix.de>
Reported-by: Peter Zijlstra <peterz@infradead.org>
Christian Lamparter [Thu, 5 Mar 2009 23:53:59 +0000 (00:53 +0100)]
p54: fix race condition in memory management
This patch fixes a number of race conditions in the driver.
Up until now, "entry" pointer was initialized before acquiring the right lock.
Signed-off-by: Christian Lamparter <chunkeey@web.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Roel Kluin [Tue, 3 Mar 2009 21:55:21 +0000 (22:55 +0100)]
cfg80211: test before subtraction on unsigned
freq_diff is unsigned, so test before subtraction
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Jeremy Higdon [Wed, 4 Mar 2009 20:09:46 +0000 (12:09 -0800)]
[IA64] fix PCI DMA flag propagation on SN (Altix) with PICs
We recently discovered a problem with passing of DMA attributes on SN
systems with the older PIC chips.
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Jeremy Higdon <jeremy@sgi.com>
Cc: <habeck@sgi.com>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Cyrill Gorcunov [Fri, 6 Mar 2009 16:08:34 +0000 (19:08 +0300)]
x86: linkage.h - guard assembler specifics by __ASSEMBLY__
Stephen Rothwell reported:
|Today's linux-next build (x86_64 allmodconfig) produced this warning:
|
|In file included from drivers/char/epca.c:49:
|drivers/char/digiFep1.h:7:1: warning: "GLOBAL" redefined
|In file included from include/linux/linkage.h:5,
| from include/linux/kernel.h:11,
| from arch/x86/include/asm/system.h:10,
| from arch/x86/include/asm/processor.h:17,
| from include/linux/prefetch.h:14,
| from include/linux/list.h:6,
| from include/linux/module.h:9,
| from drivers/char/epca.c:29:
|arch/x86/include/asm/linkage.h:55:1: warning: this is the location of the previous definition
|
|Probably introduced by commit
95695547a7db44b88a7ee36cf5df188de267e99e
|("x86: asm linkage - introduce GLOBAL macro") from the x86 tree.
Any assembler specific snippets being placed in headers
are to be protected by __ASSEMBLY__. Fixed.
Also move __ALIGN definition under the same protection as well.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
LKML-Reference: <
20090306160833.GB7420@localhost>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Tejun Heo [Fri, 6 Mar 2009 15:44:11 +0000 (00:44 +0900)]
percpu: move fully free chunk reclamation into a work
Impact: code reorganization for later changes
Do fully free chunk reclamation using a work. This change is to
prepare for locking changes.
Signed-off-by: Tejun Heo <tj@kernel.org>
Tejun Heo [Fri, 6 Mar 2009 15:44:09 +0000 (00:44 +0900)]
percpu: move chunk area map extension out of area allocation
Impact: code reorganization for later changes
Separate out chunk area map extension into a separate function -
pcpu_extend_area_map() - and call it directly from pcpu_alloc() such
that pcpu_alloc_area() is guaranteed to have enough area map slots on
invocation.
With this change, pcpu_alloc_area() does only area allocation and the
only failure mode is when the chunk doens't have enough room, so
there's no need to distinguish it from memory allocation failures.
Make it return -1 on such cases instead of hacky -ENOSPC.
Signed-off-by: Tejun Heo <tj@kernel.org>
Tejun Heo [Fri, 6 Mar 2009 15:44:09 +0000 (00:44 +0900)]
percpu: replace pcpu_realloc() with pcpu_mem_alloc() and pcpu_mem_free()
Impact: code reorganization for later changes
With static map handling moved to pcpu_split_block(), pcpu_realloc()
only clutters the code and it's also unsuitable for scheduled locking
changes. Implement and use pcpu_mem_alloc/free() instead.
Signed-off-by: Tejun Heo <tj@kernel.org>
Markus Metzger [Thu, 5 Mar 2009 07:57:21 +0000 (08:57 +0100)]
x86, pebs: correct qualifier passed to ds_write_config() from ds_request_pebs()
ds_write_config() can write the BTS as well as the PEBS part of
the DS config. ds_request_pebs() passes the wrong qualifier, which
results in the wrong configuration to be written.
Reported-by: Stephane Eranian <eranian@googlemail.com>
Signed-off-by: Markus Metzger <markus.t.metzger@intel.com>
LKML-Reference: <
20090305085721.A22550@sedona.ch.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Markus Metzger [Thu, 5 Mar 2009 07:49:54 +0000 (08:49 +0100)]
x86, bts: remove bad warning
In case a ptraced task is reaped (while the tracer is still attached),
ds_exit_thread() is called before ptrace_exit(). The latter will
release the bts_tracer and remove the thread's ds_ctx.
The former will WARN() if the context is not NULL.
Oleg Nesterov submitted patches that move ptrace_exit() before
exit_thread() and thus reverse the order of the above calls.
Remove the bad warning. I will add it again when Oleg's changes are in.
Signed-off-by: Markus Metzger <markus.t.metzger@intel.com>
LKML-Reference: <
20090305084954.A22000@sedona.ch.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Pekka Enberg [Thu, 5 Mar 2009 15:04:57 +0000 (17:04 +0200)]
x86: rename do_not_nx to disable_nx in mm/init_64.c
As a preparational step for unifying noexec handling on 32-bit and 64-bit,
rename the do_not_nx variable to disable_nx on 64-bit.
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
LKML-Reference: <
1236265497.31324.11.camel@penberg-laptop>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Pekka Enberg [Thu, 5 Mar 2009 15:04:26 +0000 (17:04 +0200)]
x86: fix uninitialized variable in init_memory_mapping()
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
LKML-Reference: <
1236265466.31324.9.camel@penberg-laptop>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Yinghai Lu [Fri, 6 Mar 2009 11:12:50 +0000 (03:12 -0800)]
x86: make "memtest" like "memtest=17"
Impact: make boot command line "memtest" do one loop by default
So don't need to guess many patterns in one loop.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <
49B10532.
3020105@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Takashi Iwai [Fri, 6 Mar 2009 08:43:58 +0000 (09:43 +0100)]
ALSA: hda - Fix headphone-detect regression with multiple HP jacks
The recent changes over the DAC detection mechanism in patch_sigmatel.c
breaks the HP detection on the machines with multiple HP jacks.
It's basically because of the workaround to support the multi-channel
output. Since the HP detection is more important feature, disable
the HP-swap workaroud temporarily.
Reference: Novell bnc#482052
https://bugzilla.novell.com/show_bug.cgi?id=482052
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Takashi Iwai [Fri, 6 Mar 2009 08:42:07 +0000 (09:42 +0100)]
ALSA: hda - Fix typos in slave controls in patch_sigmatel.c
"Headphone Playback ..." appears twice in slave_vols[] and slave_sws[].
They should be "Headphone Playback2 ..."
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Jens Axboe [Fri, 6 Mar 2009 07:55:24 +0000 (08:55 +0100)]
block: fix missing bio back/front segment size setting in blk_recount_segments()
Commit
1e42807918d17e8c93bf14fbb74be84b141334c1 introduced a bug where we
don't get front/back segment sizes in the bio in blk_recount_segments().
Fix this by tracking the back bio as well as the front bio in
__blk_recalc_rq_segments(), this also cleans up the interface by getting
rid of the segment size pointer passing.
Tested-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Tejun Heo [Fri, 6 Mar 2009 05:33:59 +0000 (14:33 +0900)]
x86, percpu: setup reserved percpu area for x86_64
Impact: fix relocation overflow during module load
x86_64 uses 32bit relocations for symbol access and static percpu
symbols whether in core or modules must be inside 2GB of the percpu
segement base which the dynamic percpu allocator doesn't guarantee.
This patch makes x86_64 reserve PERCPU_MODULE_RESERVE bytes in the
first chunk so that module percpu areas are always allocated from the
first chunk which is always inside the relocatable range.
This problem exists for any percpu allocator but is easily triggered
when using the embedding allocator because the second chunk is located
beyond 2GB on it.
This patch also changes the meaning of PERCPU_DYNAMIC_RESERVE such
that it only indicates the size of the area to reserve for dynamic
allocation as static and dynamic areas can be separate. New
PERCPU_DYNAMIC_RESERVED is increased by 4k for both 32 and 64bits as
the reserved area separation eats away some allocatable space and
having slightly more headroom (currently between 4 and 8k after
minimal boot sans module area) makes sense for common case
performance.
x86_32 can address anywhere from anywhere and doesn't need reserving.
Mike Galbraith first reported the problem first and bisected it to the
embedding percpu allocator commit.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Mike Galbraith <efault@gmx.de>
Reported-by: Jaswinder Singh Rajput <jaswinder@kernel.org>
Tejun Heo [Fri, 6 Mar 2009 05:33:59 +0000 (14:33 +0900)]
percpu, module: implement reserved allocation and use it for module percpu variables
Impact: add reserved allocation functionality and use it for module
percpu variables
This patch implements reserved allocation from the first chunk. When
setting up the first chunk, arch can ask to set aside certain number
of bytes right after the core static area which is available only
through a separate reserved allocator. This will be used primarily
for module static percpu variables on architectures with limited
relocation range to ensure that the module perpcu symbols are inside
the relocatable range.
If reserved area is requested, the first chunk becomes reserved and
isn't available for regular allocation. If the first chunk also
includes piggy-back dynamic allocation area, a separate chunk mapping
the same region is created to serve dynamic allocation. The first one
is called static first chunk and the second dynamic first chunk.
Although they share the page map, their different area map
initializations guarantee they serve disjoint areas according to their
purposes.
If arch doesn't setup reserved area, reserved allocation is handled
like any other allocation.
Signed-off-by: Tejun Heo <tj@kernel.org>
Tejun Heo [Fri, 6 Mar 2009 05:33:59 +0000 (14:33 +0900)]
percpu: add an indirection ptr for chunk page map access
Impact: allow sharing page map, no functional difference yet
Make chunk->page access indirect by adding a pointer and renaming the
actual array to page_ar. This will be used by future changes.
Signed-off-by: Tejun Heo <tj@kernel.org>
Tejun Heo [Fri, 6 Mar 2009 05:33:59 +0000 (14:33 +0900)]
x86: make embedding percpu allocator return excessive free space
Impact: reduce unnecessary memory usage on certain configurations
Embedding percpu allocator allocates unit_size *
smp_num_possible_cpus() bytes consecutively and use it for the first
chunk. However, if the static area is small, this can result in
excessive prellocated free space in the first chunk due to
PCPU_MIN_UNIT_SIZE restriction.
This patch makes embedding percpu allocator preallocate only what's
necessary as described by PERPCU_DYNAMIC_RESERVE and return the
leftover to the bootmem allocator.
Signed-off-by: Tejun Heo <tj@kernel.org>
Tejun Heo [Fri, 6 Mar 2009 05:33:59 +0000 (14:33 +0900)]
percpu: use negative for auto for pcpu_setup_first_chunk() arguments
Impact: argument semantic cleanup
In pcpu_setup_first_chunk(), zero @unit_size and @dyn_size meant
auto-sizing. It's okay for @unit_size as 0 doesn't make sense but 0
dynamic reserve size is valid. Alos, if arch @dyn_size is calculated
from other parameters, it might end up passing in 0 @dyn_size and
malfunction when the size is automatically adjusted.
This patch makes both @unit_size and @dyn_size ssize_t and use -1 for
auto sizing.
Signed-off-by: Tejun Heo <tj@kernel.org>
Tejun Heo [Fri, 6 Mar 2009 05:33:59 +0000 (14:33 +0900)]
percpu: improve first chunk initial area map handling
Impact: no functional change
When the first chunk is created, its initial area map is not allocated
because kmalloc isn't online yet. The map is allocated and
initialized on the first allocation request on the chunk. This works
fine but the scattering of initialization logic between the init
function and allocation path is a bit confusing.
This patch makes the first chunk initialize and use minimal statically
allocated map from pcpu_setpu_first_chunk(). The map resizing path
still needs to handle this specially but it's more straight-forward
and gives more latitude to the init path. This will ease future
changes.
Signed-off-by: Tejun Heo <tj@kernel.org>
Tejun Heo [Fri, 6 Mar 2009 05:33:59 +0000 (14:33 +0900)]
percpu: cosmetic renames in pcpu_setup_first_chunk()
Impact: cosmetic, preparation for future changes
Make the following renames in pcpur_setup_first_chunk() in preparation
for future changes.
* s/free_size/dyn_size/
* s/static_vm/first_vm/
* s/static_chunk/schunk/
Signed-off-by: Tejun Heo <tj@kernel.org>
Tejun Heo [Fri, 6 Mar 2009 05:33:58 +0000 (14:33 +0900)]
percpu: clean up percpu constants
Impact: cleaup
Make the following cleanups.
* There isn't much arch-specific about PERCPU_MODULE_RESERVE. Always
define it whether arch overrides PERCPU_ENOUGH_ROOM or not.
* blackfin overrides PERCPU_ENOUGH_ROOM to align static area size. Do
it by default.
* percpu allocation sizes doesn't have much to do with the page size.
Don't use PAGE_SHIFT in their definition.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Bryan Wu <cooloney@kernel.org>
Ingo Molnar [Thu, 5 Mar 2009 20:49:47 +0000 (21:49 +0100)]
Merge branch 'x86/uv' into x86/core
Ingo Molnar [Thu, 5 Mar 2009 20:49:44 +0000 (21:49 +0100)]
Merge branch 'x86/doc' into x86/core
Ingo Molnar [Thu, 5 Mar 2009 20:49:35 +0000 (21:49 +0100)]
Merge branch 'x86/mm' into x86/core
Ingo Molnar [Thu, 5 Mar 2009 20:49:25 +0000 (21:49 +0100)]
Merge branch 'x86/mce2' into x86/core