Linus Torvalds [Tue, 9 Jan 2007 17:35:16 +0000 (09:35 -0800)]
Merge branch 'merge' of /linux/kernel/git/paulus/powerpc
* 'merge' of master.kernel.org:/pub/scm/linux/kernel/git/paulus/powerpc:
[POWERPC] Fix bugs in the hypervisor call stats code
[POWERPC] Fix corruption in hcall9
[POWERPC] iSeries: fix setup initcall
[POWERPC] iSeries: fix viopath initialisation
[POWERPC] iSeries: fix lpevents initialisation
[POWERPC] iSeries: fix proc/iSeries initialisation
[POWERPC] iSeries: fix mf proc initialisation
[POWERPC] disable PReP and EFIKA during make oldconfig
[POWERPC] Fix mpc52xx serial driver to work for arch/ppc again
[POWERPC] Don't include powerpc/sysdev/rom.o for arch/ppc builds
[POWERPC] Fix mpc52xx fdt to use correct device_type for sound devices
[POWERPC] 52xx: Don't use device_initcall to probe of_platform_bus
[POWERPC] Add legacy iSeries to ppc64_defconfig
[POWERPC] Update ppc64_defconfig
[POWERPC] Fix manual assembly WARN_ON() in enter_rtas().
[POWERPC] Avoid calling get_irq_server() with a real, not virtual irq.
[POWERPC] Fix unbalanced uses of of_node_put
[POWERPC] Fix bogus BUG_ON() in in hugetlb_get_unmapped_area()
Linus Torvalds [Tue, 9 Jan 2007 17:34:20 +0000 (09:34 -0800)]
Merge branch 'for-linus' of git://git390.osdl.marist.edu/linux-2.6
* 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6:
[S390] locking problem with __cpcmd.
[S390] don't call handle_mm_fault() if in an atomic context.
[S390] Fix vmalloc area size calculation.
[S390] Fix cpu hotplug (missing 'online' attribute).
[S390] cio: use barrier() in stsch_reset.
[S390] memory detection misses 128k.
Mark M. Hoffman [Tue, 9 Jan 2007 03:11:29 +0000 (22:11 -0500)]
[PATCH] i2c/pci: fix sis96x smbus quirk once and for all
The sis96x SMBus PCI device depends on two different quirks to run
in a specific order. Apart from being fragile, this was found to
actually break on (at least) recent FC4, FC5, and FC6 kernels. This
patch fixes the quirks so that they work without relying on the
compiler and/or linker to put them in any specific order.
http://lists.lm-sensors.org/pipermail/lm-sensors/2006-April/015962.html
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=189719
I tested this patch.
Signed-off-by: Mark M. Hoffman <mhoffman@lightlink.com>
Cc: Andrew Morton <akpm@osdl.org>
Cc: Adrian Bunk <bunk@stusta.de>
Cc: Greg K-H <greg@kroah.com>
Cc: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Christian Borntraeger [Tue, 9 Jan 2007 09:19:03 +0000 (10:19 +0100)]
[S390] locking problem with __cpcmd.
Changeset
740b5706b9c4b3767f597b3ea76654c6f2a800b2 moved the protecting
spinlock from __cpcmd to cpcmd. Therefore vmcp can no longer use __cpcmd,
instead we have to use cpcmd.
Signed-off-by: Christian Borntraeger <cborntra@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Heiko Carstens [Tue, 9 Jan 2007 09:18:50 +0000 (10:18 +0100)]
[S390] don't call handle_mm_fault() if in an atomic context.
There are several places in the futex code where a spin_lock is held
and still uaccesses happen. Deadlocks are avoided by increasing the
preempt count. The pagefault handler will then not take any locks
but will immediately search the fixup tables.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Heiko Carstens [Tue, 9 Jan 2007 09:18:47 +0000 (10:18 +0100)]
[S390] Fix vmalloc area size calculation.
setup_memory_end() uses VMALLOC_END instead of VMALLOC_END_INIT to
calculate the maximum supported size of physical memory. Since
VMALLOC_END is zero, this will cause a crash on 31 bit systems.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Heiko Carstens [Tue, 9 Jan 2007 09:18:44 +0000 (10:18 +0100)]
[S390] Fix cpu hotplug (missing 'online' attribute).
72486f1f8f0a2bc828b9d30cf4690cf2dd6807fc inverts the logic if an
'online' attribute in /sys/devices/system/cpu/cpuX should appear.
So we end up with no hotpluggable cpus at all...
Set the hotpluggable value to one to make sure the online
attribute appears again.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Heiko Carstens [Tue, 9 Jan 2007 09:18:41 +0000 (10:18 +0100)]
[S390] cio: use barrier() in stsch_reset.
Use barrier() in stsch_reset() instead of duplicating the stsch()
inline assembly and adding "memory" to the clobberlist.
Pointed out by Chuck Ebbert.
Real fix would be to add a fixup section to the stsch() and extend the
basic program check handler so it searches the exception tables in case
of a program check.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Hongjie Yang [Tue, 9 Jan 2007 09:18:36 +0000 (10:18 +0100)]
[S390] memory detection misses 128k.
Fix a memory leak problem in the memory detection routines. A memory leak
of 128k occurs when we have a contiguous memory with mixed access-mode
(read or write) ranges.
Signed-off-by: Hongjie Yang <hongjie@us.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Anton Blanchard [Mon, 8 Jan 2007 15:43:02 +0000 (02:43 +1100)]
[POWERPC] Fix bugs in the hypervisor call stats code
There were a few issues with the HCALL_STATS code:
- PURR cpu feature checks were backwards
- We iterated one entry off the end of the hcall_stats array
- Remove dead update_hcall_stats() function prototype
I noticed one thing while debugging, and that is we call H_ENTER (to set
up the MMU hashtable in early init) before we have done the cpu fixups.
This means we will execute the PURR SPR reads even on a CPU that isnt
capable of it. I wonder if we can move the CPU feature fixups earlier.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Anton Blanchard [Mon, 8 Jan 2007 15:37:16 +0000 (02:37 +1100)]
[POWERPC] Fix corruption in hcall9
It looks to me like we are corrupting r12 in the hcall9 function.
Although we have r0 free we cant use offsets against it, so save
away r12 in there instead. r12 holds the ninth return value from
the hypervisor call, so without this fix, the caller will see the
wrong value for the ninth element in the array that gets the return
values.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Stephen Rothwell [Thu, 4 Jan 2007 06:06:21 +0000 (17:06 +1100)]
[POWERPC] iSeries: fix setup initcall
Clearing the progress indicator should only be done if we are running
on legacy iSeries.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Stephen Rothwell [Thu, 4 Jan 2007 06:05:13 +0000 (17:05 +1100)]
[POWERPC] iSeries: fix viopath initialisation
/proc/iSeries/config should only be created if we are running on legacy
iSeries.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Stephen Rothwell [Thu, 4 Jan 2007 06:04:21 +0000 (17:04 +1100)]
[POWERPC] iSeries: fix lpevents initialisation
/proc/iSeries/lpevents should only be created if we are running
on legacy iSeries.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Stephen Rothwell [Thu, 4 Jan 2007 06:03:16 +0000 (17:03 +1100)]
[POWERPC] iSeries: fix proc/iSeries initialisation
These proc files should only be created if we are running on legacy
iSeries.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Stephen Rothwell [Thu, 4 Jan 2007 06:01:51 +0000 (17:01 +1100)]
[POWERPC] iSeries: fix mf proc initialisation
This proc file should only be created if we are running on legacy
iSeries. Since we can now run the same kernel on legacy iSeries and
other machines, we currently get the /proc/iSeries directory and the
files in it on non-iSeries machines, and accessing them causes an oops
in some cases. This and the following patches make sure that these
files are not created on non-iSeries machines, thus avoiding the oops.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Olaf Hering [Wed, 3 Jan 2007 17:33:56 +0000 (18:33 +0100)]
[POWERPC] disable PReP and EFIKA during make oldconfig
New boards should not be enabled per default.
Disable EFIKA and PReP per default.
Anyone who really needes the new code can enable it during make oldconfig.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Grant Likely [Tue, 2 Jan 2007 22:45:37 +0000 (15:45 -0700)]
[POWERPC] Fix mpc52xx serial driver to work for arch/ppc again
The mpc52xx_uart_of_enumerate() function was added when adding 52xx
support to arch/powerpc, but it must not be called for arch/ppc.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Grant Likely [Tue, 2 Jan 2007 22:45:29 +0000 (15:45 -0700)]
[POWERPC] Don't include powerpc/sysdev/rom.o for arch/ppc builds
sysdev/rom.c is for arch/powerpc only. Don't compile it when building
an arch/ppc kernel.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Sylvain Munaut <tnt@246tNt.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Grant Likely [Tue, 2 Jan 2007 22:44:44 +0000 (15:44 -0700)]
[POWERPC] Fix mpc52xx fdt to use correct device_type for sound devices
This corrects the documented interface for mpc52xx device trees.
Sound devices should be using 'sound' for the device_type field, not
the type of sound interface.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Sylvain Munaut <tnt@246tNt.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Sylvain Munaut [Tue, 2 Jan 2007 22:29:53 +0000 (23:29 +0100)]
[POWERPC] 52xx: Don't use device_initcall to probe of_platform_bus
Using device_initcall makes it happen for every platform that
compiles this file in. This is really bad, for obvious reasons.
Instead, we use the .init field of the machine description. If
the platform needs the hook to do something specific it can provides
its own function and call mpc52xx_declare_of_platform_devices from
there. If not, the mpc52xx_declare_of_platform_devices function can
directly be used as the init hook.
Signed-off-by: Sylvain Munaut <tnt@246tNt.com>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Stephen Rothwell [Tue, 2 Jan 2007 05:13:50 +0000 (16:13 +1100)]
[POWERPC] Add legacy iSeries to ppc64_defconfig
Since we can now boot legacy iSeries and other machines with the same
config, enable legacy iSeries in ppc64_defconfig.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Stephen Rothwell [Tue, 2 Jan 2007 05:11:09 +0000 (16:11 +1100)]
[POWERPC] Update ppc64_defconfig
Enabled new netfilter stuff corresponding to what was enabled before
under different names, and turned on the gxt4500 video driver;
otherwise just took the defaults.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
David Woodhouse [Mon, 1 Jan 2007 18:45:34 +0000 (18:45 +0000)]
[POWERPC] Fix manual assembly WARN_ON() in enter_rtas().
When we switched over to the generic BUG mechanism we forgot to change
the assembly code which open-codes a WARN_ON() in enter_rtas(), so the
bug table got corrupted.
This patch provides an EMIT_BUG_ENTRY macro for use in assembly code,
and uses it in entry_64.S. Tested with CONFIG_DEBUG_BUGVERBOSE on ppc64
but not without -- I tried to turn it off but it wouldn't go away; I
suspect Aunt Tillie probably needed it.
This version gets __FILE__ and __LINE__ right in the assembly version --
rather than saying include/asm-powerpc/bug.h line 21 every time which is
a little suboptimal.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Michal Ostrowski [Wed, 20 Dec 2006 13:29:40 +0000 (07:29 -0600)]
[POWERPC] Avoid calling get_irq_server() with a real, not virtual irq.
We can use default_server when masking an interrupt vector.
get_irq_server() assumes a virtual irq, so badness may happen if we
give it a real one.
Signed-off-by: Michal Ostrowski <mostrows@watson.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Nathan Lynch [Tue, 2 Jan 2007 22:37:06 +0000 (16:37 -0600)]
[POWERPC] Fix unbalanced uses of of_node_put
The (maple|pasemi)_init_IRQ functions call of_node_put(root) once more
than they should, causing the refcount of the root node to underflow,
which triggers the WARN_ON in kref_get.
Signed-off-by: Nathan Lynch <ntl@pobox.com>
Acked-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Paul Mackerras <paulus@samba.org>
David Gibson [Thu, 21 Dec 2006 22:23:03 +0000 (09:23 +1100)]
[POWERPC] Fix bogus BUG_ON() in in hugetlb_get_unmapped_area()
The powerpc specific version of hugetlb_get_unmapped_area() makes some
unwarranted assumptions about what checks have been made to its
parameters by its callers. This will lead to a BUG_ON() if a 32-bit
process attempts to make a hugepage mapping which extends above
TASK_SIZE (4GB).
I'm not sure if these assumptions came about because they were valid
with earlier versions of the get_unmapped_area() path, or if it was
always broken. Nonetheless this patch fixes the logic, and removes
the crash.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Linus Torvalds [Mon, 8 Jan 2007 23:08:22 +0000 (15:08 -0800)]
Merge branch 'for-linus' of /linux/kernel/git/jmorris/selinux-2.6
* 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jmorris/selinux-2.6:
selinux: Delete mls_copy_context
Linus Torvalds [Mon, 8 Jan 2007 23:07:31 +0000 (15:07 -0800)]
Merge branch 'upstream' of git://ftp.linux-mips.org/upstream-linus
* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus:
[MIPS] PNX8550: Fix system timer support
[MIPS] TX49: Fix use of CDEX build_store_reg()
[MIPS] pnx8550: Fix write_config_byte() PCI config space accessor
[MIPS] Fix build errors on SEAD
[MIPS] SMTC build fix
[MIPS] csum_partial and copy in parallel
[MIPS] Malta: Add missing MTD file.
Linus Torvalds [Mon, 8 Jan 2007 23:06:39 +0000 (15:06 -0800)]
Merge master.kernel.org:/home/rmk/linux-2.6-arm
* master.kernel.org:/home/rmk/linux-2.6-arm:
[ARM] Provide basic printk_clock() implementation
[ARM] Resolve fuse and direct-IO failures due to missing cache flushes
[ARM] pass vma for flush_anon_page()
[ARM] Fix potential MMCI bug
[ARM] Fix kernel-mode undefined instruction aborts
[ARM] 4082/1: iop3xx: fix iop33x gpio register offset
[ARM] 4070/1: arch/arm/kernel: fix warnings from missing includes
[ARM] 4079/1: iop: Update MAINTAINERS
Linus Torvalds [Mon, 8 Jan 2007 23:04:46 +0000 (15:04 -0800)]
Revert "[PATCH] x86-64: Try multiple timer variants in check_timer"
This reverts commit
b026872601976f666bae77b609dc490d1834bf77, which has
been linked to several problem reports with IO-APIC and the timer.
Machines either don't boot because the timer doesn't happen, or we get
double timer interrupts because we end up double-routing the timer irq
through multiple interfaces.
See for example
http://lkml.org/lkml/2006/12/16/101
http://lkml.org/lkml/2007/1/3/9
http://bugzilla.kernel.org/show_bug.cgi?id=7789
about some of the discussion.
Patches to fix this cleanup exist (and have been confirmed to work fine
at least for some of the affected cases) and we'll revisit it for
2.6.21, but this late in the -rc series we're better off just reverting
the incomplete commit that caused the problems.
Suggested-by: Adrian Bunk <bunk@stusta.de>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Yinghai Lu <yinghai.lu@amd.com>
Cc: Andrew Morton <akpm@osdl.org>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Venkat Yekkirala [Tue, 12 Dec 2006 19:02:41 +0000 (13:02 -0600)]
selinux: Delete mls_copy_context
This deletes mls_copy_context() in favor of mls_context_cpy() and
replaces mls_scopy_context() with mls_context_cpy_low().
Signed-off-by: Venkat Yekkirala <vyekkirala@TrustedCS.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
Vitaly Wool [Thu, 28 Dec 2006 14:14:05 +0000 (17:14 +0300)]
[MIPS] PNX8550: Fix system timer support
the patch inlined below restores proper time accounting for PNX8550-based
boards. It also gets rid of #ifdef in the generic code which becomes
unnecessary then.
It's functionally identical to the previous patch with the same name but
it has minor comments from Atsushi and Sergei taken into account.
Signed-off-by: Vitaly Wool <vwool@ru.mvista.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Atsushi Nemoto [Sun, 17 Dec 2006 15:38:21 +0000 (00:38 +0900)]
[MIPS] TX49: Fix use of CDEX build_store_reg()
The commit
a923660d786a53e78834b19062f7af2535f7f8ad accidently
prevents TX49 from using CDEX. Use build_dst_pref() only if prefetch
for store was really available.
Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Davy Chan [Fri, 5 Jan 2007 05:56:46 +0000 (13:56 +0800)]
[MIPS] pnx8550: Fix write_config_byte() PCI config space accessor
There's a serious typo in the function:
arch/mips/pci/ops-pnx8550.c:write_config_byte()
The parameter passed to the function config_access() is PCI_CMD_CONFIG_READ
instead of PCI_CMD_CONFIG_WRITE. This renders any attempts to write
a single byte to the PCI configuration registers useless.
This problem does not exist for write_config_word() nor write_config_dword().
This problem has been there since kernel v2.6.17 and is still there
as of kernel v2.6.19.1.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Atsushi Nemoto [Sun, 7 Jan 2007 16:27:40 +0000 (01:27 +0900)]
[MIPS] Fix build errors on SEAD
Quick and dirty fix for build errors on SEAD.
Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Atsushi Nemoto [Sun, 7 Jan 2007 15:50:34 +0000 (00:50 +0900)]
[MIPS] SMTC build fix
Pass "irq" to __DO_IRQ_SMTC_HOOK() macro.
Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Atsushi Nemoto [Tue, 12 Dec 2006 16:22:06 +0000 (01:22 +0900)]
[MIPS] csum_partial and copy in parallel
Implement optimized asm version of csum_partial_copy_nocheck,
csum_partial_copy_from_user and csum_and_copy_to_user which can do
calculate and copy in parallel, based on memcpy.S.
Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Ralf Baechle [Tue, 12 Dec 2006 11:52:34 +0000 (11:52 +0000)]
[MIPS] Malta: Add missing MTD file.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Russell King [Mon, 8 Jan 2007 19:49:12 +0000 (19:49 +0000)]
[ARM] Provide basic printk_clock() implementation
Current sched_clock() implementations on ARM cause unbootable kernels
with PRINTK_TIME support enabled. To avoid this, provide a basic
printk_clock() implementation which avoids sched_clock() being called
before the page tables have been set up.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Russell King [Sat, 30 Dec 2006 23:17:40 +0000 (23:17 +0000)]
[ARM] Resolve fuse and direct-IO failures due to missing cache flushes
fuse does not work on ARM due to cache incoherency issues - fuse wants
to use get_user_pages() to copy data from the current process into
kernel space. However, since this accesses userspace via the kernel
mapping, the kernel mapping can be out of date wrt data written to
userspace.
This can lead to unpredictable behaviour (in the case of fuse) or data
corruption for direct-IO.
This resolves debian bug #402876
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Russell King [Sat, 30 Dec 2006 22:24:19 +0000 (22:24 +0000)]
[ARM] pass vma for flush_anon_page()
Since get_user_pages() may be used with processes other than the
current process and calls flush_anon_page(), flush_anon_page() has to
cope in some way with non-current processes.
It may not be appropriate, or even desirable to flush a region of
virtual memory cache in the current process when that is different to
the process that we want the flush to occur for.
Therefore, pass the vma into flush_anon_page() so that the architecture
can work out whether the 'vmaddr' is for the current process or not.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Russell King [Mon, 8 Jan 2007 16:42:51 +0000 (16:42 +0000)]
[ARM] Fix potential MMCI bug
The MMCI driver might end up aborting the initial command and leaving
the data part of the command sequence still in place. Avoid this
problem by ensuring that any data sequence is properly cleared out
when a command completes.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Linus Torvalds [Sun, 7 Jan 2007 05:45:51 +0000 (21:45 -0800)]
Linux 2.6.20-rc4
Russell King [Sat, 6 Jan 2007 22:53:48 +0000 (22:53 +0000)]
[ARM] Fix kernel-mode undefined instruction aborts
If the kernel attempts to execute a CP1 or CP2 instruction and it
aborts, and a FP emulator is not loaded, we try to return as if to
a user context, instead of the proper kernel context. Since the
fault came from kernel mode, we must use the kernel return paths.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Linus Torvalds [Sat, 6 Jan 2007 21:28:21 +0000 (13:28 -0800)]
Revert "[PATCH] binfmt_elf: randomize PIE binaries (2nd try)"
This reverts commit
59287c0913cc9a6c75712a775f6c1c1ef418ef3b.
Hugh Dickins reports that it causes random failures on x86 with SuSE
10.2, and points out
"Isn't that randomization, anywhere from 0x10000 to ELF_ET_DYN_BASE,
sure to place the ET_DYN from time to time just where the comment
says it's trying to avoid? I assume that somehow results in the error
reported."
(where the comment in question is the existing comment in the source
code about mmap/brk clashes).
Suggested-by: Hugh Dickins <hugh@veritas.com>
Acked-by: Marcus Meissner <meissner@suse.de>
Cc: Andrew Morton <akpm@osdl.org>
Cc: Andi Kleen <ak@suse.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Dan Williams [Thu, 4 Jan 2007 01:14:49 +0000 (02:14 +0100)]
[ARM] 4082/1: iop3xx: fix iop33x gpio register offset
iop33x gpio offset is correct in include/asm-arm/arch-iop33x/iop33x.h, but
include/asm-arm/hardware/iop3xx.h adds 4.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Ben Dooks [Sun, 24 Dec 2006 00:36:35 +0000 (01:36 +0100)]
[ARM] 4070/1: arch/arm/kernel: fix warnings from missing includes
Include <asm/io.h> to fix the warning:
arch/arm/kernel/traps.c:647:6: warning: symbol '__readwrite_bug' was not declared. Should it be static?
Include <linux/mc146818rtc.h> to fix the warning:
arch/arm/kernel/time.c:42:1: warning: symbol 'rtc_lock' was not declared. Should it be static?
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Dan Williams [Tue, 2 Jan 2007 17:32:37 +0000 (18:32 +0100)]
[ARM] 4079/1: iop: Update MAINTAINERS
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Linus Torvalds [Sat, 6 Jan 2007 08:10:55 +0000 (00:10 -0800)]
Merge /pub/scm/linux/kernel/git/gregkh/driver-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/driver-2.6:
[PATCH] Driver core: Fix prefix driver links in /sys/module by bus-name
Linus Torvalds [Sat, 6 Jan 2007 08:10:37 +0000 (00:10 -0800)]
Merge /pub/scm/linux/kernel/git/gregkh/pci-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6:
[PATCH] PCI: disable PCI_MULTITHREAD_PROBE
Linus Torvalds [Sat, 6 Jan 2007 08:10:21 +0000 (00:10 -0800)]
Merge /pub/scm/linux/kernel/git/gregkh/usb-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/usb-2.6:
USB: asix: Fix AX88772 device PHY selection
USB: usblp.c - add Kyocera Mita FS 820 to list of "quirky" printers
sisusb_con warning fixes
USB: Fixed bug in endpoint release function.
USB: small update to Documentation/usb/acm.txt
USB storage: fix ipod ejecting issue
USB Storage: unusual_devs: add supertop drives
USB: omap_udc build fixes (sync with linux-omap)
USB: funsoft is borken on sparc
USB: fix interaction between different interfaces in an "Option" usb device
UHCI: support device_may_wakeup
UHCI: make test for ASUS motherboard more specific
Linus Torvalds [Sat, 6 Jan 2007 08:09:14 +0000 (00:09 -0800)]
Merge branch 'i2c-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6
* 'i2c-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6:
i2c/m41t00: Do not forget to write year
i2c-mv64xxx: Fix random oops at boot
i2c: Migration aids for i2c_adapter.dev removal
i2c-pnx: Add entry to MAINTAINERS
i2c-pnx: Fix interrupt handler, get rid of EARLY config option
Erik Jacobson [Sat, 6 Jan 2007 00:37:05 +0000 (16:37 -0800)]
[PATCH] connector: some fixes for ia64 unaligned access errors
On ia64, the various functions that make up cn_proc.c cause kernel
unaligned access errors.
If you are using these, for example, to get notification about all tasks
forking and exiting, you get multiple unaligned access errors per process.
Use put_unaligned() in the appropriate palces to fix this.
Signed-off-by: Erik Jacobson <erikj@sgi.com>
Cc: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: <stable@kernel.org>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Andrew Morton [Sat, 6 Jan 2007 00:37:05 +0000 (16:37 -0800)]
[PATCH] shrink_all_memory(): fix lru_pages handling
At the end of shrink_all_memory() we forget to recalculate lru_pages: it can
be zero.
Fix that up, and add a helper function for this operation too.
Also, recalculate lru_pages each time around the inner loop to get the
balancing correct.
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Evgeniy Dushistov [Sat, 6 Jan 2007 00:37:04 +0000 (16:37 -0800)]
[PATCH] fix garbage instead of zeroes in UFS
Looks like this is the problem, which point Al Viro some time ago:
ufs's get_block callback allocates 16k of disk at a time, and links that
entire 16k into the file's metadata. But because get_block is called for only
a single buffer_head (a 2k buffer_head in this case?) we are only able to tell
the VFS that this 2k is buffer_new().
So when ufs_getfrag_block() is later called to map some more data in the file,
and when that data resides within the remaining 14k of this fragment,
ufs_getfrag_block() will incorrectly return a !buffer_new() buffer_head.
I don't see _right_ way to do nullification of whole block, if use inode
page cache, some pages may be outside of inode limits (inode size), and
will be lost; if use blockdev page cache it is possible to zero real data,
if later inode page cache will be used.
The simpliest way, as can I see usage of block device page cache, but not only
mark dirty, but also sync it during "nullification". I use my simple tests
collection, which I used for check that create,open,write,read,close works on
ufs, and I see that this patch makes ufs code 18% slower then before.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Hugh Dickins [Sat, 6 Jan 2007 00:37:03 +0000 (16:37 -0800)]
[PATCH] fix OOM killing of swapoff
These days, if you swapoff when there isn't enough memory, OOM killer gives
"BUG: scheduling while atomic" and the machine hangs: badness() needs to do
its PF_SWAPOFF return after the task_unlock (tasklist_lock is also held
here, so p isn't going to be freed: PF_SWAPOFF might get turned off at any
moment, but that doesn't really matter).
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Matthijs van Otterdijk [Sat, 6 Jan 2007 00:37:03 +0000 (16:37 -0800)]
[PATCH] fix the toshiba_acpi write_lcd return value
write_lcd() in toshiba_acpi returns 0 on success since the big ACPI patch
merged in 2.6.20-rc2. It should return count.
Signed-off-by: Matthijs van Otterdijk <thotter@gmail.com>
Cc: Len Brown <lenb@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Cyrill V. Gorcunov [Sat, 6 Jan 2007 00:37:02 +0000 (16:37 -0800)]
[PATCH] qconf: fix SIGSEGV on empty menu items
qconf may cause SIGSEGV by trying to show debug information on empty menu
items
Signed-off-by: Cyrill V. Gorcunov <gorcunov@gmail.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Christoph Lameter [Sat, 6 Jan 2007 00:37:02 +0000 (16:37 -0800)]
[PATCH] Check for populated zone in __drain_pages
Both process_zones() and drain_node_pages() check for populated zones
before touching pagesets. However, __drain_pages does not do so,
This may result in a NULL pointer dereference for pagesets in unpopulated
zones if a NUMA setup is combined with cpu hotplug.
Initially the unpopulated zone has the pcp pointers pointing to the boot
pagesets. Since the zone is not populated the boot pageset pointers will
not be changed during page allocator and slab bootstrap.
If a cpu is later brought down (first call to __drain_pages()) then the pcp
pointers for cpus in unpopulated zones are set to NULL since __drain_pages
does not first check for an unpopulated zone.
If the cpu is then brought up again then we call process_zones() which will
ignore the unpopulated zone. So the pageset pointers will still be NULL.
If the cpu is then again brought down then __drain_pages will attempt to
drain pages by following the NULL pageset pointer for unpopulated zones.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Alan [Sat, 6 Jan 2007 00:37:01 +0000 (16:37 -0800)]
[PATCH] hpt37x: Two important bug fixes
The HPT37x driver very carefully handles DMA completions and the needed
fixups are done on pci registers 0x50 and 0x52. This is unfortunate
because the actual registers are 0x50 and 0x54. Fixing this offset cures
the second channel problems reported.
Secondly there are some problems with the HPT370 and certain ATA drives.
The filter code however only filters ATAPI devices due to a reversed type
check.
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Alexey Dobriyan [Sat, 6 Jan 2007 00:37:00 +0000 (16:37 -0800)]
[PATCH] pata_optidma: typo in Kconfig
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Alan Cox <alan@redhat.com>
Cc: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Dor Laor [Sat, 6 Jan 2007 00:37:00 +0000 (16:37 -0800)]
[PATCH] KVM: Simplify test for interrupt window
No need to test for rflags.if as both VT and SVM specs assure us that on exit
caused from interrupt window opening, 'if' is set.
Signed-off-by: Dor Laor <dor.laor@qumranet.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Ingo Molnar [Sat, 6 Jan 2007 00:36:59 +0000 (16:36 -0800)]
[PATCH] KVM: Simplify mmu_alloc_roots()
Small optimization/cleanup:
page == page_header(page->page_hpa)
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Avi Kivity <avi@qumranet.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Ingo Molnar [Sat, 6 Jan 2007 00:36:59 +0000 (16:36 -0800)]
[PATCH] KVM: Make loading cr3 more robust
Prevent the guest's loading of a corrupt cr3 (pointing at no guest phsyical
page) from crashing the host.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Avi Kivity <avi@qumranet.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Avi Kivity [Sat, 6 Jan 2007 00:36:59 +0000 (16:36 -0800)]
[PATCH] KVM: MMU: Add missing dirty bit
If we emulate a write, we fail to set the dirty bit on the guest pte, leading
the guest to believe the page is clean, and thus lose data. Bad.
Fix by setting the guest pte dirty bit under such conditions.
Signed-off-by: Avi Kivity <avi@qumranet.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Avi Kivity [Sat, 6 Jan 2007 00:36:58 +0000 (16:36 -0800)]
[PATCH] KVM: Don't set guest cr3 from vmx_vcpu_setup()
It overwrites the right cr3 set from mmu setup. Happens only with the test
harness.
Signed-off-by: Avi Kivity <avi@qumranet.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Avi Kivity [Sat, 6 Jan 2007 00:36:58 +0000 (16:36 -0800)]
[PATCH] KVM: Add missing 'break'
Signed-off-by: Avi Kivity <avi@qumranet.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Ingo Molnar [Sat, 6 Jan 2007 00:36:57 +0000 (16:36 -0800)]
[PATCH] KVM: Avoid oom on cr3 switch
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Avi Kivity <avi@qumranet.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Avi Kivity [Sat, 6 Jan 2007 00:36:57 +0000 (16:36 -0800)]
[PATCH] KVM: Initialize vcpu->kvm a little earlier
Fixes oops on early close of /dev/kvm.
Signed-off-by: Avi Kivity <avi@qumranet.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Avi Kivity [Sat, 6 Jan 2007 00:36:56 +0000 (16:36 -0800)]
[PATCH] KVM: Improve reporting of vmwrite errors
This will allow us to see the root cause when a vmwrite error happens.
Signed-off-by: Avi Kivity <avi@qumranet.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Avi Kivity [Sat, 6 Jan 2007 00:36:56 +0000 (16:36 -0800)]
[PATCH] KVM: MMU: add audit code to check mappings, etc are correct
Signed-off-by: Avi Kivity <avi@qumranet.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Avi Kivity [Sat, 6 Jan 2007 00:36:55 +0000 (16:36 -0800)]
[PATCH] KVM: MMU: Destroy mmu while we still have a vcpu left
mmu_destroy flushes the guest tlb (indirectly), which needs a valid vcpu.
Signed-off-by: Avi Kivity <avi@qumranet.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Avi Kivity [Sat, 6 Jan 2007 00:36:55 +0000 (16:36 -0800)]
[PATCH] KVM: MMU: Flush guest tlb when reducing permissions on a pte
If we reduce permissions on a pte, we must flush the cached copy of the pte
from the guest's tlb.
This is implemented at the moment by flushing the entire guest tlb, and can be
improved by flushing just the relevant virtual address, if it is known.
Signed-off-by: Avi Kivity <avi@qumranet.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Avi Kivity [Sat, 6 Jan 2007 00:36:54 +0000 (16:36 -0800)]
[PATCH] KVM: MMU: Detect oom conditions and propagate error to userspace
Signed-off-by: Avi Kivity <avi@qumranet.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Avi Kivity [Sat, 6 Jan 2007 00:36:53 +0000 (16:36 -0800)]
[PATCH] KVM: MMU: Replace atomic allocations by preallocated objects
The mmu sometimes needs memory for reverse mapping and parent pte chains.
however, we can't allocate from within the mmu because of the atomic context.
So, move the allocations to a central place that can be executed before the
main mmu machinery, where we can bail out on failure before any damage is
done.
(error handling is deffered for now, but the basic structure is there)
Signed-off-by: Avi Kivity <avi@qumranet.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Avi Kivity [Sat, 6 Jan 2007 00:36:52 +0000 (16:36 -0800)]
[PATCH] KVM: MMU: Free pages on kvm destruction
Because mmu pages have attached rmap and parent pte chain structures, we need
to zap them before freeing so the attached structures are freed.
Signed-off-by: Avi Kivity <avi@qumranet.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Avi Kivity [Sat, 6 Jan 2007 00:36:52 +0000 (16:36 -0800)]
[PATCH] KVM: MMU: Treat user-mode faults as a hint that a page is no longer a page table
Signed-off-by: Avi Kivity <avi@qumranet.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Avi Kivity [Sat, 6 Jan 2007 00:36:51 +0000 (16:36 -0800)]
[PATCH] KVM: MMU: Fix cmpxchg8b emulation
cmpxchg8b uses edx:eax as the compare operand, not edi:eax.
cmpxchg8b is used by 32-bit pae guests to set page table entries atomically,
and this is emulated touching shadowed guest page tables.
Also, implement it for 32-bit hosts.
Signed-off-by: Avi Kivity <avi@qumranet.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Avi Kivity [Sat, 6 Jan 2007 00:36:51 +0000 (16:36 -0800)]
[PATCH] KVM: MMU: Never free a shadow page actively serving as a root
We always need cr3 to point to something valid, so if we detect that we're
freeing a root page, simply push it back to the top of the active list.
Signed-off-by: Avi Kivity <avi@qumranet.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Avi Kivity [Sat, 6 Jan 2007 00:36:50 +0000 (16:36 -0800)]
[PATCH] KVM: MMU: Page table write flood protection
In fork() (or when we protect a page that is no longer a page table), we can
experience floods of writes to a page, which have to be emulated. This is
expensive.
So, if we detect such a flood, zap the page so subsequent writes can proceed
natively.
Signed-off-by: Avi Kivity <avi@qumranet.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Avi Kivity [Sat, 6 Jan 2007 00:36:50 +0000 (16:36 -0800)]
[PATCH] KVM: MMU: If an empty shadow page is not empty, report more info
Signed-off-by: Avi Kivity <avi@qumranet.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Avi Kivity [Sat, 6 Jan 2007 00:36:49 +0000 (16:36 -0800)]
[PATCH] KVM: MMU: Ensure freed shadow pages are clean
Signed-off-by: Avi Kivity <avi@qumranet.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Avi Kivity [Sat, 6 Jan 2007 00:36:49 +0000 (16:36 -0800)]
[PATCH] KVM: MMU: <ove is_empty_shadow_page() above kvm_mmu_free_page()
Signed-off-by: Avi Kivity <avi@qumranet.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Avi Kivity [Sat, 6 Jan 2007 00:36:48 +0000 (16:36 -0800)]
[PATCH] KVM: MMU: Handle misaligned accesses to write protected guest page tables
A misaligned access affects two shadow ptes instead of just one.
Since a misaligned access is unlikely to occur on a real page table, just zap
the page out of existence, avoiding further trouble.
Signed-off-by: Avi Kivity <avi@qumranet.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Avi Kivity [Sat, 6 Jan 2007 00:36:48 +0000 (16:36 -0800)]
[PATCH] KVM: MMU: Remove release_pt_page_64()
Unused.
Signed-off-by: Avi Kivity <avi@qumranet.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Avi Kivity [Sat, 6 Jan 2007 00:36:47 +0000 (16:36 -0800)]
[PATCH] KVM: MMU: Remove invlpg interception
Since we write protect shadowed guest page tables, there is no need to trap
page invalidations (the guest will always change the mapping before issuing
the invlpg instruction).
Signed-off-by: Avi Kivity <avi@qumranet.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Avi Kivity [Sat, 6 Jan 2007 00:36:47 +0000 (16:36 -0800)]
[PATCH] KVM: MMU: oom handling
When beginning to process a page fault, make sure we have enough shadow pages
available to service the fault. If not, free some pages.
Signed-off-by: Avi Kivity <avi@qumranet.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Avi Kivity [Sat, 6 Jan 2007 00:36:47 +0000 (16:36 -0800)]
[PATCH] KVM: MMU: kvm_mmu_put_page() only removes one link to the page
... and so must not free it unconditionally.
Move the freeing to kvm_mmu_zap_page().
Signed-off-by: Avi Kivity <avi@qumranet.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Avi Kivity [Sat, 6 Jan 2007 00:36:46 +0000 (16:36 -0800)]
[PATCH] KVM: MMU: Implement child shadow unlinking
When removing a page table, we must maintain the parent_pte field all child
shadow page tables.
Signed-off-by: Avi Kivity <avi@qumranet.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Avi Kivity [Sat, 6 Jan 2007 00:36:45 +0000 (16:36 -0800)]
[PATCH] KVM: MMU: If emulating an instruction fails, try unprotecting the page
A page table may have been recycled into a regular page, and so any
instruction can be executed on it. Unprotect the page and let the cpu do its
thing.
Signed-off-by: Avi Kivity <avi@qumranet.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Avi Kivity [Sat, 6 Jan 2007 00:36:45 +0000 (16:36 -0800)]
[PATCH] KVM: MMU: Zap shadow page table entries on writes to guest page tables
Iterate over all shadow pages which correspond to a the given guest page table
and remove the mappings.
A subsequent page fault will reestablish the new mapping.
Signed-off-by: Avi Kivity <avi@qumranet.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Avi Kivity [Sat, 6 Jan 2007 00:36:44 +0000 (16:36 -0800)]
[PATCH] KVM: MMU: Support emulated writes into RAM
As the mmu write protects guest page table, we emulate those writes. Since
they are not mmio, there is no need to go to userspace to perform them.
So, perform the writes in the kernel if possible, and notify the mmu about
them so it can take the approriate action.
Signed-off-by: Avi Kivity <avi@qumranet.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Avi Kivity [Sat, 6 Jan 2007 00:36:44 +0000 (16:36 -0800)]
[PATCH] KVM: MMU: Let the walker extract the target page gfn from the pte
This fixes a problem where set_pte_common() looked for shadowed pages based on
the page directory gfn (a huge page) instead of the actual gfn being mapped.
Signed-off-by: Avi Kivity <avi@qumranet.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Avi Kivity [Sat, 6 Jan 2007 00:36:43 +0000 (16:36 -0800)]
[PATCH] KVM: MMU: Write protect guest pages when a shadow is created for them
When we cache a guest page table into a shadow page table, we need to prevent
further access to that page by the guest, as that would render the cache
incoherent.
Signed-off-by: Avi Kivity <avi@qumranet.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Avi Kivity [Sat, 6 Jan 2007 00:36:43 +0000 (16:36 -0800)]
[PATCH] KVM: MMU: Shadow page table caching
Define a hashtable for caching shadow page tables. Look up the cache on
context switch (cr3 change) or during page faults.
The key to the cache is a combination of
- the guest page table frame number
- the number of paging levels in the guest
* we can cache real mode, 32-bit mode, pae, and long mode page
tables simultaneously. this is useful for smp bootup.
- the guest page table table
* some kernels use a page as both a page table and a page directory. this
allows multiple shadow pages to exist for that page, one per level
- the "quadrant"
* 32-bit mode page tables span 4MB, whereas a shadow page table spans
2MB. similarly, a 32-bit page directory spans 4GB, while a shadow
page directory spans 1GB. the quadrant allows caching up to 4 shadow page
tables for one guest page in one level.
- a "metaphysical" bit
* for real mode, and for pse pages, there is no guest page table, so set
the bit to avoid write protecting the page.
Signed-off-by: Avi Kivity <avi@qumranet.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Avi Kivity [Sat, 6 Jan 2007 00:36:42 +0000 (16:36 -0800)]
[PATCH] KVM: MMU: Make kvm_mmu_alloc_page() return a kvm_mmu_page pointer
This allows further manipulation on the shadow page table.
Signed-off-by: Avi Kivity <avi@qumranet.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Avi Kivity [Sat, 6 Jan 2007 00:36:41 +0000 (16:36 -0800)]
[PATCH] KVM: MMU: Make the shadow page tables also special-case pae
Signed-off-by: Avi Kivity <avi@qumranet.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Avi Kivity [Sat, 6 Jan 2007 00:36:41 +0000 (16:36 -0800)]
[PATCH] KVM: MMU: Use the guest pdptrs instead of mapping cr3 in pae mode
This lets us not write protect a partial page, and is anyway what a real
processor does.
Signed-off-by: Avi Kivity <avi@qumranet.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Avi Kivity [Sat, 6 Jan 2007 00:36:40 +0000 (16:36 -0800)]
[PATCH] KVM: MU: Special treatment for shadow pae root pages
Since we're not going to cache the pae-mode shadow root pages, allocate a
single pae shadow that will hold the four lower-level pages, which will act as
roots.
Signed-off-by: Avi Kivity <avi@qumranet.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>