Takuya Yoshikawa [Sat, 23 Apr 2011 09:48:02 +0000 (18:48 +0900)]
KVM: x86 emulator: Use opcode::execute for Group 1, CMPS and SCAS
The following instructions are changed to use opcode::execute.
Group 1 (80-83)
ADD (00-05), OR (08-0D), ADC (10-15), SBB (18-1D), AND (20-25),
SUB (28-2D), XOR (30-35), CMP (38-3D)
CMPS (A6-A7), SCAS (AE-AF)
The last two do the same as CMP in the emulator, so em_cmp() is used.
Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Avi Kivity <avi@redhat.com>
Takuya Yoshikawa [Thu, 21 Apr 2011 15:34:44 +0000 (00:34 +0900)]
KVM: MMU: Optimize guest page table walk
This patch optimizes the guest page table walk by using get_user()
instead of copy_from_user().
With this patch applied, paging64_walk_addr_generic() has become
about 0.5us to 1.0us faster on my Phenom II machine with NPT on.
Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Thu, 21 Apr 2011 09:35:41 +0000 (12:35 +0300)]
KVM: SVM: Get rid of x86_intercept_map::valid
By reserving 0 as an invalid x86_intercept_stage, we no longer
need to store a valid flag in x86_intercept_map.
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Thu, 21 Apr 2011 09:21:50 +0000 (12:21 +0300)]
KVM: x86 emulator: Use opcode::execute for 0F 01 opcode
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Thu, 21 Apr 2011 09:17:13 +0000 (12:17 +0300)]
KVM: x86 emulator: Don't force #UD for 0F 01 /5
While it isn't defined, no need to force a #UD. If it becomes defined
in the future this can cause wierd problems for the guest.
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Thu, 21 Apr 2011 09:07:59 +0000 (12:07 +0300)]
KVM: x86 emulator: move 0F 01 sub-opcodes into their own functions
Signed-off-by: Avi Kivity <avi@redhat.com>
Randy Dunlap [Thu, 21 Apr 2011 16:09:22 +0000 (09:09 -0700)]
KVM: x86 emulator: fix const value warning on i386 in svm insn RAX check
arch/x86/kvm/emulate.c:2598: warning: integer constant is too large for 'long' type
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Clemens Noss [Thu, 21 Apr 2011 19:16:05 +0000 (21:16 +0200)]
KVM: x86 emulator: avoid calling wbinvd() macro
Commit
0b56652e33c72092956c651ab6ceb9f0ad081153 fails to build:
CC [M] arch/x86/kvm/emulate.o
arch/x86/kvm/emulate.c: In function 'x86_emulate_insn':
arch/x86/kvm/emulate.c:4095:25: error: macro "wbinvd" passed 1 arguments, but takes just 0
arch/x86/kvm/emulate.c:4095:3: warning: statement with no effect
make[2]: *** [arch/x86/kvm/emulate.o] Error 1
make[1]: *** [arch/x86/kvm] Error 2
make: *** [arch/x86] Error 2
Work around this for now.
Signed-off-by: Clemens Noss <cnoss@gmx.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
Liu Yuan [Thu, 21 Apr 2011 06:53:57 +0000 (14:53 +0800)]
KVM: ioapic: Fix an error field reference
Function ioapic_debug() in the ioapic_deliver() misnames
one filed by reference. This patch correct it.
Signed-off-by: Liu Yuan <tailai.ly@taobao.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Roedel, Joerg [Wed, 20 Apr 2011 13:33:16 +0000 (15:33 +0200)]
KVM: MMU: Make cmpxchg_gpte aware of nesting too
This patch makes the cmpxchg_gpte() function aware of the
difference between l1-gfns and l2-gfns when nested
virtualization is in use. This fixes a potential
data-corruption problem in the l1-guest and makes the code
work correct (at least as correct as the hardware which is
emulated in this code) again.
Cc: stable@kernel.org
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Wed, 20 Apr 2011 12:56:20 +0000 (15:56 +0300)]
KVM: x86 emulator: drop x86_emulate_ctxt::vcpu
No longer used.
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Wed, 20 Apr 2011 12:55:40 +0000 (15:55 +0300)]
KVM: Avoid using x86_emulate_ctxt.vcpu
We can use container_of() instead.
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Wed, 20 Apr 2011 12:53:23 +0000 (15:53 +0300)]
KVM: x86 emulator: add new ->wbinvd() callback
Instead of calling kvm_emulate_wbinvd() directly.
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Wed, 20 Apr 2011 12:47:13 +0000 (15:47 +0300)]
KVM: x86 emulator: add ->fix_hypercall() callback
Artificial, but needed to remove direct calls to KVM.
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Wed, 20 Apr 2011 12:43:05 +0000 (15:43 +0300)]
KVM: x86 emulator: add new ->halt() callback
Instead of reaching into vcpu internals.
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Wed, 20 Apr 2011 12:38:44 +0000 (15:38 +0300)]
KVM: x86 emulator: make emulate_invlpg() an emulator callback
Removing direct calls to KVM.
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Wed, 20 Apr 2011 12:32:49 +0000 (15:32 +0300)]
KVM: x86 emulator: emulate CLTS internally
Avoid using ctxt->vcpu; we can do everything with ->get_cr() and ->set_cr().
A side effect is that we no longer activate the fpu on emulated CLTS; but that
should be very rare.
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Wed, 20 Apr 2011 12:24:32 +0000 (15:24 +0300)]
KVM: x86 emulator: Replace calls to is_pae() and is_paging with ->get_cr()
Avoid use of ctxt->vcpu.
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Wed, 20 Apr 2011 12:21:35 +0000 (15:21 +0300)]
KVM: x86 emulator: drop use of is_long_mode()
Requires ctxt->vcpu, which is to be abolished. Replace with open calls
to get_msr().
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Wed, 20 Apr 2011 12:12:00 +0000 (15:12 +0300)]
KVM: x86 emulator: add and use new callbacks set_idt(), set_gdt()
Replacing direct calls to realmode_lgdt(), realmode_lidt().
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Wed, 20 Apr 2011 12:01:23 +0000 (15:01 +0300)]
KVM: x86 emulator: avoid using ctxt->vcpu in check_perm() callbacks
Unneeded for register access.
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Wed, 20 Apr 2011 10:37:53 +0000 (13:37 +0300)]
KVM: x86 emulator: drop vcpu argument from intercept callback
Making the emulator caller agnostic.
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Wed, 20 Apr 2011 10:37:53 +0000 (13:37 +0300)]
KVM: x86 emulator: drop vcpu argument from cr/dr/cpl/msr callbacks
Making the emulator caller agnostic.
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Wed, 20 Apr 2011 10:37:53 +0000 (13:37 +0300)]
KVM: x86 emulator: drop vcpu argument from segment/gdt/idt callbacks
Making the emulator caller agnostic.
[Takuya Yoshikawa: fix typo leading to LDT failures]
Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Wed, 20 Apr 2011 10:37:53 +0000 (13:37 +0300)]
KVM: x86 emulator: drop vcpu argument from pio callbacks
Making the emulator caller agnostic.
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Wed, 20 Apr 2011 10:37:53 +0000 (13:37 +0300)]
KVM: x86 emulator: drop vcpu argument from memory read/write callbacks
Making the emulator caller agnostic.
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Wed, 20 Apr 2011 10:12:27 +0000 (13:12 +0300)]
KVM: x86 emulator: whitespace cleanups
Clean up lines longer than 80 columns. No code changes.
Signed-off-by: Avi Kivity <avi@redhat.com>
Nelson Elhage [Mon, 18 Apr 2011 16:05:53 +0000 (12:05 -0400)]
KVM: emulator: Use linearize() when fetching instructions
Since segments need to be handled slightly differently when fetching
instructions, we add a __linearize helper that accepts a new 'fetch' boolean.
[avi: fix oops caused by wrong segmented_address initialization order]
Signed-off-by: Nelson Elhage <nelhage@ksplice.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Joerg Roedel [Mon, 18 Apr 2011 09:42:53 +0000 (11:42 +0200)]
KVM: X86: Update last_guest_tsc in vcpu_put
The last_guest_tsc is used in vcpu_load to adjust the
tsc_offset since tsc-scaling is merged. So the
last_guest_tsc needs to be updated in vcpu_put instead of
the the last_host_tsc. This is fixed with this patch.
Reported-by: Jan Kiszka <jan.kiszka@web.de>
Tested-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Joerg Roedel [Mon, 18 Apr 2011 09:42:52 +0000 (11:42 +0200)]
KVM: SVM: Fix nested sel_cr0 intercept path with decode-assists
This patch fixes a bug in the nested-svm path when
decode-assists is available on the machine. After a
selective-cr0 intercept is detected the rip is advanced
unconditionally. This causes the l1-guest to continue
running with an l2-rip.
This bug was with the sel_cr0 unit-test on decode-assists
capable hardware.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Nelson Elhage [Wed, 13 Apr 2011 15:44:13 +0000 (11:44 -0400)]
KVM: x86 emulator: Handle wraparound in (cs_base + offset) when fetching insns
Currently, setting a large (i.e. negative) base address for %cs does not work on
a 64-bit host. The "JOS" teaching operating system, used by MIT and other
universities, relies on such segments while bootstrapping its way to full
virtual memory management.
Signed-off-by: Nelson Elhage <nelhage@ksplice.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Duan Jiong [Mon, 11 Apr 2011 04:44:06 +0000 (12:44 +0800)]
KVM: remove useless function declaration kvm_inject_pit_timer_irqs()
Just remove useless function define kvm_inject_pit_timer_irqs() from
file arch/x86/kvm/i8254.h
Signed-off-by:Duan Jiong<djduanjiong@gmail.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Duan Jiong [Mon, 11 Apr 2011 04:56:01 +0000 (12:56 +0800)]
KVM: remove useless function declarations from file arch/x86/kvm/irq.h
Just remove useless function define kvm_pic_clear_isr_ack() and
pit_has_pending_timer()
Signed-off-by: Duan Jiong<djduanjiong@gmail.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Jeff Mahoney [Wed, 13 Apr 2011 01:30:17 +0000 (21:30 -0400)]
KVM: Fix off by one in kvm_for_each_vcpu iteration
This patch avoids gcc issuing the following warning when KVM_MAX_VCPUS=1:
warning: array subscript is above array bounds
kvm_for_each_vcpu currently checks to see if the index for the vcpu is
valid /after/ loading it. We don't run into problems because the address
is still inside the enclosing struct kvm and we never deference or write
to it, so this isn't a security issue.
The warning occurs when KVM_MAX_VCPUS=1 because the increment portion of
the loop will *always* cause the loop to load an invalid location since
++idx will always be > 0.
This patch moves the load so that the check occurs before the load and
we don't run into the compiler warning.
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Serge E. Hallyn [Wed, 13 Apr 2011 14:12:54 +0000 (09:12 -0500)]
KVM: fix push of wrong eip when doing softint
When doing a soft int, we need to bump eip before pushing it to
the stack. Otherwise we'll do the int a second time.
[apw@canonical.com: merged eip update as per Jan's recommendation.]
Signed-off-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Takuya Yoshikawa [Tue, 12 Apr 2011 15:31:23 +0000 (00:31 +0900)]
KVM: x86 emulator: Use em_push() instead of emulate_push()
em_push() is a simple wrapper of emulate_push(). So this patch replaces
emulate_push() with em_push() and removes the unnecessary former.
In addition, the unused ops arguments are removed from emulate_pusha()
and emulate_grp45().
Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Avi Kivity <avi@redhat.com>
Takuya Yoshikawa [Tue, 12 Apr 2011 15:29:09 +0000 (00:29 +0900)]
KVM: x86 emulator: Make emulate_push() store the value directly
PUSH emulation stores the value by calling writeback() after setting
the dst operand appropriately in emulate_push().
This writeback() using dst is not needed at all because we know the
target is the stack. So this patch makes emulate_push() call, newly
introduced, segmented_write() directly.
By this, many inlined writeback()'s are removed.
Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Avi Kivity <avi@redhat.com>
Takuya Yoshikawa [Tue, 12 Apr 2011 15:24:55 +0000 (00:24 +0900)]
KVM: x86 emulator: Disable writeback for CMP emulation
This stops "CMP r/m, reg" to write back the data into memory.
Pointed out by Avi.
The writeback suppression now covers CMP, CMPS, SCAS.
Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Avi Kivity <avi@redhat.com>
Jan Kiszka [Tue, 12 Apr 2011 23:27:55 +0000 (01:27 +0200)]
KVM: VMX: Ensure that vmx_create_vcpu always returns proper error
In case certain allocations fail, vmx_create_vcpu may return 0 as error
instead of a negative value encoded via ERR_PTR. This causes a NULL
pointer dereferencing later on in kvm_vm_ioctl_vcpu_create.
Reported-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Gleb Natapov [Thu, 31 Mar 2011 10:06:41 +0000 (12:06 +0200)]
KVM: emulator: do not needlesly sync registers from emulator ctxt to vcpu
Currently we sync registers back and forth before/after exiting
to userspace for IO, but during IO device model shouldn't need to
read/write the registers, so we can as well skip those sync points. The
only exaception is broken vmware backdor interface. The new code sync
registers content during IO only if registers are read from/written to
by userspace in the middle of the IO operation and this almost never
happens in practise.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Avi Kivity [Sun, 3 Apr 2011 09:32:09 +0000 (12:32 +0300)]
KVM: x86 emulator: implement segment permission checks
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Sun, 3 Apr 2011 11:08:51 +0000 (14:08 +0300)]
KVM: x86 emulator: move desc_limit_scaled()
For reuse later.
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Sun, 3 Apr 2011 09:33:12 +0000 (12:33 +0300)]
KVM: x86 emulator: move linearize() downwards
So it can call emulate_gp() without forward declarations.
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Sun, 3 Apr 2011 08:31:19 +0000 (11:31 +0300)]
KVM: x86 emulator: pass access size and read/write intent to linearize()
Needed for segment read/write checks.
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Thu, 31 Mar 2011 16:54:30 +0000 (18:54 +0200)]
KVM: x86 emulator: change address linearization to return an error code
Preparing to add segment checks.
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Thu, 31 Mar 2011 16:48:09 +0000 (18:48 +0200)]
KVM: x86 emulator: move invlpg emulation into a function
It's going to get more complicated soon.
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Thu, 31 Mar 2011 14:52:26 +0000 (16:52 +0200)]
KVM: x86 emulator: Add helpers for memory access using segmented addresses
Will help later adding proper segment checks.
Signed-off-by: Avi Kivity <avi@redhat.com>
Joerg Roedel [Wed, 6 Apr 2011 10:30:03 +0000 (12:30 +0200)]
KVM: SVM: Fix fault-rip on vmsave/vmload emulation
When the emulation of vmload or vmsave fails because the
guest passed an unsupported physical address it gets an #GP
with rip pointing to the instruction after vmsave/vmload.
This is a bug and fixed by this patch.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Joerg Roedel [Fri, 25 Mar 2011 08:44:51 +0000 (09:44 +0100)]
KVM: X86: Implement userspace interface to set virtual_tsc_khz
This patch implements two new vm-ioctls to get and set the
virtual_tsc_khz if the machine supports tsc-scaling. Setting
the tsc-frequency is only possible before userspace creates
any vcpu.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Joerg Roedel [Fri, 25 Mar 2011 08:44:50 +0000 (09:44 +0100)]
KVM: X86: Delegate tsc-offset calculation to architecture code
With TSC scaling in SVM the tsc-offset needs to be
calculated differently. This patch propagates this
calculation into the architecture specific modules so that
this complexity can be handled there.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Joerg Roedel [Fri, 25 Mar 2011 08:44:49 +0000 (09:44 +0100)]
KVM: X86: Implement call-back to propagate virtual_tsc_khz
This patch implements a call-back into the architecture code
to allow the propagation of changes to the virtual tsc_khz
of the vcpu.
On SVM it updates the tsc_ratio variable, on VMX it does
nothing.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Joerg Roedel [Fri, 25 Mar 2011 08:44:48 +0000 (09:44 +0100)]
KVM: X86: Make tsc_delta calculation a function of guest tsc
The calculation of the tsc_delta value to ensure a
forward-going tsc for the guest is a function of the
host-tsc. This works as long as the guests tsc_khz is equal
to the hosts tsc_khz. With tsc-scaling hardware support this
is not longer true and the tsc_delta needs to be calculated
using guest_tsc values.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Joerg Roedel [Fri, 25 Mar 2011 08:44:47 +0000 (09:44 +0100)]
KVM: X86: Let kvm-clock report the right tsc frequency
This patch changes the kvm_guest_time_update function to use
TSC frequency the guest actually has for updating its clock.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Joerg Roedel [Fri, 25 Mar 2011 08:44:46 +0000 (09:44 +0100)]
KVM: SVM: Implement infrastructure for TSC_RATE_MSR
This patch enhances the kvm_amd module with functions to
support the TSC_RATE_MSR which can be used to set a given
tsc frequency for the guest vcpu.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Tue, 5 Apr 2011 13:25:20 +0000 (16:25 +0300)]
KVM: x86 emulator: Drop EFER.SVME requirement from VMMCALL
VMMCALL requires EFER.SVME to be enabled in the host, not in the guest, which
is what check_svme() checks.
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Tue, 5 Apr 2011 13:21:58 +0000 (16:21 +0300)]
KVM: x86 emulator: Re-add VendorSpecific tag to VMMCALL insn
VMMCALL needs the VendorSpecific tag so that #UD emulation
(called if a guest running on AMD was migrated to an Intel host)
is allowed to process the instruction.
Signed-off-by: Avi Kivity <avi@redhat.com>
Bharat Bhushan [Fri, 25 Mar 2011 05:02:13 +0000 (10:32 +0530)]
KVM: PPC: Fix issue clearing exit timing counters
Following dump is observed on host when clearing the exit timing counters
[root@p1021mds kvm]# echo -n 'c' > vm1200_vcpu0_timing
INFO: task echo:1276 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
echo D
0ff5bf94 0 1276 1190 0x00000000
Call Trace:
[
c2157e40] [
c0007908] __switch_to+0x9c/0xc4
[
c2157e50] [
c040293c] schedule+0x1b4/0x3bc
[
c2157e90] [
c04032dc] __mutex_lock_slowpath+0x74/0xc0
[
c2157ec0] [
c00369e4] kvmppc_init_timing_stats+0x20/0xb8
[
c2157ed0] [
c0036b00] kvmppc_exit_timing_write+0x84/0x98
[
c2157ef0] [
c00b9f90] vfs_write+0xc0/0x16c
[
c2157f10] [
c00ba284] sys_write+0x4c/0x90
[
c2157f40] [
c000e320] ret_from_syscall+0x0/0x3c
The vcpu->mutex is used by kvm_ioctl_* (KVM_RUN etc) and same was
used when clearing the stats (in kvmppc_init_timing_stats()). What happens
is that when the guest is idle then it held the vcpu->mutx. While the
exiting timing process waits for guest to release the vcpu->mutex and
a hang state is reached.
Now using seprate lock for exit timing stats.
Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com>
Acked-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
Xiao Guangrong [Mon, 28 Mar 2011 02:29:27 +0000 (10:29 +0800)]
KVM: MMU: remove mmu_seq verification on pte update path
The mmu_seq verification can be removed since we get the pfn in the
protection of mmu_lock.
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Gleb Natapov [Mon, 28 Mar 2011 14:57:49 +0000 (16:57 +0200)]
KVM: x86 emulator: do not open code return values from the emulator
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Justin P. Mattock [Wed, 30 Mar 2011 16:54:47 +0000 (09:54 -0700)]
KVM: Remove base_addresss in kvm_pit since it is unused
The patch below removes unsigned long base_addresss; in i8254.h
since it is unused.
Signed-off-by: Justin P. Mattock <justinmattock@gmail.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Joerg Roedel [Mon, 4 Apr 2011 10:39:36 +0000 (12:39 +0200)]
KVM: SVM: Remove nested sel_cr0_write handling code
This patch removes all the old code which handled the nested
selective cr0 write intercepts. This code was only in place
as a work-around until the instruction emulator is capable
of doing the same. This is the case with this patch-set and
so the code can be removed.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Joerg Roedel [Mon, 4 Apr 2011 10:39:35 +0000 (12:39 +0200)]
KVM: SVM: Add checks for IO instructions
This patch adds code to check for IOIO intercepts on
instructions decoded by the KVM instruction emulator.
[avi: fix build error due to missing #define D2bvIP]
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Joerg Roedel [Mon, 4 Apr 2011 10:39:34 +0000 (12:39 +0200)]
KVM: SVM: Add intercept checks for one-byte instructions
This patch add intercept checks for emulated one-byte
instructions to the KVM instruction emulation path.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Joerg Roedel [Mon, 4 Apr 2011 10:39:33 +0000 (12:39 +0200)]
KVM: SVM: Add intercept checks for remaining twobyte instructions
This patch adds intercepts checks for the remaining twobyte
instructions to the KVM instruction emulator.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Joerg Roedel [Mon, 4 Apr 2011 10:39:32 +0000 (12:39 +0200)]
KVM: SVM: Add intercept checks for remaining group7 instructions
This patch implements the emulator intercept checks for the
RDTSCP, MONITOR, and MWAIT instructions.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Joerg Roedel [Mon, 4 Apr 2011 10:39:31 +0000 (12:39 +0200)]
KVM: SVM: Add intercept checks for SVM instructions
This patch adds the necessary code changes in the
instruction emulator and the extensions to svm.c to
implement intercept checks for the svm instructions.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Joerg Roedel [Mon, 4 Apr 2011 10:39:30 +0000 (12:39 +0200)]
KVM: SVM: Add intercept checks for descriptor table accesses
This patch add intercept checks into the KVM instruction
emulator to check for the 8 instructions that access the
descriptor table addresses.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Joerg Roedel [Mon, 4 Apr 2011 10:39:29 +0000 (12:39 +0200)]
KVM: SVM: Add intercept check for accessing dr registers
This patch adds the intercept checks for instruction
accessing the debug registers.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Joerg Roedel [Mon, 4 Apr 2011 10:39:28 +0000 (12:39 +0200)]
KVM: SVM: Add intercept check for emulated cr accesses
This patch adds all necessary intercept checks for
instructions that access the crX registers.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Joerg Roedel [Mon, 4 Apr 2011 10:39:27 +0000 (12:39 +0200)]
KVM: x86: Add x86 callback for intercept check
This patch adds a callback into kvm_x86_ops so that svm and
vmx code can do intercept checks on emulated instructions.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Joerg Roedel [Mon, 4 Apr 2011 10:39:26 +0000 (12:39 +0200)]
KVM: x86 emulator: Add flag to check for protected mode instructions
This patch adds a flag for the opcoded to tag instruction
which are only recognized in protected mode. The necessary
check is added too.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Joerg Roedel [Mon, 4 Apr 2011 10:39:25 +0000 (12:39 +0200)]
KVM: x86 emulator: Add check_perm callback
This patch adds a check_perm callback for each opcode into
the instruction emulator. This will be used to do all
necessary permission checks on instructions before checking
whether they are intercepted or not.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Joerg Roedel [Mon, 4 Apr 2011 10:39:24 +0000 (12:39 +0200)]
KVM: x86 emulator: Don't write-back cpu-state on X86EMUL_INTERCEPTED
This patch prevents the changed CPU state to be written back
when the emulator detected that the instruction was
intercepted by the guest.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Mon, 4 Apr 2011 10:39:23 +0000 (12:39 +0200)]
KVM: x86 emulator: add SVM intercepts
Add intercept codes for instructions defined by SVM as
interceptable.
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Mon, 4 Apr 2011 10:39:22 +0000 (12:39 +0200)]
KVM: x86 emulator: add framework for instruction intercepts
When running in guest mode, certain instructions can be intercepted by
hardware. This also holds for nested guests running on emulated
virtualization hardware, in particular instructions emulated by kvm
itself.
This patch adds a framework for intercepting instructions. If an
instruction is marked for interception, and if we're running in guest
mode, a callback is called to check whether an intercept is needed or
not. The callback is called at three points in time: immediately after
beginning execution, after checking privilge exceptions, and after
checking memory exception. This suits the different interception points
defined for different instructions and for the various virtualization
instruction sets.
In addition, a new X86EMUL_INTERCEPT is defined, which any callback or
memory access may define, allowing the more complicated intercepts to be
implemented in existing callbacks.
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Wed, 20 Jan 2010 16:09:23 +0000 (18:09 +0200)]
KVM: x86 emulator: implement movdqu instruction (f3 0f 6f, f3 0f 7f)
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Tue, 29 Mar 2011 09:41:27 +0000 (11:41 +0200)]
KVM: x86 emulator: SSE support
Add support for marking an instruction as SSE, switching registers used
to the SSE register file.
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Tue, 29 Mar 2011 09:34:38 +0000 (11:34 +0200)]
KVM: x86 emulator: Specialize decoding for insns with 66/f2/f3 prefixes
Most SIMD instructions use the 66/f2/f3 prefixes to distinguish between
different variants of the same instruction. Usually the encoding is quite
regular, but in some cases (including non-SIMD instructions) the prefixes
generate very different instructions. Examples include XCHG/PAUSE,
MOVQ/MOVDQA/MOVDQU, and MOVBE/CRC32.
Allow the emulator to handle these special cases by splitting such opcodes
into groups, with different decode flags and execution functions for different
prefixes.
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Mon, 28 Mar 2011 14:53:59 +0000 (16:53 +0200)]
KVM: x86 emulator: define callbacks for using the guest fpu within the emulator
Needed for emulating fpu instructions.
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Wed, 20 Jan 2010 14:00:35 +0000 (16:00 +0200)]
KVM: x86 emulator: do not munge rep prefix
Currently we store a rep prefix as 1 or 2 depending on whether it is a REPE or
REPNE. Since sse instructions depend on the prefix value, store it as the
original opcode to simplify things further on.
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Wed, 20 Jan 2010 10:01:20 +0000 (12:01 +0200)]
KVM: 16-byte mmio support
Since sse instructions can issue 16-byte mmios, we need to support them. We
can't increase the kvm_run mmio buffer size to 16 bytes without breaking
compatibility, so instead we break the large mmios into two smaller 8-byte
ones. Since the bus is 64-bit we aren't breaking any atomicity guarantees.
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Tue, 19 Jan 2010 12:20:10 +0000 (14:20 +0200)]
KVM: Split mmio completion into a function
Make room for sse mmio completions.
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Tue, 19 Jan 2010 10:51:22 +0000 (12:51 +0200)]
KVM: extend in-kernel mmio to handle >8 byte transactions
Needed for coalesced mmio using sse.
Signed-off-by: Avi Kivity <avi@redhat.com>
Gleb Natapov [Fri, 1 Apr 2011 14:26:29 +0000 (11:26 -0300)]
KVM: x86: better fix for race between nmi injection and enabling nmi window
Fix race between nmi injection and enabling nmi window in a simpler way.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Marcelo Tosatti [Fri, 1 Apr 2011 14:25:03 +0000 (11:25 -0300)]
Revert "KVM: Fix race between nmi injection and enabling nmi window"
This reverts commit
f86368493ec038218e8663cc1b6e5393cd8e008a.
Simpler fix to follow.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Glauber Costa [Wed, 23 Mar 2011 16:40:42 +0000 (13:40 -0300)]
KVM: expose async pf through our standard mechanism
As Avi recently mentioned, the new standard mechanism for exposing features
is KVM_GET_SUPPORTED_CPUID, not spamming CAPs. For some reason async pf
missed that.
So expose async_pf here.
Signed-off-by: Glauber Costa <glommer@redhat.com>
CC: Gleb Natapov <gleb@redhat.com>
CC: Avi Kivity <avi@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Wed, 23 Mar 2011 13:02:47 +0000 (15:02 +0200)]
KVM: VMX: simplify NMI mask management
Use vmx_set_nmi_mask() instead of open-coding management of
the hardware bit and the software hint (nmi_known_unmasked).
There's a slight change of behaviour when running without
hardware virtual NMI support - we now clear the NMI mask if
NMI delivery faulted in that case as well. This improves
emulation accuracy.
Signed-off-by: Avi Kivity <avi@redhat.com>
Jan Kiszka [Thu, 24 Mar 2011 08:45:10 +0000 (09:45 +0100)]
KVM: SVM: Remove unused svm_features
We use boot_cpu_has now.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Mon, 7 Mar 2011 15:39:45 +0000 (17:39 +0200)]
KVM: VMX: Use cached VM_EXIT_INTR_INFO in handle_exception
vmx_complete_atomic_exit() cached it for us, so we can use it here.
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Mon, 7 Mar 2011 15:37:37 +0000 (17:37 +0200)]
KVM: VMX: Don't VMREAD VM_EXIT_INTR_INFO unconditionally
Only read it if we're going to use it later.
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Mon, 7 Mar 2011 15:24:54 +0000 (17:24 +0200)]
KVM: VMX: Refactor vmx_complete_atomic_exit()
Move the exit reason checks to the front of the function, for early
exit in the common case.
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Mon, 7 Mar 2011 15:20:29 +0000 (17:20 +0200)]
KVM: VMX: Qualify check for host NMI
Check for the exit reason first; this allows us, later,
to avoid a VMREAD for VM_EXIT_INTR_INFO_FIELD.
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Mon, 7 Mar 2011 14:52:07 +0000 (16:52 +0200)]
KVM: VMX: Avoid vmx_recover_nmi_blocking() when unneeded
When we haven't injected an interrupt, we don't need to recover
the nmi blocking state (since the guest can't set it by itself).
This allows us to avoid a VMREAD later on.
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Mon, 7 Mar 2011 13:26:44 +0000 (15:26 +0200)]
KVM: VMX: Cache cpl
We may read the cpl quite often in the same vmexit (instruction privilege
check, memory access checks for instruction and operands), so we gain
a bit if we cache the value.
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Mon, 7 Mar 2011 12:54:28 +0000 (14:54 +0200)]
KVM: VMX: Optimize vmx_get_cpl()
In long mode, vm86 mode is disallowed, so we need not check for
it. Reading rflags.vm may require a VMREAD, so it is expensive.
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Mon, 7 Mar 2011 10:51:22 +0000 (12:51 +0200)]
KVM: VMX: Optimize vmx_get_rflags()
If called several times within the same exit, return cached results.
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Mon, 2 Aug 2010 12:30:20 +0000 (15:30 +0300)]
KVM: Use kvm_get_rflags() and kvm_set_rflags() instead of the raw versions
Some rflags bits are owned by the host, not guest, so we need to use
kvm_get_rflags() to strip those bits away or kvm_set_rflags() to add them
back.
Signed-off-by: Avi Kivity <avi@redhat.com>
Xiao Guangrong [Wed, 9 Mar 2011 07:41:59 +0000 (15:41 +0800)]
KVM: cleanup memslot_id function
We can get memslot id from memslot->id directly
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Linus Torvalds [Wed, 11 May 2011 00:39:01 +0000 (17:39 -0700)]
Merge git://git./linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (27 commits)
slcan: fix ldisc->open retval
net/usb: mark LG VL600 LTE modem ethernet interface as WWAN
xfrm: Don't allow esn with disabled anti replay detection
xfrm: Assign the inner mode output function to the dst entry
net: dev_close() should check IFF_UP
vlan: fix GVRP at dismantle time
netfilter: revert
a2361c8735e07322023aedc36e4938b35af31eb0
netfilter: IPv6: fix DSCP mangle code
netfilter: IPv6: initialize TOS field in REJECT target module
IPVS: init and cleanup restructuring
IPVS: Change of socket usage to enable name space exit.
netfilter: ebtables: only call xt_compat_add_offset once per rule
netfilter: fix ebtables compat support
netfilter: ctnetlink: fix timestamp support for new conntracks
pch_gbe: support ML7223 IOH
PCH_GbE : Fixed the issue of checksum judgment
PCH_GbE : Fixed the issue of collision detection
NET: slip, fix ldisc->open retval
be2net: Fixed bugs related to PVID.
ehea: fix wrongly reported speed and port
...
David Rientjes [Wed, 11 May 2011 00:08:54 +0000 (17:08 -0700)]
slub: Revert "[PARISC] slub: fix panic with DISCONTIGMEM"
This reverts commit
4a5fa3590f09, which did not allow SLUB to be used
on architectures that use DISCONTIGMEM without compiling NUMA support
without CONFIG_BROKEN also set.
The slub panic that it was intended to prevent is addressed by
d9b41e0b54fd ("[PARISC] set memory ranges in N_NORMAL_MEMORY when
onlined") on parisc so there is no further slub issues with such a
configuration.
The reverts allows SLUB now to be used on such architectures since
there haven't been any reports of additional errors.
Cc: James Bottomley <James.Bottomley@suse.de>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>