Eduardo Habkost [Mon, 17 Nov 2008 21:03:19 +0000 (19:03 -0200)]
x86: cpu_emergency_vmxoff() function
Add cpu_emergency_vmxoff() and its friends: cpu_vmx_enabled() and
__cpu_emergency_vmxoff().
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Eduardo Habkost [Mon, 17 Nov 2008 21:03:18 +0000 (19:03 -0200)]
KVM: VMX: extract kvm_cpu_vmxoff() from hardware_disable()
Along with some comments on why it is different from the core cpu_vmxoff()
function.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Eduardo Habkost [Mon, 17 Nov 2008 21:03:17 +0000 (19:03 -0200)]
x86: asm/virtext.h: add cpu_vmxoff() inline function
Unfortunately we can't use exactly the same code from vmx
hardware_disable(), because the KVM function uses the
__kvm_handle_fault_on_reboot() tricks.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Eduardo Habkost [Mon, 17 Nov 2008 21:03:16 +0000 (19:03 -0200)]
KVM: VMX: move cpu_has_kvm_support() to an inline on asm/virtext.h
It will be used by core code on kdump and reboot, to disable
vmx if needed.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Eduardo Habkost [Mon, 17 Nov 2008 21:03:15 +0000 (19:03 -0200)]
KVM: VMX: move ASM_VMX_* definitions from asm/kvm_host.h to asm/vmx.h
Those definitions will be used by code outside KVM, so move it outside
of a KVM-specific source file.
Those definitions are used only on kvm/vmx.c, that already includes
asm/vmx.h, so they can be moved safely.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Eduardo Habkost [Mon, 17 Nov 2008 21:03:14 +0000 (19:03 -0200)]
KVM: SVM: move svm.h to include/asm
svm.h will be used by core code that is independent of KVM, so I am
moving it outside the arch/x86/kvm directory.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Eduardo Habkost [Mon, 17 Nov 2008 21:03:13 +0000 (19:03 -0200)]
KVM: VMX: move vmx.h to include/asm
vmx.h will be used by core code that is independent of KVM, so I am
moving it outside the arch/x86/kvm directory.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Hollis Blanchard [Mon, 10 Nov 2008 20:57:36 +0000 (14:57 -0600)]
KVM: ppc: fix userspace mapping invalidation on context switch
We used to defer invalidating userspace TLB entries until jumping out of the
kernel. This was causing MMU weirdness most easily triggered by using a pipe in
the guest, e.g. "dmesg | tail". I believe the problem was that after the guest
kernel changed the PID (part of context switch), the old process's mappings
were still present, and so copy_to_user() on the "return to new process" path
ended up using stale mappings.
Testing with large pages (64K) exposed the problem, probably because with 4K
pages, pressure on the TLB faulted all process A's mappings out before the
guest kernel could insert any for process B.
Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Hollis Blanchard [Mon, 10 Nov 2008 20:57:35 +0000 (14:57 -0600)]
KVM: ppc: use prefetchable mappings for guest memory
Bare metal Linux on 440 can "overmap" RAM in the kernel linear map, so that it
can use large (256MB) mappings even if memory isn't a multiple of 256MB. To
prevent the hardware prefetcher from loading from an invalid physical address
through that mapping, it's marked Guarded.
However, KVM must ensure that all guest mappings are backed by real physical
RAM (since a deliberate access through a guarded mapping could still cause a
machine check). Accordingly, we don't need to make our mappings guarded, so
let's allow prefetching as the designers intended.
Curiously this patch didn't affect performance at all on the quick test I
tried, but it's clearly the right thing to do anyways and may improve other
workloads.
Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Hollis Blanchard [Mon, 10 Nov 2008 20:57:34 +0000 (14:57 -0600)]
KVM: ppc: use MMUCR accessor to obtain TID
We have an accessor; might as well use it.
Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Sheng Yang [Tue, 11 Nov 2008 07:30:40 +0000 (15:30 +0800)]
KVM: Fix kernel allocated memory slot
Commit
7fd49de9773fdcb7b75e823b21c1c5dc1e218c14 "KVM: ensure that memslot
userspace addresses are page-aligned" broke kernel space allocated memory
slot, for the userspace_addr is invalid.
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Xiantao Zhang [Fri, 24 Oct 2008 03:47:57 +0000 (11:47 +0800)]
KVM: ia64: Remove some macro definitions in asm-offsets.c.
Use kernel's corresponding macro instead.
Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Hollis Blanchard [Fri, 7 Nov 2008 19:15:13 +0000 (13:15 -0600)]
KVM: ppc: fix Kconfig constraints
Make sure that CONFIG_KVM cannot be selected without processor support
(currently, 440 is the only processor implementation available).
Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Hollis Blanchard [Fri, 7 Nov 2008 19:32:12 +0000 (13:32 -0600)]
KVM: ensure that memslot userspace addresses are page-aligned
Bad page translation and silent guest failure ensue if the userspace address is
not page-aligned. I hit this problem using large (host) pages with qemu,
because qemu currently has a hardcoded 4096-byte alignment for guest memory
allocations.
Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Nitin A Kamble [Wed, 5 Nov 2008 23:56:21 +0000 (15:56 -0800)]
KVM: Fix cpuid iteration on multiple leaves per eac
The code to traverse the cpuid data array list for counting type of leaves is
currently broken.
This patches fixes the 2 things in it.
1. Set the 1st counting entry's flag KVM_CPUID_FLAG_STATE_READ_NEXT. Without
it the code will never find a valid entry.
2. Also the stop condition in the for loop while looking for the next unflaged
entry is broken. It needs to stop when it find one matching entry;
and in the case of count of 1, it will be the same entry found in this
iteration.
Signed-Off-By: Nitin A Kamble <nitin.a.kamble@intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Nitin A Kamble [Wed, 5 Nov 2008 23:37:36 +0000 (15:37 -0800)]
KVM: Fix cpuid leaf 0xb loop termination
For cpuid leaf 0xb the bits 8-15 in ECX register define the end of counting
leaf. The previous code was using bits 0-7 for this purpose, which is
a bug.
Signed-off-by: Nitin A Kamble <nitin.a.kamble@intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Hollis Blanchard [Wed, 5 Nov 2008 15:36:24 +0000 (09:36 -0600)]
KVM: ppc: improve trap emulation
set ESR[PTR] when emulating a guest trap. This allows Linux guests to
properly handle WARN_ON() (i.e. detect that it's a non-fatal trap).
Also remove debugging printk in trap emulation.
Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Hollis Blanchard [Wed, 5 Nov 2008 15:36:23 +0000 (09:36 -0600)]
KVM: ppc: optimize irq delivery path
In kvmppc_deliver_interrupt is just one case left in the switch and it is a
rare one (less than 8%) when looking at the exit numbers. Therefore we can
at least drop the switch/case and if an if. I inserted an unlikely too, but
that's open for discussion.
In kvmppc_can_deliver_interrupt all frequent cases are in the default case.
I know compilers are smart but we can make it easier for them. By writing
down all options and removing the default case combined with the fact that
ithe values are constants 0..15 should allow the compiler to write an easy
jump table.
Modifying kvmppc_can_deliver_interrupt pointed me to the fact that gcc seems
to be unable to reduce priority_exception[x] to a build time constant.
Therefore I changed the usage of the translation arrays in the interrupt
delivery path completely. It is now using priority without translation to irq
on the full irq delivery path.
To be able to do that ivpr regs are stored by their priority now.
Additionally the decision made in kvmppc_can_deliver_interrupt is already
sufficient to get the value of interrupt_msr_mask[x]. Therefore we can replace
the 16x4byte array used here with a single 4byte variable (might still be one
miss, but the chance to find this in cache should be better than the right
entry of the whole array).
Signed-off-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Hollis Blanchard [Wed, 5 Nov 2008 15:36:22 +0000 (09:36 -0600)]
KVM: ppc: optimize find first bit
Since we use a unsigned long here anyway we can use the optimized __ffs.
Signed-off-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Hollis Blanchard [Wed, 5 Nov 2008 15:36:21 +0000 (09:36 -0600)]
KVM: ppc: optimize kvm stat handling
Currently we use an unnecessary if&switch to detect some cases.
To be honest we don't need the ligh_exits counter anyway, because we can
calculate it out of others. Sum_exits can also be calculated, so we can
remove that too.
MMIO, DCR and INTR can be counted on other places without these
additional control structures (The INTR case was never hit anyway).
The handling of BOOKE_INTERRUPT_EXTERNAL/BOOKE_INTERRUPT_DECREMENTER is
similar, but we can avoid the additional if when copying 3 lines of code.
I thought about a goto there to prevent duplicate lines, but rewriting three
lines should be better style than a goto cross switch/case statements (its
also not enough code to justify a new inline function).
Signed-off-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Hollis Blanchard [Wed, 5 Nov 2008 15:36:20 +0000 (09:36 -0600)]
KVM: ppc: fix set regs to take care of msr change
When changing some msr bits e.g. problem state we need to take special
care of that. We call the function in our mtmsr emulation (not needed for
wrtee[i]), but we don't call kvmppc_set_msr if we change msr via set_regs
ioctl.
It's a corner case we never hit so far, but I assume it should be
kvmppc_set_msr in our arch set regs function (I found it because it is also
a corner case when using pv support which would miss the update otherwise).
Signed-off-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Hollis Blanchard [Wed, 5 Nov 2008 15:36:19 +0000 (09:36 -0600)]
KVM: ppc: adjust vcpu types to support 64-bit cores
However, some of these fields could be split into separate per-core structures
in the future.
Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Hollis Blanchard [Wed, 5 Nov 2008 15:36:18 +0000 (09:36 -0600)]
KVM: ppc: create struct kvm_vcpu_44x and introduce container_of() accessor
This patch doesn't yet move all 44x-specific data into the new structure, but
is the first step down that path. In the future we may also want to create a
struct kvm_vcpu_booke.
Based on patch from Liu Yu <yu.liu@freescale.com>.
Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Hollis Blanchard [Wed, 5 Nov 2008 15:36:17 +0000 (09:36 -0600)]
KVM: ppc: Move the last bits of 44x code out of booke.c
Needed to port to other Book E processors.
Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Hollis Blanchard [Wed, 5 Nov 2008 15:36:16 +0000 (09:36 -0600)]
KVM: ppc: refactor instruction emulation into generic and core-specific pieces
Cores provide 3 emulation hooks, implemented for example in the new
4xx_emulate.c:
kvmppc_core_emulate_op
kvmppc_core_emulate_mtspr
kvmppc_core_emulate_mfspr
Strictly speaking the last two aren't necessary, but provide for more
informative error reporting ("unknown SPR").
Long term I'd like to have instruction decoding autogenerated from tables of
opcodes, and that way we could aggregate universal, Book E, and core-specific
instructions more easily and without redundant switch statements.
Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Hollis Blanchard [Wed, 5 Nov 2008 15:36:15 +0000 (09:36 -0600)]
ppc: Create disassemble.h to extract instruction fields
This is used in a couple places in KVM, but isn't KVM-specific.
However, this patch doesn't modify other in-kernel emulation code:
- xmon uses a direct copy of ppc_opc.c from binutils
- emulate_instruction() doesn't need it because it can use a series
of mask tests.
Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Avi Kivity <avi@redhat.com>
Hollis Blanchard [Wed, 5 Nov 2008 15:36:14 +0000 (09:36 -0600)]
KVM: ppc: Refactor powerpc.c to relocate 440-specific code
This introduces a set of core-provided hooks. For 440, some of these are
implemented by booke.c, with the rest in (the new) 44x.c.
Note that these hooks are link-time, not run-time. Since it is not possible to
build a single kernel for both e500 and 440 (for example), using function
pointers would only add overhead.
Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Hollis Blanchard [Wed, 5 Nov 2008 15:36:13 +0000 (09:36 -0600)]
KVM: ppc: combine booke_guest.c and booke_host.c
The division was somewhat artificial and cumbersome, and had no functional
benefit anyways: we can only guests built for the real host processor.
Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Hollis Blanchard [Wed, 5 Nov 2008 15:36:12 +0000 (09:36 -0600)]
KVM: ppc: Rename "struct tlbe" to "struct kvmppc_44x_tlbe"
This will ease ports to other cores.
Also remove unused "struct kvm_tlb" while we're at it.
Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Hollis Blanchard [Wed, 5 Nov 2008 15:36:11 +0000 (09:36 -0600)]
KVM: ppc: Move 440-specific TLB code into 44x_tlb.c
This will make it easier to provide implementations for other cores.
Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Izik Eidus [Fri, 3 Oct 2008 14:40:32 +0000 (17:40 +0300)]
KVM: MMU: Fix aliased gfns treated as unaliased
Some areas of kvm x86 mmu are using gfn offset inside a slot without
unaliasing the gfn first. This patch makes sure that the gfn will be
unaliased and add gfn_to_memslot_unaliased() to save the calculating
of the gfn unaliasing in case we have it unaliased already.
Signed-off-by: Izik Eidus <ieidus@redhat.com>
Acked-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Sheng Yang [Fri, 31 Oct 2008 04:37:41 +0000 (12:37 +0800)]
KVM: Enable Function Level Reset for assigned device
Ideally, every assigned device should in a clear condition before and after
assignment, so that the former state of device won't affect later work.
Some devices provide a mechanism named Function Level Reset, which is
defined in PCI/PCI-e document. We should execute it before and after device
assignment.
(But sadly, the feature is new, and most device on the market now don't
support it. We are considering using D0/D3hot transmit to emulate it later,
but not that elegant and reliable as FLR itself.)
[Update: Reminded by Xiantao, execute FLR after we ensure that the device can
be assigned to the guest.]
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Xiantao Zhang [Thu, 23 Oct 2008 07:03:38 +0000 (15:03 +0800)]
KVM: ia64: Remove lock held by halted vcpu
Remove the lock protection for kvm halt logic, otherwise,
once other vcpus want to acquire the lock, and they have to
wait all vcpus are waken up from halt.
Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Xiantao Zhang [Thu, 23 Oct 2008 06:56:44 +0000 (14:56 +0800)]
KVM: ia64: Re-organize data sturure of guests' data area
1. Increase the size of data area to 64M
2. Support more vcpus and memory, 128 vcpus and 256G memory are supported
for guests.
3. Add the boundary check for memory and vcpu allocation.
With this patch, kvm guest's data area looks as follow:
*
* +----------------------+ ------- KVM_VM_DATA_SIZE
* | vcpu[n]'s data | | ___________________KVM_STK_OFFSET
* | | | / |
* | .......... | | /vcpu's struct&stack |
* | .......... | | /---------------------|---- 0
* | vcpu[5]'s data | | / vpd |
* | vcpu[4]'s data | |/-----------------------|
* | vcpu[3]'s data | / vtlb |
* | vcpu[2]'s data | /|------------------------|
* | vcpu[1]'s data |/ | vhpt |
* | vcpu[0]'s data |____________________________|
* +----------------------+ |
* | memory dirty log | |
* +----------------------+ |
* | vm's data struct | |
* +----------------------+ |
* | | |
* | | |
* | | |
* | | |
* | | |
* | | |
* | | |
* | vm's p2m table | |
* | | |
* | | |
* | | | |
* vm's data->| | | |
* +----------------------+ ------- 0
* To support large memory, needs to increase the size of p2m.
* To support more vcpus, needs to ensure it has enough space to
* hold vcpus' data.
*/
Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Guillaume Thouvenin [Wed, 29 Oct 2008 08:39:42 +0000 (09:39 +0100)]
KVM: VMX: Handle mmio emulation when guest state is invalid
If emulate_invalid_guest_state is enabled, the emulator is called
when guest state is invalid. Until now, we reported an mmio failure
when emulate_instruction() returned EMULATE_DO_MMIO. This patch adds
the case where emulate_instruction() failed and an MMIO emulation
is needed.
Signed-off-by: Guillaume Thouvenin <guillaume.thouvenin@ext.bull.net>
Signed-off-by: Avi Kivity <avi@redhat.com>
Guillaume Thouvenin [Tue, 28 Oct 2008 09:51:30 +0000 (10:51 +0100)]
KVM: allow emulator to adjust rip for emulated pio instructions
If we call the emulator we shouldn't call skip_emulated_instruction()
in the first place, since the emulator already computes the next rip
for us. Thus we move ->skip_emulated_instruction() out of
kvm_emulate_pio() and into handle_io() (and the svm equivalent). We
also replaced "return 0" by "break" in the "do_io:" case because now
the shadow register state needs to be committed. Otherwise eip will never
be updated.
Signed-off-by: Guillaume Thouvenin <guillaume.thouvenin@ext.bull.net>
Signed-off-by: Avi Kivity <avi@redhat.com>
Amit Shah [Mon, 27 Oct 2008 09:04:18 +0000 (09:04 +0000)]
KVM: SVM: Set the 'busy' flag of the TR selector
The busy flag of the TR selector is not set by the hardware. This breaks
migration from amd hosts to intel hosts.
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Amit Shah [Mon, 27 Oct 2008 09:04:17 +0000 (09:04 +0000)]
KVM: SVM: Set the 'g' bit of the cs selector for cross-vendor migration
The hardware does not set the 'g' bit of the cs selector and this breaks
migration from amd hosts to intel hosts. Set this bit if the segment
limit is beyond 1 MB.
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Amit Shah [Wed, 22 Oct 2008 11:09:47 +0000 (16:39 +0530)]
KVM: x86: Fix typo in function name
get_segment_descritptor_dtable() contains an obvious type.
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Sheng Yang [Mon, 20 Oct 2008 08:07:10 +0000 (16:07 +0800)]
KVM: IRQ ACK notifier should be used with in-kernel irqchip
Also remove unnecessary parameter of unregister irq ack notifier.
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Jan Kiszka [Mon, 20 Oct 2008 08:20:03 +0000 (10:20 +0200)]
KVM: x86: Optimize NMI watchdog delivery
As suggested by Avi, this patch introduces a counter of VCPUs that have
LVT0 set to NMI mode. Only if the counter > 0, we push the PIT ticks via
all LAPIC LVT0 lines to enable NMI watchdog support.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Acked-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Jan Kiszka [Mon, 20 Oct 2008 08:20:02 +0000 (10:20 +0200)]
KVM: x86: Fix and refactor NMI watchdog emulation
This patch refactors the NMI watchdog delivery patch, consolidating
tests and providing a proper API for delivering watchdog events.
An included micro-optimization is to check only for apic_hw_enabled in
kvm_apic_local_deliver (the test for LVT mask is covering the
soft-disabled case already).
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Acked-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Guillaume Thouvenin [Mon, 20 Oct 2008 11:11:58 +0000 (13:11 +0200)]
KVM: x86 emulator: Add decode entries for 0x04 and 0x05 opcodes (add acc, imm)
Add decode entries for 0x04 and 0x05 (ADD) opcodes, execution is already
implemented.
Signed-off-by: Guillaume Thouvenin <guillaume.thouvenin@ext.bull.net>
Signed-off-by: Avi Kivity <avi@redhat.com>
Sheng Yang [Thu, 16 Oct 2008 09:30:58 +0000 (17:30 +0800)]
KVM: VMX: Move private memory slot position
PCI device assignment would map guest MMIO spaces as separate slot, so it is
possible that the device has more than 2 MMIO spaces and overwrite current
private memslot.
The patch move private memory slot to the top of userspace visible memory slots.
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Sheng Yang [Thu, 16 Oct 2008 09:30:57 +0000 (17:30 +0800)]
KVM: MMU: Extend kvm_mmu_page->slot_bitmap size
Otherwise set_bit() for private memory slot(above KVM_MEMORY_SLOTS) would
corrupted memory in 32bit host.
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Sheng Yang [Tue, 14 Oct 2008 07:59:10 +0000 (15:59 +0800)]
KVM: Clean up kvm_x86_emulate.h
Remove one left improper comment of removed CR2.
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Sheng Yang [Thu, 9 Oct 2008 08:01:57 +0000 (16:01 +0800)]
KVM: Enable MTRR for EPT
The effective memory type of EPT is the mixture of MSR_IA32_CR_PAT and memory
type field of EPT entry.
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Sheng Yang [Thu, 9 Oct 2008 08:01:56 +0000 (16:01 +0800)]
KVM: Add local get_mtrr_type() to support MTRR
For EPT memory type support.
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Sheng Yang [Thu, 9 Oct 2008 08:01:55 +0000 (16:01 +0800)]
KVM: VMX: Add PAT support for EPT
GUEST_PAT support is a new feature introduced by Intel Core i7 architecture.
With this, cpu would save/load guest and host PAT automatically, for EPT memory
type in guest depends on MSR_IA32_CR_PAT.
Also add save/restore for MSR_IA32_CR_PAT.
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Sheng Yang [Thu, 9 Oct 2008 08:01:54 +0000 (16:01 +0800)]
KVM: Improve MTRR structure
As well as reset mmu context when set MTRR.
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Sheng Yang [Thu, 9 Oct 2008 08:01:53 +0000 (16:01 +0800)]
x86: Export some definition of MTRR
For KVM can reuse the type define, and need them to support shadow MTRR.
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Sheng Yang [Thu, 9 Oct 2008 08:01:52 +0000 (16:01 +0800)]
x86: Rename mtrr_state struct and macro names
Prepare for exporting them.
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Gleb Natapov [Tue, 7 Oct 2008 13:42:33 +0000 (15:42 +0200)]
KVM: call kvm_arch_vcpu_reset() instead of the kvm_x86_ops callback
Call kvm_arch_vcpu_reset() instead of directly using arch callback.
The function does additional things.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Jan Kiszka [Fri, 26 Sep 2008 07:30:57 +0000 (09:30 +0200)]
KVM: VMX: work around lacking VNMI support
Older VMX supporting CPUs do not provide the "Virtual NMI" feature for
tracking the NMI-blocked state after injecting such events. For now
KVM is unable to inject NMIs on those CPUs.
Derived from Sheng Yang's suggestion to use the IRQ window notification
for detecting the end of NMI handlers, this patch implements virtual
NMI support without impact on the host's ability to receive real NMIs.
The downside is that the given approach requires some heuristics that
can cause NMI nesting in vary rare corner cases.
The approach works as follows:
- inject NMI and set a software-based NMI-blocked flag
- arm the IRQ window start notification whenever an NMI window is
requested
- if the guest exits due to an opening IRQ window, clear the emulated
NMI-blocked flag
- if the guest net execution time with NMI-blocked but without an IRQ
window exceeds 1 second, force NMI-blocked reset and inject anyway
This approach covers most practical scenarios:
- succeeding NMIs are seperated by at least one open IRQ window
- the guest may spin with IRQs disabled (e.g. due to a bug), but
leaving the NMI handler takes much less time than one second
- the guest does not rely on strict ordering or timing of NMIs
(would be problematic in virtualized environments anyway)
Successfully tested with the 'nmi n' monitor command, the kgdbts
testsuite on smp guests (additional patches required to add debug
register support to kvm) + the kernel's nmi_watchdog=1, and a Siemens-
specific board emulation (+ guest) that comes with its own NMI
watchdog mechanism.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Jan Kiszka [Fri, 26 Sep 2008 07:30:56 +0000 (09:30 +0200)]
KVM: VMX: Provide support for user space injected NMIs
This patch adds the required bits to the VMX side for user space
injected NMIs. As with the preexisting in-kernel irqchip support, the
CPU must provide the "virtual NMI" feature for proper tracking of the
NMI blocking state.
Based on the original patch by Sheng Yang.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Sheng Yang <sheng.yang@intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Jan Kiszka [Fri, 26 Sep 2008 07:30:55 +0000 (09:30 +0200)]
KVM: x86: Support for user space injected NMIs
Introduces the KVM_NMI IOCTL to the generic x86 part of KVM for
injecting NMIs from user space and also extends the statistic report
accordingly.
Based on the original patch by Sheng Yang.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Sheng Yang <sheng.yang@intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Jan Kiszka [Fri, 26 Sep 2008 07:30:54 +0000 (09:30 +0200)]
KVM: Kick NMI receiving VCPU
Kick the NMI receiving VCPU in case the triggering caller runs in a
different context.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Jan Kiszka [Fri, 26 Sep 2008 07:30:53 +0000 (09:30 +0200)]
KVM: x86: VCPU with pending NMI is runnabled
Ensure that a VCPU with pending NMIs is considered runnable.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Jan Kiszka [Fri, 26 Sep 2008 07:30:52 +0000 (09:30 +0200)]
KVM: x86: Enable NMI Watchdog via in-kernel PIT source
LINT0 of the LAPIC can be used to route PIT events as NMI watchdog ticks
into the guest. This patch aligns the in-kernel irqchip emulation with
the user space irqchip with already supports this feature. The trick is
to route PIT interrupts to all LAPIC's LVT0 lines.
Rebased and slightly polished patch originally posted by Sheng Yang.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Sheng Yang <sheng.yang@intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Jan Kiszka [Fri, 26 Sep 2008 07:30:51 +0000 (09:30 +0200)]
KVM: VMX: fix real-mode NMI support
Fix NMI injection in real-mode with the same pattern we perform IRQ
injection.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Jan Kiszka [Fri, 26 Sep 2008 07:30:50 +0000 (09:30 +0200)]
KVM: VMX: refactor IRQ and NMI window enabling
do_interrupt_requests and vmx_intr_assist go different way for
achieving the same: enabling the nmi/irq window start notification.
Unify their code over enable_{irq|nmi}_window, get rid of a redundant
call to enable_intr_window instead of direct enable_nmi_window
invocation and unroll enable_intr_window for both in-kernel and user
space irq injection accordingly.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Jan Kiszka [Fri, 26 Sep 2008 07:30:49 +0000 (09:30 +0200)]
KVM: VMX: refactor/fix IRQ and NMI injectability determination
There are currently two ways in VMX to check if an IRQ or NMI can be
injected:
- vmx_{nmi|irq}_enabled and
- vcpu.arch.{nmi|interrupt}_window_open.
Even worse, one test (at the end of vmx_vcpu_run) uses an inconsistent,
likely incorrect logic.
This patch consolidates and unifies the tests over
{nmi|interrupt}_window_open as cache + vmx_update_window_states
for updating the cache content.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Jan Kiszka [Fri, 26 Sep 2008 07:30:48 +0000 (09:30 +0200)]
KVM: x86: Reset pending/inject NMI state on CPU reset
CPU reset invalidates pending or already injected NMIs, therefore reset
the related state variables.
Based on original patch by Gleb Natapov.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Jan Kiszka [Fri, 26 Sep 2008 07:30:47 +0000 (09:30 +0200)]
KVM: VMX: Support for NMI task gates
Properly set GUEST_INTR_STATE_NMI and reset nmi_injected when a
task-switch vmexit happened due to a task gate being used for handling
NMIs. Also avoid the false warning about valid vectoring info in
kvm_handle_exit.
Based on original patch by Gleb Natapov.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Jan Kiszka [Fri, 26 Sep 2008 07:30:46 +0000 (09:30 +0200)]
KVM: VMX: Use INTR_TYPE_NMI_INTR instead of magic value
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Jan Kiszka [Fri, 26 Sep 2008 07:30:45 +0000 (09:30 +0200)]
KVM: VMX: include all IRQ window exits in statistics
irq_window_exits only tracks IRQ window exits due to user space
requests, nmi_window_exits include all exits. The latter makes more
sense, so let's adjust irq_window_exits accounting.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Guillaume Thouvenin [Mon, 22 Sep 2008 14:08:06 +0000 (16:08 +0200)]
KVM: x86 emulator: consolidate push reg
This patch consolidate the emulation of push reg instruction.
Signed-off-by: Guillaume Thouvenin <guillaume.thouvenin@bull.net>
Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net>
Signed-off-by: Avi Kivity <avi@redhat.com>
Linus Torvalds [Wed, 31 Dec 2008 01:48:25 +0000 (17:48 -0800)]
Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs
* 'for-linus' of git://oss.sgi.com/xfs/xfs: (184 commits)
[XFS] Fix race in xfs_write() between direct and buffered I/O with DMAPI
[XFS] handle unaligned data in xfs_bmbt_disk_get_all
[XFS] avoid memory allocations in xfs_fs_vcmn_err
[XFS] Fix speculative allocation beyond eof
[XFS] Remove XFS_BUF_SHUT() and friends
[XFS] Use the incore inode size in xfs_file_readdir()
[XFS] set b_error from bio error in xfs_buf_bio_end_io
[XFS] use inode_change_ok for setattr permission checking
[XFS] add a FMODE flag to make XFS invisible I/O less hacky
[XFS] resync headers with libxfs
[XFS] simplify projid check in xfs_rename
[XFS] replace b_fspriv with b_mount
[XFS] Remove unused tracing code
[XFS] Remove unnecessary assertion
[XFS] Remove unused variable in ktrace_free()
[XFS] Check return value of xfs_buf_get_noaddr()
[XFS] Fix hang after disallowed rename across directory quota domains
[XFS] Fix compile with CONFIG_COMPAT enabled
move inode tracing out of xfs_vnode.
move vn_iowait / vn_iowake into xfs_aops.c
...
Linus Torvalds [Wed, 31 Dec 2008 01:45:45 +0000 (17:45 -0800)]
Merge git://git.linux-nfs.org/projects/trondmy/nfs-2.6
* git://git.linux-nfs.org/projects/trondmy/nfs-2.6: (70 commits)
fs/nfs/nfs4proc.c: make nfs4_map_errors() static
rpc: add service field to new upcall
rpc: add target field to new upcall
nfsd: support callbacks with gss flavors
rpc: allow gss callbacks to client
rpc: pass target name down to rpc level on callbacks
nfsd: pass client principal name in rsc downcall
rpc: implement new upcall
rpc: store pointer to pipe inode in gss upcall message
rpc: use count of pipe openers to wait for first open
rpc: track number of users of the gss upcall pipe
rpc: call release_pipe only on last close
rpc: add an rpc_pipe_open method
rpc: minor gss_alloc_msg cleanup
rpc: factor out warning code from gss_pipe_destroy_msg
rpc: remove unnecessary assignment
NFS: remove unused status from encode routines
NFS: increment number of operations in each encode routine
NFS: fix comment placement in nfs4xdr.c
NFS: fix tabs in nfs4xdr.c
...
Linus Torvalds [Wed, 31 Dec 2008 01:45:28 +0000 (17:45 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/roland/infiniband
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
IB/mlx4: Fix reading SL field out of cqe->sl_vid
RDMA/addr: Fix build breakage when IPv6 is disabled
Linus Torvalds [Wed, 31 Dec 2008 01:43:10 +0000 (17:43 -0800)]
Merge git://git./linux/kernel/git/jejb/scsi-misc-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (104 commits)
[SCSI] fcoe: fix configuration problems
[SCSI] cxgb3i: fix select/depend problem
[SCSI] fcoe: fix incorrect use of struct module
[SCSI] cxgb3i: remove use of skb->sp
[SCSI] cxgb3i: Add cxgb3i iSCSI driver.
[SCSI] zfcp: Remove unnecessary warning message
[SCSI] zfcp: Add support for unchained FSF requests
[SCSI] zfcp: Remove busid macro
[SCSI] zfcp: remove DID_DID flag
[SCSI] zfcp: Simplify mask lookups for incoming RSCNs
[SCSI] zfcp: Remove initial device data from zfcp_data
[SCSI] zfcp: fix compile warning
[SCSI] zfcp: Remove adapter list
[SCSI] zfcp: Simplify SBAL allocation to fix sparse warnings
[SCSI] zfcp: register with SCSI layer on ccw registration
[SCSI] zfcp: Fix message line break
[SCSI] qla2xxx: changes in multiq code
[SCSI] eata: fix the data buffer accessors conversion regression
[SCSI] ibmvfc: Improve async event handling
[SCSI] lpfc : correct printk types on PPC compiles
...
Linus Torvalds [Wed, 31 Dec 2008 01:41:32 +0000 (17:41 -0800)]
Merge branch 'for_linus' of git://git./linux/kernel/git/mchehab/linux-2.6
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6: (583 commits)
V4L/DVB (10130): use USB API functions rather than constants
V4L/DVB (10129): dvb: remove deprecated use of RW_LOCK_UNLOCKED in frontends
V4L/DVB (10128): modify V4L documentation to be a valid XHTML
V4L/DVB (10127): stv06xx: Avoid having y unitialized
V4L/DVB (10125): em28xx: Don't do AC97 vendor detection for i2s audio devices
V4L/DVB (10124): em28xx: expand output formats available
V4L/DVB (10123): em28xx: fix reversed definitions of I2S audio modes
V4L/DVB (10122): em28xx: don't load em28xx-alsa for em2870 based devices
V4L/DVB (10121): em28xx: remove worthless Pinnacle PCTV HD Mini 80e device profile
V4L/DVB (10120): em28xx: remove redundant Pinnacle Dazzle DVC 100 profile
V4L/DVB (10119): em28xx: fix corrupted XCLK value
V4L/DVB (10118): zoran: fix warning for a variable not used
V4L/DVB (10116): af9013: Fix gcc false warnings
V4L/DVB (10111a): usbvideo.h: remove an useless blank line
V4L/DVB (10111): quickcam_messenger.c: fix a warning
V4L/DVB (10110): v4l2-ioctl: Fix warnings when using .unlocked_ioctl = __video_ioctl2
V4L/DVB (10109): anysee: Fix usage of an unitialized function
V4L/DVB (10104): uvcvideo: Add support for video output devices
V4L/DVB (10102): uvcvideo: Ignore interrupt endpoint for built-in iSight webcams.
V4L/DVB (10101): uvcvideo: Fix bulk URB processing when the header is erroneous
...
Linus Torvalds [Wed, 31 Dec 2008 01:39:37 +0000 (17:39 -0800)]
Merge git://git./linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
net: Fix percpu counters deadlock
cpumask: prepare for iterators to only go to nr_cpu_ids/nr_cpumask_bits: net
drivers/net/usb: use USB API functions rather than constants
cls_cgroup: clean up Kconfig
cls_cgroup: clean up for cgroup part
cls_cgroup: fix an oops when removing a cgroup
EtherExpress16: fix printing timed out status
mlx4_en: Added "set_ringparam" Ethtool interface implementation
mlx4_en: Always allocate RX ring for each interrupt vector
mlx4_en: Verify number of RX rings doesn't exceed MAX_RX_RINGS
IPVS: Make "no destination available" message more consistent between schedulers
net: KS8695: removed duplicated #include
tun: Fix SIOCSIFHWADDR error.
smsc911x: compile fix re netif_rx signature changes
netns: foreach_netdev_safe is insufficient in default_device_exit
net: make xfrm_statistics_seq_show use generic snmp_fold_field
net: Fix more NAPI interface netdev argument drop fallout.
net: Fix unused variable warnings in pasemi_mac.c and spider_net.c
Linus Torvalds [Wed, 31 Dec 2008 01:37:25 +0000 (17:37 -0800)]
Merge git://git./linux/kernel/git/rusty/linux-2.6-for-linus
* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus:
lguest: struct device - replace bus_id with dev_name()
lguest: move the initial guest page table creation code to the host
kvm-s390: implement config_changed for virtio on s390
virtio_console: support console resizing
virtio: add PCI device release() function
virtio_blk: fix type warning
virtio: block: dynamic maximum segments
virtio: set max_segment_size and max_sectors to infinite.
virtio: avoid implicit use of Linux page size in balloon interface
virtio: hand virtio ring alignment as argument to vring_new_virtqueue
virtio: use KVM_S390_VIRTIO_RING_ALIGN instead of relying on pagesize
virtio: use LGUEST_VRING_ALIGN instead of relying on pagesize
virtio: Don't use PAGE_SIZE for vring alignment in virtio_pci.
virtio: rename 'pagesize' arg to vring_init/vring_size
virtio: Don't use PAGE_SIZE in virtio_pci.c
virtio: struct device - replace bus_id with dev_name(), dev_set_name()
virtio-pci queue allocation not page-aligned
Linus Torvalds [Wed, 31 Dec 2008 01:36:49 +0000 (17:36 -0800)]
Merge branch 'devel' of /home/rmk/linux-2.6-arm
* 'devel' of master.kernel.org:/home/rmk/linux-2.6-arm: (407 commits)
[ARM] pxafb: add support for overlay1 and overlay2 as framebuffer devices
[ARM] pxafb: cleanup of the timing checking code
[ARM] pxafb: cleanup of the color format manipulation code
[ARM] pxafb: add palette format support for LCCR4_PAL_FOR_3
[ARM] pxafb: add support for FBIOPAN_DISPLAY by dma braching
[ARM] pxafb: allow pxafb_set_par() to start from arbitrary yoffset
[ARM] pxafb: allow video memory size to be configurable
[ARM] pxa: add document on the MFP design and how to use it
[ARM] sa1100_wdt: don't assume CLOCK_TICK_RATE to be a constant
[ARM] rtc-sa1100: don't assume CLOCK_TICK_RATE to be a constant
[ARM] pxa/tavorevb: update board support (smartpanel LCD + keypad)
[ARM] pxa: Update eseries defconfig
[ARM] 5352/1: add w90p910-plat config file
[ARM] s3c: S3C options should depend on PLAT_S3C
[ARM] mv78xx0: implement GPIO and GPIO interrupt support
[ARM] Kirkwood: implement GPIO and GPIO interrupt support
[ARM] Orion: share GPIO IRQ handling code
[ARM] Orion: share GPIO handling code
[ARM] s3c: define __io using the typesafe version
[ARM] S3C64XX: Ensure CPU_V6 is selected
...
Huang Weiyi [Mon, 29 Dec 2008 22:41:44 +0000 (06:41 +0800)]
tracing: removed duplicated #include
Removed duplicated #include in kernel/trace/trace.c.
Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Wed, 31 Dec 2008 01:34:37 +0000 (17:34 -0800)]
Merge git://git./linux/kernel/git/bart/ide-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6: (33 commits)
ide-cd: remove dead dsc_overlap setting
ide: push local_irq_{save,restore}() to do_identify()
ide: remove superfluous local_irq_{save,restore}() from ide_dump_status()
ide: move legacy ISA/VLB ports handling to ide-legacy.c (v2)
ide: move Power Management support to ide-pm.c
ide: use ATA_DMA_* defines in ide-dma-sff.c
ide: checkpatch.pl fixes for ide-lib.c
ide: remove inline tags from ide-probe.c
ide: remove redundant code from ide_end_drive_cmd()
ide: struct device - replace bus_id with dev_name(), dev_set_name()
ide: rework handling of serialized ports (v2)
cy82c693: remove superfluous ide_cy82c693 chipset type
trm290: add IDE_HFLAG_TRM290 host flag
ide: add ->max_sectors field to struct ide_port_info
rz1000: apply chipset quirks early (v2)
ide: always set nIEN on idle devices
ide: fix ->quirk_list checking in ide_do_request()
gayle: set IDE_HFLAG_SERIALIZE explictly
cmd64x: set IDE_HFLAG_SERIALIZE explictly for CMD646
ali14xx: doesn't use shared IRQs
...
Linus Torvalds [Wed, 31 Dec 2008 01:33:33 +0000 (17:33 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/shaggy/jfs-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/shaggy/jfs-2.6:
jfs: ensure symlinks are NUL-terminated
Linus Torvalds [Wed, 31 Dec 2008 01:32:25 +0000 (17:32 -0800)]
Merge branch 'upstream-linus' of git://git./linux/kernel/git/jgarzik/libata-dev
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
sata_sil: add Large Block Transfer support
[libata] ata_piix: cleanup dmi strings checking
DMI: add dmi_match
libata: blacklist NCQ on OCZ CORE 2 SSD (resend)
[libata] Update kernel-doc comments to match source code
libata: perform port detach in EH
libata: when restoring SControl during detach do the PMP links first
libata: beef up iterators
Linus Torvalds [Wed, 31 Dec 2008 01:31:25 +0000 (17:31 -0800)]
Merge branch 'oprofile-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'oprofile-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
oprofile: select RING_BUFFER
ring_buffer: adding EXPORT_SYMBOLs
oprofile: fix lost sample counter
oprofile: remove nr_available_slots()
oprofile: port to the new ring_buffer
ring_buffer: add remaining cpu functions to ring_buffer.h
oprofile: moving cpu_buffer_reset() to cpu_buffer.h
oprofile: adding cpu_buffer_entries()
oprofile: adding cpu_buffer_write_commit()
oprofile: adding cpu buffer r/w access functions
ftrace: remove unused function arg in trace_iterator_increment()
ring_buffer: update description for ring_buffer_alloc()
oprofile: set values to default when creating oprofilefs
oprofile: implement switch/case in buffer_sync.c
x86/oprofile: cleanup IBS init/exit functions in op_model_amd.c
x86/oprofile: reordering IBS code in op_model_amd.c
oprofile: fix typo
oprofile: whitspace changes only
oprofile: update comment for oprofile_add_sample()
oprofile: comment cleanup
Linus Torvalds [Wed, 31 Dec 2008 01:28:09 +0000 (17:28 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/penberg/slab-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6:
slub: avoid leaking caches or refcounts on sysfs error
slab: Fix comment on #endif
slab: remove GFP_THISNODE clearing from alloc_slabmgmt()
slub: Add might_sleep_if() to slab_alloc()
SLUB: failslab support
slub: Fix incorrect use of loose
slab: Update the kmem_cache_create documentation regarding the name parameter
slub: make early_kmem_cache_node_alloc void
slab: unsigned slabp->inuse cannot be less than 0
slub - fix get_object_page comment
SLUB: Replace __builtin_return_address(0) with _RET_IP_.
SLUB: cleanup - define macros instead of hardcoded numbers
Linus Torvalds [Wed, 31 Dec 2008 01:25:49 +0000 (17:25 -0800)]
Merge branch 'drm-next' of git://git./linux/kernel/git/airlied/drm-2.6
* 'drm-next' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: (37 commits)
drm/i915: fix modeset devname allocation + agp init return check.
drm/i915: Remove redundant test in error path.
drm: Add a debug node for vblank state.
drm: Avoid use-before-null-test on dev in drm_cleanup().
drm/i915: Don't print to dmesg when taking signal during object_pin.
drm: pin new and unpin old buffer when setting a mode.
drm/i915: un-EXPORT and make 'intelfb_panic' static
drm/i915: Delete unused, pointless i915_driver_firstopen.
drm/i915: fix sparse warnings: returning void-valued expression
drm/i915: fix sparse warnings: move 'extern' decls to header file
drm/i915: fix sparse warnings: make symbols static
drm/i915: fix sparse warnings: declare one-bit bitfield as unsigned
drm/i915: Don't double-unpin buffers if we take a signal in evict_everything().
drm/i915: Fix fbcon setup to align display pitch to 64b.
drm/i915: Add missing userland definitions for gem init/execbuffer.
i915/drm: provide compat defines for userspace for certain struct members.
drm: drop DRM_IOCTL_MODE_REPLACEFB, add+remove works just as well.
drm: sanitise drm modesetting API + remove unused hotplug
drm: fix allowing master ioctls on non-master fds.
drm/radeon: use locked rmmap to remove sarea mapping.
...
Linus Torvalds [Wed, 31 Dec 2008 01:25:29 +0000 (17:25 -0800)]
Merge branch 'agp-next' of git://git./linux/kernel/git/airlied/agp-2.6
* 'agp-next' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/agp-2.6:
agp/intel: Fix broken ® symbol in device name.
agp/intel: add support for G41 chipset
Linus Torvalds [Wed, 31 Dec 2008 01:23:31 +0000 (17:23 -0800)]
Merge git://git./linux/kernel/git/davem/sparc-next-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-next-2.6: (98 commits)
sparc: move select of ARCH_SUPPORTS_MSI
sparc: drop SUN_IO
sparc: unify sections.h
sparc: use .data.init_task section for init_thread_union
sparc: fix array overrun check in of_device_64.c
sparc: unify module.c
sparc64: prepare module_64.c for unification
sparc64: use bit neutral Elf symbols
sparc: unify module.h
sparc: introduce CONFIG_BITS
sparc: fix hardirq.h removal fallout
sparc64: do not export pus_fs_struct
sparc: use sparc64 version of scatterlist.h
sparc: Commonize memcmp assembler.
sparc: Unify strlen assembler.
sparc: Add asm/asm.h
sparc: Kill memcmp_32.S code which has been ifdef'd out for centuries.
sparc: replace for_each_cpu_mask_nr with for_each_cpu
sparc: fix sparse warnings in irq_32.c
sparc: add include guards to kernel.h
...
Linus Torvalds [Wed, 31 Dec 2008 01:20:05 +0000 (17:20 -0800)]
Merge branch 'for-2.6.29' of git://git.kernel.dk/linux-2.6-block
* 'for-2.6.29' of git://git.kernel.dk/linux-2.6-block: (43 commits)
bio: get rid of bio_vec clearing
bounce: don't rely on a zeroed bio_vec list
cciss: simplify parameters to deregister_disk function
cfq-iosched: fix race between exiting queue and exiting task
loop: Do not call loop_unplug for not configured loop device.
loop: Flush possible running bios when loop device is released.
alpha: remove dead BIO_VMERGE_BOUNDARY
Get rid of CONFIG_LSF
block: make blk_softirq_init() static
block: use min_not_zero in blk_queue_stack_limits
block: add one-hit cache for disk partition lookup
cfq-iosched: remove limit of dispatch depth of max 4 times quantum
nbd: tell the block layer that it is not a rotational device
block: get rid of elevator_t typedef
aio: make the lookup_ioctx() lockless
bio: add support for inlining a number of bio_vecs inside the bio
bio: allow individual slabs in the bio_set
bio: move the slab pointer inside the bio_set
bio: only mempool back the largest bio_vec slab cache
block: don't use plugging on SSD devices
...
Linus Torvalds [Wed, 31 Dec 2008 00:20:19 +0000 (16:20 -0800)]
Merge branch 'irq-core-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86, sparseirq: clean up Kconfig entry
x86: turn CONFIG_SPARSE_IRQ off by default
sparseirq: fix numa_migrate_irq_desc dependency and comments
sparseirq: add kernel-doc notation for new member in irq_desc, -v2
locking, irq: enclose irq_desc_lock_class in CONFIG_LOCKDEP
sparseirq, xen: make sure irq_desc is allocated for interrupts
sparseirq: fix !SMP building, #2
x86, sparseirq: move irq_desc according to smp_affinity, v7
proc: enclose desc variable of show_stat() in CONFIG_SPARSE_IRQ
sparse irqs: add irqnr.h to the user headers list
sparse irqs: handle !GENIRQ platforms
sparseirq: fix !SMP && !PCI_MSI && !HT_IRQ build
sparseirq: fix Alpha build failure
sparseirq: fix typo in !CONFIG_IO_APIC case
x86, MSI: pass irq_cfg and irq_desc
x86: MSI start irq numbering from nr_irqs_gsi
x86: use NR_IRQS_LEGACY
sparse irq_desc[] array: core kernel and x86 changes
genirq: record IRQ_LEVEL in irq_desc[]
irq.h: remove padding from irq_desc on 64bits
Linus Torvalds [Wed, 31 Dec 2008 00:16:21 +0000 (16:16 -0800)]
Merge branch 'timers-core-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
hrtimers: fix warning in kernel/hrtimer.c
x86: make sure we really have an hpet mapping before using it
x86: enable HPET on Fujitsu u9200
linux/timex.h: cleanup for userspace
posix-timers: simplify de_thread()->exit_itimers() path
posix-timers: check ->it_signal instead of ->it_pid to validate the timer
posix-timers: use "struct pid*" instead of "struct task_struct*"
nohz: suppress needless timer reprogramming
clocksource, acpi_pm.c: put acpi_pm_read_slow() under CONFIG_PCI
nohz: no softirq pending warnings for offline cpus
hrtimer: removing all ur callback modes, fix
hrtimer: removing all ur callback modes, fix hotplug
hrtimer: removing all ur callback modes
x86: correct link to HPET timer specification
rtc-cmos: export second NVRAM bank
Fixed up conflicts in sound/drivers/pcsp/pcsp.c and sound/core/hrtimer.c
manually.
Linus Torvalds [Wed, 31 Dec 2008 00:10:19 +0000 (16:10 -0800)]
Merge branch 'core-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (63 commits)
stacktrace: provide save_stack_trace_tsk() weak alias
rcu: provide RCU options on non-preempt architectures too
printk: fix discarding message when recursion_bug
futex: clean up futex_(un)lock_pi fault handling
"Tree RCU": scalable classic RCU implementation
futex: rename field in futex_q to clarify single waiter semantics
x86/swiotlb: add default swiotlb_arch_range_needs_mapping
x86/swiotlb: add default phys<->bus conversion
x86: unify pci iommu setup and allow swiotlb to compile for 32 bit
x86: add swiotlb allocation functions
swiotlb: consolidate swiotlb info message printing
swiotlb: support bouncing of HighMem pages
swiotlb: factor out copy to/from device
swiotlb: add arch hook to force mapping
swiotlb: allow architectures to override phys<->bus<->phys conversions
swiotlb: add comment where we handle the overflow of a dma mask on 32 bit
rcu: fix rcutorture behavior during reboot
resources: skip sanity check of busy resources
swiotlb: move some definitions to header
swiotlb: allow architectures to override swiotlb pool allocation
...
Fix up trivial conflicts in
arch/x86/kernel/Makefile
arch/x86/mm/init_32.c
include/linux/hardirq.h
as per Ingo's suggestions.
Roland Dreier [Tue, 30 Dec 2008 23:36:58 +0000 (15:36 -0800)]
Merge branches 'cma' and 'mlx4' into for-linus
Roland Dreier [Tue, 30 Dec 2008 23:30:26 +0000 (15:30 -0800)]
IB/mlx4: Fix reading SL field out of cqe->sl_vid
Commit
f780a9f1 ("mlx4_core: Add ethernet fields to CQE struct")
introduced a bug in how wc->sl is set in mlx4_ib_poll_one() -- since
cqe->sl_vid is a big-endian value, the shift must be done after
converting to host endianness.
This bug was found using sparse endianness checking.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Trond Myklebust [Tue, 30 Dec 2008 21:51:43 +0000 (16:51 -0500)]
Merge branch 'devel' into next
WANG Cong [Tue, 30 Dec 2008 21:35:55 +0000 (16:35 -0500)]
fs/nfs/nfs4proc.c: make nfs4_map_errors() static
nfs4_map_errors() can become static.
Signed-off-by: WANG Cong <wangcong@zeuux.org>
Cc: J. Bruce Fields <bfields@fieldses.org>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
James Bottomley [Tue, 30 Dec 2008 15:44:29 +0000 (09:44 -0600)]
[SCSI] fcoe: fix configuration problems
fcoe selects libfc and requires SCSI and PCI (the SCSI requirement is
implicitly covered by an enclosing if). Fix them both up so they
cannot be configured in an invalid state: make LIBFC select
SCSI_FC_ATTRS and make FCOE depend on PCI and select LIBFC.
Reported-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
James Bottomley [Tue, 30 Dec 2008 16:20:24 +0000 (10:20 -0600)]
[SCSI] cxgb3i: fix select/depend problem
cxgb3i requires the cxgb3 net driver, so it selects it. However,
cxgb3 has dependencies which the select cannot see. Fix this by
separating out the cxgb3 dependencies into a separate hidden config
option (CONFIG_CHELSIO_T3_DEPENDS) and make both cxgb3 and cxgb3i
depend on it.
Reported-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Karen Xie <kxie@chelsio.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
James Bottomley [Mon, 29 Dec 2008 21:45:41 +0000 (15:45 -0600)]
[SCSI] fcoe: fix incorrect use of struct module
This structure may not be defined if CONFIG_MODULE=n, so never deref
it. Change uses of module->name to module_name(module) and corrects
some dyslexic printks and docbook comments.
Reported-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Karen Xie [Tue, 30 Dec 2008 05:43:25 +0000 (21:43 -0800)]
[SCSI] cxgb3i: remove use of skb->sp
The cxgb3i was using skb->sp pointer for some internal book-keeping
which is not related to the secure path. Changed it to use skb->cb[]
instead.
Reported-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Karen Xie <kxie@chelsio.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Karen Xie [Tue, 9 Dec 2008 22:15:32 +0000 (14:15 -0800)]
[SCSI] cxgb3i: Add cxgb3i iSCSI driver.
This patch implements the cxgb3i iscsi connection acceleration for the
open-iscsi initiator.
The cxgb3i driver offers the iscsi PDU based offload:
- digest insertion and verification
- payload direct-placement into host memory buffer.
Signed-off-by: Karen Xie <kxie@chelsio.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Julia Lawall [Tue, 30 Dec 2008 00:49:22 +0000 (21:49 -0300)]
V4L/DVB (10130): use USB API functions rather than constants
This set of patches introduces calls to the following set of functions:
usb_endpoint_dir_in(epd)
usb_endpoint_dir_out(epd)
usb_endpoint_is_bulk_in(epd)
usb_endpoint_is_bulk_out(epd)
usb_endpoint_is_int_in(epd)
usb_endpoint_is_int_out(epd)
usb_endpoint_is_isoc_in(epd)
usb_endpoint_is_isoc_out(epd)
usb_endpoint_num(epd)
usb_endpoint_type(epd)
usb_endpoint_xfer_bulk(epd)
usb_endpoint_xfer_control(epd)
usb_endpoint_xfer_int(epd)
usb_endpoint_xfer_isoc(epd)
In some cases, introducing one of these functions is not possible, and it
just replaces an explicit integer value by one of the following constants:
USB_ENDPOINT_XFER_BULK
USB_ENDPOINT_XFER_CONTROL
USB_ENDPOINT_XFER_INT
USB_ENDPOINT_XFER_ISOC
An extract of the semantic patch that makes these changes is as follows:
(http://www.emn.fr/x-info/coccinelle/)
// <smpl>
@r1@ struct usb_endpoint_descriptor *epd; @@
- ((epd->bmAttributes & \(USB_ENDPOINT_XFERTYPE_MASK\|3\)) ==
- \(USB_ENDPOINT_XFER_CONTROL\|0\))
+ usb_endpoint_xfer_control(epd)
@r5@ struct usb_endpoint_descriptor *epd; @@
- ((epd->bEndpointAddress & \(USB_ENDPOINT_DIR_MASK\|0x80\)) ==
- \(USB_DIR_IN\|0x80\))
+ usb_endpoint_dir_in(epd)
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Steven Rostedt [Fri, 12 Dec 2008 02:01:14 +0000 (23:01 -0300)]
V4L/DVB (10129): dvb: remove deprecated use of RW_LOCK_UNLOCKED in frontends
Impact: clean up
RW_LOCK_UNLOCKED is deprecated. This patch replaces it with the
__RW_LOCK_UNLOCKED(lock) macro. This change was a little trickier than
others due to the macro being used in another macro that fills an array.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Németh Márton [Mon, 29 Dec 2008 19:37:14 +0000 (16:37 -0300)]
V4L/DVB (10128): modify V4L documentation to be a valid XHTML
Modify Documentation/video4linux/API.html to be a valid XHTML 1.0 Strict.
The result was verified using the http://validator.w3.org/ service.
Signed-off-by: Márton Németh <nm127@freemail.hu>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>