GitHub/LineageOS/android_kernel_motorola_exynos9610.git
7 years agokvm/svm: Setup MCG_CAP on AMD properly
Borislav Petkov [Sun, 26 Mar 2017 21:51:24 +0000 (23:51 +0200)]
kvm/svm: Setup MCG_CAP on AMD properly

MCG_CAP[63:9] bits are reserved on AMD. However, on an AMD guest, this
MSR returns 0x100010a. More specifically, bit 24 is set, which is simply
wrong. That bit is MCG_SER_P and is present only on Intel. Thus, clean
up the reserved bits in order not to confuse guests.

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
7 years agoKVM: nVMX: single function for switching between vmcs
David Hildenbrand [Mon, 20 Mar 2017 09:00:08 +0000 (10:00 +0100)]
KVM: nVMX: single function for switching between vmcs

Let's combine it in a single function vmx_switch_vmcs().

Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
7 years agokvm: vmx: Don't use INVVPID when EPT is enabled
Jim Mattson [Wed, 15 Mar 2017 14:56:11 +0000 (07:56 -0700)]
kvm: vmx: Don't use INVVPID when EPT is enabled

According to the Intel SDM, volume 3, section 28.3.2: Creating and
Using Cached Translation Information, "No linear mappings are used
while EPT is in use." INVEPT will invalidate both the guest-physical
mappings and the combined mappings in the TLBs and paging-structure
caches, so an INVVPID is superfluous.

Signed-off-by: Jim Mattson <jmattson@google.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
7 years agoMerge tag 'kvm_mips_4.12_1' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan...
Radim Krčmář [Thu, 6 Apr 2017 12:47:03 +0000 (14:47 +0200)]
Merge tag 'kvm_mips_4.12_1' of git://git./linux/kernel/git/jhogan/kvm-mips

From: James Hogan <james.hogan@imgtec.com>

KVM: MIPS: VZ support, Octeon III, and TLBR

Add basic support for the MIPS Virtualization Module (generally known as
MIPS VZ) in KVM. We primarily support the ImgTec P5600, P6600, I6400,
and Cavium Octeon III cores so far. Support is included for the
following VZ / guest hardware features:
- MIPS32 and MIPS64, r5 (VZ requires r5 or later) and r6
- TLBs with GuestID (IMG cores) or Root ASID Dealias (Octeon III)
- Shared physical root/guest TLB (IMG cores)
- FPU / MSA
- Cop0 timer (up to 1GHz for now due to soft timer limit)
- Segmentation control (EVA)
- Hardware page table walker (HTW) both for root and guest TLB

Also included is a proper implementation of the TLBR instruction for the
trap & emulate MIPS KVM implementation.

Preliminary MIPS architecture changes are applied directly with Ralf's
ack.

7 years agotools/kvm_stat: add '%Total' column
Stefan Raspl [Fri, 10 Mar 2017 12:40:16 +0000 (13:40 +0100)]
tools/kvm_stat: add '%Total' column

Add column '%Total' next to 'Total' for easier comparison of numbers between
hosts.

Signed-off-by: Stefan Raspl <raspl@linux.vnet.ibm.com>
Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
7 years agotools/kvm_stat: add interactive command 'r'
Stefan Raspl [Fri, 10 Mar 2017 12:40:15 +0000 (13:40 +0100)]
tools/kvm_stat: add interactive command 'r'

Provide an interactive command to reset the tracepoint statistics.
Requires some extra work for debugfs, as the counters cannot be reset.

On the up side, this offers us the opportunity to have debugfs values
reset on startup and whenever a filter is modified, becoming consistent
with the tracepoint provider. As a bonus, 'kvmstat -dt' will now provide
useful output, instead of mixing values in totally different orders of
magnitude.
Furthermore, we avoid unnecessary resets when any of the filters is
"changed" interactively to the previous value.

Signed-off-by: Stefan Raspl <raspl@linux.vnet.ibm.com>
Acked-by: Janosch Frank <frankja@linux.vnet.ibm.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
7 years agotools/kvm_stat: add interactive command 'c'
Stefan Raspl [Fri, 10 Mar 2017 12:40:14 +0000 (13:40 +0100)]
tools/kvm_stat: add interactive command 'c'

Provide a real simple way to erase any active filter.

Signed-off-by: Stefan Raspl <raspl@linux.vnet.ibm.com>
Reviewed-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
7 years agotools/kvm_stat: add option '--guest'
Stefan Raspl [Fri, 10 Mar 2017 12:40:13 +0000 (13:40 +0100)]
tools/kvm_stat: add option '--guest'

Add a new option '-g'/'--guest' to select a particular process by providing
the QEMU guest name.
Notes:
- The logic to figure out the pid corresponding to the guest name might look
  scary, but works pretty reliably in practice; in the unlikely event that it
  returns add'l flukes, it will bail out and hint at using '-p' instead, no
  harm done.
- Mixing '-g' and '-p' is possible, and the final instance specified on the
  command line is the significant one. This is consistent with current
  behavior for '-p' which, if specified multiple times, also regards the final
  instance as the significant one.

Signed-off-by: Stefan Raspl <raspl@linux.vnet.ibm.com>
Reviewed-by: Janosch Frank <frankja@linux.vnet.ibm.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
7 years agotools/kvm_stat: remove regex filter on empty input
Stefan Raspl [Fri, 10 Mar 2017 12:40:12 +0000 (13:40 +0100)]
tools/kvm_stat: remove regex filter on empty input

Behavior on empty/0 input for regex and pid filtering was inconsistent, as
the former would keep the current filter, while the latter would (naturally)
remove any pid filtering.
Make things consistent by falling back to the default filter on empty input
for the regex filter dialogue.

Signed-off-by: Stefan Raspl <raspl@linux.vnet.ibm.com>
Reviewed-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
7 years agotools/kvm_stat: display regex when set to non-default
Stefan Raspl [Fri, 10 Mar 2017 12:40:11 +0000 (13:40 +0100)]
tools/kvm_stat: display regex when set to non-default

If a user defines a regex filter through the interactive command, display
the active regex in the header's second line.

Signed-off-by: Stefan Raspl <raspl@linux.vnet.ibm.com>
Reviewed-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
7 years agotools/kvm_stat: print error messages on faulty pid filter input
Stefan Raspl [Fri, 10 Mar 2017 12:40:10 +0000 (13:40 +0100)]
tools/kvm_stat: print error messages on faulty pid filter input

Print helpful messages in case users enter invalid input or invalid pids in
the interactive pid filter dialogue.

Signed-off-by: Stefan Raspl <raspl@linux.vnet.ibm.com>
Reviewed-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
7 years agotools/kvm_stat: remove pid filter on empty input
Stefan Raspl [Fri, 10 Mar 2017 12:40:09 +0000 (13:40 +0100)]
tools/kvm_stat: remove pid filter on empty input

Improve consistency in the interactive dialogue for pid filtering by
removing any filters on empty input (in addition to entering 0).

Signed-off-by: Stefan Raspl <raspl@linux.vnet.ibm.com>
Reviewed-by: Janosch Frank <frankja@linux.vnet.ibm.com>
Reviewed-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
7 years agotools/kvm_stat: display guest name when using pid filter
Stefan Raspl [Fri, 10 Mar 2017 12:40:08 +0000 (13:40 +0100)]
tools/kvm_stat: display guest name when using pid filter

When running kvm_stat with option '-p' to filter per process, display
the QEMU guest name next to the pid, if available.

Signed-off-by: Stefan Raspl <raspl@linux.vnet.ibm.com>
Reviewed-By: Janosch Frank <frankja@linux.vnet.ibm.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
7 years agotools/kvm_stat: document list of interactive commands
Stefan Raspl [Fri, 10 Mar 2017 12:40:07 +0000 (13:40 +0100)]
tools/kvm_stat: document list of interactive commands

Apart from the source code, there does not seem to be a place that documents
the interactive capabilities of kvm_stat yet.

Signed-off-by: Stefan Raspl <raspl@linux.vnet.ibm.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
7 years agotools/kvm_stat: reduce perceived idle time on filter updates
Stefan Raspl [Fri, 10 Mar 2017 12:40:06 +0000 (13:40 +0100)]
tools/kvm_stat: reduce perceived idle time on filter updates

Whenever a user adds a filter, we
* redraw the header immediately for a snappy response
* print a message indicating to the user that we're busy while the
  noticeable delay induced by updating all of the stats objects takes place
* update the statistics ASAP (i.e. after 0.25s instead of 3s) to be
  consistent with behavior on startup
To do so, we split the Tui's refresh() method to allow for drawing header
and stats separately, and trigger a header refresh whenever we are about
to do something that takes a while - like updating filters.

Signed-off-by: Stefan Raspl <raspl@linux.vnet.ibm.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
7 years agotools/kvm_stat: full PEP8 compliance
Stefan Raspl [Fri, 10 Mar 2017 12:40:05 +0000 (13:40 +0100)]
tools/kvm_stat: full PEP8 compliance

Provides all missing empty lines as required for full PEP compliance.

Signed-off-by: Stefan Raspl <raspl@linux.vnet.ibm.com>
Reviewed-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
7 years agotools/kvm_stat: fix trace setup glitch on field updates in TracepointProvider
Stefan Raspl [Fri, 10 Mar 2017 12:40:04 +0000 (13:40 +0100)]
tools/kvm_stat: fix trace setup glitch on field updates in TracepointProvider

Updating the fields of the TracepointProvider does not propagate changes to the
tracepoints. This shows when a pid filter is enabled, whereby subsequent
extensions of the fields of the Tracepoint provider (e.g. by toggling
drilldown) will not modify the tracepoints as required.
To reproduce, select a specific process via interactive command 'p', and
enable drilldown via 'x' - none of the fields with the braces will appear
although they should.
The fix will always leave all available fields in the TracepointProvider
enabled.

Signed-off-by: Stefan Raspl <raspl@linux.vnet.ibm.com>
Based-on-text-by: Janosch Frank <frankja@linux.vnet.ibm.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
7 years agotools/kvm_stat: fix misc glitches
Stefan Raspl [Fri, 10 Mar 2017 12:40:03 +0000 (13:40 +0100)]
tools/kvm_stat: fix misc glitches

Addresses
- eliminate extra import
- missing variable initialization
- type redefinition from int to float
- passing of int type argument instead of string
- a couple of PEP8-reported indentation/formatting glitches
- remove unused variable drilldown in class Tui

Signed-off-by: Stefan Raspl <raspl@linux.vnet.ibm.com>
Reviewed-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
7 years agotools/kvm_stat: handle SIGINT in log and batch modes
Stefan Raspl [Fri, 10 Mar 2017 12:40:02 +0000 (13:40 +0100)]
tools/kvm_stat: handle SIGINT in log and batch modes

SIGINT causes ugly unhandled exceptions in log and batch mode, which we
prevent by catching the exceptions accordingly.

Signed-off-by: Stefan Raspl <raspl@linux.vnet.ibm.com>
Reviewed-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
7 years agotools/kvm_stat: catch curses exceptions only
Stefan Raspl [Fri, 10 Mar 2017 12:40:01 +0000 (13:40 +0100)]
tools/kvm_stat: catch curses exceptions only

The previous version was catching all exceptions, including SIGINT.
We only want to catch the curses exceptions here.

Signed-off-by: Stefan Raspl <raspl@linux.vnet.ibm.com>
Reviewed-by: Janosch Frank <frankja@linux.vnet.ibm.com>
Reviewed-by: Sascha Silbe <silbe@linux.vnet.ibm.com>
Reviewed-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
7 years agotools/kvm_stat: hide cursor
Stefan Raspl [Fri, 10 Mar 2017 12:40:00 +0000 (13:40 +0100)]
tools/kvm_stat: hide cursor

When running kvm_stat in interactive mode, the cursor appears at the lower
left corner, which looks a bit distracting.
This patch hides the cursor by turning it invisible.

Signed-off-by: Stefan Raspl <raspl@linux.vnet.ibm.com>
Reviewed-By: Sascha Silbe <silbe@linux.vnet.ibm.com>
Reviewed-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
7 years agoKVM: MIPS/Emulate: Properly implement TLBR for T&E
James Hogan [Tue, 14 Mar 2017 17:00:08 +0000 (17:00 +0000)]
KVM: MIPS/Emulate: Properly implement TLBR for T&E

Properly implement emulation of the TLBR instruction for Trap & Emulate.
This instruction reads the TLB entry pointed at by the CP0_Index
register into the other TLB registers, which may have the side effect of
changing the current ASID. Therefore abstract the CP0_EntryHi and ASID
changing code into a common function in the process.

A comment indicated that Linux doesn't use TLBR, which is true during
normal use, however dumping of the TLB does use it (for example with the
relatively recent 'x' magic sysrq key), as does a wired TLB entries test
case in my KVM tests.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
7 years agoMIPS: Allow KVM to be enabled on Octeon CPUs
James Hogan [Tue, 14 Mar 2017 10:25:51 +0000 (10:25 +0000)]
MIPS: Allow KVM to be enabled on Octeon CPUs

Octeon III has VZ ASE support, so allow KVM to be enabled on Octeon
CPUs as it should now be functional.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: David Daney <david.daney@cavium.com>
Cc: Andreas Herrmann <andreas.herrmann@caviumnetworks.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
7 years agoKVM: MIPS/VZ: Handle Octeon III guest.PRid register
James Hogan [Tue, 14 Mar 2017 10:25:50 +0000 (10:25 +0000)]
KVM: MIPS/VZ: Handle Octeon III guest.PRid register

Octeon III implements a read-only guest CP0_PRid register, so add cases
to the KVM register access API for Octeon to ensure the correct value is
read and writes are ignored.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: David Daney <david.daney@cavium.com>
Cc: Andreas Herrmann <andreas.herrmann@caviumnetworks.com>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
7 years agoKVM: MIPS/VZ: Emulate hit CACHE ops for Octeon III
James Hogan [Tue, 14 Mar 2017 10:25:49 +0000 (10:25 +0000)]
KVM: MIPS/VZ: Emulate hit CACHE ops for Octeon III

Octeon III doesn't implement the optional GuestCtl0.CG bit to allow
guest mode to execute virtual address based CACHE instructions, so
implement emulation of a few important ones specifically for Octeon III
in response to a GPSI exception.

Currently the main reason to perform these operations is for icache
synchronisation, so they are implemented as a simple icache flush with
local_flush_icache_range().

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: David Daney <david.daney@cavium.com>
Cc: Andreas Herrmann <andreas.herrmann@caviumnetworks.com>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
7 years agoKVM: MIPS/VZ: VZ hardware setup for Octeon III
James Hogan [Tue, 14 Mar 2017 10:25:48 +0000 (10:25 +0000)]
KVM: MIPS/VZ: VZ hardware setup for Octeon III

Set up hardware virtualisation on Octeon III cores, configuring guest
interrupt routing and carving out half of the root TLB for guest use,
restoring it back again afterwards.

We need to be careful to inhibit TLB shutdown machine check exceptions
while invalidating guest TLB entries, since TLB invalidation is not
available so guest entries must be invalidated by setting them to unique
unmapped addresses, which could conflict with mappings set by the guest
or root if recently repartitioned.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: David Daney <david.daney@cavium.com>
Cc: Andreas Herrmann <andreas.herrmann@caviumnetworks.com>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
7 years agoKVM: MIPS/T&E: Report correct dcache line size
James Hogan [Tue, 14 Mar 2017 10:25:47 +0000 (10:25 +0000)]
KVM: MIPS/T&E: Report correct dcache line size

Octeon CPUs don't report the correct dcache line size in CP0_Config1.DL,
so encode the correct value for the guest CP0_Config1.DL based on
cpu_dcache_line_size().

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: David Daney <david.daney@cavium.com>
Cc: Andreas Herrmann <andreas.herrmann@caviumnetworks.com>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
7 years agoKVM: MIPS/TLB: Handle virtually tagged icaches
James Hogan [Tue, 14 Mar 2017 10:25:46 +0000 (10:25 +0000)]
KVM: MIPS/TLB: Handle virtually tagged icaches

When TLB entries are invalidated in the presence of a virtually tagged
icache, such as that found on Octeon CPUs, flush the icache so that we
don't get a reserved instruction exception even though the TLB mapping
is removed.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: David Daney <david.daney@cavium.com>
Cc: Andreas Herrmann <andreas.herrmann@caviumnetworks.com>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
7 years agoKVM: MIPS/Emulate: Adapt T&E CACHE emulation for Octeon
James Hogan [Tue, 14 Mar 2017 10:25:45 +0000 (10:25 +0000)]
KVM: MIPS/Emulate: Adapt T&E CACHE emulation for Octeon

Cache management is implemented separately for Cavium Octeon CPUs, so
r4k_blast_[id]cache aren't available. Instead for Octeon perform a local
icache flush using local_flush_icache_range(), and for other platforms
which don't use c-r4k.c use __flush_cache_all() / flush_icache_all().

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: David Daney <david.daney@cavium.com>
Cc: Andreas Herrmann <andreas.herrmann@caviumnetworks.com>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
7 years agoMIPS: Add Octeon III register accessors & definitions
James Hogan [Tue, 14 Mar 2017 10:25:44 +0000 (10:25 +0000)]
MIPS: Add Octeon III register accessors & definitions

Add accessors for some VZ related Cavium Octeon III specific COP0
registers, along with field definitions. These will mostly be used by
KVM to set up interrupt routing and partition the TLB between root and
guest.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Cc: David Daney <david.daney@cavium.com>
Cc: Andreas Herrmann <andreas.herrmann@caviumnetworks.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
7 years agoKVM: MIPS/VZ: Trace guest mode changes
James Hogan [Tue, 14 Mar 2017 10:15:40 +0000 (10:15 +0000)]
KVM: MIPS/VZ: Trace guest mode changes

Create a trace event for guest mode changes, and enable VZ's
GuestCtl0.MC bit after the trace event is enabled to trap all guest mode
changes.

The MC bit causes Guest Hardware Field Change (GHFC) exceptions whenever
a guest mode change occurs (such as an exception entry or return from
exception), so we need to handle this exception now. The MC bit is only
enabled when restoring register state, so enabling the trace event won't
take immediate effect.

Tracing guest mode changes can be particularly handy when trying to work
out what a guest OS gets up to before something goes wrong, especially
if the problem occurs as a result of some previous guest userland
exception which would otherwise be invisible in the trace.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
7 years agoKVM: MIPS/VZ: Support hardware guest timer
James Hogan [Tue, 14 Mar 2017 10:15:39 +0000 (10:15 +0000)]
KVM: MIPS/VZ: Support hardware guest timer

Transfer timer state to the VZ guest context (CP0_GTOffset & guest
CP0_Count) when entering guest mode, enabling direct guest access to it,
and transfer back to soft timer when saving guest register state.

This usually allows guest code to directly read CP0_Count (via MFC0 and
RDHWR) and read/write CP0_Compare, without trapping to the hypervisor
for it to emulate the guest timer. Writing to CP0_Count or CP0_Cause.DC
is much less common and still triggers a hypervisor GPSI exception, in
which case the timer state is transferred back to an hrtimer before
emulating the write.

We are careful to prevent small amounts of drift from building up due to
undeterministic time intervals between reading of the ktime and reading
of CP0_Count. Some drift is expected however, since the system
clocksource may use a different timer to the local CP0_Count timer used
by VZ. This is permitted to prevent guest CP0_Count from appearing to go
backwards.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
7 years agoKVM: MIPS/VZ: Emulate MAARs when necessary
James Hogan [Tue, 14 Mar 2017 10:15:38 +0000 (10:15 +0000)]
KVM: MIPS/VZ: Emulate MAARs when necessary

Add emulation of Memory Accessibility Attribute Registers (MAARs) when
necessary. We can't actually do anything with whatever the guest
provides, but it may not be possible to clear Guest.Config5.MRP so we
have to emulate at least a pair of MAARs.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Cc: linux-doc@vger.kernel.org
7 years agoKVM: MIPS/VZ: Support guest load-linked bit
James Hogan [Tue, 14 Mar 2017 10:15:37 +0000 (10:15 +0000)]
KVM: MIPS/VZ: Support guest load-linked bit

When restoring guest state after another VCPU has run, be sure to clear
CP0_LLAddr.LLB in order to break any interrupted atomic critical
section. Without this SMP guest atomics don't work when LLB is present
as one guest can complete the atomic section started by another guest.

MIPS VZ guest read of CP0_LLAddr causes Guest Privileged Sensitive
Instruction (GPSI) exception due to the address being root physical.
Handle this by reporting only the LLB bit, which contains the bit for
whether a ll/sc atomic is in progress without any reason for failure.

Similarly on P5600 a guest write to CP0_LLAddr also causes a GPSI
exception. Handle this also by clearing the guest LLB bit from root
mode.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
7 years agoKVM: MIPS/VZ: Support guest hardware page table walker
James Hogan [Tue, 14 Mar 2017 10:15:36 +0000 (10:15 +0000)]
KVM: MIPS/VZ: Support guest hardware page table walker

Add support for VZ guest CP0_PWBase, CP0_PWField, CP0_PWSize, and
CP0_PWCtl registers for controlling the guest hardware page table walker
(HTW) present on P5600 and P6600 cores. These guest registers need
initialising on R6, context switching, and exposing via the KVM ioctl
API when they are present.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Cc: linux-doc@vger.kernel.org
7 years agoKVM: MIPS/VZ: Support guest segmentation control
James Hogan [Tue, 14 Mar 2017 10:15:35 +0000 (10:15 +0000)]
KVM: MIPS/VZ: Support guest segmentation control

Add support for VZ guest CP0_SegCtl0, CP0_SegCtl1, and CP0_SegCtl2
registers, as found on P5600 and P6600 cores. These guest registers need
initialising, context switching, and exposing via the KVM ioctl API when
they are present.

They also require the GVA -> GPA translation code for handling a GVA
root exception to be updated to interpret the segmentation registers and
decode the faulting instruction enough to detect EVA memory access
instructions.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Cc: linux-doc@vger.kernel.org
7 years agoKVM: MIPS/VZ: Support guest CP0_[X]ContextConfig
James Hogan [Tue, 14 Mar 2017 10:15:34 +0000 (10:15 +0000)]
KVM: MIPS/VZ: Support guest CP0_[X]ContextConfig

Add support for VZ guest CP0_ContextConfig and CP0_XContextConfig
(MIPS64 only) registers, as found on P5600 and P6600 cores. These guest
registers need initialising, context switching, and exposing via the KVM
ioctl API when they are present.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Cc: linux-doc@vger.kernel.org
7 years agoKVM: MIPS/VZ: Support guest CP0_BadInstr[P]
James Hogan [Tue, 14 Mar 2017 10:15:33 +0000 (10:15 +0000)]
KVM: MIPS/VZ: Support guest CP0_BadInstr[P]

Add support for VZ guest CP0_BadInstr and CP0_BadInstrP registers, as
found on most VZ capable cores. These guest registers need context
switching, and exposing via the KVM ioctl API when they are present.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Cc: linux-doc@vger.kernel.org
7 years agoKVM: MIPS: Add VZ support to build system
James Hogan [Tue, 14 Mar 2017 10:15:32 +0000 (10:15 +0000)]
KVM: MIPS: Add VZ support to build system

Add support for the MIPS Virtualization (VZ) ASE to the MIPS KVM build
system. For now KVM can only be configured for T&E or VZ and not both,
but the design of the user facing APIs support the possibility of having
both available, so this could change in future.

Note that support for various optional guest features (some of which
can't be turned off) are implemented in immediately following commits,
so although it should now be possible to build VZ support, it may not
work yet on your hardware.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
7 years agoKVM: MIPS: Implement VZ support
James Hogan [Tue, 14 Mar 2017 10:15:31 +0000 (10:15 +0000)]
KVM: MIPS: Implement VZ support

Add the main support for the MIPS Virtualization ASE (A.K.A. VZ) to MIPS
KVM. The bulk of this work is in vz.c, with various new state and
definitions elsewhere.

Enough is implemented to be able to run on a minimal VZ core. Further
patches will fill out support for guest features which are optional or
can be disabled.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Cc: linux-doc@vger.kernel.org
7 years agoKVM: MIPS: Update exit handler for VZ
James Hogan [Tue, 14 Mar 2017 10:15:30 +0000 (10:15 +0000)]
KVM: MIPS: Update exit handler for VZ

The general guest exit handler needs a few tweaks for VZ compared to
trap & emulate, which for now are made directly depending on
CONFIG_KVM_MIPS_VZ:

- There is no need to re-enable the hardware page table walker (HTW), as
  it can be left enabled during guest mode operation with VZ.

- There is no need to perform a privilege check, as any guest privilege
  violations should have already been detected by the hardware and
  triggered the appropriate guest exception.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
7 years agoKVM: MIPS/Emulate: Drop CACHE emulation for VZ
James Hogan [Tue, 14 Mar 2017 10:15:29 +0000 (10:15 +0000)]
KVM: MIPS/Emulate: Drop CACHE emulation for VZ

Ifdef out the trap & emulate CACHE instruction emulation functions for
VZ. We will provide separate CACHE instruction emulation in vz.c, and we
need to avoid linker errors due to the use of T&E specific MMU helpers.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
7 years agoKVM: MIPS/Emulate: Update CP0_Compare emulation for VZ
James Hogan [Tue, 14 Mar 2017 10:15:28 +0000 (10:15 +0000)]
KVM: MIPS/Emulate: Update CP0_Compare emulation for VZ

Update emulation of guest writes to CP0_Compare for VZ. There are two
main differences compared to trap & emulate:

 - Writing to CP0_Compare in the VZ hardware guest context acks any
   pending timer, clearing CP0_Cause.TI. If we don't want an ack to take
   place we must carefully restore the TI bit if it was previously set.

 - Even with guest timer access disabled in CP0_GuestCtl0.GT, if the
   guest CP0_Count reaches the guest CP0_Compare the timer interrupt
   will assert. To prevent this we must set CP0_GTOffset to move the
   guest CP0_Count out of the way of the new guest CP0_Compare, either
   before or after depending on whether it is a forwards or backwards
   change.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
7 years agoKVM: MIPS/TLB: Add VZ TLB management
James Hogan [Tue, 14 Mar 2017 10:15:27 +0000 (10:15 +0000)]
KVM: MIPS/TLB: Add VZ TLB management

Add functions for MIPS VZ TLB management to tlb.c.

kvm_vz_host_tlb_inv() will be used for invalidating root TLB entries
after GPA page tables have been modified due to a KVM page fault. It
arranges for a root GPA mapping to be flushed from the TLB, using the
gpa_mm ASID or the current GuestID to do the probe.

kvm_vz_local_flush_roottlb_all_guests() and
kvm_vz_local_flush_guesttlb_all() flush all TLB entries in the
corresponding TLB for guest mappings (GPA->RPA for root TLB with
GuestID, and all entries for guest TLB). They will be used when starting
a new GuestID cycle, when VZ hardware is enabled/disabled, and also when
switching to a guest when the guest TLB contents may be stale or belong
to a different VM.

kvm_vz_guest_tlb_lookup() converts a guest virtual address to a guest
physical address using the guest TLB. This will be used to decode guest
virtual addresses which are sometimes provided by VZ hardware in
CP0_BadVAddr for certain exceptions when the guest physical address is
unavailable.

kvm_vz_save_guesttlb() and kvm_vz_load_guesttlb() will be used to
preserve wired guest VTLB entries while a guest isn't running.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
7 years agoKVM: MIPS/Entry: Update entry code to support VZ
James Hogan [Tue, 14 Mar 2017 10:15:26 +0000 (10:15 +0000)]
KVM: MIPS/Entry: Update entry code to support VZ

Update MIPS KVM entry code to support VZ:

 - We need to set GuestCtl0.GM while in guest mode.

 - For cores supporting GuestID, we need to set the root GuestID to
   match the main GuestID while in guest mode so that the root TLB
   refill handler writes the correct GuestID into the TLB.

 - For cores without GuestID where the root ASID dealiases RVA/GPA
   mappings, we need to load that ASID from the gpa_mm rather than the
   per-VCPU guest_kernel_mm or guest_user_mm, since the root TLB maps
   guest physical addresses. We also need to restore the normal process
   ASID on exit.

 - The normal linux process pgd needs restoring on exit, as we can't
   leave the GPA mappings active for kernel code.

 - GuestCtl0 needs saving on exit for the GExcCode field, as it may be
   clobbered if a preemption occurs.

We also need to move the TLB refill handler to the XTLB vector at offset
0x80 on 64-bit VZ kernels, as hardware will use Root.Status.KX to
determine whether a TLB refill or XTLB Refill exception is to be taken
on a root TLB miss from guest mode, and KX needs to be set for kernel
code to be able to access the 64-bit segments.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
7 years agoKVM: MIPS: Abstract guest CP0 register access for VZ
James Hogan [Tue, 14 Mar 2017 10:15:25 +0000 (10:15 +0000)]
KVM: MIPS: Abstract guest CP0 register access for VZ

Abstract the MIPS KVM guest CP0 register access macros into inline
functions which are generated by macros. This allows them to be
generated differently for VZ, where they will usually need to access the
hardware guest CP0 context rather than the saved values in RAM.

Accessors for each individual register are generated using these macros:

 - __BUILD_KVM_*_SW() for registers which are not present in the VZ
   hardware guest context, so kvm_{read,write}_c0_guest_##name() will
   access the saved value in RAM regardless of whether VZ is enabled.

 - __BUILD_KVM_*_HW() for registers which are present in the VZ hardware
   guest context, so kvm_{read,write}_c0_guest_##name() will access the
   hardware register when VZ is enabled.

These build the underlying accessors using further macros:

 - __BUILD_KVM_*_SAVED() builds e.g. kvm_{read,write}_sw_gc0_##name()
   functions for accessing the saved versions of the registers in RAM.
   This is used for implementing the common
   kvm_{read,write}_c0_guest_##name() accessors with T&E where registers
   are always stored in RAM, but are also available with VZ HW registers
   to allow them to be accessed while saved.

 - __BUILD_KVM_*_VZ() builds e.g. kvm_{read,write}_vz_gc0_##name()
   functions for accessing the VZ hardware guest context registers
   directly. This is used for implementing the common
   kvm_{read,write}_c0_guest_##name() accessors with VZ.

 - __BUILD_KVM_*_WRAP() builds wrappers with different names, which
   allows the common kvm_{read,write}_c0_guest_##name() functions to be
   implemented using the VZ accessors while still having the SAVED
   accessors available too.

 - __BUILD_KVM_SAVE_VZ() builds functions for saving and restoring VZ
   hardware guest context register state to RAM, improving conciseness
   of VZ context saving and restoring.

Similar macros exist for generating modifiers (set, clear, change),
either with a normal unlocked read/modify/write, or using atomic LL/SC
sequences.

These changes change the types of 32-bit registers to u32 instead of
unsigned long, which requires some changes to printk() functions in MIPS
KVM.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
7 years agoKVM: MIPS: Add guest exit exception callback
James Hogan [Tue, 14 Mar 2017 10:15:24 +0000 (10:15 +0000)]
KVM: MIPS: Add guest exit exception callback

Add a callback for MIPS KVM implementations to handle the VZ guest
exit exception. Currently the trap & emulate implementation contains a
stub which reports an internal error, but the callback will be used
properly by the VZ implementation.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
7 years agoKVM: MIPS: Add hardware_{enable,disable} callback
James Hogan [Tue, 14 Mar 2017 10:15:23 +0000 (10:15 +0000)]
KVM: MIPS: Add hardware_{enable,disable} callback

Add an implementation callback for the kvm_arch_hardware_enable() and
kvm_arch_hardware_disable() architecture functions, with simple stubs
for trap & emulate. This is in preparation for VZ which will make use of
them.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
7 years agoKVM: MIPS: Add callback to check extension
James Hogan [Tue, 14 Mar 2017 10:15:22 +0000 (10:15 +0000)]
KVM: MIPS: Add callback to check extension

Add an implementation callback for checking presence of KVM extensions.
This allows implementation specific extensions to be provided without
ifdefs in mips.c.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
7 years agoKVM: MIPS: Init timer frequency from callback
James Hogan [Tue, 14 Mar 2017 10:15:21 +0000 (10:15 +0000)]
KVM: MIPS: Init timer frequency from callback

Currently the software emulated timer is initialised to a frequency of
100MHz by kvm_mips_init_count(), but this isn't suitable for VZ where
the frequency of the guest timer matches that of the host.

Add a count_hz argument so the caller can specify the default frequency,
and move the call from kvm_arch_vcpu_create() to the implementation
specific vcpu_setup() callback, so that VZ can specify a different
frequency.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
7 years agoKVM: MIPS: Add 64BIT capability
James Hogan [Tue, 14 Mar 2017 10:15:20 +0000 (10:15 +0000)]
KVM: MIPS: Add 64BIT capability

Add a new KVM_CAP_MIPS_64BIT capability to indicate that 64-bit MIPS
guests are available and supported. In this case it should still be
possible to run 32-bit guest code. If not available it won't be possible
to run 64-bit guest code and the instructions may not be available, or
the kernel may not support full context switching of 64-bit registers.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Cc: linux-doc@vger.kernel.org
7 years agoKVM: MIPS: Add VZ & TE capabilities
James Hogan [Tue, 14 Mar 2017 10:15:19 +0000 (10:15 +0000)]
KVM: MIPS: Add VZ & TE capabilities

Add new KVM_CAP_MIPS_VZ and KVM_CAP_MIPS_TE capabilities, and in order
to allow MIPS KVM to support VZ without confusing old users (which
expect the trap & emulate implementation), define and start checking
KVM_CREATE_VM type codes.

The codes available are:

 - KVM_VM_MIPS_TE = 0

   This is the current value expected from the user, and will create a
   VM using trap & emulate in user mode, confined to the user mode
   address space. This may in future become unavailable if the kernel is
   only configured to support VZ, in which case the EINVAL error will be
   returned and KVM_CAP_MIPS_TE won't be available even though
   KVM_CAP_MIPS_VZ is.

 - KVM_VM_MIPS_VZ = 1

   This can be provided when the KVM_CAP_MIPS_VZ capability is available
   to create a VM using VZ, with a fully virtualized guest virtual
   address space. If VZ support is unavailable in the kernel, the EINVAL
   error will be returned (although old kernels without the
   KVM_CAP_MIPS_VZ capability may well succeed and create a trap &
   emulate VM).

This is designed to allow the desired implementation (T&E vs VZ) to be
potentially chosen at runtime rather than being fixed in the kernel
configuration.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Cc: linux-doc@vger.kernel.org
7 years agoKVM: MIPS: Extend counters & events for VZ GExcCodes
James Hogan [Tue, 14 Mar 2017 10:15:18 +0000 (10:15 +0000)]
KVM: MIPS: Extend counters & events for VZ GExcCodes

Extend MIPS KVM stats counters and kvm_transition trace event codes to
cover hypervisor exceptions, which have their own GExcCode field in
CP0_GuestCtl0 with up to 32 hypervisor exception cause codes.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
7 years agoKVM: MIPS: Update kvm_lose_fpu() for VZ
James Hogan [Tue, 14 Mar 2017 10:15:17 +0000 (10:15 +0000)]
KVM: MIPS: Update kvm_lose_fpu() for VZ

Update the implementation of kvm_lose_fpu() for VZ, where there is no
need to enable the FPU/MSA in the root context if the FPU/MSA state is
loaded but disabled in the guest context.

The trap & emulate implementation needs to disable FPU/MSA in the root
context when the guest disables them in order to catch the COP1 unusable
or MSA disabled exception when they're used and pass it on to the guest.

For VZ however as long as the context is loaded and enabled in the root
context, the guest can enable and disable it in the guest context
without the hypervisor having to do much, and will take guest exceptions
without hypervisor intervention if used without being enabled in the
guest context.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
7 years agoKVM: MIPS/Emulate: Implement 64-bit MMIO emulation
James Hogan [Tue, 14 Mar 2017 10:15:16 +0000 (10:15 +0000)]
KVM: MIPS/Emulate: Implement 64-bit MMIO emulation

Implement additional MMIO emulation for MIPS64, including 64-bit
loads/stores, and 32-bit unsigned loads. These are only exposed on
64-bit VZ hosts.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
7 years agoKVM: MIPS/Emulate: De-duplicate MMIO emulation
James Hogan [Tue, 14 Mar 2017 10:15:15 +0000 (10:15 +0000)]
KVM: MIPS/Emulate: De-duplicate MMIO emulation

Refactor MIPS KVM MMIO load/store emulation to reduce code duplication.
Each duplicate differed slightly anyway, and it will simplify adding
64-bit MMIO support for VZ.

kvm_mips_emulate_store() and kvm_mips_emulate_load() can now return
EMULATE_DO_MMIO (as possibly originally intended). We therefore stop
calling either of these from kvm_mips_emulate_inst(), which is now only
used by kvm_trap_emul_handle_cop_unusable() which is picky about return
values.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
7 years agoKVM: MIPS: Implement HYPCALL emulation
James Hogan [Tue, 14 Mar 2017 10:15:14 +0000 (10:15 +0000)]
KVM: MIPS: Implement HYPCALL emulation

Emulate the HYPCALL instruction added in the VZ ASE and used by the MIPS
paravirtualised guest support that is already merged. The new hypcall.c
handles arguments and the return value. No actual hypercalls are yet
supported, but this still allows us to safely step over hypercalls and
set an error code in the return value for forward compatibility.

Non-zero HYPCALL codes are not handled.

We also document the hypercall ABI which asm/kvm_para.h uses.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Andreas Herrmann <andreas.herrmann@caviumnetworks.com>
Cc: David Daney <david.daney@cavium.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Cc: linux-doc@vger.kernel.org
7 years agoMIPS: asm/tlb.h: Add UNIQUE_GUEST_ENTRYHI() macro
James Hogan [Tue, 14 Mar 2017 10:15:13 +0000 (10:15 +0000)]
MIPS: asm/tlb.h: Add UNIQUE_GUEST_ENTRYHI() macro

Add a distinct UNIQUE_GUEST_ENTRYHI() macro for invalidation of guest
TLB entries by KVM, using addresses in KSeg1 rather than KSeg0. This
avoids conflicts with guest invalidation routines when there is no EHINV
bit to mark the whole entry as invalid, avoiding guest machine check
exceptions on Cavium Octeon III.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
7 years agoMIPS: Add some missing guest CP0 accessors & defs
James Hogan [Tue, 14 Mar 2017 10:15:12 +0000 (10:15 +0000)]
MIPS: Add some missing guest CP0 accessors & defs

Add some missing guest accessors and register field definitions for KVM
for MIPS VZ to make use of.

Guest CP0_LLAddr register accessors and definitions for the LLB field
allow KVM to clear the guest LLB to cancel in-progress LL/SC atomics on
restore, and to emulate accesses by the guest to the CP0_LLAddr
register.

Bitwise modifiers and definitions for the guest CP0_Wired and
CP0_Config1 registers allow KVM to modify fields within the CP0_Wired
and CP0_Config1 registers.

Finally a definition for the CP0_Config5.SBRI bit allows KVM to
initialise and allow modification of the guest version of the SBRI bit.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
7 years agoMIPS: Probe guest MVH
James Hogan [Tue, 14 Mar 2017 10:15:11 +0000 (10:15 +0000)]
MIPS: Probe guest MVH

Probe for availablility of M{T,F}HC0 instructions used with e.g. XPA in
the VZ guest context, and make it available via cpu_guest_has_mvh. This
will be helpful in properly emulating the MAAR registers in KVM for MIPS
VZ.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
7 years agoMIPS: Probe guest CP0_UserLocal
James Hogan [Tue, 14 Mar 2017 10:15:10 +0000 (10:15 +0000)]
MIPS: Probe guest CP0_UserLocal

Probe for presence of guest CP0_UserLocal register and expose via
cpu_guest_has_userlocal. This register is optional pre-r6, so this will
allow KVM to only save/restore/expose the guest CP0_UserLocal register
if it exists.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
7 years agoMIPS: Separate MAAR V bit into VL and VH for XPA
James Hogan [Tue, 14 Mar 2017 10:15:09 +0000 (10:15 +0000)]
MIPS: Separate MAAR V bit into VL and VH for XPA

The MAAR V bit has been renamed VL since another bit called VH is added
at the top of the register when it is extended to 64-bits on a 32-bit
processor with XPA. Rename the V definition, fix the various users, and
add definitions for the VH bit. Also add a definition for the MAARI
Index field.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
7 years agoMIPS: Add defs & probing of UFR
James Hogan [Tue, 14 Mar 2017 10:15:08 +0000 (10:15 +0000)]
MIPS: Add defs & probing of UFR

Add definitions and probing of the UFR bit in Config5. This bit allows
user mode control of the FR bit (floating point register mode). It is
present if the UFRP bit is set in the floating point implementation
register.

This is a capability KVM may want to expose to guest kernels, even
though Linux is unlikely to ever use it due to the implications for
multi-threaded programs.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
7 years agoKVM: x86: cleanup the page tracking SRCU instance
Paolo Bonzini [Mon, 27 Mar 2017 15:53:50 +0000 (17:53 +0200)]
KVM: x86: cleanup the page tracking SRCU instance

SRCU uses a delayed work item.  Skip cleaning it up, and
the result is use-after-free in the work item callbacks.

Reported-by: Dmitry Vyukov <dvyukov@google.com>
Suggested-by: Dmitry Vyukov <dvyukov@google.com>
Cc: stable@vger.kernel.org
Fixes: 0eb05bf290cfe8610d9680b49abef37febd1c38a
Reviewed-by: Xiao Guangrong <xiaoguangrong.eric@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
7 years agoKVM: nVMX: fix nested EPT detection
Ladi Prosek [Thu, 23 Mar 2017 06:18:08 +0000 (07:18 +0100)]
KVM: nVMX: fix nested EPT detection

The nested_ept_enabled flag introduced in commit 7ca29de2136 was not
computed correctly. We are interested only in L1's EPT state, not the
the combined L0+L1 value.

In particular, if L0 uses EPT but L1 does not, nested_ept_enabled must
be false to make sure that PDPSTRs are loaded based on CR3 as usual,
because the special case described in 26.3.2.4 Loading Page-Directory-
Pointer-Table Entries does not apply.

Fixes: 7ca29de21362 ("KVM: nVMX: fix CR3 load if L2 uses PAE paging and EPT")
Cc: qemu-stable@nongnu.org
Reported-by: Wanpeng Li <wanpeng.li@hotmail.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
7 years agoKVM: pci-assign: do not map smm memory slot pages in vt-d page tables
Herongguang (Stephen) [Mon, 27 Mar 2017 07:21:17 +0000 (15:21 +0800)]
KVM: pci-assign: do not map smm memory slot pages in vt-d page tables

or VM memory are not put thus leaked in kvm_iommu_unmap_memslots() when
destroy VM.

This is consistent with current vfio implementation.

Signed-off-by: herongguang <herongguang.he@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
7 years agoKVM: kvm_io_bus_unregister_dev() should never fail
David Hildenbrand [Thu, 23 Mar 2017 17:24:19 +0000 (18:24 +0100)]
KVM: kvm_io_bus_unregister_dev() should never fail

No caller currently checks the return value of
kvm_io_bus_unregister_dev(). This is evil, as all callers silently go on
freeing their device. A stale reference will remain in the io_bus,
getting at least used again, when the iobus gets teared down on
kvm_destroy_vm() - leading to use after free errors.

There is nothing the callers could do, except retrying over and over
again.

So let's simply remove the bus altogether, print an error and make
sure no one can access this broken bus again (returning -ENOMEM on any
attempt to access it).

Fixes: e93f8a0f821e ("KVM: convert io_bus to SRCU")
Cc: stable@vger.kernel.org # 3.4+
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
7 years agoKVM: VMX: Fix enable VPID conditions
Wanpeng Li [Thu, 23 Mar 2017 12:30:08 +0000 (05:30 -0700)]
KVM: VMX: Fix enable VPID conditions

This can be reproduced by running L2 on L1, and disable VPID on L0
if w/o commit "KVM: nVMX: Fix nested VPID vmx exec control", the L2
crash as below:

KVM: entry failed, hardware error 0x7
EAX=00000000 EBX=00000000 ECX=00000000 EDX=000306c3
ESI=00000000 EDI=00000000 EBP=00000000 ESP=00000000
EIP=0000fff0 EFL=00000002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0000 00000000 0000ffff 00009300
CS =f000 ffff0000 0000ffff 00009b00
SS =0000 00000000 0000ffff 00009300
DS =0000 00000000 0000ffff 00009300
FS =0000 00000000 0000ffff 00009300
GS =0000 00000000 0000ffff 00009300
LDT=0000 00000000 0000ffff 00008200
TR =0000 00000000 0000ffff 00008b00
GDT=     00000000 0000ffff
IDT=     00000000 0000ffff
CR0=60000010 CR2=00000000 CR3=00000000 CR4=00000000
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000
DR6=00000000ffff0ff0 DR7=0000000000000400
EFER=0000000000000000

Reference SDM 30.3 INVVPID:

Protected Mode Exceptions
- #UD
  - If not in VMX operation.
  - If the logical processor does not support VPIDs (IA32_VMX_PROCBASED_CTLS2[37]=0).
  - If the logical processor supports VPIDs (IA32_VMX_PROCBASED_CTLS2[37]=1) but does
    not support the INVVPID instruction (IA32_VMX_EPT_VPID_CAP[32]=0).

So we should check both VPID enable bit in vmx exec control and INVVPID support bit
in vmx capability MSRs to enable VPID. This patch adds the guarantee to not enable
VPID if either INVVPID or single-context/all-context invalidation is not exposed in
vmx capability MSRs.

Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
7 years agoKVM: nVMX: Fix nested VPID vmx exec control
Wanpeng Li [Tue, 21 Mar 2017 04:18:53 +0000 (21:18 -0700)]
KVM: nVMX: Fix nested VPID vmx exec control

This can be reproduced by running kvm-unit-tests/vmx.flat on L0 w/ vpid disabled.

Test suite: VPID
Unhandled exception 6 #UD at ip 00000000004051a6
error_code=0000      rflags=00010047      cs=00000008
rax=0000000000000000 rcx=0000000000000001 rdx=0000000000000047 rbx=0000000000402f79
rbp=0000000000456240 rsi=0000000000000001 rdi=0000000000000000
r8=000000000000000a  r9=00000000000003f8 r10=0000000080010011 r11=0000000000000000
r12=0000000000000003 r13=0000000000000708 r14=0000000000000000 r15=0000000000000000
cr0=0000000080010031 cr2=0000000000000000 cr3=0000000007fff000 cr4=0000000000002020
cr8=0000000000000000
STACK: @4051a6 40523e 400f7f 402059 40028f

We should hide and forbid VPID in L1 if it is disabled on L0. However, nested VPID
enable bit is set unconditionally during setup nested vmx exec controls though VPID
is not exposed through nested VMX capablity. This patch fixes it by don't set nested
VPID enable bit if it is disabled on L0.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: stable@vger.kernel.org
Fixes: 5c614b3583e (KVM: nVMX: nested VPID emulation)
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
7 years agoKVM: x86: correct async page present tracepoint
Wanpeng Li [Tue, 21 Mar 2017 04:18:55 +0000 (21:18 -0700)]
KVM: x86: correct async page present tracepoint

After async pf setup successfully, there is a broadcast wakeup w/ special
token 0xffffffff which tells vCPU that it should wake up all processes
waiting for APFs though there is no real process waiting at the moment.

The async page present tracepoint print prematurely and fails to catch the
special token setup. This patch fixes it by moving the async page present
tracepoint after the special token setup.

Before patch:

qemu-system-x86-8499  [006] ...1  5973.473292: kvm_async_pf_ready: token 0x0 gva 0x0

After patch:

qemu-system-x86-8499  [006] ...1  5973.473292: kvm_async_pf_ready: token 0xffffffff gva 0x0

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
7 years agokvm: vmx: Flush TLB when the APIC-access address changes
Jim Mattson [Thu, 16 Mar 2017 20:53:59 +0000 (13:53 -0700)]
kvm: vmx: Flush TLB when the APIC-access address changes

Quoting from the Intel SDM, volume 3, section 28.3.3.4: Guidelines for
Use of the INVEPT Instruction:

If EPT was in use on a logical processor at one time with EPTP X, it
is recommended that software use the INVEPT instruction with the
"single-context" INVEPT type and with EPTP X in the INVEPT descriptor
before a VM entry on the same logical processor that enables EPT with
EPTP X and either (a) the "virtualize APIC accesses" VM-execution
control was changed from 0 to 1; or (b) the value of the APIC-access
address was changed.

In the nested case, the burden falls on L1, unless L0 enables EPT in
vmcs02 when L1 doesn't enable EPT in vmcs12.

Signed-off-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
7 years agoKVM: x86: use pic/ioapic destructor when destroy vm
Peter Xu [Wed, 15 Mar 2017 08:01:19 +0000 (16:01 +0800)]
KVM: x86: use pic/ioapic destructor when destroy vm

We have specific destructors for pic/ioapic, we'd better use them when
destroying the VM as well.

Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
7 years agoKVM: x86: check existance before destroy
Peter Xu [Wed, 15 Mar 2017 08:01:18 +0000 (16:01 +0800)]
KVM: x86: check existance before destroy

Mostly used for split irqchip mode. In that case, these two things are
not inited at all, so no need to release.

Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
7 years agoKVM: x86: clear bus pointer when destroyed
Peter Xu [Wed, 15 Mar 2017 08:01:17 +0000 (16:01 +0800)]
KVM: x86: clear bus pointer when destroyed

When releasing the bus, let's clear the bus pointers to mark it out. If
any further device unregister happens on this bus, we know that we're
done if we found the bus being released already.

Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
7 years agoKVM: Documentation: document MCE ioctls
Luiz Capitulino [Mon, 13 Mar 2017 13:08:20 +0000 (09:08 -0400)]
KVM: Documentation: document MCE ioctls

Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
7 years agoKVM: nVMX: don't reset kvm mmu twice
Wanpeng Li [Sun, 12 Mar 2017 08:53:52 +0000 (00:53 -0800)]
KVM: nVMX: don't reset kvm mmu twice

kvm mmu is reset once successfully loading CR3 as part of emulating vmentry
in nested_vmx_load_cr3(). We should not reset kvm mmu twice.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
7 years agoPTP: fix ptr_ret.cocci warnings
kbuild test robot [Fri, 27 Jan 2017 21:01:51 +0000 (05:01 +0800)]
PTP: fix ptr_ret.cocci warnings

drivers/ptp/ptp_kvm.c:229:1-3: WARNING: PTR_ERR_OR_ZERO can be used

 Use PTR_ERR_OR_ZERO rather than if(IS_ERR(...)) + PTR_ERR

Generated by: scripts/coccinelle/api/ptr_ret.cocci

CC: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
7 years agokvm: fix usage of uninit spinlock in avic_vm_destroy()
Dmitry Vyukov [Tue, 24 Jan 2017 13:06:48 +0000 (14:06 +0100)]
kvm: fix usage of uninit spinlock in avic_vm_destroy()

If avic is not enabled, avic_vm_init() does nothing and returns early.
However, avic_vm_destroy() still tries to destroy what hasn't been created.
The only bad consequence of this now is that avic_vm_destroy() uses
svm_vm_data_hash_lock that hasn't been initialized (and is not meant
to be used at all if avic is not enabled).

Return early from avic_vm_destroy() if avic is not enabled.
It has nothing to destroy.

Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: kvm@vger.kernel.org
Cc: syzkaller@googlegroups.com
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
7 years agoKVM: VMX: downgrade warning on unexpected exit code
Radim Krčmář [Fri, 13 Jan 2017 17:59:04 +0000 (18:59 +0100)]
KVM: VMX: downgrade warning on unexpected exit code

We never needed the call trace and we better rate-limit if it can be
triggered by a guest.

Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
7 years agoLinux 4.11-rc3
Linus Torvalds [Mon, 20 Mar 2017 02:09:39 +0000 (19:09 -0700)]
Linux 4.11-rc3

7 years agomm/swap: don't BUG_ON() due to uninitialized swap slot cache
Linus Torvalds [Mon, 20 Mar 2017 02:00:47 +0000 (19:00 -0700)]
mm/swap: don't BUG_ON() due to uninitialized swap slot cache

This BUG_ON() triggered for me once at shutdown, and I don't see a
reason for the check.  The code correctly checks whether the swap slot
cache is usable or not, so an uninitialized swap slot cache is not
actually problematic afaik.

I've temporarily just switched the BUG_ON() to a WARN_ON_ONCE(), since
I'm not sure why that seemingly pointless check was there.  I suspect
the real fix is to just remove it entirely, but for now we'll warn about
it but not bring the machine down.

Cc: "Huang, Ying" <ying.huang@intel.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7 years agoMerge tag 'powerpc-4.11-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc...
Linus Torvalds [Mon, 20 Mar 2017 01:49:28 +0000 (18:49 -0700)]
Merge tag 'powerpc-4.11-5' of git://git./linux/kernel/git/powerpc/linux

Pull more powerpc fixes from Michael Ellerman:
 "A couple of minor powerpc fixes for 4.11:

   - wire up statx() syscall

   - don't print a warning on memory hotplug when HPT resizing isn't
     available

  Thanks to: David Gibson, Chandan Rajendra"

* tag 'powerpc-4.11-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/pseries: Don't give a warning when HPT resizing isn't available
  powerpc: Wire up statx() syscall

7 years agoMerge branch 'parisc-4.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller...
Linus Torvalds [Mon, 20 Mar 2017 01:11:13 +0000 (18:11 -0700)]
Merge branch 'parisc-4.11-2' of git://git./linux/kernel/git/deller/parisc-linux

Pull parisc fixes from Helge Deller:

 - Mikulas Patocka added support for R_PARISC_SECREL32 relocations in
   modules with CONFIG_MODVERSIONS.

 - Dave Anglin optimized the cache flushing for vmap ranges.

 - Arvind Yadav provided a fix for a potential NULL pointer dereference
   in the parisc perf code (and some code cleanups).

 - I wired up the new statx system call, fixed some compiler warnings
   with the access_ok() macro and fixed shutdown code to really halt a
   system at shutdown instead of crashing & rebooting.

* 'parisc-4.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
  parisc: Fix system shutdown halt
  parisc: perf: Fix potential NULL pointer dereference
  parisc: Avoid compiler warnings with access_ok()
  parisc: Wire up statx system call
  parisc: Optimize flush_kernel_vmap_range and invalidate_kernel_vmap_range
  parisc: support R_PARISC_SECREL32 relocation in modules

7 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending
Linus Torvalds [Mon, 20 Mar 2017 01:06:31 +0000 (18:06 -0700)]
Merge git://git./linux/kernel/git/nab/target-pending

Pull SCSI target fixes from Nicholas Bellinger:
 "The bulk of the changes are in qla2xxx target driver code to address
  various issues found during Cavium/QLogic's internal testing (stable
  CC's included), along with a few other stability and smaller
  miscellaneous improvements.

  There are also a couple of different patch sets from Mike Christie,
  which have been a result of his work to use target-core ALUA logic
  together with tcm-user backend driver.

  Finally, a patch to address some long standing issues with
  pass-through SCSI export of TYPE_TAPE + TYPE_MEDIUM_CHANGER devices,
  which will make folks using physical (or virtual) magnetic tape happy"

* git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (28 commits)
  qla2xxx: Update driver version to 9.00.00.00-k
  qla2xxx: Fix delayed response to command for loop mode/direct connect.
  qla2xxx: Change scsi host lookup method.
  qla2xxx: Add DebugFS node to display Port Database
  qla2xxx: Use IOCB interface to submit non-critical MBX.
  qla2xxx: Add async new target notification
  qla2xxx: Export DIF stats via debugfs
  qla2xxx: Improve T10-DIF/PI handling in driver.
  qla2xxx: Allow relogin to proceed if remote login did not finish
  qla2xxx: Fix sess_lock & hardware_lock lock order problem.
  qla2xxx: Fix inadequate lock protection for ABTS.
  qla2xxx: Fix request queue corruption.
  qla2xxx: Fix memory leak for abts processing
  qla2xxx: Allow vref count to timeout on vport delete.
  tcmu: Convert cmd_time_out into backend device attribute
  tcmu: make cmd timeout configurable
  tcmu: add helper to check if dev was configured
  target: fix race during implicit transition work flushes
  target: allow userspace to set state to transitioning
  target: fix ALUA transition timeout handling
  ...

7 years agoMerge branch 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdim...
Linus Torvalds [Sun, 19 Mar 2017 22:45:02 +0000 (15:45 -0700)]
Merge branch 'libnvdimm-fixes' of git://git./linux/kernel/git/nvdimm/nvdimm

Pull device-dax fixes from Dan Williams:
 "The device-dax driver was not being careful to handle falling back to
  smaller fault-granularity sizes.

  The driver already fails fault attempts that are smaller than the
  device's alignment, but it also needs to handle the cases where a
  larger page mapping could be established. For simplicity of the
  immediate fix the implementation just signals VM_FAULT_FALLBACK until
  fault-size == device-alignment.

  One fix is for -stable to address pmd-to-pte fallback from the
  original implementation, another fix is for the new (introduced in
  4.11-rc1) pud-to-pmd regression, and a typo fix comes along for the
  ride.

  These have received a build success notification from the kbuild
  robot"

* 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
  device-dax: fix debug output typo
  device-dax: fix pud fault fallback handling
  device-dax: fix pmd/pte fault fallback handling

7 years agoqla2xxx: Update driver version to 9.00.00.00-k
Himanshu Madhani [Wed, 15 Mar 2017 16:48:56 +0000 (09:48 -0700)]
qla2xxx: Update driver version to 9.00.00.00-k

Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
signed-off-by: Giridhar Malavali <giridhar.malavali@cavium.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
7 years agoqla2xxx: Fix delayed response to command for loop mode/direct connect.
Quinn Tran [Wed, 15 Mar 2017 16:48:55 +0000 (09:48 -0700)]
qla2xxx: Fix delayed response to command for loop mode/direct connect.

Current driver wait for FW to be in the ready state before
processing in-coming commands. For Arbitrated Loop or
Point-to- Point (not switch), FW Ready state can take a while.
FW will transition to ready state after all Nports have been
logged in. In the mean time, certain initiators have completed
the login and starts IO. Driver needs to start processing all
queues if FW is already started.

Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
7 years agoqla2xxx: Change scsi host lookup method.
Quinn Tran [Wed, 15 Mar 2017 16:48:54 +0000 (09:48 -0700)]
qla2xxx: Change scsi host lookup method.

For target mode, when new scsi command arrive, driver first performs
a look up of the SCSI Host. The current look up method is based on
the ALPA portion of the NPort ID. For Cisco switch, the ALPA can
not be used as the index. Instead, the new search method is based
on the full value of the Nport_ID via btree lib.

Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
7 years agoqla2xxx: Add DebugFS node to display Port Database
Himanshu Madhani [Wed, 15 Mar 2017 16:48:53 +0000 (09:48 -0700)]
qla2xxx: Add DebugFS node to display Port Database

Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Giridhar Malavali <giridhar.malavali@cavium.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
7 years agoqla2xxx: Use IOCB interface to submit non-critical MBX.
Quinn Tran [Wed, 15 Mar 2017 16:48:52 +0000 (09:48 -0700)]
qla2xxx: Use IOCB interface to submit non-critical MBX.

The Mailbox interface is currently over subscribed. We like
to reserve the Mailbox interface for the chip managment and
link initialization. Any non essential Mailbox command will
be routed through the IOCB interface. The IOCB interface is
able to absorb more commands.

Following commands are being routed through IOCB interface

- Get ID List (007Ch)
- Get Port DB (0064h)
- Get Link Priv Stats (006Dh)

Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
7 years agoqla2xxx: Add async new target notification
Quinn Tran [Wed, 15 Mar 2017 16:48:51 +0000 (09:48 -0700)]
qla2xxx: Add async new target notification

Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
7 years agoqla2xxx: Export DIF stats via debugfs
Anil Gurumurthy [Wed, 15 Mar 2017 16:48:50 +0000 (09:48 -0700)]
qla2xxx: Export DIF stats via debugfs

Signed-off-by: Anil Gurumurthy <anil.gurumurthy@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
7 years agoqla2xxx: Improve T10-DIF/PI handling in driver.
Quinn Tran [Wed, 15 Mar 2017 16:48:49 +0000 (09:48 -0700)]
qla2xxx: Improve T10-DIF/PI handling in driver.

Add routines to support T10 DIF tag.

Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Anil Gurumurthy <anil.gurumurthy@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
7 years agoqla2xxx: Allow relogin to proceed if remote login did not finish
Quinn Tran [Wed, 15 Mar 2017 16:48:48 +0000 (09:48 -0700)]
qla2xxx: Allow relogin to proceed if remote login did not finish

If the remote port have started the login process, then the
PLOGI and PRLI should be back to back. Driver will allow
the remote port to complete the process. For the case where
the remote port decide to back off from sending PRLI, this
local port sets an expiration timer for the PRLI. Once the
expiration time passes, the relogin retry logic is allowed
to go through and perform login with the remote port.

Signed-off-by: Quinn Tran <quinn.tran@qlogic.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
7 years agoqla2xxx: Fix sess_lock & hardware_lock lock order problem.
Quinn Tran [Wed, 15 Mar 2017 16:48:47 +0000 (09:48 -0700)]
qla2xxx: Fix sess_lock & hardware_lock lock order problem.

The main lock that needs to be held for CMD or TMR submission
to upper layer is the sess_lock. The sess_lock is used to
serialize cmd submission and session deletion. The addition
of hardware_lock being held is not necessary. This patch removes
hardware_lock dependency from CMD/TMR submission.

Use hardware_lock only for error response in this case.

Path1
       CPU0                    CPU1
       ----                    ----
  lock(&(&ha->tgt.sess_lock)->rlock);
                               lock(&(&ha->hardware_lock)->rlock);
                               lock(&(&ha->tgt.sess_lock)->rlock);
  lock(&(&ha->hardware_lock)->rlock);

Path2/deadlock
*** DEADLOCK ***
Call Trace:
dump_stack+0x85/0xc2
print_circular_bug+0x1e3/0x250
__lock_acquire+0x1425/0x1620
lock_acquire+0xbf/0x210
_raw_spin_lock_irqsave+0x53/0x70
qlt_sess_work_fn+0x21d/0x480 [qla2xxx]
process_one_work+0x1f4/0x6e0

Cc: <stable@vger.kernel.org>
Cc: Bart Van Assche <Bart.VanAssche@sandisk.com>
Reported-by: Bart Van Assche <Bart.VanAssche@sandisk.com>
Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
7 years agoqla2xxx: Fix inadequate lock protection for ABTS.
Quinn Tran [Wed, 15 Mar 2017 16:48:46 +0000 (09:48 -0700)]
qla2xxx: Fix inadequate lock protection for ABTS.

Normally, ABTS is sent to Target Core as Task MGMT command.
In the case of error, qla2xxx needs to send response, hardware_lock
is required to prevent request queue corruption.

Cc: <stable@vger.kernel.org>
Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
7 years agoqla2xxx: Fix request queue corruption.
Quinn Tran [Wed, 15 Mar 2017 16:48:45 +0000 (09:48 -0700)]
qla2xxx: Fix request queue corruption.

When FW notify driver or driver detects low FW resource,
driver tries to send out Busy SCSI Status to tell Initiator
side to back off. During the send process, the lock was not held.

Cc: <stable@vger.kernel.org>
Signed-off-by: Quinn Tran <quinn.tran@qlogic.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
7 years agoqla2xxx: Fix memory leak for abts processing
Quinn Tran [Wed, 15 Mar 2017 16:48:44 +0000 (09:48 -0700)]
qla2xxx: Fix memory leak for abts processing

Cc: <stable@vger.kernel.org>
Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
7 years agoqla2xxx: Allow vref count to timeout on vport delete.
Joe Carnuccio [Wed, 15 Mar 2017 16:48:43 +0000 (09:48 -0700)]
qla2xxx: Allow vref count to timeout on vport delete.

Cc: <stable@vger.kernel.org>
Signed-off-by: Joe Carnuccio <joe.carnuccio@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
7 years agotcmu: Convert cmd_time_out into backend device attribute
Nicholas Bellinger [Sat, 18 Mar 2017 22:04:13 +0000 (15:04 -0700)]
tcmu: Convert cmd_time_out into backend device attribute

Instead of putting cmd_time_out under ../target/core/user_0/foo/control,
which has historically been used by parameters needed for initial
backend device configuration, go ahead and move cmd_time_out into
a backend device attribute.

In order to do this, tcmu_module_init() has been updated to create
a local struct configfs_attribute **tcmu_attrs, that is based upon
the existing passthrough_attrib_attrs along with the new cmd_time_out
attribute.  Once **tcm_attrs has been setup, go ahead and point
it at tcmu_ops->tb_dev_attrib_attrs so it's picked up by target-core.

Also following MNC's previous change, ->cmd_time_out is stored in
milliseconds but exposed via configfs in seconds.  Also, note this
patch restricts the modification of ->cmd_time_out to before +
after the TCMU device has been configured, but not while it has
active fabric exports.

Cc: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>