Although the possible race described in
commit
85b7059169e128c57a3a8a3e588fb89cb2031da1
KVM: MMU: fix shrinking page from the empty mmu
was correct, the real cause of that issue was a more trivial bug of
mmu_shrink() introduced by
commit
1952639665e92481c34c34c3e2a71bf3e66ba362
KVM: MMU: do not iterate over all VMs in mmu_shrink()
Here is the bug:
if (kvm->arch.n_used_mmu_pages > 0) {
if (!nr_to_scan--)
break;
continue;
}
We skip VMs whose n_used_mmu_pages is not zero and try to shrink others:
in other words we try to shrink empty ones by mistake.
This patch reverses the logic so that mmu_shrink() can free pages from
the first VM whose n_used_mmu_pages is not zero. Note that we also add
comments explaining the role of nr_to_scan which is not practically
important now, hoping this will be improved in the future.
Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Cc: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
int idx;
LIST_HEAD(invalid_list);
+ /*
+ * Never scan more than sc->nr_to_scan VM instances.
+ * Will not hit this condition practically since we do not try
+ * to shrink more than one VM and it is very unlikely to see
+ * !n_used_mmu_pages so many times.
+ */
+ if (!nr_to_scan--)
+ break;
/*
* n_used_mmu_pages is accessed without holding kvm->mmu_lock
* here. We may skip a VM instance errorneosly, but we do not
* want to shrink a VM that only started to populate its MMU
* anyway.
*/
- if (kvm->arch.n_used_mmu_pages > 0) {
- if (!nr_to_scan--)
- break;
+ if (!kvm->arch.n_used_mmu_pages)
continue;
- }
idx = srcu_read_lock(&kvm->srcu);
spin_lock(&kvm->mmu_lock);