The ARM architecture allows the caching of intermediate page table
levels and page table freeing requires a sequence like:
pmd_clear()
TLB invalidation
pte page freeing
With commit
5e5f6dc10546 (arm64: mm: enable HAVE_RCU_TABLE_FREE logic),
the page table freeing batching was moved from tlb_remove_page() to
tlb_remove_table(). The former takes care of TLB invalidation as this is
also shared with pte clearing and page cache page freeing. The latter,
however, does not invalidate the TLBs for intermediate page table levels
as it probably relies on the architecture code to do it if required.
When the mm->mm_users < 2, tlb_remove_table() does not do any batching
and page table pages are freed before tlb_finish_mmu() which performs
the actual TLB invalidation.
This patch introduces __tlb_flush_pgtable() for arm64 and calls it from
the {pte,pmd,pud}_free_tlb() directly without relying on deferred page
table freeing.
Fixes:
5e5f6dc10546 arm64: mm: enable HAVE_RCU_TABLE_FREE logic
Reported-by: Jon Masters <jcm@redhat.com>
Tested-by: Jon Masters <jcm@redhat.com>
Tested-by: Steve Capper <steve.capper@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
static inline void __pte_free_tlb(struct mmu_gather *tlb, pgtable_t pte,
unsigned long addr)
{
+ __flush_tlb_pgtable(tlb->mm, addr);
pgtable_page_dtor(pte);
tlb_remove_entry(tlb, pte);
}
static inline void __pmd_free_tlb(struct mmu_gather *tlb, pmd_t *pmdp,
unsigned long addr)
{
+ __flush_tlb_pgtable(tlb->mm, addr);
tlb_remove_entry(tlb, virt_to_page(pmdp));
}
#endif
static inline void __pud_free_tlb(struct mmu_gather *tlb, pud_t *pudp,
unsigned long addr)
{
+ __flush_tlb_pgtable(tlb->mm, addr);
tlb_remove_entry(tlb, virt_to_page(pudp));
}
#endif
flush_tlb_all();
}
+/*
+ * Used to invalidate the TLB (walk caches) corresponding to intermediate page
+ * table levels (pgd/pud/pmd).
+ */
+static inline void __flush_tlb_pgtable(struct mm_struct *mm,
+ unsigned long uaddr)
+{
+ unsigned long addr = uaddr >> 12 | ((unsigned long)ASID(mm) << 48);
+
+ dsb(ishst);
+ asm("tlbi vae1is, %0" : : "r" (addr));
+ dsb(ish);
+}
/*
* On AArch64, the cache coherency is handled via the set_pte_at() function.
*/