File xsa254-x86-reduce-Meltdown-IPI-overhead.patch of Package xen.6649
In case we can detect single-threaded guest processes (by checking
whether we can account for all root page table uses locally on the vCPU
that's running), there's no point in issuing a sync IPI upon an L4 entry
update, as no other vCPU of the guest will have that page table loaded.
Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
---
This will apply cleanly only on top of all of the previously posted
follow-ups to the Meltdown band-aid, but it wouldn't be difficult to
move it ahead of some or all of them.
On my test system, this improves kernel build times only 0.5...1%, but
the effect may well be bigger on larger systems. But of course there's
no improvement expected at heavily multi-threaded guests/processes.
Index: xen-4.7.4-testing/xen/arch/x86/mm.c
===================================================================
--- xen-4.7.4-testing.orig/xen/arch/x86/mm.c
+++ xen-4.7.4-testing/xen/arch/x86/mm.c
@@ -4006,8 +4006,18 @@ long do_mmu_update(
case PGT_l4_page_table:
rc = mod_l4_entry(va, l4e_from_intpte(req.val), mfn,
cmd == MMU_PT_UPDATE_PRESERVE_AD, v);
- if ( !rc )
- sync_guest = !!this_cpu(root_pgt);
+ /*
+ * No need to sync if all uses of the page can be accounted
+ * to the lock we hold, its pinned status, and uses on this
+ * (v)CPU.
+ */
+ if ( !rc &&
+ ((page->u.inuse.type_info & PGT_count_mask) >
+ (1 + !!(page->u.inuse.type_info & PGT_pinned) +
+ (pagetable_get_pfn(curr->arch.guest_table) == mfn) +
+ (pagetable_get_pfn(curr->arch.guest_table_user) ==
+ mfn))) )
+ sync_guest = 1;
break;
case PGT_writable_page:
perfc_incr(writable_mmu_updates);