From: Mathias Krause The x86-64 implementation of setup_mmu() doesn't initialize 'vfree_top' and leaves it at its zero-value. This isn't wrong per se, however, it leads to odd configurations when the first vmalloc/vmap page gets allocated. It'll be the very last page in the virtual address space -- which is an interesting corner case -- but its boundary will probably wrap. It does so, for CET's shadow stack, at least, which loads the shadow stack pointer with the base address of the mapped page plus its size, i.e. 0xffffffff_fffff000 + 4096, which wraps to 0x0. The CPU seems to handle such configurations just fine. However, it feels odd to set the shadow stack pointer to "NULL". To avoid the wrapping, ignore the top most page by initializing 'vfree_top' to just one page below. Reviewed-by: Chao Gao Signed-off-by: Mathias Krause Signed-off-by: Sean Christopherson --- lib/x86/vm.c | 2 ++ x86/lam.c | 10 +++++----- 2 files changed, 7 insertions(+), 5 deletions(-) diff --git a/lib/x86/vm.c b/lib/x86/vm.c index 90f73fbb..27e7bb40 100644 --- a/lib/x86/vm.c +++ b/lib/x86/vm.c @@ -191,6 +191,8 @@ void *setup_mmu(phys_addr_t end_of_memory, void *opt_mask) end_of_memory = (1ul << 32); /* map mmio 1:1 */ setup_mmu_range(cr3, 0, end_of_memory); + /* skip the last page for out-of-bound and wrap-around reasons */ + init_alloc_vpage((void *)(~(PAGE_SIZE - 1))); #else setup_mmu_range(cr3, 0, (2ul << 30)); setup_mmu_range(cr3, 3ul << 30, (1ul << 30)); diff --git a/x86/lam.c b/x86/lam.c index 1af6c5fd..87efc5dd 100644 --- a/x86/lam.c +++ b/x86/lam.c @@ -197,11 +197,11 @@ static void test_lam_sup(void) int vector; /* - * KUT initializes vfree_top to 0 for X86_64, and each virtual address - * allocation decreases the size from vfree_top. It's guaranteed that - * the return value of alloc_vpage() is considered as kernel mode - * address and canonical since only a small amount of virtual address - * range is allocated in this test. + * KUT initializes vfree_top to -PAGE_SIZE for X86_64, and each virtual + * address allocation decreases the size from vfree_top. It's + * guaranteed that the return value of alloc_vpage() is considered as + * kernel mode address and canonical since only a small amount of + * virtual address range is allocated in this test. */ vaddr = alloc_vpage(); vaddr_mmio = alloc_vpage(); -- 2.52.0.rc1.455.g30608eb744-goog