For PIE binaries (ET_DYN), the load address is randomized at PAGE_SIZE granularity via arch_mmap_rnd(). On arm64 with 64K base pages, this means the binary is 64K-aligned, but contpte mapping requires 2M (CONT_PTE_SIZE) alignment. Without proper virtual address alignment, the readahead patches that allocate 2M folios with 2M-aligned file offsets and physical addresses cannot benefit from contpte mapping. The contpte fold check in contpte_set_ptes() requires the virtual address to be CONT_PTE_SIZE- aligned, and since the misalignment from vma->vm_start is constant across all folios in the VMA, no folio gets the contiguous PTE bit set, resulting in zero iTLB coalescing benefit. Fix this by bumping the ELF alignment to PAGE_SIZE << exec_folio_order() when the arch defines a non-zero exec_folio_order(). This ensures load_bias is aligned to the folio size, so that file-offset-aligned folios map to properly aligned virtual addresses. Signed-off-by: Usama Arif --- fs/binfmt_elf.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c index 8e89cc5b28200..2d2b3e9fd474f 100644 --- a/fs/binfmt_elf.c +++ b/fs/binfmt_elf.c @@ -49,6 +49,7 @@ #include #include #include +#include #ifndef ELF_COMPAT #define ELF_COMPAT 0 @@ -1106,6 +1107,20 @@ static int load_elf_binary(struct linux_binprm *bprm) /* Calculate any requested alignment. */ alignment = maximum_alignment(elf_phdata, elf_ex->e_phnum); + /* + * If the arch requested large folios for exec + * memory via exec_folio_order(), ensure the + * binary is mapped with sufficient alignment so + * that virtual addresses of exec pages are + * aligned to the folio boundary. Without this, + * the hardware cannot coalesce PTEs (e.g. arm64 + * contpte) even though the physical memory and + * file offset are correctly aligned. + */ + if (exec_folio_order()) + alignment = max(alignment, + (unsigned long)PAGE_SIZE << exec_folio_order()); + /** * DOC: PIE handling * -- 2.47.3