For PIE binaries (ET_DYN), the load address is randomized at PAGE_SIZE granularity via arch_mmap_rnd(). On arm64 with 64K base pages, this means the binary is 64K-aligned, but contpte mapping requires 2M (CONT_PTE_SIZE) alignment. Without proper virtual address alignment, readahead patches that allocate large folios with aligned file offsets and physical addresses cannot benefit from contpte mapping, as the contpte fold check in contpte_set_ptes() requires the virtual address to be CONT_PTE_SIZE- aligned. Fix this by extending maximum_alignment() to consider folio alignment at two tiers, matching the readahead allocation strategy: - HPAGE_PMD_SIZE, so large folios can be PMD-mapped on architectures where PMD_SIZE is reasonable (e.g. 2M on x86-64 and arm64 with 4K pages). - exec_folio_order(), the minimum order for hardware TLB coalescing (e.g. arm64 contpte/HPA). For each PT_LOAD segment, folio_alignment() tries both tiers and returns the largest power-of-2 alignment that fits within the segment size, with both p_vaddr and p_offset aligned to that size. This ensures load_bias is folio-aligned so that file-offset-aligned folios map to properly aligned virtual addresses, enabling hardware PTE coalescing and PMD mappings for large folios. The segment size check in folio_alignment() avoids reducing ASLR entropy for small binaries that cannot benefit from large folio alignment. Signed-off-by: Usama Arif --- fs/binfmt_elf.c | 50 +++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 50 insertions(+) diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c index 16a56b6b3f6c..f84fae6daf23 100644 --- a/fs/binfmt_elf.c +++ b/fs/binfmt_elf.c @@ -488,6 +488,54 @@ static int elf_read(struct file *file, void *buf, size_t len, loff_t pos) return 0; } +/* + * Return the largest folio alignment for a PT_LOAD segment, so the + * hardware can coalesce PTEs (e.g. arm64 contpte) or use PMD mappings + * for large folios. + * + * Try PMD alignment so large folios can be PMD-mapped. Then try + * exec_folio_order() alignment for hardware TLB coalescing (e.g. + * arm64 contpte/HPA). + * + * Use the largest power-of-2 that fits within the segment size, capped + * by the target folio size. + * Only align when the segment's virtual address and file offset are + * already aligned to that size, as misalignment would prevent coalescing + * anyway. + * + * The segment size check avoids reducing ASLR entropy for small binaries + * that cannot benefit. + */ +static unsigned long folio_alignment(struct elf_phdr *cmd) +{ + unsigned long alignment = 0; + unsigned long seg_size; + + if (!cmd->p_filesz) + return 0; + + seg_size = rounddown_pow_of_two(cmd->p_filesz); + + if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) { + unsigned long size = min(seg_size, HPAGE_PMD_SIZE); + + if (size > PAGE_SIZE && + IS_ALIGNED(cmd->p_vaddr | cmd->p_offset, size)) + alignment = size; + } + + if (!alignment && exec_folio_order()) { + unsigned long size = min(seg_size, + PAGE_SIZE << exec_folio_order()); + + if (size > PAGE_SIZE && + IS_ALIGNED(cmd->p_vaddr | cmd->p_offset, size)) + alignment = size; + } + + return alignment; +} + static unsigned long maximum_alignment(struct elf_phdr *cmds, int nr) { unsigned long alignment = 0; @@ -501,6 +549,8 @@ static unsigned long maximum_alignment(struct elf_phdr *cmds, int nr) if (!is_power_of_2(p_align)) continue; alignment = max(alignment, p_align); + alignment = max(alignment, + folio_alignment(&cmds[i])); } } -- 2.52.0