This patch introduces two new fields to /proc/[pid]/status to display the set of CPUs, representing the CPU affinity of the process's active memory context, in both mask and list format: "Cpus_active_mm" and "Cpus_active_mm_list". The mm_cpumask is primarily used for TLB and cache synchronisation. Exposing this information allows userspace to easily describe the relationship between CPUs where a memory descriptor is "active" and the CPUs where the thread is allowed to execute. The primary intent is to provide visibility into the "memory footprint" across CPUs, which is invaluable for debugging performance issues related to IPI storms and TLB shootdowns in large-scale NUMA systems. The CPU-affinity sets the boundary; the mm_cpumask records the arrival; they complement each other. Frequent mm_cpumask changes may indicate instability in placement policies or excessive task migration overhead. Signed-off-by: Aaron Tomlin --- Documentation/filesystems/proc.rst | 3 +++ fs/proc/array.c | 22 +++++++++++++++++++++- 2 files changed, 24 insertions(+), 1 deletion(-) diff --git a/Documentation/filesystems/proc.rst b/Documentation/filesystems/proc.rst index 8256e857e2d7..c92e95e28047 100644 --- a/Documentation/filesystems/proc.rst +++ b/Documentation/filesystems/proc.rst @@ -291,6 +291,9 @@ It's slow but very precise. SpeculationIndirectBranch indirect branch speculation mode Cpus_allowed mask of CPUs on which this process may run Cpus_allowed_list Same as previous, but in "list format" + Cpus_active_mm mask of CPUs on which this process has an active + memory context + Cpus_active_mm_list Same as previous, but in "list format" Mems_allowed mask of memory nodes allowed to this process Mems_allowed_list Same as previous, but in "list format" voluntary_ctxt_switches number of voluntary context switches diff --git a/fs/proc/array.c b/fs/proc/array.c index 42932f88141a..8887c5e38e51 100644 --- a/fs/proc/array.c +++ b/fs/proc/array.c @@ -409,6 +409,23 @@ static void task_cpus_allowed(struct seq_file *m, struct task_struct *task) cpumask_pr_args(&task->cpus_mask)); } +/** + * task_cpus_active_mm - Show the mm_cpumask for a process + * @m: The seq_file structure for the /proc/PID/status output + * @mm: The memory descriptor of the process + * + * Prints the set of CPUs, representing the CPU affinity of the process's + * active memory context, in both mask and list format. This mask is + * primarily used for TLB and cache synchronisation. + */ +static void task_cpus_active_mm(struct seq_file *m, struct mm_struct *mm) +{ + seq_printf(m, "Cpus_active_mm:\t%*pb\n", + cpumask_pr_args(mm_cpumask(mm))); + seq_printf(m, "Cpus_active_mm_list:\t%*pbl\n", + cpumask_pr_args(mm_cpumask(mm))); +} + static inline void task_core_dumping(struct seq_file *m, struct task_struct *task) { seq_put_decimal_ull(m, "CoreDumping:\t", !!task->signal->core_state); @@ -450,12 +467,15 @@ int proc_pid_status(struct seq_file *m, struct pid_namespace *ns, task_core_dumping(m, task); task_thp_status(m, mm); task_untag_mask(m, mm); - mmput(mm); } task_sig(m, task); task_cap(m, task); task_seccomp(m, task); task_cpus_allowed(m, task); + if (mm) { + task_cpus_active_mm(m, mm); + mmput(mm); + } cpuset_task_status_allowed(m, task); task_context_switch_counts(m, task); arch_proc_pid_thread_features(m, task); -- 2.51.0