commit bda420b98505 ("numa balancing: migrate on fault among multiple bound nodes") adds new flag MPOL_F_NUMA_BALANCING to enable NUMA balancing for MPOL_BIND memory policy. when the cpuset of tasks changes, the mempolicy of the task is rebound by mpol_rebind_nodemask(). The intended rebinding behavior of MPOL_F_NUMA_BALANCING was the same as when neither MPOL_F_STATIC_NODES nor MPOL_F_RELATIVE_NODES flags are set. However, this commit breaks it. struct mempolicy has a union member as bellow: union { nodemask_t cpuset_mems_allowed; /* relative to these nodes */ nodemask_t user_nodemask; /* nodemask passed by user */ } w; w.cpuset_mems_allowed and w.user_nodemask are both nodemask type and their difference is only what type of nodemask is stored. mpol_set_nodemask() initializes the union like below: static int mpol_set_nodemask(...) { if (mpol_store_user_nodemask(pol)) pol->w.user_nodemask = *nodes; else pol->w.cpuset_mems_allowed = cpuset_current_mems_allowed; } mpol_store_user_nodemask() returns true for MPOL_F_NUMA_BALANCING incorrectly and the union stores user-passed nodemask. Consequently, mpol_rebind_nodemask() ends up rebinding based on the user-passed nodemask rather than the cpuset_mems_allowed nodemask as intended. To fix this, only store the user nodemask if MPOL_F_STATIC_NODES or MPOL_F_RELATIVE_NODES is present. Fixes: bda420b98505 ("numa balancing: migrate on fault among multiple bound nodes") Reviewed-by: Gregory Price Signed-off-by: Jinjiang Tu --- Change since v1: * update changelog and comments. * collect RB from Gregory. include/uapi/linux/mempolicy.h | 3 +++ mm/mempolicy.c | 2 +- 2 files changed, 4 insertions(+), 1 deletion(-) diff --git a/include/uapi/linux/mempolicy.h b/include/uapi/linux/mempolicy.h index 8fbbe613611a..6c962d866e86 100644 --- a/include/uapi/linux/mempolicy.h +++ b/include/uapi/linux/mempolicy.h @@ -39,6 +39,9 @@ enum { #define MPOL_MODE_FLAGS \ (MPOL_F_STATIC_NODES | MPOL_F_RELATIVE_NODES | MPOL_F_NUMA_BALANCING) +/* Whether the nodemask is specified by users */ +#define MPOL_USER_NODEMASK_FLAGS (MPOL_F_STATIC_NODES | MPOL_F_RELATIVE_NODES) + /* Flags for get_mempolicy */ #define MPOL_F_NODE (1<<0) /* return next IL mode instead of node mask */ #define MPOL_F_ADDR (1<<1) /* look up vma using address */ diff --git a/mm/mempolicy.c b/mm/mempolicy.c index 68a98ba57882..76da50425712 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -365,7 +365,7 @@ static const struct mempolicy_operations { static inline int mpol_store_user_nodemask(const struct mempolicy *pol) { - return pol->flags & MPOL_MODE_FLAGS; + return pol->flags & MPOL_USER_NODEMASK_FLAGS; } static void mpol_relative_nodemask(nodemask_t *ret, const nodemask_t *orig, -- 2.43.0