When mempolicy is rebound due to the process moves to a different cpuset context, or the set of nodes allowed by current cpuset context changes, mpol_rebind_nodemask() remaps the nodemask according to the old and new cpuset_mems_allowed by default. So, use mempolicy.w.cpuset_mems_allowed to store the old nodemask allowed by cpuset. MPOL_F_STATIC_NODES suppresses the node remap and intersects the user's passed nodemask and nodes allowed by new cpuset context. For MPOL_F_RELATIVE_NODES, the user's passed nodemask means node IDs that are relative to the set of node IDs allowed by the process's current cpuset. So, use mempolicy.w.user_nodemask to store the user's passed nodemask. commit bda420b98505 ("numa balancing: migrate on fault among multiple bound nodes") adds new flag MPOL_F_NUMA_BALANCING to enable NUMA balancing for MPOL_BIND, the behaviour of rebinding should be same with default befaviour. However, mpol_store_user_nodemask() returns true for MPOL_F_NUMA_BALANCING, leading to mempolicy.w.cpuset_mems_allowed stores the user's passed nodemask instead of cpuset_current_mems_allowed, and mpol_rebind_nodemask() remaps wrongly. Fixes: bda420b98505 ("numa balancing: migrate on fault among multiple bound nodes") Signed-off-by: Jinjiang Tu --- include/uapi/linux/mempolicy.h | 6 ++++++ mm/mempolicy.c | 2 +- 2 files changed, 7 insertions(+), 1 deletion(-) diff --git a/include/uapi/linux/mempolicy.h b/include/uapi/linux/mempolicy.h index 8fbbe613611a..1802b6c89603 100644 --- a/include/uapi/linux/mempolicy.h +++ b/include/uapi/linux/mempolicy.h @@ -39,6 +39,12 @@ enum { #define MPOL_MODE_FLAGS \ (MPOL_F_STATIC_NODES | MPOL_F_RELATIVE_NODES | MPOL_F_NUMA_BALANCING) +/* + * MPOL_USER_NODEMASK_FLAGS is used to determine if nodemask passed by + * users should be used in mpol_rebind_nodemask(). + */ +#define MPOL_USER_NODEMASK_FLAGS (MPOL_F_STATIC_NODES | MPOL_F_RELATIVE_NODES) + /* Flags for get_mempolicy */ #define MPOL_F_NODE (1<<0) /* return next IL mode instead of node mask */ #define MPOL_F_ADDR (1<<1) /* look up vma using address */ diff --git a/mm/mempolicy.c b/mm/mempolicy.c index 68a98ba57882..76da50425712 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -365,7 +365,7 @@ static const struct mempolicy_operations { static inline int mpol_store_user_nodemask(const struct mempolicy *pol) { - return pol->flags & MPOL_MODE_FLAGS; + return pol->flags & MPOL_USER_NODEMASK_FLAGS; } static void mpol_relative_nodemask(nodemask_t *ret, const nodemask_t *orig, -- 2.43.0