When performing a mount-beneath operation the target mount can often be locked: unshare(CLONE_NEWUSER | CLONE_NEWNS); mount --beneath -t tmpfs tmpfs /proc will fail because the procfs mount on /proc became locked when the mount namespace was created from the parent mount namespace. Same logic for: unshare(CLONE_NEWUSER | CLONE_NEWNS); mount --beneath -t tmpfs tmpfs / MNT_LOCKED is raised to prevent an unprivileged mount namespace from revealing whatever is under a given mount. To replace the rootfs we need to handle that case though. We can simply transfer the locked mount property from the top mount to the mount beneath. The new mount we mounted beneath the top mount takes over the job of the top mount in protecting the parent mount from being revealed. This leaves us free to allow the top mount to be unmounted. This also works during mount propagation and also works for the non-MOVE_MOUNT_BENEATH case: (1) move_mount(MOVE_MOUNT_BENEATH): @source_mnt->overmount always NULL (2) move_mount(): @source_mnt->overmount maybe !NULL For (1) can_move_mount_beneath() rejects overmounted @source_mnt (We could allow this but whatever it's not really a use-case and it's fugly to move an overmounted mount stack around. What are you even doing? So let's keep that restriction. For (2) we can have @source_mnt overmounted (Someone overmounted us while we locked the target mount.). Both are fine. @source_mnt will be mounted on whatever @q was mounted on and @q will be mounted on the top of the @source_mnt mount stack. Even in such cases we can unlock @q and lock @source_mnt if @q was locked. This effectively makes mount propagation useful in cases where a mount namespace has a locked mount somewhere and we propagate a new mount beneath it but the mount namespace could never get at it because the old top mount remains locked. Again, we just let the newly propagated mount take over the protection and unlock the top mount. Signed-off-by: Christian Brauner --- fs/namespace.c | 16 +++++++++++++--- 1 file changed, 13 insertions(+), 3 deletions(-) diff --git a/fs/namespace.c b/fs/namespace.c index ebe19ded293a..cdde6c6a30ee 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -2636,6 +2636,19 @@ static int attach_recursive_mnt(struct mount *source_mnt, if (unlikely(shorter) && child != source_mnt) mp = shorter; + /* + * If @q was locked it was meant to hide + * whatever was under it. Let @child take over + * that job and lock it, then we can unlock @q. + * That'll allow another namespace to shed @q + * and reveal @child. Clearly, that mounter + * consented to this by not severing the mount + * relationship. Otherwise, what's the point. + */ + if (IS_MNT_LOCKED(q)) { + child->mnt.mnt_flags |= MNT_LOCKED; + q->mnt.mnt_flags &= ~MNT_LOCKED; + } mnt_change_mountpoint(r, mp, q); } } @@ -3529,9 +3542,6 @@ static int can_move_mount_beneath(const struct mount *mnt_from, { struct mount *parent_mnt_to = mnt_to->mnt_parent; - if (IS_MNT_LOCKED(mnt_to)) - return -EINVAL; - /* Avoid creating shadow mounts during mount propagation. */ if (mnt_from->overmount) return -EINVAL; -- 2.47.3