Disconnected files or directories can appear when they are visible and opened from a bind mount, but have been renamed or moved from the source of the bind mount in a way that makes them inaccessible from the mount point (i.e. out of scope). Previously, access rights tied to files or directories opened through a disconnected directory were collected by walking the related hierarchy down to the root of the filesystem, without taking into account the mount point because it couldn't be found. This could lead to inconsistent access results, potential access right widening, and hard-to-debug renames, especially since such paths cannot be printed. For a sandboxed task to create a disconnected directory, it needs to have write access (i.e. FS_MAKE_REG, FS_REMOVE_FILE, and FS_REFER) to the underlying source of the bind mount, and read access to the related mount point. Because a sandboxed task cannot acquire more access rights than those defined by its Landlock domain, this could lead to inconsistent access rights due to missing permissions that should be inherited from the mount point hierarchy, while inheriting permissions from the filesystem hierarchy hidden by this mount point instead. Landlock now handles files and directories opened from disconnected directories by taking into account the filesystem hierarchy when the mount point is not found in the hierarchy walk, and also always taking into account the mount point from which these disconnected directories were opened. This ensures that a rename is not allowed if it would widen access rights [1]. The rationale is that, even if disconnected hierarchies might not be visible or accessible to a sandboxed task, relying on the collected access rights from them improves the guarantee that access rights will not be widened during a rename because of the access right comparison between the source and the destination (see LANDLOCK_ACCESS_FS_REFER). It may look like this would grant more access on disconnected files and directories, but the security policies are always enforced for all the evaluated hierarchies. This new behavior should be less surprising to users and safer from an access control perspective. Remove a wrong WARN_ON_ONCE() canary in collect_domain_accesses() and fix the related comment. Because opened files have their access rights stored in the related file security properties, there is no impact for disconnected or unlinked files. Cc: Christian Brauner Cc: Günther Noack Cc: Song Liu Reported-by: Tingmao Wang Closes: https://lore.kernel.org/r/027d5190-b37a-40a8-84e9-4ccbc352bcdf@maowtm.org Closes: https://lore.kernel.org/r/09b24128f86973a6022e6aa8338945fcfb9a33e4.1749925391.git.m@maowtm.org Fixes: b91c3e4ea756 ("landlock: Add support for file reparenting with LANDLOCK_ACCESS_FS_REFER") Fixes: cb2c7d1a1776 ("landlock: Support filesystem access-control") Link: https://lore.kernel.org/r/b0f46246-f2c5-42ca-93ce-0d629702a987@maowtm.org [1] Reviewed-by: Tingmao Wang Signed-off-by: Mickaël Salaün --- Changes since v4: - Improve comment. - Add Reviewed-by. Changes since v3: - Simplify the approach by checking both the filesystem hierarchy and the mount point hierarchy for disconnected directories. Remove the related snapshot mechanism. See [1]. - Update erratum. - Do not expose path_connected() anymore, and remove related Acked-by. Changes since v2: - Fix domain check mode reset, spotted by Tingmao. Replace a conditional branch (when all accesses are granted) by only looking for a rule when needed. This simplifies code and remove a call to path_connected(). Add a new is_dom_check_bkp to safely handle race conditions and move initial backup for layer_masks_parent1. - Fix layer masks parent backup initialization, spotted by Tingmao. - Add more comments, suggested by Tingmao. - Reformat erratum. Changes since v1: - Remove useless else branch in is_access_to_paths_allowed(). - Update commit message and squash "landlock: Remove warning in collect_domain_accesses()": https://lore.kernel.org/r/20250618134734.1673254-1-mic@digikod.net - Remove "extern" for path_connected() in fs.h, requested by Christian. - Add Acked-by Christian. - Fix docstring and improve doc for collect_domain_accesses(). - Remove path_connected() check for the real root. - Fix allowed_parent* resets to be consistent with their initial values infered from the evaluated domain. - Add Landlock erratum. --- security/landlock/errata/abi-1.h | 16 +++++++++++++ security/landlock/fs.c | 40 ++++++++++++++++++++++---------- 2 files changed, 44 insertions(+), 12 deletions(-) create mode 100644 security/landlock/errata/abi-1.h diff --git a/security/landlock/errata/abi-1.h b/security/landlock/errata/abi-1.h new file mode 100644 index 000000000000..e8a2bff2e5b6 --- /dev/null +++ b/security/landlock/errata/abi-1.h @@ -0,0 +1,16 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ + +/** + * DOC: erratum_3 + * + * Erratum 3: Disconnected directory handling + * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + * + * This fix addresses an issue with disconnected directories that occur when a + * directory is moved outside the scope of a bind mount. The change ensures + * that evaluated access rights include both those from the disconnected file + * hierarchy down to its filesystem root and those from the related mount point + * hierarchy. This prevents access right widening through rename or link + * actions. + */ +LANDLOCK_ERRATUM(3) diff --git a/security/landlock/fs.c b/security/landlock/fs.c index d9c12b993fa7..a2ed0e76938a 100644 --- a/security/landlock/fs.c +++ b/security/landlock/fs.c @@ -909,21 +909,31 @@ static bool is_access_to_paths_allowed( break; } } + if (unlikely(IS_ROOT(walker_path.dentry))) { - /* - * Stops at disconnected root directories. Only allows - * access to internal filesystems (e.g. nsfs, which is - * reachable through /proc//ns/). - */ - if (walker_path.mnt->mnt_flags & MNT_INTERNAL) { + if (likely(walker_path.mnt->mnt_flags & MNT_INTERNAL)) { + /* + * Stops and allows access when reaching disconnected root + * directories that are part of internal filesystems (e.g. nsfs, + * which is reachable through /proc//ns/). + */ allowed_parent1 = true; allowed_parent2 = true; + break; } - break; + + /* + * We reached a disconnected root directory from a bind mount. + * Let's continue the walk with the mount point we missed. + */ + dput(walker_path.dentry); + walker_path.dentry = walker_path.mnt->mnt_root; + dget(walker_path.dentry); + } else { + parent_dentry = dget_parent(walker_path.dentry); + dput(walker_path.dentry); + walker_path.dentry = parent_dentry; } - parent_dentry = dget_parent(walker_path.dentry); - dput(walker_path.dentry); - walker_path.dentry = parent_dentry; } path_put(&walker_path); @@ -1021,6 +1031,9 @@ static access_mask_t maybe_remove(const struct dentry *const dentry) * file. While walking from @dir to @mnt_root, we record all the domain's * allowed accesses in @layer_masks_dom. * + * Because of disconnected directories, this walk may not reach @mnt_dir. In + * this case, the walk will continue to @mnt_dir after this call. + * * This is similar to is_access_to_paths_allowed() but much simpler because it * only handles walking on the same mount point and only checks one set of * accesses. @@ -1062,8 +1075,11 @@ static bool collect_domain_accesses( break; } - /* We should not reach a root other than @mnt_root. */ - if (dir == mnt_root || WARN_ON_ONCE(IS_ROOT(dir))) + /* + * Stops at the mount point or the filesystem root for a disconnected + * directory. + */ + if (dir == mnt_root || unlikely(IS_ROOT(dir))) break; parent_dentry = dget_parent(dir); -- 2.51.0