fsnotify_recalc_mask() fails to handle the return value of __fsnotify_recalc_mask(), which may return an inode pointer that needs to be released via fsnotify_drop_object() when the connector's HAS_IREF flag transitions from set to cleared. This manifests as a hung task with the following call trace: INFO: task umount:1234 blocked for more than 120 seconds. Call Trace: __schedule schedule fsnotify_sb_delete generic_shutdown_super kill_anon_super cleanup_mnt task_work_run do_exit do_group_exit The race window that triggers the iref leak: Thread A (adding mark) Thread B (removing mark) ────────────────────── ──────────────────────── fsnotify_add_mark_locked(): fsnotify_add_mark_list(): spin_lock(conn->lock) add mark_B(evictable) to list spin_unlock(conn->lock) return /* ---- gap: no lock held ---- */ fsnotify_detach_mark(mark_A): spin_lock(mark_A->lock) clear ATTACHED flag on mark_A spin_unlock(mark_A->lock) fsnotify_recalc_mask(): spin_lock(conn->lock) __fsnotify_recalc_mask(): /* mark_A skipped: ATTACHED cleared */ /* only mark_B(evictable) remains */ want_iref = false has_iref = true /* not yet cleared */ -> HAS_IREF transitions true -> false -> returns inode pointer spin_unlock(conn->lock) /* BUG: return value discarded! * iput() and fsnotify_put_sb_watched_objects() * are never called */ Fix this by capturing the return value of __fsnotify_recalc_mask() and passing it to fsnotify_drop_object() after releasing the spinlock, which is the same pattern used in fsnotify_put_mark(). Fixes: c3638b5b1374 ("fsnotify: allow adding an inode mark without pinning inode") Signed-off-by: Xin Yin --- fs/notify/mark.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/fs/notify/mark.c b/fs/notify/mark.c index c2ed5b11b0fe6..cc93fcc2c5a9c 100644 --- a/fs/notify/mark.c +++ b/fs/notify/mark.c @@ -283,6 +283,8 @@ static void fsnotify_conn_set_children_dentry_flags( fsnotify_set_children_dentry_flags(fsnotify_conn_inode(conn)); } +static void fsnotify_drop_object(unsigned int type, void *objp); + /* * Calculate mask of events for a list of marks. The caller must make sure * connector and connector->obj cannot disappear under us. Callers achieve @@ -292,15 +294,19 @@ static void fsnotify_conn_set_children_dentry_flags( void fsnotify_recalc_mask(struct fsnotify_mark_connector *conn) { bool update_children; + unsigned int type; + void *objp; if (!conn) return; spin_lock(&conn->lock); update_children = !fsnotify_conn_watches_children(conn); - __fsnotify_recalc_mask(conn); + objp = __fsnotify_recalc_mask(conn); + type = conn->type; update_children &= fsnotify_conn_watches_children(conn); spin_unlock(&conn->lock); + fsnotify_drop_object(type, objp); /* * Set children's PARENT_WATCHED flags only if parent started watching. * When parent stops watching, we clear false positive PARENT_WATCHED -- 2.20.1