When multiple blkgs in the same blkcg are released concurrently, a use-after-free can occur. The race happens when one blkg's __blkcg_rstat_flush() removes another blkg's iostat entries via llist_del_all(). The second blkg sees an empty list and proceeds to free itself while the first is still iterating over its entries. Fix by deferring blkg_free() via an additional call_rcu(). The second RCU grace period ensures any concurrent flush holding rcu_read_lock() has completed before the blkg memory is freed. Cc: stable@vger.kernel.org Cc: Jay Shin Cc: Tejun Heo Cc: Waiman Long Fixes: 20cb1c2fb756 ("blk-cgroup: Flush stats before releasing blkcg_gq") Reported-by: coregee2000@gmail.com Closes: https://lore.kernel.org/linux-block/CAHPqNmwT9oRpem3J3erS_W0uSQND47LGGSBsNxP8E6uSUish1w@mail.gmail.com/ Signed-off-by: Ming Lei --- block/blk-cgroup.c | 25 ++++++++++++++++++++++--- 1 file changed, 22 insertions(+), 3 deletions(-) diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c index 3cffb68ba5d8..dc0cccfdca68 100644 --- a/block/blk-cgroup.c +++ b/block/blk-cgroup.c @@ -160,6 +160,20 @@ static void blkg_free(struct blkcg_gq *blkg) schedule_work(&blkg->free_work); } +/* + * RCU callback to free blkg after an additional grace period. + * This ensures any concurrent __blkcg_rstat_flush() that might have + * removed our iostat entries via llist_del_all() has completed. + */ +static void __blkg_release_free_rcu(struct rcu_head *rcu) +{ + struct blkcg_gq *blkg = container_of(rcu, struct blkcg_gq, rcu_head); + + /* release the blkcg and parent blkg refs this blkg has been holding */ + css_put(&blkg->blkcg->css); + blkg_free(blkg); +} + static void __blkg_release(struct rcu_head *rcu) { struct blkcg_gq *blkg = container_of(rcu, struct blkcg_gq, rcu_head); @@ -178,9 +192,14 @@ static void __blkg_release(struct rcu_head *rcu) for_each_possible_cpu(cpu) __blkcg_rstat_flush(blkcg, cpu); - /* release the blkcg and parent blkg refs this blkg has been holding */ - css_put(&blkg->blkcg->css); - blkg_free(blkg); + /* + * Defer freeing via another call_rcu() to ensure any concurrent + * __blkcg_rstat_flush() (under rcu_read_lock) that might have removed + * our iostat entries via llist_del_all() has completed its iteration. + * The second grace period guarantees those RCU read-side critical + * sections have finished. + */ + call_rcu(&blkg->rcu_head, __blkg_release_free_rcu); } /* -- 2.47.0