Once btrfs_dev_replace_start() has set the replace state to STARTED and dropped dev_replace->rwsem, writes to the source device are duplicated to the target device (see btrfs_map_block() and handle_ops_on_dev_replace()), and the in-flight bios are accounted by dev_replace->bio_counter. If the following btrfs_start_transaction() then fails, the error path resets the state to NEVER_STARTED, clears ->srcdev/->tgtdev and jumps to the 'leave' label, which frees the target with btrfs_destroy_dev_replace_tgtdev(). That helper does a synchronize_rcu() (to fence readers of the device list) but does not wait for the duplicated write bios to drain. A bio that completes after the free dereferences the freed tgt_device (e.g. btrfs_log_dev_io_error() -> btrfs_dev_stat_inc_and_print()), and btrfs_close_bdev() tears the block device down while I/O is still in flight against it -- a use-after-free. btrfs_start_transaction() failing here is reachable (e.g. -ENOMEM, or -EROFS on an aborted transaction). btrfs_dev_replace_finishing() handles this correctly on its own error path: it calls btrfs_rm_dev_replace_blocked() -- which blocks new bios and waits for bio_counter to reach zero -- before btrfs_destroy_dev_replace_tgtdev(), then calls btrfs_rm_dev_replace_unblocked(). The start-failure path simply omitted the drain. Mirror the finishing error path: drain the in-flight bios before destroying the target, and return directly. The shared 'leave' label stays for the earlier failure case (the unexpected STARTED/SUSPENDED state), which never published ->tgtdev to btrfs_map_block() and so needs no drain. Fixes: e93c89c1aaaa ("Btrfs: add new sources for device replace code") Cc: stable@vger.kernel.org Signed-off-by: Christian Brauner (Amutable) --- fs/btrfs/dev-replace.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/dev-replace.c b/fs/btrfs/dev-replace.c index 318ddb790429..51665ed09798 100644 --- a/fs/btrfs/dev-replace.c +++ b/fs/btrfs/dev-replace.c @@ -66,6 +66,8 @@ static int btrfs_dev_replace_finishing(struct btrfs_fs_info *fs_info, int scrub_ret); static int btrfs_dev_replace_kthread(void *data); +static void btrfs_rm_dev_replace_blocked(struct btrfs_fs_info *fs_info); +static void btrfs_rm_dev_replace_unblocked(struct btrfs_fs_info *fs_info); int btrfs_init_dev_replace(struct btrfs_fs_info *fs_info) { @@ -690,7 +692,11 @@ static int btrfs_dev_replace_start(struct btrfs_fs_info *fs_info, dev_replace->srcdev = NULL; dev_replace->tgtdev = NULL; up_write(&dev_replace->rwsem); - goto leave; + /* Drain writes already duplicated to tgtdev before freeing it. */ + btrfs_rm_dev_replace_blocked(fs_info); + btrfs_destroy_dev_replace_tgtdev(tgt_device); + btrfs_rm_dev_replace_unblocked(fs_info); + return ret; } ret = btrfs_commit_transaction(trans); -- 2.47.3