When one process(such as udev) opens ublk block device (e.g., to read the partition table via bdev_open()), a deadlock[1] can occur: 1. bdev_open() grabs disk->open_mutex 2. The process issues read I/O to ublk backend to read partition table 3. In __ublk_complete_rq(), blk_update_request() or blk_mq_end_request() runs bio->bi_end_io() callbacks 4. If this triggers fput() on file descriptor of ublk block device, the work may be deferred to current task's task work (see fput() implementation) 5. This eventually calls blkdev_release() from the same context 6. blkdev_release() tries to grab disk->open_mutex again 7. Deadlock: same task waiting for a mutex it already holds The fix is to run blk_update_request() and blk_mq_end_request() with bottom halves disabled. This forces blkdev_release() to run in kernel work-queue context instead of current task work context, and allows ublk server to make forward progress, and avoids the deadlock. Fixes: 71f28f3136af ("ublk_drv: add io_uring based userspace block driver") Link: https://github.com/ublk-org/ublksrv/issues/170 [1] Signed-off-by: Ming Lei --- drivers/block/ublk_drv.c | 17 ++++++++++++++++- 1 file changed, 16 insertions(+), 1 deletion(-) diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c index 2c715df63f23..f69da449727f 100644 --- a/drivers/block/ublk_drv.c +++ b/drivers/block/ublk_drv.c @@ -1086,6 +1086,7 @@ static inline void __ublk_complete_rq(struct request *req, struct ublk_io *io, { unsigned int unmapped_bytes; blk_status_t res = BLK_STS_OK; + bool requeue; /* failed read IO if nothing is read */ if (!io->res && req_op(req) == REQ_OP_READ) @@ -1117,14 +1118,28 @@ static inline void __ublk_complete_rq(struct request *req, struct ublk_io *io, if (unlikely(unmapped_bytes < io->res)) io->res = unmapped_bytes; - if (blk_update_request(req, BLK_STS_OK, io->res)) + /* + * Run bio->bi_end_io() from softirq context for preventing this + * ublk's blkdev_release() from being called on current's task + * work, see fput() implementation. + * + * Otherwise, ublk server may not provide forward progress in + * case of reading partition table from bdev_open() with + * disk->open_mutex grabbed, and causes dead lock. + */ + local_bh_disable(); + requeue = blk_update_request(req, BLK_STS_OK, io->res); + local_bh_enable(); + if (requeue) blk_mq_requeue_request(req, true); else if (likely(!blk_should_fake_timeout(req->q))) __blk_mq_end_request(req, BLK_STS_OK); return; exit: + local_bh_disable(); blk_mq_end_request(req, res); + local_bh_enable(); } static struct io_uring_cmd *__ublk_prep_compl_io_cmd(struct ublk_io *io, -- 2.47.1