A subsequent commit will allow uring_cmds that don't use iopoll on IORING_SETUP_IOPOLL io_urings. As a result, CQEs can be posted without setting the iopoll_completed flag for a request in iopoll_list or going through task work. For example, a UBLK_U_IO_FETCH_IO_CMDS command could call io_uring_mshot_cmd_post_cqe() to directly post a CQE. The io_iopoll_check() loop currently only counts completions posted in io_do_iopoll() when determining whether the min_events threshold has been met. It also exits early if there are any existing CQEs before polling, or if any CQEs are posted while running task work. CQEs posted via io_uring_mshot_cmd_post_cqe() or other mechanisms won't be counted against min_events. Explicitly check the available CQEs in each io_iopoll_check() loop iteration to account for CQEs posted in any fashion. Signed-off-by: Caleb Sander Mateos --- io_uring/io_uring.c | 18 +++--------------- 1 file changed, 3 insertions(+), 15 deletions(-) diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index 46f39831d27c..5f694052f501 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -1184,11 +1184,10 @@ __cold void io_iopoll_try_reap_events(struct io_ring_ctx *ctx) io_move_task_work_from_local(ctx); } static int io_iopoll_check(struct io_ring_ctx *ctx, unsigned int min_events) { - unsigned int nr_events = 0; unsigned long check_cq; min_events = min(min_events, ctx->cq_entries); lockdep_assert_held(&ctx->uring_lock); @@ -1205,19 +1204,12 @@ static int io_iopoll_check(struct io_ring_ctx *ctx, unsigned int min_events) * dropped CQE. */ if (check_cq & BIT(IO_CHECK_CQ_DROPPED_BIT)) return -EBADR; } - /* - * Don't enter poll loop if we already have events pending. - * If we do, we can potentially be spinning for commands that - * already triggered a CQE (eg in error). - */ - if (io_cqring_events(ctx)) - return 0; - do { + while (io_cqring_events(ctx) < min_events) { int ret = 0; /* * If a submit got punted to a workqueue, we can have the * application entering polling for a command before it gets @@ -1227,34 +1219,30 @@ static int io_iopoll_check(struct io_ring_ctx *ctx, unsigned int min_events) * the poll to the issued list. Otherwise we can spin here * forever, while the workqueue is stuck trying to acquire the * very same mutex. */ if (list_empty(&ctx->iopoll_list) || io_task_work_pending(ctx)) { - u32 tail = ctx->cached_cq_tail; - (void) io_run_local_work_locked(ctx, min_events); if (task_work_pending(current) || list_empty(&ctx->iopoll_list)) { mutex_unlock(&ctx->uring_lock); io_run_task_work(); mutex_lock(&ctx->uring_lock); } /* some requests don't go through iopoll_list */ - if (tail != ctx->cached_cq_tail || list_empty(&ctx->iopoll_list)) + if (list_empty(&ctx->iopoll_list)) break; } ret = io_do_iopoll(ctx, !min_events); if (unlikely(ret < 0)) return ret; if (task_sigpending(current)) return -EINTR; if (need_resched()) break; - - nr_events += ret; - } while (nr_events < min_events); + } return 0; } void io_req_task_complete(struct io_tw_req tw_req, io_tw_token_t tw) -- 2.45.2