io_ring_ctx's mutex uring_lock can be quite expensive in high-IOPS workloads. Even when only one thread pinned to a single CPU is accessing the io_ring_ctx, the atomic CAS required to lock and unlock the mutex is a very hot instruction. The mutex's primary purpose is to prevent concurrent io_uring system calls on the same io_ring_ctx. However, there is already a flag IORING_SETUP_SINGLE_ISSUER that promises only one task will make io_uring_enter() and io_uring_register() system calls on the io_ring_ctx once it's enabled. So if the io_ring_ctx is setup with IORING_SETUP_SINGLE_ISSUER, skip the uring_lock mutex_lock() and mutex_unlock() for the io_uring_enter() submission as well as for io_handle_tw_list(). io_uring_enter() submission calls __io_uring_add_tctx_node_from_submit() to verify the current task matches submitter_task for IORING_SETUP_SINGLE_ISSUER. And task work can only be scheduled on tasks that submit io_uring requests, so io_handle_tw_list() will also only be called on submitter_task. There is a goto from the io_uring_enter() submission to the middle of the IOPOLL block which assumed the uring_lock would already be held. This is no longer the case for IORING_SETUP_SINGLE_ISSUER, so goto the preceding mutex_lock() in that case. It may be possible to avoid taking uring_lock in other places too for IORING_SETUP_SINGLE_ISSUER, but these two cover the primary hot paths. The uring_lock in io_uring_register() is necessary at least before the io_uring is enabled because submitter_task isn't set yet. uring_lock is also used to synchronize IOPOLL on submitting tasks with io_uring worker tasks, so it's still needed there. But in principle, it should be possible to remove the mutex entirely for IORING_SETUP_SINGLE_ISSUER by running any code needing exclusive access to the io_ring_ctx in task work context on submitter_task. Signed-off-by: Caleb Sander Mateos --- io_uring/io_uring.c | 6 +++++- io_uring/io_uring.h | 14 ++++++++++++++ 2 files changed, 19 insertions(+), 1 deletion(-) diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index 7f19b6da5d3d..5793f6122159 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -3534,12 +3534,15 @@ SYSCALL_DEFINE6(io_uring_enter, unsigned int, fd, u32, to_submit, if (ret != to_submit) { io_ring_ctx_unlock(ctx); goto out; } if (flags & IORING_ENTER_GETEVENTS) { - if (ctx->syscall_iopoll) + if (ctx->syscall_iopoll) { + if (ctx->flags & IORING_SETUP_SINGLE_ISSUER) + goto iopoll; goto iopoll_locked; + } /* * Ignore errors, we'll soon call io_cqring_wait() and * it should handle ownership problems if any. */ if (ctx->flags & IORING_SETUP_DEFER_TASKRUN) @@ -3556,10 +3559,11 @@ SYSCALL_DEFINE6(io_uring_enter, unsigned int, fd, u32, to_submit, * We disallow the app entering submit/complete with * polling, but we still need to lock the ring to * prevent racing with polled issue that got punted to * a workqueue. */ +iopoll: mutex_lock(&ctx->uring_lock); iopoll_locked: ret2 = io_validate_ext_arg(ctx, flags, argp, argsz); if (likely(!ret2)) ret2 = io_iopoll_check(ctx, min_complete); diff --git a/io_uring/io_uring.h b/io_uring/io_uring.h index a0580a1bf6b5..7296b12b0897 100644 --- a/io_uring/io_uring.h +++ b/io_uring/io_uring.h @@ -121,20 +121,34 @@ bool io_match_task_safe(struct io_kiocb *head, struct io_uring_task *tctx, void io_activate_pollwq(struct io_ring_ctx *ctx); static inline void io_ring_ctx_lock(struct io_ring_ctx *ctx) { + if (ctx->flags & IORING_SETUP_SINGLE_ISSUER) { + WARN_ON_ONCE(current != ctx->submitter_task); + return; + } + mutex_lock(&ctx->uring_lock); } static inline void io_ring_ctx_unlock(struct io_ring_ctx *ctx) { + if (ctx->flags & IORING_SETUP_SINGLE_ISSUER) { + WARN_ON_ONCE(current != ctx->submitter_task); + return; + } + mutex_unlock(&ctx->uring_lock); } static inline void io_ring_ctx_assert_locked(const struct io_ring_ctx *ctx) { + if (ctx->flags & IORING_SETUP_SINGLE_ISSUER && + current == ctx->submitter_task) + return; + lockdep_assert_held(&ctx->uring_lock); } static inline void io_lockdep_assert_cq_locked(struct io_ring_ctx *ctx) { -- 2.45.2