Currently if a user enqueue a work item using schedule_delayed_work() the used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to schedule_work() that is using system_wq and queue_work(), that makes use again of WORK_CPU_UNBOUND. This lack of consistentcy cannot be addressed without refactoring the API. system_unbound_wq should be the default workqueue so as not to enforce locality constraints for random work whenever it's not required. Adding system_dfl_wq to encourage its use when unbound work should be used. queue_work() / queue_delayed_work() / mod_delayed_work() will now use the new unbound wq: whether the user still use the old wq a warn will be printed along with a wq redirect to the new one. The old system_unbound_wq will be kept for a few release cycles. Suggested-by: Tejun Heo Signed-off-by: Marco Crivellari --- drivers/block/nbd.c | 2 +- drivers/block/sunvdc.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c index 7bdc7eb808ea..7738fce177fa 100644 --- a/drivers/block/nbd.c +++ b/drivers/block/nbd.c @@ -311,7 +311,7 @@ static void nbd_mark_nsock_dead(struct nbd_device *nbd, struct nbd_sock *nsock, if (args) { INIT_WORK(&args->work, nbd_dead_link_work); args->index = nbd->index; - queue_work(system_wq, &args->work); + queue_work(system_percpu_wq, &args->work); } } if (!nsock->dead) { diff --git a/drivers/block/sunvdc.c b/drivers/block/sunvdc.c index b5727dea15bd..442546b05df8 100644 --- a/drivers/block/sunvdc.c +++ b/drivers/block/sunvdc.c @@ -1187,7 +1187,7 @@ static void vdc_ldc_reset(struct vdc_port *port) } if (port->ldc_timeout) - mod_delayed_work(system_wq, &port->ldc_reset_timer_work, + mod_delayed_work(system_percpu_wq, &port->ldc_reset_timer_work, round_jiffies(jiffies + HZ * port->ldc_timeout)); mod_timer(&port->vio.timer, round_jiffies(jiffies + HZ)); return; -- 2.51.0 Currently if a user enqueue a work item using schedule_delayed_work() the used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to schedule_work() that is using system_wq and queue_work(), that makes use again of WORK_CPU_UNBOUND. This lack of consistentcy cannot be addressed without refactoring the API. system_unbound_wq should be the default workqueue so as not to enforce locality constraints for random work whenever it's not required. Adding system_dfl_wq to encourage its use when unbound work should be used. queue_work() / queue_delayed_work() / mod_delayed_work() will now use the new unbound wq: whether the user still use the old wq a warn will be printed along with a wq redirect to the new one. The old system_unbound_wq will be kept for a few release cycles. Suggested-by: Tejun Heo Signed-off-by: Marco Crivellari --- drivers/block/zram/zram_drv.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c index fda7d8624889..c7e0fa29a572 100644 --- a/drivers/block/zram/zram_drv.c +++ b/drivers/block/zram/zram_drv.c @@ -975,7 +975,7 @@ static int read_from_bdev_sync(struct zram *zram, struct page *page, work.entry = entry; INIT_WORK_ONSTACK(&work.work, zram_sync_read); - queue_work(system_unbound_wq, &work.work); + queue_work(system_dfl_wq, &work.work); flush_work(&work.work); destroy_work_on_stack(&work.work); -- 2.51.0 Currently if a user enqueue a work item using schedule_delayed_work() the used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to schedule_work() that is using system_wq and queue_work(), that makes use again of WORK_CPU_UNBOUND. This lack of consistentcy cannot be addressed without refactoring the API. alloc_workqueue() treats all queues as per-CPU by default, while unbound workqueues must opt-in via WQ_UNBOUND. This default is suboptimal: most workloads benefit from unbound queues, allowing the scheduler to place worker threads where they’re needed and reducing noise when CPUs are isolated. This default is suboptimal: most workloads benefit from unbound queues, allowing the scheduler to place worker threads where they’re needed and reducing noise when CPUs are isolated. This patch adds a new WQ_PERCPU flag to explicitly request the use of the per-CPU behavior. Both flags coexist for one release cycle to allow callers to transition their calls. Once migration is complete, WQ_UNBOUND can be removed and unbound will become the implicit default. With the introduction of the WQ_PERCPU flag (equivalent to !WQ_UNBOUND), any alloc_workqueue() caller that doesn’t explicitly specify WQ_UNBOUND must now use WQ_PERCPU. All existing users have been updated accordingly. Suggested-by: Tejun Heo Signed-off-by: Marco Crivellari --- drivers/block/aoe/aoemain.c | 2 +- drivers/block/rbd.c | 2 +- drivers/block/rnbd/rnbd-clt.c | 2 +- drivers/block/sunvdc.c | 2 +- drivers/block/virtio_blk.c | 2 +- 5 files changed, 5 insertions(+), 5 deletions(-) diff --git a/drivers/block/aoe/aoemain.c b/drivers/block/aoe/aoemain.c index cdf6e4041bb9..3b21750038ee 100644 --- a/drivers/block/aoe/aoemain.c +++ b/drivers/block/aoe/aoemain.c @@ -44,7 +44,7 @@ aoe_init(void) { int ret; - aoe_wq = alloc_workqueue("aoe_wq", 0, 0); + aoe_wq = alloc_workqueue("aoe_wq", WQ_PERCPU, 0); if (!aoe_wq) return -ENOMEM; diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c index faafd7ff43d6..af0e21149dbc 100644 --- a/drivers/block/rbd.c +++ b/drivers/block/rbd.c @@ -7389,7 +7389,7 @@ static int __init rbd_init(void) * The number of active work items is limited by the number of * rbd devices * queue depth, so leave @max_active at default. */ - rbd_wq = alloc_workqueue(RBD_DRV_NAME, WQ_MEM_RECLAIM, 0); + rbd_wq = alloc_workqueue(RBD_DRV_NAME, WQ_MEM_RECLAIM | WQ_PERCPU, 0); if (!rbd_wq) { rc = -ENOMEM; goto err_out_slab; diff --git a/drivers/block/rnbd/rnbd-clt.c b/drivers/block/rnbd/rnbd-clt.c index 15627417f12e..b3a0470f9e80 100644 --- a/drivers/block/rnbd/rnbd-clt.c +++ b/drivers/block/rnbd/rnbd-clt.c @@ -1809,7 +1809,7 @@ static int __init rnbd_client_init(void) unregister_blkdev(rnbd_client_major, "rnbd"); return err; } - rnbd_clt_wq = alloc_workqueue("rnbd_clt_wq", 0, 0); + rnbd_clt_wq = alloc_workqueue("rnbd_clt_wq", WQ_PERCPU, 0); if (!rnbd_clt_wq) { pr_err("Failed to load module, alloc_workqueue failed.\n"); rnbd_clt_destroy_sysfs_files(); diff --git a/drivers/block/sunvdc.c b/drivers/block/sunvdc.c index 442546b05df8..851763e5dd18 100644 --- a/drivers/block/sunvdc.c +++ b/drivers/block/sunvdc.c @@ -1215,7 +1215,7 @@ static int __init vdc_init(void) { int err; - sunvdc_wq = alloc_workqueue("sunvdc", 0, 0); + sunvdc_wq = alloc_workqueue("sunvdc", WQ_PERCPU, 0); if (!sunvdc_wq) return -ENOMEM; diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c index 7cffea01d868..a5a48f976a20 100644 --- a/drivers/block/virtio_blk.c +++ b/drivers/block/virtio_blk.c @@ -1683,7 +1683,7 @@ static int __init virtio_blk_init(void) { int error; - virtblk_wq = alloc_workqueue("virtio-blk", 0, 0); + virtblk_wq = alloc_workqueue("virtio-blk", WQ_PERCPU, 0); if (!virtblk_wq) return -ENOMEM; -- 2.51.0