The original code uses cancel_delayed_work() in cnic_cm_stop_bnx2x_hw(), which does not guarantee that the delayed work item 'delete_task' has fully completed if it was already running. Additionally, the delayed work item is cyclic, flush_workqueue() in cnic_cm_stop_bnx2x_hw() could not prevent the new incoming ones. This leads to use-after-free scenarios where the cnic_dev is deallocated by cnic_free_dev(), while delete_task remains active and attempt to dereference cnic_dev in cnic_delete_task(). A typical race condition is illustrated below: CPU 0 (cleanup) | CPU 1 (delayed work callback) cnic_stop_hw() | cnic_delete_task() cnic_cm_stop_bnx2x_hw() | cancel_delayed_work() | queue_delayed_work() flush_workqueue() | | cnic_delete_task() //new instance cnic_free_dev(dev)//free | | dev = cp->dev; //use This is confirmed by a KASAN report: BUG: KASAN: slab-use-after-free in __run_timer_base.part.0+0x7d7/0x8c0 Write of size 8 at addr ffff88800c0a55f0 by task kworker/u16:2/63 ... Call Trace: dump_stack_lvl+0x55/0x70 print_report+0xcf/0x610 ? __run_timer_base.part.0+0x7d7/0x8c0 kasan_report+0xb8/0xf0 ? __run_timer_base.part.0+0x7d7/0x8c0 __run_timer_base.part.0+0x7d7/0x8c0 ? rcu_sched_clock_irq+0xa57/0x27d0 ? __pfx___run_timer_base.part.0+0x10/0x10 ? __update_load_avg_cfs_rq+0x5f0/0xa50 ? _raw_spin_lock_irq+0x80/0xe0 ? __pfx__raw_spin_lock_irq+0x10/0x10 ? tmigr_next_groupevt+0x99/0x140 tmigr_handle_remote_up+0x603/0x7e0 ? __pfx_tmigr_handle_remote_up+0x10/0x10 ? sched_balance_trigger+0x199/0x9f0 ? sched_tick+0x221/0x5a0 ? _raw_spin_lock_irq+0x80/0xe0 ? timerqueue_add+0x21b/0x320 ? tick_nohz_handler+0x199/0x440 ? __pfx_tmigr_handle_remote_up+0x10/0x10 __walk_groups.isra.0+0x42/0x150 tmigr_handle_remote+0x1f4/0x2e0 ? __pfx_tmigr_handle_remote+0x10/0x10 ? ktime_get+0x60/0x140 ? lapic_next_event+0x11/0x20 ? clockevents_program_event+0x1d4/0x2a0 ? hrtimer_interrupt+0x322/0x780 handle_softirqs+0x16a/0x550 irq_exit_rcu+0xaf/0xe0 sysvec_apic_timer_interrupt+0x70/0x80 ... Allocated by task 141: kasan_save_stack+0x24/0x50 kasan_save_track+0x14/0x30 __kasan_kmalloc+0x7f/0x90 cnic_alloc_dev.isra.0+0x40/0x310 is_cnic_dev+0x795/0x11e0 cnic_netdev_event+0xde/0xd30 notifier_call_chain+0xc0/0x280 register_netdevice+0xfb5/0x16c0 register_netdev+0x1b/0x40 ... Freed by task 63: kasan_save_stack+0x24/0x50 kasan_save_track+0x14/0x30 kasan_save_free_info+0x3a/0x60 __kasan_slab_free+0x3f/0x50 kfree+0x137/0x370 cnic_netdev_event+0x972/0xd30 ... Replace cancel_delayed_work() with cancel_delayed_work_sync() to ensure that the delayed work item is properly canceled and any executing delayed work has finished before the cnic_dev is deallocated. Fixes: fdf24086f475 ("cnic: Defer iscsi connection cleanup") Signed-off-by: Duoming Zhou --- drivers/net/ethernet/broadcom/cnic.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/broadcom/cnic.c b/drivers/net/ethernet/broadcom/cnic.c index a9040c42d2ff..73dd7c25d89e 100644 --- a/drivers/net/ethernet/broadcom/cnic.c +++ b/drivers/net/ethernet/broadcom/cnic.c @@ -4230,7 +4230,7 @@ static void cnic_cm_stop_bnx2x_hw(struct cnic_dev *dev) cnic_bnx2x_delete_wait(dev, 0); - cancel_delayed_work(&cp->delete_task); + cancel_delayed_work_sync(&cp->delete_task); flush_workqueue(cnic_wq); if (atomic_read(&cp->iscsi_conn) != 0) -- 2.34.1