When system suspend is triggered while the DPMAIF TX kthread (t7xx_dpmaif_tx_hw_push_thread) is running, a deadlock can occur leading to a CPU soft lockup. The root cause is two-fold: 1. t7xx_dpmaif_suspend() calls t7xx_dpmaif_tx_stop() which only stops the TX work-queue items (by clearing txq->que_started and waiting on txq->tx_processing). It does NOT signal the kthread and does NOT update dpmaif_ctrl->state, which stays DPMAIF_STATE_PWRON. 2. The kthread's state guard is only checked at the top of each loop iteration. If the thread already passed this guard, it proceeds unconditionally to call pm_runtime_resume_and_get() — which tries to acquire dev->power.lock also contended by the system PM suspend path. The result is a spinlock deadlock observed as: watchdog: BUG: soft lockup - CPU#N stuck for 26s! [dpmaif_tx_hw_pu] RIP: _raw_spin_unlock_irqrestore Call Trace: __pm_runtime_resume+0x5b/0x80 t7xx_dpmaif_tx_hw_push_thread+0xc4 [mtk_t7xx] The condition requires ASPM L1 enabled on the endpoint (which extends the time pm_runtime_resume_and_get() holds dev->power.lock during L1.2 link retraining) and hundreds of repeated suspend/resume cycles to trigger reliably. Fix by introducing tx_pm_lock (struct mutex) and several coordinated changes: t7xx_dpmaif_suspend(): After t7xx_dpmaif_tx_stop(), acquire tx_pm_lock. Under the lock, snapshot dpmaif_ctrl->state into pre_suspend_state (capturing the modem state atomically with respect to the kthread's PM section), then set DPMAIF_STATE_PWROFF via WRITE_ONCE(). Release the lock and call wake_up() so any sleeping kthread re-evaluates the wait_event condition and exits. t7xx_dpmaif_suspend() acquires tx_pm_lock without holding any PM lock. While it waits, the kthread may call pm_runtime_resume_and_get() which briefly takes and releases dev->power.lock independently. Because the suspend callback does not compete for dev->power.lock at this point, the original spinlock deadlock cannot occur. Suspend latency increases by at most one TX burst drain time, which is bounded by the DRB ring depth. t7xx_dpmaif_resume(): When pre_suspend_state is DPMAIF_STATE_PWRON, re-arm the HW fully (start_txrx_qs, enable_irq, unmask_dlq_intr, start_hw) before publishing the new state. This ensures the kthread cannot issue ul_update_hw_drb_cnt() MMIO writes before UL_ALL_Q_EN is set by t7xx_dpmaif_start_hw(). Publish the restored state under tx_pm_lock to serialise with the kthread's under-lock state check. Wake up the kthread only after HW and state are both consistent. When pre_suspend_state is DPMAIF_STATE_PWROFF (modem was already stopped or in exception before suspend), skip HW re-arming entirely to avoid leaving DMA engines running while the MD state machine considers the modem inactive. t7xx_dpmaif_tx_hw_push_thread(): Hold tx_pm_lock across the [state check -> pm_runtime_resume_and_get -> pm_runtime_put_autosuspend] sequence. A second READ_ONCE() state check under the lock closes the TOCTOU window between the wait_event guard at the loop top and the pm_runtime call. READ_ONCE() is used in all unguarded state reads in this function. t7xx_dpmaif_start() / t7xx_dpmaif_stop(): Use WRITE_ONCE() for state writes to match the READ_ONCE() reads used throughout the driver and prevent compiler optimisations from obscuring concurrent access. t7xx_do_tx_hw_push(): Use READ_ONCE() in the do/while termination condition to match the WRITE_ONCE() annotations on the write side. t7xx_dpmaif_tx_thread_init(): Initialise tx_pm_lock with mutex_init(). Note: t7xx_dpmaif_start() and t7xx_dpmaif_stop() (called from the MD-FSM kthread via t7xx_dpmaif_md_state_callback()) do not hold tx_pm_lock. A race where the FSM transitions the modem to DPMAIF_STATE_PWROFF concurrently with the TX kthread's last burst is pre-existing and not introduced by this patch; the do/while condition in t7xx_do_tx_hw_push() now re-checks state with READ_ONCE() at each iteration boundary, limiting exposure to at most one burst. Tested: no soft lockup observed over 500+ suspend/resume cycles with SIM registered and ASPM L1 enabled (previously triggered in < 300). Fixes: 46e8f49ed7b3 ("net: wwan: t7xx: Introduce power management") Signed-off-by: Tim JH Chen --- v2 -> v3: Process fixes (per Documentation/process/maintainer-netdev.rst): - Add target tree (net) and revision (v3) to subject prefix - Fix Fixes tag to point to 46e8f49ed7b3 ("net: wwan: t7xx: Introduce power management") instead of a kernel release tag - Move version changelog after '---' separator Code fixes (addressing AI-assisted code review of v2): - Capture pre_suspend_state inside tx_pm_lock (was outside the lock in v2), closing a race where a concurrent t7xx_dpmaif_stop() from the MD-FSM kthread could flip state between the snapshot and the mutex acquisition, causing resume to incorrectly restore PWRON - In resume, re-arm HW before publishing state under tx_pm_lock; in v2 state was published before t7xx_dpmaif_start_hw(), allowing the TX kthread to call ul_update_hw_drb_cnt() while UL_ALL_Q_EN=0 - Skip HW re-arming in resume when pre_suspend_state==PWROFF, to avoid leaving DMA engines and IRQs live when the MD state machine considers the modem stopped or in exception - Add WRITE_ONCE() to t7xx_dpmaif_start()/stop() state writes and READ_ONCE() to t7xx_do_tx_hw_push() while condition - Document why pm_runtime_resume_and_get() under tx_pm_lock cannot cause a new deadlock against the suspend path - Document the pre-existing MD-FSM kthread / TX kthread race v1 -> v2: - Resume no longer unconditionally restores DPMAIF_STATE_PWRON; pre_suspend_state saves the pre-suspend modem state across the cycle - Replace the second plain state check with mutex (tx_pm_lock) that wraps the full pm_runtime section, eliminating the TOCTOU window rather than narrowing it - Add READ_ONCE/WRITE_ONCE at state accesses crossing the suspend/resume boundary Signed-off-by: Tim JH Chen --- drivers/net/wwan/t7xx/t7xx_hif_dpmaif.c | 25 ++++++++++++++++------ drivers/net/wwan/t7xx/t7xx_hif_dpmaif.h | 3 +++ drivers/net/wwan/t7xx/t7xx_hif_dpmaif_tx.c | 18 ++++++++++++---- 3 files changed, 35 insertions(+), 11 deletions(-) diff --git a/drivers/net/wwan/t7xx/t7xx_hif_dpmaif.c b/drivers/net/wwan/t7xx/t7xx_hif_dpmaif.c index 7ff33c1d6ac7..845a42fdf507 100644 --- a/drivers/net/wwan/t7xx/t7xx_hif_dpmaif.c +++ b/drivers/net/wwan/t7xx/t7xx_hif_dpmaif.c @@ -363,7 +363,7 @@ static int t7xx_dpmaif_start(struct dpmaif_ctrl *dpmaif_ctrl) t7xx_dpmaif_ul_clr_all_intr(hw_info); t7xx_dpmaif_dl_clr_all_intr(hw_info); - dpmaif_ctrl->state = DPMAIF_STATE_PWRON; + WRITE_ONCE(dpmaif_ctrl->state, DPMAIF_STATE_PWRON); t7xx_dpmaif_enable_irq(dpmaif_ctrl); wake_up(&dpmaif_ctrl->tx_wq); return 0; @@ -400,7 +400,7 @@ static int t7xx_dpmaif_stop(struct dpmaif_ctrl *dpmaif_ctrl) return -EFAULT; t7xx_dpmaif_disable_irq(dpmaif_ctrl); - dpmaif_ctrl->state = DPMAIF_STATE_PWROFF; + WRITE_ONCE(dpmaif_ctrl->state, DPMAIF_STATE_PWROFF); t7xx_dpmaif_stop_sw(dpmaif_ctrl); t7xx_dpmaif_tx_clear(dpmaif_ctrl); t7xx_dpmaif_rx_clear(dpmaif_ctrl); @@ -412,6 +412,11 @@ static int t7xx_dpmaif_suspend(struct t7xx_pci_dev *t7xx_dev, void *param) struct dpmaif_ctrl *dpmaif_ctrl = param; t7xx_dpmaif_tx_stop(dpmaif_ctrl); + mutex_lock(&dpmaif_ctrl->tx_pm_lock); + dpmaif_ctrl->pre_suspend_state = READ_ONCE(dpmaif_ctrl->state); + WRITE_ONCE(dpmaif_ctrl->state, DPMAIF_STATE_PWROFF); + mutex_unlock(&dpmaif_ctrl->tx_pm_lock); + wake_up(&dpmaif_ctrl->tx_wq); t7xx_dpmaif_hw_stop_all_txq(&dpmaif_ctrl->hw_info); t7xx_dpmaif_hw_stop_all_rxq(&dpmaif_ctrl->hw_info); t7xx_dpmaif_disable_irq(dpmaif_ctrl); @@ -451,11 +456,17 @@ static int t7xx_dpmaif_resume(struct t7xx_pci_dev *t7xx_dev, void *param) if (!dpmaif_ctrl) return 0; - t7xx_dpmaif_start_txrx_qs(dpmaif_ctrl); - t7xx_dpmaif_enable_irq(dpmaif_ctrl); - t7xx_dpmaif_unmask_dlq_intr(dpmaif_ctrl); - t7xx_dpmaif_start_hw(&dpmaif_ctrl->hw_info); - wake_up(&dpmaif_ctrl->tx_wq); + if (dpmaif_ctrl->pre_suspend_state == DPMAIF_STATE_PWRON) { + t7xx_dpmaif_start_txrx_qs(dpmaif_ctrl); + t7xx_dpmaif_enable_irq(dpmaif_ctrl); + t7xx_dpmaif_unmask_dlq_intr(dpmaif_ctrl); + t7xx_dpmaif_start_hw(&dpmaif_ctrl->hw_info); + } + mutex_lock(&dpmaif_ctrl->tx_pm_lock); + WRITE_ONCE(dpmaif_ctrl->state, dpmaif_ctrl->pre_suspend_state); + mutex_unlock(&dpmaif_ctrl->tx_pm_lock); + if (dpmaif_ctrl->pre_suspend_state == DPMAIF_STATE_PWRON) + wake_up(&dpmaif_ctrl->tx_wq); return 0; } diff --git a/drivers/net/wwan/t7xx/t7xx_hif_dpmaif.h b/drivers/net/wwan/t7xx/t7xx_hif_dpmaif.h index 0ce4505e813d..670ed2cca761 100644 --- a/drivers/net/wwan/t7xx/t7xx_hif_dpmaif.h +++ b/drivers/net/wwan/t7xx/t7xx_hif_dpmaif.h @@ -20,6 +20,7 @@ #include #include +#include #include #include #include @@ -172,6 +173,8 @@ struct dpmaif_ctrl { struct t7xx_pci_dev *t7xx_dev; struct md_pm_entity dpmaif_pm_entity; enum dpmaif_state state; + enum dpmaif_state pre_suspend_state; + struct mutex tx_pm_lock; bool dpmaif_sw_init_done; struct dpmaif_hw_info hw_info; struct dpmaif_tx_queue txq[DPMAIF_TXQ_NUM]; diff --git a/drivers/net/wwan/t7xx/t7xx_hif_dpmaif_tx.c b/drivers/net/wwan/t7xx/t7xx_hif_dpmaif_tx.c index 236d632cf591..e278e9703c69 100644 --- a/drivers/net/wwan/t7xx/t7xx_hif_dpmaif_tx.c +++ b/drivers/net/wwan/t7xx/t7xx_hif_dpmaif_tx.c @@ -439,7 +439,7 @@ static void t7xx_do_tx_hw_push(struct dpmaif_ctrl *dpmaif_ctrl) cond_resched(); } while (!t7xx_tx_lists_are_all_empty(dpmaif_ctrl) && !kthread_should_stop() && - (dpmaif_ctrl->state == DPMAIF_STATE_PWRON)); + READ_ONCE(dpmaif_ctrl->state) == DPMAIF_STATE_PWRON); } static int t7xx_dpmaif_tx_hw_push_thread(void *arg) @@ -449,10 +449,10 @@ static int t7xx_dpmaif_tx_hw_push_thread(void *arg) while (!kthread_should_stop()) { if (t7xx_tx_lists_are_all_empty(dpmaif_ctrl) || - dpmaif_ctrl->state != DPMAIF_STATE_PWRON) { + READ_ONCE(dpmaif_ctrl->state) != DPMAIF_STATE_PWRON) { if (wait_event_interruptible(dpmaif_ctrl->tx_wq, (!t7xx_tx_lists_are_all_empty(dpmaif_ctrl) && - dpmaif_ctrl->state == DPMAIF_STATE_PWRON) || + READ_ONCE(dpmaif_ctrl->state) == DPMAIF_STATE_PWRON) || kthread_should_stop())) continue; @@ -460,14 +460,23 @@ static int t7xx_dpmaif_tx_hw_push_thread(void *arg) break; } + mutex_lock(&dpmaif_ctrl->tx_pm_lock); + if (READ_ONCE(dpmaif_ctrl->state) != DPMAIF_STATE_PWRON) { + mutex_unlock(&dpmaif_ctrl->tx_pm_lock); + continue; + } + ret = pm_runtime_resume_and_get(dpmaif_ctrl->dev); - if (ret < 0 && ret != -EACCES) + if (ret < 0 && ret != -EACCES) { + mutex_unlock(&dpmaif_ctrl->tx_pm_lock); return ret; + } t7xx_pci_disable_sleep(dpmaif_ctrl->t7xx_dev); t7xx_do_tx_hw_push(dpmaif_ctrl); t7xx_pci_enable_sleep(dpmaif_ctrl->t7xx_dev); pm_runtime_put_autosuspend(dpmaif_ctrl->dev); + mutex_unlock(&dpmaif_ctrl->tx_pm_lock); } return 0; @@ -475,6 +484,7 @@ static int t7xx_dpmaif_tx_hw_push_thread(void *arg) int t7xx_dpmaif_tx_thread_init(struct dpmaif_ctrl *dpmaif_ctrl) { + mutex_init(&dpmaif_ctrl->tx_pm_lock); init_waitqueue_head(&dpmaif_ctrl->tx_wq); dpmaif_ctrl->tx_thread = kthread_run(t7xx_dpmaif_tx_hw_push_thread, dpmaif_ctrl, "dpmaif_tx_hw_push"); -- 2.43.0