rtw89_usb_ops_check_and_reclaim_tx_resource() returns a hardcoded placeholder value (42) instead of actual TX resource availability. This violates mac80211's flow control contract, preventing backpressure and causing uncontrolled URB accumulation under sustained TX load. Fix by adding per-channel atomic counters (tx_inflight[]) that track in-flight URBs: - Increment before usb_submit_urb() with rollback on failure - Decrement in completion callback - Return (MAX_URBS - inflight) to mac80211, or 0 when at capacity - Exclude firmware command channel (CH12) from tracking The pre-increment pattern prevents a race where the USB core completes the URB (possibly on another CPU) before the submitting code increments the counter. The 64-URB-per-channel limit was determined empirically. Testing across 32, 64, and 128 URBs showed that 32 causes a 10% throughput regression on single-stream USB3 5GHz upload (844 to 763 Mbps) and a 35% drop under parallel streams (858 to 556 Mbps). 64 URBs recovers to stock throughput while maintaining correct flow control. 128 URBs shows no further gain. On USB2 the bus speed is the bottleneck, so URB count has no effect. Tested on D-Link DWA-X1850 (RTL8832AU), kernel 6.19.8: Unpatched -> Patched (64 URBs) USB3 5GHz DL: 509 -> 475 Mbps USB3 5GHz UL: 844 -> 840 Mbps (retransmits: 3 -> 0) USB3 5GHz UL 4-stream: 858 -> 837 Mbps USB3 5GHz UL 8-stream: 872 -> 830 Mbps USB3 2.4GHz DL: 82 -> 104 Mbps (+27%) USB3 2.4GHz UL: 162 -> 163 Mbps USB2 5GHz UL: 250 -> 248 Mbps (bus-limited, no change) UDP flood (64 URBs): 930 Mbps, 0% loss, 4M+ datagrams 60s sustained soak: 844 Mbps, 0 retransmits Signed-off-by: Lucid Duck --- Changes since v2: - Increased MAX_TX_URBS_PER_CH from 32 to 64 (Ping-Ke Shih): 32 constrains USB3 5GHz throughput, especially under parallel streams. 64 recovers fully. 128 shows no further gain. - Removed redundant comments on define and struct member (Ping-Ke Shih) - Added uplink and multi-stream test data to commit message (Ping-Ke Shih) Changes since v1: - Removed duplicate "TX flow control" comment (Ping-Ke Shih) - Added test results to commit message (Ping-Ke Shih) Bitterblue's CH12 question from v1: the CH12 guards in tx_kick_off() and write_port_complete() are a matched pair. tx_kick_off() skips atomic_inc for CH12, so the completion handler must skip atomic_dec to match. Removing only the completion side causes counter underflow. Additional validation: 100-iteration stress test, 50-iteration teardown (rmmod/modprobe under load), 10x hot-unplug during active TX, and 30-minute soak -- all pass with counters balanced at idle. drivers/net/wireless/realtek/rtw89/usb.c | 27 +++++++++++++++++++++++++-- drivers/net/wireless/realtek/rtw89/usb.h | 4 ++++ 2 files changed, 29 insertions(+), 2 deletions(-) diff --git a/drivers/net/wireless/realtek/rtw89/usb.c b/drivers/net/wireless/realtek/rtw89/usb.c index eb489df..faafa3c 100644 --- a/drivers/net/wireless/realtek/rtw89/usb.c +++ b/drivers/net/wireless/realtek/rtw89/usb.c @@ -166,16 +166,25 @@ static u32 rtw89_usb_ops_check_and_reclaim_tx_resource(struct rtw89_dev *rtwdev, u8 txch) { + struct rtw89_usb *rtwusb = rtw89_usb_priv(rtwdev); + int inflight; + + /* Firmware command channel is not tracked */ if (txch == RTW89_TXCH_CH12) return 1; - return 42; /* TODO some kind of calculation? */ + inflight = atomic_read(&rtwusb->tx_inflight[txch]); + if (inflight >= RTW89_USB_MAX_TX_URBS_PER_CH) + return 0; + + return RTW89_USB_MAX_TX_URBS_PER_CH - inflight; } static void rtw89_usb_write_port_complete(struct urb *urb) { struct rtw89_usb_tx_ctrl_block *txcb = urb->context; struct rtw89_dev *rtwdev = txcb->rtwdev; + struct rtw89_usb *rtwusb = rtw89_usb_priv(rtwdev); struct ieee80211_tx_info *info; struct rtw89_txwd_body *txdesc; struct sk_buff *skb; @@ -234,6 +243,10 @@ static void rtw89_usb_write_port_complete(struct urb *urb) break; } + /* Decrement in-flight counter (skip firmware command channel) */ + if (txcb->txch != RTW89_TXCH_CH12) + atomic_dec(&rtwusb->tx_inflight[txcb->txch]); + kfree(txcb); } @@ -311,9 +324,17 @@ static void rtw89_usb_ops_tx_kick_off(struct rtw89_dev *rtwdev, u8 txch) skb_queue_tail(&txcb->tx_ack_queue, skb); + /* Increment BEFORE submit to avoid race with completion */ + if (txch != RTW89_TXCH_CH12) + atomic_inc(&rtwusb->tx_inflight[txch]); + ret = rtw89_usb_write_port(rtwdev, txch, skb->data, skb->len, txcb); if (ret) { + /* Rollback increment on failure */ + if (txch != RTW89_TXCH_CH12) + atomic_dec(&rtwusb->tx_inflight[txch]); + if (ret != -ENODEV) rtw89_err(rtwdev, "write port txch %d failed: %d\n", txch, ret); @@ -694,8 +715,10 @@ static void rtw89_usb_init_tx(struct rtw89_dev *rtwdev) struct rtw89_usb *rtwusb = rtw89_usb_priv(rtwdev); int i; - for (i = 0; i < ARRAY_SIZE(rtwusb->tx_queue); i++) + for (i = 0; i < ARRAY_SIZE(rtwusb->tx_queue); i++) { skb_queue_head_init(&rtwusb->tx_queue[i]); + atomic_set(&rtwusb->tx_inflight[i], 0); + } } static void rtw89_usb_deinit_tx(struct rtw89_dev *rtwdev) diff --git a/drivers/net/wireless/realtek/rtw89/usb.h b/drivers/net/wireless/realtek/rtw89/usb.h index 9f554b5..1459122 100644 --- a/drivers/net/wireless/realtek/rtw89/usb.h +++ b/drivers/net/wireless/realtek/rtw89/usb.h @@ -21,6 +21,8 @@ #define RTW89_MAX_BULKIN_NUM 2 #define RTW89_MAX_BULKOUT_NUM 7 +#define RTW89_USB_MAX_TX_URBS_PER_CH 64 + struct rtw89_usb_info { u32 usb_host_request_2; u32 usb_wlan0_1; @@ -67,6 +69,8 @@ struct rtw89_usb { struct usb_anchor tx_submitted; struct sk_buff_head tx_queue[RTW89_TXCH_NUM]; + + atomic_t tx_inflight[RTW89_TXCH_NUM]; }; static inline struct rtw89_usb *rtw89_usb_priv(struct rtw89_dev *rtwdev) -- 2.53.0