This patch addresses a robustness issue of TCP BBR when run in Virtual Machines - a common use case considering modern web hosting. Prior experiments in AWS VMs (https://ieeexplore.ieee.org/abstract/document/9546441), and our recent measurements in a controlled setup show that VM-based BBR senders heavily underestimate the available bandwidth during periods of CPU contention. In our configuration with 10ms periodic timeslices, BBR already degrades in VMs with less than 70% CPU time share, and the throughput continuously drops with decreasing CPU shares until it is bandwidth-independently capped at 10 Mbps for CPU time shares at 35% and lower. Considering the common use case of BBR in (resource-limited) Linux VMs, this issue could potentially compromise the robustness of the Internet's transport layer. In contrast to Cubic, BBR is very sensitive to off-CPU times. This is because pacing evenly spreads the target inflight over the RTT, essentially assuming that the CPU is available during the whole RTT. If this assumption is not met, the BBR sender cannot achieve the target pacing rate and concludes that the full bandwidth has been reached, even though the throughput is far below the actual bandwidth limit. This commit detects the problematic condition in bbr_update_gains() by comparing BBR's current target inflight (bbr_inflight() * tp->mss_cache) with the actual number of bytes inflight (tp->bytes_sent - tp->bytes_acked), and applies high pacing gain (bbr_high_gain), until the inflight deficit recovers. With a higher pacing gain, BBR can send faster when the VM does have the CPU, so that the target inflight volume can be achieved despite off-CPU times. Re-using the constant STARTUP gain can only solve the issue up to a certain point, but avoids complex algorithm changes. Effectively, the patch solves the degradation problem for most critical cases: with 10ms periodic timeslices, BBRv1 is robust for CPU time shares of 35% and higher - instead of 70% and higher when using the original code. We can share further results of our measurement study upon individual request. Signed-off-by: Kathrin Elmenhorst --- net/ipv4/tcp_bbr.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/net/ipv4/tcp_bbr.c b/net/ipv4/tcp_bbr.c index 760941e55153..ca6361931491 100644 --- a/net/ipv4/tcp_bbr.c +++ b/net/ipv4/tcp_bbr.c @@ -1011,6 +1011,13 @@ static void bbr_update_gains(struct sock *sk) WARN_ONCE(1, "BBR bad mode: %u\n", bbr->mode); break; } + // overwrite pacing gain in case the sender fails to put enough data inflight + struct tcp_sock *tp = tcp_sk(sk); + u64 real_inflight = tp->bytes_sent - tp->bytes_acked; + u32 target_inflight = bbr_inflight(sk, bbr_bw(sk), BBR_UNIT) * tp->mss_cache; + if (real_inflight < target_inflight) { + bbr->pacing_gain = bbr_high_gain; + } } static void bbr_update_model(struct sock *sk, const struct rate_sample *rs) -- 2.43.0