XDP_TX typically uses no offloads. To optimize XDP we added a "default descriptor" feature to the chip, which allows us to send XDP frames with just the buffer descriptors (DMA address + length). All the metadata descriptors are derived from the queue config. Commit under Fixes missed adding setting the defaults up when transplanting the code from the prototype driver. Importantly after reset the "request completion" bit is not set. Packets still get sent but there's no completion, so ring is not cleaned up. We can send one ring's worth of packets and then will start dropping all frames that got the XDP_TX action from the XDP prog. Fixes: 168deb7b31b2 ("eth: fbnic: Add support for XDP_TX action") Signed-off-by: Jakub Kicinski --- CC: alexanderduyck@fb.com CC: jacob.e.keller@intel.com CC: mohsin.bashr@gmail.com --- drivers/net/ethernet/meta/fbnic/fbnic_mac.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_mac.c b/drivers/net/ethernet/meta/fbnic/fbnic_mac.c index 8f998d26b9a3..2a84bd1d7e26 100644 --- a/drivers/net/ethernet/meta/fbnic/fbnic_mac.c +++ b/drivers/net/ethernet/meta/fbnic/fbnic_mac.c @@ -83,8 +83,16 @@ static void fbnic_mac_init_axi(struct fbnic_dev *fbd) static void fbnic_mac_init_qm(struct fbnic_dev *fbd) { + u64 default_meta = FIELD_PREP(FBNIC_TWD_L2_HLEN_MASK, ETH_HLEN) | + FBNIC_TWD_FLAG_REQ_COMPLETION; u32 clock_freq; + /* Configure default TWQ Metadata descriptor */ + wr32(fbd, FBNIC_QM_TWQ_DEFAULT_META_L, + lower_32_bits(default_meta)); + wr32(fbd, FBNIC_QM_TWQ_DEFAULT_META_H, + upper_32_bits(default_meta)); + /* Configure TSO behavior */ wr32(fbd, FBNIC_QM_TQS_CTL0, FIELD_PREP(FBNIC_QM_TQS_CTL0_LSO_TS_MASK, -- 2.51.0 Make XDP-handled packets appear in the Rx stats. The driver has been counting XDP_TX packets on the Tx ring, but there wasn't much accounting on the Rx side (the Rx bytes appear to be incremented on XDP_TX but XDP_DROP / XDP_ABORT are only counted as Rx drops). Counting XDP_TX packets (not just bytes) in Rx stats looks like a simple bug of omission. The XDP_DROP handling appears to be intentional. Whether XDP_DROP packets should be counted in interface-level Rx stats is a bit unclear historically. When we were defining qstats, however, we clarified based on operational experience that in this context: name: rx-packets doc: | Number of wire packets successfully received and passed to the stack. For drivers supporting XDP, XDP is considered the first layer of the stack, so packets consumed by XDP are still counted here. fbnic does not obey this requirement. Since XDP support has been added in current release cycle, instead of splitting interface and qstat handling - make them both follow the qstat definition. Another small tweak here is that we count bytes as received on the wire rather than post-XDP bytes (xdp_get_buff_len() vs skb->len). Fixes: 5213ff086344 ("eth: fbnic: Collect packet statistics for XDP") Signed-off-by: Jakub Kicinski --- CC: alexanderduyck@fb.com CC: sdf@fomichev.me CC: mohsin.bashr@gmail.com CC: bpf@vger.kernel.org --- drivers/net/ethernet/meta/fbnic/fbnic_txrx.c | 28 +++++++++++--------- 1 file changed, 15 insertions(+), 13 deletions(-) diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_txrx.c b/drivers/net/ethernet/meta/fbnic/fbnic_txrx.c index cf773cc78e40..b00d44926ba1 100644 --- a/drivers/net/ethernet/meta/fbnic/fbnic_txrx.c +++ b/drivers/net/ethernet/meta/fbnic/fbnic_txrx.c @@ -1242,6 +1242,7 @@ static int fbnic_clean_rcq(struct fbnic_napi_vector *nv, /* Walk the completion queue collecting the heads reported by NIC */ while (likely(packets < budget)) { struct sk_buff *skb = ERR_PTR(-EINVAL); + u32 pkt_bytes; u64 rcd; if ((*raw_rcd & cpu_to_le64(FBNIC_RCD_DONE)) == done) @@ -1272,37 +1273,38 @@ static int fbnic_clean_rcq(struct fbnic_napi_vector *nv, /* We currently ignore the action table index */ break; case FBNIC_RCD_TYPE_META: - if (unlikely(pkt->add_frag_failed)) - skb = NULL; - else if (likely(!fbnic_rcd_metadata_err(rcd))) + if (likely(!fbnic_rcd_metadata_err(rcd) && + !pkt->add_frag_failed)) { + pkt_bytes = xdp_get_buff_len(&pkt->buff); skb = fbnic_run_xdp(nv, pkt); + } /* Populate skb and invalidate XDP */ if (!IS_ERR_OR_NULL(skb)) { fbnic_populate_skb_fields(nv, rcd, skb, qt, &csum_complete, &csum_none); - - packets++; - bytes += skb->len; - napi_gro_receive(&nv->napi, skb); } else if (skb == ERR_PTR(-FBNIC_XDP_TX)) { pkt_tail = nv->qt[0].sub1.tail; - bytes += xdp_get_buff_len(&pkt->buff); + } else if (PTR_ERR(skb) == -FBNIC_XDP_CONSUME) { + fbnic_put_pkt_buff(qt, pkt, 1); } else { - if (!skb) { + if (!skb) alloc_failed++; - dropped++; - } else if (skb == ERR_PTR(-FBNIC_XDP_LEN_ERR)) { + + if (skb == ERR_PTR(-FBNIC_XDP_LEN_ERR)) length_errors++; - } else { + else dropped++; - } fbnic_put_pkt_buff(qt, pkt, 1); + goto next_dont_count; } + packets++; + bytes += pkt_bytes; +next_dont_count: pkt->buff.data_hard_start = NULL; break; -- 2.51.0 When rings are freed - stats get added to the device level stat structs. Save the stats from the XDP_TX ring just as Tx stats. Previously they would be saved to Rx and Tx stats. So we'd not see XDP_TX packets as Rx during runtime but after an down/up cycle the packets would appear in stats. Correct the helper used by ethtool code which does a runtime config switch. Fixes: 5213ff086344 ("eth: fbnic: Collect packet statistics for XDP") Signed-off-by: Jakub Kicinski --- CC: alexanderduyck@fb.com CC: sdf@fomichev.me CC: mohsin.bashr@gmail.com CC: bpf@vger.kernel.org --- drivers/net/ethernet/meta/fbnic/fbnic_txrx.h | 2 ++ drivers/net/ethernet/meta/fbnic/fbnic_ethtool.c | 2 +- drivers/net/ethernet/meta/fbnic/fbnic_txrx.c | 8 +++----- 3 files changed, 6 insertions(+), 6 deletions(-) diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_txrx.h b/drivers/net/ethernet/meta/fbnic/fbnic_txrx.h index 31fac0ba0902..4a41e21ed542 100644 --- a/drivers/net/ethernet/meta/fbnic/fbnic_txrx.h +++ b/drivers/net/ethernet/meta/fbnic/fbnic_txrx.h @@ -167,6 +167,8 @@ void fbnic_aggregate_ring_rx_counters(struct fbnic_net *fbn, struct fbnic_ring *rxr); void fbnic_aggregate_ring_tx_counters(struct fbnic_net *fbn, struct fbnic_ring *txr); +void fbnic_aggregate_ring_xdp_counters(struct fbnic_net *fbn, + struct fbnic_ring *xdpr); int fbnic_alloc_napi_vectors(struct fbnic_net *fbn); void fbnic_free_napi_vectors(struct fbnic_net *fbn); diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_ethtool.c b/drivers/net/ethernet/meta/fbnic/fbnic_ethtool.c index a1c2db69b198..a37906b70c3a 100644 --- a/drivers/net/ethernet/meta/fbnic/fbnic_ethtool.c +++ b/drivers/net/ethernet/meta/fbnic/fbnic_ethtool.c @@ -185,7 +185,7 @@ static void fbnic_aggregate_vector_counters(struct fbnic_net *fbn, for (i = 0; i < nv->txt_count; i++) { fbnic_aggregate_ring_tx_counters(fbn, &nv->qt[i].sub0); - fbnic_aggregate_ring_tx_counters(fbn, &nv->qt[i].sub1); + fbnic_aggregate_ring_xdp_counters(fbn, &nv->qt[i].sub1); fbnic_aggregate_ring_tx_counters(fbn, &nv->qt[i].cmpl); } diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_txrx.c b/drivers/net/ethernet/meta/fbnic/fbnic_txrx.c index b00d44926ba1..fee2369c2cef 100644 --- a/drivers/net/ethernet/meta/fbnic/fbnic_txrx.c +++ b/drivers/net/ethernet/meta/fbnic/fbnic_txrx.c @@ -1435,8 +1435,8 @@ void fbnic_aggregate_ring_tx_counters(struct fbnic_net *fbn, BUILD_BUG_ON(sizeof(fbn->tx_stats.twq) / 8 != 6); } -static void fbnic_aggregate_ring_xdp_counters(struct fbnic_net *fbn, - struct fbnic_ring *xdpr) +void fbnic_aggregate_ring_xdp_counters(struct fbnic_net *fbn, + struct fbnic_ring *xdpr) { struct fbnic_queue_stats *stats = &xdpr->stats; @@ -1444,9 +1444,7 @@ static void fbnic_aggregate_ring_xdp_counters(struct fbnic_net *fbn, return; /* Capture stats from queues before dissasociating them */ - fbn->rx_stats.bytes += stats->bytes; - fbn->rx_stats.packets += stats->packets; - fbn->rx_stats.dropped += stats->dropped; + fbn->tx_stats.dropped += stats->dropped; fbn->tx_stats.bytes += stats->bytes; fbn->tx_stats.packets += stats->packets; } -- 2.51.0 Test uses "netnl" for the ethtool family which is quite confusing (one would expect netdev family would use this name). No functional changes. Signed-off-by: Jakub Kicinski --- CC: shuah@kernel.org CC: sdf@fomichev.me CC: linux-kselftest@vger.kernel.org CC: bpf@vger.kernel.org --- tools/testing/selftests/drivers/net/xdp.py | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/tools/testing/selftests/drivers/net/xdp.py b/tools/testing/selftests/drivers/net/xdp.py index 08fea4230759..a7a4d97aa228 100755 --- a/tools/testing/selftests/drivers/net/xdp.py +++ b/tools/testing/selftests/drivers/net/xdp.py @@ -541,11 +541,11 @@ from lib.py import ip, bpftool, defer The HDS threshold value. If the threshold is not supported or an error occurs, a default value of 1500 is returned. """ - netnl = cfg.netnl + ethnl = cfg.ethnl hds_thresh = 1500 try: - rings = netnl.rings_get({'header': {'dev-index': cfg.ifindex}}) + rings = ethnl.rings_get({'header': {'dev-index': cfg.ifindex}}) if 'hds-thresh' not in rings: ksft_pr(f'hds-thresh not supported. Using default: {hds_thresh}') return hds_thresh @@ -562,7 +562,7 @@ from lib.py import ip, bpftool, defer Args: cfg: Configuration object containing network settings. - netnl: Network namespace or link object (not used in this function). + ethnl: Network namespace or link object (not used in this function). This function sets up the packet size and offset lists, then performs the head adjustment test by sending and receiving UDP packets. @@ -681,7 +681,7 @@ from lib.py import ip, bpftool, defer function to execute the tests. """ with NetDrvEpEnv(__file__) as cfg: - cfg.netnl = EthtoolFamily() + cfg.ethnl = EthtoolFamily() ksft_run( [ test_xdp_native_pass_sb, -- 2.51.0 Send a non-trivial number of packets and make sure that they are counted correctly in qstats. Per qstats specification XDP is the first layer of the stack so we should see Rx and Tx counters go up for packets which went thru XDP. Signed-off-by: Jakub Kicinski --- CC: shuah@kernel.org CC: sdf@fomichev.me CC: linux-kselftest@vger.kernel.org CC: bpf@vger.kernel.org --- tools/testing/selftests/drivers/net/xdp.py | 91 +++++++++++++++++++++- 1 file changed, 89 insertions(+), 2 deletions(-) diff --git a/tools/testing/selftests/drivers/net/xdp.py b/tools/testing/selftests/drivers/net/xdp.py index a7a4d97aa228..a148004e1c36 100755 --- a/tools/testing/selftests/drivers/net/xdp.py +++ b/tools/testing/selftests/drivers/net/xdp.py @@ -11,8 +11,9 @@ import string from dataclasses import dataclass from enum import Enum -from lib.py import ksft_run, ksft_exit, ksft_eq, ksft_ne, ksft_pr -from lib.py import KsftFailEx, NetDrvEpEnv, EthtoolFamily, NlError +from lib.py import ksft_run, ksft_exit, ksft_eq, ksft_ge, ksft_ne, ksft_pr +from lib.py import KsftFailEx, NetDrvEpEnv +from lib.py import EthtoolFamily, NetdevFamily, NlError from lib.py import bkg, cmd, rand_port, wait_port_listen from lib.py import ip, bpftool, defer @@ -671,6 +672,88 @@ from lib.py import ip, bpftool, defer _validate_res(res, offset_lst, pkt_sz_lst) +def _test_xdp_native_ifc_stats(cfg, act): + cfg.require_cmd("socat") + + bpf_info = BPFProgInfo("xdp_prog", "xdp_native.bpf.o", "xdp", 1500) + prog_info = _load_xdp_prog(cfg, bpf_info) + port = rand_port() + + _set_xdp_map("map_xdp_setup", TestConfig.MODE.value, act.value) + _set_xdp_map("map_xdp_setup", TestConfig.PORT.value, port) + + # Discard the input, but we need a listener to avoid ICMP errors + rx_udp = f"socat -{cfg.addr_ipver} -T 2 -u UDP-RECV:{port},reuseport " + \ + "/dev/null" + # Listener runs on "remote" in case of XDP_TX + rx_host = cfg.remote if act == XDPAction.TX else None + # We want to spew 2000 packets quickly, bash seems to do a good enough job + tx_udp = f"exec 5<>/dev/udp/{cfg.addr}/{port}; " \ + "for i in `seq 2000`; do echo a >&5; done; exec 5>&-" + + cfg.wait_hw_stats_settle() + # Qstats have more clearly defined semantics than rtnetlink. + # XDP is the "first layer of the stack" so XDP packets should be counted + # as received and sent as if the decision was made in the routing layer. + before = cfg.netnl.qstats_get({"ifindex": cfg.ifindex}, dump=True)[0] + + with bkg(rx_udp, host=rx_host, exit_wait=True): + wait_port_listen(port, proto="udp", host=rx_host) + cmd(tx_udp, host=cfg.remote, shell=True) + + cfg.wait_hw_stats_settle() + after = cfg.netnl.qstats_get({"ifindex": cfg.ifindex}, dump=True)[0] + + ksft_ge(after['rx-packets'] - before['rx-packets'], 2000) + if act == XDPAction.TX: + ksft_ge(after['tx-packets'] - before['tx-packets'], 2000) + + expected_pkts = 2000 + stats = _get_stats(prog_info["maps"]["map_xdp_stats"]) + ksft_eq(stats[XDPStats.RX.value], expected_pkts, "XDP RX stats mismatch") + if act == XDPAction.TX: + ksft_eq(stats[XDPStats.TX.value], expected_pkts, "XDP TX stats mismatch") + + # Flip the ring count back and forth to make sure the stats from XDP rings + # don't get lost. + chans = cfg.ethnl.channels_get({'header': {'dev-index': cfg.ifindex}}) + if chans.get('combined-count', 0) > 1: + cfg.ethnl.channels_set({'header': {'dev-index': cfg.ifindex}, + 'combined-count': 1}) + cfg.ethnl.channels_set({'header': {'dev-index': cfg.ifindex}, + 'combined-count': chans['combined-count']}) + before = after + after = cfg.netnl.qstats_get({"ifindex": cfg.ifindex}, dump=True)[0] + + ksft_ge(after['rx-packets'], before['rx-packets']) + if act == XDPAction.TX: + ksft_ge(after['tx-packets'], before['tx-packets']) + + +def test_xdp_native_qstats_pass(cfg): + """ + Send 2000 messages, expect XDP_PASS, make sure the packets were counted + to interface level qstats (Rx). + """ + _test_xdp_native_ifc_stats(cfg, XDPAction.PASS) + + +def test_xdp_native_qstats_drop(cfg): + """ + Send 2000 messages, expect XDP_DROP, make sure the packets were counted + to interface level qstats (Rx). + """ + _test_xdp_native_ifc_stats(cfg, XDPAction.DROP) + + +def test_xdp_native_qstats_tx(cfg): + """ + Send 2000 messages, expect XDP_TX, make sure the packets were counted + to interface level qstats (Rx and Tx) + """ + _test_xdp_native_ifc_stats(cfg, XDPAction.TX) + + def main(): """ Main function to execute the XDP tests. @@ -682,6 +765,7 @@ from lib.py import ip, bpftool, defer """ with NetDrvEpEnv(__file__) as cfg: cfg.ethnl = EthtoolFamily() + cfg.netnl = NetdevFamily() ksft_run( [ test_xdp_native_pass_sb, @@ -694,6 +778,9 @@ from lib.py import ip, bpftool, defer test_xdp_native_adjst_tail_shrnk_data, test_xdp_native_adjst_head_grow_data, test_xdp_native_adjst_head_shrnk_data, + test_xdp_native_qstats_pass, + test_xdp_native_qstats_drop, + test_xdp_native_qstats_tx, ], args=(cfg,)) ksft_exit() -- 2.51.0 Rx processing under normal circumstances has 3 rings - 2 buffer rings (heads, payloads) and a completion ring. All the rings have a struct fbnic_ring. Make sure we expose alloc_failed counter from the buffer rings, previously only the alloc_failed from the completion ring was reported, even tho all ring types may increment this counter (buffer rings in __fbnic_fill_bdq()). This makes the pp_alloc_fail.py test pass, it expects the qstat to be incrementing as page pool injections happen. Fixes: 67dc4eb5fc92 ("eth: fbnic: report software Rx queue stats") Signed-off-by: Jakub Kicinski --- CC: alexanderduyck@fb.com CC: mohsin.bashr@gmail.com CC: vadim.fedorenko@linux.dev CC: jdamato@fastly.com CC: aleksander.lobakin@intel.com --- .../net/ethernet/meta/fbnic/fbnic_netdev.h | 1 + drivers/net/ethernet/meta/fbnic/fbnic_txrx.h | 5 +++ .../net/ethernet/meta/fbnic/fbnic_ethtool.c | 4 +-- .../net/ethernet/meta/fbnic/fbnic_netdev.c | 23 ++++++++++-- drivers/net/ethernet/meta/fbnic/fbnic_txrx.c | 36 +++++++++++++++---- 5 files changed, 58 insertions(+), 11 deletions(-) diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_netdev.h b/drivers/net/ethernet/meta/fbnic/fbnic_netdev.h index e84e0527c3a9..b0a87c57910f 100644 --- a/drivers/net/ethernet/meta/fbnic/fbnic_netdev.h +++ b/drivers/net/ethernet/meta/fbnic/fbnic_netdev.h @@ -68,6 +68,7 @@ struct fbnic_net { /* Storage for stats after ring destruction */ struct fbnic_queue_stats tx_stats; struct fbnic_queue_stats rx_stats; + struct fbnic_queue_stats bdq_stats; u64 link_down_events; /* Time stamping filter config */ diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_txrx.h b/drivers/net/ethernet/meta/fbnic/fbnic_txrx.h index 4a41e21ed542..ca37da5a0b17 100644 --- a/drivers/net/ethernet/meta/fbnic/fbnic_txrx.h +++ b/drivers/net/ethernet/meta/fbnic/fbnic_txrx.h @@ -92,6 +92,9 @@ struct fbnic_queue_stats { u64 csum_none; u64 length_errors; } rx; + struct { + u64 alloc_failed; + } bdq; }; u64 dropped; struct u64_stats_sync syncp; @@ -165,6 +168,8 @@ fbnic_features_check(struct sk_buff *skb, struct net_device *dev, void fbnic_aggregate_ring_rx_counters(struct fbnic_net *fbn, struct fbnic_ring *rxr); +void fbnic_aggregate_ring_bdq_counters(struct fbnic_net *fbn, + struct fbnic_ring *rxr); void fbnic_aggregate_ring_tx_counters(struct fbnic_net *fbn, struct fbnic_ring *txr); void fbnic_aggregate_ring_xdp_counters(struct fbnic_net *fbn, diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_ethtool.c b/drivers/net/ethernet/meta/fbnic/fbnic_ethtool.c index a37906b70c3a..95fac020eb93 100644 --- a/drivers/net/ethernet/meta/fbnic/fbnic_ethtool.c +++ b/drivers/net/ethernet/meta/fbnic/fbnic_ethtool.c @@ -190,8 +190,8 @@ static void fbnic_aggregate_vector_counters(struct fbnic_net *fbn, } for (j = 0; j < nv->rxt_count; j++, i++) { - fbnic_aggregate_ring_rx_counters(fbn, &nv->qt[i].sub0); - fbnic_aggregate_ring_rx_counters(fbn, &nv->qt[i].sub1); + fbnic_aggregate_ring_bdq_counters(fbn, &nv->qt[i].sub0); + fbnic_aggregate_ring_bdq_counters(fbn, &nv->qt[i].sub1); fbnic_aggregate_ring_rx_counters(fbn, &nv->qt[i].cmpl); } } diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_netdev.c b/drivers/net/ethernet/meta/fbnic/fbnic_netdev.c index d12b4cad84a5..e95be0e7bd9e 100644 --- a/drivers/net/ethernet/meta/fbnic/fbnic_netdev.c +++ b/drivers/net/ethernet/meta/fbnic/fbnic_netdev.c @@ -543,17 +543,21 @@ static const struct net_device_ops fbnic_netdev_ops = { static void fbnic_get_queue_stats_rx(struct net_device *dev, int idx, struct netdev_queue_stats_rx *rx) { + u64 bytes, packets, alloc_fail, alloc_fail_bdq; struct fbnic_net *fbn = netdev_priv(dev); struct fbnic_ring *rxr = fbn->rx[idx]; struct fbnic_dev *fbd = fbn->fbd; struct fbnic_queue_stats *stats; - u64 bytes, packets, alloc_fail; u64 csum_complete, csum_none; + struct fbnic_q_triad *qt; unsigned int start; if (!rxr) return; + /* fbn->rx points to completion queues */ + qt = container_of(rxr, struct fbnic_q_triad, cmpl); + stats = &rxr->stats; do { start = u64_stats_fetch_begin(&stats->syncp); @@ -564,6 +568,20 @@ static void fbnic_get_queue_stats_rx(struct net_device *dev, int idx, csum_none = stats->rx.csum_none; } while (u64_stats_fetch_retry(&stats->syncp, start)); + stats = &qt->sub0.stats; + do { + start = u64_stats_fetch_begin(&stats->syncp); + alloc_fail_bdq = stats->bdq.alloc_failed; + } while (u64_stats_fetch_retry(&stats->syncp, start)); + alloc_fail += alloc_fail_bdq; + + stats = &qt->sub1.stats; + do { + start = u64_stats_fetch_begin(&stats->syncp); + alloc_fail_bdq = stats->bdq.alloc_failed; + } while (u64_stats_fetch_retry(&stats->syncp, start)); + alloc_fail += alloc_fail_bdq; + rx->bytes = bytes; rx->packets = packets; rx->alloc_fail = alloc_fail; @@ -641,7 +659,8 @@ static void fbnic_get_base_stats(struct net_device *dev, rx->bytes = fbn->rx_stats.bytes; rx->packets = fbn->rx_stats.packets; - rx->alloc_fail = fbn->rx_stats.rx.alloc_failed; + rx->alloc_fail = fbn->rx_stats.rx.alloc_failed + + fbn->bdq_stats.bdq.alloc_failed; rx->csum_complete = fbn->rx_stats.rx.csum_complete; rx->csum_none = fbn->rx_stats.rx.csum_none; } diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_txrx.c b/drivers/net/ethernet/meta/fbnic/fbnic_txrx.c index fee2369c2cef..173a1291b370 100644 --- a/drivers/net/ethernet/meta/fbnic/fbnic_txrx.c +++ b/drivers/net/ethernet/meta/fbnic/fbnic_txrx.c @@ -904,7 +904,7 @@ static void fbnic_fill_bdq(struct fbnic_ring *bdq) netmem = page_pool_dev_alloc_netmems(bdq->page_pool); if (!netmem) { u64_stats_update_begin(&bdq->stats.syncp); - bdq->stats.rx.alloc_failed++; + bdq->stats.bdq.alloc_failed++; u64_stats_update_end(&bdq->stats.syncp); break; @@ -1416,6 +1416,17 @@ void fbnic_aggregate_ring_rx_counters(struct fbnic_net *fbn, BUILD_BUG_ON(sizeof(fbn->rx_stats.rx) / 8 != 4); } +void fbnic_aggregate_ring_bdq_counters(struct fbnic_net *fbn, + struct fbnic_ring *bdq) +{ + struct fbnic_queue_stats *stats = &bdq->stats; + + /* Capture stats from queues before dissasociating them */ + fbn->bdq_stats.bdq.alloc_failed += stats->bdq.alloc_failed; + /* Remember to add new stats here */ + BUILD_BUG_ON(sizeof(fbn->rx_stats.bdq) / 8 != 1); +} + void fbnic_aggregate_ring_tx_counters(struct fbnic_net *fbn, struct fbnic_ring *txr) { @@ -1488,6 +1499,15 @@ static void fbnic_remove_rx_ring(struct fbnic_net *fbn, fbn->rx[rxr->q_idx] = NULL; } +static void fbnic_remove_bdq_ring(struct fbnic_net *fbn, + struct fbnic_ring *bdq) +{ + if (!(bdq->flags & FBNIC_RING_F_STATS)) + return; + + fbnic_aggregate_ring_bdq_counters(fbn, bdq); +} + static void fbnic_free_qt_page_pools(struct fbnic_q_triad *qt) { page_pool_destroy(qt->sub0.page_pool); @@ -1507,8 +1527,8 @@ static void fbnic_free_napi_vector(struct fbnic_net *fbn, } for (j = 0; j < nv->rxt_count; j++, i++) { - fbnic_remove_rx_ring(fbn, &nv->qt[i].sub0); - fbnic_remove_rx_ring(fbn, &nv->qt[i].sub1); + fbnic_remove_bdq_ring(fbn, &nv->qt[i].sub0); + fbnic_remove_bdq_ring(fbn, &nv->qt[i].sub1); fbnic_remove_rx_ring(fbn, &nv->qt[i].cmpl); } @@ -1707,11 +1727,13 @@ static int fbnic_alloc_napi_vector(struct fbnic_dev *fbd, struct fbnic_net *fbn, while (rxt_count) { /* Configure header queue */ db = &uc_addr[FBNIC_QUEUE(rxq_idx) + FBNIC_QUEUE_BDQ_HPQ_TAIL]; - fbnic_ring_init(&qt->sub0, db, 0, FBNIC_RING_F_CTX); + fbnic_ring_init(&qt->sub0, db, 0, + FBNIC_RING_F_CTX | FBNIC_RING_F_STATS); /* Configure payload queue */ db = &uc_addr[FBNIC_QUEUE(rxq_idx) + FBNIC_QUEUE_BDQ_PPQ_TAIL]; - fbnic_ring_init(&qt->sub1, db, 0, FBNIC_RING_F_CTX); + fbnic_ring_init(&qt->sub1, db, 0, + FBNIC_RING_F_CTX | FBNIC_RING_F_STATS); /* Configure Rx completion queue */ db = &uc_addr[FBNIC_QUEUE(rxq_idx) + FBNIC_QUEUE_RCQ_HEAD]; @@ -2830,8 +2852,8 @@ static int fbnic_queue_start(struct net_device *dev, void *qmem, int idx) real = container_of(fbn->rx[idx], struct fbnic_q_triad, cmpl); nv = fbn->napi[idx % fbn->num_napi]; - fbnic_aggregate_ring_rx_counters(fbn, &real->sub0); - fbnic_aggregate_ring_rx_counters(fbn, &real->sub1); + fbnic_aggregate_ring_bdq_counters(fbn, &real->sub0); + fbnic_aggregate_ring_bdq_counters(fbn, &real->sub1); fbnic_aggregate_ring_rx_counters(fbn, &real->cmpl); memcpy(real, qmem, sizeof(*real)); -- 2.51.0 Fix linter warnings, it's a bit hard to check for new ones otherwise. W0311: Bad indentation. Found 16 spaces, expected 12 (bad-indentation) C0114: Missing module docstring (missing-module-docstring) W1514: Using open without explicitly specifying an encoding (unspecified-encoding) C0116: Missing function or method docstring (missing-function-docstring) Signed-off-by: Jakub Kicinski --- CC: shuah@kernel.org CC: johndale@cisco.com CC: linux-kselftest@vger.kernel.org --- .../selftests/drivers/net/hw/pp_alloc_fail.py | 20 +++++++++++++------ 1 file changed, 14 insertions(+), 6 deletions(-) diff --git a/tools/testing/selftests/drivers/net/hw/pp_alloc_fail.py b/tools/testing/selftests/drivers/net/hw/pp_alloc_fail.py index ad192fef3117..fc66b7a7b149 100755 --- a/tools/testing/selftests/drivers/net/hw/pp_alloc_fail.py +++ b/tools/testing/selftests/drivers/net/hw/pp_alloc_fail.py @@ -1,6 +1,10 @@ #!/usr/bin/env python3 # SPDX-License-Identifier: GPL-2.0 +""" +Test driver resilience vs page pool allocation failures. +""" + import errno import time import os @@ -13,7 +17,8 @@ from lib.py import cmd, tool, GenerateTraffic def _write_fail_config(config): for key, value in config.items(): - with open("/sys/kernel/debug/fail_function/" + key, "w") as fp: + path = "/sys/kernel/debug/fail_function/" + with open(path + key, "w", encoding='ascii') as fp: fp.write(str(value) + "\n") @@ -22,8 +27,7 @@ from lib.py import cmd, tool, GenerateTraffic raise KsftSkipEx("Kernel built without function error injection (or DebugFS)") if not os.path.exists("/sys/kernel/debug/fail_function/page_pool_alloc_netmems"): - with open("/sys/kernel/debug/fail_function/inject", "w") as fp: - fp.write("page_pool_alloc_netmems\n") + _write_fail_config({"inject": "page_pool_alloc_netmems"}) _write_fail_config({ "verbose": 0, @@ -38,8 +42,7 @@ from lib.py import cmd, tool, GenerateTraffic return if os.path.exists("/sys/kernel/debug/fail_function/page_pool_alloc_netmems"): - with open("/sys/kernel/debug/fail_function/inject", "w") as fp: - fp.write("\n") + _write_fail_config({"inject": ""}) _write_fail_config({ "probability": 0, @@ -48,6 +51,10 @@ from lib.py import cmd, tool, GenerateTraffic def test_pp_alloc(cfg, netdevnl): + """ + Configure page pool allocation fail injection while traffic is running. + """ + def get_stats(): return netdevnl.qstats_get({"ifindex": cfg.ifindex}, dump=True)[0] @@ -105,7 +112,7 @@ from lib.py import cmd, tool, GenerateTraffic else: ksft_pr("ethtool -G change retval: did not succeed", new_g) else: - ksft_pr("ethtool -G change retval: did not try") + ksft_pr("ethtool -G change retval: did not try") time.sleep(0.1) check_traffic_flowing() @@ -119,6 +126,7 @@ from lib.py import cmd, tool, GenerateTraffic def main() -> None: + """ Ksft boiler plate main """ netdevnl = NetdevFamily() with NetDrvEpEnv(__file__, nsim_test=False) as cfg: -- 2.51.0 Lower the expected level of traffic in the pp_alloc_fail test and calculate failure counter thresholds based on the traffic rather than using a fixed constant. We only have "QEMU HW" in NIPA right now, and the test (due to debug dependencies) only works on debug kernels in the first place. We need some place for it to pass otherwise it seems to be bit rotting. So lower the traffic threshold so that it passes on QEMU and with a debug kernel... Signed-off-by: Jakub Kicinski --- CC: shuah@kernel.org CC: johndale@cisco.com CC: linux-kselftest@vger.kernel.org --- .../selftests/drivers/net/hw/pp_alloc_fail.py | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/tools/testing/selftests/drivers/net/hw/pp_alloc_fail.py b/tools/testing/selftests/drivers/net/hw/pp_alloc_fail.py index fc66b7a7b149..a4521a912d61 100755 --- a/tools/testing/selftests/drivers/net/hw/pp_alloc_fail.py +++ b/tools/testing/selftests/drivers/net/hw/pp_alloc_fail.py @@ -7,6 +7,7 @@ Test driver resilience vs page pool allocation failures. import errno import time +import math import os from lib.py import ksft_run, ksft_exit, ksft_pr from lib.py import KsftSkipEx, KsftFailEx @@ -62,7 +63,7 @@ from lib.py import cmd, tool, GenerateTraffic stat1 = get_stats() time.sleep(1) stat2 = get_stats() - if stat2['rx-packets'] - stat1['rx-packets'] < 15000: + if stat2['rx-packets'] - stat1['rx-packets'] < 4000: raise KsftFailEx("Traffic seems low:", stat2['rx-packets'] - stat1['rx-packets']) @@ -91,9 +92,14 @@ from lib.py import cmd, tool, GenerateTraffic if s2['rx-alloc-fail'] - s1['rx-alloc-fail'] < 1: raise KsftSkipEx("Allocation failures not increasing") - if s2['rx-alloc-fail'] - s1['rx-alloc-fail'] < 100: - raise KsftSkipEx("Allocation increasing too slowly", s2['rx-alloc-fail'] - s1['rx-alloc-fail'], - "packets:", s2['rx-packets'] - s1['rx-packets']) + pkts = s2['rx-packets'] - s1['rx-packets'] + # Expecting one failure per 512 buffers, 3.1x safety margin + want_fails = math.floor(pkts / 512 / 3.1) + seen_fails = s2['rx-alloc-fail'] - s1['rx-alloc-fail'] + if s2['rx-alloc-fail'] - s1['rx-alloc-fail'] < want_fails: + raise KsftSkipEx("Allocation increasing too slowly", seen_fails, + "packets:", pkts) + ksft_pr(f"Seen: pkts:{pkts} fails:{seen_fails} (pass thrs:{want_fails})") # Basic failures are fine, try to wobble some settings to catch extra failures check_traffic_flowing() -- 2.51.0 Add kernel config for error injection as needed by pp_alloc_fail.py Fixes: 9da271f825e4 ("selftests: drv-net-hw: add test for memory allocation failures with page pool") Signed-off-by: Jakub Kicinski --- CC: shuah@kernel.org CC: joe@dama.to CC: willemb@google.com CC: sdf@fomichev.me CC: almasrymina@google.com CC: linux-kselftest@vger.kernel.org --- tools/testing/selftests/drivers/net/hw/config | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/tools/testing/selftests/drivers/net/hw/config b/tools/testing/selftests/drivers/net/hw/config index e8a06aa1471c..2307aa001be1 100644 --- a/tools/testing/selftests/drivers/net/hw/config +++ b/tools/testing/selftests/drivers/net/hw/config @@ -1,3 +1,7 @@ +CONFIG_FAIL_FUNCTION=y +CONFIG_FAULT_INJECTION=y +CONFIG_FAULT_INJECTION_DEBUG_FS=y +CONFIG_FUNCTION_ERROR_INJECTION=y CONFIG_IO_URING=y CONFIG_IPV6=y CONFIG_IPV6_GRE=y -- 2.51.0