Currently, precision backtracking logic in __mark_chain_precision does not check if the target register or stack slot is already marked as precise. This can lead to redundant work when multiple paths require the same register to be precise. This patch moves the "skip if already precise" logic into the core backtracking loop setup in __mark_chain_precision(). This ensures that all entry points (mark_chain_precision, propagate_precision) benefit from the optimization and allows for cleaner code at the call sites. Specifically: - In __mark_chain_precision(), before starting the backtracking loop, check if the initial register/stack slot is already precise. - For batch propagation (used by propagate_precision), iterate through the requested masks and clear bits that are already precise in the starting state. - Remove redundant checks in mark_chain_precision() and propagate_precision(). Performance results: The optimization significantly reduces verification time by skipping redundant backtracking for registers already marked as precise. Key improvements (Duration): - prog_nested_calls: 671us -> 292us (-56.48%) - prog_non_constant_callback: 254us -> 111us (-56.30%) - stack_check: 807us -> 411us (-49.07%) - arena_strsearch: 120us -> 65us (-45.83%) - prog_null_ctx: 216us -> 164us (-24.07%) Verified that total instruction and state counts remain identical across all tests (0.00% change), confirming logic equivalence. Test steps and detailed performance raw data are as follows: The data was collected using the following command: $ sudo ./veristat -e duration,total_insns,total_states -o csv \ bpf_flow.bpf.o bpf_loop.bpf.o arena_strsearch.bpf.o Baseline (at b54345928fa1): file_name,prog_name,verdict,duration,total_insns,total_states arena_strsearch.bpf.o,arena_strsearch,failure,120,20,2 bpf_flow.bpf.o,_dissect,success,465,211,13 bpf_flow.bpf.o,flow_dissector_0,success,2408,1461,68 bpf_flow.bpf.o,flow_dissector_1,success,2698,1567,59 bpf_flow.bpf.o,flow_dissector_2,success,2010,1244,56 bpf_flow.bpf.o,flow_dissector_3,success,2250,1498,57 bpf_flow.bpf.o,flow_dissector_4,success,343,259,4 bpf_flow.bpf.o,flow_dissector_5,success,674,416,21 bpf_loop.bpf.o,prog_invalid_flags,success,292,50,5 bpf_loop.bpf.o,prog_nested_calls,success,671,145,19 bpf_loop.bpf.o,prog_non_constant_callback,success,254,41,5 bpf_loop.bpf.o,prog_null_ctx,success,216,22,3 bpf_loop.bpf.o,stack_check,success,807,325,25 bpf_loop.bpf.o,test_prog,success,493,64,7 Patched: file_name,prog_name,verdict,duration,total_insns,total_states arena_strsearch.bpf.o,arena_strsearch,failure,65,20,2 bpf_flow.bpf.o,_dissect,success,467,211,13 bpf_flow.bpf.o,flow_dissector_0,success,2392,1461,68 bpf_flow.bpf.o,flow_dissector_1,success,2722,1567,59 bpf_flow.bpf.o,flow_dissector_2,success,2050,1244,56 bpf_flow.bpf.o,flow_dissector_3,success,2258,1498,57 bpf_flow.bpf.o,flow_dissector_4,success,342,259,4 bpf_flow.bpf.o,flow_dissector_5,success,702,416,21 bpf_loop.bpf.o,prog_invalid_flags,success,239,50,5 bpf_loop.bpf.o,prog_nested_calls,success,292,145,19 bpf_loop.bpf.o,prog_non_constant_callback,success,111,41,5 bpf_loop.bpf.o,prog_null_ctx,success,164,22,3 bpf_loop.bpf.o,stack_check,success,411,325,25 bpf_loop.bpf.o,test_prog,success,484,64,7 Comparison (veristat -C): $ ./veristat -C prec_skip_baseline.csv prec_skip_patched.csv File Program Verdict (A) Verdict (B) Verdict (DIFF) Duration (us) (A) Duration (us) (B) Duration (us) (DIFF) Insns (A) Insns (B) Insns (DIFF) States (A) States (B) States (DIFF) --------------------- -------------------------- ----------- ----------- -------------- ----------------- ----------------- -------------------- --------- --------- ------------ ---------- ---------- ------------- arena_strsearch.bpf.o arena_strsearch failure failure MATCH 120 65 -55 (-45.83%) 20 20 +0 (+0.00%) 2 2 +0 (+0.00%) bpf_flow.bpf.o _dissect success success MATCH 465 467 +2 (+0.43%) 211 211 +0 (+0.00%) 13 13 +0 (+0.00%) bpf_flow.bpf.o flow_dissector_0 success success MATCH 2408 2392 -16 (-0.66%) 1461 1461 +0 (+0.00%) 68 68 +0 (+0.00%) bpf_flow.bpf.o flow_dissector_1 success success MATCH 2698 2722 +24 (+0.89%) 1567 1567 +0 (+0.00%) 59 59 +0 (+0.00%) bpf_flow.bpf.o flow_dissector_2 success success MATCH 2010 2050 +40 (+1.99%) 1244 1244 +0 (+0.00%) 56 56 +0 (+0.00%) bpf_flow.bpf.o flow_dissector_3 success success MATCH 2250 2258 +8 (+0.36%) 1498 1498 +0 (+0.00%) 57 57 +0 (+0.00%) bpf_flow.bpf.o flow_dissector_4 success success MATCH 343 342 -1 (-0.29%) 259 259 +0 (+0.00%) 4 4 +0 (+0.00%) bpf_flow.bpf.o flow_dissector_5 success success MATCH 674 702 +28 (+4.15%) 416 416 +0 (+0.00%) 21 21 +0 (+0.00%) bpf_loop.bpf.o prog_invalid_flags success success MATCH 292 239 -53 (-18.15%) 50 50 +0 (+0.00%) 5 5 +0 (+0.00%) bpf_loop.bpf.o prog_nested_calls success success MATCH 671 292 -379 (-56.48%) 145 145 +0 (+0.00%) 19 19 +0 (+0.00%) bpf_loop.bpf.o prog_non_constant_callback success success MATCH 254 111 -143 (-56.30%) 41 41 +0 (+0.00%) 5 5 +0 (+0.00%) bpf_loop.bpf.o prog_null_ctx success success MATCH 216 164 -52 (-24.07%) 22 22 +0 (+0.00%) 3 3 +0 (+0.00%) bpf_loop.bpf.o stack_check success success MATCH 807 411 -396 (-49.07%) 325 325 +0 (+0.00%) 25 25 +0 (+0.00%) bpf_loop.bpf.o test_prog success success MATCH 493 484 -9 (-1.83%) 64 64 +0 (+0.00%) 7 7 +0 (+0.00%) Suggested-by: Eduard Zingerman Signed-off-by: Qiliang Yuan --- v1 -> v2: - Move "skip if already precise" logic into the core backtracking engine __mark_chain_precision() as suggested by Eduard Zingerman. - Add detailed performance benchmark results and veristat comparison. - Clean up call sites in mark_chain_precision() and propagate_precision(). kernel/bpf/verifier.c | 27 +++++++++++++++++++++++++-- 1 file changed, 25 insertions(+), 2 deletions(-) diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index f0ca69f888fa..340d972bd833 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -4765,14 +4765,37 @@ static int __mark_chain_precision(struct bpf_verifier_env *env, * slot, but don't set precise flag in current state, as precision * tracking in the current state is unnecessary. */ - func = st->frame[bt->frame]; if (regno >= 0) { - reg = &func->regs[regno]; + reg = &st->frame[bt->frame]->regs[regno]; if (reg->type != SCALAR_VALUE) { verifier_bug(env, "backtracking misuse"); return -EFAULT; } + if (reg->precise) + return 0; bt_set_reg(bt, regno); + } else { + for (fr = bt->frame; fr >= 0; fr--) { + u32 reg_mask = bt_frame_reg_mask(bt, fr); + u64 stack_mask = bt_frame_stack_mask(bt, fr); + DECLARE_BITMAP(mask, 64); + + func = st->frame[fr]; + if (reg_mask) { + bitmap_from_u64(mask, reg_mask); + for_each_set_bit(i, mask, 32) { + if (func->regs[i].precise) + bt_clear_frame_reg(bt, fr, i); + } + } + if (stack_mask) { + bitmap_from_u64(mask, stack_mask); + for_each_set_bit(i, mask, 64) { + if (func->stack[i].spilled_ptr.precise) + bt_clear_frame_slot(bt, fr, i); + } + } + } } if (bt_empty(bt)) -- 2.39.5