In commit 5dbb19b16ac49 ("bpf: Add third round of bounds deduction"), I added a new round of bounds deduction because two rounds were not enough to converge to a fixed point. This commit slightly refactor the bounds deduction logic such that two rounds are enough. In [1], Eduard noticed that after we improved the refinement logic, a third call to the bounds deduction (__reg_deduce_bounds) was needed to converge to a fixed point. More specifically, we needed this third call to improve the s64 range using the s32 range. We added the third call and postponed a more detailed analysis of the refinement logic. I've been looking into this more recently. To help, I wrote a high level sequence of all the refinements carried out in reg_bounds_sync. u64 -> s32 means we used the u64 ranges to improve the s32 ranges. /* __update_reg_bounds: */ tnum -> {s32, u32, s64, u64} /* __reg_deduce_bounds: */ for (3 times) { /* __reg32_deduce_bounds: */ u64 -> {u32, s32} s64 -> {u32, s32} u64 -> s32 s64 -> s32 u32 -> s32 s32 -> u32 /* __reg64_deduce_bounds: */ u64 -> s64 s64 -> u64 {u64, s64} -> {u64, s64} /* __reg_deduce_mixed_bounds: */ u32 -> u64 u32 -> s64 {s32, s64} -> {s64, u64, tnum} } /* __reg_bound_offset: */ {u64, u32} -> tnum /* __update_reg_bounds: */ tnum -> {s32, u32, s64, u64} From this, we can observe that we first improve the 32bit ranges from the 64bit ranges in __reg32_deduce_bounds, then improve the 64bit ranges on their own in __reg64_deduce_bounds. Intuitively, if we were to improve the 64bit ranges on their own *before* we use them to improve the 32bit ranges, we may reach a fixed point earlier. Said otherwise, we would need to move the *64 -> *32 refinements to the beginning of __reg_deduce_mixed_bounds. (The logic would then also better match the function names, but that's secondary.) After testing, this small refactoring does allow us to lose one call to __reg_deduce_bounds. Without this refactoring, the test "verifier_bounds/bounds deduction cross sign boundary, negative overlap" fails when removing one call to __reg_deduce_bounds. This patch therefore implements that bit of refactoring and reverts commit 5dbb19b16ac49 ("bpf: Add third round of bounds deduction"). As expected, this change didn't have any impact on the number of instructions processed when running it through the Cilium complexity test suite [2]. Link: https://lore.kernel.org/bpf/aIKtSK9LjQXB8FLY@mail.gmail.com/ [1] Link: https://pchaigno.github.io/test-verifier-complexity.html [2] Signed-off-by: Paul Chaignon --- kernel/bpf/verifier.c | 138 +++++++++++++++++++++--------------------- 1 file changed, 69 insertions(+), 69 deletions(-) diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index fc4ccd1de569..55575f66aad5 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -2427,74 +2427,6 @@ static void __update_reg_bounds(struct bpf_reg_state *reg) /* Uses signed min/max values to inform unsigned, and vice-versa */ static void __reg32_deduce_bounds(struct bpf_reg_state *reg) { - /* If upper 32 bits of u64/s64 range don't change, we can use lower 32 - * bits to improve our u32/s32 boundaries. - * - * E.g., the case where we have upper 32 bits as zero ([10, 20] in - * u64) is pretty trivial, it's obvious that in u32 we'll also have - * [10, 20] range. But this property holds for any 64-bit range as - * long as upper 32 bits in that entire range of values stay the same. - * - * E.g., u64 range [0x10000000A, 0x10000000F] ([4294967306, 4294967311] - * in decimal) has the same upper 32 bits throughout all the values in - * that range. As such, lower 32 bits form a valid [0xA, 0xF] ([10, 15]) - * range. - * - * Note also, that [0xA, 0xF] is a valid range both in u32 and in s32, - * following the rules outlined below about u64/s64 correspondence - * (which equally applies to u32 vs s32 correspondence). In general it - * depends on actual hexadecimal values of 32-bit range. They can form - * only valid u32, or only valid s32 ranges in some cases. - * - * So we use all these insights to derive bounds for subregisters here. - */ - if ((reg->umin_value >> 32) == (reg->umax_value >> 32)) { - /* u64 to u32 casting preserves validity of low 32 bits as - * a range, if upper 32 bits are the same - */ - reg->u32_min_value = max_t(u32, reg->u32_min_value, (u32)reg->umin_value); - reg->u32_max_value = min_t(u32, reg->u32_max_value, (u32)reg->umax_value); - - if ((s32)reg->umin_value <= (s32)reg->umax_value) { - reg->s32_min_value = max_t(s32, reg->s32_min_value, (s32)reg->umin_value); - reg->s32_max_value = min_t(s32, reg->s32_max_value, (s32)reg->umax_value); - } - } - if ((reg->smin_value >> 32) == (reg->smax_value >> 32)) { - /* low 32 bits should form a proper u32 range */ - if ((u32)reg->smin_value <= (u32)reg->smax_value) { - reg->u32_min_value = max_t(u32, reg->u32_min_value, (u32)reg->smin_value); - reg->u32_max_value = min_t(u32, reg->u32_max_value, (u32)reg->smax_value); - } - /* low 32 bits should form a proper s32 range */ - if ((s32)reg->smin_value <= (s32)reg->smax_value) { - reg->s32_min_value = max_t(s32, reg->s32_min_value, (s32)reg->smin_value); - reg->s32_max_value = min_t(s32, reg->s32_max_value, (s32)reg->smax_value); - } - } - /* Special case where upper bits form a small sequence of two - * sequential numbers (in 32-bit unsigned space, so 0xffffffff to - * 0x00000000 is also valid), while lower bits form a proper s32 range - * going from negative numbers to positive numbers. E.g., let's say we - * have s64 range [-1, 1] ([0xffffffffffffffff, 0x0000000000000001]). - * Possible s64 values are {-1, 0, 1} ({0xffffffffffffffff, - * 0x0000000000000000, 0x00000000000001}). Ignoring upper 32 bits, - * we still get a valid s32 range [-1, 1] ([0xffffffff, 0x00000001]). - * Note that it doesn't have to be 0xffffffff going to 0x00000000 in - * upper 32 bits. As a random example, s64 range - * [0xfffffff0fffffff0; 0xfffffff100000010], forms a valid s32 range - * [-16, 16] ([0xfffffff0; 0x00000010]) in its 32 bit subregister. - */ - if ((u32)(reg->umin_value >> 32) + 1 == (u32)(reg->umax_value >> 32) && - (s32)reg->umin_value < 0 && (s32)reg->umax_value >= 0) { - reg->s32_min_value = max_t(s32, reg->s32_min_value, (s32)reg->umin_value); - reg->s32_max_value = min_t(s32, reg->s32_max_value, (s32)reg->umax_value); - } - if ((u32)(reg->smin_value >> 32) + 1 == (u32)(reg->smax_value >> 32) && - (s32)reg->smin_value < 0 && (s32)reg->smax_value >= 0) { - reg->s32_min_value = max_t(s32, reg->s32_min_value, (s32)reg->smin_value); - reg->s32_max_value = min_t(s32, reg->s32_max_value, (s32)reg->smax_value); - } /* if u32 range forms a valid s32 range (due to matching sign bit), * try to learn from that */ @@ -2649,6 +2581,75 @@ static void __reg64_deduce_bounds(struct bpf_reg_state *reg) static void __reg_deduce_mixed_bounds(struct bpf_reg_state *reg) { + /* If upper 32 bits of u64/s64 range don't change, we can use lower 32 + * bits to improve our u32/s32 boundaries. + * + * E.g., the case where we have upper 32 bits as zero ([10, 20] in + * u64) is pretty trivial, it's obvious that in u32 we'll also have + * [10, 20] range. But this property holds for any 64-bit range as + * long as upper 32 bits in that entire range of values stay the same. + * + * E.g., u64 range [0x10000000A, 0x10000000F] ([4294967306, 4294967311] + * in decimal) has the same upper 32 bits throughout all the values in + * that range. As such, lower 32 bits form a valid [0xA, 0xF] ([10, 15]) + * range. + * + * Note also, that [0xA, 0xF] is a valid range both in u32 and in s32, + * following the rules outlined below about u64/s64 correspondence + * (which equally applies to u32 vs s32 correspondence). In general it + * depends on actual hexadecimal values of 32-bit range. They can form + * only valid u32, or only valid s32 ranges in some cases. + * + * So we use all these insights to derive bounds for subregisters here. + */ + if ((reg->umin_value >> 32) == (reg->umax_value >> 32)) { + /* u64 to u32 casting preserves validity of low 32 bits as + * a range, if upper 32 bits are the same + */ + reg->u32_min_value = max_t(u32, reg->u32_min_value, (u32)reg->umin_value); + reg->u32_max_value = min_t(u32, reg->u32_max_value, (u32)reg->umax_value); + + if ((s32)reg->umin_value <= (s32)reg->umax_value) { + reg->s32_min_value = max_t(s32, reg->s32_min_value, (s32)reg->umin_value); + reg->s32_max_value = min_t(s32, reg->s32_max_value, (s32)reg->umax_value); + } + } + if ((reg->smin_value >> 32) == (reg->smax_value >> 32)) { + /* low 32 bits should form a proper u32 range */ + if ((u32)reg->smin_value <= (u32)reg->smax_value) { + reg->u32_min_value = max_t(u32, reg->u32_min_value, (u32)reg->smin_value); + reg->u32_max_value = min_t(u32, reg->u32_max_value, (u32)reg->smax_value); + } + /* low 32 bits should form a proper s32 range */ + if ((s32)reg->smin_value <= (s32)reg->smax_value) { + reg->s32_min_value = max_t(s32, reg->s32_min_value, (s32)reg->smin_value); + reg->s32_max_value = min_t(s32, reg->s32_max_value, (s32)reg->smax_value); + } + } + /* Special case where upper bits form a small sequence of two + * sequential numbers (in 32-bit unsigned space, so 0xffffffff to + * 0x00000000 is also valid), while lower bits form a proper s32 range + * going from negative numbers to positive numbers. E.g., let's say we + * have s64 range [-1, 1] ([0xffffffffffffffff, 0x0000000000000001]). + * Possible s64 values are {-1, 0, 1} ({0xffffffffffffffff, + * 0x0000000000000000, 0x00000000000001}). Ignoring upper 32 bits, + * we still get a valid s32 range [-1, 1] ([0xffffffff, 0x00000001]). + * Note that it doesn't have to be 0xffffffff going to 0x00000000 in + * upper 32 bits. As a random example, s64 range + * [0xfffffff0fffffff0; 0xfffffff100000010], forms a valid s32 range + * [-16, 16] ([0xfffffff0; 0x00000010]) in its 32 bit subregister. + */ + if ((u32)(reg->umin_value >> 32) + 1 == (u32)(reg->umax_value >> 32) && + (s32)reg->umin_value < 0 && (s32)reg->umax_value >= 0) { + reg->s32_min_value = max_t(s32, reg->s32_min_value, (s32)reg->umin_value); + reg->s32_max_value = min_t(s32, reg->s32_max_value, (s32)reg->umax_value); + } + if ((u32)(reg->smin_value >> 32) + 1 == (u32)(reg->smax_value >> 32) && + (s32)reg->smin_value < 0 && (s32)reg->smax_value >= 0) { + reg->s32_min_value = max_t(s32, reg->s32_min_value, (s32)reg->smin_value); + reg->s32_max_value = min_t(s32, reg->s32_max_value, (s32)reg->smax_value); + } + /* Try to tighten 64-bit bounds from 32-bit knowledge, using 32-bit * values on both sides of 64-bit range in hope to have tighter range. * E.g., if r1 is [0x1'00000000, 0x3'80000000], and we learn from @@ -2741,7 +2742,6 @@ static void reg_bounds_sync(struct bpf_reg_state *reg) /* We might have learned something about the sign bit. */ __reg_deduce_bounds(reg); __reg_deduce_bounds(reg); - __reg_deduce_bounds(reg); /* We might have learned some bits from the bounds. */ __reg_bound_offset(reg); /* Intersecting with the old var_off might have improved our bounds -- 2.43.0