On Fri, 2025-08-15 at 19:35 +0530, Nandakumar Edamana wrote: [...] > This commit addresses a challenge explained in an open question ("How > can we incorporate correlation in unknown bits across partial > products?") left by Harishankar et al. in their paper: > https://arxiv.org/abs/2105.05398 This is nice work on improving the precision of the tnum_mul algorithm Nandakumar. I had a few questions and comments (below). > When LSB(a) is uncertain, we know for sure that it is either 0 or 1, > from which we could find two possible partial products and take a > union. Experiment shows that applying this technique in long > multiplication improves the precision in a significant number of cases > (at the cost of losing precision in a relatively lower number of > cases). If I understand the idea correctly, when the multiplier's (i.e. tnum a) bit is unknown, it can either be 0 or 1. If it is 0, then we add nothing to accumulator, i.e. TNUM(0, 0). If it is 1, we can add b to the accumulator (appropriately shifted). The main idea is to take the union of these two possible partial products, and add that to the accumulator. If so, could we also do the following? acc = tnum_add(acc, tnum_union(TNUM(0, 0), b)); [...] > + else if (a.mask & 1) { > + /* acc += tnum_union(acc_0, acc_1), where acc_0 and > + * acc_1 are partial accumulators for cases > + * LSB(a) = certain 0 and LSB(a) = certain 1. > + * acc_0 = acc + 0 * b = acc. > + * acc_1 = acc + 1 * b = tnum_add(acc, b). > + */ > + > + acc = tnum_union(acc, tnum_add(acc, b)); The comments in your algorithm seem different from what the code implements: I believe the comment should read: /* acc = tnum_union(acc_0, acc_1), where acc_0 and ... */ [...] > +struct tnum tnum_union(struct tnum a, struct tnum b) > +{ > + u64 v = a.value & b.value; > + u64 mu = (a.value ^ b.value) | a.mask | b.mask; > + > + return TNUM(v & ~mu, mu); > +} The tnum_union() algorithm itself seems correct. I checked this in z3 [1]. [1] https://github.com/bpfverif/tnums-cgo22/blob/05a51a1af6cb72f9694e5034c263ca442b33b976/verification/tnum.py#L801 I also checked this algorithm for the precision gains myself [2]. While the numbers were not exactly the same as the ones you shared, the algorithm does seem to be more precise (echoed by Eduard's tests as well). In the below "our_mul" is the existing algorithm, and "new_mul" is your new algorithm. There are two outputs, one for the exhaustive enumeration at 8 bitwidth, and another for 10M randomly sampled 64-bit tnums. [2] https://github.com/bpfverif/tnums-cgo22/tree/main/precision-new-mul $ ./precision 8 64 10000000 number of input tnum pairs where our_mul had better precision: 1420899 number of input tnum pairs where new_mul had better precision: 30063672 number of input tnum pairs where output was the same: 11383710 number of input tnum pairs where output was incomparable: 178440 sampled at bitwidth 64 number of input tnum pairs where our_mul had better precision: 66441 number of input tnum pairs where new_mul had better precision: 241588 number of input tnum pairs where output was the same: 9687187 number of input tnum pairs where output was incomparable: 4784 Finally, what is the guarantee that this algorithm is sound? It will be useful to have a proof to ensure that we are maintaining correctness. Best, Harishankar Vishwanathan