On Fri, 2025-08-22 at 22:38 +0530, Nandakumar Edamana wrote: > This commit addresses a challenge explained in an open question ("How > can we incorporate correlation in unknown bits across partial > products?") left by Harishankar et al. in their paper: > https://arxiv.org/abs/2105.05398 > > When LSB(a) is uncertain, we know for sure that it is either 0 or 1, > from which we could find two possible partial products and take a > union. Experiment shows that applying this technique in long > multiplication improves the precision in a significant number of cases > (at the cost of losing precision in a relatively lower number of > cases). > > This commit also removes the value-mask decomposition technique > employed by Harishankar et al., as its direct incorporation did not > result in any improvements for the new algorithm. > > Signed-off-by: Nandakumar Edamana <nandakumar@xxxxxxxxxxxxxxxx> > --- Acked-by: Eduard Zingerman <eddyz87@xxxxxxxxx> [...] > diff --git a/kernel/bpf/tnum.c b/kernel/bpf/tnum.c > index fa353c5d550f..50e4d34d4774 100644 > --- a/kernel/bpf/tnum.c > +++ b/kernel/bpf/tnum.c > @@ -116,31 +116,39 @@ struct tnum tnum_xor(struct tnum a, struct tnum b) > return TNUM(v & ~mu, mu); > } > > -/* Generate partial products by multiplying each bit in the multiplier (tnum a) > - * with the multiplicand (tnum b), and add the partial products after > - * appropriately bit-shifting them. Instead of directly performing tnum addition > - * on the generated partial products, equivalenty, decompose each partial > - * product into two tnums, consisting of the value-sum (acc_v) and the > - * mask-sum (acc_m) and then perform tnum addition on them. The following paper > - * explains the algorithm in more detail: https://arxiv.org/abs/2105.05398. > +/* Perform long multiplication, iterating through the trits in a. > + * Inside `else if (a.mask & 1)`, instead of simply multiplying b with LSB(a)'s > + * uncertainty and accumulating directly, we find two possible partial products > + * (one for LSB(a) = 0 and another for LSB(a) = 1), and add their union to the > + * accumulator. This addresses an issue pointed out in an open question ("How > + * can we incorporate correlation in unknown bits across partial products?") > + * left by Harishankar et al. (https://arxiv.org/abs/2105.05398), improving > + * the general precision significantly. > */ Nit: The above comment is good for commit message but not for the code itself. Imo, commit in the code should concentrate on the concrete mechanics. E.g. you had a nice example in the readme for improved-tnum-mul. So, maybe change to something like below? /* * Perform long multiplication, iterating through the trits in a: * - if LSB(a) is a known 0, keep current accumulator * - if LSB(a) is a known 1, add b to current accumulator * - if LSB(a) is unknown, take a union of the above cases. * * For example: * * acc_0: acc_1: * * 11 * -> 11 * -> 11 * -> union(0011, 1001) == x0x1 * x1 01 11 * ------ ------ ------ * 11 11 11 * xx 00 11 * ------ ------ ------ * ???? 0011 1001 */ > struct tnum tnum_mul(struct tnum a, struct tnum b) > { > - u64 acc_v = a.value * b.value; > - struct tnum acc_m = TNUM(0, 0); > + struct tnum acc = TNUM(0, 0); > > while (a.value || a.mask) { > /* LSB of tnum a is a certain 1 */ > if (a.value & 1) > - acc_m = tnum_add(acc_m, TNUM(0, b.mask)); > + acc = tnum_add(acc, b); > /* LSB of tnum a is uncertain */ > - else if (a.mask & 1) > - acc_m = tnum_add(acc_m, TNUM(0, b.value | b.mask)); > + else if (a.mask & 1) { > + /* acc = tnum_union(acc_0, acc_1), where acc_0 and > + * acc_1 are partial accumulators for cases > + * LSB(a) = certain 0 and LSB(a) = certain 1. > + * acc_0 = acc + 0 * b = acc. > + * acc_1 = acc + 1 * b = tnum_add(acc, b). > + */ > + > + acc = tnum_union(acc, tnum_add(acc, b)); > + } > /* Note: no case for LSB is certain 0 */ > a = tnum_rshift(a, 1); > b = tnum_lshift(b, 1); > } > - return tnum_add(TNUM(acc_v, 0), acc_m); > + return acc; > } > > /* Note that if a and b disagree - i.e. one has a 'known 1' where the other has [...]