Re: [PATCH v4 bpf-next 1/2] bpf: improve the general precision of tnum_mul

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 2025-08-22 at 22:38 +0530, Nandakumar Edamana wrote:
> This commit addresses a challenge explained in an open question ("How
> can we incorporate correlation in unknown bits across partial
> products?") left by Harishankar et al. in their paper:
> https://arxiv.org/abs/2105.05398
> 
> When LSB(a) is uncertain, we know for sure that it is either 0 or 1,
> from which we could find two possible partial products and take a
> union. Experiment shows that applying this technique in long
> multiplication improves the precision in a significant number of cases
> (at the cost of losing precision in a relatively lower number of
> cases).
> 
> This commit also removes the value-mask decomposition technique
> employed by Harishankar et al., as its direct incorporation did not
> result in any improvements for the new algorithm.
> 
> Signed-off-by: Nandakumar Edamana <nandakumar@xxxxxxxxxxxxxxxx>
> ---

Acked-by: Eduard Zingerman <eddyz87@xxxxxxxxx>


[...]

> diff --git a/kernel/bpf/tnum.c b/kernel/bpf/tnum.c
> index fa353c5d550f..50e4d34d4774 100644
> --- a/kernel/bpf/tnum.c
> +++ b/kernel/bpf/tnum.c
> @@ -116,31 +116,39 @@ struct tnum tnum_xor(struct tnum a, struct tnum b)
>  	return TNUM(v & ~mu, mu);
>  }
>  
> -/* Generate partial products by multiplying each bit in the multiplier (tnum a)
> - * with the multiplicand (tnum b), and add the partial products after
> - * appropriately bit-shifting them. Instead of directly performing tnum addition
> - * on the generated partial products, equivalenty, decompose each partial
> - * product into two tnums, consisting of the value-sum (acc_v) and the
> - * mask-sum (acc_m) and then perform tnum addition on them. The following paper
> - * explains the algorithm in more detail: https://arxiv.org/abs/2105.05398.
> +/* Perform long multiplication, iterating through the trits in a.
> + * Inside `else if (a.mask & 1)`, instead of simply multiplying b with LSB(a)'s
> + * uncertainty and accumulating directly, we find two possible partial products
> + * (one for LSB(a) = 0 and another for LSB(a) = 1), and add their union to the
> + * accumulator. This addresses an issue pointed out in an open question ("How
> + * can we incorporate correlation in unknown bits across partial products?")
> + * left by Harishankar et al. (https://arxiv.org/abs/2105.05398), improving
> + * the general precision significantly.
>   */

Nit: The above comment is good for commit message but not for the code itself.
     Imo, commit in the code should concentrate on the concrete mechanics.
     E.g. you had a nice example in the readme for improved-tnum-mul.
     So, maybe change to something like below?

     /*
      * Perform long multiplication, iterating through the trits in a:
      * - if LSB(a) is a known 0, keep current accumulator
      * - if LSB(a) is a known 1, add b to current accumulator
      * - if LSB(a) is unknown, take a union of the above cases.
      *
      * For example:
      *
      *               acc_0:	    acc_1:	
      *               		   	
      *     11 *  ->      11 *	->      11 *  -> union(0011, 1001) == x0x1
      *     x1	          01	        11	
      * ------	      ------	    ------	
      *     11	          11	        11	
      *    xx	         00	       11	
      * ------	      ------	    ------	
      *   ????	        0011	      1001
      */

>  struct tnum tnum_mul(struct tnum a, struct tnum b)
>  {
> -	u64 acc_v = a.value * b.value;
> -	struct tnum acc_m = TNUM(0, 0);
> +	struct tnum acc = TNUM(0, 0);
>  
>  	while (a.value || a.mask) {
>  		/* LSB of tnum a is a certain 1 */
>  		if (a.value & 1)
> -			acc_m = tnum_add(acc_m, TNUM(0, b.mask));
> +			acc = tnum_add(acc, b);
>  		/* LSB of tnum a is uncertain */
> -		else if (a.mask & 1)
> -			acc_m = tnum_add(acc_m, TNUM(0, b.value | b.mask));
> +		else if (a.mask & 1) {
> +			/* acc = tnum_union(acc_0, acc_1), where acc_0 and
> +			 * acc_1 are partial accumulators for cases
> +			 * LSB(a) = certain 0 and LSB(a) = certain 1.
> +			 * acc_0 = acc + 0 * b = acc.
> +			 * acc_1 = acc + 1 * b = tnum_add(acc, b).
> +			 */
> +
> +			acc = tnum_union(acc, tnum_add(acc, b));
> +		}
>  		/* Note: no case for LSB is certain 0 */
>  		a = tnum_rshift(a, 1);
>  		b = tnum_lshift(b, 1);
>  	}
> -	return tnum_add(TNUM(acc_v, 0), acc_m);
> +	return acc;
>  }
>  
>  /* Note that if a and b disagree - i.e. one has a 'known 1' where the other has

[...]





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux