The following change seems to simplify the code and improve the precision: - /* In case LSB(a) is 1 */ - u64 itermask = b.value | b.mask;- struct tnum iterprod = TNUM(b.value & ~itermask, itermask);
- struct tnum acc_1 = tnum_add(acc, iterprod); - - acc = tnum_union(acc, acc_1);+ acc = tnum_union(acc, tnum_add(acc, b)); /* tnum_union(acc_0, acc_1) */
I'll check if it can be improved further. -- Nandakumar Edamana