On Mon, Apr 21, 2025 at 10:01:25PM -0400, Philippe Blain wrote: > I noticed that p5332-multi-pack-reuse.sh, which you added in > ba47d88795 (t/perf: add performance tests for multi-pack reuse, > 2023-12-14) fails early on in the second test ("setup bitmaps for > 1-pack scenario"). Since perf tests run with '--immediate', I do not > know if further tests in that file also fail. It is reproducible on macOS [1] as > well as Linux [2] (I don't know if these logs are public though). > > I also tested on Linux on version 2.44.0 which is the first release > in which this test was added, and it also failed similarily. I think the patch below is probably the right solution. With it I got the output I'd expect (multi-pack reuse with many packs yields a CPU speedup at the cost of increased size): Test this tree ---------------------------------------------------------------------------------- 5332.3: clone for 1-pack scenario (single-pack reuse) 6.66(37.73+0.19) 5332.4: clone size for 1-pack scenario (single-pack reuse) 117.0M 5332.5: clone for 1-pack scenario (multi-pack reuse) 6.89(38.71+0.25) 5332.6: clone size for 1-pack scenario (multi-pack reuse) 117.0M 5332.9: clone for 10-pack scenario (single-pack reuse) 5.67(35.65+0.37) 5332.10: clone size for 10-pack scenario (single-pack reuse) 125.1M 5332.11: clone for 10-pack scenario (multi-pack reuse) 2.47(5.71+0.15) 5332.12: clone size for 10-pack scenario (multi-pack reuse) 134.3M 5332.15: clone for 100-pack scenario (single-pack reuse) 14.50(130.54+0.55) 5332.16: clone size for 100-pack scenario (single-pack reuse) 224.2M 5332.17: clone for 100-pack scenario (multi-pack reuse) 3.34(3.69+0.18) 5332.18: clone size for 100-pack scenario (multi-pack reuse) 307.3M -- >8 -- Subject: [PATCH] p5332: drop "+" from --stdin-packs input This perf script creates a midx by running "git multi-pack-index write" with the "--stdin-packs" option. We feed that stdin by running "find" on .git/objects/pack, using sed to strip off everything but the basename. But that sed invocation also does something peculiar: it adds a "+" to the start of each pack name. This causes the multi-pack-index command to barf. The modified name does not match any pack it knows about, so it ends up with an empty list of packs to put in the midx. And thus nothing matches the --preferred-pack option we pass, which causes it die(). The fix is to remove the extra "+" (which also lets us simplify the sed invocation a bit, as it is now just stripping the leading directories). But that leaves the mystery of why it was ever there in the first place. The answer is that an earlier iteration of the patch series had a concept of "disjoint" packs in the midx. And one of its patches here: https://lore.kernel.org/git/c52d7e7b27a9add4f58b8334db4fe4498af1c90f.1701198172.git.me@xxxxxxxxxxxx/ taught read_packs_from_stdin() to treat a leading "+" as marking a disjoint pack. But in the second version of the series, which was ultimately merged, that disjoint concept went away, and the code to parse "+" did likewise. The regular regression tests were adjusted to match, but this case in t/perf was forgotten. Signed-off-by: Jeff King <peff@xxxxxxxx> --- t/perf/p5332-multi-pack-reuse.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/t/perf/p5332-multi-pack-reuse.sh b/t/perf/p5332-multi-pack-reuse.sh index d1c89a8b7d..0a2525db44 100755 --- a/t/perf/p5332-multi-pack-reuse.sh +++ b/t/perf/p5332-multi-pack-reuse.sh @@ -58,7 +58,7 @@ do ' test_expect_success "setup bitmaps for $nr_packs-pack scenario" ' - find $packdir -type f -name "*.idx" | sed -e "s/.*\/\(.*\)$/+\1/g" | + find $packdir -type f -name "*.idx" | sed -e "s/.*\///" | git multi-pack-index write --stdin-packs --bitmap \ --preferred-pack="$(find_pack $(git rev-parse HEAD))" ' -- 2.49.0.682.g886cb1c59a