Re: [PATCH v5 0/4] bloom: enable bloom filter optimization for multiple pathspec elements in revision traversal

Derrick Stolee <stolee@xxxxxxxxx> · Thu, 10 Jul 2025 09:49:50 -0400

On 7/10/2025 4:48 AM, Lidong Yan wrote:

> The difference from v4 is:
>   - for the bloom_key_* functions, we now pass struct bloom_key *
>     as the first parameter.
>   - bloom_keyvec_fill_key() and bloom_keyvec_new() have been merged
>     into a single function, so that each key is filled during the
>     creation of the bloom_keyvec.
> 
> Below is a comparison of the time taken to run git log on Git and
> LLVM repositories before and after applying this patch.
> 
> Setup commit-graph:
>   $ cd ~/src/git && git commit-graph write --split --reachable --changed-paths
>   $ cd ~/src/llvm && git commit-graph write --split --reachable --changed-paths
> 
> Run git log on Git repository
>   $ cd ~/src/git
>   $ hash -p /usr/bin/git git # my system git binary is 2.43.0
>   $ time git log -100 -- commit.c commit-graph.c >/dev/null
>   real	0m0.055s
>   user	0m0.040s
>   sys	  0m0.015s
>   $ hash -p ~/bin/git/bin/git git
>   $ time git log -100 -- commit.c commit-graph.c >/dev/null
>   real	0m0.039s
>   user	0m0.020s
>   sys	  0m0.020s
> 
> Run git log in LLVM repository
>   $ cd ~/src/llvm
>   $ hash -p /usr/bin/git git # my system git binary is 2.43.0
>   $ time git log -100 -- llvm/lib/Support/CommandLine.cpp llvm/lib/Support/CommandLine.h >/dev/null
>   real	0m2.365s
>   user	0m2.252s
>   sys	  0m0.113s
>   $ hash -p ~/bin/git/bin/git git
>   $ time git log -100 -- llvm/lib/Support/CommandLine.cpp llvm/lib/Support/CommandLine.h >/dev/null
>   real	0m0.240s
>   user	0m0.158s
>   sys	  0m0.064s

Thanks for these stats, though I'd recommend trying hyperfine [1]
which with this setup would let you run these experiments as the
following:

$ $ hyperfine --warmup=3 \
> -n 'old' '~/_git/git-sparse-checkout-clean/git log -100 -- commit.c commit-graph.c' \
> -n 'new' '~/_git/git/git log -100 -- commit.c commit-graph.c'
Benchmark 1: old
  Time (mean ± σ):      73.1 ms ±   2.9 ms    [User: 48.8 ms, System: 23.9 ms]
  Range (min … max):    69.9 ms …  84.5 ms    42 runs

Benchmark 2: new
  Time (mean ± σ):      55.1 ms ±   2.9 ms    [User: 30.5 ms, System: 24.4 ms]
  Range (min … max):    51.1 ms …  61.2 ms    52 runs

Summary
  'new' ran
    1.33 ± 0.09 times faster than 'old'

And for LLVM:

$ hyperfine --warmup=3 \
> -n 'old' '~/_git/git-sparse-checkout-clean/git log -100 -- llvm/lib/Support/CommandLine.cpp llvm/lib/Support/CommandLine.h' \
> -n 'new' '~/_git/git/git log -100 -- llvm/lib/Support/CommandLine.cpp llvm/lib/Support/CommandLine.h'

Benchmark 1: old
  Time (mean ± σ):      1.974 s ±  0.006 s    [User: 1.877 s, System: 0.097 s]
  Range (min … max):    1.960 s …  1.983 s    10 runs

Benchmark 2: new
  Time (mean ± σ):     262.9 ms ±   2.4 ms    [User: 214.2 ms, System: 48.4 ms]
  Range (min … max):   257.7 ms … 266.2 ms    11 runs

Summary
  'new' ran
    7.51 ± 0.07 times faster than 'old'

[1] https://github.com/sharkdp/hyperfine

Finally, putting these performance numbers in the commit message will
make the results permanently findable in the repo history.

> 3:  c042907b92 ! 3:  60a3b16bbb bloom: replace struct bloom_key * with struct bloom_keyvec

>     -+struct bloom_keyvec *bloom_keyvec_new(size_t count)
>     ++struct bloom_keyvec *bloom_keyvec_new(const char *path, size_t len,
>     ++				      const struct bloom_filter_settings *settings)
>      +{
>      +	struct bloom_keyvec *vec;
>     -+	size_t sz = sizeof(struct bloom_keyvec);
>     -+	sz += count * sizeof(struct bloom_key);
>     ++	const char *p;
>     ++	size_t sz;
>     ++	size_t nr = 1;
>     ++
>     ++	p = path;
>     ++	while (*p) {
>     ++		/*
>     ++		 * At this point, the path is normalized to use Unix-style
>     ++		 * path separators. This is required due to how the
>     ++		 * changed-path Bloom filters store the paths.
>     ++		 */
>     ++		if (*p == '/')
>     ++			nr++;
>     ++		p++;
>     ++	}
>     ++
>     ++	sz = sizeof(struct bloom_keyvec);
>     ++	sz += nr * sizeof(struct bloom_key);
>      +	vec = (struct bloom_keyvec *)xcalloc(1, sz);
>     -+	vec->count = count;
>     ++	if (!vec)
>     ++		return NULL;
>     ++	vec->count = nr;
>     ++
>     ++	bloom_key_fill(&vec->key[0], path, len, settings);
>     ++	nr = 1;
>     ++	p = path + len - 1;
>     ++	while (p > path) {
>     ++		if (*p == '/') {
>     ++			bloom_key_fill(&vec->key[nr++], path, p - path, settings);
>     ++		}
>     ++		p--;
>     ++	}
>     ++	assert(nr == vec->count);
>      +	return vec;
>      +}

These additions to bloom_keyvec_new() certainly help simplify some
of the code movement I was talking about, but there is more that
can be done to simplify patch 4. I'll send a couple example patches
in reply to patch 4 with what I mean.

Overall, the patch series is correct. My complaints are stylistic
more than anything.

Thanks,
-Stolee