On 8/24/25 12:31 PM, Jeff King wrote: > On Sun, Aug 24, 2025 at 10:37:28AM +0200, René Scharfe wrote: > >> We could dedicate an object flag to track queue membership, but that >> would leave less for candidate tags, affecting the results. So use a >> hash table, specifically a khash set of commit pointers, to track that. >> This avoids quadratic behaviour in all cases and provides a nice >> performance boost over the previous commit, 08bb69d70f (describe: use >> prio_queue_replace(), 2025-08-03): >> >> Benchmark 1: ./git_08bb69d70f describe $(git rev-list v2.41.0..v2.47.0) >> Time (mean ± σ): 851.7 ms ± 1.1 ms [User: 788.7 ms, System: 49.2 ms] >> Range (min … max): 849.4 ms … 852.8 ms 10 runs >> >> Benchmark 2: ./git describe $(git rev-list v2.41.0..v2.47.0) >> Time (mean ± σ): 607.1 ms ± 0.9 ms [User: 544.6 ms, System: 48.6 ms] >> Range (min … max): 606.1 ms … 608.3 ms 10 runs >> >> Summary >> ./git describe $(git rev-list v2.41.0..v2.47.0) ran >> 1.40 ± 0.00 times faster than ./git_08bb69d70f describe $(git rev-list v2.41.0..v2.47.0) >> >> Use the commit index value as a hash because it is unique, has the >> right size and needs no computation. We could also derive the hash >> directly from the pointer value, but that requires slightly more effort. > > Interesting. This is exactly what commit-slabs were created for (and are > why the convenient index value is there in the first place!). Kinda -- they have a payload value, while a khash set only stores keys. A commit slab with an int payload would still be smaller than a khash set with 64-bit pointers as keys -- IF we were to add all commits. Here we typically add just a few, but a pathological history could add a lot; not sure if there's a boundary. Hmm. So you might be able to find examples where commit-slabs win. > The idea of commit-slab was to have a zero-cost lookup, which is done by > indexing into an array (well, really an array-of-arrays). The biggest > downside of commit-slabs is that they allocate one element per commit. > So for a sparse set you end up over-allocating and possibly suffering > cache woes due to requests being far apart. Right, and it doesn't support removal. > Whereas in your technique we are trading a little bit of computation > (indexing a bucket and then probing for the match) to get a table that > scales with the number of elements actually added to it. > > It should be easy to convert between the two and time it. On top of your > patch, I think this works: > > diff --git a/builtin/describe.c b/builtin/describe.c > index edb4dec79d..f1d1ce8c8e 100644 > --- a/builtin/describe.c > +++ b/builtin/describe.c > @@ -287,36 +287,38 @@ static void lazy_queue_clear(struct lazy_queue *queue) > queue->get_pending = false; > } > > -static inline unsigned int commit_index(const struct commit *commit) > -{ > - return commit->index; > -} > - > -static inline int ptr_eq(const void *a, const void *b) > -{ > - return a == b; > -} > +define_commit_slab(commit_counter, int); We only need one bit, so a uint8_t or char would suffice. > > -KHASH_INIT(commit_set, struct commit *, int, 0, commit_index, ptr_eq) > +struct commit_set { > + int nr; > + struct commit_counter present; > +}; > > -static void commit_set_insert(kh_commit_set_t *set, struct commit *commit) > +static void commit_set_insert(struct commit_set *set, struct commit *commit) > { > - int added; > - kh_put_commit_set(set, commit, &added); > + int *v = commit_counter_at(&set->present, commit); > + if (!*v) { > + set->nr++; > + *v = 1; > + } > } > > -static void commit_set_remove(kh_commit_set_t *set, struct commit *commit) > +static void commit_set_remove(struct commit_set *set, struct commit *commit) > { > - khiter_t pos = kh_get_commit_set(set, commit); > - if (pos != kh_end(set)) > - kh_del_commit_set(set, pos); > + int *v = commit_counter_peek(&set->present, commit); > + if (*v) { > + set->nr--; > + *v = 0; > + } > } > > static unsigned long finish_depth_computation(struct lazy_queue *queue, > struct possible_tag *best) > { > unsigned long seen_commits = 0; > - kh_commit_set_t unflagged = { 0 }; > + struct commit_set unflagged = { 0 }; > + > + init_commit_counter(&unflagged.present); > > for (size_t i = queue->get_pending ? 1 : 0; i < queue->queue.nr; i++) { > struct commit *commit = queue->queue.array[i].data; > @@ -330,7 +332,7 @@ static unsigned long finish_depth_computation(struct lazy_queue *queue, > > seen_commits++; > if (c->object.flags & best->flag_within) { > - if (!kh_size(&unflagged)) > + if (!unflagged.nr) > break; > } else { > commit_set_remove(&unflagged, c); > @@ -354,7 +356,7 @@ static unsigned long finish_depth_computation(struct lazy_queue *queue, > parents = parents->next; > } > } > - kh_release_commit_set(&unflagged); > + clear_commit_counter(&unflagged.present); > return seen_commits; > } > > > Here's what I get for timing: > > Benchmark 1: ./git.orig describe $(git rev-list v2.41.0..v2.47.0) > Time (mean ± σ): 1.195 s ± 0.012 s [User: 1.152 s, System: 0.045 s] > Range (min … max): 1.175 s … 1.220 s 10 runs > > Benchmark 2: ./git.khash describe $(git rev-list v2.41.0..v2.47.0) > Time (mean ± σ): 912.4 ms ± 5.7 ms [User: 867.7 ms, System: 46.3 ms] > Range (min … max): 901.1 ms … 921.2 ms 10 runs > > Benchmark 3: ./git.slab describe $(git rev-list v2.41.0..v2.47.0) > Time (mean ± σ): 937.9 ms ± 7.6 ms [User: 896.1 ms, System: 43.5 ms] > Range (min … max): 924.8 ms … 947.9 ms 10 runs > > Summary > ./git.khash describe $(git rev-list v2.41.0..v2.47.0) ran > 1.03 ± 0.01 times faster than ./git.slab describe $(git rev-list v2.41.0..v2.47.0) > 1.31 ± 0.02 times faster than ./git.orig describe $(git rev-list v2.41.0..v2.47.0) > I didn't even consider them a contender due to the memory overhead. The difference is surprisingly small. I also got 3%: Benchmark 1: ./git_khash describe $(git rev-list v2.41.0..v2.47.0) Time (mean ± σ): 608.4 ms ± 0.6 ms [User: 544.6 ms, System: 49.2 ms] Range (min … max): 607.5 ms … 609.3 ms 10 runs Benchmark 2: ./git_slab describe $(git rev-list v2.41.0..v2.47.0) Time (mean ± σ): 624.4 ms ± 1.0 ms [User: 560.6 ms, System: 49.3 ms] Range (min … max): 623.3 ms … 626.5 ms 10 runs Summary ./git_khash describe $(git rev-list v2.41.0..v2.47.0) ran 1.03 ± 0.00 times faster than ./git_slab describe $(git rev-list v2.41.0..v2.47.0) I guess hash tables are just fast, even the slower ones. Provided they fit into memory. > So I see similar speedups vs stock Git using your patch, but the > commit-slab version is just slightly slower. That of course makes me > wonder if we could or should replace the guts of commit-slab with a hash > more like this. Some obvious questions: > > 1. Does the hash always perform better? For a dense set, might the > commit-slab do better (probably something like topo-sort would be a > good test there). With dense you mean that most commits get some data value assigned that is kept for longer? That's when commit-slabs should shine. > 2. Can the hash version handle strides of different sizes? One of the > points of commit-slab is that the fixed size of the value type can > be set at runtime (so you could have a slab of 32 bits per commit, > or 132, depending on your traversal needs). Dynamic value sizes require an indirection via a pointer, or at least I don't see any other way. What would be a possible use case? (Don't see any in-tree.) > 3. How does it perform if we swap the commit->index field for using > the pointer? If it's similar or faster, we could get rid of the > commit->index field entirely. Besides saving a few bytes and being > simpler, that would also mean that we could start to use the same > slab techniques for non-commit objects. There are several cases > where we use a few custom bits in object.flags because we need to > cover both commits and other objects. But those are error prone if > two sub-systems of Git use the same bits and happen to run at the > same time without clearing in between. It would be great if each > algorithm could declare its own unique flag space (and discard them > in one action without iterating yet again to clear the bits). With "commit" being the pointer, returning "(uintptr_t)commit / 8" in the hash function for the khash set gets me the same performance. This assumes an eight-byte alignment and that pointer values are flat indexes into memory, which might be too much. A portable solution would probably have to mix and spread out all bits evenly? The nice thing about commit-slabs is the lack of required maintenance. They just allocate as needed and never give anything back, never need to rehash. And they don't need to store the keys anywhere. They should be good alternatives to object flags used during full history traversal without flagging non-commits. Off-the-shelf hash tables like khash might be slower in these cases, though not far off, I expect. René