On 6/18/2025 2:08 PM, Phil Hord wrote: > From: Phil Hord <phil.hord@xxxxxxxxx> > > When pruning during `git fetch` we check each pruned ref against the > ref_store one at a time to decide whether to report it as dangling. > This causes every local ref to be scanned for each ref being pruned. > > If there are N refs in the repo and M refs being pruned, this code is > O(M*N). However, `git remote prune` uses a very similar function that > is only O(N*log(M)). > > Remove the wasteful ref scanning for each pruned ref and use the faster > version already available in refs_warn_dangling_symrefs. > > In a repo with 126,000 refs, where I was pruning 28,000 refs, this > code made about 3.6 billion calls to strcmp and consumed 410 seconds > of CPU. (Invariably in that time, my remote would timeout and the > fetch would fail anyway.) > > After this change, the same operation completes in under 4 seconds. > The cover letter said "under a second". Is this a different example? > I considered further optimizing this function to be O(N), but this > requires ref_store iterators to be sorted, too. I found some suggestions > that this is always the case, but I'm not certain it is. > > The current speedup is enough for our needs at the moment. > Yep. Logarithmic scaling grows slow enough that this is probably reasonable unless someone wants to put the remaining effort in. > This change causes a reordering of the output for any reported dangling > refs. Previously they would be reported inline with the "fetch: prune" > messages. Now they will be reported after all the original prune > messages are complete. > I think this is reasonable especially for the speedup. > Signed-off-by: Phil Hord <phil.hord@xxxxxxxxx> > --- Reviewed-by: Jacob Keller <jacob.e.keller@xxxxxxxxx> > builtin/fetch.c | 16 ++++++++-------- > 1 file changed, 8 insertions(+), 8 deletions(-) > > diff --git a/builtin/fetch.c b/builtin/fetch.c > index 40a0e8d24434..11ce51da780a 100644 > --- a/builtin/fetch.c > +++ b/builtin/fetch.c > @@ -1383,10 +1383,14 @@ static int prune_refs(struct display_state *display_state, > int result = 0; > struct ref *ref, *stale_refs = get_stale_heads(rs, ref_map); > struct strbuf err = STRBUF_INIT; > + struct string_list refnames = STRING_LIST_INIT_NODUP; > const char *dangling_msg = dry_run > ? _(" (%s will become dangling)") > : _(" (%s has become dangling)"); > > + for (ref = stale_refs; ref; ref = ref->next) > + string_list_append(&refnames, ref->name); > + > if (!dry_run) { > if (transaction) { > for (ref = stale_refs; ref; ref = ref->next) { > @@ -1396,15 +1400,9 @@ static int prune_refs(struct display_state *display_state, > goto cleanup; > } > } else { > - struct string_list refnames = STRING_LIST_INIT_NODUP; > - > - for (ref = stale_refs; ref; ref = ref->next) > - string_list_append(&refnames, ref->name); > - > result = refs_delete_refs(get_main_ref_store(the_repository), > "fetch: prune", &refnames, > 0); > - string_list_clear(&refnames, 0); > } > } > > @@ -1416,12 +1414,14 @@ static int prune_refs(struct display_state *display_state, > _("(none)"), ref->name, > &ref->new_oid, &ref->old_oid, > summary_width); > - refs_warn_dangling_symref(get_main_ref_store(the_repository), > - stderr, dangling_msg, ref->name); > } > + string_list_sort(&refnames); > + refs_warn_dangling_symrefs(get_main_ref_store(the_repository), > + stderr, dangling_msg, &refnames); > } > > cleanup: > + string_list_clear(&refnames, 0); > strbuf_release(&err); > free_refs(stale_refs); > return result;