On Thu, Jul 10, 2025 at 07:56:28PM -0400, Taylor Blau wrote: > On Wed, Jul 09, 2025 at 09:54:53AM +0200, Patrick Steinhardt wrote: > > diff --git a/builtin/repack.c b/builtin/repack.c > > index 9bbf032b6dd..5956df5d927 100644 > > --- a/builtin/repack.c > > +++ b/builtin/repack.c > > @@ -218,9 +218,9 @@ static void mark_packs_for_deletion(struct existing_packs *existing, > > static void remove_redundant_pack(const char *dir_name, const char *base_name) > > { > > struct strbuf buf = STRBUF_INIT; > > - struct multi_pack_index *m = get_local_multi_pack_index(the_repository); > > + struct multi_pack_index *m = get_multi_pack_index(the_repository->objects->sources); > > Is the first source always guaranteed to be the local one? I assume that the > answer here is "yes", and there are certainly other places in the code > where we make a similar assumption. But just wanted to make sure as > this popped into my head while reading. > > > strbuf_addf(&buf, "%s.pack", base_name); > > - if (m && midx_contains_pack(m, buf.buf)) > > + if (m && m->local && midx_contains_pack(m, buf.buf)) > > ...hmm, maybe not? The answer is "almost always". The only exception is in case there is a temporary object source added via `odb_set_temporary_primary_source()`, for example for quarantine directories. Generally speaking this patch here doesn't really change anything, and we could just as well drop the `m->local` part above. But I think that overall we're not doing a good job to track the local object source, so it felt safer to me to add the above guard. > > diff --git a/midx-write.c b/midx-write.c > > index f2cfb85476e..c1ae62d3549 100644 > > --- a/midx-write.c > > +++ b/midx-write.c > > @@ -916,26 +916,8 @@ static int write_midx_bitmap(struct write_midx_context *ctx, > > static struct multi_pack_index *lookup_multi_pack_index(struct repository *r, > > const char *object_dir) > > { > > - struct multi_pack_index *result = NULL; > > - struct multi_pack_index *cur; > > - char *obj_dir_real = real_pathdup(object_dir, 1); > > - struct strbuf cur_path_real = STRBUF_INIT; > > - > > - /* Ensure the given object_dir is local, or a known alternate. */ > > - odb_find_source(r->objects, obj_dir_real); > > - > > - for (cur = get_multi_pack_index(r); cur; cur = cur->next) { > > - strbuf_realpath(&cur_path_real, cur->object_dir, 1); > > - if (!strcmp(obj_dir_real, cur_path_real.buf)) { > > - result = cur; > > - goto cleanup; > > - } > > - } > > - > > -cleanup: > > - free(obj_dir_real); > > - strbuf_release(&cur_path_real); > > - return result; > > + struct odb_source *source = odb_find_source(r->objects, object_dir); > > + return get_multi_pack_index(source); > > When I first read this I wondered what would happen if we passed in an > unknown object_dir such that odb_find_source() returned NULL. But that > function will never return NULL, and instead will die() if given a bogus > object_dir. > > So this is fine, though I would have imagined that we'd return NULL > within odb_find_source() and let the caller die() (or not). Fully agreed. I've got a follow-up patch series that does this refactoring. > > diff --git a/object-name.c b/object-name.c > > index ddafe7f9b13..1e7fdcb90a8 100644 > > --- a/object-name.c > > +++ b/object-name.c > > @@ -198,16 +198,19 @@ static void unique_in_pack(struct packed_git *p, > > > > static void find_short_packed_object(struct disambiguate_state *ds) > > { > > - struct multi_pack_index *m; > > struct packed_git *p; > > > > /* Skip, unless oids from the storage hash algorithm are wanted */ > > if (ds->bin_pfx.algo && (&hash_algos[ds->bin_pfx.algo] != ds->repo->hash_algo)) > > return; > > > > - for (m = get_multi_pack_index(ds->repo); m && !ds->ambiguous; > > - m = m->next) > > - unique_in_midx(m, ds); > > + odb_prepare_alternates(ds->repo->objects); > > + for (struct odb_source *source = ds->repo->objects->sources; source && !ds->ambiguous; source = source->next) { > > + struct multi_pack_index *m = get_multi_pack_index(source); > > + if (m) > > + unique_in_midx(m, ds); > > + } > > + > > Makes sense, though now having seen this pattern a few times, I am > wondering if it would be worth it to add a utility function that takes a > callback and iterates over the various MIDXs. But perhaps that is taking > DRY-ing things up a little too far ;-). > > For what it's worth, I do think that what you wrote here makes more > logical sense: MIDXs are tied to individual alternate object DBs, which > here means that there is one MIDX per "struct odb_source". It just is a > little more verbose to type out. I was wondering whether it would make sense to add a looping macro, but I ultimately decided against it as it doesn't make too much of a difference to really matter. I wouldn't mind adding it though if you or others feel strongly about it. Patrick