On Wed, Jul 09, 2025 at 09:54:53AM +0200, Patrick Steinhardt wrote: > The function `get_multi_pack_index()` loads multi-pack indices via > `prepare_packed_git()` and then returns the linked list of multi-pack > indices that is stored in `struct object_database`. That list is in the > process of being removed though in favor of storing the MIDX as part of > the object database source it belongs to. > > Refactor `get_multi_pack_index()` so that it returns the multi-pack > index for a single object source. Callers are now expected to call this > function for each source they are interested in. This requires them to > iterate through alternates, so we have to prepare alternate object > sources before doing so. > > Signed-off-by: Patrick Steinhardt <ps@xxxxxx> > --- > builtin/pack-objects.c | 9 ++++++--- > builtin/repack.c | 4 ++-- > midx-write.c | 22 ++-------------------- > object-name.c | 21 ++++++++++++++------- > pack-bitmap.c | 20 ++++++++++++++------ > packfile.c | 30 +++++++++++------------------- > packfile.h | 3 +-- > 7 files changed, 50 insertions(+), 59 deletions(-) > > diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c > index 5781dec9808..f889e69e07d 100644 > --- a/builtin/pack-objects.c > +++ b/builtin/pack-objects.c > @@ -1701,7 +1701,6 @@ static int want_object_in_pack_mtime(const struct object_id *oid, > { > int want; > struct list_head *pos; > - struct multi_pack_index *m; > > if (!exclude && local && has_loose_object_nonlocal(oid)) > return 0; > @@ -1721,9 +1720,13 @@ static int want_object_in_pack_mtime(const struct object_id *oid, > *found_offset = 0; > } > > - for (m = get_multi_pack_index(the_repository); m; m = m->next) { > + odb_prepare_alternates(the_repository->objects); > + > + for (struct odb_source *source = the_repository->objects->sources; source; source = source->next) { > + struct multi_pack_index *m = get_multi_pack_index(source); All makes sense up to this point. I would suggest adding the following to the top of this function: struct repository *r = the_repository; struct odb_source *source; to shorten up these lines a bit, but I'm nit-picking so if you feel strongly that the existing patch is clearer, I won't object. > struct pack_entry e; > - if (fill_midx_entry(the_repository, oid, &e, m)) { > + > + if (m && fill_midx_entry(the_repository, oid, &e, m)) { Good. > want = want_object_in_pack_one(e.p, oid, exclude, found_pack, found_offset, found_mtime); > if (want != -1) > return want; > diff --git a/builtin/repack.c b/builtin/repack.c > index 9bbf032b6dd..5956df5d927 100644 > --- a/builtin/repack.c > +++ b/builtin/repack.c > @@ -218,9 +218,9 @@ static void mark_packs_for_deletion(struct existing_packs *existing, > static void remove_redundant_pack(const char *dir_name, const char *base_name) > { > struct strbuf buf = STRBUF_INIT; > - struct multi_pack_index *m = get_local_multi_pack_index(the_repository); > + struct multi_pack_index *m = get_multi_pack_index(the_repository->objects->sources); Is the first source always guaranteed to be the local one? I assume that the answer here is "yes", and there are certainly other places in the code where we make a similar assumption. But just wanted to make sure as this popped into my head while reading. > strbuf_addf(&buf, "%s.pack", base_name); > - if (m && midx_contains_pack(m, buf.buf)) > + if (m && m->local && midx_contains_pack(m, buf.buf)) ...hmm, maybe not? > clear_midx_file(the_repository); > strbuf_insertf(&buf, 0, "%s/", dir_name); > unlink_pack_path(buf.buf, 1); > diff --git a/midx-write.c b/midx-write.c > index f2cfb85476e..c1ae62d3549 100644 > --- a/midx-write.c > +++ b/midx-write.c > @@ -916,26 +916,8 @@ static int write_midx_bitmap(struct write_midx_context *ctx, > static struct multi_pack_index *lookup_multi_pack_index(struct repository *r, > const char *object_dir) > { > - struct multi_pack_index *result = NULL; > - struct multi_pack_index *cur; > - char *obj_dir_real = real_pathdup(object_dir, 1); > - struct strbuf cur_path_real = STRBUF_INIT; > - > - /* Ensure the given object_dir is local, or a known alternate. */ > - odb_find_source(r->objects, obj_dir_real); > - > - for (cur = get_multi_pack_index(r); cur; cur = cur->next) { > - strbuf_realpath(&cur_path_real, cur->object_dir, 1); > - if (!strcmp(obj_dir_real, cur_path_real.buf)) { > - result = cur; > - goto cleanup; > - } > - } > - > -cleanup: > - free(obj_dir_real); > - strbuf_release(&cur_path_real); > - return result; > + struct odb_source *source = odb_find_source(r->objects, object_dir); > + return get_multi_pack_index(source); When I first read this I wondered what would happen if we passed in an unknown object_dir such that odb_find_source() returned NULL. But that function will never return NULL, and instead will die() if given a bogus object_dir. So this is fine, though I would have imagined that we'd return NULL within odb_find_source() and let the caller die() (or not). > } > > static int fill_packs_from_midx(struct write_midx_context *ctx, > diff --git a/object-name.c b/object-name.c > index ddafe7f9b13..1e7fdcb90a8 100644 > --- a/object-name.c > +++ b/object-name.c > @@ -198,16 +198,19 @@ static void unique_in_pack(struct packed_git *p, > > static void find_short_packed_object(struct disambiguate_state *ds) > { > - struct multi_pack_index *m; > struct packed_git *p; > > /* Skip, unless oids from the storage hash algorithm are wanted */ > if (ds->bin_pfx.algo && (&hash_algos[ds->bin_pfx.algo] != ds->repo->hash_algo)) > return; > > - for (m = get_multi_pack_index(ds->repo); m && !ds->ambiguous; > - m = m->next) > - unique_in_midx(m, ds); > + odb_prepare_alternates(ds->repo->objects); > + for (struct odb_source *source = ds->repo->objects->sources; source && !ds->ambiguous; source = source->next) { > + struct multi_pack_index *m = get_multi_pack_index(source); > + if (m) > + unique_in_midx(m, ds); > + } > + Makes sense, though now having seen this pattern a few times, I am wondering if it would be worth it to add a utility function that takes a callback and iterates over the various MIDXs. But perhaps that is taking DRY-ing things up a little too far ;-). For what it's worth, I do think that what you wrote here makes more logical sense: MIDXs are tied to individual alternate object DBs, which here means that there is one MIDX per "struct odb_source". It just is a little more verbose to type out. > diff --git a/pack-bitmap.c b/pack-bitmap.c > index 0a4af199c05..7b51d381837 100644 > --- a/pack-bitmap.c > +++ b/pack-bitmap.c > @@ -692,14 +692,16 @@ static int open_midx_bitmap(struct repository *r, > struct bitmap_index *bitmap_git) > { > int ret = -1; > - struct multi_pack_index *midx; > > assert(!bitmap_git->map); > > - for (midx = get_multi_pack_index(r); midx; midx = midx->next) { > - if (!open_midx_bitmap_1(bitmap_git, midx)) > + odb_prepare_alternates(r->objects); > + for (struct odb_source *source = r->objects->sources; source; source = source->next) { > + struct multi_pack_index *midx = get_multi_pack_index(source); > + if (midx && !open_midx_bitmap_1(bitmap_git, midx)) > ret = 0; > } Makes sense. > + Stray diff? > @@ -3307,9 +3309,15 @@ int verify_bitmap_files(struct repository *r) > { > int res = 0; > > - for (struct multi_pack_index *m = get_multi_pack_index(r); > - m; m = m->next) { > - char *midx_bitmap_name = midx_bitmap_filename(m); > + odb_prepare_alternates(r->objects); > + for (struct odb_source *source = r->objects->sources; source; source = source->next) { > + struct multi_pack_index *m = get_multi_pack_index(source); > + char *midx_bitmap_name; > + > + if (!m) > + continue; > + > + midx_bitmap_name = midx_bitmap_filename(m); This one looks good as well. > diff --git a/packfile.c b/packfile.c > index e5d9d7ac8bc..e1ced050451 100644 > --- a/packfile.c > +++ b/packfile.c > @@ -963,14 +963,17 @@ static void prepare_packed_git(struct repository *r); > unsigned long repo_approximate_object_count(struct repository *r) > { > if (!r->objects->approximate_object_count_valid) { > - unsigned long count; > - struct multi_pack_index *m; > + unsigned long count = 0; > struct packed_git *p; > > prepare_packed_git(r); I was wondering where the odb_prepare_alternates() call was in this one, but it is a byproduct of calling prepare_packed_git(). So this spot looks OK as well. > - count = 0; > - for (m = get_multi_pack_index(r); m; m = m->next) > - count += m->num_objects; > + > + for (struct odb_source *source = r->objects->sources; source; source = source->next) { > + struct multi_pack_index *m = get_multi_pack_index(source); > + if (m) > + count += m->num_objects; Unrelated to your patch, I wonder if it makes sense to check for overflow here. I think so, though presumably if we are overflowing "unsigned long" then likely we have much bigger problems to worry about ;-). The rest looks good to me. Thanks, Taylor