On Wed, Jul 09, 2025 at 09:54:49AM +0200, Patrick Steinhardt wrote: > Multi-pack indices are tracked via `struct multi_pack_index`. This data > structure is stored as a linked list inside `struct object_database`, > which is the global database that spans across all of the object > sources. > > This layout causes two problems: > > - Multi-pack indices aren't global to an object database, but instead > there can be one multi-pack index per source. This creates a > mismatch between the on-disk layout and how things are organized in > the object database subsystems and makes some parts, like figuring > out whether a source has an MIDX, quite awkward. This is a little confusing to me. What do we consider to be an "object database", and what do we consider to be a "source"? For instance, if I have a repository with one or more alternates, I would imagine that each alternate is a separate "source", and the sources together comprise the object database. Does that match the way you're thinking about it? If so, that makes sense. But if not (i.e., we consider all alternates to belong to the same object database and share a single source), then I don't know how this will interact with the existing MIDX alternates mechanism. Some clarification here would be helpful, I think. > Refactor `prepare_multi_pack_index_one()` so that it works on a specific > source, which allows us to easily store a pointer to the multi-pack > index inside of it. For now, this pointer exists next to the existing > linked list we have in the object database. Users will be adjusted in > subsequent patches to instead use the per-source pointers. OK. > Signed-off-by: Patrick Steinhardt <ps@xxxxxx> > --- > midx.c | 19 +++++++++++-------- > midx.h | 7 ++++--- > odb.h | 11 +++++++++-- > packfile.c | 4 +++- > 4 files changed, 27 insertions(+), 14 deletions(-) > > diff --git a/midx.c b/midx.c > index 3c5bc821730..a91231bfcdf 100644 > --- a/midx.c > +++ b/midx.c > @@ -724,28 +724,29 @@ int midx_preferred_pack(struct multi_pack_index *m, uint32_t *pack_int_id) > return 0; > } > > -int prepare_multi_pack_index_one(struct repository *r, const char *object_dir, int local) > +int prepare_multi_pack_index_one(struct odb_source *source, int local) OK, so the combination of a repository and "object_dir" path becomes a new "struct odb_source", which makes sense. But shouldn't "local" also be a property of the odb_source? I am not familiar enough with the odb_source work thus far to know whether that is a good idea or not, but just something that stood out to me while reading. > { > + struct repository *r = source->odb->repo; > struct multi_pack_index *m; > - struct multi_pack_index *m_search; > > prepare_repo_settings(r); > if (!r->settings.core_multi_pack_index) > return 0; > > - for (m_search = r->objects->multi_pack_index; m_search; m_search = m_search->next) > - if (!strcmp(object_dir, m_search->object_dir)) > - return 1; > - > - m = load_multi_pack_index(r, object_dir, local); > + if (source->multi_pack_index) > + return 1; This makes sense. The old code was walking along the linked list of MIDXs across alternates to see if we have already loaded one for the given object_dir. But since there is a one-to-one relationship between odb_source and object_dir, it suffices to see whether or not we have loaded the MIDX for a given source at all. > + m = load_multi_pack_index(r, source->path, local); > if (m) { > struct multi_pack_index *mp = r->objects->multi_pack_index; > if (mp) { > m->next = mp->next; > mp->next = m; > - } else > + } else { > r->objects->multi_pack_index = m; > + } I am glad that we are cleaning these up (since our style guide says that if one or more if/else branch(es) has braces, they all should), but it does make reading this patch a little more difficult. Not a big deal, but I would have rather seen this as a separate cleanup patch. > + source->multi_pack_index = m; As an aside, I see that we're calling this new field "multi_pack_index". I wonder if it would be better to call it "midx", since typing out "multi_pack_index" is a little verbose, and "midx" is a common abbreviation that I don't think we would cause any confusion with. > @@ -837,6 +838,8 @@ void clear_midx_file(struct repository *r) > if (r->objects && r->objects->multi_pack_index) { > close_midx(r->objects->multi_pack_index); > r->objects->multi_pack_index = NULL; > + for (struct odb_source *source = r->objects->sources; source; source = source->next) > + source->multi_pack_index = NULL; Makes sense. > diff --git a/midx.h b/midx.h > index 9d1374cbd58..b1626a9a7c4 100644 > --- a/midx.h > +++ b/midx.h > @@ -3,11 +3,12 @@ > > #include "string-list.h" > > +struct bitmapped_pack; > +struct git_hash_algo; > struct object_id; > +struct odb_source; > struct pack_entry; > struct repository; > -struct bitmapped_pack; > -struct git_hash_algo; (I'm nit-picking, but a similar note here that the unrelated changes make it harder to see what is actually going on in this hunk, which is adding a declaration for "struct odb_source".) > diff --git a/odb.h b/odb.h > index e922f256802..8e79c7be520 100644 > --- a/odb.h > +++ b/odb.h > @@ -9,10 +9,11 @@ > #include "string-list.h" > #include "thread-utils.h" > > +struct multi_pack_index; > struct oidmap; > struct oidtree; > -struct strbuf; > struct repository; > +struct strbuf; (Same here.) Thanks, Taylor