On Thu, Aug 21, 2025 at 09:39:09AM +0200, Patrick Steinhardt wrote: > When adding a packfile to it store we add it both to the list and map of > packfiles, but we don't append it to the most-recently-used list of > packs. We do know to add the packfile to the MRU list as soon as we > access any of its objects, but in between we're being inconistent. It > doesn't help that there are some subsystems that _do_ add the packfile > to the MRU after having added it, which only adds to the confusion. > > Refactor the code so that we unconditionally add packfiles to the MRU > when adding them to a packfile store. I am a little confused why prepare_midx_pack() wants to add packs to the MRU cache so eagerly, and the commit which introduced that behavior (commit af96fe3392 (midx: add packs to packed_git linked list, 2019-04-29)) doesn't focus on that area in detail. (Note that commit af96fe3392 *does* discuss a separate cache's behavior regarding the open file descriptor limit, but that LRU cache is a different one than the MRU cache we're discussing here.) What I do wonder about is why af96fe3392 adds packs to the MRU cache in the first place. As far as I can tell, we never move MIDX'd packs to the front of the MRU cache at all. There are two spots that call list_move() on the MRU cache, which are: - packfile.c::find_pack_entry(), which enumerates MIDX'd packs in a separate loop earlier on in the function, and ignores packs in the MRU cache whose p->multi_pack_index bit is set. - builtin/pack-objects.c::want_object_in_pack_mtime(), which also enumerates MIDX'd packs in a separate loop, though it does not explicitly ignore packs in the MRU cache with the multi_pack_index bit set. In practice, though, I think these two are equivalent, since want_object_in_pack_mtime() will return before it gets to the MRU cache if it found the object in a MIDX'd pack. So I don't think we need to be adding MIDX'd packs to the MRU cache in the first place. > Signed-off-by: Patrick Steinhardt <ps@xxxxxx> > --- > midx.c | 4 +--- > packfile.c | 1 + > 2 files changed, 2 insertions(+), 3 deletions(-) > > diff --git a/midx.c b/midx.c > index 95e74c79c1..3cfe7884ad 100644 > --- a/midx.c > +++ b/midx.c > @@ -476,10 +476,8 @@ int prepare_midx_pack(struct repository *r, struct multi_pack_index *m, > struct packed_git, packmap_ent); > if (!p) { > p = add_packed_git(r, pack_name.buf, pack_name.len, m->local); > - if (p) { > + if (p) > packfile_store_add_pack(r->objects->packfiles, p); > - list_add_tail(&p->mru, &r->objects->packfiles->mru); > - } All of that aside, this portion of the diff is preserving the existing behavior, since it inlines the list_add_tail() call within packfile_store_add_pack(). OK. > diff --git a/packfile.c b/packfile.c > index c885046d9f..a79d0fc1fa 100644 > --- a/packfile.c > +++ b/packfile.c > @@ -790,6 +790,7 @@ void packfile_store_add_pack(struct packfile_store *store, > > hashmap_entry_init(&pack->packmap_ent, strhash(pack->pack_name)); > hashmap_add(&store->map, &pack->packmap_ent); > + list_add_tail(&pack->mru, &store->mru); But this spot makes me a little unsure. There are callers of what is now packfile_store_add_pack() that *don't* immediately add the pack to the MRU cache (e.g., packfile.c::prepare_pack()). I think that behavior is intentional, since we don't necessarily want to have packs in the MRU cache which haven't actually received an object lookup yet. So I am not sure I understand the full extent of this change. I think it might be worth investigating the eagerness of prepare_midx_pack() to add packs to the MRU cache, and perhaps drop that behavior altogether, which I think would obviate the need for this patch in the series. Thanks, Taylor