On Thu, Aug 21, 2025 at 09:39:10AM +0200, Patrick Steinhardt wrote: > We have a recurring pattern where we essentially perform an upsert of a > packfile in case it isn't yet known by the packfile store. The logic to > do so is non-trivial as we have to reconstruct the packfile's key, check > the map of packfiles, then create the new packfile and finally add it to > the store. > > Introduce a new function that does this dance for us. Refactor callsites > to use it. Nice, I have definitely noticed this pattern before and thought it would be nice to DRY it up a bit, but never got around to doing so ;-). > Signed-off-by: Patrick Steinhardt <ps@xxxxxx> > --- > builtin/fast-import.c | 4 ++-- > builtin/index-pack.c | 10 +++------- > midx.c | 18 ++---------------- > packfile.c | 44 +++++++++++++++++++++++++++++++------------- > packfile.h | 8 ++++++++ > 5 files changed, 46 insertions(+), 38 deletions(-) > > diff --git a/builtin/fast-import.c b/builtin/fast-import.c > index e9d82b31c3..a26e79689d 100644 > --- a/builtin/fast-import.c > +++ b/builtin/fast-import.c > @@ -897,11 +897,11 @@ static void end_packfile(void) > idx_name = keep_pack(create_index()); > > /* Register the packfile with core git's machinery. */ > - new_p = add_packed_git(pack_data->repo, idx_name, strlen(idx_name), 1); > + new_p = packfile_store_load_pack(pack_data->repo->objects->packfiles, > + idx_name, 1); > if (!new_p) > die("core git rejected index %s", idx_name); > all_packs[pack_id] = new_p; > - packfile_store_add_pack(the_repository->objects->packfiles, new_p); OK, we can now avoid calling packfile_store_add_pack() explicitly here, since that is part of the new packfile_store_load_pack() function which is called a few lines up. That does change the order of operations a little bit (previously the new pack would end up in 'all_packs' first before being installed, now it's the other way around), but not in a way that I think matters. > diff --git a/builtin/index-pack.c b/builtin/index-pack.c > index ed490dfad4..2b78ba7fe4 100644 > --- a/builtin/index-pack.c > +++ b/builtin/index-pack.c > @@ -1640,13 +1640,9 @@ static void final(const char *final_pack_name, const char *curr_pack_name, > rename_tmp_packfile(&final_index_name, curr_index_name, &index_name, > hash, "idx", 1); > > - if (do_fsck_object) { > - struct packed_git *p; > - p = add_packed_git(the_repository, final_index_name, > - strlen(final_index_name), 0); > - if (p) > - packfile_store_add_pack(the_repository->objects->packfiles, p); > - } > + if (do_fsck_object) > + packfile_store_load_pack(the_repository->objects->packfiles, > + final_index_name, 0); Looks obviously correct to me. > diff --git a/midx.c b/midx.c > index 3cfe7884ad..d30feda019 100644 > --- a/midx.c > +++ b/midx.c > @@ -454,7 +454,6 @@ int prepare_midx_pack(struct repository *r, struct multi_pack_index *m, > uint32_t pack_int_id) > { > struct strbuf pack_name = STRBUF_INIT; > - struct strbuf key = STRBUF_INIT; > struct packed_git *p; > > pack_int_id = midx_for_pack(&m, pack_int_id); > @@ -466,22 +465,9 @@ int prepare_midx_pack(struct repository *r, struct multi_pack_index *m, > > strbuf_addf(&pack_name, "%s/pack/%s", m->object_dir, > m->pack_names[pack_int_id]); > - > - /* pack_map holds the ".pack" name, but we have the .idx */ > - strbuf_addbuf(&key, &pack_name); > - strbuf_strip_suffix(&key, ".idx"); > - strbuf_addstr(&key, ".pack"); > - p = hashmap_get_entry_from_hash(&r->objects->packfiles->map, > - strhash(key.buf), key.buf, > - struct packed_git, packmap_ent); > - if (!p) { > - p = add_packed_git(r, pack_name.buf, pack_name.len, m->local); > - if (p) > - packfile_store_add_pack(r->objects->packfiles, p); > - } > - > + p = packfile_store_load_pack(r->objects->packfiles, > + pack_name.buf, m->local); Nice. This all looks like it preserves the right behavior, and it's nice to see the "we have a thing that ends in '.pack', but we need one that ends in '.idx'" logic get inlined, too. > diff --git a/packfile.c b/packfile.c > index a79d0fc1fa..f7a9967c9d 100644 > --- a/packfile.c > +++ b/packfile.c > @@ -793,6 +793,33 @@ void packfile_store_add_pack(struct packfile_store *store, > list_add_tail(&pack->mru, &store->mru); > } > > +struct packed_git *packfile_store_load_pack(struct packfile_store *store, > + const char *idx_path, int local) > +{ > + struct strbuf key = STRBUF_INIT; > + struct packed_git *p; > + > + /* > + * We're being called with the path to the index file, but `pack_map` > + * holds the path to the packfile itself. > + */ > + strbuf_addstr(&key, idx_path); > + strbuf_strip_suffix(&key, ".idx"); > + strbuf_addstr(&key, ".pack"); > + > + p = hashmap_get_entry_from_hash(&store->map, strhash(key.buf), key.buf, > + struct packed_git, packmap_ent); > + if (!p) { > + p = add_packed_git(store->odb->repo, idx_path, > + strlen(idx_path), local); > + if (p) > + packfile_store_add_pack(store, p); > + } > + > + strbuf_release(&key); > + return p; > +} > + This all looks good too, and matches the behavior of the callees which are being refactored. > void (*report_garbage)(unsigned seen_bits, const char *path); > > static void report_helper(const struct string_list *list, > @@ -892,23 +919,14 @@ static void prepare_pack(const char *full_name, size_t full_name_len, > const char *file_name, void *_data) > { > struct prepare_pack_data *data = (struct prepare_pack_data *)_data; > - struct packed_git *p; > size_t base_len = full_name_len; > > if (strip_suffix_mem(full_name, &base_len, ".idx") && > !(data->m && midx_contains_pack(data->m, file_name))) { > - struct hashmap_entry hent; > - char *pack_name = xstrfmt("%.*s.pack", (int)base_len, full_name); > - unsigned int hash = strhash(pack_name); > - hashmap_entry_init(&hent, hash); > - > - /* Don't reopen a pack we already have. */ > - if (!hashmap_get(&data->r->objects->packfiles->map, &hent, pack_name)) { > - p = add_packed_git(data->r, full_name, full_name_len, data->local); > - if (p) > - packfile_store_add_pack(data->r->objects->packfiles, p); > - } > - free(pack_name); > + char *trimmed_path = xstrndup(full_name, full_name_len); > + packfile_store_load_pack(data->r->objects->packfiles, > + trimmed_path, data->local); I think we could avoid the allocation here by passing along the length of the string we want to use, as in: packfile_store_load_pack(data->r->objects->packfiles, full_name, full_name_len, data->local); , but I prefer the way it is written here. Thanks, Taylor