Re: [PATCH v2 12/16] packfile: introduce function to load and add packfiles

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Aug 21, 2025 at 09:39:10AM +0200, Patrick Steinhardt wrote:
> We have a recurring pattern where we essentially perform an upsert of a
> packfile in case it isn't yet known by the packfile store. The logic to
> do so is non-trivial as we have to reconstruct the packfile's key, check
> the map of packfiles, then create the new packfile and finally add it to
> the store.
>
> Introduce a new function that does this dance for us. Refactor callsites
> to use it.

Nice, I have definitely noticed this pattern before and thought it would
be nice to DRY it up a bit, but never got around to doing so ;-).

> Signed-off-by: Patrick Steinhardt <ps@xxxxxx>
> ---
>  builtin/fast-import.c |  4 ++--
>  builtin/index-pack.c  | 10 +++-------
>  midx.c                | 18 ++----------------
>  packfile.c            | 44 +++++++++++++++++++++++++++++++-------------
>  packfile.h            |  8 ++++++++
>  5 files changed, 46 insertions(+), 38 deletions(-)
>
> diff --git a/builtin/fast-import.c b/builtin/fast-import.c
> index e9d82b31c3..a26e79689d 100644
> --- a/builtin/fast-import.c
> +++ b/builtin/fast-import.c
> @@ -897,11 +897,11 @@ static void end_packfile(void)
>  		idx_name = keep_pack(create_index());
>
>  		/* Register the packfile with core git's machinery. */
> -		new_p = add_packed_git(pack_data->repo, idx_name, strlen(idx_name), 1);
> +		new_p = packfile_store_load_pack(pack_data->repo->objects->packfiles,
> +						 idx_name, 1);
>  		if (!new_p)
>  			die("core git rejected index %s", idx_name);
>  		all_packs[pack_id] = new_p;
> -		packfile_store_add_pack(the_repository->objects->packfiles, new_p);

OK, we can now avoid calling packfile_store_add_pack() explicitly here,
since that is part of the new packfile_store_load_pack() function which
is called a few lines up. That does change the order of operations a
little bit (previously the new pack would end up in 'all_packs' first
before being installed, now it's the other way around), but not in a way
that I think matters.

> diff --git a/builtin/index-pack.c b/builtin/index-pack.c
> index ed490dfad4..2b78ba7fe4 100644
> --- a/builtin/index-pack.c
> +++ b/builtin/index-pack.c
> @@ -1640,13 +1640,9 @@ static void final(const char *final_pack_name, const char *curr_pack_name,
>  	rename_tmp_packfile(&final_index_name, curr_index_name, &index_name,
>  			    hash, "idx", 1);
>
> -	if (do_fsck_object) {
> -		struct packed_git *p;
> -		p = add_packed_git(the_repository, final_index_name,
> -				   strlen(final_index_name), 0);
> -		if (p)
> -			packfile_store_add_pack(the_repository->objects->packfiles, p);
> -	}
> +	if (do_fsck_object)
> +		packfile_store_load_pack(the_repository->objects->packfiles,
> +					 final_index_name, 0);

Looks obviously correct to me.

> diff --git a/midx.c b/midx.c
> index 3cfe7884ad..d30feda019 100644
> --- a/midx.c
> +++ b/midx.c
> @@ -454,7 +454,6 @@ int prepare_midx_pack(struct repository *r, struct multi_pack_index *m,
>  		      uint32_t pack_int_id)
>  {
>  	struct strbuf pack_name = STRBUF_INIT;
> -	struct strbuf key = STRBUF_INIT;
>  	struct packed_git *p;
>
>  	pack_int_id = midx_for_pack(&m, pack_int_id);
> @@ -466,22 +465,9 @@ int prepare_midx_pack(struct repository *r, struct multi_pack_index *m,
>
>  	strbuf_addf(&pack_name, "%s/pack/%s", m->object_dir,
>  		    m->pack_names[pack_int_id]);
> -
> -	/* pack_map holds the ".pack" name, but we have the .idx */
> -	strbuf_addbuf(&key, &pack_name);
> -	strbuf_strip_suffix(&key, ".idx");
> -	strbuf_addstr(&key, ".pack");
> -	p = hashmap_get_entry_from_hash(&r->objects->packfiles->map,
> -					strhash(key.buf), key.buf,
> -					struct packed_git, packmap_ent);
> -	if (!p) {
> -		p = add_packed_git(r, pack_name.buf, pack_name.len, m->local);
> -		if (p)
> -			packfile_store_add_pack(r->objects->packfiles, p);
> -	}
> -
> +	p = packfile_store_load_pack(r->objects->packfiles,
> +				     pack_name.buf, m->local);

Nice. This all looks like it preserves the right behavior, and it's nice
to see the "we have a thing that ends in '.pack', but we need one that
ends in '.idx'" logic get inlined, too.

> diff --git a/packfile.c b/packfile.c
> index a79d0fc1fa..f7a9967c9d 100644
> --- a/packfile.c
> +++ b/packfile.c
> @@ -793,6 +793,33 @@ void packfile_store_add_pack(struct packfile_store *store,
>  	list_add_tail(&pack->mru, &store->mru);
>  }
>
> +struct packed_git *packfile_store_load_pack(struct packfile_store *store,
> +					    const char *idx_path, int local)
> +{
> +	struct strbuf key = STRBUF_INIT;
> +	struct packed_git *p;
> +
> +	/*
> +	 * We're being called with the path to the index file, but `pack_map`
> +	 * holds the path to the packfile itself.
> +	 */
> +	strbuf_addstr(&key, idx_path);
> +	strbuf_strip_suffix(&key, ".idx");
> +	strbuf_addstr(&key, ".pack");
> +
> +	p = hashmap_get_entry_from_hash(&store->map, strhash(key.buf), key.buf,
> +					struct packed_git, packmap_ent);
> +	if (!p) {
> +		p = add_packed_git(store->odb->repo, idx_path,
> +				   strlen(idx_path), local);
> +		if (p)
> +			packfile_store_add_pack(store, p);
> +	}
> +
> +	strbuf_release(&key);
> +	return p;
> +}
> +

This all looks good too, and matches the behavior of the callees which
are being refactored.

>  void (*report_garbage)(unsigned seen_bits, const char *path);
>
>  static void report_helper(const struct string_list *list,
> @@ -892,23 +919,14 @@ static void prepare_pack(const char *full_name, size_t full_name_len,
>  			 const char *file_name, void *_data)
>  {
>  	struct prepare_pack_data *data = (struct prepare_pack_data *)_data;
> -	struct packed_git *p;
>  	size_t base_len = full_name_len;
>
>  	if (strip_suffix_mem(full_name, &base_len, ".idx") &&
>  	    !(data->m && midx_contains_pack(data->m, file_name))) {
> -		struct hashmap_entry hent;
> -		char *pack_name = xstrfmt("%.*s.pack", (int)base_len, full_name);
> -		unsigned int hash = strhash(pack_name);
> -		hashmap_entry_init(&hent, hash);
> -
> -		/* Don't reopen a pack we already have. */
> -		if (!hashmap_get(&data->r->objects->packfiles->map, &hent, pack_name)) {
> -			p = add_packed_git(data->r, full_name, full_name_len, data->local);
> -			if (p)
> -				packfile_store_add_pack(data->r->objects->packfiles, p);
> -		}
> -		free(pack_name);
> +		char *trimmed_path = xstrndup(full_name, full_name_len);
> +		packfile_store_load_pack(data->r->objects->packfiles,
> +					 trimmed_path, data->local);

I think we could avoid the allocation here by passing along the length
of the string we want to use, as in:

    packfile_store_load_pack(data->r->objects->packfiles,
                             full_name, full_name_len,
                             data->local);

, but I prefer the way it is written here.

Thanks,
Taylor




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux