Re: [PATCH 1/3] fetch: use batched reference updates

Patrick Steinhardt <ps@xxxxxx> · Wed, 14 May 2025 14:31:16 +0200

On Wed, May 14, 2025 at 11:03:47AM +0200, Karthik Nayak wrote:
> The reference updates performed as a part of 'git-fetch(1)', take place

s/,//

> one at a time. For each reference update, a new transaction is created
> and committed. This is necessary to ensure we can allow individual
> updates to fail without failing the entire command. The command also
> supports an '--atomic' mode, which uses a single transaction to update
> all of the references. But this mode has an all-or-nothing approach,
> where if a single update fails, all updates would fail.
> 
> In 23fc8e4f61 (refs: implement batch reference update support,
> 2025-04-08), we introduced a new mechanism to batch reference updates.
> Under the hood, this uses a single transaction to perform a batch of
> reference updates, while allowing only individual updates to fail.
> Utilize this newly introduced batch update mechanism in 'git-fetch(1)'.
> This provides a significant bump in performance, especially when dealing
> with repositories with large number of references.
> 
> Adding support for batched updates is simply modifying the flow to also
> create a batch update transaction in the non-atomic flow.
> 
> With the reftable backend there is a 22x performance improvement, when
> performing 'git-fetch(1)' with 10000 refs:
> 
>   Benchmark 1: fetch: many refs (refformat = reftable, refcount = 10000, revision = master)
>     Time (mean ± σ):      3.403 s ±  0.775 s    [User: 1.875 s, System: 1.417 s]
>     Range (min … max):    2.454 s …  4.529 s    10 runs
> 
>   Benchmark 2: fetch: many refs (refformat = reftable, refcount = 10000, revision = HEAD)
>     Time (mean ± σ):     154.3 ms ±  17.6 ms    [User: 102.5 ms, System: 56.1 ms]
>     Range (min … max):   145.2 ms … 220.5 ms    18 runs
> 
>   Summary
>     fetch: many refs (refformat = reftable, refcount = 10000, revision = HEAD) ran
>      22.06 ± 5.62 times faster than fetch: many refs (refformat = reftable, refcount = 10000, revision = master)

Nice. The speedup is larger than I have originally anticipated, but I
certainly won't complain about that :) For a good part, the speedup
should result from us not having to do 10000 auto-compactions anymore
for each of the updates, as well as not having to write 10000 new
tables. Instead, we only write a single new table and perform compaction
a single time, only.

> In similar conditions, the files backend sees a 1.25x performance
> improvement:
> 
>   Benchmark 1: fetch: many refs (refformat = files, refcount = 10000, revision = master)
>     Time (mean ± σ):     605.5 ms ±   9.4 ms    [User: 117.8 ms, System: 483.3 ms]
>     Range (min … max):   595.6 ms … 621.5 ms    10 runs
> 
>   Benchmark 2: fetch: many refs (refformat = files, refcount = 10000, revision = HEAD)
>     Time (mean ± σ):     485.8 ms ±   4.3 ms    [User: 91.1 ms, System: 396.7 ms]
>     Range (min … max):   477.6 ms … 494.3 ms    10 runs
> 
>   Summary
>     fetch: many refs (refformat = files, refcount = 10000, revision = HEAD) ran
>       1.25 ± 0.02 times faster than fetch: many refs (refformat = files, refcount = 10000, revision = master)

Yup, this makes sense, as well. We lose a bunch of overhead by creating
separate transactions for each of the updates, but the slow path is
still that we have to create 10000 files for each of the references. So
it's expected to see a small performance improvement, but nothing game
changing.

> diff --git a/builtin/fetch.c b/builtin/fetch.c
> index 5279997c96..1558f6d1e8 100644
> --- a/builtin/fetch.c
> +++ b/builtin/fetch.c
> @@ -1688,6 +1644,32 @@ static int set_head(const struct ref *remote_refs, struct remote *remote)
>  	return result;
>  }
>  
> +struct ref_rejection_data {
> +	int *retcode;
> +	int conflict_msg_shown;
> +	const char *remote_name;
> +};
> +
> +static void ref_transaction_rejection_handler(const char *refname UNUSED,
> +					      const struct object_id *old_oid UNUSED,
> +					      const struct object_id *new_oid UNUSED,
> +					      const char *old_target UNUSED,
> +					      const char *new_target UNUSED,
> +					      enum ref_transaction_error err,
> +					      void *cb_data)
> +{
> +	struct ref_rejection_data *data = (struct ref_rejection_data *)cb_data;
> +
> +	if (err == REF_TRANSACTION_ERROR_NAME_CONFLICT && !data->conflict_msg_shown) {
> +		error(_("some local refs could not be updated; try running\n"
> +			" 'git remote prune %s' to remove any old, conflicting "
> +			"branches"), data->remote_name);
> +		data->conflict_msg_shown = 1;
> +	}
> +
> +	*data->retcode = 1;
> +}
> +
>  static int do_fetch(struct transport *transport,
>  		    struct refspec *rs,
>  		    const struct fetch_config *config)

Okay, so we now handle errors over here. Is the handled error the only
error that we may see, or do we also accept other errors like D/F now?
If the latter, wouldn't it mean that we don't print any error messages
for those other failures at all? That might be quite confusing.

> @@ -1808,6 +1790,20 @@ static int do_fetch(struct transport *transport,
>  			retcode = 1;
>  	}
>  
> +	/*
> +	 * If not atomic, we can still use batched updates, which would be much
> +	 * more performent. We don't initiate the transaction before pruning,

s/performent/performant/

> +	 * since pruning must be an independent step, to avoid F/D conflicts.
> +	 */
> +	if (!transaction) {
> +		transaction = ref_store_transaction_begin(get_main_ref_store(the_repository),
> +							  REF_TRANSACTION_ALLOW_FAILURE, &err);
> +		if (!transaction) {
> +			retcode = -1;
> +			goto cleanup;
> +		}
> +	}
> +
>  	if (fetch_and_consume_refs(&display_state, transport, transaction, ref_map,
>  				   &fetch_head, config)) {
>  		retcode = 1;

Don't transactions handle D/F conflicts for us? Isn't that the sole
reason why for example `refs_verify_refname_available()` accepts an
"extras" parameter that is supposed to contain refs that are about to be
deleted?

Patrick