Re: [PATCH 3/3] receive-pack: use batched reference updates

Patrick Steinhardt <ps@xxxxxx> · Wed, 14 May 2025 14:31:09 +0200

On Wed, May 14, 2025 at 11:03:49AM +0200, Karthik Nayak wrote:
[snip]
> With the reftable backend there is a 18x performance improvement, when
> performing receive-pack with 10000 refs:
> 
>   Benchmark 1: receive: many refs (refformat = reftable, refcount = 10000, revision = master)
>     Time (mean ± σ):      4.276 s ±  0.078 s    [User: 0.796 s, System: 3.318 s]
>     Range (min … max):    4.185 s …  4.430 s    10 runs
> 
>   Benchmark 2: receive: many refs (refformat = reftable, refcount = 10000, revision = HEAD)
>     Time (mean ± σ):     235.4 ms ±   6.9 ms    [User: 75.4 ms, System: 157.3 ms]
>     Range (min … max):   228.5 ms … 254.2 ms    11 runs
> 
>   Summary
>     receive: many refs (refformat = reftable, refcount = 10000, revision = HEAD) ran
>      18.16 ± 0.63 times faster than receive: many refs (refformat = reftable, refcount = 10000, revision = master)
> 
> In similar conditions, the files backend sees a 1.21x performance
> improvement:
> 
>   Benchmark 1: receive: many refs (refformat = files, refcount = 10000, revision = master)
>     Time (mean ± σ):      1.121 s ±  0.021 s    [User: 0.128 s, System: 0.975 s]
>     Range (min … max):    1.097 s …  1.156 s    10 runs
> 
>   Benchmark 2: receive: many refs (refformat = files, refcount = 10000, revision = HEAD)
>     Time (mean ± σ):     927.9 ms ±  22.6 ms    [User: 99.0 ms, System: 815.2 ms]
>     Range (min … max):   903.1 ms … 978.0 ms    10 runs
> 
>   Summary
>     receive: many refs (refformat = files, refcount = 10000, revision = HEAD) ran
>       1.21 ± 0.04 times faster than receive: many refs (refformat = files, refcount = 10000, revision = master)

We see almost the same speedups as we saw in git-fetch(1), and the
reason why we see them is basically the same.

> diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
> index be314879e8..b4fceb3837 100644
> --- a/builtin/receive-pack.c
> +++ b/builtin/receive-pack.c
> @@ -1843,35 +1843,91 @@ static void BUG_if_skipped_connectivity_check(struct command *commands,
>  	BUG_if_bug("connectivity check skipped???");
>  }
>  
> +static void ref_transaction_rejection_handler(const char *refname,
> +					      const struct object_id *old_oid UNUSED,
> +					      const struct object_id *new_oid UNUSED,
> +					      const char *old_target UNUSED,
> +					      const char *new_target UNUSED,
> +					      enum ref_transaction_error err,
> +					      void *cb_data)
> +{
> +	struct strmap *failed_refs = (struct strmap *)cb_data;

This cast is unnecessary.

> +	const char *reason = "";
> +
> +	switch (err) {
> +	case REF_TRANSACTION_ERROR_NAME_CONFLICT:
> +		reason = "refname conflict";
> +		break;
> +	case REF_TRANSACTION_ERROR_CREATE_EXISTS:
> +		reason = "reference already exists";
> +		break;
> +	case REF_TRANSACTION_ERROR_NONEXISTENT_REF:
> +		reason = "reference does not exist";
> +		break;
> +	case REF_TRANSACTION_ERROR_INCORRECT_OLD_VALUE:
> +		reason = "incorrect old value provided";
> +		break;
> +	case REF_TRANSACTION_ERROR_INVALID_NEW_VALUE:
> +		reason = "invalid new value provided";
> +		break;
> +	case REF_TRANSACTION_ERROR_EXPECTED_SYMREF:
> +		reason = "expected symref but found regular ref";
> +		break;
> +	default:
> +		reason = "unkown failure";
> +	}
> +
> +	strmap_put(failed_refs, refname, xstrdup(reason));
> +}

I'd have expected something like this for git-fetch(1), as well, so that
we don't silently swallow failed ref updates. Would it make sense to
maybe provide an array of reasons by enum so that we can reuse those
messages?

>  static void execute_commands_non_atomic(struct command *commands,
>  					struct shallow_info *si)
>  {
>  	struct command *cmd;
>  	struct strbuf err = STRBUF_INIT;
> +	const char *reported_error = "";
> +	struct strmap failed_refs = STRMAP_INIT;
> +
> +	transaction = ref_store_transaction_begin(get_main_ref_store(the_repository),
> +						  REF_TRANSACTION_ALLOW_FAILURE, &err);
> +	if (!transaction) {
> +		rp_error("%s", err.buf);
> +		strbuf_reset(&err);
> +		reported_error = "transaction failed to start";
> +		goto failure;
> +	}

Okay. We now create a single transaction with failures being allowed.

>  	for (cmd = commands; cmd; cmd = cmd->next) {
>  		if (!should_process_cmd(cmd) || cmd->run_proc_receive)
>  			continue;
>  
> -		transaction = ref_store_transaction_begin(get_main_ref_store(the_repository),
> -							  0, &err);
> -		if (!transaction) {
> -			rp_error("%s", err.buf);
> -			strbuf_reset(&err);
> -			cmd->error_string = "transaction failed to start";
> -			continue;
> -		}
> -
>  		cmd->error_string = update(cmd, si);
> +	}

So here we only need to queue each update.

> -		if (!cmd->error_string
> -		    && ref_transaction_commit(transaction, &err)) {
> -			rp_error("%s", err.buf);
> -			strbuf_reset(&err);
> -			cmd->error_string = "failed to update ref";
> -		}
> -		ref_transaction_free(transaction);
> +	if (ref_transaction_commit(transaction, &err)) {
> +		rp_error("%s", err.buf);
> +		reported_error = "failed to update refs";
> +		goto failure;
>  	}
> +
> +	ref_transaction_for_each_rejected_update(transaction,
> +						 ref_transaction_rejection_handler,
> +						 &failed_refs);
> +
> +	if (strmap_empty(&failed_refs))
> +		goto cleanup;
> +
> +failure:
> +	for (cmd = commands; cmd; cmd = cmd->next) {
> +		if (strmap_empty(&failed_refs))
> +			cmd->error_string = reported_error;

The reported error may have one of there values now:

  - The empty string. Is it correct to set the error string to that
    value? Shouldn't it rather be a `NULL` pointer?

  - "transaction failed to start". It makes sense to update every
    command accordingly, as we wouldn't have updated any of them.

  - "failed to update refs", in case the commit failed. Does the commit
    fail only in cases where we couldn't update _any_ reference, or does
    it also retrun an error when only one of the updates failed? If the
    latter, we shouldn't update all the others to "failed", should we?

In any case, it feels weird that we use `strmap_empty()` to check for
this condition. I'd have expected that we rather check for
`reported_error` to be a non-empty string directly to figure out whether
the transaction itself has failed as a whole. Like that, we'd know that
we only ever do this if we hit a `goto failure` branch.

Patrick