Re: [PATCH v2 2/4] fetch: use batched reference updates

Karthik Nayak <karthik.188@xxxxxxxxx> · Sun, 18 May 2025 04:30:24 -0700

Patrick Steinhardt <ps@xxxxxx> writes:

> On Fri, May 16, 2025 at 02:53:22AM -0700, Karthik Nayak wrote:
>> Patrick Steinhardt <ps@xxxxxx> writes:
>>
>> > On Thu, May 15, 2025 at 04:07:26PM +0200, Karthik Nayak wrote:
>> >> diff --git a/builtin/fetch.c b/builtin/fetch.c
>> >> index 5279997c96..15eac2b1c2 100644
>> >> --- a/builtin/fetch.c
>> >> +++ b/builtin/fetch.c
>> >> @@ -1688,6 +1644,37 @@ static int set_head(const struct ref *remote_refs, struct remote *remote)
>> >>  	return result;
>> >>  }
>> >>
>> >> +struct ref_rejection_data {
>> >> +	int *retcode;
>> >> +	int conflict_msg_shown;
>> >> +	const char *remote_name;
>> >> +};
>> >> +
>> >> +static void ref_transaction_rejection_handler(const char *refname,
>> >> +					      const struct object_id *old_oid UNUSED,
>> >> +					      const struct object_id *new_oid UNUSED,
>> >> +					      const char *old_target UNUSED,
>> >> +					      const char *new_target UNUSED,
>> >> +					      enum ref_transaction_error err,
>> >> +					      void *cb_data)
>> >> +{
>> >> +	struct ref_rejection_data *data = (struct ref_rejection_data *)cb_data;
>> >
>> > Nit: unnecessary cast.
>> >
>> >> +	if (err == REF_TRANSACTION_ERROR_NAME_CONFLICT && !data->conflict_msg_shown) {
>> >> +		error(_("some local refs could not be updated; try running\n"
>> >> +			" 'git remote prune %s' to remove any old, conflicting "
>> >> +			"branches"), data->remote_name);
>> >> +		data->conflict_msg_shown = 1;
>> >> +	} else {
>> >> +		char *reason = ref_transaction_error_msg(err);
>> >> +
>> >> +		error(_("fetching ref %s failed: %s"), refname, reason);
>> >> +		free(reason);
>> >> +	}
>> >> +
>> >> +	*data->retcode = 1;
>> >> +}
>> >
>> > Okay, we stopped ignoring generic errors now and will print them. What
>> > I'm still unclear about: which exact errors do we accept now that
>> > `REF_TRANSACTION_ALLOW_FAILURE` is specified? Most of the error codes we
>> > probably want to accept, but what about `REF_TRANSACTION_ERROR_GENERIC`?
>>
>> The current mechanism in `ref_transaction_maybe_set_rejected()` doesn't
>> handle `REF_TRANSACTION_ERROR_GENERIC` errors. This was a design choice
>> (more of a requirement of what this error represents), where
>> `REF_TRANSACTION_ERROR_GENERIC` errors cannot be resolved on an
>> individual reference level. It includes:
>>
>>   - System errors such as I/O errors
>>   - Duplicates present
>>
>> Both of these represent issues which are bigger than a single ref
>> update, so we have to propagate these errors up.
>
> The second case is also why the behaviour changes now, right? If we were
> able to handle duplicates via the same mechanism then it would become
> possible to retain current behaviour for git-receive-pack(1)?
>

Yeah indeed, but if we want to allow users supporting conflict
resolution of duplicates, we'll also have to think about how that would
look. Our discussion till now was around allowing a callback for each
reference update with the associated error.

With duplicates, we'd also want to provide the context of which N
updates are duplicated.

> Not that I'm proposing this -- I very much think that the current
> behaviour in git-receive-pack(1) is a bug that should be fixed. Mostly
> trying to understand.
>

Yeah I agree with you on this!

>> > This makes me wonder a bit about the current layout of how we handle
>> > these errors. If the rejection handler was invoked while preparing the
>> > transaction for each reference as we go instead of afterwards we could
>> > decide on-the-fly whether a specific error should be ignored or not.
>> > That might lead to a design that is both more flexible and more obvious
>> > at the same time because error handling is now handled explicitly by the
>> > callsite that wants to ignore some errors.
>> >
>>
>> I did ponder on this while I was building the batched transaction
>> mechanism. I decided to take it iteratively. We can, for instance,
>> modify `ref_transaction_maybe_set_rejected()` to work with a callback
>> function which would allow the users to accept/reject errors.
>>
>> However, even if we go down that route, `REF_TRANSACTION_ERROR_GENERIC`
>> errors still cannot be overlooked, these errors will abort the entire
>> transaction.
>
> Okay, good.
>
>> That said, I'm not trying to avoid going down that route. I do agree
>> with the flexibility it does provide. Once we hit such a usecase, we
>> should make that change.
>>
>> For 'git-fetch(1)' and 'git-recieve-pack(1)', do you see a usecase?
>
> No, I don't right now. I just want to avoid that we have to eventually
> refactor all of this to support an alternative API. But agreed, there
> isn't really much of a reason why we wouldn't be able to introduce such
> a mechanism retroactively while keeping existing callers intact.
>
> So let's stick with what we have and keep this in the back of our minds
> if we ever need such a mechanism going forward.
>

I think this makes sense!

>> > Last but not least, I think that it would also allow us to decide ahead
>> > of time whether we want to commit. Right now we basically say "just
>> > commit it, whatever happens". But if I'm not mistaken, all the errors
>> > that we care about and that callers may want to ignore are already
>> > detected at prepare time. So if we already bubbled up relevant info
>> > while calling `ref_transaction_prepare()` the caller may then decide to
>> > not commit at all based on some criteria.
>> >
>>
>> Indeed, that is correct. I can confirm that even now all the calls to
>> `ref_transaction_maybe_set_rejected()` are made in the prepare phase, so
>> we could already do this, since `transaction->rejections` is already
>> populated at this stage.
>
> Good. After all, we shouldn't have to perform checks in the "commit"
> phase. Things are locked, things have been checked, so it should
> basically be a mere "let's move everything into place now". Which of
> course can still fail, but the only valid reason should be system
> failures.
>
> Patrick
>

Exactly, totally agreed.

Thanks for the discussion Patrick!
Attachment:
signature.asc

Description: PGP signature