Re: [PATCH v3 7/9] refs/files: detect race when generating reflog entry for HEAD

Jeff King <peff@xxxxxxxx> · Sat, 2 Aug 2025 07:11:28 -0400

On Tue, Jul 29, 2025 at 10:55:25AM +0200, Patrick Steinhardt wrote:

> Unfortunately, this change only helps with the second race. We cannot
> reliably plug the first race without locking the HEAD reference at the
> start of the transaction. Locking HEAD unconditionally would effectively
> serialize all writes though, and that doesn't seem like an option. Also,
> double checking its value at the end of the transaction is not an option
> either, as its target may have flip-flopped during the transaction.

I agree we should not always take a lock on HEAD, since most refs would
not need it. But I wonder if we could do better by examining HEAD, then
taking a lock when we think we'll need it, and then re-checking the
value of HEAD. That is still racy, though (somebody could have pointed
HEAD at us between the two checks). Fundamentally the files backend is
not atomic across the whole namespace, and we are trying to update two
refs. So I think there will always be some race.

It does make me wonder if this race-fix is even worth it, then. We are
catching the case where somebody moves HEAD away from the ref we are
updating while we are updating it. But without atomicity, do we even
know which happened first? That is, would it be incorrect to update
HEAD anyway? I guess the outcome is observable because their movement of
HEAD generated a reflog entry, and thus the entries would be out of
order. So maybe that is worth it.

Anyway, I had two questions about the code:

> @@ -2600,7 +2607,36 @@ static enum ref_transaction_error lock_ref_for_update(struct files_ref_store *re
>  
>  	update->backend_data = lock;
>  
> -	if (update->type & REF_ISSYMREF) {
> +	if (update->flags & REF_LOG_VIA_SPLIT) {
> +		struct ref_lock *parent_lock;
> +
> +		if (!update->parent_update)
> +			BUG("split update without a parent");
> +
> +		parent_lock = update->parent_update->backend_data;
> +
> +		/*
> +		 * Check that "HEAD" didn't racily change since we have looked
> +		 * it up. If it did we must refuse to write the reflog entry.
> +		 *
> +		 * Note that this does not catch all races: if "HEAD" was
> +		 * racily changed to point to one of the refs part of the
> +		 * transaction then we would miss writing the split reflog
> +		 * entry for "HEAD".
> +		 */
> +		if (!(update->type & REF_ISSYMREF) ||
> +		    strcmp(update->parent_update->refname, referent.buf)) {
> +			strbuf_addstr(err, "HEAD has been racily updated");
> +			ret = REF_TRANSACTION_ERROR_GENERIC;
> +			goto out;
> +		}

One, what happens with a multi-level ref (e.g., HEAD points to
refs/heads/foo which points to refs/heads/bar)?

We've resolved HEAD to get referent.buf. Do we get "foo" or "bar" here?
If "bar", then a write through "foo" will complain. But if we get "foo",
then theoretically a write through "bar" will complain.

I _think_ we are OK, though. Constructing it like this:

  git init
  git commit --allow-empty -m whatever

  git symbolic-ref refs/heads/foo refs/heads/bar
  git symbolic-ref HEAD refs/heads/foo
  git update-ref refs/heads/foo main

triggers the check and shows that our referent from lock_raw_ref() is
the first level (i.e., "foo"). Which is good.

If we swap out "foo" for "bar" in the update-ref call, then we'd get a
mismatch. But in that case we do not figure out that HEAD needs be
written at all! That is, we only do a single level of look-back to
decide whether to write HEAD at all. So as long as we keep doing so, we
are OK.

> +		if (!(update->type & REF_ISSYMREF) ||
> +		    strcmp(update->parent_update->refname, referent.buf)) {
> +			strbuf_addstr(err, "HEAD has been racily updated");
> +			ret = REF_TRANSACTION_ERROR_GENERIC;
> +			goto out;
> +		}

And two, is an error the right thing here? The user asked us to update
"foo", and we saw that HEAD pointed to it. So we decided to update
HEAD's reflog, too. And when it came time to do so under lock, we found
that HEAD did not point to "foo" any more.

Shouldn't we quietly drop the HEAD reflog update, rather than forcing
the whole transaction to fail? The user never asked us to update HEAD at
all. It was something we opportunistically decided to do, and now we
find out that it is not appropriate to do so.

-Peff