Re: [PATCH v2 7/8] refs: stop unsetting REF_HAVE_OLD for log-only updates

Jeff King <peff@xxxxxxxx> · Fri, 25 Jul 2025 07:36:10 -0400

On Fri, Jul 25, 2025 at 08:58:29AM +0200, Patrick Steinhardt wrote:

> The `REF_HAVE_OLD` flag indicates whether a given ref update has its old
> object ID set. If so, the value of that field is used to verify whether
> the current state of the reference matches this expected state. It is
> thus an important part of mitigating races with a concurrent process
> that updates the same set of references.
> 
> When writing reflogs though we explicitly unset that flag. This is a
> sensible thing to do: the old state of reflog entry updates may not
> necessarily match the current on-disk state of its accompanying ref, but
> it's only intended to signal what old object ID we want to write into
> the new reflog entry. For example when migrating refs we end up writing
> many reflog entries for a single reference, and most likely those reflog
> entries will have many different old object IDs.
> 
> But unsetting this flag also removes a useful signal, namely that the
> caller _did_ provide an old object ID for a given reflog entry. This
> signal will become useful in a subsequent commit, where we add a new
> flag that tells the transaction to use the provided old and new object
> IDs to write a reflog entry. The `REF_HAVE_OLD` flag is then used as a
> signal to verify that the caller really did provide an old object ID.
> 
> Stop unsetting the flag so that we can use it as this described signal
> in a subsequent commit. Skip checking the old object ID for log-only
> updates so that we don't expect it to match the current on-disk state.

I like this direction, but I happened to be working in this area
yesterday[1] and noticed something interesting. You're effectively
replacing this removal of the HAVE_OLD flag when split a symref update:

> diff --git a/refs/files-backend.c b/refs/files-backend.c
> index bf6f89b1d19..8b42fe18901 100644
> --- a/refs/files-backend.c
> +++ b/refs/files-backend.c
> @@ -2493,7 +2493,6 @@ static enum ref_transaction_error split_symref_update(struct ref_update *update,
>  	 * done when new_update is processed.
>  	 */
>  	update->flags |= REF_LOG_ONLY | REF_NO_DEREF;
> -	update->flags &= ~REF_HAVE_OLD;
>  
>  	return 0;
>  }

and then later we get the same logic by checking for LOG_ONLY:

> @@ -2508,8 +2507,9 @@ static enum ref_transaction_error check_old_oid(struct ref_update *update,
>  						struct object_id *oid,
>  						struct strbuf *err)
>  {
> -	if (!(update->flags & REF_HAVE_OLD) ||
> -		   oideq(oid, &update->old_oid))
> +	if (update->flags & REF_LOG_ONLY ||
> +	    !(update->flags & REF_HAVE_OLD) ||
> +	    oideq(oid, &update->old_oid))
>  		return 0;
>  
>  	if (is_null_oid(&update->old_oid)) {

Which make sense to me. But the weird thing I noticed is that when we do
something similar for split_head_update(), we don't strip REF_HAVE_OLD!

(For those not familiar with that function, it notices when we are
updating refs/heads/foo that is pointed-to by HEAD, and then adds an
extra HEAD reflog update to the transaction).

So as I understand it, right now we are doing an extra check_old_oid()
on that log-only HEAD update, and after your patch we would stop doing
so.

Which I _think_ is the right thing to do, but it made me wonder if the
transaction were ever non-atomic. That is, could we split off a log-only
update that succeeds, even though the old-oid check for the actual
ref fails?

Historically, I'd guess the answer is mostly "no", because the point of
ref transactions is to be all-or-nothing, and to do the locking and
old-oid checking before writing out any updates. But I also think I saw
some discussion of non-atomic transactions recently. I didn't really
follow it, but is this a potential problem?

-Peff

[1] If you are wondering what work: it is the fact that at least with
    the files backend, we will happily overwrite a dangling symref even
    when the caller asked us to make sure this is a creation event. That
    is easy to fix, but I was surprised that some HEAD updates failed
    after doing so. The problem is that the reflog update for HEAD did
    not clear the HAVE_OLD flag, and my solution was to do so (just like
    split_symref_update() does). But as your topic here shows, that will
    probably result in broken reflogs. And we should be checking for
    LOG_ONLY in check_old_oid, as you're doing here (which would also
    fix my problem).

    But that also makes me wonder: should ref_update_check_old_target()
    also be checking LOG_ONLY now in your patch? I guess not, as it does
    not use HAVE_OLD at all (that is just about the oid). We get the
    equivalent behavior in the split-off log-only transaction item
    because we just do not set "old_target" in the split-off item.