On Fri, Jul 25, 2025 at 07:36:10AM -0400, Jeff King wrote: > On Fri, Jul 25, 2025 at 08:58:29AM +0200, Patrick Steinhardt wrote: > > > The `REF_HAVE_OLD` flag indicates whether a given ref update has its old > > object ID set. If so, the value of that field is used to verify whether > > the current state of the reference matches this expected state. It is > > thus an important part of mitigating races with a concurrent process > > that updates the same set of references. > > > > When writing reflogs though we explicitly unset that flag. This is a > > sensible thing to do: the old state of reflog entry updates may not > > necessarily match the current on-disk state of its accompanying ref, but > > it's only intended to signal what old object ID we want to write into > > the new reflog entry. For example when migrating refs we end up writing > > many reflog entries for a single reference, and most likely those reflog > > entries will have many different old object IDs. > > > > But unsetting this flag also removes a useful signal, namely that the > > caller _did_ provide an old object ID for a given reflog entry. This > > signal will become useful in a subsequent commit, where we add a new > > flag that tells the transaction to use the provided old and new object > > IDs to write a reflog entry. The `REF_HAVE_OLD` flag is then used as a > > signal to verify that the caller really did provide an old object ID. > > > > Stop unsetting the flag so that we can use it as this described signal > > in a subsequent commit. Skip checking the old object ID for log-only > > updates so that we don't expect it to match the current on-disk state. > > I like this direction, but I happened to be working in this area > yesterday[1] and noticed something interesting. You're effectively > replacing this removal of the HAVE_OLD flag when split a symref update: > > > diff --git a/refs/files-backend.c b/refs/files-backend.c > > index bf6f89b1d19..8b42fe18901 100644 > > --- a/refs/files-backend.c > > +++ b/refs/files-backend.c > > @@ -2493,7 +2493,6 @@ static enum ref_transaction_error split_symref_update(struct ref_update *update, > > * done when new_update is processed. > > */ > > update->flags |= REF_LOG_ONLY | REF_NO_DEREF; > > - update->flags &= ~REF_HAVE_OLD; > > > > return 0; > > } > > and then later we get the same logic by checking for LOG_ONLY: > > > @@ -2508,8 +2507,9 @@ static enum ref_transaction_error check_old_oid(struct ref_update *update, > > struct object_id *oid, > > struct strbuf *err) > > { > > - if (!(update->flags & REF_HAVE_OLD) || > > - oideq(oid, &update->old_oid)) > > + if (update->flags & REF_LOG_ONLY || > > + !(update->flags & REF_HAVE_OLD) || > > + oideq(oid, &update->old_oid)) > > return 0; > > > > if (is_null_oid(&update->old_oid)) { > > Which make sense to me. But the weird thing I noticed is that when we do > something similar for split_head_update(), we don't strip REF_HAVE_OLD! And we shouldn't do that, as in the next commit we actually build on always having `REF_HAVE_OLD` set for reflog-only updates. So I'd argue that the problem is actually the other way round: when splitting off the HEAD update we must resolve the old object ID if `REF_HAVE_OLD` is not set. > (For those not familiar with that