Re: [PATCH v4 7/9] refs/files: detect race when generating reflog entry for HEAD

Jeff King <peff@xxxxxxxx> · Mon, 4 Aug 2025 11:38:03 -0400

On Mon, Aug 04, 2025 at 11:46:07AM +0200, Patrick Steinhardt wrote:

> +		/*
> +		 * Check that "HEAD" didn't racily change since we have looked
> +		 * it up. If it did we remove the reflog-only updateg from the
> +		 * transaction again.
> +		 *
> +		 * Note that this does not catch all races: if "HEAD" was
> +		 * racily changed to point to one of the refs part of the
> +		 * transaction then we would miss writing the split reflog
> +		 * entry for "HEAD".
> +		 */
> +		if (!(update->type & REF_ISSYMREF) ||
> +		    strcmp(update->parent_update->refname, referent.buf)) {
> +			if (unlock_ref(lock))
> +				strmap_remove(&backend_data->ref_locks,
> +					      update->refname, 0);
> +
> +			memmove(transaction->updates + update_idx,
> +				transaction->updates + update_idx + 1,
> +				(transaction->nr - update_idx - 1) * sizeof(*transaction->updates));
> +			transaction->nr--;
> +
> +			ret = 1;
> +			goto out;
> +		}

OK, so this is basically the same as the patch you posted earlier. Let's
see how it fares with my hacky GIT_TEST_FORCE_SPLIT_HEAD_UPDATE patch:

  $ GIT_TEST_FORCE_SPLIT_HEAD_UPDATE=1 git branch foo HEAD^
  fatal:

Yikes. I'm not sure if there's a bug here, or if my hacky patch is
violating some other assumption. It looks like we get to the die() call
in branch.c:create_branch() because the transaction reports failure, but
with an empty err strbuf.

Ah, I think I see it. When we return from lock_ref_for_update(), we've
set "ret" to "1", indicating we are skipping the update. But then we do
this:

  if (ret > 0)
	continue;

I think there are two problems there:

  1. That "ret" is also used as our return from
     files_transaction_prepare(). So if this is the last update in the
     transaction, then we return "1", rather than "0" for success, and
     the caller thinks there was an error.

  2. If it's not the last transaction, then we go to the next element in
     the loop. But because it's a for-loop, we still increment "i",
     which is wrong (because we shrunk the transaction list). We need to
     check that "i" again.

So maybe:

@@ -2910,8 +2914,11 @@ static int files_transaction_prepare(struct ref_store *ref_store,
 					  head_ref, &refnames_to_check,
 					  err);
 		if (ret) {
-			if (ret > 0)
+			if (ret > 0) {
+				ret = 0; /* not an error; we skipped it */
+				i--; /* we shrunk the list */
 				continue;
+			}
 			if (ref_transaction_maybe_set_rejected(transaction, i, ret)) {
 				strbuf_reset(err);
 				ret = 0;

I confirmed that fixes case (1). I guess I could test case (2) with a
bigger transaction involving multiple refs, but it's awkward because my
"force split update" patch would try to create multiple HEAD updates. :-/

I guess maybe it should be "pretend HEAD is this", like so:

diff --git a/refs/files-backend.c b/refs/files-backend.c
index 851b1b33f4..564b77d0da 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -2895,6 +2895,14 @@ static int files_transaction_prepare(struct ref_store *ref_store,
 		FREE_AND_NULL(head_ref);
 	}
 
+	{
+		const char *v = getenv("GIT_TEST_PRETEND_SPLIT_HEAD");
+		if (v) {
+			free(head_ref);
+			head_ref = xstrdup(v);
+		}
+	}
+
 	/*
 	 * Acquire all locks, verify old values if provided, check
 	 * that new values are valid, and write new values to the

We have to be a bit tricky here. The split head update is always added
at the end during the transaction preparation. So we need a situation
where another update is added _after_ that. I guess it would be another
symref split (but done by updating the symref).

So:

  git symbolic-ref refs/heads/SYMREF refs/heads/dest
  (
    echo "create refs/heads/foo HEAD"
    echo "create refs/heads/SYMREF HEAD"
  ) |
  GIT_TEST_PRETEND_SPLIT_HEAD=refs/heads/foo git update-ref --stdin

ends up with four updates:

  - the original create foo
  - the original create SYMREF
  - the reflog update of HEAD from split_head_update()
  - the update of refs/heads/dest from split_symref_update()

And indeed, running that through the debugger shows that we'd otherwise
skip the final update with your patch (but the extra "i--" fixes it).

I also tried this with SANITIZE=leak, and I think you'd need something
like this, as well:

diff --git a/refs.c b/refs.c
index 946eb48941..27c182e107 100644
--- a/refs.c
+++ b/refs.c
@@ -1184,6 +1184,15 @@ struct ref_transaction *ref_store_transaction_begin(struct ref_store *refs,
 	return tr;
 }
 
+void ref_update_free(struct ref_update *u)
+{
+	free(u->msg);
+	free(u->committer_info);
+	free((char *)u->new_target);
+	free((char *)u->old_target);
+	free(u);
+}
+
 void ref_transaction_free(struct ref_transaction *transaction)
 {
 	size_t i;
@@ -1204,13 +1213,8 @@ void ref_transaction_free(struct ref_transaction *transaction)
 		break;
 	}
 
-	for (i = 0; i < transaction->nr; i++) {
-		free(transaction->updates[i]->msg);
-		free(transaction->updates[i]->committer_info);
-		free((char *)transaction->updates[i]->new_target);
-		free((char *)transaction->updates[i]->old_target);
-		free(transaction->updates[i]);
-	}
+	for (i = 0; i < transaction->nr; i++)
+		ref_update_free(transaction->updates[i]);
 
 	if (transaction->rejections)
 		free(transaction->rejections->update_indices);
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 851b1b33f4..0246715383 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -2640,6 +2640,7 @@ static enum ref_transaction_error lock_ref_for_update(struct files_ref_store *re
 				transaction->updates + update_idx + 1,
 				(transaction->nr - update_idx - 1) * sizeof(*transaction->updates));
 			transaction->nr--;
+			ref_update_free(update);
 
 			ret = 1;
 			goto out;
diff --git a/refs/refs-internal.h b/refs/refs-internal.h
index 95a4dc3902..6b5895a3b3 100644
--- a/refs/refs-internal.h
+++ b/refs/refs-internal.h
@@ -144,6 +144,8 @@ struct ref_update {
 	const char refname[FLEX_ARRAY];
 };
 
+void ref_update_free(struct ref_update *);
+
 int refs_read_raw_ref(struct ref_store *ref_store, const char *refname,
 		      struct object_id *oid, struct strbuf *referent,
 		      unsigned int *type, int *failure_errno);

It's a little hard to see that freeing update inside
lock_ref_for_update() is safe (but we "goto out" after and don't look at
it again). I think it would all be a bit more obvious if
lock_ref_for_update() just returned 1 for "skip this", and then the
caller actually shrunk the transaction list.

-Peff