[PATCH v4 0/9] refs: fix migration of reflog entries

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

after the announcement that "reftable" will become the default backend
in Git 3.0 I've revived the efforts to implement this backend in
libgit2. I'm happy to report that this implementation is almost done by
now: out of 3000 tests only four are failing now.

For two of these tests I have been completely puzzled why those are
failing, as everything really looked perfectly fine in libgit2. As it
turned out, the bug wasn't in libgit2 though, but in Git. Namely, the
way we migrate reflog entries between storage formats is broken in two
ways:

  - The identity we write into the reflog entries is wrong.

  - The old commit ID of reflog entries is always set to all-zeroes.
    This is what caused the libgit2 tests to fail, as I used `git refs
    migrate` to convert test repositories to use reftables.

This patch series fixes both of these issues. Furthermore, it also adds
a new `git reflog write` subcommand to write new reflog entries for a
specific reference. This command was helpful to reproduce some test
constellations in libgit2.

Changes in v2:
  - !!! The base of this topic has changed so that it sits on top of
    v2.50.1. This is done so that we can backport this change to older
    release tracks.
  - A couple of typo fixes and clarifications for commit messages.
  - Reorder sections in git-reflog(1) manpage according to the
    reordering we have in the synopsis.
  - Add a section for the new `write` command.
  - Improve test coverage for the `git reflog write` command.
  - Avoid `cat`ing a file into a Bash loop.
  - Remove a stale comment.
  - Make `ref_update_expects_existing_old_ref()` a bit more straight
    forward.
  - Link to v1: https://lore.kernel.org/r/20250722-pks-reflog-append-v1-0-183e5949de16@xxxxxx

Changes in v3:
  - `git reflog write` now requires fully-qualified refnames.
  - A new commit that plugs one part of the race around splitting of
    reflogs for HEAD in the "files" backend.
  - Link to v2: https://lore.kernel.org/r/20250725-pks-reflog-append-v2-0-e4e7cbe3f578@xxxxxx

Changes in v4:
  - Improve one of the tests to use an existing abbreviated object ID
    instead of a non-existing one to make sure that we indeed fail due
    to the abbreviation.
  - Don't abort the transaction when HEAD has been racily updated, but
    drop the log-only update instead.
  - Link to v3: https://lore.kernel.org/r/20250729-pks-reflog-append-v3-0-9614d310f073@xxxxxx

Thanks!

Patrick

---
Patrick Steinhardt (9):
      Documentation/git-reflog: convert to use synopsis type
      builtin/reflog: improve grouping of subcommands
      refs: export `ref_transaction_update_reflog()`
      builtin/reflog: implement subcommand to write new entries
      ident: fix type of string length parameter
      refs: fix identity for migrated reflogs
      refs/files: detect race when generating reflog entry for HEAD
      refs: stop unsetting REF_HAVE_OLD for log-only updates
      refs: fix invalid old object IDs when migrating reflogs

 Documentation/git-reflog.adoc |  76 +++++++++++++------------
 builtin/reflog.c              | 103 +++++++++++++++++++++++++++-------
 ident.c                       |   2 +-
 ident.h                       |   2 +-
 refs.c                        |  60 +++++++++++---------
 refs.h                        |  24 +++++++-
 refs/files-backend.c          |  83 +++++++++++++++++++++++++---
 refs/refs-internal.h          |   3 +-
 refs/reftable-backend.c       |  26 ++++++---
 t/meson.build                 |   1 +
 t/t1421-reflog-write.sh       | 126 ++++++++++++++++++++++++++++++++++++++++++
 t/t1460-refs-migrate.sh       |  22 +++++---
 12 files changed, 420 insertions(+), 108 deletions(-)

Range-diff versus v3:

 1:  5ae2971a55 =  1:  d030117041 Documentation/git-reflog: convert to use synopsis type
 2:  256589c289 =  2:  ad1dd8a226 builtin/reflog: improve grouping of subcommands
 3:  3b9e0a1206 =  3:  d6d6d99421 refs: export `ref_transaction_update_reflog()`
 4:  4fcf540ed6 !  4:  4e5433717e builtin/reflog: implement subcommand to write new entries
    @@ t/t1421-reflog-write.sh (new)
     +	git init repo &&
     +	(
     +		cd repo &&
    -+		test_must_fail git reflog write refs/heads/something 12345 $ZERO_OID old-object-id 2>err &&
    ++		test_commit initial &&
    ++		abbreviated_oid=$(git rev-parse HEAD | test_copy_bytes 8) &&
    ++		test_must_fail git reflog write refs/heads/something $abbreviated_oid $ZERO_OID old-object-id 2>err &&
     +		test_grep "invalid old object ID" err &&
    -+		test_must_fail git reflog write refs/heads/something $ZERO_OID 12345 new-object-id 2>err &&
    ++		test_must_fail git reflog write refs/heads/something $ZERO_OID $abbreviated_oid new-object-id 2>err &&
     +		test_grep "invalid new object ID" err
     +	)
     +'
 5:  18b2f61366 =  5:  92e45f582c ident: fix type of string length parameter
 6:  d140c53224 =  6:  e50c5aaae5 refs: fix identity for migrated reflogs
 7:  91c6a7cbcb !  7:  9380dbfdab refs/files: detect race when generating reflog entry for HEAD
    @@ Commit message
             have already processed. This causes us not writing a reflog message
             even though we should have done so.
     
    -      - HEAD gets concurrently updated to point to not point to a reference
    +      - HEAD gets concurrently updated to no longer point to a reference
             anymore that we have already processed. This causes us to write a
             reflog message even though we should _not_ have done so.
     
    @@ refs/files-backend.c
      struct ref_lock {
      	char *ref_name;
      	struct lock_file lk;
    +@@ refs/files-backend.c: int parse_loose_ref_contents(const struct git_hash_algo *algop,
    + 	return 0;
    + }
    + 
    +-static void unlock_ref(struct ref_lock *lock)
    ++static int unlock_ref(struct ref_lock *lock)
    + {
    + 	lock->count--;
    + 	if (!lock->count) {
    + 		rollback_lock_file(&lock->lk);
    + 		free(lock->ref_name);
    + 		free(lock);
    ++		return 1;
    + 	}
    ++	return 0;
    + }
    + 
    + /*
     @@ refs/files-backend.c: static enum ref_transaction_error split_head_update(struct ref_update *update,
      
      	new_update = ref_transaction_add_update(
    @@ refs/files-backend.c: static enum ref_transaction_error split_head_update(struct
      
      	/*
      	 * Add "HEAD". This insertion is O(N) in the transaction
    +@@ refs/files-backend.c: struct files_transaction_backend_data {
    +  *   the referent to transaction.
    +  * - If it is an update of head_ref, add a corresponding REF_LOG_ONLY
    +  *   update of HEAD.
    ++ *
    ++ * Returns 0 on success, 1 in case the update needs to be dropped or a negative
    ++ * error code otherwise.
    +  */
    + static enum ref_transaction_error lock_ref_for_update(struct files_ref_store *refs,
    + 						      struct ref_update *update,
     @@ refs/files-backend.c: static enum ref_transaction_error lock_ref_for_update(struct files_ref_store *re
      
      	update->backend_data = lock;
    @@ refs/files-backend.c: static enum ref_transaction_error lock_ref_for_update(stru
     +
     +		/*
     +		 * Check that "HEAD" didn't racily change since we have looked
    -+		 * it up. If it did we must refuse to write the reflog entry.
    ++		 * it up. If it did we remove the reflog-only updateg from the
    ++		 * transaction again.
     +		 *
     +		 * Note that this does not catch all races: if "HEAD" was
     +		 * racily changed to point to one of the refs part of the
    @@ refs/files-backend.c: static enum ref_transaction_error lock_ref_for_update(stru
     +		 */
     +		if (!(update->type & REF_ISSYMREF) ||
     +		    strcmp(update->parent_update->refname, referent.buf)) {
    -+			strbuf_addstr(err, "HEAD has been racily updated");
    -+			ret = REF_TRANSACTION_ERROR_GENERIC;
    ++			if (unlock_ref(lock))
    ++				strmap_remove(&backend_data->ref_locks,
    ++					      update->refname, 0);
    ++
    ++			memmove(transaction->updates + update_idx,
    ++				transaction->updates + update_idx + 1,
    ++				(transaction->nr - update_idx - 1) * sizeof(*transaction->updates));
    ++			transaction->nr--;
    ++
    ++			ret = 1;
     +			goto out;
     +		}
     +
    @@ refs/files-backend.c: static enum ref_transaction_error lock_ref_for_update(stru
      		if (update->flags & REF_NO_DEREF) {
      			/*
      			 * We won't be reading the referent as part of
    +@@ refs/files-backend.c: static int files_transaction_prepare(struct ref_store *ref_store,
    + 					  head_ref, &refnames_to_check,
    + 					  err);
    + 		if (ret) {
    ++			if (ret > 0)
    ++				continue;
    + 			if (ref_transaction_maybe_set_rejected(transaction, i, ret)) {
    + 				strbuf_reset(err);
    + 				ret = 0;
 8:  8468947824 =  8:  3c6182c96d refs: stop unsetting REF_HAVE_OLD for log-only updates
 9:  78ca2d46f9 =  9:  eafd8f6d7d refs: fix invalid old object IDs when migrating reflogs

---
base-commit: d82adb61ba2fd11d8f2587fca1b6bd7925ce4044
change-id: 20250722-pks-reflog-append-634172d8ab2c





[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux