[PATCH v3 0/4] refs/files: fix issues with git-fetch on case-insensitive FS

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello!

With Git 2.51 we moved 'git-fetch(1)' and 'git-receive-pack(1)' to use
batched updates while doing reference updates. This provided a nice perf
boost since both commands will now use a single transaction for
reference updates. This removes the overhead of using individual
transaction per reference update and also avoids unnecessary
auto-compaction between reference updates in the reftable backend.

However, in the files-backend it does introduce a few bugs around
conflicts. The reported bug was around case-insensitive filesystems [1],
but we also fix some adjacent issues:

1. When fetching references such as:

   - refs/heads/foo
   - refs/heads/Foo

Earlier we would simply overwrite the first reference with the second
and continue. Since Git 2.51 we simply abort stating a conflict.

This is resolved in the first commit by explicitly categorizing the
error as non-GENERIC. This allows batched updates to reject the
particular update, while updating the rest.

2. When fetching references and a lock for a particular reference
already exits. We treat this is a GENERIC error, which fails the entire
update. By categorizing this error as non-GENERIC, we can reject this
specific update and update the other references.

3. When fetching references such as with F/D conflict:

  - refs/heads/foo
  - refs/heads/Foo/bar

Earlier we would apply the first, while the second would fail due to
conflict. Since Git 2.51, the lock files for both would be created, but
the 'commit' phase would abruptly end leaving the lock files.

The second commit fixes this by ensuring that on case-insensitive
filesystems we lowercase the refnames for availability check to ensure
F/D are caught and reported to the user.

4. When fetching references with D/F conflict:

  - refs/heads/Foo/bar
  - refs/heads/foo

The creation of the second reference's lock in `lock_raw_ref()` catches
the D/F conflict, but we mark this as a GENERIC error. By categorizing
this as non-GENERIC, we can allow the updates to continue while
rejecting this specific error.

This also applies to D/F conflicts in case-sensitive systems where there
exists a lock in the repo 'refs/heads/foo/bar.lock' causing a conflict
while fetching 'refs/heads/foo'.

- Karthik

[1]: https://lore.kernel.org/all/YQXPR01MB3046197EF39296549EE6DD669A33A@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/

Signed-off-by: Karthik Nayak <karthik.188@xxxxxxxxx>
---
Changes in v3:
- Rename duplicate_reference_case_cmp() to
  transaction_has_case_conflicting_update() and add comments.
- Improve commit messages.
- Add an additional test in the 4th commit to showcase D/F conflicts in
  case-sensistive file systems.
- Link to v2: https://lore.kernel.org/r/20250908-587-git-fetch-1-fails-fetches-on-case-insensitive-repositories-v2-0-b2eb2459befb@xxxxxxxxx

Changes in v2:
- This version fixes two more issues:
  - Fetching while locks already exist in the repository
  - D/F conflicts while fetching
- Add a specific error to the first case, so we can nicely show a
  relevant error. Also check explicitly that the issue is due to
  case-insensitive filesystems.
- Cleanup the commit messages.
- Use `string_list_append_nodup()` with `strbuf_detach`, reducing the
  number of allocations.

---
 builtin/fetch.c       |  21 ++++++++--
 refs.c                |  11 ++++-
 refs.h                |   2 +
 refs/files-backend.c  |  77 ++++++++++++++++++++++++++++------
 t/t1400-update-ref.sh |  53 +++++++++++++++++++++++
 t/t5510-fetch.sh      | 114 +++++++++++++++++++++++++++++++++++++++++++++++++-
 6 files changed, 261 insertions(+), 17 deletions(-)

Karthik Nayak (4):
      refs/files: catch conflicts on case-insensitive file-systems
      refs/files: use correct error type when lock exists
      refs/files: handle F/D conflicts in case-insensitive FS
      refs/files: handle D/F conflicts during locking

Range-diff versus v2:

1:  570b24fc09 ! 1:  1d4ff64d90 refs/files: catch conflicts on case-insensitive file-systems
    @@ refs/files-backend.c: static void unlock_ref(struct ref_lock *lock)
      	}
      }
      
    -+static bool duplicate_reference_case_cmp(struct ref_transaction *transaction,
    -+					 struct ref_update *update)
    ++/*
    ++ * Check if the transaction has another update with a case-insensitive refname
    ++ * match.
    ++ *
    ++ * If the update is part of the transaction, we only check up to that index.
    ++ * Further updates are expected to call this function to match previous indices.
    ++ */
    ++static bool transaction_has_case_conflicting_update(struct ref_transaction *transaction,
    ++						    struct ref_update *update)
     +{
     +	for (size_t i = 0; i < transaction->nr; i++) {
     +		if (transaction->updates[i] == update)
    @@ refs/files-backend.c: static enum ref_transaction_error lock_raw_ref(struct file
      		} else {
      			unable_to_lock_message(ref_file.buf, myerr, err);
     +			if (myerr == EEXIST && ignore_case &&
    -+			    duplicate_reference_case_cmp(transaction, update))
    ++			    transaction_has_case_conflicting_update(transaction, update))
     +				ret = REF_TRANSACTION_ERROR_CASE_CONFLICT;
      			goto error_return;
      		}
2:  10bfb8c1f2 ! 2:  fdc69b233e refs/files: use correct error type when lock exists
    @@ Commit message
         refs/files: use correct error type when lock exists
     
         When fetching references into a repository, if a lock for a particular
    -    reference exists, then `lock_raw_ref()` throws the generic error
    -    'REF_TRANSACTION_ERROR_GENERIC'. This causes the entire set of batched
    -    updates to fail.
    +    reference exists, then `lock_raw_ref()` throws:
    +
    +        - REF_TRANSACTION_ERROR_CASE_CONFLICT: when there is a conflict
    +        because transaction contains conflicting references while being on a
    +        case-insensitive filesystem.
    +
    +        - REF_TRANSACTION_ERROR_GENERIC: for all other errors.
    +
    +    The latter causes the entire set of batched updates to fail, even in
    +    case sensitive filessystems.
     
         Instead, return a 'REF_TRANSACTION_ERROR_CREATE_EXISTS' error. This
         allows batched updates to reject the individual update which conflicts
    @@ refs/files-backend.c: static enum ref_transaction_error lock_raw_ref(struct file
      		} else {
      			unable_to_lock_message(ref_file.buf, myerr, err);
     -			if (myerr == EEXIST && ignore_case &&
    --			    duplicate_reference_case_cmp(transaction, update))
    +-			    transaction_has_case_conflicting_update(transaction, update))
     -				ret = REF_TRANSACTION_ERROR_CASE_CONFLICT;
     +			if (myerr == EEXIST) {
    -+				if (ignore_case && duplicate_reference_case_cmp(transaction, update))
    ++				if (ignore_case &&
    ++				    transaction_has_case_conflicting_update(transaction, update))
    ++					/*
    ++					 * In case-insensitive filesystems, ensure that conflicts within a
    ++					 * given transaction are handled. Pre-existing refs on a
    ++					 * case-insensitive system will be overridden without any issue.
    ++					 */
     +					ret = REF_TRANSACTION_ERROR_CASE_CONFLICT;
     +				else
    ++					/*
    ++					 * Pre-existing case-conflicting reference locks should also be
    ++					 * specially categorized to avoid failing all batched updates.
    ++					 */
     +					ret = REF_TRANSACTION_ERROR_CREATE_EXISTS;
     +			}
     +
3:  23868f2218 = 3:  ff7fb95b66 refs/files: handle F/D conflicts in case-insensitive FS
4:  5af3dc2a27 ! 4:  a610f9478b refs/files: handle D/F conflicts during locking
    @@ Metadata
      ## Commit message ##
         refs/files: handle D/F conflicts during locking
     
    -    The previous commit, added the necessary validation and checks for F/D
    +    The previous commit added the necessary validation and checks for F/D
         conflicts in the files backend when working on case insensitive systems.
     
         There is still a possibility for D/F conflicts. This is a different from
    @@ Commit message
             refs/heads/foo.lock
     
         As in `lock_raw_ref()` after creating the lock, we also check for D/F
    -    conflicts. To fix this, simply categorize the error as a name conflict.
    -    Also remove this reference from the list of valid refnames for
    -    availability checks.
    +    conflicts. This can occur in case-insensitive filesystems when trying to
    +    fetch case-conflicted references like:
     
    +        refs/heads/Foo/new
    +        refs/heads/foo
    +
    +    D/F conflicts can also occur in case-sensitive filesystems, when the
    +    repository already contains a directory with a lock file
    +    'refs/heads/foo/bar.lock' and trying to fetch 'refs/heads/foo'. This
    +    doesn't concern directories containing garbage files as those are
    +    handled on a higher level.
    +
    +    To fix this, simply categorize the error as a name conflict. Also remove
    +    this reference from the list of valid refnames for availability checks.
         By categorizing the error and removing it from the list of valid
         references, batched updates now knows to reject such reference updates
         and apply the other reference updates.
    @@ refs.c: int ref_transaction_maybe_set_rejected(struct ref_transaction *transacti
      		return 0;
      
     +	/*
    -+	 * Remove this refname from the list of refnames used for validation
    ++	 * Rejected refnames shouldn't be considered in the availability
    ++	 * checks, so remove them from the list.
     +	 */
     +	string_list_remove(&transaction->refnames,
     +			   transaction->updates[update_idx]->refname, 0);
    @@ refs/files-backend.c: static enum ref_transaction_error lock_raw_ref(struct file
      					goto error_return;
      				} else {
      					/*
    +-					 * We can't delete the directory,
    +-					 * but we also don't know of any
    +-					 * references that it should
    +-					 * contain.
    ++					 * Directory conflicts can occur if there
    ++					 * is an existing lock file in the directory
    ++					 * or if the filesystem is case-insensitive
    ++					 * and the directory contains a valid reference
    ++					 * but conflicts with the update.
    + 					 */
    + 					strbuf_addf(err, "there is a non-empty directory '%s' "
    + 						    "blocking reference '%s'",
     
      ## t/t5510-fetch.sh ##
     @@ t/t5510-fetch.sh: test_expect_success "clone and setup child repos" '
    @@ t/t5510-fetch.sh: test_expect_success CASE_INSENSITIVE_FS,REFFILES 'F/D conflict
     +		test_cmp expect actual
     +	)
     +'
    ++
    ++test_expect_success REFFILES 'D/F conflict on case sensitive filesystem with lock' '
    ++	(
    ++		git init --ref-format=reftable base &&
    ++		cd base &&
    ++		echo >file update &&
    ++		git add . &&
    ++		git commit -m "updated" &&
    ++		git branch -M main &&
    ++
    ++		git update-ref refs/heads/foo @ &&
    ++		git update-ref refs/heads/branch @ &&
    ++		cd .. &&
    ++
    ++		git init --ref-format=files --bare repo &&
    ++		cd repo &&
    ++		git remote add origin ../base &&
    ++		mkdir refs/heads/foo &&
    ++		touch refs/heads/foo/random.lock &&
    ++		test_must_fail git fetch origin "refs/heads/*:refs/heads/*" 2>err &&
    ++		test_grep "some local refs could not be updated; try running" err &&
    ++		git rev-parse refs/heads/main >expect &&
    ++		git rev-parse refs/heads/branch >actual &&
    ++		test_cmp expect actual
    ++	)
    ++'
     +
      . "$TEST_DIRECTORY"/lib-httpd.sh
      start_httpd


base-commit: c44beea485f0f2feaf460e2ac87fdd5608d63cf0
change-id: 20250822-587-git-fetch-1-fails-fetches-on-case-insensitive-repositories-0a187c7ac41e

Thanks
- Karthik





[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux