[PATCH v6 0/8] refs: introduce support for batched reference updates

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Git supports making reference updates with or without transactions.
Updates with transactions are generally better optimized. But
transactions are all or nothing. This means, if a user wants to batch
updates to take advantage of the optimizations without the hard
requirement that all updates must succeed, there is no way currently to
do so. Particularly with the reftable backend where batching multiple
reference updates is more efficient than performing them sequentially.

This series introduces support for batched reference updates without
transactions allowing individual reference updates to fail while letting
others proceed. This capability is exposed through git-update-ref's
`--allow-partial` flag, which can be used in `--stdin` mode to batch
updates and handle failures gracefully. Under the hood, these batched
updates still use the transactions infrastructure, while modifying
sections to allow partial failures.

The changes are structured to carefully build up this functionality:

First, we clean up and consolidate the reference update checking logic.
This includes removing duplicate checks in the files backend and moving
refname tracking to the generic layer, which simplifies the codebase and
prepares it for the new feature.

We then restructure the reftable backend's transaction preparation code,
extracting the update validation logic into a dedicated function. This
not only improves code organization but sets the stage for implementing
partial transaction support.

To ensure we only skip errors which are user-oriented, we introduce
typed errors for transactions with 'enum ref_transaction_error'. We
extend the existing errors to include other scenarios and use this new
errors throughout the refs code.

With this groundwork in place, we implement the core batch update
support in the refs subsystem. This adds the necessary infrastructure to
track and report rejected updates while allowing transactions to
proceed. All reference backends are modified to support this behavior
when enabled.

Finally, we expose this functionality to users through
git-update-ref(1)'s `--allow-partial` flag, complete with test coverage
and documentation. The flag is specifically limited to `--stdin` mode
where batching multiple updates is most relevant.

This enhancement improves Git's flexibility in handling reference
updates while maintaining the safety of atomic transactions by default.
It's particularly valuable for tools and workflows that need to handle
reference update failures gracefully without abandoning the entire batch
of updates.

This series is based on top of 683c54c999 (Git 2.49, 2025-03-14) with
Patrick's series 'refs: batch refname availability checks' [1] merged
in.

[1]: https://lore.kernel.org/all/20250217-pks-update-ref-optimization-v1-0-a2b6d87a24af@xxxxxx/

---

Changes in v6:
- The documentation for 'git update-ref' didn't repeat the command, giving the intention
  that newlines added were continuation of options rather than alternative invocations.
- Link to v5: https://lore.kernel.org/all/20250327-245-partially-atomic-ref-updates-v5-0-4db2a3e34404@xxxxxxxxx

Changes in v5:
- Inline the comments around the 'ref_transaction_error'.
- Use 'strbuf_reset()' wherever possible instead of 'strbuf_setlen(err, 0)'.
- Use an extra 'conflicting_dirnames' strset in 'refs_verify_refnames_available()' to track
  dirnames which were found to be conflicting, this is to avoid re-reading those dirnames.
- Add curly braces style mismatch in if..else block.
- Link to v4: https://lore.kernel.org/r/20250320-245-partially-atomic-ref-updates-v4-0-3dcc1b311dc9@xxxxxxxxx

Changes in v4:
- Rebased on top of 2.49 since there was a long time between the
  previous iteration and we have a new release.
- Changed the naming to say 'batched' updates instead of 'partial
  transactions'. While we still use the transaction infrastructure
  underneath, the new naming causes less ambiguity.
- Clean up some of the commit messages.
- Raise BUG for invalid update index while setting rejections.
- Fix an incorrect early return.
- Link to v3: https://lore.kernel.org/r/20250305-245-partially-atomic-ref-updates-v3-0-0c64e3052354@xxxxxxxxx

Changes in v3:
- Changed 'transaction_error' to 'ref_transaction_error' along with the
  error names. Removed 'TRANSACTION_OK' since it can potentially be
  missed instead of simply 'return 0'.
- Rename 'ref_transaction_set_rejected' to
  'ref_transaction_maybe_set_rejected' and move logic around error
  checks to within this function.
- Add a new struct 'ref_transaction_rejections' to track the rejections
  within a transaction. This allows us to only iterate over rejected
  updates.
- Add a new commit to also support partial transactions within the
  batched F/D checks.
- Remove NUL delimited outputs in 'git-update-ref(1)'.
- Remove translations for plumbing outputs.
- Other small cleanups in the commit message and code.

Changes in v2:
- Introduce and use structured errors. This consolidates the errors
  and their handling between the ref backends.
- In the previous version, we skipped over all failures. This include
  system failures such as low memory or IO problems. Let's instead, only
  skip user-oriented failures, such as invalid old OID and so on.
- Change the rejection function name to `ref_transaction_set_rejected()`.
- Modify the commit messages and documentation to be a little more
  verbose.
- Link to v1: https://lore.kernel.org/r/20250207-245-partially-atomic-ref-updates-v1-0-e6a3690ff23a@xxxxxxxxx

---

Karthik Nayak (8):
  refs/files: remove redundant check in split_symref_update()
  refs: move duplicate refname update check to generic layer
  refs/files: remove duplicate duplicates check
  refs/reftable: extract code from the transaction preparation
  refs: introduce enum-based transaction error types
  refs: implement batch reference update support
  refs: support rejection in batch updates during F/D checks
  update-ref: add --batch-updates flag for stdin mode

 Documentation/git-update-ref.adoc |  14 +-
 builtin/fetch.c                   |   2 +-
 builtin/update-ref.c              |  66 +++-
 refs.c                            | 171 ++++++++--
 refs.h                            |  70 +++--
 refs/files-backend.c              | 314 ++++++++-----------
 refs/packed-backend.c             |  69 ++--
 refs/refs-internal.h              |  51 ++-
 refs/reftable-backend.c           | 502 +++++++++++++++---------------
 t/t1400-update-ref.sh             | 233 ++++++++++++++
 10 files changed, 969 insertions(+), 523 deletions(-)

---

Range-diff versus v5:

1:  cae24142a1 = 1:  cae24142a1 refs/files: remove redundant check in split_symref_update()
2:  239aecdb0f = 2:  239aecdb0f refs: move duplicate refname update check to generic layer
3:  06404dd350 = 3:  06404dd350 refs/files: remove duplicate duplicates check
4:  a3e645aa37 = 4:  a3e645aa37 refs/reftable: extract code from the transaction preparation
5:  2615bfe78e = 5:  2615bfe78e refs: introduce enum-based transaction error types
6:  d5c1c77b0d = 6:  d5c1c77b0d refs: implement batch reference update support
7:  4bb4902631 = 7:  4bb4902631 refs: support rejection in batch updates during F/D checks
8:  674630f77c ! 8:  ed92beaf18 update-ref: add --batch-updates flag for stdin mode
    @@ Documentation/git-update-ref.adoc: git-update-ref - Update the object name store
     -'git update-ref' [-m <reason>] [--no-deref] (-d <ref> [<old-oid>] | [--create-reflog] <ref> <new-oid> [<old-oid>] | --stdin [-z])
     +[synopsis]
     +git update-ref [-m <reason>] [--no-deref] -d <ref> [<old-oid>]
    -+	       [-m <reason>] [--no-deref] [--create-reflog] <ref> <new-oid> [<old-oid>]
    -+               [-m <reason>] [--no-deref] --stdin [-z] [--batch-updates]
    ++git update-ref [-m <reason>] [--no-deref] [--create-reflog] <ref> <new-oid> [<old-oid>]
    ++git update-ref [-m <reason>] [--no-deref] --stdin [-z] [--batch-updates]
      
      DESCRIPTION
      -----------

-- 
2.48.1





[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux