On Sat, Mar 29, 2025 at 11:02:46PM +0800, Zheng Yuting wrote: > ## Name and Contact Information > > - Full Name: Zheng Yuting > - Email Address: 05ZYT30@xxxxxxxxx > - Time Zone: UTC +8:00 > > --- > > ## Abstract > > The current Git reference management functionality is fragmented across > multiple independent commands (git-show-ref, git-for-each-ref, > git-update-ref, git-pack-refs, git-check-ref-format, and > git-symbolic-ref), leading to code redundancy and increased maintenance > costs. Based on Patrick Steinhardt’s integration vision[1], this project > aims to introduce 8 new subcommands (list, exists, show, resolve, pack, > update, delete, check-format) under the existing git-refs command to > achieve the following objectives: I have a couple of opinions on the exact naming of the subcommands, more on that below. In any case, I don't think the naming and how exactly each of these commands should look and work like needs to be hashed out in this document. It's nice to scope out _what_ we want to achieve and propose how this could look like, but ultimately I think that most of the design should happen during the project itself. > - Feature Integration: Consolidate existing reference management > commands under git-refs, while maintaining backward compatibility. > - Feature Enhancement: Introduce recursion depth control for git-refs > resolve. > - Testing & Documentation: Add test cases ensuring consistency and > update relevant documentation. > > --- > > ## Implementation Plan > > ### Command Integration Strategy > > #### Design Goals > > The project will unify scattered reference management functionalities > under the git-refs subcommand framework, ensuring: > > 1. Complete Feature Coverage: Each subcommand fully replaces its > corresponding legacy command. > 2. Parameter Compatibility: Preserve the semantics and output behavior > of legacy command options. This one is something that is up for debate. While I do expect that most of the commands should remain current semantics and options, we could also use this as an opportunity to think whether there are any issues with the current design and improve upon it. > 3. Code Reusability: Minimize redundancy by sharing underlying modules > (e.g., refs/files-backend.c). > > #### Subcommand Mapping > > - git-refs list > Replaces git-show-ref and git-for-each-ref, merging reference listing > functionalities with support for formatting (--format), filtering > (--heads, --tags), and sorting (--sort). Yup. One thing to note is that git-show-ref(1) and git-for-each-ref(1) are very similar, but not quite the same. One should find good arguments which of the two semantics are preferable to us and why that is. For example, git-show-ref(1) outperforms git-for-each-ref(1) due to the default format: Benchmark 1: git show-ref Time (mean ± σ): 99.0 ms ± 0.5 ms [User: 55.6 ms, System: 43.0 ms] Range (min … max): 98.0 ms … 100.8 ms 100 runs Benchmark 2: git for-each-ref Time (mean ± σ): 134.0 ms ± 0.6 ms [User: 82.3 ms, System: 50.8 ms] Range (min … max): 132.7 ms … 135.8 ms 100 runs Summary git show-ref ran 1.35 ± 0.01 times faster than git for-each-ref > - git-refs exists > Replaces git-show-ref --exists, providing reference existence checks > with positive (<ref>) and exclusion-based (--exclude-existing) > verification. I'm not quite clear what exclusion-based existence checks is. How do you check whether something exists when you exclude it? I don't think that this option is relevant in the context of `git refs exists`. > - git-refs show > Replaces git-show-ref --verify, validating reference correctness with > a strict mode (--strict). Yup. In contrast to `git refs resolve` this command shouldn't resolve the ref, but directly show what it's pointing to. And this should be true for both symbolic and normal refs. > - git-refs resolve > Replaces git-symbolic-ref, resolving symbolic references with added > recursion depth control (--max-depth), while retaining deletion (-d) > and quiet mode (-q) options. Not quite. The difference to `git refs show` is that this command always resolves the ref to an object. So it's rather more similar to `git rev-parse --verify`, except that it only ever handles references. > - git-refs pack > Replaces git-pack-refs, packing loose references with support for > filtering (--include, --exclude) and automatic cleanup (--prune). I would probably call this `git refs optimize` or something like that. git-pack-refs(1) is mostly called this way because it was introduced to pack refs into the "packed-refs" file. But nowadays with the reftable backend I think that the command name is somewhat inaccurate. > - git-refs update > Replaces git-update-ref, providing transactional reference updates > with batch processing (--stdin) and atomic guarantees. > - git-refs delete > Separates the delete functionality from git-update-ref, ensuring > explicit handling of reference removals with safety checks and batch > operations (--stdin). It's up for debate whether we should even have something like `git refs delete`. As you rightfully notice `git refs update` already handles the usecase, so it feels like needless duplication. > - git-refs check-format > Replaces git-check-ref-format, validating reference format with > support for normalized output (--normalize). Ah, nice, this is a command I forgot about. > #### Implementation Strategy > > 1. Option Parsing: Each subcommand will reuse the argument parsing > logic from legacy commands (e.g., git-pack-refs --prune). We cannot and do not want to do this for every case. As mentioned above, we may want to iterate on some of the subcommands to address historic warts. But overall I agree, we should of course aim to reduce duplication as far as it is sensible to do. > 2. Shared Backend Logic: Calls to common functions in refs/ (e.g., > reference traversal, locking mechanisms). > 3. Error Consistency: Maintain the same error codes and message > formats as legacy commands. Same reasoning here, we may want to adapt some of them. The old commands won't go away as they are used everywhere, and that makes it more reasonable for us to change behaviour in their newer equivalents. > --- > > ### Example: Implementing git-refs pack > > #### Functional Implementation > > 1. Modify builtin/refs.c: > - Add cmd_refs_pack function implementing git-pack-refs logic. > - Update cmd_refs to include pack with > OPT_SUBCOMMAND("pack", &fn, cmd_refs_pack). > - Define REFS_PACK_USAGE: > git refs pack [--all] [--no-prune] [--auto] [--include <pattern>] > [--exclude <pattern>]. > 2. Register New Subcommand in git.c: > - Add { "refs-pack", cmd_refs_pack }, to the command array. You don't actually have to change "git.c" to introduce new subcommands. We don't want `git refs-pack`, but rather `git refs pack`, which is an important distinction. > 3. Reuse refs/files-backend.c Logic: > - Ensure cmd_refs_pack calls pack_refs correctly, adjusting as > necessary for new options. We shouldn't have to touch any of the backends at all. You should rather make sure to integrate with "refs.c", which wraps the backends and provides a backend-agnostic interface to refs. > #### Testing Plan > > - Test Cases: > Add t/txxx-refs-pack.sh, leveraging t/t0601-reffiles-pack-refs.sh > scenarios to verify: > - --prune removes obsolete references correctly. > - --include and --exclude apply filtering as expected. > - Packed references match legacy command outputs (diff .git/packed-refs). > - Performance Benchmarking (if needed): > Add performance tests in t/perf to ensure no significant regression > in execution time or memory usage. > > #### Documentation Updates > > - User Manual: > Add a pack section to Documentation/git-refs.txt, mapping options to > legacy command equivalents. > - Developer Notes: > Comment code to highlight functional parity between git-refs pack > and git-pack-refs. > > --- > > ### Timeline > > - May 8 - May 11 (4 days): Initial Testing & Subcommand Framework Setup > - May 12 - May 28 (17 days): pack Subcommand Implementation > - May 29 - June 14 (17 days): check-format Subcommand Development > - June 15 - July 5 (21 days): update and delete Subcommands Development > - July 6 - July 26 (21 days): show and exists Subcommands Development > - July 27 - August 16 (21 days): resolve Subcommand Implementation > - August 17 - September 6 (21 days): list Subcommand Implementation > - September 7 - September 16 (10 days): Mid-term Review > - September 17 - September 23 (7 days): Mentor Review & Final Adjustments You probably underestimate the time to review and land a specific change quite significantly. Landing new features in ~2 weeks is thus not quite realistic and you should allocate a lot more time for each of the specific subcommands. That of course raises the question of how to squeeze all of the subcommands into a single GSoC. And the answer is that you don't: it's perfectly fine to implement only a subset of the new proposed subcommands. I'd rather you spend more time thinking about how to improve upon the status quo for each of the subcommands and thus spend more time on it than trying to do everything in a hurry. So: there isn't any expectation that you manage to implement all of them. I'd recommend to pick a subset of commands that you want to implement as a realistic goal. You may define other commands as a stretch goal in case you manage to speed through the implementation way faster than I anticipate. Thanks! Patrick