# Proposal for GSOC 2025 to Git **Machine-Readable Repository Information Query Tool** ## Contact Details * **Name**: K Jayatheerth * **Email**: jayatheerthkulkarni2005@xxxxxxxxx * **Blog**: [Blog](https://jayatheerthkulkarni.github.io/gsoc_blog/index.html) * **GitHub**: [GitHub](https://github.com/jayatheerthkulkarni) ## Prerequisites & Experience As part of the GSoC application prerequisites, I have engaged with the Git community and initiated a microproject. This involved **updating documentation for `MyFirstContribution.adoc` and update it with modern codebase**, providing valuable experience with Git's codebase structure (documentation files), the contribution workflow (patch submission using `git send-email`, addressing feedback across versions), and communication via the mailing list. * **Microproject Status:** v4 submitted, incorporating feedback, awaiting further review. * **Microproject Patch Series:** [Main mail thread](https://lore.kernel.org/git/xmqqa59evffd.fsf@gitster.g/T/#t) (Link to the most relevant thread demonstrating interaction and successful patch refinement) * **Initial Patch:** [First Patch](https://lore.kernel.org/git/20250312081534.75536-1-jayatheerthkulkarni2005@xxxxxxxxx/t/#u) * **Mailing List Introduction:** [First Mail](https://lore.kernel.org/git/CA+rGoLc69R8qgbkYQiKoc2uweDwD10mxZXYFSY8xFs5eKSRVkA@xxxxxxxxxxxxxx/t/#u) * **Blog:** My GSoC blog details these interactions: [Blog](https://jayatheerthkulkarni.github.io/gsoc_blog/index.html) ## **Synopsis** This project focuses on **refactoring Git by creating a dedicated command (tentatively named `git info`, subject to further discussion) to house the low-level repository, path, and format-related query options currently misplaced under the "OPTIONS FOR FILES" section of `git-rev-parse(1)`**. This new command, potentially using a subcommand structure (e.g., `git info path`, `git info repo`), will provide a more logical and maintainable location for this functionality. This allows `git rev-parse` to better focus on its core purpose of parsing revisions, ultimately improving Git's internal organization and command structure clarity by offering a **cleaner interface** for these specific queries. ## **Benefits to the Community** ### **1. Improves `git rev-parse` Clarity and Maintainability** - `git rev-parse` has accumulated various options unrelated to its primary purpose of parsing revisions, particularly those for querying low-level repository state and paths. - This project **directly addresses this issue** by migrating these options to a dedicated, purpose-built command, making `git rev-parse` cleaner and easier to understand and maintain. - Provides a **clearer separation of concerns** within Git's command suite. ### **2. Provides Reliable Access for Automation and Scripting** - Scripts often need fundamental repository information like the top-level directory path, the `.git` directory location, or repository state. - Currently, scripts rely on `git rev-parse` for this, invoking it for tasks outside its core revision-parsing role. - The new `git info` command will offer a **stable, dedicated, and cleaner interface** for retrieving this specific low-level information, making scripts **more robust and readable** by calling the command designed explicitly for these tasks. ## Deliverables This project will introduce a new Git command, **tentatively named `git info`**, serving as the designated home for specific low-level query options migrated from `git rev-parse`. The implementation will likely adopt a **subcommand structure**. The key deliverables for this GSoC project include: 1. **New Core Command: `git info` with Subcommands** * A new `builtin/info.c` command integrated into the Git source code. * Implementation primarily in C, using `parse-options` to handle **subcommands** (e.g., `path`, `repo`, `misc`) and their specific options. * Leverages existing internal Git APIs and logic currently within `rev-parse.c`. 2. **Relocated `rev-parse` Options under Subcommands:** * Implementation of the core functionality behind selected options from `git-rev-parse(1)`'s "OPTIONS FOR FILES" section, organized under appropriate subcommands within `git info`. *(Specific options and subcommand grouping subject to final confirmation with mentor)*: * **`git info path ...` (Example Grouping):** * `--show-cdup` -> `git info path --cdup` (or similar) * `--show-prefix` -> `git info path --prefix` * `--show-toplevel` -> `git info path --toplevel` * `--show-superproject-working-tree` -> `git info path --superproject-worktree` * **`git info repo ...` (Example Grouping):** * `--git-dir` -> `git info repo --git-dir` * `--git-common-dir` -> `git info repo --common-dir` * `--resolve-git-dir <path>` -> `git info repo --resolve-dir <path>` * `--is-bare-repository` -> `git info repo --is-bare` * `--is-shallow-repository` -> `git info repo --is-shallow` * **`git info misc ...` (Example Grouping for others):** * `--is-inside-git-dir` -> `git info misc --inside-gitdir` * `--is-inside-work-tree` -> `git info misc --inside-worktree` * `--shared-index-path` -> `git info misc --shared-index-path` * *(Design Consideration):* Option names within subcommands might be slightly adjusted for clarity/consistency (e.g., dropping "show-"). 3. **Multiple Output Formats:** * **Default Text Output:** The default output for each implemented option will be simple, human-readable text, **matching the semantics and format** produced by the corresponding `git rev-parse` option (e.g., printing a path string, "true"/"false", or exiting with status 0/1 for boolean checks). * **NUL Termination (`-z`):** Implement a `-z` option (standard across many Git plumbing commands) for unambiguous, newline-safe output suitable for scripting, particularly for path-related options. * **JSON Output (`--json`):** Implement a `--json` option to provide structured output, mapping query keys (derived from options) to their values. This offers maximum flexibility for tools consuming the information. *(The relative priority and implementation details of `-z` vs `--json` to be discussed with mentor, but both are considered core deliverables)*. 4. **Comprehensive Documentation (Incremental):** * A clear man page (`git-info.adoc`) explaining the new command's purpose, the subcommand structure, and detailing the usage, options (including `-z`, `--json`), and output formats for each implemented feature. **Relevant sections of the man page will be added or updated within each patch series submitted.** * Updates to `git-rev-parse.adoc` to clearly **document the relationship** with `git info` for the migrated options (e.g., noting that `git info` is the preferred command) and potentially marking them for deprecation. **These updates will also be included incrementally with relevant patch series.** 5. **Robust Test Suite (Incremental):** * A new test script (`t/tXXXX-info.sh`) using Git's test framework (`test-lib.sh`). * Tests covering the subcommand structure, each implemented option, and **all output formats** (`text`, `-z`, `--json`). * Tests validating behavior across various repository states (standard, bare, inside `.git`, inside worktree, submodules, shallow clone etc.). **New tests will be added within each patch series for the features implemented.** ## Technical Details 1. **Core `git info` Command Implementation:** * **Entry Point:** Create `builtin/info.c` with `cmd_info(...)`. Use `parse-options` to parse the **subcommand** first. Based on the subcommand, invoke a specific helper function (e.g., `cmd_info_path()`, `cmd_info_repo()`) which then uses `parse-options` again to handle the options specific to that subcommand. * **Repository Context:** Standard setup using `repo` structure, `startup_info`, and potentially `setup_git_directory_gently`. * **Reusing Logic:** Adapt logic from `builtin/rev-parse.c` for the core functionality of each option. This might involve direct code migration or creating shared helper functions where appropriate. * **Subcommand Implementation:** Implement helper functions for each subcommand (`path`, `repo`, `misc`) containing the `parse_options` calls and logic for the options within that group. * **Output Generation:** * **Text (Default):** Use `printf("%s\n", ...)` / `puts(...)` for string output; print "true"/"false" or use `exit(0)` / `exit(1)` for boolean checks, mimicking `rev-parse`. * **NUL (`-z`):** Use `putchar('\0')` or `fwrite(..., 1, 1, stdout)` instead of newline for string output when `-z` is active. Boolean checks likely remain exit-code based. * **JSON (`--json`):** Collect results internally. Use Git's `strbuf` API (with `strbuf_add_json_string` etc.) or potentially an approved C JSON library to construct and print a JSON object mapping keys to values. All requested info within a single invocation should ideally be combined into one JSON object. 2. **Documentation:** * Create `Documentation/git-info.adoc`. Structure based on subcommands. Detail each subcommand and its options, including `-z` and `--json` behavior. * Modify `Documentation/git-rev-parse.adoc` to add cross-references for relevant options. * **Documentation updates will accompany the code changes in each patch series.** 3. **Testing:** * Create `t/tXXXX-info.sh`. * Use `test_expect_success` with helpers like `test_create_repo`, `test_cmp`, `test_must_fail`. * Add tests for: * Correct subcommand parsing and error handling. * Each option under its subcommand, comparing **text output** against `rev-parse` (where applicable) or expected values/exit codes. * **`-z` output** using appropriate comparison methods (e.g., piping to `tr '\\0' '\\n'`). * **`--json` output** using tools like `jq` (if available in test env) or careful `grep`/`sed` checks for structure and values. * **Tests will be added incrementally with the features in each patch series.** ## Detailed Project Timeline **Phase 0: Pre-Acceptance Preparation (April 9 - May 7, 2025)** * **Phase 1: Community Bonding & Final Planning (May 8 - May 26, 2025 Approx.)** * **Focus:** Formal introductions, finalize scope, agree on command structure, setup. * **Activities:** * **(Week 1: May 8 - May 12):** Discuss proposal with mentor(s). Finalize: * Command name (`git info` or alternative). * Subcommand structure and grouping of options. * Exact list of options to port, and any necessary renaming within subcommands. * Approach for handling relationship with `rev-parse` (deprecation vs. aliasing vs. simple documentation cross-link). * Prioritization/approach for implementing `-z` and `--json`. * **(Week 2: May 13 - May 19):** Set up dev environment. Deep dive into agreed-upon code blocks in `rev-parse.c`. Outline `builtin/info.c` structure including subcommand handlers. Outline initial test script `t/tXXXX-info.sh`. * **(Week 3: May 20 - May 26):** Implement basic `cmd_info` skeleton, top-level subcommand parsing, repository setup. Implement one simple subcommand handler (e.g., `cmd_info_path`) with basic option parsing structure. Write initial "no-op" / basic structure tests. Post first blog update. **Phase 2: Implementation in Batches (Coding Weeks 1-8: May 27 - July 21, 2025 Approx.)** * **Focus:** Implement options within subcommands, including documentation and tests for text output first, then potentially add machine-readable formats. Submit patches early and often. * **GSoC Milestone:** Midterm Evaluations occur around Week 8. * **Activities:** *(Structure assumes implementing text output first, then `-z`/`--json` later in the phase)* * **(Batch 1 / Weeks 1-2: May 27 - June 9):** Implement `path` subcommand options (`--toplevel`, `--prefix`, `--cdup`). Implement **text output**. Add corresponding **tests** and **documentation** snippets (for `git-info.adoc` and `git-rev-parse.adoc`). **Submit Patch Series 1**. * **(Batch 2 / Weeks 3-4: June 10 - June 23):** Implement `repo` subcommand options (`--git-dir`, `--common-dir`, `--resolve-dir`, `--is-bare`). Implement **text output**. Add **tests** and **documentation** snippets. **Submit Patch Series 2**. Write blog post update. * **(Batch 3 / Weeks 5-6: June 24 - July 7):** Implement remaining `repo` (`--is-shallow`) and `misc` subcommand options (`--inside-gitdir`, `--inside-worktree`, `--shared-index-path`, `--superproject-worktree` - *adjust subcommand grouping based on final plan*). Implement **text output**. Add **tests** and **documentation**. **Submit Patch Series 3**. * **(Batch 4 / Weeks 7-8: July 8 - July 21):** Implement **`-z` and `--json` output formats** for all options added in Batches 1-3. Add comprehensive **tests** for these formats. Update **documentation** to fully describe `-z` and `--json` behavior. **Submit Patch Series 4**. Prepare for Midterm evaluation; ensure submitted batches show substantial progress on core functionality and formats. Write blog post update. **Phase 3: Refinement & Final Polish (Coding Weeks 9-12: July 22 - Aug 18, 2025 Approx.)** * **Focus:** Address feedback on all patches, ensure robustness, finalize documentation consistency. * **Activities:** * **(Week 9: July 22 - July 28):** Focus on addressing review comments on **all** previous patch series (Code, Tests, Docs). Refactor based on feedback. * **(Week 10: July 29 - Aug 4):** Continue addressing feedback. Ensure the test suite is robust, covers edge cases identified in reviews. Perform thorough documentation review for consistency and clarity across the entire man page. * **(Week 11: Aug 5 - Aug 11):** Final code cleanup. Final pass on test coverage. *(Stretch Goal Idea):* If all core work is stable and time permits, potentially explore adding one or two *new*, simple, agreed-upon repo info queries (not from `rev-parse`) that fit the command's purpose. * **(Week 12: Aug 12 - Aug 18):** Prepare and submit final versions of all patch series, incorporating all feedback. Final self-testing. Write blog post update summarizing progress and final state. Code freeze for final evaluation. **Phase 4: Final Evaluation & Wrap-up (Aug 19 - Nov 19, 2025)** * Write final GSoC project summary blog post. Continue engaging with the community in further contributions beyond GSoC. Thank You, Jayatheerth