On Mon, Mar 31, 2025 at 08:21:27PM +0530, JAYATHEERTH K wrote: > ## **Synopsis** > This project aims to develop a dedicated Git command that interfaces > with Git’s internal APIs to produce structured JSON output, > particularly for repository metadata. By offering a clean, > machine-readable format, this tool will improve automation, scripting, > and integration with other developer tools. > > ## **Benefits to the Community** > ### **1. Simplifies Automation and Scripting** > - Many Git commands output **human-readable text**, making automation > **error-prone** and **dependent on fragile parsing**. > - This project introduces **structured JSON output**, allowing scripts > and tools to consume repository metadata **directly and reliably**. > - No more **awkward text parsing**, `grep` hacks, or brittle `awk/sed` > pipelines—just **clean, structured data**. > > ### **2. Eliminates the Overuse of `git rev-parse`** > - `git rev-parse` is widely misused for extracting metadata, despite > being intended primarily for **parsing revisions**. > - Developers often **repurpose** it because there’s **no dedicated > alternative** for metadata queries. > - This project **corrects that gap** by introducing a **purpose-built > command** that is **cleaner, more intuitive, and extensible**. > > ### **3. Optimizes CI/CD Pipelines** > - CI/CD systems currently need **multiple Git commands** and > associated parsing logic to fetch basic metadata: > > ```bash > # Example: Gathering just a few common pieces of info > BRANCH=$(git rev-parse --abbrev-ref HEAD 2>/dev/null || echo "DETACHED") > COMMIT=$(git rev-parse HEAD) > REMOTE_URL=$(git remote get-url origin 2>/dev/null || echo "no-origin") > # ... often requiring more commands and error handling logic. > ``` > - The proposed command aims to **replace these multiple calls** with a > **single, efficient query** returning comprehensive, structured JSON > data. > - This **simplifies pipeline scripts**, reduces process overhead, and > makes CI/CD configurations **cleaner and more robust**. I already saw this in another proposal, which indicates that the project idea might be a bit underspecced. In any case, the goal of the project isn't to write a single tool that is able to surface _all_ information for a Git repository. It's rather that we want to surface low-level information around the repository itself. The basic intent is to give the options listed in git-rev-list(1) under the section "Options for Files" a better home. We have a bunch of command line options there that allow us to parse environment variables, paths, repository formats and other low-level stuff. But these aren't really a good fit for git-rev-parse(1) itself because that tool was intended to be about parsing revisions. So this is one of those organically grown commands that has started to accumulate all kinds of unrelated options that didn't have a better home elswhere. So the scope of the project is somewhat more limited compared to what you propose here. As that impacts a lot of the implementation details as well as the project timeline I'm not going to comment on these now. > ## Detailed Project Timeline > > > **Phase 0: Pre-Acceptance Preparation (April 9 - May 7, 2025)** > > * **Focus:** Demonstrate continued interest and deepen understanding > while awaiting results. > * **Official GSoC Milestone:** April 8, 2025 - Proposal Deadline. > * **Activities:** > * **(April 9 - April 21):** Deep dive into Git's source code > structure, focusing specifically on areas identified in the proposal's > Technical Details: > * `builtin/` directory structure and command handling. > * `repository.h`, `refs.h`, `remote.h`, `config.c`, `strbuf.h`. > * How existing commands like `git status`, `git branch`, `git > rev-parse`, `git remote -v` access underlying data. > * **(April 22 - May 7):** > * Monitor the Git mailing list for discussions related to repository > information, command output formats, or JSON usage. > * Refine understanding of Git's testing framework as I've not done a > deep dive into tests(`t/test-lib.sh`). Try running and understanding > existing tests relevant to refs, remotes, or configuration. > * Review Git's contribution guidelines (`SubmittingPatches`, coding > style) again since most of my microproject time was related to > documentation. > * Try to start some more microprojects or actively converse in other patches. Note that microprojects are supposed to be finished before submitting your proposal. They are used for us mentors to figure out whether candidates would be a good fit or not. So ideally, you would prominently link to one or more of your finished microprojects in the proposal itself already. > **Phase 4: Documentation, Polish & Stretch Goals (Coding Weeks 9-12: > July 22 - Aug 18, 2025 Approx.)** > > * **Focus:** Finalize documentation, implement error handling, address > feedback, attempt stretch goals if feasible. > * **Activities:** > * **(Week 9: July 22 - July 28):** Complete the first draft of the man > page, detailing usage, JSON schema, and options. Implement the > `--json-errors` functionality for structured error reporting. Add > tests for error cases. > * **(Week 10: July 29 - Aug 4):** *Begin Stretch Goals (Conditional):* > If core work is stable and time permits, start implementing > `--head-only` / `--remotes-only` flags or the basic `is_dirty` check. > Add tests for any implemented stretch goals. > * **(Week 11: Aug 5 - Aug 11):** Thorough code cleanup, address all > outstanding review comments on submitted patches. Ensure documentation > is comprehensive and accurate. Final pass on test suite coverage. > * **(Week 12: Aug 12 - Aug 18):** Prepare and submit final patches > incorporating documentation, error handling, and any completed stretch > goals. Final code freeze for GSoC evaluation purposes. Write blog post > update summarizing final phase. One thing that I also mentioned to others: instead of planning for one big batch of load, I would strongly recommend to plan your work in smaller batches. You should ideally have multiple self-contained batches of work that you can submit as early as possible while still bringing some value to the project. This ensures that you can get feedback from the bigger community early on. > ## Past Communication and Microproject > * **Blog**: [Blog](https://jayatheerthkulkarni.github.io/gsoc_blog/index.html) > This blog contains a detailed communication description and blog of my > microproject experience. > * First Introduction to the Git Mailing list: [first > Mail](https://lore.kernel.org/git/CA+rGoLc69R8qgbkYQiKoc2uweDwD10mxZXYFSY8xFs5eKSRVkA@xxxxxxxxxxxxxx/t/#u) > * First patch to the git mailing list: [First > Patch](https://lore.kernel.org/git/20250312081534.75536-1-jayatheerthkulkarni2005@xxxxxxxxx/t/#u) > * Most recent series of patches and back and forth with feedbacks: > [Main mail thread](https://lore.kernel.org/git/xmqqa59evffd.fsf@gitster.g/T/#t) > > I've been maintaing the blog and will maintain the blogs of all the > communication of mine to the git mailing list. ah, you do have a microproject. As this is part of the prerequisites I would like to propose to have this more prominently visible. Thanks! Patrick