Hello , This is my GSOC proposal The doc version - https://docs.google.com/document/d/1f1npZ7Ye-FOZENkfaR4SR2TrgSXnNlI8hvC2T0hJA_Y/edit?usp=sharing # Proposal for GSOC 2025 ## Project \- Machine-Readable Repository Information Query Tool ## Personal information Name \- Moumita Dhar Email \- [dhar61595@xxxxxxxxx](mailto:dhar61595@xxxxxxxxx) Github \- [https://github.com/Mou887](https://github.com/Mou887) LinkedIn \- [https://www.linkedin.com/in/moumita-dhar-234940253/](https://www.linkedin.com/in/moumita-dhar-234940253/) ## About me I’m a self-taught programmer who began my coding journey in 2022\. I got started by taking [CS50: Introduction to Computer Science](https://cs50.harvard.edu/x/) by Harvard University on edX, which sparked my curiosity about how software really works. Since then, I’ve been learning independently, completing several follow-up CS50 courses like **CS50’s Web Programming**, and **CS50’s Understanding Technology**, to build a strong foundation in computer science and software development. While I hold a university degree, my academic background is **not in computer science**. However, I have consistently dedicated my time and energy to learning programming concepts, tools, and real-world development workflows on my own. I’m passionate about systems programming, developer tools, and contributing to meaningful open-source projects. I’m participating in GSoC under the **Open Source Beginner** category. Even though I’m not currently a student, GSoC represents a unique opportunity for me to gain valuable mentorship and experience in large-scale software collaboration, while contributing to a project I deeply care about. Outside of coursework and learning, I’ve also explored Git’s internals through personal projects and patches, and I’m excited to take this further through GSoC. ## Microproject Status \- Under discussion Mail thread \- [https://lore.kernel.org/git/20250330134018.9662-2-dhar61595@xxxxxxxxx/](https://lore.kernel.org/git/20250330134018.9662-2-dhar61595@xxxxxxxxx/) Description \- I contributed to Git’s `userdiff` system by enhancing syntax detection for shell scripts. I focused on improving how Git highlights and navigates function definitions and words in Bash scripts during diffs. I have iterated over four patch versions based on reviewer feedback. ## Project Overview: Decluttering `git rev-parse` The core purpose of the command was to \- **Parse revision identifiers** like `HEAD`, `master~2`, `origin/HEAD`, or tags. **Convert symbolic references** into full 40-character commit hashes. **Resolve user input** into unambiguous commit IDs for internal use. Over time, developers began adding utility options to `git rev-parse` that had **nothing to do with parsing revisions**, such as: * `--is-bare-repository` * `--git-dir` * `--show-toplevel` * `--is-inside-work-tree` This project aims to: 1. **Extract non-revision-parsing functionality from `git rev-parse`.** 2\. **Create a new structured command** (e.g., `git repo-info`) dedicated to: * Repository paths and environment * Status checks * Format queries * Superproject relationships * Git environment variables ## Project Timeline ### Community Bonding Period(Before June 2\) * Finalize the scope and confirm overall design with mentors. * Settle on command name (e.g., `git-info`, `git-meta`) and structure. * Review how `git-rev-parse` implements the related options. * Draft the expected JSON output format for each functionality area. ### Week 1 (June 2–8): Repository Path Information * .Implement logic to report on repository layout and paths: `.git` directory, common directory, top-level path, relative and absolute paths, etc. Related options: `--git-dir`, `--git-common-dir`, `--git-path`, `--show-toplevel`, `--show-cdup`, `--show-prefix`, `--absolute-git-dir` * Introduce new command skeleton and first subcommand infrastructure. * Output structured data (e.g., JSON). * Write an initial test suite and begin documentation. ### Week 2 (June 9–15): Git Environment Context * Handle environment reporting:- List Git-relevant environment variables (e.g., `GIT_DIR`, `GIT_WORK_TREE`, etc.) Related option: `--local-env-vars` * Ensure the output is shell-safe and informative for scripting use. * Write tests covering multiple shell environments. * Finalize docs and polish previous week’s code based on mentor feedback. ### **Week 3 (June 16–22): Repository State and Status** * Implement checks for current repo state:- If the repo is a bare repo, shallow clone, inside `.git` or working tree. Related options: `--is-bare-repository`, `--is-shallow-repository`, `--is-inside-git-dir`, `--is-inside-work-tree` * Add structured output with booleans for each status. * Test across various repo types (bare, shallow, normal). * Document usage and update test coverage. ### Week 4 (June 23–29): Object and Ref Format Reporting * Report the object format and reference storage format used:- SHA-1/SHA-256, loose or reftable, etc. Related options: `--show-object-format`, `--show-ref-format.` * Ensure fallback behavior works for older Git versions or partial configurations. * Add comprehensive tests and documentation for this area. ### Week 5 (June 30–July 6): Review & Midterm Prep * Integrate feedback on the previous four areas. * Finalize documentation and tests. * Clean up patch series. * Run full test suite and verify output consistency. * Prepare for **midterm submission**. ### Week 6 (July 7–13): Superproject Awareness * Implement logic to determine whether the current repo is inside a superproject:- Show the outer working tree if present. Related option: `--show-superproject-working-tree` * Handle edge cases where repo is not a submodule. * Write test coverage and update documentation accordingly. ### Week 7 (July 14–20): Path Resolution Logic * Add functionality to resolve Git-related paths: Handle symlinks, relative paths, and `.git` indirection. Related option: `--resolve-git-dir` * Focus on correctness and compatibility. * Add comprehensive tests (symlinks, embedded repos, relative vs absolute). * Document clearly. ### Week 8 (July 21–27): Code Review & Integration * Submit patch series for areas from Weeks 6–7. * Begin integrating all subcommands into a consistent command structure. * Ensure consistent JSON schema and error handling. * Begin polish and unification. ### Week 9 (July 28–Aug 3): Unified Output and CLI Polish * Implement a top-level dispatcher for all functionality areas. * Add `--format=json` or similar flags for consistent CLI interface. * Write integration tests across all supported repo states. * Run full test suite in clean and dirty trees. ### Week 10 (Aug 4–10): Final Documentation and Usability * Write a complete manpage for the new command. * Add real-world examples and shell usage patterns. * Run `check-docs`, validate formatting and help output. ### Week 11 (Aug 11–17): Final Mentor Review and Bugfixes * Submit a full final patch series. * Incorporate the last round of mentor feedback. * Clean up commit messages and inline comments. * Final CI runs and Git project best practices review. ### Week 12 (Aug 18–24): Submission and Wrap-Up * Submit final work to Git mailing list (if not already). * Complete final report, blog post, and GSoC submission. * Add final tests or polish based on review feedback. ### Final Week * Reserved for unforeseen delays or last-minute polish. ### Time period from April 9 to May 6 During this period, I plan to work on a **practice patch** based on my current understanding of the project. This will help me evaluate how well I can implement the ideas outlined in my proposal and whether the timeline I’ve suggested is realistic. This preparatory work will allow me to: * Explore the relevant parts of the codebase in more depth * Validate my implementation approach with a small, isolated prototype * Build confidence in handling Git’s development workflow (compilation, testing, patch submission, etc.) I understand that official coding for GSoC begins in June, and I will reserve actual patch submissions for that period, in accordance with GSoC guidelines. The goal of this exercise is solely to prepare myself to contribute effectively and responsibly from day one. ## Blogging I will maintain a blog to document my progress, challenges, and learnings throughout the program. This will serve both as a personal reflection and a way to give back to the community by helping future contributors understand the development process within Git. I will post regular updates—starting from the community bonding period through to the final evaluation—covering details like subcommand implementation, testing strategies, mailing list interactions, and reviews. My blog :- [https://hashnode.com/@Moumita](https://hashnode.com/@Moumita) ## Post GSOC My involvement with Git will not end with the GSoC coding period. I intend to continue contributing to the Git project even after GSoC concludes by following up on any remaining feedback related to my project, further refining and expanding the new command as needed, and actively participating in the community through patch reviews and mailing list discussions. I also plan to explore and work on other issues or features in the Git codebase that align with my interests. Through GSoC, I hope to establish myself as a long-term contributor to Git. I see this project not just as a summer commitment, but as the start of a deeper and ongoing engagement with the Git project and the broader open source community. ## Availability I am fully available for GSoC and can dedicate **approximately 8 hours per day, 7 days a week**, which totals to **about 50–56 hours per week**. I do not have any academic or job commitments during the GSoC period and can devote my full attention to the project. This flexibility allows me to accommodate feedback, mentor communication, code reviews, and unexpected blockers without falling behind on the proposed timeline. I'm also willing to adjust my schedule if needed to better sync with my mentor’s availability or project needs.