MOUMITA DHAR <dhar61595@xxxxxxxxx> writes: Hello Moumita, > Hello , > This is my GSOC proposal > The doc version - > https://docs.google.com/document/d/1f1npZ7Ye-FOZENkfaR4SR2TrgSXnNlI8hvC2T0hJA_Y/edit?usp=sharing > > # Proposal for GSOC 2025 > > ## Project \- Machine-Readable Repository Information Query Tool > > ## Personal information > > Name \- Moumita Dhar > Email \- [dhar61595@xxxxxxxxx](mailto:dhar61595@xxxxxxxxx) > Github \- [https://github.com/Mou887](https://github.com/Mou887) > LinkedIn \- [https://www.linkedin.com/in/moumita-dhar-234940253/](https://www.linkedin.com/in/moumita-dhar-234940253/) > > ## About me > > I’m a self-taught programmer who began my coding journey in 2022\. I > got started by taking [CS50: Introduction to Computer > Science](https://cs50.harvard.edu/x/) by Harvard University on edX, > which sparked my curiosity about how software really works. Since > then, I’ve been learning independently, completing several follow-up > CS50 courses like **CS50’s Web Programming**, and **CS50’s > Understanding Technology**, to build a strong foundation in computer > science and software development. > > While I hold a university degree, my academic background is **not in > computer science**. However, I have consistently dedicated my time and > energy to learning programming concepts, tools, and real-world > development workflows on my own. I’m passionate about systems > programming, developer tools, and contributing to meaningful > open-source projects. > The rules [1] don't mention anything about requiring a *computer science*, so all participants are welcome! > I’m participating in GSoC under the **Open Source Beginner** category. > Even though I’m not currently a student, GSoC represents a unique > opportunity for me to gain valuable mentorship and experience in > large-scale software collaboration, while contributing to a project I > deeply care about. > > Outside of coursework and learning, I’ve also explored Git’s internals > through personal projects and patches, and I’m excited to take this > further through GSoC. > > ## Microproject > > Status \- Under discussion > > Mail thread \- [https://lore.kernel.org/git/20250330134018.9662-2-dhar61595@xxxxxxxxx/](https://lore.kernel.org/git/20250330134018.9662-2-dhar61595@xxxxxxxxx/) > > Description \- I contributed to Git’s `userdiff` system by enhancing > syntax detection for shell scripts. I focused on improving how Git > highlights and navigates function definitions and words in Bash > scripts during diffs. I have iterated over four patch versions based > on reviewer feedback. > > ## Project Overview: Decluttering `git rev-parse` > > The core purpose of the command was to \- > **Parse revision identifiers** like `HEAD`, `master~2`, `origin/HEAD`, or tags. > > **Convert symbolic references** into full 40-character commit hashes. > > **Resolve user input** into unambiguous commit IDs for internal use. > > Over time, developers began adding utility options to `git rev-parse` > that had **nothing to do with parsing revisions**, such as: > > * `--is-bare-repository` > > * `--git-dir` > > * `--show-toplevel` > > * `--is-inside-work-tree` > > This project aims to: > > 1. **Extract non-revision-parsing functionality from `git rev-parse`.** > > 2\. **Create a new structured command** (e.g., `git > repo-info`) dedicated to: > > * Repository paths and environment > * Status checks > * Format queries > * Superproject relationships > * Git environment variables > Makes sense, like I mentioned on another proposal [2], it would be nice to mention that everything under 'Options for Files' section of the 'git rev-parse' manpage probably needs a new home. I also think you should elaborate on how the new command would look like, will we simply copy over the options? Will there be better consistent naming? What would the default output for 'git repo-info' be? Also how do you justify the name? Is it consistent with the command names in Git? Is it self-explanatory? It would also be nice to write a brief about how you plan to tackle this, not from a timeline perspective but from a technical perspective. > ## Project Timeline > > ### Community Bonding Period(Before June 2\) > > * Finalize the scope and confirm overall design with mentors. > > * Settle on command name (e.g., `git-info`, `git-meta`) and structure. > I would suggest involving the mailing list as soon as possible, as you'd get some good feedback around the early design. > * Review how `git-rev-parse` implements the related options. > > * Draft the expected JSON output format for each functionality area. > > ### Week 1 (June 2–8): Repository Path Information > > * .Implement logic to report on repository layout and paths: > > `.git` directory, common directory, top-level path, relative and > absolute paths, etc. > > Related options: `--git-dir`, `--git-common-dir`, `--git-path`, > `--show-toplevel`, `--show-cdup`, `--show-prefix`, > `--absolute-git-dir` > Nice, I like that the project is broken down into smaller modules. > * Introduce new command skeleton and first subcommand infrastructure. > > * Output structured data (e.g., JSON). How do you plan to tackle this? Have you taken a look at json-writer.[c,h]? > > * Write an initial test suite and begin documentation. > > ### Week 2 (June 9–15): Git Environment Context > > * Handle environment reporting:- > > List Git-relevant environment variables (e.g., `GIT_DIR`, > `GIT_WORK_TREE`, etc.) > > Related option: `--local-env-vars` > > * Ensure the output is shell-safe and informative for scripting use. > > * Write tests covering multiple shell environments. > > * Finalize docs and polish previous week’s code based on mentor feedback. > I think this is a good point. Generally things take long, since we need to sync with the mailing list and ensure it is upto a good standard. Then the topic will slowly move from seen -> next -> master. > ### **Week 3 (June 16–22): Repository State and Status** > > * Implement checks for current repo state:- > > > If the repo is a bare repo, shallow clone, inside > `.git` or working tree. > > Related options: `--is-bare-repository`, `--is-shallow-repository`, > `--is-inside-git-dir`, `--is-inside-work-tree` > > * Add structured output with booleans for each status. > > * Test across various repo types (bare, shallow, normal). > > * Document usage and update test coverage. > > ### Week 4 (June 23–29): Object and Ref Format Reporting > > * Report the object format and reference storage format used:- > > SHA-1/SHA-256, loose or reftable, etc. > > Related options: `--show-object-format`, `--show-ref-format.` > > * Ensure fallback behavior works for older Git versions or partial > configurations. > > * Add comprehensive tests and documentation for this area. > > ### Week 5 (June 30–July 6): Review & Midterm Prep > > * Integrate feedback on the previous four areas. > > * Finalize documentation and tests. > > * Clean up patch series. > > * Run full test suite and verify output consistency. > > * Prepare for **midterm submission**. > > ### Week 6 (July 7–13): Superproject Awareness > > * Implement logic to determine whether the current repo is inside a > superproject:- > > > Show the outer working tree if present. > > Related option: `--show-superproject-working-tree` > > * Handle edge cases where repo is not a submodule. > > * Write test coverage and update documentation accordingly. > > ### Week 7 (July 14–20): Path Resolution Logic > > * Add functionality to resolve Git-related paths: > > > Handle symlinks, relative paths, and `.git` indirection. > > Related option: `--resolve-git-dir` > > * Focus on correctness and compatibility. > > * Add comprehensive tests (symlinks, embedded repos, relative vs absolute). > > * Document clearly. > > ### Week 8 (July 21–27): Code Review & Integration > > > > * Submit patch series for areas from Weeks 6–7. > > * Begin integrating all subcommands into a consistent command structure. > Could you expand on what you mean here? > * Ensure consistent JSON schema and error handling. > And here. > * Begin polish and unification. > > ### Week 9 (July 28–Aug 3): Unified Output and CLI Polish > > * Implement a top-level dispatcher for all functionality areas. > This too, what is a dispatcher in context to our codebase? > * Add `--format=json` or similar flags for consistent CLI interface. > > * Write integration tests across all supported repo states. > Aren't tests covered as part of each batch of work? What extra do these tests add, why aren't they part of the initial tests? > * Run full test suite in clean and dirty trees. > This should be part of each batch no? > ### Week 10 (Aug 4–10): Final Documentation and Usability > > * Write a complete manpage for the new command. > I would say each patch should hold corresponding documentation, it is not something we want to work on at the end. We don't want a project left midway without _any_ documentation. it'd be better if there is sufficient documentation added for each new block of changes, that way the state of the project is not lacking at any point. So code, tests, documentation should all be part of each block of work you do. > * Add real-world examples and shell usage patterns. > > * Run `check-docs`, validate formatting and help output. > > > ### Week 11 (Aug 11–17): Final Mentor Review and Bugfixes > > * Submit a full final patch series. > > * Incorporate the last round of mentor feedback. > I think this too, is part of each step of the process. > * Clean up commit messages and inline comments. > > * Final CI runs and Git project best practices review. > > ### Week 12 (Aug 18–24): Submission and Wrap-Up > > * Submit final work to Git mailing list (if not already). > > * Complete final report, blog post, and GSoC submission. > > * Add final tests or polish based on review feedback. > > ### Final Week > > * Reserved for unforeseen delays or last-minute polish. > Overall, it seems like we're building up to the end for a big patch series in the end. The recommended route would be to split the work into small chunks and get each chunk through one at a time. Each chunk would contain necessary code, tests, documentation and should be in a state where it can be merged to the maintree. > ### Time period from April 9 to May 6 > > During this period, I plan to work on a **practice patch** based on my > current understanding of the project. This will help me evaluate how > well I can implement the ideas outlined in my proposal and whether the > timeline I’ve suggested is realistic. > > This preparatory work will allow me to: > > * Explore the relevant parts of the codebase in more depth > > * Validate my implementation approach with a small, isolated prototype > > * Build confidence in handling Git’s development workflow > (compilation, testing, patch submission, etc.) > > I understand that official coding for GSoC begins in June, and I will > reserve actual patch submissions for that period, in accordance with > GSoC guidelines. The goal of this exercise is solely to prepare myself > to contribute effectively and responsibly from day one. > > ## Blogging > > I will maintain a blog to document my progress, challenges, and > learnings throughout the program. This will serve both as a personal > reflection and a way to give back to the community by helping future > contributors understand the development process within Git. I will > post regular updates—starting from the community bonding period > through to the final evaluation—covering details like subcommand > implementation, testing strategies, mailing list interactions, and > reviews. > My blog :- [https://hashnode.com/@Moumita](https://hashnode.com/@Moumita) > > ## Post GSOC > > My involvement with Git will not end with the GSoC coding period. I > intend to continue contributing to the Git project even after GSoC > concludes by following up on any remaining feedback related to my > project, further refining and expanding the new command as needed, and > actively participating in the community through patch reviews and > mailing list discussions. I also plan to explore and work on other > issues or features in the Git codebase that align with my interests. > Through GSoC, I hope to establish myself as a long-term contributor to > Git. I see this project not just as a summer commitment, but as the > start of a deeper and ongoing engagement with the Git project and the > broader open source community. > > ## Availability > > I am fully available for GSoC and can dedicate **approximately 8 hours > per day, 7 days a week**, which totals to **about 50–56 hours per > week**. I do not have any academic or job commitments during the GSoC > period and can devote my full attention to the project. > > This flexibility allows me to accommodate feedback, mentor > communication, code reviews, and unexpected blockers without falling > behind on the proposed timeline. I'm also willing to adjust my > schedule if needed to better sync with my mentor’s availability or > project needs. Thanks for the proposal! - Karthik [1]: https://summerofcode.withgoogle.com/rules [2]: https://lore.kernel.org/git/CAOLa=ZQSnwSPw1U_-2YZzjK5z_jUEB3vGy=So5e+gpOa87Ei=w@xxxxxxxxxxxxxx/T/#mc4c5c87594cd2e0ea795259a6868b3494781cf86
Attachment:
signature.asc
Description: PGP signature