[GSoC PROPOSAL v2] Refactoring in order to reduce Git’s global state

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



## Personal Information

- Full name: Arnav Akshaya Bhate
- Email address: bhatearnav@xxxxxxxxx
- Mobile no.: +91 8291328838
- Time zone: UTC+05:30
- Education: IIT Bombay
- Year: Second year
- GitHub: https://github.com/arnavbhate

## About Me

I'm Arnav Bhate, a second-year UG student at Indian Institute of
Technology Bombay. I love coding and so I am a member of IIT Bombay's
Developers' Community (DevCom), which is a group of roughly 40 people
developing software for use by students and staff of the institute. Most
of the software developed is not open source, so I can not include
examples of my work there in this proposal. Being a member of DevCom has
exposed me to collaborative software development.

A common link in all software I have worked on is that Git has been used
for version control. I thus see this project as my way of giving back to
the Git community in particular and open source in general. This will be
my first significant contribution to the open source community, and I
wish to stick around afterwards.

## Overview

Git currently uses many global variables, most significantly
`the_repository`, which are included in roughly 290 files. Apart from
`the_repository`, there are many global variables, some of which
logically belong in struct repository, as they represent information
specific to a repository. So even if all instances of the_repository
were converted into a extra repository argument for the function, there
would still be many global variables left.

The use of such variables assumes that Git will only operate on one
repository at a time, which renders multi-repository handling
impossible without kludges.

This project aims to move such variables from global scope into more
appropriate local contexts, mainly `struct repository` and
`struct repository_settings`. This will not only make the environment
repository-specific, allowing easy multi-repository handling, but also
make maintaining the code easier.

The project involves identifying suitable locations for environment
variables in repository specific structs, moving them there and updating
all the code affected by the move.

## Pre-GSoC

I first got into Git's codebase in February 2025, with my first
contribution in March. My first patch was on my microproject and since
then I have submitted two more patches on a similar topic.

### Patches

- (Microproject) decorate: fix sign comparison warnings  
  Thread: https://lore.kernel.org/git/afa6b428-3190-42ae-9eac-540c95b576fd@xxxxxxxxx/  
  Status: Merged into master  
  Commit hash: 2bfd3b368572cbf1ce287de09db08b7e7e429ecd  
  Description: Refactoring of decorate.c to replace signed variables
  with unsigned ones when they are used to iterate over arrays whose
  sizes are represented by unsigned variables, and remove 2 unnecessary
  variables which just hold the value of another variable without being
  modified, replacing them with the variable whose value they were
  holding.

- rm: fix sign comparison warnings  
  Thread: https://lore.kernel.org/git/38de63ce-6d4e-4f1f-95b1-049df78d9cfc@xxxxxxxxx/  
  Status: Under discussion  
  Description: Refactoring of rm.c to make iterators over arrays whose
  sizes are represented by unsigned variables unsigned. Specifically in
  `get_ours_cache_pos`, where before a signed variable was being passed
  and then inverted in the function, now the already inverted variable
  is passed as an unsigned variable, with the inversion moved to the
  function call.

- pathspec: fix sign comparison warnings  
  Thread: https://lore.kernel.org/git/a3aa5f99-63ce-4be5-8d64-fb6e226b3bf9@xxxxxxxxx/  
  Status: Under discussion  
  Description: Refactoring of pathspec.c to make array iterator
  variables match the type of the variable storing the array's size.
  Where replacing the variable's type is not possible, because of the
  large-scale cascade replacements it would cause, an appropriate cast
  has been added.

- environment.h: remove unused variables
  Thread: https://lore.kernel.org/git/2c547567-2b72-476c-9fc5-71cac050fa15@xxxxxxxxx/
  Status: Under discussion
  Description: Removing two variables which did not have any references
  in the codebase, as they had been moved to `struct repo_settings`, but
  were not removed from environment.h.

## Proposed Plan

- Identifying global variables in environment.c that should be moved and
  identifying suitable locations, some could be moved directly into
  `struct repository`, some in its sub-structs that already exist and
  some in newly created sub-structs.

- Identifying and updating occurrences of these variables to reference
  their new locations.

- Identifying all occurrences of `the_repository` and updating them to
  use a `struct repository` passed to the function.

It makes sense that all the variables need not be in the same struct, as
separation would keep the codebase organised, and thus easier to
maintain. It would also make it easier to introduce these changes
systematically, as a group of related variables, combined together in a
struct, could be introduced in a single patch series.

### Timeline

#### Pre-GSoC (Until May 8)

- Explore the codebase, identifying locations where global variables
  from environment.c are used.

- Identify suitable locations for these global variables.

#### Community Bonding Period (May 8 - June 1)

- Interact with mentor, discussing the locations I have decided, and
  refining the plan if required.

- Start coding early, as my summer break will have started. (See coding
  period)

#### Coding Period (June 2 - August 25)

- Move global variables to their new locations in various structs,
  and refactor functions that depend on them to use their new locations.

  - Variables which represent settings from config (7 weeks)
    - Core (5 weeks)
    - Others (2 weeks)
  - Variables not from config (3 weeks)

- Modify functions to add an `struct repository` argument where they
  depend on `the_repository` and replace all occurrences of it in the
  function.

#### Final Week (August 25 - September 1)

- Fix any bugs that may be left.

- Write final report.

### Availability

My summer break from college lasts from May to July. I am currently
planning on taking a vacation during this period of about 1 week,
however, the dates have not been decided. Outside of this vacation, I
am not occupied in the break and can devote up to 60 hours a week
towards the project. In August, once classes recommence, I will be
available for 20 hours a week.

## Post-GSoC

After completing my project, I plan on staying active and contributing
patches, and start reviewing code.
-- 
Regards,
Arnav Bhate
(He/Him)





[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux