Re: [GSOC PATCH v6 0/3] environment: remove sparse-checkout related global variables

Derrick Stolee <stolee@xxxxxxxxx> · Thu, 24 Jul 2025 09:25:34 -0400

On 7/23/25 6:14 PM, Junio C Hamano wrote:
Ayush Chandekar <ayu.chandekar@xxxxxxxxx> writes:

This patch series aims to remove global variables related to
sparse-checkout from the global scope and to remove the definition
'#define USE_THE_REPOSITORY_VARIABLE' from a few files.

Sorry that I missed early versions of this thread. It's an
interesting topic to me, but I've been distracted.

Discussions since v5:

* For 1/3 and 2/3, Junio told me that it was concerning to put so
   many calls to `prepare_repo_settings()` so I tried to minimize the
   calls and made sure that there's no useless calling.

I didn't mean that the number of places is the problem.  What I
found troubling was that this is not done in any central place, so
it is hard to notice even if some random cmd_foo() failed to call
the function before doing its real work.  For example, shouldn't we
be able to, at least for built-in commands that have RUN_SETUP bit
set, centrally call prepare_repo_settings() somewhere late in
git.c:run_builtin() after we figure out what should be in
the_repository?  Now historically, setting up a repository may never
have involved opening and parsing tons of configuration files, so
such a change may be incurring extra overhead we did not have to
pay, so it needs a lot more thought than just trying to minimize the
number of calls, but some performance measurement.

I think that the core issue here (and probably causing the issues
that were seen in the user-facing issues) is that the repo settings
struct was intended as a place to fill config for some one-off
"feature flags" and not to replace core functionality for a repo.

There are two ways to change the approach here to fix the problem
of needing prepare_repo_settings() everyhwere:

 1. With the idea that these sparse-checkout variables are
    critical to the functionality of the repo, they should move
    into the repository struct itself and be initialized along
    with all other values there. This changes the patches (and my
    follow-up series) significantly, but mechanically.

 2. If we are going to change the intention of the repo settings
    struct to move from "optional one-off feature flags" to
    "important information about the core behavior of a repo"
    then we should prepare_repo_settings() when initializing the
    repository struct.

My preference is (1). The only argument for (2) that I can think
of is that it is sometimes helpful to share only the settings for
a repo without sharing the whole repo. But that seems like a weak
reason right now.

* For 3/3, Phillip told me that it broke user-facing as it will be
   parsed quite late in the callchain and might throw an error mid
   operation which we do not want.

So has the behaviour change caused by 3/3 been resolved?

  * This throws everything in repo_settings, but these settings are
    inherently per repository and they are meaningful only when you
    are working with a repository.  What makes us choose to make them
    new members in the repo_settings structure, not direct members in
    the repository structure?

(This is the same thought I expressed earlier in this message.)

Thanks,
-Stolee