Re: [PATCH v2] worktree: detect from secondary worktree if main worktree is bare

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thank you for the review, I totally understand the delay in the review process and appreciate your time spent on this.

> On Jan 19, 2025, at 3:30 PM, Eric Sunshine <sunshine@xxxxxxxxxxxxxx> wrote:
> 
> On Thu, Jan 16, 2025 at 4:35 PM Olga Pilipenco via GitGitGadget
> <gitgitgadget@xxxxxxxxx> wrote:
>> Setup:
>> 1. Have a bare repo with core.bare = true in config.worktree
>> 2. Create a new worktree
>> 
>> Behavior:
>> From the secondary worktree the main worktree appears as non-bare.
>> 
>> Expected:
>> From the secondary worktree the main worktree should appear as bare.
>> 
>> Why current behavior is not good?
>> If the main worktree is detected as not bare it doesn't allow
>> checking out the branch of the main worktree. There are possibly
>> other problems associated with that behavior.
>> 
>> Why is it happening?
>> While we're inside the secondary worktree we don't initialize the main
>> worktree's repository with its configuration.
> 
> Okay, this is clearly a very real problem and explains this comment
> added by f3534c98e4 (worktree: update is_bare heuristics, 2019-04-19):
> 
>    NEEDSWORK: If this function is called from a secondary worktree and
>    config.worktree is present, is_bare_repository_cfg will reflect the
>    contents of config.worktree, not the contents of the main worktree.
>    This means that worktree->is_bare may be set to 0 even if the main
>    worktree is configured to be bare.
> 
> (Aside: I recall reading this comment when Jonathan added it but
> wasn't able to dig into it at the time to really understand it, and
> never got back around to it. Now, after studying your patch, I
> understand what it was about.
> 
>> How is it fixed?
>> Load actual configs of the main worktree. Also, skip the config loading
>> step if we're already inside the current worktree because in that case we
>> rely on is_bare_repository() to return the correct result.
> 
> I found that I had to dig around a bit to fully understand the problem
> expressed by this commit message. Perhaps adding a bit more detail
> would help? Here's my attempt at rewriting the above (also in a way
> which is more idiomatic to this project):
> 
>    When extensions.worktreeConfig is true and the main worktree is
>    bare -- that is, its config.worktree file contains core.bare=true
>    -- commands run from secondary worktrees incorrectly see the main
>    worktree as not bare. As such, those commands incorrectly think
>    that the repository's default branch (typically "main" or
>    "master") is checked out in the bare repository even though it's
>    not. This makes it impossible, for instance, to checkout or delete
>    the default branch from a secondary worktree, among other
>    shortcomings.
> 
>    This problem occurs because, when extensions.worktreeConfig is
>    true, commands run in secondary worktrees only consult
>    $commondir/config and $commondir/worktrees/<id>/config.worktree,
>    thus they never see the main worktree's core.bare=true setting in
>    $commondir/config.worktree.
> 
>    Fix this problem by consulting the main worktree's config.worktree
>    file when checking whether it is bare. (This extra work is
>    performed only when running from a secondary worktree.)


Wow, your explanation is so much better than mine.Thank you for “translating" it for the world :) I’m still trying to get used to the terminology used in this codebase.
I’ll steal your description for sure (if you don’t mind).

> 
>> Other solutions considered:
>> Alternatively, instead of incorrectly always using
>> `the_repository` as the main worktree's repository, we can detect
>> and load the actual repository of the main worktree and then use
>> that repository's `is_bare` value extracted from correct configs.
>> However, this approach is a bit riskier and could also affect
>> performance. Since we had the assignment `worktree->repo =
>> the_repository` for a long time already, I decided it's safe to
>> keep it as it is for now; it can be still fixed separately from
>> this change.
> 
> I found this paragraph somewhat confusing because it seems to conflate
> a repository (i.e. the shared object database) with the `struct
> repository` type, and the configuration which happens to get loaded
> and stored (as one of *many* members) of the repository structure. I
> had to read it several times to understand that this was talking about
> instantiating a separate `struct repository` initialized from the main
> worktree configuration. I agree that doing so would likely be overkill
> and could impact performance negatively. I understand that you added
> this paragraph because SubmittingPatches suggests to do so, but I
> think it can probably be omitted in this case unless it can be
> rewritten to be more clear (but even then I doubt it is necessary to
> keep it).

Trust me, it took me a while to wrap my head around `struct repository` as well.
I agree if the explanation is too confusing and doesn’t bring any value, it can be omitted.

> 
>> Real life use case:
>> 1. Have a bare repo
>> 2. Create a worktree from the bare repo
>> 3. In the secondary worktree enable sparse-checkout - this enables
>> extensions.worktreeConfig and keeps core.bare=true setting in
>> config.worktree of the bare worktree
>> 4. The secondary worktree or any other non-bare worktree created
>> won't be able to use branch main (not even once), but it should be
>> able to.
> 
> This is mostly repeating what was said earlier, thus probably isn't
> adding any value to the commit message. I'd probably drop it.

I agree, your improved description captures this scenario perfectly.

> 
>> Signed-off-by: Olga Pilipenco <olga.pilipenco@xxxxxxxxxxx>
>> ---
>>    Changes since v1:
>> 
>>     * no code changes
>>     * rebased with maint
>>     * CC added
> 
> Sorry. I've had your v1 sitting in my ever-increasingly-large backlog
> of patches to look at, but have been extra busy the last many months
> and never managed to get to it.

Totally understand. Thanks again for getting to it eventually.

> 
>>    Existing broken functionality forces our project to use hacks on bare
>>    repo that we'd like to avoid. I would really appreciate reviews of this
>>    patch to move closer towards fixing the issue. This is my first
>>    contribution to git/git, I apologize if I got lost in the instructions,
>>    but I tried my best to follow the rules.
> 
> Your submission is fine. Unfortunately, the project has a lack of
> reviewers but no lack of submitters, so sometimes patches get
> overlooked or simply buried.
> 
>> diff --git a/t/t3200-branch.sh b/t/t3200-branch.sh
>> @@ -410,6 +410,20 @@ test_expect_success 'bare main worktree has HEAD at branch deleted by secondary
>> +test_expect_success 'secondary worktree can switch to main if common dir is bare worktree' '
> 
> The use of "common dir" is a bit confusing. Also, this patch is fixing
> the more general problem that secondary worktrees think that the bare
> main worktree has a branch checked out. So, perhaps a better title
> would be:
> 
>    secondary worktrees recognize core.bare=true in main config.worktree
> 
> or something?

Sounds good, will update.

> 
>> +       test_when_finished "rm -rf bare_repo non_bare_repo secondary_worktree" &&
>> +       git init -b main non_bare_repo &&
>> +       test_commit -C non_bare_repo x &&
>> +
>> +       git clone --bare non_bare_repo bare_repo &&
>> +       git -C bare_repo config extensions.worktreeConfig true &&
>> +       git -C bare_repo config unset core.bare &&
>> +       git -C bare_repo config --worktree core.bare true &&
>> +
>> +       git -C bare_repo worktree add ../secondary_worktree &&
>> +       git -C secondary_worktree checkout main
>> +'
> 
> Very straightforward and exactly what I expected to see once I
> understood the problem.
> 
>> diff --git a/worktree.c b/worktree.c
>> @@ -65,6 +65,28 @@ static int is_current_worktree(struct worktree *wt)
>> +static int is_bare_git_dir(const char *git_dir)
> 
> Nit: I wonder if a name such as is_main_worktree_bare() would clue
> readers in a bit more?

I was about to explain how I wanted this function to be more generic and handle all sorts of bare and non-bare cases - whether it’s the main worktree or not. However, after seeing your comments and after revisiting the code, I realized that generalization doesn’t really provide much benefit here. It is much clearer if we're explicit that the bare check in this case is only performed on the main worktree. I’ll update it in the next version.

> 
>> +{
>> +       int bare = 0;
>> +       struct config_set cs = { { 0 } };
> 
> This is not your fault since this construct is used elsewhere in this
> file (from which I presume you copied it), but project consensus is
> that using the notation `{{0}}` to work around a complaint from the
> Apple compiler (and only the Apple compiler) should be avoided, and
> that `{0}` is preferred. So, if you reroll, changing this to `{0}` may
> make other reviewers happy (or you can leave it as is to be consistent
> with existing precedence in this file; I don't feel strongly about
> it).

I’ll fix it, sounds like a good reason.

> 
>> +       char *config_file;
>> +       char *worktree_config_file;
>> +
>> +       config_file = xstrfmt("%s/config", git_dir);
>> +       worktree_config_file = xstrfmt("%s/config.worktree",  git_dir);
>> +
>> +       git_configset_init(&cs);
>> +       git_configset_add_file(&cs, config_file);
>> +       git_configset_add_file(&cs, worktree_config_file);
> 
> Genuine question: I haven't thought too deeply about it, but do we
> gain anything by loading $commondir/config here -- which is shared by
> the main worktree and all secondary worktrees -- considering that it
> was already loaded and consulted by the earlier is-bare check before
> this function was even called?

This function determines if a worktree is bare or not. I want this logic to work even when it’s called from a different context and not rely on other is-bare checks (that are a bit confusing tbh).

> 
>> +       git_configset_get_bool(&cs, "core.bare", &bare);
>> +
>> +       git_configset_clear(&cs);
>> +       free(config_file);
>> +       free(worktree_config_file);
>> +       return bare;
> 
> Everything gets cleaned up correctly. Good.
> 
>> @@ -77,18 +99,16 @@ static struct worktree *get_main_worktree(int skip_reading_head)
>> +       /*
>> +        * NEEDSWORK: the_repository is not always main worktree's repository
>> +       */
>>        worktree->repo = the_repository;
>>        worktree->path = strbuf_detach(&worktree_path, NULL);
> 
> I found this new NEEDSWORK comment rather confusing the first several
> times I read the patch. It wasn't until I finally realized that the
> reference to `the_repository` here is the same reference to
> `the_repository` in the commit message -- which confused me, as well
> -- that I understood what this was trying to say. The actual problem,
> of course, is that the _configuration_ stored in `the_repository` is
> the secondary worktree's configuration, not the main worktree's
> configuration. Considering that this patch addresses that problem, I'd
> probably just drop this new comment altogether (unless, perhaps, you
> rewrite it to talk about the _configuration_ stored in
> `the_repository`).

This `the_repository` structure is soooo confusing, took me a while to figure out what it is! I would feel guilty not mentioning that under some circumstances `the_repository` assigned here could be not actual configuration of the worktree object. I don’t know if that will ever matter or not, but I find this assignment kinda “stinky” and want everyone to know about it. I don’t want to change this assignment in this patch because it didn’t bring any harm so far. I’ll try again to rephrase this comment, just to give a heads up in case someone experiences “weird” behaviour in this area (same way the previous NEEDSWORK comment gave me ideas why my workflow didn’t work and inspired me to try to fix it).

> 
>> -       /*
>> -        * NEEDSWORK: If this function is called from a secondary worktree and
>> -        * config.worktree is present, is_bare_repository_cfg will reflect the
>> -        * contents of config.worktree, not the contents of the main worktree.
>> -        * This means that worktree->is_bare may be set to 0 even if the main
>> -        * worktree is configured to be bare.
>> -        */
>> -       worktree->is_bare = (is_bare_repository_cfg == 1) ||
>> -               is_bare_repository();
>>        worktree->is_current = is_current_worktree(worktree);
>> +       worktree->is_bare = (is_bare_repository_cfg == 1) ||
>> +               is_bare_repository() ||
>> +               (!worktree->is_current && is_bare_git_dir(repo_get_common_dir(the_repository)));
> 
> This is performing the expensive check only if the earlier checks left
> the question unanswered. Good.

Thanks for the review. I’ll incorporate the changes in my next version and hopefully it will be good to go :tada:
I hope I responded to all the comments, it’s a bit nerve-wrecking to contribute for the first time (so many rules and instructions!) :)






[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux