On Tue, Jul 15, 2025 at 6:38 AM Derrick Stolee <stolee@xxxxxxxxx> wrote: > > On 7/9/2025 1:35 PM, Elijah Newren wrote: > > On Wed, Jul 9, 2025 at 9:13 AM Derrick Stolee <stolee@xxxxxxxxx> wrote: > >> > >> On 7/8/2025 5:43 PM, Elijah Newren wrote: > [...] > > Right, for cone mode without the sparse index turned on, this new > > subcommand seems to be a silent no-op (other than burning some > > computation time), > > No, it works without the sparse index off. The sparse index config > is about whether or not Git _writes_ the index in the sparse format. > > This command works even if the sparse index is not enabled for the > written format, since we can manipulate the in-memory index for the > purpose of discovering which tracked directories should be sparse > and thus not in the worktree. Ah...manipulating the in-memory index despite the sparse-index not being enabled was the detail I was missing. Thanks for explaining. [...] > So your concerns here should not be a problem, since the command > _does_ do the expected action even if index.sparse=false. Yep, thanks for straightening me out on this point. [...] > > If someone resolves the conflicted merge or rebase and commits (long > > before running this `git sparse-checkout clean` command), what happens > > to those paths? Do these materialized paths persist in the worktree > > after the commit? [...] > I have a TODO to add test cases around this behavior, so we can > have concrete expectations. I'll incorporate them into the existing > test cases around merge conflicts. Thanks. > This may be another example of Git leaving files around that should > be deleted in order to efficiently work with a sparse index. Deleted...whether or not they have unstaged changes? [...] > One thing to keep in mind is > that the SKIP_WORKTREE bit does some amount of ignoring the worktree > by not modifying what is there, and that may include some issues > around reporting the changes in 'status' or staging the changes in > 'add'. Not sure I follow. SKIP_WORKTREE bit will be cleared for files that are present in the working tree before 'git status' or 'git add' ever perform their core logic (due to 82386b44963f (Merge branch 'en/present-despite-skipped', 2022-03-09)), so isn't this point you raise moot, or am I misunderstanding something here? > A lot of your concerns seem like they would be satisfied by providing > a verbose file-by-file output of what would be deleted and potentially > having --dry-run be the default mode. I think that'd be helpful, but primarily I wanted either the commit message to explain why tracked-but-unmodified files and tracked-with-unstaged-changes files under the intended-to-be-sparse directory aren't expected to ever happen in practice, or for the manual to explain what the clean command does with such files. [...] > > Oh! Based on this hint, I went and looked up the code for this; it's > > from convert_to_sparse_rec(), right? I see something interesting > > there; does the present-despite-skipped checks (from 82386b44963f > > (Merge branch 'en/present-despite-skipped', 2022-03-09)) cause this > > collapsing to also fail for unstaged entries? I.e. this part of > > convert_to_sparse_rec(): > > > > if (ce_stage(ce) || > > S_ISGITLINK(ce->ce_mode) || > > !(ce->ce_flags & CE_SKIP_WORKTREE)) > > can_convert = 0; > > > > The `ce_stage(ce)` part of it is what prevent it from collapsing when > > there are conflicts, and I think the `!(ce->ce_flags & > > CE_SKIP_WORKTREE))` would prevent it from collapsing any tracked files > > whatsoever, whether modified or not, due to the > > present-despite-skipped checks. Does that sound right? > > This matches my expectations. Cool. > > In other words, perhaps your clean command as implemented really does > > only handle untracked and ignored files, and if the user also has > > tracked-but-unmodified or tracked-with-unstaged-changes or > > tracked-with-staged-changes then this command won't actually restore > > performance for them until they _also_ run `git sparse-checkout > > reapply` ? > > Worth adding testing to be sure, though I believe 'git sparse-checkout > clean' would help if they stage any changes (resolving conflicts, if > any) and commit those results. The files won't get cleaned by 'git > commit' but the clean operation should work after that. Wait, what? Doesn't this contradict what you just said above about my explanation matching your expectations? ...or by "work" are you just comparing to when we previously talked about how the clean command would abort early when there are conflicts, so by "work" you just mean that when conflicts are resolved, the clean command will then "run without aborting even though it doesn't actually clean up these tracked files either"? [...] > Thanks for the careful review of these fine details! Thanks for working on this series, being willing to dive into these details, and patiently explain stuff when there were details I misunderstood!