On Wed, Jul 16, 2025 at 6:34 PM Derrick Stolee via GitGitGadget <gitgitgadget@xxxxxxxxx> wrote: Sorry for the long delay in responding... [...] > Add a new subcommand to 'git sparse-checkout' that removes these > tracked-but-sparse directories. This necessarily removes all files > contained within, including tracked and untracked files. Of particular Nice to see tracked files also being addressed in v2. > importance are ignored and excluded files which would normally be > ignored even by 'git clean -f' unless the '-x' or '-X' option is > provided. This is the most extreme method for doing this, but it works > when the sparse-checkout is in cone mode and is expected to rescope > based on directories, not files. > > The current implementation always deletes these sparse directories > without warning. This is unacceptable for a released version, but those > features will be added in changes coming immediately after this one. > > Note that untracked directories within the sparse-checkout remain. You've changed the wording here relative to v1, but you haven't addressed the part that was ambiguous/misleading in v1. In fact, you may have made a different part ambiguous as well, and made readers think that this sentence contradicts your above claims that this command is meant to clean out untracked directories underneath sparse directories. Perhaps something like: "Note that untracked directories in the sparse-checkout that are not within sparse directories will not be removed by this command; it only cleans up paths under directories that are supposed to be sparse." > Further, directories that contain staged changes or files in merge > conflict states are not deleted. Doesn't this sentence conflict with your above statement that "This necessarily removes all files contained within, including tracked and untracked files."? > This is a detail that is partly hidden > by the implementation which relies on collapsing the index to a sparse > index in-memory and only deleting directories that are listed as sparse > in the index. > > If a staged change exists, then that entry is not stored as a sparse > tree entry and thus remains on-disk until committed or reset. This seems a bit surprising -- if a file's modifications are staged, then you can use it and other files in the index to write new trees all the way up to the toplevel, so you should be able to get a sparse tree entry without problem. I'd only expect problems if you had unstaged changes, or higher order stages; perhaps you could clarify here? > There are some interesting cases around merge conflict resolution, but > that will be carefully analyzed in the future. ...okay, so you did clarify for the higher order stages, but I'm still confused about staged vs unstaged without conflicts. > > Signed-off-by: Derrick Stolee <stolee@xxxxxxxxx> > --- > Documentation/git-sparse-checkout.adoc | 11 ++++- > builtin/sparse-checkout.c | 64 +++++++++++++++++++++++++- > t/t1091-sparse-checkout-builtin.sh | 38 +++++++++++++++ > 3 files changed, 111 insertions(+), 2 deletions(-) > > diff --git a/Documentation/git-sparse-checkout.adoc b/Documentation/git-sparse-checkout.adoc > index 529a8edd9c1e..6db88f00781d 100644 > --- a/Documentation/git-sparse-checkout.adoc > +++ b/Documentation/git-sparse-checkout.adoc > @@ -9,7 +9,7 @@ git-sparse-checkout - Reduce your working tree to a subset of tracked files > SYNOPSIS > -------- > [verse] > -'git sparse-checkout' (init | list | set | add | reapply | disable | check-rules) [<options>] > +'git sparse-checkout' (init | list | set | add | reapply | disable | check-rules | clean) [<options>] > > > DESCRIPTION > @@ -111,6 +111,15 @@ flags, with the same meaning as the flags from the `set` command, in order > to change which sparsity mode you are using without needing to also respecify > all sparsity paths. > > +'clean':: > + Remove all files in tracked directories that are outside of the > + sparse-checkout definition. This subcommand requires cone-mode So, this sentence implies that it'll wipe out all untracked or ignored files or tracked files with either unstaged, staged, or conflicted entries. Your commit message says it'll discuss conflicted entries later, but conflicts about whether staged or unstaged changes will be wiped. > + sparse-checkout to be sure that we know which directories are > + both tracked and all contained paths are not in the sparse-checkout. > + This command can be used to be sure the sparse index works > + efficiently, though it does not require enabling the sparse index [...] > diff --git a/builtin/sparse-checkout.c b/builtin/sparse-checkout.c [...] > + if (convert_to_sparse(repo->index, SPARSE_INDEX_MEMORY_ONLY) || > + repo->index->sparse_index == INDEX_EXPANDED) > + die(_("failed to convert index to a sparse index; resolve merge conflicts and try again")); In the commit message, though, you said that it'd also fail to convert a tree with a staged change to a sparse directory. And I thought in our discussion on v1 we found out that unstaged changes would prevent converting to sparse. Shouldn't the error message be more general, then? [...] > diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh > index ab3a105ffff2..a48eedf766d2 100755 > --- a/t/t1091-sparse-checkout-builtin.sh > +++ b/t/t1091-sparse-checkout-builtin.sh > @@ -1050,5 +1050,43 @@ test_expect_success 'check-rules null termination' ' > test_cmp expect actual > ' > > +test_expect_success 'clean' ' > + git -C repo sparse-checkout set --cone deep/deeper1 && > + mkdir repo/deep/deeper2 repo/folder1 && > + touch repo/deep/deeper2/file && > + touch repo/folder1/file && > + > + cat >expect <<-\EOF && > + Removing deep/deeper2/ > + Removing folder1/ > + EOF > + > + git -C repo sparse-checkout clean >out && > + test_cmp expect out && > + > + test_path_is_missing repo/deep/deeper2 && > + test_path_is_missing repo/folder1 > +' > + > +test_expect_success 'clean with staged sparse change' ' > + git -C repo sparse-checkout set --cone deep/deeper1 && > + mkdir repo/deep/deeper2 repo/folder1 repo/folder2 && > + touch repo/deep/deeper2/file && > + touch repo/folder1/file && > + echo dirty >repo/folder2/a && > + > + git -C repo add --sparse folder1/file && > + > + # deletes deep/deeper2/ but leaves folder1/ and folder2/ > + cat >expect <<-\EOF && > + Removing deep/deeper2/ > + EOF > + > + git -C repo sparse-checkout clean >out && > + test_cmp expect out && > + > + test_path_is_missing repo/deep/deeper2 && > + test_path_exists repo/folder1 What about repo/folder2/ ? Anyway, this test shows that neither staged nor unstaged changes are cleaned up (which at least resolves the conflicting documentation you provided on the matter) -- or would if you also checked repo/folder2. What it doesn't show is that tracked files with neither staged nor unstaged changes are not cleaned up either: $ mkdir repo/folder2 $ echo dirty >repo/folder2/a $ touch repo/folder2/untracked $ cd repo $ git status --porcelain M folder2/a ?? folder2/untracked # So, we have both a unstaged change and an untracked file; let's undo the unstaged change $ git checkout HEAD folder2/a Updated 1 path from 8cc814f $ git status --porcelain ?? folder2/untracked $ ls folder2/ a untracked # Both files are still present -- the untracked file, and the untracked file with no changes either staged or unstaged -- what does `git sparse-checkout clean` do? $ git sparse-checkout clean $ ls folder2/ a untracked $ git status --porcelain ?? folder2/untracked # Absolutely nothing. Not only does it not clean anything up, it gives no warnings about not cleaning up what should be cleaned up. Let's try sparse-checkout reapply: $ git sparse-checkout reapply warning: directory 'folder2/' contains untracked files, but is not in the sparse-checkout cone $ git status --porcelain ?? folder2/untracked $ ls folder2/ untracked # So `sparse-checkout reapply` does correctly remove folder2/a for us, while warning about the untracked file. (If folder2/a would have still had changes, it would have warned about it instead of removing.). Let's try `sparse-checkout clean` now... $ git sparse-checkout clean Removing folder2/ $ git status --porcelain $ ls folder2/ ls: cannot access 'folder2/': No such file or directory $ I think these cases either need to be a new testcase or part of this last testcase, and the commit message and documentation should be clearer about tracked-and-staged, tracked-with-unstaged-changes, and tracked-with-no-changes files...or at least comment that they'll be discussed later in the patch series. (I have a feeling I just did a lot of work to discover as I read your next patches that you cover these later...)