On Wed, Jul 16, 2025 at 6:34 PM Derrick Stolee via GitGitGadget <gitgitgadget@xxxxxxxxx> wrote: > > From: Derrick Stolee <stolee@xxxxxxxxx> > > The 'git sparse-checkout clean' command is designed to be a one-command > way to get the worktree in a state such that a sparse index would > operate efficiently. The previous change demonstrated that files outside > the sparse-checkout that were committed due to a merge conflict would > persist despite attempts to run 'git sparse-checkout clean' and instead > a 'git sparse-checkout reapply' would be required. > > Instead of requiring users to run both commands, update 'clean' to be > more ruthless about tracked sparse directories. The key here is to make > sure that the SKIP_WORKTREE bit is removed from more paths in the index > using update_sparsity() before compressing the index to a sparse one > in-memory. > > The tricky part here is that update_sparsity() was previously assuming > that it would be in 'update' mode and would change the worktree as it > made changes. However, we do not want to make these worktree changes at > this point, instead relying on our later logic (that integrates with > --dry-run and --verbose options) to perform those steps. > > One side-effect here is that we also clear out staged files that exist > in the worktree, but they would also appear in the verbose output as > part of the dry run. > > The final test in t1091 demonstrates that we no longer need the > 'reapply' subcommand for merge resolutions. It also fixes an earlier > case where 'git add --sparse' clears the SKIP_WORKTREE bit and avoids a > directory deletion. > > Signed-off-by: Derrick Stolee <stolee@xxxxxxxxx> > --- > builtin/sparse-checkout.c | 8 ++++++++ > t/t1091-sparse-checkout-builtin.sh | 24 +++++++++++++++++------- > unpack-trees.c | 2 +- > 3 files changed, 26 insertions(+), 8 deletions(-) > > diff --git a/builtin/sparse-checkout.c b/builtin/sparse-checkout.c > index f38a0809c098..1d1d5208a3ba 100644 > --- a/builtin/sparse-checkout.c > +++ b/builtin/sparse-checkout.c > @@ -962,6 +962,7 @@ static int sparse_checkout_clean(int argc, const char **argv, > size_t worktree_len; > int force = 0, dry_run = 0, verbose = 0; > int require_force = 1; > + struct unpack_trees_options o = { 0 }; > > struct option builtin_sparse_checkout_clean_options[] = { > OPT__DRY_RUN(&dry_run, N_("dry run")), > @@ -990,6 +991,13 @@ static int sparse_checkout_clean(int argc, const char **argv, > if (repo_read_index(repo) < 0) > die(_("failed to read index")); > > + o.verbose_update = verbose; > + o.update = 0; /* skip modifying the worktree here. */ > + o.head_idx = -1; > + o.src_index = o.dst_index = repo->index; > + if (update_sparsity(&o, NULL)) > + warning(_("failed to reapply sparse-checkout patterns")); > + > if (convert_to_sparse(repo->index, SPARSE_INDEX_MEMORY_ONLY) || > repo->index->sparse_index == INDEX_EXPANDED) > die(_("failed to convert index to a sparse index; resolve merge conflicts and try again")); > diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh > index 116ad7c9a20e..4b9078d90a61 100755 > --- a/t/t1091-sparse-checkout-builtin.sh > +++ b/t/t1091-sparse-checkout-builtin.sh > @@ -1104,6 +1104,7 @@ test_expect_success 'clean with staged sparse change' ' > > cat >expect <<-\EOF && > Would remove deep/deeper2/ > + Would remove folder1/ > EOF > > git -C repo sparse-checkout clean --dry-run >out && > @@ -1115,6 +1116,7 @@ test_expect_success 'clean with staged sparse change' ' > # deletes deep/deeper2/ but leaves folder1/ and folder2/ > cat >expect <<-\EOF && > Removing deep/deeper2/ > + Removing folder1/ > EOF > > # The previous test case checked the -f option, so > @@ -1124,7 +1126,7 @@ test_expect_success 'clean with staged sparse change' ' > test_cmp expect out && > > test_path_is_missing repo/deep/deeper2 && > - test_path_exists repo/folder1 && > + test_path_is_missing repo/folder1 && > test_path_exists repo/folder2 What this doesn't show is that afterwards: $ git -C repo status On branch main You are in a sparse checkout with 78% of tracked files present. Changes to be committed: (use "git restore --staged <file>..." to unstage) new file: folder1/file Changes not staged for commit: (use "git add/rm <file>..." to update what will be committed) (use "git restore <file>..." to discard changes in working directory) deleted: folder1/file modified: folder2/a ==> In other words, folder1/file and folder1/ were successfully removed by this clean command, but when the user runs status, they see that folder1/file has been deleted (and has a staged change). This is probably correct behavior and may be expected, but might be surprising to users at first; it probably deserves to be documented, or at least covered in the commit message. Especially since it's somewhat difficult for users to work with: $ cd repo $ git checkout HEAD -- folder1/file error: pathspec 'folder1/file' did not match any file(s) known to git $ git add -- folder1/file The following paths and/or pathspecs matched paths that exist outside of your sparse-checkout definition, so will not be updated in the index: folder1/file hint: If you intend to update such entries, try one of the following: hint: * Use the --sparse option. hint: * Disable or modify the sparsity rules. hint: Disable this message with "git config set advice.updateSparsePath false" $ git add --sparse -- folder1/file fatal: pathspec 'folder1/file' did not match any files $ git reset -- folder1/file Unstaged changes after reset: M folder2/a $ ==> So, both checkout and add fail to work with this path while reset works. `git commit` also would have worked; let's back up to the point of status above, then get rid of the folder2/a modification, and then use git commit to commit our folder1/file change: $ git checkout folder2/a Updated 1 path from the index $ git commit -m "A change" [main aad2d89] A change 1 file changed, 0 insertions(+), 0 deletions(-) create mode 100644 folder1/file $ git status On branch main You are in a sparse checkout with 78% of tracked files present. Changes not staged for commit: (use "git add/rm <file>..." to update what will be committed) (use "git restore <file>..." to discard changes in working directory) deleted: folder1/file no changes added to commit (use "git add" and/or "git commit -a") ==> Great, so we got rid of the folder2/a modification, and committed the staged folder1/file modification. The folder1/file showing as deleted is annoying but a `sparse-checkout clean` should get rid of it as well as the pesky folder2/ directory, right? $ git sparse-checkout clean -f Removing folder2/ $ git status On branch main You are in a sparse checkout with 78% of tracked files present. Changes not staged for commit: (use "git add/rm <file>..." to update what will be committed) (use "git restore <file>..." to discard changes in working directory) deleted: folder1/file deleted: folder2/a no changes added to commit (use "git add" and/or "git commit -a") ==> Huh, `git sparse-checkout clean -f` got rid of folder2, but now folder2/a shows up as deleted locally...and folder1/file is still showing as deleted locally. And it doesn't matter how many times we run `git sparse-checkout clean -f`; this is the end state for it. But if we run `git sparse-checkout reapply`: $ git sparse-checkout reapply $ git status On branch main You are in a sparse checkout with 56% of tracked files present. nothing to commit, working tree clean ==> So, you still need to use `git sparse-checkout reapply` together with `git sparse-checkout clean -f`; you haven't fixed that yet. > @@ -1147,7 +1149,11 @@ test_expect_success 'sparse-checkout operations with merge conflicts' ' > git commit -a -m "left" && > > git checkout -b merge && > - git sparse-checkout set deep/deeper1 && > + > + touch deep/deeper2/extra && > + git sparse-checkout set deep/deeper1 2>err && > + grep "contains untracked files" err && > + test_path_exists deep/deeper2/extra && > > test_must_fail git merge -m "will-conflict" right && > > @@ -1159,15 +1165,19 @@ test_expect_success 'sparse-checkout operations with merge conflicts' ' > git merge --continue && > > test_path_exists folder1/even/more/dirs/file && > + test_path_exists deep/deeper2/extra && > + > + cat >expect <<-\EOF && > + Removing deep/deeper2/ > + Removing folder1/ > + EOF > > # clean does not remove the file, because the > # SKIP_WORKTREE bit was not cleared by the merge command. Shouldn't the comment be updated, given the testcase updates? > git sparse-checkout clean -f >out && > - test_line_count = 0 out && > - test_path_exists folder1/even/more/dirs/file && > - > - git sparse-checkout reapply && > - test_path_is_missing folder1 > + test_cmp expect out && > + test_path_is_missing folder1 && > + test_path_is_missing deep/deeper2 Yes, but why does `git status` show folder1/even/more/dirs/file as being locally deleted? Does the code forget to update the SKIP_WORKTREE status after clearing out the files?