"Derrick Stolee via GitGitGadget" <gitgitgadget@xxxxxxxxx> writes: > From: Derrick Stolee <stolee@xxxxxxxxx> > > When users change their sparse-checkout definitions to add new > directories and remove old ones, there may be a few reasons why > directories no longer in scope remain (ignored or excluded files still > exist, Windows handles are still open, etc.). When these files still > exist, the sparse index feature notices that a tracked, but sparse, > directory still exists on disk and thus the index expands. This causes a > performance hit _and_ the advice printed isn't very helpful. Using 'git > clean' isn't enough (generally '-dfx' may be needed) but also this may > not be sufficient. > > Add a new subcommand to 'git sparse-checkout' that removes these > tracked-but-sparse directories, including any excluded or ignored files Are excluded files and ignored files form two separate sets, or are they one and the same? Do files that users forgot to add (e.g. new source file that would not match any patterns listed in .gitignore) and object files left over from the previous compilation (most likely match *.o in .gitignore) treated the same way for the purpose of determining if the directory that is no longer in the cone can be removed? > underneath. This is the most extreme method for doing this, but it works > when the sparse-checkout is in cone mode and is expected to rescope > based on directories, not files. > > Be sure to add a --dry-run option so users can predict what will be > deleted. In general, output the directories that are being removed so > users can know what was removed. Hmph. It would be safer to show not just the directories but which excluded files are about to be lost, wouldn't it, especially when the user is trying to play safe and see what potential damage they are looking at? Also even though ignored files are "ignored and expendable", nobody marks their temporary file as "ignored but precious" (yet), so "it is listed in .gitignore so we can safely remove it" may not be a safe assumption for us to be making (yet). Shouldn't we at least be listing these ignored files in --dry-run output, next to those files that the user may have forgotten to add? > Note that untracked directories remain. Further, directories that > contain staged changes are not deleted. This is a detail that is partly > hidden by the implementation which relies on collapsing the index to a > sparse index in-memory and only deleting directories that are listed as > sparse in the index. If a staged change exists, then that entry is not > stored as a sparse tree entry and thus remains on-disk until committed > or reset. Removing untracked directories is a job for "clean -d", so it makes sense for this new command not to touch them. Not losing changes that have already been added is just a bad as losing new files that the user forgot to add, so it does make sense not to remove them. I wonder if we need "-x" and/or "-X" options "clean" has (and perhaps "-d" that is a no-op, as the whole point of this subcommand is about removing directories from the working tree) to control its operation a bit finer-grained way. > + for (size_t i = 0; i < repo->index->cache_nr; i++) { > + DIR* dir; The asterisk sticks to the variable, not the type, i.e. DIR *dir; Thanks.