On Fri, Aug 15, 2025 at 9:13 AM Derrick Stolee via GitGitGadget <gitgitgadget@xxxxxxxxx> wrote: > > From: Derrick Stolee <stolee@xxxxxxxxx> > > When running 'git ls-files' with a pathspec, the index entries get > filtered according to that pathspec before iterating over them in When I first read this patch, I missed this part of your commit message and figured there was no possible way your patch could actually speed things up. I verified with your testcase that it worked, though, and had to step through a debugger to find out what I was missing. It's the prune_index() call in cmd_ls_files() that does this -- but only when the pathspecs provided have some common prefix. So, it's not unique to when there's a single pathspec as your commit message claims, and the pointer to prune_index() may have helped save me some head-scratching in review the patch. Perhaps this could be clarified here (and made more explicit for folks like me that gloss over it), something like When running 'git ls-files' with pathspecs with a common prefix, the index entries get filtered according to that common prefix in prune_index() before iterating over them in show_files(). > show_files(). In 78087097b8 (ls-files: add --sparse option, > 2021-12-22), this iteration was prefixed with a check for the '--sparse' > option which allows the command to output directory entries; this > created a pre-loop call to ensure_full_index(). > > However, when a user runs 'git ls-files' where the pathspec matches > directories that are recursively matched in the sparse-checkout, there > are not any sparse directories that match the pathspec so they would not > be written to the output. The expansion in this case is just a > performance drop for no behavior difference. > > Replace this global check to expand the index with a check inside the > loop for a matched sparse directory. If we see one, then expand the > index and continue from the current location. This is safe since the > previous entries in the index did not have any sparse directories and > thus would remain stable in this expansion. > > A test in t1092 confirms that this changes the behavior. > > Signed-off-by: Derrick Stolee <stolee@xxxxxxxxx> > --- > ls-files: conditionally leave index sparse > > Here's a small sparse index performance update based on a user report. > > Thanks, -Stolee > > Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1955%2Fderrickstolee%2Fls-files-sparse-index-v1 > Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1955/derrickstolee/ls-files-sparse-index-v1 > Pull-Request: https://github.com/gitgitgadget/git/pull/1955 > > builtin/ls-files.c | 13 ++++++++++--- > t/t1092-sparse-checkout-compatibility.sh | 13 +++++++++++++ > 2 files changed, 23 insertions(+), 3 deletions(-) > > diff --git a/builtin/ls-files.c b/builtin/ls-files.c > index c06a6f33e41..b148607f7a1 100644 > --- a/builtin/ls-files.c > +++ b/builtin/ls-files.c > @@ -414,14 +414,21 @@ static void show_files(struct repository *repo, struct dir_struct *dir) > if (!(show_cached || show_stage || show_deleted || show_modified)) > return; > > - if (!show_sparse_dirs) > - ensure_full_index(repo->index); > - > for (i = 0; i < repo->index->cache_nr; i++) { > const struct cache_entry *ce = repo->index->cache[i]; > struct stat st; > int stat_err; > > + if (S_ISSPARSEDIR(ce->ce_mode) && !show_sparse_dirs) { > + /* > + * This is the first time we've hit a sparse dir, > + * so expansion will leave the first 'i' entries > + * alone. > + */ > + ensure_full_index(repo->index); > + ce = repo->index->cache[i]; > + } I see how this is safe. I didn't understand how it helped performance until I figured out by stepping through that repo->indexc->cache_nr is much less than I expected, because of the prune_index() call that happened earlier. > construct_fullname(&fullname, repo, ce); > > if ((dir->flags & DIR_SHOW_IGNORED) && > diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh > index d8101139b40..b0f691c151a 100755 > --- a/t/t1092-sparse-checkout-compatibility.sh > +++ b/t/t1092-sparse-checkout-compatibility.sh > @@ -1506,6 +1506,8 @@ test_expect_success 'sparse-index is not expanded' ' > ensure_not_expanded reset --hard && > ensure_not_expanded restore -s rename-out-to-out -- deep/deeper1 && > > + ensure_not_expanded ls-files deep/deeper1 && > + Thanks, this testcase is exactly what I needed to figure out what I was misunderstanding. > echo >>sparse-index/README.md && > ensure_not_expanded add -A && > echo >>sparse-index/extra.txt && > @@ -1607,6 +1609,17 @@ test_expect_success 'describe tested on all' ' > test_all_match git describe --dirty > ' > > +test_expect_success 'ls-files filtering and expansion' ' > + init_repos && > + > + # This filtering will hit a sparse directory midway > + # through the iteration. > + test_all_match git ls-files deep && > + > + # This pathspec will filter the index to only a sparse > + # directory. > + test_all_match git ls-files folder1 > +' Looks good.