On Tue, Jul 29, 2025 at 08:57:43PM +0200, Toon Claes wrote: > From: Jeff King <peff@xxxxxxxx> > > The within_depth() function is used to check whether pathspecs limited > by a max-depth parameter are acceptable. It takes a path to check, a > maximum depth, and a "base" depth. It counts the components in the > path (by counting slashes), adds them to the base, and compare them to s/compare/&s/ > the maximum. > > However, if the base does not have any slashes at all, we always return > `true`. If the base depth is 0, then this is correct; no matter what the > maximum is, we are always within it. However, if the base depth is > greater than 0, then we might return an erroneous result. > > This ends up not causing any user-visible bugs in the current code. The > call sites in dir.c always pass a base depth of 0, so are unaffected. > But tree_entry_interesting() uses this function differently: it will > pass the prefix of the current entry, along with a `1` if the entry is a > directory, in essence checking whether items inside the entry would be > of interest. It turns out not to make a difference in behavior, but the > reasoning is complex. > > Given a tree like: > > file > a/file > a/b/file > > walking the tree and calling tree_entry_interesting() will yield the > following results: > > (with max_depth=0): > file: yes > a: yes > a/file: no > a/b: no > > (with max_depth=1): > file: yes > a: yes > a/file: yes > a/b: no > > So we have inconsistent behavior in considering directories interesting. > If they are at the edge of our depth but at the root, we will recurse > into them, but then find all of their entries uninteresting (e.g., in > the first case, we will look at "a" but find "a/*" uninteresting). But > if they are at the edge of our depth and not at the root, then we will > not recurse (in the second example, we do not even bother entering > "a/b"). > > This turns out not to matter because the only caller which uses > max-depth pathspecs is cmd_grep(), which only cares about blob entries. > From its perspective, it is exactly the same to not recurse into a > subtree, or to recurse and find that it contains no matching entries. > Not recursing is merely an optimization. Okay, well-explained. > It is debatable whether tree_entry_interesting() should consider such an > entry interesting. The only caller does not care if it sees the tree > itself, and can benefit from the optimization. But if we add a > "max-depth" limiter to regular diffs, then a diff with > DIFF_OPT_TREE_IN_RECURSIVE would probably want to show the tree itself, > but not what it contains. I haven't yet read the cover letter, but I guess that the scenario where we'll care about this is in git-last-modified(1) if we want to teach that command a `--max-depth` parameter? > This patch just fixes within_depth(), which means we consider such > entries uninteresting (and makes the current caller happy). If we want > to change that in the future, then this fix is still the correct first > step, as the current behavior is simply inconsistent. So... do we end up with this now? (with max_depth=1): file: yes a: yes a/file: no a/b: no I think it would be nice to include that in the message to show the change in behaviour at a glance. > Signed-off-by: Jeff King <peff@xxxxxxxx> > Signed-off-by: Toon Claes <toon@xxxxxxxxx> > --- > dir.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/dir.c b/dir.c > index 02873f59ea..900ee5e559 100644 > --- a/dir.c > +++ b/dir.c > @@ -277,7 +277,7 @@ int within_depth(const char *name, int namelen, > if (depth > max_depth) > return 0; > } > - return 1; > + return depth <= max_depth; > } Just for my own understanding: the only difference is when we don't have even a single matching slash, as we don't verify `depth > max_depth` in that case. So in theory, we could modify the function to the following equivalent: int within_depth(const char *name, int namelen, int depth, int max_depth) { const char *cp = name, *cpe = name + namelen; if (depth > max_depth) return 0; while (cp < cpe) { if (*cp++ != '/') continue; depth++; if (depth > max_depth) return 0; } return 1; } (Not saying we should, I'm just double checking my understanding). Patrick