On Thu, Mar 6, 2025 at 7:34 AM Taylor Blau <me@xxxxxxxxxxxx> wrote: > > In the packed-refs backend, our implementation of '--exclude' (dating > back to 59c35fac54 (refs/packed-backend.c: implement jump lists to avoid > excluded pattern(s), 2023-07-10)) considers, for example: > > $ git for-each-ref --exclude=refs/heads/ba > > to exclude "refs/heads/bar", "refs/heads/baz", and so on. > > The files backend, which does not implement '--exclude' (and relies on > the caller to cull out results that don't match) naturally will > enumerate "refs/heads/bar" and so on. > > So in the above example, 'for-each-ref' will try and see if > "refs/heads/ba" matches "refs/heads/bar" (since the files backend simply > enumerated every loose reference), and, realizing that it does not > match, output the reference as expected. (A caller that did want to > exclude "refs/heads/bar" and "refs/heads/baz" might instead run "git > for-each-ref --exclude='refs/heads/ba*'"). > > This can lead to strange behavior, like seeing a different set of > references advertised via 'upload-pack' depending on what set of > references were loose versus packed. > > So there is a subtle bug with '--exclude' which is that in the > packed-refs backend we will consider "refs/heads/bar" to be a pattern > match against "refs/heads/ba" when we shouldn't. Likewise, the reftable > backend (which in this case is bug-compatible with the packed backend) > exhibits the same broken behavior. Yuck; nice to see this being addressed. > There are a few ways to fix this. One is to tighten the rules in > cmp_record_to_refname(), which is used to determine the start/end-points > of the jump list used by the packed backend. In this new "strict" mode, > the comparison function would handle the case where we've reached the > end of the pattern by introducing a new check like so: > > while (1) { > if (*r1 == '\n') > return *r2 ? -1 : 0; > if (!*r2) > if (strict && *r1 != '/') /* <- here */ > return 1; > return start ? 1 : -1; > if (*r1 != *r2) > return (unsigned char)*r1 < (unsigned char)*r2 ? -1 : +1; > r1++; > r2++; > } > > (eliding out the rest of cmp_record_to_refname()). Equivalently, we > could teach refs/packed-backend::populate_excluded_jump_list() to append > a trailing '/' if one does not already exist, forcing an exclude pattern > like "refs/heads/ba" to only match "refs/heads/ba/abc" and so forth. > > But since the same problem exists in reftable, we can fix both at once > by performing this pre-processing step one layer up in refs.c at the > common entrypoint for the two, which is 'refs_ref_iterator_begin()'. > > Since that solution is both the simplest and only requires modification > in one spot, let's normalize exclude patterns so that they end with a > trailing slash. This causes us to unify the behavior between all three > backends. :-) > There is some minor test fallout in the "overlapping excluded regions" > test, which happens to use 'refs/ba' as an exclude pattern, and expects > references under the "refs/heads/bar/*" and "refs/heads/baz/*" > hierarchies to be excluded from the results. > > But that test fallout is expected, because the test was codifying the > buggy behavior to begin with, and should have never been written that > way. Split that into its own test (since the range is no longer > overlapping under the stricter interpretation of --exclude patterns > presented here). Create a new test which does have overlapping > regions by using a refs/heads/bar/4/... hierarchy and excluding both > "refs/heads/bar" and "refs/heads/bar/4". Always nice to see tests corrected. > Reported-by: SURA <surak8806@xxxxxxxxx> > Helped-by: Jeff King <peff@xxxxxxxx> > Signed-off-by: Taylor Blau <me@xxxxxxxxxxxx> > --- > refs.c | 6 +++++- > t/t1419-exclude-refs.sh | 16 ++++++++++++++-- > 2 files changed, 19 insertions(+), 3 deletions(-) > > diff --git a/refs.c b/refs.c > index 17d3840aff..2d9a1b51f4 100644 > --- a/refs.c > +++ b/refs.c > @@ -1708,7 +1708,11 @@ struct ref_iterator *refs_ref_iterator_begin( > if (!len) > continue; > > - strvec_push(&normalized_exclude_patterns, pattern); > + if (pattern[len - 1] == '/') > + strvec_push(&normalized_exclude_patterns, pattern); > + else > + strvec_pushf(&normalized_exclude_patterns, "%s/", > + pattern); Doesn't this mean that if the user requested to exclude "refs/heads/bar" and "refs/heads/bar" exists, that we won't exclude it because it doesn't have a trailing slash?