Re: [PATCH v2 2/2] refs.c: stop matching non-directory prefixes in exclude patterns

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Mar 6, 2025 at 7:34 AM Taylor Blau <me@xxxxxxxxxxxx> wrote:
>
> In the packed-refs backend, our implementation of '--exclude' (dating
> back to 59c35fac54 (refs/packed-backend.c: implement jump lists to avoid
> excluded pattern(s), 2023-07-10)) considers, for example:
>
>     $ git for-each-ref --exclude=refs/heads/ba
>
> to exclude "refs/heads/bar", "refs/heads/baz", and so on.
>
> The files backend, which does not implement '--exclude' (and relies on
> the caller to cull out results that don't match) naturally will
> enumerate "refs/heads/bar" and so on.
>
> So in the above example, 'for-each-ref' will try and see if
> "refs/heads/ba" matches "refs/heads/bar" (since the files backend simply
> enumerated every loose reference), and, realizing that it does not
> match, output the reference as expected. (A caller that did want to
> exclude "refs/heads/bar" and "refs/heads/baz" might instead run "git
> for-each-ref --exclude='refs/heads/ba*'").
>
> This can lead to strange behavior, like seeing a different set of
> references advertised via 'upload-pack' depending on what set of
> references were loose versus packed.
>
> So there is a subtle bug with '--exclude' which is that in the
> packed-refs backend we will consider "refs/heads/bar" to be a pattern
> match against "refs/heads/ba" when we shouldn't. Likewise, the reftable
> backend (which in this case is bug-compatible with the packed backend)
> exhibits the same broken behavior.

Yuck; nice to see this being addressed.

> There are a few ways to fix this. One is to tighten the rules in
> cmp_record_to_refname(), which is used to determine the start/end-points
> of the jump list used by the packed backend. In this new "strict" mode,
> the comparison function would handle the case where we've reached the
> end of the pattern by introducing a new check like so:
>
>     while (1) {
>         if (*r1 == '\n')
>             return *r2 ? -1 : 0;
>         if (!*r2)
>             if (strict && *r1 != '/')        /* <- here */
>                 return 1;
>             return start ? 1 : -1;
>         if (*r1 != *r2)
>             return (unsigned char)*r1 < (unsigned char)*r2 ? -1 : +1;
>         r1++;
>         r2++;
>     }
>
> (eliding out the rest of cmp_record_to_refname()). Equivalently, we
> could teach refs/packed-backend::populate_excluded_jump_list() to append
> a trailing '/' if one does not already exist, forcing an exclude pattern
> like "refs/heads/ba" to only match "refs/heads/ba/abc" and so forth.
>
> But since the same problem exists in reftable, we can fix both at once
> by performing this pre-processing step one layer up in refs.c at the
> common entrypoint for the two, which is 'refs_ref_iterator_begin()'.
>
> Since that solution is both the simplest and only requires modification
> in one spot, let's normalize exclude patterns so that they end with a
> trailing slash. This causes us to unify the behavior between all three
> backends.

:-)

> There is some minor test fallout in the "overlapping excluded regions"
> test, which happens to use 'refs/ba' as an exclude pattern, and expects
> references under the "refs/heads/bar/*" and "refs/heads/baz/*"
> hierarchies to be excluded from the results.
>
> But that test fallout is expected, because the test was codifying the
> buggy behavior to begin with, and should have never been written that
> way. Split that into its own test (since the range is no longer
> overlapping under the stricter interpretation of --exclude patterns
> presented here). Create a new test which does have overlapping
> regions by using a refs/heads/bar/4/... hierarchy and excluding both
> "refs/heads/bar" and "refs/heads/bar/4".

Always nice to see tests corrected.

> Reported-by: SURA <surak8806@xxxxxxxxx>
> Helped-by: Jeff King <peff@xxxxxxxx>
> Signed-off-by: Taylor Blau <me@xxxxxxxxxxxx>
> ---
>  refs.c                  |  6 +++++-
>  t/t1419-exclude-refs.sh | 16 ++++++++++++++--
>  2 files changed, 19 insertions(+), 3 deletions(-)
>
> diff --git a/refs.c b/refs.c
> index 17d3840aff..2d9a1b51f4 100644
> --- a/refs.c
> +++ b/refs.c
> @@ -1708,7 +1708,11 @@ struct ref_iterator *refs_ref_iterator_begin(
>                         if (!len)
>                                 continue;
>
> -                       strvec_push(&normalized_exclude_patterns, pattern);
> +                       if (pattern[len - 1] == '/')
> +                               strvec_push(&normalized_exclude_patterns, pattern);
> +                       else
> +                               strvec_pushf(&normalized_exclude_patterns, "%s/",
> +                                            pattern);

Doesn't this mean that if the user requested to exclude
"refs/heads/bar" and "refs/heads/bar" exists, that we won't exclude it
because it doesn't have a trailing slash?


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux