Re: [PATCH v2 2/2] refs.c: stop matching non-directory prefixes in exclude patterns

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Mar 07, 2025 at 09:31:17AM -0800, Junio C Hamano wrote:
> Patrick Steinhardt <ps@xxxxxx> writes:
>
> > I think you've swapped things around a bit by accident. The problem is
> > that the patterns were being matched too loosely by the underlying
> > backends, which had the consequence that the backends marked too many
> > refs as excluded.
>
> OK, I agree it is confusing.  As a selection mechanism for refs to
> be shown or processed, exclusion should be "we omit it because we
> clearly know this one should not be in the final result, but we may
> pass questionable ones, relying on our caller to have the final
> say".  As a selection mechanism for refs to be excluded, the logic
> should be the other way around, so false positive and false negative
> are going to be swapped.  We want the exclusion at the lower layer
> to only say "this ref clearly matches with given exclusion pattern",
> but we used to claim matches for refs that shouldn't match.
>
> OK.  Thanks for straightening me out.

Yes, Patrick is exactly right here. Thanks, Patrick, for beating me to
the punch ;-).

> > What makes me feel a bit uneasy is that for the "files" backend the
> > optimization depends on the packed state, which is quite awkward overall
> > as our tests may not uncover issues only because we didn't pack refs. I
> > don't really see a way to address this potential test gap generically
> > though.
>
> True.  An obvious optimization for "files" _might_ be to lazily walk
> the directory hierarchy and skip recursive readdir when a directory
> clearly matches the given exclusion pattern, but the result of such
> an optimization (in other words, what would seep through the sieve)
> to be filtered out at the upper layer would be different from what
> the "packed-refs" backend does for its optimization, and they would
> be different for reftable or any other future backends.

I had considered doing this back when I wrote 59c35fac54
(refs/packed-backend.c: implement jump lists to avoid excluded
pattern(s), 2023-07-10).

But I decided against it for a couple of reasons. First, it's a little
more complicated than the packed backend's implementation, since we have
to consider the additional context of what layer of the $GIT_DIR/refs
directory we're in to construct the full prefix in order to even perform
the match.

But the second reason was that we should never have so many loose
references sitting around for this optimization to even matter. If we're
in a case where it does, then the repository in question should "git
pack-refs --all" to take advantage of the optimization.

> But I think that is the nature of lower-level optimization---each
> backend takes advantage of intimately knowing how it organizes the
> underlying data, and how they can omit without looking into a bulk
> of the section of data deeply would be different.

Yep.

Thanks,
Taylor




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux