Junio C Hamano <gitster@xxxxxxxxx> writes: > > Lidong Yan <yldhome2d2@xxxxxxxxx> writes: > >> git won't use bloom filter for multiple pathspec, which makes the command > > Let's get the terminology straight. A pathspec consists of one or > more pathspec elements (or pathspec items). Thanks for the clarification. I will be more precise with the terminology in v2. > Also, "git won't" is overly general. The series title shares the > same issue ("given multiple pathspec" does not even hint that this > is about revision traversal---you are not making filter used with > pathspec with more than one element in other code paths). > > Perhaps like: > > The revision traversal limited by pathspec has optimization when > the pathspec has only one element, it does not use any pathspec > magic (other than literal), and there is no wildcard. > > While it is much harder to lift the latter two limitations, > supporting a pathspec with multiple elements is relatively easy. > Just make sure we hash each of them separately and ask the bloom > filter about them, and if we see none of them can possibly be > affected by the commit, we can skip without tree comparison. > > or something along that line? > What you wrote makes perfect sense to me, I’ll just copy and paste those paragraphs into my cover letter. And the title would be "bloom: enable bloom filter optimization for multiple pathspec elements in revision traversal" > Can we have a set of real tests to make sure that the updated filter > code still identifies commits that touch the files without false > negatives? False positives are OK as we will follow them with real > tree comparison to determine what exactly got changed, but false > negatives are absolute no-no. > > Testing to see that the filter code path is activated is much less > interesting than the filter code path still functions correctly with > these changes presented here. I have a feeling that with the > changes to the test in this series, you wouldn't even find a bug > where you simply added subpaths for all pathspec elements into a > single array and use the original "bloom has to say 'possibly yes' > to all array elements" logic (which would incorrectly require that > both file1 and file2 must be modified). I assume that t4216/test_bloom_filters_used has already verified that using bloom filters with multiple pathspec elements produces the same results as when bloom filters are not used. But I would love to add more test cases to check no false negative happened. Thanks, Lidong