Lidong Yan <yldhome2d2@xxxxxxxxx> writes: > git won't use bloom filter for multiple pathspec, which makes the command Let's get the terminology straight. A pathspec consists of one or more pathspec elements (or pathspec items). Also, "git won't" is overly general. The series title shares the same issue ("given multiple pathspec" does not even hint that this is about revision traversal---you are not making filter used with pathspec with more than one element in other code paths). Perhaps like: The revision traversal limited by pathspec has optimization when the pathspec has only one element, it does not use any pathspec magic (other than literal), and there is no wildcard. While it is much harder to lift the latter two limitations, supporting a pathspec with multiple elements is relatively easy. Just make sure we hash each of them separately and ask the bloom filter about them, and if we see none of them can possibly be affected by the commit, we can skip without tree comparison. or something along that line? > git log -- file1 file2 > significantly slower than > git log -- file1 && git log -- file2 > > This issue is raised by Kai Koponen at > https://lore.kernel.org/git/CADYQcGqaMC=4jgbmnF9Q11oC11jfrqyvH8EuiRRHytpMXd4wYA@xxxxxxxxxxxxxx/ > > To fix this, revs->bloom_keys[] needs to become an array of bloom_keys[], > one for each literal pathspec element. For convenience, first commit > creates a new struct bloom_keyvec to hold all bloom keys for a single > pathspec. The second commit add for loop to check if any pathspec's keyvec > is contained in a commit's bloom filter, along with code that initialize > destory and test multiple pathspec bloom keyvecs. It is nice to outline an approach to the solution one day, and see an almost exact implementation of it appear on the list a few days later. I wish all the development would go like this ;-) > bloom.c | 47 +++++++++++++++++ > bloom.h | 14 +++++ > revision.c | 121 ++++++++++++++++++++++++------------------- > revision.h | 5 +- > t/t4216-log-bloom.sh | 10 ++-- Can we have a set of real tests to make sure that the updated filter code still identifies commits that touch the files without false negatives? False positives are OK as we will follow them with real tree comparison to determine what exactly got changed, but false negatives are absolute no-no. Testing to see that the filter code path is activated is much less interesting than the filter code path still functions correctly with these changes presented here. I have a feeling that with the changes to the test in this series, you wouldn't even find a bug where you simply added subpaths for all pathspec elements into a single array and use the original "bloom has to say 'possibly yes' to all array elements" logic (which would incorrectly require that both file1 and file2 must be modified). Thanks.