Junio C Hamano <gitster@xxxxxxxxx> writes: > > Kai Koponen <kaikoponen@xxxxxxxxxx> writes: > >> I see, more of a perf FR than a bug then. >> I don't have much expertise here, but on the surface of it, it doesn't >> seem to me like there would be any reason the algorithm couldn't check >> each path's bloom filter in turn while searching, other than that this >> would be a large and annoying change. > > It looks like that the necessary changes are probably fairly well > isolated to two functions, i.e., prepare_to_use_bloom_filter() and > forbid_bloom_filters(). Right now, for a pathspec that has one > element "dir/file", the code uses two bloom keys for "dir" and > "dir/file", but if we have "dir1/file1" as well, then it does look > like a matter of using two more (and the bloom_keys[] array is > designed to be variable length). I believe the issue here is that revs->bloom_keys[] represents an AND condition, whereas what we actually want is an OR. In Kai’s example, we’re trying to identify commits that modified either src/Make.dist or src/clean.bash. However, by adding src, Make.dist, and clean.bash to the bloom_keys, we end up filtering for commits that modified all of these, rather than any of them. > But those who have more intimate knowledge in the area than I do may > point out what is missing in my "it looks like" gut feeling. >