[PATCH 0/2] bloom: use bloom filter given multiple pathspec

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



git won't use bloom filter for multiple pathspec, which makes the command
  git log -- file1 file2
significantly slower than
  git log -- file1 && git log -- file2

This issue is raised by Kai Koponen at
  https://lore.kernel.org/git/CADYQcGqaMC=4jgbmnF9Q11oC11jfrqyvH8EuiRRHytpMXd4wYA@xxxxxxxxxxxxxx/

To fix this, revs->bloom_keys[] needs to become an array of bloom_keys[],
one for each literal pathspec element. For convenience, first commit
creates a new struct bloom_keyvec to hold all bloom keys for a single
pathspec. The second commit add for loop to check if any pathspec's keyvec
is contained in a commit's bloom filter, along with code that initialize
destory and test multiple pathspec bloom keyvecs.

With this change, testing on Kai's example shows that
  git rev-list -10 3730814f2f2bf24550920c39a16841583de2dac1 -- src/clean.bash src/Make.dist
runs as fast as
  git rev-list -10 3730814f2f2bf24550920c39a16841583de2dac1 -- src/Make.dist && \
  git rev-list -10 3730814f2f2bf24550920c39a16841583de2dac1 -- src/clean.bash

Lidong Yan (2):
  bloom: replace struct bloom_key * with struct bloom_keyvec
  bloom: enable multiple pathspec bloom keys

 bloom.c              |  47 +++++++++++++++++
 bloom.h              |  14 +++++
 revision.c           | 121 ++++++++++++++++++++++++-------------------
 revision.h           |   5 +-
 t/t4216-log-bloom.sh |  10 ++--
 5 files changed, 137 insertions(+), 60 deletions(-)

-- 
2.50.0.108.g6ae0c543ae





[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux