Re: [PATCH 2/2] bloom: enable multiple pathspec bloom keys

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Junio C Hamano <gitster@xxxxxxxxx> writes:
> 
> Before concluding so, we may want to double check how Bloom filters
> are built on case insensitive systems, though.  If we normalize the
> string by downcasing before murmuring the string, the resulting
> Bloom filter may have more false positives for those who want to
> (ab)use it to optimize case sensitive queries (without affecting
> correctness), but case insensitive queries would be helped.  I do
> not think we support (or want to support) a repository that spans
> across two filesystems with different case sensitivity, so those who
> worked on our changed-path Bloom filter subsystem may have already
> placed such an optimization, based on the case sensitivity recorded
> in the repository (core.ignorecase).

In bloom.c:get_or_compute_bloom_filter(), the computation of a bloom filter
looks like:
    diff_tree_oid(c’s parent or NULL, &c->object.oid, "", &diffopt);
    diffcore_std(&diffopt);
    struct hashmap path_hashmap;

    for (path : diff_queue_diff) {
        Add all parts of path to path_hashmap;
    }

    for_each(path_hashmap) {
        Add path to filter
    }

All these steps do not check config.ignoreCase, so I believe the Bloom filter we
build in the commit graph is case-sensitive.

To demonstrate this assumption—and since I happen to be a Mac user (where
config.ignoreCase is true by default)—I ran the following commands under the
llvm-project repository:

$ git commit-graph write --split --reachable --changed-paths
$ time git log -5 -t -- README.md > /dev/null
real	0m0.089s
user	0m0.067s
sys	0m0.021s
$ time git log -5 -t -- ':(icase)README.md' > /dev/null
real	0m0.281s
user	0m0.239s
sys	0m0.041s
$ time git log -5 -t -- ‘rEADME.md’ > /dev/null
real	0m0.458s
user	0m0.394s
sys	0m0.061s

And I think it proves that changed-path Bloom filter doesn’t optimize icase
pathspec item in case insensitive file system.




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux