Re: Perf bug: rev-list w/ 2+ paths relatively slow with commit-graph

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Kai Koponen <kaikoponen@xxxxxxxxxx> writes:

> Reproduce steps:
> ```
> git clone https://github.com/golang/go.git
> cd go
> git config core.commitGraph true
> git commit-graph write --split --reachable --changed-paths  # Without
> this, all calls equally slow (~1s)
> time git rev-list -10 3730814f2f2bf24550920c39a16841583de2dac1 --
> src/clean.bash > /dev/null  # ~90ms
> time git rev-list -10 3730814f2f2bf24550920c39a16841583de2dac1 --
> src/Make.dist > /dev/null  # ~100ms
> time git rev-list -10 3730814f2f2bf24550920c39a16841583de2dac1 --
> src/clean.bash src/Make.dist > /dev/null  # ~650ms
> ```
>
> The rev-list call with multiple paths takes over 3x longer than the
> sum of individual calls to it for the same files.
>
> Expectation: rev-list with multiple paths should take <= the sum of
> the time it takes to call it with each path individually (ideally <,
> since with the count limit it should be able to early-exit and search
> less commits for either path).
>
> Also reproduces without the -10 arg, or with a lower count (double
> instead of triple w/ -1), but these results are perhaps most
> surprising with a count present.

I asked 

    How does "git log -- path" use the changed-paths bloom filter
    stored in the commit-graph file?

to https://deepwiki.com/git/git (there is a text field in the bottom
of the page), and an early part of its answer explains why in a
fairly convincing way ;-)

    When you run git log -- path, Git first prepares to use bloom
    filters in the prepare_to_use_bloom_filter function. This function:

     1. Validates the pathspec - It calls forbid_bloom_filters to check
        if bloom filters can be used revision.c:674-686 . Bloom filters
        are disabled for wildcards, multiple paths, or complex pathspec
        magic.

     ...

In short, the changed-path filter is used only when following
pathspec with a single element that is not a wildcard.  So the
observed result is (unfortunately) quite expected.





[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux