[RFC PATCH 0/2] fetch --prune performance problem

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



From: Phil Hord <phil.hord@xxxxxxxxx>

`git fetch --prune` runs in O(N^2) time normally. This happens because the code
iterates over each ref to be pruned to display its status. In a repo with
174,000 refs, where I was pruning 15,000 refs, the current code made 2.6 billion
calls to strcmp and consumed 470 seconds of CPU. After this change, the same
operation completes in under 1 second.

The loop looks like this:

    for p in prune_refs { for ref in all_refs { if p == ref { ... }}}

That loop runs only to check for and report newly dangling refs. A workaround to
avoid this slowness is to run with `-q` to bypass this check.

There is similar check/report functionality in `git remote prune`, but it uses a
more efficient method to check for dangling refs. prune_refs is first sorted, so
it can be searched in O(logN), so this loop is O(N*logN).

    for ref in all_refs { if ref in prune_refs { ... }}

My patch fixes this for fetch, but it affects the command's output order.
Currently the results look like this:

     - [deleted]     (none) -> origin/bar
       (origin/bar has become dangling)
     - [deleted]     (none) -> origin/baz
     - [deleted]     (none) -> origin/foo
       (origin/foo has become dangling)
     - [deleted]     (none) -> origin/frotz

After my change, the order will change so the danglers are reported at the end.

     - [deleted]     (none) -> origin/bar
     - [deleted]     (none) -> origin/baz
     - [deleted]     (none) -> origin/foo
     - [deleted]     (none) -> origin/frotz
       (origin/bar has become dangling)
       (origin/foo has become dangling)

The latter format is close to how `git remote prune` works, but the formatting
is a bit different. I can coerce my change into something that preserves the
original order, but it will be quite a bit messier.

Q: Does anyone care enough about the command output ordering that they think
   it's worth the extra code complexity?

Phil Hord (2):
  fetch-prune: optimize dangling-ref reporting
  refs: remove old refs_warn_dangling_symref

 builtin/fetch.c | 16 ++++++++--------
 refs.c          | 17 +----------------
 2 files changed, 9 insertions(+), 24 deletions(-)

-- 
2.50.0.1.gf2ab606906.dirty





[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux