Use prio_queue to improve worst-case performance at the cost of slightly worse best-case performance. Then add and use prio_queue_replace() to recover that loss. Changes since v2: - Mention that a prio_queue improves performance for merge-heavy histories in the commit message. - Add the new perf script to Meson build file. - Mention which kind of history we are aiming for and show its shape in a comment in the perf script. - Remove unnecessary quotes and use single quotes for the perf test code. - Rename the variable delete_pending to get_pending to align it with the concrete function prio_queue_get() instead of referring to the abstract concept of deletion, to improve readability. commit: convert pop_most_recent_commit() to prio_queue prio-queue: add prio_queue_replace() commit: use prio_queue_replace() in pop_most_recent_commit() commit.c | 14 ++++-- commit.h | 8 ++-- fetch-pack.c | 13 +++--- object-name.c | 10 ++--- prio-queue.c | 45 ++++++++++++++------ prio-queue.h | 8 ++++ t/meson.build | 1 + t/perf/p1501-rev-parse-oneline.sh | 71 +++++++++++++++++++++++++++++++ t/unit-tests/u-prio-queue.c | 23 ++++++++++ walker.c | 11 +++-- 10 files changed, 170 insertions(+), 34 deletions(-) create mode 100755 t/perf/p1501-rev-parse-oneline.sh Range-diff against v1: 1: d9c0d13801 ! 1: 5bff885d0f commit: convert pop_most_recent_commit() to prio_queue @@ Metadata ## Commit message ## commit: convert pop_most_recent_commit() to prio_queue - pop_most_recent_commit() calls commit_list_insert_by_date(), which and - is itself called in a loop, which can lead to quadratic complexity. - Replace the commit_list with a prio_queue to ensure logarithmic worst - case complexity and convert all three users. + pop_most_recent_commit() calls commit_list_insert_by_date() for parent + commits, which is itself called in a loop. This can lead to quadratic + complexity if there are many merges. Replace the commit_list with a + prio_queue to ensure logarithmic worst case complexity and convert all + three users. Add a performance test that exercises one of them using a pathological history that consists of 50% merges and 50% root commits to demonstrate @@ object-name.c: static enum get_oid_result get_oid_with_context_1(struct reposito free_commit_list(list); + ## t/meson.build ## +@@ t/meson.build: benchmarks = [ + 'perf/p1450-fsck.sh', + 'perf/p1451-fsck-skip-list.sh', + 'perf/p1500-graph-walks.sh', ++ 'perf/p1501-rev-parse-oneline.sh', + 'perf/p2000-sparse-operations.sh', + 'perf/p3400-rebase.sh', + 'perf/p3404-rebase-interactive.sh', + ## t/perf/p1501-rev-parse-oneline.sh (new) ## @@ +#!/bin/sh @@ t/perf/p1501-rev-parse-oneline.sh (new) + +test_perf_fresh_repo + ++# ++# Creates lots of merges to make history traversal costly. In ++# particular it creates 2^($max_level-1)-1 2-way merges on top of ++# 2^($max_level-1) root commits. E.g., the commit history looks like ++# this for a $max_level of 3: ++# ++# _1_ ++# / \ ++# 2 3 ++# / \ / \ ++# 4 5 6 7 ++# ++# The numbers are the fast-import marks, which also are the commit ++# messages. 1 is the HEAD commit and a merge, 2 and 3 are also merges, ++# 4-7 are the root commits. ++# +build_history () { + local max_level="$1" && + local level="${2:-1}" && @@ t/perf/p1501-rev-parse-oneline.sh (new) + sed -n -e "s/^.* //p" -e "q" <commits >needle +' + -+test_perf "rev-parse ':/$(cat needle)'" " -+ git rev-parse ':/$(cat needle)' >actual -+" ++test_perf "rev-parse :/$(cat needle)" ' ++ git rev-parse :/$(cat needle) >actual ++' + +test_expect_success 'verify result' ' + test_cmp expect actual 2: a4011d850c = 2: a2e57ca443 prio-queue: add prio_queue_replace() 3: dc535e8b64 ! 3: 1cbea38524 commit: use prio_queue_replace() in pop_most_recent_commit() @@ commit.c: void commit_list_sort_by_date(struct commit_list **list) { - struct commit *ret = prio_queue_get(queue); + struct commit *ret = prio_queue_peek(queue); -+ int delete_pending = 1; ++ int get_pending = 1; struct commit_list *parents = ret->parents; while (parents) { @@ commit.c: void commit_list_sort_by_date(struct commit_list **list) if (!repo_parse_commit(the_repository, commit) && !(commit->object.flags & mark)) { commit->object.flags |= mark; - prio_queue_put(queue, commit); -+ if (delete_pending) ++ if (get_pending) + prio_queue_replace(queue, commit); + else + prio_queue_put(queue, commit); -+ delete_pending = 0; ++ get_pending = 0; } parents = parents->next; } -+ if (delete_pending) ++ if (get_pending) + prio_queue_get(queue); return ret; } -- 2.50.1