On 25/07/15 04:51PM, René Scharfe wrote: > pop_most_recent_commit() calls commit_list_insert_by_date(), which and Did you mean? s/which and/which/ > is itself called in a loop, which can lead to quadratic complexity. > Replace the commit_list with a prio_queue to ensure logarithmic worst > case complexity and convert all three users. If I understand correctly, `pop_most_recent_commit()` removes the most recent commit from a list of commits sorted by date and then inserts each of the removed commit's parents into the list while maintaining date order. Iterating through `struct commit_list` every time to find where to insert each parent parent leads to quadratic complexity. For repositories with many merge commits, this could scale poorly. > Add a performance test that exercises one of them using a pathological > history that consists of 50% merges and 50% root commits to demonstrate > the speedup: > > Test v2.50.1 HEAD > ---------------------------------------------------------------------- > 1501.2: rev-parse ':/65535' 2.48(2.47+0.00) 0.20(0.19+0.00) -91.9% > > Alas, sane histories don't benefit from the conversion much, and > traversing Git's own history takes a 1% performance hit on my machine: As "normal" repositories don't benefit here, it might be nice to more explicitly mention the the types of repositories that do benefit. > $ hyperfine -w3 -L git ./git_2.50.1,./git '{git} rev-parse :/^Initial.revision' > Benchmark 1: ./git_2.50.1 rev-parse :/^Initial.revision > Time (mean ± σ): 1.071 s ± 0.004 s [User: 1.052 s, System: 0.017 s] > Range (min … max): 1.067 s … 1.078 s 10 runs > > Benchmark 2: ./git rev-parse :/^Initial.revision > Time (mean ± σ): 1.079 s ± 0.003 s [User: 1.060 s, System: 0.017 s] > Range (min … max): 1.074 s … 1.083 s 10 runs > > Summary > ./git_2.50.1 rev-parse :/^Initial.revision ran > 1.01 ± 0.00 times faster than ./git rev-parse :/^Initial.revision > > Signed-off-by: René Scharfe <l.s.r@xxxxxx> > --- > commit.c | 7 ++-- > commit.h | 8 ++--- > fetch-pack.c | 13 +++++--- > object-name.c | 10 +++--- > t/perf/p1501-rev-parse-oneline.sh | 55 +++++++++++++++++++++++++++++++ > walker.c | 11 ++++--- > 6 files changed, 83 insertions(+), 21 deletions(-) > create mode 100755 t/perf/p1501-rev-parse-oneline.sh > > diff --git a/commit.c b/commit.c > index e915b2b9a1..0200759aaa 100644 > --- a/commit.c > +++ b/commit.c > @@ -31,6 +31,7 @@ > #include "parse.h" > #include "object-file.h" > #include "object-file-convert.h" > +#include "prio-queue.h" > > static struct commit_extra_header *read_commit_extra_header_lines(const char *buf, size_t len, const char **); > > @@ -738,17 +739,17 @@ void commit_list_sort_by_date(struct commit_list **list) > commit_list_sort(list, commit_list_compare_by_date); > } > > -struct commit *pop_most_recent_commit(struct commit_list **list, > +struct commit *pop_most_recent_commit(struct prio_queue *queue, > unsigned int mark) > { > - struct commit *ret = pop_commit(list); > + struct commit *ret = prio_queue_get(queue); > struct commit_list *parents = ret->parents; > > while (parents) { > struct commit *commit = parents->item; > if (!repo_parse_commit(the_repository, commit) && !(commit->object.flags & mark)) { > commit->object.flags |= mark; > - commit_list_insert_by_date(commit, list); > + prio_queue_put(queue, commit); > } > parents = parents->next; > } > diff --git a/commit.h b/commit.h > index 70c870dae4..9630c076d6 100644 > --- a/commit.h > +++ b/commit.h > @@ -201,10 +201,10 @@ const char *repo_logmsg_reencode(struct repository *r, > > const char *skip_blank_lines(const char *msg); > > -/** Removes the first commit from a list sorted by date, and adds all > - * of its parents. > - **/ > -struct commit *pop_most_recent_commit(struct commit_list **list, > +struct prio_queue; > + > +/* Removes the first commit from a prio_queue and adds its parents. */ > +struct commit *pop_most_recent_commit(struct prio_queue *queue, > unsigned int mark); Previously, `pop_most_recent_commit()` would ensure commits inserted in the list were done it date order. Now this depends on how the caller has configured the `struct prio_queue`. This is fine though as previously the caller was required to ensure the list was sorted to begin with otherwise it wouldn't work properly. -Justin