Re: [PATCH 1/3] commit: convert pop_most_recent_commit() to prio_queue

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 25/07/15 04:51PM, René Scharfe wrote:
> pop_most_recent_commit() calls commit_list_insert_by_date(), which and

Did you mean?

s/which and/which/

> is itself called in a loop, which can lead to quadratic complexity.
> Replace the commit_list with a prio_queue to ensure logarithmic worst
> case complexity and convert all three users.

If I understand correctly, `pop_most_recent_commit()` removes the most
recent commit from a list of commits sorted by date and then inserts
each of the removed commit's parents into the list while maintaining
date order. Iterating through `struct commit_list` every time to find
where to insert each parent parent leads to quadratic complexity. For
repositories with many merge commits, this could scale poorly.

> Add a performance test that exercises one of them using a pathological
> history that consists of 50% merges and 50% root commits to demonstrate
> the speedup:
> 
>    Test                          v2.50.1           HEAD
>    ----------------------------------------------------------------------
>    1501.2: rev-parse ':/65535'   2.48(2.47+0.00)   0.20(0.19+0.00) -91.9%
> 
> Alas, sane histories don't benefit from the conversion much, and
> traversing Git's own history takes a 1% performance hit on my machine:

As "normal" repositories don't benefit here, it might be nice to more
explicitly mention the the types of repositories that do benefit.

>    $ hyperfine -w3 -L git ./git_2.50.1,./git '{git} rev-parse :/^Initial.revision'
>    Benchmark 1: ./git_2.50.1 rev-parse :/^Initial.revision
>      Time (mean ± σ):      1.071 s ±  0.004 s    [User: 1.052 s, System: 0.017 s]
>      Range (min … max):    1.067 s …  1.078 s    10 runs
> 
>    Benchmark 2: ./git rev-parse :/^Initial.revision
>      Time (mean ± σ):      1.079 s ±  0.003 s    [User: 1.060 s, System: 0.017 s]
>      Range (min … max):    1.074 s …  1.083 s    10 runs
> 
>    Summary
>      ./git_2.50.1 rev-parse :/^Initial.revision ran
>        1.01 ± 0.00 times faster than ./git rev-parse :/^Initial.revision
> 
> Signed-off-by: René Scharfe <l.s.r@xxxxxx>
> ---
>  commit.c                          |  7 ++--
>  commit.h                          |  8 ++---
>  fetch-pack.c                      | 13 +++++---
>  object-name.c                     | 10 +++---
>  t/perf/p1501-rev-parse-oneline.sh | 55 +++++++++++++++++++++++++++++++
>  walker.c                          | 11 ++++---
>  6 files changed, 83 insertions(+), 21 deletions(-)
>  create mode 100755 t/perf/p1501-rev-parse-oneline.sh
> 
> diff --git a/commit.c b/commit.c
> index e915b2b9a1..0200759aaa 100644
> --- a/commit.c
> +++ b/commit.c
> @@ -31,6 +31,7 @@
>  #include "parse.h"
>  #include "object-file.h"
>  #include "object-file-convert.h"
> +#include "prio-queue.h"
>  
>  static struct commit_extra_header *read_commit_extra_header_lines(const char *buf, size_t len, const char **);
>  
> @@ -738,17 +739,17 @@ void commit_list_sort_by_date(struct commit_list **list)
>  	commit_list_sort(list, commit_list_compare_by_date);
>  }
>  
> -struct commit *pop_most_recent_commit(struct commit_list **list,
> +struct commit *pop_most_recent_commit(struct prio_queue *queue,
>  				      unsigned int mark)
>  {
> -	struct commit *ret = pop_commit(list);
> +	struct commit *ret = prio_queue_get(queue);
>  	struct commit_list *parents = ret->parents;
>  
>  	while (parents) {
>  		struct commit *commit = parents->item;
>  		if (!repo_parse_commit(the_repository, commit) && !(commit->object.flags & mark)) {
>  			commit->object.flags |= mark;
> -			commit_list_insert_by_date(commit, list);
> +			prio_queue_put(queue, commit);
>  		}
>  		parents = parents->next;
>  	}
> diff --git a/commit.h b/commit.h
> index 70c870dae4..9630c076d6 100644
> --- a/commit.h
> +++ b/commit.h
> @@ -201,10 +201,10 @@ const char *repo_logmsg_reencode(struct repository *r,
>  
>  const char *skip_blank_lines(const char *msg);
>  
> -/** Removes the first commit from a list sorted by date, and adds all
> - * of its parents.
> - **/
> -struct commit *pop_most_recent_commit(struct commit_list **list,
> +struct prio_queue;
> +
> +/* Removes the first commit from a prio_queue and adds its parents. */
> +struct commit *pop_most_recent_commit(struct prio_queue *queue,
>  				      unsigned int mark);

Previously, `pop_most_recent_commit()` would ensure commits inserted in
the list were done it date order. Now this depends on how the caller has
configured the `struct prio_queue`. This is fine though as previously
the caller was required to ensure the list was sorted to begin with
otherwise it wouldn't work properly.

-Justin




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux