Re: [PATCH] range-diff: add configurable memory limit for cost matrix

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



"Paulo Casaretto via GitGitGadget" <gitgitgadget@xxxxxxxxx> writes:

> From: pcasaretto <paulo.casaretto@xxxxxxxxxxx>

<administrivia>

It is usual to see a less human readable name embedded in the commit
object than the mail header when a mail comes from GGG.  

Just in case you want to be known to this community as "Paulo
Casaretto", not "pcasaretto", I thought I'd point it out that you
may want to redo the commit.  I do not mind what name you like to
use, as long as it is identifiable, and From: identity matches the
identity you add your Signed-off-by: with.

</administrivia>

> Acked-by: Johannes Schindelin johannes.schindelin@xxxxxx

It is unusual to lack <> around e-mail address here.

> Signed-off-by: pcasaretto <paulo.casaretto@xxxxxxxxxxx>
> ---
>     range-diff: add configurable memory limit for cost matrix

> +static int parse_max_memory(const struct option *opt, const char *arg, int unset)
> +{
> +	size_t *max_memory = opt->value;
> +	uintmax_t val;
> +
> +	if (unset) {
> +		return 0;
> +	}

No unnecessary {braces} around a single statement, please.

> +	if (!git_parse_unsigned(arg, &val, SIZE_MAX))
> +		return error(_("invalid max-memory value: %s"), arg);
> +
> +	*max_memory = (size_t)val;
> +	return 0;
> +}

> @@ -33,17 +51,21 @@ int cmd_range_diff(int argc,
>  		OPT_INTEGER(0, "creation-factor",
>  			    &range_diff_opts.creation_factor,
>  			    N_("percentage by which creation is weighted")),
> +		OPT_PASSTHRU_ARGV(0, "diff-merges", &diff_merges_arg,
> +				  N_("style"), N_("passed to 'git log'"), 0),
> +		OPT_BOOL(0, "left-only", &left_only,
> +			 N_("only emit output related to the first range")),
> +		OPT_CALLBACK(0, "max-memory", &range_diff_opts.max_memory,
> +			     N_("size"),
> +			     N_("maximum memory for cost matrix (default 4G)"),
> +			     parse_max_memory),
>  		OPT_BOOL(0, "no-dual-color", &simple_color,
>  			    N_("use simple diff colors")),
>  		OPT_PASSTHRU_ARGV(0, "notes", &other_arg,
>  				  N_("notes"), N_("passed to 'git log'"),
>  				  PARSE_OPT_OPTARG),
> -		OPT_PASSTHRU_ARGV(0, "diff-merges", &diff_merges_arg,
> -				  N_("style"), N_("passed to 'git log'"), 0),
>  		OPT_PASSTHRU_ARGV(0, "remerge-diff", &diff_merges_arg, NULL,
>  				  N_("passed to 'git log'"), PARSE_OPT_NOARG),
> -		OPT_BOOL(0, "left-only", &left_only,
> -			 N_("only emit output related to the first range")),
>  		OPT_BOOL(0, "right-only", &right_only,
>  			 N_("only emit output related to the second range")),
>  		OPT_END()

This seems to mix unrelated changes.  Please don't.

Or if the reordering of options do have a reason to exist in _this_
commit, please justify it in your proposed log message.  Even if
there were a good reason for reordering existing options, I strongly
suspect that the change would want to be done in a separate,
preparatory-clean-up commit (i.e., making this topic a two-patch
series), because it has nothing to do with preventing inefficient
cost matrix computation from consuming too much memory, which _is_
the theme of this commit.

> diff --git a/range-diff.c b/range-diff.c
> index 8a2dcbee322..6e9b6b115e5 100644
> --- a/range-diff.c
> +++ b/range-diff.c
> @@ -21,6 +21,7 @@
>  #include "apply.h"
>  #include "revision.h"
>  
> +

Unrelated, unexplained, and unnecessary change snuck in?  Please
proof-read the patch yourself before sending.

> @@ -287,8 +288,8 @@ static void find_exact_matches(struct string_list *a, struct string_list *b)
>  }
>  
>  static int diffsize_consume(void *data,
> -			     char *line UNUSED,
> -			     unsigned long len UNUSED)
> +			    char *line UNUSED,
> +			    unsigned long len UNUSED)

What is this change about???

>  static void get_correspondences(struct string_list *a, struct string_list *b,
> -				int creation_factor)
> +				int creation_factor, size_t max_memory)
>  {
>  	int n = a->nr + b->nr;
>  	int *cost, c, *a2b, *b2a;
>  	int i, j;
> -
> -	ALLOC_ARRAY(cost, st_mult(n, n));
> +	size_t cost_size = st_mult(n, n);
> +	size_t cost_bytes = st_mult(sizeof(int), cost_size);
> +	if (cost_bytes >= max_memory) {
> +		struct strbuf cost_str = STRBUF_INIT;
> +		struct strbuf max_str = STRBUF_INIT;
> +		strbuf_humanise_bytes(&cost_str, cost_bytes);
> +		strbuf_humanise_bytes(&max_str, max_memory);
> +		die(_("range-diff: unable to compute the range-diff, since it "
> +		      "exceeds the maximum memory for the cost matrix: %s "
> +		      "(%"PRIuMAX" bytes) needed, %s (%"PRIuMAX" bytes) available"),
> +		    cost_str.buf, (uintmax_t)cost_bytes, max_str.buf, (uintmax_t)max_memory);
> +	}
> +	ALLOC_ARRAY(cost, cost_size);

Nicely done.

> @@ -351,7 +363,8 @@ static void get_correspondences(struct string_list *a, struct string_list *b,
>  		}
>  
>  		c = a_util->matching < 0 ?
> -			a_util->diffsize * creation_factor / 100 : COST_MAX;
> +			    a_util->diffsize * creation_factor / 100 :
> +			    COST_MAX;
>  		for (j = b->nr; j < n; j++)
>  			cost[i + n * j] = c;
>  	}

There seem to be other unrelated changes indentation-only changes
mixed in to the changes to this file, not just this one.

As a style fix, 

		c = a_util->matching < 0
		  ? a_util->diffsize * creation_factor / 100
		  : COST_MAX;

would be easier to follow and read, but please do not do such a
cosmetic clean-up in the same patch.  Do them in a separate
preliminary clean-up patch before the "real work".

> @@ -591,7 +605,8 @@ int show_range_diff(const char *range1, const char *range2,
>  	if (!res) {
>  		find_exact_matches(&branch1, &branch2);
>  		get_correspondences(&branch1, &branch2,
> -				    range_diff_opts->creation_factor);
> +				    range_diff_opts->creation_factor,
> +				    range_diff_opts->max_memory);
>  		output(&branch1, &branch2, range_diff_opts);
>  	}

OK.




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux