"Paulo Casaretto via GitGitGadget" <gitgitgadget@xxxxxxxxx> writes: > From: pcasaretto <paulo.casaretto@xxxxxxxxxxx> <administrivia> It is usual to see a less human readable name embedded in the commit object than the mail header when a mail comes from GGG. Just in case you want to be known to this community as "Paulo Casaretto", not "pcasaretto", I thought I'd point it out that you may want to redo the commit. I do not mind what name you like to use, as long as it is identifiable, and From: identity matches the identity you add your Signed-off-by: with. </administrivia> > Acked-by: Johannes Schindelin johannes.schindelin@xxxxxx It is unusual to lack <> around e-mail address here. > Signed-off-by: pcasaretto <paulo.casaretto@xxxxxxxxxxx> > --- > range-diff: add configurable memory limit for cost matrix > +static int parse_max_memory(const struct option *opt, const char *arg, int unset) > +{ > + size_t *max_memory = opt->value; > + uintmax_t val; > + > + if (unset) { > + return 0; > + } No unnecessary {braces} around a single statement, please. > + if (!git_parse_unsigned(arg, &val, SIZE_MAX)) > + return error(_("invalid max-memory value: %s"), arg); > + > + *max_memory = (size_t)val; > + return 0; > +} > @@ -33,17 +51,21 @@ int cmd_range_diff(int argc, > OPT_INTEGER(0, "creation-factor", > &range_diff_opts.creation_factor, > N_("percentage by which creation is weighted")), > + OPT_PASSTHRU_ARGV(0, "diff-merges", &diff_merges_arg, > + N_("style"), N_("passed to 'git log'"), 0), > + OPT_BOOL(0, "left-only", &left_only, > + N_("only emit output related to the first range")), > + OPT_CALLBACK(0, "max-memory", &range_diff_opts.max_memory, > + N_("size"), > + N_("maximum memory for cost matrix (default 4G)"), > + parse_max_memory), > OPT_BOOL(0, "no-dual-color", &simple_color, > N_("use simple diff colors")), > OPT_PASSTHRU_ARGV(0, "notes", &other_arg, > N_("notes"), N_("passed to 'git log'"), > PARSE_OPT_OPTARG), > - OPT_PASSTHRU_ARGV(0, "diff-merges", &diff_merges_arg, > - N_("style"), N_("passed to 'git log'"), 0), > OPT_PASSTHRU_ARGV(0, "remerge-diff", &diff_merges_arg, NULL, > N_("passed to 'git log'"), PARSE_OPT_NOARG), > - OPT_BOOL(0, "left-only", &left_only, > - N_("only emit output related to the first range")), > OPT_BOOL(0, "right-only", &right_only, > N_("only emit output related to the second range")), > OPT_END() This seems to mix unrelated changes. Please don't. Or if the reordering of options do have a reason to exist in _this_ commit, please justify it in your proposed log message. Even if there were a good reason for reordering existing options, I strongly suspect that the change would want to be done in a separate, preparatory-clean-up commit (i.e., making this topic a two-patch series), because it has nothing to do with preventing inefficient cost matrix computation from consuming too much memory, which _is_ the theme of this commit. > diff --git a/range-diff.c b/range-diff.c > index 8a2dcbee322..6e9b6b115e5 100644 > --- a/range-diff.c > +++ b/range-diff.c > @@ -21,6 +21,7 @@ > #include "apply.h" > #include "revision.h" > > + Unrelated, unexplained, and unnecessary change snuck in? Please proof-read the patch yourself before sending. > @@ -287,8 +288,8 @@ static void find_exact_matches(struct string_list *a, struct string_list *b) > } > > static int diffsize_consume(void *data, > - char *line UNUSED, > - unsigned long len UNUSED) > + char *line UNUSED, > + unsigned long len UNUSED) What is this change about??? > static void get_correspondences(struct string_list *a, struct string_list *b, > - int creation_factor) > + int creation_factor, size_t max_memory) > { > int n = a->nr + b->nr; > int *cost, c, *a2b, *b2a; > int i, j; > - > - ALLOC_ARRAY(cost, st_mult(n, n)); > + size_t cost_size = st_mult(n, n); > + size_t cost_bytes = st_mult(sizeof(int), cost_size); > + if (cost_bytes >= max_memory) { > + struct strbuf cost_str = STRBUF_INIT; > + struct strbuf max_str = STRBUF_INIT; > + strbuf_humanise_bytes(&cost_str, cost_bytes); > + strbuf_humanise_bytes(&max_str, max_memory); > + die(_("range-diff: unable to compute the range-diff, since it " > + "exceeds the maximum memory for the cost matrix: %s " > + "(%"PRIuMAX" bytes) needed, %s (%"PRIuMAX" bytes) available"), > + cost_str.buf, (uintmax_t)cost_bytes, max_str.buf, (uintmax_t)max_memory); > + } > + ALLOC_ARRAY(cost, cost_size); Nicely done. > @@ -351,7 +363,8 @@ static void get_correspondences(struct string_list *a, struct string_list *b, > } > > c = a_util->matching < 0 ? > - a_util->diffsize * creation_factor / 100 : COST_MAX; > + a_util->diffsize * creation_factor / 100 : > + COST_MAX; > for (j = b->nr; j < n; j++) > cost[i + n * j] = c; > } There seem to be other unrelated changes indentation-only changes mixed in to the changes to this file, not just this one. As a style fix, c = a_util->matching < 0 ? a_util->diffsize * creation_factor / 100 : COST_MAX; would be easier to follow and read, but please do not do such a cosmetic clean-up in the same patch. Do them in a separate preliminary clean-up patch before the "real work". > @@ -591,7 +605,8 @@ int show_range_diff(const char *range1, const char *range2, > if (!res) { > find_exact_matches(&branch1, &branch2); > get_correspondences(&branch1, &branch2, > - range_diff_opts->creation_factor); > + range_diff_opts->creation_factor, > + range_diff_opts->max_memory); > output(&branch1, &branch2, range_diff_opts); > } OK.