Junio C Hamano <gitster@xxxxxxxxx> writes: > > But I think the refactoring of diff_flush() codepath would may > involve some new mode (perhaps DIFF_FORMAT_DRYRUN or something) that > > (1) does not produce any output, like DIFF_FORMAT_NO_OUTPUT, so > that we do not need to play with /dev/null like Peff's > illustration. > > (2) knows that the caller is only interested in each path having > any change worth reporting, so that it can short-circuit once a > change is found for each path. > > So, just before you want to decide showing name or name-status, > you'd do this extra diff_flush() that is run only to learn if each > path has changes (with various "ignore" criteria) in the dry-run > mode, and it can do as much short-cut as it needs to. I’m proposing to add a .diff_optimize field to struct diff_options, which would support three modes: DIFF_OPT_NONE, DIFF_OPT_DRY_RUN, and DIFF_OPT_BUFFER. The appropriate value would be determined before calling diff_flush(), potentially in repo_diff_setup(). DIFF_OPT_NONE will be the code Peff provide, DIFF_OPT_DRY_RUN will optimize for --quiet, --name, --name-status, etc, so that we can return early if we found any change. DIFF_OPT_BUFFER will first emit changes and context around changes into a buffer (so there would be a map from file pair to change buffer), then operations after the buffer is built will use the buffer instead of calling xdl_diff(). However, I’m concerned that DIFF_OPT_BUFFER could lead to high memory usage in Git, and I’m not entirely sure if this trade-off is justified. Thanks, Lidong