On Thu, Feb 13, 2025 at 10:38:51PM +0100, René Scharfe wrote: > Am 12.02.25 um 08:05 schrieb Patrick Steinhardt: > > And this is fixed by immediately processing all commits that we > > currently have in the list. As `clear_commit_marks_1()` only processes > > immediate children of the handed-in commit we know that it will have > > processed the first parent, and `list` will contain remaining commits, > > if any. > > clear_commit_marks_1() processes the whole ancestral chain of first > parents down to the root or first clean ancestor. > > > We also end up adding grandchildren to the list, so this change > > essentially converts the algorithm from depth-first to breadth-first. > > It's still depth-first, but all second and higher numbered parents added > to the list are cleaned before starting to clean the next commit from > the input list. So we go from "clean straight down for each input > commit first and clean their side branches later" to "clean all > ancestors for each input commit". Ah, fair. > > I bet we can construct cases where this will perform _worse_ than the > > current algorithm, e.g. when you have branch thickets where every commit > > is a merge: But I assume that for the most common cases this might be an > > improvement indeed. > > I won't bet, but I'd like to see such a case. Can't imagine one. > > > The question to me is: does this actually matter in the real world? It > > would be nice to maybe get some numbers that demonstrate the improvement > > in a repository. > > Well, the maximum list length for clear_commit_marks_many() calls with > nr > 1 in the test suite goes from 12 in t6600 to 4 with the patch. Not > that exciting. The question to me is: Why pile up parents in the list > when we can clean them earlier with no downside? Or is there any? If it really is without downsides then yes, it's a nice improvement. I was mostly wondering whether you're fixing something where the old way of doing things performs _significantly_ worse and where the change could lead to a user-visible improvement. Like, requiring us to store orders of magnitudes less commits at the same time. Patrick