On Fri, Aug 1, 2025 at 1:31 AM Patrick Steinhardt <ps@xxxxxx> wrote: > > On Tue, Jul 22, 2025 at 03:23:11PM +0000, Elijah Newren via GitGitGadget wrote: > > What a massive commit message. It almost felt like a blog post rather > than a commit message, but I certainly don't mind the additional > context. Yeah...it certainly is. I was spinning my wheels for a few weeks because of * the number of items needed to trigger the issue * the misleading/buggy testcases we had, now addressed earlier in this series * the number of other nearby bugs that also existed * the fact that "relevant renames" sometimes tricks you into thinking you're testing a rename testcase when you're not and when I started explaining it, I realized there was lots of assumed background needed to understand the bug that I'm not sure others on the list would have. Besides, I was worried that _I_ would forget all these details in 6 months, so I wanted it all spelled out. Thanks for being understanding of the lengthy tome this commit message became. And for reading all of it! > > From: Elijah Newren <newren@xxxxxxxxx> > > > > At GitHub, we've got a real-world repository that has been triggering > > failures of the form: > > > > git: merge-ort.c:3007: process_renames: Assertion `newinfo && !newinfo->merged.clean' failed. > > > > which comes from the line: > > > > VERIFY_CI(newinfo); > > > > Unfortunately, this one has been quite complex to unravel, and is a > > bit complex to explain. So, I'm going to carefully try to explain each > > relevant piece needed to understand the fix, then carefully build up > > from a simple testcase to some of the relevant testcases. > > > > == New special case we need to consider == > > > > Rename pairs in the diffcore machinery connect the source path of a > > rename with the destination path of a rename. Since we have rename > > pairs to consider on both sides of history since the merge base, > > merging has to consider a few special cases of possible overlap: > > > > A) two rename pairs having the same target path > > B) two rename pairs having the same source path > > C) the source path of one rename pair being the target path of a > > different rename pair > > So basically file A get's moved somewhere else and then replaced by a > different file B? Yup. > > > Some of these came up often enough that we gave them names: > > A) a rename/rename(2to1) conflict (looks similar to an add/add conflict) > > B) a rename/rename(1to2) conflict, which represents the same path being > > renamed differently on the two sides of history > > C) not yet named > > > > merge-ort is well-prepared to handle cases (A) and (B), as was > > merge-recursive (which was merge-ort's predecessor). Case (C) was > > briefly considered during the years of merge-recursive maintenance, > > but the full extent of support it got was a few FIXME/TODO comments > > littered around the code highlighting some of the places that would > > probably need to be fixed to support it. When I wrote merge-ort I > > ignored case (C) entirely, since I believed that case (C) was only > > possible if we were to support break detection during merges. Not > > only had break detection never been supported by any merge algorithm, > > I thought break detection wasn't worth the effort to support in a > > merge algorithm. However, it turns out that case (C) can be triggered > > without break detection, if there's enough moving pieces. > > > > Before I dive into how to trigger case (C) with directory renames plus > > other renames, it might be helpful to use a simpler example with break > > detection first. And before we get to that it may help to explain > > some more basics of handling renames in the merge algorithm. So, let > > me first backup and provide a quick refresher on on each of > > s/on on/on/ Thanks. > [snip] > > == Directory rename detection == > > > > If one side of history renames directory D/ -> E/, and the other side of > > history adds new files to E/, then directory rename detection notices > > Did you mean to say "D/" here? Yes, thanks. > [snip] > > == Testcases 8+ == > > > > Another bonus bug, found via understanding our final solution (and the > > failure of our first attempted solution)! > > s/solution/solutions/ as there are multiple attempted solutions that > were discarded? Yeah, I typed up this commit message and then found more issues, and inserted them earlier. I'll fix up the wording; thanks. > > diff --git a/merge-ort.c b/merge-ort.c > > index feb06720c7e1..f1ecccee940b 100644 > > --- a/merge-ort.c > > +++ b/merge-ort.c > > @@ -2313,14 +2313,20 @@ static char *apply_dir_rename(struct strmap_entry *rename_info, > > return strbuf_detach(&new_path, NULL); > > } > > > > -static int path_in_way(struct strmap *paths, const char *path, unsigned side_mask) > > +static int path_in_way(struct strmap *paths, > > + const char *path, > > + unsigned side_mask, > > + struct diff_filepair *p) > > { > > struct merged_info *mi = strmap_get(paths, path); > > struct conflict_info *ci; > > if (!mi) > > return 0; > > INITIALIZE_CI(ci, mi); > > - return mi->clean || (side_mask & (ci->filemask | ci->dirmask)); > > + return mi->clean || (side_mask & (ci->filemask | ci->dirmask)) > > + // See testcases 12n, 12p, 12q for more details on this next condition > > This should use `/* */`-style comments. Yep, will fix. > > > + || ((ci->filemask & 0x01) && > > + strcmp(p->one->path, path)); > > So if we have a stage 1 index entry and the path is the same due to a > transitive rename we can say that the path is not in the way? Right if A -> A, then we know that the original A and the new A do in fact have related contents and thus the old A is not in the way of the new A. I would have rather just checked for a stage 1 index entry and said that the presence of such a thing means there's a file in the way (that's what I originally did), but the rename-to-self case is special and is why the strcmp condition is there.