Re: [PATCH 6/6] merge-ort: fix directory rename on top of source of other rename/delete

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Aug 1, 2025 at 1:31 AM Patrick Steinhardt <ps@xxxxxx> wrote:
>
> On Tue, Jul 22, 2025 at 03:23:11PM +0000, Elijah Newren via GitGitGadget wrote:
>
> What a massive commit message. It almost felt like a blog post rather
> than a commit message, but I certainly don't mind the additional
> context.

Yeah...it certainly is.  I was spinning my wheels for a few weeks because of
  * the number of items needed to trigger the issue
  * the misleading/buggy testcases we had, now addressed earlier in this series
  * the number of other nearby bugs that also existed
  * the fact that "relevant renames" sometimes tricks you into
thinking you're testing a rename testcase when you're not
and when I started explaining it, I realized there was lots of assumed
background needed to understand the bug that I'm not sure others on
the list would have.

Besides, I was worried that _I_ would forget all these details in 6
months, so I wanted it all spelled out.

Thanks for being understanding of the lengthy tome this commit message
became.  And for reading all of it!

> > From: Elijah Newren <newren@xxxxxxxxx>
> >
> > At GitHub, we've got a real-world repository that has been triggering
> > failures of the form:
> >
> >     git: merge-ort.c:3007: process_renames: Assertion `newinfo && !newinfo->merged.clean' failed.
> >
> > which comes from the line:
> >
> >     VERIFY_CI(newinfo);
> >
> > Unfortunately, this one has been quite complex to unravel, and is a
> > bit complex to explain.  So, I'm going to carefully try to explain each
> > relevant piece needed to understand the fix, then carefully build up
> > from a simple testcase to some of the relevant testcases.
> >
> > == New special case we need to consider ==
> >
> > Rename pairs in the diffcore machinery connect the source path of a
> > rename with the destination path of a rename.  Since we have rename
> > pairs to consider on both sides of history since the merge base,
> > merging has to consider a few special cases of possible overlap:
> >
> >   A) two rename pairs having the same target path
> >   B) two rename pairs having the same source path
> >   C) the source path of one rename pair being the target path of a
> >      different rename pair
>
> So basically file A get's moved somewhere else and then replaced by a
> different file B?

Yup.

>
> > Some of these came up often enough that we gave them names:
> >   A) a rename/rename(2to1) conflict (looks similar to an add/add conflict)
> >   B) a rename/rename(1to2) conflict, which represents the same path being
> >      renamed differently on the two sides of history
> >   C) not yet named
> >
> > merge-ort is well-prepared to handle cases (A) and (B), as was
> > merge-recursive (which was merge-ort's predecessor).  Case (C) was
> > briefly considered during the years of merge-recursive maintenance,
> > but the full extent of support it got was a few FIXME/TODO comments
> > littered around the code highlighting some of the places that would
> > probably need to be fixed to support it.  When I wrote merge-ort I
> > ignored case (C) entirely, since I believed that case (C) was only
> > possible if we were to support break detection during merges.  Not
> > only had break detection never been supported by any merge algorithm,
> > I thought break detection wasn't worth the effort to support in a
> > merge algorithm.  However, it turns out that case (C) can be triggered
> > without break detection, if there's enough moving pieces.
> >
> > Before I dive into how to trigger case (C) with directory renames plus
> > other renames, it might be helpful to use a simpler example with break
> > detection first.  And before we get to that it may help to explain
> > some more basics of handling renames in the merge algorithm.  So, let
> > me first backup and provide a quick refresher on on each of
>
> s/on on/on/

Thanks.

> [snip]
> > == Directory rename detection ==
> >
> > If one side of history renames directory D/ -> E/, and the other side of
> > history adds new files to E/, then directory rename detection notices
>
> Did you mean to say "D/" here?

Yes, thanks.

> [snip]
> > == Testcases 8+ ==
> >
> > Another bonus bug, found via understanding our final solution (and the
> > failure of our first attempted solution)!
>
> s/solution/solutions/ as there are multiple attempted solutions that
> were discarded?

Yeah, I typed up this commit message and then found more issues, and
inserted them earlier.  I'll fix up the wording; thanks.

> > diff --git a/merge-ort.c b/merge-ort.c
> > index feb06720c7e1..f1ecccee940b 100644
> > --- a/merge-ort.c
> > +++ b/merge-ort.c
> > @@ -2313,14 +2313,20 @@ static char *apply_dir_rename(struct strmap_entry *rename_info,
> >       return strbuf_detach(&new_path, NULL);
> >  }
> >
> > -static int path_in_way(struct strmap *paths, const char *path, unsigned side_mask)
> > +static int path_in_way(struct strmap *paths,
> > +                    const char *path,
> > +                    unsigned side_mask,
> > +                    struct diff_filepair *p)
> >  {
> >       struct merged_info *mi = strmap_get(paths, path);
> >       struct conflict_info *ci;
> >       if (!mi)
> >               return 0;
> >       INITIALIZE_CI(ci, mi);
> > -     return mi->clean || (side_mask & (ci->filemask | ci->dirmask));
> > +     return mi->clean || (side_mask & (ci->filemask | ci->dirmask))
> > +       // See testcases 12n, 12p, 12q for more details on this next condition
>
> This should use `/* */`-style comments.

Yep, will fix.

>
> > +                      || ((ci->filemask & 0x01) &&
> > +                          strcmp(p->one->path, path));
>
> So if we have a stage 1 index entry and the path is the same due to a
> transitive rename we can say that the path is not in the way?

Right if A -> A, then we know that the original A and the new A do in
fact have related contents and thus the old A is not in the way of the
new A.  I would have rather just checked for a stage 1 index entry and
said that the presence of such a thing means there's a file in the way
(that's what I originally did), but the rename-to-self case is special
and is why the strcmp condition is there.





[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux