Re: [PATCH 5/6] merge-ort: fix incorrect file handling

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Aug 1, 2025 at 1:31 AM Patrick Steinhardt <ps@xxxxxx> wrote:
>
> On Tue, Jul 22, 2025 at 03:23:10PM +0000, Elijah Newren via GitGitGadget wrote:
> > From: Elijah Newren <newren@xxxxxxxxx>
> >
> > We have multiple bugs here -- accidental silent file deletion,
> > accidental silent file retention for files that should be deleted,
> > and incorrect number of entries left in the index.
> >
> > The series merged at commit d3b88be1b450 (Merge branch
> > 'en/merge-dir-rename-corner-case-fix', 2021-07-16) introduced testcase
> > 12i-12k in t6423 which checked for rename-to-self cases, and fixed bugs
> > that merge-ort and merge-recursive had with these testcases.  At the
> > time, I noted that merge-ort had one bug for these cases, while
> > merge-recursive had two.  It turns out that merge-ort did in fact have
> > another bug, but the "relevant renames" optimizations were masking it.
> > If we modify testcase 12i from t6423 to modify the file in the commit
> > that renames it (but only modify it enough that it can still be detected
> > as a rename), then we can trigger silent deletion of the file.
> >
> > Tweak testcase 12i slightly to make the file in question have more than
> > one line in it, but which doesn't change how it operates.
>
> Hm, the second part of this sentence doesn't quite parse for me. Do you
> mean to say that 12i is basically left intact except that you change the
> contents of one line?

Yeah, sometimes my repeated editing of text leaves things not so
clear.  You are correct that I meant leaving the testcase intact other
than changing the initial contents of one file (though I changed the
contents so it had multiple lines, not just giving it a different
single line).  I'll reword it.

> > Make this
> > change to otherwise minimize the changes between this testcase and a new
> > one that we want to add.  Then duplicate the testcase as 12i2, changing
> > it so that it adds a single line to the file in question when it is
> > renamed, as a testcase for this bug.
>
> Okay.
>
> > Further, commit 98a1a00d5301 (t6423: add a testcase causing a failed
> > assertion in process_renames, 2025-03-06), fixed an issue with
> > rename-to-self but added a new testcase, 12n, that only checked for
> > whether the merge ran to completion.  A few commits ago, we modified
> > this test to check for the number of entries in the index -- but noted
> > that the number was wrong.  And we also noted a
> > silently-keep-instead-of-delete bug at the same time in the new testcase
> > 12n2.
> >
> > Fix to merge-ort to prevent multiple bugs with rename-to-self cases:
> >   * silent deletion of file expected to be kept (t6423 testcase 12i2)
> >   * silent retention of file expected to be removed (t6423 testcase 12n2)
> >   * wrong number of extries left in the index (t6423 testcase 12n)
>
> I think it would have been nice to also go a bit more in depth for what
> the bug actually was and how it's fixed. You do add a comment, but that
> only adds a single sentence of context.

Would something like this help:

...all of these issues come up because in a rename-to-self case, when
we have a rename A->B, both A and B name the same file.  The code in
process_renames() assumes A & B are different, and tries to move the
higher order stages and file contents so that they are associated just
with the new path, but the assumptions of A & B being different can
cause A to be deleted when it's not supposed to be or mark B as
resolved and kept in place when it's supposed to be deleted.  Since A
& B are already the same path in the rename-to-self case, we can
simply skip the steps in process_renames() for such files.

> > Signed-off-by: Elijah Newren <newren@xxxxxxxxx>
> > ---
> >  merge-ort.c                         | 11 +++++
> >  t/t6423-merge-rename-directories.sh | 69 +++++++++++++++++++++++++++--
> >  2 files changed, 77 insertions(+), 3 deletions(-)
> >
> > diff --git a/merge-ort.c b/merge-ort.c
> > index 9b9d82ed10f7..feb06720c7e1 100644
> > --- a/merge-ort.c
> > +++ b/merge-ort.c
> > @@ -2873,6 +2873,17 @@ static int process_renames(struct merge_options *opt,
> >                       newinfo = new_ent->value;
> >               }
> >
> > +             /*
> > +              * Directory renames can result in rename-to-self, which we
> > +              * want to skip so we don't mark oldpath for deletion.
> > +              *
> > +              * Note that we can avoid strcmp here because of prior
> > +              * diligence in apply_directory_rename_modifications() to
> > +              * ensure we reused existing paths from opt->priv->paths.
> > +              */
> > +             if (oldpath == newpath)
> > +                     continue;
>
> Makes me wonder whether the additional brittleness is worth the saved
> `strcmp()` comparison. But on the other hand we do have tests now that
> would break if the memory allocation patterns ever changed, so that's
> reassuring.

There's no brittleness here; one of the many optimizations in
merge-ort.c was to intern *all* pathnames in struct
merge_options_internal's "paths" member; any code that needs to
generate/compute a filename that may be part of the merge must check
if that path already exists in opt->priv->paths, and if so use that
pointer instead so that all filename comparisons can be done with
cheap pointer comparisons.  See the big comment near the top of
merge_options_internal.  Nearly all such
string-equality-via-pointer-equality checks were introduced about the
same time, and in other functions, which makes this one kind of an
outlier.  I figured whoever reviewed this patch or ran across this in
the code might get surprised by the pointer comparison, so I tried to
add a comment to address any questions.  Looks like I wasn't thorough
enough (and the first paragraph of the comment pre-dated my finding
other bugs this patch fixed, which makes it slightly confusing), so
I'll try to see if I can improve it for v2.





[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux