Re: [PATCH 5/6] merge-ort: fix incorrect file handling

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Aug 4, 2025 at 9:39 PM Patrick Steinhardt <ps@xxxxxx> wrote:
>
> On Mon, Aug 04, 2025 at 03:08:50PM -0700, Elijah Newren wrote:
> > On Fri, Aug 1, 2025 at 1:31 AM Patrick Steinhardt <ps@xxxxxx> wrote:
> > > On Tue, Jul 22, 2025 at 03:23:10PM +0000, Elijah Newren via GitGitGadget wrote:
> > > > Further, commit 98a1a00d5301 (t6423: add a testcase causing a failed
> > > > assertion in process_renames, 2025-03-06), fixed an issue with
> > > > rename-to-self but added a new testcase, 12n, that only checked for
> > > > whether the merge ran to completion.  A few commits ago, we modified
> > > > this test to check for the number of entries in the index -- but noted
> > > > that the number was wrong.  And we also noted a
> > > > silently-keep-instead-of-delete bug at the same time in the new testcase
> > > > 12n2.
> > > >
> > > > Fix to merge-ort to prevent multiple bugs with rename-to-self cases:
> > > >   * silent deletion of file expected to be kept (t6423 testcase 12i2)
> > > >   * silent retention of file expected to be removed (t6423 testcase 12n2)
> > > >   * wrong number of extries left in the index (t6423 testcase 12n)
> > >
> > > I think it would have been nice to also go a bit more in depth for what
> > > the bug actually was and how it's fixed. You do add a comment, but that
> > > only adds a single sentence of context.
> >
> > Would something like this help:
> >
> > ...all of these issues come up because in a rename-to-self case, when
> > we have a rename A->B, both A and B name the same file.  The code in
> > process_renames() assumes A & B are different, and tries to move the
> > higher order stages and file contents so that they are associated just
> > with the new path, but the assumptions of A & B being different can
> > cause A to be deleted when it's not supposed to be or mark B as
> > resolved and kept in place when it's supposed to be deleted.  Since A
> > & B are already the same path in the rename-to-self case, we can
> > simply skip the steps in process_renames() for such files.
>
> Yes, it would!

Great, I'll include it in the next round, which I'll send out soon.

> > > > Signed-off-by: Elijah Newren <newren@xxxxxxxxx>
> > > > ---
> > > >  merge-ort.c                         | 11 +++++
> > > >  t/t6423-merge-rename-directories.sh | 69 +++++++++++++++++++++++++++--
> > > >  2 files changed, 77 insertions(+), 3 deletions(-)
> > > >
> > > > diff --git a/merge-ort.c b/merge-ort.c
> > > > index 9b9d82ed10f7..feb06720c7e1 100644
> > > > --- a/merge-ort.c
> > > > +++ b/merge-ort.c
> > > > @@ -2873,6 +2873,17 @@ static int process_renames(struct merge_options *opt,
> > > >                       newinfo = new_ent->value;
> > > >               }
> > > >
> > > > +             /*
> > > > +              * Directory renames can result in rename-to-self, which we
> > > > +              * want to skip so we don't mark oldpath for deletion.
> > > > +              *
> > > > +              * Note that we can avoid strcmp here because of prior
> > > > +              * diligence in apply_directory_rename_modifications() to
> > > > +              * ensure we reused existing paths from opt->priv->paths.
> > > > +              */
> > > > +             if (oldpath == newpath)
> > > > +                     continue;
> > >
> > > Makes me wonder whether the additional brittleness is worth the saved
> > > `strcmp()` comparison. But on the other hand we do have tests now that
> > > would break if the memory allocation patterns ever changed, so that's
> > > reassuring.
> >
> > There's no brittleness here; one of the many optimizations in
> > merge-ort.c was to intern *all* pathnames in struct
> > merge_options_internal's "paths" member; any code that needs to
> > generate/compute a filename that may be part of the merge must check
> > if that path already exists in opt->priv->paths, and if so use that
> > pointer instead so that all filename comparisons can be done with
> > cheap pointer comparisons.  See the big comment near the top of
> > merge_options_internal.  Nearly all such
> > string-equality-via-pointer-equality checks were introduced about the
> > same time, and in other functions, which makes this one kind of an
> > outlier.  I figured whoever reviewed this patch or ran across this in
> > the code might get surprised by the pointer comparison, so I tried to
> > add a comment to address any questions.  Looks like I wasn't thorough
> > enough (and the first paragraph of the comment pre-dated my finding
> > other bugs this patch fixed, which makes it slightly confusing), so
> > I'll try to see if I can improve it for v2.
>
> I think the current version is good enough -- it feels brittle to me,
> but I don't see a strong reason to change it.

Okay.  Note that if there was a strong reason to change it, the logic
to do so would mean that the pointer-comparison-instead-of-strcmp
optimization used on pathnames in several other places of merge-ort
would also need to change, because they are all based on the same
interning-of-pathname-strings to allow that optimization to be safe.
I did notice that the merge_options_internal comment on the paths
member could perhaps be improved slightly; I'll include that too.





[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux