RE: [PATCH 1/2] ceph/mds_client: transfer CEPH_CAP_PIN when updating r_parent on mismatch

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 2025-09-03 at 11:14 +0300, Alex Markuze wrote:
> These patches apply to the testing branch. They update the r_parent race fix.
> 
> commit a69ac54928a45ad66b6ba84f9bd4be2fd0f9518e
> Author: Alex Markuze <amarkuze@xxxxxxxxxx>
> Date:   Tue Aug 12 09:57:39 2025 +0000
> 
>     ceph: fix race condition where r_parent becomes stale before sending message
> 
>     When the parent directory's i_rwsem is not locked, req->r_parent may become
>     stale due to concurrent operations (e.g. rename) between dentry lookup and
>     message creation. Validate that r_parent matches the encoded parent inode
>     and update to the correct inode if a mismatch is detected.
> 
>     Signed-off-by: Alex Markuze <amarkuze@xxxxxxxxxx>
>     Reviewed-by: Viacheslav Dubeyko <Slava.Dubeyko@xxxxxxx>
>     Signed-off-by: Ilya Dryomov <idryomov@xxxxxxxxx>
> 
> commit 7128e41a490709c759fde32898eade197acd0978
> Author: Alex Markuze <amarkuze@xxxxxxxxxx>
> Date:   Tue Aug 12 09:57:38 2025 +0000
> 
>     ceph: fix race condition validating r_parent before applying state
> 
>     Add validation to ensure the cached parent directory inode matches the
>     directory info in MDS replies. This prevents client-side race conditions
>     where concurrent operations (e.g. rename) cause r_parent to become stale
>     between request initiation and reply processing, which could lead to
>     applying state changes to incorrect directory inodes.
> 
>     Signed-off-by: Alex Markuze <amarkuze@xxxxxxxxxx>
>     Reviewed-by: Viacheslav Dubeyko <Slava.Dubeyko@xxxxxxx>
>     Signed-off-by: Ilya Dryomov <idryomov@xxxxxxxxx>
> 

OK. Thanks for explaining this.

> On Tue, Sep 2, 2025 at 9:42 PM Viacheslav Dubeyko <Slava.Dubeyko@xxxxxxx> wrote:
> > 
> > On Mon, 2025-09-01 at 15:14 +0000, Alex Markuze wrote:
> > > When the parent directory lock is not held, req->r_parent can become stale between dentry lookup and request encoding.
> > > The client updates r_parent to the correct inode based on the encoded path, but previously did not adjust CEPH_CAP_PIN references.
> > > 
> > > Release the pin from the old parent and acquire it for the new parent when switching r_parent, ensuring reference accounting stays balanced and avoiding leaks or underflows later in ceph_mdsc_release_request().
> > > 
> > 

I think that it makes sense to explain in brief what is the
responsibility of CEPH_CAP_PIN and why it is important to move
the CEPH_CAP_PIN from the old parent to the new one.

> > 
> > > Signed-off-by: Alex Markuze <amarkuze@xxxxxxxxxx>
> > > ---
> > >  fs/ceph/mds_client.c | 11 +++++++++--
> > >  1 file changed, 9 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
> > > index ce0c129f4651..4e5926f36e8d 100644
> > > --- a/fs/ceph/mds_client.c
> > > +++ b/fs/ceph/mds_client.c
> > > @@ -3053,12 +3053,19 @@ static struct ceph_msg *create_request_message(struct ceph_mds_session *session,
> > >        */
> > >       if (!parent_locked && req->r_parent && path_info1.vino.ino &&
> > >           ceph_ino(req->r_parent) != path_info1.vino.ino) {
> > > +             struct inode *old_parent = req->r_parent;
> > >               struct inode *correct_dir = ceph_get_inode(mdsc->fsc->sb, path_info1.vino, NULL);
> > >               if (!IS_ERR(correct_dir)) {
> > >                       WARN_ONCE(1, "ceph: r_parent mismatch (had %llx wanted %llx) - updating\n",
> > > -                               ceph_ino(req->r_parent), path_info1.vino.ino);
> > > -                     iput(req->r_parent);
> > > +                               ceph_ino(old_parent), path_info1.vino.ino);
> > > +                     /*
> > > +                      * Transfer CEPH_CAP_PIN from the old parent to the new one.
> > > +                      * The pin was taken earlier in ceph_mdsc_submit_request().
> > > +                      */
> > > +                     ceph_put_cap_refs(ceph_inode(old_parent), CEPH_CAP_PIN);
> > > +                     iput(old_parent);
> > >                       req->r_parent = correct_dir;
> > > +                     ceph_get_cap_refs(ceph_inode(req->r_parent), CEPH_CAP_PIN);
> > >               }
> > >       }
> > > 

The patch looks good. But the commit message can be improved to be more
informative, from my point of view.

Reviewed-by: Viacheslav Dubeyko <Slava.Dubeyko@xxxxxxx>

Thanks,
Slava.




[Index of Archives]     [CEPH Users]     [Ceph Large]     [Ceph Dev]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux