On Wed, 2025-09-03 at 11:14 +0300, Alex Markuze wrote: > These patches apply to the testing branch. They update the r_parent race fix. > > commit a69ac54928a45ad66b6ba84f9bd4be2fd0f9518e > Author: Alex Markuze <amarkuze@xxxxxxxxxx> > Date: Tue Aug 12 09:57:39 2025 +0000 > > ceph: fix race condition where r_parent becomes stale before sending message > > When the parent directory's i_rwsem is not locked, req->r_parent may become > stale due to concurrent operations (e.g. rename) between dentry lookup and > message creation. Validate that r_parent matches the encoded parent inode > and update to the correct inode if a mismatch is detected. > > Signed-off-by: Alex Markuze <amarkuze@xxxxxxxxxx> > Reviewed-by: Viacheslav Dubeyko <Slava.Dubeyko@xxxxxxx> > Signed-off-by: Ilya Dryomov <idryomov@xxxxxxxxx> > > commit 7128e41a490709c759fde32898eade197acd0978 > Author: Alex Markuze <amarkuze@xxxxxxxxxx> > Date: Tue Aug 12 09:57:38 2025 +0000 > > ceph: fix race condition validating r_parent before applying state > > Add validation to ensure the cached parent directory inode matches the > directory info in MDS replies. This prevents client-side race conditions > where concurrent operations (e.g. rename) cause r_parent to become stale > between request initiation and reply processing, which could lead to > applying state changes to incorrect directory inodes. > > Signed-off-by: Alex Markuze <amarkuze@xxxxxxxxxx> > Reviewed-by: Viacheslav Dubeyko <Slava.Dubeyko@xxxxxxx> > Signed-off-by: Ilya Dryomov <idryomov@xxxxxxxxx> > OK. Thanks for explaining this. > On Tue, Sep 2, 2025 at 9:42 PM Viacheslav Dubeyko <Slava.Dubeyko@xxxxxxx> wrote: > > > > On Mon, 2025-09-01 at 15:14 +0000, Alex Markuze wrote: > > > When the parent directory lock is not held, req->r_parent can become stale between dentry lookup and request encoding. > > > The client updates r_parent to the correct inode based on the encoded path, but previously did not adjust CEPH_CAP_PIN references. > > > > > > Release the pin from the old parent and acquire it for the new parent when switching r_parent, ensuring reference accounting stays balanced and avoiding leaks or underflows later in ceph_mdsc_release_request(). > > > > > I think that it makes sense to explain in brief what is the responsibility of CEPH_CAP_PIN and why it is important to move the CEPH_CAP_PIN from the old parent to the new one. > > > > > Signed-off-by: Alex Markuze <amarkuze@xxxxxxxxxx> > > > --- > > > fs/ceph/mds_client.c | 11 +++++++++-- > > > 1 file changed, 9 insertions(+), 2 deletions(-) > > > > > > diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c > > > index ce0c129f4651..4e5926f36e8d 100644 > > > --- a/fs/ceph/mds_client.c > > > +++ b/fs/ceph/mds_client.c > > > @@ -3053,12 +3053,19 @@ static struct ceph_msg *create_request_message(struct ceph_mds_session *session, > > > */ > > > if (!parent_locked && req->r_parent && path_info1.vino.ino && > > > ceph_ino(req->r_parent) != path_info1.vino.ino) { > > > + struct inode *old_parent = req->r_parent; > > > struct inode *correct_dir = ceph_get_inode(mdsc->fsc->sb, path_info1.vino, NULL); > > > if (!IS_ERR(correct_dir)) { > > > WARN_ONCE(1, "ceph: r_parent mismatch (had %llx wanted %llx) - updating\n", > > > - ceph_ino(req->r_parent), path_info1.vino.ino); > > > - iput(req->r_parent); > > > + ceph_ino(old_parent), path_info1.vino.ino); > > > + /* > > > + * Transfer CEPH_CAP_PIN from the old parent to the new one. > > > + * The pin was taken earlier in ceph_mdsc_submit_request(). > > > + */ > > > + ceph_put_cap_refs(ceph_inode(old_parent), CEPH_CAP_PIN); > > > + iput(old_parent); > > > req->r_parent = correct_dir; > > > + ceph_get_cap_refs(ceph_inode(req->r_parent), CEPH_CAP_PIN); > > > } > > > } > > > The patch looks good. But the commit message can be improved to be more informative, from my point of view. Reviewed-by: Viacheslav Dubeyko <Slava.Dubeyko@xxxxxxx> Thanks, Slava.