On Tue, 2025-02-11 at 19:01 +0000, Al Viro wrote:
> On Tue, Feb 11, 2025 at 06:01:21PM +0000, Viacheslav Dubeyko wrote:
>
> > After some considerations, I believe we can follow such simple logic.
> > Correct me if I will be wrong here. The ceph_lookup() method's responsibility is
> > to "look up a single dir entry". It sounds for me that if we have positive
> > dentry,
> > then it doesn't make sense to call the ceph_lookup(). And if ceph_lookup() has
> > been
> > called for the positive dentry, then something wrong is happening.
>
> VFS never calls it that way; ceph itself might, if ceph_handle_notrace_create()
> is called with positive dentry.
>
> > But all this logic is not about negative dentry, it's about local check
> > before sending request to MDS server. So, I think we need to change the logic
> > in likewise way:
> >
> > if (<we can check locally>) {
> > <do check locally>
> > if (-ENOENT)
> > return NULL;
> > } else {
> > <send request to MDS server>
> > }
> >
> > Am I right here? :) Let me change the logic in this way and to test it.
>
> First of all, returning NULL does *not* mean "it's negative"; d_add(dentry, NULL)
> does that. What would "we can check locally" be? AFAICS, the checks in
> __ceph_dir_is_complete() and near it are doing just that...
>
Currently, we have:
if (d_really_is_negative(dentry)) {
<execute logic>
}
My point is simply to delete the d_really_is_negative() check and execute
this logic without the check.
> The really unpleasant question is whether ceph_handle_notrace_create() could
> end up feeding an already-positive dentry to direct call of ceph_lookup()...
We have ceph_handle_notrace_create() call in several methods:
(1) ceph_mknod()
(2) ceph_symlink()
(3) ceph_mkdir()
(4) ceph_atomic_open()
Every time we create object at first and, then, we call
ceph_handle_notrace_create() if creation was successful. We have such pattern:
req = ceph_mdsc_create_request(mdsc, CEPH_MDS_OP_MKNOD, USE_AUTH_MDS);
<skipped>
err = ceph_mdsc_do_request(mdsc, dir, req);
if (!err && !req->r_reply_info.head->is_dentry)
err = ceph_handle_notrace_create(dir, dentry);
And ceph_lookup() has such logic:
if (d_really_is_negative(dentry)) {
<execute logic>
if (-ENOENT)
return NULL;
}
req = ceph_mdsc_create_request(mdsc, op, USE_ANY_MDS);
<skipped>
err = ceph_mdsc_do_request(mdsc, NULL, req);
So, we have two different type of requests here:
(1) ceph_mdsc_create_request(mdsc, CEPH_MDS_OP_MKNOD, USE_AUTH_MDS)
(2) ceph_mdsc_create_request(mdsc, op, USE_ANY_MDS)
The first request creates an object on MDS side and second one checks that this
object exists on MDS side by lookup. I assume that first request simply reports
success or failure of object creation. And only second one can extract metadata
from MDS side.
Thanks,
Slava.