Re: [PATCH 1/6] NFSD: add the ability to enable use of RWF_DONTCACHE for all IO

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 2025-06-11 at 15:18 -0400, Mike Snitzer wrote:
> On Wed, Jun 11, 2025 at 10:31:20AM -0400, Chuck Lever wrote:
> > On 6/10/25 4:57 PM, Mike Snitzer wrote:
> > > Add 'enable-dontcache' to NFSD's debugfs interface so that: Any data
> > > read or written by NFSD will either not be cached (thanks to O_DIRECT)
> > > or will be removed from the page cache upon completion (DONTCACHE).
> > 
> > I thought we were going to do two switches: One for reads and one for
> > writes? I could be misremembering.
> 
> We did discuss the possibility of doing that.  Still can-do if that's
> what you'd prefer.
>  

Having them as separate controls in debugfs is fine for
experimentation's sake, but I imagine we'll need to be all-in one way
or the other with a real interface.

I think if we can crack the problem of receiving WRITE payloads into an
already-aligned buffer, then that becomes much more feasible. I think
that's a solveable problem.

> > After all, you are describing two different facilities here: a form of
> > direct I/O for READs, and RWF_DONTCACHE for WRITEs (I think?).
> 
> My thinking was NFSD doesn't need to provide faithful pure
> RWF_DONTCACHE if it really doesn't make sense.  But the "dontcache"
> name can be (ab)used by NFSD to define it how it sees fit (O_DIRECT
> doesn't cache so it seems fair).  What I arrived at with this patchset
> is how I described in my cover letter:
> 
> When 'enable-dontcache' is used:
> - all READs will use O_DIRECT (both DIO-aligned and misaligned)
> - all DIO-aligned WRITEs will use O_DIRECT (useful for SUNRPC RDMA)
> - misaligned WRITEs currently continue to use normal buffered IO
> 
> But we reserve the right to iterate on the implementation details as
> we see fit.  Still using the umbrella of 'dontcache'.
>
> > > diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
> > > index 7d94fae1dee8..bba3e6f4f56b 100644
> > > --- a/fs/nfsd/vfs.c
> > > +++ b/fs/nfsd/vfs.c
> > > @@ -49,6 +49,7 @@
> > >  #define NFSDDBG_FACILITY		NFSDDBG_FILEOP
> > >  
> > >  bool nfsd_disable_splice_read __read_mostly;
> > > +bool nfsd_enable_dontcache __read_mostly;
> > >  
> > >  /**
> > >   * nfserrno - Map Linux errnos to NFS errnos
> > > @@ -1086,6 +1087,7 @@ __be32 nfsd_iter_read(struct svc_rqst *rqstp, struct svc_fh *fhp,
> > >  	unsigned long v, total;
> > >  	struct iov_iter iter;
> > >  	loff_t ppos = offset;
> > > +	rwf_t flags = 0;
> > >  	ssize_t host_err;
> > >  	size_t len;
> > >  
> > > @@ -1103,7 +1105,11 @@ __be32 nfsd_iter_read(struct svc_rqst *rqstp, struct svc_fh *fhp,
> > >  
> > >  	trace_nfsd_read_vector(rqstp, fhp, offset, *count);
> > >  	iov_iter_bvec(&iter, ITER_DEST, rqstp->rq_bvec, v, *count);
> > > -	host_err = vfs_iter_read(file, &iter, &ppos, 0);
> > > +
> > > +	if (nfsd_enable_dontcache)
> > > +		flags |= RWF_DONTCACHE;
> > 
> > Two things:
> > 
> > - Maybe NFSD should record whether the file system is DONTCACHE-enabled
> > in @fhp or in the export it is associated with, and then check that
> > setting here before asserting RWF_DONTCACHE
> 
> Sure, that'd be safer than allowing RWF_DONTCACHE to be tried only to
> get EOPNOTSUPP because the underlying filesystem doesn't enable
> support.
> 
> Could follow what I did with nfsd_file only storing the dio_*
> alignment data retrieved from statx IFF 'enable-dontcache' was enabled
> at the time the nfsd_file was opened.
> 
> By adding check for FOP_DONTCACHE being set in underlying filesystem.
> But as-is, we're not actually using RWF_DONTCACHE in the final form of
> what I've provided in this patchset.  So can easily circle back to
> adding this if/when we do decide to use RWF_DONTCACHE.
> 
> > - I thought we were going with O_DIRECT for READs.
> 
> Yes, this is just an intermediate patch that goes away in later
> patches.  I was more focused on minimal patch to get the
> 'enable-dontcache' debugfs interface in place and tweaking it to its
> ultimate form in later patch.
> 
> I put in place a more general framework that can evolve... it being
> more free-form (e.g. "don't worry your pretty head about the
> implementation details, we'll worry for you").
> 
> Causes some reviewer angst I suppose, so I can just fold patches to do
> away with unused intermediate state.
> 
> Mike

-- 
Jeff Layton <jlayton@xxxxxxxxxx>





[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux