Re: [PATCH v1 2/8] iomap: add IOMAP_IN_MEM iomap type

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jun 10, 2025 at 9:04 PM Christoph Hellwig <hch@xxxxxxxxxxxxx> wrote:
>
> On Tue, Jun 10, 2025 at 01:13:09PM -0700, Joanne Koong wrote:
> > > synchronous ones.  And if the file system fragmented the folio so badly
> > > that we'll now need to do more than two reads we're still at least
> > > pipelining it, although that should basically never happen with modern
> > > file systems.
> >
> > If the filesystem wants granular folio reads, it can also just do that
> > itself by calling an iomap helper (eg what iomap_adjust_read_range()
> > is doing right now) in its ->read_folio() implementation, correct?
>
> Well, nothing tells ->read_folio how much to read.  But having a new

Not a great idea, but theoretically we could stash that info (offset
and len) in the folio->private iomap_folio_state struct. I don't think
that runs into synchronization issues since it would be set and
cleared while the file lock is held for that read.

But regardless I think we still need a new variant of read_folio
because if a non block-io iomap wants to use iomap_read_folio() /
iomap_readahead() for the granular uptodate parsing logic that's in
there, it'll need to provide a method for reading a partial folio. I
initially wasn't planning to have fuse use iomap_read_folio() /
iomap_readahead() but I realized there's some cases where fuse will
find it useful, so i'm planning to add that in.

> variant of read_folio that allows partial reads might still be nicer
> than a iomap_folio_op.  Let me draft that and see if willy or other mm
> folks choke on it :)

writeback_folio() is also a VM level concept so under that same logic,
should writeback_folio() also be an address space operation?

A more general question i've been trying to figure out is if the
vision is that iomap is going to be the defacto generic library that
all/most filesystems will be using in the future? If so then it makes
sense to me to add this to the address space operations but if not
then I don't think I see the hate for having the folio callbacks be
embedded in iomap_folio_op.

>
> > For fuse at least, we definitely want granular reads, since reads may
> > be extremely expensive (eg it may be a network fetch) and there's
> > non-trivial mempcy overhead incurred with fuse needing to memcpy read
> > buffer data from userspace back to the kernel.
>
> Ok, with that the plain ->read_folio variant is not going to fly.
>
> > > +               folio_lock(folio);
> > > +               if (unlikely(folio->mapping != inode->i_mapping))
> > > +                       return 1;
> > > +               if (unlikely(!iomap_validate(iter)))
> > > +                       return 1;
> >
> > Does this now basically mean that every caller that uses iomap for
> > writes will have to implement ->iomap_valid and up the sequence
> > counter anytime there's a write or truncate, in case the folio changes
> > during the lock drop? Or were we already supposed to be doing this?
>
> Not any more than before.  It's is still option, but you still
> very much want it to protect against races updating the mapping.
>
Okay thanks, I think I'll need to add this in for fuse then. I'll look
at this some more





[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux