Re: [PATCH v2 13/16] iomap: move read/readahead logic out of CONFIG_BLOCK guard

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Sep 11, 2025 at 9:11 PM Gao Xiang <hsiangkao@xxxxxxxxxxxxxxxxx> wrote:
>
> On 2025/9/12 09:09, Gao Xiang wrote:
> >
> >
> > On 2025/9/12 08:06, Gao Xiang wrote:
> >>
> >>
> >> On 2025/9/12 03:45, Joanne Koong wrote:
> >>> On Thu, Sep 11, 2025 at 8:29 AM Gao Xiang <hsiangkao@xxxxxxxxxxxxxxxxx> wrote:
> >>
> >>>> But if FUSE or some other fs later needs to request L2P information
> >>>> in their .iomap_begin() and need to send L2P requests to userspace
> >>>> daemon to confirm where to get the physical data (maybe somewhat
> >>>> like Darrick's work but I don't have extra time to dig into that
> >>>> either) rather than just something totally bypass iomap-L2P logic
> >>>> as above, then I'm not sure the current `iomap_iter->private` is
> >>>> quite seperate to `struct iomap_read_folio_ctx->private`, it seems
> >>>
> >>> If in the future this case arises, the L2P mapping info is accessible
> >>> by the read callback in the current design. `.read_folio_range()`
> >>> passes the iomap iter to the filesystem and they can access
> >>> iter->private to get the L2P mapping data they need.
> >>
> >> The question is what exposes to `iter->private` then, take
> >> an example:
> >>
> >> ```
> >> struct file *file;
> >> ```
> >>
> >> your .read_folio_range() needs `file->private_data` to get
> >> `struct fuse_file` so `file` is kept into
> >> `struct iomap_read_folio_ctx`.
> >>
> >> If `file->private_data` will be used for `.iomap_begin()`
> >> as well, what's your proposal then?
> >>
> >> Duplicate the same `file` pointer in both
> >> `struct iomap_read_folio_ctx` and `iter->private` context?
> >
> > It's just an not-so-appropriate example because
> > `struct file *` and `struct fuse_file *` are widely used
> > in the (buffer/direct) read/write flow but Darrick's work
> > doesn't use `file` in .iomap_{begin/end}.
> >
> > But you may find out `file` pointer is already used for
> > both FUSE buffer write and your proposal, e.g.
> >
> > buffer write:
> >   /*
> >    * Use iomap so that we can do granular uptodate reads
> >    * and granular dirty tracking for large folios.
> >    */
> >   written = iomap_file_buffered_write(iocb, from,
> >                                       &fuse_iomap_ops,
> >                                       &fuse_iomap_write_ops,
> >                                       file);
>
> And your buffer write per-fs context seems just use
> `iter->private` entirely instead to keep `file`.
>

I don’t think the iomap buffered writes interface is good to use as a
model. I looked a bit at some of the other iomap file operations and I
think we should just pass operation-specific data through an
operation-specific context for those too, eg for buffered writes and
dio modifying the interface from

ssize_t iomap_file_buffered_write(struct kiocb *iocb, struct iov_iter
*from, const struct iomap_ops *ops, const struct iomap_write_ops
*write_ops, void *private);
ssize_t iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter, const
struct iomap_ops *ops, const struct iomap_dio_ops *dops, unsigned int
dio_flags, void *private, size_t done_before);

to something like

ssize_t iomap_file_buffered_write(const struct iomap_ops *ops, struct
iomap_write_folio_ctx *ctx);
ssize_t iomap_dio_rw(const struct iomap_ops *ops, struct iomap_dio_ctx *ctx);

There’s one filesystem besides fuse that uses “iter->private” and
that’s for xfs zoned inodes (xfs_zoned_buffered_write_iomap_begin()),
which passes the  struct xfs_zone_alloc_ctx*  through iter->private,
and it's used afaict for tracking block reservations. imo that's what
iter->private should be used for, to track the more high level
metadata stuff and then the lower-level details that are
operation-specific go through the ctx->data fields. That seems the
cleanest design to me. I think we should rename the iter->private
field to something like "iter->metadata" to make that delineation more
clear.  I'm not sure what the iomap maintainers think, but that is my
opinion.

I think if in the future there is a case/feature which needs something
previously in one of the operation-specific ctxes, it seems fine to me
to have both iter->private and ctx->data point to the same thing.


Thanks,
Joanne

> >
> >
> > I just try to say if there is a case/feature which needs
> > something previously in `struct iomap_read_folio_ctx` to
> > be available in .iomap_{begin,end} too, you have to either:
> >   - duplicate this in `iter->private` as well;
> >   - move this to `iter->private` entirely.
> >
> > The problem is that both `iter->private` and
> > `struct iomap_read_folio_ctx` are filesystem-specific,
> > I can only see there is no clear boundary to leave something
> > in which one.  It seems just like an artificial choice.
> >
> > Thanks,
> > Gao Xiang
> >
> >>
> >>
> >>>
> >>>> both needs fs-specific extra contexts for the same I/O flow.
> >>>>
> >>>> I think the reason why `struct iomap_read_folio_ctx->private` is
> >>>> introduced is basically previous iomap filesystems are all
> >>>> bio-based, and they shares `bio` concept in common but
> >>>> `iter->private` was not designed for this usage.
> >>>>
> >>>> But fuse `struct iomap_read_folio_ctx` and
> >>>> `struct fuse_fill_read_data` are too FUSE-specific, I cannot
> >>>> see it could be shared by other filesystems in the near future,
> >>>> which is much like a single-filesystem specific concept, and
> >>>> unlike to `bio` at all.
> >>>
> >>> Currently fuse is the only non-block-based filesystem using iomap but
> >>> I don't see why there wouldn't be more in the future. For example,
> >>> while looking at some of the netfs code, a lot of the core
> >>> functionality looks the same between that and iomap and I think it
> >>> might be a good idea to have netfs in the future use iomap's interface
> >>> so that it can get the large folio dirty/uptodate tracking stuff and
> >>> any other large folio stuff like more granular writeback stats
> >>> accounting for free.
> >>
> >> I think you need to ask David on this idea, I've told him to
> >> switch fscache to use iomap in 2022 before netfs is fully out [1],
> >> but I don't see it will happen.
> >>
> >> [1] https://lore.kernel.org/linux-fsdevel/YfivxC9S52FlyKoL@B-P7TQMD6M-0146/
> >>
> >> Thanks,
> >> Gao Xiang
> >
>





[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux