Re: [PATCH v1 13/16] iomap: add a private arg for read and readahead

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 2025/9/5 23:21, Darrick J. Wong wrote:
On Fri, Sep 05, 2025 at 10:21:19AM +0800, Gao Xiang wrote:


On 2025/9/5 07:29, Joanne Koong wrote:
On Tue, Sep 2, 2025 at 6:55 PM Gao Xiang <hsiangkao@xxxxxxxxxxxxxxxxx> wrote:


...



    int iomap_read_folio(struct folio *folio, const struct iomap_ops *ops,
-             const struct iomap_read_ops *read_ops)
+             const struct iomap_read_ops *read_ops, void *private)
    {
         struct iomap_iter iter = {
                 .inode          = folio->mapping->host,
                 .pos            = folio_pos(folio),
                 .len            = folio_size(folio),
+             .private        = private,
         };

Will this whole work be landed for v6.18?

If not, may I ask if this patch can be shifted advance in this
patchset for applying separately (I tried but no luck).

Because I also need some similar approach for EROFS iomap page
cache sharing feature since EROFS uncompressed I/Os go through
iomap and extra information needs a proper way to pass down to
iomap_{begin,end} with extra pointer `.private` too.

Hi Gao,

I'm not sure whether this will be landed for v6.18 but I'm happy to
shift this patch to the beginning of the patchset for applying
separately.

Yeah, thanks.  At least this common patch can be potentially applied
easily (e.g. form a common commit id for both features if really
needed) since other iomap/FUSE patches are not dependency of our new
feature and shouldn't be coupled with our development branch later.


Hi Gao,

I'll be dropping this patch in v2 since all the iomap read stuff is
going to go through a struct ctx arg instead of through iter->private.
Sorry this won't help your use case, but looking forward to seeing your patches.

Hi Joanne,

Thanks for your reminder.  Okay, I will check your v2 to know how
you change then.

Also, one thing I really think it's helpful for our use cases is
converting .iomap_begin() at least to pass struct iomap_iter *
directly rather than (inode, pos, len, flags, iomap, srcmap)
since:
   - .iomap_{begin,end}() are introduced before iomap_iter()
     and struct iomap_iter but those callbacks are basically
     now passed down some fields of `struct iomap_iter` now;

   - struct iomap_iter->private then can contain a per-request
     context so that .iomap_begin() can leverage too;

   - There are already too many arguments for .iomap_begin(),
     pass down struct iomap_iter directly could avoid adding
     another `private` argument to .iomap_begin()..

Therefore, I do wonder if this change (.iomap_begin() passes
struct iomap_iter *) is a good idea for the iomap folks, in
addition that filesystems can specify `struct iomap_iter->private`
as in this patch.  Since this change is necessary to make our
page cache sharing feature efficient, I will continue working on
this soon.

 From a source code perspective, I like the idea of cleaning up the
function signature to pass fewer things to ->iomap_begin.  I suspect
that we could simplify it to:

	int (*iomap_begin)(const struct iomap_iter *iter,
			   struct iomap *iomap,
			   struct iomap *srcmap);

Hi Darrick,

Thanks for your reply and sorry for my late reply due to another
internal stuff.

It sounds better to me since `const` annonation may have some more
aggressive compiler optimization.


That way we preserve the notion that the ->iomap_begin functions aren't
allowed to change the iterator contents except for the two iomaps.

That said, the nice thing about passing so many parameters is that it
probably leads to less pointer chasing in the implementation functions.
I wonder if that makes any difference because extent mapping lookups
likely involve a lot more pointer chasing anyway.  Another benefit is
that since the parameters aren't const, each implementation can (re)use
those variables if they need to.

Ok, but I'm not sure, in principle, even without `const` annotation,
users can make local variables in their callbacks to avoid extra pointer
chasing.

But anyway, that is not what's my own original intention: we need
another `->private` ctx to pass among all-iomap_begin() because it
holds extra information, I think passing in `const struct iomap_iter *`
is cleaner than adding a new argument.

I will post this after the whole new feature work is finished.


I think you could simplify iomap_end too:

	int (*iomap_end)(const struct iomap_iter *iter,
			 loff_t pos, u64 length,
			 size_t written);

and make ->iomap_end implementations extract iter->flags and iter->iomap
themselves if they want to.  I don't like how xfs_iomap.c abuses
container_of to extract the iter from the iomap pointer.

Yeah, make sense.


(But not enough to have written patches fixing any of this. :P)

Another thing I want to discuss (but it's less important for our
recent features) is the whole callback hook model of iomap.

Basically the current model does mean if any filesystem doesn't
fulfill the iomap standard flow, it has to add some customized
callback hook somewhere to modify the code flow then (or introduce
a new special flag and move their specific logic into iomap/
itself even other fses may not need this), but the hook way will
cause increased indirect calls for them, currently we have
`struct iomap_ops`, `struct iomap_writeback_ops` and
`struct iomap_dio_ops`, if some another filesystem (when converting
buffer I/Os for example or adding {pre,post}-processing ) have
specified timing, it needs to add new hooks then.

I do wonder if it's possible to convert iomap to get rid of the
indirect-call model by just providing helper kAPIs instead,
take .read_folio / .fiemap for example e.g.

    xxxfs_read_folio:
       loop iomap_iter
         xxxfs_iomap_begin();
	iomap_readpage_bio_advance(); [ or if a fs is non-bio
              based, spliting more low-level helpers for them. ]
         xxxfs_iomap_end();

    xxxfs_fiemap():
       iomap_fiemap_begin
       loop iomap_iter
         xxxfs_iomap_begin();
         iomap_readpage_fiemap_advance()
         xxxfs_iomap_end();
       iomap_fiemap_end
So that each fs can use those helpers flexibly instead of diging
into adding various new indirect call hooks or moving customized
logic into iomap/ itself.

Yes, it's quite possible to push the iomap iteration control down into
the filesystems to avoid the indirect calls.  That might make things
faster, though I have no idea what sort of performance impact that will
have.

It's not from the performance impact perspective, but for the
flexibility.  Adding new hooks everywhere doesn't smell good
at least on my side (because iomap tends to cover the whole I/O
lifetime, which is by design different from page cache callbacks
for example).


I don't have a specific example  because currently we don't have
direct issue against standard iomap flow on our uncompressed
path, but after a quick glance of other potential users who try
to convert their buffer I/Os to iomap, I had such impression in
my head for a while.

OTOH making it easier for non-disk filesystems to use iomap but supply
their own IO mechanism (transformed bios, memcpy, etc) makes a far more
compelling argument for doing this painful(?) treewide change IMO.

Yeah, I can see it's rather a painful treewide change, but if more
hooks are added, it will be more painful than now.

Anyway, it's just my random thought but it doesn't have real impact
to our work and much less important to me.

Thanks,
Gao Xiang


--D

Thanks,
Gao Xiang



Thanks,
Joanne

Thanks,
Gao Xiang


Thanks,
Joanne

Thanks,
Gao Xiang








[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux