Re: [PATCH v1 4/8] iomap: add writepages support for IOMAP_IN_MEM iomaps

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Jun 8, 2025 at 10:32 PM Christoph Hellwig <hch@xxxxxxxxxxxxx> wrote:
>
> On Fri, Jun 06, 2025 at 04:37:59PM -0700, Joanne Koong wrote:
> > This allows IOMAP_IN_MEM iomaps to use iomap_writepages() for handling
> > writeback. This lets IOMAP_IN_MEM iomaps use some of the internal
> > features in iomaps such as granular dirty tracking for large folios.
> >
> > This introduces a new iomap_writeback_ops callback, writeback_folio(),
> > callers may pass in which hands off folio writeback logic to the caller
> > for writing back dirty pages instead of relying on mapping blocks.
> >
> > This exposes two apis, iomap_start_folio_write() and
> > iomap_finish_folio_write(), which callers may find useful in their
> > writeback_folio() callback implementation.
>
> It might also be worth stating what you don't use.  One big thing
> that springs to mind is ioends.  Which are really useful if you
> need more than one request to handle a folio, something that is
> pretty common in network file systems.  I guess you don't need
> that for fuse?

ioends are used in fuse for cleaning up state. fuse implements a
->submit_ioend() callback in fuse_iomap_submit_ioend() (in patch 7/8).

For fuse, there can be multiple requests handling a folio. I'm using
the ifs->write_bytes_pending to keep track of how many requests for
the folio there are, so that we know when to end writeback after the
last request on that folio has completed.

>
> > +     if (wpc->iomap.type == IOMAP_IN_MEM) {
> > +             if (wpc->ops->submit_ioend)
> > +                     error = wpc->ops->submit_ioend(wpc, error);
> > +             return error;
> > +     }
>
> Given that the patch that moved things around already wrapped the
> error propagation to the bio into a helpr, how does this differ
> from the main path in the function now?
>

If we don't add this special casing for IOMAP_IN_MEM here, then in
this function it'll hit the "if (!wpc->ioend)" case right in the
beginning and return error without calling the ->submit_ioend()
callback. For non IOMAP_IN_MEM types, ->submit_ioend() should only get
called if wpc->ioend is set.

> > +     /*
> > +      * If error is non-zero, it means that we have a situation where some part of
> > +      * the submission process has failed after we've marked pages for writeback.
> > +      * We cannot cancel ioend directly in that case, so call the bio end I/O handler
> > +      * with the error status here to run the normal I/O completion handler to clear
> > +      * the writeback bit and let the file system process the errors.
> > +      */
>
> Please add the comment in a separate preparation patch.

This isn't a new comment, it's just moved from the function
description to here. I'll put this comment move into the patch that
moves all the bio stuff.
>
> > +             if (wpc->ops->writeback_folio) {
> > +                     WARN_ON_ONCE(wpc->ops->map_blocks);
> > +                     error = wpc->ops->writeback_folio(wpc, folio, inode,
> > +                                                       offset_in_folio(folio, pos),
> > +                                                       rlen);
> > +             } else {
> > +                     WARN_ON_ONCE(wpc->iomap.type == IOMAP_IN_MEM);
> > +                     error = iomap_writepage_map_blocks(wpc, wbc, folio,
> > +                                                        inode, pos, end_pos,
> > +                                                        rlen, &count);
> > +             }
>
> So instead of having two entirely different methods, can we
> refactor the block based code to also use
> ->writeback_folio?
>
> Basically move all of the code inside the do { } while loop after
> the call into ->map_blocks into a helper, and then let the caller
> loop and also directly discard the folio if needed.  I can give that
> a spin if you want.
>

Sounds great, I like this idea a lot. I'll make this change for v2.

> Note that writeback_folio is misnamed, as it doesn't write back an
> entire folio, but just a dirty range.

I'll rename this to writeback_folio_dirty_range().

>
> >       } else {
> > -             if (!count)
> > +             /*
> > +              * If wpc->ops->writeback_folio is set, then it is responsible
> > +              * for ending the writeback itself.
> > +              */
> > +             if (!count && !wpc->ops->writeback_folio)
> >                       folio_end_writeback(folio);
>
> This fails to explain why writeback_folio does the unlocking itself.

writeback_folio needs to do the unlocking itself because the writeback
may be done asynchronously (as in the case for fuse). I'll add that as
a comment in v2.

> I also don't see how that would work in case of multiple dirty ranges.

For multiple dirty ranges in the same folio, the caller uses
ifs->writes_bytes_pending (through the
iomap_{start/finish}_folio_write() apis) to track when writeback is
finally finished on the folio and folio_end_write() may be called.

>
> >       }
> >       mapping_set_error(inode->i_mapping, error);
> > @@ -1693,3 +1713,25 @@ iomap_writepages(struct address_space *mapping, struct writeback_control *wbc,
> >       return iomap_submit_ioend(wpc, error);
> >  }
> >  EXPORT_SYMBOL_GPL(iomap_writepages);
> > +
> > +void iomap_start_folio_write(struct inode *inode, struct folio *folio, size_t len)
> > +{
> > +     struct iomap_folio_state *ifs = folio->private;
> > +
> > +     WARN_ON_ONCE(i_blocks_per_folio(inode, folio) > 1 && !ifs);
> > +     if (ifs)
> > +             atomic_add(len, &ifs->write_bytes_pending);
> > +}
> > +EXPORT_SYMBOL_GPL(iomap_start_folio_write);
> > +
> > +void iomap_finish_folio_write(struct inode *inode, struct folio *folio, size_t len)
> > +{
> > +     struct iomap_folio_state *ifs = folio->private;
> > +
> > +     WARN_ON_ONCE(i_blocks_per_folio(inode, folio) > 1 && !ifs);
> > +     WARN_ON_ONCE(ifs && atomic_read(&ifs->write_bytes_pending) <= 0);
> > +
> > +     if (!ifs || atomic_sub_and_test(len, &ifs->write_bytes_pending))
> > +             folio_end_writeback(folio);
>
> Please also use these helpers in the block based code.

Will do. Thanks for your reviews on this.
>





[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux