On Sun, Jun 8, 2025 at 10:32 PM Christoph Hellwig <hch@xxxxxxxxxxxxx> wrote: > > On Fri, Jun 06, 2025 at 04:37:59PM -0700, Joanne Koong wrote: > > This allows IOMAP_IN_MEM iomaps to use iomap_writepages() for handling > > writeback. This lets IOMAP_IN_MEM iomaps use some of the internal > > features in iomaps such as granular dirty tracking for large folios. > > > > This introduces a new iomap_writeback_ops callback, writeback_folio(), > > callers may pass in which hands off folio writeback logic to the caller > > for writing back dirty pages instead of relying on mapping blocks. > > > > This exposes two apis, iomap_start_folio_write() and > > iomap_finish_folio_write(), which callers may find useful in their > > writeback_folio() callback implementation. > > It might also be worth stating what you don't use. One big thing > that springs to mind is ioends. Which are really useful if you > need more than one request to handle a folio, something that is > pretty common in network file systems. I guess you don't need > that for fuse? ioends are used in fuse for cleaning up state. fuse implements a ->submit_ioend() callback in fuse_iomap_submit_ioend() (in patch 7/8). For fuse, there can be multiple requests handling a folio. I'm using the ifs->write_bytes_pending to keep track of how many requests for the folio there are, so that we know when to end writeback after the last request on that folio has completed. > > > + if (wpc->iomap.type == IOMAP_IN_MEM) { > > + if (wpc->ops->submit_ioend) > > + error = wpc->ops->submit_ioend(wpc, error); > > + return error; > > + } > > Given that the patch that moved things around already wrapped the > error propagation to the bio into a helpr, how does this differ > from the main path in the function now? > If we don't add this special casing for IOMAP_IN_MEM here, then in this function it'll hit the "if (!wpc->ioend)" case right in the beginning and return error without calling the ->submit_ioend() callback. For non IOMAP_IN_MEM types, ->submit_ioend() should only get called if wpc->ioend is set. > > + /* > > + * If error is non-zero, it means that we have a situation where some part of > > + * the submission process has failed after we've marked pages for writeback. > > + * We cannot cancel ioend directly in that case, so call the bio end I/O handler > > + * with the error status here to run the normal I/O completion handler to clear > > + * the writeback bit and let the file system process the errors. > > + */ > > Please add the comment in a separate preparation patch. This isn't a new comment, it's just moved from the function description to here. I'll put this comment move into the patch that moves all the bio stuff. > > > + if (wpc->ops->writeback_folio) { > > + WARN_ON_ONCE(wpc->ops->map_blocks); > > + error = wpc->ops->writeback_folio(wpc, folio, inode, > > + offset_in_folio(folio, pos), > > + rlen); > > + } else { > > + WARN_ON_ONCE(wpc->iomap.type == IOMAP_IN_MEM); > > + error = iomap_writepage_map_blocks(wpc, wbc, folio, > > + inode, pos, end_pos, > > + rlen, &count); > > + } > > So instead of having two entirely different methods, can we > refactor the block based code to also use > ->writeback_folio? > > Basically move all of the code inside the do { } while loop after > the call into ->map_blocks into a helper, and then let the caller > loop and also directly discard the folio if needed. I can give that > a spin if you want. > Sounds great, I like this idea a lot. I'll make this change for v2. > Note that writeback_folio is misnamed, as it doesn't write back an > entire folio, but just a dirty range. I'll rename this to writeback_folio_dirty_range(). > > > } else { > > - if (!count) > > + /* > > + * If wpc->ops->writeback_folio is set, then it is responsible > > + * for ending the writeback itself. > > + */ > > + if (!count && !wpc->ops->writeback_folio) > > folio_end_writeback(folio); > > This fails to explain why writeback_folio does the unlocking itself. writeback_folio needs to do the unlocking itself because the writeback may be done asynchronously (as in the case for fuse). I'll add that as a comment in v2. > I also don't see how that would work in case of multiple dirty ranges. For multiple dirty ranges in the same folio, the caller uses ifs->writes_bytes_pending (through the iomap_{start/finish}_folio_write() apis) to track when writeback is finally finished on the folio and folio_end_write() may be called. > > > } > > mapping_set_error(inode->i_mapping, error); > > @@ -1693,3 +1713,25 @@ iomap_writepages(struct address_space *mapping, struct writeback_control *wbc, > > return iomap_submit_ioend(wpc, error); > > } > > EXPORT_SYMBOL_GPL(iomap_writepages); > > + > > +void iomap_start_folio_write(struct inode *inode, struct folio *folio, size_t len) > > +{ > > + struct iomap_folio_state *ifs = folio->private; > > + > > + WARN_ON_ONCE(i_blocks_per_folio(inode, folio) > 1 && !ifs); > > + if (ifs) > > + atomic_add(len, &ifs->write_bytes_pending); > > +} > > +EXPORT_SYMBOL_GPL(iomap_start_folio_write); > > + > > +void iomap_finish_folio_write(struct inode *inode, struct folio *folio, size_t len) > > +{ > > + struct iomap_folio_state *ifs = folio->private; > > + > > + WARN_ON_ONCE(i_blocks_per_folio(inode, folio) > 1 && !ifs); > > + WARN_ON_ONCE(ifs && atomic_read(&ifs->write_bytes_pending) <= 0); > > + > > + if (!ifs || atomic_sub_and_test(len, &ifs->write_bytes_pending)) > > + folio_end_writeback(folio); > > Please also use these helpers in the block based code. Will do. Thanks for your reviews on this. >