On Tue, Sep 16, 2025 at 4:50 PM Joanne Koong <joannelkoong@xxxxxxxxx> wrote: > > Non-block-based filesystems will be using iomap read/readahead. If they > handle reading in ranges asynchronously and fulfill those read requests > on an ongoing basis (instead of all together at the end), then there is > the possibility that the read on the folio may be prematurely ended if > earlier async requests complete before the later ones have been issued. > > For example if there is a large folio and a readahead request for 16 > pages in that folio, if doing readahead on those 16 pages is split into > 4 async requests and the first request is sent off and then completed > before we have sent off the second request, then when the first request > calls iomap_finish_folio_read(), ifs->read_bytes_pending would be 0, > which would end the read and unlock the folio prematurely. > > To mitigate this, a "bias" is added to ifs->read_bytes_pending before > the first range is forwarded to the caller and removed after the last > range has been forwarded. > > iomap writeback does this with their async requests as well to prevent > prematurely ending writeback. > > Signed-off-by: Joanne Koong <joannelkoong@xxxxxxxxx> > --- > fs/iomap/buffered-io.c | 55 ++++++++++++++++++++++++++++++++++++------ > 1 file changed, 47 insertions(+), 8 deletions(-) > > diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c > index 561378f2b9bb..667a49cb5ae5 100644 > --- a/fs/iomap/buffered-io.c > +++ b/fs/iomap/buffered-io.c > @@ -420,6 +420,38 @@ const struct iomap_read_ops iomap_bio_read_ops = { > }; > EXPORT_SYMBOL_GPL(iomap_bio_read_ops); > > +/* > + * Add a bias to ifs->read_bytes_pending to prevent the read on the folio from > + * being ended prematurely. > + * > + * Otherwise, if the ranges are read asynchronously and read requests are > + * fulfilled on an ongoing basis, there is the possibility that the read on the > + * folio may be prematurely ended if earlier async requests complete before the > + * later ones have been issued. > + */ > +static void iomap_read_add_bias(struct folio *folio) > +{ > + iomap_start_folio_read(folio, 1); > +} > + > +static void iomap_read_remove_bias(struct folio *folio, bool *cur_folio_owned) > +{ > + struct iomap_folio_state *ifs = folio->private; > + bool finished, uptodate; > + > + if (ifs) { > + spin_lock_irq(&ifs->state_lock); > + ifs->read_bytes_pending -= 1; > + finished = !ifs->read_bytes_pending; > + if (finished) > + uptodate = ifs_is_fully_uptodate(folio, ifs); > + spin_unlock_irq(&ifs->state_lock); > + if (finished) > + folio_end_read(folio, uptodate); > + *cur_folio_owned = true; > + } > +} > + > static int iomap_read_folio_iter(struct iomap_iter *iter, > struct iomap_read_folio_ctx *ctx, bool *cur_folio_owned) > { > @@ -429,7 +461,7 @@ static int iomap_read_folio_iter(struct iomap_iter *iter, > struct folio *folio = ctx->cur_folio; > size_t poff, plen; > loff_t delta; > - int ret; > + int ret = 0; > > if (iomap->type == IOMAP_INLINE) { > ret = iomap_read_inline_data(iter, folio); > @@ -441,6 +473,8 @@ static int iomap_read_folio_iter(struct iomap_iter *iter, > /* zero post-eof blocks as the page may be mapped */ > ifs_alloc(iter->inode, folio, iter->flags); > > + iomap_read_add_bias(folio); Same here, it's not guaranteed that the whole folio is parsed here because the current iomap mapping may only have part of the folio mapped. The bias needs to be added before the first iomap_iter() call and removed after all iomap_iter() calls are complete. I'll make this change for v4. > + > length = min_t(loff_t, length, > folio_size(folio) - offset_in_folio(folio, pos)); > while (length) { > @@ -448,16 +482,18 @@ static int iomap_read_folio_iter(struct iomap_iter *iter, > &plen); > > delta = pos - iter->pos; > - if (WARN_ON_ONCE(delta + plen > length)) > - return -EIO; > + if (WARN_ON_ONCE(delta + plen > length)) { > + ret = -EIO; > + break; > + } > length -= delta + plen; > > ret = iomap_iter_advance(iter, &delta); > if (ret) > - return ret; > + break; > > if (plen == 0) > - return 0; > + break; > > if (iomap_block_needs_zeroing(iter, pos)) { > folio_zero_range(folio, poff, plen); > @@ -466,16 +502,19 @@ static int iomap_read_folio_iter(struct iomap_iter *iter, > *cur_folio_owned = true; > ret = ctx->ops->read_folio_range(iter, ctx, plen); > if (ret) > - return ret; > + break; > } > > delta = plen; > ret = iomap_iter_advance(iter, &delta); > if (ret) > - return ret; > + break; > pos = iter->pos; > } > - return 0; > + > + iomap_read_remove_bias(folio, cur_folio_owned); > + > + return ret; > } > > int iomap_read_folio(const struct iomap_ops *ops, > -- > 2.47.3 >