On Wed, Sep 3, 2025 at 5:14 AM Brian Foster <bfoster@xxxxxxxxxx> wrote: > > On Tue, Sep 02, 2025 at 02:11:35PM -0700, Joanne Koong wrote: > > On Tue, Sep 2, 2025 at 8:04 AM Brian Foster <bfoster@xxxxxxxxxx> wrote: > > > > > > An iomap op iteration should not be considered successful if > > > ->iomap_end() fails. Most ->iomap_end() callbacks do not return > > > errors, and for those that do we return the error to the caller, but > > > this is still not sufficient in some corner cases. > > > > > > For example, if a DAX write to a shared iomap fails at ->iomap_end() > > > on XFS, this means the remap of shared blocks from the COW fork to > > > the data fork has possibly failed. In turn this means that just > > > written data may not be accessible in the file. dax_iomap_rw() > > > returns partial success over a returned error code and the operation > > > has already advanced iter.pos by the time ->iomap_end() is called. > > > This means that dax_iomap_rw() can return more bytes processed than > > > have been completed successfully, including partial success instead > > > of an error code if the first iteration happens to fail. > > > > > > To address this problem, first tweak the ->iomap_end() error > > > handling logic to run regardless of whether the current iteration > > > advanced the iter. Next, revert pos in the error handling path. Add > > > a new helper to undo the changes from iomap_iter_advance(). It is > > > static to start since the only initial user is in iomap_iter.c. > > > > > > Signed-off-by: Brian Foster <bfoster@xxxxxxxxxx> > > > --- > > > fs/iomap/iter.c | 20 +++++++++++++++++++- > > > 1 file changed, 19 insertions(+), 1 deletion(-) > > > > > > diff --git a/fs/iomap/iter.c b/fs/iomap/iter.c > > > index 7cc4599b9c9b..69c993fe51fa 100644 > > > --- a/fs/iomap/iter.c > > > +++ b/fs/iomap/iter.c > > > @@ -27,6 +27,22 @@ int iomap_iter_advance(struct iomap_iter *iter, u64 *count) > > > return 0; > > > } > > > > > > +/** > > > + * iomap_iter_revert - revert the iterator position > > > + * @iter: iteration structure > > > + * @count: number of bytes to revert > > > + * > > > + * Revert the iterator position by the specified number of bytes, undoing > > > + * the effect of a previous iomap_iter_advance() call. The count must not > > > + * exceed the amount previously advanced in the current iter. > > > + */ > > > +static void iomap_iter_revert(struct iomap_iter *iter, u64 count) > > > +{ > > > + count = min_t(u64, iter->pos - iter->iter_start_pos, count); > > > + iter->pos -= count; > > > + iter->len += count; > > > +} > > > + > > > static inline void iomap_iter_done(struct iomap_iter *iter) > > > { > > > WARN_ON_ONCE(iter->iomap.offset > iter->pos); > > > @@ -80,8 +96,10 @@ int iomap_iter(struct iomap_iter *iter, const struct iomap_ops *ops) > > > iomap_length_trim(iter, iter->iter_start_pos, > > > olen), > > > advanced, iter->flags, &iter->iomap); > > > - if (ret < 0 && !advanced && !iter->status) > > > + if (ret < 0 && !iter->status) { > > > + iomap_iter_revert(iter, advanced); > > > return ret; > > > + } > > > > Should iomap_iter_revert() also be called in the "if (iter->status < > > 0)" case a few lines below? I think otherwise, that leads to the same > > problem in dax_iomap_rw() you pointed out in the commit message. > > > > My thinking was that I wanted to try for the invariant that the > operation/iteration is responsible to set the iter appropriately in the > event that it returns an error in iter.status. I.e., either not advance > or revert if appropriate. > > This is more consistent with how the iter is advanced and I suspect will > help prevent potential whack a mole issues with inconsistent > expectations for error handling at the iomap_iter() level. I actually > had iomap_iter_revert() non-static originally, but changed it since I > didn't spot anywhere it needed to be called as of yet. I could have > certainly missed something though. Did you have a particular sequence in > mind, or were just thinking in general? Thanks for explaining your thought process. That reasoning makes sense to me. Originally I thought the dax_iomap_rw() sequence needed a iomap_iter_revert() but looking at it again, I'm realizing now that that function is intended to return successfully even if the writes in further iterations fail. Thanks, Joanne > > FWIW, I suspect there's a reasonable argument for doing the same for > ->iomap_end() and make the callback responsible for reverting if > necessary. I went the way in this patch just because it seemed more > simple given the limited scope, but that may not always be the case > and/or may just be cleaner. I can take a closer look at that if there > are stronger opinions..? Thanks for the feedback. > > > returns partial success over a returned error code and the operation > > > has already advanced iter.pos by the time ->iomap_end() is called. > > > This means that dax_iomap_rw() can return more bytes processed than > > > have been completed successfully, including partial success instead > > > of an error code if the first iteration happens to fail. > > Brian > > > Thanks, > > Joanne > > > } > > > > > > /* detect old return semantics where this would advance */ > > > -- > > > 2.51.0 > > > > > > > > >