Re: [PATCH] iomap: don't lose folio dropbehind state for overwrites

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, May 27, 2025 at 09:43:42AM -0600, Jens Axboe wrote:
> DONTCACHE I/O must have the completion punted to a workqueue, just like
> what is done for unwritten extents, as the completion needs task context
> to perform the invalidation of the folio(s). However, if writeback is
> started off filemap_fdatawrite_range() off generic_sync() and it's an
> overwrite, then the DONTCACHE marking gets lost as iomap_add_to_ioend()
> don't look at the folio being added and no further state is passed down
> to help it know that this is a dropbehind/DONTCACHE write.
> 
> Check if the folio being added is marked as dropbehind, and set
> IOMAP_IOEND_DONTCACHE if that is the case. Then XFS can factor this into
> the decision making of completion context in xfs_submit_ioend().
> Additionally include this ioend flag in the NOMERGE flags, to avoid
> mixing it with unrelated IO.
> 
> This fixes extra page cache being instantiated when the write performed
> is an overwrite, rather than newly instantiated blocks.
> 
> Fixes: b2cd5ae693a3 ("iomap: make buffered writes work with RWF_DONTCACHE")
> Signed-off-by: Jens Axboe <axboe@xxxxxxxxx>
> 
> ---
> 
> Found this one while testing the unrelated issue of invalidation being a
> bit broken before 6.15 release. We need this to ensure that overwrites
> also prune correctly, just like unwritten extents currently do.

I wondered about the stack traces showing DONTCACHE writeback
completion being handled from irq context[*] when I read the -fsdevel
thread about broken DONTCACHE functionality yesterday.

[*] second trace in the failure reported in this comment:

https://lore.kernel.org/linux-fsdevel/432302ad-aa95-44f4-8728-77e61cc1f20c@xxxxxxxxx/

> diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
> index 233abf598f65..3729391a18f3 100644
> --- a/fs/iomap/buffered-io.c
> +++ b/fs/iomap/buffered-io.c
> @@ -1691,6 +1691,8 @@ static int iomap_add_to_ioend(struct iomap_writepage_ctx *wpc,
>  		ioend_flags |= IOMAP_IOEND_UNWRITTEN;
>  	if (wpc->iomap.flags & IOMAP_F_SHARED)
>  		ioend_flags |= IOMAP_IOEND_SHARED;
> +	if (folio_test_dropbehind(folio))
> +		ioend_flags |= IOMAP_IOEND_DONTCACHE;
>  	if (pos == wpc->iomap.offset && (wpc->iomap.flags & IOMAP_F_BOUNDARY))
>  		ioend_flags |= IOMAP_IOEND_BOUNDARY;
>  
> diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
> index 26a04a783489..1b7a006402ea 100644
> --- a/fs/xfs/xfs_aops.c
> +++ b/fs/xfs/xfs_aops.c
> @@ -436,6 +436,9 @@ xfs_map_blocks(
>  	return 0;
>  }
>  
> +#define IOEND_WQ_FLAGS	(IOMAP_IOEND_UNWRITTEN | IOMAP_IOEND_SHARED | \
> +			 IOMAP_IOEND_DONTCACHE)
> +
>  static int
>  xfs_submit_ioend(
>  	struct iomap_writepage_ctx *wpc,
> @@ -460,8 +463,7 @@ xfs_submit_ioend(
>  	memalloc_nofs_restore(nofs_flag);
>  
>  	/* send ioends that might require a transaction to the completion wq */
> -	if (xfs_ioend_is_append(ioend) ||
> -	    (ioend->io_flags & (IOMAP_IOEND_UNWRITTEN | IOMAP_IOEND_SHARED)))
> +	if (xfs_ioend_is_append(ioend) || ioend->io_flags & IOEND_WQ_FLAGS)
>  		ioend->io_bio.bi_end_io = xfs_end_bio;
>  
>  	if (status)

IMO, this would be cleaner as a helper so that individual cases can
be commented correctly, as page cache invalidation does not actually
require a transaction...

Something like:

static bool
xfs_ioend_needs_wq_completion(
	struct xfs_ioend	*ioend)
{
	/* Changing inode size requires a transaction. */
	if (xfs_ioend_is_append(ioend))
		return true;

	/* Extent manipulation requires a transaction. */
	if (ioend->io_flags & (IOMAP_IOEND_UNWRITTEN | IOMAP_IOEND_SHARED))
		return true;

	/* Page cache invalidation cannot be done in irq context. */
	if (ioend->io_flags & IOMAP_IOEND_DONTCACHE)
		return true;

	return false;
}

Otherwise seems fine.

-Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux