Re: [PATCH] iomap: don't lose folio dropbehind state for overwrites

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 5/27/25 3:12 PM, Dave Chinner wrote:
> On Tue, May 27, 2025 at 09:43:42AM -0600, Jens Axboe wrote:
>> DONTCACHE I/O must have the completion punted to a workqueue, just like
>> what is done for unwritten extents, as the completion needs task context
>> to perform the invalidation of the folio(s). However, if writeback is
>> started off filemap_fdatawrite_range() off generic_sync() and it's an
>> overwrite, then the DONTCACHE marking gets lost as iomap_add_to_ioend()
>> don't look at the folio being added and no further state is passed down
>> to help it know that this is a dropbehind/DONTCACHE write.
>>
>> Check if the folio being added is marked as dropbehind, and set
>> IOMAP_IOEND_DONTCACHE if that is the case. Then XFS can factor this into
>> the decision making of completion context in xfs_submit_ioend().
>> Additionally include this ioend flag in the NOMERGE flags, to avoid
>> mixing it with unrelated IO.
>>
>> This fixes extra page cache being instantiated when the write performed
>> is an overwrite, rather than newly instantiated blocks.
>>
>> Fixes: b2cd5ae693a3 ("iomap: make buffered writes work with RWF_DONTCACHE")
>> Signed-off-by: Jens Axboe <axboe@xxxxxxxxx>
>>
>> ---
>>
>> Found this one while testing the unrelated issue of invalidation being a
>> bit broken before 6.15 release. We need this to ensure that overwrites
>> also prune correctly, just like unwritten extents currently do.
> 
> I wondered about the stack traces showing DONTCACHE writeback
> completion being handled from irq context[*] when I read the -fsdevel
> thread about broken DONTCACHE functionality yesterday.
> 
> [*] second trace in the failure reported in this comment:
>
> https://lore.kernel.org/linux-fsdevel/432302ad-aa95-44f4-8728-77e61cc1f20c@xxxxxxxxx/

Indeed, though that could've been a "normal" write and not a DONTCACHE
one. But with the bug being fixed by this one, both would've gone that
path...

 
>> diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
>> index 233abf598f65..3729391a18f3 100644
>> --- a/fs/iomap/buffered-io.c
>> +++ b/fs/iomap/buffered-io.c
>> @@ -1691,6 +1691,8 @@ static int iomap_add_to_ioend(struct iomap_writepage_ctx *wpc,
>>  		ioend_flags |= IOMAP_IOEND_UNWRITTEN;
>>  	if (wpc->iomap.flags & IOMAP_F_SHARED)
>>  		ioend_flags |= IOMAP_IOEND_SHARED;
>> +	if (folio_test_dropbehind(folio))
>> +		ioend_flags |= IOMAP_IOEND_DONTCACHE;
>>  	if (pos == wpc->iomap.offset && (wpc->iomap.flags & IOMAP_F_BOUNDARY))
>>  		ioend_flags |= IOMAP_IOEND_BOUNDARY;
>>  
>> diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
>> index 26a04a783489..1b7a006402ea 100644
>> --- a/fs/xfs/xfs_aops.c
>> +++ b/fs/xfs/xfs_aops.c
>> @@ -436,6 +436,9 @@ xfs_map_blocks(
>>  	return 0;
>>  }
>>  
>> +#define IOEND_WQ_FLAGS	(IOMAP_IOEND_UNWRITTEN | IOMAP_IOEND_SHARED | \
>> +			 IOMAP_IOEND_DONTCACHE)
>> +
>>  static int
>>  xfs_submit_ioend(
>>  	struct iomap_writepage_ctx *wpc,
>> @@ -460,8 +463,7 @@ xfs_submit_ioend(
>>  	memalloc_nofs_restore(nofs_flag);
>>  
>>  	/* send ioends that might require a transaction to the completion wq */
>> -	if (xfs_ioend_is_append(ioend) ||
>> -	    (ioend->io_flags & (IOMAP_IOEND_UNWRITTEN | IOMAP_IOEND_SHARED)))
>> +	if (xfs_ioend_is_append(ioend) || ioend->io_flags & IOEND_WQ_FLAGS)
>>  		ioend->io_bio.bi_end_io = xfs_end_bio;
>>  
>>  	if (status)
> 
> IMO, this would be cleaner as a helper so that individual cases can
> be commented correctly, as page cache invalidation does not actually
> require a transaction...
> 
> Something like:
> 
> static bool
> xfs_ioend_needs_wq_completion(
> 	struct xfs_ioend	*ioend)
> {
> 	/* Changing inode size requires a transaction. */
> 	if (xfs_ioend_is_append(ioend))
> 		return true;
> 
> 	/* Extent manipulation requires a transaction. */
> 	if (ioend->io_flags & (IOMAP_IOEND_UNWRITTEN | IOMAP_IOEND_SHARED))
> 		return true;
> 
> 	/* Page cache invalidation cannot be done in irq context. */
> 	if (ioend->io_flags & IOMAP_IOEND_DONTCACHE)
> 		return true;
> 
> 	return false;
> }
> 
> Otherwise seems fine.

Yeah I like that, gets rid of the need to add the mask as well. I'll
spin a v2 and add the helper.

-- 
Jens Axboe




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux