Re: [PATCH v6 10/13] xfs: iomap COW-based atomic write support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 18/03/2025 08:32, Christoph Hellwig wrote:
On Tue, Mar 18, 2025 at 08:22:53AM +0000, John Garry wrote:
Is xfs_reflink_allocate_cow even the right helper to use?  We know we
absolutely want a a COW fork extent, we know there can't be delalloc
extent to convert as we flushed dirty data, so most of the logic in it
is pointless.

Well xfs_reflink_allocate_cow gives us what we want when we set some flag
(XFS_REFLINK_FORCE_COW).

Are you hinting at a dedicated helper? Note that
xfs_reflink_fill_cow_hole() also handles the XFS_REFLINK_FORCE_COW flag.

We might not even need much of a helper.  This all comes from my
experience with the zoned code that always writes out of place as well.
I initially tried to reuse the existing iomap_begin methods and all
these helpers, but in the end it turned out to much, much simpler to
just open code the logic.  Now the atomic cow code will be a little
more complex in some aspect (unwritten extents, speculative
preallocations), but also simpler in others (no need to support buffered
I/O including zeroing and sub-block writes that require the ugly
srcmap based read-modify-write), but I suspect the overall trade-off
will be similar.

After all the high-level logic for the atomic COW iomap_begin really
should just be:

  - take the ilock
  - check the COW fork if there is an extent for the start of the range.
  - If yes:
      - if the extent is unwritten, convert the part overlapping with
        the range to written

I was wondering when it could be written, and then I figured it could be if we had this sort of scenario:
- initially we have a mixed of shared and unshared file, like:

mnt/file:
EXT: FILE-OFFSET      BLOCK-RANGE      AG AG-OFFSET        TOTAL FLAGS
  0: [0..7]:          192..199          0 (192..199)           8 100000
  1: [8..15]:         32840..32847      0 (32840..32847)       8 000000
  2: [16..20479]:     208..20671        0 (208..20671)     20464 100000
FLAG Values:
   0100000 Shared extent
   0010000 Unwritten preallocated extent
   0001000 Doesn't begin on stripe unit
   0000100 Doesn't end   on stripe unit
   0000010 Doesn't begin on stripe width
   0000001 Doesn't end   on stripe width

- try atomic write of size 32K @ offset 0
- we first try HW atomics method and call xfs_direct_write_iomap_begin() -> xfs_reflink_allocate_cow() - CoW fork mapping is no good for atomics (too short), so try CoW atomic method - in CoW atomic method, we find that pre-alloc'ed CoW fork extent (which was converted to written already)

      - return the extent
  - If no:
      - allocate a new extent covering the range in the COW fork

Doing this without going down multiple levels of helpers designed for
an entirely different use case will probably simplify things
significantly.

Please suggest any further modifications to the following attempt. I have XFS_REFLINK_FORCE_COW still being passed to xfs_reflink_fill_cow_hole(), but xfs_reflink_fill_cow_hole() is quite a large function and I am not sure if I want to duplicate lots of it.

static int
xfs_atomic_write_cow_iomap_begin(
	struct inode		*inode,
	loff_t			offset,
	loff_t			length,
	unsigned		flags,
	struct iomap		*iomap,
	struct iomap		*srcmap)
{
	ASSERT(flags & IOMAP_WRITE);
	ASSERT(flags & IOMAP_DIRECT);

	struct xfs_inode	*ip = XFS_I(inode);
	struct xfs_mount	*mp = ip->i_mount;
	xfs_fileoff_t		offset_fsb = XFS_B_TO_FSBT(mp, offset);
	xfs_fileoff_t		end_fsb = xfs_iomap_end_fsb(mp, offset, length);
	xfs_fileoff_t		count_fsb = end_fsb - offset_fsb;
	struct xfs_bmbt_irec	imap = {
		.br_startoff = offset_fsb,
		.br_startblock = HOLESTARTBLOCK,
		.br_blockcount = end_fsb - offset_fsb,
		.br_state = XFS_EXT_UNWRITTEN,
	};
	struct xfs_bmbt_irec	cmap;
	bool			shared = false;
	unsigned int		lockmode = XFS_ILOCK_EXCL;
	u64			seq;
	struct xfs_iext_cursor	icur;

	if (xfs_is_shutdown(mp))
		return -EIO;

	if (!xfs_has_reflink(mp))
		return -EINVAL;

	if (!ip->i_cowfp) {
		ASSERT(!xfs_is_reflink_inode(ip));
		xfs_ifork_init_cow(ip);
	}

	error = xfs_ilock_for_iomap(ip, flags, &lockmode);
	if (error)
		return error;

if (xfs_iext_lookup_extent(ip, ip->i_cowfp, offset_fsb, &icur, &cmap) && cmap.br_startoff <= offset_fsb) {
		if (cmap.br_state == XFS_EXT_UNWRITTEN) {
			xfs_trim_extent(&cmap, offset_fsb, count_fsb);
			
error = xfs_reflink_convert_unwritten(ip, &imap, &cmap, XFS_REFLINK_CONVERT_UNWRITTEN);
			if (error)
				goto out_unlock;
		}
	} else {
		error = xfs_reflink_fill_cow_hole(ip, &imap, &cmap, &shared,
				&lockmode, XFS_REFLINK_CONVERT_UNWRITTEN |
				XFS_REFLINK_FORCE_COW | XFS_REFLINK_ALLOC_EXTSZALIGN);
		if (error)
			goto out_unlock;
	}

finish:
	... // as before
}

xfs_reflink_convert_unwritten() and others are private to reflink.c, so this function should also prob live there.

thanks





[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux