On Tue, Apr 08, 2025 at 10:42:08AM +0000, John Garry wrote: > Now that CoW-based atomic writes are supported, update the max size of an > atomic write for the data device. > > The limit of a CoW-based atomic write will be the limit of the number of > logitems which can fit into a single transaction. I still think this is the wrong way to define the maximum size of a COW-based atomic write because it is going to change from filesystem to filesystem and that variability in supported maximum length will be exposed to userspace... i.e. Maximum supported atomic write size really should be defined as a well documented fixed size (e.g. 16MB). Then the transaction reservations sizes needed to perform that conversion can be calculated directly from that maximum size and optimised directly for the conversion operation that atomic writes need to perform. ..... > diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c > index b2dd0c0bf509..42b2b7540507 100644 > --- a/fs/xfs/xfs_super.c > +++ b/fs/xfs/xfs_super.c > @@ -615,6 +615,28 @@ xfs_init_mount_workqueues( > return -ENOMEM; > } > > +unsigned int > +xfs_atomic_write_logitems( > + struct xfs_mount *mp) > +{ > + unsigned int efi = xfs_efi_item_overhead(1); > + unsigned int rui = xfs_rui_item_overhead(1); > + unsigned int cui = xfs_cui_item_overhead(1); > + unsigned int bui = xfs_bui_item_overhead(1); > + unsigned int logres = M_RES(mp)->tr_write.tr_logres; > + > + /* > + * Maximum overhead to complete an atomic write ioend in software: > + * remove data fork extent + remove cow fork extent + > + * map extent into data fork > + */ > + unsigned int atomic_logitems = > + (bui + cui + rui + efi) + (cui + rui) + (bui + rui); This seems wrong. Unmap from the data fork only logs a (bui + cui) pair, we don't log a RUI or an EFI until the transaction that processes the BUI or CUI actually frees an extent from the the BMBT or removes a block from the refcount btree. We also need to be able to relog all the intents and everything that was modified, so we effectively have at least one xfs_allocfree_block_count() reservation needed here as well. Even finishing an invalidation BUI can result in BMBT block allocation occurring if the operation splits an existing extent record and the insert of the new record causes a BMBT block split.... > + > + /* atomic write limits are always a power-of-2 */ > + return rounddown_pow_of_two(logres / (2 * atomic_logitems)); What is the magic 2 in that division? > +} Also this function does not belong in xfs_super.c - that file is for interfacing with the VFS layer. Calculating log reservation constants at mount time is done in xfs_trans_resv.c - I suspect most of the code in this patch should probably be moved there and run from xfs_trans_resv_calc()... -Dave. -- Dave Chinner david@xxxxxxxxxxxxx