On Mon, May 05, 2025 at 09:42:27AM +0200, Carlos Maiolino wrote: > On Thu, May 01, 2025 at 05:25:16PM +1000, Dave Chinner wrote: > > On Wed, Apr 30, 2025 at 09:37:35PM -0700, Darrick J. Wong wrote: > > > On Thu, May 01, 2025 at 09:27:24AM +1000, Dave Chinner wrote: > > > > From: Dave Chinner <dchinner@xxxxxxxxxx> > > > > > > > > When running fstrim immediately after mounting a V4 filesystem, > > > > the fstrim fails to trim all the free space in the filesystem. It > > > > only trims the first extent in the by-size free space tree in each > > > > AG and then returns. If a second fstrim is then run, it runs > > > > correctly and the entire free space in the filesystem is iterated > > > > and discarded correctly. > > > > > > > > The problem lies in the setup of the trim cursor - it assumes that > > > > pag->pagf_longest is valid without either reading the AGF first or > > > > checking if xfs_perag_initialised_agf(pag) is true or not. > > > > > > > > As a result, when a filesystem is mounted without reading the AGF > > > > (e.g. a clean mount on a v4 filesystem) and the first operation is a > > > > fstrim call, pag->pagf_longest is zero and so the free extent search > > > > starts at the wrong end of the by-size btree and exits after > > > > discarding the first record in the tree. > > > > > > > > Fix this by deferring the initialisation of tcur->count to after > > > > we have locked the AGF and guaranteed that the perag is properly > > > > initialised. We trigger this on tcur->count == 0 after locking the > > > > AGF, as this will only occur on the first call to > > > > xfs_trim_gather_extents() for each AG. If we need to iterate, > > > > tcur->count will be set to the length of the record we need to > > > > restart at, so we can use this to ensure we only sample a valid > > > > pag->pagf_longest value for the iteration. > > > > > > > > Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx> > > > > > > Makes sense to me. Please add the following trailers on merge: > > > > > > Cc: <stable@xxxxxxxxxxxxxxx> # v6.10 > > > Fixes: b0ffe661fab4b9 ("xfs: fix performance problems when fstrimming a subset of a fragmented AG") > > > > Those tags are incorrect. The regression was introduced by commit > > 89cfa899608f ("xfs: reduce AGF hold times during fstrim operations") > > a few releases before that change.... > > This sounds right, introduced in v6.6. > > Darrick, I'll add a stable #v6.6 tag then. Ok thanks. --D > > > > > Reviewed-by: "Darrick J. Wong" <djwong@xxxxxxxxxx> > > > > Thanks. > > > > -Dave. > > -- > > Dave Chinner > > david@xxxxxxxxxxxxx > > >