Re: [PATCH 2/7] generic/427: try to ensure there's some free space before we do the aio test

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jul 30, 2025 at 07:18:48AM -0700, Christoph Hellwig wrote:
> On Tue, Jul 29, 2025 at 01:08:46PM -0700, Darrick J. Wong wrote:
> > The pwrite failure comes from the aio-dio-eof-race.c program because the
> > filesystem ran out of space.  There are no speculative posteof
> > preallocations on a zoned filesystem, so let's skip this test on those
> > setups.
> 
> Did it run out of space because it is overwriting and we need a new
> allocation (I've not actually seen this fail in my zoned testing,
> that's why I'm asking)?  If so it really should be using the new
> _require_inplace_writes Filipe just sent to the list.

I took a deeper look into what's going on here, and I think the
intermittent ENOSPC failures are caused by:

1. First we write to every byte in the 256M zoned rt device so that
   0x55 gets written to the disk.
2. Then we delete the huge file we created.
3. The zoned garbage collector doesn't run.
4. aio-dio-eof-race starts up and initiates an aiodio at pos 0.
5. xfs_file_dio_write_zoned calls xfs_zoned_write_space_reserve
6. xfs_zoned_space_reserve tries to decrement 64k from XC_FREE_RTEXTENTS
   but gets ENOSPC.
7. We didn't pass XFS_ZR_GREEDY, so we error out.

If I make the test sleep until I see zonegc do some work before starting
aio-dio-eof-race, the problem goes away.  I'm not sure what the proper
solution is, but maybe it's adding a wake_up to the gc process and
waiting for it?

diff --git a/fs/xfs/xfs_zone_space_resv.c b/fs/xfs/xfs_zone_space_resv.c
index 1313c55b8cbe51..dfd0384f8e3931 100644
--- a/fs/xfs/xfs_zone_space_resv.c
+++ b/fs/xfs/xfs_zone_space_resv.c
@@ -223,15 +223,25 @@ xfs_zoned_space_reserve(
        unsigned int                    flags,
        struct xfs_zone_alloc_ctx       *ac)
 {
+       int                             tries = 5;
        int                             error;
 
        ASSERT(ac->reserved_blocks == 0);
        ASSERT(ac->open_zone == NULL);
 
+again:
        error = xfs_dec_freecounter(mp, XC_FREE_RTEXTENTS, count_fsb,
                        flags & XFS_ZR_RESERVED);
        if (error == -ENOSPC && (flags & XFS_ZR_GREEDY) && count_fsb > 1)
                error = xfs_zoned_reserve_extents_greedy(mp, &count_fsb, flags);
+       if (error == -ENOSPC && !(flags & XFS_ZR_GREEDY) && --tries) {
+               struct xfs_zone_info    *zi = mp->m_zone_info;
+
+               xfs_err(mp, "OI ZONEGC %d", tries);
+               wake_up_process(zi->zi_gc_thread);
+               udelay(100);
+               goto again;
+       }
        if (error)
                return error;
 
This fugly patch makes the test failures go away.  On my system we
rarely go below "OI ZONEGC 2" after 100x runs.

> If now we need to figure out what this depends on instead of adding
> random xfs-specific hacks to common code.

<nod> I saw the "this tests speculative posteof preallocations" and
thought that didn't sound like an interesting test on a zoned fs. ;)

--D




[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux