Re: [RFC PATCH] xfs: add mount option for zone gc pressure

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 19/03/2025 10:11, Dave Chinner wrote:
> On Wed, Mar 19, 2025 at 08:19:19AM +0000, Hans Holmberg wrote:
>> Presently we start garbage collection late - when we start running
>> out of free zones to backfill max_open_zones. This is a reasonable
>> default as it minimizes write amplification. The longer we wait,
>> the more blocks are invalidated and reclaim cost less in terms
>> of blocks to relocate.
>>
>> Starting this late however introduces a risk of GC being outcompeted
>> by user writes. If GC can't keep up, user writes will be forced to
>> wait for free zones with high tail latencies as a result.
>>
>> This is not a problem under normal circumstances, but if fragmentation
>> is bad and user write pressure is high (multiple full-throttle
>> writers) we will "bottom out" of free zones.
>>
>> To mitigate this, introduce a gc_pressure mount option that lets the
>> user specify a percentage of how much of the unused space that gc
>> should keep available for writing. A high value will reclaim more of
>> the space occupied by unused blocks, creating a larger buffer against
>> write bursts.
>>
>> This comes at a cost as write amplification is increased. To
>> illustrate this using a sample workload, setting gc_pressure to 60%
>> avoids high (500ms) max latencies while increasing write amplification
>> by 15%.
> 
> It seems to me that this is runtime workload dependent, and so maybe
> a tunable variable in /sys/fs/xfs/<dev>/.... might suit better?
> 
> That way it can be controlled by a userspace agent as the filesystem
> fills and empties rather than being fixed at mount time and never
> really being optimal for a changing workload...

Yes, that would make sense.

> >> diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h
>> index 799b84220ebb..af595024de00 100644
>> --- a/fs/xfs/xfs_mount.h
>> +++ b/fs/xfs/xfs_mount.h
>> @@ -229,6 +229,7 @@ typedef struct xfs_mount {
>>  	bool			m_finobt_nores; /* no per-AG finobt resv. */
>>  	bool			m_update_sb;	/* sb needs update in mount */
>>  	unsigned int		m_max_open_zones;
>> +	unsigned int		m_gc_pressure;
> 
> This is not explicitly initialised somewhere. If the magic "mount
> gets zeroed on allocation" value of zero it gets means this feature
> is turned off, there needs to be a comment somewhere explaining why
> it is turned completely off rather than having a default of, say,
> 5% like we have for low space allocation thresholds in various
> other lowspace allocation and reclaim algorithms....


Right, it is not at all obvious why 0 is a good default.
I'll add a comment in the next rev.

>> +
>> +	free = xfs_estimate_freecounter(mp, XC_FREE_RTEXTENTS);
>> +	if (available < div_s64(free * mp->m_gc_pressure, 100))
> 
> mult_frac(free, mp->m_gc_pressure, 100) to avoid overflow.
> 
> Also, this is really a free space threshold, not a dynamic
> "pressure" measurement...
> 

Yeah, naming is hard. I can't come up with a better name off
the bat, but I'll give it some thought.

Thanks,
Hans




[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux