Re: temporary hung tasks on XFS since updating to 6.6.92

Christian Theune <ct@xxxxxxxxxxxxxxx> · Tue, 17 Jun 2025 13:54:43 +0200

> On 17. Jun 2025, at 07:44, Christian Theune <ct@xxxxxxxxxxxxxxx> wrote:
> 
> 
> 
>> On 16. Jun 2025, at 14:15, Carlos Maiolino <cem@xxxxxxxxxx> wrote:
>> 
>> On Mon, Jun 16, 2025 at 12:09:21PM +0200, Christian Theune wrote:
>> 
>>> 
>>> # xfs_info /tmp/
>>> meta-data=/dev/vdb1              isize=512    agcount=8, agsize=229376 blks
>>>        =                       sectsz=512   attr=2, projid32bit=1
>>>        =                       crc=1        finobt=1, sparse=1, rmapbt=0
>>>        =                       reflink=0    bigtime=0 inobtcount=0 nrext64=0
>>>        =                       exchange=0
>>> data     =                       bsize=4096   blocks=1833979, imaxpct=25
>>>        =                       sunit=1024   swidth=1024 blks
>>> naming   =version 2              bsize=4096   ascii-ci=0, ftype=1, parent=0
>>> log      =internal log           bsize=4096   blocks=2560, version=2
>>>        =                       sectsz=512   sunit=8 blks, lazy-count=1
>>> realtime =none                   extsz=4096   blocks=0, rtextents=0
>> 
>> This is worrisome. Your journal size is 10MiB, this can easily keep stalling IO
>> waiting for log space to be freed, depending on the nature of the machine this
>> can be easily triggered. I'm curious though how you made this FS, because 2560
>> is below the minimal log size that xfsprogs allows since (/me goes look
>> into git log) 2022, xfsprogs 5.15.
>> 
>> FWIW, one of the reasons the minimum journal log size has been increased is the
>> latency/stalls that happens when waiting for free log space, which is exactly
>> the symptom you've been seeing.
>> 
>> I'd suggest you to check the xfsprogs commit below if you want more details,
>> but if this is one of the filesystems where you see the stalls, this might very
>> well be the cause:
> 
> Interesting catch! I’ll double check this against our fleet and the affected machines and will dive into the traffic patterns of the specific underlying devices.
> 
> This filesystem is used for /tmp and is getting created fresh after a “cold boot” from our hypervisor. It could be that a number of VMs have only seen warm reboots for a couple of years but get kernel upgrades with warm reboots quite regularly. We’re in the process of changing the /tmp filesystem creation to happen fresh during initrd so that the VM internal xfsprogs will more closely match the guest kernel.

I’ve checked the log size. A number of machines with very long uptimes have this outdated 10 MiB size. Many machines with less uptime have larger sizes (multiple hundred megabytes). Checking our codebase we let xfsprogs do their thing and don’t fiddle with the defaults.

The log sizes of the affected machines weren’t all set to 10 MiB - even machines with larger sizes were affected.

I’ll follow up - as promised - with further analysis whether IO starvation from the underlying storage may have occured.

Christian

-- 
Christian Theune · ct@xxxxxxxxxxxxxxx · +49 345 219401 0
Flying Circus Internet Operations GmbH · https://flyingcircus.io
Leipziger Str. 70/71 · 06108 Halle (Saale) · Deutschland
HR Stendal HRB 21169 · Geschäftsführer: Christian Theune, Christian Zagrodnick