Re: [PATCH] generic/738: warn & lazy-umount if thaw hangs on buggy XFS

"Darrick J. Wong" <djwong@xxxxxxxxxx> · Mon, 9 Jun 2025 08:30:13 -0700

On Fri, Jun 06, 2025 at 09:51:25AM +0800, Zorro Lang wrote:
> On Fri, Jun 06, 2025 at 09:24:42AM +0800, Li Chen wrote:
> > Hi Zorro,
> > 
> > On Thu, 05 Jun 2025 23:49:47 +0800,
> > Zorro Lang wrote:
> > > 
> > > On Sun, Jun 01, 2025 at 03:00:59PM +0800, Li Chen wrote:
> > > > From: Li Chen <chenl311@xxxxxxxxxxxxxxx>
> > > > 
> > > > If `xfs_freeze -u` goes D-state (because of freeze-reclaim deadlock)
> > > > the test never finishes and the harness stalls.
> > > > Run thaw in background, wait 10 s, and when it’s still alive:
> > > > 
> > > >   * emit a warning plus the fixing commit
> > > >       ab23a7768739  “xfs: per-cpu deferred inode inactivation queues”
> > > >   * `umount -l` the scratch FS so the rest of xfstests can proceed
> > > >   * skip any `wait` that would block on the hung tasks.
> > > > 
> > > > Fixed kernels behave as before; broken ones no longer wedge the run.
> > > > 
> > > > The hung task call trace would be as below:
> > > > [   20.535519]       Not tainted 5.14.0-rc4+ #27
> > > > [   20.537855] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > > > [   20.539420] task:738             state:D stack:14544 pid: 7124 ppid:   753 flags:0x00004002
> > > > [   20.540892] Call Trace:
> > > > [   20.541424]  __schedule+0x22d/0x6c0
> > > > [   20.542128]  schedule+0x3f/0xa0
> > > > [   20.542751]  percpu_rwsem_wait+0x100/0x130
> > > > [   20.543516]  ? percpu_free_rwsem+0x30/0x30
> > > > [   20.544259]  __percpu_down_read+0x44/0x50
> > > > [   20.545002]  xfs_trans_alloc+0x19a/0x1f0
> > > > [   20.545747]  xfs_free_eofblocks+0x47/0x100
> > > > [   20.546519]  xfs_inode_mark_reclaimable+0x115/0x160
> > > > [   20.547398]  destroy_inode+0x36/0x70
> > > > [   20.548077]  prune_icache_sb+0x79/0xb0
> > > > [   20.548789]  super_cache_scan+0x159/0x1e0
> > > > [   20.549536]  shrink_slab.constprop.0+0x1b1/0x370
> > > > [   20.550363]  drop_slab_node+0x1d/0x40
> > > > [   20.551041]  drop_slab+0x30/0x70
> > > > [   20.551600]  drop_caches_sysctl_handler+0x6b/0x80
> > > > [   20.552311]  proc_sys_call_handler+0x12b/0x250
> > > > [   20.552931]  new_sync_write+0x117/0x1b0
> > > > [   20.553462]  vfs_write+0x1bd/0x250
> > > > [   20.553914]  ksys_write+0x5a/0xd0
> > > > [   20.554381]  do_syscall_64+0x3b/0x90
> > > > [   20.554854]  entry_SYSCALL_64_after_hwframe+0x44/0xae
> > > > [   20.555481] RIP: 0033:0x7f90928d3300
> > > > [   20.555946] RSP: 002b:00007ffc2b50b998 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
> > > > [   20.556853] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f90928d3300
> > > > [   20.557686] RDX: 0000000000000002 RSI: 000055a5d6c47750 RDI: 0000000000000001
> > > > [   20.558524] RBP: 000055a5d6c47750 R08: 0000000000000007 R09: 0000000000000073
> > > > [   20.559335] R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000002
> > > > [   20.560154] R13: 00007f90929ae760 R14: 0000000000000002 R15: 00007f90929a99e0
> > > > 
> > > > localhost login: [   30.773559] INFO: task 738:7124 blocked for more than 20 seconds.
> > > > [   30.775236]       Not tainted 5.14.0-rc4+ #27
> > > > [   30.777449] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > > > [   30.779729] task:738             state:D stack:14544 pid: 7124 ppid:   753 flags:0x00004002
> > > > [   30.781267] Call Trace:
> > > > [   30.781850]  __schedule+0x22d/0x6c0
> > > > [   30.782618]  schedule+0x3f/0xa0
> > > > [   30.783297]  percpu_rwsem_wait+0x100/0x130
> > > > [   30.784110]  ? percpu_free_rwsem+0x30/0x30
> > > > [   30.785085]  __percpu_down_read+0x44/0x50
> > > > [   30.786071]  xfs_trans_alloc+0x19a/0x1f0
> > > > [   30.786877]  xfs_free_eofblocks+0x47/0x100
> > > > [   30.787727]  xfs_inode_mark_reclaimable+0x115/0x160
> > > > [   30.788708]  destroy_inode+0x36/0x70
> > > > [   30.789395]  prune_icache_sb+0x79/0xb0
> > > > [   30.790056]  super_cache_scan+0x159/0x1e0
> > > > [   30.790712]  shrink_slab.constprop.0+0x1b1/0x370
> > > > [   30.791381]  drop_slab_node+0x1d/0x40
> > > > [   30.791924]  drop_slab+0x30/0x70
> > > > [   30.792469]  drop_caches_sysctl_handler+0x6b/0x80
> > > > [   30.793328]  proc_sys_call_handler+0x12b/0x250
> > > > [   30.793948]  new_sync_write+0x117/0x1b0
> > > > [   30.794471]  vfs_write+0x1bd/0x250
> > > > [   30.794941]  ksys_write+0x5a/0xd0
> > > > [   30.795414]  do_syscall_64+0x3b/0x90
> > > > [   30.795928]  entry_SYSCALL_64_after_hwframe+0x44/0xae
> > > > [   30.796595] RIP: 0033:0x7f90928d3300
> > > > [   30.797090] RSP: 002b:00007ffc2b50b998 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
> > > > [   30.798033] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f90928d3300
> > > > [   30.798852] RDX: 0000000000000002 RSI: 000055a5d6c47750 RDI: 0000000000000001
> > > > [   30.799703] RBP: 000055a5d6c47750 R08: 0000000000000007 R09: 0000000000000073
> > > > [   30.800833] R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000002
> > > > [   30.801764] R13: 00007f90929ae760 R14: 0000000000000002 R15: 00007f90929a99e0
> > > > [   30.802628] INFO: task xfs_io:7130 blocked for more than 10 seconds.
> > > > [   30.803421]       Not tainted 5.14.0-rc4+ #27
> > > > [   30.803985] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > > > [   30.804979] task:xfs_io          state:D stack:13712 pid: 7130 ppid:  7127 flags:0x00000002
> > > > [   30.806013] Call Trace:
> > > > [   30.806399]  __schedule+0x22d/0x6c0
> > > > [   30.806867]  schedule+0x3f/0xa0
> > > > [   30.807334]  rwsem_down_write_slowpath+0x1d8/0x510
> > > > [   30.808018]  thaw_super+0xd/0x20
> > > > [   30.808748]  __x64_sys_ioctl+0x5d/0xb0
> > > > [   30.809292]  do_syscall_64+0x3b/0x90
> > > > [   30.809797]  entry_SYSCALL_64_after_hwframe+0x44/0xae
> > > > [   30.810454] RIP: 0033:0x7ff1b48c5d1b
> > > > [   30.810943] RSP: 002b:00007fff0bf88ac0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> > > > [   30.811874] RAX: ffffffffffffffda RBX: 000055b93ae5fc40 RCX: 00007ff1b48c5d1b
> > > > [   30.812743] RDX: 00007fff0bf88b2c RSI: ffffffffc0045878 RDI: 0000000000000003
> > > > [   30.813583] RBP: 000055b93ae60fe0 R08: 0000000000000000 R09: 0000000000000000
> > > > [   30.814497] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001
> > > > [   30.815413] R13: 000055b93a3a94e9 R14: 0000000000000000 R15: 000055b93ae61150
> > > > ---
> > > >  tests/generic/738 | 20 ++++++++++++++++++--
> > > >  1 file changed, 18 insertions(+), 2 deletions(-)
> > > > 
> > > > diff --git a/tests/generic/738 b/tests/generic/738
> > > > index 6f1ea7f8..9a90eefa 100755
> > > > --- a/tests/generic/738
> > > > +++ b/tests/generic/738
> > > > @@ -11,8 +11,24 @@ _begin_fstest auto quick freeze
> > > >  
> > > >  _cleanup()
> > > >  {
> > > > -	xfs_freeze -u $SCRATCH_MNT 2>/dev/null
> > > > -	wait
> > > > +    # Thaw may dead-lock on unfixed XFS kernels.  Run it in background,
> > > > +    # wait a tiny bit, then decide whether it is stuck.
> > > > +    xfs_freeze -u $SCRATCH_MNT 2>/dev/null &
> > > > +    _thaw_pid=$!
> > > > +
> > > > +    sleep 8
> > > > +
> > > > +    if [ -e "/proc/$_thaw_pid" ]; then
> > > > +            # still running → stuck in D-state
> > > > +            if [ "$FSTYP" = "xfs" ]; then
> > > > +                    echo "generic/738: known XFS freeze-reclaim deadlock; " \
> > > > +                         "fixed by kernel commit ab23a7768739 " \
> > > > +                         '"xfs: per-cpu deferred inode inactivation queues"' \
> > > 
> > > If want to mark a known fix, you can add below line to this case:
> > > 
> > > _fixed_by_kernel_commit ab23a7768739 \
> > > 	"xfs: per-cpu deferred inode inactivation queues"
> > 
> > I have already tried that way, but it doesn't have any chance to output the fixd commit
> > because it already hang inside xfs_freeze, that's why I change to run this
> > command in background then sleep.
> 
> At least someone can find this message (and some comments if you like) when he check the
> test case source code :)
> 
> # ...
> [ "$FSTYP" = "xfs" ] && _fixed_by_kernel_commit ab23a7768739 \
> 	"xfs: per-cpu deferred inode inactivation queues"
> 
> > 
> > > 
> > > But for this patch, I don't think we should do this for a bug. If it blocks your
> > > testing on someone downstream system, you can skip this test. CC xfs list if you
> > > need more review points for this xfs bug.
> > 
> > Without this patch, users will not know the cause of the hang easily from the stdout/stderr.
> > I have already bisected and confirms this patch resolves the issue.
> 
> CC xfs list to confirm that.

Yes, it would be useful to tie this test to related bugfixes.  Please
use the appropriate _fixed_by_* helpers to make it easier to grep for
those sorts of things.

--D

> > 
> > Regards,
> > Li
> > 
> 
>