Re: [PATCHv6 6/7] block: protect wbt_lat_usec using q->elevator_lock

Oliver Sang <oliver.sang@xxxxxxxxx> · Tue, 25 Mar 2025 15:08:20 +0800

hi, Nilay,

On Tue, Mar 18, 2025 at 07:13:20PM +0530, Nilay Shroff wrote:
> 
> 
> On 3/17/25 7:10 PM, kernel test robot wrote:
> > 
> > 
> > Hello,
> > 
> > kernel test robot noticed "INFO:task_blocked_for_more_than#seconds" on:
> > 
> > commit: f35c9ef2ba17842de59176b29df32999803bd9fa ("[PATCHv6 6/7] block: protect wbt_lat_usec using q->elevator_lock")
> > url: https://github.com/intel-lab-lkp/linux/commits/Nilay-Shroff/block-acquire-q-limits_lock-while-reading-sysfs-attributes/20250304-182738
> > base: https://git.kernel.org/cgit/linux/kernel/git/axboe/linux-block.git for-next
> > patch link: https://lore.kernel.org/all/20250304102551.2533767-7-nilay@xxxxxxxxxxxxx/
> > patch subject: [PATCHv6 6/7] block: protect wbt_lat_usec using q->elevator_lock
> > 
> > in testcase: fio-basic
> > version: fio-x86_64-3.38-1_20250308
> > with following parameters:
> > 
> > 	runtime: 300s
> > 	disk: 1HDD
> > 	fs: btrfs
> > 	nr_task: 100%
> > 	test_size: 128G
> > 	rw: randwrite
> > 	bs: 4M
> > 	ioengine: posixaio
> > 	cpufreq_governor: performance
> > 
> > 
> > 
> > config: x86_64-rhel-9.4
> > compiler: gcc-12
> > test machine: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory
> > 
> > (please refer to attached dmesg/kmsg for entire log/backtrace)
> > 
> > 
> > 
> > If you fix the issue in a separate patch/commit (i.e. not just a new version of
> > the same patch/commit), kindly add following tags
> > | Reported-by: kernel test robot <oliver.sang@xxxxxxxxx>
> > | Closes: https://lore.kernel.org/oe-lkp/202503171650.cc082b66-lkp@xxxxxxxxx
> > 
> > 
> > [  991.017071][  T472] INFO: task umount:12320 blocked for more than 491 seconds.
> > [  991.024314][  T472]       Tainted: G        W          6.14.0-rc5-00192-gf35c9ef2ba17 #1
> > [  991.032414][  T472] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > [  991.040948][  T472] task:umount          state:D stack:0     pid:12320 tgid:12320 ppid:12317  task_flags:0x400100 flags:0x00004002
> > [  991.052695][  T472] Call Trace:
> > [  991.055849][  T472]  <TASK>
> > [ 991.058658][ T472] __schedule (kernel/sched/core.c:5378 kernel/sched/core.c:6765) 
> > [ 991.062856][ T472] schedule (arch/x86/include/asm/bitops.h:206 arch/x86/include/asm/bitops.h:238 include/linux/thread_info.h:192 include/linux/thread_info.h:208 include/linux/sched.h:2149 kernel/sched/core.c:6844 kernel/sched/core.c:6857) 
> > [ 991.066706][ T472] wb_wait_for_completion (fs/fs-writeback.c:216 fs/fs-writeback.c:213) 
> > [ 991.071773][ T472] ? __pfx_autoremove_wake_function (kernel/sched/wait.c:383) 
> > [ 991.077702][ T472] __writeback_inodes_sb_nr (fs/fs-writeback.c:2736) 
> > [ 991.082936][ T472] sync_filesystem (fs/sync.c:55 fs/sync.c:30) 
> > [ 991.087390][ T472] generic_shutdown_super (fs/super.c:622) 
> > [ 991.092538][ T472] kill_anon_super (fs/super.c:434 fs/super.c:1238) 
> > [ 991.096991][ T472] btrfs_kill_super (fs/btrfs/super.c:2101) btrfs 
> > [ 991.102280][ T472] deactivate_locked_super (fs/super.c:473) 
> > [ 991.107678][ T472] cleanup_mnt (fs/namespace.c:281 fs/namespace.c:1414) 
> > [ 991.112082][ T472] task_work_run (kernel/task_work.c:227 (discriminator 1)) 
> > [ 991.116534][ T472] syscall_exit_to_user_mode (include/linux/resume_user_mode.h:50 kernel/entry/common.c:114 include/linux/entry-common.h:329 kernel/entry/common.c:207 kernel/entry/common.c:218) 
> > [ 991.122197][ T472] do_syscall_64 (arch/x86/entry/common.c:102) 
> > [ 991.126731][ T472] ? do_syscall_64 (arch/x86/entry/common.c:102) 
> > [ 991.131430][ T472] ? __count_memcg_events (mm/memcontrol.c:583 mm/memcontrol.c:857) 
> > [ 991.136738][ T472] ? handle_mm_fault (arch/x86/include/asm/irqflags.h:154 include/linux/memcontrol.h:970 include/linux/memcontrol.h:993 include/linux/memcontrol.h:1000 mm/memory.c:6077 mm/memory.c:6238) 
> > [ 991.141606][ T472] ? do_user_addr_fault (include/linux/mm.h:743 arch/x86/mm/fault.c:1339) 
> > [ 991.146823][ T472] ? clear_bhb_loop (arch/x86/entry/entry_64.S:1538) 
> > [ 991.151517][ T472] ? clear_bhb_loop (arch/x86/entry/entry_64.S:1538) 
> > [ 991.156203][ T472] ? clear_bhb_loop (arch/x86/entry/entry_64.S:1538) 
> > [ 991.160881][ T472] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130) 
> > [  991.166777][  T472] RIP: 0033:0x7f415ea2aa77
> > [  991.171197][  T472] RSP: 002b:00007ffe0db2fd98 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
> > [  991.179611][  T472] RAX: 0000000000000000 RBX: 000055cc64b55048 RCX: 00007f415ea2aa77
> > [  991.187586][  T472] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000055cc64b55160
> > [  991.195555][  T472] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000073
> > [  991.203514][  T472] R10: 0000000000000000 R11: 0000000000000246 R12: 00007f415eb65264
> > [  991.211476][  T472] R13: 000055cc64b55160 R14: 0000000000000000 R15: 000055cc64b54f30
> > [  991.219431][  T472]  </TASK>
> > [ 1008.358661][T12320] BTRFS info (device sda1): last unmount of filesystem 8b972718-96ad-4a66-b549-8be29321e91a
> > 
> > 
> > The kernel config and materials to reproduce are available at:
> > https://download.01.org/0day-ci/archive/20250317/202503171650.cc082b66-lkp@xxxxxxxxx
> > 
> I attempted to reproduce the above issue multiple times using the provided 
> reproducer but was unable to do so. However, during further investigation, 
> I discovered a  lockdep warning related to a circular buffer. The patch in
> question introduces q->elevator_lock to protect writes to the sysfs attribute
> wbt_lat_usec and the cgroup attribute io.cost.qos. However, write to these
> attributes also acquire q->rq_qos_mutex, which may lead to a potential lock
> ordering issue reported by lockdep. Unfortunately, blktest doesn't have any
> testcase testing writes to these attributes. I think we should have one and
> so will submit a blktest. 
> 
> The lockdep warning reports an incorrect locking order between q->elevator_lock
> and q->rq_qos_mutex, which might cause the observed symptom reported. Notably, 
> I saw that the LKP test case did not have lockdep enabled, which may have 
> allowed this issue to manifest much earlier rather than being detected later 
> while unmounting the file system.
> 
> Anyways, we have to fix the circular locking dependency between q->elevator_lock 
> and q->rq_qos_mutex. I will prepare a patch to address this and submit it upstream, 
> tagging you in the commit.
> 
> On another, if you're able to recreate this issue then whenever this issue manifests
> would you please help run the below command and collect dmesg output:
> # echo w > /proc/sysrq-trigger

sorry for late.

above need some changes in our auto flow. instead of collect this information,
we noticed you've already had a fix patch-set in
https://lore.kernel.org/all/20250319105518.468941-1-nilay@xxxxxxxxxxxxx/
we applied them upon previous patch-set
(https://lore.kernel.org/all/20250304102551.2533767-1-nilay@xxxxxxxxxxxxx/)

and confirmed the issue gone. thanks!

Tested-by: kernel test robot <oliver.sang@xxxxxxxxx>

> 
> Thanks,
> --Nilay
>