Hi, On Mon, Jul 14, 2025 at 09:03:25PM +0800, Baokun Li wrote: > While traversing the list, holding a spin_lock prevents load_buddy, making > direct use of ext4_try_lock_group impossible. This can lead to a bouncing > scenario where spin_is_locked(grp_A) succeeds, but ext4_try_lock_group() > fails, forcing the list traversal to repeatedly restart from grp_A. > This patch causes crashes for pretty much every architecture when running unit tests as part of booting. Example (from x8_64) as well as bisect log attached below. Guenter --- ... [ 9.353832] # Subtest: test_new_blocks_simple [ 9.366711] BUG: kernel NULL pointer dereference, address: 0000000000000014 [ 9.366931] #PF: supervisor read access in kernel mode [ 9.366993] #PF: error_code(0x0000) - not-present page [ 9.367165] PGD 0 P4D 0 [ 9.367305] Oops: Oops: 0000 [#1] SMP PTI [ 9.367686] CPU: 0 UID: 0 PID: 217 Comm: kunit_try_catch Tainted: G N 6.16.0-rc7-next-20250722 #1 PREEMPT(voluntary) [ 9.367846] Tainted: [N]=TEST [ 9.367891] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014 [ 9.368063] RIP: 0010:ext4_mb_release+0x26e/0x510 [ 9.368374] Code: 28 4a cb ff e8 03 5a cf ff 31 db 48 8d 3c 9b 48 83 c3 01 48 c1 e7 04 48 03 bd 60 05 00 00 e8 c9 a6 48 01 48 8b 85 68 03 00 00 <0f> b6 40 14 83 c0 02 39 d8 7f d6 48 8b bd 60 05 00 00 31 db e8 d9 [ 9.368581] RSP: 0000:ffffb33b8041fe40 EFLAGS: 00010286 [ 9.368659] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000 [ 9.368732] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff9a319e36 [ 9.368802] RBP: ffff8b89c3502400 R08: 0000000000000001 R09: 0000000000000000 [ 9.368872] R10: 0000000000000001 R11: 0000000000000120 R12: ffff8b89c2f49160 [ 9.368941] R13: ffff8b89c2f49158 R14: ffff8b89c2f24000 R15: ffff8b89c2f24000 [ 9.369042] FS: 0000000000000000(0000) GS:ffff8b8a3381a000(0000) knlGS:0000000000000000 [ 9.369127] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 9.369194] CR2: 0000000000000014 CR3: 0000000009a9c000 CR4: 00000000003506f0 [ 9.369324] Call Trace: [ 9.369440] <TASK> [ 9.369637] mbt_kunit_exit+0x47/0xf0 [ 9.369745] ? __pfx_kunit_generic_run_threadfn_adapter+0x10/0x10 [ 9.369813] kunit_try_run_case_cleanup+0x2f/0x40 [ 9.369865] kunit_generic_run_threadfn_adapter+0x1c/0x40 [ 9.369922] kthread+0x10b/0x230 [ 9.369965] ? __pfx_kthread+0x10/0x10 [ 9.370013] ret_from_fork+0x165/0x1b0 [ 9.370057] ? __pfx_kthread+0x10/0x10 [ 9.370099] ret_from_fork_asm+0x1a/0x30 [ 9.370188] </TASK> [ 9.370250] Modules linked in: [ 9.370428] CR2: 0000000000000014 [ 9.370657] ---[ end trace 0000000000000000 ]--- [ 9.370791] RIP: 0010:ext4_mb_release+0x26e/0x510 [ 9.370847] Code: 28 4a cb ff e8 03 5a cf ff 31 db 48 8d 3c 9b 48 83 c3 01 48 c1 e7 04 48 03 bd 60 05 00 00 e8 c9 a6 48 01 48 8b 85 68 03 00 00 <0f> b6 40 14 83 c0 02 39 d8 7f d6 48 8b bd 60 05 00 00 31 db e8 d9 [ 9.370996] RSP: 0000:ffffb33b8041fe40 EFLAGS: 00010286 [ 9.371050] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000 [ 9.371112] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff9a319e36 [ 9.371174] RBP: ffff8b89c3502400 R08: 0000000000000001 R09: 0000000000000000 [ 9.371235] R10: 0000000000000001 R11: 0000000000000120 R12: ffff8b89c2f49160 [ 9.371297] R13: ffff8b89c2f49158 R14: ffff8b89c2f24000 R15: ffff8b89c2f24000 [ 9.371358] FS: 0000000000000000(0000) GS:ffff8b8a3381a000(0000) knlGS:0000000000000000 [ 9.371428] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 9.371484] CR2: 0000000000000014 CR3: 0000000009a9c000 CR4: 00000000003506f0 [ 9.371598] note: kunit_try_catch[217] exited with irqs disabled [ 9.371861] # test_new_blocks_simple: try faulted: last line seen fs/ext4/mballoc-test.c:452 [ 9.372123] # test_new_blocks_simple: internal error occurred during test case cleanup: -4 [ 9.372440] not ok 1 block_bits=10 cluster_bits=3 blocks_per_group=8192 group_count=4 desc_size=64 [ 9.375702] BUG: kernel NULL pointer dereference, address: 0000000000000014 [ 9.375782] #PF: supervisor read access in kernel mode [ 9.375832] #PF: error_code(0x0000) - not-present page [ 9.375881] PGD 0 P4D 0 [ 9.375919] Oops: Oops: 0000 [#2] SMP PTI [ 9.375966] CPU: 0 UID: 0 PID: 219 Comm: kunit_try_catch Tainted: G D N 6.16.0-rc7-next-20250722 #1 PREEMPT(voluntary) [ 9.376085] Tainted: [D]=DIE, [N]=TEST [ 9.376123] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014 [ 9.376220] RIP: 0010:ext4_mb_release+0x26e/0x510 [ 9.376275] Code: 28 4a cb ff e8 03 5a cf ff 31 db 48 8d 3c 9b 48 83 c3 01 48 c1 e7 04 48 03 bd 60 05 00 00 e8 c9 a6 48 01 48 8b 85 68 03 00 00 <0f> b6 40 14 83 c0 02 39 d8 7f d6 48 8b bd 60 05 00 00 31 db e8 d9 [ 9.376425] RSP: 0000:ffffb33b803f7e40 EFLAGS: 00010286 [ 9.376482] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000 [ 9.376546] RDX: 0000000002000008 RSI: ffffffff9a319e36 RDI: ffffffff9a319e36 [ 9.376608] RBP: ffff8b89c352a400 R08: 0000000000000000 R09: 0000000000000000 [ 9.376669] R10: 0000000000000000 R11: 0000000058d996d7 R12: ffff8b89c2f49cc0 [ 9.376730] R13: ffff8b89c2f49cb8 R14: ffff8b89c3524000 R15: ffff8b89c3524000 [ 9.376792] FS: 0000000000000000(0000) GS:ffff8b8a3381a000(0000) knlGS:0000000000000000 [ 9.376861] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 9.376913] CR2: 0000000000000014 CR3: 0000000009a9c000 CR4: 00000000003506f0 [ 9.376975] Call Trace: [ 9.377004] <TASK> [ 9.377040] mbt_kunit_exit+0x47/0xf0 [ 9.377089] ? __pfx_kunit_generic_run_threadfn_adapter+0x10/0x10 [ 9.377150] kunit_try_run_case_cleanup+0x2f/0x40 [ 9.377207] kunit_generic_run_threadfn_adapter+0x1c/0x40 [ 9.377266] kthread+0x10b/0x230 [ 9.377308] ? __pfx_kthread+0x10/0x10 [ 9.377353] ret_from_fork+0x165/0x1b0 [ 9.377397] ? __pfx_kthread+0x10/0x10 [ 9.377439] ret_from_fork_asm+0x1a/0x30 [ 9.377505] </TASK> [ 9.377531] Modules linked in: [ 9.377571] CR2: 0000000000000014 [ 9.377609] ---[ end trace 0000000000000000 ]--- --- Bisect log: # bad: [a933d3dc1968fcfb0ab72879ec304b1971ed1b9a] Add linux-next specific files for 20250723 # good: [89be9a83ccf1f88522317ce02f854f30d6115c41] Linux 6.16-rc7 git bisect start 'HEAD' 'v6.16-rc7' # bad: [a56f8f8967ad980d45049973561b89dcd9e37e5d] Merge branch 'main' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git git bisect bad a56f8f8967ad980d45049973561b89dcd9e37e5d # bad: [f6a8dede4030970707e9bae5b3ae76f60df4b75a] Merge branch 'fs-next' of linux-next git bisect bad f6a8dede4030970707e9bae5b3ae76f60df4b75a # good: [b863560c5a26fbcf164f5759c98bb5e72e26848d] Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc.git git bisect good b863560c5a26fbcf164f5759c98bb5e72e26848d # bad: [690056682cc4de56d8de794bc06a3c04bc7f624b] Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs.git git bisect bad 690056682cc4de56d8de794bc06a3c04bc7f624b # good: [fea76c3eb7455d1e941fba6fdd89ab41ab7797c8] Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git git bisect good fea76c3eb7455d1e941fba6fdd89ab41ab7797c8 # bad: [714a183e8cf1cc1ddddb3318de1694a33f49c694] Merge branch 'dev' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs.git git bisect bad 714a183e8cf1cc1ddddb3318de1694a33f49c694 # good: [5fb60c0365c4dad347e4958f78976cb733d903f2] f2fs: Pass a folio to __has_merged_page() git bisect good 5fb60c0365c4dad347e4958f78976cb733d903f2 # bad: [a8a47fa84cc2168b2b3bd645c2c0918eed994fc0] ext4: do not BUG when INLINE_DATA_FL lacks system.data xattr git bisect bad a8a47fa84cc2168b2b3bd645c2c0918eed994fc0 # good: [a35454ecf8a320c49954fdcdae0e8d3323067632] ext4: use memcpy() instead of strcpy() git bisect good a35454ecf8a320c49954fdcdae0e8d3323067632 # good: [3772fe7b4225f21a1bfe63e4a338702cc3c153de] ext4: convert sbi->s_mb_free_pending to atomic_t git bisect good 3772fe7b4225f21a1bfe63e4a338702cc3c153de # good: [12a5b877c314778ddf9a5c603eeb1803a514ab58] ext4: factor out ext4_mb_might_prefetch() git bisect good 12a5b877c314778ddf9a5c603eeb1803a514ab58 # bad: [458bfb991155c2e8ba51861d1ef3c81c5a0846f9] ext4: convert free groups order lists to xarrays git bisect bad 458bfb991155c2e8ba51861d1ef3c81c5a0846f9 # good: [6e0275f6e713f55dd3fc23be317ec11f8db1766d] ext4: factor out ext4_mb_scan_group() git bisect good 6e0275f6e713f55dd3fc23be317ec11f8db1766d # first bad commit: [458bfb991155c2e8ba51861d1ef3c81c5a0846f9] ext4: convert free groups order lists to xarrays