On Fri, Mar 28, 2025 at 02:48:00AM -0700, Luis Chamberlain wrote: > On Thu, Mar 27, 2025 at 09:21:30PM -0700, Luis Chamberlain wrote: > > Would the extra ref check added via commit 060913999d7a9e50 ("mm: > > migrate: support poisoned recover from migrate folio") make the removal > > of the spin lock safe now given all the buffers are locked from the > > folio? This survives some basic sanity checks on my end with > > generic/750 against ext4 and also filling a drive at the same time with > > fio. I have a feeling is we are not sure, do we have a reproducer for > > the issue reported through ebdf4de5642fb6 ("mm: migrate: fix reference > > check race between __find_get_block() and migration")? I suspect the > > answer is no. Sebastian, David, is there a reason CONFIG_DEBUG_ATOMIC_SLEEP=y won't trigger a atomic sleeping context warning when cond_resched() is used? Syzbot and 0-day had ways to reproduce it a kernel warning under these conditions, but this config didn't, and require dan explicit might_sleep() CONFIG_PREEMPT_BUILD=y CONFIG_ARCH_HAS_PREEMPT_LAZY=y # CONFIG_PREEMPT_NONE is not set # CONFIG_PREEMPT_VOLUNTARY is not set CONFIG_PREEMPT=y # CONFIG_PREEMPT_LAZY is not set # CONFIG_PREEMPT_RT is not set CONFIG_PREEMPT_COUNT=y CONFIG_PREEMPTION=y CONFIG_PREEMPT_DYNAMIC=y CONFIG_PREEMPT_RCU=y CONFIG_HAVE_PREEMPT_DYNAMIC=y CONFIG_HAVE_PREEMPT_DYNAMIC_CALL=y CONFIG_PREEMPT_NOTIFIERS=y CONFIG_DEBUG_PREEMPT=y CONFIG_PREEMPTIRQ_TRACEPOINTS=y # CONFIG_PREEMPT_TRACER is not set # CONFIG_PREEMPTIRQ_DELAY_TEST is not set Are there some preemption configs under which cond_resched() won't trigger a kernel splat where expected so the only thing I can think of is perhaps some preempt configs don't implicate a sleep? If true, instead of adding might_sleep() to one piece of code (in this case foio_mc_copy()) I wonder if instead just adding it to cond_resched() may be useful. Note that the issue in question wouldn't trigger at all with ext4, that some reports suggset it happened with btrfs (0-day) with LTP, or another test from syzbot was just coincidence on any filesystem, the only way to reproduce this really was by triggering compaction with the block device cache and hitting compaction as we're now enabling large folios with the block device cache, and we've narrowed that down to a simple reproducer of running dd if=/dev/zero of=/dev/vde bs=1024M count=1024. and by adding the might_sleep() on folio_mc_copy() Then as for the issue we're analzying, now that I get back home I think its important to highlight then that generic/750 seems likely able to reproduce the original issue reported by commit ebdf4de5642fb6 ("mm: migrate: fix reference check race between __find_get_block() and migration") and that it takes about 3 hours to reproduce, which requires reverting that commit which added the spin lock: Mar 28 03:36:37 extra-ext4-4k unknown: run fstests generic/750 at 2025-03-28 03:36:37 <-- snip --> Mar 28 05:57:09 extra-ext4-4k kernel: EXT4-fs error (device loop5): ext4_get_first_dir_block:3538: inode #5174: comm fsstress: directory missing '.' Jan, can you confirm if the symptoms match the original report? It would be good for us to see if running the newly proposed generic/764 I am proposing [0] can reproduce that corruption faster than 3 hours. If we have a reproducer we can work on evaluating a fix for both the older ext4 issue reported by commit ebdf4de5642fb6 and also remove the spin lock from page migration to support large folios. And lastly, can __find_get_block() avoid running in case of page migration? Do we have semantics from a filesystem perspective to prevent work in filesystems going on when page migration on a folio is happening in atomic context? If not, do we need it? [0] https://lore.kernel.org/all/20250326185101.2237319-1-mcgrof@xxxxxxxxxx/T/#u Luis