On Sun, Mar 30, 2025 at 01:04:02PM +0100, Matthew Wilcox wrote: > On Sat, Mar 29, 2025 at 11:47:30PM -0700, Luis Chamberlain wrote: > > However tracing shows that folio_mc_copy() *isn't* being called > > as often as we'd expect from buffer_migrate_folio_norefs() path > > as we're likely bailing early now thanks to the check added by commit > > 060913999d7a ("mm: migrate: support poisoned recover from migrate > > folio"). > > Umm. You're saying that most folios we try to migrate have extra refs? > That seems unexpected; does it indicate a bug in 060913999d7a? I've debugged this further, the migration does succeed and I don't see any failures due to the new refcheck added by 060913999d7a. I've added stats in a out of tree patch [0] in case folks find this useful, I could submit this. But the point is that even if you use dd against a large block device you won't always end up trying to migrate large folios *right away* even if you trigger folio migration through compaction, specially if you use a large bs on dd like bs=1M. Using a size matching more close to the logical block size will trigger large folio migration much faster. Example of the stats: # cat /sys/kernel/debug/mm/migrate/bh/stats [buffer_migrate_folio] calls 9874 success 9854 fails 20 [buffer_migrate_folio_norefs] calls 3694 success 1651 fails 2043 no-head-success 532 no-head-fails 0 invalid 2040 valid 1119 valid-success 1119 valid-fails 0 Success ratios: buffer_migrate_folio: 99% success (9854/9874) buffer_migrate_folio_norefs: 44% success (1651/3694) > > +++ b/mm/migrate.c > > @@ -751,6 +751,8 @@ static int __migrate_folio(struct address_space *mapping, struct folio *dst, > > { > > int rc, expected_count = folio_expected_refs(mapping, src); > > > > + might_sleep(); > > We deliberately don't sleep when the folio is only a single page. > So this needs to be: > > might_sleep_if(folio_test_large(folio)); That does reduce the scope of our test coverage but, sure. [0] https://lore.kernel.org/all/20250331061306.4073352-1-mcgrof@xxxxxxxxxx/ Luis