On 7/4/25 10:26, Harry Yoo wrote: > On Wed, Jul 02, 2025 at 09:30:30AM +0200, Vlastimil Babka wrote: >> +CC xfs and few more >> >> On 7/2/25 3:41 AM, Tetsuo Handa wrote: >> > On 2025/07/02 0:01, Zi Yan wrote: >> >>> __alloc_frozen_pages_noprof+0x319/0x370 mm/page_alloc.c:4972 >> >>> alloc_pages_mpol+0x232/0x4a0 mm/mempolicy.c:2419 >> >>> alloc_slab_page mm/slub.c:2451 [inline] >> >>> allocate_slab+0xe2/0x3b0 mm/slub.c:2627 >> >>> new_slab mm/slub.c:2673 [inline] >> >> >> >> new_slab() allows __GFP_NOFAIL, since GFP_RECLAIM_MASK has it. >> >> In allocate_slab(), the first allocation without __GFP_NOFAIL >> >> failed, the retry used __GFP_NOFAIL but kmem_cache order >> >> was greater than 1, which led to the warning above. >> >> >> >> Maybe allocate_slab() should just fail when kmem_cache >> >> order is too big and first trial fails? I am no expert, >> >> so add Vlastimil for help. >> >> Thanks Zi. Slab shouldn't fail with __GFP_NOFAIL, that would only lead >> to subsystems like xfs to reintroduce their own forever retrying >> wrappers again. I think it's going the best it can for the fallback >> attempt by using the minimum order, so the warning will never happen due >> to the calculated optimal order being too large, but only if the >> kmalloc()/kmem_cache_alloc() requested/object size is too large itself. > > Right. The warning would trigger only if the object size is bigger > than 8k (PAGE_SIZE * 2). > >> Hm but perhaps enabling slab_debug can inflate it over the threshold, is >> it the case here? > > CONFIG_CMDLINE="earlyprintk=serial net.ifnames=0 sysctl.kernel.hung_task_all_cpu_backtrace=1 ima_policy=tcb nf-conntrack-ftp.ports=20000 nf-conntrack-tftp.ports=20000 nf-conntrack-sip.ports=20000 nf-conntrack-irc.ports=20000 nf-conntrack-sane.ports=20000 binder.debug_mask=0 rcupdate.rcu_expedited=1 rcupdate.rcu_cpu_stall_cputime=1 no_hash_pointers page_owner=on sysctl.vm.nr_hugepages=4 sysctl.vm.nr_overcommit_hugepages=4 secretmem.enable=1 sysctl.max_rcu_stall_to_panic=1 msr.allow_writes=off coredump_filter=0xffff root=/dev/sda console=ttyS0 vsyscall=native numa=fake=2 kvm-intel.nested=1 spec_store_bypass_disable=prctl nopcid vivid.n_devs=64 vivid.multiplanar=1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2 netrom.nr_ndevs=32 rose.rose_ndevs=32 smp.csd_lock_timeout=100000 watchdog_thresh=55 workqueue.watchdog_thresh=140 sysctl.net.core.netdev_unregister_timeout_secs=140 dummy_hcd.num=32 max_loop=32 nbds_max=32 panic_on_warn=1" > > CONFIG_SLUB_DEBUG=y > # CONFIG_SLUB_DEBUG_ON is not set > > It seems no slab_debug is involved here. > > I downloaded the config and built the kernel, and > sizeof(struct xfs_mount) is 4480 bytes. It should have allocated using > order 1? So it should be the kmalloc-8k cache, its min order should be get_order(8k) thus 1. If the object was larger than 8k it would be a large kmalloc anyway and also trigger the __GFP_NOFAIL warning but with a different stacktrace. > Not sure why the min order was greater than 1? > Not sure what I'm missing... The only sane explanation is that slab debugging is enabled but not via CONFIG_CMDLINE but via options passed to the qemu execution? But I don't see those, nor the full dmesg (that would report them) in the syzbot dashboard. Hm or actually it might be kasan_cache_create() bumping our size when called from calculate_sizes(). KASAN seems enabled... >> I think in that rare case we could convert such >> fallback allocations to large kmalloc to avoid adding the debugging >> overhead - we can't easily create an individual slab page without the >> debugging layout for a kmalloc cache with debugging enabled. > > Yeah that can be doable when the size is exactly 8k or very close to 8k. >