On Wed, Jul 02, 2025 at 09:30:30AM +0200, Vlastimil Babka wrote: > +CC xfs and few more > > On 7/2/25 3:41 AM, Tetsuo Handa wrote: > > On 2025/07/02 0:01, Zi Yan wrote: > >>> __alloc_frozen_pages_noprof+0x319/0x370 mm/page_alloc.c:4972 > >>> alloc_pages_mpol+0x232/0x4a0 mm/mempolicy.c:2419 > >>> alloc_slab_page mm/slub.c:2451 [inline] > >>> allocate_slab+0xe2/0x3b0 mm/slub.c:2627 > >>> new_slab mm/slub.c:2673 [inline] > >> > >> new_slab() allows __GFP_NOFAIL, since GFP_RECLAIM_MASK has it. > >> In allocate_slab(), the first allocation without __GFP_NOFAIL > >> failed, the retry used __GFP_NOFAIL but kmem_cache order > >> was greater than 1, which led to the warning above. > >> > >> Maybe allocate_slab() should just fail when kmem_cache > >> order is too big and first trial fails? I am no expert, > >> so add Vlastimil for help. > > Thanks Zi. Slab shouldn't fail with __GFP_NOFAIL, that would only lead > to subsystems like xfs to reintroduce their own forever retrying > wrappers again. I think it's going the best it can for the fallback > attempt by using the minimum order, so the warning will never happen due > to the calculated optimal order being too large, but only if the > kmalloc()/kmem_cache_alloc() requested/object size is too large itself. Right. The warning would trigger only if the object size is bigger than 8k (PAGE_SIZE * 2). > Hm but perhaps enabling slab_debug can inflate it over the threshold, is > it the case here? CONFIG_CMDLINE="earlyprintk=serial net.ifnames=0 sysctl.kernel.hung_task_all_cpu_backtrace=1 ima_policy=tcb nf-conntrack-ftp.ports=20000 nf-conntrack-tftp.ports=20000 nf-conntrack-sip.ports=20000 nf-conntrack-irc.ports=20000 nf-conntrack-sane.ports=20000 binder.debug_mask=0 rcupdate.rcu_expedited=1 rcupdate.rcu_cpu_stall_cputime=1 no_hash_pointers page_owner=on sysctl.vm.nr_hugepages=4 sysctl.vm.nr_overcommit_hugepages=4 secretmem.enable=1 sysctl.max_rcu_stall_to_panic=1 msr.allow_writes=off coredump_filter=0xffff root=/dev/sda console=ttyS0 vsyscall=native numa=fake=2 kvm-intel.nested=1 spec_store_bypass_disable=prctl nopcid vivid.n_devs=64 vivid.multiplanar=1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2 netrom.nr_ndevs=32 rose.rose_ndevs=32 smp.csd_lock_timeout=100000 watchdog_thresh=55 workqueue.watchdog_thresh=140 sysctl.net.core.netdev_unregister_timeout_secs=140 dummy_hcd.num=32 max_loop=32 nbds_max=32 panic_on_warn=1" CONFIG_SLUB_DEBUG=y # CONFIG_SLUB_DEBUG_ON is not set It seems no slab_debug is involved here. I downloaded the config and built the kernel, and sizeof(struct xfs_mount) is 4480 bytes. It should have allocated using order 1? Not sure why the min order was greater than 1? Not sure what I'm missing... > I think in that rare case we could convert such > fallback allocations to large kmalloc to avoid adding the debugging > overhead - we can't easily create an individual slab page without the > debugging layout for a kmalloc cache with debugging enabled. Yeah that can be doable when the size is exactly 8k or very close to 8k. -- Cheers, Harry / Hyeonggon