Re: [syzbot] [mm?] WARNING in xfs_init_fs_context

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 7/8/25 00:10, Dave Chinner wrote:
> On Wed, Jul 02, 2025 at 09:30:30AM +0200, Vlastimil Babka wrote:
>> On 7/2/25 3:41 AM, Tetsuo Handa wrote:
>> > By the way, why is xfs_init_fs_context() using __GFP_NOFAIL ?
>> > 
>> > 	mp = kzalloc(sizeof(struct xfs_mount), GFP_KERNEL | __GFP_NOFAIL);
>> > 	if (!mp)
>> > 		return -ENOMEM;
>> > 
>> > This looks an allocation attempt which can fail safely.
> 
> It's irrelevant - it shouldn't fail regardless of __GFP_NOFAIL being
> specified.

If you mean the "too small to fail" behavior then it's generally true,
except in some corner cases like being an oom victim, in which case the
allocation can fail - the userspace process is doomed anyway. But a (small)
kernel allocation not handling NULL would still need __GFP_NOFAIL to prevent
that corner case.

>> Indeed. Dave Chinner's commit f078d4ea82760 ("xfs: convert kmem_alloc()
>> to kmalloc()") dropped the xfs wrapper. This allocation didn't use
>> KM_MAYFAIL so it got __GFP_NOFAIL. The commit mentions this high-order
>> nofail issue for another allocation site that had to use xlog_kvmalloc().
> 
> I don't see how high-order allocation behaviour is relevant here.
> 
> Pahole says the struct xfs_mount is 4224 bytes in length. It is an
> order-1 allocation and if we've fragmented memory so badly that slab
> can't allocate an order-1 page then *lots* of other stuff is going
> to be stalling. (e.g. slab pages for inodes are typically order-3,
> same as the kmalloc-8kk slab).

Elsewhere in this thread we figured it out since I wrote the quoted reply.
4224 bytes means kmalloc-8k where the fallback allocation (the one that
passes on the __GFP_NOFAIL) order is 1 normally. But due to KASAN enabled
its metadata means the per-object size goes above 8k and thus the fallback
order will be 2. It's a corner case that wasn't anticipated and existed for
years without known reports. We'll need to deal with it somehow.

> Note that the size of the structure is largely because of the
> embedded cpumask for inodegc:
> 
> 	struct cpumask             m_inodegc_cpumask;    /*  3104  1024 */
> 
> This should probably be pulled out into a dynamically allocated
> inodegc specific structure. Then the struct xfs_mount is only a
> order-0 allocation and should never fail, regardless of
> __GFP_NOFAIL being specified or not.
> 
>> I think either this allocation really can fail as the code (return
>> -ENOMEM) suggests and thus can drop __GFP_NOFAIL, or it can use
>> kvmalloc() - I think the wrapper for that can be removed now too after
>> the discussion in [1] resulted in commit 46459154f997 ("mm: kvmalloc:
>> make kmalloc fast path real fast path").
> 
> I know about that - I have patches that I'm testing that replace
> xlog_kvmalloc() with kvmalloc calls.

Great, thanks!

> -Dave.





[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux