On Wed, Apr 02, 2025 at 06:24:10PM +0100, Matthew Wilcox wrote: > On Wed, Apr 02, 2025 at 02:24:45PM +0200, Michal Hocko wrote: > > On Wed 02-04-25 22:32:14, Dave Chinner wrote: > > > > > > >+ /* > > > > > > >+ * Use vmalloc if the count is too large to avoid costly high-order page > > > > > > >+ * allocations. > > > > > > >+ */ > > > > > > >+ if (count < (PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER)) > > > > > > >+ kbuf = kvzalloc(count + 1, GFP_KERNEL); > > > > > > > > > > > > Why not move this check into kvmalloc family? > > > > > > > > > > Hmm should this check really be in kvmalloc family? > > > > > > > > Modifying the existing kvmalloc functions risks performance regressions. > > > > Could we instead introduce a new variant like vkmalloc() (favoring > > > > vmalloc over kmalloc) or kvmalloc_costless()? > > > > > > We should fix kvmalloc() instead of continuing to force > > > subsystems to work around the limitations of kvmalloc(). > > > > Agreed! > > > > > Have a look at xlog_kvmalloc() in XFS. It implements a basic > > > fast-fail, no retry high order kmalloc before it falls back to > > > vmalloc by turning off direct reclaim for the kmalloc() call. > > > Hence if the there isn't a high-order page on the free lists ready > > > to allocate, it falls back to vmalloc() immediately. > > ... but if vmalloc fails, it goes around again! This is exactly why > we don't want filesystems implementing workarounds for MM problems. > What a mess. > > > if (size > PAGE_SIZE) { > > flags |= __GFP_NOWARN; > > > > if (!(flags & __GFP_RETRY_MAYFAIL)) > > flags |= __GFP_NORETRY; > > + else > > + flags &= ~__GFP_DIRECT_RECLAIM; > > I think it might be better to do this: > > flags |= __GFP_NOWARN; > > if (!(flags & __GFP_RETRY_MAYFAIL)) > flags |= __GFP_NORETRY; > + else if (size > (PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER)) > + flags &= ~__GFP_DIRECT_RECLAIM; The above seems more appropriate then the Michal's bigger hammer. In addition I think Vlastimil has a very good point about the kswapd reclaim for such cases (the patch explicitly complains about kcompactd cpu usage). > > I think it's entirely appropriate for a call to kvmalloc() to do > direct reclaim if it's asking for, say, 16KiB and we don't have any of > those available. Better than exacerbating the fragmentation problem by > allocating 4x4KiB pages, each from different groupings.