There's some discrepancy between filesystems whether you need scratch space for decompression. Some filesystems read the compressed data into the pagecache and decompress in-place, while other filesystems read the compressed data into scratch pages and decompress into the page cache. There also seems to be some discrepancy between filesystems whether the decompression involves vmap() of all the memory allocated or whether the decompression routines can handle doing kmap_local() on individual pages. So, my proposal is that filesystems tell the page cache that their minimum folio size is the compression block size. That seems to be around 64k, so not an unreasonable minimum allocation size. That removes all the extra code in filesystems to allocate extra memory in the page cache. It means we don't attempt to track dirtiness at a sub-folio granularity (there's no point, we have to write back the entire compressed bock at once). We also get a single virtually contiguous block ... if you're willing to ditch HIGHMEM support. Or there's a proposal to introduce a vmap_file() which would give us a virtually contiguous chunk of memory (and could be trivially turned into a noop for the case of trying to vmap a single large folio).