> -----Original Message----- > From: Barry Song <21cnbao@xxxxxxxxx> > Sent: Monday, August 25, 2025 10:17 PM > To: Sridhar, Kanchana P <kanchana.p.sridhar@xxxxxxxxx> > Cc: linux-kernel@xxxxxxxxxxxxxxx; linux-mm@xxxxxxxxx; > hannes@xxxxxxxxxxx; yosry.ahmed@xxxxxxxxx; nphamcs@xxxxxxxxx; > chengming.zhou@xxxxxxxxx; usamaarif642@xxxxxxxxx; > ryan.roberts@xxxxxxx; ying.huang@xxxxxxxxxxxxxxxxx; akpm@linux- > foundation.org; senozhatsky@xxxxxxxxxxxx; linux-crypto@xxxxxxxxxxxxxxx; > herbert@xxxxxxxxxxxxxxxxxxx; davem@xxxxxxxxxxxxx; > clabbe@xxxxxxxxxxxx; ardb@xxxxxxxxxx; ebiggers@xxxxxxxxxx; > surenb@xxxxxxxxxx; Accardi, Kristen C <kristen.c.accardi@xxxxxxxxx>; > Gomes, Vinicius <vinicius.gomes@xxxxxxxxx>; Feghali, Wajdi K > <wajdi.k.feghali@xxxxxxxxx>; Gopal, Vinodh <vinodh.gopal@xxxxxxxxx> > Subject: Re: [PATCH v11 22/24] mm: zswap: Allocate pool batching resources > if the compressor supports batching. > > > > > > [...] > > > > > > > > > > > > + /* > > > > > > + * Set the unit of compress batching for large folios, for quick > > > > > > + * retrieval in the zswap_compress() fast path: > > > > > > + * If the compressor is sequential (@pool->compr_batch_size is > 1), > > > > > > + * large folios will be compressed in batches of > > > > > ZSWAP_MAX_BATCH_SIZE > > > > > > + * pages, where each page in the batch is compressed > sequentially. > > > > > > + * We see better performance by processing the folio in batches > of > > > > > > + * ZSWAP_MAX_BATCH_SIZE, due to cache locality of working > set > > > > > > + * structures. > > > > > > + */ > > > > > > + pool->batch_size = (pool->compr_batch_size > 1) ? > > > > > > + pool->compr_batch_size : > ZSWAP_MAX_BATCH_SIZE; > > > > > > + > > > > > > zswap_pool_debug("created", pool); > > > > > > > > > > > > return pool; > > > > > > > > > > > > > > > > It’s hard to follow — you add batch_size and compr_batch_size in this > > > > > patch, but only use them in another. Could we merge the related > changes > > > > > into one patch instead of splitting them into several that don’t work > > > > > independently? > > > > > > > > Hi Barry, > > > > > > > > Thanks for reviewing the code and for your comments! Sure, I can merge > > > > this patch with the next one. I was trying to keep the changes > modularized > > > > to a) zswap_cpu_comp_prepare(), b) zswap_store() and c) > > > zswap_compress() > > > > so the changes are broken into smaller parts, but I can see how this can > > > > make the changes appear disjointed. > > > > > > > > One thing though: the commit logs for each of the patches will > > > > also probably need to be merged, since I have tried to explain the > > > > changes in detail. > > > > > > It’s fine to merge the changelog and present the story as a whole. Do we > > > > Sure. > > > > > really need both pool->batch_size and pool->compr_batch_size? I assume > > > pool->batch_size = pool->compr_batch_size if HW supports batch; > otherwise > > > pool->compr_batch_size = 1. > > > > Actually not exactly. We have found value in compressing in batches of > > ZSWAP_MAX_BATCH_SIZE even for software compressors. Latency benefits > > from cache locality of working-set data. Hence the approach that we have > > settled on is pool->batch_size = ZSWAP_MAX_BATCH_SIZE if > > the compressor does not support batching (i.e., if pool->compr_batch_size is > 1). > > If it does, then pool->batch_size = pool->compr_batch_size. > > I understand that even without a hardware batch, you can still > have some software batching that excludes compression. > > However, based on the code below, it looks like > pool->compr_batch_size is almost always either equal to > pool->batch_size or 1: > > + pool->compr_batch_size = min(ZSWAP_MAX_BATCH_SIZE, > + crypto_acomp_batch_size(acomp_ctx->acomp)); I would like to explain some of the considerations in coming up with this approach: 1) The compression algorithm gets to decide an optimal batch-size. For a hardware accelerator such as IAA, this value could be different than ZSWAP_MAX_BATCH_SIZE. 2) ZSWAP_MAX_BATCH_SIZE acts as a limiting factor to the # of acomp_ctx per-CPU resources that will be allocated in zswap_cpu_comp_prepare(); as per Yosry's suggestion. This helps limit the memory overhead for batching algorithms. 3) If a batching algorithm works with a batch size "X" , where 1 < X < ZSWAP_MAX_BATCH_SIZE, two things need to happen: a) We want to only allocate "X" per-CPU resources. b) We want to process the folio in batches of "X", not ZSWAP_MAX_BATCH_SIZE to avail of the benefits of hardware parallelism. This is the compression step size you also mention. In particular, we cannot treat batch_size as ZSWAP_MAX_BATCH_SIZE, and send a batch of ZSWAP_MAX_BATCH_SIZE pages to zswap_compress() in this case. For e.g., what if the compress step-size is 6, but the new zswap_store_pages() introduced in patch 23 sends 8 pages to zswap_compress() because ZSWAP_MAX_BATCH_SIZE is set to 8? The code in zswap_compress() could get quite messy, which will impact latency. > > + pool->batch_size = (pool->compr_batch_size > 1) ? > + pool->compr_batch_size : ZSWAP_MAX_BATCH_SIZE; > > > It seems one of these two variables may be redundant. > For instance, no matter if pool->compr_batch_size > 1, could we always treat > batch_size as ZSWAP_MAX_BATCH_SIZE? if we remove > pool->batch_size, could we just use pool->compr_batch_size as the > step size for compression no matter if pool->compr_batch_size > 1. To further explain the rationale for keeping these two distinct, we statically compute the compress step-size by querying the algorithm in zswap_cpu_comp_prepare() after the acomp_ctx->acomp has been created. We store it in pool->compr_batch_size. Next, in zswap_pool_create(), we do a one-time computation to determine if the pool->batch_size needs to align with a non-1 batching acomp's compr_batch_size; or, if it is a non-batching compressor, for the pool->batch_size to be ZSWAP_MAX_BATCH_SIZE. This enables further code simplifications/unification in zswap_compress(); and quick retrieval of the # of available acomp_ctx batching resources to set up the SG lists per call to crypto_acomp_compress(). IOW, memo-ization for latency gains and unified code paths. I hope this clarifies things further. Thanks again, these are all good questions. Best regards, Kanchana > > for example: > pool->compr_batch_size = min(ZSWAP_MAX_BATCH_SIZE, > crypto_acomp_batch_size(acomp_ctx->acomp)); > > Then batch the zswap store using ZSWAP_MAX_BATCH_SIZE, but set the > compression step to pool->compr_batch_size. > > Thanks > Barry