On Wed, Jun 4, 2025 at 11:00 AM Pratyush Yadav <pratyush@xxxxxxxxxx> wrote: > > On Thu, May 15 2025, Pasha Tatashin wrote: > > > From: Changyuan Lyu <changyuanl@xxxxxxxxxx> > > > > Allow users of KHO to cancel the previous preservation by adding the > > necessary interfaces to unpreserve folio. > > > > Signed-off-by: Changyuan Lyu <changyuanl@xxxxxxxxxx> > > Co-developed-by: Pasha Tatashin <pasha.tatashin@xxxxxxxxxx> > > Signed-off-by: Pasha Tatashin <pasha.tatashin@xxxxxxxxxx> > > --- > > include/linux/kexec_handover.h | 12 +++++ > > kernel/kexec_handover.c | 84 ++++++++++++++++++++++++++++------ > > 2 files changed, 83 insertions(+), 13 deletions(-) > > > [...] > > diff --git a/kernel/kexec_handover.c b/kernel/kexec_handover.c > > index 8ff561e36a87..eb305e7e6129 100644 > > --- a/kernel/kexec_handover.c > > +++ b/kernel/kexec_handover.c > > @@ -101,26 +101,33 @@ static void *xa_load_or_alloc(struct xarray *xa, unsigned long index, size_t sz) > > return elm; > > } > > > > -static void __kho_unpreserve(struct kho_mem_track *track, unsigned long pfn, > > - unsigned long end_pfn) > > +static void __kho_unpreserve_order(struct kho_mem_track *track, unsigned long pfn, > > + unsigned int order) > > { > > struct kho_mem_phys_bits *bits; > > struct kho_mem_phys *physxa; > > + const unsigned long pfn_high = pfn >> order; > > > > - while (pfn < end_pfn) { > > - const unsigned int order = > > - min(count_trailing_zeros(pfn), ilog2(end_pfn - pfn)); > > - const unsigned long pfn_high = pfn >> order; > > + physxa = xa_load(&track->orders, order); > > + if (!physxa) > > + return; > > > > - physxa = xa_load(&track->orders, order); > > - if (!physxa) > > - continue; > > + bits = xa_load(&physxa->phys_bits, pfn_high / PRESERVE_BITS); > > + if (!bits) > > + return; > > > > - bits = xa_load(&physxa->phys_bits, pfn_high / PRESERVE_BITS); > > - if (!bits) > > - continue; > > + clear_bit(pfn_high % PRESERVE_BITS, bits->preserve); > > +} > > > > - clear_bit(pfn_high % PRESERVE_BITS, bits->preserve); > > +static void __kho_unpreserve(struct kho_mem_track *track, unsigned long pfn, > > + unsigned long end_pfn) > > +{ > > + unsigned int order; > > + > > + while (pfn < end_pfn) { > > + order = min(count_trailing_zeros(pfn), ilog2(end_pfn - pfn)); > > This is fragile. If the preserve call spans say 4 PFNs, then it gets > preserved as a order 2 allocation, but if the PFNs are unpreserved > one-by-one, __kho_unpreserve_order() will unpreserve from the order 0 > xarray, which will end up doing nothing, leaking those pages. > > It should either look through all orders to find the PFN, or at least > have a requirement in the API that the same phys and size combination as > the preserve call must be given to unpreserve. Thank you Pratyush, this is an excellent point. I will add to the comments of these functions, that it is a requirement to unpreserve exactly the memory that was preserved, and subsections are not allowed. I do not think this is needed, but in the future, if a use case arises, we can relax this requirement. Pasha