On 6/23/25 20:41, Alexey Kardashevskiy wrote: > On 21/6/25 05:20, Tom Lendacky wrote: >> On 6/17/25 04:43, Alexey Kardashevskiy wrote: >>> The __snp_alloc_firmware_pages() helper allocates pages in the firmware >>> state (alloc + rmpupdate). In case of failed rmpupdate, it tries >>> reclaiming pages with already changed state. This requires calling >>> the PSP firmware and since there is sev_cmd_mutex to guard such calls, >>> the helper takes a "locked" parameter so specify if the lock needs to >>> be held. >>> >>> Most calls happen from snp_alloc_firmware_page() which executes without >>> the lock. However >>> >>> commit 24512afa4336 ("crypto: ccp: Handle the legacy TMR allocation >>> when SNP is enabled") >>> >>> switched sev_fw_alloc() from alloc_pages() (which does not call the >>> PSP) to >>> __snp_alloc_firmware_pages() (which does) but did not account for the fact >>> that sev_fw_alloc() is called from __sev_platform_init_locked() >>> (via __sev_platform_init_handle_tmr()) and executes with the lock held. >>> >>> Add a "locked" parameter to __snp_alloc_firmware_pages(). >>> Make sev_fw_alloc() use the new parameter to prevent potential deadlock in >>> rmp_mark_pages_firmware() if rmpupdate() failed. >> >> Would it make sense to add the locked parameter to sev_fw_alloc(), too? > > That would be another patch then, this one is a fix ;) > > and I'd probably just ditch both snp_alloc_firmware_page() and > sev_fw_alloc(), rename __snp_alloc_firmware_pages() to > snp_alloc_firmware_page() and just use this one everywhere. Nobody needs > page struct anyway, and the locking will be clear everywhere. Also do the > same for snp_free_firmware_page(). > > It is just that snp_alloc_firmware_page() and snp_free_firmware_page() are > EXPORT_SYMBOL_GPL, > >> Right now there is only one caller of sev_fw_alloc(), but in the future, >> if some other path should call sev_fw_alloc() and that path doesn't have >> the lock, then we'll miss taking it. > > I'd rather just ditch sev_fw_alloc(), does not look very useful. Thanks, Ok, I'm good with this patch for the fix. Can you also provide a follow up patch(es) to address what you've discussed? For the fix: Reviewed-by: Tom Lendacky <thomas.lendacky@xxxxxxx> Thanks, Tom > > > >> Thanks, >> Tom >> >>> >>> Fixes: 24512afa4336 ("crypto: ccp: Handle the legacy TMR allocation >>> when SNP is enabled") >>> Signed-off-by: Alexey Kardashevskiy <aik@xxxxxxx> >>> --- >>> drivers/crypto/ccp/sev-dev.c | 8 ++++---- >>> 1 file changed, 4 insertions(+), 4 deletions(-) >>> >>> diff --git a/drivers/crypto/ccp/sev-dev.c b/drivers/crypto/ccp/sev-dev.c >>> index 3451bada884e..16a11d5efe46 100644 >>> --- a/drivers/crypto/ccp/sev-dev.c >>> +++ b/drivers/crypto/ccp/sev-dev.c >>> @@ -434,7 +434,7 @@ static int rmp_mark_pages_firmware(unsigned long >>> paddr, unsigned int npages, boo >>> return rc; >>> } >>> -static struct page *__snp_alloc_firmware_pages(gfp_t gfp_mask, int >>> order) >>> +static struct page *__snp_alloc_firmware_pages(gfp_t gfp_mask, int >>> order, bool locked) >>> { >>> unsigned long npages = 1ul << order, paddr; >>> struct sev_device *sev; >>> @@ -453,7 +453,7 @@ static struct page >>> *__snp_alloc_firmware_pages(gfp_t gfp_mask, int order) >>> return page; >>> paddr = __pa((unsigned long)page_address(page)); >>> - if (rmp_mark_pages_firmware(paddr, npages, false)) >>> + if (rmp_mark_pages_firmware(paddr, npages, locked)) >>> return NULL; >>> return page; >>> @@ -463,7 +463,7 @@ void *snp_alloc_firmware_page(gfp_t gfp_mask) >>> { >>> struct page *page; >>> - page = __snp_alloc_firmware_pages(gfp_mask, 0); >>> + page = __snp_alloc_firmware_pages(gfp_mask, 0, false); >>> return page ? page_address(page) : NULL; >>> } >>> @@ -498,7 +498,7 @@ static void *sev_fw_alloc(unsigned long len) >>> { >>> struct page *page; >>> - page = __snp_alloc_firmware_pages(GFP_KERNEL, get_order(len)); >>> + page = __snp_alloc_firmware_pages(GFP_KERNEL, get_order(len), true); >>> if (!page) >>> return NULL; >>> >