From: Ashish Kalra <ashish.kalra@xxxxxxx> When a crash is triggered the kernel attempts to shut down SEV-SNP using the SNP_SHUTDOWN_EX command. If active SNP VMs are present, SNP_SHUTDOWN_EX fails as firmware checks all encryption-capable ASIDs to ensure none are in use and that a DF_FLUSH is not required. This casues the kdump kernel to boot with IOMMU SNP enforcement still enabled and IOMMU completion wait buffers (CWBs), command buffers, device tables and event buffer registers remain locked and exclusive to the previous kernel. Attempts to allocate and use new buffers in the kdump kernel fail, as the hardware ignores writes to the locked MMIO registers (per AMD IOMMU spec Section 2.12.2.1). As a result, the kdump kernel cannot initialize the IOMMU or enable IRQ remapping which is required for proper operation. This results in repeated "Completion-Wait loop timed out" errors and a second kernel panic: "Kernel panic - not syncing: timer doesn't work through Interrupt-remapped IO-APIC" The following MMIO registers are locked and ignore writes after failed SNP shutdown: Device Table Base Address Register Command Buffer Base Address Register Event Buffer Base Address Register Completion Store Base Register/Exclusion Base Register Completion Store Limit Register/Exclusion Range Limit Register Instead of allocating new buffers, re-use the previous kernel’s pages for completion wait buffers, command buffers, event buffers and device tables and operate with the already enabled SNP configuration and existing data structures. This approach is now used for kdump boot regardless of whether SNP is enabled during kdump. The fix enables successful crashkernel/kdump operation on SNP hosts even when SNP_SHUTDOWN_EX fails. Fixes: c3b86e61b756 ("x86/cpufeatures: Enable/unmask SEV-SNP CPU feature") v3: - Moving to AMD IOMMU driver fix so that there is no need to do SNP_DECOMMISSION during panic() and kdump kernel boot will be more agnostic to whether or not SNP_SHUTDOWN is done properly (or even done at all), i.e., even with active SNP guests. Fixing crashkernel/kdump boot with IOMMU SNP/RMP enforcement still enabled prior to kdump boot by reusing the pages of the previous kernel for IOMMU completion wait buffers, command buffer and device table and memremap them during kdump boot. - Rebased on linux-next. - Split the original patch into smaller patches and prepare separate patches for adding iommu_memremap() helper and remapping/unmapping of IOMMU buffers for kdump, Reusing device table for kdump and skip the enabling of IOMMU buffers for kdump. - Add new functions for remapping/unmapping IOMMU buffers and call them from alloc_iommu_buffers/free_iommu_buffers in case of kdump boot else call the exisiting alloc/free variants of CWB, command and event buffers. - Skip SNP INIT in case of kdump boot. - The final patch skips enabling IOMMU command buffer and event buffer for kdump boot which fixes kdump on SNP host. - Add comment that completion wait buffers are only re-used when SNP is enabled. Ashish Kalra (4): iommu/amd: Add support to remap/unmap IOMMU buffers for kdump iommu/amd: Reuse device table for kdump crypto: ccp: Skip SNP INIT for kdump boot iommu/amd: Fix host kdump support for SNP drivers/crypto/ccp/sev-dev.c | 8 + drivers/iommu/amd/amd_iommu_types.h | 5 + drivers/iommu/amd/init.c | 288 +++++++++++++++++++--------- drivers/iommu/amd/iommu.c | 2 +- 4 files changed, 212 insertions(+), 91 deletions(-) -- 2.34.1