On Thu, Aug 21, 2025, Alejandro Jimenez wrote: > On 8/21/25 7:42 AM, Maciej S. Szmigiero wrote: > > On 21.08.2025 10:18, Naveen N Rao wrote: > > > > Yes, this breaks real guests when AVIC is enabled. > > > > Specifically, the one OS that sometimes needs different handling and its > > > > name begins with letter 'W'. > > > > > > Indeed, Linux does not use TPR AFAIK. > > I believe it does, Heh, yes, Linux technically "uses" the TPR in that it does a one-time write to it. But what Naveen really meant is that Linux doesn't actively use TPR to manage what IRQs are masked/allowed, whereas Windows heavily uses TPR to do exactly that. Specifically, what matters is that Linux doesn't use TPR to _mask_ IRQs, and so clobbering it to '0' on migration is largely benign. > during the local APIC initialization. When Maciej > determined the root cause of this issue, I was wondering why we have not > seen it earlier in Linux. I found that Linux takes a defensive approach and > drains all pending interrupts during lapic initialization. Essentially, for > each CPU, Linux will: > - temporarily disable the Local APIC (via Spurious Int Vector Reg) > - set the TPR to accept all "regular" interrupts i.e. tpr=0x10 > - drain all pending interrupts in ISR and/or IRR > - attempt the above draining step a max of 512 times > - then re-enable APIC and continue initialization > > The relevant code is in setup_local_APIC() > https://elixir.bootlin.com/linux/v6.16/source/arch/x86/kernel/apic/apic.c#L1533-L1545 > > So without Maciej's proposed change, other OSs that are not as resilient > could also be affected by this issue. > > Alejandro > > > > - Naveen > > > > > > > Thanks, > > Maciej > > > > >