Khushit Shah <khushit.shah@xxxxxxxxxxx> writes: >> On 8 Sep 2025, at 5:12 PM, Vitaly Kuznetsov <vkuznets@xxxxxxxxxx> wrote: >> ... >> Also, I've just recalled I fixed (well, 'workarounded') an issue similar >> to yours a while ago in QEMU: >> >> commit 958a01dab8e02fc49f4fd619fad8c82a1108afdb >> Author: Vitaly Kuznetsov <vkuznets@xxxxxxxxxx> >> Date: Tue Apr 2 10:02:15 2019 +0200 >> >> ioapic: allow buggy guests mishandling level-triggered interrupts to make progress >> >> maybe something has changed and it doesn't work anymore? > > This is really interesting, we are facing a very similar issue, but the interrupt storm only occurs when using split-irqchip. > Using kernel-irqchip, we do not even see consecutive level triggered interrupts of the same vector. From the logs it is > clear that somehow with kernel-irqchip, L1 passes the interrupt to L2 to service, but with split-irqchip, L1 EOI’s without > servicing the interrupt. As it is working properly on kernel-irqchip, we can’t really point it as an Hyper-V issue. AFAIK, > kernel-irqchip setting should be transparent to the guest, can you think of anything that can change this? The problem I've fixed back then was also only visible with split irqchip. The reason was: """ in-kernel IOAPIC implementation has commit 184564efae4d ("kvm: ioapic: conditionally delay irq delivery duringeoi broadcast") """ so even though the guest cannot really distinguish between in-kernel and split irqchips, the small differences in implementation can make a big difference in the observed behavior. In case we re-assert improperly handled level-triggered interrupt too fast, the guest is not able to make much progress but if we let it execute for even the tiniest fraction of time, then the forward progress happens. I don't exactly know what happens in this particular case but I'd suggest you try to atrificially delay re-asserting level triggered interrupts and see what happens. -- Vitaly