On 5/30/25 01:08, Sean Christopherson wrote:
On Thu, May 29, 2025, Kai Huang wrote:
On Thu, 2025-05-29 at 07:31 -0700, Sean Christopherson wrote:
On Thu, May 29, 2025, Kai Huang wrote:
On Thu, 2025-05-29 at 23:55 +1200, Kai Huang wrote:
Do they only support userspace IRQ chip, or not support any IRQ chip at all?
The former, only userspace I/O APIC (and associated devices), though some VM
shapes, e.g. TDX, don't provide an I/O APIC or PIC.
Thanks for the info.
Just wondering what's the benefit of using userspace IRQCHIP instead of
emulating in the kernel?
Reduced kernel attack surface (this was especially true years ago, before KVM's
I/O APIC emulation was well-tested) and more flexibility (e.g. shipping userspace
changes is typically easier than shipping new kernels. I'm pretty sure there's
one more big one that I'm blanking on at the moment.
Feature-wise, the big one is support for IRQ remapping which is not
implemented in the in-kernel IOAPIC.
Forgot to ask:
Since this new Kconfig option is not only for IOAPIC but also includes PIC and
PIT, is CONFIG_KVM_IRQCHIP a better name?
I much prefer IOAPIC, because IRQCHIP is far too ambiguous and confusing, e.g.
just look at KVM's internal APIs, where these:
irqchip_in_kernel()
irqchip_kernel()
are not equivalent. In practice, no modern guest kernel is going to utilize the
PIC, and the PIT isn't an IRQ chip, i.e. isn't strictly covered by IRQCHIP either.
Right.
Maybe it is worth to further have dedicated Kconfig for PIC, PIT and IOAPIC?
Nah. PIC and I/O APIC can't be split (without new uAPI and non-trivial complexity),
and I highly doubt there is any use case that would want an in-kernel I/O APIC
with a userspace PIT. I.e. in practice, the three almost always come as a group;
either a setup wants all, or a setup wants none.
Without "almost", even. I think it's okay to make it CONFIG_KVM_IOAPIC,
it's not super accurate but there's no single word that convey "IOAPIC,
PIC and PIT".
Btw, I also find irqchip_in_kernel() and irqchip_kernel() confusing. I am not
sure the value of having irqchip_in_kernel() in fact. The guest should always
have an in-kernel APIC for modern guests. I am wondering whether we can get rid
of it completely (the logic will be it is always be true), or we can have a
Kconfig to only build it when user truly wants it.
irqchip_kernel() can be renamed to irqchip_full().
For better or worse, an in-kernel local APIC is still optional. I do hope/want
to make it mandatory, but that's not a small ABI change.
I am pretty sure that some users (was it DOSBox? or maybe even gVisor?)
would break.
Paolo