On 3/21/2025 7:57 PM, Thomas Gleixner wrote: > On Wed, Feb 26 2025 at 14:35, Neeraj Upadhyay wrote: >> Add update_vector callback to set/clear ALLOWED_IRR field in >> the APIC backing page. The ALLOWED_IRR field indicates the >> interrupt vectors which the guest allows the hypervisor to >> send (typically for emulated devices). Interrupt vectors used >> exclusively by the guest itself (like IPI vectors) should not >> be allowed to be injected into the guest for security reasons. >> The update_vector callback is invoked from APIC vector domain >> whenever a vector is allocated, freed or moved. > > Your changelog tells a lot about the WHAT. Please read and follow the > documentation, which describes how a change log should be structured. > > https://www.kernel.org/doc/html/latest/process/maintainer-tip.html#changelog > Ok >> diff --git a/arch/x86/kernel/apic/vector.c b/arch/x86/kernel/apic/vector.c >> index 72fa4bb78f0a..e0c9505e05f8 100644 >> --- a/arch/x86/kernel/apic/vector.c >> +++ b/arch/x86/kernel/apic/vector.c >> @@ -174,6 +174,8 @@ static void apic_update_vector(struct irq_data *irqd, unsigned int newvec, >> apicd->prev_cpu = apicd->cpu; >> WARN_ON_ONCE(apicd->cpu == newcpu); >> } else { >> + if (apic->update_vector) >> + apic->update_vector(apicd->cpu, apicd->vector, false); >> irq_matrix_free(vector_matrix, apicd->cpu, apicd->vector, >> managed); >> } >> @@ -183,6 +185,8 @@ static void apic_update_vector(struct irq_data *irqd, unsigned int newvec, >> apicd->cpu = newcpu; >> BUG_ON(!IS_ERR_OR_NULL(per_cpu(vector_irq, newcpu)[newvec])); >> per_cpu(vector_irq, newcpu)[newvec] = desc; >> + if (apic->update_vector) >> + apic->update_vector(apicd->cpu, apicd->vector, true); > > A trivial > > static inline void apic_update_vector(....) > { > if (apic->update_vector) > .... > } > > would be too easy to read and add not enough line count, right? > Yes. >> static void vector_assign_managed_shutdown(struct irq_data *irqd) >> @@ -528,11 +532,15 @@ static bool vector_configure_legacy(unsigned int virq, struct irq_data *irqd, >> if (irqd_is_activated(irqd)) { >> trace_vector_setup(virq, true, 0); >> apic_update_irq_cfg(irqd, apicd->vector, apicd->cpu); >> + if (apic->update_vector) >> + apic->update_vector(apicd->cpu, apicd->vector, true); >> } else { >> /* Release the vector */ >> apicd->can_reserve = true; >> irqd_set_can_reserve(irqd); >> clear_irq_vector(irqd); >> + if (apic->update_vector) >> + apic->update_vector(apicd->cpu, apicd->vector, false); >> realloc = true; > > This is as incomplete as it gets. None of the other code paths which > invoke clear_irq_vector() nor those which invoke free_moved_vector() are > mopping up the leftovers in the backing page. > > And no, you don't sprinkle this nonsense all over the call sites. There > is only a very limited number of functions which are involed in setting > up and tearing down a vector. Doing this at the call sites is a > guarantee for missing out as you demonstrated. > This is the part where I was looking for guidance. As ALLOWED_IRR (which tells if Hypervisor is allowed to inject a vector to guest vCPU) is per CPU, intent was to call it at places where vector's CPU affinity changes. I surely have missed cleaning up ALLOWED_IRR on previously affined CPU. I will follow your suggestion to do it during setup/teardown of vector (need to figure out those functions) and configure it for all CPUs in those functions. >> +#define VEC_POS(v) ((v) & (32 - 1)) >> +#define REG_POS(v) (((v) >> 5) << 4) > > This is unreadable, undocumented and incomprehensible garbage. > I will update it. >> static DEFINE_PER_CPU(void *, apic_backing_page); >> >> struct apic_id_node { >> @@ -192,6 +195,22 @@ static void x2apic_savic_send_IPI_mask_allbutself(const struct cpumask *mask, in >> __send_IPI_mask(mask, vector, APIC_DEST_ALLBUT); >> } >> >> +static void x2apic_savic_update_vector(unsigned int cpu, unsigned int vector, bool set) >> +{ >> + void *backing_page; >> + unsigned long *reg; >> + int reg_off; >> + >> + backing_page = per_cpu(apic_backing_page, cpu); >> + reg_off = SAVIC_ALLOWED_IRR_OFFSET + REG_POS(vector); >> + reg = (unsigned long *)((char *)backing_page + reg_off); >> + >> + if (set) >> + test_and_set_bit(VEC_POS(vector), reg); >> + else >> + test_and_clear_bit(VEC_POS(vector), reg); >> +} > > What's the test_and_ for if you ignore the return value anyway? > To not set it again if it already set. I will switch to set_bit/clear_bit() as test_and_ is not necessary. > Als I have no idea what SAVIC_ALLOWED_IRR_OFFSET means. Whether it's > something from the datashit or a made up thing does not matter. It's > patently non-informative. > Ok, I had tried to give some details in the cover letter. These APIC regs are at offset APIC_IRR(n) + 4 and are used by guest to configure the interrupt vectors which can be injected by Hypervisor to Guest. > Again: > > struct apic_page { > union { > u32 regs[NR_APIC_REGS]; > u8 bytes[PAGE_SIZE]; > }; > }; > > struct apic_page *ap = this_cpu_ptr(apic_page); > unsigned long *sirr; > > /* > * apic_page.regs[SAVIC_ALLOWED_IRR_OFFSET...] is an array of > * consecutive 32-bit registers, which represents a vector bitmap. > */ > sirr = (unsigned long *) &ap->regs[SAVIC_ALLOWED_IRR_OFFSET]; > if (set) > set_bit(sirr, vector); > else > clear_bit(sirr, vector); > > See how code suddenly becomes self explaining, obvious and > comprehensible? > Yes, thank you! - Neeraj > Thanks, > > tglx