Re: [PATCH v2 00/16] Fix incorrect iommu_groups with PCIe ACS

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 8/6/2025 10:41 AM, Baolu Lu wrote:
On 8/6/25 10:22, Ethan Zhao wrote:
On 8/5/2025 10:43 PM, Jason Gunthorpe wrote:
On Tue, Aug 05, 2025 at 10:41:03PM +0800, Ethan Zhao wrote:

My understanding, iommu has no logic yet to handle the egress control
vector configuration case,

We don't support it at all. If some FW leaves it configured then it
will work at the PCI level but Linux has no awarness of what it is
doing.

Arguably Linux should disable it on boot, but we don't..
linux tool like setpci could access PCIe configuration raw data, so
does to the ACS control bits. that is boring.

Any change to ACS after boot is "not supported" - iommu groups are one
time only using boot config only. If someone wants to customize ACS
they need to use the new config_acs kernel parameter.
That would leave ACS to boot time configuration only. Linux never
limits tools to access(write) hardware directly even it could do that.
Would it be better to have interception/configure-able policy for such
hardware access behavior in kernel like what hypervisor does to MSR etc ?

A root user could even clear the BME or MSE bits of a device's PCIe
configuration space, even if the device is already bound to a driver and
operating normally. I don't think there's a mechanism to prevent that
pci tools such setpci accesses PCIe device configuration space via sysfs
interface, it has default write/read rights setting to root users, that is one point could control the root permission.

PCIe device configuration space was mapped into CPU address space via
ECAM by calling ioremap to setup CPU page table, the PTE has permission
control bits for read/wirte/cache etc. this is another point to control.

Legacy PCI device configuration space was accessed via 0xCF8/0xCFC ioport operation, there is point to intercept.

To prevent device from DMA to configuration space, the same IOMMU pagetable PTE could be setup to control the access.

from happening, besides permission enforcement. I believe that the same
applies to the ACS control.


The static groups were created according to
FW DRDB tables,

?? iommu_groups have nothing to do with FW tables.
Sorry, typo, ACPI drhd table.

Same answer, AFAIK FW tables have no effect on iommu_groups
My understanding, FW tables are part of the description about device topology and iommu-device relationship. did I really misunderstand
something ?

The ACPI/DMAR table describes the platform's IOMMU topology, not the
device topology, which is described by the PCI bus. So, the firmware
table doesn't impact the iommu_group.

I remember drhd table list the iommus and the device belong to them.
but kernel still needs to traverse PCIe topology to make up iommu_groups.


Thanks,
Ethan>
Thanks,
baolu





[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux