> -----Original Message----- > From: Tian, Kevin <kevin.tian@xxxxxxxxx> > Sent: Wednesday, March 12, 2025 2:53 AM > To: Wathsala Wathawana Vithanage <wathsala.vithanage@xxxxxxx>; Alex > Williamson <alex.williamson@xxxxxxxxxx> > Cc: Jason Gunthorpe <jgg@xxxxxxxx>; linux-kernel@xxxxxxxxxxxxxxx; nd > <nd@xxxxxxx>; Philipp Stanner <pstanner@xxxxxxxxxx>; Yunxiang Li > <Yunxiang.Li@xxxxxxx>; Dr. David Alan Gilbert <linux@xxxxxxxxxxx>; Ankit > Agrawal <ankita@xxxxxxxxxx>; open list:VFIO DRIVER <kvm@xxxxxxxxxxxxxxx>; > Dhruv Tripathi <Dhruv.Tripathi@xxxxxxx>; Honnappa Nagarahalli > <Honnappa.Nagarahalli@xxxxxxx>; Jeremy Linton <Jeremy.Linton@xxxxxxx> > Subject: RE: [RFC PATCH] vfio/pci: add PCIe TPH to device feature ioctl > > > From: Wathsala Wathawana Vithanage <wathsala.vithanage@xxxxxxx> > > Sent: Wednesday, March 5, 2025 2:11 PM > > > > > -----Original Message----- > > > From: Alex Williamson <alex.williamson@xxxxxxxxxx> > > > Sent: Tuesday, March 4, 2025 7:24 PM > > > To: Wathsala Wathawana Vithanage <wathsala.vithanage@xxxxxxx> > > > Cc: Jason Gunthorpe <jgg@xxxxxxxx>; linux-kernel@xxxxxxxxxxxxxxx; nd > > > <nd@xxxxxxx>; Kevin Tian <kevin.tian@xxxxxxxxx>; Philipp Stanner > > > <pstanner@xxxxxxxxxx>; Yunxiang Li <Yunxiang.Li@xxxxxxx>; Dr. David > > Alan > > > Gilbert <linux@xxxxxxxxxxx>; Ankit Agrawal <ankita@xxxxxxxxxx>; open > > list:VFIO > > > DRIVER <kvm@xxxxxxxxxxxxxxx> > > > Subject: Re: [RFC PATCH] vfio/pci: add PCIe TPH to device feature ioctl > > > > > > On Tue, 4 Mar 2025 22:38:16 +0000 > > > Wathsala Wathawana Vithanage <wathsala.vithanage@xxxxxxx> wrote: > > > > > > > > > Linux v6.13 introduced the PCIe TLP Processing Hints (TPH) feature for > > > > > > direct cache injection. As described in the relevant patch set [1], > > > > > > direct cache injection in supported hardware allows optimal platform > > > > > > resource utilization for specific requests on the PCIe bus. This feature > > > > > > is currently available only for kernel device drivers. However, > > > > > > user space applications, especially those whose performance is > > sensitive > > > > > > to the latency of inbound writes as seen by a CPU core, may benefit > > from > > > > > > using this information (E.g., DPDK cache stashing RFC [2] or an HPC > > > > > > application running in a VM). > > > > > > > > > > > > This patch enables configuring of TPH from the user space via > > > > > > VFIO_DEVICE_FEATURE IOCLT. It provides an interface to user space > > > > > > drivers and VMMs to enable/disable the TPH feature on PCIe devices > > and > > > > > > set steering tags in MSI-X or steering-tag table entries using > > > > > > VFIO_DEVICE_FEATURE_SET flag or read steering tags from the kernel > > using > > > > > > VFIO_DEVICE_FEATURE_GET to operate in device-specific mode. > > > > > > > > > > What level of protection do we expect to have here? Is it OK for > > > > > userspace to make up any old tag value or is there some security > > > > > concern with that? > > > > > > > > > Shouldn't be allowed from within a container. > > > > A hypervisor should have its own STs and map them to platform STs for > > > > the cores the VM is pinned to and verify any old ST is not written to the > > > > device MSI-X, ST table or device specific locations. > > > > > > And how exactly are we mediating device specific steering tags when we > > > don't know where/how they're written to the device. An API that > > > returns a valid ST to userspace doesn't provide any guarantees relative > > > to what userspace later writes. MSI-X tables are also writable by > > > > By not enabling TPH in device-specific mode, hypervisors can ensure that > > setting an ST in a device-specific location (like queue contexts) will have no > > effect. VMs should also not be allowed to enable TPH. I believe this could > > be enforced by trapping (causing VM exits) on MSI-X/ST table writes. > > Probably we should not allow device-specific mode unless the user is > capable of CAP_SYS_RAWIO? It allows an user to pollute caches on Sounds plausible. > CPUs which its processes are not affined to, hence could easily break > SLAs which CSPs try to achieve... > > Interrupt vector mode sounds safer as it only needs to provide an > enable/disable cmd to the user and it's the kernel VFIO driver > managing the steering table, e.g. also in irq affinity handler. > > > > > Having said that, regardless of this proposal or the availability of kernel > > TPH support, a VFIO driver could enable TPH and set an arbitrary ST on the > > MSI-X/ST table or a device-specific location on supported platforms. If the > > driver doesn't have a list of valid STs, it can enumerate 8- or 16-bit STs and > > measure access latencies to determine valid ones. > > > > PCI capabilities are managed by the kernel VFIO driver. So w/o this > patch no userspace driver can enable TPH to try that trick? Yes, it's possible. It's just a matter of setting the right bits in the PCI config space to enable TPH on the device. Thanks --wathsala