Jason Gunthorpe wrote: > On Tue, Aug 05, 2025 at 10:37:01AM +0530, Aneesh Kumar K.V wrote: > > > To me it is an unfortunate PCI specification wrinkle that writing to the > > > command register drops the device from RUN to ERROR. So you can LOCK > > > without setting BME, but then no DMA. > > > > This is only w.r.t clearing BME isn't ? > > > > According to section 11.2.6 DSM Tracking and Handling of Locked TDI Configurations > > > > Clearing any of the following bits causes the TDI hosted > > by the Function to transition to ERROR: > > > > • Memory Space Enable > > • Bus Master Enable > > Oh that's nice, yeah! That is useful, but an unmodified PCI driver is going to make separate calls to pci_set_master() and pci_enable_device() so it should still be the case that those need to be trapped out of the concern that writing back zero for a read-modify-write also trips the error state on some device that fails the Robustness Principle. I guess we could wait to solve that problem until the encountering the first device that trips ERROR when writing zero to an already zeroed bit. > > Which implies the flow described in the cover-letter where driver enable the BME works? > > However clearing BME may be problematic? I did have a FIXME!!/comment in [1] > > > > vfio_pci_core_close_device(): > > > > #if 0 > > /* > > * destroy vdevice which involves tsm unbind before we disable pci disable > > * A MSE/BME clear will transition the device to error state. > > */ > > if (core_vdev->iommufd_device) > > iommufd_device_tombstone_vdevice(core_vdev->iommufd_device); > > #endif > > > > vfio_pci_core_disable(vdev); > > Here is where I feel the VMM should be trapping this and NOPing it, or > failing that the guest PCI Core should NOP it. At this point (vfio shutdown path) the VMM is committed stopping guest operations with the device. So ok not to not NOP in this specific path, right? > With the ideal version being the TSM and VMM would be able to block > the iommu as a functional stand in for BME. The TSM block for BME is the LOCKED or ERROR state. That would be in conflict with the proposal that the device stays in the RUN state on guest driver unbind. I feel like either the device stays in RUN state and BME leaks, or the device is returned to LOCKED on driver unbind. Otherwise a functional stand-in for BME that also keeps the device in RUN state feels like a TSM feature request for a "RUN but BLOCKED" state.