Otherwise, if you say, have a TDISP capable mlx5 device and boot up the cVM in a comporomised host the host can probably completely hack your cVM by exploiting the mlx5 drivers's total trust in the HW interface while running in T=0 mode. You must attest it and switch to T=1 before binding any driver if you care about mitigating this risk. > With the driver in control there would need to be something like a > usermodehelper to notify userspace that the device is in the locked > state and to go ahead and run the attestation while the driver waits*. It doesn't make sense to require modification to all existing drivers in Linux! The starting point must have the core code do this sequence for every driver. Once that is working we can talk about if other flows are needed. > > step 4: Load the driver again. > > echo ${DEVICE} > /sys/bus/pci/drivers_probe > > TIL drivers_probe > > Maybe want to recommend: > > echo ${DEVICE} > /sys/bus/pci/drivers/${DRIVER}/bind > > ...to users just in case there are multiple drivers loaded for the > device for the "shared" vs "private" case? Generic userspace will have a hard time to know what the driver names are.. The driver_probe option looks good to me as the default. I'm not sure how generic code can handle "multiple drivers".. Most devices will be able to work just fine with T=0 mode with bounce buffers so we should generally not encourage people to make completely different drivers for T=0/T=1 mode. I think what is needed is some way for userspace to trigger the "locking configuration" you mentioned, that may need a special driver, but ONLY if the userspace is sequencing the device to T=1 mode. Not sure how to make that generic, but I think so long as userspace is explicitly controlling driver binding we can punt on that solution to the userspace project :) The real nastyness is RAS - what do you do when the device falls out of RUN, the kernel driver should pretty much explode. But lots of people would like the kernel driver to stay alive and somehow we FLR, re-attest and "resume" the kernel driver without allowing any T=0 risks. For instance you can keep your netdev and just see a lot of lost packets while the driver thrashes. But I think we can start with the idea that such RAS failures have to reload the driver too and work on improvements. Realistically few drivers have the sort of RAS features to consume this anyhow and maybe we introduce some "enhanced" driver mode to opt-into down the road. Jason