Alexey Kardashevskiy wrote: > > > On 4/6/25 08:26, Dan Williams wrote: > > Alexey Kardashevskiy wrote: > > [..] > >>>>> +static int pci_tsm_accept(struct pci_dev *pdev) > >>>>> +{ > >>>>> + struct pci_tsm_pf0 *tsm = to_pci_tsm_pf0(pdev->tsm); > >>>>> + int rc; > >>>>> + > >>>>> + struct mutex *lock __free(tsm_ops_unlock) = tsm_ops_lock(tsm); > >>>> > >>>> Add an empty line. > >>> > >>> I think we, as a community, are still figuring out the coding-style > >>> around scope-based cleanup declarations, but I would argue, no empty > >>> line required after mid-function variable declarations. Now, in this > >>> case it is arguably not "mid-function", but all the other occurrences of > >>> tsm_ops_lock() are checking the result on the immediate next line. > >> > >> Do not really care as much :) > > > > Hey, what's kernel development without little side-arguments about > > whitespace? Will leave it alone for now. > > :) > > >>>>> + if (!lock) > >>>>> + return -EINTR; > >>>>> + > >>>>> + if (tsm->state < PCI_TSM_CONNECT) > >>>>> + return -ENXIO; > >>>>> + if (tsm->state >= PCI_TSM_ACCEPT) > >>>>> + return 0; > >>>>> + > >>>>> + /* > >>>>> + * "Accept" transitions a device to the run state, it is only suitable > >>>>> + * to make that transition from a known DMA-idle (no active mappings) > >>>>> + * state. The "driver detached" state is a coarse way to assert that > >>>>> + * requirement. > >>>> > >>>> And then the userspace will modprobe the driver, which will enable BME > >>>> and MSE which in turn will render the ERROR state, what is the plan > >>>> here? > >>> > >>> Right, so the notifier proposal [1] gave me pause because of perceived > >>> complexity and my general reluctance to rely on the magic of notifiers > >>> when an explicit sequence is easier to read/maintain. The proposal is > >>> that drivers switch to TDISP aware setup helpers that understand that > >>> BME and MSE were handled at LOCK. Becauase it is not just > >>> pci_enable_device() / pci_set_master() awareness that is needed but also > >>> pci_disable_device() / pci_clear_master() flows that need consideration > >>> for avoiding/handling the TDISP ERROR state. > >>> > >>> I.e. support for TDISP-unaware drivers is not a goal. > >> > >> So your plan it to modify driver to switch to the secure mode on the > >> go? (not arguing, just asking for now) > > > > Effectively, yes. In the non-TDISP case the driver handles the MSE+BME > > transition, in the TDISP case the driver also effectively handles the > > same as BME+MSE are superseded by the LOCKED state. > > > > So TVM userspace is responsible for marking the device "accepted" and the > > driver checks that state before enabling the device (LOCKED -> RUN). > > > > This also allows for kernel debug overrides of the acceptance policy, > > because, in the end, the Linux kernel trusts drivers. If the TVM owner > > loads a driver that ignores the "accepted" bit, that is the owner's > > prerogative. If the TVM owner does not trust a driver there are multiple > > knobs under the TVM's control to mitigate that mistrust. > > > >> The alternative is - let TSM do the attestation and acceptance and > >> then "modprobe tdispawaredriver tdisp=on" and change the driver to > >> assume that BME and MSE are already enabled. > > > > My heartburn with that is that there is an indefinite amount of time > > whereby a device is MSE + BME active without any driver to deal with the > > consequences. > > (out of curiosity) AMD can block DMA until the guest decides it is ready and enabled IOMMU for the device, cannot TDX do the same? Circling back to this as I go to refresh this series... I am not worried about the host protecting itself, I am worried about diverging from the model where the device is expected to not be active while a driver is detached. If the device is enabled to, for example, trigger platform errors (platform self protection from unauthorized DMA / interrupts), that is a difference from the non-CC case. So this is more about symmetry of behavior with the typical non-CC case where PCI devices are not issuing or accepting bus cycles while a driver is detached. > And what is the consequence of MSE being enabled? I think the problem is reversed with MSE, you *want* errors to happen when the driver is detached which is what happens in the non-CC case. > It is in the guest's best interest to avoid touching MMIO before > things are set up. p2p DMA? Yes. > > For example, what if the device needs some form of reset / > > re-initialization to quiet an engine or silence an interrupt that > > immediately starts firing upon the LOCKED -> RUN transition. > > The OS will have to ignore such interrupts, what is a problem with it? I don't know, but I do know that it would a bug to fix in the non-CC case. I.e. it is a problem that need not be introduced by maintaining "driver is in control of MSE+BME activation" (LOCKED->RUN transition). > > Userspace > > is not in a good position to make judgements about the state of the > > device outside of the Interface Report. > > > >>> There are still details to work out like supporting drivers that want to > >>> stay loaded over the UNLOCKED->LOCKED->RUN transitions, and whether the > >>> "accept" UAPI triggers entry into "RUN" or still requires a driver to > >>> perform that. > > > > Yes, I now think entry into "RUN" needs to be a driver triggered event > > to maintain parity with the safety of the non-TDISP case. > > > > [..] > >>>>> @@ -135,6 +141,8 @@ struct pci_tsm_guest_req_info { > >>>>> * @bind: establish a secure binding with the TVM > >>>>> * @unbind: teardown the secure binding > >>>>> * @guest_req: handle the vendor specific requests from TVM when bound > >>>>> + * @accept: TVM-only operation to confirm that system policy allows > >>>>> + * device to access private memory and be mapped with private mmio. > >>>>> * > >>>>> * @probe and @remove run in pci_tsm_rwsem held for write context. All > >>>>> * other ops run under the @pdev->tsm->lock mutex and pci_tsm_rwsem held > >>>>> @@ -150,6 +158,7 @@ struct pci_tsm_ops { > >>>>> void (*unbind)(struct pci_tdi *tdi); > >>>>> int (*guest_req)(struct pci_dev *pdev, > >>>>> struct pci_tsm_guest_req_info *info); > >>>>> + int (*accept)(struct pci_dev *pdev); > >>>> > >>>> > >>>> When I posted my v1, I got several comments to not put host and guest > >>>> callbacks together which makes sense (as only really "connect" and > >>>> "status" are shared, and I am not sure I like dual use of "connect") > >>>> and so did I in v2 and yet you are pushing for one struct for all? > >>> > >>> Frankly, I missed that feedback and was focused on how to simply extend > >>> PCI to understand TSM semantics. > >> > >> That was literally you (and I think someone else mentioned it too) ;) > >> > >> https://lore.kernel.org/all/66d7a10a4d621_3975294ac@xxxxxxxxxxxxxxxxxxxxxxxxx.notmuch/ > > > > Ugh, yes, it seems that joke: "debugging is a murder mystery where you > > find out you were the killer the whole time." can also be true for patch > > review.>>> "Lets not mix HV and VM hooks in the same ops without good reason" and > >> I do not see a good reason here yet. > >> > >> More to the point, the host and guest have very little in common to > >> have one ops struct for both and then deal with questions "do we > >> execute the code related to PF0 in the guest", etc. > > > > Now that is a problem independent of the ops unification question. The > > 'struct pci_tsm_pf0' data-type should not be used for guest devices. I > > will rework that to be a separate data-structure, but still keep > > 'pci_tsm_ops' unified since the signatures are identical. > > > >> My life definitely got easier with 2 separate structures and my split > >> to virt/coco/...(tsm-host.c|tsm-guest.c|tsm.c) + pci/tsm.c. > > > > Here is the reason my thinking evolved from that comment. A primary goal > > of drivers/pci/tsm.c is to give one "Device Security" lifetime model to > > the PCI core. That means TSM driver discovery (host or guest) lights up > > TEE I/O capabilities in the PCI topology. That supports "pci_tsm_ops + > > mode flag" vs separate registration mechanisms for different ops. > > > > I also am not perceiving the need for guest-specific ops beyond > > ->accept(), as part of what drove my reaction to that RFC proposal was > > the quantity of proposed ops. > > > But this is all the guest will ever need, why allow possibility of > (not) dealing with IDE/DOE in the guest? We will end up with > "host-connect" and "guest-connect" when talking about this, having 2 > types of bind (VFIO bind and TDI bind) is already confusing people > whom I tell about this TSM business. And a global pointer, why... :( > > "tvm_mode == !!tsm_ops->accept" - this kind of knob should really be > compile-time imho. > > Is it going to be one TSM driver for TDX host and guest, sharing > measurable amount of code? I am definitely missing a bigger picture > here. Thanks, FWIW between this and Aneesh's comments about the different ops needed for guest vs host side I am going to separate them in the next version.