Jonathan Cameron wrote: > On Thu, 17 Jul 2025 11:33:52 -0700 > Dan Williams <dan.j.williams@xxxxxxxxx> wrote: > > > The PCIe 6.1 specification, section 11, introduces the Trusted Execution > > Environment (TEE) Device Interface Security Protocol (TDISP). This > > protocol definition builds upon Component Measurement and Authentication > > (CMA), and link Integrity and Data Encryption (IDE). It adds support for > > assigning devices (PCI physical or virtual function) to a confidential > > VM such that the assigned device is enabled to access guest private > > memory protected by technologies like Intel TDX, AMD SEV-SNP, RISCV > > COVE, or ARM CCA. > > > > The "TSM" (TEE Security Manager) is a concept in the TDISP specification > > of an agent that mediates between a "DSM" (Device Security Manager) and > > system software in both a VMM and a confidential VM. A VMM uses TSM ABIs > > to setup link security and assign devices. A confidential VM uses TSM > > ABIs to transition an assigned device into the TDISP "RUN" state and > > validate its configuration. From a Linux perspective the TSM abstracts > > many of the details of TDISP, IDE, and CMA. Some of those details leak > > through at times, but for the most part TDISP is an internal > > implementation detail of the TSM. > > > > CONFIG_PCI_TSM adds an "authenticated" attribute and "tsm/" subdirectory > > to pci-sysfs. Consider that the TSM driver may itself be a PCI driver. > > Userspace can watch for the arrival of a "TSM" device, > > /sys/class/tsm/tsm0/uevent KOBJ_CHANGE, to know when the PCI core has > > initialized TSM services. > > > > The operations that can be executed against a PCI device are split into > > 2 mutually exclusive operation sets, "Link" and "Security" (struct > > pci_tsm_{link,security}_ops). The "Link" operations manage physical link > > security properties and communication with the device's Device Security > > Manager firmware. These are the host side operations in TDISP. The > > "Security" operations coordinate the security state of the assigned > > virtual device (TDI). These are the guest side operations in TDISP. Only > > link management operations are defined at this stage and placeholders > > provided for the security operations. > > > > The locking allows for multiple devices to be executing commands > > simultaneously, one outstanding command per-device and an rwsem > > synchronizes the implementation relative to TSM > > registration/unregistration events. > > > > Thanks to Wu Hao for his work on an early draft of this support. > > > > Cc: Lukas Wunner <lukas@xxxxxxxxx> > > Cc: Samuel Ortiz <sameo@xxxxxxxxxxxx> > > Cc: Alexey Kardashevskiy <aik@xxxxxxx> > > Acked-by: Bjorn Helgaas <bhelgaas@xxxxxxxxxx> > > Co-developed-by: Xu Yilun <yilun.xu@xxxxxxxxxxxxxxx> > > Signed-off-by: Xu Yilun <yilun.xu@xxxxxxxxxxxxxxx> > > Signed-off-by: Dan Williams <dan.j.williams@xxxxxxxxx> > Various things inline. > > > diff --git a/drivers/pci/tsm.c b/drivers/pci/tsm.c > > new file mode 100644 > > index 000000000000..0784cc436dd3 > > --- /dev/null > > +++ b/drivers/pci/tsm.c > > @@ -0,0 +1,554 @@ > > +// SPDX-License-Identifier: GPL-2.0 > > +/* > > + * TEE Security Manager for the TEE Device Interface Security Protocol > > + * (TDISP, PCIe r6.1 sec 11) > > + * > > + * Copyright(c) 2024 Intel Corporation. All rights reserved. > > + */ > > > +static void tsm_remove(struct pci_tsm *tsm) > > +{ > > + struct pci_dev *pdev; > > + > > + if (!tsm) > > You protect against this in the DEFINE_FREE() so probably safe > to assume it is always set if we get here. It is safe, but I would rather not require reading other code to understand the expectation that some callers may unconditionally call this routine. > > + return; > > + > > + pdev = tsm->pdev; > > + tsm->ops->remove(tsm); > > + pdev->tsm = NULL; > > +} > > +DEFINE_FREE(tsm_remove, struct pci_tsm *, if (_T) tsm_remove(_T)) > > + > > +static int call_cb_put(struct pci_dev *pdev, void *data, > > Is this combination worth while? I don't like the 'and' aspect of it > and it only saves a few lines... > > vs > if (pdev) { > rc = cb(pdev, data); > pci_dev_put(pdev); > if (rc) > return; > } I think it is worth it, but an even better option is to just let scope-based cleanup handle it by moving the variable inside the loop declaration. > > > + int (*cb)(struct pci_dev *pdev, void *data)) > > +{ > > + int rc; > > + > > + if (!pdev) > > + return 0; > > + rc = cb(pdev, data); > > + pci_dev_put(pdev); > > + return rc; > > +} > > + > > +static void pci_tsm_walk_fns(struct pci_dev *pdev, > > + int (*cb)(struct pci_dev *pdev, void *data), > > + void *data) > > +{ > > + struct pci_dev *fn; > > + int i; > > + > > + /* walk virtual functions */ > > + for (i = 0; i < pci_num_vf(pdev); i++) { > > + fn = pci_get_domain_bus_and_slot(pci_domain_nr(pdev->bus), > > + pci_iov_virtfn_bus(pdev, i), > > + pci_iov_virtfn_devfn(pdev, i)); > > + if (call_cb_put(fn, data, cb)) > > + return; > > + } > > + > > + /* walk subordinate physical functions */ > > + for (i = 1; i < 8; i++) { > > + fn = pci_get_slot(pdev->bus, > > + PCI_DEVFN(PCI_SLOT(pdev->devfn), i)); > > + if (call_cb_put(fn, data, cb)) > > + return; > > + } > > + > > + /* walk downstream devices */ > > + if (pci_pcie_type(pdev) != PCI_EXP_TYPE_UPSTREAM) > > spaces rather than tabs... Fixed. > > + return; > > + > > + if (!is_dsm(pdev)) > > + return; > > + > > + pci_walk_bus(pdev->subordinate, cb, data); > > +} > > + > > +static void pci_tsm_walk_fns_reverse(struct pci_dev *pdev, > > + int (*cb)(struct pci_dev *pdev, > > + void *data), > > + void *data) > > +{ > > + struct pci_dev *fn; > > + int i; > > + > > + /* reverse walk virtual functions */ > > + for (i = pci_num_vf(pdev) - 1; i >= 0; i--) { > > + fn = pci_get_domain_bus_and_slot(pci_domain_nr(pdev->bus), > > + pci_iov_virtfn_bus(pdev, i), > > + pci_iov_virtfn_devfn(pdev, i)); > > + if (call_cb_put(fn, data, cb)) > > + return; > > + } > > + > While it probably doesn't matter can we make this strict reverse by doing > the physical functions first? I prefer not to think about whether it matters. Actually, me too, and in that case I also want the downstream devices to be done in strict reverse order. So, I do not have a use in mind where this matters, but changed the order to physical->virtual->downstream and downstream->virtual->physical for the reverse case. Oh, this is missing the potential case of SR-IOV on multiple physical functions of the device, so added that too. > > > + /* reverse walk subordinate physical functions */ > > + for (i = 7; i >= 1; i--) { > > + fn = pci_get_slot(pdev->bus, > > + PCI_DEVFN(PCI_SLOT(pdev->devfn), i)); > > + if (call_cb_put(fn, data, cb)) > > + return; > > + } > > + > > + /* reverse walk downstream devices */ > > + if (pci_pcie_type(pdev) != PCI_EXP_TYPE_UPSTREAM) > > + return; > > + > > + if (!is_dsm(pdev)) > > + return; > > + > > + pci_walk_bus_reverse(pdev->subordinate, cb, data); > > Likewise, can we do this before the rest. Fixed. > > > +} > > > +/* > > + * Find the PCI Device instance that serves as the Device Security > > + * Manger (DSM) for @pdev. Note that no additional reference is held for > > + * the resulting device because @pdev always has a longer registered > > + * lifetime than its DSM by virtue of being a child of or identical to > > + * its DSM. > > + */ > > +static struct pci_dev *find_dsm_dev(struct pci_dev *pdev) > > +{ > > + struct pci_dev *uport_pf0; > > + > > + if (is_pci_tsm_pf0(pdev)) > > + return pdev; > > + > > + struct pci_dev *pf0 __free(pci_dev_put) = pf0_dev_get(pdev); > > + if (!pf0) > > + return NULL; > > + > > + if (is_dsm(pf0)) > > + return pf0; > > > Unusual for a find command to not hold the device reference on the device > it returns. Maybe just call that out in the comment. It is, "Note that no additional reference..." > > + > > + /* > > + * For cases where a switch may be hosting TDISP services on > > + * behalf of downstream devices, check the first usptream port > > + * relative to this endpoint. > > + */ > Odd alignment. Space rather than tab. Hmm, clang-format does not fixup tabs vs spaces in block comments, fixed. > > > > + if (!pdev->dev.parent || !pdev->dev.parent->parent) > > + return NULL; > > + > > + uport_pf0 = to_pci_dev(pdev->dev.parent->parent); > > + if (is_dsm(uport_pf0)) > > + return uport_pf0; > > + return NULL; > > +} > > > > +/** > > + * pci_tsm_pf0_constructor() - common 'struct pci_tsm_pf0' initialization > > + * @pdev: Physical Function 0 PCI device (as indicated by is_pci_tsm_pf0()) > > + * @tsm: context to initialize > > ops missing. Run kernel-doc or do W=1 build to catch these. TIL I do not need need to create a Documentation file to reference this file to get kdoc build warnings. Fixed. > > > + */ > > +int pci_tsm_pf0_constructor(struct pci_dev *pdev, struct pci_tsm_pf0 *tsm, > > + const struct pci_tsm_ops *ops) > > +{ > > + struct tsm_dev *tsm_dev = ops->owner; > > + > > + mutex_init(&tsm->lock); > Might as well do devm_mutex_init() Hmm, no, this is running out of the driver bind lifetime of @pdev. > > + tsm->doe_mb = pci_find_doe_mailbox(pdev, PCI_VENDOR_ID_PCI_SIG, > > + PCI_DOE_PROTO_CMA); > > + if (!tsm->doe_mb) { > > + pci_warn(pdev, "TSM init failure, no CMA mailbox\n"); > > + return -ENODEV; > > + } > > + > > + if (tsm_pci_group(tsm_dev)) > > + sysfs_merge_group(&pdev->dev.kobj, tsm_pci_group(tsm_dev)); > > + > > + return pci_tsm_constructor(pdev, &tsm->tsm, ops); > > +} > > +EXPORT_SYMBOL_GPL(pci_tsm_pf0_constructor); > > > diff --git a/drivers/virt/coco/tsm-core.c b/drivers/virt/coco/tsm-core.c > > index 1f53b9049e2d..093824dc68dd 100644 > > --- a/drivers/virt/coco/tsm-core.c > > +++ b/drivers/virt/coco/tsm-core.c > > > +/* > > + * Caller responsible for ensuring it does not race tsm_dev > > + * unregistration. > Wrap is a bit early. unregistration fits on the line above. True, fixed up editor settings to automate this. > > + */ > > +struct tsm_dev *find_tsm_dev(int id) > > +{ > > + guard(rcu)(); > > + return idr_find(&tsm_idr, id); > > +} > > > @@ -44,6 +76,29 @@ static struct tsm_dev *alloc_tsm_dev(struct device *parent, > > return no_free_ptr(tsm_dev); > > } > > > > +static struct tsm_dev *tsm_register_pci_or_reset(struct tsm_dev *tsm_dev, > > + struct pci_tsm_ops *pci_ops) > > +{ > > + int rc; > > + > > + if (!pci_ops) > > + return tsm_dev; > > + > > + pci_ops->owner = tsm_dev; > > + tsm_dev->pci_ops = pci_ops; > > + rc = pci_tsm_register(tsm_dev); > > + if (rc) { > > + dev_err(tsm_dev->dev.parent, > > + "PCI/TSM registration failure: %d\n", rc); > > + device_unregister(&tsm_dev->dev); > > As below. I'm fairly sure this device_unregister is nothing to do with > what this function is doing, so having it buried in here is less easy > to follow than pushing it up a layer. I prefer a short function with an early exit and no scope based unwind for this. > > + return ERR_PTR(rc); > > + } > > + > > + /* Notify TSM userspace that PCI/TSM operations are now possible */ > > + kobject_uevent(&tsm_dev->dev.kobj, KOBJ_CHANGE); > > + return tsm_dev; > > +} > > + > > static void put_tsm_dev(struct tsm_dev *tsm_dev) > > { > > if (!IS_ERR_OR_NULL(tsm_dev)) > > @@ -54,7 +109,8 @@ DEFINE_FREE(put_tsm_dev, struct tsm_dev *, > > if (!IS_ERR_OR_NULL(_T)) put_tsm_dev(_T)) > > > > struct tsm_dev *tsm_register(struct device *parent, > > - const struct attribute_group **groups) > > + const struct attribute_group **groups, > > + struct pci_tsm_ops *pci_ops) > > { > > struct tsm_dev *tsm_dev __free(put_tsm_dev) = > > alloc_tsm_dev(parent, groups); > > @@ -73,12 +129,13 @@ struct tsm_dev *tsm_register(struct device *parent, > > if (rc) > > return ERR_PTR(rc); > > > > - return no_free_ptr(tsm_dev); > > + return tsm_register_pci_or_reset(no_free_ptr(tsm_dev), pci_ops); > > Having a function call that either succeeds or cleans up something it > never did on error is odd. devm_add_action_or_reset() is the same pattern. Do the action, or otherwise take care of cleaning up. The action in this case is pci_tsm_register() and the reset is cleaning up the device_add(). > The or_reset hints at that oddity but to me is not enough. If you want > to use __free magic in here maybe hand off the tsm_dev on succesful > device registration. > > struct tsm_dev *registered_tsm_dev __free(unregister_tsm_dev) = > no_free_ptr(tsm_dev); That does not look like an improvement to me. The moment device_add() succeeds the cleanup shifts from put_device() to device_unregister(). I considered wrapping device_add(), but I think it is awkard to redeclare the same variable again vs being able to walk all instances of @tsm_dev in the function. [..] > > + * struct pci_tsm_ops - manage confidential links and security state > > + * @link_ops: Coordinate PCIe SPDM and IDE establishment via a platform TSM. > > + * Provide a secure session transport for TDISP state management > > + * (typically bare metal physical function operations). > > + * @sec_ops: Lock, unlock, and interrogate the security state of the > > + * function via the platform TSM (typically virtual function > > + * operations). > > + * @owner: Back reference to the TSM device that owns this instance. > > + * > > + * This operations are mutually exclusive either a tsm_dev instance > > + * manages phyiscal link properties or it manages function security > > + * states like TDISP lock/unlock. > > + */ > > +struct pci_tsm_ops { > > + /* > Likewise though I'm not sure if kernel-doc deals with struct groups. It does not. > > +/** > > + * struct pci_tsm_pf0 - Physical Function 0 TDISP link context > > + * @tsm: generic core "tsm" context > > + * @lock: protect @state vs pci_tsm_ops invocation > > What is @state referring to? @state was removed with v4 of the PCI/TSM series. The kernel-doc for 'struct pci_tsm_pf0' is now: /** * struct pci_tsm_pf0 - Physical Function 0 TDISP link context * @tsm: generic core "tsm" context * @lock: mutual exclustion for pci_tsm_ops invocation * @doe_mb: PCIe Data Object Exchange mailbox */ struct pci_tsm_pf0 { struct pci_tsm tsm; struct mutex lock; struct pci_doe_mb *doe_mb; };