On 21/5/25 19:28, Xu Yilun wrote:
On Tue, May 20, 2025 at 01:12:12PM -0700, Dan Williams wrote:
Alexey Kardashevskiy wrote:
[..]
+int pci_tsm_bind(struct pci_dev *pdev, struct kvm *kvm, u64 tdi_id)
+{
+ struct pci_tdi *tdi;
+
+ if (!kvm)
+ return -EINVAL;
Does not belong here, if the caller failed to get the kvm pointer from
an fd, then that caller should handle it.
Sure.
[..]
+static int __pci_tsm_unbind(struct pci_dev *pdev)
+{
+ struct pci_tdi *tdi;
+
+ lockdep_assert_held(&pci_tsm_rwsem);
+
+ if (!pdev->tsm)
+ return -EINVAL;
Nothing checks for these errors.
True this function signature should probably drop the error code
altogether and become a void helper. I.e. it should be impossible for a
bound device to not have a reference, or not be in the right state.
+
+ struct pci_dev *pf0_dev __free(pci_dev_put) = tsm_pf0_get(pdev);
+ if (!pf0_dev)
+ return -EINVAL;
+
+ struct mutex *lock __free(tdi_ops_unlock) = tdi_ops_lock(pf0_dev);
+ if (IS_ERR(lock))
+ return PTR_ERR(lock);
+
+ tdi = pdev->tsm->tdi;
+ if (!tdi)
+ return 0;
+
+ tsm_ops->unbind(tdi);
+ pdev->tsm->tdi = NULL;
+
+ return 0;
+}
+
+int pci_tsm_unbind(struct pci_dev *pdev)
+{
+ struct rw_semaphore *lock __free(tsm_read_unlock) = tsm_read_lock();
+ if (!lock)
+ return -EINTR;
+
+ return __pci_tsm_unbind(pdev);
+}
+EXPORT_SYMBOL_GPL(pci_tsm_unbind);
+
+/**
+ * pci_tsm_guest_req - VFIO/IOMMUFD helper to handle guest requests
+ * @pdev: @pdev representing a bound tdi
+ * @info: envelope for the request
+ *
+ * Expected flow is guest low-level TSM driver initiates a guest request
+ * like "transition TDISP state to RUN", "fetch report" via a
+ * technology specific guest-host-interface and KVM exit reason. KVM
+ * posts to userspace (e.g. QEMU) that holds the host-to-guest RID
+ * mapping.
+ */
+int pci_tsm_guest_req(struct pci_dev *pdev, struct pci_tsm_guest_req_info *info)
+{
+ struct pci_tdi *tdi;
+ int rc;
+
+ lockdep_assert_held_read(&pci_tsm_rwsem);
+
+ if (!pdev->tsm)
+ return -ENODEV;
+
+ struct pci_dev *pf0_dev __free(pci_dev_put) = tsm_pf0_get(pdev);
+ if (!pf0_dev)
+ return -EINVAL;
+
+ struct mutex *lock __free(tdi_ops_unlock) = tdi_ops_lock(pf0_dev);
+ if (IS_ERR(lock))
+ return -ENODEV;
+
+ tdi = pdev->tsm->tdi;
+ if (!tdi)
+ return -ENODEV;
+
+ rc = tsm_ops->guest_req(pdev, info);
+ if (rc)
+ return -EIO;
return rc.
Agree.
[..]
@@ -86,12 +101,40 @@ static inline bool is_pci_tsm_pf0(struct pci_dev *pdev)
return PCI_FUNC(pdev->devfn) == 0;
}
+enum pci_tsm_guest_req_type {
+ PCI_TSM_GUEST_REQ_TDXC,
Will Intel ever need more types here?
I doubt it as this is routing to a TDX vs TIO vs ... blob handler. It is
unfortunate that we need this indirection (i.e. missing
standardization), but it is in the same line-of-thought as
configfs-tsm-report providing a transport along with a
technology-specific "provider" identifier for parsing the blob.
+ * struct pci_tsm_guest_req_info - parameter for pci_tsm_ops.guest_req()
+ * @type: identify the format of the following blobs
+ * @type_info: extra input/output info, e.g. firmware error code
Call it "fw_ret"?
Sure.
This field is intended for out-of-blob values, like fw_ret. But fw_ret
is specified in GHCB and is vendor specific. Other vendors may also
have different values of this kind.
So I intend to gather these out-of-blob values in type_info, like:
enum pci_tsm_guest_req_type {
PCI_TSM_GUEST_REQ_TDXC,
PCI_TSM_GUEST_REQ_SEV_SNP,
};
The pci_tsm_ops hooks already know what they are - SEV or TDX.
/* SEV SNP guest request type info */
struct pci_tsm_guest_req_sev_snp {
s32 fw_err;
};
Since IOMMUFD has the userspace entry, maybe these definitions should be
moved to include/uapi/linux/iommufd.h.
In pci-tsm.h, just define:
struct pci_tsm_guest_req_info {
u32 type;
void __user *type_info;
size_t type_info_len;
void __user *req;
size_t req_len;
void __user *resp;
size_t resp_len;
};
BTW: TDX Connect has no out-of-blob value, so should set type_info_len = 0
No TDX Connect fw error handling on the host OS whatsoever, always return to the guest? oookay, do not use it but the fw response is still a generic thing. Whatever is specific to AMD can be packed into req/resp and QEMU/guest will handle those.
[..]
@@ -102,6 +145,11 @@ struct pci_tsm_ops {
void (*remove)(struct pci_tsm *tsm);
int (*connect)(struct pci_dev *pdev);
void (*disconnect)(struct pci_dev *pdev);
+ struct pci_tdi *(*bind)(struct pci_dev *pdev, struct pci_dev *pf0_dev,
+ struct kvm *kvm, u64 tdi_id);
p0_dev is not needed here, we should be able to calculate it from pdev.
The pci_tsm core needs to hold the lock before calling this routine. At
that point might as well pass the already looked up device rather than
require the low-level TSM driver to repeat that work.
tdi_id is 32bit.
@Yilun, I saw that you had it as 64-bit in one location, was that
unintentional.
Intel is also switching to gBDF as tdi_id, so yes 32bit is good.
Note that INTERFACE_ID is Reserved to be 12-bytes, but today FUNCTION_ID
is indeed only 4-bytes. I will fix this up unless some arch speaks up
and says they need to pass a larger id around.
Yes.
yeah, 4 or 12 bytes make sense but not 8. Thanks,
Thanks,
Yilun
Should return an error code. Thanks,
Lets make it an ERR_PTR() because the low-level provider likely needs to
allocate more than just a 'struct pci_tdi' on bind.
--
Alexey