[PATCH v2 02/16] PCI: Add pci_bus_isolation()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Prepare to move the calculation of the bus P2P isolation out of the iommu
code and into the PCI core. This allows using the faster list iteration
under the pci_bus_sem, and the code has a kinship with the logic in
pci_for_each_dma_alias().

Bus isolation is the concept that drives the iommu_groups for the purposes
of VFIO. Stated simply, if device A can send traffic to device B then they
must be in the same group.

Only PCIe provides isolation. The multi-drop electrical topology in
classical PCI allows any bus member to claim the transaction.

In PCIe isolation comes out of ACS. If a PCIe Switch and Root Complex has
ACS flags that prevent peer to peer traffic and funnel all operations to
the IOMMU then devices can be isolated.

If a multi-function devices has ACS flags preventing self loop back then
that device is isolated, though pci_bus_isolation() does not deal with
devices.

As a property of a bus, there are several positive cases:

 - The point to point "bus" on a physical PCIe link is isolated if the
   bridge/root device has something preventing self-access to its own
   MMIO.

 - A Root Port is usually isolated

 - A PCIe switch can be isolated if all it's Down Stream Ports have good
   ACS flags

pci_bus_isolation() implements these rules and returns an enum indicating
the level of isolation the bus has, with five possibilies:

 PCIE_ISOLATED: Traffic on this PCIE bus can not do any P2P.

 PCIE_SWITCH_DSP_NON_ISOLATED: The bus is the internal bus of a PCIE
     switch and the USP is isolated but the DSPs are not.

 PCIE_NON_ISOLATED: The PCIe bus has no isolation between the bridge or
     any downstream devices.

 PCI_BUS_NON_ISOLATED: It is a PCI/PCI-X but the bridge is PCIe, has no
     aliases and is isolated.

 PCI_BRIDGE_NON_ISOLATED: It is a PCI/PCI-X bus and has no isolation, the
     bridge is part of the group.

The calculation is done per-bus, so it is possible for a transactions from
a PCI device to travel through different bus isolation types on its way
upstream. PCIE_SWITCH_DSP_NON_ISOLATED/PCI_BUS_NON_ISOLATED and
PCIE_NON_ISOLATED/PCI_BRIDGE_NON_ISOLATED are the same for the purposes of
creating iommu groups. The distinction between PCIe and PCI allows for
easier understanding and debugging as to why the groups are chosen.

For the iommu groups if all busses on the upstream path are PCIE_ISOLATED
then the end device has a chance to have a single-device iommu_group. Once
any non-isolated bus segment is found that bus segment will have an
iommu_group that captures all downstream devices, and sometimes the
upstream bridge.

pci_bus_isolation() is principally about isolation, but there is an
overlap with grouping requirements for legacy PCI aliasing. For purely
legacy PCI environments pci_bus_isolation() returns
PCI_BRIDGE_NON_ISOLATED for everything and all devices within a hierarchy
are in one group. No need to worry about bridge aliasing.

Signed-off-by: Jason Gunthorpe <jgg@xxxxxxxxxx>
---
 drivers/pci/search.c | 164 +++++++++++++++++++++++++++++++++++++++++++
 include/linux/pci.h  |  31 ++++++++
 2 files changed, 195 insertions(+)

diff --git a/drivers/pci/search.c b/drivers/pci/search.c
index 53840634fbfc2b..a13fad53e44df9 100644
--- a/drivers/pci/search.c
+++ b/drivers/pci/search.c
@@ -113,6 +113,170 @@ int pci_for_each_dma_alias(struct pci_dev *pdev,
 	return ret;
 }
 
+static enum pci_bus_isolation pcie_switch_isolated(struct pci_bus *bus)
+{
+	struct pci_dev *pdev;
+
+	/*
+	 * Within a PCIe switch we have an interior bus that has the Upstream
+	 * port as the bridge and a set of Downstream port bridging to the
+	 * egress ports.
+	 *
+	 * Each DSP has an ACS setting which controls where its traffic is
+	 * permitted to go. Any DSP with a permissive ACS setting can send
+	 * traffic flowing upstream back downstream through another DSP.
+	 *
+	 * Thus any non-permissive DSP spoils the whole bus.
+	 */
+	guard(rwsem_read)(&pci_bus_sem);
+	list_for_each_entry(pdev, &bus->devices, bus_list) {
+		/* Don't understand what this is, be conservative */
+		if (!pci_is_pcie(pdev) ||
+		    pci_pcie_type(pdev) != PCI_EXP_TYPE_DOWNSTREAM ||
+		    pdev->dma_alias_mask)
+			return PCIE_NON_ISOLATED;
+
+		if (!pci_acs_enabled(pdev, PCI_ACS_ISOLATED))
+			return PCIE_SWITCH_DSP_NON_ISOLATED;
+	}
+	return PCIE_ISOLATED;
+}
+
+static bool pci_has_mmio(struct pci_dev *pdev)
+{
+	unsigned int i;
+
+	for (i = 0; i <= PCI_ROM_RESOURCE; i++) {
+		struct resource *res = pci_resource_n(pdev, i);
+
+		if (resource_size(res) && resource_type(res) == IORESOURCE_MEM)
+			return true;
+	}
+	return false;
+}
+
+/**
+ * pci_bus_isolated - Determine how isolated connected devices are
+ * @bus: The bus to check
+ *
+ * Isolation is the ability of devices to talk to each other. Full isolation
+ * means that a device can only communicate with the IOMMU and can not do peer
+ * to peer within the fabric.
+ *
+ * We consider isolation on a bus by bus basis. If the bus will permit a
+ * transaction originated downstream to complete on anything other than the
+ * IOMMU then the bus is not isolated.
+ *
+ * Non-isolation includes all the downstream devices on this bus, and it may
+ * include the upstream bridge or port that is creating this bus.
+ *
+ * The various cases are returned in an enum.
+ *
+ * Broadly speaking this function evaluates the ACS settings in a PCI switch to
+ * determine if a PCI switch is configured to have full isolation.
+ *
+ * Old PCI/PCI-X busses cannot have isolation due to their physical properties,
+ * but they do have some aliasing properties that effect group creation.
+ *
+ * pci_bus_isolated() does not consider loopback internal to devices, like
+ * multi-function devices performing a self-loopback. The caller must check
+ * this separately. It does not considering alasing within the bus.
+ *
+ * It does not currently support the ACS P2P Egress Control Vector, Linux does
+ * not yet have any way to enable this feature. EC will create subsets of the
+ * bus that are isolated from other subsets.
+ */
+enum pci_bus_isolation pci_bus_isolated(struct pci_bus *bus)
+{
+	struct pci_dev *bridge = bus->self;
+	int type;
+
+	/*
+	 * This bus was created by pci_register_host_bridge(). The spec provides
+	 * no way to tell what kind of bus this is, for PCIe we expect this to
+	 * be internal to the root complex and not covered by any spec behavior.
+	 * Linux has historically been optimistic about this bus and treated it
+	 * as isolating. Given that the behavior of the root complex and the ACS
+	 * behavior of RCiEP's is explicitly not specified we hope that the
+	 * implementation is directing everything that reaches the root bus to
+	 * the IOMMU.
+	 */
+	if (pci_is_root_bus(bus))
+		return PCIE_ISOLATED;
+
+	/*
+	 * bus->self is only NULL for SRIOV VFs, it represents a "virtual" bus
+	 * within Linux to hold any bus numbers consumed by VF RIDs. Caller must
+	 * use pci_physfn() to get the bus for calling this function.
+	 */
+	if (WARN_ON(!bridge))
+		return PCI_BRIDGE_NON_ISOLATED;
+
+	/*
+	 * The bridge is not a PCIe bridge therefore this bus is PCI/PCI-X.
+	 *
+	 * PCI does not have anything like ACS. Any down stream device can bus
+	 * master an address that any other downstream device can claim. No
+	 * isolation is possible.
+	 */
+	if (!pci_is_pcie(bridge)) {
+		if (bridge->dev_flags & PCI_DEV_FLAG_PCIE_BRIDGE_ALIAS)
+			type = PCI_EXP_TYPE_PCI_BRIDGE;
+		else
+			return PCI_BRIDGE_NON_ISOLATED;
+	} else {
+		type = pci_pcie_type(bridge);
+	}
+
+	switch (type) {
+	/*
+	 * Since PCIe links are point to point root and downstream ports are
+	 * isolated if their own MMIO cannot be reached.
+	 */
+	case PCI_EXP_TYPE_ROOT_PORT:
+	case PCI_EXP_TYPE_DOWNSTREAM:
+		if (!pci_acs_enabled(bridge, PCI_ACS_ISOLATED))
+			return PCIE_NON_ISOLATED;
+		return PCIE_ISOLATED;
+
+	/*
+	 * bus is the interior bus of a PCI-E switch where ACS rules apply.
+	 */
+	case PCI_EXP_TYPE_UPSTREAM:
+		return pcie_switch_isolated(bus);
+
+	/*
+	 * PCIe to PCI/PCI-X - this bus is PCI.
+	 */
+	case PCI_EXP_TYPE_PCI_BRIDGE:
+		/*
+		 * A PCIe express bridge will use the subordinate bus number
+		 * with a 0 devfn as the RID in some cases. This causes all
+		 * subordinate devfns to alias with 0, which is the same
+		 * grouping as PCI_BUS_NON_ISOLATED. The RID of the bridge
+		 * itself is only used by the bridge.
+		 *
+		 * However, if the bridge has MMIO then we will assume the MMIO
+		 * is not isolated due to no ACS controls on this bridge type.
+		 */
+		if (pci_has_mmio(bridge))
+			return PCI_BRIDGE_NON_ISOLATED;
+		return PCI_BUS_NON_ISOLATED;
+
+	/*
+	 * PCI/PCI-X to PCIe - this bus is PCIe. We already know there must be a
+	 * PCI bus upstream of this bus, so just return non-isolated. If
+	 * upstream is PCI-X the PCIe RID should be preserved, but for PCI the
+	 * RID will be lost.
+	 */
+	case PCI_EXP_TYPE_PCIE_BRIDGE:
+		return PCI_BRIDGE_NON_ISOLATED;
+
+	default:
+		return PCI_BRIDGE_NON_ISOLATED;
+	}
+}
+
 static struct pci_bus *pci_do_find_bus(struct pci_bus *bus, unsigned char busnr)
 {
 	struct pci_bus *child;
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 05e68f35f39238..0b1e28dcf9187d 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -834,6 +834,32 @@ struct pci_dynids {
 	struct list_head	list;	/* For IDs added at runtime */
 };
 
+enum pci_bus_isolation {
+	/*
+	 * The bus is off a root port and the root port has isolated ACS flags
+	 * or the bus is part of a PCIe switch and the switch has isolated ACS
+	 * flags.
+	 */
+	PCIE_ISOLATED,
+	/*
+	 * The switch's DSP's are not isolated from each other but are isolated
+	 * from the USP.
+	 */
+	PCIE_SWITCH_DSP_NON_ISOLATED,
+	/* The above and the USP's MMIO is not isolated. */
+	PCIE_NON_ISOLATED,
+	/*
+	 * A PCI/PCI-X bus, no isolation. This is like
+	 * PCIE_SWITCH_DSP_NON_ISOLATED in that the upstream bridge is isolated
+	 * from the bus. The bus itself may also have a shared alias of devfn=0.
+	 */
+	PCI_BUS_NON_ISOLATED,
+	/*
+	 * The above and the bridge's MMIO is not isolated and the bridge's RID
+	 * may be an alias.
+	 */
+	PCI_BRIDGE_NON_ISOLATED,
+};
 
 /*
  * PCI Error Recovery System (PCI-ERS).  If a PCI device driver provides
@@ -1222,6 +1248,8 @@ struct pci_dev *pci_get_domain_bus_and_slot(int domain, unsigned int bus,
 struct pci_dev *pci_get_class(unsigned int class, struct pci_dev *from);
 struct pci_dev *pci_get_base_class(unsigned int class, struct pci_dev *from);
 
+enum pci_bus_isolation pci_bus_isolated(struct pci_bus *bus);
+
 int pci_dev_present(const struct pci_device_id *ids);
 
 int pci_bus_read_config_byte(struct pci_bus *bus, unsigned int devfn,
@@ -2035,6 +2063,9 @@ static inline struct pci_dev *pci_get_base_class(unsigned int class,
 						 struct pci_dev *from)
 { return NULL; }
 
+static inline enum pci_bus_isolation pci_bus_isolated(struct pci_bus *bus)
+{ return PCIE_NON_ISOLATED; }
+
 static inline int pci_dev_present(const struct pci_device_id *ids)
 { return 0; }
 
-- 
2.43.0





[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux