On Mon, 1 Sep 2025, Steve Oswald wrote: > I've added the dmesg output fordyndbg="file drivers/pci/*. I wasn't > sure if I added it with the escaped quotes correctly. > https://gist.github.com/stepeos/cd060c7d66ab195f51ab4d5675b4e4af/raw/9cf5fc3a8c4f13588a33d61865f804f85e50470a/dmesg_linux_6.11.0_dyndbg.log Hi again, I think the patch below should solve your issue. If you manage to get it tested and are confident enough it fixed your issue, you may provide your Tested-by tag so I can include it into the official submission of the fix. From 1c8ef31c6ac6616869b447a473e879b233df62db Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Ilpo=20J=C3=A4rvinen?= <ilpo.jarvinen@xxxxxxxxxxxxxxx> Date: Wed, 3 Sep 2025 15:53:48 +0300 Subject: [PATCH 1/1] PCI: Prevent shrinking bridge window from its required size MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit pci_bridge_distribute_available_resources() -> ... -> adjust_bridge_window() is called in between __pci_bus_size_bridges() and assigning the resources. Since the commit 948675736a77 ("PCI: Allow adjust_bridge_window() to shrink resource if necessary") adjust_bridge_window() can also shrink the bridge window to force it to a smaller size. The shrunken size, however, conflicts what __pci_bus_size_bridges() -> pbus_size_mem() calculated as the required bridge window size. By shrinking the resource size, adjust_bridge_window() prevents rest of the resource fitting algorithm working as intended. Resource fitting logic is expecting assignment failures when bridge windows need resizing, but there are cases where failures are no longer happening after the commit 948675736a77 ("PCI: Allow adjust_bridge_window() to shrink resource if necessary"). The commit 948675736a77 ("PCI: Allow adjust_bridge_window() to shrink resource if necessary") justifies the change by the extra reservation made due to hpmemsize parameter, however, the kernel code contradicts with that statement. (For simplicity, finer-grained hpmmiosize and hpmmiopref parameters that can be used to the same effect as hpmemsize are ignored in this description.) pbus_size_mem() calls calculate_memsize() twice. First with add_size=0 to find out the minimal required resource size. The second call occurs with add_size=hpmemsize (effectively) but the result does not directly affect the resource size only resulting in an entry on the realloc_head list (a.k.a. add_list). Yet, adjust_bridge_window() directly changes the resource size which does not include what is reserved due to hpmemsize. Also, if the required size for the bridge window exceeds hpmemsize, the parameter does not have any effect even on the second size calculcation made by pbus_size_mem(); from calculate_memsize(): size = max(size, add_size) + children_add_size; The commit ae4611f1d7e9 ("PCI: Set resource size directly in adjust_bridge_window()") that precedes the commit 948675736a77 ("PCI: Allow adjust_bridge_window() to shrink resource if necessary") is also related to causing this problem. Its changelog explicitly states adjust_bridge_window() wants to "guarantee" allocation success. Guaranteed allocations, however, are incompatible with how the other parts of the resource fitting algorithm work. The given justification fails to explain why guaranteed allocations at this stage are required nor why forcing window to a smaller value than what was calculated by pbus_size_mem() is correct. The the change might have worked by chance in some test scenario, too small bridge window does not "guarantee" success from the point of view of the endpoint device resource assignments. No issue is mentioned within the changelog so it's unclear if the change was made to fix some observed issue nor and what that issue was. The unwanted shrinking of a bridge window occurs, e.g., when a device with large BARs such as eGPU is attached using Thunderbolt and the Root Port holds less than enough resource space for the eGPU. The GPU resources are in order of GBs and the default hotplug allocation is mere 2M (DEFAULT_HOTPLUG_MMIO_PREF_SIZE). The problem is illustrated by this log (filtered to the relevant content only): pci 0000:00:07.0: PCI bridge to [bus 03-2c] pci 0000:00:07.0: bridge window [mem 0x6000000000-0x601bffffff 64bit pref] pci 0000:03:00.0: PCI bridge to [bus 00] pci 0000:03:00.0: bridge window [mem 0x00000000-0x000fffff 64bit pref] pci 0000:03:00.0: bridge configuration invalid ([bus 00-00]), reconfiguring pci 0000:03:00.0: PCI bridge to [bus 04-2c] pcieport 0000:00:07.0: Assigned bridge window [mem 0x6000000000-0x601bffffff 64bit pref] to [bus 03-2c] cannot fit 0xc00000000 required for 0000:03:00.0 bridging to [bus 04-2c] pci 0000:03:00.0: bridge window [mem 0x800000000-0x10003fffff 64bit pref] to [bus 04-2c] add_size 100000 add_align 100000 pcieport 0000:00:07.0: distributing available resources pci 0000:03:00.0: bridge window [mem 0x800000000-0x10003fffff 64bit pref] shrunken by 0x00000007e4400000 pci 0000:03:00.0: bridge window [mem 0x6000000000-0x601bffffff 64bit pref]: assigned The initial size of the Root Port's window is 448MB (0x601bffffff - 0x6000000000). __pci_bus_size_bridges() -> pbus_size_mem() calculates the required size to be 32772 MB (0x10003fffff - 0x800000000) which would fit the eGPU resources. adjust_bridge_window() then shrinks the bridge window down to what is guaranteed to fit into the Root Port's bridge window. The bridge window for 03:00.0 is also eliminated from the add_list (a.k.a. realloc_head) list by adjust_bridge_window(). After adjustment, the resources are assigned and as the bridge window for 03:00.0 is assigned successfully, no failure is recorded. Without a failure, no attempt to resize the window of the Root Port is required. The end result is eGPU not having large enough resources to work. The commit 948675736a77 ("PCI: Allow adjust_bridge_window() to shrink resource if necessary") also claims nested bridge windows are sized the same, which is false. pbus_size_mem() calculates the size for the parent bridge window by summing all the downstream resources so the resource fitting calculates larger bridge window for the parent to accomodate the childen. That is, hpmemsize does not result the same size for the case where there are nested bridge windows. In order to fix the most immediate problem, don't shrink the resource size as hpmemsize had nothing to do with it. When considering add_size, only reduce it up to what is added due to hpmemsize (if required size is larger than hpmemsize, the parameter has no impact, see calculate_memsize()). This is not exactly a revert of the commits e4611f1d7e9 ("PCI: Set resource size directly in adjust_bridge_window()") and 948675736a77 ("PCI: Allow adjust_bridge_window() to shrink resource if necessary") as shrinking still remains in place but is implemented differently, and the end result behaves very differently. It is possible that those two commits fixed some other issue that is not described with enough detail in the changelog and undoing parts of them results in another regression due to behavioral change. Nonetheless, as described above, the solution by those two commits was flawed and the issue, if one exists, should be solved in a way that is compatible with the rest of the resource fitting algorithm instead of working against it. Besides shrinking, the case where adjust_bridge_window() expands the bridge window is likely somewhat wrong as well because it removes the entry from add_list (a.k.a. realloc_head), but it is less damaging as that only impacts optional resources and may have no impact if expanding by hpmemsize is larger than what add_size was. Fixing it is left as further work. Reported-by: Steve Oswald <stevepeter.oswald@xxxxxxxxx> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@xxxxxxxxxxxxxxx> --- drivers/pci/setup-bus.c | 42 +++++++++++++++++++++++++++++++++++++++-- 1 file changed, 40 insertions(+), 2 deletions(-) diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c index 23082bc0ca37..4dd618bc4196 100644 --- a/drivers/pci/setup-bus.c +++ b/drivers/pci/setup-bus.c @@ -1828,6 +1828,7 @@ static void adjust_bridge_window(struct pci_dev *bridge, struct resource *res, resource_size_t new_size) { resource_size_t add_size, size = resource_size(res); + struct pci_dev_resource *dev_res; if (res->parent) return; @@ -1840,9 +1841,46 @@ static void adjust_bridge_window(struct pci_dev *bridge, struct resource *res, pci_dbg(bridge, "bridge window %pR extended by %pa\n", res, &add_size); } else if (new_size < size) { + int idx = pci_resource_num(bridge, res); + + /* + * hpio/mmio/mmioprefsize hasn't been included at all? See the + * add_size param at the callsites of calculate_memsize(). + */ + if (!add_list) + return; + + /* Only shrink if the hotplug extra relates to window size. */ + switch (idx) { + case PCI_BRIDGE_IO_WINDOW: + if (size > pci_hotplug_io_size) + return; + break; + case PCI_BRIDGE_MEM_WINDOW: + if (size > pci_hotplug_mmio_size) + return; + break; + case PCI_BRIDGE_PREF_MEM_WINDOW: + if (size > pci_hotplug_mmio_pref_size) + return; + break; + default: + break; + } + + dev_res = res_to_dev_res(add_list, res); add_size = size - new_size; - pci_dbg(bridge, "bridge window %pR shrunken by %pa\n", res, - &add_size); + if (add_size < dev_res->add_size) { + dev_res->add_size -= add_size; + pci_dbg(bridge, "bridge window %pR optional size shrunken by %pa\n", + res, &add_size); + } else { + pci_dbg(bridge, "bridge window %pR optional size removed\n", + res); + remove_from_list(add_list, res); + } + return; + } else { return; } base-commit: f6d41443f54856ceece0d5b584f47f681513bde4 -- 2.39.5