On Mon, 2 Jun 2025, Tudor Ambarus wrote: > On 6/2/25 4:08 PM, Ilpo Järvinen wrote: > >>> I think I figured out more about the reason. It's not related to that > >>> bridge window resource. > >>> > >>> pbus_size_mem() will add also that ROM resource into realloc_head > >>> as it is considered (intentionally) optional after the optional change > >>> (as per "tudor: 2:" line). And that resource is never assigned because > > cut > > >>> pdev_sort_resources() didn't pick it up into the head list. The next > >>> question is why the ROM resource isn't in the head list. > >>> > >> It seems the ROM resource is skipped at: > >> https://web.git.kernel.org/pub/scm/linux/kernel/git/next/linux- > >> next.git/tree/drivers/pci/setup-bus.c#n175 > >> > >> tudor: pdev_sort_resources: ROM [??? 0x00000000 flags 0x0] resource > >> skipped due to !(r->flags) || r->parent > > I don't see the device in this print, hope it is for the same device. > > > > In any case, I don't understand what reset resource's flags in between > > pbus_size_mem() and pdev_sort_resources(), or alternative, why type > > checking in pbus_size_mem() matches if flags == 0 at that point. > > > > Those two functions should work on the same resources, if one skips > > something, the other should too. Disparity between them can cause issues, > > but despite reading the code multiple times, I couldn't figure out how > > that disparity occurs (except for the !pdev_resources_assignable() case). > > cut > > > It is of interest to know why the same resource is treated differently. > > So what were the resource flags, type* args when it's processed by > > pbus_size_mem()? If resource's flags are zero at that point but it matches > > This is the full output: https://termbin.com/mn1x > for the following prints: https://termbin.com/q57h > > It seems ROM resource is of type 2 at pbus_size_mem() time. > > > one of the types, that would be a bug. > > I'll give another try tomorrow. Thanks, Those are not the same device, so not the same resource either: [ 16.262745][ T1113] pci 0001:01:00.0: tudor: 2: pbus_size_mem: ROM [mem 0x00000000-0x0000ffff pref] list empty? 0 [ 16.267736][ T1113] pcieport 0001:00:00.0: tudor: pdev_sort_resources: ROM [??? 0x00000000 flags 0x0] resource skipped due to !(r->flags) || r->parent 0001:01:00.0 vs 0001:00:00.0 And the resources for 0001:01:00.0 were never processed by __pci_bus_assign_resources(). But __pci_bus_assign_resources() should recurse to subordinate busses. And, it seems this boils down to the inconsistency I noticed earlier: [ 16.253464][ T1113] pci 0001:01:00.0: [144d:a5a5] type 00 class 0x000000 PCIe Endpoint include/linux/pci_ids.h:#define PCI_CLASS_NOT_DEFINED 0x0000 pdev_resources_assignable() checks if class == PCI_CLASS_NOT_DEFINED and pdev_sort_resources() bails out without processing those resources. So please test if this patch solves your problem: From 00e440f505a79568cf5203dce265488cb3f66941 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Ilpo=20J=C3=A4rvinen?= <ilpo.jarvinen@xxxxxxxxxxxxxxx> Date: Tue, 3 Jun 2025 11:07:38 +0300 Subject: [PATCH 1/1] PCI: Fix pdev_resources_assignable() disparity MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit pdev_sort_resources() uses pdev_resources_assignable() helper to decide if device's resources cannot be assigned. pbus_size_mem(), on the other hand, does not do the same check. This could lead into a situation where a resource ends up on realloc_head list but is not on the head list, which is turn prevents emptying the resource from the realloc_head list in __assign_resources_sorted(). A non-empty realloc_head is unacceptable because it triggers an internal sanity check as show in this log with a device that has class 0 (PCI_CLASS_NOT_DEFINED): pci 0001:01:00.0: [144d:a5a5] type 00 class 0x000000 PCIe Endpoint pci 0001:01:00.0: BAR 0 [mem 0x00000000-0x000fffff 64bit] pci 0001:01:00.0: ROM [mem 0x00000000-0x0000ffff pref] pci 0001:01:00.0: enabling Extended Tags pci 0001:01:00.0: PME# supported from D0 D3hot D3cold pci 0001:01:00.0: 15.752 Gb/s available PCIe bandwidth, limited by 8.0 GT/s PCIe x2 link at 0001:00:00.0 (capable of 31.506 Gb/s with 16.0 GT/s PCIe x2 link) pcieport 0001:00:00.0: bridge window [mem 0x00100000-0x001fffff] to [bus 01-ff] add_size 100000 add_align 100000 pcieport 0001:00:00.0: bridge window [mem 0x40000000-0x401fffff]: assigned ------------[ cut here ]------------ kernel BUG at drivers/pci/setup-bus.c:2532! Internal error: Oops - BUG: 00000000f2000800 [#1] SMP ... Call trace: pci_assign_unassigned_bus_resources+0x110/0x114 (P) pci_rescan_bus+0x28/0x48 Use pdev_resources_assignable() also within pbus_size_mem() to skip processing of non-assignable resources which removes the disparity in between what resources pdev_sort_resources() and pbus_size_mem() consider. As non-assignable resources are no longer processed, they are not added to the realloc_head list, thus the sanity check no longer triggers. This disparity problem is very old but only now became apparent after the commit 2499f5348431 ("PCI: Rework optional resource handling") that made the ROM resources optional when calculating bridge window sizes which required adding the resource to the realloc_head list. Previously, bridge windows were just sized larger than necessary. Fixes: 2499f5348431 ("PCI: Rework optional resource handling") Cc: stable@xxxxxxxxxxxxxxx Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@xxxxxxxxxxxxxxx> --- drivers/pci/setup-bus.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c index 54d6f4fa3ce1..da084251df43 100644 --- a/drivers/pci/setup-bus.c +++ b/drivers/pci/setup-bus.c @@ -1187,6 +1187,7 @@ static int pbus_size_mem(struct pci_bus *bus, unsigned long mask, resource_size_t r_size; if (r->parent || (r->flags & IORESOURCE_PCI_FIXED) || + !pdev_resources_assignable(dev) || ((r->flags & mask) != type && (r->flags & mask) != type2 && (r->flags & mask) != type3)) base-commit: 0af2f6be1b4281385b618cb86ad946eded089ac8 -- 2.39.5