On Wed, Jul 16, 2025 at 09:27:55PM +0530, Manivannan Sadhasivam wrote: > On Tue, Jul 15, 2025 at 02:21:47PM GMT, Brian Norris wrote: > > OTOH, I also see that part of my change is not really doing quite what I > > thought it was -- so far, I think there may be some kind of resource > > leak (kobj ref), since I'm not seeing pci_release_host_bridge_dev() > > called when I think it should be. If I perform cleanup in > > pci_free_host_bridge() instead, then I do indeed see > > of_platform_device_destroy() tear things down the way I expect. > > > > Oh, that's bad! Which controller it is? I played with making the pcie-qcom > driver modular and I unloaded/loaded multiple times, but never saw any > refcount warning (I really hope if there was any leak, it would've tripped over > during insmod). I'm still trying to tease this apart, and I'm not sure when I'll have plenty of time to get further on this. I'm also primarily using a non-upstream DWC-based driver, which isn't really ready to be published. I also have some systems that use drivers/pci/controller/pcie-rockchip-host.c and are fully upstream-supported, so I'll see if I can replicate my observations there. But I think there are at least two problems: (1) I'm adding code to bridge->dev.release(). release() is only called when the device's refcount drops to zero. And child devices hold a refcount on their parent (the bridge). So, I have a circular refcount, if there were any pwrctrl children present. I think this is easily solved by moving the child destruction to pci_free_host_bridge() instead. (2) Even after resolving 1, I'm seeing pci_free_host_bridge() exit with a bridge->dev.kboj.kref refcount of 1 in some cases. I don't yet have an explanation of that one. IIUC, this kind of error would be considered a leak, but crucially, I also don't think it would produce any kind of refcount warning or other error. It's "just" a device that has been removed (a la, device_del()), but still has some client holding a reference count (i.e., not enough put_device()). Brian