On 15. 07. 25, 23:36, Timothy Pearson wrote:
In cases where the root of a nested PCIe bridge configuration is
unplugged, the pnv_php driver would leak the allocated IRQ resources for
the child bridges' hotplug event notifications, resulting in a panic.
Fix this by walking all child buses and deallocating all it's IRQ
resources before calling pci_hp_remove_devices.
Also modify the lifetime of the workqueue at struct pnv_php_slot::wq so
that it is only destroyed in pnv_php_free_slot, instead of
pnv_php_disable_irq. This is required since pnv_php_disable_irq will now
be called by workers triggered by hot unplug interrupts, so the
workqueue needs to stay allocated.
The abridged kernel panic that occurs without this patch is as follows:
WARNING: CPU: 0 PID: 687 at kernel/irq/msi.c:292 msi_device_data_release+0x6c/0x9c
CPU: 0 UID: 0 PID: 687 Comm: bash Not tainted 6.14.0-rc5+ #2
Call Trace:
msi_device_data_release+0x34/0x9c (unreliable)
release_nodes+0x64/0x13c
devres_release_all+0xc0/0x140
device_del+0x2d4/0x46c
pci_destroy_dev+0x5c/0x194
pci_hp_remove_devices+0x90/0x128
pci_hp_remove_devices+0x44/0x128
pnv_php_disable_slot+0x54/0xd4
power_write_file+0xf8/0x18c
pci_slot_attr_store+0x40/0x5c
sysfs_kf_write+0x64/0x78
kernfs_fop_write_iter+0x1b0/0x290
vfs_write+0x3bc/0x50c
ksys_write+0x84/0x140
system_call_exception+0x124/0x230
system_call_vectored_common+0x15c/0x2ec
Signed-off-by: Shawn Anastasio <sanastasio@xxxxxxxxxxxxxxxxxxxxx>
Signed-off-by: Timothy Pearson <tpearson@xxxxxxxxxxxxxxxxxxxxx>
---
drivers/pci/hotplug/pnv_php.c | 94 ++++++++++++++++++++++++++++-------
1 file changed, 75 insertions(+), 19 deletions(-)
diff --git a/drivers/pci/hotplug/pnv_php.c b/drivers/pci/hotplug/pnv_php.c
index 573a41869c15..aec0a6d594ac 100644
--- a/drivers/pci/hotplug/pnv_php.c
+++ b/drivers/pci/hotplug/pnv_php.c
...
@@ -647,6 +702,15 @@ static struct pnv_php_slot *pnv_php_alloc_slot(struct device_node *dn)
This is preceded by:
php_slot = kzalloc(sizeof(*php_slot), GFP_KERNEL);
Ie. php_slot is zeroed.
return NULL;
}
+ /* Allocate workqueue for this slot's interrupt handling */
+ php_slot->wq = alloc_workqueue("pciehp-%s", 0, 0, php_slot->name);
+ if (!php_slot->wq) {
+ SLOT_WARN(php_slot, "Cannot alloc workqueue\n");
I believe this introduced a (unlikely) NULL ptr dereference.
+ kfree(php_slot->name);
+ kfree(php_slot);
+ return NULL;
+ }
+
if (dn->child && PCI_DN(dn->child))
php_slot->slot_no = PCI_SLOT(PCI_DN(dn->child)->devfn);
else
This continues:
php_slot->pdev = bus->self;
php_slot->bus = bus;
And SLOT_WARN() is defined as:
#define SLOT_WARN(sl, x...) \
((sl)->pdev ? pci_warn((sl)->pdev, x) :
dev_warn(&(sl)->bus->dev, x))
The else branch is alkays taken in the 'if' above, which still
dereferences NULLed (sl)->bus here.
@@ -843,14 +907,6 @@ static void pnv_php_init_irq(struct pnv_php_slot *php_slot, int irq)
u16 sts, ctrl;
int ret;
- /* Allocate workqueue */
- php_slot->wq = alloc_workqueue("pciehp-%s", 0, 0, php_slot->name);
- if (!php_slot->wq) {
- SLOT_WARN(php_slot, "Cannot alloc workqueue\n");
Here, php_slot used to have both ->pdev and ->bus assigned at this point.
- pnv_php_disable_irq(php_slot, true);
- return;
- }
-
Right?
thanks,
--
js
suse labs