Re: [PATCH v3 1/6] PCI: pnv_php: Properly clean up allocated IRQs on unplug

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 15. 07. 25, 23:36, Timothy Pearson wrote:
In cases where the root of a nested PCIe bridge configuration is
unplugged, the pnv_php driver would leak the allocated IRQ resources for
the child bridges' hotplug event notifications, resulting in a panic.
Fix this by walking all child buses and deallocating all it's IRQ
resources before calling pci_hp_remove_devices.

Also modify the lifetime of the workqueue at struct pnv_php_slot::wq so
that it is only destroyed in pnv_php_free_slot, instead of
pnv_php_disable_irq. This is required since pnv_php_disable_irq will now
be called by workers triggered by hot unplug interrupts, so the
workqueue needs to stay allocated.

The abridged kernel panic that occurs without this patch is as follows:

   WARNING: CPU: 0 PID: 687 at kernel/irq/msi.c:292 msi_device_data_release+0x6c/0x9c
   CPU: 0 UID: 0 PID: 687 Comm: bash Not tainted 6.14.0-rc5+ #2
   Call Trace:
    msi_device_data_release+0x34/0x9c (unreliable)
    release_nodes+0x64/0x13c
    devres_release_all+0xc0/0x140
    device_del+0x2d4/0x46c
    pci_destroy_dev+0x5c/0x194
    pci_hp_remove_devices+0x90/0x128
    pci_hp_remove_devices+0x44/0x128
    pnv_php_disable_slot+0x54/0xd4
    power_write_file+0xf8/0x18c
    pci_slot_attr_store+0x40/0x5c
    sysfs_kf_write+0x64/0x78
    kernfs_fop_write_iter+0x1b0/0x290
    vfs_write+0x3bc/0x50c
    ksys_write+0x84/0x140
    system_call_exception+0x124/0x230
    system_call_vectored_common+0x15c/0x2ec

Signed-off-by: Shawn Anastasio <sanastasio@xxxxxxxxxxxxxxxxxxxxx>
Signed-off-by: Timothy Pearson <tpearson@xxxxxxxxxxxxxxxxxxxxx>
---
  drivers/pci/hotplug/pnv_php.c | 94 ++++++++++++++++++++++++++++-------
  1 file changed, 75 insertions(+), 19 deletions(-)

diff --git a/drivers/pci/hotplug/pnv_php.c b/drivers/pci/hotplug/pnv_php.c
index 573a41869c15..aec0a6d594ac 100644
--- a/drivers/pci/hotplug/pnv_php.c
+++ b/drivers/pci/hotplug/pnv_php.c
...
@@ -647,6 +702,15 @@ static struct pnv_php_slot *pnv_php_alloc_slot(struct device_node *dn)

This is preceded by:
        php_slot = kzalloc(sizeof(*php_slot), GFP_KERNEL);

Ie. php_slot is zeroed.

  		return NULL;
  	}
+ /* Allocate workqueue for this slot's interrupt handling */
+	php_slot->wq = alloc_workqueue("pciehp-%s", 0, 0, php_slot->name);
+	if (!php_slot->wq) {
+		SLOT_WARN(php_slot, "Cannot alloc workqueue\n");

I believe this introduced a (unlikely) NULL ptr dereference.

+		kfree(php_slot->name);
+		kfree(php_slot);
+		return NULL;
+	}
+
  	if (dn->child && PCI_DN(dn->child))
  		php_slot->slot_no = PCI_SLOT(PCI_DN(dn->child)->devfn);
  	else

This continues:
        php_slot->pdev                  = bus->self;
        php_slot->bus                   = bus;


And SLOT_WARN() is defined as:
#define SLOT_WARN(sl, x...) \
((sl)->pdev ? pci_warn((sl)->pdev, x) : dev_warn(&(sl)->bus->dev, x))

The else branch is alkays taken in the 'if' above, which still dereferences NULLed (sl)->bus here.

@@ -843,14 +907,6 @@ static void pnv_php_init_irq(struct pnv_php_slot *php_slot, int irq)
  	u16 sts, ctrl;
  	int ret;
- /* Allocate workqueue */
-	php_slot->wq = alloc_workqueue("pciehp-%s", 0, 0, php_slot->name);
-	if (!php_slot->wq) {
-		SLOT_WARN(php_slot, "Cannot alloc workqueue\n");

Here, php_slot used to have both ->pdev and ->bus assigned at this point.

-		pnv_php_disable_irq(php_slot, true);
-		return;
-	}
-

Right?

thanks,
--
js
suse labs





[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux