Re: [PATCH] PCI/THP: Fix hang due to incorrect guard lock

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Just realized this patch was missing maintainer email address and mailing list for reviews.  

Bjorn, 

Apologies. Can you please queue this up in the PCI tree for inclusion in 6.16-rc6. 


> On Jul 8, 2025, at 15:25, himanshu.madhani@xxxxxxxxxx wrote:
> 
> From: Himanshu Madhani <himanshu.madhani@xxxxxxxxxx>
> 
> Fix system hang due to incorrect mutex lock placement.
> 
> following stack trace will be seen on system boot
> 
> [  525.664681] task:systemd-shutdow state:D stack:0     pid:1     tgid:1     ppid:0      task_flags:0x400100 flags:0x00004002
> [  525.796878] Call Trace:
> [  525.826116]  <TASK>
> [  525.851195]  __schedule+0x2d1/0x730
> [  525.892917]  schedule+0x27/0x80
> [  525.930478]  schedule_preempt_disabled+0x15/0x30
> [  525.985718]  __mutex_lock.constprop.0+0x4be/0x8a0
> [  526.041993]  msi_domain_get_virq+0xcc/0x110
> [  526.092031]  pci_msix_write_tph_tag+0x3c/0x100
> [  526.145186]  pcie_tph_set_st_entry+0x125/0x1d0
> [  526.198346]  bnxt_irq_affinity_release+0x35/0x50 [bnxt_en]
> [  526.264015]  irq_set_affinity_notifier+0xe0/0x130
> [  526.320291]  bnxt_free_irq+0x6e/0x110 [bnxt_en]
> [  526.374507]  __bnxt_close_nic.isra.0+0x1eb/0x220 [bnxt_en]
> [  526.440175]  bnxt_close+0x3a/0x100 [bnxt_en]
> [  526.491264]  __dev_close_many+0xae/0x220
> [  526.538179]  dev_close_many+0xc2/0x1b0
> [  526.583014]  netif_close+0x9d/0xd0
> [  526.623693]  bnxt_shutdown+0xb1/0xe0 [bnxt_en]
> [  526.676874]  pci_device_shutdown+0x35/0x70
> [  526.725871]  device_shutdown+0x118/0x1a0
> [  526.772788]  kernel_restart+0x3a/0x70
> [  526.816588]  __do_sys_reboot+0x150/0x250
> [  526.863504]  do_syscall_64+0x84/0x940
> [  526.907300]  ? __put_user_8+0xd/0x20
> [  526.950059]  ? rseq_ip_fixup+0x90/0x1e0
> [  526.995937]  ? task_mm_cid_work+0x1ad/0x220
> [  527.045971]  ? __rseq_handle_notify_resume+0x35/0x90
> [  527.105367]  ? arch_exit_to_user_mode_prepare.isra.0+0x98/0xb0
> [  527.175166]  ? do_syscall_64+0xba/0x940
> [  527.221040]  ? do_filp_open+0xd7/0x1a0
> [  527.265882]  ? alloc_fd+0xba/0x110
> [  527.306556]  ? do_sys_openat2+0xa4/0xf0
> [  527.352434]  ? __x64_sys_openat+0x54/0xb0
> [  527.400389]  ? arch_exit_to_user_mode_prepare.isra.0+0x9/0xb0
> [  527.469150]  ? do_syscall_64+0xba/0x940
> [  527.515023]  ? do_user_addr_fault+0x221/0x690
> [  527.567141]  ? clear_bhb_loop+0x30/0x80
> [  527.613017]  ? clear_bhb_loop+0x30/0x80
> [  527.658895]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> [  527.719332] RIP: 0033:0x7fc3ec504777
> [  527.762091] RSP: 002b:00007ffecd62c4f8 EFLAGS: 00000202 ORIG_RAX: 00000000000000a9
> [  527.852685] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fc3ec504777
> [  527.938085] RDX: 0000000001234567 RSI: 0000000028121969 RDI: 00000000fee1dead
> [  528.023485] RBP: 00007ffecd62c700 R08: 0000000000000000 R09: 00007ffecd62b8e0
> [  528.108878] R10: 0000000000000001 R11: 0000000000000202 R12: 00007ffecd62c568
> [  528.194273] R13: 00007ffecd62c548 R14: 00007ffecd62c568 R15: 0000000000000000
> [  528.279672]  </TASK>
> 
> Fixes: d5124a9957b2 ("PCI/MSI: Provide a sane mechanism for TPH")
> Fixes: 71296eae5887 ("PCI/TPH: Replace the broken MSI-X control word update")
> Reported-by: Jorge Lopez <jorge.jo.lopez@xxxxxxxxxx>
> Suggested-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> Tested-by: Jorge Lopez <jorge.jo.lopez@xxxxxxxxxx>
> Signed-off-by: Himanshu Madhani <himanshu.madhani@xxxxxxxxxx>
> ---
> drivers/pci/msi/msi.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/pci/msi/msi.c b/drivers/pci/msi/msi.c
> index 6ede55a7c5e6..d686488f4111 100644
> --- a/drivers/pci/msi/msi.c
> +++ b/drivers/pci/msi/msi.c
> @@ -934,10 +934,12 @@ int pci_msix_write_tph_tag(struct pci_dev *pdev, unsigned int index, u16 tag)
> if (!pdev->msix_enabled)
> return -ENXIO;
> 
> - guard(msi_descs_lock)(&pdev->dev);
> virq = msi_get_virq(&pdev->dev, index);
> if (!virq)
> return -ENXIO;
> +
> + guard(msi_descs_lock)(&pdev->dev);
> +
> /*
> * This is a horrible hack, but short of implementing a PCI
> * specific interrupt chip callback and a huge pile of
> -- 
> 2.41.0.rc2
> 



-- 
Himanshu Madhani	Oracle Linux Engineering






[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux