Re: [PATCH v2] PCI/AER: Consolidate CXL, ACPI GHES and native AER reporting paths

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 25 Apr 2025 16:12:26 +0200
Karolina Stolarek <karolina.stolarek@xxxxxxxxxx> wrote:

> On 25/04/2025 15:14, Jonathan Cameron wrote:
> > On Fri, 25 Apr 2025 12:32:10 +0200
> > Karolina Stolarek <karolina.stolarek@xxxxxxxxxx> wrote:  
> >> 
> >> It's possible that some of the nuances of this escaped me. I decided to
> >> pick up the series, as I saw "PCI Express bus error injection via GHES"
> >> script and thought it might be useful.  
> > 
> > With Mauro's series you can inject (on ARM64 virt) any CPER record you
> > like.  That doesn't synchronize the wider state of the system though
> > so may not exercise everything (PCI registers etc not updated as it
> > is only injecting the record).  Mostly it just works, as remarkably
> > few error handlers actually take the state of the components on which
> > the error is reported into account.  
> 
> OK, that means even if we manage to inject a PCIe error, AER wouldn't be 
> able to look up the Source ID and other values it needs to report an 
> error, which is not quite the solution I was looking for.

Isn't the source ID in the CPER record? (Device ID field) or do
you mean something else?

> 
> > The aim is specifically to allow exercising FW first error handling
> > paths because it's a pain to get real systems that have firmware to inject
> > the full range of what the kernel etc need to handle.  
> 
> Does this include PCIe errors? If so, that probably doesn't make sense 
> to try to test my patch on an actual system?

Ideally test it on a real system as well, but indeed the intent is to
allow testing of PCI errors on emulation.

> 
> > x86 support for emulated injection is a work in progress (more of a mess wrt
> > to the different ways the event signaling is handled than it is on arm64).
> > 
> > I did have an earlier version of that work wired up to the same
> > hooks as the native CXL error injection but I dropped it from my QEMU
> > CXL staging tree for now as it was a pain to rebase whilst Mauro was rapidly
> > revising the infrastructure.  I'll bring it back when I get time.  
> 
> I understand, I saw some of your series while looking for ways to test 
> my patch. Thank you very much for your work. As you can see, there are 
> people actually looking forward to it :)

Great!  I'll try and get back to wiring it all up again sometime soon.

Jonathan

> 
> 
> All the best,
> Karolina
> 
> > 
> > Jonathan
> >   
> >>  
> >>> Unfortunately there are some typos in the spec (FIRMWARE_FIRST,
> >>> FIRMWAREFIRST in 18.4), so it's a little hard to find all the
> >>> references.  
> >>
> >> Thanks for the pointers, I'll take a look.
> >>  
> >>> It's a long shot, but I added Yijun as a Dell contact that who might
> >>> have a pointer to someone who could possibly test GHES logging on a
> >>> Dell box with and without your patch so we could have a concrete
> >>> comparison of the dmesg log differences.  
> >>
> >> Thank you very much. Let's see, maybe we'll get lucky :)
> >>
> >> All the best,
> >> Karolina
> >>  
> >>>      
> >>>>> If you can't produce actual logs for comparison, I think we can take
> >>>>> info from a sample log somebody has posted and synthesize what the
> >>>>> changes would be after this patch.  
> >>>>
> >>>> I also found some logs at some point, mostly from 2021 and 2023, but I felt
> >>>> bad about mocking up the messages and tried to produce actual logs. If I
> >>>> can't find a way to get this working in two weeks, I'll revisit this idea.
> >>>>
> >>>> All the best,
> >>>> Karolina
> >>>>
> >>>> -------------------------------------------------------------
> >>>> [1] - https://lore.kernel.org/lkml/76824dfc6bb5dd23a9f04607a907ac4ccf7cb147.1740653898.git.mchehab+huawei@xxxxxxxxxx/  
> >>
> >>  
> >   
> 
> 





[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux