[+to Yijun @Dell in case there's some testing opportunity, thread at https://lore.kernel.org/r/81c040d54209627de2d8b150822636b415834c7f.1742900213.git.karolina.stolarek@xxxxxxxxxx] On Thu, Apr 24, 2025 at 11:01:11AM +0200, Karolina Stolarek wrote: > On 23/04/2025 22:31, Bjorn Helgaas wrote: > > On Wed, Apr 23, 2025 at 03:52:27PM +0200, Karolina Stolarek wrote: > > > > > > I wasn't able to produce logs for the CXL path (that is, Restricted CXL > > > Device, as CXL1.1 devices not supported by the driver due to a missing > > > functionality; confirmed by Terry) and faced issues when trying to inject > > > errors via GHES. Is the lack of logs a blocker for this patch? I tested > > > other CXL scenarios and my changes didn't cause regression, as far as I > > > know. > > > > Yes, I do think we need to say something about the output format > > changes. > > I understand. > > > I assume you're trying GHES on machines that actually do > > firmware-first error handling, right? I found several GHES logs by > > searching the web for "APEI Generic Hardware Error Source" "PCIe > > error". The majority were Dell boxes. > > The only way to inject GHES errors I'm aware of is Mauro's patch for > qemu[1], so I went down the virtualization path. As for working with the > actual hardware, I'd need to ask around and learn more about the platform. I'd be surprised if the qemu firmware supports firmware-first handling, so I wouldn't expect to be able to exercise this path that way. I think there are some bits in HEST and similar tables that tell us about this, e.g., ACPI r6.5, sec 18.3.2.4. Unfortunately there are some typos in the spec (FIRMWARE_FIRST, FIRMWAREFIRST in 18.4), so it's a little hard to find all the references. It's a long shot, but I added Yijun as a Dell contact that who might have a pointer to someone who could possibly test GHES logging on a Dell box with and without your patch so we could have a concrete comparison of the dmesg log differences. > > If you can't produce actual logs for comparison, I think we can take > > info from a sample log somebody has posted and synthesize what the > > changes would be after this patch. > > I also found some logs at some point, mostly from 2021 and 2023, but I felt > bad about mocking up the messages and tried to produce actual logs. If I > can't find a way to get this working in two weeks, I'll revisit this idea. > > All the best, > Karolina > > ------------------------------------------------------------- > [1] - https://lore.kernel.org/lkml/76824dfc6bb5dd23a9f04607a907ac4ccf7cb147.1740653898.git.mchehab+huawei@xxxxxxxxxx/