https://bugzilla.kernel.org/show_bug.cgi?id=220069 --- Comment #12 from Michał Pecio (michal.pecio@xxxxxxxxx) --- (In reply to Claudio Wunder from comment #10) > > I think you said you have more of those logs, is the above always appearing > a > > few seconds before "hc died"? It seems related to the 8-3 device, a VIA USB > > 3.0 hub. > > For the sample of two items I have so far, it appears that these are showing > up. Note that on both the original regression and the current "apparent" one > (if we can even call it a regression?), these errors above are happening. I > will need to wait to see the next crash also happens to have said logs; Wait, this is important. If you were seeing "Abort failed to stop command ring: -110" instead of "xHCI host not responding to stop endpoint command" before 6.13.7 then it is at least possible, if not likely, that you were already running into a different problem than the one fixed in 6.13.7. And it gets doubly suspicious if you also saw "ERROR unknown event type <some number>" a few seconds before "HC died". Do you still have those logs by any chance? As Mathias Nyman explained, the known 6.13 issue was a simple driver bug: commands were written incorrectly, chips correctly ignored them, the driver incorrectly pronounced them dead. Mathias further suggests that this or similar bug may still somehow exist in your kernel and that command abort fails because the chip believes there are no pending commands. That is possible, but unlikely because command abort is not supposed to fail like that. So if you ever seem command abort timeout, either the abort code is buggy (and it looks like no one touched that part in ages) or the chip is buggy in one way or another. It would be sad if this turns out to be a regression due to the commits initially suspected back in February: https://bugzilla.kernel.org/show_bug.cgi?id=219824#c5 These are present in all 6.12 and higher releases from this year, so the only supported kernels without them are old LTS series. Not sure if you have means of testing those for a few weeks on the same HW, userspace and workload? I could also suggest some stress tests which exercise this code (and the USB controller). I found webcams and USB serial dongles to be particularly suitable, do you have some of such stuff at hand? -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.