Re: Problem hanging Bulk IN, with USB 3.x, perhaps due to wMaxPacketSize = 1024 and wMaxBurst = 6 (OUT) and 2 (IN), tested and reproduceable with i.MX8MP and Raspberry Pi Compute Module 5 (CM5)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Daniele,

many thanks for your reply!

I can only partly open

https://www.spinics.net

pages, often pages time out...

Have I understood correctly, that there is a known bug, but it was not fixed (from 2020 till now).

But as workaround enabling qmux/qmimux could work?

Best regards,

Martin



Am 17.08.2025 um 17:22 schrieb Daniele Palmas:
Hello Martin,

Il giorno dom 17 ago 2025 alle ore 17:09 Martin Maurer
<martin.maurer@xxxxxxxxx> ha scritto:
Hello Michał, hello Mathias at all,

many thanks for your answers!

I have tried if I can reproduce it with a AMD Linux PC, but
unfortunately I was not able to reproduce (but setup is a bit different).

So I went back to Raspberry Pi Compute Module 5, where I mainly
connected the radio module (Quectel RM520N-GL) via USB3,

and installed a Wifi access point. All data/all connections from Wifi
access point are routed directly via wwan0 to radio module.

This is currently my easiest setup to be able to reproduce the error.
Mostly in a few seconds.

My knowledge in area Linux Kernel + USB is unfortunately not yet enough
to analyze and fix it by myself.

But I used the help of ChatGPT-5 to create an usbmon and xhci kernel trace.

I create an usbmon trace as well as a trace from xhci (both recorded in
parallel):

https://www.file-upload.net/en/download-15523936/usbmon_bus5_20250817-150158.log.html

https://www.file-upload.net/en/download-15523937/xhci_20250817-150158.trace.html

This was the last output, my ping in a shell has shown:

64 bytes from 8.8.8.8: icmp_seq=2323 ttl=112 time=26.0 ms
64 bytes from 8.8.8.8: icmp_seq=2324 ttl=112 time=25.0 ms
64 bytes from 8.8.8.8: icmp_seq=2325 ttl=112 time=29.1 ms
64 bytes from 8.8.8.8: icmp_seq=2326 ttl=112 time=37.8 ms

In parallel created more data traffic, but with ping I see first when IP
data connection does not work stable anymore.

According to ChatGPT-5 the following places contain errors:

*** USBMON ***

In your usbmon_bus5_20250817-150158.log:

First -71 (EPROTO) on the QMI Bulk-IN (Bi:5:005:14): line 2161,
timestamp 493245744

2161: ffffff8003c8cb40 493245744 C Bi:5:005:14 -71 0

Just before that, there’s a -75 (EOVERFLOW) on the same IN EP, which is
often the first sign of trouble: line 2159, timestamp 493245221

I did not have the chance to look at the usbmon traces so I'm not sure
that this is really the same scenario, but you could take a look at
the whole thread at
https://www.spinics.net/lists/netdev/msg635944.html

If it is the same issue, basically, if you setup the data connection
with QMAP you should not face the issue.

Regards,
Daniele

2159: ffffff8003c8cd80 493245221 C Bi:5:005:14 -75 1024 = ...

So the sequence is: several good completions → EOVERFLOW (-75) → then a
stream of EPROTO (-71) errors on Bi:5:005:14, which kills further ping
replies after your last good seq (2326).


*** XHCI TRACE ***

I found the first failure in your xHCI trace.

First error line: line 8216

Timestamp: 758267.000115

Event: xhci_handle_event … type 'Transfer Event' … 'Error' … slot 1 ep
29 … len 1472

Why ep 29? In xHCI, the endpoint context index is ep_index = 2 *
ep_number + (direction), where direction is 0=OUT, 1=IN.
So for Bulk IN ep 14: 2*14+1 = 29 → that’s your IN 0x87 pipe.

Right after that line you can see the driver react:

xhci_handle_transfer … length 1472 … (the failed TD)

xhci_queue_command: Reset Endpoint Command … ep 29 (host tries to recover)

xhci_handle_event: … 'Command Completion Event' (reset completes)

But from this point on, completions for that IN EP correspond to usbmon
-71 (EPROTO) — matching what you saw.


Does this give a clue, where it could be coming from?

It is 100% reproduceable in a few seconds on Raspberry Pi Ccompute
Module 5 (and I same behaviour on different kernel of i.MX8MP).

Could it be a hardware problem? I already tried different radio module
(all Qualcomm, X62/X65 and X72/X75),

different cables (all same length, all from same source), different eval
board for the M.2 radio modules (but from same source).


Can you give me a hint, what to try next?


ChatGPT-5 pinpoints me to try to disable LPM for USB3, could this be a
next step? Or is it something  else?


Many thanks for your help!

Best regards,

Martin






[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux