On Mon, Mar 24, 2025 at 3:52 PM K Prateek Nayak <kprateek.nayak@xxxxxxx> wrote: > So far, with tracing, this is where I'm: > > o Mainline + Oleg's optimization reverted: > > ... > kworker/43:1-1723 [043] ..... 115.309065: p9_read_work: Data read wait 55 > kworker/43:1-1723 [043] ..... 115.309066: p9_read_work: Data read 55 > kworker/43:1-1723 [043] ..... 115.309067: p9_read_work: Data read wait 7 > kworker/43:1-1723 [043] ..... 115.309068: p9_read_work: Data read 7 > repro-4138 [043] ..... 115.309084: netfs_wake_write_collector: Wake collector > repro-4138 [043] ..... 115.309085: netfs_wake_write_collector: Queuing collector work > repro-4138 [043] ..... 115.309088: netfs_unbuffered_write: netfs_unbuffered_write > repro-4138 [043] ..... 115.309088: netfs_end_issue_write: netfs_end_issue_write > repro-4138 [043] ..... 115.309089: netfs_end_issue_write: Write collector need poke 0 > repro-4138 [043] ..... 115.309091: netfs_unbuffered_write_iter_locked: Waiting on NETFS_RREQ_IN_PROGRESS! > kworker/u1030:1-1951 [168] ..... 115.309096: netfs_wake_write_collector: Wake collector > kworker/u1030:1-1951 [168] ..... 115.309097: netfs_wake_write_collector: Queuing collector work > kworker/u1030:1-1951 [168] ..... 115.309102: netfs_write_collection_worker: Write collect clearing and waking up! > ... (syzbot reproducer continues) > > o Mainline: > > kworker/185:1-1767 [185] ..... 109.485961: p9_read_work: Data read wait 7 > kworker/185:1-1767 [185] ..... 109.485962: p9_read_work: Data read 7 > kworker/185:1-1767 [185] ..... 109.485962: p9_read_work: Data read wait 55 > kworker/185:1-1767 [185] ..... 109.485963: p9_read_work: Data read 55 > repro-4038 [185] ..... 114.225717: netfs_wake_write_collector: Wake collector > repro-4038 [185] ..... 114.225723: netfs_wake_write_collector: Queuing collector work > repro-4038 [185] ..... 114.225727: netfs_unbuffered_write: netfs_unbuffered_write > repro-4038 [185] ..... 114.225727: netfs_end_issue_write: netfs_end_issue_write > repro-4038 [185] ..... 114.225728: netfs_end_issue_write: Write collector need poke 0 > repro-4038 [185] ..... 114.225728: netfs_unbuffered_write_iter_locked: Waiting on NETFS_RREQ_IN_PROGRESS! > ... (syzbot reproducer hangs) > > There is a third "kworker/u1030" component that never gets woken up for > reasons currently unknown to me with Oleg's optimization. I'll keep > digging. > Thanks for the update. It is unclear to me if you checked, so I'm going to have to ask just in case: when there is a hang, is there *anyone* stuck in pipe code (and if so, where)? You can get the kernel to print stacks for all threads with sysrq: echo t > /proc/sysrq-trigger -- Mateusz Guzik <mjguzik gmail.com>