First of all, let me remind that I know nothing about 9p or netfs ;) And I am not sure that my patch is the right solution. I am not even sure we need the fix, according to syzbot testing the problem goes away with the fixes from David https://web.git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/log/?h=netfs-fixes but I didn't even try to read them, this is not my area. Now, I'll try to answer some of your questions, but I can be easily wrong. On 03/29, asmadeus@xxxxxxxxxxxxx wrote: > > Right, so your patch sounds better than Prateek's initial blowing > up workaround, but it's a bit weird anyway so let me recap: > - that syz repro has this unnatural pattern where the replies are all > written before the requests are sent Yes, > - 9p_read_work() (read worker) has an optimization that if there is no > in fly request then there obviously must be nothing to read (9p is 100% > client initiated, there's no way the server should send something > first), so at this point the reader task is idle Yes. But note that it does kernel_read() -> pipe_read() before it becomes idle. See below. > - p9_fd_request() (sending a new request) has another optimization that > only checks for tx: at this point if another request was already in > flight then the rx task should have a poll going on for rx, and if there > were no in flight request yet then there should be no point in checking > for rx, so p9_fd_request() only kick in the tx worker if there is room > to send Can't comment, but > - at this point I don't really get the logic that'll wake the rx thread > up either... p9_pollwake() will trigger p9_poll_workfn() ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Yes, but where this p9_pollwake() can come from? see below. > - due to the new optimization (aaec5a95d59615 "pipe_read: don't wake up > the writer if the pipe is still full"), that 'if there is room to send' > check started failing and tx thread doesn't start? Again, I can be easily wrong, but no. With or without the optimization above, it doesn't make sense to start the tx thread when the pipe is full, p9_fd_poll() can't report EPOLLOUT. Lets recall that the idle read worker did kernel_read() -> pipe_read(). Before this optimization, pipe_read() did the unnecessary wake_up_interruptible_sync_poll(&pipe->wr_wait); when the pipe was full before the reading _and_ is still full after the reading. This wakeup calls p9_pollwake() which kicks p9_poll_workfn(). p9_poll_workfn() calls p9_poll_mux(). p9_poll_mux() does n = p9_fd_poll(). "n & EPOLLOUT" is false, exactly because this wakeup was unnecessary, so p9_poll_mux() won't do schedule_work(&m->wq), this is fine, But, "n & EPOLLIN" is true, so p9_poll_mux() does schedule_work(&m->rq) and wakes the rx thread. p9_read_work() is called again. It reads more data and (I guess) notices some problem and does p9_conn_cancel(EIO). This no longer happens after the optimization. So in some sense the p9_fd_request() -> p9_poll_mux() hack (which wakes the rx thread in this case) restores the old behaviour. But again, again, quite possibly I completely misread this (nontrivial) code. Oleg.