On 03/25, Dominique Martinet wrote: > > Thanks for the traces. > > w/ revert > K Prateek Nayak wrote on Tue, Mar 25, 2025 at 08:19:26PM +0530: > > kworker/100:1-1803 [100] ..... 286.618822: p9_fd_poll: p9_fd_poll rd poll > > kworker/100:1-1803 [100] ..... 286.618822: p9_fd_poll: p9_fd_request wr poll > > kworker/100:1-1803 [100] ..... 286.618823: p9_read_work: Data read wait 7 > > new behavior > > repro-4076 [031] ..... 95.011394: p9_fd_poll: p9_fd_poll rd poll > > repro-4076 [031] ..... 95.011394: p9_fd_poll: p9_fd_request wr poll > > repro-4076 [031] ..... 99.731970: p9_client_rpc: Wait event killable (-512) > > For me the problem isn't so much that this gets ERESTARTSYS but that it > nevers gets to read the 7 bytes that are available? Yes... OK, lets first recall what the commit aaec5a95d59615523 ("pipe_read: don't wake up the writer if the pipe is still full") does. It simply removes the unnecessary/spurious wakeups when the writer can't add more data to the pipe. See the "stupid test-cas" in https://lore.kernel.org/all/20250120144338.GC7432@xxxxxxxxxx/ In particular this note: As you can see, without this patch pipe_read() wakes the writer up 4095 times for no reason, the writer burns a bit of CPU and blocks again after wakeup until the last read(fd[0], &c, 1). in this test-case the writer sleeps in pipe_write(), but the same is true for the task sleeping in poll( { .fd = pipe_fd, .events = POLLOUT}, ...). Now, after some grepping I have found static void p9_conn_create(struct p9_client *client) { ... init_poll_funcptr(&m->pt, p9_pollwait); n = p9_fd_poll(client, &m->pt, NULL); ... } So, iiuc, in this case p9_fd_poll(&m->pt /* != NULL */) -> p9_pollwait() paths will add the "dummy" pwait->wait entries with ->func = p9_pollwake to pipe_inode_info.rd_wait and pipe_inode_info.wr_wait. Hmm... I don't understand why the 2nd vfs_poll(ts->wr) depends on the ret from vfs_poll(ts->rd), but I assume this is correct. This means that every time pipe_read() does wake_up(&pipe->wr_wait) p9_pollwake() is called. This function kicks p9_poll_workfn() which calls p9_poll_mux() which calls p9_fd_poll() again with pt == NULL. In this case the conditional vfs_poll(ts->wr) looks more understandable... So. Without the commit above, p9_poll_mux()->p9_fd_poll() can be called much more often and, in particular, can report the "additional" EPOLLIN. Can this somehow explain the problem? Oleg.