Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter

K Prateek Nayak <kprateek.nayak@xxxxxxx> · Thu, 27 Mar 2025 23:16:27 +0530

Hello all,

On 3/25/2025 8:28 PM, Dominique Martinet wrote:
Thanks for the traces.

w/ revert
K Prateek Nayak wrote on Tue, Mar 25, 2025 at 08:19:26PM +0530:
    kworker/100:1-1803    [100] .....   286.618822: p9_fd_poll: p9_fd_poll rd poll
    kworker/100:1-1803    [100] .....   286.618822: p9_fd_poll: p9_fd_request wr poll
    kworker/100:1-1803    [100] .....   286.618823: p9_read_work: Data read wait 7

new behavior
            repro-4076    [031] .....    95.011394: p9_fd_poll: p9_fd_poll rd poll
            repro-4076    [031] .....    95.011394: p9_fd_poll: p9_fd_request wr poll
            repro-4076    [031] .....    99.731970: p9_client_rpc: Wait event killable (-512)

For me the problem isn't so much that this gets ERESTARTSYS but that it
nevers gets to read the 7 bytes that are available?

I'm back after getting distracted for a bit. So here it goes:

On the mainline, the signal is actually SIGKILL from parent thread
due to a timeout:

           repro-4084    [112] d..2.   233.654264: signal_generate: sig=9 errno=0 code=0 comm=repro pid=4085 grp=1 res=0
           repro-4084    [112] d..1.   233.654272: signal_generate: sig=9 errno=0 code=0 comm=repro pid=4085 grp=1 res=1
           repro-4085    [125] .....   233.654304: p9_client_rpc: Wait event killable (-512) (1) (0) (118)
           repro-4085    [125] .....   233.654308: p9_client_rpc: Flushed (-512)
           repro-4085    [125] .....   233.654312: p9_client_rpc: Final error (-512)
           repro-4085    [125] .....   233.654313: p9_client_write: p9_client_rpc done
           repro-4085    [125] .....   233.654313: p9_client_write: p9_client_rpc error (-512)
           repro-4085    [125] .....   233.654315: v9fs_issue_write: Issue write done 2 err(-512)

Specifically, this part:

    for (;;) {
      sleep_ms(10);
      if (waitpid(-1, &status, WNOHANG | WAIT_FLAGS) == pid)
        break;
      if (current_time_ms() - start < 5000)
        continue;
      kill_and_wait(pid, &status); /* <------- here */
      break;
    }

Now for the 7 bytes that were not read - If you look at the traces,
every time there is a valid trans_mod->request(), there is a kworker
wakeup:

           repro-4102    [100] .....   286.618772: p9_client_write: p9_client_rpc
           repro-4102    [100] .....   286.618780: p9_fd_request: p9_fd_request
           repro-4102    [100] .....   286.618781: p9_fd_request: p9_fd_request EPOLL
           repro-4102    [100] .....   286.618781: p9_fd_request: p9_fd_request schedule work  <----- this
   kworker/100:1-1803    [100] .....   286.618784: p9_write_work: Data write wait 32770

However, for that last 7 byte read, there is in fact no wakeup:

           repro-4102    [100] .....   286.618805: p9_client_write: p9_client_rpc
           repro-4102    [100] .....   286.618817: p9_fd_request: p9_fd_request
           repro-4102    [100] .....   286.618818: p9_fd_request: p9_fd_request EPOLL
           repro-4102    [100] .....   286.618818: p9_fd_poll: p9_fd_poll rd poll
           repro-4102    [100] .....   286.618818: p9_fd_poll: p9_fd_request wr poll  <--- No wakeup after
   kworker/100:1-1803    [100] .....   286.618822: p9_fd_poll: p9_fd_poll rd poll
   kworker/100:1-1803    [100] .....   286.618822: p9_fd_poll: p9_fd_request wr poll
   kworker/100:1-1803    [100] .....   286.618823: p9_read_work: Data read wait 7
   kworker/100:1-1803    [100] .....   286.618825: p9_read_work: Data read 7

The kworker would spuriously wakeup, do something, realizes it wasn't
suppose to read it, and then disconnects the connection waking up all
waiters.

But if we haven't sent a request, why are we waiting at all?


If the repro has already written the bytes in both cases then there's no
reason to wait 5 seconds here...

OTOH syzbot 9p code is silly and might have been depending on something
that's not supposed to work e.g. they might be missing a flush or
equivalent for all I know (I still haven't looked at the repro)


I believe if c->trans_mod->request(c, req) does not dispatch the
request to the worker for RPC, the caller must not wait. So here
goes something that got the benchmark going in my case which might
be totally wrong but worth a look:

#syz test: upstream aaec5a95d59615523db03dd53c2052f0a87beea7

diff --git a/net/9p/trans_fd.c b/net/9p/trans_fd.c
index 196060dc6138..148533c993f1 100644
--- a/net/9p/trans_fd.c
+++ b/net/9p/trans_fd.c
@@ -687,7 +687,11 @@ static int p9_fd_request(struct p9_client *client, struct p9_req_t *req)
 	else
 		n = p9_fd_poll(m->client, NULL, NULL);
 
-	if (n & EPOLLOUT && !test_and_set_bit(Wworksched, &m->wsched))
+	/* Request was not sent */
+	if (!(n & EPOLLOUT))
+		return -EIO;
+
+	if (!test_and_set_bit(Wworksched, &m->wsched))
 		schedule_work(&m->wq);
 
 	return 0;