On Sun, 9 Mar 2025 18:02:55 +0100 Oleg Nesterov > > Well. Prateek has already provide the lengthy/thorough explanation, > but let me add anyway... > lengthy != correct > On 03/08, Hillf Danton wrote: > > On Fri, 7 Mar 2025 13:34:43 +0100 Oleg Nesterov <oleg@xxxxxxxxxx> > > > On 03/07, Oleg Nesterov wrote: > > > > On 03/07, Hillf Danton wrote: > > > > > On Fri, 7 Mar 2025 11:54:56 +0530 K Prateek Nayak <kprateek.nayak@xxxxxxx> > > > > > >> step-03 > > > > > >> task-118766 new reader > > > > > >> makes pipe empty > > > > > > > > > > > >Reader seeing a pipe full should wake up a writer allowing 118768 to > > > > > >wakeup again and fill the pipe. Am I missing something? > > > > > > > > > > > Good catch, but that wakeup was cut off [2,3] > > > > > > Please note that "that wakeup" was _not_ removed by the patch below. > > > > > After another look, you did cut it. > > I still don't think so. > > > Link: https://lore.kernel.org/all/20250209150718.GA17013@xxxxxxxxxx/ > ... > > --- a/fs/pipe.c > > +++ b/fs/pipe.c > > @@ -360,29 +360,9 @@ anon_pipe_read(struct kiocb *iocb, struct iov_iter *to) > > break; > > } > > mutex_unlock(&pipe->mutex); > > - > > /* > > * We only get here if we didn't actually read anything. > > * > > - * However, we could have seen (and removed) a zero-sized > > - * pipe buffer, and might have made space in the buffers > > - * that way. > > - * > > - * You can't make zero-sized pipe buffers by doing an empty > > - * write (not even in packet mode), but they can happen if > > - * the writer gets an EFAULT when trying to fill a buffer > > - * that already got allocated and inserted in the buffer > > - * array. > > - * > > - * So we still need to wake up any pending writers in the > > - * _very_ unlikely case that the pipe was full, but we got > > - * no data. > > - */ > > - if (unlikely(wake_writer)) > > - wake_up_interruptible_sync_poll(&pipe->wr_wait, EPOLLOUT | EPOLLWRNORM); > > - kill_fasync(&pipe->fasync_writers, SIGIO, POLL_OUT); > > - > > - /* > > * But because we didn't read anything, at this point we can > > * just return directly with -ERESTARTSYS if we're interrupted, > > * since we've done any required wakeups and there's no need > > @@ -391,7 +371,6 @@ anon_pipe_read(struct kiocb *iocb, struct iov_iter *to) > > if (wait_event_interruptible_exclusive(pipe->rd_wait, pipe_readable(pipe)) < 0) > > return -ERESTARTSYS; > > > > - wake_writer = false; > > wake_next_reader = true; > > mutex_lock(&pipe->mutex); > > } > > Please note that in this particular case (hackbench testing) > pipe_write() -> copy_page_from_iter() never fails. So wake_writer is > never true before pipe_reader() calls wait_event(pipe->rd_wait). > Given never and the BUG_ON below, you accidentally prove that Prateek's comment is false, no? > So (again, in this particular case) we could apply the patch below > on top of Linus's tree. > > So, with or without these changes, the writer should be woken up at > step-03 in your scenario. > Fine, before checking my scenario once more, feel free to pinpoint the line number where writer is woken up, with the change below applied. > Oleg. > --- > > --- a/fs/pipe.c > +++ b/fs/pipe.c > @@ -360,27 +360,7 @@ pipe_read(struct kiocb *iocb, struct iov_iter *to) > } > mutex_unlock(&pipe->mutex); > > - /* > - * We only get here if we didn't actually read anything. > - * > - * However, we could have seen (and removed) a zero-sized > - * pipe buffer, and might have made space in the buffers > - * that way. > - * > - * You can't make zero-sized pipe buffers by doing an empty > - * write (not even in packet mode), but they can happen if > - * the writer gets an EFAULT when trying to fill a buffer > - * that already got allocated and inserted in the buffer > - * array. > - * > - * So we still need to wake up any pending writers in the > - * _very_ unlikely case that the pipe was full, but we got > - * no data. > - */ > - if (unlikely(wake_writer)) > - wake_up_interruptible_sync_poll(&pipe->wr_wait, EPOLLOUT | EPOLLWRNORM); > - kill_fasync(&pipe->fasync_writers, SIGIO, POLL_OUT); > - > + BUG_ON(wake_writer); > /* > * But because we didn't read anything, at this point we can > * just return directly with -ERESTARTSYS if we're interrupted, > >