On Tue, Mar 25 2025, Bernd Schubert wrote: > On 3/25/25 22:38, Luis Henriques wrote: >> Hi Bernd! >> >> On Tue, Mar 25 2025, Bernd Schubert wrote: >> >>> task-A (application) might be in request_wait_answer and >>> try to remove the request when it has FR_PENDING set. >>> >>> task-B (a fuse-server io-uring task) might handle this >>> request with FUSE_IO_URING_CMD_COMMIT_AND_FETCH, when >>> fetching the next request and accessed the req from >>> the pending list in fuse_uring_ent_assign_req(). >>> That code path was not protected by fiq->lock and so >>> might race with task-A. >>> >>> For scaling reasons we better don't use fiq->lock, but >>> add a handler to remove canceled requests from the queue. >>> >>> This also removes usage of fiq->lock from >>> fuse_uring_add_req_to_ring_ent() altogether, as it was >>> there just to protect against this race and incomplete. >>> >>> Also added is a comment why FR_PENDING is not cleared. > > Hi Luis, > > thanks for your review! > >> >> At first, this patch looked OK to me. However, after looking closer, I'm >> not sure if this doesn't break fuse_abort_conn(). Because that function >> assumes it is safe to walk through all the requests using fiq->lock, it >> could race against fuse_uring_remove_pending_req(), which uses queue->lock >> instead. Am I missing something (quite likely!), or does fuse_abort_conn() >> also needs to be modified? > > I don't think there is an issue with abort > > fuse_abort_conn() > spin_lock(&fiq->lock); > list_for_each_entry(req, &fiq->pending, list) > ... > spin_unlock(&fiq->lock); > > ... > > fuse_uring_abort(fc); > > Iterating fiq->pending will not handle any uring request, as these are > in per queue lists. The per uring queues are then handled in > fuse_uring_abort(). > > I.e. I don't think this commit changes anything for abort. Yeah, you're right. Thanks for looking, Bernd. >> >> [ Another scenario that is not problematic, but that could become messy in >> the future is if we want to add support for the FUSE_NOTIFY_RESEND >> operation through uring. Obviously, that's not an issue right now, but >> this patch probably will make it harder to add that support. ] > > Oh yeah, this needs to be fixed. Though I don't think that this patch > changes much. We need to iterate through the per fpq and apply the > same logic? Right, I agree this patch doesn't change anything here. And I guess I also misunderstood the problem here as well -- I thought this would be an issue when adding support for iouring, but in fact it is already a problem. The ideal solution would be to implement NOTIFY_RESEND over iouring, but that would be a bit more evolving. Cheers, -- Luís