On 3/25/25 22:38, Luis Henriques wrote: > Hi Bernd! > > On Tue, Mar 25 2025, Bernd Schubert wrote: > >> task-A (application) might be in request_wait_answer and >> try to remove the request when it has FR_PENDING set. >> >> task-B (a fuse-server io-uring task) might handle this >> request with FUSE_IO_URING_CMD_COMMIT_AND_FETCH, when >> fetching the next request and accessed the req from >> the pending list in fuse_uring_ent_assign_req(). >> That code path was not protected by fiq->lock and so >> might race with task-A. >> >> For scaling reasons we better don't use fiq->lock, but >> add a handler to remove canceled requests from the queue. >> >> This also removes usage of fiq->lock from >> fuse_uring_add_req_to_ring_ent() altogether, as it was >> there just to protect against this race and incomplete. >> >> Also added is a comment why FR_PENDING is not cleared. Hi Luis, thanks for your review! > > At first, this patch looked OK to me. However, after looking closer, I'm > not sure if this doesn't break fuse_abort_conn(). Because that function > assumes it is safe to walk through all the requests using fiq->lock, it > could race against fuse_uring_remove_pending_req(), which uses queue->lock > instead. Am I missing something (quite likely!), or does fuse_abort_conn() > also needs to be modified? I don't think there is an issue with abort fuse_abort_conn() spin_lock(&fiq->lock); list_for_each_entry(req, &fiq->pending, list) ... spin_unlock(&fiq->lock); ... fuse_uring_abort(fc); Iterating fiq->pending will not handle any uring request, as these are in per queue lists. The per uring queues are then handled in fuse_uring_abort(). I.e. I don't think this commit changes anything for abort. > > [ Another scenario that is not problematic, but that could become messy in > the future is if we want to add support for the FUSE_NOTIFY_RESEND > operation through uring. Obviously, that's not an issue right now, but > this patch probably will make it harder to add that support. ] Oh yeah, this needs to be fixed. Though I don't think that this patch changes much. We need to iterate through the per fpq and apply the same logic? Thanks, Bernd