Re: [PATCH v2] fuse: {io-uring} Fix a possible req cancellation race

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Mar 25 2025, Bernd Schubert wrote:

> On 3/25/25 22:38, Luis Henriques wrote:
>> Hi Bernd!
>> 
>> On Tue, Mar 25 2025, Bernd Schubert wrote:
>> 
>>> task-A (application) might be in request_wait_answer and
>>> try to remove the request when it has FR_PENDING set.
>>>
>>> task-B (a fuse-server io-uring task) might handle this
>>> request with FUSE_IO_URING_CMD_COMMIT_AND_FETCH, when
>>> fetching the next request and accessed the req from
>>> the pending list in fuse_uring_ent_assign_req().
>>> That code path was not protected by fiq->lock and so
>>> might race with task-A.
>>>
>>> For scaling reasons we better don't use fiq->lock, but
>>> add a handler to remove canceled requests from the queue.
>>>
>>> This also removes usage of fiq->lock from
>>> fuse_uring_add_req_to_ring_ent() altogether, as it was
>>> there just to protect against this race and incomplete.
>>>
>>> Also added is a comment why FR_PENDING is not cleared.
>
> Hi Luis,
>
> thanks for your review!
>
>> 
>> At first, this patch looked OK to me.  However, after looking closer, I'm
>> not sure if this doesn't break fuse_abort_conn().  Because that function
>> assumes it is safe to walk through all the requests using fiq->lock, it
>> could race against fuse_uring_remove_pending_req(), which uses queue->lock
>> instead.  Am I missing something (quite likely!), or does fuse_abort_conn()
>> also needs to be modified?
>
> I don't think there is an issue with abort
>
> fuse_abort_conn()
>    spin_lock(&fiq->lock);
>    list_for_each_entry(req, &fiq->pending, list)
>    ...
>    spin_unlock(&fiq->lock);
>
>    ...
>
>    fuse_uring_abort(fc);
>
> Iterating fiq->pending will not handle any uring request, as these are
> in per queue lists. The per uring queues are then handled in
> fuse_uring_abort().
>
> I.e. I don't think this commit changes anything for abort.

Yeah, you're right.  Thanks for looking, Bernd.

>> 
>> [ Another scenario that is not problematic, but that could become messy in
>>   the future is if we want to add support for the FUSE_NOTIFY_RESEND
>>   operation through uring.  Obviously, that's not an issue right now, but
>>   this patch probably will make it harder to add that support. ]
>
> Oh yeah, this needs to be fixed. Though I don't think that this patch
> changes much. We need to iterate through the per fpq and apply the
> same logic?

Right, I agree this patch doesn't change anything here.  And I guess I
also misunderstood the problem here as well -- I thought this would be an
issue when adding support for iouring, but in fact it is already a
problem.  The ideal solution would be to implement NOTIFY_RESEND over
iouring, but that would be a bit more evolving.

Cheers,
-- 
Luís





[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux