On Tue, Jul 22, 2025 at 2:58 PM Bernd Schubert <bschubert@xxxxxxx> wrote: > > Currently, FUSE io-uring requires all queues to be registered before > becoming ready, which can result in too much memory usage. > > This patch introduces a static queue mapping system that allows FUSE > io-uring to operate with a reduced number of registered queues by: > > 1. Adding a queue_mapping array to track which registered queue each > CPU should use > 2. Replacing the is_ring_ready() check with immediate queue mapping > once any queues are registered > 3. Implementing fuse_uring_map_queues() to create CPU-to-queue mappings > that prefer NUMA-local queues when available > 4. Updating fuse_uring_get_queue() to use the static mapping instead > of direct CPU-to-queue correspondence > > The mapping prioritizes NUMA locality by first attempting to map CPUs > to queues on the same NUMA node, falling back to any available > registered queue if no local queue exists. Do we need a static queue map or does it suffice to just overload a queue on the local node if we're not able to find an "ideal" queue for the request? it seems to me like if we default to that behavior, then we get the advantages the static queue map is trying to provide (eg marking the ring as ready as soon as the first queue is registered and finding a last-resort queue for the request) without the overhead. Thanks, Joanne