The plug mechanism uses the merging of block I/O (bio) to reduce the frequency of I/O submission to improve throughput. This mechanism can greatly reduce the disk seek overhead of the HDD and plays a key role in optimizing the speed of IO. However, with the improvement of storage device speed, high-performance SSD combined with asynchronous processing mechanisms such as io_uring has achieved very fast I/O processing speed. The delay introduced by flow control and bio merging may reduced the throughput to a certain extent. After testing, I found that plug increases the burden of high concurrency of SSD on random IO and 128K sequential IO. But it still has a certain optimization effect on small block (4k) sequential IO, of course small sequential IO is the most suitable application for merging scenarios, but the current plug does not distinguish between different usage scenarios. I have made aggressive modifications to the kernel code to disable the plug mechanism during I/O submission, the following are the random performance differences after disabling only merging and completely disabling plug (merging and flow control): ------------------------------------------------------------------------------------ PCIe Gen4 SSD 16GB Mem Seq 128K Random 4K cmd: taskset -c 0 ./t/io_uring -b 131072 -d128 -c32 -s32 -R0 -p1 -F1 -B1 -n1 -r5 /dev/nvme0n1 taskset -c 0 ./t/io_uring -b 4096 -d128 -c32 -s32 -R1 -p1 -F1 -B1 -n1 -r5 /dev/nvme0n1 data unit: IOPS ------------------------------------------------------------------------------------ Enable plug disable merge disable plug Seq IO 50100 50133 50125 Random IO 821K 824K 836K -1.83% ------------------------------------------------------------------------------------ I used a higher-speed device (PCIe Gen5 server and PCIe Gen5 SSD) to verify the hypothesis and found that the gap widened further. ------------------------------------------------------------------------------------ Enable plug disable merge disable plug Seq IO 88938 89832 89869 Random IO 1.02M 1.022M 1.06M -3.92% ------------------------------------------------------------------------------------ In the current kernel, there is a certain flag (REQ_NOMERGE_FLAGS) to control whether IO operations can be merged. However, the decision for plug selection is determined solely by whether batch submission is enabled (state->need_plug = max_ios > 2;). I'm wondering whether this judgment mechanism is still applicable to high-speed SSDs. So the discussion points are: - Will plugs gradually disappear as hardware devices develop? - Is it reasonable to make flow control an optional configuration? Or could we change the criteria for determining when to apply plug? - Are there other thoughts about plug that we can talk now? Thanks, Xue He