> `iostat -dxm 5` output during the fio run on both kernels will give us some indication of the differences in IO patterns, queue depths, etc. iostat files attached. fedora 42 [root@localhost ~]# fio --name=test --rw=read --bs=256k --filename=/mnt/testfile --direct=1 --numjobs=1 --iodepth=64 --exitall --group_reporting --ioengine=libaio --runtime=30 --time_based test: (g=0): rw=read, bs=(R) 256KiB-256KiB, (W) 256KiB-256KiB, (T) 256KiB-256KiB, ioengine=libaio, iodepth=64 fio-3.39-44-g19d9 Starting 1 process Jobs: 1 (f=1): [R(1)][100.0%][r=43.6GiB/s][r=179k IOPS][eta 00m:00s] test: (groupid=0, jobs=1): err= 0: pid=18826: Wed May 7 13:44:38 2025 read: IOPS=178k, BW=43.4GiB/s (46.7GB/s)(1303GiB/30001msec) slat (usec): min=3, max=267, avg= 5.29, stdev= 1.62 clat (usec): min=147, max=2549, avg=354.18, stdev=28.87 lat (usec): min=150, max=2657, avg=359.47, stdev=29.15 rocky 9.5 [root@localhost ~]# fio --name=test --rw=read --bs=256k --filename=/mnt/testfile --direct=1 --numjobs=1 --iodepth=64 --exitall --group_reporting --ioengine=libaio --runtime=30 --time_based test: (g=0): rw=read, bs=(R) 256KiB-256KiB, (W) 256KiB-256KiB, (T) 256KiB-256KiB, ioengine=libaio, iodepth=64 fio-3.39-44-g19d9 Starting 1 process Jobs: 1 (f=1): [R(1)][100.0%][r=98.3GiB/s][r=403k IOPS][eta 00m:00s] test: (groupid=0, jobs=1): err= 0: pid=10500: Wed May 7 15:16:39 2025 read: IOPS=403k, BW=98.4GiB/s (106GB/s)(2951GiB/30001msec) slat (nsec): min=1101, max=156185, avg=2087.89, stdev=1415.57 clat (usec): min=82, max=951, avg=156.56, stdev=20.19 lat (usec): min=83, max=1078, avg=158.65, stdev=20.25 > Silly question: if you use DM to create the same RAID 0 array with a dm table such as: > 0 75011629056 striped 12 1024 /dev/nvme7n1 0 /dev/nvme0n1 0 .... /dev/nvme12n1 0 > to create a similar 38TB raid 0 array, do you see the same perf degradation? Will check that tomorrow. Anton ср, 7 мая 2025 г. в 00:46, Dave Chinner <david@xxxxxxxxxxxxx>: > > On Tue, May 06, 2025 at 02:03:37PM +0300, Anton Gavriliuk wrote: > > > So is this MD chunk size related? i.e. what is the chunk size > > > the MD device? Is it smaller than the IO size (256kB) or larger? > > > Does the regression go away if the chunk size matches the IO size, > > > or if the IO size vs chunk size relationship is reversed? > > > > According to the output below, the chunk size is 512K, > > Ok. > > `iostat -dxm 5` output during the fio run on both kernels will give > us some indication of the differences in IO patterns, queue depths, > etc. > > Silly question: if you use DM to create the same RAID 0 array > with a dm table such as: > > 0 75011629056 striped 12 1024 /dev/nvme7n1 0 /dev/nvme0n1 0 .... /dev/nvme12n1 0 > > to create a similar 38TB raid 0 array, do you see the same perf > degradation? > > -Dave. > -- > Dave Chinner > david@xxxxxxxxxxxxx
Attachment:
rocky_95_iostat_dxm_5
Description: Binary data
Attachment:
fedora_42_iostat_dxm_5
Description: Binary data