Hi, I'm just throwing it here as a warning to other users. Kioxia KCD61LUL7T68 NVMe device have terrible discard performance. (All devices report firmware 8001)I have multiple of these kioxia SSDs in a ceph cluster. I normally configured bdev_async_discard_threads and bdev_enable_discard on all SSD.
I have a script which makes new rbd snapshots and clears old ones. It has plenty of sleep commands to make whole process easier on the cluster, but these SSDs just cannot handle it.
Example iostat: avg-cpu: %user %nice %system %iowait %steal %idle 45.6% 0.0% 27.0% 10.4% 0.0% 17.0% r/s rkB/s rrqm/s %rrqm r_await rareq-sz Device 1482.00 55.9M 208.00 12.3% 0.22 38.6k nvme1n1 13.00 220.0k 0.00 0.0% 0.15 16.9k nvme2n1 w/s wkB/s wrqm/s %wrqm w_await wareq-sz Device 7333.00 92.6M 15303.00 67.6% 0.15 12.9k nvme1n1 1470.00 13.2M 294.00 16.7% 0.10 9.2k nvme2n1 d/s dkB/s drqm/s %drqm d_await dareq-sz Device 807.00 3.8M 0.00 0.0% 2.27 4.8k nvme1n1 924.00 7.5M 0.00 0.0% 2.08 8.4k nvme2n1 NVMes start to choke around 700-800 discard IOPS.These discard ops sizes are probably terrible for the SSDs but afaik there is no way to fix this without just disabling discard on bluestore.
In this case SSDs were about half full.I've used multiple brands of SSDs on ceph with discard enabled and I've never seen results this bad.
Best regards Adam Prycki
Attachment:
smime.p7s
Description: Kryptograficzna sygnatura S/MIME
_______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx