Warning about ceph discard on kioxia CD6 KCD61LUL7T68 NVMes

Adam Prycki <aprycki@xxxxxxxxxxxxx> · Tue, 29 Jul 2025 20:03:29 +0200

Hi,

I'm just throwing it here as a warning to other users.
Kioxia KCD61LUL7T68 NVMe device have terrible discard performance.
(All devices report firmware 8001)

I have multiple of these kioxia SSDs in a ceph cluster. I normally 
configured bdev_async_discard_threads and bdev_enable_discard on all SSD.

I have a script which makes new rbd snapshots and clears old ones. It 
has plenty of sleep commands to make whole process easier on the 
cluster, but these SSDs just cannot handle it.

Example iostat:

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          45.6%    0.0%   27.0%   10.4%    0.0%   17.0%

     r/s     rkB/s   rrqm/s  %rrqm r_await rareq-sz Device
 1482.00     55.9M   208.00  12.3%    0.22    38.6k nvme1n1
   13.00    220.0k     0.00   0.0%    0.15    16.9k nvme2n1

     w/s     wkB/s   wrqm/s  %wrqm w_await wareq-sz Device
 7333.00     92.6M 15303.00  67.6%    0.15    12.9k nvme1n1
 1470.00     13.2M   294.00  16.7%    0.10     9.2k nvme2n1

     d/s     dkB/s   drqm/s  %drqm d_await dareq-sz Device
  807.00      3.8M     0.00   0.0%    2.27     4.8k nvme1n1
  924.00      7.5M     0.00   0.0%    2.08     8.4k nvme2n1

NVMes start to choke around 700-800 discard IOPS.
These discard ops sizes are probably terrible for the SSDs but afaik 
there is no way to fix this without just disabling discard on bluestore.

In this case SSDs were about half full.

I've used multiple brands of SSDs on ceph with discard enabled and I've 
never seen results this bad.

Best regards
Adam Prycki
Attachment:
smime.p7s

Description: Kryptograficzna sygnatura S/MIME
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx