Why does recovering objects take much longer than the outage that caused them?

Niklas Hambüchen <mail@xxxxxx> · Fri, 19 Sep 2025 13:23:54 +0200

I noticed that for my clusters, even a short 5-minute network outage or single-host reboot can cause

    pgs:     5586988/366684639 objects misplaced (1.524%)

which at the speed of

    recovery: 2.2 GiB/s, 676 objects/s

can take hours to recover.

I don't understand how this can be. If it's down for so short, how can rebalancing can take this long?

I'm using Ceph 19.2.2 on HDDs with SSDs as BlueStore "db" device.
Is this perhaps that writes of new files are written linearly to HDD (fast) but recovery seeks around on my HDDs in random order (slow)?

In any case, this asymmetry is quite annoying.
Could anything be done against it?

Thanks!
Niklas
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx