How to remove failed OSD & reuse it?

lejeczek <peljasz@xxxxxxxxxxx> · Tue, 2 Sep 2025 10:27:53 +0200

Hi guys.

I've browsing through the net in a search of a relatively 
clear "howto" but I failed to find one. It's rather many, 
sometimes different notes/thoughts on how to deal with 
such/similar situation.
Having a 3-node containerized cluster which lost osd - it 
crushed, there is nothing wrong with the node, nothing wrong 
with the disk, but never mind that.
Is there a howto which covers containerized environment?
One example I followed is: 
https://docs.redhat.com/en/documentation/red_hat_ceph_storage/1.2.3/html/red_hat_ceph_administration_guide/setting_unsetting_overrides
but it is - to me - clear, what to do with "broken" containers.
I'm got to:
-> $ ceph osd tree
ID  CLASS  WEIGHT   TYPE NAME          STATUS  REWEIGHT  PRI-AFF
-1         0.68359  root default
-3               0      host podster1
-7         0.34180      host podster2
 2    hdd  0.04880          osd.2          up   1.00000  
1.00000
 4    hdd  0.29300          osd.4          up   1.00000  
1.00000
-5         0.34180      host podster3
 1    hdd  0.04880          osd.1          up   1.00000  
1.00000
 5    hdd  0.29300          osd.5          up   1.00000  
1.00000

yet:
-> $ ceph orch ps --daemon-type=osd
NAME   HOST                PORTS  STATUS         REFRESHED  
AGE MEM USE  MEM LIM  VERSION    IMAGE ID      CONTAINER ID
osd.0  podster1.mine.priv         error             7m ago 
3w        -    4096M  <unknown>  <unknown> <unknown>
osd.1  podster3.mine.priv         running (25h)     7m ago 
3w     942M    4096M  19.2.3     aade1b12b8e6  d71051ea79dc
osd.2  podster2.mine.priv         running (6d)      7m ago   
3w 1192M    4096M  19.2.3     aade1b12b8e6  e8d05142a73a
osd.3  podster1.mine.priv         error             7m ago 
2w        -    4096M  <unknown>  <unknown> <unknown>
osd.4  podster2.mine.priv         running (6d)      7m ago   
2w 3293M    4096M  19.2.3     aade1b12b8e6  6116277f69d1
osd.5  podster3.mine.priv         running (25h)     7m ago   
2w 2963M    4096M  19.2.3     aade1b12b8e6  d671bf73cc01

what would be next bits needed to complete such 
removal&reuse/re-create of osd(s)?
p.s. This a 'lab' setup so I'm not worried, but it'd be 
great to complete this process in a healthy manner.
many thanks, L.
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx