Hi,
after upgrading one of our ceph clusters from 18.2.7 to 19.2.3 some OSDs
fail to start. For these OSDs, db devices were moved manually months ago
from a partition to a lvm volume.
OSD log shows:
2025-09-04T11:38:22.055+0000 7fec1bbc4740 0 set uid:gid to 167:167
(ceph:ceph)
2025-09-04T11:38:22.055+0000 7fec1bbc4740 0 ceph version 19.2.3
(c92aebb279828e9c3c1f5d24613efca272649e62) squid (stable), process
ceph-osd, pid 7
2025-09-04T11:38:22.055+0000 7fec1bbc4740 0 pidfile_write: ignore empty
--pid-file
2025-09-04T11:38:22.055+0000 7fec1bbc4740 1 bdev(0x556f24a77400
/var/lib/ceph/osd/ceph-256/block) open path /var/lib/ceph/osd/ceph-256/block
2025-09-04T11:38:22.055+0000 7fec1bbc4740 -1 bdev(0x556f24a77400
/var/lib/ceph/osd/ceph-256/block) open stat got: (1) Operation not permitted
2025-09-04T11:38:22.055+0000 7fec1bbc4740 -1 ** ERROR: unable to open
OSD superblock on /var/lib/ceph/osd/ceph-256: (2) No such file or directory
Links to block and block.db within the osd path get deleted after each
startup attempt. Recreating the links manually does not help.
ceph-bluestore-tool fsck --path... shows no errors if links to block and
block.db are recreated.
Running "ceph-volume activate --osd-id 256" manually within cephadm
shell fails with the follwing error:
--> Failed to activate via LVM: could not find db with uuid
6d676bcd-1f3c-e740-8fdf-6a5156605a3f
ceph-volume lvm list shows outdated db uuid and db device:
===== osd.256 ======
[block]
/dev/ceph-402b182b-c5bd-416d-bde6-668e772b0c4c/osd-block-93811afc-17a3-4458-8e00-506eb9c92cb0
block device
/dev/ceph-402b182b-c5bd-416d-bde6-668e772b0c4c/osd-block-93811afc-17a3-4458-8e00-506eb9c92cb0
block uuid GRd7Zo-dXdx-23Wf-507d-dHPw-6UOA-oaf7Qy
cephx lockbox secret
cluster fsid 1f3b3198-08b1-418c-a279-7050a2eb1ce3
cluster name ceph
crush device class None
db device /dev/sdai1
db uuid 6d676bcd-1f3c-e740-8fdf-6a5156605a3f
encrypted 0
osd fsid 93811afc-17a3-4458-8e00-506eb9c92cb0
osd id 256
type block
vdo 0
devices /dev/sdc
[db] /dev/sdai1
PARTUUID 6d676bcd-1f3c-e740-8fdf-6a5156605a3f
db was migrated from partition /dev/sdai1 to lvm volume
ceph-blockdb-01/osd-db-01 on /dev/sdaa months ago and running fine with
Ceph 18.2. Migration was done manually by using "ceph-bluestore-tool
bluefs-bdev-migrate" (ceph-volume lvm migrate failed though).
Is there any way to fix this?
Best,
Sönke
--
Sönke Schippmann
Universität Bremen
Dezernat 8 - IT Service Center
Referat 82 Serverbetrieb
Büroanschrift:
Universität Bremen
Dez. 8-Bi, SFG 1390
Enrique-Schmidt-Str. 7
28359 Bremen
E-Mail: schippmann@xxxxxxxxxxxxx
Tel: +49 421 218-61327
Fax: +49 421 218-98-61327
http://www.uni-bremen.de/zfn/
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx