Ceph OSD with migrated db device fails after upgrade to 19.2.3 from 18.2.7

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

after upgrading one of our ceph clusters from 18.2.7 to 19.2.3 some OSDs fail to start. For these OSDs, db devices were moved manually months ago from a partition to a lvm volume.

OSD log shows:

2025-09-04T11:38:22.055+0000 7fec1bbc4740  0 set uid:gid to 167:167 (ceph:ceph) 2025-09-04T11:38:22.055+0000 7fec1bbc4740  0 ceph version 19.2.3 (c92aebb279828e9c3c1f5d24613efca272649e62) squid (stable), process ceph-osd, pid 7 2025-09-04T11:38:22.055+0000 7fec1bbc4740  0 pidfile_write: ignore empty --pid-file 2025-09-04T11:38:22.055+0000 7fec1bbc4740  1 bdev(0x556f24a77400 /var/lib/ceph/osd/ceph-256/block) open path /var/lib/ceph/osd/ceph-256/block 2025-09-04T11:38:22.055+0000 7fec1bbc4740 -1 bdev(0x556f24a77400 /var/lib/ceph/osd/ceph-256/block) open stat got: (1) Operation not permitted 2025-09-04T11:38:22.055+0000 7fec1bbc4740 -1 ** ERROR: unable to open OSD superblock on /var/lib/ceph/osd/ceph-256: (2) No such file or directory

Links to block and block.db within the osd path get deleted after each startup attempt. Recreating the links manually does not help.

ceph-bluestore-tool fsck --path... shows no errors if links to block and block.db are recreated.

Running "ceph-volume activate --osd-id 256" manually within cephadm shell fails with the follwing  error:

--> Failed to activate via LVM: could not find db with uuid 6d676bcd-1f3c-e740-8fdf-6a5156605a3f


ceph-volume lvm list shows outdated db uuid and db device:

===== osd.256 ======

  [block] /dev/ceph-402b182b-c5bd-416d-bde6-668e772b0c4c/osd-block-93811afc-17a3-4458-8e00-506eb9c92cb0

      block device /dev/ceph-402b182b-c5bd-416d-bde6-668e772b0c4c/osd-block-93811afc-17a3-4458-8e00-506eb9c92cb0
      block uuid GRd7Zo-dXdx-23Wf-507d-dHPw-6UOA-oaf7Qy
      cephx lockbox secret
      cluster fsid              1f3b3198-08b1-418c-a279-7050a2eb1ce3
      cluster name              ceph
      crush device class        None
      db device                 /dev/sdai1
      db uuid                   6d676bcd-1f3c-e740-8fdf-6a5156605a3f
      encrypted                 0
      osd fsid                  93811afc-17a3-4458-8e00-506eb9c92cb0
      osd id                    256
      type                      block
      vdo                       0
      devices                   /dev/sdc

  [db]          /dev/sdai1

      PARTUUID                  6d676bcd-1f3c-e740-8fdf-6a5156605a3f


db was migrated from partition /dev/sdai1 to lvm volume ceph-blockdb-01/osd-db-01 on /dev/sdaa months ago and running fine with Ceph 18.2. Migration was done manually by using "ceph-bluestore-tool bluefs-bdev-migrate" (ceph-volume lvm migrate failed though).


Is there any way to fix this?

Best,
Sönke


--
Sönke Schippmann

Universität Bremen
Dezernat 8 - IT Service Center
Referat 82 Serverbetrieb

Büroanschrift:
Universität Bremen
Dez. 8-Bi, SFG 1390
Enrique-Schmidt-Str. 7
28359 Bremen

E-Mail: schippmann@xxxxxxxxxxxxx
Tel:    +49 421 218-61327
Fax:    +49 421 218-98-61327

http://www.uni-bremen.de/zfn/
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux