Re: Ceph OSD with migrated db device fails after upgrade to 19.2.3 from 18.2.7

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

thank you, guys.

Correcting LV tags on the block LV and setting those on the db volume accordingly did the job:

lvchange --deltag "ceph.db_device=/dev/sdai1" ceph-402b182b-c5bd-416d-bde6-668e772b0c4c/osd-block-93811afc-17a3-4458-8e00-506eb9c92cb0 lvchange --deltag "ceph.db_uuid=6d676bcd-1f3c-e740-8fdf-6a5156605a3f" ceph-402b182b-c5bd-416d-bde6-668e772b0c4c/osd-block-93811afc-17a3-4458-8e00-506eb9c92cb0

lvchange --addtag "ceph.db_device=/dev/ceph-blockdb-01/osd-db-01" ceph-402b182b-c5bd-416d-bde6-668e772b0c4c/osd-block-93811afc-17a3-4458-8e00-506eb9c92cb0 lvchange --addtag "ceph.db_uuid=LVKxCf-teCI-1k6G-NamB-7A8o-l0h7-ORHHgx" ceph-402b182b-c5bd-416d-bde6-668e772b0c4c/osd-block-93811afc-17a3-4458-8e00-506eb9c92cb0


lvchange --addtag "ceph.block_device=/dev/ceph-402b182b-c5bd-416d-bde6-668e772b0c4c/osd-block-93811afc-17a3-4458-8e00-506eb9c92cb0" ceph-blockdb-01/osd-db-01 lvchange --addtag "ceph.block_uuid=GRd7Zo-dXdx-23Wf-507d-dHPw-6UOA-oaf7Qy" ceph-blockdb-01/osd-db-01
lvchange --addtag "ceph.cephx_lockbox_secret=" ceph-blockdb-01/osd-db-01
lvchange --addtag "ceph.cluster_fsid=1f3b3198-08b1-418c-a279-7050a2eb1ce3" ceph-blockdb-01/osd-db-01
lvchange --addtag "ceph.cluster_name=ceph" ceph-blockdb-01/osd-db-01
lvchange --addtag "ceph.crush_device_class=" ceph-blockdb-01/osd-db-01
lvchange --addtag "ceph.db_device=/dev/ceph-blockdb-01/osd-db-01" ceph-blockdb-01/osd-db-01 lvchange --addtag "ceph.db_uuid=LVKxCf-teCI-1k6G-NamB-7A8o-l0h7-ORHHgx" ceph-blockdb-01/osd-db-01
lvchange --addtag "ceph.encrypted=0" ceph-blockdb-01/osd-db-01
lvchange --addtag "ceph.osd_fsid=93811afc-17a3-4458-8e00-506eb9c92cb0" ceph-blockdb-01/osd-db-01
lvchange --addtag "ceph.osd_id=256" ceph-blockdb-01/osd-db-01
lvchange --addtag "ceph.type=db" ceph-blockdb-01/osd-db-01
lvchange --addtag "ceph.vdo=0" ceph-blockdb-01/osd-db-01


Best,
Sönke




Am 04.09.25 um 19:17 schrieb Igor Fedotov:
Hi Shoenke,

Migrating DB volume using ceph-bluestore-tool was a wrong step. It doesn't setup LV tags for underlying volumes which prevents preper OSD devices detection after reboot.

One should set these tags manually using lvchange --addtag command. To a major degree DB tags are similar to ones for the block device. But some additional tuning is still required.

Unfortunately AFAIK there is no full how-to-do manual available. One of Eugene's link covers the topic just partly. So you should rather use existing valid OSD deployment as a reference.


Thanks,

Igor

On 9/4/2025 3:55 PM, Soenke Schippmann wrote:
Hi,

after upgrading one of our ceph clusters from 18.2.7 to 19.2.3 some OSDs fail to start. For these OSDs, db devices were moved manually months ago from a partition to a lvm volume.

OSD log shows:

2025-09-04T11:38:22.055+0000 7fec1bbc4740  0 set uid:gid to 167:167 (ceph:ceph) 2025-09-04T11:38:22.055+0000 7fec1bbc4740  0 ceph version 19.2.3 (c92aebb279828e9c3c1f5d24613efca272649e62) squid (stable), process ceph-osd, pid 7 2025-09-04T11:38:22.055+0000 7fec1bbc4740  0 pidfile_write: ignore empty --pid-file 2025-09-04T11:38:22.055+0000 7fec1bbc4740  1 bdev(0x556f24a77400 /var/lib/ceph/osd/ceph-256/block) open path /var/lib/ceph/osd/ceph-256/block 2025-09-04T11:38:22.055+0000 7fec1bbc4740 -1 bdev(0x556f24a77400 /var/lib/ceph/osd/ceph-256/block) open stat got: (1) Operation not permitted 2025-09-04T11:38:22.055+0000 7fec1bbc4740 -1 ** ERROR: unable to open OSD superblock on /var/lib/ceph/osd/ceph-256: (2) No such file or directory

Links to block and block.db within the osd path get deleted after each startup attempt. Recreating the links manually does not help.

ceph-bluestore-tool fsck --path... shows no errors if links to block and block.db are recreated.

Running "ceph-volume activate --osd-id 256" manually within cephadm shell fails with the follwing  error:

--> Failed to activate via LVM: could not find db with uuid 6d676bcd-1f3c-e740-8fdf-6a5156605a3f


ceph-volume lvm list shows outdated db uuid and db device:

===== osd.256 ======

  [block] /dev/ceph-402b182b-c5bd-416d-bde6-668e772b0c4c/osd-block-93811afc-17a3-4458-8e00-506eb9c92cb0

      block device /dev/ceph-402b182b-c5bd-416d-bde6-668e772b0c4c/osd-block-93811afc-17a3-4458-8e00-506eb9c92cb0
      block uuid GRd7Zo-dXdx-23Wf-507d-dHPw-6UOA-oaf7Qy
      cephx lockbox secret
      cluster fsid 1f3b3198-08b1-418c-a279-7050a2eb1ce3
      cluster name              ceph
      crush device class        None
      db device                 /dev/sdai1
      db uuid 6d676bcd-1f3c-e740-8fdf-6a5156605a3f
      encrypted                 0
      osd fsid 93811afc-17a3-4458-8e00-506eb9c92cb0
      osd id                    256
      type                      block
      vdo                       0
      devices                   /dev/sdc

  [db]          /dev/sdai1

      PARTUUID 6d676bcd-1f3c-e740-8fdf-6a5156605a3f


db was migrated from partition /dev/sdai1 to lvm volume ceph-blockdb-01/osd-db-01 on /dev/sdaa months ago and running fine with Ceph 18.2. Migration was done manually by using "ceph-bluestore-tool bluefs-bdev-migrate" (ceph-volume lvm migrate failed though).


Is there any way to fix this?

Best,
Sönke


_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

--
Sönke Schippmann

Universität Bremen
Dezernat 8 - IT Service Center
Referat 82 Serverbetrieb

Büroanschrift:
Universität Bremen
Dez. 8-Bi, SFG 1390
Enrique-Schmidt-Str. 7
28359 Bremen

E-Mail: schippmann@xxxxxxxxxxxxx
Tel:    +49 421 218-61327
Fax:    +49 421 218-98-61327

http://www.uni-bremen.de/zfn/
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux