Re: [v19.2.3] All OSDs are not created with a managed spec

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Salut Gilles

Did you try to check if ceph-volume regonized all of your HDD and SSD devices ?

https://docs.ceph.com/en/squid/cephadm/services/osd/#list-devices

Simply check the output of the followings command :


> ceph orch device ls.


Regards


On 8/18/25 12:11, Gilles Mocellin wrote:
Le 2025-08-18 11:47, Gilles Mocellin a écrit :
Le 2025-08-18 11:30, Gilles Mocellin a écrit :
On Hi Cephers,

I'm building a new Squid cluster with cephadm on Ubuntu 24.04.
After expanding my cluster in the Dashboard (adding my 7 hosts),
I choose throughput_optimized proflie wich create a generic spec for hybrid HDD/SSD :

service_type: osd
service_id: throughput_optimized
service_name: osd.throughput_optimized
placement:
  host_pattern: '*'
spec:
  data_devices:
    rotational: 1
  db_devices:
    rotational: 0
  encrypted: true
  filter_logic: AND
  objectstore: bluestore

The cluster is for a LAB environment, on each of the 7 nodes, I have 17 HDD SAS 1.2TB drives and 1 SSD SAS Enterprise 400GB drive. On my first try, only 28 OSD where created (out of the 119), the others appeared as down, but they won't start, I didn't find systemd units created on the hosts. But, the VGs and LVs where created, there are 17 LVs on the SSD for WAL/DB of the 17 HDD (yes, small : 29GB).

On my second try, it creates 72 OSDs, still it stops, and never tries to continue, or re-create the down OSDs.

I didn't manage to find them, but it seems I saw some OSD creation timeout in the logs...

What can I do to have my missing OSD created ?


Some additional information :

ceph -s
  cluster:
    id:     3ebf83bf-7927-11f0-9f3a-246e96bd90a4
    health: HEALTH_OK

  services:
    mon: 5 daemons, quorum fidcl-lyo1-sto-sds-lab-01,fidcl-lyo1-sto-sds-lab-02,fidcl-lyo1-sto-sds-lab-03,fidcl-lyo1-sto-sds-lab-04,fidcl-lyo1-sto-sds-lab-05 (age 3d)     mgr: fidcl-lyo1-sto-sds-lab-01.ovbjpb(active, since 3d), standbys: fidcl-lyo1-sto-sds-lab-02.nqdhpl, fidcl-lyo1-sto-sds-lab-03.cizytz
    osd: 119 osds: 72 up (since 3d), 89 in (since 41m)

  data:
    pools:   1 pools, 1 pgs
    objects: 2 objects, 769 KiB
    usage:   1.5 TiB used, 79 TiB / 80 TiB avail
    pgs:     1 active+clean


One of the "missing" (not fully created) OSD, is present, but not found :

ceph osd find 6
{
    "osd": 6,
    "addrs": {
        "addrvec": []
    },
    "osd_fsid": "f8745284-8026-4713-8329-cd57cb6842f7",
    "crush_location": {}
}

ceph osd info 6
osd.6 down out weight 0 up_from 0 up_thru 0 down_at 0 last_clean_interval [0,0)   autoout,exists,new f8745284-8026-4713-8329-cd57cb6842f7

The missing OSDs are not chown with the device list command :
ceph device ls | grep osd.6
is empty...


Also, the MGR reports that on all not fully created OSD :
Aug 18 10:09:41 fidcl-lyo1-sto-sds-lab-01 ceph-mgr[4186078]: mgr get_metadata_python Requested missing service osd.99
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux