Re: failing to enable disk failure prediction

Jens Galsgaard <jens@xxxxxxxxxxxxx> · Tue, 6 May 2025 17:34:46 +0000

Hi Frédéric.

I didn't see the link before.

I am using this image: quay.io/ceph/ceph@sha256:1607a746adb9332f71b42e98768e8a16ed96e71c1449794fcece9f6ada16b140

I see from inside the mgr container that it is built with centos 9 stream.

The GUI says: 18.2.6 (ff498e17d264a1a4d588c361cbce9cc65afa2327) reef (stable)

The system was installed with cephadm

Venlig hilsen - Mit freundlichen Grüßen - Kind Regards,
Jens Galsgaard

-----Oprindelig meddelelse-----
Fra: Frédéric Nass <frederic.nass@xxxxxxxxxxxxxxxx> 
Sendt: Friday, 25 April 2025 09.01
Til: Jens Galsgaard <jens@xxxxxxxxxxxxx>
Cc: ceph-users <ceph-users@xxxxxxx>
Emne: Re:  Re: failing to enable disk failure prediction

Hi Jens,

I suppose you've seen this [1].

sklearn was added to quay.io ceph container image as a package installed from Kefu's third party repo added to the image as /etc/yum.repos.d/_copr:copr.fedorainfracloud.org:tchaikov:python-scikit-learn.repo.

Can you share which container image you're using on Debian Bookworm? Where it's pulled from? a 'podman ps' or 'docker ps' should tell you that. If the Ceph container image you're using is based on Debian, it could be that the python scikit learn package was not build / installed inside the Debian based container image.

Regards,
Frédéric.

[1] https://github.com/ceph/ceph-container/pull/1821

----- Le 21 Avr 25, à 19:07, Jens Galsgaard jens@xxxxxxxxxxxxx a écrit :

> Upgraded to 18.2.6 today and the module is still missing from the MGR container.
> 
> Is this the right place to write about this or is there a better channel?
> 
> Venlig hilsen - Mit freundlichen Grüßen - Kind Regards, Jens Galsgaard
> 
> Gitservice.dk
> Mob: +45 28864340
> 
> 
> -----Oprindelig meddelelse-----
> Fra: Jens Galsgaard <jens@xxxxxxxxxxxxx>
> Sendt: Monday, 14 April 2025 08.59
> Til: ceph-users@xxxxxxx
> Emne:  failing to enable disk failure prediction
> 
> Hello,
> 
> I’ve a cluster built with cephadm running on Debian 12/Bookworm.
> Ceph 18.2.5.
> 
> I want to enable disk failure prediction and run this command:
> 
> ceph mgr module enable diskprediction_local
> 
> Then the cluster goes into ERROR state and the logs shows:
> 
> 2025-04-14T08:55:37.073118+0200 mgr.host01.eqvsde [ERR] Unhandled 
> exception from module 'diskprediction_local' while running on 
> mgr.host01.eqvsde: No module named 'sklearn.svm.classes'
> 2025-04-14T08:55:38.388289+0200 mon.host03 [ERR] Health check failed: 
> Module 'diskprediction_local' has failed: No module named 'sklearn.svm.classes'
> (MGR_MODULE_ERROR)
> 
> How to add sklearn to the container as it is obviously missing?
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an 
> email to ceph-users-leave@xxxxxxx 
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an 
> email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx