2 MDSs behind on trimming on my Ceph Cluster since the upgrade from 18.2.6 (reef) to 19.2.2 (squid)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Dear all,

 

I have the following issue on my Ceph cluster MDSs behind on trimming on my Ceph Cluster since the upgrade using cephadm from 18.2.6 to 19.2.2.

 

Here some cluster logs :

 

8/5/25 09:00 AM[WRN]overall HEALTH_WARN 2 MDSs behind on trimming

8/5/25 08:50 AM[WRN]overall HEALTH_WARN 2 MDSs behind on trimming

8/5/25 08:40 AM[WRN]mds.cephfs.node2.isqjza(mds.0): Behind on trimming (326/128) max_segments: 128, num_segments: 326

8/5/25 08:40 AM[WRN]mds.cephfs.node1.ojmpnk(mds.0): Behind on trimming (326/128) max_segments: 128, num_segments: 326

8/5/25 08:40 AM[WRN][WRN] MDS_TRIM: 2 MDSs behind on trimming

8/5/25 08:40 AM[WRN]Health detail: HEALTH_WARN 2 MDSs behind on trimming

8/5/25 08:33 AM[WRN]Health check update: 2 MDSs behind on trimming (MDS_TRIM)

8/5/25 08:33 AM[WRN]Health check failed: 1 MDSs behind on trimming (MDS_TRIM)

8/5/25 08:30 AM[INF]overall HEALTH_OK

8/5/25 08:22 AM[INF]Cluster is now healthy

8/5/25 08:22 AM[INF]Health check cleared: MDS_TRIM (was: 1 MDSs behind on trimming)

8/5/25 08:22 AM[INF]MDS health message cleared (mds.?): Behind on trimming (525/128)

8/5/25 08:22 AM[WRN]Health check update: 1 MDSs behind on trimming (MDS_TRIM)

8/5/25 08:22 AM[INF]MDS health message cleared (mds.?): Behind on trimming (525/128)

8/5/25 08:20 AM[WRN]overall HEALTH_WARN 2 MDSs behind on trimming

8/5/25 08:10 AM[WRN]mds.cephfs.node2.isqjza(mds.0): Behind on trimming (332/128) max_segments: 128, num_segments: 332

8/5/25 08:10 AM[WRN]mds.cephfs.node1.ojmpnk(mds.0): Behind on trimming (332/128) max_segments: 128, num_segments: 332

8/5/25 08:10 AM[WRN][WRN] MDS_TRIM: 2 MDSs behind on trimming

8/5/25 08:10 AM[WRN]Health detail: HEALTH_WARN 2 MDSs behind on trimming

8/5/25 08:03 AM[WRN]Health check update: 2 MDSs behind on trimming (MDS_TRIM)

8/5/25 08:03 AM[WRN]Health check failed: 1 MDSs behind on trimming (MDS_TRIM)

8/5/25 08:00 AM[INF]overall HEALTH_OK

 

#ceph fs status

cephfs - 50 clients

======

RANK      STATE                 MDS               ACTIVITY     DNS    INOS   DIRS   CAPS

0        active      cephfs.node1.ojmpnk  Reqs:   10 /s   305k   294k  91.8k  6818

0-s   standby-replay  cephfs.node2.isqjza  Evts:    0 /s   551k   243k  90.6k     0

      POOL         TYPE     USED  AVAIL

cephfs_metadata  metadata  2630M  2413G

  cephfs_data      data    12.7T  3620G

      STANDBY MDS

cephfs.node3.vdicdn

MDS version: ceph version 19.2.2 (0eceb0defba60152a8182f7bd87d164b639885b8) squid (stable)

 

# ceph versions

{

    "mon": {

        "ceph version 19.2.2 (0eceb0defba60152a8182f7bd87d164b639885b8) squid (stable)": 3

    },

    "mgr": {

        "ceph version 19.2.2 (0eceb0defba60152a8182f7bd87d164b639885b8) squid (stable)": 2

    },

    "osd": {

        "ceph version 19.2.2 (0eceb0defba60152a8182f7bd87d164b639885b8) squid (stable)": 18

    },

    "mds": {

        "ceph version 19.2.2 (0eceb0defba60152a8182f7bd87d164b639885b8) squid (stable)": 3

    },

    "rgw": {

        "ceph version 19.2.2 (0eceb0defba60152a8182f7bd87d164b639885b8) squid (stable)": 6

    },

    "overall": {

        "ceph version 19.2.2 (0eceb0defba60152a8182f7bd87d164b639885b8) squid (stable)": 32

    }

}

 

#ceph orch ps --daemon-type mds

NAME                         HOST       PORTS  STATUS         REFRESHED  AGE  MEM USE  MEM LIM  VERSION  IMAGE ID      CONTAINER ID

mds.cephfs.node1.ojmpnk  rke-sh1-1         running (18h)     4m ago  19M    1709M        -  19.2.2   4892a7ef541b  8dd8db30a1de

mds.cephfs.node2.isqjza  rke-sh1-2         running (18h)     2m ago   3y    1720M        -  19.2.2   4892a7ef541b  7b9d5b692764

mds.cephfs.node3.vdicdn  rke-sh1-3         running (18h)   108s ago  18M    27.9M        -  19.2.2   4892a7ef541b  d2de22a15e18

 

root@node1:~# ceph config show-with-defaults mds.cephfs.rke-sh1-3.vdicdn | egrep "mds_cache_trim_threshold|mds_cache_trim_decay_rate|mds_cache_memory_limit|mds_recall_max_caps|mds_recall_max_decay_rate"

mds_cache_memory_limit                                      4294967296                                                                                                                                                                                                                                                                                                                                                                               default

mds_cache_trim_decay_rate                                   1.000000                                                                                                                                                                                                                                                                                                                                                                                 default

mds_cache_trim_threshold                                    262144                                                                                                                                                                                                                                                                                                                                                                                   default

mds_recall_max_caps                                         30000                                                                                                                                                                                                                                                                                                                                                                                    default

mds_recall_max_decay_rate                                   1.500000                                                                                                                                                                                                                                                                                                                                                                                 default

root@node2:~# ceph config show-with-defaults mds.cephfs.rke-sh1-2.isqjza | egrep "mds_cache_trim_threshold|mds_cache_trim_decay_rate|mds_cache_memory_limit|mds_recall_max_caps|mds_recall_max_decay_rate"

mds_cache_memory_limit                                      4294967296                                                                                                                                                                                                                                                                                                                                                                               default

mds_cache_trim_decay_rate                                   1.000000                                                                                                                                                                                                                                                                                                                                                                                 default

mds_cache_trim_threshold                                    262144                                                                                                                                                                                                                                                                                                                                                                                   default

mds_recall_max_caps                                         30000                                                                                                                                                                                                                                                                                                                                                                                    default

mds_recall_max_decay_rate                                   1.500000                                                                                                                                                                                                                                                                                                                                                                                 default

root@node3:~# ceph config show-with-defaults mds.cephfs.rke-sh1-1.ojmpnk | egrep "mds_cache_trim_threshold|mds_cache_trim_decay_rate|mds_cache_memory_limit|mds_recall_max_caps|mds_recall_max_decay_rate"

mds_cache_memory_limit                                      4294967296                                                                                                                                                                                                                                                                                                                                                                               default

mds_cache_trim_decay_rate                                   1.000000                                                                                                                                                                                                                                                                                                                                                                                 default

mds_cache_trim_threshold                                    262144                                                                                                                                                                                                                                                                                                                                                                                   default

mds_recall_max_caps                                         30000                                                                                                                                                                                                                                                                                                                                                                                    default

mds_recall_max_decay_rate                                   1.500000                                                                                                                                                                                                                                                                                                                                                                                 default

 

# ceph mds stat

cephfs:1 {0=cephfs.node1.ojmpnk=up:active} 1 up:standby-replay 1 up:standby

 

Do you have an idea on what could happen ? Should I increate mds_cache_trim_decay_rate ?

 

I saw the folloing issue : Bug #66948: mon.a (mon.0) 326 : cluster [WRN] Health check failed: 1 MDSs behind on trimming (MDS_TRIM)" in cluster log - CephFS - Ceph ( squid: mds: trim mdlog when segments exceed threshold and trim was idle by vshankar · Pull Request #60838 · ceph/ceph · GitHub ) maybe related ?

 

Thanks for the help 😊

 

Best Regards, Edouard Fazenda.

 


Swiss Cloud Provider

 

Edouard Fazenda

Technical Support

      

Chemin du Curé-Desclouds, 2
CH-1226 Thonex
+41 22 869 04 40

www.csti.ch

 

 

 

Attachment: smime.p7s
Description: S/MIME cryptographic signature

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux