Hi Igor, Thank you so very much for responding so quickly. Interestingly, I don't remember setting these values, but I did see a global level override for 0.8 on one, and 0.2 on another, so I removed the global overrides and am rebooting the server to see what happens. I should know soon enough how things are looking. I'll report back, but I don't understand why I would have been able to upgrade this over the past 4-5 years from 14 --> 15 --> 16 --> 17 --> 18.2.4 without issues, but now going from 18.2.4 --> 18.2.6 I am dead in the water. Thanks, Marco On Tue, Apr 29, 2025 at 1:18 PM Igor Fedotov <igor.fedotov@xxxxxxxx> wrote: > Hi Marco, > > the following log line (unfortunately it was cut off) sheds some light: > > " > Apr 29 06:24:09 prdhcistonode01 bash[23886]: debug > 2025-04-29T10:24:09.287+0000 7f6961ae9740 -1 > bluestore(/var/lib/ceph/osd/ceph-3) _set_cache_sizes bluestore_cache_meta_> > > " > > Likely it says that sum of bluestore_cache_meta_ratio + > bluestore_cache_kv_ratio + bluestore_cache_kv_onode_ratio config > parameters exceeds 1.0 > > So one has to tune the parameters in a way to get the sum less or equal > to 1.0. > > Default settings are: > > bluestore_cache_meta_ratio = 0.45 > > bluestore_cache_kv_ratio = 0.45 > > bluestore_cache_kv_onode_ratio = 0.04 > > > Thanks, > > Igor > > > > On 29.04.2025 13:36, Marco Pizzolo wrote: > > Hello Everyone, > > > > I'm upgrading from 18.2.4 to 18.2.6, and I have a 4-node cluster with 8 > > NVMe's per node. Each NVMe is split into 2 OSDs. The upgrade went > through > > the mgr, mon, crash and began upgrading OSDs. > > > > The OSDs it was upgrading were not coming back online. > > > > I tried rebooting, and no luck. > > > > journalctl -xe shows the following: > > > > ░░ The unit > > > docker-02cb79ef9a657cdaa26b781966aa6d2f1d5e54cdc9efa6c5ff1f0e98c3a866e4.scope > > has successfully entered the 'dead' state. > > Apr 29 06:24:09 prdhcistonode01 dockerd[2967]: > > time="2025-04-29T06:24:09.282073583-04:00" level=info msg="ignoring > event" > > container=76c56ddd668015de0022bfa2527060e64a9513> > > Apr 29 06:24:09 prdhcistonode01 containerd[2797]: > > time="2025-04-29T06:24:09.282129114-04:00" level=info msg="shim > > disconnected" id=76c56ddd668015de0022bfa2527060e64a95137> > > Apr 29 06:24:09 prdhcistonode01 containerd[2797]: > > time="2025-04-29T06:24:09.282219664-04:00" level=warning msg="cleaning up > > after shim disconnected" id=76c56ddd668015de00> > > Apr 29 06:24:09 prdhcistonode01 containerd[2797]: > > time="2025-04-29T06:24:09.282242484-04:00" level=info msg="cleaning up > dead > > shim" > > Apr 29 06:24:09 prdhcistonode01 bash[23886]: debug > > 2025-04-29T10:24:09.287+0000 7f6961ae9740 1 mClockScheduler: > > set_osd_capacity_params_from_config: osd_bandwidth_cost_p> > > Apr 29 06:24:09 prdhcistonode01 bash[23886]: debug > > 2025-04-29T10:24:09.287+0000 7f6961ae9740 0 osd.3:0.OSDShard using op > > scheduler mclock_scheduler, cutoff=196 > > Apr 29 06:24:09 prdhcistonode01 bash[23886]: debug > > 2025-04-29T10:24:09.287+0000 7f6961ae9740 1 bdev(0x56046b4c8000 > > /var/lib/ceph/osd/ceph-3/block) open path /var/lib/cep> > > Apr 29 06:24:09 prdhcistonode01 containerd[2797]: > > time="2025-04-29T06:24:09.292047607-04:00" level=warning msg="cleanup > > warnings time=\"2025-04-29T06:24:09-04:00\" level=> > > Apr 29 06:24:09 prdhcistonode01 dockerd[2967]: > > time="2025-04-29T06:24:09.292163618-04:00" level=info msg="ignoring > event" > > container=02cb79ef9a657cdaa26b781966aa6d2f1d5e54> > > Apr 29 06:24:09 prdhcistonode01 containerd[2797]: > > time="2025-04-29T06:24:09.292216428-04:00" level=info msg="shim > > disconnected" id=02cb79ef9a657cdaa26b781966aa6d2f1d5e54c> > > Apr 29 06:24:09 prdhcistonode01 containerd[2797]: > > time="2025-04-29T06:24:09.292277279-04:00" level=warning msg="cleaning up > > after shim disconnected" id=02cb79ef9a657cdaa2> > > Apr 29 06:24:09 prdhcistonode01 containerd[2797]: > > time="2025-04-29T06:24:09.292291949-04:00" level=info msg="cleaning up > dead > > shim" > > Apr 29 06:24:09 prdhcistonode01 bash[23886]: debug > > 2025-04-29T10:24:09.287+0000 7f6961ae9740 1 bdev(0x56046b4c8000 > > /var/lib/ceph/osd/ceph-3/block) open size 640122932428> > > Apr 29 06:24:09 prdhcistonode01 bash[23886]: debug > > 2025-04-29T10:24:09.287+0000 7f6961ae9740 -1 > > bluestore(/var/lib/ceph/osd/ceph-3) _set_cache_sizes > bluestore_cache_meta_> > > Apr 29 06:24:09 prdhcistonode01 bash[23886]: debug > > 2025-04-29T10:24:09.287+0000 7f6961ae9740 1 bdev(0x56046b4c8000 > > /var/lib/ceph/osd/ceph-3/block) close > > Apr 29 06:24:09 prdhcistonode01 containerd[2797]: > > time="2025-04-29T06:24:09.303385220-04:00" level=warning msg="cleanup > > warnings time=\"2025-04-29T06:24:09-04:00\" level=> > > Apr 29 06:24:09 prdhcistonode01 bash[24158]: debug > > 2025-04-29T10:24:09.307+0000 7f2c10403740 1 mClockScheduler: > > set_osd_capacity_params_from_config: osd_bandwidth_cost_p> > > Apr 29 06:24:09 prdhcistonode01 bash[24158]: debug > > 2025-04-29T10:24:09.307+0000 7f2c10403740 0 osd.0:0.OSDShard using op > > scheduler mclock_scheduler, cutoff=196 > > Apr 29 06:24:09 prdhcistonode01 bash[23144]: debug > > 2025-04-29T10:24:09.307+0000 7f12f08c5740 -1 osd.15 0 OSD:init: unable to > > mount object store > > Apr 29 06:24:09 prdhcistonode01 bash[23144]: debug > > 2025-04-29T10:24:09.307+0000 7f12f08c5740 -1 ** ERROR: osd init failed: > > (22) Invalid argument > > Apr 29 06:24:09 prdhcistonode01 bash[24158]: debug > > 2025-04-29T10:24:09.307+0000 7f2c10403740 1 bdev(0x55d5e45f0000 > > /var/lib/ceph/osd/ceph-0/block) open path /var/lib/cep> > > Apr 29 06:24:09 prdhcistonode01 bash[24158]: debug > > 2025-04-29T10:24:09.307+0000 7f2c10403740 1 bdev(0x55d5e45f0000 > > /var/lib/ceph/osd/ceph-0/block) open size 640122932428> > > Apr 29 06:24:09 prdhcistonode01 bash[24158]: debug > > 2025-04-29T10:24:09.307+0000 7f2c10403740 -1 > > bluestore(/var/lib/ceph/osd/ceph-0) _set_cache_sizes > bluestore_cache_meta_> > > Apr 29 06:24:09 prdhcistonode01 bash[24158]: debug > > 2025-04-29T10:24:09.307+0000 7f2c10403740 1 bdev(0x55d5e45f0000 > > /var/lib/ceph/osd/ceph-0/block) close > > Apr 29 06:24:09 prdhcistonode01 bash[24328]: debug > > 2025-04-29T10:24:09.363+0000 7f30b83b1740 1 mClockScheduler: > > set_osd_capacity_params_from_config: osd_bandwidth_cost_p> > > Apr 29 06:24:09 prdhcistonode01 bash[24328]: debug > > 2025-04-29T10:24:09.363+0000 7f30b83b1740 0 osd.8:0.OSDShard using op > > scheduler mclock_scheduler, cutoff=196 > > Apr 29 06:24:09 prdhcistonode01 bash[24328]: debug > > 2025-04-29T10:24:09.363+0000 7f30b83b1740 1 bdev(0x555f40688000 > > /var/lib/ceph/osd/ceph-8/block) open path /var/lib/cep> > > Apr 29 06:24:09 prdhcistonode01 bash[24328]: debug > > 2025-04-29T10:24:09.363+0000 7f30b83b1740 1 bdev(0x555f40688000 > > /var/lib/ceph/osd/ceph-8/block) open size 640122932428> > > Apr 29 06:24:09 prdhcistonode01 bash[24328]: debug > > 2025-04-29T10:24:09.363+0000 7f30b83b1740 -1 > > bluestore(/var/lib/ceph/osd/ceph-8) _set_cache_sizes > bluestore_cache_meta_> > > Apr 29 06:24:09 prdhcistonode01 bash[24328]: debug > > 2025-04-29T10:24:09.363+0000 7f30b83b1740 1 bdev(0x555f40688000 > > /var/lib/ceph/osd/ceph-8/block) close > > Apr 29 06:24:09 prdhcistonode01 systemd[1]: > > ceph-fbc38f5c-a3a6-11ea-805c-3b954db9ce7a@osd.12.service: Main process > > exited, code=exited, status=1/FAILURE > > > > > > Any help you can offer would be greatly appreciated. This is running in > > docker: > > > > Client: Docker Engine - Community > > Version: 24.0.7 > > API version: 1.43 > > Go version: go1.20.10 > > Git commit: afdd53b > > Built: Thu Oct 26 09:08:01 2023 > > OS/Arch: linux/amd64 > > Context: default > > > > Server: Docker Engine - Community > > Engine: > > Version: 24.0.7 > > API version: 1.43 (minimum version 1.12) > > Go version: go1.20.10 > > Git commit: 311b9ff > > Built: Thu Oct 26 09:08:01 2023 > > OS/Arch: linux/amd64 > > Experimental: false > > containerd: > > Version: 1.6.25 > > GitCommit: d8f198a4ed8892c764191ef7b3b06d8a2eeb5c7f > > runc: > > Version: 1.1.10 > > GitCommit: v1.1.10-0-g18a0cb0 > > docker-init: > > Version: 0.19.0 > > GitCommit: de40ad0 > > > > Thanks, > > Marco > > _______________________________________________ > > ceph-users mailing list -- ceph-users@xxxxxxx > > To unsubscribe send an email to ceph-users-leave@xxxxxxx > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx