Re: Patching Ceph cluster

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Nope, it was really broken in 17.2.7. When RHEL 10 comes available, I will look into this part again :)

> Op 25-04-2025 07:22 CEST schreef Lukasz Borek <lukasz@xxxxxxxxxxxx>:
> 
> 
> > For upgrade the OS we have something similar, but exiting maintenance mode is broken (with 17.2.7) :(
> > I need to check the tracker for similar issues and if I can't find anything, I will create a ticket
> For 18.2.2 first maint exit command threw an exception for some reason. In my patching script I execute commands in a loop and the 2nd shoot usually works.
> 
> exit maint 1/3
> Error EINVAL: Traceback (most recent call last):
>  File "/usr/share/ceph/mgr/mgr_module.py", line 1809, in _handle_command
>  return self.handle_command(inbuf, cmd)
>  File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 183, in handle_command
>  return dispatch[cmd['prefix']].call(self, cmd, inbuf)
>  File "/usr/share/ceph/mgr/mgr_module.py", line 474, in call
>  return self.func(mgr, **kwargs)
>  File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 119, in <lambda>
>  wrapper_copy = lambda *l_args, **l_kwargs: wrapper(*l_args, **l_kwargs) # noqa: E731
>  File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 108, in wrapper
>  return func(*args, **kwargs)
>  File "/usr/share/ceph/mgr/orchestrator/module.py", line 778, in _host_maintenance_exit
>  raise_if_exception(completion)
>  File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 237, in raise_if_exception
>  e = pickle.loads(c.serialized_exception)
> TypeError: __init__() missing 2 required positional arguments: 'hostname' and 'addr'
> 
> exit maint 2/3
> Ceph cluster f3e63d9e-2f4c-11ef-87a2-0f1170f55ed5 on cephbackup-osd1 has exited maintenance mode
> exit maint 3/3
> Error EINVAL: Host cephbackup-osd1 is not in maintenance mode
> Fri Apr 25 07:17:58 CEST 2025 cluster state is HEALTH_WARN
> Fri Apr 25 07:18:02 CEST 2025 cluster state is HEALTH_WARN
> [...]
> 
> 
> 
> 
> 
> On Thu, 13 Jun 2024 at 22:07, Sake Ceph <ceph@xxxxxxxxxxx> wrote:
> > 
> >  
> >  For upgrade the OS we have something similar, but exiting maintenance mode is broken (with 17.2.7) :(
> >  I need to check the tracker for similar issues and if I can't find anything, I will create a ticket. 
> >  
> >  Kind regards, 
> >  Sake 
> >  
> >  > Op 12-06-2024 19:02 CEST schreef Daniel Brown <daniel.h.brown@thermify.cloud>:
> >  > 
> >  > 
> >  > I have two ansible roles, one for enter, one for exit. There’s likely better ways to do this — and I’ll not be surprised if someone here lets me know. They’re using orch commands via the cephadm shell. I’m using Ansible for other configuration management in my environment, as well, including setting up clients of the ceph cluster. 
> >  > 
> >  > 
> >  > Below excerpts from main.yml in the “tasks” for the enter/exit roles. The host I’m running ansible from is one of my CEPH servers - I’ve limited which process run there though so it’s in the cluster but not equal to the others. 
> >  > 
> >  > 
> >  > —————
> >  > Enter
> >  > —————
> >  > 
> >  > - name: Ceph Maintenance Mode Enter
> >  > shell:
> >  > 
> >  > cmd: ' cephadm shell ceph orch host maintenance enter {{ (ansible_ssh_host|default(ansible_host))|default(inventory_hostname) }} --force --yes-i-really-mean-it ‘
> >  > become: True
> >  > 
> >  > 
> >  > 
> >  > —————
> >  > Exit
> >  > ————— 
> >  > 
> >  > 
> >  > - name: Ceph Maintenance Mode Exit
> >  > shell:
> >  > cmd: 'cephadm shell ceph orch host maintenance exit {{ (ansible_ssh_host|default(ansible_host))|default(inventory_hostname) }} ‘
> >  > become: True
> >  > connection: local
> >  > 
> >  > 
> >  > - name: Wait for Ceph to be available
> >  > ansible.builtin.wait_for:
> >  > delay: 60
> >  > host: '{{ (ansible_ssh_host|default(ansible_host))|default(inventory_hostname) }}’
> >  > port: 9100
> >  > connection: local
> >  > 
> >  > 
> >  > 
> >  > 
> >  > 
> >  > 
> >  > > On Jun 12, 2024, at 11:28 AM, Michael Worsham <mworsham@xxxxxxxxxxxxxxxxxx> wrote:
> >  > > 
> >  > > Interesting. How do you set this "maintenance mode"? If you have a series of documented steps that you have to do and could provide as an example, that would be beneficial for my efforts.
> >  > > 
> >  > > We are in the process of standing up both a dev-test environment consisting of 3 Ceph servers (strictly for testing purposes) and a new production environment consisting of 20+ Ceph servers.
> >  > > 
> >  > > We are using Ubuntu 22.04.
> >  > > 
> >  > > -- Michael
> >  > > From: Daniel Brown <daniel.h.brown@thermify.cloud>
> >  > > Sent: Wednesday, June 12, 2024 9:18 AM
> >  > > To: Anthony D'Atri <anthony.datri@xxxxxxxxx>
> >  > > Cc: Michael Worsham <mworsham@xxxxxxxxxxxxxxxxxx>; ceph-users@xxxxxxx <ceph-users@xxxxxxx>
> >  > > Subject: Re:  Patching Ceph cluster
> >  > > This is an external email. Please take care when clicking links or opening attachments. When in doubt, check with the Help Desk or Security.
> >  > > 
> >  > > 
> >  > > There’s also a Maintenance mode that you can set for each server, as you’re doing updates, so that the cluster doesn’t try to move data from affected OSD’s, while the server being updated is offline or down. I’ve worked some on automating this with Ansible, but have found my process (and/or my cluster) still requires some manual intervention while it’s running to get things done cleanly.
> >  > > 
> >  > > 
> >  > > 
> >  > > > On Jun 12, 2024, at 8:49 AM, Anthony D'Atri <anthony.datri@xxxxxxxxx> wrote:
> >  > > >
> >  > > > Do you mean patching the OS?
> >  > > >
> >  > > > If so, easy -- one node at a time, then after it comes back up, wait until all PGs are active+clean and the mon quorum is complete before proceeding.
> >  > > >
> >  > > >
> >  > > >
> >  > > >> On Jun 12, 2024, at 07:56, Michael Worsham <mworsham@xxxxxxxxxxxxxxxxxx> wrote:
> >  > > >>
> >  > > >> What is the proper way to patch a Ceph cluster and reboot the servers in said cluster if a reboot is necessary for said updates? And is it possible to automate it via Ansible? This message and its attachments are from Data Dimensions and are intended only for the use of the individual or entity to which it is addressed, and may contain information that is privileged, confidential, and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, or the employee or agent responsible for delivering the message to the intended recipient, you are hereby notified that any dissemination, distribution, or copying of this communication is strictly prohibited. If you have received this communication in error, please notify the sender immediately and permanently delete the original email and destroy any copies or printouts of this email as well as any attachments.
> >  > > >> _______________________________________________
> >  > > >> ceph-users mailing list -- ceph-users@xxxxxxx
> >  > > >> To unsubscribe send an email to ceph-users-leave@xxxxxxx
> >  > > > _______________________________________________
> >  > > > ceph-users mailing list -- ceph-users@xxxxxxx
> >  > > > To unsubscribe send an email to ceph-users-leave@xxxxxxx
> >  > > 
> >  > > This message and its attachments are from Data Dimensions and are intended only for the use of the individual or entity to which it is addressed, and may contain information that is privileged, confidential, and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, or the employee or agent responsible for delivering the message to the intended recipient, you are hereby notified that any dissemination, distribution, or copying of this communication is strictly prohibited. If you have received this communication in error, please notify the sender immediately and permanently delete the original email and destroy any copies or printouts of this email as well as any attachments.
> >  > 
> >  > _______________________________________________
> >  > ceph-users mailing list -- ceph-users@xxxxxxx
> >  > To unsubscribe send an email to ceph-users-leave@xxxxxxx
> >  _______________________________________________
> >  ceph-users mailing list -- ceph-users@xxxxxxx
> >  To unsubscribe send an email to ceph-users-leave@xxxxxxx
> > 
> 
> 
> --
> 
> Łukasz Borek
> lukasz@xxxxxxxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux