Re: [RFC v2 04/16] luo: luo_core: Live Update Orchestrator

Pasha Tatashin <pasha.tatashin@xxxxxxxxxx> · Sat, 7 Jun 2025 13:11:14 -0400

> > + * Based on the outcome of the notification process:
> > + * - If luo_do_freeze_calls() returns 0 (all callbacks succeeded), the state
> > + * is set to %LIVEUPDATE_STATE_FROZEN using luo_set_state(), indicating
> > + * readiness for the imminent kexec.
> > + * - If luo_do_freeze_calls() returns a negative error code (a callback
> > + * failed), the state is reverted to %LIVEUPDATE_STATE_NORMAL using
> > + * luo_set_state() to cancel the live update attempt.
>
> Would we end up with a more robust serialization in subsystems or
> filesystems if we do not allow freeze to fail? Then they would be forced
> to ensure they have everything in order by the time the system goes into
> prepared state, and only need to make small adjustments in the freeze
> callback.
>

The reboot syscall is allowed to fail. Since freeze happens once we
leave userspace, it is the only chance left to conduct proper
verification that serialization assumptions have been maintained. For
example, if, after the prepare phase, some mutations are not allowed
for preserved resources (such as DMA re-mappings, etc.), the freeze
phase is the only place where we can perform this verification and
return an error to the user. So, while I agree it could simplify the
state machine by allowing cancellation only from the prepared state, I
think it is important to leave this ability for the freeze phase as
well.

Pasha