> > +config LIVEUPDATE > > + bool "Live Update Orchestrator" > > + depends on KEXEC_HANDOVER > > + help > > + Enable the Live Update Orchestrator. Live Update is a mechanism, > > + typically based on kexec, that allows the kernel to be updated > > + while keeping selected devices operational across the transition. > > + These devices are intended to be reclaimed by the new kernel and > > + re-attached to their original workload without requiring a device > > + reset. > > + > > + This functionality depends on specific support within device drivers > > + and related kernel subsystems. > > This is not clear if the ability to reattach a device to the new kernel or > the entire live update functionality depends on specific support with > drivers. > > Probably better phrase it as > > Ability to handover a device from old to new kernel depends ... Updated > > > + > > + This feature is primarily used in cloud environments to quickly > > + update the kernel hypervisor with minimal disruption to the > > + running virtual machines. > > I wouldn't put it into Kconfig. If anything I'd make it > > This feature primarily targets virtual machine hosts to quickly ... Ok > > + * The core of LUO is a state machine that tracks the progress of a live update, > > + * along with a callback API that allows other kernel subsystems to participate > > + * in the process. Example subsystems that can hook into LUO include: kvm, > > + * iommu, interrupts, vfio, participating filesystems, and mm. > > Please spell out memory management. Done. > > > + * LUO uses KHO to transfer memory state from the current Kernel to the next > > A link to KHO docs would have been nice, but I'm not sure kernel-doc can do > that nicely. Added a link, a simple path to rst, is apparently correctly converted to a link by sphinx. > > > + * Kernel. > > Why capital 'K'? :) Fixed. > > > + * The LUO state machine ensures that operations are performed in the correct > > + * sequence and provides a mechanism to track and recover from potential > > + * failures, and select devices and subsystems that should participate in > > + * live update sequence. > > + */ > > + > > +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt > > + > > +#include <linux/err.h> > > +#include <linux/kobject.h> > > +#include <linux/liveupdate.h> > > +#include <linux/rwsem.h> > > +#include <linux/string.h> > > +#include "luo_internal.h" > > + > > +static DECLARE_RWSEM(luo_state_rwsem); > > + > > +enum liveupdate_state luo_state; > > static? Fixed > Hmm, luo_state is initialized to 0 (NORMAL) which means we always start > from NORMAL, although the second kernel is not in the normal state until > the handover is complete. Maybe we need an initial "unknown" state until > some of luo code starts running and would set an actual known state? Added: LIVEUPDATE_STATE_UNDEFINED that exists only before LUO is initialized during boot. > > +const char *const luo_state_str[] = { > > + [LIVEUPDATE_STATE_NORMAL] = "normal", > > + [LIVEUPDATE_STATE_PREPARED] = "prepared", > > + [LIVEUPDATE_STATE_FROZEN] = "frozen", > > + [LIVEUPDATE_STATE_UPDATED] = "updated", > > +}; > > + > > +bool luo_enabled; > > static? Fixed. > > > +static int __init early_liveupdate_param(char *buf) > > +{ > > + return kstrtobool(buf, &luo_enabled); > > +} > > +early_param("liveupdate", early_liveupdate_param); > > + > > +/* Return true if the current state is equal to the provided state */ > > +static inline bool is_current_luo_state(enum liveupdate_state expected_state) > > +{ > > + return READ_ONCE(luo_state) == expected_state; > > +} > > + > > +static void __luo_set_state(enum liveupdate_state state) > > +{ > > + WRITE_ONCE(luo_state, state); > > +} > > + > > +static inline void luo_set_state(enum liveupdate_state state) > > +{ > > + pr_info("Switched from [%s] to [%s] state\n", > > + LUO_STATE_STR, luo_state_str[state]); > > Maybe LUO_CURRENT_STATE_STR? Done > > + __luo_set_state(state); > > +} > > + > > +static int luo_do_freeze_calls(void) > > +{ > > + return 0; > > +} > > + > > +static void luo_do_finish_calls(void) > > +{ > > +} > > + > > +int luo_prepare(void) > > +{ > > + return 0; > > +} > > + > > +/** > > + * luo_freeze() - Initiate the final freeze notification phase for live update. > > + * > > + * Attempts to transition the live update orchestrator state from > > + * %LIVEUPDATE_STATE_PREPARED to %LIVEUPDATE_STATE_FROZEN. This function is > > + * typically called just before the actual reboot system call (e.g., kexec) > > + * is invoked, either directly by the orchestration tool or potentially from > > + * within the reboot syscall path itself. > > + * > > + * Based on the outcome of the notification process: > > + * - If luo_do_freeze_calls() returns 0 (all callbacks succeeded), the state > > + * is set to %LIVEUPDATE_STATE_FROZEN using luo_set_state(), indicating > > + * readiness for the imminent kexec. > > + * - If luo_do_freeze_calls() returns a negative error code (a callback > > + * failed), the state is reverted to %LIVEUPDATE_STATE_NORMAL using > > + * luo_set_state() to cancel the live update attempt. > > The kernel-doc comments are mostly for users of a function and describe how > it should be used rather how it is implemented. SGTM, cleaned-up. > I don't think it's important to mention return values of > luo_do_freeze_calls() here. The important things are whether registered > subsystems succeeded to freeze or not and the state changes. > I'd also mention that if a subsystem fails to freeze, everything is > canceled. Added > > +/** > > + * luo_finish - Finalize the live update process in the new kernel. > > + * > > + * This function is called after a successful live update reboot into a new > > + * kernel, once the new kernel is ready to transition to the normal operational > > + * state. It signals the completion of the live update sequence to subsystems. > > + * > > + * It first attempts to acquire the write lock for the orchestrator state. > > + * > > + * Then, it checks if the system is in the ``LIVEUPDATE_STATE_UPDATED`` state. > > + * If not, it logs a warning and returns ``-EINVAL``. > > + * > > + * If the state is correct, it triggers the ``LIVEUPDATE_FINISH`` notifier > > Here too, you describe what the function does rather how it should be used Fixed > > > + * chain. Note that the return value of the notifier is intentionally ignored as > > + * finish callbacks must not fail. Finally, the orchestrator state is > > And what should happen if there was an error in a finish callback? Scream, warn, panic, we cannot allow running a system past liveupdate, if some state was not properly passed from the previous kernel to the current kernel. This may result in catastrophic memory leaks. > > +static int __init luo_startup(void) > > +{ > > + __luo_set_state(LIVEUPDATE_STATE_NORMAL); > > + > > + return 0; > > +} > > +early_initcall(luo_startup); > > This means that the second kernel starts with luo_state == NORMAL, then > at early_initcall transitions to NORMAL again and later is set to UPDATED, > doesn't it? In the next patch, in this function we transition to UPDATED. So, technically, we go from NORMAL to UPDATED. However, I added UNDEFINED state so, in this function we either go from UNDEFINED to UPDATED or UNDEFINED to NORMAL. > > + * @return true if the system is in the ``LIVEUPDATE_STATE_NORMAL`` state, > > + * false otherwise. > > + */ > > +bool liveupdate_state_normal(void) > > +{ > > + return is_current_luo_state(LIVEUPDATE_STATE_NORMAL); > > +} > > +EXPORT_SYMBOL_GPL(liveupdate_state_normal); > > Won't liveupdate_get_state() do? Yeah, we can simply return state, and let caller to compare. However, I think, caller is only interested if this is normal state or if live update is in progress. I will keep them, and also added liveupdate_get_state(). > > + > > +/** > > + * liveupdate_enabled - Check if the live update feature is enabled. > > + * > > + * This function returns the state of the live update feature flag, which > > + * can be controlled via the ``liveupdate`` kernel command-line parameter. > > + * > > + * @return true if live update is enabled, false otherwise. > > + */ > > +bool liveupdate_enabled(void) > > +{ > > + return luo_enabled; > > +} > > +EXPORT_SYMBOL_GPL(liveupdate_enabled); > > diff --git a/drivers/misc/liveupdate/luo_internal.h b/drivers/misc/liveupdate/luo_internal.h > > new file mode 100644 > > index 000000000000..34e73fb0318c > > --- /dev/null > > +++ b/drivers/misc/liveupdate/luo_internal.h > > @@ -0,0 +1,26 @@ > > +/* SPDX-License-Identifier: GPL-2.0 */ > > + > > +/* > > + * Copyright (c) 2025, Google LLC. > > + * Pasha Tatashin <pasha.tatashin@xxxxxxxxxx> > > + */ > > + > > +#ifndef _LINUX_LUO_INTERNAL_H > > +#define _LINUX_LUO_INTERNAL_H > > + > > +int luo_cancel(void); > > +int luo_prepare(void); > > +int luo_freeze(void); > > +int luo_finish(void); > > + > > +void luo_state_read_enter(void); > > +void luo_state_read_exit(void); > > + > > +extern const char *const luo_state_str[]; > > + > > +/* Get the current state as a string */ > > +#define LUO_STATE_STR luo_state_str[READ_ONCE(luo_state)] > > IIUC you need the macro to have LUO_STATE_STR available in all files in > liveupdate/ but without exposing luo_state. > > I think that we can do a function call to get that string, will make things > nicer IMHO. Done. > > > + > > +extern enum liveupdate_state luo_state; > > + > > +#endif /* _LINUX_LUO_INTERNAL_H */ > > diff --git a/include/linux/liveupdate.h b/include/linux/liveupdate.h > > new file mode 100644 > > index 000000000000..c2740da70958 > > --- /dev/null > > +++ b/include/linux/liveupdate.h > > @@ -0,0 +1,131 @@ > > +/* SPDX-License-Identifier: GPL-2.0 */ > > + > > +/* > > + * Copyright (c) 2025, Google LLC. > > + * Pasha Tatashin <pasha.tatashin@xxxxxxxxxx> > > + */ > > +#ifndef _LINUX_LIVEUPDATE_H > > +#define _LINUX_LIVEUPDATE_H > > + > > +#include <linux/bug.h> > > +#include <linux/types.h> > > +#include <linux/list.h> > > + > > +/** > > + * enum liveupdate_event - Events that trigger live update callbacks. > > + * @LIVEUPDATE_PREPARE: PREPARE should happens *before* the blackout window. > > should happen or happens ;-) Done > > > + * Subsystems should prepare for an upcoming reboot by > > + * serializing their states. However, it must be considered > > It's not only about state serialization, it's also about adjusting > operational mode so that state that was serialized won't be changed or at > least the changes from PREPARE to FREEZE would be accounted somehow. By serialization, I mean is to save their state, but I agree, the devices and resources are also should be in a limited state where the serialized data should not be altered between prepare and freeze (i.e. no memfd resizing, no new DMA mappings, etc). Thank you for your comments. Pasha