> > This is not safe, the memory might be DMA or owned by a sensetive > > process, and if we proceed liveupdate reboot without properly handling > > memory, we can get corruptions, and memory leaks. Therefore, during > > liveupdate boot if there are exceptions, we should panic. > > I don't get how it would result in memory leaks or corruptions, since > KHO would have marked that memory as preserved, and the new kernel won't > touch it until someone restores it. > > So it can at most lead to loss of data, and in that case, userspace can > very well decide if it can live with that loss or not. > > Or are you assuming here that even data in KHO is broken? In that case, > it would probably be a good idea to panic early. A broken LUO format is a catastrophic failure. It's unclear at this point in boot whether the problem lies with KHO, LUO itself, or mismatched interface assumptions between kernel versions. Regardless, falling back to a cold reboot is the safest course of action, rather than attempting to boot into a potentially broken environment. Since VMs or any preserved userspace won't survive, the additional delay of a full reboot should not significantly worsen the impact. > > [...] > >> > + } > >> > + > >> > + luo_file = kmalloc(sizeof(*luo_file), > >> > + GFP_KERNEL | __GFP_NOFAIL); > >> > + luo_file->fs = fs; > >> > + luo_file->file = NULL; > >> > + memcpy(&luo_file->private_data, data_ptr, sizeof(u64)); > >> > >> Why not make sure data_ptr is exactly sizeof(u64) when we parse it, and > >> then simply do luo_file->private_data = (u64)*data_ptr ? > > > > Because FDT alignment is 4 bytes, we can't simply assign it. > > Hmm, good catch. Didn't think of that. > > > > >> Because if the previous kernel wrote more than a u64 in data, then > >> something is broken and we should catch that error anyway. > >> > >> > + luo_file->reclaimed = false; > >> > + mutex_init(&luo_file->mutex); > >> > + luo_file->state = LIVEUPDATE_STATE_UPDATED; > >> > + ret = xa_err(xa_store(&luo_files_xa_in, token, luo_file, > >> > + GFP_KERNEL | __GFP_NOFAIL)); > >> > [...] > >> > +struct liveupdate_filesystem { > >> > + int (*prepare)(struct file *file, void *arg, u64 *data); > >> > + int (*freeze)(struct file *file, void *arg, u64 *data); > >> > + void (*cancel)(struct file *file, void *arg, u64 data); > >> > + void (*finish)(struct file *file, void *arg, u64 data, bool reclaimed); > >> > + int (*retrieve)(void *arg, u64 data, struct file **file); > >> > + bool (*can_preserve)(struct file *file, void *arg); > >> > + const char *compatible; > >> > + void *arg; > >> > >> What is the use for this arg? I would expect one file type/system to > >> register one set of handlers. So they can keep their arg in a global in > >> their code. I don't see why a per-filesystem arg is needed. > > > > I think, arg is useful in case we support a subsystem is registered > > multiple times with some differences: i.e. based on mount point, or > > file types handling. Let's keep it for now, but if needed, we can > > remove that in future revisions. > > > >> What I do think is needed is a per-file arg. Each callback gets 'data', > >> which is the serialized data, but there is no place to store runtime > >> state, like some flags or serialization metadata. Sure, you could make > >> place for it somewhere in the inode, but I think it would be a lot > >> cleaner to be able to store it in struct luo_file. > >> > >> So perhaps rename private_data in struct luo_file to say > >> serialized_data, and have a field called "private" that filesystems can > >> use for their runtime state? > > > > I am not against this, but let's make this change when it is actually > > needed by a registered filesystem. > > Okay, fair enough. > > -- > Regards, > Pratyush Yadav