On Thu, Sep 04, 2025 at 04:08:47PM +0200, Rafael J. Wysocki wrote: > On Wed, Sep 3, 2025 at 11:54 PM Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> wrote: > > > > On Wed, Sep 03, 2025 at 09:30:47PM +0200, Rafael J. Wysocki wrote: > > > I personally think that it would be reasonable to simply preserve > > > device states in error paths unless they have been changed already > > > before the error (or suspend abort due to a wakeup signal). > > > > The problem is complicated by the interaction between runtime-PM states > > and system-sleep states. In the case, we've been considering, B changes > > from runtime-suspended to runtime-suspended + system-suspended. > > Therefore the error path is allowed to modify B's state. > > Yes, it is, but retaining the B's state in an error path is also fine > so long as no changes have been made to it so far. > > If B was runtime-suspended to start with and none of the suspend > callbacks invoked for it so far has done anything to it, then it is de > facto still runtime-suspended and its state need not be changed in an > error path. > > > > By this rule, B would be left in runtime suspend if it were still in > > > runtime suspend when the error (or suspend abort in general) occurred > > > and then it doesn't matter what happens to A. > > > > More fully, B would be changed from runtime-suspended + system-suspended > > back to simply runtime-suspended. Unfortunately, none of the PM > > callbacks in the kernel are defined to make this change -- at least, not > > without some cooperation from the driver. > > Except when runtime-suspended + system-suspended is effectively the > same as runtime-suspended. > > Say this is not the case and say that the device is runtime-suspended > to start with. Then the "suspend" callback has two choices: either > (1) it can runtime-resume the device before doing anything to it, > which will also cause the device's parent and suppliers to > runtime-resume, or (2) it can update the device's state without > resuming it. > > If it chooses (1), then "resume" is straightforward. If it chooses > (2), "resume" may just reverse the changes made by "suspend" and > declare that the device is runtime-suspended. And if it really really > wants to resume the device then, why not call runtime_resume() on it? That's what I meant by needing "cooperation from the driver". The driver's ->resume callback needs to do this check to see which pathway to follow: (1) or (2). I bet that right now almost none of the drivers in the kernel do anything like that. I know that the USB drivers always follow (1) during ->resume, even if they followed (2) during suspend. What do the PCI drivers do? > > > The PM core can do something like that for the drivers opting in for > > > runtime PM integration assistance, so to speak. That is, drivers that > > > point their ->suspend() and ->resume() callbacks to > > > pm_runtime_force_suspend() and pm_runtime_force_resume(), > > > respectively, or set DPM_FLAG_SMART_SUSPEND (or both at the same time > > > which is now feasible). Otherwise, it is hard to say what the > > > expectations of the driver are and some code between the driver and > > > the PM core may be involved (say, the PCI bus type). > > > > Setting DPM_FLAG_SMART_SUSPEND really does sound like the best answer. > > > > But there still should be some way the PM core can make resumes easier > > for drivers that don't set the flag. Something like: If the device is > > in runtime suspend with SMART_SUSPEND clear, perform a runtime resume on > > the device's parent (and anything else the device depends on) before > > invoking ->resume. > > Say that ->resume() does nothing to the device (because it is > runtime-suspended and there's no need to resume it). Why would the > core resume the parent etc then? You're right. I'm just trying to figure out a way to fix this problem in general without the need for updating every driver in the kernel. Maybe that's not possible. :-( Alan Stern