On Thu, Sep 4, 2025 at 4:19 PM Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> wrote: > > On Thu, Sep 04, 2025 at 04:08:47PM +0200, Rafael J. Wysocki wrote: > > On Wed, Sep 3, 2025 at 11:54 PM Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> wrote: > > > > > > On Wed, Sep 03, 2025 at 09:30:47PM +0200, Rafael J. Wysocki wrote: > > > > I personally think that it would be reasonable to simply preserve > > > > device states in error paths unless they have been changed already > > > > before the error (or suspend abort due to a wakeup signal). > > > > > > The problem is complicated by the interaction between runtime-PM states > > > and system-sleep states. In the case, we've been considering, B changes > > > from runtime-suspended to runtime-suspended + system-suspended. > > > Therefore the error path is allowed to modify B's state. > > > > Yes, it is, but retaining the B's state in an error path is also fine > > so long as no changes have been made to it so far. > > > > If B was runtime-suspended to start with and none of the suspend > > callbacks invoked for it so far has done anything to it, then it is de > > facto still runtime-suspended and its state need not be changed in an > > error path. > > > > > > By this rule, B would be left in runtime suspend if it were still in > > > > runtime suspend when the error (or suspend abort in general) occurred > > > > and then it doesn't matter what happens to A. > > > > > > More fully, B would be changed from runtime-suspended + system-suspended > > > back to simply runtime-suspended. Unfortunately, none of the PM > > > callbacks in the kernel are defined to make this change -- at least, not > > > without some cooperation from the driver. > > > > Except when runtime-suspended + system-suspended is effectively the > > same as runtime-suspended. > > > > Say this is not the case and say that the device is runtime-suspended > > to start with. Then the "suspend" callback has two choices: either > > (1) it can runtime-resume the device before doing anything to it, > > which will also cause the device's parent and suppliers to > > runtime-resume, or (2) it can update the device's state without > > resuming it. > > > > If it chooses (1), then "resume" is straightforward. If it chooses > > (2), "resume" may just reverse the changes made by "suspend" and > > declare that the device is runtime-suspended. And if it really really > > wants to resume the device then, why not call runtime_resume() on it? > > That's what I meant by needing "cooperation from the driver". The > driver's ->resume callback needs to do this check to see which pathway > to follow: (1) or (2). > > I bet that right now almost none of the drivers in the kernel do > anything like that. I know that the USB drivers always follow (1) > during ->resume, even if they followed (2) during suspend. What do the > PCI drivers do? If they don't set DPM_FLAG_SMART_SUSPEND, the PCI bus type suspend callback will runtime-resume their devices. Calling runtime_resume() in a suspend callback (for the "suspend" phase) is a popular pattern because it was recommended once upon a time. > > > > The PM core can do something like that for the drivers opting in for > > > > runtime PM integration assistance, so to speak. That is, drivers that > > > > point their ->suspend() and ->resume() callbacks to > > > > pm_runtime_force_suspend() and pm_runtime_force_resume(), > > > > respectively, or set DPM_FLAG_SMART_SUSPEND (or both at the same time > > > > which is now feasible). Otherwise, it is hard to say what the > > > > expectations of the driver are and some code between the driver and > > > > the PM core may be involved (say, the PCI bus type). > > > > > > Setting DPM_FLAG_SMART_SUSPEND really does sound like the best answer. > > > > > > But there still should be some way the PM core can make resumes easier > > > for drivers that don't set the flag. Something like: If the device is > > > in runtime suspend with SMART_SUSPEND clear, perform a runtime resume on > > > the device's parent (and anything else the device depends on) before > > > invoking ->resume. > > > > Say that ->resume() does nothing to the device (because it is > > runtime-suspended and there's no need to resume it). Why would the > > core resume the parent etc then? > > You're right. I'm just trying to figure out a way to fix this problem > in general without the need for updating every driver in the kernel. > Maybe that's not possible. :-( Fortunately, in many cases runtime-suspended + system-suspended == runtime-suspended. Many drivers use pm_runtime_force_suspend/resume() as their suspend/resume callbacks. There are also drivers without runtime PM support. It is not all lost I think.