On Thu, 22 May 2025 at 16:08, Claudiu Beznea <claudiu.beznea@xxxxxxxxx> wrote: > > Hi, Ulf, > > On 22.05.2025 14:53, Ulf Hansson wrote: > > On Thu, 22 May 2025 at 11:48, Claudiu Beznea <claudiu.beznea@xxxxxxxxx> wrote: > >> > >> Hi, Ulf, > >> > >> On 21.05.2025 17:57, Dmitry Torokhov wrote: > >>> On Wed, May 21, 2025 at 02:37:08PM +0200, Ulf Hansson wrote: > >>>> On Wed, 21 May 2025 at 07:41, Claudiu Beznea <claudiu.beznea@xxxxxxxxx> wrote: > >>>>> > >>>>> Hi, Ulf, > >>>>> > >>>>> On 20.05.2025 15:09, Ulf Hansson wrote: > >>>>>> For example, even if the order is made correctly, suppose a driver's > >>>>>> ->remove() callback completes by turning off the resources for its > >>>>>> device and leaves runtime PM enabled, as it relies on devres to do it > >>>>>> some point later. Beyond this point, nothing would prevent userspace > >>>>>> for runtime resuming/suspending the device via sysfs. > >>>>> > >>>>> If I'm not wrong, that can't happen? The driver_sysfs_remove() is called > >>>>> before device_remove() (which calls the driver remove) is called, this > >>>>> being the call path: > >>>>> > >>>>> device_driver_detach() -> > >>>>> device_release_driver_internal() -> > >>>>> __device_release_driver() -> > >>>>> driver_sysfs_remove() > >>>>> // ... > >>>>> device_remove() > >>>>> > >>>>> And the driver_sysfs_remove() calls in the end __kernfs_remove() which > >>>>> looks to me like the place that actually drops the entries from sysfs, this > >>>>> being a call path for it: > >>>>> > >>>>> driver_sysfs_remove() -> > >>>>> sysfs_remove_link() -> > >>>>> kernfs_remove_by_name() -> > >>>>> kernfs_remove_by_name_ns() -> > >>>>> __kernfs_remove() -> > >>>>> > >>>>> activating the following line in __kernfs_remove(): > >>>>> > >>>>> pr_debug("kernfs %s: removing\n", kernfs_rcu_name(kn)); > >>>>> > >>>>> leads to the following prints when unbinding the watchdog device from its > >>>>> watchdog driver (attached to platform bus) on my board: > >>>>> https://p.fr33tux.org/935252 > >>>> > >>>> Indeed this is a very good point you make! I completely overlooked > >>>> this fact, thanks a lot for clarifying this! > >>>> > >>>> However, my main point still stands. > >>>> > >>>> In the end, there is nothing preventing rpm_suspend|resume|idle() in > >>>> drivers/base/power/runtime.c from running (don't forget runtime PM is > >>>> asynchronous too) for the device in question. This could lead to that > >>>> a ->runtime_suspend|resume|idle() callback becomes executed at any > >>>> point in time, as long as we haven't called pm_runtime_disable() for > >>>> the device. > >>> > >>> So exactly the same may happen if you enter driver->remove() and > >>> something calls runtime API before pm_runtime_disable() is called. > >>> The driver has (as they should be doing currently) be prepared for this. > >> > >> I took the time and tried to do a comparison of the current solutions > >> (describing the bad and good things I see), trying to understand your > >> concerns with regards to RPM suspend|resume|idle while unbinding a device > >> from its driver. > >> > >> I see the following cases: > >> > >> Case 1/ the current approach when devm_pm_runtime_enable() is used in > >> driver's ->probe() with the current code base: > >> > >> - right after driver ->remove() finish its execution clocks are detached > >> from the PM domain, through dev_pm_domain_detach() call in > >> platform_remove() > >> > >> - any subsequent RPM resume|suspend|idle will lead to failure if the driver > >> specific RPM APIs access directly registers and counts on PM domain to > >> enable/disable the clocks > >> > >> - at this point, if the IRQs are shared (but not only) and devm requested > >> the driver's IRQ handler can still be called asynchronously; driver > >> should be prepared for such events and should be written to work for such > >> scenarios; but as the clocks are not in the PM domain anymore and RPM is > >> still enabled at this point, if the driver don't run runtime suspend on > >> probe (and runtime resume/suspend on runtime), I think (because I haven't > >> investigated this yet) it can't rely on pm_runtime_active()/ > >> pm_runtime_suspended() checks in interrupt handlers > >> and can't decide if it can interrogate registers or not; interrogating > >> should lead to failure at this stage as the clocks are disabled; drivers > >> should work in such scenario and the CONFIG_DEBUG_SHIRQ is a way to check > >> they can; I previously debugged a similar issue on drivers/net/ethernet/ > >> renesas/ravb driver where using devm_pm_runtime_enable() in probe and > >> pm_runtime_suspended() checks in IRQ handlers was the way to make this > >> scenario happy; at that time I wasn't able to find that > >> dev_pm_domain_detach() have the impact discussed in this thread > >> > >> Case 2/ What is proposed in this patch: devm_pm_runtime_enable() used + > >> open devres group after dev_pm_domain_attach() (in probe) and close the > >> devres group before dev_pm_domain_attach() (in remove): > >> > >> - right after the driver ->remove() is executed only the driver allocated > >> devres resources are freed; this happens before dev_pm_domain_deattach() > >> is called, though the proposed devres_release_group() call in this patch > >> > >> - while doing this, driver can still get async RPM suspend|resume|idle > >> requests; is like the execution is in the driver ->remove() > >> but the pm_runtime_disable() hasn't been called yet > >> > >> - as the runtime PM is enabled in driver's ->probe() mostly after the HW is > >> prepared to take requests and all the other devm resources are allocated, > >> the RPM disable is going to be among the first things to be called by the > >> devres_release_group() > >> > >> - then, after RPM disable, all the devres resources allocated only in the > >> driver's ->probe() are cleaned up in reverse order, just like > >> device_unbind_cleanup() -> devres_release_all() call in > >> __device_release_driver() is doing, but limited only to the resources > >> allocated by the driver itself; I personally see this like manually > >> allocating and freeing resources in the driver itself w/o relying on > >> devres > >> > >> - then it comes the turn of dev_pm_domain_detach() call in > >> platform_remove(): at the time dev_pm_domain_detach() is executed the > >> runtime PM is disabled and all the devres resources allocated by driver > >> are freed as well > >> > >> - after the dev_pm_domain_detach() is executed all the driver resources > >> are cleaned up, the driver can't get IRQs as it's handler was already > >> unregistered, no other user can execute rpm suspend|resume|idle > >> as the RPM is disabled at this time > >> > >> Case 3/ devm_pm_runtime_enabled() dropped and replaced by manual cleanup: > >> - the driver code is going be complicated, difficult to maintain and error > >> prone > > > > Yes, the driver's code would become slightly more complicated, but > > more importantly it would be correct. > > > > To me it sounds like the driver's ->remove() callback could do this: > > > > pm_runtime_get_sync() > > pm_runtime_disable() > > pm_runtime_put_noidle() > > In my case it was just pm_runtime_disable() at the end of driver ->remove() > and pm_runtime_active() checks in IRQ handlers which didn't worked after > driver ->remove() finished execution due to disable_depth being incremented > in pm_runtime_disable(). The IRQs were devm requested. > > The solution found at the that time was to use devm_pm_runtime_enable() in > probe and pm_runtime_suspended() calls in IRQ handlers. > > > > > In this way, the driver will runtime resume its device, allowing > > devres to drop/turn-off resources in the order we want. > > Except for the > > clocks, as those would be turned off via dev_pm_domain_detach() before > > the IRQ handler is freed (via devres), right? > > > > To avoid getting the IRQ handler to be called when it can't access > > registers, we could do one of the below: > > *) Look for a condition in the IRQ handler and bail-out when we know > > we should not manage IRQs. Is using pm_runtime_enabled() sufficient, > > you think? Otherwise we need a driver specific flag, which should be > > set in ->remove(). > > *) Don't use devm* when registering the IRQ handler. > > That's true. > > > > > Yes, both options further contribute to making the driver code > > slightly more complicated, but if you want to solve the problem sooner > > than later, I think this is what you need to do. Yet, I think there is > > another option too, see below. > > > >> > >> I may have missed considering things when describing the case 2 (which is > >> what is proposed by this patch) as I don't have the full picture behind the > >> dev_pm_domain_detach() call in platform bus remove. If so, please correct me. > > > > The dev_pm_domain_attach|detach() calls in bus level code > > (probe/remove) were added there a long time ago, way before devres was > > being used like today. > > > > Currently we also have devm_pm_domain_attach_list(), which is used > > when devices have multiple PM domains to attach too. This is *not* > > called by bus-level code, but by the driver themselves. For these > > cases, we would not encounter the problems you have been facing with > > clocks/IRQ-handler, I think - because the devres order is maintained > > for PM domains too. > > > > That said, I think adding a devm_pm_domain_attach() interface would > > make perfect sense. Then we can try to replace > > dev_pm_domain_attach|detach() in bus level code, with just a call to > > devm_pm_domain_attach(). In this way, we should preserve the > > expectation for drivers around devres for PM domains. Even if it would > > change the behaviour for some drivers, it still sounds like the > > correct thing to do in my opinion. > > This looks good to me, as well. I did prototype it on my side and tested on > all my failure cases and it works. That's great! I am happy to help review, if/when you decide to post it. Kind regards Uffe