On Mon, Jun 16, 2025 at 11:37 AM Claudiu Beznea <claudiu.beznea@xxxxxxxxx> wrote: > > Hi, Rafael, > > On 13.06.2025 13:02, Rafael J. Wysocki wrote: > > On Fri, Jun 13, 2025 at 9:39 AM Claudiu Beznea <claudiu.beznea@xxxxxxxxx> wrote: > >> > >> Hi, Rafael, > >> > >> On 09.06.2025 22:59, Rafael J. Wysocki wrote: > >>> On Sat, Jun 7, 2025 at 3:06 PM Jonathan Cameron <jic23@xxxxxxxxxx> wrote: > >>>> > >>>> On Fri, 6 Jun 2025 22:01:52 +0200 > >>>> "Rafael J. Wysocki" <rafael@xxxxxxxxxx> wrote: > >>>> > >>>> Hi Rafael, > >>>> > >>>>> On Fri, Jun 6, 2025 at 8:55 PM Dmitry Torokhov > >>>>> <dmitry.torokhov@xxxxxxxxx> wrote: > >>>>>> > >>>>>> On Fri, Jun 06, 2025 at 06:00:34PM +0200, Rafael J. Wysocki wrote: > >>>>>>> On Fri, Jun 6, 2025 at 1:18 PM Claudiu <claudiu.beznea@xxxxxxxxx> wrote: > >>>>>>>> > >>>>>>>> From: Claudiu Beznea <claudiu.beznea.uj@xxxxxxxxxxxxxx> > >>>>>>>> > >>>>>>>> The dev_pm_domain_attach() function is typically used in bus code alongside > >>>>>>>> dev_pm_domain_detach(), often following patterns like: > >>>>>>>> > >>>>>>>> static int bus_probe(struct device *_dev) > >>>>>>>> { > >>>>>>>> struct bus_driver *drv = to_bus_driver(dev->driver); > >>>>>>>> struct bus_device *dev = to_bus_device(_dev); > >>>>>>>> int ret; > >>>>>>>> > >>>>>>>> // ... > >>>>>>>> > >>>>>>>> ret = dev_pm_domain_attach(_dev, true); > >>>>>>>> if (ret) > >>>>>>>> return ret; > >>>>>>>> > >>>>>>>> if (drv->probe) > >>>>>>>> ret = drv->probe(dev); > >>>>>>>> > >>>>>>>> // ... > >>>>>>>> } > >>>>>>>> > >>>>>>>> static void bus_remove(struct device *_dev) > >>>>>>>> { > >>>>>>>> struct bus_driver *drv = to_bus_driver(dev->driver); > >>>>>>>> struct bus_device *dev = to_bus_device(_dev); > >>>>>>>> > >>>>>>>> if (drv->remove) > >>>>>>>> drv->remove(dev); > >>>>>>>> dev_pm_domain_detach(_dev); > >>>>>>>> } > >>>>>>>> > >>>>>>>> When the driver's probe function uses devres-managed resources that depend > >>>>>>>> on the power domain state, those resources are released later during > >>>>>>>> device_unbind_cleanup(). > >>>>>>>> > >>>>>>>> Releasing devres-managed resources that depend on the power domain state > >>>>>>>> after detaching the device from its PM domain can cause failures. > >>>>>>>> > >>>>>>>> For example, if the driver uses devm_pm_runtime_enable() in its probe > >>>>>>>> function, and the device's clocks are managed by the PM domain, then > >>>>>>>> during removal the runtime PM is disabled in device_unbind_cleanup() after > >>>>>>>> the clocks have been removed from the PM domain. It may happen that the > >>>>>>>> devm_pm_runtime_enable() action causes the device to be runtime-resumed. > >>>>>>> > >>>>>>> Don't use devm_pm_runtime_enable() then. > >>>>>> > >>>>>> What about other devm_ APIs? Are you suggesting that platform drivers > >>>>>> should not be using devm_clk*(), devm_regulator_*(), > >>>>>> devm_request_*_irq() and devm_add_action_or_reset()? Because again, > >>>>>> dev_pm_domain_detach() that is called by platform bus_remove() may shut > >>>>>> off the device too early, before cleanup code has a chance to execute > >>>>>> proper cleanup. > >>>>>> > >>>>>> The issue is not limited to runtime PM. > >>>>>> > >>>>>>> > >>>>>>>> If the driver specific runtime PM APIs access registers directly, this > >>>>>>>> will lead to accessing device registers without clocks being enabled. > >>>>>>>> Similar issues may occur with other devres actions that access device > >>>>>>>> registers. > >>>>>>>> > >>>>>>>> Add devm_pm_domain_attach(). When replacing the dev_pm_domain_attach() and > >>>>>>>> dev_pm_domain_detach() in bus probe and bus remove, it ensures that the > >>>>>>>> device is detached from its PM domain in device_unbind_cleanup(), only > >>>>>>>> after all driver's devres-managed resources have been release. > >>>>>>>> > >>>>>>>> For flexibility, the implemented devm_pm_domain_attach() has 2 state > >>>>>>>> arguments, one for the domain state on attach, one for the domain state on > >>>>>>>> detach. > >>>>>>> > >>>>>>> dev_pm_domain_attach() is not part driver API and I'm not convinced at > >>>>>> > >>>>>> Is the concern that devm_pm_domain_attach() will be [ab]used by drivers? > >>>>> > >>>>> Yes, among other things. > >>>> > >>>> Maybe naming could make abuse at least obvious to spot? e.g. > >>>> pm_domain_attach_with_devm_release() > >>> > >>> If I'm not mistaken, it is not even necessary to use devres for this. > >>> > >>> You might as well add a dev_pm_domain_detach() call to > >>> device_unbind_cleanup() after devres_release_all(). There is a slight > >>> complication related to the second argument of it, but I suppose that > >>> this can be determined at the attach time and stored in a new device > >>> PM flag, or similar. > >>> > >> > >> I looked into this solution. I've tested it for all my failure cases and > >> went good. > > > > OK > > > >>> Note that dev->pm_domain is expected to be cleared by ->detach(), so > >>> this should not cause the domain to be detached twice in a row from > >>> the same device, but that needs to be double-checked. > >> > >> The genpd_dev_pm_detach() calls genpd_remove_device() -> > >> dev_pm_domain_set(dev, NULL) which sets the dev->pm_domain = NULL. I can't > >> find any other detach function in the current code base. > > > > There is also acpi_dev_pm_detach() which can be somewhat hard to find, > > but it calls dev_pm_domain_set(dev, NULL) either. > > > >> The code I've tested for this solution is this one: > >> > >> diff --git a/drivers/base/dd.c b/drivers/base/dd.c > >> index b526e0e0f52d..5e9750d007b4 100644 > >> --- a/drivers/base/dd.c > >> +++ b/drivers/base/dd.c > >> @@ -25,6 +25,7 @@ > >> #include <linux/kthread.h> > >> #include <linux/wait.h> > >> #include <linux/async.h> > >> +#include <linux/pm_domain.h> > >> #include <linux/pm_runtime.h> > >> #include <linux/pinctrl/devinfo.h> > >> #include <linux/slab.h> > >> @@ -552,8 +553,11 @@ static void device_unbind_cleanup(struct device *dev) > >> dev->dma_range_map = NULL; > >> device_set_driver(dev, NULL); > >> dev_set_drvdata(dev, NULL); > >> - if (dev->pm_domain && dev->pm_domain->dismiss) > >> - dev->pm_domain->dismiss(dev); > >> + if (dev->pm_domain) { > >> + if (dev->pm_domain->dismiss) > >> + dev->pm_domain->dismiss(dev); > >> + dev_pm_domain_detach(dev, dev->pm_domain->detach_power_off); > > > > I would do the "detach" before the "dismiss" to retain the current ordering. > > I applied on my local development branch all your suggestions except this > one because genpd_dev_pm_detach() as well as acpi_dev_pm_detach() set > dev->pm_domain = NULL. > > Due to this I would call first ->dismiss() then ->detach(), as initially > proposed. Please let me know if you consider it otherwise. This is a matter of adding one more dev->pm_domain check AFAICS, but OK.