Hi, Rafael, On 09.06.2025 22:59, Rafael J. Wysocki wrote: > On Sat, Jun 7, 2025 at 3:06 PM Jonathan Cameron <jic23@xxxxxxxxxx> wrote: >> >> On Fri, 6 Jun 2025 22:01:52 +0200 >> "Rafael J. Wysocki" <rafael@xxxxxxxxxx> wrote: >> >> Hi Rafael, >> >>> On Fri, Jun 6, 2025 at 8:55 PM Dmitry Torokhov >>> <dmitry.torokhov@xxxxxxxxx> wrote: >>>> >>>> On Fri, Jun 06, 2025 at 06:00:34PM +0200, Rafael J. Wysocki wrote: >>>>> On Fri, Jun 6, 2025 at 1:18 PM Claudiu <claudiu.beznea@xxxxxxxxx> wrote: >>>>>> >>>>>> From: Claudiu Beznea <claudiu.beznea.uj@xxxxxxxxxxxxxx> >>>>>> >>>>>> The dev_pm_domain_attach() function is typically used in bus code alongside >>>>>> dev_pm_domain_detach(), often following patterns like: >>>>>> >>>>>> static int bus_probe(struct device *_dev) >>>>>> { >>>>>> struct bus_driver *drv = to_bus_driver(dev->driver); >>>>>> struct bus_device *dev = to_bus_device(_dev); >>>>>> int ret; >>>>>> >>>>>> // ... >>>>>> >>>>>> ret = dev_pm_domain_attach(_dev, true); >>>>>> if (ret) >>>>>> return ret; >>>>>> >>>>>> if (drv->probe) >>>>>> ret = drv->probe(dev); >>>>>> >>>>>> // ... >>>>>> } >>>>>> >>>>>> static void bus_remove(struct device *_dev) >>>>>> { >>>>>> struct bus_driver *drv = to_bus_driver(dev->driver); >>>>>> struct bus_device *dev = to_bus_device(_dev); >>>>>> >>>>>> if (drv->remove) >>>>>> drv->remove(dev); >>>>>> dev_pm_domain_detach(_dev); >>>>>> } >>>>>> >>>>>> When the driver's probe function uses devres-managed resources that depend >>>>>> on the power domain state, those resources are released later during >>>>>> device_unbind_cleanup(). >>>>>> >>>>>> Releasing devres-managed resources that depend on the power domain state >>>>>> after detaching the device from its PM domain can cause failures. >>>>>> >>>>>> For example, if the driver uses devm_pm_runtime_enable() in its probe >>>>>> function, and the device's clocks are managed by the PM domain, then >>>>>> during removal the runtime PM is disabled in device_unbind_cleanup() after >>>>>> the clocks have been removed from the PM domain. It may happen that the >>>>>> devm_pm_runtime_enable() action causes the device to be runtime-resumed. >>>>> >>>>> Don't use devm_pm_runtime_enable() then. >>>> >>>> What about other devm_ APIs? Are you suggesting that platform drivers >>>> should not be using devm_clk*(), devm_regulator_*(), >>>> devm_request_*_irq() and devm_add_action_or_reset()? Because again, >>>> dev_pm_domain_detach() that is called by platform bus_remove() may shut >>>> off the device too early, before cleanup code has a chance to execute >>>> proper cleanup. >>>> >>>> The issue is not limited to runtime PM. >>>> >>>>> >>>>>> If the driver specific runtime PM APIs access registers directly, this >>>>>> will lead to accessing device registers without clocks being enabled. >>>>>> Similar issues may occur with other devres actions that access device >>>>>> registers. >>>>>> >>>>>> Add devm_pm_domain_attach(). When replacing the dev_pm_domain_attach() and >>>>>> dev_pm_domain_detach() in bus probe and bus remove, it ensures that the >>>>>> device is detached from its PM domain in device_unbind_cleanup(), only >>>>>> after all driver's devres-managed resources have been release. >>>>>> >>>>>> For flexibility, the implemented devm_pm_domain_attach() has 2 state >>>>>> arguments, one for the domain state on attach, one for the domain state on >>>>>> detach. >>>>> >>>>> dev_pm_domain_attach() is not part driver API and I'm not convinced at >>>> >>>> Is the concern that devm_pm_domain_attach() will be [ab]used by drivers? >>> >>> Yes, among other things. >> >> Maybe naming could make abuse at least obvious to spot? e.g. >> pm_domain_attach_with_devm_release() > > If I'm not mistaken, it is not even necessary to use devres for this. > > You might as well add a dev_pm_domain_detach() call to > device_unbind_cleanup() after devres_release_all(). There is a slight > complication related to the second argument of it, but I suppose that > this can be determined at the attach time and stored in a new device > PM flag, or similar. > I looked into this solution. I've tested it for all my failure cases and went good. > Note that dev->pm_domain is expected to be cleared by ->detach(), so > this should not cause the domain to be detached twice in a row from > the same device, but that needs to be double-checked. The genpd_dev_pm_detach() calls genpd_remove_device() -> dev_pm_domain_set(dev, NULL) which sets the dev->pm_domain = NULL. I can't find any other detach function in the current code base. The code I've tested for this solution is this one: diff --git a/drivers/base/dd.c b/drivers/base/dd.c index b526e0e0f52d..5e9750d007b4 100644 --- a/drivers/base/dd.c +++ b/drivers/base/dd.c @@ -25,6 +25,7 @@ #include <linux/kthread.h> #include <linux/wait.h> #include <linux/async.h> +#include <linux/pm_domain.h> #include <linux/pm_runtime.h> #include <linux/pinctrl/devinfo.h> #include <linux/slab.h> @@ -552,8 +553,11 @@ static void device_unbind_cleanup(struct device *dev) dev->dma_range_map = NULL; device_set_driver(dev, NULL); dev_set_drvdata(dev, NULL); - if (dev->pm_domain && dev->pm_domain->dismiss) - dev->pm_domain->dismiss(dev); + if (dev->pm_domain) { + if (dev->pm_domain->dismiss) + dev->pm_domain->dismiss(dev); + dev_pm_domain_detach(dev, dev->pm_domain->detach_power_off); + } pm_runtime_reinit(dev); dev_pm_set_driver_flags(dev, 0); } diff --git a/drivers/base/platform.c b/drivers/base/platform.c index 075ec1d1b73a..2459be6aecf4 100644 --- a/drivers/base/platform.c +++ b/drivers/base/platform.c @@ -1400,11 +1400,8 @@ static int platform_probe(struct device *_dev) if (ret) goto out; - if (drv->probe) { + if (drv->probe) ret = drv->probe(dev); - if (ret) - dev_pm_domain_detach(_dev, true); - } out: if (drv->prevent_deferred_probe && ret == -EPROBE_DEFER) { @@ -1422,7 +1419,6 @@ static void platform_remove(struct device *_dev) if (drv->remove) drv->remove(dev); - dev_pm_domain_detach(_dev, true); } static void platform_shutdown(struct device *_dev) diff --git a/drivers/base/power/common.c b/drivers/base/power/common.c index 781968a128ff..4bd1e3c7f401 100644 --- a/drivers/base/power/common.c +++ b/drivers/base/power/common.c @@ -111,6 +111,9 @@ int dev_pm_domain_attach(struct device *dev, bool power_on) if (!ret) ret = genpd_dev_pm_attach(dev); + if (dev->pm_domain) + dev->pm_domain->detach_power_off = power_on; + return ret < 0 ? ret : 0; } EXPORT_SYMBOL_GPL(dev_pm_domain_attach); diff --git a/include/linux/pm.h b/include/linux/pm.h index f0bd8fbae4f2..12e97e09e85c 100644 --- a/include/linux/pm.h +++ b/include/linux/pm.h @@ -748,6 +748,7 @@ struct dev_pm_domain { void (*sync)(struct device *dev); void (*dismiss)(struct device *dev); int (*set_performance_state)(struct device *dev, unsigned int state); + bool detach_power_off; }; Rafael, Ulf, Dmitry, Jonathan, all, Could you please let me know how do you consider this approach? Thank you, Claudiu