On Tue, Apr 1, 2025 at 8:25 PM Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote: > > On Tue, Apr 01, 2025 at 09:02:14PM +0530, Anshuman Gupta wrote: > > Implement _DSM method 10 and _DSM Method 11 as per PCI firmware specs > > section 4.6.10 Rev 3.3. > > Thanks for splitting this into two patches. But I think now this only > implements function 10 (0x0a), so this sentence needs to be updated. > > I would write this consistently as "0x0a" or "0Ah" to match the spec > description. I don't think the spec ever uses "10". > > > Note that this implementation assumes only a single device below the > > Downstream Port will request for Aux Power Limit under a given > > Root Port because it does not track and aggregate requests > > from all child devices below the Downstream Port as required > > by Section 4.6.10 Rev 3.3. Why is it regarded as compliant, then? Request aggregation is a known problem that has been addressed for a couple of times (at least) in the kernel. Why is it too hard to address it here? > > One possible mitigation would be only allowing only first PCIe > > Non-Bridge Endpoint Function 0 driver to call_DSM method 10. What about topologies where there is a switch with multiple downstream ports below the Root Port? > > > > Signed-off-by: Varun Gupta <varun.gupta@xxxxxxxxx> > > Signed-off-by: Badal Nilawar <badal.nilawar@xxxxxxxxx> > > Signed-off-by: Anshuman Gupta <anshuman.gupta@xxxxxxxxx> > > --- > > drivers/pci/pci-acpi.c | 78 ++++++++++++++++++++++++++++++++++++++++ > > include/linux/pci-acpi.h | 6 ++++ > > 2 files changed, 84 insertions(+) > > > > diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c > > index af370628e583..ebd49e43457e 100644 > > --- a/drivers/pci/pci-acpi.c > > +++ b/drivers/pci/pci-acpi.c > > @@ -1421,6 +1421,84 @@ static void pci_acpi_optimize_delay(struct pci_dev *pdev, > > ACPI_FREE(obj); > > } > > > > +/** > > + * pci_acpi_request_d3cold_aux_power - Request D3cold aux power via ACPI DSM > > + * @dev: PCI device instance > > + * @requested_power: Requested auxiliary power in milliwatts How's the caller supposed to find out what value to use here? > > + * > > + * This function sends a request to the host BIOS via ACPI _DSM Function 10 > > + * to grant the required D3Cold Auxiliary power for the specified PCI device. > > + * It evaluates the _DSM (Device Specific Method) to request the Auxiliary > > + * power and handles the response accordingly. > > + * > > + * This function shall be only called by 1st non-bridge Endpoint driver > > + * on Function 0. For a Multi-Function Device, the driver for Function 0 is > > + * required to report an aggregate power requirement covering all > > + * functions contained within the device. > > + * > > + * Return: Returns 0 on success and errno on failure. > > Write all this in imperative mood, e.g., > > Request auxiliary power while device is in D3cold ... > > Return 0 on success ... > > > + */ > > +int pci_acpi_request_d3cold_aux_power(struct pci_dev *dev, u32 requested_power) > > +{ > > + union acpi_object in_obj = { > > + .integer.type = ACPI_TYPE_INTEGER, > > + .integer.value = requested_power, > > + }; > > + > > + union acpi_object *out_obj; > > + acpi_handle handle; > > + int result, ret = -EINVAL; > > + > > + if (!dev || !ACPI_HANDLE(&dev->dev)) > > + return -EINVAL; > > + > > + handle = ACPI_HANDLE(&dev->dev); Well, ACPI_HANDLE() is not a simple macro, so I'd rather avoid using it twice in a row for the same device. What about if (!dev) return -EINVAL; handle = ACPI_HANDLE(&dev->dev); if (!handle) return -EINVAL; > > This needs an acpi_check_dsm() call to find out whether the platform > supports DSM_PCI_D3COLD_AUX_POWER_LIMIT. Absolutely. > We have several _DSM calls that *should* do this, but unfortunately > they don't do it yet, so they're not good examples to copy. Right. > > + out_obj = acpi_evaluate_dsm_typed(handle, > > + &pci_acpi_dsm_guid, > > + 4, > > + DSM_PCI_D3COLD_AUX_POWER_LIMIT, > > + &in_obj, > > + ACPI_TYPE_INTEGER); > > + if (!out_obj) > > + return -EINVAL; > > + > > + result = out_obj->integer.value; > > + > > + switch (result) { > > + case 0x0: > > + dev_dbg(&dev->dev, "D3cold Aux Power %umW request denied\n", > > + requested_power); > > Use pci_dbg(dev), pci_info(dev), etc. > > > + break; > > + case 0x1: > > + dev_info(&dev->dev, "D3cold Aux Power request granted: %umW\n", > > + requested_power); > > + ret = 0; > > + break; > > + case 0x2: > > + dev_info(&dev->dev, "D3cold Aux Power: Main power won't be removed\n"); > > + ret = -EBUSY; > > + break; > > + default: > > + if (result >= 0x11 && result <= 0x1F) { > > + int retry_interval = result & 0xF; > > + > > + dev_warn(&dev->dev, "D3cold Aux Power request needs retry." > > + "Interval: %d seconds.\n", retry_interval); > > + msleep(retry_interval * 1000); > > It doesn't seem right to me to do this sleep internally because it > means this interface might take up to 15 seconds to return, and the > caller has no idea what is happening during that time. I entirely agree here. If this is going to happen during system suspend, sleeping for seconds is a total no-go. > This seems like it should be done in the driver, which can decide > *whether* it wants to sleep, and if it does sleep, it may give a nice > indication to the user. So during a system suspend in progress, this is rather hard to achieve. I'd say that if the request is not granted right away, it is a failure and we don't use D3cold. > Of course, that would mean returning some kind of retry interval > information to the caller. > > > + ret = -EAGAIN; > > + } else { > > + dev_err(&dev->dev, "D3cold Aux Power: Reserved or " > > + "unsupported response: 0x%x.\n", result); > > Drop periods at end of messages. > > > + } > > + break; > > + } > > + > > + ACPI_FREE(out_obj); > > + return ret; > > +} > > +EXPORT_SYMBOL(pci_acpi_request_d3cold_aux_power);