On Thu, Apr 3, 2025 at 9:57 AM Gupta, Anshuman <anshuman.gupta@xxxxxxxxx> wrote: > > > -----Original Message----- > > From: Bjorn Helgaas <helgaas@xxxxxxxxxx> > > Sent: Wednesday, April 2, 2025 1:44 AM > > To: Gupta, Anshuman <anshuman.gupta@xxxxxxxxx> > > Cc: intel-xe@xxxxxxxxxxxxxxxxxxxxx; linux-acpi@xxxxxxxxxxxxxxx; linux- > > pci@xxxxxxxxxxxxxxx; rafael@xxxxxxxxxx; lenb@xxxxxxxxxx; > > bhelgaas@xxxxxxxxxx; ilpo.jarvinen@xxxxxxxxxxxxxxx; De Marchi, Lucas > > <lucas.demarchi@xxxxxxxxx>; Vivi, Rodrigo <rodrigo.vivi@xxxxxxxxx>; Nilawar, > > Badal <badal.nilawar@xxxxxxxxx>; Gupta, Varun <varun.gupta@xxxxxxxxx>; > > ville.syrjala@xxxxxxxxxxxxxxx; Shankar, Uma <uma.shankar@xxxxxxxxx> > > Subject: Re: [PATCH 03/12] PCI/ACPI: Add aux power grant notifier > > > > On Tue, Apr 01, 2025 at 09:02:16PM +0530, Anshuman Gupta wrote: > > > Adding a notifier to notify all PCIe child devices about the block > > > main power removal. It is needed because theoretically multiple PCIe > > > Endpoint devices on same Root Port can request for AUX power and _DSM > > > method can return with 80000000h signifies that the hierarchy > > > connected via the slot cannot support main power removal when in > > > D3Cold. > > > > I wish the spec used different language here. "D3cold" *means* "main power > > is removed" (PCIe r6.0, sec 5.3.1.4.2), so it doesn't make sense to say that > > "the slot cannot support main power removal when in D3cold". If a device is > > in D3cold, its main power has been removed by definition. > > > > I suppose the spec is trying to say if the driver has called this _DSM with > > 80000000h, it means the platform must not remove main power from the > > device while the system is in S0? Is that what you think it means? > Thanks for review. > what I understand here, S0 term essentially means here PCIe Runtime PM or s2idle > (both refers system is S0 state). AFAIU during both Runtime PM and s2ilde path > Root Port can transition to D3Cold if it support _PR3. > I observed Root Port either have D0 or D3Cold state. > /sys/bus/pci0/devices/$root_port_bdf/firmware_node/real_power_state > > Driver can disable the D3cold by pci_d3cold_disable(), so I wonder there is no use > case driver calling _DSM 0xa with 80000000h. > PCIe specs shall be simplified by removal of value 80000000h from _DSM 0xa Arg. > > > > The 2h return value description says it "indicates that the platform will not > > remove main power from the slot while the system is in S0," > > so I guess that must be it. > > > > In this series, pci_acpi_aux_power_setup() only supplies a 16-bit > > aux_pwr_limit value, so the driver cannot *request* that the platform not > > remove main power. > Yeah but that value Xe driver received from GPU firmware. > > > > So I guess the only scenario where the _DSM returns 80000000h is when the > > platform itself or other devices prevent the removal of main power. And the > > point of the notifier is to tell devices that their main power will never be > > removed while the system is in S0. Is that right? > Yes the notifier will safeguard against the use case if some other PCIe device > calls _DSM 0xa with Arg3 Value of 80000000h or its request return with 0x2h. > > > > > This may disrupt functionality of other child device. > > > > What sort of disruption could happen? I would think that if the _DSM returns > > 80000000h, it just means the device will not have main power removed, so it > > won't see that power state transition. The only "disruption" would be that > > we're using more power. > > Let's say during Xe Driver initialization BIOS firmware grants the Xe driver > Aux power request successfully. Slightly OT, but if you do it at init time, you also need to do it after hibernation because the original ini-time request may not survive it. > Xe driver will prepare to transition D3Cold state with VRAM Self Refresh. > VRAM Self Refresh relies on vram shall derive power from Aux power. > But later at any point if some other PCIe device under same root port > Calls _DSM either with 080000000h or _DSM returns with 02h, will > Block the main power removal. As I said in the previous message, this cannot happen if only one driver at a time is allowed to request the Aux power on a given Root Port. If multiple drivers are allowed to do it, aggregation of requests is necessary. > But Xe driver is unaware of it can still > program the VRAM Self Refresh and that make VRAM to derive power > from Aux and that will disrupt the GPU VRAM as there is no switch to Aux This is a rather unfortunate and counter-intuitive design decision IMV. If the main power is still there, I'd expect the VRAM Self Refresh to use it instead of the Aux power.