Hi, On Tue, May 27, 2025 at 10:22 AM Shashank Balaji <shashank.mahadasyam@xxxxxxxx> wrote: > > Hi Rafael, > > On Fri, May 23, 2025 at 09:06:04PM +0200, Rafael J. Wysocki wrote: > > On Fri, May 23, 2025 at 6:25 AM Shashank Balaji > > <shashank.mahadasyam@xxxxxxxx> wrote: > > > ... > > > Consider the following on a Raptor Lake machine: > > > ... > > > > > > 3. Same as above, except with strictuserspace governor, which is a > > > custom kernel module which is exactly the same as the userspace > > > governor, except it has the CPUFREQ_GOV_STRICT_TARGET flag set: > > > > > > # echo strictuserspace > cpufreq/policy0/scaling_governor > > > # x86_energy_perf_policy -c 0 2>&1 | grep REQ > > > cpu0: HWP_REQ: min 26 max 26 des 0 epp 128 window 0x0 (0*10^0us) use_pkg 0 > > > pkg0: HWP_REQ_PKG: min 1 max 255 des 0 epp 128 window 0x0 (0*10^0us) > > > # echo 3000000 > cpufreq/policy0/scaling_setspeed > > > # x86_energy_perf_policy -c 0 2>&1 | grep REQ > > > cpu0: HWP_REQ: min 39 max 39 des 0 epp 128 window 0x0 (0*10^0us) use_pkg 0 > > > pkg0: HWP_REQ_PKG: min 1 max 255 des 0 epp 128 window 0x0 (0*10^0us) > > > > > > With the strict flag set, intel_pstate honours this by setting > > > the min and max freq same. > > > > > > desired_perf is always 0 in the above cases. The strict flag check is done in > > > intel_cpufreq_update_pstate, which sets max_pstate to target_pstate if policy > > > has strict target, and cpu->max_perf_ratio otherwise. > > > > > > As Russell and Rafael have noted, CPU frequency is subject to hardware > > > coordination and optimizations. While I get that, shouldn't software try > > > its best with whatever interface it has available? If a user sets the > > > userspace governor, that's because they want to have manual control over > > > CPU frequency, for whatever reason. The kernel should honor this by > > > setting the min and max freq in HWP_REQUEST equal. The current behaviour > > > explicitly lets the hardware choose higher frequencies. > > > > Well, the userspace governor ends up calling the same function, > > intel_cpufreq_target(), as other cpufreq governors except for > > schedutil. This function needs to work for all of them and for some > > of them setting HWP_MIN_PERF to the same value as HWP_MAX_PERF would > > be too strict. HWP_DESIRED_PERF can be set to the same value as > > HWP_MIN_PERF, though (please see the attached patch). > > > > > Since Russell pointed out that the "actual freq >= target freq" can be > > > achieved by leaving intel_pstate active and setting scaling_{min,max}_freq > > > instead (for some reason this slipped my mind), I now think the strict target > > > flag should be added to the userspace governor, leaving the documentation as > > > is. Maybe a warning like "you may want to set this exact frequency, but it's > > > subject to hardware coordination, so beware" can be added. > > > > If you expect the userspace governor to set the frequency exactly > > (module HW coordination), that's the only way to make it do so without > > potentially affecting the other governors. > > I don't mean to say that intel_cpufreq_target() should be modified. I'm > suggesting that the CPUFREQ_GOV_STRICT_TARGET flag be added to the > userspace governor. That'll ensure that HWP_MIN_PERF and > HWP_MAX_PERF are set to the target frequency. intel_cpufreq_target() > already correctly deals with the strict target flag. To test this, I > registered a custom governor, same as the userspace governor, except > with the strict target flag set. Please see case 3 above. > > If this flag is added to the userspace governor, then whatever the > documentation says right now will actually be true. No need to modify > the documentation then. So please submit a patch to set CPUFREQ_GOV_STRICT_TARGET in the userspace governor. Thanks!