Hi Rafael, On Fri, May 23, 2025 at 09:06:04PM +0200, Rafael J. Wysocki wrote: > On Fri, May 23, 2025 at 6:25 AM Shashank Balaji > <shashank.mahadasyam@xxxxxxxx> wrote: > > ... > > Consider the following on a Raptor Lake machine: > > ... > > > > 3. Same as above, except with strictuserspace governor, which is a > > custom kernel module which is exactly the same as the userspace > > governor, except it has the CPUFREQ_GOV_STRICT_TARGET flag set: > > > > # echo strictuserspace > cpufreq/policy0/scaling_governor > > # x86_energy_perf_policy -c 0 2>&1 | grep REQ > > cpu0: HWP_REQ: min 26 max 26 des 0 epp 128 window 0x0 (0*10^0us) use_pkg 0 > > pkg0: HWP_REQ_PKG: min 1 max 255 des 0 epp 128 window 0x0 (0*10^0us) > > # echo 3000000 > cpufreq/policy0/scaling_setspeed > > # x86_energy_perf_policy -c 0 2>&1 | grep REQ > > cpu0: HWP_REQ: min 39 max 39 des 0 epp 128 window 0x0 (0*10^0us) use_pkg 0 > > pkg0: HWP_REQ_PKG: min 1 max 255 des 0 epp 128 window 0x0 (0*10^0us) > > > > With the strict flag set, intel_pstate honours this by setting > > the min and max freq same. > > > > desired_perf is always 0 in the above cases. The strict flag check is done in > > intel_cpufreq_update_pstate, which sets max_pstate to target_pstate if policy > > has strict target, and cpu->max_perf_ratio otherwise. > > > > As Russell and Rafael have noted, CPU frequency is subject to hardware > > coordination and optimizations. While I get that, shouldn't software try > > its best with whatever interface it has available? If a user sets the > > userspace governor, that's because they want to have manual control over > > CPU frequency, for whatever reason. The kernel should honor this by > > setting the min and max freq in HWP_REQUEST equal. The current behaviour > > explicitly lets the hardware choose higher frequencies. > > Well, the userspace governor ends up calling the same function, > intel_cpufreq_target(), as other cpufreq governors except for > schedutil. This function needs to work for all of them and for some > of them setting HWP_MIN_PERF to the same value as HWP_MAX_PERF would > be too strict. HWP_DESIRED_PERF can be set to the same value as > HWP_MIN_PERF, though (please see the attached patch). > > > Since Russell pointed out that the "actual freq >= target freq" can be > > achieved by leaving intel_pstate active and setting scaling_{min,max}_freq > > instead (for some reason this slipped my mind), I now think the strict target > > flag should be added to the userspace governor, leaving the documentation as > > is. Maybe a warning like "you may want to set this exact frequency, but it's > > subject to hardware coordination, so beware" can be added. > > If you expect the userspace governor to set the frequency exactly > (module HW coordination), that's the only way to make it do so without > potentially affecting the other governors. I don't mean to say that intel_cpufreq_target() should be modified. I'm suggesting that the CPUFREQ_GOV_STRICT_TARGET flag be added to the userspace governor. That'll ensure that HWP_MIN_PERF and HWP_MAX_PERF are set to the target frequency. intel_cpufreq_target() already correctly deals with the strict target flag. To test this, I registered a custom governor, same as the userspace governor, except with the strict target flag set. Please see case 3 above. If this flag is added to the userspace governor, then whatever the documentation says right now will actually be true. No need to modify the documentation then. Regards, Shashank