Hello, These patches intruduce a mechanism for limiting device frequencies via PM QoS when latency-sensitive threads block on IO. Stroage device (like UFS) use the "devfreq_monitor" mechanism to automatically scale frequency based on IO workloads. However, the hysteresis in IO workload detection, it will lead IO request to be processed at low frequency. Original devfreq_monitir frequency scaling timeline: |--- latency-intensive process working ------------ |****+**+**|***+++*++*|*++++++++*|**+++++***|*++++++++*| | |- high load and scale up frequency |-------- low frequency ----|-- high frequency ---| ([+] have IO request [*] nothing to do) Now, the patches provided here intruduce a mechanism for the block layer to add constraints to device frequecny through PM QoS framework, with configurable sysfs knobs per block device. Doing following config in my test system: /sys/block/sda/dev_freq_timeout_ms = 30 This constraints is removed if there is no block IO for 30ms. Enhanced frequency scaling timeline: |--- latency-intensive process working ------------ |****+**+**|***+++*++*|*++++++++*|**+++++***|*++++++++*| |- add device frequecy PM QoS constraints---------- |- scale up frequency |-low-|------------ high frequency ---------------| Here are my example system detail: - SoC: Qualcomm Snapdragon (1+3+4 core cluster) - Stroage: UFS 4.1 - Fio Version: 3.9 - Global fio config: --rw=randread --bs=64k --iodepth=1 \ --numjobs=5 --time_based --runtime=10 \ --ioengine=libaio --hipri --cpus_allowed=3 (job1~5 startdelay = [0s, 20s, 40s, 60s, 80s]) - Local fio config: -Test case 1: --rate=10ms -Test case 2: --rate=0ms Runing the same fio test used above with enhanced frequency scaling enabled/disabled, I get: Test case 1: enabled: clat (usec): min=141, max=872, avg=550 disabled: clat (usec): min=210, max=899, avg=635 Test case 2: enabled: BW=388.6(MB/s) disabled: BW=378.2(MB/s) So the intermittent workloads test(case 1) show >10% latency improvement. The continuous workloads test(case2) show about 5% bandwidth improvement.This mechanism delivers greater performance gains under intermittent workloads compared to continuous workloads scenarios. Any thoughts about the patches and the approach taken? Wang Jianzheng (3): block/genhd: add sysfs knobs for the device frequency PM QoS block: add support for device frequency PM QoS tuning scsi: ufs: core: Add support for frequency PM QoS tuning block/blk-mq.c | 58 +++++++++++++++++++++++++++++++++++++++ block/genhd.c | 23 ++++++++++++++++ drivers/ufs/core/ufshcd.c | 44 +++++++++++++++++++++++++++++ include/linux/blkdev.h | 11 ++++++++ include/linux/pm_qos.h | 6 ++++ 5 files changed, 142 insertions(+) -- 2.34.1