[PATCH 0/3] block: device frequency PM QoS tuning

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

These patches intruduce a mechanism for limiting device frequencies via
PM QoS when latency-sensitive threads block on IO. Stroage device (like
UFS) use the "devfreq_monitor" mechanism to automatically scale
frequency based on IO workloads. However, the hysteresis in IO workload
detection, it will lead IO request to be processed at low frequency. 
 
Original devfreq_monitir frequency scaling timeline:
       |--- latency-intensive process working ------------
  |****+**+**|***+++*++*|*++++++++*|**+++++***|*++++++++*|
       |                           |- high load and scale up frequency
       |-------- low frequency ----|-- high frequency ---|
([+] have IO request   [*] nothing to do)

Now, the patches provided here intruduce a mechanism for the block layer
to add constraints to device frequecny through PM QoS framework, with
configurable sysfs knobs per block device. Doing following config in my
test system:

  /sys/block/sda/dev_freq_timeout_ms = 30

This constraints is removed if there is no block IO for 30ms.

Enhanced frequency scaling timeline:
       |--- latency-intensive process working ------------
  |****+**+**|***+++*++*|*++++++++*|**+++++***|*++++++++*|
       |- add device frequecy PM QoS constraints----------
             |- scale up frequency
       |-low-|------------ high frequency ---------------|

Here are my example system detail:
  - SoC: Qualcomm Snapdragon (1+3+4 core cluster)
  - Stroage: UFS 4.1
  - Fio Version: 3.9
  - Global fio config:
           --rw=randread --bs=64k --iodepth=1 \
           --numjobs=5 --time_based --runtime=10 \
           --ioengine=libaio --hipri --cpus_allowed=3
           (job1~5 startdelay = [0s, 20s, 40s, 60s, 80s])
  - Local fio config:
      -Test case 1:
           --rate=10ms
      -Test case 2:
           --rate=0ms

Runing the same fio test used above with enhanced frequency scaling
enabled/disabled, I get:

  Test case 1:
     enabled: 	clat (usec): min=141, max=872, avg=550
     disabled:	clat (usec): min=210, max=899, avg=635
  Test case 2:
     enabled: 	BW=388.6(MB/s)
     disabled:	BW=378.2(MB/s)

So the intermittent workloads test(case 1) show >10% latency
improvement. The continuous workloads test(case2) show about 5%
bandwidth improvement.This mechanism delivers greater performance gains
under intermittent workloads compared to continuous workloads scenarios.

Any thoughts about the patches and the approach taken?

Wang Jianzheng (3):
  block/genhd: add sysfs knobs for the device frequency PM QoS
  block: add support for device frequency PM QoS tuning
  scsi: ufs: core: Add support for frequency PM QoS tuning

 block/blk-mq.c            | 58 +++++++++++++++++++++++++++++++++++++++
 block/genhd.c             | 23 ++++++++++++++++
 drivers/ufs/core/ufshcd.c | 44 +++++++++++++++++++++++++++++
 include/linux/blkdev.h    | 11 ++++++++
 include/linux/pm_qos.h    |  6 ++++
 5 files changed, 142 insertions(+)

-- 
2.34.1





[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux