[PATCH] blk-wbt: Fix io starvation in wbt_rqw_done()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Recently, we encountered the following hungtask:

INFO: task kworker/11:2:2981147 blocked for more than 6266 seconds
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kworker/11:2    D    0 2981147      2 0x80004000
Workqueue: cgroup_destroy css_free_rwork_fn
Call Trace:
 __schedule+0x934/0xe10
 schedule+0x40/0xb0
 wb_wait_for_completion+0x52/0x80
 ? finish_wait+0x80/0x80
 mem_cgroup_css_free+0x3a/0x1b0
 css_free_rwork_fn+0x42/0x380
 process_one_work+0x1a2/0x360
 worker_thread+0x30/0x390
 ? create_worker+0x1a0/0x1a0
 kthread+0x110/0x130
 ? __kthread_cancel_work+0x40/0x40
 ret_from_fork+0x1f/0x30

This is because the writeback thread has been continuously and repeatedly
throttled by wbt, but at the same time, the writes of another thread
proceed quite smoothly.
After debugging, I believe it is caused by the following reasons.

When thread A is blocked by wbt, the I/O issued by thread B will
use a deeper queue depth(rwb->rq_depth.max_depth) because it
meets the conditions of wb_recent_wait(), thus allowing thread B's
I/O to be issued smoothly and resulting in the inflight I/O of wbt
remaining relatively high.

However, when I/O completes, due to the high inflight I/O of wbt,
the condition "limit - inflight >= rwb->wb_background / 2"
in wbt_rqw_done() cannot be satisfied, causing thread A's I/O
to remain unable to be woken up.

Some on-site information:

>>> rwb.rq_depth.max_depth
(unsigned int)48
>>> rqw.inflight.counter.value_()
44
>>> rqw.inflight.counter.value_()
35
>>> prog['jiffies'] - rwb.rqos.q.backing_dev_info.last_bdp_sleep
(unsigned long)3
>>> prog['jiffies'] - rwb.rqos.q.backing_dev_info.last_bdp_sleep
(unsigned long)2
>>> prog['jiffies'] - rwb.rqos.q.backing_dev_info.last_bdp_sleep
(unsigned long)20
>>> prog['jiffies'] - rwb.rqos.q.backing_dev_info.last_bdp_sleep
(unsigned long)12

cat wb_normal
24
cat wb_background
12

To fix this issue, we can use max_depth in wbt_rqw_done(), so that
the handling of wb_recent_wait by wbt_rqw_done() and get_limit()
will also be consistent, which is more reasonable.

Signed-off-by: Julian Sun <sunjunchao@xxxxxxxxxxxxx>
Fixes: e34cbd307477 ("blk-wbt: add general throttling mechanism")
---
 block/blk-wbt.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/block/blk-wbt.c b/block/blk-wbt.c
index a50d4cd55f41..d6a2782d442f 100644
--- a/block/blk-wbt.c
+++ b/block/blk-wbt.c
@@ -210,6 +210,8 @@ static void wbt_rqw_done(struct rq_wb *rwb, struct rq_wait *rqw,
 	else if (blk_queue_write_cache(rwb->rqos.disk->queue) &&
 		 !wb_recent_wait(rwb))
 		limit = 0;
+	else if (wb_recent_wait(rwb))
+		limit = rwb->rq_depth.max_depth;
 	else
 		limit = rwb->wb_normal;
 
-- 
2.20.1





[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux