Re: 10x I/O await times in 6.12

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

在 2025/04/22 9:39, Keith Busch 写道:
On Tue, Apr 22, 2025 at 09:28:02AM +0800, Yu Kuai wrote:
Hi,

在 2025/04/21 23:22, Keith Busch 写道:
On Mon, Apr 21, 2025 at 09:53:10AM +0100, Matt Fleming wrote:
Hey there,

We're moving to 6.12 at Cloudflare and noticed that write await times
in iostat are 10x what they were in 6.6. After a bit of bpftracing
(script to find all plug times above 10ms below), it seems like this
is an accounting error caused by the plug->cur_ktime optimisation
rather than anything more material.

It appears as though a task can enter __submit_bio() with ->plug set
and a very stale cur_ktime value on the order of milliseconds. Is this
expected behaviour? It looks like it leads to inaccurate I/O times.

There are places with a block plug that call cond_resched(), which
doesn't invalidate the plug's cached ktime. You could end up with a
stale ktime if your process is scheduled out.

This is wrong, scheduled out will clear cached ktime. You can check
it easily since there are not much caller to clear ktime.

Huh? cond_resched() calls __schedule() directly via
preempt_schedule_common(), which most certainly does not clear the
plug's time.

Yes, this is the preempt case, where pluged IO is not issued, this
behaviour is already known. I thought you mean the normal case, like
you said below. :(

The timestamp is only invalidated from schedule() or
rt_mutex_post_schedule(). You can check it ... "easily".

So, either preempt takes a long time, or generate lots of bio to plug
takes a long time can both results in larger iostat IO latency. I still
think delay setting request start_time to blk_mq_flush_plug_list() might
be a reasonable fix.

Thanks,
Kuai

.






[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux