From: Yu Kuai <yukuai3@xxxxxxxxxx> We have a workload of random 4k-128k read on a HDD, from iostat we observed that average request size is 256k+ and bandwidth is 100MB+, this is because readahead waste lots of disk bandwidth. Hence we disable readahead and performance from user side is indeed much better(2x+), however, from iostat we observed request size is just 4k and bandwidth is just around 40MB. Then we do a simple dd test and found out if readahead is disabled, page_cache_sync_ra() will force to read one page at a time, and this really doesn't make sense because we can just issue user requested size request to disk. Fix this problem by removing the limit to read one page at a time from page_cache_sync_ra(), this way the random read workload can get better performance with readahead disabled. PS: I'm not sure if I miss anything, so this version is RFC Signed-off-by: Yu Kuai <yukuai3@xxxxxxxxxx> --- mm/readahead.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/mm/readahead.c b/mm/readahead.c index 20d36d6b055e..1df85ccba575 100644 --- a/mm/readahead.c +++ b/mm/readahead.c @@ -561,13 +561,21 @@ void page_cache_sync_ra(struct readahead_control *ractl, * Even if readahead is disabled, issue this request as readahead * as we'll need it to satisfy the requested range. The forced * readahead will do the right thing and limit the read to just the - * requested range, which we'll set to 1 page for this case. + * requested range. */ - if (!ra->ra_pages || blk_cgroup_congested()) { + if (blk_cgroup_congested()) { if (!ractl->file) return; + /* + * If the cgroup is congested, ensure to do at least 1 page of + * readahead to make progress on the read. + */ req_count = 1; do_forced_ra = true; + } else if (!ra->ra_pages) { + if (!ractl->file) + return; + do_forced_ra = true; } /* be dumb */ -- 2.39.2