On Wed 30-07-25 10:03:50, Yu Kuai wrote: > 在 2025/07/29 18:16, Jan Kara 写道: > > On Tue 29-07-25 11:19:05, Yu Kuai wrote: > > > From: Yu Kuai <yukuai3@xxxxxxxxxx> > > > > > > Currently elevators will record internal 'async_depth' to throttle > > > asynchronous requests, and they both calculate shallow_dpeth based on > > > sb->shift, with the respect that sb->shift is the available tags in one > > > word. > > > > > > However, sb->shift is not the availbale tags in the last word, see > > > __map_depth: > > > > > > if (index == sb->map_nr - 1) > > > return sb->depth - (index << sb->shift); > > > > > > For consequence, if the last word is used, more tags can be get than > > > expected, for example, assume nr_requests=256 and there are four words, > > > in the worst case if user set nr_requests=32, then the first word is > > > the last word, and still use bits per word, which is 64, to calculate > > > async_depth is wrong. > > > > > > One the other hand, due to cgroup qos, bfq can allow only one request > > > to be allocated, and set shallow_dpeth=1 will still allow the number > > > of words request to be allocated. > > > > > > Fix this problems by using shallow_depth to the whole sbitmap instead > > > of per word, also change kyber, mq-deadline and bfq to follow this. > > > > > > Signed-off-by: Yu Kuai <yukuai3@xxxxxxxxxx> > > > > I agree with these problems but AFAIU this implementation of shallow depth > > has been done for a reason. Omar can chime in here as the original author > > or perhaps Jens but the idea of current shallow depth implementation is > > that each sbitmap user regardless of used shallow depth has a chance to > > allocate from each sbitmap word which evenly distributes pressure among > > available sbitmap words. With the implementation you've chosen there will > > be higher pressure (and thus contention) on words with low indices. > > Yes, this make sense. However, consider that shallow depth is only used > by elevator, this higher pressure should be negligible for deadline and > bfq. As for kyber, this might be a problem. I agree that for bfq the overhead should be in the noise. For mq-deadline it might be measurable but I'm not overly concerned. > > So I think we would be good to fix issues with shallow depth for small > > number of sbitmap words (because that's where these buggy cornercases may > > matter in practice) but I believe the logic which constrains number of used > > bits from each *word* when shallow_depth is specified should be kept. It > > might make sense to change the API so that shallow_depth is indeed > > specified compared to the total size of the bitmap, not to the size of the > > word (because that's confusing practically everybody I've met and is a > > constant source of bugs) if it can be made to perform well. > > Do you think will it be ok to add a new shallow depth API to use the > total size, and convert bfq and deadline to use it? I think having two APIs will be even more confusing than the current state. But as I wrote I think you can have API to specify shallow depth in total size and in sbitmap_queue_get_shallow() do: shallow_per_word = (shallow_depth << sb->shift) / sb->depth; rounding_index = shallow_depth - shallow_per_word * sb->depth; and allow depth shallow_per_word + 1 if current index < rounding_index and exactly shallow_per_word if current index >= rounding_index. This will still evenly distribute shallow depth among words and I don't think the additional overhead of the several arithmetic operations will be visible. Honza -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR