On 5/6/25 10:17 AM, Jason Gunthorpe wrote: > On Tue, May 06, 2025 at 10:13:00AM -0400, Chuck Lever wrote: >> On 5/6/25 9:55 AM, Jason Gunthorpe wrote: >>> On Tue, May 06, 2025 at 06:40:25AM -0700, Christoph Hellwig wrote: >>>> On Tue, May 06, 2025 at 10:17:22AM -0300, Jason Gunthorpe wrote: >>>>> On Tue, May 06, 2025 at 06:08:59AM -0700, Christoph Hellwig wrote: >>>>>> On Mon, Apr 28, 2025 at 03:36:49PM -0400, cel@xxxxxxxxxx wrote: >>>>>>> qp_attr.cap.max_rdma_ctxs. The QP's actual Send Queue length is on >>>>>>> the order of the sum of qp_attr.cap.max_send_wr and a factor times >>>>>>> qp_attr.cap.max_rdma_ctxs. The factor can be up to three, depending >>>>>>> on whether MR operations are required before RDMA Reads. >>>>>>> >>>>>>> This limit is not visible to RDMA consumers via dev->attrs. When the >>>>>>> limit is surpassed, QP creation fails with -ENOMEM. For example: >>>>>> >>>>>> Can we find a way to expose this limit from the HCA drivers and the >>>>>> RDMA core? >>>>> >>>>> Shouldn't it be max_qp_wr? >>>> >>>> Does that allow for arbitrary combination of different WRs? >>> >>> I think it is supposed to be the maximum QP WR depth you can create.. >>> >>> A QP shouldn't behave differently depending on the WR operation, each >>> one takes one WR entry. >>> >>> Chuck do you know differently? >> >> qp_attr.cap.max_rdma_ctxs reserves a number of SQEs over and above >> qp_attr.cap.max_send_wr. The sum of those two cannot exceed max_qp_wr, >> of course. > > Yes > >> But there is a multiplier, due to whether the device wants a >> registration and invalidation WR in addition to each RDMA Read WR. > > Yes, but both of these are in the rdma rw layer > >> Further, in drivers/infiniband/hw/mlx5/qp.c :: calc_sq_size >> >> wq_size = roundup_pow_of_two(attr->cap.max_send_wr * wqe_size); >> qp->sq.wqe_cnt = wq_size / MLX5_SEND_WQE_BB; >> if (qp->sq.wqe_cnt > (1 << MLX5_CAP_GEN(dev->mdev, >> log_max_qp_sz))) { > > And this log_max_qp_sz should be used to derive attr.max_qp_wr > >> In this patch I'm trying to include the reg/inv multiplier in the >> calculation, but that doesn't seem to be enough to make "accept" >> reliable, IMO due to this extra calculation in calc_sq_size(). > > Did ib_create_qp get called with more than max_qp_wr ? The request was for, like, 9300 SQEs. max_qp_wr is 32K on my systems. > Or is max_qp_wr not working? -- Chuck Lever