On Tue, Jun 17, 2025 at 05:04:55PM -0700, Tantilov, Emil S wrote: > > > On 5/16/2025 7:58 AM, Larysa Zaremba wrote: > > From: Pavan Kumar Linga <pavan.kumar.linga@xxxxxxxxx> > > > > Support to initialize and configure controlqs, and manage their > > transactions was introduced in libie. As part of it, most of the existing > > controlq structures are renamed and modified. Use those APIs in idpf and > > make all the necessary changes. > > > > Previously for the send and receive virtchnl messages, there used to be a > > memcpy involved in controlq code to copy the buffer info passed by the send > > function into the controlq specific buffers. There was no restriction to > > use automatic memory in that case. The new implementation in libie removed > > copying of the send buffer info and introduced DMA mapping of the send > > buffer itself. To accommodate it, use dynamic memory for the larger send > > buffers. For smaller ones (<= 128 bytes) libie still can copy them into the > > pre-allocated message memory. > > > > In case of receive, idpf receives a page pool buffer allocated by the libie > > and care should be taken to release it after use in the idpf. > > > > The changes are fairly trivial and localized, with a notable exception > > being the consolidation of idpf_vc_xn_shutdown and idpf_deinit_dflt_mbx > > under the latter name. This has some additional consequences that are > > addressed in the following patches. > > There is an issue with this approach that impacts the ability of the driver > to force a reset. See below ... > > > > > This refactoring introduces roughly additional 40KB of module storage used > > for systems that only run idpf, so idpf + libie_cp + libie_pci takes about > > 7% more storage than just idpf before refactoring. > > > > We now pre-allocate small TX buffers, so that does increase the memory > > usage, but reduces the need to allocate. This results in additional 256 * > > 128B of memory permanently used, increasing the worst-case memory usage by > > 32KB but our ctlq RX buffers need to be of size 4096B anyway (not changed > > by the patchset), so this is hardly noticeable. > > > > As for the timings, the fact that we are mostly limited by the HW response > > time which is far from instant, is not changed by this refactor. > > > > Reviewed-by: Michal Kubiak <michal.kubiak@xxxxxxxxx> > > Signed-off-by: Pavan Kumar Linga <pavan.kumar.linga@xxxxxxxxx> > > Co-developed-by: Larysa Zaremba <larysa.zaremba@xxxxxxxxx> > > Signed-off-by: Larysa Zaremba <larysa.zaremba@xxxxxxxxx> > > --- > > drivers/net/ethernet/intel/idpf/Kconfig | 2 +- > > drivers/net/ethernet/intel/idpf/Makefile | 2 - > > drivers/net/ethernet/intel/idpf/idpf.h | 27 +- > > .../net/ethernet/intel/idpf/idpf_controlq.c | 624 ------- > > .../net/ethernet/intel/idpf/idpf_controlq.h | 130 -- > > .../ethernet/intel/idpf/idpf_controlq_api.h | 177 -- > > .../ethernet/intel/idpf/idpf_controlq_setup.c | 171 -- > > drivers/net/ethernet/intel/idpf/idpf_dev.c | 54 +- > > .../net/ethernet/intel/idpf/idpf_ethtool.c | 37 +- > > drivers/net/ethernet/intel/idpf/idpf_lib.c | 44 +- > > drivers/net/ethernet/intel/idpf/idpf_main.c | 4 - > > drivers/net/ethernet/intel/idpf/idpf_mem.h | 20 - > > drivers/net/ethernet/intel/idpf/idpf_txrx.h | 2 +- > > drivers/net/ethernet/intel/idpf/idpf_vf_dev.c | 60 +- > > .../net/ethernet/intel/idpf/idpf_virtchnl.c | 1617 ++++++----------- > > .../net/ethernet/intel/idpf/idpf_virtchnl.h | 90 +- > > .../ethernet/intel/idpf/idpf_virtchnl_ptp.c | 204 +-- > > 17 files changed, 765 insertions(+), 2500 deletions(-) > > delete mode 100644 drivers/net/ethernet/intel/idpf/idpf_controlq.c > > delete mode 100644 drivers/net/ethernet/intel/idpf/idpf_controlq.h > > delete mode 100644 drivers/net/ethernet/intel/idpf/idpf_controlq_api.h > > delete mode 100644 drivers/net/ethernet/intel/idpf/idpf_controlq_setup.c > > delete mode 100644 drivers/net/ethernet/intel/idpf/idpf_mem.h > > > > <snip> > > > diff --git a/drivers/net/ethernet/intel/idpf/idpf_lib.c b/drivers/net/ethernet/intel/idpf/idpf_lib.c > > index 68330b884967..500bff1091d9 100644 > > --- a/drivers/net/ethernet/intel/idpf/idpf_lib.c > > +++ b/drivers/net/ethernet/intel/idpf/idpf_lib.c > > @@ -1190,6 +1190,7 @@ void idpf_statistics_task(struct work_struct *work) > > */ > > void idpf_mbx_task(struct work_struct *work) > > { > > + struct libie_ctlq_xn_recv_params xn_params = {}; > > struct idpf_adapter *adapter; > > adapter = container_of(work, struct idpf_adapter, mbx_task.work); > > @@ -1200,7 +1201,11 @@ void idpf_mbx_task(struct work_struct *work) > > queue_delayed_work(adapter->mbx_wq, &adapter->mbx_task, > > msecs_to_jiffies(300)); > > - idpf_recv_mb_msg(adapter, adapter->hw.arq); > > + xn_params.xnm = adapter->xn_init_params.xnm; > > + xn_params.ctlq = adapter->arq; > > + xn_params.ctlq_msg_handler = idpf_recv_event_msg; > > + > > + libie_ctlq_xn_recv(&xn_params); > > } > > /** > > @@ -1757,7 +1762,6 @@ static int idpf_init_hard_reset(struct idpf_adapter *adapter) > > idpf_vc_core_deinit(adapter); > > if (!is_reset) > > Since one of the checks in idpf_is_reset_detected() is !adapter->arq, this > will never be possible through the event task. I think we may be able to > remove this check altogether, but as-is this patch introduces large delays > in the Tx hang recovery and depending on the cause may not recover at all. > > > reg_ops->trigger_reset(adapter, IDPF_HR_FUNC_RESET); > > - idpf_deinit_dflt_mbx(adapter); > > } else { > > dev_err(dev, "Unhandled hard reset cause\n"); > > err = -EBADRQC; > > @@ -1825,7 +1829,7 @@ void idpf_vc_event_task(struct work_struct *work) > > return; > > func_reset: > > - idpf_vc_xn_shutdown(adapter->vcxn_mngr); > > + idpf_deinit_dflt_mbx(adapter); > > This is not a straightforward swap, whereas previously we just discard > messages knowing that we cannot communicate with the CP in a reset, this > goes much further as it dismantles the MBX resources, and as a result the > check `if (!is_reset)` in idpf_init_hard_reset() will never be true. > Thanks for finding this! Given the problem seems to only relate to idpf_vc_event_task() in case of IDPF_HR_FUNC_RESET and the following call sequence: idpf_deinit_dflt_mbx(adapter); set_bit(IDPF_HR_RESET_IN_PROG, adapter->flags); idpf_init_hard_reset(adapter); netif_carrier_off(); netif_tx_disable(); is_reset = idpf_is_reset_detected(adapter); idpf_set_vport_state(adapter); idpf_vc_core_deinit(adapter); idpf_deinit_dflt_mbx(adapter); ... ... I think, it is safe to remove idpf_deinit_dflt_mbx() from idpf_vc_event_task(), given no mailbox communication is attempted in between it and idpf_vc_core_deinit(). Anything going on in parallel also should not suffer from having mailbox available just a little bit longer. What do you think? > <snip> > > Thanks, > Emil >