> -----Original Message----- > From: Shradha Gupta <shradhagupta@xxxxxxxxxxxxxxxxxxx> > Sent: Thursday, May 8, 2025 5:29 AM > To: Haiyang Zhang <haiyangz@xxxxxxxxxxxxx> > Cc: linux-hyperv@xxxxxxxxxxxxxxx; netdev@xxxxxxxxxxxxxxx; Dexuan Cui > <decui@xxxxxxxxxxxxx>; stephen@xxxxxxxxxxxxxxxxxx; KY Srinivasan > <kys@xxxxxxxxxxxxx>; Paul Rosswurm <paulros@xxxxxxxxxxxxx>; > olaf@xxxxxxxxx; vkuznets@xxxxxxxxxx; davem@xxxxxxxxxxxxx; > wei.liu@xxxxxxxxxx; edumazet@xxxxxxxxxx; kuba@xxxxxxxxxx; > pabeni@xxxxxxxxxx; leon@xxxxxxxxxx; Long Li <longli@xxxxxxxxxxxxx>; > ssengar@xxxxxxxxxxxxxxxxxxx; linux-rdma@xxxxxxxxxxxxxxx; > daniel@xxxxxxxxxxxxx; john.fastabend@xxxxxxxxx; bpf@xxxxxxxxxxxxxxx; > ast@xxxxxxxxxx; hawk@xxxxxxxxxx; tglx@xxxxxxxxxxxxx; > andrew+netdev@xxxxxxx; linux-kernel@xxxxxxxxxxxxxxx > Subject: Re: [PATCH net-next] net: mana: Add handler for hardware > servicing events > > On Wed, May 07, 2025 at 08:58:39AM -0700, Haiyang Zhang wrote: > > To collaborate with hardware servicing events, upon receiving the > special > > EQE notification from the HW channel, remove the devices on this bus. > > Then, after a waiting period based on the device specs, rescan the > parent > > bus to recover the devices. > > > > Signed-off-by: Haiyang Zhang <haiyangz@xxxxxxxxxxxxx> > > --- > > .../net/ethernet/microsoft/mana/gdma_main.c | 61 +++++++++++++++++++ > > include/net/mana/gdma.h | 5 +- > > 2 files changed, 65 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/net/ethernet/microsoft/mana/gdma_main.c > b/drivers/net/ethernet/microsoft/mana/gdma_main.c > > index 4ffaf7588885..aa2ccf4d0ec6 100644 > > --- a/drivers/net/ethernet/microsoft/mana/gdma_main.c > > +++ b/drivers/net/ethernet/microsoft/mana/gdma_main.c > > @@ -352,11 +352,52 @@ void mana_gd_ring_cq(struct gdma_queue *cq, u8 > arm_bit) > > } > > EXPORT_SYMBOL_NS(mana_gd_ring_cq, "NET_MANA"); > > > > +#define MANA_SERVICE_PERIOD 10 > > + > > +struct mana_serv_work { > > + struct work_struct serv_work; > > + struct pci_dev *pdev; > > +}; > > + > > +static void mana_serv_func(struct work_struct *w) > > +{ > > + struct mana_serv_work *mns_wk = container_of(w, struct > mana_serv_work, serv_work); > > + struct pci_dev *pdev = mns_wk->pdev; > > + struct pci_bus *bus, *parent; > > + > > + if (!pdev) > > + goto out; > > + > > + bus = pdev->bus; > > + if (!bus) { > > + dev_err(&pdev->dev, "MANA service: no bus\n"); > > + goto out; > > + } > > + > > + parent = bus->parent; > > + if (!parent) { > > + dev_err(&pdev->dev, "MANA service: no parent bus\n"); > > + goto out; > > + } > > + > > + pci_stop_and_remove_bus_device_locked(bus->self); > > + > > + msleep(MANA_SERVICE_PERIOD * 1000); > > + > > + pci_lock_rescan_remove(); > > + pci_rescan_bus(parent); > > + pci_unlock_rescan_remove(); > > + > > +out: > > + kfree(mns_wk); > > Shouldn't gc->in_service be set to false again? > > > +} > > + > > static void mana_gd_process_eqe(struct gdma_queue *eq) > > { > > u32 head = eq->head % (eq->queue_size / GDMA_EQE_SIZE); > > struct gdma_context *gc = eq->gdma_dev->gdma_context; > > struct gdma_eqe *eq_eqe_ptr = eq->queue_mem_ptr; > > + struct mana_serv_work *mns_wk; > > union gdma_eqe_info eqe_info; > > enum gdma_eqe_type type; > > struct gdma_event event; > > @@ -400,6 +441,26 @@ static void mana_gd_process_eqe(struct gdma_queue > *eq) > > eq->eq.callback(eq->eq.context, eq, &event); > > break; > > > > + case GDMA_EQE_HWC_FPGA_RECONFIG: > > + case GDMA_EQE_HWC_SOCMANA_CRASH: > > may be we also add a log(dev_dbg) to indicate if the servicing is for > FPGA reconfig or socmana crash. Thanks, I will add this. - Haiyang