On Mon, Jul 07, 2025 at 03:37:43PM +0800, Yibo Dong wrote: > On Fri, Jul 04, 2025 at 08:25:12PM +0200, Andrew Lunn wrote: > > > +/** > > > + * mucse_fw_send_cmd_wait - Send cmd req and wait for response > > > + * @hw: Pointer to the HW structure > > > + * @req: Pointer to the cmd req structure > > > + * @reply: Pointer to the fw reply structure > > > + * > > > + * mucse_fw_send_cmd_wait sends req to pf-cm3 mailbox and wait > > > + * reply from fw. > > > + * > > > + * Returns 0 on success, negative on failure > > > + **/ > > > +static int mucse_fw_send_cmd_wait(struct mucse_hw *hw, > > > + struct mbx_fw_cmd_req *req, > > > + struct mbx_fw_cmd_reply *reply) > > > +{ > > > + int err; > > > + int retry_cnt = 3; > > > + > > > + if (!hw || !req || !reply || !hw->mbx.ops.read_posted) > > > > Can this happen? > > > > If this is not supposed to happen, it is better the driver opps, so > > you get a stack trace and find where the driver is broken. > > > Yes, it is not supposed to happen. So, you means I should remove this > check in order to get opps when this condition happen? You should remove all defensive code. Let is explode with an Opps, so you can find your bugs. > > > + return -EINVAL; > > > + > > > + /* if pcie off, nothing todo */ > > > + if (pci_channel_offline(hw->pdev)) > > > + return -EIO; > > > > What can cause it to go offline? Is this to do with PCIe hotplug? > > > Yes, I try to get a PCIe hotplug condition by 'pci_channel_offline'. > If that happens, driver should never do bar-read/bar-write, so return > here. I don't know PCI hotplug too well, but i assume the driver core will call the .release function. Can this function be called as part of release? What actually happens on the PCI bus when you try to access a device which no longer exists? How have you tested this? Do you have the ability to do a hot{un}plug? > > > + if (mutex_lock_interruptible(&hw->mbx.lock)) > > > + return -EAGAIN; > > > > mutex_lock_interruptable() returns -EINTR, which is what you should > > return, not -EAGAIN. > > > Got it, I should return '-EINTR' here. No, you should return whatever mutex_lock_interruptable() returns. Whenever you call a function which returns an error code, you should pass that error code up the call stack. Never replace one error code with another. > > > + if (reply->error_code) > > > + return -reply->error_code; > > > > The mbox is using linux error codes? > > > It is used only between driver and fw, yay be just samply like this: > 0 -- no error > not 0 -- error > So, it is not using linux error codes. Your functions should always use linux/POSIX error codes. So if your firmware says an error has happened, turn it into a linux/POSIX error code. EINVAL, TIMEDOUT, EIO, whatever makes the most sense. Andrew