> From: Nicolin Chen <nicolinc@xxxxxxxxxx> > Sent: Tuesday, August 12, 2025 6:59 AM > > There is a need to attach a PCI device that's under a reset to temporally > the blocked domain (i.e. detach it from its previously attached domain), > and then to reattach it back to its previous domain (i.e. detach it from > the blocked domain) after reset. > > During the reset stage, there can be races from other attach/detachment. > To solve this, a per-gdev reset flag will be introduced so that all the > attach functions will bypass the driver-level attach_dev callbacks, but > only update the group->domain pointer. The reset recovery procedure will > attach directly to the cached pointer so things will be back to normal. > > On the other hand, the iommu_get_domain_for_dev() API always returns the > group->domain pointer, and several IOMMMU drivers call this API in their > attach_dev callback functions to get the currently attached domain for a > device, which will be broken for the recovery case mentioned above: > 1. core asks the driver to attach dev from blocked to group->domain > 2. driver attaches dev from group->domain to group->domain the 2nd bullet implies that a driver may skip the operation by noting that old domain is same as the new one? > > So, iommu_get_domain_for_dev() should check the gdev flag and return the > blocked domain if the flag is set. But the caller of this API could hold > the group->mutex already or not, making it difficult to add the lock. > > Introduce a new iommu_get_domain_for_dev_locked() helper to be used by > those drivers in a context that is already under the protection of the > group->mutex, e.g. those attach_dev callback functions. And roll out the > new helper to all the existing IOMMU drivers. iommu_get_domain_for_dev() is also called outside of attach_dev callback functions, e.g. malidp_get_pgsize_bitmap(). and the returned info according to the attached domain might be saved in static structures, e.g.: ms->mmu_prefetch_pgsize = malidp_get_pgsize_bitmap(mp); would that cause weird issues when blocking domain is returned though one may not expect reset to happen at that point? > +/* Caller can be any general/external function that isn't an IOMMU callback > */ > struct iommu_domain *iommu_get_domain_for_dev(struct device *dev) s/can/must/ ?