Hi Mostafa, On 8/20/25 5:20 PM, Mostafa Saleh wrote: > Hi Eric, > > On Tue, Aug 19, 2025 at 11:58:32AM +0200, Eric Auger wrote: >> Hi Mostafa, >> >> On 8/18/25 7:33 PM, Mostafa Saleh wrote: >>> On Mon, Aug 18, 2025 at 10:52:42AM -0600, Alex Williamson wrote: >>>> On Fri, 15 Aug 2025 16:59:37 +0000 >>>> Mostafa Saleh <smostafa@xxxxxxxxxx> wrote: >>>> >>>>> Hi Alex, >>>>> >>>>> On Wed, Aug 06, 2025 at 11:03:12AM -0600, Alex Williamson wrote: >>>>>> vfio-platform hasn't had a meaningful contribution in years. In-tree >>>>>> hardware support is predominantly only for devices which are long since >>>>>> e-waste. QEMU support for platform devices is slated for removal in >>>>>> QEMU-10.2. Eric Auger presented on the future of the vfio-platform >>>>>> driver and difficulties supporting new devices at KVM Forum 2024, >>>>>> gaining some support for removal, some disagreement, but garnering no >>>>>> new hardware support, leaving the driver in a state where it cannot >>>>>> be tested. >>>>>> >>>>>> Mark as obsolete and subject to removal. >>>>> Recently(this year) in Android, we enabled VFIO-platform for protected KVM, >>>>> and it’s supported in our VMM (CrosVM) [1]. >>>>> CrosVM support is different from Qemu, as it doesn't require any device >>>>> specific logic in the VMM, however, it relies on loading a device tree >>>>> template in runtime (with “compatiable” string...) and it will just >>>>> override regs, irqs.. So it doesn’t need device knowledge (at least for now) >>>>> Similarly, the kernel doesn’t need reset drivers as the hypervisor handles that. >>>> I think what we attempt to achieve in vfio is repeatability and data >>>> integrity independent of the hypervisor. IOW, if we 'kill -9' the >>>> hypervisor process, the kernel can bring the device back to a default >>>> state where the device isn't wedged or leaking information through the >>>> device to the next use case. If the hypervisor wants to support >>>> enhanced resets on top of that, that's great, but I think it becomes >>>> difficult to argue that vfio-platform itself holds up its end of the >>>> bargain if we're really trusting the hypervisor to handle these aspects. >>> Sorry I was not clear, we only use that in Android for ARM64 and pKVM, >>> where the hypervisor in this context means the code running in EL2 which >>> is more privileged than the kernel, so it should be trusted. >>> However, as I mentioned that code is not upstream yet, so it's a valid >>> concern that the kernel still needs a reset driver. >>> >>>>> Unfortunately, there is no upstream support at the moment, we are making >>>>> some -slow- progress on that [2][3] >>>>> >>>>> If it helps, I have access to HW that can run that and I can review/test >>>>> changes, until upstream support lands; if you are open to keeping VFIO-platform. >>>>> Or I can look into adding support for existing upstream HW(with platforms I am >>>>> familiar with as Pixel-6) >>>> Ultimately I'll lean on Eric to make the call. I know he's concerned >>>> about testing, but he raised that and various other concerns whether >>>> platform device really have a future with vfio nearly a year ago and >>>> nothing has changed. Currently it requires a module option opt-in to >>>> enable devices that the kernel doesn't know how to reset. Is that >>>> sufficient or should use of such a device taint the kernel? If any >>>> device beyond the few e-waste devices that we know how to reset taint >>>> the kernel, should this support really even be in the kernel? Thanks, >>> I think with the way it’s supported at the moment we need the kernel >>> to ensure that reset happens. >> Effectively my main concern is I cannot test vfio-platform anymore. We >> had some CVEs also impacting the vfio platform code base and it is a >> major issue not being able to test. That's why I was obliged, last year, >> to resume the integration of a new device (the tegra234 mgbe), nobody >> seemed to be really interested in and this work could not be upstreamed >> due to lack of traction and its hacky nature. >> >> You did not really comment on which kind of devices were currently >> integrated. Are they within the original scope of vfio (with DMA >> capabilities and protected by an IOMMU)? Last discussion we had in >> https://lore.kernel.org/all/ZvvLpLUZnj-Z_tEs@xxxxxxxxxx/ led to the >> conclusion that maybe VFIO was not the best suited framework. > At the moment, Android device assignement only supports DMA capable > devices which are behind an IOMMU, and we use VFIO-platform for that, > most of our use cases are accelerators. > > In that thread, I was looking into adding support for simpler devices > (such as sensors) but as discussed that won’t be done through > VFIO-platform. > > Ignoring Android, as I mentioned, I can work on adding support for > existing upstream platforms (preferably ARM64, that I can get access to) > such as Pixel-6, which should make it easier to test. > > Also, we have some interest on adding new features such as run-time > power management. OK fair enough. If Alex agrees then we can wait for those efforts. Also I think it would make sense to formalize the way you reset the devices (I understand the hyp does that under the hood). > >> In case we keep the driver in, I think we need to get a garantee that >> you or someone else at Google commits to review and test potential >> changes with a perspective to take over its maintenance. > I can’t make guarantees on behalf of Google, but I can contribute in > reviewing/testing/maintenance of the driver as far as I am able to. > If you want, you can add me as reviewer to the driver. I understand. I think the usual way then is for you to send a patch to update the Maintainers file. Thanks Eric > > Thanks, > Mostafa > > >> Thanks >> >> Eric >> >>> But maybe instead of having that specific reset handler for VFIO, we >>> can rely on the “shutdown” method already existing in "platform_driver"? >>> I believe that should put the device in a state where it can be re-probed >>> safely. Although not all devices implement that but it seems more generic >>> and scalable. >>> >>> Thanks, >>> Mostafa >>> >>>> Alex >>>>