Hi Eric, On Tue, Aug 19, 2025 at 11:58:32AM +0200, Eric Auger wrote: > Hi Mostafa, > > On 8/18/25 7:33 PM, Mostafa Saleh wrote: > > On Mon, Aug 18, 2025 at 10:52:42AM -0600, Alex Williamson wrote: > >> On Fri, 15 Aug 2025 16:59:37 +0000 > >> Mostafa Saleh <smostafa@xxxxxxxxxx> wrote: > >> > >>> Hi Alex, > >>> > >>> On Wed, Aug 06, 2025 at 11:03:12AM -0600, Alex Williamson wrote: > >>>> vfio-platform hasn't had a meaningful contribution in years. In-tree > >>>> hardware support is predominantly only for devices which are long since > >>>> e-waste. QEMU support for platform devices is slated for removal in > >>>> QEMU-10.2. Eric Auger presented on the future of the vfio-platform > >>>> driver and difficulties supporting new devices at KVM Forum 2024, > >>>> gaining some support for removal, some disagreement, but garnering no > >>>> new hardware support, leaving the driver in a state where it cannot > >>>> be tested. > >>>> > >>>> Mark as obsolete and subject to removal. > >>> Recently(this year) in Android, we enabled VFIO-platform for protected KVM, > >>> and it’s supported in our VMM (CrosVM) [1]. > >>> CrosVM support is different from Qemu, as it doesn't require any device > >>> specific logic in the VMM, however, it relies on loading a device tree > >>> template in runtime (with “compatiable” string...) and it will just > >>> override regs, irqs.. So it doesn’t need device knowledge (at least for now) > >>> Similarly, the kernel doesn’t need reset drivers as the hypervisor handles that. > >> I think what we attempt to achieve in vfio is repeatability and data > >> integrity independent of the hypervisor. IOW, if we 'kill -9' the > >> hypervisor process, the kernel can bring the device back to a default > >> state where the device isn't wedged or leaking information through the > >> device to the next use case. If the hypervisor wants to support > >> enhanced resets on top of that, that's great, but I think it becomes > >> difficult to argue that vfio-platform itself holds up its end of the > >> bargain if we're really trusting the hypervisor to handle these aspects. > > Sorry I was not clear, we only use that in Android for ARM64 and pKVM, > > where the hypervisor in this context means the code running in EL2 which > > is more privileged than the kernel, so it should be trusted. > > However, as I mentioned that code is not upstream yet, so it's a valid > > concern that the kernel still needs a reset driver. > > > >>> Unfortunately, there is no upstream support at the moment, we are making > >>> some -slow- progress on that [2][3] > >>> > >>> If it helps, I have access to HW that can run that and I can review/test > >>> changes, until upstream support lands; if you are open to keeping VFIO-platform. > >>> Or I can look into adding support for existing upstream HW(with platforms I am > >>> familiar with as Pixel-6) > >> Ultimately I'll lean on Eric to make the call. I know he's concerned > >> about testing, but he raised that and various other concerns whether > >> platform device really have a future with vfio nearly a year ago and > >> nothing has changed. Currently it requires a module option opt-in to > >> enable devices that the kernel doesn't know how to reset. Is that > >> sufficient or should use of such a device taint the kernel? If any > >> device beyond the few e-waste devices that we know how to reset taint > >> the kernel, should this support really even be in the kernel? Thanks, > > I think with the way it’s supported at the moment we need the kernel > > to ensure that reset happens. > > Effectively my main concern is I cannot test vfio-platform anymore. We > had some CVEs also impacting the vfio platform code base and it is a > major issue not being able to test. That's why I was obliged, last year, > to resume the integration of a new device (the tegra234 mgbe), nobody > seemed to be really interested in and this work could not be upstreamed > due to lack of traction and its hacky nature. > > You did not really comment on which kind of devices were currently > integrated. Are they within the original scope of vfio (with DMA > capabilities and protected by an IOMMU)? Last discussion we had in > https://lore.kernel.org/all/ZvvLpLUZnj-Z_tEs@xxxxxxxxxx/ led to the > conclusion that maybe VFIO was not the best suited framework. At the moment, Android device assignement only supports DMA capable devices which are behind an IOMMU, and we use VFIO-platform for that, most of our use cases are accelerators. In that thread, I was looking into adding support for simpler devices (such as sensors) but as discussed that won’t be done through VFIO-platform. Ignoring Android, as I mentioned, I can work on adding support for existing upstream platforms (preferably ARM64, that I can get access to) such as Pixel-6, which should make it easier to test. Also, we have some interest on adding new features such as run-time power management. > > In case we keep the driver in, I think we need to get a garantee that > you or someone else at Google commits to review and test potential > changes with a perspective to take over its maintenance. I can’t make guarantees on behalf of Google, but I can contribute in reviewing/testing/maintenance of the driver as far as I am able to. If you want, you can add me as reviewer to the driver. Thanks, Mostafa > > Thanks > > Eric > > > > > But maybe instead of having that specific reset handler for VFIO, we > > can rely on the “shutdown” method already existing in "platform_driver"? > > I believe that should put the device in a state where it can be re-probed > > safely. Although not all devices implement that but it seems more generic > > and scalable. > > > > Thanks, > > Mostafa > > > >> Alex > >> >