QEMU Inter-VM Shared Memory (ivshmem) is designed to share a memory region between guest and host. The host creates a file, passes it to QEMU which it presents to the guest via PCI BAR#2. The guest userspace can map /sys/bus/pci/devices/0000:01:02.3/resource2(_wc) to use the region without having the guest driver for the device at all. The problem with this, since it is a PCI resource, the PCI sysfs reasonably enforces: - no caching when mapped via "resourceN" (PTE::PCD on x86) or - write-through when mapped via "resourceN_wc" (PTE::PWT on x86). As the result, the host writes are seen by the guest immediately (as the region is just a mapped file) but it takes quite some time for the host to see non-cached guest writes. Add a quirk to always map ivshmem's BAR2 as cacheable (==write-back) as ivshmem is backed by RAM anyway. (Re)use already defined but not used IORESOURCE_CACHEABLE flag. This does not affect other ways of mapping a PCI BAR, a driver can use memremap() for this functionality. Signed-off-by: Alexey Kardashevskiy <aik@xxxxxxx> --- What is this IORESOURCE_CACHEABLE for actually? Anyway, the alternatives are: 1. add a new node in sysfs - "resourceN_wb" - for mapping as writeback but this requires changing existing (and likely old) userspace tools; 2. fix the kernel to strictly follow /proc/mtrr (now it is rather a recommendation) but Documentation/arch/x86/mtrr.rst says it is replaced with PAT which does not seem to allow overriding caching for specific devices (==MMIO ranges). --- drivers/pci/mmap.c | 6 ++++++ drivers/pci/quirks.c | 8 ++++++++ 2 files changed, 14 insertions(+) diff --git a/drivers/pci/mmap.c b/drivers/pci/mmap.c index 8da3347a95c4..8495bee08fae 100644 --- a/drivers/pci/mmap.c +++ b/drivers/pci/mmap.c @@ -35,6 +35,7 @@ int pci_mmap_resource_range(struct pci_dev *pdev, int bar, if (write_combine) vma->vm_page_prot = pgprot_writecombine(vma->vm_page_prot); else + else if (!(pci_resource_flags(pdev, bar) & IORESOURCE_CACHEABLE)) vma->vm_page_prot = pgprot_device(vma->vm_page_prot); if (mmap_state == pci_mmap_io) { @@ -46,6 +47,11 @@ int pci_mmap_resource_range(struct pci_dev *pdev, int bar, vma->vm_ops = &pci_phys_vm_ops; + if (pci_resource_flags(pdev, bar) & IORESOURCE_CACHEABLE) + return remap_pfn_range_notrack(vma, vma->vm_start, vma->vm_pgoff, + vma->vm_end - vma->vm_start, + vma->vm_page_prot); + return io_remap_pfn_range(vma, vma->vm_start, vma->vm_pgoff, vma->vm_end - vma->vm_start, vma->vm_page_prot); diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index d7f4ee634263..858869ec6612 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -6335,3 +6335,11 @@ static void pci_mask_replay_timer_timeout(struct pci_dev *pdev) DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_GLI, 0x9750, pci_mask_replay_timer_timeout); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_GLI, 0x9755, pci_mask_replay_timer_timeout); #endif + +static void pci_ivshmem_writeback(struct pci_dev *dev) +{ + struct resource *r = &dev->resource[2]; + + r->flags |= IORESOURCE_CACHEABLE; +} +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_REDHAT_QUMRANET, 0x1110, pci_ivshmem_writeback); -- 2.49.0