Re: [PATCH] PCI/VGA: Look at all PCI display devices in VGA arbiter

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jun 12, 2025 at 10:12:14PM -0500, Mario Limonciello wrote:
> From: Mario Limonciello <mario.limonciello@xxxxxxx>
> 
> On an A+N mobile system the APU is not being selected by some desktop
> environments for any rendering tasks. This is because the neither GPU
> is being treated as "boot_vga" but that is what some environments use
> to select a GPU [1].

What is "A+N" and "APU"?

I didn't quite follow the second sentence.  I guess you're saying some
userspace environments use the "boot_vga" sysfs file to select a GPU?
And on this A+N system, neither device has the file?

> The VGA arbiter driver only looks at devices that report as PCI display
> VGA class devices. Neither GPU on the system is a display VGA class
> device:
> 
> c5:00.0 3D controller: NVIDIA Corporation Device 2db9 (rev a1)
> c6:00.0 Display controller: Advanced Micro Devices, Inc. [AMD/ATI] Device 150e (rev d1)
> 
> So neither device gets the boot_vga sysfs file. The vga_is_boot_device()
> function already has some handling for integrated GPUs by looking at the
> ACPI HID entries, so if all PCI display class devices are looked at it
> actually can detect properly with this device ordering.  However if there
> is a different ordering it could flag the wrong device.
> 
> Modify the VGA arbiter code and matching sysfs file entries to examine all
> PCI display class devices. After every device is added to the arbiter list
> make a pass on all devices and explicitly check whether one is integrated.
> 
> This will cause all GPUs to gain a `boot_vga` file, but the correct device
> (APU in this case) will now show `1` and the incorrect device shows `0`.
> Userspace then picks the right device as well.

What determines whether the APU is the "correct" device?

> Link: https://github.com/robherring/libpciaccess/commit/b2838fb61c3542f107014b285cbda097acae1e12 [1]
> Acked-by: Thomas Zimmermann <tzimmermann@xxxxxxx>
> Signed-off-by: Mario Limonciello <mario.limonciello@xxxxxxx>
> ---
> RFC->v1:
>  * Add tag
>  * Drop unintended whitespace change
>  * Use vgaarb_dbg instead of vgaarb_info
> ---
>  drivers/pci/pci-sysfs.c |  2 +-
>  drivers/pci/vgaarb.c    | 48 +++++++++++++++++++++++++++--------------
>  include/linux/pci.h     | 15 +++++++++++++
>  3 files changed, 48 insertions(+), 17 deletions(-)
> 
> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
> index 268c69daa4d57..c314ee1b3f9ac 100644
> --- a/drivers/pci/pci-sysfs.c
> +++ b/drivers/pci/pci-sysfs.c
> @@ -1707,7 +1707,7 @@ static umode_t pci_dev_attrs_are_visible(struct kobject *kobj,
>  	struct device *dev = kobj_to_dev(kobj);
>  	struct pci_dev *pdev = to_pci_dev(dev);
>  
> -	if (a == &dev_attr_boot_vga.attr && pci_is_vga(pdev))
> +	if (a == &dev_attr_boot_vga.attr && pci_is_display(pdev))
>  		return a->mode;
>  
>  	return 0;
> diff --git a/drivers/pci/vgaarb.c b/drivers/pci/vgaarb.c
> index 78748e8d2dbae..0eb1274d52169 100644
> --- a/drivers/pci/vgaarb.c
> +++ b/drivers/pci/vgaarb.c
> @@ -140,6 +140,7 @@ void vga_set_default_device(struct pci_dev *pdev)
>  	if (vga_default == pdev)
>  		return;
>  
> +	vgaarb_dbg(&pdev->dev, "selecting as VGA boot device\n");

I guess this is essentially a move of the vgaarb_info() message from
vga_arbiter_add_pci_device() to here?  If so, I think I would preserve
the _info() level.  Including non-VGA devices is fairly subtle and I
don't think we need to discard potentially useful information about
what we're doing.

>  	pci_dev_put(vga_default);
>  	vga_default = pci_dev_get(pdev);
>  }
> @@ -751,6 +752,31 @@ static void vga_arbiter_check_bridge_sharing(struct vga_device *vgadev)
>  		vgaarb_info(&vgadev->pdev->dev, "no bridge control possible\n");
>  }
>  
> +static
> +void vga_arbiter_select_default_device(void)

Signature on one line.

> +{
> +	struct pci_dev *candidate = vga_default_device();
> +	struct vga_device *vgadev;
> +
> +	list_for_each_entry(vgadev, &vga_list, list) {
> +		if (vga_is_boot_device(vgadev)) {
> +			/* check if one is an integrated GPU, use that if so */
> +			if (candidate) {
> +				if (vga_arb_integrated_gpu(&candidate->dev))
> +					break;
> +				if (vga_arb_integrated_gpu(&vgadev->pdev->dev)) {
> +					candidate = vgadev->pdev;
> +					break;
> +				}
> +			} else
> +				candidate = vgadev->pdev;
> +		}
> +	}
> +
> +	if (candidate)
> +		vga_set_default_device(candidate);
> +}

It looks like this is related to the integrated GPU code in
vga_is_boot_device().  Can this be done by updating the logic there,
so it's more clear what change this patch makes?

It seems like this patch would change the selection in a way that
makes some of the vga_is_boot_device() comments obsolete.  E.g., "We
select the default VGA device in this order"?  Or "we use the *last*
[integrated GPU] because that was the previous behavior"?

The end of vga_is_boot_device() mentions non-legacy (non-VGA) devices,
but I don't remember now how we ever got there because
vga_arb_device_init() and pci_notify() only call
vga_arbiter_add_pci_device() for VGA devices.

>  /*
>   * Currently, we assume that the "initial" setup of the system is not sane,
>   * that is, we come up with conflicting devices and let the arbiter's
> @@ -816,23 +842,17 @@ static bool vga_arbiter_add_pci_device(struct pci_dev *pdev)
>  		bus = bus->parent;
>  	}
>  
> -	if (vga_is_boot_device(vgadev)) {
> -		vgaarb_info(&pdev->dev, "setting as boot VGA device%s\n",
> -			    vga_default_device() ?
> -			    " (overriding previous)" : "");
> -		vga_set_default_device(pdev);
> -	}
> -
>  	vga_arbiter_check_bridge_sharing(vgadev);
>  
>  	/* Add to the list */
>  	list_add_tail(&vgadev->list, &vga_list);
>  	vga_count++;
> -	vgaarb_info(&pdev->dev, "VGA device added: decodes=%s,owns=%s,locks=%s\n",
> +	vgaarb_dbg(&pdev->dev, "VGA device added: decodes=%s,owns=%s,locks=%s\n",

Looks like an unrelated change.

>  		vga_iostate_to_str(vgadev->decodes),
>  		vga_iostate_to_str(vgadev->owns),
>  		vga_iostate_to_str(vgadev->locks));
>  
> +	vga_arbiter_select_default_device();
>  	spin_unlock_irqrestore(&vga_lock, flags);
>  	return true;
>  fail:
> @@ -1499,8 +1519,8 @@ static int pci_notify(struct notifier_block *nb, unsigned long action,
>  
>  	vgaarb_dbg(dev, "%s\n", __func__);
>  
> -	/* Only deal with VGA class devices */
> -	if (!pci_is_vga(pdev))
> +	/* Only deal with display devices */
> +	if (!pci_is_display(pdev))
>  		return 0;

Are there any other pci_is_vga() users that might want
pci_is_display() instead?  virtio_gpu_pci_quirk()?

>  	/*
> @@ -1548,12 +1568,8 @@ static int __init vga_arb_device_init(void)
>  
>  	/* Add all VGA class PCI devices by default */
>  	pdev = NULL;
> -	while ((pdev =
> -		pci_get_subsys(PCI_ANY_ID, PCI_ANY_ID, PCI_ANY_ID,
> -			       PCI_ANY_ID, pdev)) != NULL) {
> -		if (pci_is_vga(pdev))
> -			vga_arbiter_add_pci_device(pdev);
> -	}
> +	while ((pdev = pci_get_base_class(PCI_BASE_CLASS_DISPLAY, pdev)))
> +		vga_arbiter_add_pci_device(pdev);

I guess pci_get_base_class(PCI_BASE_CLASS_DISPLAY) is sort of a source
code optimization and this one-line change would be equivalent?

  -		if (pci_is_vga(pdev))
  +		if (pci_is_display(pdev))
  			vga_arbiter_add_pci_device(pdev);

If so, I think I prefer the one-liner because then everything in this
file would use pci_is_display() and we wouldn't have to figure out the
equivalent-ness of pci_get_base_class(PCI_BASE_CLASS_DISPLAY).

>  	pr_info("loaded\n");
>  	return rc;
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 05e68f35f3923..e77754e43c629 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -744,6 +744,21 @@ static inline bool pci_is_vga(struct pci_dev *pdev)
>  	return false;
>  }
>  
> +/**
> + * pci_is_display - Check if a PCI device is a display controller
> + * @pdev: Pointer to the PCI device structure
> + *
> + * This function determines whether the given PCI device corresponds
> + * to a display controller. Display controllers are typically used
> + * for graphical output and are identified based on their class code.
> + *
> + * Return: true if the PCI device is a display controller, false otherwise.
> + */
> +static inline bool pci_is_display(struct pci_dev *pdev)
> +{
> +	return (pdev->class >> 16) == PCI_BASE_CLASS_DISPLAY;
> +}

Could use in vga_switcheroo_client_probe_defer(), IS_GFX_DEVICE(),
vfio_pci_is_intel_display(), i915_gfx_present(), get_bound_vga().
Arguable whether it's worth changing them, but I guess it's nice to
test for the same thing the same way.

Bjorn




[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux