On 8/13/25 4:50 PM, Danilo Krummrich wrote: > On Thu Aug 14, 2025 at 1:28 AM CEST, John Hubbard wrote: >> NovaCore 0000:c1:00.0: GPU instance built >> NovaCore 0000:c1:00.1: Probe Nova Core GPU driver. >> NovaCore 0000:c1:00.1: enabling device (0000 -> 0002) >> NovaCore 0000:c1:00.1: probe with driver NovaCore failed with error -22 >> ... >> Bad IO access at port 0x0 () >> WARNING: CPU: 26 PID: 748 at lib/iomap.c:45 pci_iounmap+0x3f/0x50 >> ... >> <kernel::devres::Devres<kernel::pci::Bar<16777216>>>::devres_callback+0x2c/0x70 [nova_core] >> devres_release_all+0xa8/0xf0 >> really_probe+0x30f/0x420 >> __driver_probe_device+0x77/0xf0 >> driver_probe_device+0x22/0x1b0 >> __driver_attach+0x118/0x250 >> bus_for_each_dev+0x105/0x130 >> bus_add_driver+0x163/0x2a0 >> driver_register+0x5d/0xf0 >> init_module+0x6d/0x1000 [nova_core] >> do_one_initcall+0xde/0x380 >> do_init_module+0x60/0x250 >> >> ...and then: >> BUG: kernel NULL pointer dereference, address: 0000000000000538 >> RIP: 0010:pci_release_region+0x10/0x60 >> ... >> <kernel::devres::Devres<kernel::pci::Bar<16777216>>>::devres_callback+0x36/0x70 [nova_core] >> devres_release_all+0xa8/0xf0 >> really_probe+0x30f/0x420 >> __driver_probe_device+0x77/0xf0 >> driver_probe_device+0x22/0x1b0 >> __driver_attach+0x118/0x250 >> bus_for_each_dev+0x105/0x130 >> bus_add_driver+0x163/0x2a0 >> driver_register+0x5d/0xf0 >> init_module+0x6d/0x1000 [nova_core] >> do_one_initcall+0xde/0x380 >> do_init_module+0x60/0x250 > > This is caused by a bug in Devres, which I already fixed in [1]. Nice timing! btw, I scanned through rust-for-linux emails, and wasn't immediately able to connect it ("fix leak...") with this crash. You might want to add a little note to that commit description, to mention that it also prevents a kernel crash in some cases? > > With the patch in [1] nova-core should gracefully fail probing for the > non-supported device classes as expected. Beautiful. > > However, I think we still want to filter by PCI class, so the patch is fine in > general. :) > > Few comments below. > > [1] https://lore.kernel.org/lkml/20250812130928.11075-1-dakr@xxxxxxxxxx/ >> >> Signed-off-by: John Hubbard <jhubbard@xxxxxxxxxx> >> --- >> drivers/gpu/nova-core/driver.rs | 13 +++++++++++++ >> rust/kernel/pci.rs | 6 ++++++ >> 2 files changed, 19 insertions(+) >> >> diff --git a/drivers/gpu/nova-core/driver.rs b/drivers/gpu/nova-core/driver.rs >> index 274989ea1fb4..4e0e6f5338e9 100644 >> --- a/drivers/gpu/nova-core/driver.rs >> +++ b/drivers/gpu/nova-core/driver.rs >> @@ -31,6 +31,19 @@ impl pci::Driver for NovaCore { >> fn probe(pdev: &pci::Device<Core>, _info: &Self::IdInfo) -> Result<Pin<KBox<Self>>> { >> dev_dbg!(pdev.as_ref(), "Probe Nova Core GPU driver.\n"); >> >> + let class_code = pdev.class(); >> + >> + if class_code != bindings::PCI_CLASS_DISPLAY_VGA >> + && class_code != bindings::PCI_CLASS_DISPLAY_3D > > I think it would be nice if we could provide a Rust enum for PCI classes, such > that this could be pci::Class::DISPLAY_VGA instead. > > Of course the same is true for PCI (sub)vendor, (sub)device IDs. > OK, will do. >> + { >> + dev_dbg!( >> + pdev.as_ref(), >> + "Skipping non-display NVIDIA device with class 0x{:04x}\n", >> + class_code >> + ); >> + return Err(kernel::error::code::ENODEV); > > With the prelude included you should be able to use ENODEV directly. Yes. I should have noticed that, sorry. > >> + } >> + >> pdev.enable_device_mem()?; >> pdev.set_master(); >> >> diff --git a/rust/kernel/pci.rs b/rust/kernel/pci.rs > > Please split the PCI part up into a separate patch. OK, will do. Thanks for the review! thanks, -- John Hubbard > >> index 887ee611b553..b6416fe7bdfd 100644 >> --- a/rust/kernel/pci.rs >> +++ b/rust/kernel/pci.rs >> @@ -399,6 +399,12 @@ pub fn device_id(&self) -> u16 { >> unsafe { (*self.as_raw()).device } >> } >> >> + /// Returns the PCI class code (class and subclass). >> + pub fn class(&self) -> u32 { >> + // SAFETY: `self.as_raw` is a valid pointer to a `struct pci_dev`. >> + unsafe { (*self.as_raw()).class >> 8 } >> + } >> + >> /// Returns the size of the given PCI bar resource. >> pub fn resource_len(&self, bar: u32) -> Result<bindings::resource_size_t> { >> if !Bar::index_is_valid(bar) { >> >> base-commit: dfc0f6373094dd88e1eaf76c44f2ff01b65db851 >> -- >> 2.50.1 >