On 11.07.25 12:08, Simona Vetter wrote: > On Fri, Jul 11, 2025 at 11:35:17AM +0200, Thomas Zimmermann wrote: >> This reverts commit 5307dce878d4126e1b375587318955bd019c3741. >> >> We're going to revert the dma-buf handle back to separating dma_buf >> and import_attach->dmabuf in struct drm_gem_object. Hence revert this >> fix for it. > > I think we should add my reasons from the private thread here why I think > this is conceptually wrong: > > handle_count is an uapi reference, and should have nothing to do with the > lifetime and consistency of the underlying gem_bo. The problem is that we tied the lifetime of the DMA-buf reference to the handle count and I think that is not 100% clean. The reason why that was done is to break the circle dependency GEM obj -> DMA-buf -> GEM obj, but what potentially should actually happen is that we distinct between a structure reference and an use count. E.g. similar to what is done with mm_struct and mmgrab()/mmdrop() vs mmget()/mmput(). > And for imported bo the > link to the dma-buf really should be invariant, and hence > drm_gem_object_get/put() enough. The fact that this patch seems to have > helped at least in some cases indicates that our assumption that we can > replace gem_bo->import_attach.dmabuf with gem_bo->dma_buf was wrong, > because pretty obviously the latter can become NULL while the gem_bo is > still alive. Which means this was conceptually wrong and at best helped > hide a race condition somewhere. > > This means that unlike the claim in the reverted commit that 1a148af06000 > ("drm/gem-shmem: Use dma_buf from GEM object instance") started triggering > an existing condition the much more likely explanation is that it > introduced the regression itself. And hence we need to revert this entire > chain of commits. The existing condition is still a problem I think. We ran into issues with that multiple times already. Just imagine the following scenario: 1. GEM obj is exported, DMA-buf file descriptor created 2. GEM obj is used in a FB. 3. GEM obj is closed, handle_count goes from 1->0, DMA-buf reference is dropped, but file descriptor remains open, obj->dma_buf set to NULL 4. Userspace calls DRM_IOCTL_MODE_GETFB2, handle count goes 0->1 again, but obj->dma_buf is still NULL! 5. GEM obj is exported again, second DMA-buf is created. The first time I stumbled over that it took me a week to figure out why we can have two DMA-bufs for the same GEM obj. Especially you can trigger the "WARN_ON(obj->dma_buf != dma_buf);" in drm_gem_prime_fd_to_handle() with this. For my particular use case it was just a broken unit test, but it allows userspace to mess up the kernel objects quite a bit and that is really not good. Regards, Christian. > > I'll also add all the Fixes: lines as needed when merging these to > drm-fixes, since some of the patches reverted in this series have landed > in 6.15 already. > > I plan to merge them all to drm-fixes once intel-gfx-ci has approved it > all. > > Thanks, Sima > >> Signed-off-by: Thomas Zimmermann <tzimmermann@xxxxxxx> >> --- >> drivers/gpu/drm/drm_gem.c | 44 ++------------------ >> drivers/gpu/drm/drm_gem_framebuffer_helper.c | 16 ++++--- >> drivers/gpu/drm/drm_internal.h | 2 - >> 3 files changed, 11 insertions(+), 51 deletions(-) >> >> diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c >> index 3a99e4a5d303..db44c40e307f 100644 >> --- a/drivers/gpu/drm/drm_gem.c >> +++ b/drivers/gpu/drm/drm_gem.c >> @@ -213,35 +213,6 @@ void drm_gem_private_object_fini(struct drm_gem_object *obj) >> } >> EXPORT_SYMBOL(drm_gem_private_object_fini); >> >> -static void drm_gem_object_handle_get(struct drm_gem_object *obj) >> -{ >> - struct drm_device *dev = obj->dev; >> - >> - drm_WARN_ON(dev, !mutex_is_locked(&dev->object_name_lock)); >> - >> - if (obj->handle_count++ == 0) >> - drm_gem_object_get(obj); >> -} >> - >> -/** >> - * drm_gem_object_handle_get_unlocked - acquire reference on user-space handles >> - * @obj: GEM object >> - * >> - * Acquires a reference on the GEM buffer object's handle. Required >> - * to keep the GEM object alive. Call drm_gem_object_handle_put_unlocked() >> - * to release the reference. >> - */ >> -void drm_gem_object_handle_get_unlocked(struct drm_gem_object *obj) >> -{ >> - struct drm_device *dev = obj->dev; >> - >> - guard(mutex)(&dev->object_name_lock); >> - >> - drm_WARN_ON(dev, !obj->handle_count); /* first ref taken in create-tail helper */ >> - drm_gem_object_handle_get(obj); >> -} >> -EXPORT_SYMBOL(drm_gem_object_handle_get_unlocked); >> - >> /** >> * drm_gem_object_handle_free - release resources bound to userspace handles >> * @obj: GEM object to clean up. >> @@ -272,14 +243,8 @@ static void drm_gem_object_exported_dma_buf_free(struct drm_gem_object *obj) >> } >> } >> >> -/** >> - * drm_gem_object_handle_put_unlocked - releases reference on user-space handles >> - * @obj: GEM object >> - * >> - * Releases a reference on the GEM buffer object's handle. Possibly releases >> - * the GEM buffer object and associated dma-buf objects. >> - */ >> -void drm_gem_object_handle_put_unlocked(struct drm_gem_object *obj) >> +static void >> +drm_gem_object_handle_put_unlocked(struct drm_gem_object *obj) >> { >> struct drm_device *dev = obj->dev; >> bool final = false; >> @@ -304,7 +269,6 @@ void drm_gem_object_handle_put_unlocked(struct drm_gem_object *obj) >> if (final) >> drm_gem_object_put(obj); >> } >> -EXPORT_SYMBOL(drm_gem_object_handle_put_unlocked); >> >> /* >> * Called at device or object close to release the file's >> @@ -429,8 +393,8 @@ drm_gem_handle_create_tail(struct drm_file *file_priv, >> int ret; >> >> WARN_ON(!mutex_is_locked(&dev->object_name_lock)); >> - >> - drm_gem_object_handle_get(obj); >> + if (obj->handle_count++ == 0) >> + drm_gem_object_get(obj); >> >> /* >> * Get the user-visible handle using idr. Preload and perform >> diff --git a/drivers/gpu/drm/drm_gem_framebuffer_helper.c b/drivers/gpu/drm/drm_gem_framebuffer_helper.c >> index c60d0044d036..618ce725cd75 100644 >> --- a/drivers/gpu/drm/drm_gem_framebuffer_helper.c >> +++ b/drivers/gpu/drm/drm_gem_framebuffer_helper.c >> @@ -100,7 +100,7 @@ void drm_gem_fb_destroy(struct drm_framebuffer *fb) >> unsigned int i; >> >> for (i = 0; i < fb->format->num_planes; i++) >> - drm_gem_object_handle_put_unlocked(fb->obj[i]); >> + drm_gem_object_put(fb->obj[i]); >> >> drm_framebuffer_cleanup(fb); >> kfree(fb); >> @@ -183,10 +183,8 @@ int drm_gem_fb_init_with_funcs(struct drm_device *dev, >> if (!objs[i]) { >> drm_dbg_kms(dev, "Failed to lookup GEM object\n"); >> ret = -ENOENT; >> - goto err_gem_object_handle_put_unlocked; >> + goto err_gem_object_put; >> } >> - drm_gem_object_handle_get_unlocked(objs[i]); >> - drm_gem_object_put(objs[i]); >> >> min_size = (height - 1) * mode_cmd->pitches[i] >> + drm_format_info_min_pitch(info, i, width) >> @@ -196,22 +194,22 @@ int drm_gem_fb_init_with_funcs(struct drm_device *dev, >> drm_dbg_kms(dev, >> "GEM object size (%zu) smaller than minimum size (%u) for plane %d\n", >> objs[i]->size, min_size, i); >> - drm_gem_object_handle_put_unlocked(objs[i]); >> + drm_gem_object_put(objs[i]); >> ret = -EINVAL; >> - goto err_gem_object_handle_put_unlocked; >> + goto err_gem_object_put; >> } >> } >> >> ret = drm_gem_fb_init(dev, fb, mode_cmd, objs, i, funcs); >> if (ret) >> - goto err_gem_object_handle_put_unlocked; >> + goto err_gem_object_put; >> >> return 0; >> >> -err_gem_object_handle_put_unlocked: >> +err_gem_object_put: >> while (i > 0) { >> --i; >> - drm_gem_object_handle_put_unlocked(objs[i]); >> + drm_gem_object_put(objs[i]); >> } >> return ret; >> } >> diff --git a/drivers/gpu/drm/drm_internal.h b/drivers/gpu/drm/drm_internal.h >> index f921cc73f8b8..9078504e789c 100644 >> --- a/drivers/gpu/drm/drm_internal.h >> +++ b/drivers/gpu/drm/drm_internal.h >> @@ -161,8 +161,6 @@ void drm_sysfs_lease_event(struct drm_device *dev); >> >> /* drm_gem.c */ >> int drm_gem_init(struct drm_device *dev); >> -void drm_gem_object_handle_get_unlocked(struct drm_gem_object *obj); >> -void drm_gem_object_handle_put_unlocked(struct drm_gem_object *obj); >> int drm_gem_handle_create_tail(struct drm_file *file_priv, >> struct drm_gem_object *obj, >> u32 *handlep); >> -- >> 2.50.0 >> >