Re: [PATCH 2/2] drm/gpuvm: Add locking helpers

Danilo Krummrich <dakr@xxxxxxxxxx> · Mon, 16 Jun 2025 23:38:55 +0200

On Sat, Jun 14, 2025 at 08:03:20AM -0700, Rob Clark wrote:
> On Sat, Jun 14, 2025 at 3:39 AM Danilo Krummrich <dakr@xxxxxxxxxx> wrote:
> >
> > On Fri, Jun 13, 2025 at 04:57:03PM -0700, Rob Clark wrote:
> > > For UNMAP/REMAP steps we could be needing to lock objects that are not
> > > explicitly listed in the VM_BIND ioctl in order to tear-down unmapped
> > > VAs.  These helpers handle locking/preparing the needed objects.
> >
> > Yes, that's a common use-case. I think drivers typically iterate through their
> > drm_gpuva_ops to lock those objects.
> >
> > I had a look at you link [1] and it seems that you keep a list of ops as well by
> > calling vm_op_enqueue() with a new struct msm_vm_op from the callbacks.
> >
> > Please note that for exactly this case there is the op_alloc callback in
> > struct drm_gpuvm_ops, such that you can allocate a custom op type (i.e. struct
> > msm_vm_op) that embedds a struct drm_gpuva_op.
> 
> I did use drm_gpuvm_sm_xyz_ops_create() in an earlier iteration of my
> VM_BIND series, but it wasn't quite what I was after.  I wanted to
> apply the VM updates immediately to avoid issues with a later
> map/unmap overlapping an earlier map, which
> drm_gpuvm_sm_xyz_ops_create() doesn't really handle.  I'm not even
> sure why this isn't a problem for other drivers unless userspace is
> providing some guarantees.

The drm_gpuva_ops are usually used in a pattern like this.

	vm_bind {
		for_each_vm_bind_operation {
			drm_gpuva_for_each_op {
				// modify drm_gpuvm's interval tree
				// pre-allocate memory
				// lock and prepare objects
			}
		}

		drm_sched_entity_push_job();
	}

	run_job {
		for_each_vm_bind_operation {
			drm_gpuva_for_each_op {
				// modify page tables
			}
		}
	}

	run_job {
		for_each_vm_bind_operation {
			drm_gpuva_for_each_op {
				// free page table structures, if any
				// free unused pre-allocated memory
			}
		}
	}

What did you do instead to get map/unmap overlapping? Even more interesting,
what are you doing now?

> Once I realized I only wanted to defer the
> application of the pgtable changes, but keep all the
> locking/allocation/etc in the synchronous part of the ioctl,
> vm_op_enqueue() was the natural solution.

But vm_op_enqueue() creates exactly this list of operations you would get from
drm_gpuvm_sm_{map,unmap}_ops_create(), just manually, no?

<snip>

> > > Note that these functions do not strictly require the VM changes to be
> > > applied before the next drm_gpuvm_sm_map_lock()/_unmap_lock() call.  In
> > > the case that VM changes from an earlier drm_gpuvm_sm_map()/_unmap()
> > > call result in a differing sequence of steps when the VM changes are
> > > actually applied, it will be the same set of GEM objects involved, so
> > > the locking is still correct.
> >
> > I'm not sure about this part, how can we be sure that's the case?
> 
> I could be not imaginative enough here, so it is certainly worth a
> second opinion.  And why I explicitly called it out in the commit msg.
> But my reasoning is that any new op in the second pass that actually
> applies the VM updates which results from overlapping with a previous
> update in the current VM_BIND will only involve GEM objects from that
> earlier update, which are already locked.

Yeah, it's probably fine, since, as you say, the only additional object can be
the req_obj from the previous iteration.