[no subject]

**Date** **Thread**

> Pending cpuhp callbacks for this CPU look to be blocked while we sleep,
> at the very least.
> Since this only happens during the probing phase, maybe that's not such
> a big deal.

> Is it likely that some late CPUs might be left offline indefinitely?

Offlined and never come back is certainly something that can happen.

> If so, we might end up doing futile work here forever.

It may never probe all the MSC? Yes, that can certainly happen.
But the work only happens when CPUs come online, which is already a major serialising
event. There is no cost to run in this 'not done yet' state forever. It's not retrying on
a timer or something like that.

>> +	list_for_each_entry(msc, &mpam_all_msc, glbl_list) {
>> +		if (!cpumask_test_cpu(cpu, &msc->accessibility))
>> +			continue;
>> +
>> +		mutex_lock(&msc->probe_lock);
>> +		if (!msc->probed)
>> +			err = mpam_msc_hw_probe(msc);
>> +		mutex_unlock(&msc->probe_lock);
>> +
>> +		if (!err)
>> +			new_device_probed = true;
>> +		else
>> +			break; // mpam_broken

> What's the effect of returning a non-zero value to the CPU hotplug
> callback dispatcher here?

I think the dynamically allocated ones can fail without any ill effects, it'll print a
message but nothing else will happen. In this case the callbacks are synchronous with the
attempt to register them, so it propagates the error back there.

> Do we want to tear anything down if MPAM is unusable?

It will keep trying, whereas it could pack up shop completely.

Looks like that '// mpam_broken' is where I intended to schedule the work, but the code
doesn't exist this early in the series. I'll pull bits of that earlier.

>> +	}
>> +	mutex_unlock(&mpam_list_lock);
>> +
>> +	if (new_device_probed && !err)
>> +		schedule_work(&mpam_enable_work);
>> +
>> +	return err;
>> +}
>> +
>> +static int mpam_cpu_offline(unsigned int cpu)
>> +{
>> +	return 0;
>> +}
>> +
>> +static void mpam_register_cpuhp_callbacks(int (*online)(unsigned int online),
>> +					  int (*offline)(unsigned int offline))
>> +{
>> +	mutex_lock(&mpam_cpuhp_state_lock);
>> +	if (mpam_cpuhp_state) {
>> +		cpuhp_remove_state(mpam_cpuhp_state);
>> +		mpam_cpuhp_state = 0;
>> +	}
>> +
>> +	mpam_cpuhp_state = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "mpam:online",
>> +					     online, offline);
>> +	if (mpam_cpuhp_state <= 0) {
>> +		pr_err("Failed to register cpuhp callbacks");

> Should an error code be returned to the caller if this fails?

It can fail asynchronously - so any error handling for this is only solves part of the
problem, a failure here means at least one of the callbacks ran and returned an error.
If we schedule the disable call in the callback then that will take care of tearing the
whole thing down.
The cpuhp callbacks don't get registered until all the driver has found all the MSC that
firmware described, so there is no race with driver:probing an MSC after the disable call
got scheduled which tears the whole thing down.

>> +		mpam_cpuhp_state = 0;
>> +	}
>> +	mutex_unlock(&mpam_cpuhp_state_lock);
>>  }
>>  
>>  static int mpam_dt_count_msc(void)
>> @@ -772,7 +875,7 @@ static int mpam_msc_drv_probe(struct platform_device *pdev)
>>  	}
>>  
>>  	if (!err && fw_num_msc == mpam_num_msc)
>> -		mpam_discovery_complete();
>> +		mpam_register_cpuhp_callbacks(&mpam_discovery_cpu_online, NULL);
> 
> Abandon probing the MSC if this fails?

Any error returned here is most likely to be from mpam_discovery_cpu_online(), but that
can also happen asynchronously. Scheduling mpam_disable() is a better approach as it
covers the asynchronous case too.

> (However, the next phase of probing hangs off CPU hotplug, so it just
> won't happen if the callbacks can't be registered -- but it looks like
> MPAM may be left in a half-probed state.  I'm not entirely convinced
> that this matters if the MPAM driver is not unloadable anyway...)

I'm not at all worried about failing to register the cpuhp callbacks. The space needed for
that is pre-allocated, if there isn't enough space, you get a splat from the cpuhp core -
and the callbacks never run. No additional/unnecessary work happens - sure the memory
didn't get free'd, but the WARN() from cpuhp_reserve_state() should be enough to debug this.

> Nit: redundant &
> 
> (You don't have it in the similar call in mpam_enable_once().)

Done,

>>  	if (err && msc)
>>  		mpam_msc_drv_remove(pdev);
>> @@ -795,6 +898,41 @@ static struct platform_driver mpam_msc_driver = {
>>  	.remove = mpam_msc_drv_remove,
>>  };
>>  
>> +static void mpam_enable_once(void)
>> +{
>> +	mpam_register_cpuhp_callbacks(mpam_cpu_online, mpam_cpu_offline);
> 
> Should it be fatal if this fails?

As above. Once case doesn't matter - and any handling here is incomplete.

>> +
>> +	pr_info("MPAM enabled\n");
>> +}
>> +
>> +/*
>> + * Enable mpam once all devices have been probed.
>> + * Scheduled by mpam_discovery_cpu_online() once all devices have been created.
>> + * Also scheduled when new devices are probed when new CPUs come online.
>> + */
>> +void mpam_enable(struct work_struct *work)
>> +{
>> +	static atomic_t once;
> 
> Nit: possibly unnecessary atomic_t?  This is slow-path code, and we
> already have to take mpam_list_lock.  Harmless, though.

mpam_enable_once() can't be called under the lock because of the ordering with cpuhp. This
just ended up being cleaner. Too much is stuffed under that lock already!

>> +	struct mpam_msc *msc;
>> +	bool all_devices_probed = true;
>> +
>> +	/* Have we probed all the hw devices? */
>> +	mutex_lock(&mpam_list_lock);
>> +	list_for_each_entry(msc, &mpam_all_msc, glbl_list) {
>> +		mutex_lock(&msc->probe_lock);
>> +		if (!msc->probed)
>> +			all_devices_probed = false;
>> +		mutex_unlock(&msc->probe_lock);
>> +
>> +		if (!all_devices_probed)
>> +			break;
> 
> WARN()?

Expected condition...

> We counted the MSCs in via the mpam_discovery_cpu_online(), so I think
> we shouldn't get in here if some failed to probe?

No we didn't!
We counted them in mpam_msc_drv_probe() before registering the cpuhp callbacks.
The cpuhp callbacks run asynchronously, each one schedules mpam_enable() iff it probed a
new device. mpam_enable() then has to check if all the devices had been probed.

This is done so that mpam_discovery_cpu_online() only has to take the msc->probe_lock for
MSC that it can access - instead of all of them. That was to avoid having something that
serialises CPUs coming online ... but mpam_discovery_cpu_online() is taking the list_lock
due to an incomplete switch to SRCU. I'll fix that.

>> +	}
>> +	mutex_unlock(&mpam_list_lock);
>> +
>> +	if (all_devices_probed && !atomic_fetch_inc(&once))
>> +		mpam_enable_once();
>> +}

Thanks,

James