On 09/05/2025 17:04, Catalin Marinas wrote: > On Fri, May 09, 2025 at 02:49:05PM +0100, Will Deacon wrote: >> On Tue, May 06, 2025 at 03:52:59PM +0100, Ryan Roberts wrote: >>> On 06/05/2025 15:25, Will Deacon wrote: >>>> This penalises large homogeneous systems and it feels unnecessary given >>>> that we have the ability to check this per-CPU. Can you use >>>> ARM64_CPUCAP_BOOT_CPU_FEATURE instead of ARM64_CPUCAP_SYSTEM_FEATURE >>>> to solve this? >>> >>> We are trying to solve for the case where the boot CPU has BBML2 but a secondary >>> CPU doesn't. (e.g. hetrogeneous system where boot CPU is big and secondary is >>> little and does not advertise the feature. I can't remember if we proved there >>> are real systems with this config - I have vague recollection that we did but my >>> memory is poor...). >>> >>> My understanding is that for ARM64_CPUCAP_BOOT_CPU_FEATURE, "If the boot CPU >>> has enabled this feature already, then every late CPU must have it". So that >>> would exclude any secondary CPUs without BBML2 from coming online? >> >> Damn, yes, you're right. However, it still feels horribly hacky to iterate >> over the online CPUs in has_bbml2_noabort() -- the cpufeature framework >> has the ability to query features locally and we should be able to use >> that. We're going to want that should the architecture eventually decide >> on something like BBML3 for this. >> >> What we have with BBML2_NOABORT seems similar to an hwcap in that we only >> support the capability if all CPUs have it (rejecting late CPUs without it >> in that case) but we can live without it if not all of the early CPUs >> have it. Unlikely hwcaps, though, we shouldn't be advertising this to >> userspace and we can't derive the capability solely from the sanitised >> system registers. >> >> I wonder if we could treat it like an erratum in some way instead? That >> is, invert things so that CPUs which _don't_ have BBML2_NOABORT are >> considered to have a "BBM_CONFLICT_ABORT" erratum (which we obviously >> wouldn't shout about). Then we should be able to say: >> >> - If any of the early CPUs don't have BBML2_NOABORT, then the erratum >> would be enabled and we wouln't elide BBM. >> >> - If a late CPU doesn't have BBML2_NOABORT then it can't come online >> if the erratum isn't already enabled. >> >> Does that work? If not, then perhaps the cpufeature/cpuerrata code needs >> some surgery for this. > > Ah, I should have read this thread in order. I think we can treat this > as BBML2_NOABORT available as default based on ID regs and use the > allow/deny-list as an erratum. > Just to make sure I've understood all this, I think what you are both saying is we can create a single capability called ARM64_HAS_NO_BBML2_NOABORT of type ARM64_CPUCAP_LOCAL_CPU_ERRATUM. Each CPU will then check it has BBML2 and is in the MIDR allow list; If any of those conditions are not met, the CPU is considered to have ARM64_HAS_NO_BBML2_NOABORT. Then we have this helper: static inline bool system_supports_bbml2_noabort(void) { return system_capabilities_finalized() && !alternative_has_cap_unlikely(ARM64_HAS_NO_BBML2_NOABORT); } system_capabilities_finalized() is there to ensure an early call to this helper returns false (i.e. the safe value before we have evaluated on all CPUs). Because ARM64_HAS_NO_BBML2_NOABORT is inverted it would otherwise return true prior to finalization. I don't believe we need any second (SYSTEM or BOOT) feature. This is sufficient on its own? Thanks, Ryan