On Thu, Apr 03, 2025 at 09:14:43AM +0100, Ryan Roberts wrote: > On 25/03/2025 05:36, Mikołaj Lenczewski wrote: > > diff --git a/arch/arm64/mm/contpte.c b/arch/arm64/mm/contpte.c > > index 55107d27d3f8..77ed03b30b72 100644 > > --- a/arch/arm64/mm/contpte.c > > +++ b/arch/arm64/mm/contpte.c > > @@ -68,7 +68,8 @@ static void contpte_convert(struct mm_struct *mm, unsigned long addr, > > pte = pte_mkyoung(pte); > > } > > > > - __flush_tlb_range(&vma, start_addr, addr, PAGE_SIZE, true, 3); > > + if (!system_supports_bbml2_noabort()) > > + __flush_tlb_range(&vma, start_addr, addr, PAGE_SIZE, true, 3); > > > > __set_ptes(mm, start_addr, start_ptep, pte, CONT_PTES); > > Despite all the conversation we had about completely eliding the TLBI for the > BBML2 case, I've continued to be a bit uneasy about it. I had another chat with > Alex C and we concluded that it is safe, but there could be conceivable > implementations where it is not performant. Alex suggested doing a TLBI without > the DSB and I think that's a good idea. So after the __set_ptes(), I suggest adding: > > if (system_supports_bbml2_noabort()) > __flush_tlb_range_nosync(mm, start_addr, addr, PAGE_SIZE, > true, 3); > > That will issue the TLBI but won't wait for it to complete. So it should be very > fast. We are guranteed correctness immediately. We are guranteed performance > after the next DSB (worst-case; next context switch). > > Thanks, > Ryan Hi Ryan, Sure, perfectly happy to add that on. Will respin and add a note about this behaviour to the source code and to the patch / cover letter. -- Kind regards, Mikołaj Lenczewski