>+static int tdx_pamt_add(atomic_t *pamt_refcount, unsigned long hpa, >+ struct list_head *pamt_pages) >+{ >+ u64 err; >+ >+ hpa = ALIGN_DOWN(hpa, SZ_2M); Nit: it is better to use SZ_2M or PMD_SIZE consistently. e.g., patch 2 uses PMD_SIZE: +atomic_t *tdx_get_pamt_refcount(unsigned long hpa) +{ + return &pamt_refcounts[hpa / PMD_SIZE]; +} +EXPORT_SYMBOL_GPL(tdx_get_pamt_refcount); >+ >+ spin_lock(&pamt_lock); >+ >+ /* Lost race to other tdx_pamt_add() */ >+ if (atomic_read(pamt_refcount) != 0) { >+ atomic_inc(pamt_refcount); >+ spin_unlock(&pamt_lock); >+ tdx_free_pamt_pages(pamt_pages); >+ return 0; >+ } >+ >+ err = tdh_phymem_pamt_add(hpa | TDX_PS_2M, pamt_pages); >+ >+ if (err) >+ tdx_free_pamt_pages(pamt_pages); >+ >+ /* >+ * tdx_hpa_range_not_free() is true if current task won race >+ * against tdx_pamt_put(). >+ */ >+ if (err && !tdx_hpa_range_not_free(err)) { >+ spin_unlock(&pamt_lock); >+ pr_tdx_error(TDH_PHYMEM_PAMT_ADD, err); >+ return -EIO; >+ } IIUC, this chunk is needed because tdx_pamt_put() decreases the refcount without holding the pamt_lock. Why not move that decrease inside the lock? And I suggest that all accesses to the pamt_refcount should be performed with the pamt_lock held. This can make the code much clearer. It's similar to how kvm_usage_count is managed, where transitions from 0 to 1 or 1 to 0 require extra work, but other cases simply increases or decreases the refcount.