Re: [PATCH 2/7] mm: shmem: avoid setting error on splited entries in shmem_set_folio_swapin_error()

Baolin Wang <baolin.wang@xxxxxxxxxxxxxxxxx> · Sat, 7 Jun 2025 14:20:01 +0800

On 2025/6/6 06:10, Kemeng Shi wrote:
When large entry is splited, the first entry splited from large entry
retains the same entry value and index as original large entry but it's
order is reduced. In shmem_set_folio_swapin_error(), if large entry is
splited before xa_cmpxchg_irq(), we may replace the first splited entry
with error entry while using the size of original large entry for release
operations. This could lead to a WARN_ON(i_blocks) due to incorrect
nr_pages used by shmem_recalc_inode() and could lead to used after free
due to incorrect nr_pages used by swap_free_nr().

I wonder if you have actually triggered this issue? When a large swap 
entry is split, it means the folio is already at order 0, so why would 
the size of the original large entry be used for release operations? Or 
is there another race condition?

Skip setting error if entry spliiting is detected to fix the issue. The
bad entry will be replaced with error entry anyway as we will still get
IO error when we swap in the bad entry at next time.

Fixes: 12885cbe88ddf ("mm: shmem: split large entry if the swapin folio is not large")
Signed-off-by: Kemeng Shi <shikemeng@xxxxxxxxxxxxxxx>
---
  mm/shmem.c | 21 +++++++++++++++------
  1 file changed, 15 insertions(+), 6 deletions(-)

diff --git a/mm/shmem.c b/mm/shmem.c
index e27d19867e03..f1062910a4de 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -2127,16 +2127,25 @@ static void shmem_set_folio_swapin_error(struct inode *inode, pgoff_t index,
  	struct address_space *mapping = inode->i_mapping;
  	swp_entry_t swapin_error;
  	void *old;
-	int nr_pages;
+	int nr_pages = folio_nr_pages(folio);
+	int order;
  
  	swapin_error = make_poisoned_swp_entry();
-	old = xa_cmpxchg_irq(&mapping->i_pages, index,
-			     swp_to_radix_entry(swap),
-			     swp_to_radix_entry(swapin_error), 0);
-	if (old != swp_to_radix_entry(swap))
+	xa_lock_irq(&mapping->i_pages);
+	order = xa_get_order(&mapping->i_pages, index);
+	if (nr_pages != (1 << order)) {
+		xa_unlock_irq(&mapping->i_pages);
  		return;
+	}
+	old = __xa_cmpxchg(&mapping->i_pages, index,
+			   swp_to_radix_entry(swap),
+			   swp_to_radix_entry(swapin_error), 0);
+	if (old != swp_to_radix_entry(swap)) {
+		xa_unlock_irq(&mapping->i_pages);
+		return;
+	}
+	xa_unlock_irq(&mapping->i_pages);
  
-	nr_pages = folio_nr_pages(folio);
  	folio_wait_writeback(folio);
  	if (!skip_swapcache)
  		delete_from_swap_cache(folio);