Re: [PATCH 2/7] mm: shmem: avoid setting error on splited entries in shmem_set_folio_swapin_error()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 2025/6/6 06:10, Kemeng Shi wrote:
When large entry is splited, the first entry splited from large entry
retains the same entry value and index as original large entry but it's
order is reduced. In shmem_set_folio_swapin_error(), if large entry is
splited before xa_cmpxchg_irq(), we may replace the first splited entry
with error entry while using the size of original large entry for release
operations. This could lead to a WARN_ON(i_blocks) due to incorrect
nr_pages used by shmem_recalc_inode() and could lead to used after free
due to incorrect nr_pages used by swap_free_nr().

I wonder if you have actually triggered this issue? When a large swap entry is split, it means the folio is already at order 0, so why would the size of the original large entry be used for release operations? Or is there another race condition?

Skip setting error if entry spliiting is detected to fix the issue. The
bad entry will be replaced with error entry anyway as we will still get
IO error when we swap in the bad entry at next time.

Fixes: 12885cbe88ddf ("mm: shmem: split large entry if the swapin folio is not large")
Signed-off-by: Kemeng Shi <shikemeng@xxxxxxxxxxxxxxx>
---
  mm/shmem.c | 21 +++++++++++++++------
  1 file changed, 15 insertions(+), 6 deletions(-)

diff --git a/mm/shmem.c b/mm/shmem.c
index e27d19867e03..f1062910a4de 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -2127,16 +2127,25 @@ static void shmem_set_folio_swapin_error(struct inode *inode, pgoff_t index,
  	struct address_space *mapping = inode->i_mapping;
  	swp_entry_t swapin_error;
  	void *old;
-	int nr_pages;
+	int nr_pages = folio_nr_pages(folio);
+	int order;
swapin_error = make_poisoned_swp_entry();
-	old = xa_cmpxchg_irq(&mapping->i_pages, index,
-			     swp_to_radix_entry(swap),
-			     swp_to_radix_entry(swapin_error), 0);
-	if (old != swp_to_radix_entry(swap))
+	xa_lock_irq(&mapping->i_pages);
+	order = xa_get_order(&mapping->i_pages, index);
+	if (nr_pages != (1 << order)) {
+		xa_unlock_irq(&mapping->i_pages);
  		return;
+	}
+	old = __xa_cmpxchg(&mapping->i_pages, index,
+			   swp_to_radix_entry(swap),
+			   swp_to_radix_entry(swapin_error), 0);
+	if (old != swp_to_radix_entry(swap)) {
+		xa_unlock_irq(&mapping->i_pages);
+		return;
+	}
+	xa_unlock_irq(&mapping->i_pages);
- nr_pages = folio_nr_pages(folio);
  	folio_wait_writeback(folio);
  	if (!skip_swapcache)
  		delete_from_swap_cache(folio);




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux