endless loop in truncate_inode_pages_range()

Haifeng Xu <haifeng.xu@xxxxxxxxxx> · Tue, 22 Apr 2025 16:44:03 +0800

Hi all,
	I found a task stall in truncate_inode_pages_range() in our production environment. The kernel version is stable 6.6.66.

	It's call trace is here:

	[<0>] find_get_entries+0x74/0x270
	[<0>] truncate_inode_pages_range+0x312/0x550
	[<0>] truncate_pagecache+0x48/0x70
	[<0>] truncate_setsize+0x27/0x60
	[<0>] xfs_setattr_size+0xf5/0x3d0 [xfs]
	[<0>] xfs_vn_setattr_size+0x49/0x90 [xfs]
	[<0>] xfs_vn_setattr+0x7e/0x120 [xfs]
	[<0>] notify_change+0x1ee/0x4d0
	[<0>] do_truncate+0x98/0xf0
	[<0>] do_open+0x329/0x470
	[<0>] path_openat+0x135/0x2d0
	[<0>] do_filp_open+0xaf/0x170
	[<0>] do_sys_openat2+0xb3/0xe0
	[<0>] __x64_sys_openat+0x55/0xa0
	[<0>] x64_sys_call+0x16e4/0x2210
	[<0>] do_syscall_64+0x56/0x90
	[<0>] entry_SYSCALL_64_after_hwframe+0x78/0xe2

	truncate_inode_pages_range

	...
	index = start;
	while (index < end) {
		cond_resched();
		if (!find_get_entries(mapping, &index, end - 1, &fbatch,
				indices)) {
			/* If all gone from start onwards, we're done */
			if (index == start)
				break;
			/* Otherwise restart to make sure all gone */
			index = start;
			continue;
		}

		for (i = 0; i < folio_batch_count(&fbatch); i++) {
			struct folio *folio = fbatch.folios[i];

			/* We rely upon deletion not changing page->index */

			if (xa_is_value(folio))
				continue;

			folio_lock(folio);
			VM_BUG_ON_FOLIO(!folio_contains(folio, indices[i]), folio);
			folio_wait_writeback(folio);
			truncate_inode_folio(mapping, folio);
			folio_unlock(folio);
		}
		truncate_folio_batch_exceptionals(mapping, &fbatch, indices);
		folio_batch_release(&fbatch);
	}
	...

	From the vmcore, we found 64 entries in the mapping which have same folio. And the folio seems invalid.
	Those entries always stay in the mapping. When the index fallback to start, we can find the entry 
	sucessfully. So the task stall in truncate_inode_pages_range().

	Since truncate_inode_pages_range() is executed with holding lock of inode->i_rwsem, any operation related with
	this lock will be blocked.

crash> tree -t xarray -r address_space.i_pages 0xffff9e1176d23eb0  -p

fffffa99adef6000
  index: 7467136  position: root/28/31/2/0
fffffa99adef6000
  index: 7467137  position: root/28/31/2/1
fffffa99adef6000
  index: 7467138  position: root/28/31/2/2
fffffa99adef6000
  index: 7467139  position: root/28/31/2/3
fffffa99adef6000
  index: 7467140  position: root/28/31/2/4
fffffa99adef6000
  index: 7467141  position: root/28/31/2/5
fffffa99adef6000

...

fffffa99adef6000
  index: 7467190  position: root/28/31/2/54
fffffa99adef6000
  index: 7467191  position: root/28/31/2/55
fffffa99adef6000
  index: 7467192  position: root/28/31/2/56
fffffa99adef6000
  index: 7467193  position: root/28/31/2/57
fffffa99adef6000
  index: 7467194  position: root/28/31/2/58
fffffa99adef6000
  index: 7467196  position: root/28/31/2/60
fffffa99adef6000
  index: 7467197  position: root/28/31/2/61
fffffa99adef6000
  index: 7467198  position: root/28/31/2/62
fffffa99adef6000
  index: 7467199  position: root/28/31/2/63

struct folio {
  {
    {
      flags = 24769796876798016,
      {
        lru = {
          next = 0xffff9dfa8024cf00,
          prev = 0xfffffa9997482210
        },
        {
          __filler = 0xffff9dfa8024cf00,
          mlock_count = 2538086928
        }
      },
      mapping = 0xfffffa999cf87190,
      index = 18446636480414094976,
      {
        private = 0x2a001e,
        swap = {
          val = 2752542
        }
      },
      _mapcount = {
        counter = -1
      },
      _refcount = {
        counter = 1
      },
      memcg_data = 0
    },
    page = {
      flags = 24769796876798016,
      {
        {
          {
            lru = {
              next = 0xffff9dfa8024cf00,
              prev = 0xfffffa9997482210
            },
            {
              __filler = 0xffff9dfa8024cf00,
              mlock_count = 2538086928
            },
            buddy_list = {
              next = 0xffff9dfa8024cf00,
              prev = 0xfffffa9997482210
            },
            pcp_list = {
              next = 0xffff9dfa8024cf00,
              prev = 0xfffffa9997482210
            }
          },
          mapping = 0xfffffa999cf87190,
          {
            index = 18446636480414094976,
            share = 18446636480414094976
          },
.....

	Any thoughts about this?

Thanks!