On Apr 21, 2025 / 18:18, Darrick J. Wong wrote: > From: Darrick J. Wong <djwong@xxxxxxxxxx> > > With the new large sector size support, it's now the case that > set_blocksize can change i_blksize and the folio order in a manner that > conflicts with a concurrent reader and causes a kernel crash. > > Specifically, let's say that udev-worker calls libblkid to detect the > labels on a block device. The read call can create an order-0 folio to > read the first 4096 bytes from the disk. But then udev is preempted. > > Next, someone tries to mount an 8k-sectorsize filesystem from the same > block device. The filesystem calls set_blksize, which sets i_blksize to > 8192 and the minimum folio order to 1. > > Now udev resumes, still holding the order-0 folio it allocated. It then > tries to schedule a read bio and do_mpage_readahead tries to create > bufferheads for the folio. Unfortunately, blocks_per_folio == 0 because > the page size is 4096 but the blocksize is 8192 so no bufferheads are > attached and the bh walk never sets bdev. We then submit the bio with a > NULL block device and crash. > > Therefore, truncate the page cache after flushing but before updating > i_blksize. However, that's not enough -- we also need to lock out file > IO and page faults during the update. Take both the i_rwsem and the > invalidate_lock in exclusive mode for invalidations, and in shared mode > for read/write operations. > > I don't know if this is the correct fix, but xfs/259 found it. > > Signed-off-by: "Darrick J. Wong" <djwong@xxxxxxxxxx> > Reviewed-by: Christoph Hellwig <hch@xxxxxx> > Reviewed-by: Luis Chamberlain <mcgrof@xxxxxxxxxx> Thanks. I confirmed that this patch avoids the hang recreated by the new blktests test case [1]. I also ran whole blktests and observed no regression. Tested-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@xxxxxxx> [1] https://lore.kernel.org/linux-block/20250418075431.1851353-1-shinichiro.kawasaki@xxxxxxx/