Re: [PATCH 1/2] block: fix race between set_blocksize and read paths

Luis Chamberlain <mcgrof@xxxxxxxxxx> · Fri, 18 Apr 2025 10:56:55 -0700

On Fri, Apr 18, 2025 at 09:02:34AM -0700, Darrick J. Wong wrote:
> On Fri, Apr 18, 2025 at 08:54:58AM -0700, Darrick J. Wong wrote:
> > From: Darrick J. Wong <djwong@xxxxxxxxxx>
> > 
> > With the new large sector size support, it's now the case that
> > set_blocksize can change i_blksize and the folio order in a manner that
> > conflicts with a concurrent reader and causes a kernel crash.
> > 
> > Specifically, let's say that udev-worker calls libblkid to detect the
> > labels on a block device.  The read call can create an order-0 folio to
> > read the first 4096 bytes from the disk.  But then udev is preempted.
> > 
> > Next, someone tries to mount an 8k-sectorsize filesystem from the same
> > block device.  The filesystem calls set_blksize, which sets i_blksize to
> > 8192 and the minimum folio order to 1.
> > 
> > Now udev resumes, still holding the order-0 folio it allocated.  It then
> > tries to schedule a read bio and do_mpage_readahead tries to create
> > bufferheads for the folio.  Unfortunately, blocks_per_folio == 0 because
> > the page size is 4096 but the blocksize is 8192 so no bufferheads are
> > attached and the bh walk never sets bdev.  We then submit the bio with a
> > NULL block device and crash.
> > 
> > Therefore, truncate the page cache after flushing but before updating
> > i_blksize.  However, that's not enough -- we also need to lock out file
> > IO and page faults during the update.  Take both the i_rwsem and the
> > invalidate_lock in exclusive mode for invalidations, and in shared mode
> > for read/write operations.
> > 
> > I don't know if this is the correct fix, but xfs/259 found it.
> > 
> > Signed-off-by: "Darrick J. Wong" <djwong@xxxxxxxxxx>
> 
> I think this could also have the tag:
> Fixes: 3c20917120ce61 ("block/bdev: enable large folio support for large logical block sizes")
> 
> Not sure anyone cares about that for a fix for 6.15-rc1 though.

Its a fix, so I'd prefer this goes to v6.15-rcx for sure.

  Luis