Re: [PATCH RFC] fs/buffer: fix use-after-free when call bh_read() helper

Matthew Wilcox <willy@xxxxxxxxxxxxx> · Mon, 11 Aug 2025 16:21:26 +0100

On Mon, Aug 11, 2025 at 10:18:30PM +0800, Ye Bin wrote:
> From: Ye Bin <yebin10@xxxxxxxxxx>
> 
> There's issue as follows:
> BUG: KASAN: stack-out-of-bounds in end_buffer_read_sync+0xe3/0x110
> Read of size 8 at addr ffffc9000168f7f8 by task swapper/3/0
> CPU: 3 UID: 0 PID: 0 Comm: swapper/3 Not tainted 6.16.0-862.14.0.6.x86_64
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
> Call Trace:
>  <IRQ>
>  dump_stack_lvl+0x55/0x70
>  print_address_description.constprop.0+0x2c/0x390
>  print_report+0xb4/0x270
>  kasan_report+0xb8/0xf0
>  end_buffer_read_sync+0xe3/0x110
>  end_bio_bh_io_sync+0x56/0x80
>  blk_update_request+0x30a/0x720
>  scsi_end_request+0x51/0x2b0
>  scsi_io_completion+0xe3/0x480
>  ? scsi_device_unbusy+0x11e/0x160
>  blk_complete_reqs+0x7b/0x90
>  handle_softirqs+0xef/0x370
>  irq_exit_rcu+0xa5/0xd0
>  sysvec_apic_timer_interrupt+0x6e/0x90
>  </IRQ>
> 
>  Above issue happens when do ntfs3 filesystem mount, issue may happens
>  as follows:
>            mount                            IRQ
> ntfs_fill_super
>   read_cache_page
>     do_read_cache_folio
>       filemap_read_folio
>         mpage_read_folio
> 	 do_mpage_readpage
> 	  ntfs_get_block_vbo
> 	   bh_read
> 	     submit_bh
> 	     wait_on_buffer(bh);
> 	                            blk_complete_reqs
> 				     scsi_io_completion
> 				      scsi_end_request
> 				       blk_update_request
> 				        end_bio_bh_io_sync
> 					 end_buffer_read_sync
> 					  __end_buffer_read_notouch
> 					   unlock_buffer
> 
>             wait_on_buffer(bh);--> return will return to caller
> 
> 					  put_bh
> 					    --> trigger stack-out-of-bounds
> In the mpage_read_folio() function, the stack variable 'map_bh' is
> passed to ntfs_get_block_vbo(). Once unlock_buffer() unlocks and
> wait_on_buffer() returns to continue processing, the stack variable
> is likely to be reclaimed. Consequently, during the end_buffer_read_sync()
> process, calling put_bh() may result in stack overrun.

All good to here.

> If it is not a stack variable, since the reference count of the
> buffer_head is released after unlocking, it cannot be released during
> drop_buffers. This poses a risk of buffer_head leakage.
> To solve above issue first call put_bh() before unlock_buffer. This
> should be safe because during the release, discard_buffer() will call
> lock_buffer().

I find this part of the explanation hard to follow and I thought there
was a mistake here.  However after tracing through what would happen in
gfs2_metapath_ra() and __ext4_sb_bread_gfp(), I don't see a problem.

So here's a replacement paragraph:

If the bh is not allocated on the stack, it belongs to a folio.  Freeing a
buffer head which belongs to a folio is done by drop_buffers() which
will fail to free buffers which are still locked.  So it is safe to call
put_bh() before __end_buffer_read_notouch().

Reviewed-by: Matthew Wilcox (Oracle) <willy@xxxxxxxxxxxxx>

I'll also note that we have a bunch of weird corruptions with ntfs3
and this might explain them.  Also this isn't really an ntfs3 bug,
but ntfs3 might be the only in-tree filesystem which happens to use
map_bh like this.  It's bad for performance to do it this way, but
if all you're trying to do is get a working filesystem, this is a
simple way to do things.

> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> Signed-off-by: Ye Bin <yebin10@xxxxxxxxxx>
> ---
>  fs/buffer.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/buffer.c b/fs/buffer.c
> index ead4dc85debd..6a8752f7bbed 100644
> --- a/fs/buffer.c
> +++ b/fs/buffer.c
> @@ -157,8 +157,8 @@ static void __end_buffer_read_notouch(struct buffer_head *bh, int uptodate)
>   */
>  void end_buffer_read_sync(struct buffer_head *bh, int uptodate)
>  {
> -	__end_buffer_read_notouch(bh, uptodate);
>  	put_bh(bh);
> +	__end_buffer_read_notouch(bh, uptodate);
>  }
>  EXPORT_SYMBOL(end_buffer_read_sync);
>  
> -- 
> 2.34.1
> 
>