Re: [PATCH v2 2/2] fuse: fix fuseblk i_blkbits for iomap partial writes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Aug 14, 2025 at 11:25 AM Darrick J. Wong <djwong@xxxxxxxxxx> wrote:
>
> On Wed, Aug 13, 2025 at 03:35:21PM -0700, Joanne Koong wrote:
> > On regular fuse filesystems, i_blkbits is set to PAGE_SHIFT which means
> > any iomap partial writes will mark the entire folio as uptodate. However
> > fuseblk filesystems work differently and allow the blocksize to be less
> > than the page size. As such, this may lead to data corruption if fuseblk
> > sets its blocksize to less than the page size, uses the writeback cache,
> > and does a partial write, then a read and the read happens before the
> > write has undergone writeback, since the folio will not be marked
> > uptodate from the partial write so the read will read in the entire
> > folio from disk, which will overwrite the partial write.
> >
> > The long-term solution for this, which will also be needed for fuse to
> > enable large folios with the writeback cache on, is to have fuse also
> > use iomap for folio reads, but until that is done, the cleanest
> > workaround is to use the page size for fuseblk's internal kernel inode
> > blksize/blkbits values while maintaining current behavior for stat().
> >
> > This was verified using ntfs-3g:
> > $ sudo mkfs.ntfs -f -c 512 /dev/vdd1
> > $ sudo ntfs-3g /dev/vdd1 ~/fuseblk
> > $ stat ~/fuseblk/hi.txt
> > IO Block: 512
> >
> > Signed-off-by: Joanne Koong <joannelkoong@xxxxxxxxx>
> > Fixes: a4c9ab1d4975 ("fuse: use iomap for buffered writes")
> > ---
> >  fs/fuse/dir.c    |  2 +-
> >  fs/fuse/fuse_i.h |  8 ++++++++
> >  fs/fuse/inode.c  | 13 ++++++++++++-
> >  3 files changed, 21 insertions(+), 2 deletions(-)
> >
> > diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
> > index db44d05c8d02..a6aa16422c30 100644
> > --- a/fs/fuse/fuse_i.h
> > +++ b/fs/fuse/fuse_i.h
> > @@ -975,6 +975,14 @@ struct fuse_conn {
> >               /* Request timeout (in jiffies). 0 = no timeout */
> >               unsigned int req_timeout;
> >       } timeout;
> > +
> > +     /*
> > +      * This is a workaround until fuse uses iomap for reads.
> > +      * For fuseblk servers, this represents the blocksize passed in at
> > +      * mount time and for regular fuse servers, this is equivalent to
> > +      * inode->i_blkbits.
> > +      */
> > +     unsigned char blkbits;
>
> uint8_t, since the value is an integer quantity, not a character.

Ohh interesting, I copied this directly from the sb->s_blocksize_bits
type, which in the super_block struct gets defined as an unsigned
char. Your point makes sense to me though, I'll make this change and
send out v3!
>
> >  };
> >
> >  /*
> > diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
> > index 3bfd83469d9f..7ddfd2b3cc9c 100644
> > --- a/fs/fuse/inode.c
> > +++ b/fs/fuse/inode.c
> > @@ -292,7 +292,7 @@ void fuse_change_attributes_common(struct inode *inode, struct fuse_attr *attr,
> >       if (attr->blksize)
> >               fi->cached_i_blkbits = ilog2(attr->blksize);
> >       else
> > -             fi->cached_i_blkbits = inode->i_sb->s_blocksize_bits;
> > +             fi->cached_i_blkbits = fc->blkbits;
> >
> >       /*
> >        * Don't set the sticky bit in i_mode, unless we want the VFS
> > @@ -1810,10 +1810,21 @@ int fuse_fill_super_common(struct super_block *sb, struct fuse_fs_context *ctx)
> >               err = -EINVAL;
> >               if (!sb_set_blocksize(sb, ctx->blksize))
> >                       goto err;
> > +             /*
> > +              * This is a workaround until fuse hooks into iomap for reads.
> > +              * Use PAGE_SIZE for the blocksize else if the writeback cache
> > +              * is enabled, buffered writes go through iomap and a read may
> > +              * overwrite partially written data if blocksize < PAGE_SIZE
> > +              */
> > +             fc->blkbits = sb->s_blocksize_bits;
> > +             if (ctx->blksize != PAGE_SIZE &&
> > +                 !sb_set_blocksize(sb, PAGE_SIZE))
> > +                     goto err;
> >  #endif
> >       } else {
> >               sb->s_blocksize = PAGE_SIZE;
> >               sb->s_blocksize_bits = PAGE_SHIFT;
> > +             fc->blkbits = sb->s_blocksize_bits;
> >       }
>
> Heh. :)
> Reviewed-by: "Darrick J. Wong" <djwong@xxxxxxxxxx>
>

Thanks for reviewing this patchset!

> (How long until readahead? ;))

Haha I have readahead in the same patchset as read, planning to send
that out in 2 weeks after I'm back from PTO :D

>
> --D
>
> >       sb->s_subtype = ctx->subtype;
> > --
> > 2.47.3
> >
> >





[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux