On Thu, May 29, 2025 at 12:25 PM Darrick J. Wong <djwong@xxxxxxxxxx> wrote: > > On Thu, May 29, 2025 at 10:50:01AM +0800, Yafang Shao wrote: > > Hello, > > > > Recently, we encountered data loss when using XFS on an HDD with bad > > blocks. After investigation, we determined that the issue was related > > to writeback errors. The details are as follows: > > > > 1. Process-A writes data to a file using buffered I/O and completes > > without errors. > > 2. However, during the writeback of the dirtied pagecache pages, an > > I/O error occurs, causing the data to fail to reach the disk. > > 3. Later, the pagecache pages may be reclaimed due to memory pressure, > > since they are already clean pages. > > 4. When Process-B reads the same file, it retrieves zeroed data from > > the bad blocks, as the original data was never successfully written > > (IOMAP_UNWRITTEN). > > > > We reviewed the related discussion [0] and confirmed that this is a > > known writeback error issue. While using fsync() after buffered > > write() could mitigate the problem, this approach is impractical for > > our services. > > > > Instead, we propose introducing configurable options to notify users > > of writeback errors immediately and prevent further operations on > > affected files or disks. Possible solutions include: > > > > - Option A: Immediately shut down the filesystem upon writeback errors. > > - Option B: Mark the affected file as inaccessible if a writeback error occurs. > > > > These options could be controlled via mount options or sysfs > > configurations. Both solutions would be preferable to silently > > returning corrupted data, as they ensure users are aware of disk > > issues and can take corrective action. > > > > Any suggestions ? > > Option C: report all those write errors (direct and buffered) to a > daemon and let it figure out what it wants to do: > > https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=health-monitoring_2025-05-21 > https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfsprogs-dev.git/log/?h=health-monitoring-rust_2025-05-21 > > Yes this is a long term option since it involves adding upcalls from the > pagecache/vfs into the filesystem and out through even more XFS code, > which has to go through its usual rigorous reviews. > > But if there's interest then I could move up the timeline on submitting > those since I wasn't going to do much with any of that until 2026. This would be very helpful. While it might take some time, it's better to address it now than never. Please proceed with this when you have availability. -- Regards Yafang