Subject: Reproducible XFS Filesystems Builds for VMs linux-xfs@xxxxxxxxxxxxxxx Dear XFS Maintainers and Community, I am a Software Engineer at Chainguard working on reproducible builds for VMs. While we have successfully implemented reproducible disk images with EFI+EXT4 partitions, I’ve been unable to replicate this for XFS filesystems. Current Approach: We have successfully implemented reproducible disk images with EFI+EXT4 partitions using the following methods: - For FAT32 partitions: `mkfs.vfat --invarian -i $EFI_UUID` with `$SOURCE_DATE_EPOCH` and populating via mtools - For EXT4 partitions: `mkfs.ext4 -E hash_seed=$EXT4_HASH_SEED -U $ROOTFS_UUID` with `$SOURCE_DATE_EPOCH` plus the `-d /path/to/rootfs.tar.gz` to populate it XFS Challenges: For XFS, I've attempted to create reproducible filesystems using extensive parameters: ``` mkfs.xfs \ -b size=4096 \ -d agcount=4 \ -d noalign \ -i attr=2 \ -i projid32bit=1 \ -i size=512 \ -l size=67108864 \ -l su=4096 \ -l version=2 \ -m crc=1 \ -m finobt=1 \ -m uuid=$ROOTFS_UUID \ -n size=16384 \ -n version=2 $root_partition ``` I've tried to specify as many options as possible in order to avoid runtime aleatory decisions. Unfortunately, this does not produce reproducible results across different disk images. I've made progress with empty filesystems by using a combination of `libfaketime` to enforce `$SOURCE_DATE_EPOCH` and a custom library that overwrites the libc's `getrandom()`: ``` ~$ export LD_PRELOAD="./deterministic_rng.so /usr/lib/faketime/libfaketime.so.1" ~$ mkfs.xfs \ -b size=4096 \ -d agcount=4 \ -d noalign \ -i attr=2 \ -i projid32bit=1 \ -i size=512 \ -l size=67108864 \ -l su=4096 \ -l version=2 \ -m crc=1 \ -m finobt=1 \ -m uuid=$ROOTFS_UUID \ -n size=16384 \ -n version=2 disk1.img ~$ mkfs.xfs \ -b size=4096 \ -d agcount=4 \ -d noalign \ -i attr=2 \ -i projid32bit=1 \ -i size=512 \ -l size=67108864 \ -l su=4096 \ -l version=2 \ -m crc=1 \ -m finobt=1 \ -m uuid=$ROOTFS_UUID \ -n size=16384 \ -n version=2 disk2.img ~$ md5sum disk* c68c202163dcb862762fc01970f6c8b4 disk1.img c68c202163dcb862762fc01970f6c8b4 disk2.img ``` This approach works for empty filesystems, but when populating the filesystem by mounting and untarring an archive, different metadata is generated even after using `xfs_repair -L` to reset most metadata. The primary difference appears to be in the allocation group metadata, which is optimized at runtime. Question: EXT4 addresses this issue with the -d flag, which allows populating from an archive or directory without mounting. Is there similar functionality available for XFS, or is there interest in developing a method for generating reproducible XFS root filesystems? I'm asking this because we'd be interested in using XFS as a filesystem for the final product. Thank you for your time and expertise. Any guidance would be greatly appreciated. Regards, L.