Reproducible XFS Filesystems Builds for VMs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Subject: Reproducible XFS Filesystems Builds for VMs
linux-xfs@xxxxxxxxxxxxxxx

Dear XFS Maintainers and Community,

I am a Software Engineer at Chainguard working on reproducible builds for VMs.

While we have successfully implemented reproducible disk images with
EFI+EXT4 partitions, I’ve been unable to replicate this for XFS
filesystems.
Current Approach:

We have successfully implemented reproducible disk images with
EFI+EXT4 partitions using the following methods:

- For FAT32 partitions: `mkfs.vfat --invarian -i $EFI_UUID` with
`$SOURCE_DATE_EPOCH` and populating via mtools
- For EXT4 partitions: `mkfs.ext4 -E hash_seed=$EXT4_HASH_SEED -U
$ROOTFS_UUID` with `$SOURCE_DATE_EPOCH` plus the `-d
/path/to/rootfs.tar.gz` to populate it

XFS Challenges:

For XFS, I've attempted to create reproducible filesystems using
extensive parameters:

```
mkfs.xfs \
-b size=4096 \
-d agcount=4 \
-d noalign \
-i attr=2 \
-i projid32bit=1 \
-i size=512 \
-l size=67108864 \
-l su=4096 \
-l version=2 \
-m crc=1 \
-m finobt=1 \
-m uuid=$ROOTFS_UUID \
-n size=16384 \
-n version=2 $root_partition
```

I've tried to specify as many options as possible in order to avoid
runtime aleatory decisions.

Unfortunately, this does not produce reproducible results across
different disk images.

I've made progress with empty filesystems by using a combination of
`libfaketime`
to enforce `$SOURCE_DATE_EPOCH` and a custom library that overwrites
the libc's `getrandom()`:

```
~$ export LD_PRELOAD="./deterministic_rng.so /usr/lib/faketime/libfaketime.so.1"
~$ mkfs.xfs \
-b size=4096 \
-d agcount=4 \
-d noalign \
-i attr=2 \
-i projid32bit=1 \
-i size=512 \
-l size=67108864 \
-l su=4096 \
-l version=2 \
-m crc=1 \
-m finobt=1 \
-m uuid=$ROOTFS_UUID \
-n size=16384 \
-n version=2 disk1.img
~$ mkfs.xfs \
-b size=4096 \
-d agcount=4 \
-d noalign \
-i attr=2 \
-i projid32bit=1 \
-i size=512 \
-l size=67108864 \
-l su=4096 \
-l version=2 \
-m crc=1 \
-m finobt=1 \
-m uuid=$ROOTFS_UUID \
-n size=16384 \
-n version=2 disk2.img
~$ md5sum disk*
c68c202163dcb862762fc01970f6c8b4  disk1.img
c68c202163dcb862762fc01970f6c8b4  disk2.img
```

This approach works for empty filesystems, but when populating the filesystem by
mounting and untarring an archive, different metadata is generated
even after using
`xfs_repair -L` to reset most metadata.

The primary difference appears to be in the allocation group metadata,
which is optimized at runtime.

Question:

EXT4 addresses this issue with the -d flag, which allows populating
from an archive or directory without mounting.
Is there similar functionality available for XFS, or is there interest
in developing a method for generating reproducible XFS root
filesystems?

I'm asking this because we'd be interested in using XFS as a filesystem for the
final product.

Thank you for your time and expertise. Any guidance would be greatly
appreciated.
Regards,
L.





[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux