Re: [RFC] fs: add ioctl to query protection info capabilities

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 5/29/2025 8:32 AM, Martin K. Petersen wrote:
> 
> Hi Anuj!
> 
> Thanks for working on this!
> 
Hi Martin,
Thanks for the feedback!

>> 4. tuple_size: size (in bytes) of the protection information tuple.
>> 6. pi_offset: offset of protection info within the tuple.
> 
> I find this a little confusing. The T10 PI tuple is <guard, app, ref>.
> 
> I acknowledge things currently are a bit muddy in the block layer since
> tuple_size has been transmogrified to hold the NVMe metadata size.
> 
> But for a new user-visible interface I think we should make the
> terminology clear. The tuple is the PI and not the rest of the metadata.
> 
> So I think you'd want:
> 
> 4. metadata_size: size (in bytes) of the metadata associated with each interval.
> 6. pi_offset: offset of protection information tuple within the metadata.
> 

Yes, this representation looks better. Will make this change.

>> +#define	FILE_PI_CAP_INTEGRITY		(1 << 0)
>> +#define	FILE_PI_CAP_REFTAG		(1 << 1)
> 
> You'll also need to have corresponding uapi defines for:
> 
> enum blk_integrity_checksum {
>          BLK_INTEGRITY_CSUM_NONE         = 0,
>          BLK_INTEGRITY_CSUM_IP           = 1,
>          BLK_INTEGRITY_CSUM_CRC          = 2,
>          BLK_INTEGRITY_CSUM_CRC64        = 3,
> } __packed ;
>

Right, I'll add these definitions to the UAPI.
>> +
>> +/*
>> + * struct fs_pi_cap - protection information(PI) capability descriptor
>> + * @flags:			Bitmask of capability flags
>> + * @interval:		Number of bytes of data per PI tuple
>> + * @csum_type:		Checksum type
>> + * @tuple_size:		Size in bytes of the PI tuple
>> + * @tag_size:		Size of the tag area within the tuple
>> + * @pi_offset:		Offset in bytes of the PI metadata within the tuple
>> + * @rsvd:			Reserved for future use
> 
> See above for distinction between metadata and PI tuple. The question is
> whether we need to report the size of those two separately (both in
> kernel and in this structure). Otherwise how do we know how big the PI
> tuple is? Or do we infer that from the csum_type?
> 

The block layer currently infers this by looking at the csum_type (e.g.,
in blk_integrity_generate). I assumed userspace could do the same, so I
didn't expose a separate pi_tuple_size field. Do you see this
differently?

As you mentioned, the other option would be to report the PI tuple size
explicitly in both the kernel and in the uapi struct.

> Also, for the extended NVMe PI types we'd probably need to know the size
> of the ref tag and the storage tag.
>

Makes sense, I will introduce ref_tag_size and storage_tag_size in the
UAPI struct to account for this.
I did a respin based on your feedback here [1]. If this looks good to
you, I'll roll out a v2.

Thanks,
Anuj Gupta

[1]

[PATCH] fs: add ioctl to query protection info capabilities

Add a new ioctl, FS_IOC_GETPICAP, to query protection info (PI)
capabilities. This ioctl returns information about the files integrity
profile. This is useful for userspace applications to understand a files
end-to-end data protection support and configure the I/O accordingly.

For now this interface is only supported by block devices. However the
design and placement of this ioctl in generic FS ioctl space allows us
to extend it to work over files as well. This maybe useful when
filesystems start supporting  PI-aware layouts.

A new structure struct fs_pi_cap is introduced, which contains the
following fields:
1. flags: bitmask of capability flags.
2. interval: the data block interval (in bytes) for which the protection
information is generated.
3. csum type: type of checksum used.
4. metadata_size: size (in bytes) of the metadata associated with each
interval.
5. tag_size: size (in bytes) of tag information.
6. pi_offset: offset of protection information tuple within the
metadata.
7. ref_tag_size: size in bytes of the reference tag.
8. storage_tag_size: size in bytes of the storage tag.
9. rsvd: reserved for future use.

The internal logic to fetch the capability is encapsulated in a helper
function blk_get_pi_cap(), which uses the blk_integrity profile
associated with the device. The ioctl returns -EOPNOTSUPP, if
CONFIG_BLK_DEV_INTEGRITY is not enabled.

Signed-off-by: Anuj Gupta <anuj20.g@xxxxxxxxxxx>
Signed-off-by: Kanchan Joshi <joshi.k@xxxxxxxxxxx>
---
  block/blk-integrity.c         | 38 +++++++++++++++++++++++++++++++++++
  block/ioctl.c                 |  3 +++
  include/linux/blk-integrity.h |  6 ++++++
  include/uapi/linux/fs.h       | 36 +++++++++++++++++++++++++++++++++
  4 files changed, 83 insertions(+)

diff --git a/block/blk-integrity.c b/block/blk-integrity.c
index a1678f0a9f81..9bd2888a85ce 100644
--- a/block/blk-integrity.c
+++ b/block/blk-integrity.c
@@ -13,6 +13,7 @@
  #include <linux/scatterlist.h>
  #include <linux/export.h>
  #include <linux/slab.h>
+#include <linux/t10-pi.h>

  #include "blk.h"

@@ -54,6 +55,43 @@ int blk_rq_count_integrity_sg(struct request_queue 
*q, struct bio *bio)
  	return segments;
  }

+int blk_get_pi_cap(struct block_device *bdev, struct fs_pi_cap __user 
*argp)
+{
+	struct blk_integrity *bi = blk_get_integrity(bdev->bd_disk);
+	struct fs_pi_cap pi_cap = {};
+
+	if (!bi)
+		goto out;
+
+	if (bi->flags & BLK_INTEGRITY_DEVICE_CAPABLE)
+		pi_cap.flags |= FILE_PI_CAP_INTEGRITY;
+	if (bi->flags & BLK_INTEGRITY_REF_TAG)
+		pi_cap.flags |= FILE_PI_CAP_REFTAG;
+	pi_cap.csum_type = bi->csum_type;
+	pi_cap.tuple_size = bi->tuple_size;
+	pi_cap.tag_size = bi->tag_size;
+	pi_cap.interval = 1 << bi->interval_exp;
+	pi_cap.pi_offset = bi->pi_offset;
+	switch (bi->csum_type) {
+		case BLK_INTEGRITY_CSUM_CRC64:
+			pi_cap.ref_tag_size = sizeof_field(struct crc64_pi_tuple
+							   , ref_tag);
+			break;
+		case BLK_INTEGRITY_CSUM_CRC:
+		case BLK_INTEGRITY_CSUM_IP:
+			pi_cap.ref_tag_size = sizeof_field(struct t10_pi_tuple,
+							   ref_tag);
+			break;
+		default:
+			break;
+	}
+
+out:
+	if (copy_to_user(argp, &pi_cap, sizeof(struct fs_pi_cap)))
+		return -EFAULT;
+	return 0;
+}
+
  /**
   * blk_rq_map_integrity_sg - Map integrity metadata into a scatterlist
   * @rq:		request to map
diff --git a/block/ioctl.c b/block/ioctl.c
index e472cc1030c6..53b35bf3e6fa 100644
--- a/block/ioctl.c
+++ b/block/ioctl.c
@@ -13,6 +13,7 @@
  #include <linux/uaccess.h>
  #include <linux/pagemap.h>
  #include <linux/io_uring/cmd.h>
+#include <linux/blk-integrity.h>
  #include <uapi/linux/blkdev.h>
  #include "blk.h"
  #include "blk-crypto-internal.h"
@@ -643,6 +644,8 @@ static int blkdev_common_ioctl(struct block_device 
*bdev, blk_mode_t mode,
  		return blkdev_pr_preempt(bdev, mode, argp, true);
  	case IOC_PR_CLEAR:
  		return blkdev_pr_clear(bdev, mode, argp);
+	case FS_IOC_GETPICAP:
+		return blk_get_pi_cap(bdev, argp);
  	default:
  		return -ENOIOCTLCMD;
  	}
diff --git a/include/linux/blk-integrity.h b/include/linux/blk-integrity.h
index c7eae0bfb013..6118a0c28605 100644
--- a/include/linux/blk-integrity.h
+++ b/include/linux/blk-integrity.h
@@ -29,6 +29,7 @@ int blk_rq_map_integrity_sg(struct request *, struct 
scatterlist *);
  int blk_rq_count_integrity_sg(struct request_queue *, struct bio *);
  int blk_rq_integrity_map_user(struct request *rq, void __user *ubuf,
  			      ssize_t bytes);
+int blk_get_pi_cap(struct block_device *bdev, struct fs_pi_cap __user 
*argp);

  static inline bool
  blk_integrity_queue_supports_integrity(struct request_queue *q)
@@ -92,6 +93,11 @@ static inline struct bio_vec rq_integrity_vec(struct 
request *rq)
  				 rq->bio->bi_integrity->bip_iter);
  }
  #else /* CONFIG_BLK_DEV_INTEGRITY */
+static inline int blk_get_pi_cap(struct block_device *bdev,
+				 struct fs_pi_cap __user *argp)
+{
+	return -EOPNOTSUPP;
+}
  static inline int blk_rq_count_integrity_sg(struct request_queue *q,
  					    struct bio *b)
  {
diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h
index e762e1af650c..c70584b09bed 100644
--- a/include/uapi/linux/fs.h
+++ b/include/uapi/linux/fs.h
@@ -91,6 +91,40 @@ struct fs_sysfs_path {
  	__u8			name[128];
  };

+/* Protection info capability flags */
+#define	FILE_PI_CAP_INTEGRITY		(1 << 0)
+#define	FILE_PI_CAP_REFTAG		(1 << 1)
+
+/* Checksum types for Protection Information */
+#define FS_PI_CSUM_NONE			0
+#define FS_PI_CSUM_IP			1
+#define FS_PI_CSUM_CRC			2
+#define FS_PI_CSUM_CRC64		3
+
+/*
+ * struct fs_pi_cap - protection information(PI) capability descriptor
+ * @flags:			Bitmask of capability flags
+ * @interval:			Number of bytes of data per PI tuple
+ * @csum_type:			Checksum type
+ * @metadata_size:		Size in bytes of the metadata associated with each 
interval
+ * @tag_size:			Size of the tag area within the tuple
+ * @pi_offset:			Offset of protection information tuple within the metadata
+ * @ref_tag_size:		Size in bytes of the reference tag
+ * @storage_tag_size:		Size in bytes of the storage tag
+ * @rsvd:			Reserved for future use
+ */
+struct fs_pi_cap {
+	__u32	flags;
+	__u16	interval;
+	__u8	csum_type;
+	__u8	tuple_size;
+	__u8	tag_size;
+	__u8	pi_offset;
+	__u8	ref_tag_size;
+	__u8	storage_tag_size;
+	__u8	rsvd[4];
+};
+
  /* extent-same (dedupe) ioctls; these MUST match the btrfs ioctl 
definitions */
  #define FILE_DEDUPE_RANGE_SAME		0
  #define FILE_DEDUPE_RANGE_DIFFERS	1
@@ -247,6 +281,8 @@ struct fsxattr {
   * also /sys/kernel/debug/ for filesystems with debugfs exports
   */
  #define FS_IOC_GETFSSYSFSPATH		_IOR(0x15, 1, struct fs_sysfs_path)
+/* Get protection info capability details */
+#define FS_IOC_GETPICAP			_IOR('f', 3, struct fs_pi_cap)

  /*
   * Inode flags (FS_IOC_GETFLAGS / FS_IOC_SETFLAGS)
-- 
2.25.1




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux