> On Apr 30, 2025, at 9:34 PM, Stanislav Fomichev <stfomichev@xxxxxxxxx> wrote: > > !-------------------------------------------------------------------| > CAUTION: External Email > > |-------------------------------------------------------------------! > > On 05/01, Jon Kohler wrote: >> >> >>> On Apr 30, 2025, at 5:09 PM, Stanislav Fomichev <stfomichev@xxxxxxxxx> wrote: >>> >>> !-------------------------------------------------------------------| >>> CAUTION: External Email >>> >>> |-------------------------------------------------------------------! >>> >>> On 04/30, Jon Kohler wrote: >>>> Introduce new XDP helpers: >>>> - xdp_headlen: Similar to skb_headlen >>>> - xdp_headroom: Similar to skb_headroom >>>> - xdp_metadata_len: Similar to skb_metadata_len >>>> >>>> Integrate these helpers into tap, tun, and XDP implementation to start. >>>> >>>> No functional changes introduced. >>>> >>>> Signed-off-by: Jon Kohler <jon@xxxxxxxxxxx> >>>> --- >>>> v1->v2: Integrate feedback from Willem >>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__patchwork.kernel.org_project_netdevbpf_patch_20250430182921.1704021-2D1-2Djon-40nutanix.com_&d=DwIBaQ&c=s883GpUCOChKOHiocYtGcg&r=NGPRGGo37mQiSXgHKm5rCQ&m=9pdxzQszX_M0K3gEPeYOyMZZYSkRR8IMvxslS8320Eoctk58y-ELCdZ5iaryF2GH&s=J-ILB7E9VQ_plo0hyjEtzGzjy6G0_o4MMMmmE_z8vvc&e= >>>> >>>> drivers/net/tap.c | 6 +++--- >>>> drivers/net/tun.c | 12 +++++------ >>>> include/net/xdp.h | 54 +++++++++++++++++++++++++++++++++++++++++++---- >>>> net/core/xdp.c | 12 +++++------ >>>> 4 files changed, 65 insertions(+), 19 deletions(-) >>>> >>>> diff --git a/drivers/net/tap.c b/drivers/net/tap.c >>>> index d4ece538f1b2..a62fbca4b08f 100644 >>>> --- a/drivers/net/tap.c >>>> +++ b/drivers/net/tap.c >>>> @@ -1048,7 +1048,7 @@ static int tap_get_user_xdp(struct tap_queue *q, struct xdp_buff *xdp) >>>> struct sk_buff *skb; >>>> int err, depth; >>>> >>>> - if (unlikely(xdp->data_end - xdp->data < ETH_HLEN)) { >>>> + if (unlikely(xdp_headlen(xdp) < ETH_HLEN)) { >>>> err = -EINVAL; >>>> goto err; >>>> } >>>> @@ -1062,8 +1062,8 @@ static int tap_get_user_xdp(struct tap_queue *q, struct xdp_buff *xdp) >>>> goto err; >>>> } >>>> >>>> - skb_reserve(skb, xdp->data - xdp->data_hard_start); >>>> - skb_put(skb, xdp->data_end - xdp->data); >>>> + skb_reserve(skb, xdp_headroom(xdp)); >>>> + skb_put(skb, xdp_headlen(xdp)); >>>> >>>> skb_set_network_header(skb, ETH_HLEN); >>>> skb_reset_mac_header(skb); >>>> diff --git a/drivers/net/tun.c b/drivers/net/tun.c >>>> index 7babd1e9a378..4c47eed71986 100644 >>>> --- a/drivers/net/tun.c >>>> +++ b/drivers/net/tun.c >>>> @@ -1567,7 +1567,7 @@ static int tun_xdp_act(struct tun_struct *tun, struct bpf_prog *xdp_prog, >>>> dev_core_stats_rx_dropped_inc(tun->dev); >>>> return err; >>>> } >>>> - dev_sw_netstats_rx_add(tun->dev, xdp->data_end - xdp->data); >>>> + dev_sw_netstats_rx_add(tun->dev, xdp_headlen(xdp)); >>>> break; >>>> case XDP_TX: >>>> err = tun_xdp_tx(tun->dev, xdp); >>>> @@ -1575,7 +1575,7 @@ static int tun_xdp_act(struct tun_struct *tun, struct bpf_prog *xdp_prog, >>>> dev_core_stats_rx_dropped_inc(tun->dev); >>>> return err; >>>> } >>>> - dev_sw_netstats_rx_add(tun->dev, xdp->data_end - xdp->data); >>>> + dev_sw_netstats_rx_add(tun->dev, xdp_headlen(xdp)); >>>> break; >>>> case XDP_PASS: >>>> break; >>>> @@ -2355,7 +2355,7 @@ static int tun_xdp_one(struct tun_struct *tun, >>>> struct xdp_buff *xdp, int *flush, >>>> struct tun_page *tpage) >>>> { >>>> - unsigned int datasize = xdp->data_end - xdp->data; >>>> + unsigned int datasize = xdp_headlen(xdp); >>>> struct tun_xdp_hdr *hdr = xdp->data_hard_start; >>>> struct virtio_net_hdr *gso = &hdr->gso; >>>> struct bpf_prog *xdp_prog; >>>> @@ -2415,14 +2415,14 @@ static int tun_xdp_one(struct tun_struct *tun, >>>> goto out; >>>> } >>>> >>>> - skb_reserve(skb, xdp->data - xdp->data_hard_start); >>>> - skb_put(skb, xdp->data_end - xdp->data); >>>> + skb_reserve(skb, xdp_headroom(xdp)); >>>> + skb_put(skb, xdp_headlen(xdp)); >>>> >>>> /* The externally provided xdp_buff may have no metadata support, which >>>> * is marked by xdp->data_meta being xdp->data + 1. This will lead to a >>>> * metasize of -1 and is the reason why the condition checks for > 0. >>>> */ >>>> - metasize = xdp->data - xdp->data_meta; >>>> + metasize = xdp_metadata_len(xdp); >>>> if (metasize > 0) >>>> skb_metadata_set(skb, metasize); >>>> >>>> diff --git a/include/net/xdp.h b/include/net/xdp.h >>>> index 48efacbaa35d..044345b18305 100644 >>>> --- a/include/net/xdp.h >>>> +++ b/include/net/xdp.h >>>> @@ -151,10 +151,56 @@ xdp_get_shared_info_from_buff(const struct xdp_buff *xdp) >>>> return (struct skb_shared_info *)xdp_data_hard_end(xdp); >>>> } >>>> >>>> +/** >>>> + * xdp_headlen - Calculate the length of the data in an XDP buffer >>>> + * @xdp: Pointer to the XDP buffer structure >>>> + * >>>> + * Compute the length of the data contained in the XDP buffer. Does not >>>> + * include frags, use xdp_get_buff_len() for that instead. >>>> + * >>>> + * Analogous to skb_headlen(). >>>> + * >>>> + * Return: The length of the data in the XDP buffer in bytes. >>>> + */ >>>> +static inline unsigned int xdp_headlen(const struct xdp_buff *xdp) >>>> +{ >>>> + return xdp->data_end - xdp->data; >>>> +} >>>> + >>>> +/** >>>> + * xdp_headroom - Calculate the headroom available in an XDP buffer >>>> + * @xdp: Pointer to the XDP buffer structure >>>> + * >>>> + * Compute the headroom in an XDP buffer. >>>> + * >>>> + * Analogous to the skb_headroom(). >>>> + * >>>> + * Return: The size of the headroom in bytes. >>>> + */ >>>> +static inline unsigned int xdp_headroom(const struct xdp_buff *xdp) >>>> +{ >>>> + return xdp->data - xdp->data_hard_start; >>>> +} >>>> + >>>> +/** >>>> + * xdp_metadata_len - Calculate the length of metadata in an XDP buffer >>>> + * @xdp: Pointer to the XDP buffer structure >>>> + * >>>> + * Compute the length of the metadata region in an XDP buffer. >>>> + * >>>> + * Analogous to skb_metadata_len(). >>>> + * >>>> + * Return: The length of the metadata in bytes. >>>> + */ >>>> +static inline unsigned int xdp_metadata_len(const struct xdp_buff *xdp) >>> >>> I believe this has to return int, not unsigned int. There are places >>> where we do data_meta = data + 1, and the callers check whether >>> the result is signed or sunsigned. >> >> The existing SKB APIs are the exact same return type (I copied them 1:1), >> but I have a feeling that we’re never intending these values to either A) be >> negative and/or B) wrap in strange ways? >> >>> >>>> +{ >>>> + return xdp->data - xdp->data_meta; >>>> +} >>>> + >>>> static __always_inline unsigned int >>>> xdp_get_buff_len(const struct xdp_buff *xdp) >>>> { >>>> - unsigned int len = xdp->data_end - xdp->data; >>>> + unsigned int len = xdp_headlen(xdp); >>>> const struct skb_shared_info *sinfo; >>>> >>>> if (likely(!xdp_buff_has_frags(xdp))) >>>> @@ -364,8 +410,8 @@ int xdp_update_frame_from_buff(const struct xdp_buff *xdp, >>>> int metasize, headroom; >> >> Said another way, perhaps this should be unsigned? >> >>>> >>>> /* Assure headroom is available for storing info */ >>>> - headroom = xdp->data - xdp->data_hard_start; >>>> - metasize = xdp->data - xdp->data_meta; >>>> + headroom = xdp_headroom(xdp); >>>> + metasize = xdp_metadata_len(xdp); >>>> metasize = metasize > 0 ? metasize : 0; >>> >>> ^^ like here >> >> Look across the tree, seems like more are unsigned than signed > > The ones that are unsigned are either calling xdp_data_meta_unsupported > explicitly (and it does > to check for this condition, not signed math) > or are running in the drivers that are guaranteed to have metadata > support (and, hence, always have data_meta <= data). > >> These ones use unsigned: >> xdp_convert_zc_to_xdp_frame > > This uses xdp_data_meta_unsupported > >> veth_xdp_rcv_skb >> xsk_construct_skb >> bnxt_copy_xdp >> i40e_build_skb >> i40e_construct_skb_zc >> ice_build_skb (this is u8) >> ice_construct_skb_zc >> igb_build_skb >> igb_construct_skb_zc >> igc_build_skb >> igc_construct_skb >> igc_construct_skb_zc >> ixgbe_build_skb >> ixgbe_construct_skb_zc >> ixgbevf_build_skb >> mvneta_swbm_build_skb >> mlx5e_xsk_construct_skb >> mana_build_skb >> stmmac_construct_skb_zc >> bpf_prog_run_generic_xdp > > These run in the drivers that support metadata (data_meta <= data) > >> xdp_get_metalen > > This uses xdp_data_meta_unsupported > >> These ones are regular int: >> xdp_build_skb_from_buff >> xdp_build_skb_from_zc >> xdp_update_frame_from_buff >> tun_xdp_one >> build_skb_from_xdp_buff > > These can be called from the drivers that support and don't support > the metadata, so have to (correctly) use int. > >> Perhaps a separate patch to convert the regulars to unsigned, >> thoughts? > > Take a look at xdp_set_data_meta_invalid and xdp_data_meta_unsupported. > There are cases where xdp->data - xdp->data_meta is -1 (and the callers > check for this condition), we can't use unsigned unconditionally > (unless we use xdp_data_meta_unsupported). Ah! Good catch, and thank you for helping me to understand that, I appreciate it. About to turn in for the evening, will wait for any more comments and I’m happy to send out a v3. One thought is that I stumbled upon xdp_get_metalen in filter.c. I wonder it would make sense to pirate that logic and move it into xdp.h? That might be a simply solution here that would allow us to keep unsigned like SKB API? Happy to take feedback either way. Cheers, Jon