Re: [PATCH 2/2 libnftnl v2] tunnel: add support to geneve options

Fernando Fernandez Mancera <fmancera@xxxxxxx> · Wed, 28 May 2025 11:27:40 +0200

On 5/28/25 2:34 AM, Florian Westphal wrote:
Fernando Fernandez Mancera <fmancera@xxxxxxx> wrote:

Hi Fernando

Thanks for working on this, I got inquiries as to nft_tunnel.c
and how to make use of this stuff...

diff --git a/include/libnftnl/object.h b/include/libnftnl/object.h
index 9930355..14a42cd 100644
--- a/include/libnftnl/object.h
+++ b/include/libnftnl/object.h
@@ -117,15 +117,19 @@ enum {
  	NFTNL_OBJ_TUNNEL_ERSPAN_V1_INDEX,
  	NFTNL_OBJ_TUNNEL_ERSPAN_V2_HWID,
  	NFTNL_OBJ_TUNNEL_ERSPAN_V2_DIR,
+	NFTNL_OBJ_TUNNEL_GENEVE_OPTS,

If every flavour gets its own flag in the tunnel namespace we'll run
out of u64 in no time.

AFAICS these are mutually exclusive, e.g.
NFTNL_OBJ_TUNNEL_ERSPAN_V1_INDEX and NFTNL_OBJ_TUNNEL_VXLAN_GBP cannot
be active at the same time.

Is there a way to re-use the internal flag namespace depending on the tunnel
subtype?

Or to have distinct tunnel object types?

object -> tunnel -> {vxlan, erspan, ...} ?


IMHO, this could be done by providing libnftnl its own object tunnel 
API. Although that would require to expose more functions, but tunnel 
object could have its own flags field.

In addition, I am afraid that ERSPAN and VXLAN enum fields have been 
exposed in a released version so removing them would be a breaking 
change. I am not sure whether that is acceptable or not.

As-is, how is this API supposed to be used?  The internal union seems to
be asking for trouble later, when e.g. 'getting' NFTNL_OBJ_TUNNEL_GENEVE_OPTS
on something that was instantiated as vxlan tunnel and fields aliasing to
unexpected values.

Perhaps the first use of any of the NFTNL_OBJ_TUNNEL_ERSPAN_V1_INDEX
etc values in a setter should interally "lock" the object to the given
subtype?

That might allow to NOT use ->flags for those enum values and instead
keep track of them via overlapping bits.

We'd need some internal 'enum nft_obj_tunnel_type' that marks which
part of the union is useable/instantiated so we can reject requests
to set bits that are not available for the specific tunnel type.


Yes, that would help but isn't that the reason why the flags field is 
there? I mean, currently the same problem would exist with the different 
object variants e.g the union of "struct nftnl_obj_*". When trying to 
get the attribute we always check that the flag is set.

  	switch (type) {
  	case NFTNL_OBJ_TUNNEL_ID:
@@ -72,6 +73,15 @@ nftnl_obj_tunnel_set(struct nftnl_obj *e, uint16_t type,
  	case NFTNL_OBJ_TUNNEL_ERSPAN_V2_DIR:
  		memcpy(&tun->u.tun_erspan.u.v2.dir, data, data_len);
  		break;
+	case NFTNL_OBJ_TUNNEL_GENEVE_OPTS:
+		geneve = malloc(sizeof(struct nftnl_obj_tunnel_geneve));

No null check.  Applies to a few other spots too.


Ah right. Yes, will fix that. Thanks.

+		memcpy(geneve, data, data_len);

Hmm, this looks like the API leaks internal data layout from nftables to
libnftnl and vice versa?  IMO thats a non-starter, sorry.


I agree that exposing the list abstraction to the user is problematic 
and shouldn't be done. As you mentioned below, I couldn't come up with a 
better API on libnftnl. Thinking about it, we could expose only the 
relevant fields by putting them in a isolated struct.

I see that options are essentially unlimited values, so perhaps nftables
should build the netlink blob(s) directly, similar to nftnl_udata()?

Pablo, any better idea?

I don't like exposing the struct and also the list abstraction getting
exposed to libnftnl users.

As-is, there is:

json/bison -> struct tunnel_geneve -> nftnl_obj_set_data(&obj, NFTNL_OBJ_TUNNEL_GENEVE_OPTS, &blob);
  -> nftnl_obj_tunnel_build -> mnl_attr_put_u16( ...

I think it will be better if we make this internal to nftables and have
pass-through for the entire geneve blob,i.e.

json/bison -> struct tunnel_geneve ->
  (re)alloc buffer: mnl_nlmsg_put_header(), mnl_attr_put_u16 -> etc.
  nftnl_obj_set_data(obj, NFTNL_OBJ_TUNNEL_GENEVE_OPTS, blob) ->
  nftnl_obj_tunnel_build -> memcpy into NFTA_TUNNEL_KEY_OPTS nest
  container.


I am fine with this, although just as a note I chose the linked list 
path mainly to make it more flexible for users. AFAICS, libnftnl users 
usually do not need to build the netlink blob themselves.

So, the problem in essence is the shared binary structure. If I do not 
expose the list abstraction, copying the allocated struct would still be 
problematic, right?

Thank you for all the comments, Florian.