[Last-Call] draft-ietf-core-groupcomm-bis-14 ietf last call Tsvart review

Magnus Westerlund via Datatracker <noreply@xxxxxxxx> · Tue, 29 Jul 2025 04:21:25 -0700

Document: draft-ietf-core-groupcomm-bis
Title: Group Communication for the Constrained Application Protocol (CoAP)
Reviewer: Magnus Westerlund
Review result: Ready with Issues

This document has been reviewed as part of the transport area review team's
ongoing effort to review key IETF documents. These comments were written
primarily for the transport area directors, but are copied to the document's
authors and WG to allow them to address any issues raised and also to the IETF
discussion list for information.

When done at the time of IETF Last Call, the authors should consider this
review as part of the last-call comments they receive. Please always CC
tsv-art@xxxxxxxx if you reply to or forward this review.

First of all I will note that this publication is moving a specification from
Experimental to Proposed Standard thus a higher bar for robustness in
congestion control and security issue mitigation is to me reasonable and to be
expected.

General:
So what multicast models are really supported by this specification? To my
reading it appears focused on Any Source Multicast, and Source Specific
Multicast (RFC4607) does not work as the unicast responses are sent to the
source address of the multicast packet containing the COAP request. I think it
would be good to be clearer on these limitations. To me it is unclear if a
forward proxy would work as an interface and the specific source for a SSM
group. Some clarification on this would be good.

I think there are several areas where this document is taking way to much an
toolbox approach rather than being descriptive on what is normative to
implement and use for particular use cases and deployments. This especially in
relation to security mitigations. There are a lot of things mentioned in the
security conisderations that I think should be normatively required to be
supported and implemented to ensure secure deployment of groupcomm. Examples of
this includes mitigation of amplification attacks. Also the use of NoSec mode
should be stated early on the draft basically as an application statement for
this mode. To my understanding NoSec is extremely limited and that should be
made clear up front.

Section 3.1.3
Request repetition is proposed of two different types but it is unclear how a
client should be able to determine that the loss happened on the client to
server, vs server to client path. Shouldn't at least one of the methods be the
recommended one, like using the same token value and message ID that can be
suppressed on server side?

Section 3.6:
I guess the general principle described is the simplest that gives some
back-off. However, I think more guidance are needed on both the variables used
in computing the lb_leisure value as well as the default values PROBING_RATE
and DEFAULT_LEISUIRE.

It is a bit unfortunate that the protocol setup results in it being difficult
for any endpoint to judge the server group size. Thus it needs to be
configured. Response size (S) could be estimated based on sent responses. The
data transfer rate given to groupcomm responses (R) appear to be needs
configuration. Also which level of load factor is deemed acceptable in relation
to link speeds? Some one configuring a whole system needs to take into account
all the different application groups here.

I think there are also potential to discuss the issue of client request rates.
I assume that there exist both deployments when the client is not also a server
and thus (I guess) is not part of the multicast group, as well as where it is a
member. If the client is a multicast member it could actually attempt to judge
the current request rate and based on that determine if it can send the
request. Unfortunately an individual client struggles to determine if the
multicast group or unicast response somewhere leads to overload.

Section 3.7:
Rate limiting observation notificatitons. So per RFC 7641 there are some
recommendations for congestion control of the Obersvation notification. What I
can determine from reviewing RFC 7641 Section 4.5.1 and RFC 7252 Section 4.7
the number of outstanding or non-confirmable notifications  is NSTAT and for
the later NSTAT will be 1. The issue I see with this is that with the groupcomm
a client can request notifications from all servers in the application group.
If the resource that is observed is something that will be synchronized across
a large group of server a feedback implosion can be generated. Thus my question
is if the sending of observation notification when the request has been
received over groupcomm should default to use a random backoff time to attempt
to smear out the notifications over time.

I believe that there is an overload attack possible using this mechanism. An
attacker sends an Oberserve requet for a resource that exists on a large number
of servers in the application group. This request has a spoofed address of the
target of the attack. When wanting to overload said target the attacker send a
request that will change the resource being observed using groupcomm to all
targets. Thus triggering a notification storm towards target from all of the
servers. I think to protect from this attack there are two levels of defense.
The first is to require authorization to request observation. That prevents
security group external resources from triggering the attack. But, unless there
is hard binding between the IP+port and the security credentials of the
endpoint this will not prevent an insider from setting up this attack. For the
later I think one would need either the strict address binding, source address
validation or a return routability check would be needed to prevent this. Most
of this attack is discussed in Security Consideration. The insider attacker
appears to be mostly ignored. Secondly, I think the risk of triggering
notifications on many servers was not really discussed, but maybe I missed the
text.

So why isn't this section discussing the ECHO Option, that is first discussed
in Section 6.2.3. and why are the requirements first stated in Section 6.3
rather than earlier as part of section 3?

Section 5.3:

I think this section and the discussion on proxies in this document should be
clearer on that yes proxies do exist, but this document will not answer the
details needed to actually use them. That will require the
draft-ietf-core-groupcomm-proxy document to be able to implement it especially
in combination with security. Especially as this document have a MUST support
OSCORE groupcomm I think there is an imbalance here when it comes to be able to
implement it. I think you either make groupcomm-proxy normative and MUST be
implemented, or you scope the support for proxies differently in this document
so that it is possbile to later use groupcomm-proxies as the specification to
follow as soon as proxies and groupcomm are relevant.

Section 6.3:
[I-D.irtf-t2trg-amplification-attacks], is likely not an informative reference,
but an normative as it becomes crucial to understand the attacks discussed here.

"Thus, consistent with Section 7 of [RFC7641], a server in a CoAP group MUST
strictly limit the number of notifications it sends between receiving
acknowledgements that confirm the actual interest of the client in continuing
the observation." So what is this limitation that one should implement?

To conclude I think a lot of aspects have been considered but the actually
recommendations and default or limit values should be more clearly specified.

-- 
last-call mailing list -- last-call@xxxxxxxx
To unsubscribe send an email to last-call-leave@xxxxxxxx