Re: BCP 78 policy / copyright / Generative AI / LLM .. is there a FAQ?

Michael De Roover <ietf@xxxxxxxxxxxx> · Tue, 19 Aug 2025 07:42:25 +0200

Hi Job,
I personally would recommend working groups against adoption of AI generated text (mostly because of the potential for issues related to intellectual property). IMHO, handwritten originals are the way to go when using the IETF publication venue! :)

I recently spotted someone who contemplated submitting a 33,000+ word LLM-generated I-D for review to a working
group.

Wholeheartedly agreed, especially in terms of the textual content itself. The context of 30k+ words of slop is a good example.

Where can I point newcomers on this topic?

I do find I-D documents somewhat difficult to write. For that, getting an LLM’s help proved useful at some point. It is a double-edged sword though, even if just for formatting. Mistakes are very easily made, time is easily wasted if you don’t keep the LLM on a leash.

XML is a good format to write RFCs in, though I do find myself being more familiar with Markdown. Having a skeletal template would be nice. Maybe xml2rfc could be made to generate one.

I’m not sure to which extent this would apply to the RFC Editor resources, but the aforementioned LLM’s crawlers have also exhausted storage on my git.nixmagic.com instance twice, on account of wanting to download everything there was to get on that instance. PDF downloads may be similarly affected.

Cloudflare has launched a service that targets such crawlers, which IETF may want to consider if it is / becomes an issue. Datatracker already seems to be fronted by Cloudflare too.
https://blog.cloudflare.com/ai-labyrinth/

Met vriendelijke groet,Michael De Roover

Mail: ietf@xxxxxxxxxxxx
Web: michael.de.roover.eu.org

On 18 Aug 2025, at 14:51, Job Snijders <job@xxxxxxxxxxxxx> wrote:

Dear all,

The IETF is committed to making all text available with a permissive
license and with appropriate attribution. A fantastic objective I
wholeheartedly support. As I understand things, generative AI/LLMs, by
their nature, are likely unable to provide the necessary assurances that
the generated material is compatible with the provisions of BCP 78
(RFC5378) or that the original authors are properly attributed.

I can see that AI tooling is helping some folks with study & analysis,
(which is great for them!), however I am not sure that submitting an
internet-draft to the IETF that was (partially) generated using AI would
be a wise path to follow: I recently spotted someone who contemplated
submitting a 33,000+ word LLM-generated I-D for review to a working
group. The machine certainly was not a subject matter expert - but at
first glance it all looked legit, if such submissions were to happen
they'd have the potential to take up a lot of time resources.

I personally would recommend working groups against adoption of AI
generated text (mostly because of the potential for issues related to
intellectual property). IMHO, handwritten originals are the way to go
when using the IETF publication venue! :)

The FreeBSD project recently added some clarifications,
result: https://reviews.freebsd.org/differential/changeset/?ref=1420532
discussion: https://reviews.freebsd.org/D50650?id=156417

Other entities also provided documentation on the topic:
https://www.linuxfoundation.org/legal/generative-ai
https://www.apache.org/legal/generative-tooling.html

Is it documented somewhere for IETF newcomers that internet-draft
submissions should not contain LLM/AI generated text? I imagine that
similar clarifications for the IETF context along the lines of "text
_about_ AI is fine, but text generated by AI has legal implications"
would be very helpful.

Where can I point newcomers on this topic?

Kind regards,

Job