Hi Job,
I personally would recommend working groups against adoption of AI generated text (mostly because of the potential for issues related to intellectual property). IMHO, handwritten originals are the way to go when using the IETF publication venue! :)
I recently spotted someone who contemplated submitting a 33,000+ word LLM-generated I-D for review to a working group.
Wholeheartedly agreed, especially in terms of the textual content itself. The context of 30k+ words of slop is a good example.
Where can I point newcomers on this topic?
I do find I-D documents somewhat difficult to write. For that, getting an LLM’s help proved useful at some point. It is a double-edged sword though, even if just for formatting. Mistakes are very easily made, time is easily wasted if you don’t keep the LLM on a leash.
XML is a good format to write RFCs in, though I do find myself being more familiar with Markdown. Having a skeletal template would be nice. Maybe xml2rfc could be made to generate one.
I’m not sure to which extent this would apply to the RFC Editor resources, but the aforementioned LLM’s crawlers have also exhausted storage on my git.nixmagic.com instance twice, on account of wanting to download everything there was to get on that instance. PDF downloads may be similarly affected.
Cloudflare has launched a service that targets such crawlers, which IETF may want to consider if it is / becomes an issue. Datatracker already seems to be fronted by Cloudflare too. https://blog.cloudflare.com/ai-labyrinth/
Met vriendelijke groet, Michael De Roover
Mail: ietf@xxxxxxxxxxxx Web: michael.de.roover.eu.org On 18 Aug 2025, at 14:51, Job Snijders <job@xxxxxxxxxxxxx> wrote:
Dear all,
The IETF is committed to making all text available with a permissive license and with appropriate attribution. A fantastic objective I wholeheartedly support. As I understand things, generative AI/LLMs, by their nature, are likely unable to provide the necessary assurances that the generated material is compatible with the provisions of BCP 78 (RFC5378) or that the original authors are properly attributed.
I can see that AI tooling is helping some folks with study & analysis, (which is great for them!), however I am not sure that submitting an internet-draft to the IETF that was (partially) generated using AI would be a wise path to follow: I recently spotted someone who contemplated submitting a 33,000+ word LLM-generated I-D for review to a working group. The machine certainly was not a subject matter expert - but at first glance it all looked legit, if such submissions were to happen they'd have the potential to take up a lot of time resources.
I personally would recommend working groups against adoption of AI generated text (mostly because of the potential for issues related to intellectual property). IMHO, handwritten originals are the way to go when using the IETF publication venue! :)
The FreeBSD project recently added some clarifications, result: https://reviews.freebsd.org/differential/changeset/?ref=1420532 discussion: https://reviews.freebsd.org/D50650?id=156417
Other entities also provided documentation on the topic: https://www.linuxfoundation.org/legal/generative-ai https://www.apache.org/legal/generative-tooling.html
Is it documented somewhere for IETF newcomers that internet-draft submissions should not contain LLM/AI generated text? I imagine that similar clarifications for the IETF context along the lines of "text _about_ AI is fine, but text generated by AI has legal implications" would be very helpful.
Where can I point newcomers on this topic?
Kind regards,
Job
|