On Fri, Jul 11, 2025 at 2:29 PM brian m. carlson <sandals@xxxxxxxxxxxxxxxxxxxx> wrote: > > On 2025-07-11 at 20:57:03, Carlo Marcelo Arenas Belón wrote: > > except that it would be incorrect, as language tags are defined in RFC5646 > > and are larger than that. > > > > most importantly, deriving language tags from locales provides some very > > useful tags when including the characters after the _, because zh_CN and > > zh_HK use completely different scripts, for example. > > Yes, that's true. You have some private use and some irregular tags and > you also have some tags that include scripts or country codes. > > For instance, Swahili can be written in Latin or Arabic script. As I > understand it, the Arabic script form is older and less common these > days, so if I learned Swahili (which I would like to), then I might only > learn the Latin script variant in a course. I would need to specify > that script in the language code to be sure that I was presented with > content in a form that I could read and understand. Similar concerns > exist with the variants of Serbo-Croatian: some are written in Latin > scripts, some in Cyrillic, and some in both, and it's not guaranteed > that all speakers understand all forms. > > And then there's pt-PT and pt-BR, which are not always mutually > intelligible. Most free software I've seen ships these as separate > translations. > > I don't want to implement language tag parsing here since we don't need > to do that. I would like to do the simple thing to prevent commonly > used locales that don't represent actual language tags from being > included and not overengineer this design I think that your design of filtering C and POSIX accomplishes that, even if it might seem like hardcoding those two values is a little dirty. Moving the logic (including the filtering, which is already happening for the `!NO_GETTEXT `code path adds several chances to modernize and cleanup the code though which will be beneficial (ex: using and strvec or even a hashtable to process the candidates, improve validation and tests) Carlo CC Yi EungJun at a hopefully working email address with link to thread https://lore.kernel.org/git/20250710221641.857081-1-sandals@xxxxxxxxxxxxxxxxxxxx/ . > -- > brian m. carlson (they/them) > Toronto, Ontario, CA