Re: [PATCH 1/1] http: don't send C or POSIX in Accept-Language

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 7/10/25 6:16 PM, brian m. carlson wrote:
> The LANGUAGE environment variable is not specified by POSIX, but a
> variety of programs using GNU gettext accept it.  The Linux manpages
> state that it can contain a colon-separated list of locales.
> 
> However, not all locales are valid as languages.  The C and POSIX
> locales, for instance, are not languages and are not registered with
> IANA, nor are they a part of ISO 639.  In fact, "C" is too short to
> match the ABNF production for a language, which must be at least two
> characters in length.
> 
> Nonetheless, many users provide these values in the LANGUAGE environment
> variable for unknown reasons and if they do, we do not want to send a
> malformed Accept-Language header to the server.  If there are no other
> valid language tags, then send no header; otherwise, send only the valid
> tags, ignoring "C" and "POSIX" wherever they may appear, as well as any
> variants (such as the "C.UTF-8" locale found on some Linux systems).


Better docs -- the gettext manpages suck:
https://www.gnu.org/software/gettext/manual/html_node/Locale-Names.html
https://www.gnu.org/software/gettext/manual/html_node/The-LANGUAGE-variable.html


At minimum this commit message needs revising. Gettext was adopted into
POSIX 2024 (Issue 8).


Respected by tools of course:
https://pubs.opengroup.org/onlinepubs/9799919799/utilities/gettext.html#tag_20_54_08
https://pubs.opengroup.org/onlinepubs/9799919799/functions/gettext.html

$LANGUAGE docs can be found at

https://pubs.opengroup.org/onlinepubs/9799919799/basedefs/V1_chap08.html#tag_08_02


"""
The value of LANGUAGE shall be a list of locale names separated by a
<colon> (':') character. If LANGUAGE is set to a non-empty string, each
locale name shall be tried in the specified order and if a messages
object is found, it shall be used for translation. If a locale name has
the format language[_territory][.codeset][@modifier], additional
searches of locale names without .codeset (if present), without
_territory (if present), and without @modifier (if present) may be performed
"""


And, for locale name values,

"""
If the locale value is "C" or "POSIX", the POSIX locale shall be used
and the standard utilities behave in accordance with the rules in 7.2
POSIX Locale for the associated category.

If the locale value begins with a <slash>, it shall be interpreted as
the pathname of a file that was created in the output format used by the
localedef utility; see OUTPUT FILES under localedef. Referencing such a
pathname shall result in that locale being used for the indicated category.

[XSI] [Option Start] If the locale value has the form:

language[_territory][.codeset]

it refers to an implementation-provided locale, where settings of
language, territory, and codeset are implementation-defined.

LC_COLLATE , LC_CTYPE , LC_MESSAGES , LC_MONETARY , LC_NUMERIC , and
LC_TIME are defined to accept an additional field @modifier, which
allows the user to select a specific instance of localization data
within a single category (for example, for selecting the dictionary as
opposed to the character ordering of data). The syntax for these
environment variables is thus defined as:

[language[_territory][.codeset][@modifier]]
"""


Your tests and code are probably broken -- they appear to normalize
nearly none of the standard grammar into valid Accept-Language entries.
Of course, "surely nobody actually does that" (except when they do!) --
but it's a relatively simple grammar structure, simply getting the
"shape" correct seems like a good idea.


-- 
Eli Schwartz

Attachment: OpenPGP_signature.asc
Description: OpenPGP digital signature


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux