On 25/07/10 10:16PM, brian m. carlson wrote: > The LANGUAGE environment variable is not specified by POSIX, but a > variety of programs using GNU gettext accept it. The Linux manpages > state that it can contain a colon-separated list of locales. > > However, not all locales are valid as languages. The C and POSIX > locales, for instance, are not languages and are not registered with > IANA, nor are they a part of ISO 639. In fact, "C" is too short to > match the ABNF production for a language, which must be at least two > characters in length. > > Nonetheless, many users provide these values in the LANGUAGE environment > variable for unknown reasons and if they do, we do not want to send a > malformed Accept-Language header to the server. If there are no other > valid language tags, then send no header; otherwise, send only the valid > tags, ignoring "C" and "POSIX" wherever they may appear, as well as any > variants (such as the "C.UTF-8" locale found on some Linux systems). Ok so the languages returned by `get_preferred_languages()` are used to write the Accept-Language header when making requests. Looking at `get_preferred_languages()` when NO_GETTEXT is defined, we already filter out "C" and "POSIX". So doing this for the LANGUAGE environment variable when writing the header also makes sense. > We do not reject all possible invalid language tags since doing so > would require bundling a copy of the IANA database and would risk poor > behavior in the face of uncommon languages or values that are not > registered but meet the production for private use or other restricted > interchange. However, these two values are widely used in the LANGUAGE > header, are well-known and widely used non-language locales, and have > been seen in the wild on the server side. > > Signed-off-by: brian m. carlson <sandals@xxxxxxxxxxxxxxxxxxxx> > --- > http.c | 8 ++++++++ > t/t5541-http-push-smart.sh | 18 ++++++++++++++++++ > 2 files changed, 26 insertions(+) > > diff --git a/http.c b/http.c > index d88e79fbde..a96df4fcdb 100644 > --- a/http.c > +++ b/http.c > @@ -2022,6 +2022,14 @@ static void write_accept_language(struct strbuf *buf) > s++; > > if (tag.len) { > + /* > + * These are not valid languages: do not send them to > + * the server. > + */ > + if (!strcmp(tag.buf, "C") || !strcmp(tag.buf, "POSIX")) { > + strbuf_reset(&tag); > + continue; > + }