Re: (RFC 3986) Clarification around interpreting literal plus in URLs (U+002B)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Dear Julian (and Lloyd),

Thanks for your replies. So it sounds like on the server side of things, I should be aware of whether I expect my server to be called via an HTML form submission, vs. URIs generated directly. Or in the other case when I'm generating the URI, be aware of whether the server is expected to be called via HTML form submission vs. direct URI generation.

So seems like for practical purposes the best approach is trial & error, or ideally use `%20` to be most compatible.

Regards,

Raghu Saxena

On 3/15/25 2:24 PM, Julian Reschke wrote:
On 14.03.2025 11:30, Raghu Saxena wrote:
Dear IETF Community,

I was writing to seek clarification around URIs & HTTP, specifically how
to handle "plus" ("+") symbols in them.

For instance, if my server receives a request for `GET /page?key=A+B` ,
where the bytes over the wire are literally [0x412B42], should I
interpret it as literally that byte sequence, or decode the "+" to a
space, thereby interpreting the bytes as [0x412042]?

It depends.

 From my reading of RFC3986, it seems that "+" is a reserved character,
but it's not clear how it is to be interpreted / decoded.

From the pov of the URI specification, "+" is not special and does not
need to be encoded (ABNF: query -> pchar -> subdelims).

The question behind the question is how should spaces be encoded to be
URL safe; it seems "%20" is the recommended approach, however some
languages (such as Golang[0]) implement query-escaping where the spaces
(0x20) are replaced by a literal "+" (0x2B). This causes problems by
some libraries which then treat this as a literal "plus" (0x2B) and then
return unexpected results.

What you observe is a layering issue.

Typically, query parameters are generated by HTML form submissions, and
those use their own encoding on top of what is needed in URIs. This
format indeed maps "+" to a space. So a library should dinstinguish
between encoding into URIs, and encoding into query parameters used in
HTML forms (FWIW, that encoding is also used in POST payloads).

See https://url.spec.whatwg.org/#application/x-www-form-urlencoded.

Best regards, Julian

Attachment: OpenPGP_0xA1E21ED06A67D28A.asc
Description: OpenPGP public key

Attachment: OpenPGP_signature.asc
Description: OpenPGP digital signature


[Index of Archives]     [IETF Annoucements]     [IETF]     [IP Storage]     [Yosemite News]     [Linux SCTP]     [Linux Newbies]     [Mhonarc]     [Fedora Users]

  Powered by Linux