On 14.03.2025 11:30, Raghu Saxena wrote:
Dear IETF Community, I was writing to seek clarification around URIs & HTTP, specifically how to handle "plus" ("+") symbols in them. For instance, if my server receives a request for `GET /page?key=A+B` , where the bytes over the wire are literally [0x412B42], should I interpret it as literally that byte sequence, or decode the "+" to a space, thereby interpreting the bytes as [0x412042]?
It depends.
From my reading of RFC3986, it seems that "+" is a reserved character, but it's not clear how it is to be interpreted / decoded.
From the pov of the URI specification, "+" is not special and does not need to be encoded (ABNF: query -> pchar -> subdelims).
The question behind the question is how should spaces be encoded to be URL safe; it seems "%20" is the recommended approach, however some languages (such as Golang[0]) implement query-escaping where the spaces (0x20) are replaced by a literal "+" (0x2B). This causes problems by some libraries which then treat this as a literal "plus" (0x2B) and then return unexpected results.
What you observe is a layering issue. Typically, query parameters are generated by HTML form submissions, and those use their own encoding on top of what is needed in URIs. This format indeed maps "+" to a space. So a library should dinstinguish between encoding into URIs, and encoding into query parameters used in HTML forms (FWIW, that encoding is also used in POST payloads). See https://url.spec.whatwg.org/#application/x-www-form-urlencoded. Best regards, Julian