URL Encoding Explained: Percent-Encoding in HTTP Requests

Learn why special characters break URLs, how percent-encoding works at the byte level, the critical difference between encodeURI and encodeURIComponent, and common encoding bugs in real applications.

URLs look simple from the outside — a string of text pointing to a resource. But under the hood, they follow a strict grammar that only allows a specific set of characters. The moment you try to pass a space, an ampersand, or a non-ASCII character in a URL without encoding it, things break in ways that can be difficult to debug. Percent-encoding (commonly called URL encoding) is the mechanism that makes arbitrary data safe to embed in a URL.

You can encode and decode URLs instantly with the BrowseryTools URL Encoder/Decoder — free, no sign-up, everything stays in your browser.

Why Special Characters Break URLs

The URL specification (RFC 3986) reserves certain characters for structural purposes. The ? separates the path from the query string. The & separates query parameters from each other. The # marks a fragment identifier. The / separates path segments. If your data contains any of these characters, a URL parser cannot tell the difference between your data and the URL structure itself.

Consider a search query for rock & roll. Naively constructing the URL gives:

/search?q=rock & roll
          ^     ^
          |     └── looks like a new parameter begins here
          └── this & splits q from a phantom second parameter

The parser reads q=rock (with a trailing space) as the first parameter, then encounters what looks like the start of a second parameter named roll. Both values are wrong. The correct URL is /search?q=rock%20%26%20roll — the space becomes %20 and the ampersand becomes %26.

What Percent-Encoding Actually Does

Percent-encoding converts a byte to a three-character sequence: a literal percent sign followed by two uppercase hexadecimal digits representing the byte's value. The space character (ASCII byte 32, hex 0x20) becomes %20. The at-sign (@, ASCII 64, hex 0x40) becomes %40. The rule is:

percent-encode(byte) = "%" + byte.toString(16).toUpperCase().padStart(2, "0")

Examples:
  space  (0x20) → %20
  @      (0x40) → %40
  [      (0x5B) → %5B
  €      (UTF-8: 0xE2 0x82 0xAC) → %E2%82%AC

For multi-byte Unicode characters (anything outside ASCII), the character is first encoded to UTF-8 bytes, and then each byte is percent-encoded. The euro sign € is three UTF-8 bytes, so it becomes three percent-encoded sequences: %E2%82%AC.

Safe Characters vs Reserved Characters

Not every character needs encoding. RFC 3986 defines two sets that are safe to use as-is:

Unreserved characters — A–Z, a–z, 0–9, hyphen, underscore, period, tilde. These carry no special meaning and never need encoding.
Reserved characters — : / ? # [ ] @ ! $ & ' ( ) * + , ; =. These ARE safe in their structural positions but must be encoded when they appear as data values.

Everything else — spaces, Unicode, control characters, most punctuation — must always be encoded.

encodeURI vs encodeURIComponent: The Critical Difference

JavaScript ships with two built-in encoding functions, and confusing them is one of the most common URL-encoding bugs in web applications.

encodeURI() is designed to encode a complete URL. It leaves all reserved characters alone because they are structurally meaningful in a full URL. You would use it if you have a complete URL that might contain spaces or Unicode but has a valid structure:

encodeURI("https://example.com/search?q=hello world&lang=en")
// → "https://example.com/search?q=hello%20world&lang=en"
//   ✓ space encoded, but & and ? left intact

encodeURIComponent() is designed to encode a single value — a query parameter value, a path segment, anything that needs to be treated as pure data. It encodes reserved characters too, including &, =, ?, and /:

encodeURIComponent("rock & roll")
// → "rock%20%26%20roll"
//   ✓ & encoded — safe to use as a query parameter value

encodeURIComponent("https://example.com/page")
// → "https%3A%2F%2Fexample.com%2Fpage"
//   ✓ colons and slashes encoded — safe as a redirect_uri value

The rule of thumb: when constructing a URL, use encodeURIComponent() on each individual parameter value, never on the full URL. Use encodeURI() only on a complete URL that you want to normalize. In modern code, prefer the URL and URLSearchParams APIs over manual encoding — they handle encoding automatically and correctly.

Query String Encoding Pitfalls

Several subtle bugs come up repeatedly when encoding query strings. The + sign deserves special mention: in the application/x-www-form-urlencoded format (the format HTML forms submit in), a space is encoded as + rather than %20. This is a legacy convention that predates RFC 3986. If your backend URL-decodes using form-encoding rules and your frontend sends %20, it works fine. But if the frontend sends + and your backend decodes with RFC 3986 rules, the + is left as a literal plus sign — not a space.

// URLSearchParams uses application/x-www-form-urlencoded (+ for spaces)
new URLSearchParams({ q: "rock & roll" }).toString()
// → "q=rock+%26+roll"

// encodeURIComponent uses RFC 3986 (%20 for spaces)
"q=" + encodeURIComponent("rock & roll")
// → "q=rock%20%26%20roll"

// Both are valid — just be consistent on both ends

How Form Data Gets URL-Encoded

When an HTML form submits with method="GET", the browser serializes the form fields into a query string using application/x-www-form-urlencoded. Each field name and value is encoded (spaces as +, special characters as %XX), and fields are joined with &. For method="POST" forms without a enctype attribute, the same encoding is used but the data goes in the request body instead of the URL.

This is also the format fetch() uses when you pass a URLSearchParams object as the body, and it is what most server-side frameworks automatically decode when reading form submissions.

Base64 in URLs

Standard Base64 uses + and / — both of which have special meanings in URLs. When Base64-encoded data needs to appear in a URL (a common pattern for tokens, image data, or cryptographic signatures), use the Base64URL variant instead. It replaces + with - and / with _, producing a string that is safe in any URL position without further encoding. JWTs use this format for their header and payload segments.

Real-World Encoding Bugs

A few bug patterns that come up in production applications:

Double-encoding — encoding an already-encoded URL. %20 becomes %2520 because % itself gets encoded to %25. Always check whether a value is already encoded before encoding again.
Missing encodeURIComponent on redirect_uri — OAuth flows pass a redirect_uri as a query parameter. If it contains a ? or & and is not encoded, the auth server parses those characters as part of the outer URL structure, breaking the redirect.
Non-UTF-8 encoding — older systems or misconfigured servers sometimes percent-encode strings using ISO-8859-1 instead of UTF-8. The byte sequence for é differs between the two. Always enforce UTF-8 consistently on both sides.
Logging raw URLs — logging a URL that contains encoded user data may produce misleading logs if your log viewer decodes percent-sequences automatically, hiding what was actually sent on the wire.

Encode and Decode URLs Instantly

Whether you are debugging an OAuth redirect, constructing a query string by hand, inspecting a malformed API request, or just trying to understand what a percent-encoded URL actually contains — the BrowseryTools URL Encoder/Decoder handles it instantly. Paste your string, choose encode or decode, and see the result immediately. No server calls, no sign-up.

Free URL Encoder / Decoder — Runs 100% in Your Browser

Open URL Encoder →