What Is Base64 Encoding and When Should You Use It

· 9 min read

If you work with APIs, email systems, or web development, you have encountered Base64 even if you did not recognise it. Those long strings of letters and numbers that look like gibberish at the start of an email attachment, a data: URL in CSS, or the middle segment of a JWT token? That is Base64. It is one of the oldest and most quietly load-bearing pieces of internet plumbing, and almost every piece of software you use leans on it somewhere.

A short history of Base64

Base64 is part of a family called "radix-64" or "printable encodings," whose job is to represent arbitrary bytes using only the small alphabet of characters that a text-based system is guaranteed to pass through unchanged. The earliest widely-used member is uuencode, written by Mary Ann Horton at UC Berkeley around 1980 to ship binary files over Usenet and email when those systems would corrupt anything above 7-bit ASCII.

The Base64 alphabet itself was first standardised in RFC 989 (1987) for Privacy-Enhanced Mail (PEM), an early attempt at signed and encrypted email. PEM died, but its encoding scheme survived and was canonised in RFC 1421 (1993) and then in the MIME specification (RFC 1521 and 1522 in 1993, revised to RFCs 2045-2049 in 1996). MIME made Base64 the default way to attach binary files to email, and from there the encoding spread to nearly every text-only transport on the internet.

In 2006, IETF consolidated the scattered Base64 definitions into RFC 4648, which defines Base64, Base32 and Base16 in a single document. RFC 4648 also defined the URL-safe variant in section 5, which swapped the two non-URL-friendly characters (+ and /) for - and _. JSON Web Tokens (RFC 7519, 2015) standardised on URL-safe Base64 with the padding stripped. Today, every email attachment, every PEM-encoded certificate, every data: URL, every JWT, and every multipart upload boundary depends on Base64.

How Base64 works: the math

Base64 takes three input bytes (24 bits) and rewrites them as four output characters (6 bits each), using a 64-symbol alphabet. The mapping is fixed:

Index range Characters
0-25 A-Z
26-51 a-z
52-61 0-9
62 + (standard) or - (URL-safe)
63 / (standard) or _ (URL-safe)

So Hello becomes:

The output is always a multiple of 4 characters. If the input length modulo 3 is 1, you get two = padding characters; if it is 2, you get one =; if it is 0, no padding. Padding is sometimes stripped (notably in JWT and in URL fragments) and decoders are expected to tolerate that.

The 33 % size overhead comes from this 3-to-4 expansion: every 3 bytes of input become 4 characters of output, an increase of one third. There is no way to reduce it without changing the alphabet (Base85 / Ascii85 expands by only 25 % using 85 printable characters, at the cost of a more complex encoder).

Common use cases

Email attachments. SMTP, the protocol that carries 95 % of email between servers, was designed in 1982 (RFC 821) for 7-bit ASCII. Every binary attachment you send (an image, a PDF, a ZIP) is Base64-encoded by your mail client before transmission and decoded by the recipient's. The MIME headers in an email tell the recipient which parts are Base64 and which are plain text.

Data URLs in HTML and CSS. A data:image/png;base64,iVBORw0KGgo... URL embeds a binary file directly in the document. Useful for small icons under 1-2 KB where the saved HTTP request outweighs the 33 % size overhead and the loss of caching.

API payloads. When a JSON or XML API needs to accept a binary value (a file upload, a signature, a profile picture), the standard pattern is to Base64-encode the bytes and ship them as a string field. The receiver decodes them on the server side. This is how OpenAI's image input works, how Stripe receives file uploads, and how most cloud functions accept binary input.

HTTP Basic Authentication. The Authorization: Basic <token> header carries a Base64-encoded username:password pair (RFC 7617). This is encoding, not encryption: anyone who sees the header sees the password. Basic Auth requires HTTPS for that reason.

Certificates and keys. PEM files (-----BEGIN CERTIFICATE----- ... -----END CERTIFICATE-----) wrap a Base64-encoded blob of DER-encoded ASN.1 bytes. Every TLS certificate, every SSH key file, every code-signing certificate is Base64 inside a PEM envelope.

JWT tokens. A JWT is three URL-safe-Base64 segments separated by dots: <header>.<payload>.<signature>. The Base64 encoding lets a JWT travel safely in headers, URLs, and cookies.

How to encode and decode

  1. Choose encode or decode: select the direction of conversion.
  2. Paste text or upload a file: enter text directly or drag and drop a file (up to 5 MB for browser-side encoding).
  3. Pick the variant: standard Base64 for email and certificates, URL-safe for JWT and URL fragments. The tool defaults to standard.
  4. Copy the result: the output updates instantly. Copy it to your clipboard, or use the download button for long outputs.

Variants of Base64

Several Base64-like encodings exist for specific situations:

Variant Differences Where it is used
Standard (RFC 4648 §4) A-Z, a-z, 0-9, +, /, = padding Email (MIME), PEM, generic binary-to-text
URL-safe (RFC 4648 §5) + becomes -, / becomes _ JWT, URL fragments, filenames
MIME (RFC 2045) Line breaks every 76 chars Email body, mail headers (with =?utf-8?B?...?=)
crypt(3) / htpasswd Different alphabet (./0-9A-Za-z) Old Unix password hashes (DES-based)
Base64Url no-padding URL-safe without trailing = JWT (per RFC 7515)
Base32 (RFC 4648 §6) 32-char alphabet, case-insensitive TOTP secrets, Onion addresses
Base58 58-char alphabet (no 0, O, I, l) Bitcoin addresses, IPFS CIDs
Ascii85 / Base85 85-char alphabet, 25 % overhead PDF, PostScript

Most of the time you want either standard or URL-safe Base64. The others come up in specific protocols.

When to use Base64

Use it when:

Do not use it when:

Common pitfalls

Alternatives and adjacent encodings

Base64 is the default, not the only option. The right choice depends on the channel and the size budget.

Encoding Overhead Strength Best for
Hex (Base16) 100 % Trivial to read, every byte is two chars Debug output, short identifiers, color codes
Base32 (RFC 4648) 60 % Case-insensitive, no look-alike characters TOTP secrets, Onion addresses, voice dictation
Base64 standard 33 % Universal, every language has it Email, PEM, generic transport
Base64 URL-safe 33 % URL- and filename-safe JWT, URL fragments
Base58 ~37 % No 0/O/I/l confusion, no special chars Bitcoin addresses, IPFS CIDs
Ascii85 / Base85 25 % Denser than Base64 PDF, PostScript
Base91 ~22 % Even denser, more complex Rare, niche compression contexts
Multipart upload 0 % Native binary transport over HTTP File uploads (browsers do this for you)
gzip + Base64 varies Sometimes smaller than raw Base64 Pre-compressed payloads

For most everyday work, the answer is Base64 (standard or URL-safe). For binary file uploads over HTTP, the right answer is usually multipart/form-data, which does not encode at all.

Privacy and the encoder

The Base64 encoder and decoder run entirely in your browser. The text or file you input is processed by JavaScript on your device, the result is rendered to the page, and nothing is sent to a server. Nothing is logged, nothing is stored after you navigate away, and no analytics tag sees the content. For things you might Base64-encode (PEM certificates, private keys, JWT payloads from production systems, draft API requests with real customer data), that local-only flow is the right default. The whole tool can run offline once the page is loaded, which you can verify by switching off your network and re-encoding the same input.

Frequently Asked Questions

Does Base64 encryption protect my data?

No. Base64 is encoding, not encryption. Anyone can decode a Base64 string in milliseconds, it provides zero security. If you need to protect data, use actual encryption (AES, RSA, or higher-level tools like GnuPG and age).

Why does Base64 make files larger?

Base64 encoding increases data size by approximately 33%. Three bytes of binary data become four Base64 characters. This overhead is the trade-off for being able to transmit binary data safely as text through systems that may strip or mangle non-printable bytes.

Can I encode files, not just text?

Yes. Any file (images, PDFs, audio) can be encoded to Base64. This is commonly used for embedding small images directly in HTML or CSS as data URLs, and for shipping certificates and keys as PEM text.

When should I NOT use Base64?

Do not use it for large files. A 1 MB image becomes 1.33 MB as Base64 text, and the browser cannot cache it separately. For anything over a few KB, serving the file normally is more efficient.

What is the difference between standard Base64 and URL-safe Base64?

Standard Base64 (RFC 4648 section 4) uses the characters A-Z, a-z, 0-9, +, / with = padding. URL-safe Base64 (RFC 4648 section 5) swaps + for - and / for _ so the string is safe to drop into a URL or a filename without percent-encoding. JWT tokens use the URL-safe variant.

Why does Base64 sometimes have one or two = signs at the end?

The = is padding. Base64 encodes input in 3-byte groups; if the input length is not a multiple of 3, the last group is padded with zero bits and one or two = characters mark the missing bytes. One = means one missing byte, two = means two missing bytes.