Free Character Map

Browse Unicode characters by category, search by name or code point, and copy to clipboard.

No data leaves your device

How to Use

  1. Click a category tab to view characters in that group.
  2. Click any character to view its details and copy options.
  3. Use the search box to find characters by name (e.g., "heart") or hex code (e.g., "2665").
  4. Click Copy Character to copy the selected character to your clipboard.

Frequently Asked Questions

What is a Unicode code point?

A Unicode code point is a unique number assigned to every character in the Unicode standard. It is written in hexadecimal format, often prefixed with U+ (e.g., U+2665 for ♥).

What is an HTML entity?

An HTML entity is a special code that represents a character in HTML. For example, ♥ represents ♥. Entities are useful when you cannot directly type a character.

What is CSS code?

CSS code uses the \\ notation to insert a character using its Unicode code point in stylesheets. For example, .heart::before { content: "\2665"; } inserts ♥.

A short history of Unicode

Before Unicode, every region had its own incompatible character encoding: ASCII for English, the ISO 8859 family for European languages (8859-1 Latin-1, 8859-5 Cyrillic, 8859-6 Arabic), Windows code pages 1252 / 1251 / 1253–1258, multibyte sets for East Asian languages (Shift-JIS for Japanese, Big5 for Traditional Chinese, GB2312 for Simplified Chinese, EUC-KR for Korean). Mismatched encodings produced garbled text known by the Japanese term mojibake (文字化け, "character transformation"), opening a Japanese page in the wrong encoding gave you rows of question marks or random Latin-1 letters.

The work began in 1987 at Xerox. Joe Becker, with Lee Collins and Mark Davis at Apple, started investigating a single universal character set that could replace the patchwork. Becker's August 1988 draft proposal, "Unicode 88," explained: "the name 'Unicode' is intended to suggest a unique, unified, universal encoding." The Unicode Consortium was incorporated in January 1991 and shipped Unicode 1.0 in October that year with about 7,100 characters across 24 scripts.

As of Unicode 17.0 (released 9 September 2025) the standard contains about 159,801 characters across 172 scripts, with code space allocated for 1,112,064 valid code points, meaning Unicode has assigned roughly 14% of its possible space and has decades of headroom. Major recent milestones: Unicode 6.0 (2010) was the first version to formally encode emoji (722 of them, taken from the Japanese carriers); Unicode 17.0 added four new scripts (Sidetic, Tolong Siki, Beria Erfe, Tai Yo) and pushed the total CJK ideograph count over 100,000.

Code points, planes, and encodings

A code point is just a number, written in hexadecimal with a U+ prefix, like U+2665 for ♥. Code points are grouped into 17 planes of 65,536 code points each. Almost everything you've ever read lives on Plane 0, the Basic Multilingual Plane (BMP, U+0000 to U+FFFF). Plane 1 (the Supplementary Multilingual Plane) holds historical scripts (Linear B, Egyptian hieroglyphs, Cuneiform), musical notation, and almost all emoji. Planes 2 and 3 are CJK ideograph extensions. Planes 4–13 are unassigned, reserved for the future. Plane 14 carries variation selectors and emoji modifiers. Planes 15 and 16 are private-use areas where fonts and apps assign their own meanings.

A code point is just a number; an encoding is how that number gets stored as bytes. Unicode defines three:

The 25 invisible whitespace characters

Unicode formally tags exactly 25 characters with the White_Space=yes property: regular space (U+0020), tab, line feed, carriage return, no-break space (U+00A0, the famous one that looks identical to a regular space but won't break across lines), the typographic widths in U+2000–U+200A, the line / paragraph separators (U+2028 / U+2029), the narrow no-break space common in French typography (U+202F), the medium mathematical space (U+205F), and the full-width ideographic space (U+3000) used in CJK text.

Several characters look invisible but are not classified as whitespace and behave differently from a regular space:

These invisible characters are routinely the cause of "why won't this string match?" debugging sessions, paste any character into a character map's search and it will tell you the actual code point, so you can confirm whether you're looking at a smart quote masquerading as a straight one, or an NBSP where a regular space should be.

Useful character ranges

BlockRangeExamples
Latin-1 SupplementU+0080–U+00FFà ñ ü © ® ¥ § ° ¶
GreekU+0370–U+03FFα β γ π Σ Ω
CyrillicU+0400–U+04FFRussian / Ukrainian / Bulgarian etc.
General PunctuationU+2000–U+206F- – … " " ' ' • † NBSP ZWSP
Currency SymbolsU+20A0–U+20CF€ £ ¥ ₩ ₽ ₹ ₿
Letterlike SymbolsU+2100–U+214F™ ℠ № ℃ ℉ ℗
ArrowsU+2190–U+21FF← → ↑ ↓ ↔ ⇒ ⇐
Mathematical OperatorsU+2200–U+22FF∑ ∫ ∞ √ ≠ ≤ ≥ ± ∂ ∇ ∈ ∪ ∩
Box DrawingU+2500–U+257F─ │ ┌ ┐ └ ┘ ├ ┤ ┬ ┴ ┼ ═ ║ ╔ ╗
Math AlphanumericsU+1D400–U+1D7FF"Fancy text" generators (𝓗𝓮𝓵𝓵𝓸) draw from here.

Special characters in everyday writing

The "I just need to type one symbol" use case, quick reference of what this tool exists to deliver in two clicks:

When you'd reach for a character map

Security: the homograph problem

Many Unicode characters look identical across scripts. The Cyrillic lowercase "а" (U+0430) is visually indistinguishable from the Latin "a" (U+0061). Attackers register internationalised domain names that look like legitimate ones (for instance an "apple.com" with a Cyrillic а in place of the Latin a) and use them for phishing. A 2017 attack on adoḅe.com used the dotted-below ḅ (U+1E05) to deliver malware. Modern browsers mitigate this with restrictive script-mixing rules, falling back to the ASCII Punycode form (xn--…) when a domain mixes scripts; Safari is particularly conservative. The same lookalike property that makes Unicode rich for human writing makes it dangerous in domains, and a character map is one way to confirm the actual code point of every character at a glance.

More questions

What's the difference between a character and a glyph?

A character is the abstract unit Unicode encodes, the letter A, regardless of typeface. A glyph is the specific drawing of that character in a particular font: A in Helvetica, A in Garamond, A in Comic Sans are all the same character but three different glyphs. Unicode encodes characters; fonts ship glyphs.

Why is "1.0" 7,000 characters but "17.0" is 160,000?

Unicode 1.0 covered 24 scripts, most of the world's living writing systems then in regular computing use. The growth since has come from three places: hugely expanding CJK ideograph coverage (pulling in historical Chinese characters and rare regional variants, Extension J added 4,298 in version 17.0 alone), formally encoding historical scripts (Linear B, Cuneiform, Egyptian hieroglyphs, Phoenician), and standardising emoji from 2010 onward.

What's an HTML entity?

A way to encode a character inside HTML using a special escape syntax. There are named entities for common characters (© for ©, — for -) and numeric entities for any code point (♥ or ♥ for ♥). They're useful when typing the character directly is awkward, say in source code with mixed encodings, or in a system that strips non-ASCII.

What about CSS escapes?

CSS uses backslash plus the hex code point: .heart::before { content: "\2665"; } inserts ♥. Useful inside ::before / ::after generated content, in CSS counter styles, and in any place where the source file's encoding can't be relied on.

Does anything get sent to a server?

No. The character data is bundled with the page; the search and category filtering run locally in JavaScript; Copy uses the browser's Clipboard API. Nothing leaves your device, and the page works offline once it's loaded.

Related Tools

Unicode Converter Text Case Converter HTML Entities