HTML Entity Encoder Online

Q: What characters must be encoded in HTML?

At minimum, five characters must be encoded: (>), & (&), " ("), and ' (' or '). These characters have special meaning in HTML syntax and will cause parsing errors or security vulnerabilities if left unencoded.

Q: What is the difference between named and numeric HTML entities?

Named entities use a human-readable name (e.g., &), while numeric entities use the character's Unicode code point in decimal (&) or hexadecimal (&). Both produce the same rendered output. Named entities are more readable; numeric entities can represent any Unicode character.

Convert special characters in your text into HTML entities so your content displays correctly and doesn’t break HTML markup.

Input (Plain Text)

Encoded Output

Read Only

HTML Entity Encoder — Encode Special Characters for Safe HTML

Encode special characters into HTML entities instantly with the XConvert HTML Entity Encoder. Paste any text containing characters like <, >, &, ", and ' and get properly encoded output that renders safely in HTML documents. This free, client-side tool runs entirely in your browser — no data is sent to any server.

Displaying user-generated content, code snippets, or special characters in HTML requires encoding certain characters as HTML entities. Without proper encoding, a < character could be interpreted as the start of an HTML tag, an & could begin an unintended entity reference, and a " could break an attribute value. The XConvert HTML Entity Encoder converts these dangerous characters into their safe entity equivalents, preventing rendering errors and cross-site scripting (XSS) vulnerabilities.

How to Encode HTML Entities with XConvert (4 Steps)

Open the HTML Entity Encoder — Navigate to the XConvert HTML Entity Encoder in any modern browser. No account or installation is needed.
Paste Your Text — Enter or paste the text containing special characters into the input field. This can be plain text, code snippets, user-generated content, or any string that needs to be safely embedded in HTML.
Choose Encoding Options — Select the encoding mode: encode only the essential HTML characters (<, >, &, ", '), encode all non-ASCII characters as well, or encode using named entities where available (e.g., & vs. &).
Copy the Encoded Output — The encoded text appears instantly in the output panel. Copy it and paste it into your HTML source code, template, or CMS.

The encoder handles all Unicode characters, converting them to either named HTML entities (like &, <, ©) or numeric character references (like ©, ©) depending on your preference.

What Are HTML Entities?

HTML entities are special sequences of characters that represent reserved or special characters in HTML. They begin with an ampersand (&) and end with a semicolon (;). HTML entities exist because certain characters have special meaning in HTML syntax — the < and > characters define tags, the & character begins entity references, and the " character delimits attribute values.

There are three types of HTML entities:

Named entities use a human-readable name: & for &, < for <, > for >, " for ", and ' for '. HTML defines hundreds of named entities for common symbols, including © (©), € (€), — (—), and   (non-breaking space).
Decimal numeric references use the character's Unicode code point in decimal: & for &, < for <, © for ©.
Hexadecimal numeric references use the code point in hexadecimal: & for &, < for <, © for ©.

All three forms are valid HTML and produce the same rendered output. Named entities are more readable in source code, while numeric references can represent any Unicode character, including those without named entities.

Comparison Table

Feature	XConvert Encoder	Manual Encoding	Server-Side Libraries
Client-side processing	✅ Yes	✅ Yes	❌ Server-side
No data transmission	✅ Yes	✅ Yes	❌ Sent to server
Named entity support	✅ Full set	⚠️ Common only	✅ Full set
Numeric reference support	✅ Decimal + hex	⚠️ Manual	✅ Yes
Unicode support	✅ Full Unicode	⚠️ Limited	✅ Full Unicode
Selective encoding	✅ Configurable	❌ Manual	✅ Configurable
Batch processing	✅ Yes	❌ Tedious	✅ Yes
No installation	✅ Yes	✅ Yes	❌ Requires setup
XSS prevention	✅ Encodes all vectors	⚠️ Error-prone	✅ Yes
Cost	Free	Free	Free

Common Use Cases

Preventing Cross-Site Scripting (XSS) — The most critical use of HTML encoding is preventing XSS attacks. When user-generated content is displayed on a web page without encoding, an attacker can inject malicious <script> tags. Encoding converts <script> to <script>, which renders as visible text instead of executing as code.
Displaying Code Snippets in HTML — When showing HTML, XML, or any code that contains angle brackets in a web page, the code must be entity-encoded to prevent the browser from interpreting it as markup. The encoder converts <div class="example"> to <div class="example"> so it displays correctly.
Embedding Special Characters in HTML Attributes — Attribute values enclosed in double quotes will break if the value itself contains a double quote. Encoding " as " prevents this. Similarly, encoding & in URLs within href attributes prevents the browser from interpreting & as the start of an entity reference.
Email Template Development — HTML email clients have inconsistent entity support. Encoding special characters as numeric references ensures they render correctly across Gmail, Outlook, Apple Mail, and other clients that may not support all named entities.
CMS and Blog Content — When writing content for content management systems, special characters like ©, ™, —, and curly quotes may not survive the CMS's processing pipeline. Encoding them as entities ensures they display correctly regardless of the CMS's character handling.
Internationalization and Multilingual Content — Non-ASCII characters from languages like Chinese, Arabic, or Cyrillic can be encoded as numeric HTML entities to ensure they display correctly even if the page's character encoding is misconfigured. While UTF-8 has largely solved this problem, entity encoding provides an additional safety net.

Technical Details of HTML Entity Encoding

HTML entity encoding is defined by the HTML specification (currently HTML Living Standard maintained by WHATWG) and the XML specification. The five characters that must always be encoded in HTML content are: < (less-than, <), > (greater-than, >), & (ampersand, &), " (double quote, "), and ' (apostrophe, ' or '). These five characters are the minimum set required to prevent HTML parsing errors and security vulnerabilities.

The HTML specification defines 2,231 named character references as of HTML5. These cover Latin characters with diacritics, Greek and Cyrillic letters, mathematical symbols, arrows, box-drawing characters, and many other Unicode symbols. Named references are case-sensitive — &Amp; is not the same as &. The XConvert encoder uses the correct casing for all named entities and falls back to numeric references for characters without named entities.

Numeric character references can represent any Unicode code point from U+0000 to U+10FFFF, though certain ranges are prohibited in HTML (such as the surrogate pair range U+D800 to U+DFFF and most C0 control characters). The encoder validates code points and only produces references for valid, displayable characters. For characters outside the Basic Multilingual Plane (above U+FFFF), the encoder produces a single numeric reference using the full code point value, not a surrogate pair — this is the correct behavior for HTML, which differs from JavaScript's internal UTF-16 representation.

Tips for Best Results

Encode the minimum necessary — For most use cases, encoding only the five essential characters (<, >, &, ", ') is sufficient. Over-encoding makes the HTML source harder to read without providing additional security or compatibility benefits.
Always encode user input — Any content that originates from user input must be encoded before being inserted into HTML. This is the primary defense against XSS attacks. Never trust user input, even if it has been validated on the client side.
Use named entities for readability — When hand-editing HTML, named entities like & and < are easier to read and maintain than numeric references like & and <. Use named entities when they are available.
Use numeric references for obscure characters — For characters without named entities, or when targeting systems with limited named entity support (like some email clients), use decimal or hexadecimal numeric references.
Encode before inserting, not after — Always encode text before inserting it into the HTML document. Encoding after insertion risks missing some content or double-encoding already-encoded entities.
Pair with the decoder for round-trip verification — After encoding, use the HTML Entity Decoder to verify that the encoded text decodes back to the original. This catches double-encoding errors and ensures correctness.

Frequently Asked Questions

What characters must be encoded in HTML?

At minimum, five characters must be encoded: < (<), > (>), & (&), " ("), and ' (' or '). These characters have special meaning in HTML syntax and will cause parsing errors or security vulnerabilities if left unencoded.

What is the difference between named and numeric HTML entities?

Named entities use a human-readable name (e.g., &), while numeric entities use the character's Unicode code point in decimal (&) or hexadecimal (&). Both produce the same rendered output. Named entities are more readable; numeric entities can represent any Unicode character.

Does HTML entity encoding prevent XSS attacks?

Yes, when applied correctly. Encoding user input before inserting it into HTML content converts potentially dangerous characters like < and > into harmless entity references. However, encoding alone is not sufficient for all contexts — content inserted into JavaScript, CSS, or URL attributes requires context-specific encoding.

What is double encoding and how do I avoid it?

Double encoding occurs when already-encoded entities are encoded again, turning & into &amp;. This causes the literal text & to appear on the page instead of &. To avoid it, encode raw text only once, and never re-encode text that has already been processed. The HTML Entity Decoder can help you detect double-encoded content.

Should I encode all non-ASCII characters?

Not necessarily. If your HTML document uses UTF-8 encoding (which is the modern standard), non-ASCII characters like ©, é, and 中 can be included directly without encoding. Entity encoding non-ASCII characters is only necessary when the document's character encoding is uncertain or when targeting systems with limited Unicode support.

What is ` ` and when should I use it?

  is the named entity for a non-breaking space (Unicode U+00A0). Unlike a regular space, it prevents the browser from breaking a line at that point and is not collapsed with adjacent spaces. Use it to keep words together (e.g., 100 km) or to create visible whitespace in HTML.

Can I encode an entire HTML document?

You can, but you should not. Encoding an entire HTML document would convert all the tags into visible text, destroying the document's structure. Only encode the text content that is inserted into HTML — not the HTML markup itself.

How does the encoder handle emoji and special Unicode characters?

Emoji and other characters outside the Basic Multilingual Plane (above U+FFFF) are encoded as numeric character references using their full Unicode code point. For example, the 😀 emoji (U+1F600) becomes 😀 or 😀.

Is my text sent to any server during encoding?

No. The XConvert HTML Entity Encoder runs entirely in your browser using client-side JavaScript. Your text is processed locally and never transmitted to any server, making it safe for sensitive content.

Can I decode HTML entities back to plain text?

Yes. Use the HTML Entity Decoder to convert HTML entities back to their original characters. This is useful for reading encoded source code, debugging entity issues, or processing HTML content that needs to be displayed as plain text.

Related XConvert Tools: HTML Entity Decoder · URL Encoder/Decoder · Base64 Encoder/Decoder · JSON Formatter · JWT Decoder