Free XML Formatter & Minifier
Paste XML to format, beautify, or minify it instantly.
How to Use
- Paste your XML into the Input box.
- Click Format to beautify or Minify to compress.
- Copy or Download the result.
Frequently Asked Questions
What happens if my XML has errors?
The tool validates your XML using the browser's built-in DOMParser. If there are syntax errors, they are displayed in a red error box above the output.
Does this support CDATA, comments, and processing instructions?
Yes. The formatter preserves all XML node types including CDATA sections, comments, and processing instructions.
Is there a size limit?
There is no hard limit · it depends on your browser's memory. XML files up to several MB typically format instantly.
A Practical Tour of XML
XML 1.0 became a W3C Recommendation on 10 February 1998, edited by Tim Bray, Jean Paoli, and C. M. Sperberg-McQueen, with a working group chaired by Sun's Jon Bosak. Tim Bray's launch quote captured the design intent: "XML is extensible, internationalized, robust, simple, and built for the Web." The current canonical version is the Fifth Edition, published 26 November 2008, edited by Bray, Paoli, Sperberg-McQueen, Eve Maler and François Yergeau. XML descends directly from SGML (ISO 8879:1986), a much larger, much harder-to-implement document format from which XML stripped most of the seldom-used parts while keeping the document model intact.
Where XML Still Lives in 2026
JSON has dominated REST API payloads for over a decade, but XML remains entrenched anywhere schema rigour, document semantics, or established standards lock it in. Knowing where you'll meet it on a given day is half the value of a good formatter:
- Office document formats: Microsoft's
.docx/.xlsx/.pptxare zip archives full of XML parts, standardised as ECMA-376 and ISO/IEC 29500. OpenDocument's.odt/.ods/.odp(ISO/IEC 26300) follow the same pattern. EPUB ebooks are also XML-packaged. - Web vocabularies: SVG (vector graphics), MathML (mathematical notation), Atom feeds, RSS 2.0, and the
sitemap.xmlprotocol Google and Bing parse for crawl scheduling. - SOAP and enterprise messaging: banking, telecom, insurance, and government back-ends still expose SOAP endpoints, often behind a REST facade. Industry standards on top of XML include FpML for derivatives, XBRL for SEC filings, ACORD for insurance, and ISO 20022 for payment messaging.
- Build and config files: Maven's
pom.xml, legacy Spring beans, every Android resource (AndroidManifest.xml,res/values/strings.xml,res/layout/*.xml), and Apple'sInfo.plistin its XML variant. - Vertical-domain interchange: KML for Google Earth, GPX for GPS traces, MusicXML for sheet music (4.0 published 2021), XLIFF for localisation (2.1 standardised as ISO 21720 in July 2024), HL7 v3 / CDA for clinical documents.
Well-Formed vs Valid: They Are Not the Same Thing
XML uses two different conformance levels and they are easy to confuse:
- Well-formed means the document obeys XML's grammar, exactly one root element, all tags balanced and properly nested, attribute values quoted, entity references closed with a semicolon, no unescaped
<or&in text content. - Valid means a well-formed document additionally conforms to a declared DTD, XSD (XML Schema 1.1), RELAX NG, or Schematron schema, the right elements in the right places, with attribute values of the right types, in the right cardinalities.
This formatter only checks well-formedness. The browser's built-in DOMParser reports the first parse error it hits via a parsererror element, which the tool surfaces in the red error box. Validation against a schema needs a different tool (Saxon for XSD, libxml2 with xmllint --schema, the W3C validator service, etc.).
The Five Predefined Entity References
Per W3C XML 1.0 §4.6, "well-formed documents need not declare any of the following entities": amp, lt, gt, apos, quot. The trailing semicolon is mandatory, XML, unlike some HTML usage, will never accept & without a closing ;.
| Entity | Character | Where it's required |
|---|---|---|
< | < | Always in element content (it would otherwise begin a tag) |
& | & | Always (it would otherwise begin an entity reference) |
> | > | Required inside the sequence ]]> in content; recommended elsewhere for symmetry |
' | ' | Inside attribute values delimited with single quotes |
" | " | Inside attribute values delimited with double quotes |
CDATA Sections, Comments, and Processing Instructions
Three special syntactic features that anyone formatting XML eventually encounters:
- CDATA sections:
<![CDATA[ … ]]>blocks let you embed arbitrary text without escaping<and&. The only sequence you cannot put inside a CDATA is the literal closing delimiter]]>. Useful for embedding code samples, regex patterns, or HTML fragments inside XML documentation. - Comments:
<!-- … -->. The string--is illegal inside a comment. - Processing instructions:
<?target …?>. The XML declaration itself is technically not a PI, but most tooling treats it the same way:<?xml version="1.0" encoding="UTF-8"?>is the recommended first line.
Namespaces
XML's namespace mechanism is what lets multiple vocabularies coexist in a single document, Atom plus a custom extension, SOAP plus WS-Security headers, OOXML's main document part referencing relationships, drawings, and pictures from sister namespaces. The syntax is xmlns="…" for a default namespace and xmlns:prefix="…" for a prefixed one, and the formatter preserves both unchanged. Namespace URIs are identifiers, not URLs, they don't have to resolve to anything.
Two Famous XML Security Pitfalls
The Billion Laughs attack. A small XML file with recursively expanding entities can balloon to billions of characters in the parser's memory:
<!ENTITY lol "lol">
<!ENTITY lol2 "&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;">
<!ENTITY lol3 "&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;">
<!-- … nine more layers later, the document expands to 10^9 lols -->
Modern parsers cap entity expansion to defeat this. XXE (XML External Entity) attacks exploit a parser that resolves external entities to read local files (<!ENTITY xxe SYSTEM "file:///etc/passwd">) or trigger SSRF requests from the server. OWASP's XXE prevention cheat sheet is unambiguous: "the safest way to prevent XXE is always to disable DTDs (External Entities) completely." That is exactly what the browser's DOMParser does, it ignores DOCTYPE entity declarations entirely, which is why this client-side formatter is also safe to feed untrusted XML.
Pretty-Printing vs Minifying
- Format / Beautify indents nested elements, places open and close tags on their own lines, and produces output that is much easier to read but typically 30–50% larger than the minified equivalent. Use this when debugging, reviewing diffs, or learning an unfamiliar schema.
- Minify removes all whitespace between elements while preserving whitespace inside text content (XML defines whitespace inside element content as significant by default unless overridden by
xml:space="default"). Use this for production payloads and for any case where bytes-on-the-wire matter.
The xml:space attribute is your escape hatch when whitespace genuinely matters, verbatim source code embedded in documentation, for example. Set xml:space="preserve" on an ancestor element and a conformant processor will keep every space and newline in the descendants byte-for-byte.
Common XML Errors a Formatter Catches
- Unescaped
&in text content. A naked ampersand is always invalid; use&. - Mismatched or unclosed tags. The most common parse error. Every
<tag>needs a matching</tag>(or use the self-closing form<tag/>). - Multiple root elements. An XML document must have exactly one outermost element. If you have two siblings at the top level, wrap them in a parent.
- Encoding mismatch. A
<?xml version="1.0" encoding="UTF-8"?>declaration must match the actual byte encoding of the file. A UTF-16 BOM with a UTF-8 declaration is the classic version of this bug. - Reserved characters in attribute values.
<tag attr="a<b">is invalid even though<looks harmless inside quotes. - Stray BOM in front of the XML declaration. Some text editors silently insert a UTF-8 BOM that confuses strict parsers.
- Mixed line endings inside
xml:space="preserve"regions. Inconsistent CR / LF / CRLF can produce visible whitespace artefacts when round-tripping through different platforms.
More Frequently Asked Questions
Why does my XML format produce no output?
Most often because the input is not well-formed. The error box above the output shows the first parse error the browser's DOMParser hits, usually a missing or mismatched tag, an unescaped &, or a missing root element. Fix the error and re-run.
Is my XML uploaded to a server?
No. Formatting and minification both run inside the browser's built-in DOMParser and a small JavaScript serializer. Your XML never leaves the page, which is important for SOAP payloads, configuration files, and anything else that may contain credentials, internal URLs, or sensitive customer data.
Can the tool validate against an XSD or DTD schema?
No. Schema validation requires loading the schema file and resolving its references, which is a different problem than the well-formedness check the browser performs. For XSD validation, use Saxon or xmllint --schema at the command line, or the W3C XML Schema validator service.
Is XML still relevant in 2026, or should I just use JSON?
It depends on what you're doing. For new REST APIs, JSON is almost always the right pick. But XML is still the default for office documents (.docx, .xlsx), enterprise messaging (SOAP, financial standards), Android resources, EPUB, RSS / Atom, SVG, and most regulated-industry interchanges. Knowing how to read, format, and validate XML is still a baseline skill; it just isn't every day's first tool the way JSON is.
What does "preserve all node types" mean for the formatter?
CDATA sections, comments, and processing instructions are all kept exactly as they appear in the input, the formatter only changes whitespace between elements. So a <![CDATA[ if (a < b) { … } ]]> block round-trips byte-for-byte even if its content contains < characters that would otherwise need escaping.