XML Validator
Validate XML against schema
XML validation checks whether a document is well-formed (correct syntax) and optionally valid (conforms to a schema). Well-formed is required, valid is optional.
What is XML Validation?
XML validation is the process of checking an XML document against two levels of correctness: well-formedness and validity.
Well-formedness is the baseline requirement. A well-formed XML document follows the syntax rules defined by the XML 1.0 specification: every opening tag has a matching closing tag, elements are properly nested, attribute values are quoted, and the document has exactly one root element. If a document is not well-formed, XML parsers will reject it outright.
Validity is an optional, stricter check. A valid XML document is well-formed and conforms to a schema — a formal definition of which elements, attributes, and data types are allowed and in what order. Schemas are defined using XSD (XML Schema Definition), DTD (Document Type Definition), or RelaxNG.
Well-Formed vs Valid
Understanding the distinction between these two levels is fundamental to working with XML:
| Check | Scope | Required? | Defined By |
|---|---|---|---|
| Well-formed | Syntax rules | Always | XML 1.0 Specification |
| Valid | Structure + data types | Optional | XSD, DTD, or RelaxNG schema |
A document can be well-formed without being valid. For example, a well-formed XML file might have the correct syntax but contain an element that the schema does not allow. However, a document cannot be valid without being well-formed — syntax correctness is a prerequisite.
Well-Formedness Rules
The XML 1.0 specification requires:
- Single root element: The document must have exactly one top-level element that wraps all content.
- Matching tags: Every opening tag (
<element>) must have a corresponding closing tag (</element>), or be self-closing (<element />). - Proper nesting: Elements must be nested correctly —
<a><b></b></a>is valid,<a><b></a></b>is not. - Quoted attributes: All attribute values must be enclosed in single or double quotes (
id="1"orid='1'). - Escaped special characters: Characters like
<,>,&,', and"must be escaped in text content (<,>,&,',"). - Valid XML declaration: If present, the
<?xml ?>declaration must be the very first thing in the document.
Common XML Errors
| Error | Example | Fix |
|---|---|---|
| Mismatched tags | <name>Alice</Name> | XML is case-sensitive: <name> must close with </name> |
| Unescaped ampersand | <text>Tom & Jerry</text> | Use &: <text>Tom & Jerry</text> |
| Missing quotes on attribute | <item id=5> | Quote the value: <item id="5"> |
| Multiple root elements | <a/><b/> | Wrap in a single root: <root><a/><b/></root> |
| Unclosed tag | <br> | Self-close: <br /> (XML is not HTML) |
| Invalid characters | Control characters (0x00-0x08) | Remove or encode them |
The most frequent mistake for developers coming from HTML is forgetting that XML is case-sensitive and strict. In HTML, <BR>, <br>, and <Br> are all the same. In XML, they are three different elements.
XSD Validation
XSD (XML Schema Definition) is the most widely used schema language for XML. An XSD file defines:
- Which elements and attributes are allowed
- The data type of each element (string, integer, date, decimal, etc.)
- The cardinality (how many times an element can appear)
- The order in which child elements must occur
- Default and fixed values
- Enumeration constraints (allowed values)
When an XML document is validated against an XSD, the validator checks every element and attribute against the schema rules. If any rule is violated, the validator reports the specific error with its location in the document.
XSD validation is critical in enterprise systems where XML messages must conform to strict standards — such as ISO 20022 for financial messaging, HL7 for healthcare, or UBL for electronic invoicing.
Common Use Cases
- API message validation: SOAP web services validate incoming and outgoing XML messages against WSDL/XSD schemas to ensure contract compliance
- Financial messaging: ISO 20022 (pacs, pain, camt) messages are validated against published XSD schemas before submission to payment networks
- Configuration files: Application servers (Tomcat, Spring, JBoss) validate XML configuration files against their DTDs or XSDs at startup
- Data import/export: ETL pipelines validate XML data feeds against schemas before processing to catch format errors early
- Document standards: SVG, XHTML, RSS, Atom, and EPUB documents can be validated against their respective schemas to ensure compliance
- Regulatory compliance: Government agencies and industry bodies require XML submissions (tax filings, healthcare records, customs declarations) to pass schema validation
Try These Examples
A well-formed XML document with a proper XML declaration, a single root element (<book>), correctly nested child elements, and matching opening/closing tags.
<?xml version="1.0" encoding="UTF-8"?>
<book>
<title>The Great Gatsby</title>
<author>F. Scott Fitzgerald</author>
<year>1925</year>
</book> Invalid because the opening tag <author> does not match the closing tag </autor> (typo: 'autor' instead of 'author'). XML parsers will reject this document entirely at the first well-formedness error.
<?xml version="1.0"?>
<book>
<title>The Great Gatsby</title>
<author>F. Scott Fitzgerald</autor>
</book> A well-formed XML document with a matching XSD schema. The schema defines a <product> element containing <name> (string), <price> (decimal), and <quantity> (integer) in strict sequence. The XML conforms to all type and structure constraints.
<?xml version="1.0" encoding="UTF-8"?>
<product>
<name>Laptop Pro 15</name>
<price>1299.99</price>
<quantity>50</quantity>
</product> The XML is well-formed (valid syntax), but violates the XSD schema in two ways: <price> contains 'not-a-number' instead of a decimal value, and <color> is not defined in the schema while the required <quantity> element is missing.
<?xml version="1.0" encoding="UTF-8"?>
<product>
<name>Laptop Pro 15</name>
<price>not-a-number</price>
<color>Silver</color>
</product>