FileExamples
XML.xml · Invalid

XML with Wrong Encoding Declaration

Download a free XML file that declares UTF-8 encoding in its XML declaration but actually contains ISO-8859-1 encoded characters. This encoding mismatch is a common problem with XML files from legacy systems, older databases, and Windows applications. Use it to test how your XML parser detects and handles encoding mismatches.

What Is Broken

The XML declaration says encoding="UTF-8" but the file contains bytes that are valid ISO-8859-1 but invalid UTF-8 sequences (e.g., accented characters like é, ü, ñ encoded as single bytes 0xE9, 0xFC, 0xF1).

Broken Example

<?xml version="1.0" encoding="UTF-8"?>
<contacts>
  <person>
    <name>José García</name>      <!-- é and í are ISO-8859-1 bytes -->
    <city>Zürich</city>            <!-- ü is 0xFC in ISO-8859-1 -->
    <note>Señor García's file</note>  <!-- ñ is 0xF1 -->
  </person>
</contacts>

Why It Matters

Encoding mismatches cause mojibake (garbled text), parsing failures, and data corruption. They are especially common when migrating data between systems with different default encodings.

Expected Parser / Validator Behavior

Strict UTF-8 parsers should reject the document with an encoding error. Lenient parsers may attempt to auto-detect the actual encoding. Well-behaved applications should report the mismatch and suggest re-encoding.

Related Validators & Tools

Valid Sample Files

Frequently Asked Questions

What is an encoding mismatch?

The XML declaration claims one encoding (UTF-8) but the file bytes are actually in a different encoding (ISO-8859-1). This causes parsers to misinterpret character sequences.

How do I fix encoding issues?

Either re-encode the file to match the declaration (convert to actual UTF-8), or update the declaration to match the actual encoding (change to encoding="ISO-8859-1").