FileExamples
CSV.csv · Invalid

CSV with UTF-8 BOM

Download a free CSV file that starts with a UTF-8 byte order mark (BOM: EF BB BF). Microsoft Excel adds this invisible 3-byte prefix when saving CSV files as UTF-8. Many parsers fail to detect it, causing the first column header to be read incorrectly (e.g., '\xEFname' instead of 'name'). Use this file to verify your CSV parser correctly strips or handles the BOM.

What Is Broken

The file begins with the bytes EF BB BF (UTF-8 BOM) before the first header field. The CSV content itself is valid, but the BOM prefix causes many parsers to misread the first field name or treat it as an unknown character.

Broken Example

[EF BB BF]name,email,age,city
Alice,alice@example.com,29,New York
Bob,bob@example.com,34,London

Why It Matters

This is extremely common in enterprise environments where CSV files are generated by Excel. If your application imports CSV files from users, you will encounter BOM-prefixed files. Failing to handle them causes data import failures, broken column mapping, and silent data corruption.

Expected Parser / Validator Behavior

BOM-aware parsers should detect and strip the BOM prefix transparently. The first column should be read as 'name' not '\uFEFFname'. Parsers that don't handle BOM should at minimum report the issue clearly.

Related Validators & Tools

Valid Sample Files

Frequently Asked Questions

What is a UTF-8 BOM?

A byte order mark (BOM) is a 3-byte sequence (EF BB BF) at the start of a file that indicates UTF-8 encoding. It is invisible in most editors but can break parsers.

Why does Excel add a BOM?

Excel adds the BOM to indicate that the CSV file uses UTF-8 encoding, helping other Microsoft applications detect the encoding. However, many non-Microsoft tools don't expect it.