You've got data. You need to move it somewhere. The format you choose will determine how painful that is — and how painful it'll be six months later when the next system in the chain expects something different. JSON, CSV, and XML each have genuine strengths, and each has blind spots that will bite you if you use the wrong one.
JSON: the format that won the web
JSON emerged from the JavaScript community in the early 2000s as a simpler alternative to XML, and it rapidly won the web. The reason is simple: it maps directly to the data structures developers already use in every language — objects (or dicts), arrays, strings, numbers, booleans, and null. There's no translation layer. Parse the string, get usable data.
{
"users": [
{
"id": 1,
"name": "Alice",
"active": true,
"score": 98.5,
"tags": ["admin", "verified"]
},
{
"id": 2,
"name": "Bob",
"active": false,
"score": 72.1,
"tags": ["user"]
}
]
}JSON's real strength is nested, hierarchical data. A user with multiple addresses, each with a city and postcode and country, is completely natural in JSON and completely unworkable in CSV. That's why JSON dominates REST APIs, configuration files, NoSQL databases like MongoDB and Firestore, and pretty much anything browser-based.
- Best for: REST APIs, nested or hierarchical data, configuration files, NoSQL document storage.
- Strengths: Human-readable, native to JavaScript, handles complex nested structures, widely supported in all languages.
- Weaknesses: Verbose for large flat datasets, no support for comments, no schema enforcement without additional tooling (JSON Schema).
- File size: Moderate — more verbose than CSV for flat data, more compact than XML.
CSV: ugly, flawed, and universally supported
CSV is almost laughably simple: rows of values, separated by commas. No schema. No types. No nesting. RFC 4180 defines the spec, but implementations still disagree on edge cases like embedded newlines and quoting. It's a mess. It's also the one format that every spreadsheet, database, and data analysis tool in existence can open without configuration.
id,name,active,score
1,Alice,true,98.5
2,Bob,false,72.1
3,"O'Brien, Carol",true,88.0For flat, tabular data — user lists, product catalogs, analytics exports, financial reports — CSV is almost always the right choice. Your non-technical colleagues can open it in Excel. Your DBA can import it with a single COPY command. Your data scientist can load it into pandas. Nothing else has that kind of reach.
- Best for: Tabular/flat data, spreadsheet exchange, database imports/exports, data analysis pipelines, large datasets.
- Strengths: Extremely compact, universally supported, easy to read in any text editor, opens natively in Excel/Sheets.
- Weaknesses: No native support for nested data, no type information (everything is a string by default), no standard for handling special characters or encoding (RFC 4180 is widely but not universally followed).
- File size: Smallest of the three for flat data — no schema overhead.
XML: still here, still necessary in the right places
XML's reputation is mixed. Web developers abandoned it after JSON took over, but it never really went away. It's the format behind Word documents (.docx), SVG images, RSS/Atom feeds, Android manifests, Maven build configs, and essentially all of enterprise software built between 1998 and 2010. If you're integrating with a bank, hospital system, or government agency, you will encounter XML.
<?xml version="1.0" encoding="UTF-8"?>
<users>
<user id="1">
<name>Alice</name>
<active>true</active>
<score>98.5</score>
<tags>
<tag>admin</tag>
<tag>verified</tag>
</tags>
</user>
<user id="2">
<name>Bob</name>
<active>false</active>
<score>72.1</score>
<tags>
<tag>user</tag>
</tags>
</user>
</users>The verbosity is the tradeoff for real advantages: XSD schemas for rigorous validation, XSLT for document transformation, namespaces for combining data from multiple sources, and mature tooling for all of it. These features are overkill for a REST API payload — but they matter when you're signing contracts about the exact format of data between systems.
- Best for: Enterprise integrations, document-centric formats, configurations requiring strict schema validation, SOAP APIs, legacy system interop.
- Strengths: Strong schema validation (XSD), namespace support, XSLT transformations, mixed content, mature tooling.
- Weaknesses: Very verbose (high file size), slower to parse than JSON or CSV, harder to read and write by hand.
- File size: Largest of the three — element tags add significant overhead.
Quick Comparison Table
- Nested data: JSON (native), CSV (not supported), XML (supported via element nesting)
- Type support: JSON (string, number, boolean, null, array, object), CSV (string only — types inferred), XML (string — types via XSD schema)
- Comments: JSON (none — use JSON5 or a workaround), CSV (none), XML (<!-- comment --> supported)
- Schema validation: JSON (JSON Schema, external tool), CSV (none built-in), XML (XSD, DTD — built-in)
- Human readability: JSON (good), CSV (excellent for flat data), XML (poor — very verbose)
- Parse speed: JSON (fast), CSV (fastest), XML (slowest)
- File size: JSON (medium), CSV (smallest), XML (largest)
Converting Between Formats
Converting flat JSON to CSV is straightforward — each object in the array becomes a row, each key becomes a column. The headache is nested data. You have two options: serialize nested objects as JSON strings in a single cell (ugly but lossless), or flatten them with dot-notation column names like `address.city` (readable but lossy for arrays).
CSV to JSON conversion requires type inference decisions: should `"true"` become a boolean? Should `"42"` become a number? Should `"07052"` stay a string (it's a zip code, the leading zero matters)? Good converters let you specify types per column; naive ones coerce everything and break your zip codes. Converting to XML requires escaping reserved characters — `<`, `>`, `&`, `'`, `"` all need entity encoding.
Rule of thumb: use JSON for APIs and config, CSV for tabular data and spreadsheets, XML when you're integrating with a system that demands it or when you need XSD schema validation. Default to JSON — switch to CSV when your users are going to open it in Excel.