What are the three types of data?

The three fundamental types are structured (organised in tables with rows and columns), semi-structured (has some organisation like JSON or XML but no rigid schema), and unstructured (no predefined format, such as images, documents, and audio).

What is structured data?

Structured data follows a fixed, predefined schema, like a database table or spreadsheet. Every record has the same fields, making it easy to query with SQL and govern with standard tools.

What is unstructured data?

Unstructured data has no predefined format. It includes images, videos, PDFs, emails, and audio files. Extracting value requires AI techniques like Named Entity Recognition or computer vision.

Where is semi-structured data stored?

Semi-structured data is typically stored in NoSQL databases (MongoDB, Cassandra) or data lakes. Common formats include JSON, XML, and YAML.

Why does data type matter for data quality?

Each data type requires different governance approaches. Structured data can be validated with schema rules; unstructured data needs AI-assisted tagging and metadata management strategies.

Structured vs Unstructured Data: 3 Types of Data Explained

Every data strategy, governance framework, and analytics architecture must begin with understanding the type of data being managed. Not all data is the same, and treating it as though it is leads to costly infrastructure decisions and persistent gaps in governance.

Spectrum from structured data through semi-structured to unstructured data with examples

1. Structured Data

Structured data is information organised in a predefined, consistent format, typically rows and columns within a relational database or spreadsheet.

Common examples: Relational databases, spreadsheets, CSV files, ERP/CRM exports.

Characteristic	Description
Format	Stable, predefined schema
Queryability	High, fully addressable with SQL
Processing	Automated and scalable
Governance	Straightforward
Storage	Relational databases (RDBMS)

2. Semi-Structured Data

Comparison of structured CSV vs semi-structured JSON data

Semi-structured data has some organisational markers but no rigid schema. JSON and XML are the canonical examples. The data carries its own descriptive tags or markers, but the structure can vary from record to record.

Common examples: JSON, REST APIs, XML files, emails, social media posts.

Characteristic	Description
Format	Flexible, structure varies per record
Queryability	Moderate, requires JSONPath, XQuery
Governance	More complex, schema evolution must be managed
Storage	NoSQL databases, data lakes

3. Unstructured Data

Unstructured data lacks any predefined schema. It is the fastest-growing type and the most challenging to manage. Entity Recognition (NER) is one key technology for extracting value from it.

Common examples: Images, video, audio, PDFs, Word files, chat logs.

Characteristic	Description
Format	None, unknown until examined
Queryability	Low, requires AI/ML or specialised parsing
Processing	Complex, NLP, computer vision
Governance	Highly complex, metadata must be applied externally
Storage	Object storage (S3, Azure Blob), data lakes

Side-by-Side Comparison

Dimension	Structured	Semi-Structured	Unstructured
Schema	Fixed, predefined	Flexible, partial	None
Examples	SQL tables, CSV	JSON, XML, Email	Images, PDFs, Video
Query tool	SQL	JSONPath, XQuery	AI/ML, NLP
Processing complexity	Low	Medium	High
Governance complexity	Low	Medium	High
Primary storage	RDBMS	NoSQL, Data Lake	Object Store, Data Lake

Strategic Implications for Data Architecture

Understanding data types is not an academic exercise. It directly drives architecture and investment decisions:

Storage architecture decisions: structured data fits RDBMS, semi-structured needs NoSQL or data lakes, and unstructured demands object storage with metadata layers
Governance strategy needs: each type requires different validation, cataloguing, and quality approaches
AI and ML readiness: converting unstructured data into usable features is often the most expensive part of any AI initiative

Key Takeaway

Structured, semi-structured, and unstructured data are distinct types with different management requirements. A mature data strategy accounts for all three, with appropriate tooling, governance, and architecture for each. A strategic advisory engagement can help you design the right approach.