PostgreSQL for Primary Storage
Context
Choosing a database for storing raw messages, extracted frames, and channel metadata. The system needs to handle millions of records with complex queries for narrative analysis.
Decision
Use PostgreSQL with JSONB columns for flexible schema alongside structured relational tables for core entities.
Alternatives Considered
MongoDB
Pros
- Flexible schema for varying message formats
- Native JSON storage
Cons
- Weaker support for complex analytical queries
- Less mature full-text search
Elasticsearch
Pros
- Excellent full-text search
- Good for time-series data
Cons
- Not ideal as primary data store
- Higher operational complexity
ClickHouse
Pros
- Excellent for analytical queries
- Column-oriented storage
Cons
- Not great for transactional writes
- Less flexible for schema evolution
Reasoning
PostgreSQL provides the best balance of relational integrity for structured data (channels, frames, clusters) and flexibility via JSONB for raw message storage. Its mature ecosystem, pg_trgm for text search, and strong community support make it reliable for a research project.