ADR-001: SQLite + EFS Unified Data Layer
| Field | Value |
|---|---|
| Status | Accepted |
| Date | 2026-03-11 |
| Decision Makers | CA (lead), PO, MEE, IE |
| Supersedes | N/A |
Context
xOps BC1 needs a data layer for three workloads:
- Open WebUI metadata: user sessions, chat history, settings (<50 users)
- CrewAI Knowledge: vector embeddings for RAG (ChromaDB, built-in)
- FinOps evidence: JSON/CSV reports and FOCUS 1.2+ data
Production runs on ECS Fargate (ephemeral containers). Data must survive task restarts.
FinOps analytics at scale (>1TB scan volume) will outgrow file-based reporting — S3 Tables (Apache Iceberg) provides columnar analytics with time-travel and ACID transactions at $0.004/GB scan cost. This is a Module Addition at BC2+, not a data layer replacement.
Decision
Use SQLite (Open WebUI default) + ChromaDB (CrewAI Knowledge built-in) + EFS ($6/mo persistent POSIX filesystem) as the unified data layer.
EFS provides the POSIX filesystem that SQLite requires, persists across Fargate task restarts, and stores ChromaDB vector data. One $6/mo EFS volume serves all three workloads.
Consequences
Gains
- $6/mo vs $43/mo Aurora minimum (87% cost reduction at data layer)
- Zero configuration — SQLite and ChromaDB are framework defaults
- Zero external database operations (no patching, scaling, or connection pooling)
- Identical local and production behavior (SQLite on both)
Losses
- Write concurrency limited: SQLite WAL mode supports ~50 concurrent readers but limited concurrent writers
- No SQL analytics at scale: Complex FinOps queries on >1TB need S3 Tables (Iceberg), not file-based processing
- EFS latency: 1-2ms per write vs NVMe local (acceptable, not equivalent)
- No cross-service SQL+vector queries: ChromaDB and SQLite are separate — no unified query layer
Upgrade Paths
| Workload | Trigger | Upgrade To | Classification |
|---|---|---|---|
| Relational data | >50 concurrent writes | RDS PostgreSQL | Data Migration |
| Vector search | Cross-system SQL+vector | pgvector on RDS | Config Change |
| FinOps analytics | Scan volume >1TB | S3 Tables (Iceberg) | Module Addition |
| Full-text search | Dedicated search workload | OpenSearch Serverless | Module Addition |
SQLite → RDS is a data migration (requires sqlite3 .dump + pg_restore + brief downtime). Budget 1-2 sprint days.
S3 Tables (Iceberg) is a module addition — file-based FinOps evidence continues working; S3 Tables adds columnar analytics alongside it, not replacing it.
Alternatives Considered
| Alternative | Cost | Verdict | When to Reconsider |
|---|---|---|---|
| RDS PostgreSQL | $20/mo | Ops overhead unjustified at <50 users | >50 concurrent writes |
| Aurora Serverless v2 | $43/mo min | 0.5 ACU minimum = 7x EFS cost | >200 users + multi-AZ requirement |
| pgvector | +$20/mo on RDS | ChromaDB handles BC1 vector workload | Cross-system SQL+vector search |
| S3 Tables (Iceberg) | $5/mo | File-based FinOps sufficient at BC1; excellent for analytics at scale ($0.004/GB scan, time-travel, ACID) | FinOps scan volume >1TB |
| OpenSearch Serverless | $345/mo | 2 OCU minimum = 58x EFS cost | Full-text search at scale |
| Valkey / ElastiCache | $15/mo | ALB sticky sessions sufficient | Pub/sub or cross-service cache |
| Qdrant OSS | $0 (self-hosted) | ChromaDB sufficient; Qdrant better for dedicated vector workloads | Dedicated vector search at BC2+ |
Source: xops.jsx LAYERS[id=4].whyNot[]
Agent Scores
| Agent | Score | Key Rationale |
|---|---|---|
| PO | 96% | $6/mo data layer removes cost objection |
| CA | 98% | EFS POSIX for SQLite is architecturally correct; upgrade paths documented |
| MEE | 95% | ChromaDB via CrewAI Knowledge = zero-config RAG |
| IE | 97% | EFS + Fargate = standard AWS pattern |