Skip to main content

ADR-001: SQLite + EFS Unified Data Layer

FieldValue
StatusAccepted
Date2026-03-11
Decision MakersCA (lead), PO, MEE, IE
SupersedesN/A

Context

xOps BC1 needs a data layer for three workloads:

  1. Open WebUI metadata: user sessions, chat history, settings (<50 users)
  2. CrewAI Knowledge: vector embeddings for RAG (ChromaDB, built-in)
  3. FinOps evidence: JSON/CSV reports and FOCUS 1.2+ data

Production runs on ECS Fargate (ephemeral containers). Data must survive task restarts.

FinOps analytics at scale (>1TB scan volume) will outgrow file-based reporting — S3 Tables (Apache Iceberg) provides columnar analytics with time-travel and ACID transactions at $0.004/GB scan cost. This is a Module Addition at BC2+, not a data layer replacement.

Decision

Use SQLite (Open WebUI default) + ChromaDB (CrewAI Knowledge built-in) + EFS ($6/mo persistent POSIX filesystem) as the unified data layer.

EFS provides the POSIX filesystem that SQLite requires, persists across Fargate task restarts, and stores ChromaDB vector data. One $6/mo EFS volume serves all three workloads.

Consequences

Gains

  • $6/mo vs $43/mo Aurora minimum (87% cost reduction at data layer)
  • Zero configuration — SQLite and ChromaDB are framework defaults
  • Zero external database operations (no patching, scaling, or connection pooling)
  • Identical local and production behavior (SQLite on both)

Losses

  • Write concurrency limited: SQLite WAL mode supports ~50 concurrent readers but limited concurrent writers
  • No SQL analytics at scale: Complex FinOps queries on >1TB need S3 Tables (Iceberg), not file-based processing
  • EFS latency: 1-2ms per write vs NVMe local (acceptable, not equivalent)
  • No cross-service SQL+vector queries: ChromaDB and SQLite are separate — no unified query layer

Upgrade Paths

WorkloadTriggerUpgrade ToClassification
Relational data>50 concurrent writesRDS PostgreSQLData Migration
Vector searchCross-system SQL+vectorpgvector on RDSConfig Change
FinOps analyticsScan volume >1TBS3 Tables (Iceberg)Module Addition
Full-text searchDedicated search workloadOpenSearch ServerlessModule Addition

SQLite → RDS is a data migration (requires sqlite3 .dump + pg_restore + brief downtime). Budget 1-2 sprint days.

S3 Tables (Iceberg) is a module addition — file-based FinOps evidence continues working; S3 Tables adds columnar analytics alongside it, not replacing it.

Alternatives Considered

AlternativeCostVerdictWhen to Reconsider
RDS PostgreSQL$20/moOps overhead unjustified at <50 users>50 concurrent writes
Aurora Serverless v2$43/mo min0.5 ACU minimum = 7x EFS cost>200 users + multi-AZ requirement
pgvector+$20/mo on RDSChromaDB handles BC1 vector workloadCross-system SQL+vector search
S3 Tables (Iceberg)$5/moFile-based FinOps sufficient at BC1; excellent for analytics at scale ($0.004/GB scan, time-travel, ACID)FinOps scan volume >1TB
OpenSearch Serverless$345/mo2 OCU minimum = 58x EFS costFull-text search at scale
Valkey / ElastiCache$15/moALB sticky sessions sufficientPub/sub or cross-service cache
Qdrant OSS$0 (self-hosted)ChromaDB sufficient; Qdrant better for dedicated vector workloadsDedicated vector search at BC2+

Source: xops.jsx LAYERS[id=4].whyNot[]

Agent Scores

AgentScoreKey Rationale
PO96%$6/mo data layer removes cost objection
CA98%EFS POSIX for SQLite is architecturally correct; upgrade paths documented
MEE95%ChromaDB via CrewAI Knowledge = zero-config RAG
IE97%EFS + Fargate = standard AWS pattern