SYSTEM_ARCHITECT_GUIDE

category: "⚙️ Operations/Admin" version: "v1.3.0" status: "✅" date: "22.12.2025" audience: "System architects, principal engineers, CTOs"

⚙️ ThemisDB System Architect Guide

Comprehensive architecture guide for enterprise-scale deployment.

📋 Inhaltsverzeichnis

📋 Übersicht
✨ Features
🚀 Quick Start
📖 Architecture & Design
💡 Best Practices
🔧 Troubleshooting
📚 Weitere Ressourcen
📝 Changelog

📋 Übersicht

ThemisDB provides enterprise-grade horizontal sharding with Raft consensus and Google Spanner-inspired TrueTime for strong consistency across distributed deployments.

Target Audience: System architects, principal engineers, CTOs

Version: 1.3.0
Last Updated: December 2025

✨ Features

🔀 Horizontal Sharding - Raft + TrueTime (only open-source implementation)
📈 Billion-Scale - Migration playbooks from hyperscalers
💰 46% Cost Savings - vs. AWS complete stack
🌍 Distributed Transactions - SAGA pattern support
🔐 Enterprise Security - mTLS, RBAC, encryption

🚀 Quick Start

Key Topics:

Horizontal sharding with Raft + TrueTime (only open-source implementation)
Migration playbooks: AWS/GCP/Azure → ThemisDB
46% cost savings vs. AWS complete stack
Capacity planning for billion-scale deployments
Distributed transactions with SAGA pattern

Horizontal Sharding Architecture

Raft Consensus Overview

┌──────────────────────────────────────────────────────────┐
│                    ThemisDB Cluster                      │
│                                                          │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐    │
│  │  Leader     │  │  Follower   │  │  Follower   │    │
│  │  Shard 1    │  │  Shard 1    │  │  Shard 1    │    │
│  │  (Write)    │──│  (Read)     │──│  (Read)     │    │
│  └─────────────┘  └─────────────┘  └─────────────┘    │
│         │                                               │
│    Raft Log                                             │
│    WAL Replication                                      │
│    TrueTime Coordination                                │
└──────────────────────────────────────────────────────────┘

Key Components:

Raft Consensus: Leader election, log replication, strong consistency
TrueTime: Distributed timestamps with uncertainty intervals
WAL: Write-ahead logging for durability
Circuit Breaker: Prevents cascade failures

TrueTime Implementation

// TrueTime timestamp with uncertainty interval
struct TrueTimeStamp {
    uint64_t earliest;  // Lower bound
    uint64_t latest;    // Upper bound
    uint64_t now() const { return (earliest + latest) / 2; }
    uint64_t uncertainty() const { return latest - earliest; }
};

// Wait for uncertainty to pass
void wait_for_safe_time(const TrueTimeStamp& ts) {
    while (current_time() < ts.latest) {
        std::this_thread::sleep_for(std::chrono::microseconds(10));
    }
}

Benefits:

External consistency (linearizability)
Snapshot isolation without locks
Distributed transactions across shards

Sharding Strategy & Design

Hash-Based Sharding

// Consistent hashing with virtual nodes
uint64_t compute_shard(const std::string& key) {
    uint64_t hash = murmur3_hash(key);
    return hash % num_shards;
}

// With 150 virtual nodes per shard for better distribution
uint64_t compute_shard_virtual(const std::string& key) {
    uint64_t hash = murmur3_hash(key);
    uint64_t vnode = hash % (num_shards * 150);
    return vnode / 150;  // Map to physical shard
}

Range-Based Sharding

// Shard by key ranges (e.g., user IDs, timestamps)
struct ShardRange {
    std::string start_key;
    std::string end_key;
    uint64_t shard_id;
};

std::vector<ShardRange> shard_map = {
    {"0000", "2999", 0},
    {"3000", "5999", 1},
    {"6000", "9999", 2}
};

Rebalancing Strategy

// Dynamic shard rebalancing
void rebalance_shards() {
    // 1. Detect imbalance (>20% size difference)
    auto stats = collect_shard_statistics();
    
    // 2. Create rebalancing plan
    auto plan = create_rebalancing_plan(stats);
    
    // 3. Execute with joint consensus (Raft membership change)
    execute_rebalancing(plan);
}

Distributed Transactions

SAGA Pattern Implementation

SAGATransaction saga;

// Step 1: Reserve inventory
saga.addStep(
    []() { 
        return inventory.reserve(productId, quantity);
    },
    []() { 
        inventory.unreserve(productId, quantity);  // Compensation
    }
);

// Step 2: Charge payment
saga.addStep(
    []() { 
        return payment.charge(userId, amount);
    },
    []() { 
        payment.refund(userId, amount);  // Compensation
    }
);

// Step 3: Create order
saga.addStep(
    []() { 
        return orders.create(order);
    },
    []() { 
        orders.delete(orderId);  // Compensation
    }
);

// Execute with automatic rollback on failure
saga.execute();

Advantages over 2PC:

No blocking coordinator
Better availability
Eventual consistency with compensation

Migration from Hyperscalers

AWS → ThemisDB Migration Playbook

Current AWS Stack:

RDS PostgreSQL: $2,000/month
DynamoDB: $1,500/month
SageMaker: $3,000/month
Total: $6,500/month = $78K/year

ThemisDB Equivalent:

ThemisDB Self-Hosted: $1,500/month (3 × c6i.8xlarge + storage)
vLLM Co-Located: $0/month (uses same infrastructure)
Total: $1,500/month = $18K/year
Savings: $60K/year (77% reduction)

Migration Steps:

# Phase 1: Setup ThemisDB (Week 1-2)
1. Provision infrastructure (EC2 instances, EBS volumes)
2. Deploy ThemisDB cluster (3 nodes)
3. Configure backups to S3
4. Set up monitoring (CloudWatch → OpenTelemetry)

# Phase 2: Data Migration (Week 3-4)
1. Export data from RDS/DynamoDB
   aws rds export-db-snapshot-to-s3 --snapshot-id snapshot-123
   
2. Transform to ThemisDB format
   themisdb-migrator transform --source rds --target themisdb
   
3. Bulk load into ThemisDB
   themisdb-admin bulk-load --source s3://migration-bucket/

# Phase 3: Dual-Write (Week 5-6)
1. Update application to write to both RDS and ThemisDB
2. Validate data consistency
3. Monitor performance and errors

# Phase 4: Read Migration (Week 7-8)
1. Gradually shift reads to ThemisDB (10% → 50% → 100%)
2. Monitor latency and errors
3. Validate business metrics

# Phase 5: Cutover (Week 9-10)
1. Disable writes to RDS/DynamoDB
2. Final data reconciliation
3. Decommission AWS services

Service Mapping:

AWS Service	ThemisDB Equivalent	Migration Complexity
RDS PostgreSQL	Multi-Model Relational	Medium (schema mapping)
DynamoDB	Document Model	Low (direct mapping)
Aurora	Multi-Model + Sharding	Medium (sharding setup)
SageMaker	vLLM Co-Location	Medium (model deployment)
Pinecone	FAISS Advanced	Low (API compatible)
Timestream	Hypertables	Low (time-series native)

GCP → ThemisDB Migration

Service Mapping:

GCP Service	ThemisDB Equivalent
Cloud SQL	Multi-Model Relational
Firestore	Document Model
Cloud Spanner	Sharding + TrueTime (compatible!)
Vertex AI	vLLM Co-Location
BigQuery	Arrow Parquet Export → BigQuery

Azure → ThemisDB Migration

Service Mapping:

Azure Service	ThemisDB Equivalent
Azure SQL Database	Multi-Model Relational
Cosmos DB	Document + Graph
Azure ML	vLLM Co-Location

Capacity Planning

Sizing Guidelines

Small Deployment (< 1TB data, < 10K QPS):

- 3 nodes (for HA)
- 16 vCPUs, 64 GB RAM per node
- 1 TB SSD per node
- Cost: ~$1,500/month (self-hosted AWS)

Medium Deployment (1-10 TB data, 10K-100K QPS):

- 5 nodes
- 32 vCPUs, 128 GB RAM per node
- 2 TB NVMe SSD per node
- Cost: ~$5,000/month (self-hosted AWS)

Large Deployment (10-100 TB data, 100K-1M QPS):

- 10 nodes
- 64 vCPUs, 256 GB RAM per node
- 10 TB NVMe SSD per node
- Cost: ~$20,000/month (self-hosted AWS)

Storage Calculation

# Billion vectors storage calculation
vectors = 1_000_000_000
dimensions = 1536
bytes_per_float = 4

# Flat storage
flat_storage = vectors * dimensions * bytes_per_float
print(f"Flat: {flat_storage / 1e12:.1f} TB")  # 6.1 TB

# IVF+PQ compression
compression_ratio = 100  # 100x with PQ
compressed_storage = flat_storage / compression_ratio
print(f"IVF+PQ: {compressed_storage / 1e9:.1f} GB")  # 61 GB

Scalability Patterns

Horizontal Scaling

# Add new shard to cluster
themisdb-admin shard add \
  --node themisdb-4:8530 \
  --rebalance-strategy gradual

# Monitor rebalancing progress
themisdb-admin shard status

Read Replicas

# Add read replica
themisdb-admin replica add \
  --primary themisdb-1:8530 \
  --replica themisdb-5:8530 \
  --lag-threshold 100ms

Cost Optimization

3-Year TCO Comparison

Solution	Infrastructure	Operations	Total (3Y)	vs ThemisDB
ThemisDB	$54K	$113K	$167K	Baseline
AWS Stack	$234K	$65K	$299K	+79%
PostgreSQL	$156K	$76K	$232K	+39%
MongoDB	$180K	$75K	$255K	+53%

Cost Savings Breakdown:

Infrastructure: 46-77% (self-hosted vs managed)
Embedding Cache: 70-90% API cost reduction
vLLM Co-Location: Share GPU infrastructure
Zero egress costs: Save 10-20% on data transfer

Reference Architectures

RAG Platform Architecture

┌──────────────────────────────────────────────────────────────┐
│                      Application Layer                       │
│  ┌────────────┐  ┌────────────┐  ┌────────────┐           │
│  │  Web App   │  │  Mobile    │  │  API       │           │
│  └────────────┘  └────────────┘  └────────────┘           │
└──────────────────────────────────────────────────────────────┘
                           │
┌──────────────────────────────────────────────────────────────┐
│                   ThemisDB + vLLM                           │
│  ┌────────────────────────────────────────────────────┐    │
│  │  ThemisDB (50 cores, 200 GB RAM, 30% GPU)        │    │
│  │  - Vector Index (FAISS IVF+PQ, 1B vectors)       │    │
│  │  - Embedding Cache (70-90% cost savings)         │    │
│  │  - Hybrid Search (BM25 + Vector RRF)             │    │
│  │  - Document Store                                  │    │
│  └────────────────────────────────────────────────────┘    │
│  ┌────────────────────────────────────────────────────┐    │
│  │  vLLM (14 cores, 56 GB RAM, 70% GPU)             │    │
│  │  - LLaMA 2 70B / GPT-3.5 equivalent              │    │
│  │  - Low-priority CUDA streams                       │    │
│  └────────────────────────────────────────────────────┘    │
└──────────────────────────────────────────────────────────────┘

Performance:

Query latency: 50-200ms (p99)
Embedding cache hit rate: 80%+
vLLM inference: 1-2ms (GPU) vs 5-10ms (CPU fallback)
Throughput: 10K+ queries/sec

IoT Time-Series Platform

┌──────────────────────────────────────────────────────────────┐
│                    IoT Devices (100K+)                       │
│     Sensors → Edge Gateways → Data Ingestion                │
└──────────────────────────────────────────────────────────────┘
                           │
┌──────────────────────────────────────────────────────────────┐
│                      ThemisDB Cluster                        │
│  ┌─────────────────────────────────────────────────────┐   │
│  │  Hypertables (TimescaleDB compatibility)           │   │
│  │  - 1-day chunks (Column Families)                   │   │
│  │  - TTL-based retention (30 days)                    │   │
│  │  - ZSTD compression (5x storage reduction)          │   │
│  └─────────────────────────────────────────────────────┘   │
│  ┌─────────────────────────────────────────────────────┐   │
│  │  Time-Series Aggregates (Arrow SIMD)               │   │
│  │  - Resample (1-second → 1-minute)                  │   │
│  │  - Rolling windows (5-minute, 1-hour)              │   │
│  │  - Percentiles (P50, P95, P99)                     │   │
│  │  - 5-10x faster than SQL                           │   │
│  └─────────────────────────────────────────────────────┘   │
└──────────────────────────────────────────────────────────────┘

Performance:

Ingestion rate: 1M+ points/sec
Query latency (aggregates): 10-100ms
Storage efficiency: 80% compression
Retention: Automatic with TTL

Security Architecture

Zero-Trust Model

┌──────────────────────────────────────────────────────────────┐
│                        Internet                              │
└──────────────────────────────────────────────────────────────┘
                           │
┌──────────────────────────────────────────────────────────────┐
│                    WAF + DDoS Protection                     │
└──────────────────────────────────────────────────────────────┘
                           │
┌──────────────────────────────────────────────────────────────┐
│                  Load Balancer (mTLS)                        │
└──────────────────────────────────────────────────────────────┘
                           │
┌──────────────────────────────────────────────────────────────┐
│                   ThemisDB Cluster                           │
│  ┌────────────────────────────────────────────────────┐    │
│  │  Security Layer                                     │    │
│  │  - mTLS (client certificate validation)            │    │
│  │  - RBAC (role-based access control)                │    │
│  │  - Audit logging (all operations)                  │    │
│  │  - Encryption at rest (AES-256)                    │    │
│  │  - Signed requests (RSA-SHA256)                    │    │
│  └────────────────────────────────────────────────────┘    │
└──────────────────────────────────────────────────────────────┘

Summary

Architect's Checklist:

✅ Design sharding strategy (hash vs range)
✅ Plan capacity for 3-5 year growth
✅ Calculate TCO vs hyperscaler alternatives
✅ Define migration strategy (strangler pattern)
✅ Implement security in depth (mTLS, RBAC, encryption)
✅ Design for HA and DR (RTO/RPO targets)
✅ Monitor and optimize costs continuously

Key Benefits:

46% cost savings vs AWS complete stack
Billion-scale vector search capability
Strong consistency with TrueTime
Zero vendor lock-in (self-hosted, multi-cloud)
Production-ready (9.3/10 audit rating)

SYSTEM_ARCHITECT_GUIDE

category: "⚙️ Operations/Admin" version: "v1.3.0" status: "✅" date: "22.12.2025" audience: "System architects, principal engineers, CTOs"

⚙️ ThemisDB System Architect Guide

📋 Inhaltsverzeichnis

📋 Übersicht

✨ Features

🚀 Quick Start

Executive Summary

Horizontal Sharding Architecture

Raft Consensus Overview

TrueTime Implementation

Sharding Strategy & Design

Hash-Based Sharding

Range-Based Sharding

Rebalancing Strategy

Distributed Transactions

SAGA Pattern Implementation

Migration from Hyperscalers

AWS → ThemisDB Migration Playbook

GCP → ThemisDB Migration

Azure → ThemisDB Migration

Capacity Planning

Sizing Guidelines

Storage Calculation

Scalability Patterns

Horizontal Scaling

Read Replicas

Cost Optimization

3-Year TCO Comparison

Reference Architectures

RAG Platform Architecture

IoT Time-Series Platform

Security Architecture

Zero-Trust Model

Summary

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!