Skip to content

SYSTEM_ARCHITECT_GUIDE

GitHub Actions edited this page Jan 2, 2026 · 1 revision

category: "⚙️ Operations/Admin" version: "v1.3.0" status: "✅" date: "22.12.2025" audience: "System architects, principal engineers, CTOs"

⚙️ ThemisDB System Architect Guide

Comprehensive architecture guide for enterprise-scale deployment.

📋 Inhaltsverzeichnis


📋 Übersicht

ThemisDB provides enterprise-grade horizontal sharding with Raft consensus and Google Spanner-inspired TrueTime for strong consistency across distributed deployments.

Target Audience: System architects, principal engineers, CTOs

Version: 1.3.0
Last Updated: December 2025


✨ Features

  • 🔀 Horizontal Sharding - Raft + TrueTime (only open-source implementation)
  • 📈 Billion-Scale - Migration playbooks from hyperscalers
  • 💰 46% Cost Savings - vs. AWS complete stack
  • 🌍 Distributed Transactions - SAGA pattern support
  • 🔐 Enterprise Security - mTLS, RBAC, encryption

🚀 Quick Start

Key Topics:

  • Horizontal sharding with Raft + TrueTime (only open-source implementation)
  • Migration playbooks: AWS/GCP/Azure → ThemisDB
  • 46% cost savings vs. AWS complete stack
  • Capacity planning for billion-scale deployments
  • Distributed transactions with SAGA pattern

Executive Summary

  1. Horizontal Sharding Architecture
  2. Sharding Strategy & Design
  3. Distributed Transactions
  4. Consistency Models
  5. Multi-Shard Queries
  6. Migration from Hyperscalers
  7. Capacity Planning
  8. Scalability Patterns
  9. Disaster Recovery Architecture
  10. Security Architecture
  11. Cost Optimization
  12. Reference Architectures

Horizontal Sharding Architecture

Raft Consensus Overview

┌──────────────────────────────────────────────────────────┐
│                    ThemisDB Cluster                      │
│                                                          │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐    │
│  │  Leader     │  │  Follower   │  │  Follower   │    │
│  │  Shard 1    │  │  Shard 1    │  │  Shard 1    │    │
│  │  (Write)    │──│  (Read)     │──│  (Read)     │    │
│  └─────────────┘  └─────────────┘  └─────────────┘    │
│         │                                               │
│    Raft Log                                             │
│    WAL Replication                                      │
│    TrueTime Coordination                                │
└──────────────────────────────────────────────────────────┘

Key Components:

  • Raft Consensus: Leader election, log replication, strong consistency
  • TrueTime: Distributed timestamps with uncertainty intervals
  • WAL: Write-ahead logging for durability
  • Circuit Breaker: Prevents cascade failures

TrueTime Implementation

// TrueTime timestamp with uncertainty interval
struct TrueTimeStamp {
    uint64_t earliest;  // Lower bound
    uint64_t latest;    // Upper bound
    uint64_t now() const { return (earliest + latest) / 2; }
    uint64_t uncertainty() const { return latest - earliest; }
};

// Wait for uncertainty to pass
void wait_for_safe_time(const TrueTimeStamp& ts) {
    while (current_time() < ts.latest) {
        std::this_thread::sleep_for(std::chrono::microseconds(10));
    }
}

Benefits:

  • External consistency (linearizability)
  • Snapshot isolation without locks
  • Distributed transactions across shards

Sharding Strategy & Design

Hash-Based Sharding

// Consistent hashing with virtual nodes
uint64_t compute_shard(const std::string& key) {
    uint64_t hash = murmur3_hash(key);
    return hash % num_shards;
}

// With 150 virtual nodes per shard for better distribution
uint64_t compute_shard_virtual(const std::string& key) {
    uint64_t hash = murmur3_hash(key);
    uint64_t vnode = hash % (num_shards * 150);
    return vnode / 150;  // Map to physical shard
}

Range-Based Sharding

// Shard by key ranges (e.g., user IDs, timestamps)
struct ShardRange {
    std::string start_key;
    std::string end_key;
    uint64_t shard_id;
};

std::vector<ShardRange> shard_map = {
    {"0000", "2999", 0},
    {"3000", "5999", 1},
    {"6000", "9999", 2}
};

Rebalancing Strategy

// Dynamic shard rebalancing
void rebalance_shards() {
    // 1. Detect imbalance (>20% size difference)
    auto stats = collect_shard_statistics();
    
    // 2. Create rebalancing plan
    auto plan = create_rebalancing_plan(stats);
    
    // 3. Execute with joint consensus (Raft membership change)
    execute_rebalancing(plan);
}

Distributed Transactions

SAGA Pattern Implementation

SAGATransaction saga;

// Step 1: Reserve inventory
saga.addStep(
    []() { 
        return inventory.reserve(productId, quantity);
    },
    []() { 
        inventory.unreserve(productId, quantity);  // Compensation
    }
);

// Step 2: Charge payment
saga.addStep(
    []() { 
        return payment.charge(userId, amount);
    },
    []() { 
        payment.refund(userId, amount);  // Compensation
    }
);

// Step 3: Create order
saga.addStep(
    []() { 
        return orders.create(order);
    },
    []() { 
        orders.delete(orderId);  // Compensation
    }
);

// Execute with automatic rollback on failure
saga.execute();

Advantages over 2PC:

  • No blocking coordinator
  • Better availability
  • Eventual consistency with compensation

Migration from Hyperscalers

AWS → ThemisDB Migration Playbook

Current AWS Stack:

  • RDS PostgreSQL: $2,000/month
  • DynamoDB: $1,500/month
  • SageMaker: $3,000/month
  • Total: $6,500/month = $78K/year

ThemisDB Equivalent:

  • ThemisDB Self-Hosted: $1,500/month (3 × c6i.8xlarge + storage)
  • vLLM Co-Located: $0/month (uses same infrastructure)
  • Total: $1,500/month = $18K/year
  • Savings: $60K/year (77% reduction)

Migration Steps:

# Phase 1: Setup ThemisDB (Week 1-2)
1. Provision infrastructure (EC2 instances, EBS volumes)
2. Deploy ThemisDB cluster (3 nodes)
3. Configure backups to S3
4. Set up monitoring (CloudWatch → OpenTelemetry)

# Phase 2: Data Migration (Week 3-4)
1. Export data from RDS/DynamoDB
   aws rds export-db-snapshot-to-s3 --snapshot-id snapshot-123
   
2. Transform to ThemisDB format
   themisdb-migrator transform --source rds --target themisdb
   
3. Bulk load into ThemisDB
   themisdb-admin bulk-load --source s3://migration-bucket/

# Phase 3: Dual-Write (Week 5-6)
1. Update application to write to both RDS and ThemisDB
2. Validate data consistency
3. Monitor performance and errors

# Phase 4: Read Migration (Week 7-8)
1. Gradually shift reads to ThemisDB (10% → 50% → 100%)
2. Monitor latency and errors
3. Validate business metrics

# Phase 5: Cutover (Week 9-10)
1. Disable writes to RDS/DynamoDB
2. Final data reconciliation
3. Decommission AWS services

Service Mapping:

AWS Service ThemisDB Equivalent Migration Complexity
RDS PostgreSQL Multi-Model Relational Medium (schema mapping)
DynamoDB Document Model Low (direct mapping)
Aurora Multi-Model + Sharding Medium (sharding setup)
SageMaker vLLM Co-Location Medium (model deployment)
Pinecone FAISS Advanced Low (API compatible)
Timestream Hypertables Low (time-series native)

GCP → ThemisDB Migration

Service Mapping:

GCP Service ThemisDB Equivalent
Cloud SQL Multi-Model Relational
Firestore Document Model
Cloud Spanner Sharding + TrueTime (compatible!)
Vertex AI vLLM Co-Location
BigQuery Arrow Parquet Export → BigQuery

Azure → ThemisDB Migration

Service Mapping:

Azure Service ThemisDB Equivalent
Azure SQL Database Multi-Model Relational
Cosmos DB Document + Graph
Azure ML vLLM Co-Location

Capacity Planning

Sizing Guidelines

Small Deployment (< 1TB data, < 10K QPS):

- 3 nodes (for HA)
- 16 vCPUs, 64 GB RAM per node
- 1 TB SSD per node
- Cost: ~$1,500/month (self-hosted AWS)

Medium Deployment (1-10 TB data, 10K-100K QPS):

- 5 nodes
- 32 vCPUs, 128 GB RAM per node
- 2 TB NVMe SSD per node
- Cost: ~$5,000/month (self-hosted AWS)

Large Deployment (10-100 TB data, 100K-1M QPS):

- 10 nodes
- 64 vCPUs, 256 GB RAM per node
- 10 TB NVMe SSD per node
- Cost: ~$20,000/month (self-hosted AWS)

Storage Calculation

# Billion vectors storage calculation
vectors = 1_000_000_000
dimensions = 1536
bytes_per_float = 4

# Flat storage
flat_storage = vectors * dimensions * bytes_per_float
print(f"Flat: {flat_storage / 1e12:.1f} TB")  # 6.1 TB

# IVF+PQ compression
compression_ratio = 100  # 100x with PQ
compressed_storage = flat_storage / compression_ratio
print(f"IVF+PQ: {compressed_storage / 1e9:.1f} GB")  # 61 GB

Scalability Patterns

Horizontal Scaling

# Add new shard to cluster
themisdb-admin shard add \
  --node themisdb-4:8530 \
  --rebalance-strategy gradual

# Monitor rebalancing progress
themisdb-admin shard status

Read Replicas

# Add read replica
themisdb-admin replica add \
  --primary themisdb-1:8530 \
  --replica themisdb-5:8530 \
  --lag-threshold 100ms

Cost Optimization

3-Year TCO Comparison

Solution Infrastructure Operations Total (3Y) vs ThemisDB
ThemisDB $54K $113K $167K Baseline
AWS Stack $234K $65K $299K +79%
PostgreSQL $156K $76K $232K +39%
MongoDB $180K $75K $255K +53%

Cost Savings Breakdown:

  • Infrastructure: 46-77% (self-hosted vs managed)
  • Embedding Cache: 70-90% API cost reduction
  • vLLM Co-Location: Share GPU infrastructure
  • Zero egress costs: Save 10-20% on data transfer

Reference Architectures

RAG Platform Architecture

┌──────────────────────────────────────────────────────────────┐
│                      Application Layer                       │
│  ┌────────────┐  ┌────────────┐  ┌────────────┐           │
│  │  Web App   │  │  Mobile    │  │  API       │           │
│  └────────────┘  └────────────┘  └────────────┘           │
└──────────────────────────────────────────────────────────────┘
                           │
┌──────────────────────────────────────────────────────────────┐
│                   ThemisDB + vLLM                           │
│  ┌────────────────────────────────────────────────────┐    │
│  │  ThemisDB (50 cores, 200 GB RAM, 30% GPU)        │    │
│  │  - Vector Index (FAISS IVF+PQ, 1B vectors)       │    │
│  │  - Embedding Cache (70-90% cost savings)         │    │
│  │  - Hybrid Search (BM25 + Vector RRF)             │    │
│  │  - Document Store                                  │    │
│  └────────────────────────────────────────────────────┘    │
│  ┌────────────────────────────────────────────────────┐    │
│  │  vLLM (14 cores, 56 GB RAM, 70% GPU)             │    │
│  │  - LLaMA 2 70B / GPT-3.5 equivalent              │    │
│  │  - Low-priority CUDA streams                       │    │
│  └────────────────────────────────────────────────────┘    │
└──────────────────────────────────────────────────────────────┘

Performance:

  • Query latency: 50-200ms (p99)
  • Embedding cache hit rate: 80%+
  • vLLM inference: 1-2ms (GPU) vs 5-10ms (CPU fallback)
  • Throughput: 10K+ queries/sec

IoT Time-Series Platform

┌──────────────────────────────────────────────────────────────┐
│                    IoT Devices (100K+)                       │
│     Sensors → Edge Gateways → Data Ingestion                │
└──────────────────────────────────────────────────────────────┘
                           │
┌──────────────────────────────────────────────────────────────┐
│                      ThemisDB Cluster                        │
│  ┌─────────────────────────────────────────────────────┐   │
│  │  Hypertables (TimescaleDB compatibility)           │   │
│  │  - 1-day chunks (Column Families)                   │   │
│  │  - TTL-based retention (30 days)                    │   │
│  │  - ZSTD compression (5x storage reduction)          │   │
│  └─────────────────────────────────────────────────────┘   │
│  ┌─────────────────────────────────────────────────────┐   │
│  │  Time-Series Aggregates (Arrow SIMD)               │   │
│  │  - Resample (1-second → 1-minute)                  │   │
│  │  - Rolling windows (5-minute, 1-hour)              │   │
│  │  - Percentiles (P50, P95, P99)                     │   │
│  │  - 5-10x faster than SQL                           │   │
│  └─────────────────────────────────────────────────────┘   │
└──────────────────────────────────────────────────────────────┘

Performance:

  • Ingestion rate: 1M+ points/sec
  • Query latency (aggregates): 10-100ms
  • Storage efficiency: 80% compression
  • Retention: Automatic with TTL

Security Architecture

Zero-Trust Model

┌──────────────────────────────────────────────────────────────┐
│                        Internet                              │
└──────────────────────────────────────────────────────────────┘
                           │
┌──────────────────────────────────────────────────────────────┐
│                    WAF + DDoS Protection                     │
└──────────────────────────────────────────────────────────────┘
                           │
┌──────────────────────────────────────────────────────────────┐
│                  Load Balancer (mTLS)                        │
└──────────────────────────────────────────────────────────────┘
                           │
┌──────────────────────────────────────────────────────────────┐
│                   ThemisDB Cluster                           │
│  ┌────────────────────────────────────────────────────┐    │
│  │  Security Layer                                     │    │
│  │  - mTLS (client certificate validation)            │    │
│  │  - RBAC (role-based access control)                │    │
│  │  - Audit logging (all operations)                  │    │
│  │  - Encryption at rest (AES-256)                    │    │
│  │  - Signed requests (RSA-SHA256)                    │    │
│  └────────────────────────────────────────────────────┘    │
└──────────────────────────────────────────────────────────────┘

Summary

Architect's Checklist:

  • ✅ Design sharding strategy (hash vs range)
  • ✅ Plan capacity for 3-5 year growth
  • ✅ Calculate TCO vs hyperscaler alternatives
  • ✅ Define migration strategy (strangler pattern)
  • ✅ Implement security in depth (mTLS, RBAC, encryption)
  • ✅ Design for HA and DR (RTO/RPO targets)
  • ✅ Monitor and optimize costs continuously

Key Benefits:

  • 46% cost savings vs AWS complete stack
  • Billion-scale vector search capability
  • Strong consistency with TrueTime
  • Zero vendor lock-in (self-hosted, multi-cloud)
  • Production-ready (9.3/10 audit rating)

Next Steps:

ThemisDB Dokumentation

Version: 1.3.0 | Stand: Dezember 2025


📋 Schnellstart


🏗️ Architektur


🗄️ Basismodell


💾 Storage & MVCC


📇 Indexe & Statistiken


🔍 Query & AQL


💰 Caching


📦 Content Pipeline


🔎 Suche


⚡ Performance & Benchmarks


🏢 Enterprise Features


✅ Qualitätssicherung


🧮 Vektor & GNN


🌍 Geo Features


🛡️ Sicherheit & Governance

Authentication

Schlüsselverwaltung

Verschlüsselung

TLS & Certificates

PKI & Signatures

PII Detection

Vault & HSM

Audit & Compliance

Security Audits

Gap Analysis


🚀 Deployment & Betrieb

Docker

Observability

Change Data Capture

Operations


💻 Entwicklung

API Implementations

Changefeed

Security Development

Development Overviews


📄 Publikation & Ablage


🔧 Admin-Tools


🔌 APIs


📚 Client SDKs


📊 Implementierungs-Zusammenfassungen


📅 Planung & Reports


📖 Dokumentation


📝 Release Notes


📖 Styleguide & Glossar


🗺️ Roadmap & Changelog


💾 Source Code Documentation

Main Programs

Source Code Module


🗄️ Archive


🤝 Community & Support


Vollständige Dokumentation: https://makr-code.github.io/ThemisDB/

Clone this wiki locally