Skip to content

Latest commit

 

History

History
552 lines (414 loc) · 26.2 KB

File metadata and controls

552 lines (414 loc) · 26.2 KB

Concourse Architecture

This document provides a comprehensive overview of Concourse's system architecture, explaining how the various components work together to deliver a distributed database warehouse for transactions, search, and analytics across time.

Table of Contents

System Overview

Concourse is architected as a multi-layered system with clear separation between client drivers, the core server, storage engine, and extensible plugin framework.

┌─────────────────────────────────────────────────────────────────────────┐
│                           Client Layer                                   │
│  ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐       │
│  │  Java    │ │  Python  │ │   PHP    │ │   Ruby   │ │  CaSH    │       │
│  │  Driver  │ │  Driver  │ │  Driver  │ │  Driver  │ │  Shell   │       │
│  └────┬─────┘ └────┬─────┘ └────┬─────┘ └────┬─────┘ └────┬─────┘       │
└───────┼────────────┼────────────┼────────────┼────────────┼─────────────┘
        │            │            │            │            │
        └────────────┴────────────┴─────┬──────┴────────────┘
                                        │
┌───────────────────────────────────────┼─────────────────────────────────┐
│                          Concourse Server                                │
│  ┌────────────────────────────────────┼────────────────────────────┐    │
│  │              Communication Layer   │                             │    │
│  │  ┌─────────────────┐  ┌───────────┴───────────┐                 │    │
│  │  │   Thrift RPC    │  │     HTTP/REST API     │                 │    │
│  │  └────────┬────────┘  └───────────┬───────────┘                 │    │
│  └───────────┼───────────────────────┼─────────────────────────────┘    │
│              └───────────┬───────────┘                                   │
│  ┌───────────────────────┼─────────────────────────────────────────┐    │
│  │           ConcourseServer (Core API)                             │    │
│  │  ┌─────────────────┐  ┌─────────────────┐  ┌─────────────────┐  │    │
│  │  │   Operations    │  │  Plugin Manager │  │  HTTP Server    │  │    │
│  │  └────────┬────────┘  └────────┬────────┘  └─────────────────┘  │    │
│  └───────────┼────────────────────┼────────────────────────────────┘    │
│              │                    │                                      │
│  ┌───────────┼────────────────────┼────────────────────────────────┐    │
│  │           │      Storage Engine│                                 │    │
│  │  ┌────────┴────────┐  ┌────────┴────────┐                       │    │
│  │  │     Engine      │  │   Environments  │                       │    │
│  │  └────────┬────────┘  └─────────────────┘                       │    │
│  │  ┌────────┴────────┐  ┌─────────────────┐                       │    │
│  │  │     Buffer      │  │    Database     │                       │    │
│  │  │ (Write-Optimized)│  │(Read-Optimized) │                       │    │
│  │  └─────────────────┘  └─────────────────┘                       │    │
│  └─────────────────────────────────────────────────────────────────┘    │
└─────────────────────────────────────────────────────────────────────────┘
                                        │
                                        │ IPC
                                        ▼
┌─────────────────────────────────────────────────────────────────────────┐
│                         Plugin JVMs (External)                           │
│  ┌─────────────────┐  ┌─────────────────┐  ┌─────────────────┐          │
│  │    Plugin 1     │  │    Plugin 2     │  │    Plugin N     │          │
│  └─────────────────┘  └─────────────────┘  └─────────────────┘          │
└─────────────────────────────────────────────────────────────────────────┘

Module Structure

Concourse is organized as a multi-module Gradle project. Each module serves a specific purpose:

Core Modules

Module Description
concourse-server The core database server containing the storage engine, transaction management, plugin system, and server APIs
concourse-driver-java The official Java client driver providing the Concourse API for connecting to and interacting with the server
concourse-plugin-core Core framework for developing Concourse plugins, including base classes and communication utilities
concourse-shell The CaSH (Concourse Shell) - an interactive Groovy-based shell for database operations
concourse-cli Command-line interface tools for server management and administration

Data Tools

Module Description
concourse-import Tools for importing data from various formats (CSV, JSON, etc.) into Concourse
concourse-export Tools for exporting data from Concourse to various formats

Other Language Drivers

Module Description
concourse-driver-python Python client driver
concourse-driver-php PHP client driver
concourse-driver-ruby Ruby client driver

Testing Modules

Module Description
concourse-unit-test-core Base classes and utilities for unit testing
concourse-ete-test-core End-to-end testing framework for running tests against real server instances
concourse-integration-tests Integration test suite
concourse-ete-tests End-to-end tests including cross-version and performance tests
concourse-plugin-core-tests Tests for the plugin framework
concourse-upgrade-tests Tests for upgrade scenarios between versions

Automation

Module Description
concourse-automation Tools for programmatic interaction with the codebase and release artifacts

Data Model

Concourse uses a document-graph data model that is both flexible and powerful:

Core Concepts

  • Record: A logical grouping of data about a single entity (person, place, thing). Each record has a unique 64-bit identifier and contains a collection of key/value pairs.

  • Key: A string attribute name that maps to one or more values within a record. Keys are schemaless - they don't need to be declared in advance.

  • Value: A dynamically-typed quantity associated with a key. Concourse natively supports: boolean, double, float, integer, long, string (UTF-8), and Link.

  • Link: A special value type that creates a reference from one record to another, enabling graph-like relationships.

Schema-Free Design

Concourse is schemaless, meaning:

  • No tables or collections to define
  • No schema migrations required
  • Keys can be added to any record at any time
  • Values are dynamically typed
  • The same key can hold different value types across records

Document-Graph Structure

Record 1                          Record 2
┌─────────────────────────┐      ┌─────────────────────────┐
│ name: "John Doe"        │      │ name: "Acme Corp"       │
│ age: 30                 │      │ industry: "Technology"  │
│ employer: @Link(2) ─────┼──────┼─▶                       │
│ skills: ["Java", "SQL"] │      │ employees: [@Link(1)]   │
└─────────────────────────┘      └─────────────────────────┘

Storage Engine Architecture

The storage engine is the heart of Concourse, designed for both high write throughput and efficient reads.

BufferedStore Pattern

Concourse uses a BufferedStore pattern that combines two complementary storage systems:

                    ┌─────────────────────────────────────┐
                    │              Engine                  │
                    │  ┌─────────┐       ┌─────────────┐  │
   Write ──────────▶│  │ Buffer  │──────▶│  Database   │  │
                    │  │ (Limbo) │       │  (Durable)  │  │
                    │  └─────────┘       └─────────────┘  │
                    │       │                   │         │
   Read ◀───────────│◀──────┴───────────────────┘         │
                    └─────────────────────────────────────┘

Engine

The Engine class is the primary coordinator for all storage operations:

  • Schedules concurrent CRUD operations
  • Manages ACID transactions
  • Versions writes and coordinates indexing
  • Provides global locking coordination via LockBroker

Buffer

The Buffer is a write-optimized, append-only storage system:

  • Accepts writes immediately with constant-time performance
  • Stores data in memory-mapped Pages
  • Provides durability through append-only log structure
  • Data is eventually transported to the Database for indexing

Database

The Database is a read-optimized, indexed storage system:

  • Stores data in immutable Segments
  • Maintains multiple index structures for efficient queries
  • Supports background compaction
  • Handles historical queries via version tracking

Segments

A Segment is an immutable storage unit containing three types of Chunks:

Chunk Type Purpose Record Type
TableChunk Primary data storage TableRecord - stores key/value pairs per record
IndexChunk Secondary indexes IndexRecord - inverted index mapping values to records
CorpusChunk Full-text search index CorpusRecord - search corpus with positional data
Segment
┌─────────────────────────────────────────────────────────┐
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────────┐  │
│  │ TableChunk  │  │ IndexChunk  │  │  CorpusChunk    │  │
│  │             │  │             │  │                 │  │
│  │ Record Data │  │ Value→Recs  │  │ Term→Positions │  │
│  └─────────────┘  └─────────────┘  └─────────────────┘  │
│                                                         │
│  ┌─────────────┐  ┌─────────────────────────────────┐  │
│  │  Manifest   │  │         Bloom Filters           │  │
│  └─────────────┘  └─────────────────────────────────┘  │
└─────────────────────────────────────────────────────────┘

Automatic Indexing

One of Concourse's key features is automatic indexing - every value is indexed without manual configuration.

Index Types

  1. Table Records: Primary storage mapping (record, key) → values
  2. Index Records: Secondary indexes mapping (key, value) → records
  3. Corpus Records: Full-text search indexes mapping (key, term, position) → records

Constant-Time Writes

Concourse guarantees constant-time writes through:

  1. Append-only Buffer: Writes are immediately appended to the Buffer
  2. Background Indexing: Full indexing happens asynchronously during transport
  3. SearchIndexer: Multi-threaded search indexing in the background

Search Indexing

The SearchIndexer handles asynchronous full-text indexing:

Write Operation
      │
      ▼
┌───────────┐     ┌──────────────┐     ┌──────────────┐
│   Buffer  │────▶│  Transporter │────▶│   Segment    │
└───────────┘     └──────────────┘     └──────┬───────┘
                                              │
                  ┌───────────────────────────┘
                  │
                  ▼
            ┌───────────┐
            │  Search   │──▶ Worker Thread 1
            │  Indexer  │──▶ Worker Thread 2
            │           │──▶ Worker Thread N
            └───────────┘

Data Transport

Data transport moves writes from the Buffer to the Database, where they become fully indexed.

Transport Strategies

Concourse supports two transport strategies:

Streaming Transporter (Legacy)

  • Processes writes incrementally in small batches
  • Transport operations compete with reads/writes for resources
  • Amortizes indexing cost across normal operations
  • Can cause "stop-the-world" situations under high load

Batch Transporter (Default)

  • Performs indexing entirely in the background
  • Reads and writes continue uninterrupted during indexing
  • Only brief critical section when merging indexed data
  • Dramatically improves system throughput
Batch Transport Flow
────────────────────

Buffer                    Background                  Database
┌─────────┐              ┌──────────────┐           ┌──────────┐
│ Page 1  │───────────▶  │ Create       │           │          │
│ Page 2  │              │ Segment +    │           │          │
│ Page 3  │              │ Index        │           │          │
└─────────┘              └──────┬───────┘           │          │
                                │                    │          │
                         ┌──────▼───────┐           │          │
                         │ Merge        │──────────▶│ Segments │
                         │ (Critical)   │           │          │
                         └──────────────┘           └──────────┘

Configuration

Transport behavior is configured in concourse.yaml:

# Simple configuration
transporter: batch  # or "streaming"

# Advanced configuration
transporter:
  type: batch
  num_threads: 2

Compaction

Compaction optimizes storage by merging adjacent Segments.

SimilarityCompactor

The default compaction strategy merges Segments with similar characteristics:

  • Reduces total number of Segments
  • Eliminates redundant revision data
  • Improves read performance by reducing I/O

Compaction Modes

  1. Incremental Compaction: Opportunistic, runs frequently but only if no conflicting work
  2. Full Compaction: Aggressive, blocks until it can process all Segments

Compaction Cycle

Segments: [S1] [S2] [S3] [S4] [S5]
              ╲   ╱
         Shift 1: Try compact S2+S3
                    │
                    ▼
Segments: [S1] [S2'] [S4] [S5]    (if successful)
               ╲    ╱
         Shift 2: Try compact S2'+S4
                    │
                    ▼
              ... continue ...

Version Control

Concourse automatically tracks all changes to data, enabling powerful time-travel queries.

Revision-Based Storage

Every write creates a new Revision rather than modifying existing data:

Key "name" in Record 1
────────────────────────
Version 1: ADD "John"     @ timestamp T1
Version 2: REMOVE "John"  @ timestamp T2
Version 3: ADD "Johnny"   @ timestamp T3

Historical Queries

Query data as it existed at any point in time:

// Get current value
get(key="name", record=1)

// Get value as of specific time
get(key="name", record=1, time=time("2023-01-01"))

// Find records matching criteria in the past
find("status = active", time=time("last month"))

Audit Trail

The audit() function provides a complete history of changes:

audit(record=1)
// Returns: Map<Timestamp, String> describing each revision

Revert Capability

Atomically revert data to a previous state:

revert(key="name", record=1, time=time("yesterday"))

Concurrency Model

Concourse uses sophisticated concurrency controls to ensure ACID compliance while maximizing throughput.

Just-in-Time (JIT) Locking

Rather than acquiring all locks upfront, Concourse uses JIT locking:

  • Locks are acquired only when needed
  • Reduces contention for non-conflicting operations
  • Enables higher concurrency

Lock Types

Lock Type Purpose
Token Locks Granular locks for specific (key, record) pairs
Range Locks Protect range queries from concurrent modifications
Transport Locks Coordinate Buffer-to-Database transport
Shared Locks Record-level coordination

LockBroker

The LockBroker is the central coordinator for all locks:

                    ┌─────────────────┐
                    │   LockBroker    │
                    │                 │
   Operation 1 ────▶│  Token Locks    │◀──── Operation 2
                    │  Range Locks    │
   Operation 3 ────▶│  Shared Locks   │◀──── Operation 4
                    └─────────────────┘

Atomic Operations and Transactions

  • AtomicOperation: Lightweight atomic operations within a single environment
  • Transaction: Full ACID transactions that can span multiple operations
// Atomic operation
stage
add("balance", 100, record1)
remove("balance", 50, record1)
commit

// Transaction with isolation
stage({
    // All operations here are isolated and atomic
    set("status", "complete", order)
    add("completed_orders", Link.to(order), customer)
})

Communication Layer

Thrift RPC (Primary)

The primary communication protocol uses Apache Thrift:

  • Defined in interface/concourse.thrift
  • Supports all client drivers
  • Binary protocol for efficiency

HTTP/REST API

RESTful API for web-based access:

  • JSON request/response format
  • RESTQL-compatible query syntax
  • Token-based authentication

Thrift Interface

The Thrift service definition includes:

service ConcourseService {
    void abort(AccessToken creds, TransactionToken transaction, string environment)
    long addKeyValue(string key, TObject value, AccessToken creds, ...)
    Map<Long, Boolean> addKeyValueRecords(string key, TObject value, list<i64> records, ...)
    // ... hundreds of methods
}

Plugin System

Concourse's plugin system enables extending functionality without modifying core code.

Plugin Architecture

Plugins run in separate JVMs and communicate via IPC:

┌─────────────────────────────────────────────────────────┐
│                    Concourse Server                      │
│  ┌─────────────────────────────────────────────────┐    │
│  │               Plugin Manager                     │    │
│  │  ┌──────────────┐  ┌──────────────────────┐     │    │
│  │  │   Registry   │  │  IPC Channels        │     │    │
│  │  └──────────────┘  │  - fromServer        │     │    │
│  │                    │  - fromPlugin        │     │    │
│  │                    └──────────────────────┘     │    │
│  └─────────────────────────────────────────────────┘    │
└───────────────────────────┬─────────────────────────────┘
                            │ IPC (Shared Memory)
        ┌───────────────────┼───────────────────┐
        │                   │                   │
        ▼                   ▼                   ▼
┌───────────────┐   ┌───────────────┐   ┌───────────────┐
│   Plugin A    │   │   Plugin B    │   │   Plugin C    │
│   (JVM 1)     │   │   (JVM 2)     │   │   (JVM 3)     │
└───────────────┘   └───────────────┘   └───────────────┘

Plugin Lifecycle

  1. Install: Bundle is extracted to the plugins directory
  2. Activate: Plugin classes are discovered (internal or external bootstrapping)
  3. Launch: Plugin JVM is started with IPC channels
  4. Run: Plugin processes requests and can call back to server
  5. Stop: Plugin JVM is terminated gracefully

Plugin Types

  • Standard Plugin: Extends Plugin base class
  • RealTimePlugin: Receives real-time data streams for live processing

Cross-Version Compatibility

Plugins can be compiled against different Java versions than the server:

  • External Bootstrapping: Discovers plugins via bytecode scanning without loading classes
  • Internal Bootstrapping: Uses Reflections for class discovery (requires version compatibility)

Plugin Configuration

plugins:
  bootstrapper: external  # or "internal"

# Per-plugin Java runtime
plugin_name:
  java_binary: /path/to/java

Further Reading