Scalability Best Practices

Overview

This guide covers strategies for scaling Pangolin to handle growing data volumes, user bases, and query loads.

Scaling Dimensions

Vertical Scaling (Scale Up)

When to Scale Up

Single-tenant deployments with predictable growth
Cost-effective for small to medium workloads
Simpler operational model

Limits

Hardware constraints (CPU, RAM, disk)
Single point of failure
Downtime required for upgrades

Horizontal Scaling (Scale Out)

When to Scale Out

Multi-tenant deployments
Unpredictable or bursty workloads
High availability requirements
Large-scale deployments (100+ users, 1000+ tables)

API Server Scaling

Stateless Design

Pangolin API servers are stateless, enabling easy horizontal scaling.

Load Balancing Strategy

┌──────────────┐
│ Load Balancer│
└──────┬───────┘
       │
   ┌───┴────┬────────┬────────┐
   │        │        │        │
┌──▼──┐  ┌──▼──┐  ┌──▼──┐  ┌──▼──┐
│ API │  │ API │  │ API │  │ API │
│  1  │  │  2  │  │  3  │  │  4  │
└──┬──┘  └──┬──┘  └──┬──┘  └──┬──┘
   │        │        │        │
   └────────┴────────┴────────┘
              │
       ┌──────▼──────┐
       │  PostgreSQL │
       └─────────────┘

Scaling Triggers

CPU utilization > 70%
Request latency p95 > 500ms
Request queue depth > 100

Auto-Scaling Configuration (Kubernetes)

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: pangolin-api-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: pangolin-api
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80

Database Scaling

PostgreSQL Scaling

Read Replicas

┌─────────────┐
│   Primary   │ ◄── Writes
└──────┬──────┘
       │ Replication
   ┌───┴────┬────────┐
   │        │        │
┌──▼──┐  ┌──▼──┐  ┌──▼──┐
│Read │  │Read │  │Read │ ◄── Reads
│Rep 1│  │Rep 2│  │Rep 3│
└─────┘  └─────┘  └─────┘

Connection Pooling

# PgBouncer configuration
[databases]
pangolin = host=localhost port=5432 dbname=pangolin

[pgbouncer]
pool_mode = transaction
max_client_conn = 1000
default_pool_size = 20
reserve_pool_size = 5

Partitioning Strategy

-- Partition audit logs by month
CREATE TABLE audit_logs (
    id UUID PRIMARY KEY,
    timestamp TIMESTAMP NOT NULL,
    action VARCHAR(50),
    ...
) PARTITION BY RANGE (timestamp);

CREATE TABLE audit_logs_2025_01 PARTITION OF audit_logs
    FOR VALUES FROM ('2025-01-01') TO ('2025-02-01');

MongoDB Scaling

Sharding

// Shard by tenant_id for multi-tenancy
sh.shardCollection("pangolin.assets", { tenant_id: 1, catalog: 1 })

// Shard by catalog for single-tenant
sh.shardCollection("pangolin.assets", { catalog: 1, namespace: 1 })

Multi-Tenancy Scaling

Tenant Isolation Strategies

Shared Infrastructure (Cost-Effective)

All tenants share same API servers and database
Logical isolation via tenant_id
Suitable for: 100s-1000s of small tenants

Dedicated Infrastructure (High Isolation)

Each tenant gets dedicated resources
Physical isolation
Suitable for: Enterprise customers, compliance requirements

Hybrid Approach (Recommended)

Shared API servers
Tenant-specific database schemas or databases
Balance of cost and isolation

Tenant-Specific Scaling

-- Monitor per-tenant resource usage
SELECT 
    tenant_id,
    COUNT(*) as table_count,
    SUM(size_bytes) as total_size
FROM assets
GROUP BY tenant_id
ORDER BY total_size DESC;

Catalog Scaling

Namespace Organization

Flat Structure (Small Scale)

catalog
├── namespace1
├── namespace2
└── namespace3

Hierarchical Structure (Large Scale)

catalog
├── department
│   ├── team1
│   │   ├── project_a
│   │   └── project_b
│   └── team2
└── shared

Best Practices

Limit namespace depth to 3-4 levels
Use consistent naming conventions
Implement namespace quotas

Table Distribution

Anti-Pattern: Single Mega-Catalog

❌ production (10,000 tables)

Best Practice: Multiple Catalogs

✅ sales_analytics (500 tables)
✅ marketing_data (300 tables)
✅ finance_reports (200 tables)
✅ operations_metrics (400 tables)

Metadata Scaling

Metadata Size Management

Audit Log Archival

-- Archive logs older than 90 days
INSERT INTO audit_logs_archive
SELECT * FROM audit_logs
WHERE timestamp < NOW() - INTERVAL '90 days';

DELETE FROM audit_logs
WHERE timestamp < NOW() - INTERVAL '90 days';

Snapshot Cleanup

# Remove snapshots older than retention policy
# Implement via maintenance operations
pangolin-admin optimize-catalog --catalog production \
  --remove-old-snapshots --retention-days 30

Indexing Strategy

Essential Indexes

-- Catalog operations
CREATE INDEX idx_assets_catalog_namespace ON assets(catalog_id, namespace);
CREATE INDEX idx_assets_tenant ON assets(tenant_id);

-- Search and discovery
CREATE INDEX idx_assets_name ON assets(name);
CREATE INDEX idx_assets_kind ON assets(kind);

-- Audit queries
CREATE INDEX idx_audit_tenant_time ON audit_logs(tenant_id, timestamp DESC);
CREATE INDEX idx_audit_action ON audit_logs(action);

Caching Strategies

Metadata Caching

Redis Configuration

# Cache frequently accessed metadata
cache:
  type: redis
  host: redis-cluster
  ttl: 300  # 5 minutes
  
  # Cache keys
  patterns:
    - "catalog:*:metadata"
    - "namespace:*:list"
    - "permissions:user:*"

Cache Invalidation

Invalidate on write operations
Use pub/sub for multi-instance coordination
Implement cache warming for critical data

Query Result Caching

# Cache expensive queries
@cache(ttl=600)  # 10 minutes
def get_catalog_summary(catalog_id):
    return {
        'table_count': count_tables(catalog_id),
        'total_size': calculate_size(catalog_id),
        'namespaces': list_namespaces(catalog_id)
    }

Object Storage Scaling

S3 Performance Optimization

Prefix Distribution

# Anti-pattern: Sequential prefixes
s3://bucket/table-00001/
s3://bucket/table-00002/
s3://bucket/table-00003/

# Best practice: Hash-based prefixes
s3://bucket/a7f3/table-00001/
s3://bucket/b2e9/table-00002/
s3://bucket/c4d1/table-00003/

Request Rate Limits

S3: 3,500 PUT/COPY/POST/DELETE, 5,500 GET/HEAD per prefix per second
Use multiple prefixes for high-throughput workloads

Multi-Region Storage

Primary Region: us-east-1
├── Catalog metadata: RDS us-east-1
└── Table data: S3 us-east-1

Secondary Region: eu-west-1
├── Read replica: RDS eu-west-1
└── Replicated data: S3 eu-west-1

Performance Benchmarks

Target Metrics

API Response Times

List catalogs: < 100ms (p95)
Get table metadata: < 50ms (p95)
Create table: < 200ms (p95)
Search: < 500ms (p95)

Throughput

1,000 requests/second per API instance
10,000 concurrent users per deployment

Scale Limits

100,000 tables per catalog
1,000 catalogs per deployment
10,000 users per tenant

Monitoring for Scale

Key Performance Indicators

-- Track growth trends
SELECT 
    DATE_TRUNC('day', created_at) as date,
    COUNT(*) as new_tables,
    SUM(COUNT(*)) OVER (ORDER BY DATE_TRUNC('day', created_at)) as cumulative
FROM assets
GROUP BY DATE_TRUNC('day', created_at)
ORDER BY date DESC
LIMIT 30;

Capacity Planning

Forecast Resource Needs

Monitor growth rate (tables, users, queries)
Project 6-12 months ahead
Plan scaling actions proactively

Alerting Thresholds

Database connections > 80% of pool
Disk usage > 70%
API latency p95 > 500ms
Error rate > 1%

Cost Optimization at Scale

Resource Efficiency

Right-Sizing

Start small, scale based on actual usage
Use auto-scaling to handle peaks
Scale down during off-hours

Storage Tiering

# Move old data to cheaper storage
aws s3 cp s3://hot-bucket/old-data/ s3://archive-bucket/old-data/ \
  --storage-class GLACIER_IR

Multi-Tenancy Economics

Shared Resources

API servers: Shared across all tenants
Database: Shared with logical isolation
Monitoring: Centralized

Dedicated Resources (Premium Tier)

Dedicated API instances
Isolated database
Priority support

Scaling Checklist

Before Scaling

Identify bottleneck (CPU, memory, database, network)
Review current resource utilization
Analyze query patterns and hot spots
Test scaling strategy in staging

During Scaling

Monitor metrics continuously
Validate health checks
Test failover scenarios
Verify data consistency

After Scaling

Measure performance improvement
Update capacity planning
Document changes
Review cost impact

FilesExpand file tree

scalability.md

Latest commit

History