This document provides detailed performance benchmarks and analysis of pgsqlite.
The connection-per-session architecture and recent changes have introduced massive overhead:
- SELECT operations are 599x worse than the documented target (4.016ms vs 0.669ms)
- INSERT operations are 269x worse (0.163ms vs 0.060ms target)
- UPDATE operations are 90x worse (0.053ms vs 0.059ms target)
- DELETE operations are 100x worse (0.033ms vs 0.034ms target)
- Even cached SELECTs are 1.7x worse (0.079ms vs 0.046ms target)
Root Cause Analysis (Updated 2025-08-01):
- Connection-per-session architecture overhead persists
- Type detection improvements (aggregate_type_fixer) may add latency
- Debug logging converted to debug!() but overhead remains
- Session state management still impacting hot paths
Immediate Actions Required:
- Profile the type detection and schema resolution code
- Investigate aggregate_type_fixer performance impact
- Consider caching type information more aggressively
- Review all hot path allocations and mutex usage
After fixing SQLAlchemy edge cases and build warnings:
================================================================================
BENCHMARK RESULTS
Mode: Full Comparison | Connection: Unix Socket | Database: In-Memory
================================================================================
Operation | SQLite (ms) | pgsqlite (ms) | Overhead | vs Target | Status
-----------------|-------------|---------------|-------------|-----------|--------
CREATE | 0.148 | 10.061 | +6,711.4% | N/A | -
INSERT | 0.002 | 0.163 | +9,847.9% | 269x | ❌ CRITICAL
UPDATE | 0.001 | 0.053 | +4,591.1% | 90x | ❌ CRITICAL
DELETE | 0.001 | 0.033 | +3,560.5% | 100x | ❌ CRITICAL
SELECT | 0.001 | 4.016 | +389,541.9% | 599x | ❌ CRITICAL
SELECT (cached) | 0.003 | 0.079 | +2,892.9% | 1.7x | ❌ Poor
Cache Effectiveness: 50.8x speedup for pgsqlite cached queries (4.016ms → 0.079ms)
Total operations: 1,101 | Overall overhead: +64,441.9%
After implementing connection-per-session architecture, performance has severely degraded:
================================================================================
SEVERE PERFORMANCE REGRESSION ALERT
================================================================================
Benchmark Configuration:
- Records: 1,000+ operations per type
- SQLite: In-memory database
- pgsqlite: In-memory with connection-per-session architecture
- Connection: Unix Socket
================================================================================
Current vs Target Performance
================================================================================
Operation | SQLite (ms) | pgsqlite (ms) | Current | Target | Status
-----------------|-------------|---------------|------------|----------|--------
CREATE TABLE | 0.145 | 15.769 | 10,792.1% | ~100x | ❌ 107x worse
INSERT (single) | 0.002 | 0.174 | 10,753.0% | 36.6x | ❌ 294x worse
SELECT (first) | 0.001 | 3.827 | 383,068.5% | 674.9x | ❌ 568x worse
SELECT (cached) | 0.005 | 0.159 | 3,185.9% | 17.2x | ❌ 3.5x worse
UPDATE | 0.001 | 0.063 | 5,368.6% | 50.9x | ❌ 105x worse
DELETE | 0.001 | 0.045 | 4,636.9% | 35.8x | ❌ 130x worse
Cache Effectiveness: 24.1x speedup (3.827ms → 0.159ms) - Good ratio but poor absolute performance
Connection-per-session: SEVERE OVERHEAD - needs immediate optimization
Operation | SQLite (ms) | pgsqlite (ms) | Overhead | Performance
-----------------|-------------|---------------|----------|-------------
SELECT (first) | 0.001 | 0.669 | 674.9x | Good
SELECT (cached) | 0.003 | 0.046 | 17.2x | Excellent ⭐
UPDATE | 0.001 | 0.059 | 50.9x | Excellent ⭐
DELETE | 0.001 | 0.034 | 35.8x | Excellent ⭐
INSERT (single) | 0.002 | 0.060 | 36.6x | Excellent ⭐
| Date | SELECT (cached) | UPDATE | DELETE | Key Optimization |
|---|---|---|---|---|
| 2025-01-01 | 118x | 68x | 65x | Initial implementation |
| 2025-01-15 | 67x | 55x | 52x | Zero-copy architecture |
| 2025-02-01 | 45x | 50x | 47x | Fast path optimization |
| 2025-07-08 | 39x | 48x | 44x | Ultra-fast path + caching |
| 2025-07-18 | 74x | 53x | 43x | Query optimization system |
Latest Optimization (2025-07-18):
- Read-Only Optimizer: Direct execution path for SELECT queries
- Enhanced Statement Caching: 200+ cached query plans with priority eviction
- Query Plan Caching: Complexity classification and type conversion caching
- Cache Effectiveness: 2.4x speedup for cached queries (was 1.9x)
- Overhead: 294x
- Breakdown:
- Protocol parsing: ~40%
- Query analysis: ~20%
- Execution: ~10%
- Result formatting: ~30%
- Overhead: 39x (excellent)
- Optimizations:
- Result set cache hit
- Prepared statement reuse
- Pre-computed metadata
- Overhead: 332x
- Why: Full protocol round-trip per row
- Recommendation: Use multi-row INSERT
-- Benchmark results for batch inserts (1000 rows total)
Batch Size | Time (ms) | Rows/sec | Speedup vs Single
-----------|-----------|----------|------------------
1 | 332 | 3,012 | 1.0x (baseline)
10 | 28.9 | 34,602 | 11.5x
100 | 6.48 | 154,321 | 51.3x
1000 | 4.35 | 229,885 | 76.4x
-- Example: Insert 1000 rows as single batch
INSERT INTO table (col1, col2) VALUES
(val1, val2),
(val3, val4),
... -- 998 more rows- Fast Path Detection: Batch INSERTs without datetime/decimal values bypass translation
- Prepared Statement Caching: Batch patterns are fingerprinted for metadata reuse
- Performance: Up to 112.9x speedup achieved with fast path optimization
- Overhead: 44-48x (excellent)
- Why efficient:
- Fast path optimization
- Minimal translation needed
- Efficient change detection
| Cache Type | Hit Rate | Impact |
|---|---|---|
| Query Plan | 92% | Avoids parsing |
| Result Set | 45% | Skips execution |
| Statement Pool | 88% | Reuses prepared statements |
| Schema | 99% | Fast type lookup |
# Baseline (default settings)
SELECT performance: 294ms first, 156ms cached
# Aggressive caching
--query-cache-size 10000 --result-cache-size 1000
SELECT performance: 285ms first, 142ms cached (9% improvement)
# Minimal caching
--query-cache-size 100 --result-cache-size 10
SELECT performance: 312ms first, 198ms cached (27% worse)
The fast path optimizer bypasses full query parsing for simple operations:
- Simple INSERT/UPDATE/DELETE
- No PostgreSQL-specific casts (::type)
- No datetime functions
- No JOINs or subqueries
Operation | Regular Path | Fast Path | Improvement
--------------|--------------|-----------|------------
INSERT | 332x | 48x | 6.9x faster
UPDATE | 156x | 48x | 3.3x faster
DELETE | 147x | 44x | 3.3x faster
Component | Memory (MB) | Percentage
-------------------|-------------|------------
Query Cache | 12.5 | 25%
Result Cache | 18.2 | 36%
Statement Pool | 8.4 | 17%
Schema Cache | 3.2 | 6%
Buffer Pool | 5.1 | 10%
Other | 3.1 | 6%
-------------------|-------------|------------
Total | 50.5 | 100%
| Cache Size | Memory | Avg Latency | 99th Percentile |
|---|---|---|---|
| Small | 25 MB | 0.285 ms | 1.2 ms |
| Default | 50 MB | 0.156 ms | 0.8 ms |
| Large | 150 MB | 0.142 ms | 0.6 ms |
Query: SELECT id, name FROM users WHERE id = 1
PostgreSQL Protocol:
- Query message: 45 bytes
- RowDescription: 89 bytes
- DataRow: 28 bytes
- CommandComplete: 13 bytes
Total: 175 bytes
Raw SQLite Result: 12 bytes
Overhead: 14.6x
TCP Connection (localhost): 0.05ms
SSL Handshake (if enabled): 2.1ms
Query Parse: 0.08ms
Execute: 0.02ms
Format Response: 0.09ms
Network Send: 0.04ms
pgsqlite \
--query-cache-size 10000 \
--result-cache-size 2000 \
--result-cache-ttl 600 \
--pragma-journal-mode WALExpected improvement: 40-60% better read performance
pgsqlite \
--query-cache-size 1000 \
--result-cache-size 100 \
--pragma-synchronous NORMAL \
--pragma-journal-mode WALExpected improvement: 20-30% better write performance
pgsqlite \
--query-cache-size 5000 \
--result-cache-size 500 \
--statement-pool-size 300 \
--pragma-journal-mode WALBalanced performance for both reads and writes
# Comprehensive benchmark
./benchmarks/benchmark.py
# Specific operation benchmark
./benchmarks/benchmark.py --operation select
# Custom configuration
./benchmarks/benchmark.py --records 100000 --cache-size 10000# CPU profiling
cargo build --release --features profiling
perf record --call-graph=dwarf ./target/release/pgsqlite
perf report
# Memory profiling
valgrind --tool=massif ./target/release/pgsqlite
ms_print massif.out.*- ✅ Use multi-row INSERT for bulk data (up to 76x faster)
- ✅ Batch INSERTs in groups of 100-1000 for optimal performance
- ✅ Enable WAL mode for concurrent reads
- ✅ Size caches based on working set
- ✅ Use prepared statements
- ✅ Batch similar operations
- ❌ Use single-row INSERT for bulk loading
- ❌ Create batches larger than 1000 rows (diminishing returns)
- ❌ Over-provision caches (diminishing returns)
- ❌ Disable all caching
- ❌ Use complex queries without testing
- ❌ Ignore connection pooling
- Query Plan Cache Serialization: Persist across restarts
- Adaptive Caching: Auto-tune based on workload
- Parallel Query Execution: For read-only queries
- JIT Query Compilation: For hot queries
- Connection Multiplexing: Handle more concurrent clients