This document outlines the design for a type-mapping system for a service that speaks the PostgreSQL wire protocol on top of a SQLite storage backend. The core challenge is ensuring consistent bidirectional type conversion between PostgreSQL and SQLite, especially when SQLite does not natively support the full range of PostgreSQL types.
- Preserve PostgreSQL type information when SQLite lacks native equivalents
- Enable correct wire protocol behavior for
CREATE,INSERT,SELECT,UPDATE, andDELETE - Avoid reliance on column names alone
- Enable inference of types for functions such as
COUNT,AVG,MAX, etc.
The following table maps supported PostgreSQL types to SQLite storage representations:
| PostgreSQL Type | SQLite Type | Custom Type | Notes |
|---|---|---|---|
| BOOLEAN | INTEGER | - | Stored as 0/1 |
| SMALLINT | INTEGER | - | Integer range |
| INTEGER | INTEGER | - | Default |
| BIGINT | INTEGER | - | 64-bit signed |
| REAL | TEXT | DECIMAL | 32-bit float stored as decimal for precision |
| DOUBLE PRECISION | TEXT | DECIMAL | 64-bit float stored as decimal for precision |
| NUMERIC/DECIMAL | TEXT | DECIMAL | Custom type with precision/scale constraints |
| CHAR/VARCHAR/TEXT | TEXT | - | Length constraints enforced for (n) variants |
| UUID | TEXT | - | Format validation in code |
| DATE | INTEGER | - | Days since Unix epoch (1970-01-01) |
| TIMESTAMP | INTEGER | - | Microseconds since Unix epoch |
| TIMESTAMPTZ | INTEGER | - | Microseconds since Unix epoch |
| TIME | INTEGER | - | Microseconds since midnight |
| TIMETZ | INTEGER | - | Microseconds since midnight |
| BYTEA | BLOB | - | Binary |
| JSON/JSONB | TEXT | - | Validated/serialized in code |
| ENUM | TEXT | ENUM | Full PostgreSQL ENUM support with CHECK constraints |
| ARRAY | TEXT | - | JSON string |
| SERIAL/BIGSERIAL | INTEGER | - | Use AUTOINCREMENT |
| MONEY | TEXT | - | Currency values with validation |
| INT4RANGE | TEXT | - | Integer ranges stored as strings |
| INT8RANGE | TEXT | - | Big integer ranges stored as strings |
| NUMRANGE | TEXT | - | Numeric ranges stored as strings |
| CIDR | TEXT | - | Network addresses with prefix |
| INET | TEXT | - | IP addresses with optional prefix |
| MACADDR | TEXT | - | 6-byte MAC addresses |
| MACADDR8 | TEXT | - | 8-byte MAC addresses |
| BIT | TEXT | - | Fixed-length bit strings |
| BIT VARYING | TEXT | - | Variable-length bit strings |
PGSQLite implements custom SQLite types using user-defined functions to provide better PostgreSQL compatibility:
The DECIMAL custom type provides arbitrary precision decimal arithmetic using the rust_decimal library. This ensures accurate calculations without floating-point errors.
Functions provided:
decimal_from_text(text)- Convert text to DECIMALdecimal_to_text(decimal)- Convert DECIMAL to textdecimal_add(a, b)- Additiondecimal_sub(a, b)- Subtractiondecimal_mul(a, b)- Multiplicationdecimal_div(a, b)- Divisiondecimal_sum()- Aggregate SUMdecimal_avg()- Aggregate AVGdecimal_min()- Aggregate MINdecimal_max()- Aggregate MAX
PostgreSQL ENUM types are fully supported with automatic CHECK constraint generation:
DDL Support:
CREATE TYPE name AS ENUM ('value1', 'value2', ...)- Create new ENUM typeALTER TYPE name ADD VALUE 'new_value' [BEFORE|AFTER 'existing_value']- Add enum valuesDROP TYPE name [IF EXISTS]- Drop ENUM type with dependency checking
Implementation:
- ENUM values are stored as TEXT in SQLite
- Trigger-based validation enforces valid values (enables ALTER TYPE ADD VALUE on existing tables)
- Metadata stored in
__pgsqlite_enum_typesand__pgsqlite_enum_valuestables - Usage tracking in
__pgsqlite_enum_usagefor dependency checking - Full system catalog integration (pg_type, pg_enum)
- Type casting support with both
::andCAST()syntax
All datetime types use INTEGER storage for perfect precision (no floating point errors):
Storage Formats:
- DATE: INTEGER days since Unix epoch (1970-01-01)
- TIME/TIMETZ: INTEGER microseconds since midnight (0-86,399,999,999)
- TIMESTAMP/TIMESTAMPTZ: INTEGER microseconds since Unix epoch
- INTERVAL: INTEGER microseconds duration
Benefits:
- Perfect microsecond precision (PostgreSQL's maximum precision)
- No floating-point rounding errors
- Efficient storage and indexing
- Fast arithmetic operations
- Consistent behavior across platforms
Conversion Functions:
to_timestamp(seconds)- Convert seconds to microseconds timestampNOW()- Returns current timestamp as microsecondsCURRENT_DATE- Returns current date as epoch daysEXTRACT(field FROM timestamp)- Returns integer valuesDATE_TRUNC(field, timestamp)- Returns truncated timestamp as microseconds
Performance Optimizations:
- Dedicated type converters with indices 6 (date), 7 (time), 8 (timestamp)
- Buffer-based formatting avoiding string allocations
- Fast-path execution for all datetime columns
- Optimized datetime formatting functions:
format_days_to_date_buf()- Direct buffer writing for datesformat_microseconds_to_time_buf()- Direct buffer writing for timesformat_microseconds_to_timestamp_buf()- Direct buffer writing for timestamps
The following PostgreSQL native types are not yet mapped to SQLite equivalents:
| PostgreSQL Type | Suggested SQLite Type | Notes |
|---|---|---|
| POINT | TEXT | Store as JSON: {"x": x, "y": y} |
| LINE | TEXT | Store as JSON: {"A": a, "B": b, "C": c} |
| LSEG | TEXT | Store as JSON: [{"x": x1, "y": y1}, {"x": x2, "y": y2}] |
| BOX | TEXT | Store as JSON: {"upper_right": {...}, "lower_left": {...}} |
| PATH | TEXT | Store as JSON array of points |
| POLYGON | TEXT | Store as JSON array of points |
| CIRCLE | TEXT | Store as JSON: {"center": {...}, "radius": r} |
| PostgreSQL Type | Suggested SQLite Type | Notes |
|---|---|---|
| TSVECTOR | TEXT | Store as serialized format |
| TSQUERY | TEXT | Store as serialized format |
| PostgreSQL Type | Suggested SQLite Type | Notes |
|---|---|---|
| INTERVAL | INTEGER | Microseconds duration |
| TSRANGE | TEXT | Store as JSON with timestamps |
| TSTZRANGE | TEXT | Store as JSON with timestamps |
| DATERANGE | TEXT | Store as JSON with dates |
| PostgreSQL Type | Suggested SQLite Type | Notes |
|---|---|---|
| XML | TEXT | Validate XML structure |
| PG_LSN | TEXT | Store as string |
| TXID_SNAPSHOT | TEXT | Store as string |
| Composite Types | TEXT | Store as JSON |
| Domain Types | (base type) | Map to underlying base type |
| OID Types | INTEGER/TEXT | Depends on specific OID type |
When receiving a CREATE TABLE statement via the PostgreSQL protocol, the original column types should be parsed and stored in a metadata registry. This registry serves as the source of truth for type mapping.
To persist type information across restarts, pgsqlite uses special metadata tables:
CREATE TABLE IF NOT EXISTS __pgsqlite_schema (
table_name TEXT NOT NULL,
column_name TEXT NOT NULL,
pg_type TEXT NOT NULL,
sqlite_type TEXT NOT NULL,
pg_oid INTEGER,
datetime_format TEXT,
timezone_offset INTEGER,
PRIMARY KEY (table_name, column_name)
);The datetime_format column stores format information for datetime types (e.g., 'ISO8601', 'UNIX_TIMESTAMP'), while timezone_offset stores timezone information for TIMETZ and TIMESTAMPTZ types.
CREATE TABLE IF NOT EXISTS __pgsqlite_enum_types (
type_name TEXT PRIMARY KEY,
type_oid INTEGER NOT NULL UNIQUE
);
CREATE TABLE IF NOT EXISTS __pgsqlite_enum_values (
type_name TEXT NOT NULL,
enum_value TEXT NOT NULL,
enum_label_oid INTEGER NOT NULL UNIQUE,
sort_order INTEGER NOT NULL,
PRIMARY KEY (type_name, enum_value),
FOREIGN KEY (type_name) REFERENCES __pgsqlite_enum_types(type_name)
);CREATE TABLE users (
id UUID PRIMARY KEY,
profile JSONB,
active BOOLEAN,
balance NUMERIC(10,2),
created_at TIMESTAMP,
birth_date DATE,
work_start TIME
);Would store the following in __pgsqlite_schema:
| table_name | column_name | pg_type | sqlite_type | pg_oid | type_modifier | datetime_format | timezone_offset |
|---|---|---|---|---|---|---|---|
| users | id | UUID | TEXT | 2950 | -1 | NULL | NULL |
| users | profile | JSONB | TEXT | 3802 | -1 | NULL | NULL |
| users | active | BOOLEAN | INTEGER | 16 | -1 | NULL | NULL |
| users | balance | NUMERIC | DECIMAL | 1700 | 655366 | NULL | NULL |
| users | created_at | TIMESTAMP | INTEGER | 1114 | -1 | UNIX_TIMESTAMP | NULL |
| users | birth_date | DATE | INTEGER | 1082 | -1 | UNIX_TIMESTAMP | NULL |
| users | work_start | TIME | INTEGER | 1083 | -1 | UNIX_TIMESTAMP | NULL |
SQLite's PRAGMA table_info(table) provides column types such as TEXT, INTEGER, etc. To reverse map correctly to PostgreSQL types:
- First consult the
__pgsqlite_schemaregistry. - If unavailable, fall back to
PRAGMA-based inference.
Example fallback:
PRAGMA table_info('users');When handling inserts or updates:
- Use type OIDs (PostgreSQL protocol) to map values
- Validate or convert values (e.g., parse UUIDs, validate JSON)
- Use registry to resolve expected Postgres type
When handling expressions like COUNT(*), AVG(col), etc., infer the result type using known input types from the registry:
| Function | Inferred PG Type |
|---|---|
| COUNT | BIGINT |
| SUM | Depends on input |
| AVG | DOUBLE |
| MAX/MIN | Same as column |
PGSQLite includes a migration system to evolve the internal schema:
- In-memory databases: Migrations run automatically on startup (always start fresh)
- File-based databases: Require explicit
--migrateflag to run migrations - Version tracking: Schema version stored in
__pgsqlite_metadatatable - Dependency management: Migrations specify dependencies on previous versions
- v1: Initial schema - Creates
__pgsqlite_schemaand metadata tables - v2: ENUM support - Adds enum types, values, and usage tracking tables
- v3: DateTime timezone support - Adds
datetime_formatandtimezone_offsetcolumns - v4: DateTime INTEGER storage - Converts all datetime types to INTEGER microseconds/days
- v5: PostgreSQL catalog tables - Creates pg_class, pg_namespace, pg_am, pg_type, pg_attribute views
- v6: VARCHAR/CHAR constraints - Adds type_modifier to __pgsqlite_schema, creates __pgsqlite_string_constraints table
- v7: NUMERIC/DECIMAL constraints - Creates __pgsqlite_numeric_constraints table for precision/scale validation
Because SQLite stores JSON/JSONB and UUID as TEXT, conversion must be performed in code:
let value: serde_json::Value = serde_json::from_str(&text_column)?;let uuid: Uuid = Uuid::parse_str(&text_column)?;PGSQLite enforces PostgreSQL-compatible constraints on certain types:
VARCHAR(n)enforces maximum character length (not byte length)CHAR(n)enforces exact length with blank padding- Stored in
__pgsqlite_string_constraintstable - Returns error code 22001 (string_data_right_truncation) on violation
NUMERIC(p,s)enforces precision (total digits) and scale (decimal places)- Stored in
__pgsqlite_numeric_constraintstable - Application-layer validation intercepts INSERT/UPDATE statements
- Returns error code 22003 (numeric_value_out_of_range) on violation
- Values are formatted to specified scale on retrieval (e.g., 123.4 → 123.40 for NUMERIC(10,2))
Constraints are encoded in the type_modifier column using PostgreSQL's format:
- VARCHAR(n):
n + 4(where 4 is VARHDRSZ) - NUMERIC(p,s):
((p << 16) | s) + 4