Version: 2.x (PHP 8.1+) | View Legacy Docs (PHP 7.0-8.0)
Complete guide to using TOON PHP for token-efficient data serialization in AI/LLM applications.
- Introduction
- Installation
- Quick Start
- Core Concepts
- API Reference
- Data Type Support
- Format Modes
- Advanced Usage
- Real-World Examples
- Performance & Benchmarks
- Error Handling
- Best Practices
TOON (Token-Oriented Object Notation) is a compact data format designed to reduce token consumption when sending structured data to Large Language Models (LLMs). It achieves 30-60% token savings compared to JSON while maintaining readability and full round-trip serialization.
Traditional JSON uses many structural characters ({, }, [, ], ") that consume tokens but add no semantic value. TOON eliminates this overhead:
// JSON: 82 tokens
{"users":[{"id":1,"name":"Alice","role":"admin"},{"id":2,"name":"Bob","role":"user"}]}
// TOON: 36 tokens (56% savings!)
users[2]{id,name,role}:
1,Alice,admin
2,Bob,usercomposer require iamgerwin/toon-phpRequirements:
- PHP 8.1 or higher
- No other dependencies
Automatic Loading:
// Composer autoload includes all classes and helper functions
require 'vendor/autoload.php';
use iamgerwin\Toon\Toon;
// Helper functions are now available: toon(), toon_decode(), etc.use iamgerwin\Toon\Toon;
// Encode data to TOON
$data = [
'name' => 'Alice',
'age' => 30,
'active' => true,
'balance' => 1250.50
];
$toon = Toon::encode($data);
echo $toon;
/* Output:
name: Alice
age: 30
active: true
balance: 1250.5
*/
// Decode back to PHP
$decoded = Toon::decode($toon);
// $decoded is identical to $data// Shorter syntax using global helper
$toon = toon($data); // Same as Toon::encode()
$decoded = toon_decode($toon); // Same as Toon::decode()Objects and associative arrays use colon syntax:
$data = ['name' => 'Alice', 'age' => 30];
// TOON:
// name: Alice
// age: 30Simple arrays with length markers:
$data = [1, 2, 3, 4, 5];
// TOON: [5]: 1,2,3,4,5Uniform arrays use tabular format for maximum efficiency:
$users = [
['id' => 1, 'name' => 'Alice'],
['id' => 2, 'name' => 'Bob']
];
// TOON:
// [2]{id,name}:
// 1,Alice
// 2,BobFull support for nested objects and arrays:
$data = [
'user' => [
'profile' => ['name' => 'Alice'],
'settings' => ['theme' => 'dark']
]
];
// TOON maintains hierarchy with indentationThe main facade for all TOON operations.
Encodes a PHP value to TOON format.
Parameters:
$value- Any PHP value (primitives, arrays, objects, DateTime, enums)$options- Optional encoding configuration
Returns: TOON-encoded string
Throws: EncodingException if encoding fails
Examples:
// Basic encoding
$toon = Toon::encode(['name' => 'Alice']);
// With options
use iamgerwin\Toon\EncodeOptions;
$options = new EncodeOptions(
indent: 4,
preferTabular: true
);
$toon = Toon::encode($data, $options);Decodes a TOON string back to PHP.
Parameters:
$toon- TOON-formatted string$options- Optional decoding configuration
Returns: Decoded PHP value
Throws: DecodingException if decoding fails
Examples:
// Basic decoding
$data = Toon::decode($toonString);
// With options
use iamgerwin\Toon\DecodeOptions;
$options = DecodeOptions::lenient();
$data = Toon::decode($toonString, $options);Encodes to compact format (minimal whitespace).
Parameters:
$value- Value to encode
Returns: Compact TOON string
Example:
$compact = Toon::compact(['foo' => 'bar', 'items' => [1, 2, 3]]);
// Output: foo: bar\nitems[3]: 1,2,3Encodes to readable format (4-space indentation).
Parameters:
$value- Value to encode
Returns: Readable TOON string
Example:
$readable = Toon::readable(['foo' => 'bar']);
// Output:
// foo: barEncodes arrays to tabular format.
Parameters:
$value- Array with uniform structure
Returns: Tabular TOON string
Example:
$users = [
['id' => 1, 'name' => 'Alice'],
['id' => 2, 'name' => 'Bob']
];
$tabular = Toon::tabular($users);
// Output:
// [2]{id,name}:
// 1,Alice
// 2,BobCompares TOON vs JSON token usage.
Parameters:
$value- Value to compare$options- Optional encoding options
Returns: Array with comparison data
[
'toon' => string, // TOON-encoded version
'json' => string, // JSON-encoded version
'toon_tokens' => int, // Estimated TOON tokens
'json_tokens' => int, // Estimated JSON tokens
'savings_percent' => float // Percentage saved
]Example:
$comparison = Toon::compare($data);
echo "Token savings: {$comparison['savings_percent']}%";
// Output: Token savings: 42.5%Estimates token count for a TOON string.
Parameters:
$toon- TOON string
Returns: Estimated token count (uses ~4 characters per token)
Example:
$tokens = Toon::estimateTokens($toonString);
$cost = ($tokens / 1000) * 0.03; // GPT-4 pricing
echo "Estimated cost: $$cost";Global helper functions for convenience.
Alias for Toon::encode().
$toon = toon(['name' => 'Alice']);Alias for Toon::decode().
$data = toon_decode($toonString);Alias for Toon::compact().
$compact = toon_compact($data);Alias for Toon::readable().
$readable = toon_readable($data);Alias for Toon::tabular().
$tabular = toon_tabular($users);Alias for Toon::compare().
$comparison = toon_compare($data);Alias for Toon::estimateTokens().
$tokens = toon_estimate_tokens($toonString);Configure how TOON encoding behaves.
new EncodeOptions(
int $indent = 2,
ToonDelimiter $delimiter = ToonDelimiter::COMMA,
bool $useLengthMarker = true,
bool $preferTabular = true,
bool $quoteStrings = false,
bool $sortKeys = false
)| Property | Type | Default | Description |
|---|---|---|---|
indent |
int | 2 | Number of spaces for indentation (0 = compact) |
delimiter |
ToonDelimiter | COMMA | Array value delimiter (,, \t, or |) |
useLengthMarker |
bool | true | Include array length markers like [5] |
preferTabular |
bool | true | Use tabular format for uniform arrays |
quoteStrings |
bool | false | Always quote string values |
sortKeys |
bool | false | Sort object keys alphabetically |
// Compact mode (minimal whitespace)
$options = EncodeOptions::compact();
// Readable mode (4-space indentation)
$options = EncodeOptions::readable();
// Tabular mode (optimized for arrays)
$options = EncodeOptions::tabular();use iamgerwin\Toon\{Toon, EncodeOptions, Enums\ToonDelimiter};
// Custom indentation
$options = new EncodeOptions(indent: 4);
$toon = Toon::encode($data, $options);
// Tab delimiters
$options = new EncodeOptions(delimiter: ToonDelimiter::TAB);
$toon = Toon::encode($data, $options);
// [3]: value1 value2 value3
// Pipe delimiters
$options = new EncodeOptions(delimiter: ToonDelimiter::PIPE);
$toon = Toon::encode($data, $options);
// [3]: value1|value2|value3
// Sorted keys
$options = new EncodeOptions(sortKeys: true);
$toon = Toon::encode(['z' => 1, 'a' => 2], $options);
// Output:
// a: 2
// z: 1
// No length markers
$options = new EncodeOptions(useLengthMarker: false);
$toon = Toon::encode([1, 2, 3], $options);
// Output: 1,2,3 (no [3]:)Configure how TOON decoding behaves.
new DecodeOptions(
bool $strict = true,
bool $associative = true,
int $depth = 512
)| Property | Type | Default | Description |
|---|---|---|---|
strict |
bool | true | Enable strict validation during decoding |
associative |
bool | true | Return objects as associative arrays |
depth |
int | 512 | Maximum nesting depth |
// Strict mode (validation enabled)
$options = DecodeOptions::strict();
// Lenient mode (relaxed validation)
$options = DecodeOptions::lenient();use iamgerwin\Toon\{Toon, DecodeOptions};
// Strict decoding (default)
$data = Toon::decode($toon, DecodeOptions::strict());
// Lenient decoding (forgiving)
$data = Toon::decode($toon, DecodeOptions::lenient());
// Return stdClass objects instead of arrays
$options = new DecodeOptions(associative: false);
$data = Toon::decode($toon, $options);
// $data contains stdClass objects
// Custom depth limit
$options = new DecodeOptions(depth: 100);
$data = Toon::decode($toon, $options);TOON PHP supports all PHP data types with full round-trip serialization.
// Null
Toon::encode(null); // "null"
// Booleans
Toon::encode(true); // "true"
Toon::encode(false); // "false"
// Integers
Toon::encode(42); // "42"
Toon::encode(-100); // "-100"
// Floats
Toon::encode(3.14); // "3.14"
Toon::encode(1.5e10); // "15000000000"
// Special floats
Toon::encode(INF); // "null"
Toon::encode(NAN); // "null"
// Strings
Toon::encode('hello'); // "hello"
Toon::encode('hello world'); // "hello world"
// Strings with special characters (auto-quoted)
Toon::encode('contains:colon'); // "\"contains:colon\""
Toon::encode("line1\nline2"); // "\"line1\\nline2\""// Simple arrays
Toon::encode([1, 2, 3]);
// [3]: 1,2,3
// Associative arrays
Toon::encode(['name' => 'Alice', 'age' => 30]);
// name: Alice
// age: 30
// Nested arrays
Toon::encode([
'users' => [
['id' => 1, 'name' => 'Alice'],
['id' => 2, 'name' => 'Bob']
]
]);
// users[2]{id,name}:
// 1,Alice
// 2,Bob
// Mixed arrays
Toon::encode([
'numbers' => [1, 2, 3],
'strings' => ['a', 'b', 'c']
]);// stdClass objects
$obj = new stdClass();
$obj->name = 'Alice';
$obj->age = 30;
Toon::encode($obj);
// name: Alice
// age: 30
// Arrays are treated as objects
Toon::encode((object)['name' => 'Alice']);
// name: Aliceuse DateTime;
use DateTimeImmutable;
// DateTime
$date = new DateTime('2024-01-15 10:30:00');
Toon::encode(['created' => $date]);
// created: 2024-01-15T10:30:00+00:00
// DateTimeImmutable
$date = new DateTimeImmutable('2024-01-15');
Toon::encode(['date' => $date]);
// date: 2024-01-15T00:00:00+00:00
// Round-trip preserves ISO 8601 format
$toon = Toon::encode(['date' => new DateTime('2024-01-15')]);
$decoded = Toon::decode($toon);
// $decoded['date'] is a string: "2024-01-15T00:00:00+00:00"// Backed enums
enum Status: string {
case Active = 'active';
case Pending = 'pending';
case Inactive = 'inactive';
}
Toon::encode(['status' => Status::Active]);
// status: active
// Unit enums (no backing value)
enum Color {
case Red;
case Green;
case Blue;
}
Toon::encode(['color' => Color::Red]);
// color: Red
// Round-trip encoding/decoding
$data = ['status' => Status::Active];
$toon = Toon::encode($data);
$decoded = Toon::decode($toon);
// $decoded['status'] is the string "active"// Empty array
Toon::encode([]); // "[]"
// Empty object
Toon::encode(new stdClass()); // "{}"
// Empty string
Toon::encode(''); // (empty string)TOON PHP supports three main format modes.
Balances readability and compactness.
$data = [
'user' => ['name' => 'Alice', 'age' => 30],
'items' => [1, 2, 3]
];
$toon = Toon::encode($data);
/* Output:
user:
name: Alice
age: 30
items[3]: 1,2,3
*/Removes unnecessary whitespace for maximum efficiency.
$compact = Toon::compact($data);
/* Output:
user:
name: Alice
age: 30
items[3]: 1,2,3
*/Adds extra indentation for better human readability.
$readable = Toon::readable($data);
/* Output:
user:
name: Alice
age: 30
items[3]: 1,2,3
*/Optimized for uniform arrays (best token efficiency).
$users = [
['id' => 1, 'name' => 'Alice', 'role' => 'admin'],
['id' => 2, 'name' => 'Bob', 'role' => 'user'],
['id' => 3, 'name' => 'Charlie', 'role' => 'user']
];
$tabular = Toon::tabular($users);
/* Output:
[3]{id,name,role}:
1,Alice,admin
2,Bob,user
3,Charlie,user
*/Change array delimiters for specific use cases.
use iamgerwin\Toon\{Toon, EncodeOptions, Enums\ToonDelimiter};
$data = ['items' => [1, 2, 3, 4, 5]];
// Tab delimiter
$options = new EncodeOptions(delimiter: ToonDelimiter::TAB);
echo Toon::encode($data, $options);
// items[5]: 1 2 3 4 5
// Pipe delimiter
$options = new EncodeOptions(delimiter: ToonDelimiter::PIPE);
echo Toon::encode($data, $options);
// items[5]: 1|2|3|4|5Sort object keys alphabetically for consistent output.
$data = [
'zebra' => 1,
'apple' => 2,
'mango' => 3
];
$options = new EncodeOptions(sortKeys: true);
echo Toon::encode($data, $options);
/* Output:
apple: 2
mango: 3
zebra: 1
*/Always quote strings even when not necessary.
$data = ['name' => 'Alice', 'city' => 'NYC'];
$options = new EncodeOptions(quoteStrings: true);
echo Toon::encode($data, $options);
/* Output:
name: "Alice"
city: "NYC"
*/Handle malformed TOON with lenient parsing.
// Strict mode (default) - throws exception on errors
try {
$data = Toon::decode($malformedToon);
} catch (DecodingException $e) {
// Handle error
}
// Lenient mode - attempts to parse anyway
$data = Toon::decode($malformedToon, DecodeOptions::lenient());Limit maximum nesting depth to prevent stack overflow.
$options = new DecodeOptions(depth: 50);
$data = Toon::decode($deeplyNestedToon, $options);
// Will throw exception if depth exceeds 50 levelsuse iamgerwin\Toon\Toon;
$conversation = [
['role' => 'system', 'content' => 'You are a helpful assistant.'],
['role' => 'user', 'content' => 'What is TOON?'],
['role' => 'assistant', 'content' => 'TOON is a token-efficient format...'],
['role' => 'user', 'content' => 'How much can I save?']
];
// JSON: ~280 tokens = $0.0084 per API call
$json = json_encode($conversation);
// TOON: ~165 tokens = $0.00495 per API call (41% savings!)
$toon = Toon::tabular($conversation);
// For 100K API calls:
// JSON cost: $840
// TOON cost: $495
// Savings: $345/year$products = [
['id' => 1, 'name' => 'Laptop', 'price' => 999.99, 'stock' => 15],
['id' => 2, 'name' => 'Mouse', 'price' => 29.99, 'stock' => 50],
['id' => 3, 'name' => 'Keyboard', 'price' => 79.99, 'stock' => 30]
];
// JSON: 156 characters
$json = json_encode($products);
// TOON: 89 characters (43% savings!)
$toon = Toon::tabular($products);
/* Output:
[3]{id,name,price,stock}:
1,Laptop,999.99,15
2,Mouse,29.99,50
3,Keyboard,79.99,30
*/
// Decode back when needed
$decoded = Toon::decode($toon);$profile = [
'id' => 12345,
'username' => 'alice_dev',
'email' => 'alice@example.com',
'profile' => [
'firstName' => 'Alice',
'lastName' => 'Johnson',
'bio' => 'Full-stack developer'
],
'settings' => [
'theme' => 'dark',
'notifications' => true,
'language' => 'en'
],
'metadata' => [
'created' => new DateTime('2024-01-15'),
'lastLogin' => new DateTime('2024-11-09')
]
];
$toon = Toon::encode($profile);
$comparison = Toon::compare($profile);
echo "Token savings: {$comparison['savings_percent']}%";
// Typical savings: 35-40%// Typical API response
$response = [
'success' => true,
'data' => [
'users' => [
['id' => 1, 'name' => 'Alice', 'active' => true],
['id' => 2, 'name' => 'Bob', 'active' => false],
['id' => 3, 'name' => 'Charlie', 'active' => true]
],
'total' => 3,
'page' => 1
],
'meta' => [
'timestamp' => time(),
'version' => '1.0'
]
];
// Before sending to LLM
$compressed = Toon::encode($response);
// LLM processes the compact format
// Then decode on the other end
$original = Toon::decode($compressed);$trainingData = [];
for ($i = 0; $i < 1000; $i++) {
$trainingData[] = [
'features' => [rand(1, 100), rand(1, 100), rand(1, 100)],
'label' => rand(0, 1)
];
}
// JSON: ~45,000 tokens
// TOON: ~22,000 tokens (51% savings!)
$compactData = Toon::tabular($trainingData);
// Massive cost savings when processing large datasetsBased on official TOON benchmarks:
| Use Case | JSON Tokens | TOON Tokens | Savings |
|---|---|---|---|
| E-commerce Orders | 3,245 | 2,170 | 33.1% |
| User Lists | 150 | 82 | 45.3% |
| Product Catalogs | 320 | 180 | 43.8% |
| Event Logs | 1,890 | 1,606 | 15.0% |
| Config Files | 2,456 | 1,687 | 31.3% |
| Conversation History | 280 | 165 | 41.1% |
Tested across 209 questions on 4 LLM models:
| Format | Accuracy | Tokens Used |
|---|---|---|
| TOON | 73.9% | 2,744 |
| JSON (compact) | 70.7% | 3,081 |
| JSON (formatted) | 69.7% | 4,545 |
TOON achieves +3.2% better accuracy while using 39.6% fewer tokens.
At OpenAI's GPT-4 pricing ($0.03/1K input tokens, $0.06/1K output tokens):
// Calculate potential savings
$data = [...]; // your data
$comparison = Toon::compare($data);
$tokensaved = $comparison['json_tokens'] - $comparison['toon_tokens'];
// For 1M API calls with 100-token payloads
$jsonCost = (100 * 1_000_000 / 1000) * 0.03; // $3,000
$toonCost = (60 * 1_000_000 / 1000) * 0.03; // $1,800 (40% savings)
echo "Annual savings: $" . ($jsonCost - $toonCost);
// Output: Annual savings: $1,200- Encoding Speed: ~5-10μs for small objects, ~50-100μs for large datasets
- Decoding Speed: ~10-20μs for small objects, ~100-200μs for large datasets
- Memory Usage: Minimal overhead, similar to json_encode/decode
- Round-trip Accuracy: 100% for all supported data types
TOON PHP uses exceptions for error handling.
ToonException (base)
├── EncodingException (encoding errors)
└── DecodingException (decoding errors)
use iamgerwin\Toon\{Toon, Exceptions\EncodingException};
try {
$toon = Toon::encode($data);
} catch (EncodingException $e) {
echo "Encoding failed: " . $e->getMessage();
// Log error, use fallback, etc.
}use iamgerwin\Toon\{Toon, Exceptions\DecodingException};
try {
$data = Toon::decode($toonString);
} catch (DecodingException $e) {
echo "Decoding failed: " . $e->getMessage();
// Handle malformed input
}// Unsupported type
try {
$resource = fopen('file.txt', 'r');
Toon::encode($resource); // Resources not supported
} catch (EncodingException $e) {
echo $e->getMessage(); // "Unsupported type: resource"
}
// Malformed TOON
try {
Toon::decode("invalid:::toon::: format");
} catch (DecodingException $e) {
echo $e->getMessage(); // Error details
}
// Depth exceeded
try {
$options = new DecodeOptions(depth: 5);
Toon::decode($deeplyNested, $options);
} catch (DecodingException $e) {
echo "Nesting too deep";
}// ✅ Good - uses tabular format
$users = [
['id' => 1, 'name' => 'Alice'],
['id' => 2, 'name' => 'Bob']
];
$toon = Toon::tabular($users);
// Maximum token efficiency
// ❌ Less efficient - default format
$toon = Toon::encode($users);// For LLM consumption - use compact
$forLLM = Toon::compact($data);
// For human debugging - use readable
$forDebug = Toon::readable($data);
// For uniform datasets - use tabular
$forArrays = Toon::tabular($arrayData);// Always measure actual savings for your use case
$comparison = Toon::compare($yourData);
if ($comparison['savings_percent'] < 20) {
// TOON might not be worth it for this data
// Consider using JSON instead
}// Always wrap in try-catch for production
try {
$toon = Toon::encode($userInput);
} catch (EncodingException $e) {
// Fallback to JSON or handle error
$toon = json_encode($userInput);
Log::warning('TOON encoding failed, using JSON fallback');
}// For data from external sources
$options = DecodeOptions::lenient();
$data = Toon::decode($externalInput, $options);// For frequently sent data, cache the TOON encoding
$cacheKey = 'toon_' . md5(serialize($data));
$toon = Cache::remember($cacheKey, 3600, function() use ($data) {
return Toon::encode($data);
});// Track token usage in production
$toon = Toon::encode($data);
$tokens = Toon::estimateTokens($toon);
Metrics::gauge('llm.tokens.input', $tokens);
Metrics::gauge('llm.cost.estimated', $tokens / 1000 * 0.03);// In tests, ensure data survives round-trip
$original = ['name' => 'Alice', 'age' => 30];
$toon = Toon::encode($original);
$decoded = Toon::decode($toon);
assert($original === $decoded);- GitHub Repository: https://github.com/iamgerwin/toon-php
- Issue Tracker: https://github.com/iamgerwin/toon-php/issues
- TOON Format Specification: https://github.com/toon-format/toon
- Packagist: https://packagist.org/packages/iamgerwin/toon-php
- Current Version: 2.x (PHP 8.1+)
- Legacy Version: 1.x (PHP 7.0-8.0)
Made with ❤️ for the PHP and AI community