This document outlines implementation options, trade-offs, and recommendations for building the agent-trajectories system.
- User Workflow
- Language Recommendation
- Language & Runtime Options
- Code Structure
- Storage Layer
- CLI Framework
- Integration Architecture
- Distribution & Packaging
- Testing Strategy
- Recommendations
What does using agent-trajectories look like day-to-day?
┌─────────────────────────────────────────────────────────────────────────┐
│ SOLO DEV WORKFLOW │
└─────────────────────────────────────────────────────────────────────────┘
┌──────────────┐
│ 1. START │ User begins work on a feature
└──────┬───────┘
│
▼
┌──────────────────────────────────────────────────────────────────────┐
│ $ traj start "Add user authentication" │
│ ✓ Trajectory started: traj_a1b2c3 │
│ │
│ OR (with Claude Code hooks installed): │
│ Claude detects new work → auto-prompts "What are you working on?" │
└──────────────────────────────────────────────────────────────────────┘
│
▼
┌──────────────┐
│ 2. WORK │ User works with Claude Code (or any tool)
└──────┬───────┘
│
▼
┌──────────────────────────────────────────────────────────────────────┐
│ Claude Code session: │
│ > "Implement JWT auth with refresh tokens" │
│ │
│ [Hook auto-captures: tool calls, file changes, timestamps] │
│ │
│ Claude outputs (explicitly or prompted): │
│ [[TRAJECTORY:decision]] │
│ { "question": "Auth strategy", │
│ "chosen": "JWT with refresh tokens", │
│ "alternatives": ["sessions", "OAuth only"], │
│ "reasoning": "Stateless for horizontal scaling" } │
│ [[/TRAJECTORY]] │
└──────────────────────────────────────────────────────────────────────┘
│
▼
┌──────────────┐
│ 3. COMPLETE │ User finishes the task
└──────┬───────┘
│
▼
┌──────────────────────────────────────────────────────────────────────┐
│ $ traj complete │
│ │
│ ┌─ Retrospective ──────────────────────────────────────────────┐ │
│ │ Summary: Implemented JWT auth with refresh token rotation │ │
│ │ Challenges: Existing UserContext types were wrong │ │
│ │ Confidence: 0.85 │ │
│ │ Would do differently: Check types earlier │ │
│ └───────────────────────────────────────────────────────────────┘ │
│ │
│ ✓ Trajectory completed │
│ ✓ Exported to .trajectories/traj_a1b2c3.json │
│ ✓ Summary written to .trajectories/traj_a1b2c3.md │
└──────────────────────────────────────────────────────────────────────┘
│
▼
┌──────────────┐
│ 4. USE │ Later: review, debug, learn
└──────┬───────┘
│
▼
┌──────────────────────────────────────────────────────────────────────┐
│ # In PR description (auto-generated): │
│ ## Trajectory Summary │
│ - **Approach:** JWT with refresh tokens for stateless scaling │
│ - **Key decisions:** 3 recorded │
│ - **Confidence:** 85% │
│ - [Full trajectory](./trajectories/traj_a1b2c3.md) │
│ │
│ # Months later, debugging: │
│ $ traj search "refresh token" │
│ Found 1 trajectory: │
│ traj_a1b2c3: "Add user authentication" (2024-01-15) │
│ │
│ $ traj show traj_a1b2c3 --decisions │
│ Decision 1: JWT over sessions (stateless scaling) │
│ Decision 2: HttpOnly cookies (XSS protection) │
│ Decision 3: 15min access / 7day refresh (security balance) │
└──────────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────────────┐
│ TEAM WORKFLOW │
└─────────────────────────────────────────────────────────────────────────┘
AGENT ALICE AGENT BOB
──────────── ─────────
│ │
▼ │
┌────────────────┐ │
│ traj start │ │
│ "Auth module" │ │
└───────┬────────┘ │
│ │
▼ │
┌────────────────┐ │
│ Working... │ │
│ Records: │ │
│ - decisions │ │
│ - tool calls │ │
└───────┬────────┘ │
│ │
│ ┌───────────────────────────┐ │
└─▶│ @Bob: Can you review the │───┘
│ token rotation logic? │
└───────────────────────────┘
│
▼
┌────────────────┐
│ Bob joins │
│ trajectory │
│ (new chapter) │
└───────┬────────┘
│
▼
┌────────────────┐
│ Reviews, adds │
│ own decisions │
└───────┬────────┘
│
┌────────────────────┘
▼
┌────────────────┐
│ traj complete │
│ │
│ Trajectory: │
│ - 2 agents │
│ - 3 chapters │
│ - 5 decisions │
└────────────────┘
┌─────────────────────────────────────────────────────────────────────────┐
│ LEARNING WORKFLOW │
└─────────────────────────────────────────────────────────────────────────┘
NEW TASK: "Add rate limiting to API"
┌─────────────────────────────────────────────────────────────────────┐
│ Agent (or human) queries workspace: │
│ │
│ $ traj suggest "rate limiting" │
│ │
│ ┌─ Relevant Trajectories ─────────────────────────────────────────┐│
│ │ traj_x1: "Auth rate limiting" (90% match) ││
│ │ └─ Gotcha: Redis connection pooling was tricky ││
│ │ traj_y2: "API throttling" (75% match) ││
│ └─────────────────────────────────────────────────────────────────┘│
│ │
│ ┌─ Related Decisions ─────────────────────────────────────────────┐│
│ │ "Use Redis for distributed state" (from traj_x1) ││
│ │ "Sliding window over fixed window" (from traj_y2) ││
│ └─────────────────────────────────────────────────────────────────┘│
│ │
│ ┌─ Suggested Patterns ────────────────────────────────────────────┐│
│ │ middleware-pattern.md: "How we add cross-cutting concerns" ││
│ └─────────────────────────────────────────────────────────────────┘│
└─────────────────────────────────────────────────────────────────────┘
Agent now has context BEFORE writing a single line of code.
# Lifecycle
traj start "Task description" # Start a new trajectory
traj status # Show active trajectory
traj complete # Complete with retrospective
traj abandon # Abandon without completing
# Recording
traj decision "What" --reasoning "Why" --alternatives "A,B,C"
traj chapter "New phase" # Start a new chapter
traj note "Observation" # Add a note
# Querying
traj list # List recent trajectories
traj show <id> # Show trajectory details
traj search "query" # Full-text search
traj suggest "task description" # Get suggestions for new work
# Export
traj export <id> --format md # Export as markdown
traj export <id> --format json # Export as JSON
traj export <id> --format timeline # Export as timeline viewRecommendation: TypeScript
After weighing all factors, TypeScript is the clear choice for agent-trajectories:
┌─────────────────────────────────────────────────────────────────────────┐
│ WHY TYPESCRIPT │
├─────────────────────────────────────────────────────────────────────────┤
│ │
│ 1. ECOSYSTEM ALIGNMENT │
│ ├── Claude Code: TypeScript │
│ ├── agent-relay: TypeScript │
│ ├── claude-mem: TypeScript │
│ ├── Most MCP servers: TypeScript │
│ └── One language across the entire stack │
│ │
│ 2. CONTRIBUTOR POOL │
│ ├── TypeScript: ~15M developers │
│ ├── Rust: ~3M developers │
│ ├── Go: ~4M developers │
│ └── Lower barrier = more contributions │
│ │
│ 3. JSON-NATIVE │
│ ├── Trajectories ARE JSON │
│ ├── No serialization friction │
│ ├── Type inference from schemas │
│ └── Zod for runtime validation │
│ │
│ 4. DEVELOPMENT VELOCITY │
│ ├── Faster iteration │
│ ├── Hot reload during development │
│ ├── Rich IDE support │
│ └── Ship v1 faster │
│ │
│ 5. INTEGRATION SIMPLICITY │
│ ├── Claude Code hooks: direct integration │
│ ├── MCP server: same runtime │
│ ├── agent-relay: import directly │
│ └── No FFI boundaries │
│ │
└─────────────────────────────────────────────────────────────────────────┘
| Concern | Mitigation |
|---|---|
| Startup time (~150ms) | Acceptable for trajectory operations (not hot path). For CI/scripting, batch commands. Future: Bun binary. |
| Dependency bloat | Strict dependency policy. Core has <10 deps. Use bundleDependencies. |
| Runtime type safety | Zod for all external input. Schema validation on read/write. |
| Distribution | NPM primary. Optional: pkg/bun binaries for standalone. |
┌─────────────────────────────────────────────────────────────────────────┐
│ RECOMMENDED STACK │
├─────────────────────────────────────────────────────────────────────────┤
│ │
│ RUNTIME Node.js 20+ (LTS) │
│ └── Future: Bun for faster startup │
│ │
│ LANGUAGE TypeScript 5.3+ │
│ └── Strict mode, ESM-only │
│ │
│ VALIDATION Zod │
│ └── Runtime types from schema │
│ │
│ CLI Commander + Clack │
│ └── Simple parsing + beautiful prompts │
│ │
│ STORAGE better-sqlite3 (default) │
│ └── File fallback for zero-dep mode │
│ │
│ TESTING Vitest │
│ └── Fast, TypeScript-native │
│ │
│ BUILD tsup │
│ └── Zero-config bundling │
│ │
│ FORMATTING Biome │
│ └── Fast, replaces ESLint + Prettier │
│ │
└─────────────────────────────────────────────────────────────────────────┘
{
"dependencies": {
"commander": "^12.0.0", // CLI parsing
"@clack/prompts": "^0.7.0", // Interactive prompts
"better-sqlite3": "^11.0.0", // Storage (optional)
"zod": "^3.23.0" // Validation
},
"devDependencies": {
"typescript": "^5.4.0",
"vitest": "^2.0.0",
"tsup": "^8.0.0",
"@biomejs/biome": "^1.8.0"
}
}Total production deps: 4 (and SQLite is optional)
Pros:
- Ecosystem alignment with Claude Code, agent-relay, and most AI tooling
- Excellent async/await patterns for I/O-heavy operations
- Strong typing catches errors at compile time
- NPM distribution is well-understood
- Easy JSON handling (trajectories are JSON-native)
- Same language for CLI and potential web dashboard
Cons:
- Node.js startup time (~100-200ms) noticeable for frequent CLI calls
- Dependency bloat if not careful
- Runtime type checking requires additional tooling (Zod, io-ts)
Best for: Maximum ecosystem compatibility, rapid development
Pros:
- Near-instant startup time (<10ms)
- Single binary distribution, no runtime dependencies
- Memory safety without garbage collection
- Excellent CLI libraries (clap, dialoguer)
- Strong typing with algebraic data types
- Cross-compilation built-in
Cons:
- Steeper learning curve
- Slower iteration during development
- JSON handling more verbose than JS/TS
- Smaller pool of contributors
- Async Rust has complexity (tokio runtime)
Best for: Performance-critical, wide distribution, long-term maintenance
Pros:
- Fast startup, single binary
- Simple concurrency model
- Good CLI libraries (cobra, bubbletea)
- Easy cross-compilation
- Readable by most developers
Cons:
- Weaker type system (no generics until recently, no sum types)
- Error handling verbosity
- Less ecosystem overlap with AI/agent tooling
- JSON struct tags are tedious
Best for: Simple, reliable tooling with broad compatibility
Pros:
- Performance-critical paths in Rust
- TypeScript for integrations and scripting
- Best of both worlds
Cons:
- Build complexity (napi-rs, wasm-pack)
- Two codebases to maintain
- FFI boundary bugs
Best for: When performance AND ecosystem integration are both critical
| Factor | TypeScript | Rust | Go | Hybrid |
|---|---|---|---|---|
| Startup time | ~150ms | <10ms | <20ms | <20ms |
| Dev velocity | ★★★★★ | ★★☆☆☆ | ★★★★☆ | ★★☆☆☆ |
| Type safety | ★★★★☆ | ★★★★★ | ★★★☆☆ | ★★★★★ |
| Distribution | ★★★☆☆ | ★★★★★ | ★★★★★ | ★★★★☆ |
| Ecosystem fit | ★★★★★ | ★★★☆☆ | ★★★☆☆ | ★★★★☆ |
| Contributor pool | ★★★★★ | ★★★☆☆ | ★★★★☆ | ★★★☆☆ |
agent-trajectories/
├── packages/
│ ├── core/ # Types, schemas, validation
│ │ ├── src/
│ │ │ ├── types.ts # Trajectory, Chapter, Event types
│ │ │ ├── schema.ts # JSON Schema + Zod validation
│ │ │ └── index.ts
│ │ └── package.json
│ │
│ ├── storage/ # Storage adapters
│ │ ├── src/
│ │ │ ├── interface.ts # StorageAdapter interface
│ │ │ ├── file.ts # File system storage
│ │ │ ├── sqlite.ts # SQLite storage
│ │ │ └── index.ts
│ │ └── package.json
│ │
│ ├── capture/ # Event capture mechanisms
│ │ ├── src/
│ │ │ ├── hooks.ts # Claude Code hooks
│ │ │ ├── parser.ts # [[TRAJECTORY:*]] block parser
│ │ │ └── index.ts
│ │ └── package.json
│ │
│ ├── adapters/ # Task system adapters
│ │ ├── src/
│ │ │ ├── interface.ts # TaskSourceAdapter interface
│ │ │ ├── beads.ts
│ │ │ ├── github.ts
│ │ │ ├── linear.ts
│ │ │ └── plain.ts
│ │ └── package.json
│ │
│ ├── workspace/ # Knowledge extraction
│ │ ├── src/
│ │ │ ├── decisions.ts
│ │ │ ├── patterns.ts
│ │ │ ├── extract.ts
│ │ │ └── index.ts
│ │ └── package.json
│ │
│ ├── export/ # Export formats
│ │ ├── src/
│ │ │ ├── markdown.ts
│ │ │ ├── timeline.ts
│ │ │ ├── json.ts
│ │ │ └── index.ts
│ │ └── package.json
│ │
│ └── cli/ # Command-line interface
│ ├── src/
│ │ ├── commands/
│ │ │ ├── start.ts
│ │ │ ├── status.ts
│ │ │ ├── complete.ts
│ │ │ ├── export.ts
│ │ │ └── search.ts
│ │ └── index.ts
│ └── package.json
│
├── pnpm-workspace.yaml
├── tsconfig.json
└── package.json
Pros:
- Clear separation of concerns
- Independent versioning possible
- Tree-shaking friendly
- Easy to test in isolation
Cons:
- Build complexity (need turborepo/nx/pnpm workspaces)
- More boilerplate
- Dependency management between packages
agent-trajectories/
├── src/
│ ├── core/
│ │ ├── types.ts
│ │ ├── schema.ts
│ │ └── index.ts
│ │
│ ├── storage/
│ │ ├── interface.ts
│ │ ├── file.ts
│ │ ├── sqlite.ts
│ │ └── index.ts
│ │
│ ├── capture/
│ │ ├── hooks.ts
│ │ ├── parser.ts
│ │ └── index.ts
│ │
│ ├── adapters/
│ │ ├── interface.ts
│ │ ├── beads.ts
│ │ ├── github.ts
│ │ ├── linear.ts
│ │ ├── plain.ts
│ │ └── index.ts
│ │
│ ├── workspace/
│ │ ├── decisions.ts
│ │ ├── patterns.ts
│ │ ├── extract.ts
│ │ └── index.ts
│ │
│ ├── export/
│ │ ├── markdown.ts
│ │ ├── timeline.ts
│ │ ├── json.ts
│ │ └── index.ts
│ │
│ ├── cli/
│ │ ├── commands/
│ │ └── index.ts
│ │
│ └── index.ts # Main entry point
│
├── package.json
└── tsconfig.json
Pros:
- Simpler build setup
- Easier to get started
- Single version
- Less boilerplate
Cons:
- All-or-nothing imports (unless careful with exports)
- Harder to split later
- Single package grows large over time
agent-trajectories/
├── src/
│ ├── core/ # Minimal core
│ │ ├── types.ts
│ │ ├── schema.ts
│ │ ├── plugin.ts # Plugin interface
│ │ └── registry.ts # Plugin registry
│ │
│ ├── cli/
│ └── index.ts
│
├── plugins/ # Official plugins
│ ├── storage-file/
│ ├── storage-sqlite/
│ ├── adapter-beads/
│ ├── adapter-github/
│ ├── export-markdown/
│ └── workspace/
│
└── package.json
Pros:
- Maximum extensibility
- Community can add adapters/exporters
- Core stays minimal
- Optional features don't bloat install
Cons:
- Plugin API design is hard to get right
- Discovery/loading complexity
- Version compatibility matrix
- More documentation needed
| Factor | Monorepo | Single Package | Plugin |
|---|---|---|---|
| Initial complexity | High | Low | Medium |
| Extensibility | Medium | Low | High |
| Bundle size control | ★★★★★ | ★★☆☆☆ | ★★★★★ |
| Dev experience | ★★★☆☆ | ★★★★★ | ★★★☆☆ |
| Long-term maintenance | ★★★★★ | ★★★☆☆ | ★★★★☆ |
.trajectories/
├── index.json # Quick lookup index
├── active/
│ └── traj_abc123.json
├── completed/
│ └── 2024-01/
│ └── traj_def456.json
└── archive/
Pros:
- Zero dependencies
- Git-friendly (version with code)
- Human-readable/editable
- Works everywhere
Cons:
- No efficient querying
- No full-text search
- Index can get out of sync
- Large repos = slow listing
Best for: Simple setups, single-agent use, git-centric workflows
// Schema
CREATE TABLE trajectories (
id TEXT PRIMARY KEY,
task_title TEXT NOT NULL,
task_source TEXT,
task_id TEXT,
status TEXT NOT NULL,
started_at INTEGER NOT NULL,
completed_at INTEGER,
data TEXT NOT NULL, -- Full JSON
created_at INTEGER,
updated_at INTEGER
);
CREATE VIRTUAL TABLE trajectories_fts USING fts5(
task_title,
summary,
decisions,
content='trajectories'
);Pros:
- Full-text search with FTS5
- Efficient queries
- ACID transactions
- Single file, portable
- Excellent library support (better-sqlite3, sql.js)
Cons:
- Binary file, harder to diff in git
- Requires SQLite dependency
- Migration management needed
- Concurrent write limitations
Best for: Local-first with search, single-user/single-machine
.trajectories/
├── trajectories.db # SQLite index + search
├── data/
│ ├── traj_abc123.json # Full trajectory data
│ └── traj_def456.json
└── exports/
└── traj_abc123.md # Generated markdown
Pros:
- Best of both: search + human-readable files
- Git can track JSON files
- SQLite handles indexing
- Files survive DB corruption
Cons:
- Sync complexity
- Two sources of truth
- More disk usage
Best for: Balance of usability and power
interface StorageAdapter {
// Trajectory CRUD
save(trajectory: Trajectory): Promise<void>;
get(id: string): Promise<Trajectory | null>;
list(query: TrajectoryQuery): Promise<TrajectorySummary[]>;
delete(id: string): Promise<void>;
// Search
search(text: string, options?: SearchOptions): Promise<TrajectorySummary[]>;
// Events (for streaming)
appendEvent(trajectoryId: string, event: TrajectoryEvent): Promise<void>;
// Lifecycle
initialize(): Promise<void>;
close(): Promise<void>;
}
// Implementations
class FileStorageAdapter implements StorageAdapter { }
class SQLiteStorageAdapter implements StorageAdapter { }
class PostgresStorageAdapter implements StorageAdapter { }
class S3StorageAdapter implements StorageAdapter { }Pros:
- User chooses what fits their needs
- Easy to add new backends
- Testing with in-memory adapter
- Cloud-ready when needed
Cons:
- Interface design requires foresight
- Lowest common denominator features
- More code to maintain
| Factor | File Only | SQLite | Hybrid | Pluggable |
|---|---|---|---|---|
| Setup complexity | None | Low | Medium | Medium |
| Search capability | ★☆☆☆☆ | ★★★★★ | ★★★★★ | Varies |
| Git-friendly | ★★★★★ | ★☆☆☆☆ | ★★★★☆ | Varies |
| Scalability | ★★☆☆☆ | ★★★★☆ | ★★★★☆ | ★★★★★ |
| Dependencies | None | 1 lib | 1 lib | Varies |
import { program } from 'commander';
program
.command('start <title>')
.option('--task <id>', 'Link to external task')
.option('--source <system>', 'Task system (beads, linear, github)')
.action(async (title, options) => {
// Implementation
});Pros:
- Most popular, well-documented
- Simple API
- Good TypeScript support
- Automatic help generation
Cons:
- Basic output formatting
- No built-in prompts/spinners
import { Command, Flags } from '@oclif/core';
export default class Start extends Command {
static flags = {
task: Flags.string({ char: 't', description: 'External task ID' }),
};
async run() {
const { flags } = await this.parse(Start);
// Implementation
}
}Pros:
- Enterprise-grade
- Plugin system built-in
- Excellent TypeScript support
- Auto-generated docs
- Built-in testing utilities
Cons:
- Heavy (many dependencies)
- Opinionated structure
- Learning curve
import yargs from 'yargs';
yargs
.command('start <title>', 'Start a new trajectory', (yargs) => {
return yargs.positional('title', { type: 'string' });
}, async (argv) => {
// Implementation
})
.parse();Pros:
- Flexible
- Good middleware support
- Widely used
Cons:
- TypeScript types can be tricky
- Less structured than oclif
import { intro, outro, text, select, spinner } from '@clack/prompts';
intro('Agent Trajectories');
const title = await text({
message: 'Trajectory title?',
placeholder: 'Implement feature X',
});
const s = spinner();
s.start('Creating trajectory...');
// ...
s.stop('Trajectory created!');
outro('Done!');Pros:
- Beautiful, modern UI
- Great DX
- Interactive prompts built-in
- Small bundle
Cons:
- Newer, less battle-tested
- Focused on interactive use
Use Commander for command parsing, Clack for interactive elements:
import { program } from 'commander';
import { intro, spinner, text } from '@clack/prompts';
program
.command('start [title]')
.action(async (title) => {
intro('New Trajectory');
const finalTitle = title ?? await text({
message: 'What are you working on?',
});
const s = spinner();
s.start('Creating trajectory...');
await createTrajectory(finalTitle);
s.stop('Trajectory created!');
});// hooks/trajectory.ts
export default {
name: 'trajectory-capture',
// Inject active trajectory context at session start
onSessionStart: async () => {
const active = await getActiveTrajectory();
if (active) {
return {
systemPrompt: `
Active trajectory: ${active.task.title}
Chapter: ${active.currentChapter?.title ?? 'Not started'}
Use [[TRAJECTORY:event]] blocks to record significant decisions.
`
};
}
},
// Prompt for retrospective at session end
onSessionEnd: async () => {
const active = await getActiveTrajectory();
if (active?.status === 'completing') {
return {
prompt: 'Please provide a retrospective for this trajectory.'
};
}
},
// Capture tool calls as events
onToolCall: async (tool, args, result) => {
await appendEvent({
type: 'tool_call',
content: `${tool}: ${summarize(args)}`,
raw: { tool, args, result }
});
}
};// When agent-relay is available, capture messages
import { RelayClient } from 'agent-relay';
const relay = new RelayClient();
relay.on('message', async (envelope) => {
if (envelope.metadata?.trajectoryId) {
await appendEvent({
type: envelope.from === currentAgent ? 'message_sent' : 'message_received',
content: envelope.payload.body,
agentName: envelope.from
});
}
});// Watch for beads state changes
import { BeadsClient } from 'beads';
const beads = new BeadsClient();
beads.on('taskStarted', async (task) => {
await startTrajectory({
title: task.title,
source: { system: 'beads', id: task.id }
});
});
beads.on('taskCompleted', async (taskId) => {
await promptRetrospective(taskId);
await completeTrajectory(taskId);
});Expose trajectories via Model Context Protocol:
// mcp/trajectory-server.ts
import { MCPServer } from '@anthropic/mcp';
const server = new MCPServer({
name: 'trajectory-server',
tools: {
'trajectory.start': {
description: 'Start a new trajectory',
parameters: { title: 'string', taskId: 'string?' },
handler: async ({ title, taskId }) => {
return await startTrajectory({ title, taskId });
}
},
'trajectory.decision': {
description: 'Record a decision point',
parameters: {
question: 'string',
chosen: 'string',
alternatives: 'string[]',
reasoning: 'string'
},
handler: async (decision) => {
return await recordDecision(decision);
}
},
'trajectory.search': {
description: 'Search past trajectories',
parameters: { query: 'string' },
handler: async ({ query }) => {
return await searchTrajectories(query);
}
}
},
resources: {
'trajectory://active': {
description: 'Current active trajectory',
handler: async () => {
return await getActiveTrajectory();
}
}
}
});npm install -g agent-trajectories
# or
npx agent-trajectories start "My task"Pros:
- Familiar to JS/TS developers
- Easy updates
- Can be used as library too
Cons:
- Requires Node.js
- Startup overhead
- node_modules size
# Build
npx pkg . --targets node18-linux-x64,node18-macos-x64,node18-win-x64
# Distribute
curl -fsSL https://agent-trajectories.dev/install.sh | shPros:
- No Node.js required
- Single file
- Faster startup (bundled)
Cons:
- Large binary (~50MB)
- Build complexity
- Platform-specific builds
bun build ./src/cli/index.ts --compile --outfile agent-trajectoriesPros:
- Fast builds
- Small binaries
- TypeScript native
- Fast startup
Cons:
- Newer runtime, less proven
- Some Node.js APIs differ
- Smaller community
FROM node:20-alpine
WORKDIR /app
COPY . .
RUN npm ci && npm run build
ENTRYPOINT ["node", "dist/cli/index.js"]Pros:
- Consistent environment
- Easy for CI/CD integration
- No local install
Cons:
- Docker required
- Overhead for simple CLI
- Less convenient for daily use
| Method | Install Size | Startup | Portability | Updates |
|---|---|---|---|---|
| NPM | ~20MB | ~150ms | Node required | npm update |
| Binary (pkg) | ~50MB | ~80ms | Standalone | Redownload |
| Bun | ~30MB | ~30ms | Standalone | Redownload |
| Docker | ~150MB | ~500ms | Docker required | docker pull |
// core/schema.test.ts
import { validateTrajectory } from './schema';
describe('Trajectory Schema', () => {
it('validates a complete trajectory', () => {
const trajectory = {
id: 'traj_123',
version: 1,
task: { title: 'Test task' },
status: 'active',
startedAt: new Date().toISOString(),
agents: [],
chapters: []
};
expect(validateTrajectory(trajectory)).toEqual({ valid: true });
});
it('rejects trajectory without title', () => {
const trajectory = { id: 'traj_123', status: 'active' };
expect(validateTrajectory(trajectory).valid).toBe(false);
});
});// storage/sqlite.test.ts
import { SQLiteStorage } from './sqlite';
import { tmpdir } from 'os';
describe('SQLite Storage', () => {
let storage: SQLiteStorage;
beforeEach(async () => {
storage = new SQLiteStorage(`${tmpdir()}/test-${Date.now()}.db`);
await storage.initialize();
});
afterEach(async () => {
await storage.close();
});
it('saves and retrieves trajectories', async () => {
const trajectory = createTestTrajectory();
await storage.save(trajectory);
const retrieved = await storage.get(trajectory.id);
expect(retrieved).toEqual(trajectory);
});
it('searches by text', async () => {
await storage.save(createTestTrajectory({ title: 'Implement auth' }));
await storage.save(createTestTrajectory({ title: 'Fix database' }));
const results = await storage.search('auth');
expect(results).toHaveLength(1);
expect(results[0].task.title).toBe('Implement auth');
});
});// cli/e2e.test.ts
import { execSync } from 'child_process';
describe('CLI E2E', () => {
it('creates and completes a trajectory', () => {
// Start
const startOutput = execSync('npx agent-trajectories start "Test task"');
expect(startOutput.toString()).toContain('Trajectory created');
// Check status
const statusOutput = execSync('npx agent-trajectories status');
expect(statusOutput.toString()).toContain('Test task');
expect(statusOutput.toString()).toContain('active');
// Complete
const completeOutput = execSync('npx agent-trajectories complete');
expect(completeOutput.toString()).toContain('completed');
});
});| Framework | Speed | DX | TypeScript | Watch Mode |
|---|---|---|---|---|
| Vitest | ★★★★★ | ★★★★★ | Native | ★★★★★ |
| Jest | ★★★☆☆ | ★★★★☆ | Config needed | ★★★★☆ |
| Node test runner | ★★★★★ | ★★★☆☆ | Via tsx | ★★★☆☆ |
| Bun test | ★★★★★ | ★★★★☆ | Native | ★★★★☆ |
Recommendation: Vitest for best TypeScript DX and speed.
Based on the analysis above, here are my recommendations for different scenarios:
| Choice | Recommendation | Rationale |
|---|---|---|
| Language | TypeScript | Fastest development, ecosystem alignment |
| Structure | Single package | Less setup, iterate quickly |
| Storage | SQLite + Files hybrid | Good balance of features |
| CLI | Commander + Clack | Simple + beautiful |
| Distribution | NPM | Standard, easy |
| Testing | Vitest | Fast, great DX |
# Quick start
mkdir agent-trajectories && cd agent-trajectories
npm init -y
npm install typescript commander @clack/prompts better-sqlite3 zod
npm install -D vitest @types/node @types/better-sqlite3| Choice | Recommendation | Rationale |
|---|---|---|
| Language | TypeScript | Ecosystem, contributors |
| Structure | Monorepo (pnpm + turborepo) | Scalability, clean separation |
| Storage | Pluggable (SQLite default) | Flexibility |
| CLI | Oclif | Enterprise-grade, plugins |
| Distribution | NPM + pkg binaries | Both audiences |
| Testing | Vitest + Playwright | Unit + E2E |
| Choice | Recommendation | Rationale |
|---|---|---|
| Language | Rust | Speed, single binary |
| Structure | Cargo workspace | Rust standard |
| Storage | SQLite (rusqlite) | Native performance |
| CLI | clap + ratatui | Fast, beautiful TUI |
| Distribution | GitHub releases | Binary downloads |
| Testing | cargo test | Built-in |
| Choice | Recommendation | Rationale |
|---|---|---|
| Language | TypeScript | Contributor-friendly |
| Structure | Plugin architecture | Community extensions |
| Storage | Pluggable | Any backend |
| CLI | Oclif | Built-in plugins |
| Distribution | NPM (core + plugins) | Mix and match |
| Testing | Vitest | Standard |
This is the recommended path—a progressive architecture that works for a solo developer on day one but scales to enterprise teams without rewrites.
| Choice | Recommendation | Rationale |
|---|---|---|
| Language | TypeScript | Broadest contributor pool, ecosystem alignment |
| Structure | Single package → Monorepo | Start simple, split when needed |
| Storage | Pluggable with sensible defaults | File for solo, SQLite for power users, Postgres for teams |
| CLI | Commander + Clack | Simple, beautiful, scriptable |
| Distribution | NPM + optional binaries | Easy install, optional standalone |
| Testing | Vitest | Fast, modern, great DX |
┌─────────────────────────────────────────────────────────────────────────┐
│ PROGRESSIVE ENHANCEMENT │
├─────────────────────────────────────────────────────────────────────────┤
│ │
│ SOLO DEV (Day 1) │
│ ├── npm install agent-trajectories │
│ ├── File storage (.trajectories/ in repo) │
│ ├── CLI: start, status, complete, export │
│ └── Zero config, works immediately │
│ │
│ POWER USER (Week 2) │
│ ├── Enable SQLite for search │
│ ├── Claude Code hooks for auto-capture │
│ ├── Custom export templates │
│ └── Config file for preferences │
│ │
│ SMALL TEAM (Month 1) │
│ ├── Shared SQLite or PostgreSQL │
│ ├── Multiple agent coordination │
│ ├── Team patterns and decisions │
│ └── PR integration │
│ │
│ ENTERPRISE (Month 3+) │
│ ├── PostgreSQL + S3 for scale │
│ ├── SSO/auth integration │
│ ├── Audit logging │
│ ├── Custom adapters (Jira, ServiceNow, etc.) │
│ └── Dashboard/reporting │
│ │
└─────────────────────────────────────────────────────────────────────────┘
| Need | Solo Dev Solution | Enterprise Solution |
|---|---|---|
| Quick start | npx agent-trajectories init |
Same, then configure |
| Storage | Files in repo (git-friendly) | Postgres + S3 |
| Search | SQLite FTS (local) | Elasticsearch/Postgres FTS |
| Auth | None needed | SSO via config |
| Compliance | Git history IS the audit log | Dedicated audit tables |
| Integrations | Claude Code hooks | + Jira, Linear, custom |
// No config = sensible defaults for solo dev
// .trajectoriesrc.json for customization
// Solo dev (zero config)
{}
// Power user
{
"storage": "sqlite",
"hooks": { "claudeCode": true }
}
// Team
{
"storage": {
"type": "postgresql",
"url": "$DATABASE_URL"
},
"team": {
"sharedPatterns": true,
"requireRetrospective": true
}
}
// Enterprise
{
"storage": {
"type": "postgresql",
"url": "$DATABASE_URL",
"archive": { "type": "s3", "bucket": "trajectories-archive" }
},
"auth": {
"type": "oidc",
"issuer": "$AUTH_ISSUER"
},
"audit": { "enabled": true },
"adapters": ["jira", "servicenow"]
}How should trajectories be captured? This determines the v1 focus and user experience.
// .claude/hooks/trajectory.ts
export default {
onSessionStart: async () => {
// Auto-detect or prompt for trajectory context
return { systemPrompt: `Recording trajectory for: ${activeTask}` };
},
onToolCall: async (tool, args, result) => {
// Silently capture tool usage
await trajectory.appendEvent({ type: 'tool_call', tool, summary: ... });
},
onSessionEnd: async () => {
// Prompt for retrospective
return { prompt: 'Please summarize key decisions made...' };
}
};User experience:
- Install hook once, trajectories captured automatically
- Agent explicitly prompted for decisions/retrospectives
- Works passively in background
Pros:
- Lowest friction—just works
- Captures everything (no forgotten recordings)
- Integrates with existing Claude Code workflow
Cons:
- Coupled to Claude Code
- Less control over what's captured
- May capture noise
Best for: Claude Code users who want zero-effort capture
# User explicitly starts/stops trajectories
$ traj start "Implement auth"
Trajectory started: traj_abc123
# User works with any tool (Claude Code, Cursor, manual coding)
# User explicitly records decisions
$ traj decision "Chose JWT over sessions" --reasoning "Stateless scaling"
# User explicitly completes
$ traj complete
Please enter retrospective summary: ...User experience:
- Full control over what's recorded
- Works with ANY coding tool/workflow
- Trajectory is a deliberate artifact
Pros:
- Tool-agnostic (works with Cursor, Copilot, manual coding)
- User controls signal vs noise
- Simpler implementation (no hook infrastructure)
Cons:
- Friction—user must remember to record
- Incomplete trajectories likely
- More typing/commands
Best for: Users who want explicit control, multi-tool workflows
// Agent can directly read/write trajectories via MCP
const server = new MCPServer({
tools: {
'trajectory.start': { ... },
'trajectory.decision': { ... },
'trajectory.search': { ... }, // Query past trajectories
'trajectory.getSuggestions': { ... } // "How did we solve X before?"
},
resources: {
'trajectory://active': { ... },
'trajectory://patterns': { ... }
}
});User experience:
- Agent can autonomously record its own work
- Agent can query past trajectories for context
- Enables the "Agent Workspace" vision
Pros:
- Agents can learn from past trajectories
- Self-documenting agents
- Enables workspace/pattern features
Cons:
- More complex infrastructure
- Requires MCP-capable client
- Agent must be taught to use it
Best for: Advanced multi-agent setups, workspace features
Start with CLI as the foundation, add hooks and MCP as layers:
┌─────────────────────────────────────────────────────────────────┐
│ INTEGRATION LAYERS │
├─────────────────────────────────────────────────────────────────┤
│ │
│ MCP Server (Layer 3) ─── Agent-accessible queries/writes │
│ │ │
│ ▼ │
│ Claude Code Hooks (Layer 2) ─── Automatic capture │
│ │ │
│ ▼ │
│ CLI Core (Layer 1) ─── Explicit commands, always available │
│ │
└─────────────────────────────────────────────────────────────────┘
v1 scope options:
- Minimal: CLI only (4 commands: start, status, complete, export)
- Standard: CLI + Claude Code hooks
- Full: CLI + Hooks + MCP Server
Recommendation: Start with CLI + Hooks. MCP in v1.1.
-
Storage default?
- File-only (simplest, git-friendly)? ← Recommended for v1
- SQLite (searchable)?
- Both available from start?
-
Retrospective enforcement?
- Required to complete?
- Optional but prompted? ← Recommended
- Fully optional?
-
Scope of v1?
- Core capture + export only?
- Include workspace features?
- Include all adapters?
- Decide on core choices (language, structure, storage)
- Define v1 scope (what's in, what's deferred)
- Set up repository with chosen stack
- Implement core types and schema validation
- Build minimal CLI (start, status, complete, export)
- Add first integration (Claude Code hooks or Beads)
- Ship alpha for feedback
{
"dependencies": {
"commander": "^12.0.0",
"@clack/prompts": "^0.7.0",
"better-sqlite3": "^9.0.0",
"zod": "^3.22.0",
"chalk": "^5.3.0",
"date-fns": "^3.0.0"
},
"devDependencies": {
"typescript": "^5.3.0",
"vitest": "^1.0.0",
"@types/node": "^20.0.0",
"@types/better-sqlite3": "^7.0.0",
"tsup": "^8.0.0"
}
}[dependencies]
clap = { version = "4.4", features = ["derive"] }
rusqlite = { version = "0.30", features = ["bundled"] }
serde = { version = "1.0", features = ["derive"] }
serde_json = "1.0"
chrono = { version = "0.4", features = ["serde"] }
tokio = { version = "1.0", features = ["full"] }