Skip to content

Headless Server Architecture (Foundation) #5

@peterzan

Description

@peterzan

Overview

Extract the ILETP orchestration logic from the current Swift macOS application into a protocol-agnostic headless server that exposes REST and/or WebSocket APIs. This enables the core multi-agent coordination, trust scoring, and consensus mechanisms to be accessed by any client (web browsers, mobile apps, CLI tools, IDE extensions) while maintaining clean separation between backend logic and frontend presentation.

Status: Foundation / Core Infrastructure
Priority: High (enables all other implementation variants)
Blocks: Issue #2 (MCP variant), future protocol-specific implementations

Important Note: This issue is provided for community members who fork the repository and wish to build implementations. Any work completed here will not be merged back into this repository—it's intended for independent forks and derivative projects. That said, if you build something based on this issue, please let me know! I'd love to see what the community creates and may link to notable implementations.


Context

The current Swift application contains valuable ILETP orchestration logic (multi-model coordination, trust protocols, consensus mechanisms) tightly coupled to a macOS UI. This coupling limits:

  • Platform reach - macOS only, not accessible from web/mobile/Linux
  • Integration potential - Other apps can't embed ILETP functionality
  • UI experimentation - Changing UI requires touching orchestration code
  • Language flexibility - Locked into Swift/macOS ecosystem
  • Scalability - Can't distribute backend processing independently

A headless server architecture solves these limitations by creating a clean API boundary between ILETP's "brain" (orchestration, trust, consensus) and its "face" (any UI client).


Goals

  1. API-First Design - Create well-defined REST and/or WebSocket endpoints for all ILETP core functions
  2. Frontend Independence - Enable any client technology (React, Vue, SwiftUI, CLI, etc.) to use ILETP
  3. Embeddability - Allow other applications to integrate ILETP via HTTP/WebSocket calls
  4. Protocol Agnostic - Foundation that supports multiple communication protocols (REST, GraphQL, MCP, gRPC, etc.)
  5. Reference Implementation - Demonstrate how ILETP specifications translate into working server code

Technical Requirements

Core Server Infrastructure

  • Server Framework Selection

    • Choose appropriate framework for implementation language:
      • Python: FastAPI, Flask, or Django
      • Node.js: Express, Fastify, or NestJS
      • Rust: Axum, Actix-web, or Rocket
      • Go: Gin, Echo, or Chi
    • Support both HTTP REST and WebSocket connections
    • Production-ready error handling and logging
    • Health check and metrics endpoints
  • API Design

    • RESTful endpoints for stateless operations
    • WebSocket support for streaming/real-time updates
    • Versioned API (/v1/...) for future compatibility
    • OpenAPI/Swagger specification for documentation
    • Consistent request/response formats (JSON)
    • Proper HTTP status codes and error responses

ILETP Core Functionality (Backend)

  • Orchestration Engine (Spec 1)

    • POST /v1/orchestrate - Route query to multiple models
    • GET /v1/orchestrate/{job_id} - Check orchestration status
    • GET /v1/orchestrate/{job_id}/results - Retrieve results
    • Handle concurrent model API calls
    • Support both synchronous and asynchronous processing
    • Return structured responses with model attribution
  • Trust & Consensus Protocol (Spec 2)

    • POST /v1/consensus - Calculate trust scores from model responses
    • GET /v1/trust-score/{query_id} - Retrieve trust analysis
    • Return confidence levels, dissent analysis, contributing models
    • Support confidence-weighted aggregation
    • Provide transparency into consensus calculation
  • Dynamic Agent Orchestration (Spec 7)

    • Analyze query complexity automatically
    • Select optimal number/type of models based on requirements
    • Calculate and enforce diversity thresholds
    • Expose orchestration decisions via API for transparency
    • Support manual overrides (user specifies models)
  • Session & Conversation Management (Spec 4, 5, 6)

    • POST /v1/sessions - Create new conversation session
    • GET /v1/sessions/{session_id} - Retrieve session state
    • POST /v1/sessions/{session_id}/messages - Add message to session
    • GET /v1/sessions/{session_id}/history - Get conversation history
    • DELETE /v1/sessions/{session_id} - End session
    • Persistent session storage (database or file-based)
    • Context preservation across reconnections
    • Session expiration and cleanup
  • Asynchronous Task Handling (Spec 5)

    • POST /v1/tasks - Submit long-running task
    • GET /v1/tasks/{task_id} - Check task status
    • GET /v1/tasks/{task_id}/result - Retrieve completed result
    • WebSocket endpoint for task progress updates
    • Task queue implementation (in-memory or Redis/RabbitMQ)
    • Proper timeout and cancellation handling

LLM Provider Integration

  • API Client Abstraction

    • Unified interface for different LLM providers
    • Support for Anthropic (Claude), OpenAI (GPT), Google (Gemini), Mistral, local models (Ollama)
    • Provider-specific authentication handling
    • Rate limiting and retry logic
    • Error handling and fallback strategies
  • Configuration Management

    • API keys stored securely (environment variables, vault, not hardcoded)
    • Provider endpoints configurable
    • Model selection and parameters configurable
    • Support for multiple API key rotation
    • Clear documentation on required credentials

State Management

  • Session Storage

    • Choose storage backend (PostgreSQL, MongoDB, Redis, SQLite)
    • Conversation history persistence
    • Trust score caching
    • Audit trail storage
    • Session metadata (created time, last activity, participant models)
  • Stateless vs Stateful Decision

    • Option A (Stateless): Client sends full context with each request
      • Pros: Simpler, more scalable, easier to load balance
      • Cons: Higher bandwidth, client manages state
    • Option B (Stateful): Server maintains session state
      • Pros: Lower bandwidth, richer session features
      • Cons: Requires session store, more complex deployment
    • Recommendation: Start stateful, provide stateless option

Security & Authentication

  • API Security

    • API key or JWT-based authentication
    • Rate limiting per user/API key
    • CORS configuration for web clients
    • Input validation and sanitization
    • Protection against common attacks (injection, XSS, etc.)
  • LLM API Key Protection

    • Never expose provider API keys to clients
    • Server-side only LLM API calls
    • Secure key storage (environment variables, secrets manager)
    • Key rotation support

Developer Experience

  • Documentation

    • API reference (auto-generated from OpenAPI spec)
    • Quickstart guide (install → run → first API call)
    • Architecture overview diagram
    • Example requests/responses for each endpoint
    • Client SDK examples (curl, Python, JavaScript)
    • Deployment guide (local, Docker, cloud)
  • Testing

    • Unit tests for core orchestration logic
    • Integration tests with mock LLM providers
    • API endpoint tests (request/response validation)
    • Load testing for concurrent requests
    • Test coverage >80%
  • Development Tools

    • Docker/docker-compose for local development
    • Environment variable templates (.env.example)
    • Database migrations/seeding scripts
    • Hot reload for development
    • Logging and debugging utilities

Acceptance Criteria

Minimum Viable Server (MVP)

  • Server starts successfully and listens on configurable port
  • Health check endpoint responds (GET /health)
  • At least 3 core ILETP functions exposed as API endpoints:
    • Orchestrate query across multiple models
    • Calculate trust/consensus score
    • Manage conversation sessions
  • Successfully integrates with at least 2 LLM providers (e.g., Anthropic + OpenAI)
  • Returns properly formatted JSON responses with appropriate HTTP status codes
  • Basic error handling (graceful degradation when models fail)
  • API documentation (OpenAPI spec or equivalent)

Quality Standards

  • Security: API keys never exposed in responses or logs
  • Performance: Multi-model orchestration completes within reasonable time (<30s for 3 models)
  • Reliability: Handles provider failures gracefully (doesn't crash server)
  • Observability: Request/response logging, error tracking
  • Configuration: All secrets/credentials via environment variables
  • Documentation: README with setup, API reference, architecture explanation

Validation

  • Demonstrate working API calls via curl/Postman
  • Show multi-model orchestration returning trust scores
  • Prove session persistence (create session, disconnect, reconnect, resume)
  • Deploy to at least one environment (local, Docker, cloud)
  • Create simple web frontend that consumes the API (HTML/JS or framework)

Open Questions

These should be explored during implementation and documented:

Architecture Decisions

  1. Language/Framework: Which technology stack provides best balance of:

    • Developer familiarity in open-source community
    • Performance for concurrent LLM API calls
    • Library ecosystem for LLM integrations
    • Deployment simplicity
  2. API Style: REST-only, WebSocket-only, or hybrid?

    • REST for request/response operations
    • WebSocket for streaming/real-time updates
    • Server-Sent Events (SSE) as alternative to WebSockets?
  3. State Management: Stateless or stateful sessions?

    • What are the actual scaling implications?
    • Can we support both modes?
    • How does this affect deployment complexity?
  4. Database: Which storage backend for sessions/history?

    • SQL (PostgreSQL, MySQL) for structured data
    • NoSQL (MongoDB) for flexible schemas
    • Redis for high-performance caching
    • SQLite for simplicity in single-instance deployments

Implementation Strategy

  1. Migration from Swift: Should we:

    • Port Swift logic directly to new language
    • Rewrite from scratch following ILETP specs
    • Use Swift code as reference but optimize for server use case
  2. Concurrency Model: How to handle concurrent LLM calls?

    • Async/await patterns
    • Thread pools
    • Message queues
    • What are the performance implications?
  3. Provider Abstraction: How generic should LLM provider interface be?

    • Support only major providers initially (Anthropic, OpenAI, Google)
    • Design for easy plugin of new providers
    • Handle provider-specific features (function calling, vision, etc.)

Deployment & Operations

  1. Containerization: Should we provide:

    • Docker image with all dependencies
    • Docker Compose for multi-container setup (server + database)
    • Kubernetes manifests for cloud deployment
  2. Horizontal Scaling: How to design for multiple server instances?

    • Stateless design for easy load balancing
    • Shared session store (Redis, database)
    • Message queue for task distribution
  3. Monitoring & Observability: What should be exposed?

    • Request metrics (latency, throughput, errors)
    • LLM provider metrics (calls, costs, failures)
    • Trust score distributions
    • Session statistics

Non-Goals (Out of Scope)

This Issue does NOT aim to:

  • ❌ Build production-grade server ready for enterprise deployment (this is reference implementation)
  • ❌ Implement all 10 ILETP specifications initially (start with core 3-4)
  • ❌ Create polished frontend UI (simple demo client is sufficient)
  • ❌ Support every LLM provider (start with 2-3 major ones)
  • ❌ Optimize for extreme performance (focus on correctness first)
  • ❌ Handle authentication/authorization beyond basic API keys
  • ❌ Implement advanced features like multi-tenancy, billing, etc.

Success Metrics

We'll consider this Issue successful if:

  1. Working Server: API server runs and handles ILETP orchestration requests
  2. Clean API Design: Well-documented endpoints that make sense to external developers
  3. Provider Independence: Easy to add new LLM providers without changing core code
  4. Client Proof: At least one working client (web, CLI, or mobile) consuming the API
  5. Foundation Quality: Solid architecture that supports future enhancements (MCP, GraphQL, etc.)
  6. Documentation: Other developers can fork, deploy, and use the server

Success is not measured by completeness—it's measured by whether this creates a solid foundation for ILETP's ecosystem.


Suggested Implementation Phases

Phase 1: Minimal API Server (Week 1-2)

  • Basic HTTP server with health check
  • Single endpoint: orchestrate query across 2 models
  • Return simple JSON response
  • No session persistence yet

Phase 2: Core ILETP Functions (Week 3-4)

  • Trust score calculation endpoint
  • Session management endpoints
  • Basic state persistence
  • Error handling

Phase 3: Polish & Documentation (Week 5-6)

  • OpenAPI specification
  • Example client (simple web UI)
  • Deployment guide
  • Testing suite

Phase 4: Optional Enhancements

  • Asynchronous task processing
  • WebSocket support for streaming
  • Additional LLM providers
  • Performance optimization

Related Issues


Resources

ILETP Documentation

Technical References

Framework Options


Labels

enhancement architecture foundation help wanted good first issue documentation


Call for Contributions

Backend Developers: This is a great opportunity to:

  • Design API architecture from scratch
  • Work with cutting-edge LLM APIs
  • Build foundation for ILETP ecosystem
  • Influence protocol design decisions

New to ILETP? This Issue provides:

  • Clear acceptance criteria
  • Phased implementation plan
  • Extensive documentation requirements
  • Freedom to choose technology stack

Questions? Comment below or join discussions in community channels.


Maintainer Note: This headless server is the foundation for all protocol-specific implementations (MCP, GraphQL, gRPC, etc.). Quality and architecture matter more than speed of implementation. Take time to design the API thoughtfully—future variants will build on this foundation.

NOTE: Direction provided by me (Peter), documentation provided by Claude.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions