Skip to content

Phase 2: Procedure replay engine -- skip reasoning for proven workflows #694

@AlexMikhalev

Description

@AlexMikhalev

Parent Epic

Part of #692 (Operational Skill Store)

Summary

When learn query finds a matching CapturedProcedure with confidence > 0.8, offer to replay it instead of re-planning from scratch. This is the core token-saving mechanism -- proven workflows skip LLM reasoning entirely.

What Changes

New replay subcommand in terraphim_agent

terraphim-agent learn replay "deploy terraphim-llm-proxy config"

Behaviour:

  1. Match query against stored procedures (Aho-Corasick + tag matching)
  2. Display matched procedure with steps, confidence, and replay count
  3. Dry-run by default -- show steps, require confirmation
  4. Execute step-by-step with validation:
    • Check exit code against expected_exit_code
    • Optionally match output against expected_output_pattern
    • Respect per-step timeout_secs
  5. On divergence (unexpected exit code or output), halt and report
  6. Update procedure's success_count or failure_count after completion
  7. Recalculate confidence score

Integration point for ADF agents

Expose replay as a library function so terraphim_orchestrator agents can call it programmatically:

pub async fn replay_procedure(
    procedure_id: &str,
    dry_run: bool,
    on_step: impl Fn(&ProcedureStep, StepResult) -> ReplayDecision,
) -> Result<ReplayOutcome, ReplayError>;

Affected Crates

  • terraphim_agent (new replay subcommand + library API)

Dependencies

Acceptance Criteria

  • learn replay matches and displays procedures
  • Dry-run mode shows steps without executing
  • Step-by-step execution with exit code validation
  • Halts on divergence with clear error message
  • Updates confidence score after each replay
  • Library API available for programmatic replay
  • Integration tests with mock commands

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions