Skip to content

feat(dashboard): surface sandbox stdout/stderr in task timeline #107

@Abernaughty

Description

@Abernaughty

Summary

Repurposed from "terminal agent blades" — the per-agent SSE-filtered terminal tabs are redundant with the existing task timeline and agent sidebar. What's actually missing is sandbox output visibility: when the sandbox runs a script, the dashboard shows "Sandbox: 1 tests passed" but never displays what the script actually printed.

This was identified during trace analysis of the triforce task where QA failed the output for missing leading spaces — a problem that would have been immediately obvious if sandbox stdout were visible in the timeline.

Current State

  • SandboxResult.output_summary captures stdout (+ stderr on failure), already secret-scanned and truncated at 2000 chars
  • SandboxResult.errors contains extracted error messages (tracebacks, SyntaxError, etc.)
  • runner.py handles sandbox_validate but does not include output_summary in the SSE payload — only pass/fail counts
  • The data exists end-to-end in GraphState, it just isn't rendered

Implementation

Backend (runner.py) — Tiny

In the sandbox_validate SSE event handler, include output content:

sandbox_result = output.get("sandbox_result")
if sandbox_result:
    event_data["output_summary"] = sandbox_result.output_summary
    event_data["errors"] = sandbox_result.errors
    event_data["exit_code"] = sandbox_result.exit_code

Dashboard (Svelte) — Small-Medium

  1. Collapsible <pre> block beneath sandbox timeline events when output_summary is non-empty
  2. Auto-expand on failure (exit_code !== 0) — user needs to see what went wrong without extra clicks
  3. Monospace renderingDM Mono, dark background (#0a0c10), matching terminal panel styling
  4. Truncation — max 50 lines with "Show all (N lines)" toggle
  5. Error displayerrors list rendered above stdout in red-tinted block (#ef4444 bg) when present
  6. Copy button — top-right of the output block
  7. UI-side redaction — regex pass matching secret patterns before rendering (Layer 4 of defense model, belt-and-suspenders with server-side scanning)
  8. Empty state — if output_summary is empty/whitespace, render nothing (no empty collapsible)

Files to Modify

File Change Effort
dev-suite/src/api/runner.py Add output_summary, errors, exit_code to sandbox SSE event Tiny
dashboard/src/lib/stores/tasks.svelte.ts Store sandbox output fields from SSE Tiny
dashboard/src/lib/components/ (timeline event component) Collapsible output block, auto-expand, copy button Small-Medium

Test Plan

  • Unit: Mock SSE event with output_summary → verify store update
  • Manual: Run triforce task → confirm ASCII art visible in collapsed block under sandbox event
  • Manual: Run a task that crashes → confirm stderr/traceback auto-expands in red
  • Manual: Run a task with >50 lines output → confirm truncation + "Show all"
  • Edge: Empty stdout script → confirm no empty collapsible renders

Not in Scope

Effort

Small-Medium (mostly dashboard rendering, trivial backend addition)

Dependencies

None

Metadata

Metadata

Assignees

No one assigned

    Projects

    Status

    Done

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions