Supervisor agent: one-shot workflows by hanna-paasivirta · Pull Request #363 · OpenFn/apollo

hanna-paasivirta · 2026-01-29T11:53:19Z

Short Description

Adds a global agent — a supervisor-style orchestration layer that sits in front of workflow_chat and job_chat. It accepts a single unified payload from the frontend and intelligently routes requests to the right subagent(s), or escalates to a multi-step planner when the task requires coordination across both.

Current testing focus: One-shot workflow generation from scratch (when the user doesn't have an existing workflow). Other scenarios (editing existing workflows, job-level chat, multi-turn conversations) have not been tested end-to-end and may not work correctly.

To test use

pytest global_agent/tests/test_planner_multistep.py -v -s

Fixes #333 (done without this code being merged)
Fixes #398
Fixes #404

Implementation Details

Architecture

The global agent uses a two-tier dispatch model:

Router (router.py) — A fast, cheap Claude Haiku call classifies the user request into one of three destinations: workflow_agent, job_code_agent, or planner. Uses a constrained JSON generation trick (pre-filled assistant turn '{"destination": "') to force deterministic structured output. On any routing failure, defaults to planner.
Planner (planner.py) — A Claude Sonnet tool-calling loop (up to 20 iterations) that can orchestrate multiple subagent calls in sequence. The planner sees a redacted version of the workflow YAML (job bodies replaced with # [use inspect_job_code to view]) to keep context small, and maintains a live current_yaml state that gets stitched after each subagent call.
Direct routes — For simple requests (e.g., "edit this job's code"), the router bypasses the planner entirely and calls workflow_chat.main() or job_chat.main() directly as in-process Python function calls.

Key Design Decisions

YAML as shared state: The full workflow_yaml string is the single state carrier between turns and between agents. The planner mutates a local copy during its loop, stitching code in after each subagent call, and returns the final state as an attachment.
Stateless subagents: Both call_workflow_agent and call_job_agent always pass history: []. The planner encodes all necessary context in the message field of each tool call, treating subagents as stateless specialists.
Direct Python invocation: All subagent calls are synchronous in-process function calls, not HTTP requests.
Context management: The planner uses Anthropic's context-management beta to prune old tool uses in multi-turn conversations (triggers at 20, keeps 10), preserving search_documentation results.

New Files

File	Purpose
`global_agent/global_agent.py`	Entry point — validates payload, creates router, returns structured envelope
`global_agent/router.py`	Haiku-based routing + direct dispatch to workflow_chat/job_chat
`global_agent/planner.py`	Sonnet tool-calling loop for multi-step tasks
`global_agent/subagent_caller.py`	Thin wrappers around workflow_chat.main() and job_chat.main()
`global_agent/config.yaml`	LLM model configs (Haiku for router, Sonnet for planner)
`global_agent/prompts.yaml`	System prompts for router and planner
`global_agent/tools/tool_definitions.py`	Claude API tool schemas (search_documentation, call_workflow_agent, call_job_code_agent, inspect_job_code)
`global_agent/yaml_utils.py`	YAML parsing, job lookup, code stitching, body redaction
`global_agent/PAYLOAD_SPEC.md`	Public API contract documentation
`search_documentation/search_documentation.py`	Extracted doc search into standalone service (used by planner as a tool)

Changes to Existing Services

streaming_util.py: Simplified streaming utilities
util.py: Added sum_usage() for token aggregation across agents, search_documentation_tool() helper
load_adaptor_docs: Minor adjustments

Tests

Tests in global_agent/tests/ covering:

Planner multi-step scenarios
Planner-to-subagent clarification flows
End-to-end "good morning workflow" generation

Changes needed in Lightning

This is a new service with a new API. Changes are needed in Lightning if we want to experiment with this. See: OpenFn/lightning#4532

AI Usage

Please disclose how you've used AI in this work (it's cool, we just want to know!):

You can read more details in our Responsible AI Policy

josephjclark · 2026-02-03T10:29:51Z

Some requests!

A simple test that I (or someone else) can run which asks the same question to a) the existing workflow chat and b) the global assistant, and allow me to compare answers
A readout here of the approximate token overhead of of the global assistant. Like if the workflow service today costs 10k tokens per query, how many tokens for the same query in the global assistant? This before any optimisation - I want a baseline number that we can work with

hanna-paasivirta · 2026-02-04T17:05:01Z

At this stage, we've plugged in the workflow agent and can get a similar answer via the supervisor and the original standalone workflow_chat service. For a simple task that only requires one pass through the workflow agent, the token consumption increase in the agentic version is 2x input tokens and 3x output tokens (maybe an underestimate, given the basic prompt of the supervisor). This is our baseline cost without any optimisations.

Add lightweight router to global agent

josephjclark · 2026-03-16T15:26:02Z

TODOs:

Take the print_response_details test util and promote it to a main server util. When I'm debugging apollo responses, I want this to be available to me!
Later later later compile yaml and js code with the CLI to ensure it's valid
The planner does an ok job of taking the project.yaml and passing the right parts to the subagents. But if the planner is bypassed, and we call the workflow or job chat directly, the chat agent will likely perform too badly because it is given the whole workflow.yaml as an input, and not just the stuff it wants. There's no filter. We might want to add a filter. Ie in workflow chat, remove the job code. In job chat, extract the relevant step from the workflow yaml and just send that.

The weaknesses of this implementation right now are:

Anything unrelated to workflow or chat code generation may perform poorly
when going through the global endpoint, any "simple" workflow generation or job code generation is likely to perform poorly
The planner can be very very slow

Basically planning works well for the cases we've tested, but we don't know about other cases, and because of changes to the input, non planning steps won't well.

Ways forward:

Test carefully for a few days to understand the behaviour
Start implementing fixes for anticipated problems
Release without testing and let AI be AI
Limit the capability to strictly only do what workflow chat does today.

josephjclark · 2026-03-16T16:02:24Z

Here's the next steps:

Get a release ready of this global_assistant endpoint (fix signatures and tidy up)
Release!
Start testing and improving here(Including better system logging and user event logging)
Kick off lightning work for a minimal opt-in experimental integration (@hanna-paasivirta to kick off off with Product)

hanna-paasivirta · 2026-03-17T17:52:29Z

Structured outputs are deprecated now, so I reverted back to Sonnet 4.5 as some changes are needed across services.

hanna-paasivirta added 4 commits January 27, 2026 18:42

add basic global agent

37d420b

add subagents as separate tools

e7c35d5

split folders

78c2e98

add workflow chat without history building

60271ff

hanna-paasivirta added 3 commits February 3, 2026 16:11

add token accumulation

133aef9

simplify answer passing

dc9a3ba

Merge remote-tracking branch 'origin/main' into supervisor-agent

19c7512

hanna-paasivirta added 13 commits February 5, 2026 17:39

fix info passing

9da90f0

add caching

afe862e

trim history

297b81a

add router

4f860e3

Merge pull request #383 from OpenFn/lightweight-router

6242930

Add lightweight router to global agent

add tests and fix planner bugs

e08431f

add new payload

ef50f0e

improve planner flow

91daca6

combine yaml and job code

6865e7f

adjust tests

75c0928

give planner access to yaml

fdcd317

add info on yaml status for planner

0c7ee3b

adjust supervisor role

29d5bc7

adjust input payload

dbd089c

hanna-paasivirta changed the title ~~Supervisor agent~~ Supervisor agent: one-shot workflows Mar 16, 2026

hanna-paasivirta added 4 commits March 16, 2026 18:26

restore job_chat

5a0777d

restore workflow chat

b2ed040

add attachments

ad1e338

add concurrent job code calls to planner

64da909

hanna-paasivirta added 2 commits March 17, 2026 14:33

fix tool use amnesia

15d325e

add changeset

da1c1c9

hanna-paasivirta changed the base branch from main to release/next March 17, 2026 17:15

Merge branch 'release/next' into supervisor-agent

98e9196

hanna-paasivirta marked this pull request as ready for review March 17, 2026 17:19

hanna-paasivirta added 2 commits March 17, 2026 17:33

Merge branch 'release/next' into supervisor-agent

b180113

use central model management in global agent

0aa9383

hanna-paasivirta and others added 6 commits March 17, 2026 17:53

use sonnet 4-5

748e182

fix import

059d0ca

update poetry deps

be935c4

update deps and readme

d6a0d6c

global_agent -> global_chat

db5b6d5

versions

11aee55

josephjclark merged commit 8788793 into release/next Mar 19, 2026

josephjclark deleted the supervisor-agent branch March 19, 2026 15:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Supervisor agent: one-shot workflows#363

Supervisor agent: one-shot workflows#363
josephjclark merged 36 commits intorelease/nextfrom
supervisor-agent

hanna-paasivirta commented Jan 29, 2026 •

edited

Loading

Uh oh!

josephjclark commented Feb 3, 2026

Uh oh!

hanna-paasivirta commented Feb 4, 2026

Uh oh!

josephjclark commented Mar 16, 2026 •

edited

Loading

Uh oh!

josephjclark commented Mar 16, 2026 •

edited

Loading

Uh oh!

hanna-paasivirta commented Mar 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

hanna-paasivirta commented Jan 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Short Description

Implementation Details

Architecture

Key Design Decisions

New Files

Changes to Existing Services

Tests

Changes needed in Lightning

AI Usage

Uh oh!

josephjclark commented Feb 3, 2026

Uh oh!

hanna-paasivirta commented Feb 4, 2026

Uh oh!

josephjclark commented Mar 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

josephjclark commented Mar 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hanna-paasivirta commented Mar 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

hanna-paasivirta commented Jan 29, 2026 •

edited

Loading

josephjclark commented Mar 16, 2026 •

edited

Loading

josephjclark commented Mar 16, 2026 •

edited

Loading