feat(ce-work): add Codex delegation mode by mvanhorn · Pull Request #328 · EveryInc/compound-engineering-plugin

mvanhorn · 2026-03-20T22:23:00Z

Summary

Adds an optional "Codex Delegation" section to ce:work, parallel to the existing Swarm Mode section. When the user says "use codex for this work", implementation tasks are delegated to codex exec while planning, review, and git operations stay in the current agent.

Why

Users on Max20 plans burn significant context on code implementation. Codex Pro plans include free compute that can handle the writing work. This keeps Claude focused on planning and review.

Changes

One file changed: plugins/compound-engineering/skills/ce-work/SKILL.md (+53 lines)

The new section follows the same structure as Swarm Mode:

Decision table (when to use vs standard mode)
Conversational trigger ("use codex for this work")
7-step workflow: availability check, credential extraction, prompt assembly, delegation via stdin pipe, diff review, commit, per-task error fallback

Technical details from a battle-tested implementation (open-source-contributor runs this pattern in production):

command -v codex for health check (graceful fallback if not installed)
echo "$PROMPT" | codex exec ... - to pipe long prompts via stdin (avoids shell quoting issues)
model_reasoning_effort config key (verified from ~/.codex/config.toml)
GH token extraction needed because sandbox blocks keychain access
Per-task fallback (one Codex failure doesn't disable it for remaining tasks)

Relation to PR #223

PR #223 (lfg-ext) replaces the entire work pipeline with external delegates in parallel worktrees. This PR adds a mode within the existing ce:work flow - same relationship as Swarm Mode has to standard mode. The two are complementary.

Testing

bun run release:validate passes (29 agents, 45 skills, 1 MCP server - no count changes)
Section is gated behind conversational trigger - standard ce:work is untouched
Verified codex exec invocation pattern works in production via open-source-contributor commit 588cf99

This contribution was developed with AI assistance (Claude Code).

Add an optional "Codex Delegation" section to ce:work, parallel to the existing Swarm Mode section. When triggered by saying "use codex for this work", ce:work delegates implementation tasks to codex exec while keeping planning, review, and git operations in the current agent. Uses stdin piping for long prompts, per-task fallback on failure, and graceful degradation when Codex CLI is not installed.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f5b9194975

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

plugins/compound-engineering/skills/ce-work/SKILL.md

- Remove GH token injection from prompt (not needed for code writing, avoids credential exposure in model input) - Replace shell chaining (cd && echo | codex) with discrete steps using a temp file, compliant with AGENTS.md shell rules

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a70a23fd7b

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

plugins/compound-engineering/skills/ce-work/SKILL.md

tmchow · 2026-03-20T22:54:32Z

@mvanhorn love this. Is it possible to refactor into a work-beta skill for time being while we test it?

mvanhorn · 2026-03-20T22:55:39Z

@tmchow funny enough I had it in beta, but to simplify my PR I took it out. let me get it in there

tmchow · 2026-03-20T23:00:02Z

Another interesting thing I wonder is how we might add a config for this so it would use Claude vs codex by default for this phase. But it would have to be careful which coding env you're in and respect the flag. Just thinking out loud.

… and enforcement to ce:work - ce:plan-beta Phase 1.1b: detect Codex delegation intent from user signals - ce:plan: add Execution Posture step for Codex delegation detection - ce:work: recognize plan-level Execution target signals as delegation trigger - ce:work: add anti-rationalization enforcement to Codex Delegation Workflow Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…ction - Use mktemp for unique prompt files (prevents cross-task races) - Pass prompt as argument instead of stdin redirection (AGENTS.md compliance) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 71bce5cc15

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

plugins/compound-engineering/skills/ce-work/SKILL.md

mvanhorn · 2026-03-20T23:48:20Z

Another interesting thing I wonder is how we might add a config for this so it would use Claude vs codex by default for this phase. But it would have to be careful which coding env you're in and respect the flag. Just thinking out loud.

interesting! I've only used CE in CC.

tmchow · 2026-03-20T23:51:42Z

interesting! I've only used CE in CC.

Even just in CC, it would make sense possibly for a setting you set with the /setup then you by default use Codex if you want. Otherwise it's on demand. Either way, it's probably fine as is, then we can add that later. Let me look at the skill work you've done.

mvanhorn · 2026-03-21T13:44:31Z

Sounds good - I'll keep it on-demand for now and we can add a /setup config option in a follow-up. Let me know what you think after looking at the skill.

tmchow · 2026-03-22T00:34:39Z

@mvanhorn some thoughts after spending time reviewing and thinkign about this.

Environment Detection

The skill runs across harnesses (Claude Code, Codex, Gemini CLI, etc.). If someone runs ce:work from inside Codex, this will attempt to shell out to codex exec from within Codex's sandbox — likely failing silently or recursing. We need a guard that detects "am I already running inside Codex?" and falls back to standard mode with a clear message. Check for something like a CODEX_SANDBOX env var or equivalent before attempting delegation.

Strategy Integration

The Codex delegation section is bolted on after Swarm Mode as a standalone workflow, but ce:work already has a strategy selection step (Phase 1 Step 4) with three modes: inline, serial subagents, parallel subagents. The delegation section doesn't explain how it interacts with these.

Key questions the skill needs to answer:

If using subagent dispatch, does each subagent shell out to Codex for tagged tasks? (That's Claude Code → subagent → codex exec → GPT — three layers deep)
If using inline, does the Codex workflow replace the Phase 2 task loop for tagged tasks?
The 7-step Codex workflow has its own review + commit steps that duplicate Phase 2's incremental commit logic — two parallel "how to execute a task" code paths will drift

Suggestion: wire delegation into the existing strategy selection as a task-level modifier rather than a separate section. The strategy (inline/serial/parallel) still applies, but the implementation step within each task delegates to Codex instead of executing directly.

Plan Abstraction

The ce:plan and ce:plan-beta changes bake Execution target: codex directly into the planning abstraction. Plans should be tool-agnostic — today it's Codex, tomorrow it could be a different delegate. wonde if something like Execution target: external-delegate and resolve the specific tool at execution time based on config? This also sets up the /setup config option later we discussed earlier

Hardcoded Model

you use -m "gpt-5.3-codex" but that will go stale. This should either come from config (compound-engineering.local.md) or but i think omitted entirely to use Codex's default model?

Mixed Attribution

Phase 4 requires attribution like Generated with [MODEL] via [HARNESS]. When half the tasks are written by Codex and half by Claude, what should the attribution say? Needs guidance for mixed-model work... or should we just omit it?

Error Threshold

"One failure does not disable Codex for remaining tasks" is fine for transient errors, but not for systemic issues (API down, auth expired, rate-limited). Consider a consecutive-failure threshold (e.g., 3 failures in a row → disable delegation for remaining tasks)?

Beta Skill

Per our earlier discussion, refactor this into a ce:work-beta skill with disable-model-invocation: true so we can test it safely before promoting.

Already flagged by automated review (still open)

$(cat $CODEX_PROMPT_FILE) expands into argv, not stdin. ARG_MAX risk on larger prompts. Pipe via stdin as described in the step text
Shell chaining/redirection in examples violates AGENTS.md rules (no &&, pipes, >/dev/null 2>&1 in skill shell examples)

…, env guard, error threshold - Rename "Codex Delegation" to "External Delegate Mode" for tool-agnostic plans - Replace `Execution target: codex` with `Execution target: external-delegate` in plans - Add environment guard to detect sandbox recursion (CODEX_SANDBOX, CODEX_SESSION_ID) - Wire delegation into existing strategy selection as task-level modifier - Remove hardcoded model flag - use delegate's default - Add consecutive failure threshold (3 failures disables delegation) - Add mixed-model attribution guidance for Phase 4 - Pipe prompt via stdin instead of argv expansion (ARG_MAX safety) - Remove shell chaining/redirection from examples (AGENTS.md compliance) - Add ce:work-beta skill wrapper with disable-model-invocation for safe testing - Remove credential embedding from prompt (pass auth via env vars) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 289d14658b

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

plugins/compound-engineering/skills/ce-work-beta/SKILL.md

mvanhorn · 2026-03-22T02:11:35Z

Took a stab at this - I like your thoughts. Heads up the description for Compound Engineering on GH says a Claude Code skill.

Addressed all points in 289d146:

Environment Detection - Added sandbox guard that checks CODEX_SANDBOX, CODEX_SESSION_ID, and .git/ write access before attempting delegation. Falls back to standard mode with a message.

Strategy Integration - Rewrote the section to frame delegation as a task-level modifier on the existing strategy (inline/serial/parallel), not a parallel workflow. The strategy still applies - delegation just changes how individual tagged tasks execute.

Plan Abstraction - Replaced Execution target: codex with Execution target: external-delegate across ce:plan, ce:plan-beta, and ce:work. The specific tool resolves at execution time.

Hardcoded Model - Removed -m "gpt-5.3-codex" flag. Delegates use their own default model.

Mixed Attribution - Added guidance for single-delegate, single-standard, and mixed-model scenarios in Phase 4.

Error Threshold - Added 3-consecutive-failure circuit breaker that disables delegation for remaining tasks.

Beta Skill - Created ce:work-beta with disable-model-invocation: true.

Shell compliance + stdin - Removed all shell chaining/redirection from examples. Prompt goes via stdin, not argv. Credentials via env vars, not prompt text.

tmchow · 2026-03-22T03:03:00Z

@codex provide a final review, focused on the latest changes. also check if the repo and various READMEs need to be updated.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 289d14658b

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-03-22T03:07:37Z

plugins/compound-engineering/skills/ce-work-beta/SKILL.md

+This is the beta variant of `ce:work` that includes the External Delegate Mode feature for testing. Once stable, the delegation section will be promoted to `ce:work` and this beta skill will be removed.
+
+**All behavior is identical to `ce:work`** - this skill simply re-includes the full `ce:work` content with the External Delegate Mode section active.
+
+To use: run `/ce:work-beta [plan file]` instead of `/ce:work [plan file]`.


Replace the ce:work-beta stub with the full beta workflow

All of our loaders/targets only register and copy each skill’s own skills/<name>/SKILL.md (src/parsers/claude.ts, src/targets/codex.ts), so invoking /ce:work-beta will load only this 20-line wrapper. Because this file does not actually contain or link in the ce:work instructions in a supported way, anyone testing the beta command will miss the real execution workflow and the new delegation steps entirely.

Useful? React with 👍 / 👎.

@mvanhorn in order to make the beta skill work, you have to duplicate the entire skill which incorporates your changes. You can't just do what you did. Also by having an entire encapsulated skill, it makes it easier to promote the skill to stable.

chatgpt-codex-connector · 2026-03-22T03:07:37Z

plugins/compound-engineering/skills/ce-work/SKILL.md

+External delegation activates when any of these conditions are met:
+- The user says "use codex for this work", "delegate to codex", or "delegate mode"
+- A plan implementation unit contains `Execution target: external-delegate` in its Execution note (set by ce:plan-beta or ce:plan)


Keep delegate triggers out of the stable ce:work path

The beta rollout framework in docs/solutions/skill-design/beta-skills-framework.md only works if the stable skills stay unchanged while users opt into the -beta variant, but these trigger phrases now enable delegation from normal /ce:work sessions, and stable ce:plan also stamps Execution target: external-delegate into plans (plugins/compound-engineering/skills/ce-plan/SKILL.md:154-156). That means a regular /ce:plan -> /ce:work flow can hit the experimental delegate path without ever choosing /ce:work-beta, which defeats the safe side-by-side trial and can surprise users who merely mention Codex or token conservation.

Useful? React with 👍 / 👎.

@mvanhorn address this one too. see docs/solutions/skill-design/beta-skills-framework.md if you want to see how i've structured how beta skills work.

chatgpt-codex-connector bot reviewed Mar 20, 2026

View reviewed changes

plugins/compound-engineering/skills/ce-work/SKILL.md Outdated Show resolved Hide resolved

plugins/compound-engineering/skills/ce-work/SKILL.md Outdated Show resolved Hide resolved

fix(ce-work): address Codex review feedback

a70a23f

- Remove GH token injection from prompt (not needed for code writing, avoids credential exposure in model input) - Replace shell chaining (cd && echo | codex) with discrete steps using a temp file, compliant with AGENTS.md shell rules

chatgpt-codex-connector bot reviewed Mar 20, 2026

View reviewed changes

plugins/compound-engineering/skills/ce-work/SKILL.md Outdated Show resolved Hide resolved

mvanhorn and others added 2 commits March 20, 2026 16:15

fix(ce-work): unique temp file per Codex task and remove shell redire…

71bce5c

…ction - Use mktemp for unique prompt files (prevents cross-task races) - Pass prompt as argument instead of stdin redirection (AGENTS.md compliance) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

chatgpt-codex-connector bot reviewed Mar 20, 2026

View reviewed changes

plugins/compound-engineering/skills/ce-work/SKILL.md Outdated Show resolved Hide resolved

plugins/compound-engineering/skills/ce-work/SKILL.md Outdated Show resolved Hide resolved

chatgpt-codex-connector bot reviewed Mar 22, 2026

View reviewed changes

plugins/compound-engineering/skills/ce-work-beta/SKILL.md Show resolved Hide resolved

chatgpt-codex-connector bot reviewed Mar 22, 2026

View reviewed changes

Conversation

mvanhorn commented Mar 20, 2026

Summary

Why

Changes

Relation to PR #223

Testing

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

tmchow commented Mar 20, 2026

Uh oh!

mvanhorn commented Mar 20, 2026

Uh oh!

tmchow commented Mar 20, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

mvanhorn commented Mar 20, 2026

Uh oh!

tmchow commented Mar 20, 2026

Uh oh!

mvanhorn commented Mar 21, 2026

Uh oh!

tmchow commented Mar 22, 2026

Environment Detection

Strategy Integration

Plan Abstraction

Hardcoded Model

Mixed Attribution

Error Threshold

Beta Skill

Already flagged by automated review (still open)

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

mvanhorn commented Mar 22, 2026

Uh oh!

tmchow commented Mar 22, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

tmchow Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

tmchow Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants