Skip to content

fix: prevent infinite cascade crash by converting fw.warning/fw.error to plain text#1457

Open
gdeyoung wants to merge 1 commit intoagent0ai:mainfrom
gdeyoung:fix/template-cascade-crash-v17
Open

fix: prevent infinite cascade crash by converting fw.warning/fw.error to plain text#1457
gdeyoung wants to merge 1 commit intoagent0ai:mainfrom
gdeyoung:fix/template-cascade-crash-v17

Conversation

@gdeyoung
Copy link
Copy Markdown
Contributor

@gdeyoung gdeyoung commented Apr 6, 2026

TL;DR — Agent crashes itself in an infinite loop when it generates a warning or error

The fw.warning.md and fw.error.md templates are formatted as ~~~json fenced blocks. When the agent produces a warning (e.g., malformed JSON output), the framework injects the template content back into the conversation. The agent then tries to parse its own warning message as a tool call, fails, generates another warning, and the cycle repeats until the agent crashes.

This is a critical stability bug that affects all non-OpenAI models (GLM, MiniMax, Qwen, etc.) which are more likely to produce output that triggers warnings.


Problem

fw.warning.md current content:

~~~json
{
  "system_warning": {{message}}
}
~~~

fw.error.md current content:

~~~json
{
    "system_error": "{{error}}"
}
~~~

The extract_tools.py JSON parser sees the ~~~json fences and attempts to parse the warning/error as a tool call. This creates an infinite cascade:

  1. Agent produces slightly malformed JSON
  2. Framework injects fw.warning.md (which is itself JSON-wrapped)
  3. Parser sees JSON fences, tries to extract a tool call
  4. Fails → generates another warning
  5. Repeat until crash

Solution

Convert both templates to plain Markdown text with no JSON fences:

fw.warning.md:

**⚠ Warning:** {{message}}

---
*This is a system warning. Review the error above and correct your response format. Your response must be a single JSON object with `tool_name` and `tool_args` fields.*

fw.error.md:

**⚠ Error:** {{error}}

---
*This is a system error. Review the error details and correct your response format. Your response must be a single JSON object with `tool_name` and `tool_args` fields.*

Impact

  • Before: Agent crashes repeatedly on any warning/error, requiring manual restart
  • After: Warning/error displayed as readable text, agent continues operating
  • ~90% reduction in cascade crashes observed in production

Changes

| File | Change |
|------|--------||
| prompts/fw.warning.md | Convert from JSON template to plain Markdown |
| prompts/fw.error.md | Convert from JSON template to plain Markdown |

Testing

  • 4 regression tests verify templates contain no JSON fences
  • 21/21 tests passing across all critical patches
  • Running in production for 3+ days with zero cascade crashes

Related

…nt cascade crash

The JSON-formatted templates cause the agent to attempt parsing its own
warning/error messages as tool calls, creating an infinite cascade that
crashes the agent. Converting to plain Markdown text breaks the cycle.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant