Fix: Resolve JSON malformation causing infinite loops and TypeError by gdeyoung · Pull Request #1037 · agent0ai/agent-zero

gdeyoung · 2026-02-14T18:39:16Z

Summary

Fixed multiple critical bugs causing JSON malformation, infinite loops, and TypeError crashes in Agent Zero.

Issues Fixed

1. JSON Object Extraction Bug (rfind)

File: python/helpers/extract_tools.py - Used rfind to find LAST closing brace instead of matching one
Fix: Proper nested brace tracking using depth counter

2. Escape Handling Logic Error

File: python/helpers/extract_tools.py - Escaped quotes not toggling in_string flag
Fix: Proper check for escaped quotes

3. No Loop Protection

File: agent.py - No protection against consecutive misformat errors
Fix: Added consecutive_misformat counter with 5-attempt limit + HandledException

4. TypeError: tool_args must be a mapping

File: agent.py - .get() returns string if key exists with string value
Fix: isinstance(tool_args, dict) validation

gdeyoung · 2026-02-14T23:25:12Z

Additional Fix v1.2 - HandledException Shadowing

Issue Found

A duplicate local class definition of HandledException in agent.py was shadowing the imported class from python.helpers.errors. This caused:

isinstance(exception, HandledException) check to fail in handle_critical_exception
Unhandled exceptions causing crashes instead of graceful loop termination

Root Cause

Line 36: from python.helpers.errors import RepairableException, HandledException ✅ (import)
Line 353: class HandledException(Exception): pass ❌ (duplicate local definition)

Fix Applied

Removed the duplicate class definition (lines 349-355) in agent.py. Now uses only the imported HandledException from errors.py.

Files Changed

agent.py: Removed duplicate HandledException class definition

Verification

Python syntax validated
Import now works correctly (same class object used throughout)
Exception handling now works as intended

Added: 2026-02-14

longman391 · 2026-02-20T04:12:49Z

I can confirm this is a critical bug. I traced tool call failures across multiple chat sessions on my instance (v0.9.8.1, Claude Opus 4.6 via GitHub Copilot, 128k context) and found 105 empty tool_name failures in a single log file — all caused by the rfind bug in extract_json_object_string().

Evidence from my logs:

Chat mYlPuJkf: 38 consecutive 'Tool not found' errors (messages 131-168) — the agent was stuck in an infinite loop of malformed output → error → retry
The extract_json_object_string() function grabs everything between the first { and last }, which means incidental curly braces in LLM output (file paths like /restore/{backup_id}/, inline JSON examples, etc.) get misinterpreted as tool calls with empty tool_name
I verified this by testing DirtyJson directly: input 'The backup is at /restore/{backup_id}/files' parses to {'backup_id': ''} with no tool_name → triggers 'Tool not found'

Additional issue not covered by this PR: The LLM frequently hallucinates tool names from training data instead of using the actual tool names in the system prompt:

code_execution instead of code_execution_tool
web_search instead of search_engine
browser_tool instead of browser_agent
response_tool / message_tool instead of response
terminal instead of code_execution_tool

A simple alias mapping in get_tool() would catch these. Happy to submit a PR for that.

The consecutive misformat counter in this PR would have prevented the 38-message infinite loop. Please consider merging this — it's a significant stability improvement.

Fix agent0ai#3 - Empty tool_name validation: - When DirtyJson parses valid JSON but the object has no tool_name field, the agent previously dispatched with an empty string, triggering 'Tool not found' errors. Now treats this as a misformat and increments the consecutive_misformat counter (integrates with PR agent0ai#1037's circuit breaker). - Evidence: 105 empty tool_name failures found in a single log session. Fix agent0ai#4 - Tool name alias mapping: - LLMs frequently hallucinate tool names from training data instead of using the actual names in the system prompt. Added TOOL_ALIASES dict that maps common hallucinated names to actual Agent Zero tool names: - code_execution/terminal/shell -> code_execution_tool - web_search/search -> search_engine - browser_tool/browser -> browser_agent - response_tool/message_tool/message/reply -> response - knowledge_tool/memory_tool -> memory - task_manager -> scheduler - Evidence: 20+ hallucinated tool name failures across multiple chat logs. Related: agent0ai#1031, agent0ai#805

…ng resistance Claude subordinates interpret Agent Zero system prompt as "prompt injection" and refuse to output JSON, causing infinite misformat loops (even with circuit breaker from PR agent0ai#1037). GLM-5 reliably follows JSON formatting instructions and is capable enough for agentic subordinate work. Uses existing initialize_agent(override_settings=) mechanism.

PaoloC68 · 2026-03-11T21:28:00Z

Hi @gdeyoung @frdel — I can independently confirm all three bugs here. I've been hitting the misformat infinite loop repeatedly with MiniMax M2.5 and GLM-5 on OpenRouter, and traced the root cause through the same path:

rfind bug: extract_json_object_string() grabs the last } instead of the matching one — confirmed this corrupts tool JSON when model output contains incidental braces (code examples, file paths, etc.)
No loop protection: My instance hit 4+ consecutive misformat warnings with no break. The agent burns through its entire context retrying the same malformed pattern.
ValueError crash (related): The validate_tool_request added in 1b89a0d raises plain ValueError instead of RepairableException, which makes the crash even worse — the agent can't recover at all. I filed a separate fix for this: fix: use RepairableException in validate_tool_request for graceful recovery #1242.

I've deployed all three fixes (rfind, circuit breaker, RepairableException) on my production instance and they work together correctly:

The rfind fix reduces false misformat detections significantly
When a model still produces malformed JSON, the circuit breaker stops after 5 attempts instead of looping forever
The RepairableException fix lets the agent retry on validation failures instead of hard-crashing

Note: @Krashnicov's review about the escape handling bug in this PR is correct — escaped quotes should NOT toggle in_string. The fix is simply continue on escape_next without any toggle. I used the corrected version in my deployment.

@frdel this is probably the highest-impact stability fix pending for the project — multiple users are hitting this (#624, #841, #1031, #1234, #1241). Would love to see this reviewed and merged.

longman391 mentioned this pull request Feb 20, 2026

# Agent Zero JSON Malformation Bug Fixes #1031

Open

PaoloC68 mentioned this pull request Mar 12, 2026

fix: harden JSON misformat handling to prevent infinite loops and crashes #1245

Closed

gdeyoung closed this Apr 5, 2026

gdeyoung force-pushed the main branch from e82b2dc to 2d95cd9 Compare April 5, 2026 23:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix: Resolve JSON malformation causing infinite loops and TypeError#1037

Fix: Resolve JSON malformation causing infinite loops and TypeError#1037
gdeyoung wants to merge 0 commit intoagent0ai:mainfrom
gdeyoung:main

gdeyoung commented Feb 14, 2026

Uh oh!

gdeyoung commented Feb 14, 2026

Uh oh!

longman391 commented Feb 20, 2026

Uh oh!

PaoloC68 commented Mar 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

gdeyoung commented Feb 14, 2026

Summary

Issues Fixed

1. JSON Object Extraction Bug (rfind)

2. Escape Handling Logic Error

3. No Loop Protection

4. TypeError: tool_args must be a mapping

Uh oh!

gdeyoung commented Feb 14, 2026

Additional Fix v1.2 - HandledException Shadowing

Issue Found

Root Cause

Fix Applied

Files Changed

Verification

Uh oh!

longman391 commented Feb 20, 2026

Uh oh!

PaoloC68 commented Mar 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants