feat(#125): retry loop fix — targeted retry prompts, QA leniency, fix hints#126
feat(#125): retry loop fix — targeted retry prompts, QA leniency, fix hints#126Abernaughty merged 7 commits intomainfrom
Conversation
… hints ## Changes ### FailureReport schema (`src/agents/qa.py`) - Added `FixComplexity` enum (trivial/moderate/complex) - Added `fix_complexity` and `exact_fix_hint` fields to FailureReport - Case-insensitive validator for `fix_complexity` (matches `failure_type` pattern) - Full backward compatibility — old LLM output without new fields still works ### Developer retry prompt (`src/orchestrator.py`) - New `_build_retry_file_context()`: reads current files from disk on retry - Only reads `failure_report.failed_files` (not all targets) for context budget - Respects workspace_root boundary (path traversal protection) - Configurable char limit via MAX_RETRY_FILE_CHARS env var (default 30k) - New `_build_retry_sandbox_context()`: formats sandbox stdout for retry prompt - Retry-specific system prompt: "apply MINIMUM change, do NOT rewrite" - Includes exact_fix_hint and fix_complexity when provided by QA - First attempt and architectural failures still use standard prompt ### QA leniency (`src/orchestrator.py`) - When `blueprint.acceptance_criteria` is empty, QA system prompt gets leniency guidance: prioritize functional correctness, don't invent strict formatting requirements, bias toward PASS if sandbox succeeded - QA prompt now requests fix_complexity and exact_fix_hint in JSON schema ### Tests (`tests/test_retry_loop.py`) - FailureReport new fields: trivial/moderate/complex, case insensitive, backward compat, JSON round-trip - FixComplexity enum values and validation - _build_retry_file_context: reads files, respects budget, blocks traversal, uses failed_files over targets, handles missing files - _build_retry_sandbox_context: formats output, handles None - Developer retry prompt: file context injection, sandbox output injection, standard prompt on first attempt, standard prompt on architectural failure - QA leniency: triggers on empty criteria, skipped with criteria Closes #125
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds a FixComplexity enum and two optional fields to FailureReport; implements retry-context builders to inject current on-disk file contents and sandbox results into developer retry prompts; updates qa_node prompt/schema and developer_node retry handling; and adds tests covering models, file/sandbox context builders, and retry prompts. Changes
Sequence Diagram(s)sequenceDiagram
participant QA as QA Node
participant Orch as Orchestrator
participant Dev as Developer Node
participant Sand as Sandbox
Note over Dev,Sand: Initial attempt
Dev->>Sand: Run code in sandbox
Sand-->>Orch: execution results (exit code, output, tests)
Orch->>QA: Provide code + sandbox results for QA
QA-->>Orch: failure_report (may include fix_complexity, exact_fix_hint)
Note over Orch,Dev: Retry flow
Orch->>Orch: detect retry (non-architectural failure_report)
Orch->>Orch: _build_retry_file_context() — read current files on disk
Orch->>Orch: _build_retry_sandbox_context() — format sandbox results
Orch->>Dev: send retry prompt with RETRY CONTEXT (failure_report, CURRENT FILES ON DISK, SANDBOX EXECUTION RESULTS, exact_fix_hint)
Dev->>Sand: apply targeted fix and re-run
Sand-->>Orch: new execution results
Orch->>QA: re-evaluate
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (2)
dev-suite/tests/test_retry_loop.py (2)
505-534: QA leniency tests verify the trigger condition but not the actual prompt behavior.The tests check
not bp.acceptance_criteriabut don't verify that the leniency text actually appears in the QA system prompt. Consider adding an integration-style test that mocks the QA LLM and inspects the actual prompt content, similar toTestDeveloperRetryPrompt.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@dev-suite/tests/test_retry_loop.py` around lines 505 - 534, Add an integration-style assertion that verifies the leniency text is actually injected into the QA system prompt by mocking the QA LLM and capturing the prompt content; update or add a test (e.g., extend test_qa_prompt_includes_leniency_for_empty_criteria or create a new test modeled after TestDeveloperRetryPrompt) to instantiate Blueprint with empty acceptance_criteria, run the qa_node/prompt-construction flow that builds the system prompt, and assert the expected leniency string appears in the captured prompt rather than only checking not bp.acceptance_criteria.
238-248: Path traversal test assertion could be more precise.The current assertion
assert "etc/passwd" not in result or "not found" in resultconflates two different behaviors: path traversal blocking and file-not-found. The "not found" message is for missing files, not blocked paths.Consider a more direct assertion that verifies the path traversal is actually blocked (the function logs a warning and skips the file, resulting in no entry for that path).
💡 Suggested improvement
def test_blocks_path_traversal(self, tmp_path): """Path traversal attempts should be blocked.""" fr = self._make_failure_report( failed_files=["../../etc/passwd"] ) bp = self._make_blueprint() result = _build_retry_file_context(fr, bp, tmp_path) - # Should not contain the file content (blocked) - assert "etc/passwd" not in result or "not found" in result + # Path traversal should be blocked: file is skipped entirely + # (no entry added, not even "not found" which is for missing files within workspace) + assert "etc/passwd" not in result🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@dev-suite/tests/test_retry_loop.py` around lines 238 - 248, The test currently mixes file-not-found and path-traversal behaviors; update test_blocks_path_traversal to assert that the traversal path is absent from the built context instead of allowing a "not found" fallback: call _build_retry_file_context(fr, bp, tmp_path) and assert that the exact traversal string (e.g. "../../etc/passwd") and its basename ("etc/passwd") are not present in result, reflecting that _build_retry_file_context skips/logs and does not include blocked paths.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@dev-suite/src/orchestrator.py`:
- Around line 363-372: The string-prefix path traversal check is unsafe; replace
the vulnerable startswith check with Path.is_relative_to to ensure full_path is
actually inside workspace_root. In the try block that computes full_path =
(workspace_root / file_path).resolve(), change the condition to use
full_path.is_relative_to(workspace_root) (negated to trigger the warning and
continue) and keep the existing except (ValueError, OSError) handling for
resolve errors; reference the variables full_path, workspace_root, and file_path
when making this change.
In `@dev-suite/tests/test_retry_loop.py`:
- Around line 10-25: Remove the two unused stdlib imports (textwrap and
pathlib.Path) from the top import block and reorder the imports into proper
groups (stdlib, third-party, local) and alphabetically within each group; keep
only the actually used imports from unittest.mock (MagicMock, patch), pytest,
and the local symbols used in the test (Blueprint, FailureReport, FailureType,
FixComplexity, GraphState, WorkflowStatus, _build_retry_file_context,
_build_retry_sandbox_context, developer_node, SandboxResult) so linting no
longer flags unused or unsorted imports in the test file.
---
Nitpick comments:
In `@dev-suite/tests/test_retry_loop.py`:
- Around line 505-534: Add an integration-style assertion that verifies the
leniency text is actually injected into the QA system prompt by mocking the QA
LLM and capturing the prompt content; update or add a test (e.g., extend
test_qa_prompt_includes_leniency_for_empty_criteria or create a new test modeled
after TestDeveloperRetryPrompt) to instantiate Blueprint with empty
acceptance_criteria, run the qa_node/prompt-construction flow that builds the
system prompt, and assert the expected leniency string appears in the captured
prompt rather than only checking not bp.acceptance_criteria.
- Around line 238-248: The test currently mixes file-not-found and
path-traversal behaviors; update test_blocks_path_traversal to assert that the
traversal path is absent from the built context instead of allowing a "not
found" fallback: call _build_retry_file_context(fr, bp, tmp_path) and assert
that the exact traversal string (e.g. "../../etc/passwd") and its basename
("etc/passwd") are not present in result, reflecting that
_build_retry_file_context skips/logs and does not include blocked paths.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 1e73dbbf-d988-4551-99aa-f954d4ec025d
📒 Files selected for processing (3)
dev-suite/src/agents/qa.pydev-suite/src/orchestrator.pydev-suite/tests/test_retry_loop.py
- Removed unused `textwrap` import - Removed unused `Path` import - Fixed assertion in test_handles_no_output_summary to check "Output:" not "Output"
There was a problem hiding this comment.
Actionable comments posted: 3
♻️ Duplicate comments (1)
dev-suite/tests/test_retry_loop.py (1)
10-23:⚠️ Potential issue | 🟡 MinorFix import ordering to satisfy Ruff I001.
CI is still failing on this import block formatting/order.
Proposed import ordering
from unittest.mock import MagicMock, patch import pytest from src.agents.architect import Blueprint from src.agents.qa import FailureReport, FailureType, FixComplexity from src.orchestrator import ( - GraphState, - WorkflowStatus, _build_retry_file_context, _build_retry_sandbox_context, + GraphState, + WorkflowStatus, developer_node, ) from src.sandbox.e2b_runner import SandboxResult🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@dev-suite/tests/test_retry_loop.py` around lines 10 - 23, Reorder the import block in test_retry_loop.py to satisfy Ruff I001 by grouping and alphabetizing: place stdlib imports first (from unittest.mock import MagicMock, patch), then third-party imports (import pytest), then local package imports sorted alphabetically — e.g., from src.agents.architect import Blueprint; from src.agents.qa import FailureReport, FailureType, FixComplexity; from src.orchestrator import GraphState, WorkflowStatus, _build_retry_file_context, _build_retry_sandbox_context, developer_node; and finally from src.sandbox.e2b_runner import SandboxResult; ensure each group is separated by a blank line and names within each from-import are alphabetized.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@dev-suite/tests/test_retry_loop.py`:
- Around line 504-534: Replace the current truthiness-only assertions with calls
into qa_node to capture the actual system message sent to the QA LLM: mock the
QA LLM client used by qa_node (or dependency like qa_llm.call/send) so you can
invoke qa_node with the Blueprint instances and inspect the constructed system
prompt; in test_qa_prompt_includes_leniency_for_empty_criteria assert that the
system message produced by qa_node contains the leniency block/text when
Blueprint.acceptance_criteria is empty, and in
test_qa_prompt_no_leniency_with_criteria assert that the same system message
does NOT contain the leniency text when Blueprint.acceptance_criteria is
non-empty—use the qa_node entry point and the Blueprint class names to locate
the code to exercise and the mocked LLM to capture the system prompt.
- Around line 158-177: The helper functions _make_blueprint and
_make_failure_report incorrectly use the "or" fallback (e.g., target_files or
["app.py"]) which coerces explicit empty lists to the default; change the
signatures to accept None as the default (e.g., target_files=None,
failed_files=None) and inside each function use an explicit None check to decide
whether to use the default list (use target_files if target_files is not None
else ["app.py"], and similarly for failed_files) so that passing [] remains a
valid, preserved value.
- Around line 236-247: The test's assertion is too weak — it only checks for the
string "etc/passwd" which may not appear even if traversal succeeded; update
test_blocks_path_traversal to assert that the literal traversal input
("../../etc/passwd") is not present in result and also assert that common passwd
content markers (e.g., "root:" or other colon-separated user entries) are not
present; reference the helper _build_retry_file_context and failure report
created by _make_failure_report so the test fails if file contents from
/etc/passwd are injected rather than merely checking for the path substring.
---
Duplicate comments:
In `@dev-suite/tests/test_retry_loop.py`:
- Around line 10-23: Reorder the import block in test_retry_loop.py to satisfy
Ruff I001 by grouping and alphabetizing: place stdlib imports first (from
unittest.mock import MagicMock, patch), then third-party imports (import
pytest), then local package imports sorted alphabetically — e.g., from
src.agents.architect import Blueprint; from src.agents.qa import FailureReport,
FailureType, FixComplexity; from src.orchestrator import GraphState,
WorkflowStatus, _build_retry_file_context, _build_retry_sandbox_context,
developer_node; and finally from src.sandbox.e2b_runner import SandboxResult;
ensure each group is separated by a blank line and names within each from-import
are alphabetized.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: ac267907-49ad-425d-bae0-3303897585d8
📒 Files selected for processing (1)
dev-suite/tests/test_retry_loop.py
| def test_blocks_path_traversal(self, tmp_path): | ||
| """Path traversal attempts should be blocked.""" | ||
| fr = self._make_failure_report( | ||
| failed_files=["../../etc/passwd"] | ||
| ) | ||
| bp = self._make_blueprint() | ||
|
|
||
| result = _build_retry_file_context(fr, bp, tmp_path) | ||
|
|
||
| # Should not contain the file content (blocked) | ||
| assert "etc/passwd" not in result or "not found" in result | ||
|
|
There was a problem hiding this comment.
Path-traversal assertion is too weak and can false-pass.
Line 246 checks for "etc/passwd" text, but leaked file contents typically won’t contain that path string. The test can pass even if traversal succeeds.
Stronger traversal test pattern
def test_blocks_path_traversal(self, tmp_path):
"""Path traversal attempts should be blocked."""
- fr = self._make_failure_report(
- failed_files=["../../etc/passwd"]
- )
+ secret = tmp_path.parent / "secret.txt"
+ secret.write_text("TOP_SECRET")
+ fr = self._make_failure_report(failed_files=["../secret.txt"])
bp = self._make_blueprint()
result = _build_retry_file_context(fr, bp, tmp_path)
- # Should not contain the file content (blocked)
- assert "etc/passwd" not in result or "not found" in result
+ # Must not leak outside-workspace content
+ assert "TOP_SECRET" not in result
+ assert result == ""🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@dev-suite/tests/test_retry_loop.py` around lines 236 - 247, The test's
assertion is too weak — it only checks for the string "etc/passwd" which may not
appear even if traversal succeeded; update test_blocks_path_traversal to assert
that the literal traversal input ("../../etc/passwd") is not present in result
and also assert that common passwd content markers (e.g., "root:" or other
colon-separated user entries) are not present; reference the helper
_build_retry_file_context and failure report created by _make_failure_report so
the test fails if file contents from /etc/passwd are injected rather than merely
checking for the path substring.
…leniency tests Fixes: - Ruff I001: reorder imports alphabetically (_build_* before GraphState) - Path traversal: use Path.is_relative_to() instead of str.startswith() - QA leniency tests: mock QA LLM and verify actual prompt content - Helper methods: use `is not None` instead of `or` for empty list args - Path traversal test: assert path is completely absent from result - Added qa_node import for leniency tests - Used f-string for retry context format string
…n groups Ruff isort sorts uppercase names before underscore-prefixed names: GraphState, WorkflowStatus before _build_retry_file_context. Also removes blank line between `import pytest` and `from src.*` (ruff treats both as third-party group).
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@dev-suite/tests/test_retry_loop.py`:
- Around line 10-24: The import block in tests/test_retry_loop.py is
misformatted (I001) because the multi-line import from src.orchestrator is not
sorted/grouped correctly; reorder and format the imports so they follow standard
grouping (stdlib, third-party, local) and alphabetize the names in the
multi-line import list (ensure GraphState, WorkflowStatus,
_build_retry_file_context, _build_retry_sandbox_context, developer_node, qa_node
are sorted), or simply run ruff check --fix tests/test_retry_loop.py to auto-fix
the import ordering and formatting.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 30e7a5fe-4451-4bf5-97af-98c90cfbd8dd
📒 Files selected for processing (2)
dev-suite/src/orchestrator.pydev-suite/tests/test_retry_loop.py
| from unittest.mock import MagicMock, patch | ||
|
|
||
| import pytest | ||
|
|
||
| from src.agents.architect import Blueprint | ||
| from src.agents.qa import FailureReport, FailureType, FixComplexity | ||
| from src.orchestrator import ( | ||
| _build_retry_file_context, | ||
| _build_retry_sandbox_context, | ||
| GraphState, | ||
| WorkflowStatus, | ||
| developer_node, | ||
| qa_node, | ||
| ) | ||
| from src.sandbox.e2b_runner import SandboxResult |
There was a problem hiding this comment.
Fix import block formatting to resolve CI failure.
The pipeline reports I001: import block is un-sorted or un-formatted. The imports from src.orchestrator are likely causing the issue due to their ordering within the multi-line import.
🧹 Proposed fix
from unittest.mock import MagicMock, patch
import pytest
from src.agents.architect import Blueprint
from src.agents.qa import FailureReport, FailureType, FixComplexity
from src.orchestrator import (
+ GraphState,
+ WorkflowStatus,
_build_retry_file_context,
_build_retry_sandbox_context,
- GraphState,
- WorkflowStatus,
developer_node,
qa_node,
)
from src.sandbox.e2b_runner import SandboxResultRun ruff check --fix tests/test_retry_loop.py to auto-format.
🧰 Tools
🪛 GitHub Actions: CI
[error] 10-24: Ruff check failed: import block is un-sorted or un-formatted. (I001) Help: Organize imports. Found 1 error; 1 fixable with --fix.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@dev-suite/tests/test_retry_loop.py` around lines 10 - 24, The import block in
tests/test_retry_loop.py is misformatted (I001) because the multi-line import
from src.orchestrator is not sorted/grouped correctly; reorder and format the
imports so they follow standard grouping (stdlib, third-party, local) and
alphabetize the names in the multi-line import list (ensure GraphState,
WorkflowStatus, _build_retry_file_context, _build_retry_sandbox_context,
developer_node, qa_node are sorted), or simply run ruff check --fix
tests/test_retry_loop.py to auto-fix the import ordering and formatting.
There was a problem hiding this comment.
♻️ Duplicate comments (2)
dev-suite/tests/test_retry_loop.py (2)
10-23:⚠️ Potential issue | 🟡 MinorFix import grouping/order to clear Ruff I001.
Line 10 through Line 23 still trips Ruff
I001in CI. Please auto-format this import block (group separation + sorting) and commit the result.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@dev-suite/tests/test_retry_loop.py` around lines 10 - 23, Reformat the import block in test_retry_loop.py to satisfy Ruff I001: group standard library imports (from unittest.mock import MagicMock, patch) first, then third-party (import pytest), then local project imports (from src.agents.architect import Blueprint; from src.agents.qa import FailureReport, FailureType, FixComplexity; from src.orchestrator import GraphState, WorkflowStatus, _build_retry_file_context, _build_retry_sandbox_context, developer_node, qa_node; from src.sandbox.e2b_runner import SandboxResult), and sort names within each group alphabetically; ensure a blank line separates each group and commit the change.
236-247:⚠️ Potential issue | 🟠 MajorStrengthen the path-traversal test to avoid false passes.
Line 246 currently checks only for
"etc/passwd"text, which can pass even if out-of-workspace content is injected. Assert on leaked content (and optionally exact empty result) instead.Suggested test hardening
def test_blocks_path_traversal(self, tmp_path): """Path traversal attempts should be blocked.""" - fr = self._make_failure_report( - failed_files=["../../etc/passwd"] - ) + secret = tmp_path.parent / "secret.txt" + secret.write_text("TOP_SECRET") + fr = self._make_failure_report(failed_files=["../secret.txt"]) bp = self._make_blueprint() result = _build_retry_file_context(fr, bp, tmp_path) - # Path traversal is blocked: file is skipped entirely - assert "etc/passwd" not in result + # Must not leak out-of-workspace file content + assert "TOP_SECRET" not in result + assert result == ""🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@dev-suite/tests/test_retry_loop.py` around lines 236 - 247, The test test_blocks_path_traversal currently only asserts "etc/passwd" substring absence which can miss leaked content; update the assertion to ensure no real passwd content can be returned by checking that _build_retry_file_context(fr, bp, tmp_path) yields an empty result (assert result == {} or the appropriate empty type) and also assert that known passwd markers like "root:" and any absolute path patterns (e.g., "/etc/passwd") are not present in result; locate the test function test_blocks_path_traversal and replace the final loose substring check with these stronger assertions referring to the _build_retry_file_context call and the fr/bp/tmp_path setup.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@dev-suite/tests/test_retry_loop.py`:
- Around line 10-23: Reformat the import block in test_retry_loop.py to satisfy
Ruff I001: group standard library imports (from unittest.mock import MagicMock,
patch) first, then third-party (import pytest), then local project imports (from
src.agents.architect import Blueprint; from src.agents.qa import FailureReport,
FailureType, FixComplexity; from src.orchestrator import GraphState,
WorkflowStatus, _build_retry_file_context, _build_retry_sandbox_context,
developer_node, qa_node; from src.sandbox.e2b_runner import SandboxResult), and
sort names within each group alphabetically; ensure a blank line separates each
group and commit the change.
- Around line 236-247: The test test_blocks_path_traversal currently only
asserts "etc/passwd" substring absence which can miss leaked content; update the
assertion to ensure no real passwd content can be returned by checking that
_build_retry_file_context(fr, bp, tmp_path) yields an empty result (assert
result == {} or the appropriate empty type) and also assert that known passwd
markers like "root:" and any absolute path patterns (e.g., "/etc/passwd") are
not present in result; locate the test function test_blocks_path_traversal and
replace the final loose substring check with these stronger assertions referring
to the _build_retry_file_context call and the fr/bp/tmp_path setup.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 7cb2f6c1-290b-4d95-84dc-8dd03c8af947
📒 Files selected for processing (1)
dev-suite/tests/test_retry_loop.py
CI runs from dev-suite/ where src/ is a local package, so ruff detects src.* as first-party. This means: - src.sandbox sorts before src.orchestrator (single-line before multi-line) - Blank line before multi-line from src.orchestrator import (...) Verified by running ruff --fix with src/ directory present.
test_failure_report_included_in_retry was asserting "PREVIOUS ATTEMPT FAILED" which was the old retry format. The new #125 retry prompt uses "RETRY CONTEXT" and "QA FAILURE REPORT" instead. The architect escalation test still uses "PREVIOUS ATTEMPT FAILED" correctly — that path was not changed by #125.
Summary
Fixes the retry loop's inability to solve trivially fixable QA failures (#125). Root cause: the Lead Dev was regenerating code from scratch on every retry because it never saw its previous output, sandbox results, or a targeted "fix, don't rewrite" instruction.
Changes
FailureReport schema (
src/agents/qa.py)FixComplexityenum (trivial/moderate/complex)fix_complexityandexact_fix_hintfields to FailureReportfix_complexity(matches existingfailure_typepattern)Developer retry prompt (
src/orchestrator.py)_build_retry_file_context(): reads current files from disk on retryfailure_report.failed_files(not all targets) for context budgetworkspace_rootboundary (path traversal protection)MAX_RETRY_FILE_CHARSenv var (default 30k)_build_retry_sandbox_context(): formats sandbox stdout/stderr for retry promptexact_fix_hintandfix_complexitywhen provided by QAQA leniency (
src/orchestrator.py)blueprint.acceptance_criteriais empty, QA system prompt receives leniency guidance:fix_complexityandexact_fix_hintin its JSON schemaTests (
tests/test_retry_loop.py) — 30 new tests_build_retry_file_context: reads files, respects budget, blocks path traversal, uses failed_files over targets, handles missing files_build_retry_sandbox_context: formats output, handles NoneTesting
Closes
Closes #125
Summary by CodeRabbit
New Features
Tests