fix: repair orphaned toolUser in last message during session restore by konippi · Pull Request #2026 · strands-agents/sdk-python

konippi · 2026-04-01T04:50:13Z

Description

When an agent process is terminated during tool execution (e.g., runtime timeout in multi-agent setups), the session ends with an assistant message containing toolUse but no corresponding toolResult. On the next invocation, _fix_broken_tool_use restores the session but explicitly skips the last message, leaving the conversation in an invalid state that causes a ValidationException from the model provider.

This change removes the last-message exclusion so that orphaned toolUse at the end of the conversation is repaired during session restoration, the same way mid-conversation orphans are already handled.

Related Issues

resolves: #2025

[BUG] Session management fails to resume when previous session ended during tool execution #859 original issue that introduced _fix_broken_tool_use (the last-message case was deferred to Agent._convert_prompt_to_messages, which doesn't cover all code paths)

Documentation PR

N/A

Type of Change

Bug fix

Testing

How have you tested the change? Verify that the changes do not break functionality or introduce warnings in consuming repositories: agents-docs, agents-tools, agents-cli

I ran hatch run prepare

Checklist

I have read the CONTRIBUTING document
I have added any necessary tests that prove my fix is effective or my feature works
I have updated the documentation accordingly
I have added an appropriate example to the documentation to outline the feature, or no new docs are needed
My changes generate no new warnings
Any dependent changes have been merged and published

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

beansandroasters · 2026-04-01T05:53:20Z

Hi, I stumbled upon this PR while investigating the same issue in my own project. Nice fix — the else branch approach makes sense. Just wanted to share a couple of things I noticed while reading through the code, in case they're helpful.

In-place mutation makes the old test vacuous

The original test_fix_broken_tool_use_ignores_last_message asserted:

fixed_messages = session_manager._fix_broken_tool_use(messages)
assert fixed_messages == messages

Since _fix_broken_tool_use mutates and returns the same list object, fixed_messages is messages is always True — meaning fixed_messages == messages passes regardless of whether the method appended anything or not. In other words, the old test would still pass even with this patch applied, so it was never actually guarding against the change.

Your new assertions (len == 3, checking toolResult fields) do verify the fix correctly, so this isn't a problem in practice. But using copy.deepcopy(messages) before calling the method would make the intent more explicit:

import copy

original = copy.deepcopy(messages)
fixed_messages = session_manager._fix_broken_tool_use(messages)

assert len(original) == 2   # sanity: original is untouched
assert len(fixed_messages) == 3

The `else` branch also fixes an `insert()`-induced edge case — might be worth a test

While reproducing things locally, I found that the else branch implicitly covers another scenario that doesn't have test coverage: when insert() on L297 shifts a later toolUse into the last position mid-iteration.

# Two consecutive orphaned toolUse messages, no trailing message
messages = [
    {"role": "assistant", "content": [{"toolUse": {"toolUseId": "A", "name": "t", "input": {}}}]},
    {"role": "assistant", "content": [{"toolUse": {"toolUseId": "B", "name": "t", "input": {}}}]},
]

What happens in the loop:

index=0: toolUse(A) → insert(1, toolResult(A)) → list becomes [A, resultA, B] (len=3)
index=1: resultA → no toolUse → continue
index=2: toolUse(B) → index+1=3 < 3 is False → hits the new else branch → fixed ✓

Without the else branch, step 3 silently skips B. A test like this would pin that behavior:

def test_fix_broken_tool_use_consecutive_orphaned_not_skipped_by_insert(session_manager):
    """Ensure insert() during iteration doesn't cause a later toolUse to be skipped."""
    messages = [
        {"role": "assistant", "content": [
            {"toolUse": {"toolUseId": "A", "name": "tool", "input": {}}}
        ]},
        {"role": "assistant", "content": [
            {"toolUse": {"toolUseId": "B", "name": "tool", "input": {}}}
        ]},
    ]

    fixed = session_manager._fix_broken_tool_use(copy.deepcopy(messages))

    assert len(fixed) == 4  # A, resultA, B, resultB
    assert fixed[1]["content"][0]["toolResult"]["toolUseId"] == "A"
    assert fixed[3]["content"][0]["toolResult"]["toolUseId"] == "B"

Anyway, neither of these are blockers. Just sharing what I found. Thanks for the fix!

beansandroasters · 2026-04-01T05:58:20Z

Following up on my comment above — I went ahead and opened an issue and a PR for the test improvements:

Issue: Test: _fix_broken_tool_use tests don't guard against in-place mutation; missing coverage for insert-induced edge case #2028 — documents the two test quality issues (deepcopy + insert-induced edge case)
PR: test: improve _fix_broken_tool_use test coverage #2029 — adds the test fixes; the two new tests intentionally fail on current main and will pass once this PR (fix: repair orphaned toolUser in last message during session restore #2026) is merged

Happy to adjust if anything doesn't fit your project conventions.

konippi · 2026-04-01T08:42:57Z

Following up on my comment above — I went ahead and opened an issue and a PR for the test improvements:

Issue: Test: _fix_broken_tool_use tests don't guard against in-place mutation; missing coverage for insert-induced edge case #2028 — documents the two test quality issues (deepcopy + insert-induced edge case)

PR: test: improve _fix_broken_tool_use test coverage #2029 — adds the test fixes; the two new tests intentionally fail on current main and will pass once this PR (fix: repair orphaned toolUser in last message during session restore #2026) is merged

Happy to adjust if anything doesn't fit your project conventions.

Thanks for the thorough review and for catching both the deepcopy issue and the insert-induced edge case — really appreciate it!

github-actions · 2026-04-01T14:37:33Z

Assessment: Approve

Good fix for a real bug. The change correctly extends _fix_broken_tool_use to handle orphaned toolUse in the last message by appending a toolResult with error status. The test update accurately reflects the new behavior.

Minor Suggestion

The docstring for _fix_broken_tool_use (lines 256-262) documents non-existent parameters agent_id and removed_message_count. Since you're updating this docstring, consider removing these stale entries.

Thanks for the clear PR description and test coverage! 🎉

Unshure · 2026-04-07T20:25:56Z

Thanks for raising this pull request!

I'm still a bit confused how this issue happens in the first place. We intentionally don't repair the latest message in the session manager, and instead rely on the agent invocation to fix this: https://github.com/strands-agents/sdk-python/blob/main/src/strands/agent/agent.py#L986-L1001

Basically, during agent invocation, we check to see if the latest message is a tool use, and if so, then we put a toolResult message after it so that the conversation is valid. Then we append the use message so we have a fully valid conversation.

The reason we do this is because if you resume a session that had interrupted its tool use, you can pass None into your agent and that tool will automatically execute, saving an llm invoke. (This pull request goes into more detail: #1123)

By automatically adding a toolResult after an orphaned tool use in the session manager, you break this behavior. But ideally you shouldnt be getting this error in the first place. Can you help me understand how you get this error? Maybe a small reproducible example?

fix: repair orphaned toolUser in last message during session restore

ab85a6c

github-actions bot added the size/s label Apr 1, 2026

konippi temporarily deployed to manual-approval April 1, 2026 04:50 — with GitHub Actions Inactive

konippi requested a deployment to manual-approval April 1, 2026 04:50 — with GitHub Actions Waiting

This was referenced Apr 1, 2026

Test: _fix_broken_tool_use tests don't guard against in-place mutation; missing coverage for insert-induced edge case #2028

Open

test: improve _fix_broken_tool_use test coverage #2029

Open

github-actions bot added the strands-running label Apr 1, 2026

github-actions bot removed the strands-running label Apr 1, 2026

docs: remove stale params from _fix_broken_tool_use docstring

34cce61

github-actions bot added size/s and removed size/s labels Apr 1, 2026

konippi requested a deployment to manual-approval April 1, 2026 14:54 — with GitHub Actions Waiting

This was referenced Apr 7, 2026

fix: repair orphaned toolUse in the last session message during restore #2073

Closed

[BUG] _fix_broken_tool_use does not repair orphaned toolUse in the last message, causing session corruption on process termination #2025

Open

Unshure self-assigned this Apr 7, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: repair orphaned toolUser in last message during session restore#2026

fix: repair orphaned toolUser in last message during session restore#2026
konippi wants to merge 2 commits intostrands-agents:mainfrom
konippi:fix-orphaned-tooluse-last-message

konippi commented Apr 1, 2026 •

edited

Loading

Uh oh!

beansandroasters commented Apr 1, 2026

Uh oh!

beansandroasters commented Apr 1, 2026

Uh oh!

konippi commented Apr 1, 2026

Uh oh!

github-actions bot commented Apr 1, 2026

Uh oh!

Unshure commented Apr 7, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

konippi commented Apr 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Related Issues

Documentation PR

Type of Change

Testing

Checklist

Uh oh!

beansandroasters commented Apr 1, 2026

In-place mutation makes the old test vacuous

The else branch also fixes an insert()-induced edge case — might be worth a test

Uh oh!

beansandroasters commented Apr 1, 2026

Uh oh!

konippi commented Apr 1, 2026

Uh oh!

github-actions bot commented Apr 1, 2026

Uh oh!

Unshure commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

konippi commented Apr 1, 2026 •

edited

Loading

The `else` branch also fixes an `insert()`-induced edge case — might be worth a test

Unshure commented Apr 7, 2026 •

edited

Loading