Skip to content

fix: improve todo tool reliability by reminding LLM of incomplete items#2048

Open
trungutt wants to merge 1 commit intodocker:mainfrom
trungutt:trungutt/improve-todo-completion-reliability
Open

fix: improve todo tool reliability by reminding LLM of incomplete items#2048
trungutt wants to merge 1 commit intodocker:mainfrom
trungutt:trungutt/improve-todo-completion-reliability

Conversation

@trungutt
Copy link
Contributor

@trungutt trungutt commented Mar 10, 2026

When using the todo toolset, agents would create a set of todos (e.g. 5 items), do the actual work, but then only call update_todos for a subset of them (e.g. 3 out of 5). The remaining items stayed "pending" permanently, making the todo sidebar look like the agent didn't finish its work — even though the underlying tasks were actually completed.

This is an LLM attention problem, not a backend logic bug. LLMs frequently create todos but fail to mark all of them as completed, leaving the sidebar in a partially-done state. This happens because the original "complete your todos" instruction is far back in the system prompt by the time the LLM finishes working — it's outside the LLM's effective attention window.

LLMs frequently create todos but fail to mark all of them as completed,
leaving the todo sidebar in a partially-done state. This happens because
the instruction to complete todos is far back in the system prompt by the
time the LLM finishes its work.

Add an incomplete-todo reminder to update_todos and list_todos tool
output so the LLM sees unfinished items directly in its immediate
context. Also strengthen the system instructions to emphasize that every
todo must be completed before responding.
@trungutt trungutt marked this pull request as ready for review March 10, 2026 18:16
@trungutt trungutt requested a review from a team as a code owner March 10, 2026 18:16
Copy link

@docker-agent docker-agent bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Summary

Assessment: 🟢 APPROVE

This PR successfully addresses the LLM attention problem by adding reminder strings that alert the agent about incomplete todos. The implementation is sound with proper control flow:

✅ Clear separation between completed/pending/in-progress states
✅ Correct conditional logic (reminders only shown when needed)
✅ Storage clearing happens at the right time (after all completed)
✅ Test updated to reflect new output format

The changes effectively solve the problem of LLMs forgetting to mark todos as completed by bringing incomplete items back into their attention window.

Minor optimization opportunities (informational only, not blocking):

  • Two small inefficiencies noted below could be addressed in a future cleanup

Posted via docker-agent PR review

// incompleteReminder returns a reminder string listing any non-completed todos,
// or an empty string if all are completed (or storage is empty).
func (h *todoHandler) incompleteReminder() string {
all := h.storage.All()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[LOW] Minor inefficiency: storage.All() called twice

When incompleteReminder() is called, allCompleted() has already called h.storage.All() to check completion status. This results in iterating the storage twice. Consider refactoring to pass the slice or cache the result.

This doesn't affect correctness, just performance (minimal impact in practice given todo lists are typically small).

if todo.Status == "completed" {
completed++
}
total++
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[LOW] Redundant counter: total is always len(all)

The total variable is incremented for every item but will always equal len(all). You can simplify this:

var completed int
for _, todo := range all {
    if todo.Status == "completed" {
        completed++
    }
}
fmt.Fprintf(&output, "Current todos (%d/%d completed):\n", completed, len(all))

This is a minor code clarity improvement with no functional impact.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant