Skip to content

perf: implement bounded conversation history and caching optimizations#1463

Open
Letteriello wants to merge 7 commits intoeigent-ai:mainfrom
Letteriello:perf-caching
Open

perf: implement bounded conversation history and caching optimizations#1463
Letteriello wants to merge 7 commits intoeigent-ai:mainfrom
Letteriello:perf-caching

Conversation

@Letteriello
Copy link

Summary

  • US-006: Add bounded conversation history with sliding window (100 messages)
  • US-007: Add toolkit caching for tool results and MCP connections
  • US-008/US-009: Add prompt caching with platform info and formatted prompts

Test plan

  • Verify conversation trimming works in frontend
  • Verify backend auto-trims before overflow
  • Test toolkit caching reduces initialization time
  • Verify prompt formatting uses cached values

🤖 Generated with Claude Code

Claude and others added 7 commits March 6, 2026 12:52
- US-006: Add bounded conversation history with sliding window (100 messages)
  - Frontend: trim messages in chatStore.addMessages()
  - Backend: add TaskLock.trim_conversation_history() method
  - Auto-trim before context overflow errors

- US-007: Add toolkit caching
  - Cache tool results per (toolkit_name, task_id)
  - Cache MCP toolkit connections for reuse
  - Add clear_toolkit_cache() and clear_mcp_cache() helpers

- US-008/US-009: Add prompt caching
  - Cache platform info via @lru_cache in utils.py
  - Cache formatted prompts to avoid recomputation
  - Update factory files to use format_prompt()

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- US-010: Implement context compression via summarization
  - Add compress_conversation_history() method to TaskLock
  - Generate summary from older entries before trimming
  - Include summary in get_recent_context() output
  - Auto-compress when conversation exceeds 2x max_entries

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Log history length, summary presence, and summary length when building context
- Include last_task_summary in context when available
- Log final context length for debugging

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Extract key actions from tool calls (file creation, reading, writing, etc.)
- Extract main topics from user messages
- Create structured summary with message count, actions, and topics
- Add more detailed logging for debugging

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Log conversation entries count, summary presence and preview
- Log task result length for debugging context preservation

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Log when summary is included in context
- Log total context length and entries included

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Test compression creates structured summary with actions/topics
- Test compression skips when history is small
- Test get_recent_context includes summary when available
- Test get_recent_context works without summary

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@Letteriello
Copy link
Author

Code Review

Encontrei 3 issues para considerar:

1. Cache key collision risk (Medium)

prompt.py usa template[:50] que pode causar colisões se dois prompts tiverem os mesmos 50 primeiros caracteres.

2. now_str não está na cache key (High)

O now_str muda a cada hora mas não faz parte da chave de cache. O cache pode retornar prompts com hora desatualizada.

3. Toolkit cache por task_id (Medium)

O cache usa api_task_id na chave, então cada task cria novas entradas. O cache não é compartilhado entre tasks.

Sugestões

Boa implementação no geral!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant