Skip to content

Feature/compaction#311

Closed
janspoerer wants to merge 37 commits intoevalstate:mainfrom
janspoerer:feature/compaction
Closed

Feature/compaction#311
janspoerer wants to merge 37 commits intoevalstate:mainfrom
janspoerer:feature/compaction

Conversation

@janspoerer
Copy link
Copy Markdown
Contributor

No description provided.

@janspoerer
Copy link
Copy Markdown
Contributor Author

Please note that Claude-Sonnet-3.5 works much better than Gemini-2.5-Flash.

@janspoerer
Copy link
Copy Markdown
Contributor Author

I think, though, that this is mostly a LLM-related behavior. But I'll need to dig deeper into this. Maybe it is not (only) that Gemini-2.5-Flash is just worse at tool use.

@janspoerer
Copy link
Copy Markdown
Contributor Author

@evalstate

Some integration tests fail since I merged the latest uv.lock into this branch.

It seems like the main change was that mcp was updated from 1.12.0 to 1.12.1.

All unit tests are still passing, and the agent runs normally, as far as I can tell.

Will try to find the root cause on the weekend.

@janspoerer
Copy link
Copy Markdown
Contributor Author

storlien wrote this in #286:

@janspoerer
I did make an adhoc solution to the error displayed earlier that fixed it for me in this branch, but currently checking out the branch you mentioned.
I'm trying to get it working with Gemini, but encountered an error that I can post later. Would be extremely grateful if you could implement for OpenAI/Azure asap, that would be very beneficial to the project I'm working on!

  1. Thanks, that would be helpful. I have no clear idea where to look into to fix this strange new error.
  2. Yes, please post the error, that would also be extremely helpful.
  3. Yes, sure, I'm working on OpenAI right now. I think I have no Azure API key, but I can see if I can get one.

@evalstate
Copy link
Copy Markdown
Owner

Hi @janspoerer -- send me a DM on discord if you need help with a key.

@storlien
Copy link
Copy Markdown

storlien commented Jul 26, 2025

@janspoerer

This is the error I got when trying out gemini-2.5-pro.

TypeError: GoogleNativeAugmentedLLM._completion_orchestrator() got an unexpected keyword argument 'multipart_messages'

Had to add

context_truncation_or_summarization_mode = None,
context_truncation_or_summarization_length_limit = None,

several places in the direct_decorators.py file first.

Haven't tried with your latest changes. This occured when your latest commit was "Made linter happy". Can try with your latest changes now.

Full traceback:

Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/Users/carl.edward.storlien/Documents/Altinn/altinity/fastagent/altinity.py", line 84, in <module>
    asyncio.run(main())
    ~~~~~~~~~~~^^^^^^^^
  File "/Users/carl.edward.storlien/.local/share/uv/python/cpython-3.13.5-macos-aarch64-none/lib/python3.13/asyncio/runners.py", line 195, in run
    return runner.run(main)
           ~~~~~~~~~~^^^^^^
  File "/Users/carl.edward.storlien/.local/share/uv/python/cpython-3.13.5-macos-aarch64-none/lib/python3.13/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^
  File "/Users/carl.edward.storlien/.local/share/uv/python/cpython-3.13.5-macos-aarch64-none/lib/python3.13/asyncio/base_events.py", line 725, in run_until_complete
    return future.result()
           ~~~~~~~~~~~~~^^
  File "/Users/carl.edward.storlien/Documents/Altinn/jan-repo/fast-agent/src/mcp_agent/core/direct_decorators.py", line 132, in wrapper
    return await func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/carl.edward.storlien/Documents/Altinn/jan-repo/fast-agent/src/mcp_agent/core/direct_decorators.py", line 132, in wrapper
    return await func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/carl.edward.storlien/Documents/Altinn/jan-repo/fast-agent/src/mcp_agent/core/direct_decorators.py", line 132, in wrapper
    return await func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  [Previous line repeated 1 more time]
  File "/Users/carl.edward.storlien/Documents/Altinn/altinity/fastagent/altinity.py", line 81, in main
    await agent.interactive()
  File "/Users/carl.edward.storlien/Documents/Altinn/jan-repo/fast-agent/src/mcp_agent/core/agent_app.py", line 305, in interactive
    return await prompt.prompt_loop(
           ^^^^^^^^^^^^^^^^^^^^^^^^^
    ...<5 lines>...
    )
    ^
  File "/Users/carl.edward.storlien/Documents/Altinn/jan-repo/fast-agent/src/mcp_agent/core/interactive_prompt.py", line 215, in prompt_loop
    result = await send_func(user_input, agent)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/carl.edward.storlien/Documents/Altinn/jan-repo/fast-agent/src/mcp_agent/core/agent_app.py", line 288, in send_wrapper
    result = await self.send(message, agent_name)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/carl.edward.storlien/Documents/Altinn/jan-repo/fast-agent/src/mcp_agent/core/agent_app.py", line 95, in send
    return await self._agent(agent_name).send(message)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/carl.edward.storlien/Documents/Altinn/jan-repo/fast-agent/src/mcp_agent/agents/base_agent.py", line 228, in send
    response = await self.generate([prompt], None)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/carl.edward.storlien/Documents/Altinn/jan-repo/fast-agent/src/mcp_agent/agents/base_agent.py", line 659, in generate
    return await self._llm.generate(multipart_messages, request_params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/carl.edward.storlien/Documents/Altinn/jan-repo/fast-agent/src/mcp_agent/llm/augmented_llm.py", line 236, in generate
    assistant_response: PromptMessageMultipart = await self._apply_prompt_provider_specific(
                                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        multipart_messages, request_params
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/Users/carl.edward.storlien/Documents/Altinn/jan-repo/fast-agent/src/mcp_agent/llm/providers/augmented_llm_google_native.py", line 343, in _apply_prompt_provider_specific
    final_content, new_history_messages = await self._completion_orchestrator(
                                                ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
        messages_for_turn=messages_for_turn,
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ...<2 lines>...
        multipart_messages=self.history.get(...)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
TypeError: GoogleNativeAugmentedLLM._completion_orchestrator() got an unexpected keyword argument 'multipart_messages'

@janspoerer
Copy link
Copy Markdown
Contributor Author

Thanks a lot.

Will have a look into this on Monday.

I have progressed with the OpenAI truncation as well. Will push the changes now.

@storlien
Copy link
Copy Markdown

@janspoerer

Trying it out now.

I do get this error when I add "context_truncation_mode" and "context_truncation_length_limit" to the agent decorator.

@altinity.agent(
     ~~~~~~~~~~~~~~^
        name="Altinity_Executer",
        ^^^^^^^^^^^^^^^^^^^^^^^^^
    ...<7 lines>...
        context_truncation_length_limit=5000,
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        model="openai.gpt-4.1-mini-2025-04-14")
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: agent() got an unexpected keyword argument 'context_truncation_mode'

It seems you haven't added the parameters to the direct_decorators.py file. After adding the parameters several locations in this file, I get this error instead:

Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/Users/carl.edward.storlien/Documents/Altinn/altinity/fastagent/altinity.py", line 84, in <module>
    asyncio.run(main())
    ~~~~~~~~~~~^^^^^^^^
  File "/Users/carl.edward.storlien/.local/share/uv/python/cpython-3.13.5-macos-aarch64-none/lib/python3.13/asyncio/runners.py", line 195, in run
    return runner.run(main)
           ~~~~~~~~~~^^^^^^
  File "/Users/carl.edward.storlien/.local/share/uv/python/cpython-3.13.5-macos-aarch64-none/lib/python3.13/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^
  File "/Users/carl.edward.storlien/.local/share/uv/python/cpython-3.13.5-macos-aarch64-none/lib/python3.13/asyncio/base_events.py", line 725, in run_until_complete
    return future.result()
           ~~~~~~~~~~~~~^^
  File "/Users/carl.edward.storlien/Documents/Altinn/jan-repo/fast-agent/src/mcp_agent/core/direct_decorators.py", line 219, in wrapper
    return await func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/carl.edward.storlien/Documents/Altinn/jan-repo/fast-agent/src/mcp_agent/core/direct_decorators.py", line 219, in wrapper
    return await func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/carl.edward.storlien/Documents/Altinn/jan-repo/fast-agent/src/mcp_agent/core/direct_decorators.py", line 219, in wrapper
    return await func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  [Previous line repeated 1 more time]
  File "/Users/carl.edward.storlien/Documents/Altinn/altinity/fastagent/altinity.py", line 79, in main
    async with altinity.run() as agent:
               ~~~~~~~~~~~~^^
  File "/Users/carl.edward.storlien/.local/share/uv/python/cpython-3.13.5-macos-aarch64-none/lib/python3.13/contextlib.py", line 214, in __aenter__
    return await anext(self.gen)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/Users/carl.edward.storlien/Documents/Altinn/jan-repo/fast-agent/src/mcp_agent/core/fastagent.py", line 312, in run
    active_agents = await create_agents_in_dependency_order(
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ...<3 lines>...
    )
    ^
  File "/Users/carl.edward.storlien/Documents/Altinn/jan-repo/fast-agent/src/mcp_agent/core/direct_factory.py", line 376, in create_agents_in_dependency_order
    basic_agents = await create_agents_by_type(
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ...<9 lines>...
    )
    ^
  File "/Users/carl.edward.storlien/Documents/Altinn/jan-repo/fast-agent/src/mcp_agent/core/direct_factory.py", line 153, in create_agents_by_type
    await agent.attach_llm(
    ...<3 lines>...
    )
  File "/Users/carl.edward.storlien/Documents/Altinn/jan-repo/fast-agent/src/mcp_agent/agents/base_agent.py", line 173, in attach_llm
    self._llm = llm_factory(
                ~~~~~~~~~~~^
        agent=self, request_params=effective_params, context=self._context, **additional_kwargs
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/Users/carl.edward.storlien/Documents/Altinn/jan-repo/fast-agent/src/mcp_agent/llm/model_factory.py", line 285, in factory
    llm: AugmentedLLMProtocol = llm_class(**llm_args)
                                ~~~~~~~~~^^^^^^^^^^^^
  File "/Users/carl.edward.storlien/Documents/Altinn/jan-repo/fast-agent/src/mcp_agent/llm/providers/augmented_llm_azure.py", line 32, in __init__
    super().__init__(provider=provider, *args, **kwargs)
    ~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/carl.edward.storlien/Documents/Altinn/jan-repo/fast-agent/src/mcp_agent/llm/providers/augmented_llm_openai.py", line 86, in __init__
    self.client = self._initialize_client()
                  ~~~~~~~~~~~~~~~~~~~~~~~^^
  File "/Users/carl.edward.storlien/Documents/Altinn/jan-repo/fast-agent/src/mcp_agent/llm/providers/augmented_llm_azure.py", line 107, in _initialize_client
    if self.use_default_cred:
       ^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'AzureOpenAIAugmentedLLM' object has no attribute 'use_default_cred'

Some of my agents use Azure models. When changing all agents to OpenAI models, it starts up correctly.

I tried setting the context_truncation_mode to a random string. I didn't get an error message about this, perhaps there should be a validation of that?

Truncation doesn't seem to work for me with OpenAI. I've set context_truncation_mode="remove" and context_truncation_length_limit=5000.

image

@storlien
Copy link
Copy Markdown

Wait, I see that I should put the params in the RequestParams. Trying it out now

@storlien
Copy link
Copy Markdown

New error after setting the params in the RequestParams. First message to the LLM works fine, then it crashes after the next:

image

Traceback:

Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/Users/carl.edward.storlien/Documents/Altinn/altinity/fastagent/altinity.py", line 85, in <module>
    asyncio.run(main())
    ~~~~~~~~~~~^^^^^^^^
  File "/Users/carl.edward.storlien/.local/share/uv/python/cpython-3.13.5-macos-aarch64-none/lib/python3.13/asyncio/runners.py", line 195, in run
    return runner.run(main)
           ~~~~~~~~~~^^^^^^
  File "/Users/carl.edward.storlien/.local/share/uv/python/cpython-3.13.5-macos-aarch64-none/lib/python3.13/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^
  File "/Users/carl.edward.storlien/.local/share/uv/python/cpython-3.13.5-macos-aarch64-none/lib/python3.13/asyncio/base_events.py", line 725, in run_until_complete
    return future.result()
           ~~~~~~~~~~~~~^^
  File "/Users/carl.edward.storlien/Documents/Altinn/jan-repo/fast-agent/src/mcp_agent/core/direct_decorators.py", line 217, in wrapper
    return await func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/carl.edward.storlien/Documents/Altinn/jan-repo/fast-agent/src/mcp_agent/core/direct_decorators.py", line 217, in wrapper
    return await func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/carl.edward.storlien/Documents/Altinn/jan-repo/fast-agent/src/mcp_agent/core/direct_decorators.py", line 217, in wrapper
    return await func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  [Previous line repeated 1 more time]
  File "/Users/carl.edward.storlien/Documents/Altinn/altinity/fastagent/altinity.py", line 82, in main
    await agent.interactive()
  File "/Users/carl.edward.storlien/Documents/Altinn/jan-repo/fast-agent/src/mcp_agent/core/agent_app.py", line 305, in interactive
    return await prompt.prompt_loop(
           ^^^^^^^^^^^^^^^^^^^^^^^^^
    ...<5 lines>...
    )
    ^
  File "/Users/carl.edward.storlien/Documents/Altinn/jan-repo/fast-agent/src/mcp_agent/core/interactive_prompt.py", line 215, in prompt_loop
    result = await send_func(user_input, agent)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/carl.edward.storlien/Documents/Altinn/jan-repo/fast-agent/src/mcp_agent/core/agent_app.py", line 288, in send_wrapper
    result = await self.send(message, agent_name)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/carl.edward.storlien/Documents/Altinn/jan-repo/fast-agent/src/mcp_agent/core/agent_app.py", line 95, in send
    return await self._agent(agent_name).send(message)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/carl.edward.storlien/Documents/Altinn/jan-repo/fast-agent/src/mcp_agent/agents/base_agent.py", line 228, in send
    response = await self.generate([prompt], None)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/carl.edward.storlien/Documents/Altinn/jan-repo/fast-agent/src/mcp_agent/agents/base_agent.py", line 659, in generate
    return await self._llm.generate(multipart_messages, request_params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/carl.edward.storlien/Documents/Altinn/jan-repo/fast-agent/src/mcp_agent/llm/augmented_llm.py", line 252, in generate
    assistant_response: PromptMessageMultipart = await self._apply_prompt_provider_specific(
                                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        multipart_messages, request_params
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/Users/carl.edward.storlien/Documents/Altinn/jan-repo/fast-agent/src/mcp_agent/llm/providers/augmented_llm_openai.py", line 498, in _apply_prompt_provider_specific
    responses: List[ContentBlock] = await self._openai_completion(
                                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ...<2 lines>...
    )                                    ##
    ^
  File "/Users/carl.edward.storlien/Documents/Altinn/jan-repo/fast-agent/src/mcp_agent/llm/providers/augmented_llm_openai.py", line 361, in _openai_completion
    message["role"]
    ~~~~~~~^^^^^^^^
TypeError: 'ParsedChatCompletionMessage[NoneType]' object is not subscriptable

@janspoerer
Copy link
Copy Markdown
Contributor Author

I am not sure why GitHub displays the warning "This branch has conflicts that must be resolved."

I have merged all changes from the main branch locally into this feature branch.

Do you have an idea, @evalstate?

Other than that, given that the tests and linting work here, I think this is ready for review.

@janspoerer
Copy link
Copy Markdown
Contributor Author

Ah my bad, I have of course compared this to my forked main branch. Sorry, will sync my fork.

@janspoerer
Copy link
Copy Markdown
Contributor Author

Alright, now the tests are being queued... All good :-)

@evalstate
Copy link
Copy Markdown
Owner

@janspoerer how are you running the tests? they work from the command line if api keys are set; not otherwise. are api keys strictly needed for these (if so, we can shift to e2e). the test_truncate_if_required_no_truncation_needed unit test fails in both environments - looks like it is truncating when it shouldn't be. is that a config issue?

@janspoerer
Copy link
Copy Markdown
Contributor Author

@evalstate

Sorry for the late reply!

Thanks for looking at this.

I run the tests using:

  • pytest tests/integration
  • pytest tests/unit

And now they are running.

I turned off the integration test that required Google API keys. This test is not needed.

@janspoerer
Copy link
Copy Markdown
Contributor Author

I will use the features from this PR next week in some long-running workflows.

The features will thus be somewhat battle-tested (and not only have integration tests and unit tests).

@janspoerer
Copy link
Copy Markdown
Contributor Author

Update: Still battle-testing. :)

@evalstate
Copy link
Copy Markdown
Owner

Good stuff. I'm definitely going to need your help with the merge to 0.3.0.....

@evalstate
Copy link
Copy Markdown
Owner

Hi @janspoerer lmk how the testing went. I've been having a look at this to figure out the best way to proceed with merging with 0.3.0.

My current thought is to bring the e2e tests over as-is (as the top-level interface remains the same), and then manually merge the functionality (as I think post-refactor we can reduce some of the duplication you've had to deal with).

@janspoerer
Copy link
Copy Markdown
Contributor Author

Hi @janspoerer lmk how the testing went. I've been having a look at this to figure out the best way to proceed with merging with 0.3.0.

My current thought is to bring the e2e tests over as-is (as the top-level interface remains the same), and then manually merge the functionality (as I think post-refactor we can reduce some of the duplication you've had to deal with).

Hi @evalstate,

sorry for the late answer. I'm building a browser use MCP for work that I wanted to use to test this compaction feature, but the MCP took longer than expected, so my testing got delayed. So far, I could not find bugs with the compaction feature.

Maybe to clarify:

  • You would keep the e2e tests as they currently are in the master branch and discard test changes introduced in this compaction branch -> That is absolutely fine with me.
  • "manually merge the functionality" -> That is not clear to me.

Can you maybe elaborate on what you mean with "manually merging"? Thank you :)

Next step would thus be: I remove the e2e changes? Anything else that you would like to see being done here?

evalstate pushed a commit that referenced this pull request Dec 3, 2025
Resurrects and adapts functionality from PR #311 for the current architecture
where agents own history and LLM providers are stateless.

New compaction module (src/fast_agent/llm/compaction/):
- ContextCompactionMode enum: NONE, TRUNCATE, SUMMARIZE
- ContextCompaction class: main logic for compacting messages
- Token estimation utilities leveraging existing usage tracking

Integration:
- AgentConfig: added context_compaction_mode and context_compaction_limit
- LlmDecorator: added compact_history() method
- CLI: added /compact [truncate|summarize] command

The compaction feature allows users to:
- Manually compact history via /compact command
- Configure automatic compaction via agent config
- Choose between truncation (removing old messages) or summarization
  (using LLM to condense history)
@evalstate
Copy link
Copy Markdown
Owner

Thanks for this contribution - sorry this didn't make it in; closing as superceded by hooks skill.

@evalstate evalstate closed this Apr 9, 2026
@janspoerer
Copy link
Copy Markdown
Contributor Author

All good, thank you for the support! I'm glad if the functionality is available in any way!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants