| title | Compaction |
|---|---|
| description | Background context management that keeps conversations flowing without blocking. |
LLM context windows are finite. A long conversation fills up. In OpenClaw, when context hits a threshold, the session freezes for 20 seconds while it summarizes and compacts. The user stares at a typing indicator wondering if the bot died. Spacebot never blocks.
The compactor is a programmatic monitor — not an LLM process. It watches a channel's context size (estimated token count) and triggers compaction workers in the background. The channel keeps responding to messages the entire time.
Every turn, after the channel's LLM call completes, the compactor checks context usage:
estimated_tokens / context_window = usage ratio
Token estimation uses a chars / 4 heuristic across all message content (text, tool calls, tool results). It's intentionally rough — we only need to know "are we getting close?" not "exactly how many tokens." Overestimating is the safe direction.
Three tiers, configurable per agent:
| Threshold | Default | Action |
|---|---|---|
| Background | 80% | Compact oldest 30% of messages |
| Aggressive | 85% | Compact oldest 50% of messages |
| Emergency | 95% | Drop oldest 50%, no LLM |
[defaults.compaction]
background_threshold = 0.80
aggressive_threshold = 0.85
emergency_threshold = 0.95Only one compaction runs at a time per channel. If context is already being compacted and a new threshold is hit, it's ignored until the current compaction finishes.
These are the normal path. A compaction worker runs in tokio::spawn alongside the channel:
-
Drain — Write-lock the channel's history, remove the oldest N messages (30% for background, 50% for aggressive). Release the lock. The channel can immediately continue with the remaining history.
-
Summarize — Build a transcript from the removed messages and run a Rig agent with
prompts/en/compactor.md.j2as the system prompt. The agent produces a condensed summary preserving key decisions, active topics, commitments, and emotional context. It discards greetings, tool call mechanics, and intermediate reasoning. -
Extract memories — The compaction agent has access to the
memory_savetool. While summarizing, it identifies facts, preferences, decisions, and observations worth keeping long-term and saves them directly to the memory store. These persist independently of the conversation. -
Inject summary — Write-lock the history again, insert the summary at position 0 as
[Compaction Summary]: .... Release the lock. The channel sees this summary on its next turn.
The compaction agent runs with max_turns(10) — enough for the LLM to produce the summary and call memory_save a few times for extracted memories.
At 95% context usage, there's no time for an LLM call. Emergency truncation is synchronous:
- Write-lock history
- Remove oldest 50% of messages
- Insert a marker:
[System: N older messages were truncated due to context limits] - Release lock
This should rarely fire. If it does, it means the background/aggressive compaction didn't keep up — either the thresholds are too high, or the conversation is extremely fast-paced.
Compaction summaries accumulate at the top of the context window. A long-running conversation might have several:
[Compaction Summary]: Earlier today, discussed project architecture...
[Compaction Summary]: Moved on to auth implementation, decided on JWT...
[Recent conversation messages]
This gives the channel rolling awareness of what happened without carrying the full raw history. Each summary covers the messages it replaced.
The compaction agent receives a rendered transcript of the removed messages. User messages, assistant responses, tool calls, and tool results — all formatted as readable text. The agent's system prompt (prompts/en/compactor.md.j2) tells it to:
Preserve: Key decisions, active topics, commitments, emotional context, active workers/tasks.
Discard: Greetings, small talk, tool call details (results matter, not mechanics), intermediate reasoning, repeated information.
Extract as memories: Facts, preferences, decisions, observations — anything that should outlive the conversation.
Thresholds are set in config.toml at the defaults level and can be overridden per agent:
[defaults.compaction]
background_threshold = 0.80
aggressive_threshold = 0.85
emergency_threshold = 0.95
# An agent with a smaller context window might want tighter thresholds
[[agents]]
id = "small-model-bot"
[agents.compaction]
background_threshold = 0.70
aggressive_threshold = 0.75
emergency_threshold = 0.90The context_window setting (default 128,000 tokens) determines the denominator for usage calculation. Set this to match your model's actual context window.
| Concern | OpenClaw | Spacebot |
|---|---|---|
| When it runs | Blocks the session | Background tokio task |
| User experience | Typing indicator, 20s freeze | No interruption |
| Summarization | Same session's LLM | Dedicated compaction worker |
| Memory extraction | Separate pass | Same LLM call as summarization |
| Raw transcript | Lost | Extracted as memories |
| Multiple summaries | One summary replaces all | Summaries stack chronologically |
| Emergency fallback | None (just hope it fits) | Hard truncation at 95% |
src/agent/compactor.rs— TheCompactorstruct, threshold checking, token estimation, compaction worker spawning, emergency truncationsrc/agent/channel.rs— Channel owns aCompactor, callscheck_and_compact()after each turnprompts/en/compactor.md.j2— System prompt for the compaction LLM