perf(cloud-agent): reduce idle DO wake-ups with 1-hour reaper interval#1917
Open
perf(cloud-agent): reduce idle DO wake-ups with 1-hour reaper interval#1917
Conversation
Introduce REAPER_IDLE_INTERVAL_MS (1 hour) so idle sessions don't wake the DO every 5 minutes for nothing. When an execution starts, the alarm is immediately rescheduled to REAPER_ACTIVE_INTERVAL_MS (2 min) to keep stale/hung execution detection responsive. Error fallback uses the conservative 5-min default rather than the 1-hour idle interval.
| // longer idle interval otherwise so we don't wake the DO every 5 min for nothing. | ||
| // Wrapped in try/catch so a failure here never prevents rescheduling the alarm. | ||
| let nextInterval = this.getReaperIntervalMs(); | ||
| let nextInterval = REAPER_IDLE_INTERVAL_MS; |
Contributor
There was a problem hiding this comment.
WARNING: Hourly idle cadence delays idle wrapper shutdown
cleanupIdleKiloServer() still relies on this alarm to enforce KILO_SERVER_IDLE_TIMEOUT_MS_DEFAULT (15 minutes). After a short execution completes, the next active alarm will usually observe idleMs < idleTimeoutMs, and then this reschedules the following check for 1 hour later. That means an otherwise idle wrapper can stay alive for roughly 60 minutes instead of 15, increasing sandbox usage until the next wake-up.
Contributor
Code Review SummaryStatus: 1 Issues Found | Recommendation: Address before merge Overview
Fix these issues in Kilo Cloud Issue Details (click to expand)WARNING
Other Observations (not in diff)Issues found in unchanged code that cannot receive inline comments:
Files Reviewed (1 files)
Reviewed by gpt-5.4-2026-03-05 · 901,764 tokens |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Reduces unnecessary Durable Object wake-ups by introducing a 1-hour idle reaper interval (
REAPER_IDLE_INTERVAL_MS) when no execution is active, replacing the previous 5-minute default. When an execution starts, the alarm is immediately rescheduled to the 2-minute active interval so stale/hung execution detection remains responsive.Key changes:
REAPER_IDLE_INTERVAL_MS(1 hour) used as the default interval after alarm runs when no execution is active.setAlarmis called withREAPER_ACTIVE_INTERVAL_MS(2 min) to close the gap where the idle alarm could be up to an hour away.REAPER_INTERVAL_MS_DEFAULT(5 min) rather than the idle interval, so the reaper retries conservatively when state is unknown.getReaperIntervalMs()documenting that the env override only affects the initial alarm scheduling.Verification
tsgo --noEmitfor cloud-agent-next and dependencies, via pre-push hook)oxfmtandoxlint, via pre-push hook)Visual Changes
N/A
Reviewer Notes
REAPER_INTERVAL_MSenv var override now only controls the initial alarm viaensureAlarmScheduled(). Steady-state scheduling uses the new constants directly. This is documented in the added JSDoc.REAPER_INTERVAL_MS_DEFAULT) rather than the 1-hour idle interval — ifgetActiveExecutionId()throws, we can't know whether an execution is active, so we err on the side of checking sooner.