Skip to content

perf(cloud-agent): reduce idle DO wake-ups with 1-hour reaper interval#1917

Open
eshurakov wants to merge 1 commit intomainfrom
eshurakov/ripple-request
Open

perf(cloud-agent): reduce idle DO wake-ups with 1-hour reaper interval#1917
eshurakov wants to merge 1 commit intomainfrom
eshurakov/ripple-request

Conversation

@eshurakov
Copy link
Copy Markdown
Contributor

Summary

Reduces unnecessary Durable Object wake-ups by introducing a 1-hour idle reaper interval (REAPER_IDLE_INTERVAL_MS) when no execution is active, replacing the previous 5-minute default. When an execution starts, the alarm is immediately rescheduled to the 2-minute active interval so stale/hung execution detection remains responsive.

Key changes:

  • New REAPER_IDLE_INTERVAL_MS (1 hour) used as the default interval after alarm runs when no execution is active.
  • On execution start, setAlarm is called with REAPER_ACTIVE_INTERVAL_MS (2 min) to close the gap where the idle alarm could be up to an hour away.
  • Error fallback in the alarm handler uses REAPER_INTERVAL_MS_DEFAULT (5 min) rather than the idle interval, so the reaper retries conservatively when state is unknown.
  • Clarifying JSDoc on getReaperIntervalMs() documenting that the env override only affects the initial alarm scheduling.

Verification

  • Code review performed in this session — identified and applied two improvements (error fallback safety, env override documentation)
  • Typecheck passed (tsgo --noEmit for cloud-agent-next and dependencies, via pre-push hook)
  • Lint passed (oxfmt and oxlint, via pre-push hook)
  • Additional verification (tests, manual checks, etc.)

Visual Changes

N/A

Reviewer Notes

  • The REAPER_INTERVAL_MS env var override now only controls the initial alarm via ensureAlarmScheduled(). Steady-state scheduling uses the new constants directly. This is documented in the added JSDoc.
  • The error catch block in the alarm handler intentionally falls back to 5 min (REAPER_INTERVAL_MS_DEFAULT) rather than the 1-hour idle interval — if getActiveExecutionId() throws, we can't know whether an execution is active, so we err on the side of checking sooner.

Introduce REAPER_IDLE_INTERVAL_MS (1 hour) so idle sessions don't wake
the DO every 5 minutes for nothing.  When an execution starts, the alarm
is immediately rescheduled to REAPER_ACTIVE_INTERVAL_MS (2 min) to keep
stale/hung execution detection responsive.  Error fallback uses the
conservative 5-min default rather than the 1-hour idle interval.
// longer idle interval otherwise so we don't wake the DO every 5 min for nothing.
// Wrapped in try/catch so a failure here never prevents rescheduling the alarm.
let nextInterval = this.getReaperIntervalMs();
let nextInterval = REAPER_IDLE_INTERVAL_MS;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WARNING: Hourly idle cadence delays idle wrapper shutdown

cleanupIdleKiloServer() still relies on this alarm to enforce KILO_SERVER_IDLE_TIMEOUT_MS_DEFAULT (15 minutes). After a short execution completes, the next active alarm will usually observe idleMs < idleTimeoutMs, and then this reschedules the following check for 1 hour later. That means an otherwise idle wrapper can stay alive for roughly 60 minutes instead of 15, increasing sandbox usage until the next wake-up.

@kilo-code-bot
Copy link
Copy Markdown
Contributor

kilo-code-bot bot commented Apr 2, 2026

Code Review Summary

Status: 1 Issues Found | Recommendation: Address before merge

Overview

Severity Count
CRITICAL 0
WARNING 1
SUGGESTION 0

Fix these issues in Kilo Cloud

Issue Details (click to expand)

WARNING

File Line Issue
cloud-agent-next/src/persistence/CloudAgentSession.ts 1362 Idle sessions now wait up to an hour before cleanupIdleKiloServer() runs again, so wrappers can stay alive far past the intended 15-minute idle timeout.
Other Observations (not in diff)

Issues found in unchanged code that cannot receive inline comments:

File Line Issue
N/A N/A None.
Files Reviewed (1 files)
  • cloud-agent-next/src/persistence/CloudAgentSession.ts - 1 issue

Reviewed by gpt-5.4-2026-03-05 · 901,764 tokens

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant