Skip to content

Graceful WebSocket drain on SIGTERM for api_server/web commands #4876

@fparga

Description

@fparga

🔴 Required Information

Is your feature request related to a specific problem?

When deploying adk api_server or adk web on Cloud Run (or any container orchestrator), the platform sends SIGTERM on deployment, scale-down, or preemption. Uvicorn's default SIGTERM handler immediately closes WebSocket connections with close code 1012 (Service Restart), which drops active /run_live streaming sessions mid-conversation.

For agents using Gemini Live models (gemini-live-*-native-audio), every user interaction is a long-lived WebSocket connection. A routine deployment kills all active conversations instantly with no opportunity for the client to reconnect or the model to finish its response.

Describe the Solution You'd Like

The adk api_server and adk web commands should drain active WebSocket connections before shutting down:

  1. On SIGTERM, stop accepting new connections (close listeners).
  2. Wait for in-flight WebSocket connections to finish naturally, up to a configurable timeout.
  3. Then let uvicorn shut down cleanly.

This ensures zero dropped conversations during deployments when the orchestrator routes new connections to the new instance while the old one drains.

Impact on your work

We run live audio agents on Cloud Run. Every deployment drops all active calls. We currently maintain a wrapper around adk api_server solely to add graceful drain behavior. This is critical infrastructure for any production deployment of live agents. Without it, deployments during business hours are disruptive.

Willingness to contribute

Yes.


🟡 Recommended Information

Describe Alternatives You've Considered

We wrote a custom CLI wrapper that intercepts SIGTERM, closes the server listeners, polls uvicorn.Server.server_state.connections until all WebSocket connections finish (or a timeout expires), then sets server.should_exit = True. This works but requires maintaining a parallel entry point and duplicating upstream's server startup logic.

Proposed API / Implementation

Replace the current uvicorn.run() / server.serve() call in the api_server and web commands with a drain-aware server loop. Rough sketch:

async def _serve_with_graceful_drain(server: uvicorn.Server, drain_timeout: float) -> None:
    loop = asyncio.get_running_loop()
    serve_task = asyncio.create_task(server._serve())

    shutdown_event = asyncio.Event()

    def _on_signal(sig: int) -> None:
        for srv in server.servers:
            srv.close()          # stop accepting new connections
        shutdown_event.set()

    for sig in (signal.SIGTERM, signal.SIGINT):
        loop.add_signal_handler(sig, _on_signal, sig)

    wait_task = asyncio.create_task(shutdown_event.wait())
    done, _ = await asyncio.wait(
        [serve_task, wait_task], return_when=asyncio.FIRST_COMPLETED
    )
    if serve_task in done:
        wait_task.cancel()
        return

    # Drain: wait for active connections to finish
    deadline = time.monotonic() + drain_timeout
    while server.server_state.connections and time.monotonic() < deadline:
        await asyncio.sleep(0.1)

    server.should_exit = True
    await serve_task

A --drain_timeout CLI flag (defaulting to something reasonable like 300s) would let users tune it. Cloud Run's request timeout is the natural upper bound.

Additional Context

  • Cloud Run keeps the instance alive for active requests after SIGTERM (up to the configured request timeout), so the drain window is already provided by the platform — the server just needs to use it instead of exiting immediately.
  • This affects any WebSocket-based command (api_server, web), not just live audio agents. SSE streaming sessions would also benefit.
  • Related: Interrupt Handling #2620 (agent-level interrupt handling) covers cancelling agent tasks, this is about server-level connection draining.

Metadata

Metadata

Assignees

No one assigned

    Labels

    web[Component] This issue will be transferred to adk-web

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions