Skip to content

MCP servers stop working when resuming old sessions; orphaned MCP processes accumulate indefinitely #2279

@jayanthgowda-aurigo

Description

@jayanthgowda-aurigo

Describe the bug

When switching between CLI sessions and later resuming an older session, MCP servers configured in ~/.copilot/mcp-config.json are no longer functional. The CLI logs MCP client for <server> closed but never auto-restarts them. Meanwhile, every new CLI launch spawns fresh MCP child processes — the old ones from prior sessions are never cleaned up, leading to dozens of zombie processes over time.

Affected version

GitHub Copilot CLI 1.0.11.

Steps to reproduce the behavior

  1. Configure 2+ MCP servers in ~/.copilot/mcp-config.json (e.g., @anthropic-ai/mcp-playwright and @azure-devops/mcp)
  2. Start a new CLI session → MCP servers work fine
  3. Switch to a different session (or close the terminal and open a new one)
  4. After some time (hours/days), switch back to the original session
  5. Try to use any MCP tool → fails with "cannot find server" error

Actual Behavior

  • MCP servers die silently when sessions are switched or terminals close
  • Log shows MCP client for <name> closed but no reconnection attempt is made
  • Old MCP node processes (spawned via cmd.exe /c npx -y @<package>) survive as orphans indefinitely
  • Over days, dozens of zombie processes accumulate (observed: 22 orphaned MCP node processes + 12 zombie copilot processes on a single machine after ~3 days of normal use)
  • The only recovery is /restart (which the user must discover on their own) or killing and restarting the CLI manually

Expected behavior

  • MCP servers should auto-restart when a session is resumed and the servers are no longer connected
  • When a CLI process exits, its child MCP server processes should be terminated
  • Resuming a session should always have working MCP servers if they're configured

Additional context

Evidence from Log Analysis

# From ~/.copilot/logs — 62 MCP close events in a single log rotation:
# 48 × "MCP client for ado closed"
# 14 × "MCP client for playwright closed"
# 0 × any "MCP client reconnected" or "MCP client restarted" events

# Typical log sequence:
Session changed, reloading extensions
...
MCP client for playwright closed
MCP client for ado closed
# (no recovery attempt follows)

Process Dump (at time of investigation)

# 22 orphaned MCP node.exe processes (some 64+ hours old)
# 12 zombie copilot processes (some 3+ days old)
# Each CLI session spawns: 1 copilot + 2 cmd.exe + 4 node.exe (npx wrapper + actual server, ×2 for 2 MCP configs)
# None are cleaned up on exit

Workarounds Found

  1. /restart — restarts the CLI process and re-spawns MCP servers for the current session. Works but must be done manually every time a session is resumed.
  2. Manual cleanup script — we wrote a PowerShell function that identifies orphaned copilot/MCP processes by comparing start times against the current session and kills them. This shouldn't be necessary.

Root Cause Analysis

  1. No process cleanup on exit: When a CLI process exits (terminal closed, session switch), its child processes (cmd.exe → node.exe MCP servers) are not terminated. They become orphans.
  2. No auto-restart on resume: When a session is resumed and its MCP servers are dead, the CLI does not attempt to reconnect or restart them. It logs the closure but takes no corrective action.
  3. Process naming after auto-update: After a CLI auto-update, the running binary is renamed to copilot.exe.old, making process management harder (process name changes mid-session).

Suggested Fix

  1. Use a process group / job object: On Windows, create a Job Object with JOB_OBJECT_LIMIT_KILL_ON_JOB_CLOSE so all child MCP processes are automatically terminated when the CLI exits.
  2. Auto-restart MCP on session resume: When switching to a session, check if MCP servers are alive. If not, restart them automatically (similar to what /restart does, but targeted at MCP only).
  3. Periodic health check: Ping MCP servers periodically and restart any that have died.

Environment

  • Copilot CLI version: 1.0.10
  • OS: Windows 11 (Build 26200)
  • Shell: PowerShell 7.x (Windows Terminal)
  • MCP servers configured: @anthropic-ai/mcp-playwright, @azure-devops/mcp
  • Node.js: v22.x (npx-based MCP server launch)

Impact

High — MCP servers are a core feature for users who rely on external tool integration. The bug makes MCP unreliable for any workflow involving multiple sessions or long-running usage. Users must manually run /restart every time they resume a session, which is undiscoverable and disruptive.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No fields configured for Bug.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions