Skip to content

Fix Process.WaitForExit hang when grandchild process keeps redirected pipe open#125732

Draft
Copilot wants to merge 6 commits intomainfrom
copilot/add-theory-to-process-waiting-tests
Draft

Fix Process.WaitForExit hang when grandchild process keeps redirected pipe open#125732
Copilot wants to merge 6 commits intomainfrom
copilot/add-theory-to-process-waiting-tests

Conversation

Copy link
Contributor

Copilot AI commented Mar 18, 2026

Description

Process.WaitForExit() hangs indefinitely when stdout/stderr is redirected and a grandchild process inherits the pipe handle. The root cause: after the child process exits, AsyncStreamReader blocks forever on EOF.GetAwaiter().GetResult() because the grandchild still holds the write end of the pipe open.

Changes

  • AsyncStreamReader.cs — Added parameterless CancelDueToProcessExit() and CancelDueToProcessExitAsync() methods that use a private StreamDrainDefaultTimeoutMs constant (300ms) internally. These wait up to the default timeout for _readToBufferTask to complete, then cancel via _cts.Cancel() if it hasn't. Removed the now-unused EOF property.

  • Process.Unix.cs / Process.Windows.cs — Replaced _output?.EOF.GetAwaiter().GetResult() with CancelDueToProcessExit(). Now drains streams for all exited processes (not just Timeout.Infinite).

  • Process.cs — Updated WaitForExitAsync to use CancelDueToProcessExitAsync() instead of EOF.WaitAsync(cancellationToken).

  • ProcessWaitingTests.cs — Added [Theory] covering all four WaitForExit overloads: spawns a child with redirected output that uses CreateProcess to create a sleeping grandchild, asserts WaitForExit completes in <2s.

// Before: blocks forever if grandchild holds pipe open
_output?.EOF.GetAwaiter().GetResult();

// After: waits up to default timeout, then cancels the read
_output?.CancelDueToProcessExit();

💬 Send tasks to Copilot coding agent from Slack and Teams to turn conversations into code. Copilot posts an update in your thread when it's finished.

Copilot AI and others added 3 commits March 18, 2026 13:44
Add CancelDueToProcessExit and CancelDueToProcessExitAsync methods to
AsyncStreamReader that wait a timeout for the read task to finish and
cancel it if it doesn't. Replace all EOF.GetAwaiter().GetResult() and
EOF.WaitAsync() calls in WaitForExitCore (Unix/Windows) and
WaitForExitAsync with the new timeout-based methods. Use 300ms default
when no timeout is provided. Add Theory test reproducing the hang
scenario with a grandchild process that sleeps for 8 hours.

Co-authored-by: adamsitnik <6011991+adamsitnik@users.noreply.github.com>
… grandchild process dispatch

Co-authored-by: adamsitnik <6011991+adamsitnik@users.noreply.github.com>
Copy link
Member

@adamsitnik adamsitnik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot please address my feedback

@dotnet-policy-service
Copy link
Contributor

Tagging subscribers to this area: @dotnet/area-system-diagnostics-process
See info in area-owners.md if you want to be subscribed.

@adamsitnik adamsitnik added the NO-REVIEW Experimental/testing PR, do NOT review it label Mar 18, 2026
Co-authored-by: adamsitnik <6011991+adamsitnik@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Mar 18, 2026

@copilot please address my feedback

Addressed in a0f49f5 — the grandchild process now uses CreateProcess(SleepForEightHours) instead of manually constructing ProcessStartInfo. All 376 tests pass.

Co-authored-by: adamsitnik <6011991+adamsitnik@users.noreply.github.com>
@adamsitnik
Copy link
Member

@copilot please invoke the code-review skill and post your analysis/results as a comment on this PR

Copy link
Contributor Author

Copilot AI commented Mar 18, 2026

@copilot please invoke the code-review skill and post your analysis/results as a comment on this PR

🤖 Copilot Code Review — PR #125790

Holistic Assessment

Motivation: The problem is real and well-documented — WaitForExit() hangs indefinitely when a grandchild process inherits the pipe handle. This is a known pain point.

Approach: The fix introduces a timeout-based cancellation mechanism in AsyncStreamReader that breaks the blocking read loop after the child process has exited. The approach is sound — after the process exits, the streams are given a grace period to drain, then forcibly cancelled. This is the correct general pattern for this class of problem.

Summary: ⚠️ Needs Human Review. The code is functionally correct for the stated fix, but introduces broader behavioral changes that a maintainer should evaluate. Key concerns: (1) WaitForExit(int) now additionally waits for streams to drain (it didn't before), potentially doubling the effective timeout; (2) the cancellationToken in WaitForExitAsync is no longer forwarded to the stream drain phase (replaced by a fixed 300ms timeout); (3) the EOF property is now dead code. These are deliberate design choices but warrant explicit maintainer approval.


Detailed Findings

⚠️ Behavioral Change — WaitForExit(int) now waits for stream drain

Previously: WaitForExit(int milliseconds) with a finite timeout did NOT wait for streams at all (both Unix and Windows). The old code had if (exited && milliseconds == Timeout.Infinite) on Unix and if (milliseconds == Timeout.Infinite) on Windows — finite timeouts bypassed stream draining entirely.

Now: When the process exits within the timeout, streams are drained for up to the full milliseconds timeout again. So WaitForExit(5000) could take up to ~10 seconds in the worst case (5s waiting for process exit + 5s waiting for stream drain).

This is a breaking behavioral change — callers who relied on WaitForExit(int) returning promptly after process exit may now experience significantly longer waits. A maintainer should decide whether this is acceptable, or whether the stream drain should use the remaining time from the original timeout (i.e., timeout - elapsed).

Files: Process.Unix.cs:208-213, Process.Windows.cs:179-184

⚠️ Behavioral Change — WaitForExitAsync ignores cancellationToken for stream drain

Previously: WaitForExitAsync passed the cancellationToken to _output.EOF.WaitAsync(cancellationToken), allowing the caller to cancel the stream drain phase.

Now: The cancellationToken is only used for the main process-exit wait. Once the process exits, streams are drained with a fixed 300ms timeout regardless of cancellation. This is likely fine in practice (the process is already dead, and 300ms is short), but it's a subtle change in the cancellation contract. Note: if cancellationToken fires during the main wait, tcs.Task throws TaskCanceledException and the drain code at line 1563 is never reached — so this only affects the post-exit drain phase.

File: Process.cs:1570-1581

✅ Correctness — CancelDueToProcessExit exception handling is correct

Both review models raised concerns about task.GetAwaiter().GetResult() potentially throwing after _cts.Cancel(). This is a false positive: ReadBufferAsync() already catches both OperationCanceledException and IOException internally (lines 105-119), breaks out of the loop, and completes normally. After _cts.Cancel(), the task will complete successfully, not in a faulted state. The GetResult() call is safe.

File: AsyncStreamReader.cs:258-266

✅ Correctness — Windows uses _signaled instead of milliseconds == Timeout.Infinite

The Windows path now correctly checks _signaled (whether the process actually exited) instead of milliseconds == Timeout.Infinite. This is a genuine improvement — previously, the stream drain only happened for infinite waits. Now it happens whenever the process exits, regardless of the timeout mode. This is correct.

File: Process.Windows.cs:179

💡 Suggestion — EOF property is now dead code

The internal Task EOF property (line 256) is no longer used by any caller. All usages have been replaced by CancelDueToProcessExit / CancelDueToProcessExitAsync. Consider removing it to keep the API surface clean.

File: AsyncStreamReader.cs:256

💡 Suggestion — Task.Delay allocation in CancelDueToProcessExitAsync

Task.WhenAny(task, Task.Delay(milliseconds)) creates a new Task + timer allocation on every call. This is acceptable since WaitForExitAsync is not a hot path, but if desired, a CancellationTokenSource with timeout could be used instead: using var timeoutCts = new CancellationTokenSource(milliseconds); await task.WaitAsync(timeoutCts.Token).

File: AsyncStreamReader.cs:273

✅ Test Quality — Test is well-structured

The test creates a realistic scenario (child exits, grandchild keeps pipe open) and validates all four WaitForExit overloads. The 2-second assertion threshold provides enough margin above the 300ms default timeout. The catch (Exception) { } for killing the grandchild is pragmatic — the grandchild may have already been killed by the OS when its parent tree collapses.

File: ProcessWaitingTests.cs:675-729


Models consulted: Claude Sonnet 4.6 (primary), Gemini 2.5 Pro, GPT-4.1

Copy link
Member

@adamsitnik adamsitnik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot address my feedback

if (exited && milliseconds == Timeout.Infinite) // if we have a hard timeout, we cannot wait for the streams
if (exited)
{
_output?.EOF.GetAwaiter().GetResult();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove the EOF property, it's no longer used

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed in dae1fe0.

/// <summary>
/// Default timeout in milliseconds to wait for redirected streams to complete after the process exits.
/// </summary>
private const int StreamDrainDefaultTimeoutMs = 300;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's move it to src/libraries/System.Diagnostics.Process/src/System/Diagnostics/AsyncStreamReader.cs, and make both CancelDueToProcessExit and CancelDueToProcessExitAsync parameterless and just always use the default timeout internally.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in dae1fe0. Moved StreamDrainDefaultTimeoutMs to AsyncStreamReader.cs and made both CancelDueToProcessExit and CancelDueToProcessExitAsync parameterless — all callers now use the default timeout internally.

…thods parameterless, remove EOF

Co-authored-by: adamsitnik <6011991+adamsitnik@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Mar 18, 2026

@copilot address my feedback

Addressed both items in dae1fe0:

  • Removed the unused EOF property from AsyncStreamReader.cs
  • Moved StreamDrainDefaultTimeoutMs to AsyncStreamReader.cs and made CancelDueToProcessExit/CancelDueToProcessExitAsync parameterless

Copilot AI requested a review from adamsitnik March 18, 2026 22:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area-System.Diagnostics.Process NO-REVIEW Experimental/testing PR, do NOT review it

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants