Skip to content

ChannelClosedException when server close connection right after releasing the last http2 stream#6258

Closed
huajiang-tubi wants to merge 3 commits intoaws:masterfrom
adRise:fix-closed-channel-exception
Closed

ChannelClosedException when server close connection right after releasing the last http2 stream#6258
huajiang-tubi wants to merge 3 commits intoaws:masterfrom
adRise:fix-closed-channel-exception

Conversation

@huajiang-tubi
Copy link
Copy Markdown

@huajiang-tubi huajiang-tubi commented Jul 15, 2025

Motivation and Context

sequenceDiagram
    participant aws AS AWS Service
    participant channel AS Channel Event Loop
    participant pool AS HttpOrHttp2ChannelPool.eventLoop

    aws->>+channel: 1. LastHttpContent
    channel->>+channel: 2. ResponseHandler.finalizeResponse
    channel->>+aws: 3. Http2ResetFrame
    channel->>+channel: 4. HttpOrHttp2ChannelPool.release
    channel->>pool: 5. doInEventLoop(() -> release0(channel, promise))
    deactivate channel
    deactivate channel
    deactivate channel

    aws->>+channel: 6. Close The Connection
    channel->>+channel: 7. Http2ConnectionHandler.channelInactive
    channel->>+channel: 8. MultiplexedChannelRecord.closeAndExecuteOnChildChannels
    deactivate channel
    deactivate channel
    deactivate channel


    pool->>+pool: 9. HttpOrHttp2ChannelPool.release0(channel, promise)
    pool->>+channel: 10. doInEventLoop(() -> MultiplexedChannelRecord.closeAndReleaseChild)
    channel->>+channel: 11. childChannels.remove
    deactivate channel
    channel-->>-pool:
    deactivate pool
Loading
  1. The AWS service returns the final part of the response.

  2. ResponseHandler.finalizeResponse is invoked to complete processing of the response.

  3. The client sends an RST_STREAM frame to acknowledge the completion.

  4. It then calls release on the channel pool. Since the channel pool consists of multiple layers, this invocation eventually reaches HttpOrHttp2ChannelPool.release.

  5. HttpOrHttp2ChannelPool.release needs to access the protocolImpl field, which is only safely accessed from the pool’s event loop. To ensure thread safety, it submits a task to perform the release within that event loop.

  6. Meanwhile, the server receives the reset frame and decides to immediately close the connection (No idea why. It's a question for the service developer).

  7. This triggers channelInactive on the Http2ConnectionHandler.

  8. As a result, MultiplexedChannelRecord.closeAndExecuteOnChildChannels is called. It detects that there are still unreleased child channels because the task submitted in step 5 has not yet been executed. This leads to a ClosedChannelException being thrown and logged as an error.

  9. The release task is now running on the pool's event loop, but a little bit too late.

It may also be the root cause of #2914

Modifications

By declaring protocolImpl as volatile, we ensure visibility across threads once it is assigned within the pool's event loop. This guarantees that it can be safely accessed without thread-safety concerns. The underlying BetterFixedChannelPool is thread-safe, as its mutable state is managed exclusively within its dedicated event loop. For release operations, state updates are performed via a future listener after the underlying pool completes the release, effectively avoiding the concurrency issues observed with HttpOrHttp2ChannelPool.

Tests

Reproducing the issue consistently in a test environment is challenging, as it relies on the precise timing and order of event handler invocations.

We’ve been running the fix without the volatile keyword on protocolImpl in one of our production services for some time. It has been working well, as evidenced by a noticeable drop in error logs:
Screenshot 2025-07-16 at 21 15 01

The deployment with the volatile keyword added is now live. I’ll share an update on the results shortly.

Update:
Screenshot 2025-07-20 at 08 53 44

@huajiang-tubi huajiang-tubi requested a review from a team as a code owner July 15, 2025 06:26
@huajiang-tubi huajiang-tubi changed the title ChannelClosedException when server close connection right after the last http2 stream ChannelClosedException when server close connection right after releasing the last http2 stream Jul 15, 2025
@dagnir
Copy link
Copy Markdown
Contributor

dagnir commented Feb 11, 2026

Hi @huajiang-tubi! Thank you for the PR, and apologies for the delayed response.

I am a bit confused by the description of the issue though. In your timing diagram, 3 has the client sending a rst_stream to the service. However, this only happens from the client side if the Subscriber cancels the Subscription

// For HTTP2 we send a RST_STREAM frame on cancel to stop the service from sending more data
if (ChannelAttributeKey.getProtocolNow(channelContext.channel()) == Protocol.HTTP2) {
return new Http2ResetSendingSubscription(channelContext, subscription);
} else {
, which signals that the client does not want to read the rest of the response content; this does not normally happen. Likewise, the service doesn't normally send a rst_stream if the request handled successfully (which seems like it is in your timing diagram given by the LastHttpResponse).

Are your perhaps describing two different scenarios? i.e. either the server or the client sends the rst_stream?

@bhoradc bhoradc added the response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 10 days. label Feb 12, 2026
@github-actions
Copy link
Copy Markdown

It looks like this PR has not been active for more than five days. In the absence of more information, we will be closing this PR soon. Please add a comment to prevent automatic closure, or if the PR is already closed please feel free to open a new one.

@github-actions github-actions Bot added closing-soon This issue will close in 4 days unless further comments are made. closed-for-staleness and removed closing-soon This issue will close in 4 days unless further comments are made. labels Feb 21, 2026
@github-actions github-actions Bot closed this Feb 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

closed-for-staleness response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 10 days.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants