Fix concurrent sendMessage race in StdioClientTransport by AbbasNS · Pull Request #876 · modelcontextprotocol/java-sdk

AbbasNS · 2026-03-20T14:12:36Z

Problem

StdioClientTransport.sendMessage() uses tryEmitNext() on a unicast Reactor sink (Sinks.many().unicast().onBackpressureBuffer()). When two threads call sendMessage() concurrently on the same transport, one fails with "Failed to enqueue message".

Root cause

Reactor wraps the unicast sink in SinkManySerialized, which uses a CAS-based guard in tryEmitNext(). When two threads race on the CAS, the loser immediately gets FAIL_NON_SERIALIZED — the method does not retry. The current code checks only .isSuccess() and returns Mono.error for any failure, discarding the retryable FAIL_NON_SERIALIZED result.

The existing TODO at line 230-235 acknowledges this:

"we delegate the retry and the backpressure onto the caller. This might be enough for most cases."

It is not enough when an MCP server proxies concurrent tool calls to a downstream server via stdio — each call dispatches on a separate reactor thread and both call sendMessage() on the same transport.

Fix

Replace tryEmitNext with emitNext + Sinks.EmitFailureHandler.busyLooping(Duration.ofMillis(100)).

emitNext with busyLooping spin-retries on FAIL_NON_SERIALIZED until the competing thread finishes its CAS (microseconds in practice). The 100ms duration is just a generous upper bound for pathological cases like GC pauses. This is the approach recommended by Reactor's own emitNext Javadoc:

"It would be possible for an EmitFailureHandler to busy-loop and optimistically wait for the contention to disappear"

For non-retryable failures (FAIL_TERMINATED, FAIL_CANCELLED), emitNext throws EmissionException which is caught and converted to Mono.error, preserving the existing error behavior.

Test

StdioClientTransportConcurrencyTest reproduces the race with two threads sending through the same transport concurrently (20 repetitions):

Before fix: 19/20 fail
After fix: 20/20 pass

Closes #875

When two threads call sendMessage() concurrently on the same StdioClientTransport, the unicast sink's SinkManySerialized wrapper returns FAIL_NON_SERIALIZED via its CAS guard, causing "Failed to enqueue message". This test reproduces the race: 19/20 repetitions fail. Closes modelcontextprotocol#875

Replace tryEmitNext (fail-fast) with emitNext + busyLooping(100ms) in StdioClientTransport.sendMessage(). The unicast sink's SinkManySerialized wrapper returns FAIL_NON_SERIALIZED when two threads call tryEmitNext concurrently. busyLooping retries the CAS spin instead of immediately failing, making concurrent sends safe. The contention window is microseconds (single CAS operation), so the 100ms duration is just a generous upper bound for pathological cases like GC pauses. Before: 19/20 test repetitions fail After: 20/20 pass Closes modelcontextprotocol#875

AbbasNS added 2 commits March 20, 2026 15:11

AbbasNS mentioned this pull request Mar 20, 2026

MCP-374 Fix concurrent sendMessage race in ManagedStdioClientTransport SonarSource/sonarqube-mcp-server#290

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix concurrent sendMessage race in StdioClientTransport#876

Fix concurrent sendMessage race in StdioClientTransport#876
AbbasNS wants to merge 2 commits intomodelcontextprotocol:mainfrom
AbbasNS:as/fix-concurrent-sendmessage-race

AbbasNS commented Mar 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

AbbasNS commented Mar 20, 2026

Problem

Root cause

Fix

Test

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant