feat: add request cancellation and cleanup #167

dsp-ant · 2025-01-23T21:13:18Z

Summary

Add support for request cancellation via notifications
Implement periodic cleanup of in-flight requests
Track request state throughout lifecycle
Fixes Random error thrown on response #88

Test plan

Test request cancellation
Verify cleanup of completed requests
Test concurrent request handling

Depends on #166

jspahrsummers · 2025-01-27T16:12:57Z

src/mcp/shared/session.py

+    async def _cleanup_loop(self) -> None:
+        """Periodically clean up completed and cancelled requests."""
+        while True:
+            with anyio.move_on_after(self._cleanup_interval):
+                # Clean up completed requests
+                self._in_flight = {
+                    req_id: responder
+                    for req_id, responder in self._in_flight.items()
+                    if responder.in_flight
+                }
+                await anyio.sleep(self._cleanup_interval)


Why isn't this done on completion or cancellation instead? The idea of using a timer for this seems weird to me, and leaves a window (the length of the interval) of possibly unbounded memory growth.

https://anyio.readthedocs.io/en/stable/cancellation.html#finalization

jspahrsummers · 2025-01-27T16:14:23Z

src/mcp/shared/session.py

+                # Clean up completed requests
+                self._in_flight = {
+                    req_id: responder
+                    for req_id, responder in self._in_flight.items()
+                    if responder.in_flight
+                }


OTOH, this seems excessive to do on every new incoming message. Why can't we tie it to completion and cancellation?

EduPonz · 2025-01-29T15:36:55Z

This doesn't seem to solve #88 for me. I can reproduce triggering a call tool from claude with a call tool handle that simply waits for 200 s and returns a valid message. Claude receives the response but the server crashes

@server.call_tool()
async def handle_call_tool(
    name: str, arguments: dict | None,
) -> list[types.TextContent | types.ImageContent | types.EmbeddedResource]:
    await asyncio.sleep(200)
    response = {"msg": "success"}
    return [types.TextContent(type="text", text=f"{response}")]

This commit adds support for request cancellation and tracking of in-flight requests in the MCP protocol implementation. The key architectural changes are: 1. Request Lifecycle Management: - Added _in_flight dictionary to BaseSession to track active requests - Requests are tracked from receipt until completion/cancellation - Added proper cleanup via on_complete callback 2. Cancellation Support: - Added CancelledNotification handling in _receive_loop - Implemented cancel() method in RequestResponder - Uses anyio.CancelScope for robust cancellation - Sends error response on cancellation 3. Request Context: - Added request_ctx ContextVar for request context - Ensures proper cleanup after request handling - Maintains request state throughout lifecycle 4. Error Handling: - Improved error propagation for cancelled requests - Added proper cleanup of cancelled requests - Maintains consistency of in-flight tracking This change enables clients to cancel long-running requests and servers to properly clean up resources when requests are cancelled. Github-Issue:#88

EduPonz · 2025-02-04T06:17:13Z

Hi @dsp-ant, unfortunately, this still fails with the snippet I posted when using Claude. It seems that the important bit of the trace is this:

| pydantic_core._pydantic_core.ValidationError: 6 validation errors for ClientNotification
| CancelledNotification.method
|   Input should be 'notifications/cancelled' [type=literal_error, input_value='cancelled', input_type=str]
|     For further information visit https://errors.pydantic.dev/2.10/v/literal_error
| ProgressNotification.method
|   Input should be 'notifications/progress' [type=literal_error, input_value='cancelled', input_type=str]
|     For further information visit https://errors.pydantic.dev/2.10/v/literal_error
| ProgressNotification.params.progressToken
|   Field required [type=missing, input_value={'requestId': 19, 'reason... -2: Request timed out'}, input_type=dict]
|     For further information visit https://errors.pydantic.dev/2.10/v/missing
| ProgressNotification.params.progress
|   Field required [type=missing, input_value={'requestId': 19, 'reason... -2: Request timed out'}, input_type=dict]
|     For further information visit https://errors.pydantic.dev/2.10/v/missing
| InitializedNotification.method
|   Input should be 'notifications/initialized' [type=literal_error, input_value='cancelled', input_type=str]
|     For further information visit https://errors.pydantic.dev/2.10/v/literal_error
| RootsListChangedNotification.method
|   Input should be 'notifications/roots/list_changed' [type=literal_error, input_value='cancelled', input_type=str]
|     For further information visit https://errors.pydantic.dev/2.10/v/literal_error

From that is seems that the client (Claude) is sending the cancelled notification wrong (if that makes sense), sending cancelled instead of notifications/cancelled.

If the problem is indeed triggered from the client side (you'd need to clarify this to me, as I'm very much not an expert on MCP in general or this implementation in particular), then crashing the server does not seem the way to go, meaning the exception should be handled by the SDK. If on the other hand the problem has a server-side origin, we'd need to find were it comes from. I'm happy to test any advancements you have, I'm sorry I cannot be of more help ATM.

EduPonz · 2025-02-04T14:31:46Z

It seems that a73fc7d does the trick at least for me, thanks @dsp-ant!!

dsp-ant · 2025-02-04T14:36:04Z

@EduPonz Yes. After fixing actual cancellation and thanks to your perfect repro, I was able to pinpoint the problem. It is caused by validation failing for cancellation messages from Claude Desktop, since Claude Desktop relies on a very old SDK verseion that sends wrong cancellation. This showed that the SDK isnt handling the error gracefuly.

jspahrsummers

Broadly looks good. Do we need to update everything to require using RequestResponder as a context manager? e.g., MCP client code, docs, …?

And should it be an error to use RequestResponder without entering the context? We could track this with a flag.

jspahrsummers · 2025-02-04T14:43:24Z

tests/issues/test_88_random_error.py

+            # Long enough to ensure timeout
+            await anyio.sleep(0.2)


I'm quite allergic to actually sleeping in tests, due to the flakiness and slowness. Is there any other way we can exercise these? Would a timeout of 0 work? Or a timeout in the past vs. a timeout in the far future?

Let me fix this. This was Claude written. After a lot of back and forth, the right way is to use events and wait for them to trigger.

EduPonz · 2025-02-04T14:48:33Z

@EduPonz Yes. After fixing actual cancellation and thanks to your perfect repro, I was able to pinpoint the problem. It is caused by validation failing for cancellation messages from Claude Desktop, since Claude Desktop relies on a very old SDK verseion that sends wrong cancellation. This showed that the SDK isnt handling the error gracefuly.

Do you know were they keep their code or some place to open a ticket? I was looking for it but did not find much. It's strange that such a popular client is not up to spec...

dsp-ant linked an issue Jan 23, 2025 that may be closed by this pull request

Random error thrown on response #88

Closed

dsp-ant force-pushed the davidsp/88-v2 branch from 0c6ec6d to d50255e Compare January 24, 2025 09:52

dsp-ant force-pushed the davidsp/refactor branch from 0d7be0a to 0a93799 Compare January 24, 2025 09:52

Base automatically changed from davidsp/refactor to main January 24, 2025 09:54

dsp-ant requested review from jerome3o-anthropic, nick-merrill and jspahrsummers January 24, 2025 10:36

jspahrsummers requested changes Jan 27, 2025

View reviewed changes

dsp-ant added 2 commits February 3, 2025 14:03

tests for issue 88

888bdd3

dsp-ant force-pushed the davidsp/88-v2 branch from d50255e to 827e494 Compare February 3, 2025 20:57

dsp-ant requested a review from jspahrsummers February 3, 2025 20:58

jspahrsummers previously approved these changes Feb 4, 2025

View reviewed changes

fix: improve error handling and request cancellation for issue #88

08cfbe5

dsp-ant dismissed jspahrsummers’s stale review via 08cfbe5 February 4, 2025 18:20

dsp-ant force-pushed the davidsp/88-v2 branch from a73fc7d to 08cfbe5 Compare February 4, 2025 18:20

fix: enforce context manager usage for RequestResponder

733db0c

dsp-ant force-pushed the davidsp/88-v2 branch from b32e455 to 733db0c Compare February 4, 2025 20:13

dsp-ant merged commit c58adfe into main Feb 4, 2025
5 checks passed

dsp-ant deleted the davidsp/88-v2 branch February 4, 2025 20:23

donghao1393 mentioned this pull request Feb 5, 2025

fix: handle internal notifications during session cleanup #85

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add request cancellation and cleanup #167

feat: add request cancellation and cleanup #167

dsp-ant commented Jan 23, 2025 •

edited

Loading

jspahrsummers Jan 27, 2025

jspahrsummers Jan 27, 2025

jspahrsummers Jan 27, 2025

EduPonz commented Jan 29, 2025 •

edited

Loading

EduPonz commented Feb 4, 2025

EduPonz commented Feb 4, 2025

dsp-ant commented Feb 4, 2025

jspahrsummers left a comment

jspahrsummers Feb 4, 2025

dsp-ant Feb 4, 2025

EduPonz commented Feb 4, 2025

feat: add request cancellation and cleanup #167

feat: add request cancellation and cleanup #167

Conversation

dsp-ant commented Jan 23, 2025 • edited Loading

Summary

Test plan

jspahrsummers Jan 27, 2025

Choose a reason for hiding this comment

jspahrsummers Jan 27, 2025

Choose a reason for hiding this comment

jspahrsummers Jan 27, 2025

Choose a reason for hiding this comment

EduPonz commented Jan 29, 2025 • edited Loading

EduPonz commented Feb 4, 2025

EduPonz commented Feb 4, 2025

dsp-ant commented Feb 4, 2025

jspahrsummers left a comment

Choose a reason for hiding this comment

jspahrsummers Feb 4, 2025

Choose a reason for hiding this comment

dsp-ant Feb 4, 2025

Choose a reason for hiding this comment

EduPonz commented Feb 4, 2025

dsp-ant commented Jan 23, 2025 •

edited

Loading

EduPonz commented Jan 29, 2025 •

edited

Loading