Skip to content

Conversation

@yannbonzom
Copy link

@yannbonzom yannbonzom commented Nov 15, 2025

Adds an opt-in “code-mode” wrapper that fronts multiple downstream MCP servers via three meta-tools (list_mcp_servers, list_tool_names, get_tool_implementation, plus call_tool). Agents progressively disclose server → tool → schema, so they only load what they need.

Motivation and Context

Per Anthropic's recent article on code execution providing massive token savings over MCP servers, I was curious to explore whether we might imagine emulating this file-system based code execution (listing directories & reading files to execute scripts) with MCP meta-tools (one config stores many MCP servers, and so the agent can list MCP servers, list tools, and call tools).

With the wrapper in place, linking Playwright through the MCP server drops a simple “open a site” run from 17 k tokens down to 13.7 k. Given Cursor’s 11.1 k base prompt, that’s 5.9 k tokens of overhead without the wrapper versus 2.6 k with it—a 56 % reduction (about 3.3 k tokens) while still keeping all downstream tools available. The LLM does spend a couple of extra tool calls on discovery, but the workflow mirrors “explore files → read file,” so even with many servers the prompt stays slim.

Please note: I open this PR to hear what you all think of this sort of setup. I did this fairly quickly, so there's likely cleaner approaches to accomplishing this setup. I would love to hear your thoughts and hunches around this, and whether building this out further might be worthwhile! Thanks for your thoughts :).

How Has This Been Tested?

  • npm run typecheck
  • npm run test
  • Manual end-to-end test in Cursor with the wrapper configured via code-config.mcp-servers.json, Playwright MCP linked to the local SDK, and the client invoking list_mcp_servers → list_tool_names(serverId=playwright) → get_tool_implementation → call_tool.

This is how I set it up:
In ~/.cursor/code-config.mcp-servers.json:

{
    "downstreams": [
        {
            "id": "playwright",
            "description": "Browser automation via Playwright",
            "command": "node",
            "args": ["/Users/yannbonzom/Desktop/projects/playwright-mcp/cli.js", "--headless", "--browser=chromium"]
        }
    ]
}

In ~/.cursor/mcp.json:

{
  "mcpServers": {
    "code-mode-mcp-servers": {
      "command": "npm",
      "args": [
        "--prefix",
        "/Users/yannbonzom/Desktop/projects/typescript-sdk",
        "run",
        "code-mode",
        "--",
        "--config",
        "/Users/yannbonzom/.cursor/code-mode.mcp-servers.json"
      ]
    }
  }
}

Breaking Changes

No breaking changes. The wrapper is opt-in and runs as its own CLI (npm run code-mode -- --config ...). Existing SDK usage is untouched.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation update

Checklist

  • I have read the MCP Documentation
  • My code follows the repository's style guidelines
  • New and existing tests pass locally
  • I have added appropriate error handling
  • I have added or updated documentation as needed

Additional context

  • The config file is named code-config.mcp-servers.json to underline that it lists multiple downstream MCP servers.
  • Meta-tools are hierarchical to keep token usage predictable: list_mcp_servers returns server summaries; list_tool_names requires serverId and returns just name + short description; get_tool_implementation is the only call that returns full schemas/stubs.
  • This gives agents a code-execution-like exploration experience without needing a real filesystem view, and lets users keep many MCP servers active simultaneously while still saving tokens. I realize there's the risk of too-many-MCP-servers (similar to the too-many-tools problem causing agents to get confused), so it will need experimentation to see if we encounter similar problems. My hunch is that it's a lot easier for an agent to discern across, say, [notion, playwright, jira] than having many similar-sounding tool names.
  • Sample chat to show how it's able to dynamically retrieve what it needs:
Screenshot 2025-11-15 at 5 21 20 PM
  • Sample chat showing how it figures out to explore its tools first to then determine what to use
Screenshot 2025-11-15 at 5 25 21 PM

@pkg-pr-new
Copy link

pkg-pr-new bot commented Nov 15, 2025

Open in StackBlitz

npm i https://pkg.pr.new/modelcontextprotocol/typescript-sdk/@modelcontextprotocol/sdk@1118

commit: 9d8a259

@yannbonzom yannbonzom marked this pull request as ready for review November 15, 2025 21:56
@yannbonzom yannbonzom requested a review from a team as a code owner November 15, 2025 21:56
@mattzcarey
Copy link
Contributor

Love this! Will leave it to the other team to decide whether it lives here yet. Codemode is awesome :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants