Skip to content

[FEATURE] Code Sandbox for Tool Execution #1676

@dbschmigelski

Description

@dbschmigelski

Priority: Medium

Problem Statement

Even with semantic filtering, some agents need access to many capabilities. Each capability as a separate tool adds to context overhead and degrades tool selection accuracy. Research from Cloudflare Code Mode and Anthropic Code Execution shows that LLMs are better at writing code than selecting from large tool sets.

Proposed Solution

Provide a CodeSandbox interface that:

  • Accepts tool definitions and exposes them as a callable typed SDK within the sandbox
  • Executes agent-generated code in isolation
  • Returns results to the agent

A single execute_code tool replaces dozens of individual tools. The agent accomplishes work in one code execution call instead of chaining N tool calls — fewer round-trips means faster results and less opportunity to drift off track.

Security boundary: the sandbox exposes only the tool SDK, not raw system access. Tools themselves define the capability boundary.

Use Case

  • Developer assistants needing file operations, git commands, HTTP requests, JSON parsing, etc.
  • Reducing context overhead from irrelevant tool definitions
  • Improving tool selection accuracy by narrowing the active set

Additional Context

Part of the Context Management epic, Track 2: Tool Context.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions