ToolScope solves a core scalability problem in tool-using agents:
As the number of tools grows, LLMs become worse at selecting the right one.
In addition to degraded accuracy and reliability, including a large number of tools in the prompt consumes context budget and can lead to prompt bloat. This is especially evident with small models.
ToolScope addresses this problem by filtering tools per prompt using semantic retrieval - the same way RAG does for text.
ToolScope does not change how the model interacts with tools, introduces no meta-tools and no framework lock-in.
ToolScope is for you if:
- you have >20 tools, coming from MCP servers or registries
- you want to keep using standard agent frameworks
- you want predictable, debuggable behavior
- you donβt want meta-tools
pip install toolscope(or from source)
pip install -e .import toolscope
class TinyEmbedder:
def embed_texts(self, texts):
return [[len(t) % 97, t.count("jira"), sum(map(ord, t)) % 101] for t in texts]
tools = [
{"name": "jira_create_issue", "description": "Create a Jira issue", "inputSchema": {}},
{"name": "confluence_search", "description": "Search Confluence pages", "inputSchema": {}},
]
filtered = toolscope.filter(
messages=[{"role": "user", "content": "Create a Jira ticket"}],
tools=tools,
embedder=TinyEmbedder(),
k=1,
)
print(filtered) # same tools, fewer of themor, with an embedding configuration (requires sentence-transformers):
import toolscope
embedding_config = toolscope.EmbeddingConfig(
provider="sentence-transformers",
model="sentence-transformers/all-MiniLM-L6-v2",
allow_download=False,
)
tools = [
{"name": "jira_create_issue", "description": "Create a Jira issue", "inputSchema": {}},
{"name": "confluence_search", "description": "Search Confluence pages", "inputSchema": {}},
]
filtered = toolscope.filter(
messages=[{"role": "user", "content": "Create a Jira ticket"}],
tools=tools,
embedding=embedding_config,
k=1,
)
print(filtered) # same tools, fewer of themidx = toolscope.index(
tools,
embedder=TinyEmbedder(),
)
filtered = idx.filter(
messages="Create a Jira ticket",
k=1,
)Also check out the usage examples for LangChain, LangGraph and FastMCP.
ToolScope normalizes tools from many schemas into a canonical form:
- name
- description
- input schema
- tags
- fingerprint
Original tool objects are preserved and returned unchanged.
ToolScope supports pluggable embedding backends.
class MyEmbedder:
def embed_texts(self, texts): ...toolscope.EmbeddingConfig(
provider="http",
endpoint="http://localhost:8000/embed",
model="my-embedding-model",
)ToolScope never downloads models behind your back.
You control what text is embedded:
toolscope.ToolTextConfig(
use_name=True,
use_description=True,
use_schema=False,
truncate=256,
)Defaults (battle-tested):
- name + description only
- truncate to 256 chars
- no preprocessing
idx.filter(
messages,
allow_tags=["jira"],
deny_tags=["dangerous"],
)Reuse tools across turns when the query stays similar:
toolscope.StickySessionConfig(
enabled=True,
similarity_threshold_reuse=0.95,
similarity_threshold_refresh=0.8,
sticky_keep=2,
)This reduces latency and improves consistency.
Boost retrieval quality using a cross-encoder:
toolscope.RerankingConfig(
model="cross-encoder/ms-marco-MiniLM-L-6-v2",
pool_size=20,
)Not enabled by default β you opt in explicitly.
Inspect what ToolScope is doing:
tools, trace = idx.filter_with_trace(messages)
print(trace)Includes:
- candidate counts
- timings
- allow/deny decisions
- reranking effects
Fast, simple, zero dependencies.
toolscope.MemoryBackend()Persistent, scalable local vector DB:
toolscope.MilvusLiteBackend(path="./toolscope.db")ToolScope is backend-agnostic; more vector DBs can be added.
ToolScope integrates cleanly with popular agent stacks.
- full agent loops
- per-turn tool filtering
- middleware-based integration
from toolscope.adapters.langchain import (
ToolSelector,
make_toolscope_tool_selection_middleware,
)See:
examples/langchain/
- drop-in MCP client wrapper
- supports multi-server clients
- reacts to tools/list_changed notifications
from toolscope.adapters.fastmcp import ToolScopeFastMCPClientSee:
examples/fastmcp/
ToolScope is actively developed.
Adapters:
- β LangGraph
- β FastMCP
- β³ Llama Stack
- β³ LlamaIndex
- β³ AutoGen
- β³ CrewAI
- β³ Haystack
Other features:
- additional vector DB backends
- more MCP normalizers
- richer observability sinks
This project is licensed under the Apache License 2.0.