Agentic Code Intelligence via Model Context Protocol
Nucleus is an MCP (Model Context Protocol) server that provides AI agents with deep, semantic understanding of codebases. It combines semantic search, keyword matching, and code structure analysis to answer both conceptual queries ("Where is the auth logic?") and precise navigation ("Who calls verify_token?").
- Hybrid Search: Semantic concept matching, exact keyword search, and structural analysis (who calls whom)
- Incremental Indexing: Only changed files are re-indexed
- Multi-Language Support: Rust, Python, TypeScript, C, C++, C#, Dart, Go, Java, Lua, and YARA
- Local-First: All data stored in
.nucleus/within your project — no external services required - Cognitive Memory: Agents persist learnings across sessions
- GPU Accelerated: Supports CUDA, DirectML, OpenVINO, and Metal (macOS)
- Download the latest release ZIP for your platform from the Releases page.
- Unpack the ZIP to your desired install location, e.g.:
# Windows example
Expand-Archive -Path nucleus-server_win64.zip -DestinationPath "C:\Tools\nucleus"macOS and Linux: Coming soon.
Run the setup script to install the inference runtime libraries into the same directory as the binary:
DirectML (Windows — works with any GPU)
powershell -ExecutionPolicy Bypass -File scripts/setup-gpu.ps1 directml -OutputDir "C:\Tools\nucleus"NVIDIA CUDA
# Requires CUDA Toolkit 11.8+ and cuDNN 8.x
powershell -ExecutionPolicy Bypass -File scripts/setup-gpu.ps1 cuda -OutputDir "C:\Tools\nucleus"Intel OpenVINO
# For Intel CPUs, integrated/discrete GPUs (Arc), and NPUs
powershell -ExecutionPolicy Bypass -File scripts/setup-gpu.ps1 openvino -OutputDir "C:\Tools\nucleus"Supported Intel hardware:
- CPU — Any Intel CPU (Core, Xeon)
- iGPU — Integrated graphics (11th gen+)
- dGPU — Discrete GPUs (Intel Arc A-series)
- NPU — Neural Processing Unit (Intel Core Ultra / Meteor Lake and newer)
CPU only
powershell -ExecutionPolicy Bypass -File scripts/setup-gpu.ps1 cpu -OutputDir "C:\Tools\nucleus"Replace C:\Tools\nucleus with the path where you unpacked the binary.
You can also override the temporary download directory:
powershell -ExecutionPolicy Bypass -File scripts/setup-gpu.ps1 directml `
-OutputDir "C:\Tools\nucleus" `
-TempDir "D:\scratch"macOS (Apple Silicon): No setup needed — GPU acceleration via Metal is automatic.
Run the server once in a console to download AI models and verify the setup:
C:\Tools\nucleus\nucleus-server --root /path/to/your/projectOn first run, embedding and reranking models are downloaded automatically (~2.3–3.3 GB depending on configuration — embedding model plus ~1.1 GB reranker). The console output shows model loading, GPU device selection, and indexing progress.
Models are cached in:
- Windows:
%LOCALAPPDATA%\fastembed\ - macOS/Linux:
~/.cache/fastembed/
Override with: FASTEMBED_CACHE_PATH=/custom/path
{
"mcpServers": {
"nucleus": {
"command": "C:\\Tools\\nucleus\\nucleus-server",
"args": []
}
}
}{
"servers": {
"nucleus": {
"command": "C:\\Tools\\nucleus\\nucleus-server",
"args": [],
"type": "stdio"
}
}
}{
"mcpServers": {
"nucleus": {
"command": "C:\\Tools\\nucleus\\nucleus-server",
"args": []
}
}
}claude mcp add nucleus C:\Tools\nucleus\nucleus-serverReplace
C:\Tools\nucleus\nucleus-serverwith the actual path to your binary in all examples above.
By default, Nucleus communicates over stdio — each MCP client spawns its own server process, loading models into memory separately. In HTTP mode, a single Nucleus instance serves all clients. Models are loaded once, regardless of how many editors or agents connect.
nucleus-server --httpThis starts an HTTP server on http://127.0.0.1:4040/mcp using the MCP Streamable HTTP transport. All clients connect to this one instance. You can work on multiple projects at the same time, and even open the same project from several windows or editors simultaneously — each project gets its own shared session (index, file watcher, vector store), so nothing gets duplicated or corrupted.
Point your MCP clients at the running server instead of launching a binary:
{
"mcpServers": {
"nucleus": {
"url": "http://127.0.0.1:4040/mcp"
}
}
}CLI flags:
--http-- Enable HTTP/SSE transport (default: stdio)--bind <addr>-- Bind address (default:127.0.0.1)--port <port>-- Listen port (default:4040)
| Tool | Description |
|---|---|
search_code |
Semantic search across the codebase. Returns file-level results with matched symbols. |
search_symbols |
Search for symbols by name. Returns precise locations with line numbers. |
get_symbol |
Get full symbol definition by ID — signature, docstring, code, and relations. |
get_symbols |
Batch fetch multiple symbol definitions. |
get_usages |
Get all references to a symbol with locations and code snippets. |
resolve_symbol_at |
Resolve the reference at a given file and line to its definition. Returns symbol_id and location. |
find_similar_code |
Find code similar to a snippet. Use before writing new code to check for existing patterns. |
find_duplicate_code |
Detect near-duplicate code clusters across the codebase. Scores ≥0.95 are clones, 0.90–0.95 are similar. |
file_overview |
Get structural overview of all symbols in a file. |
class_overview |
Get API surface of a class/struct: methods, bases, traits — without bodies. |
get_implementors |
Get all types that implement a trait or interface. |
get_dependency_graph |
File-level dependency graph: who imports this file (inbound) and what it imports (outbound). |
| Tool | Description |
|---|---|
list_dir |
List contents of a directory within the indexed project. |
project_info |
Get project statistics: file counts, symbol counts, languages, index health. |
status |
Get indexing status and system health. |
reindex |
Trigger a full or incremental reindex. Use {"force": true} to rebuild from scratch. |
| Tool | Description |
|---|---|
cognitive_trigger |
Manage the memory session lifecycle: start, end, problem_appeared, problem_solved. |
read_memory |
Retrieve relevant memories using semantic search. |
write_memory |
Persist a new memory (code patterns, decisions, learnings). |
update_memory |
Amend an existing memory. |
Create .nucleus/config.json in your project root to customize behavior:
{
"embedding": {
"model": "Qwen3",
"max_length": 512,
"batch_size": 32
},
"indexer": {
"include": ["**/*.rs", "**/*.py", "**/*.ts", "**/*.js"],
"exclude": ["target", "node_modules", ".git", "dist"],
"max_file_size_bytes": 10485760
}
}| Model | Size | Description |
|---|---|---|
Qwen3 (default) |
~1.2 GB | Best for code search. Dense vectors only. |
BGEM3 |
~2.2 GB | Hybrid search with both dense and sparse vectors. |
To switch models, update config.json and run reindex with force: true.
| Variable | Description | Default |
|---|---|---|
NUCLEUS_EP |
Execution provider (Windows/Linux): cuda, openvino, directml, cpu |
Auto-detect |
FASTEMBED_CACHE_PATH |
Model cache directory | OS default (see above) |
RUST_LOG |
Logging level: error, warn, info, debug, trace |
info |
| Variable | Default | Description |
|---|---|---|
NUCLEUS_CUDA_MEM_LIMIT |
unlimited | GPU memory limit in bytes |
NUCLEUS_CUDA_CUDNN_CONV_ALGO |
EXHAUSTIVE |
Algorithm search: DEFAULT, HEURISTIC, EXHAUSTIVE |
NUCLEUS_CUDA_ARENA_STRATEGY |
SameAsRequested |
Memory strategy: NextPowerOfTwo, SameAsRequested |
| Variable | Default | Description |
|---|---|---|
NUCLEUS_OPENVINO_DEVICE |
GPU |
Device: GPU, GPU.0, NPU, CPU |
NUCLEUS_OPENVINO_PRECISION |
FP16 |
Precision: FP16, FP32 |
NUCLEUS_OPENVINO_STREAMS |
8 |
Parallel execution streams (1–255) |
NUCLEUS_OPENVINO_CACHE_DIR |
auto | Model cache directory |
After upgrading Nucleus, you may need to reindex:
# Delete existing index and restart the server
rm -rf .nucleus/
# Or trigger via MCP tool:
# reindex with force: trueWhen to reindex:
- After upgrading to a new Nucleus version
- After changing the embedding model
- If search results seem stale or incomplete