This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
glean is a Rust CLI and MCP server for smart code reading. It combines tree-sitter AST parsing, ripgrep search, and token-aware file viewing into a single tool for AI agents. ~6,000 lines of Rust.
cargo build # Debug build
cargo build --release # Release build (LTO, stripped)
cargo build --profile fast # Optimized but fast compile (for dev/benchmarking)
cargo test # Run all tests (inline #[cfg(test)] modules)
cargo fmt --check # Check formatting
cargo clippy -- -D warnings # Lint (CI enforces zero warnings)All CLI queries go through lib.rs:run():
classify(query) → QueryType → dispatch
FilePath → read::read_file()
Glob → search::search_glob()
Symbol → search::search_symbol()
Content → search::search_content()
Fallthrough → try symbol, then content, then NotFound
Deterministic byte-pattern matching (no regex). Checks for glob metacharacters (* ? { [), path separators, dotfiles, numeric strings, and valid identifiers — in that order.
Decision tree: directory → section param → empty → binary → generated → token estimate.
- ≤3500 tokens: full content with line numbers
- >3500 tokens: language-specific smart outline via
read/outline/
Outline strategies by file type:
code.rs— tree-sitter AST extraction (functions, classes, imports with line ranges)markdown.rs— heading hierarchystructured.rs— JSON/TOML/YAML top-level keystabular.rs— CSV/TSV column headersfallback.rs— head + tail for logs and other text
- Symbol search (
symbol.rs): tree-sitter definition detection + ripgrep usage search, run in parallel viarayon::join. Results merged and deduped. - Content search (
content.rs): ripgrep regex, supports/regex/syntax. - Callers (
callers.rs): structural tree-sitter reverse matching withmemchrSIMD pre-filtering. - Callees (
callees.rs): extracted at expand time from definition bodies. - Ranking (
rank.rs): definitions first, then by distance to context file and file age.
JSON-RPC 2.0 over stdio. Tools: glean_read, glean_search, glean_files, glean_edit (optional), glean_session. Stateful — maintains Session (tracks expanded definitions for dedup) and OutlineCache (mtime-invalidated, DashMap-backed).
Hash-anchored editing. glean_read emits line:hash| format where hash = FNV-1a truncated to 12 bits (3 hex chars). glean_edit validates hashes before applying edits — rejects if file changed since last read.
MCP mode tracks which definitions have been expanded. Re-expanding shows [shown earlier] instead of full body to save tokens.
Outline cache keyed by (path, mtime). Uses DashMap entry API to avoid TOCTOU races. Stale entries (mtime changed) are never hit.
QueryType— classified query variant (FilePath, Glob, Symbol, Content, Fallthrough)Lang— supported programming languages (compiler enforces exhaustive matching)FileType— determines outline strategy (Code, Markdown, StructuredData, Tabular, Log, Other)ViewMode— what kind of output was produced (Full, Outline, Keys, Section, etc.)Match/SearchResult— search results with definition ranges and ranking metadataOutlineEntry/OutlineKind— structured outline tree
The project uses clippy::pedantic with specific allows listed in lib.rs. CI runs cargo clippy -- -D warnings.
Add an arm to Lang in types.rs — the compiler will flag every match that needs updating (classification, tree-sitter grammar init, outline extraction, definition detection).
ALWAYS write a failing test before fixing code issues.
Rust benchmark suite in benchmark/ tests against real repos (Gin, ripgrep, Alamofire, Zod). Run with cd benchmark && cargo build --release && ./target/release/bench run. See benchmark/README.md for methodology and results.