Code Intelligence Status Report

Last Updated: 2026-02-04 Purpose: Track all code intelligence features, implementation details, and language support.

Executive Summary

Category	Live	Partial	Missing	Total
Reference Analysis	7	0	0	7
Code Quality Markers	5	0	1	6
File Relationships	1	1	1	3
Auto-Triggers	0	0	1	1
Total	13	1	3	17

Recent Changes (v0.13.4+)

Hybrid LSP Fallback - When LSP returns 0 refs, automatically falls back to ripgrep text search to detect cross-package references
Cross-Package Detection - Catches references that LSP misses due to lazy imports or installed packages vs source
Enhanced Risk Assessment - Risk level upgraded based on text matches when LSP fails

How Code Intelligence Works

Relationship Flow

                              IMPORTS                           CALLS
                         ┌──────────────┐                 ┌──────────────┐
                         │              │                 │              │
                         ▼              │                 ▼              │
┌─────────────┐    ┌─────────────┐    ┌─────────────┐    ┌─────────────┐
│  module_a   │───▶│  module_b   │───▶│  function   │◀───│  caller_1   │
│             │    │             │    │             │    │             │
│ imports     │    │ imports     │    │ calls ────────▶  │  caller_2   │
│ module_b    │    │ module_c    │    │ helper()    │    │             │
└─────────────┘    └─────────────┘    └─────────────┘    └─────────────┘
                         │                   │
                         │ imported_by       │ called_by
                         ▼                   ▼
                   "Who imports me?"    "Who calls me?"

                         │                   │
                         │ imports           │ calling
                         ▼                   ▼
                   "What do I import?"  "What do I call?"

Metrics Explained

Metric	Direction	Question Answered	Example	Used For
called_by	← incoming	"Who calls this function?"	`get_user()` ← `login()`, `signup()`	Impact analysis before refactoring
calling	→ outgoing	"What does this function call?"	`login()` → `get_user()`, `validate()`	Understanding dependencies
imported_by	← incoming	"What files import this module?"	`utils.py` ← `api.py`, `cli.py`	Safe to modify? Who depends on me?
imports	→ outgoing	"What modules does this file import?"	`api.py` → `utils`, `models`, `config`	Dependency tracking
used_by	summary	"How widely used is this?"	`"3f 12r c:45"` = 3 files, 12 refs, 45% complexity	Quick usage overview
refs	count	"Total reference count"	`12` references across codebase	Raw usage metric
unused	flag	"Is this dead code?"	`refs <= 2` → `unused: true`	Dead code detection
complexity	score	"How complex is this code?"	`c:45` = 45th percentile	Risk assessment
risk	level	"How risky to change?"	HIGH (11+ refs), MED (3-10), LOW (0-2)	Change impact

Hybrid LSP Fallback (NEW in v0.13.4)

When LSP returns 0 references for a symbol, Aurora automatically falls back to text search:

┌─────────────────────────────────────────────────────────────────────┐
│                    Hybrid Reference Detection                        │
├─────────────────────────────────────────────────────────────────────┤
│                                                                      │
│  1. LSP Request                2. Fallback (if 0 refs)               │
│  ┌─────────────┐               ┌─────────────┐                       │
│  │ LSP Server  │──0 refs?──▶   │  ripgrep    │                       │
│  │ (jedi/ts)   │               │ -w symbol   │                       │
│  └─────────────┘               └─────────────┘                       │
│        │                              │                              │
│        ▼                              ▼                              │
│  ┌─────────────┐               ┌─────────────┐                       │
│  │  result =   │               │ text_matches│                       │
│  │  0 usages   │               │ text_files  │                       │
│  └─────────────┘               │ note:...    │                       │
│                                │ risk: adj.  │                       │
│                                └─────────────┘                       │
└─────────────────────────────────────────────────────────────────────┘

Why This Helps:

LSP tracks references within the analyzed workspace
Cross-package imports (e.g., from aurora_soar import SOAROrchestrator) may not resolve if the package is installed vs source
Text search catches these with ~85% accuracy

Response Fields (when fallback activates):

Field	Description	Example
`text_matches`	Total text occurrences	`12`
`text_files`	Files containing matches	`5`
`note`	Explanation of divergence	`"LSP found 0 refs but text search found 12 matches in 5 files - likely cross-package usage"`
`risk`	Adjusted risk level	`"medium"` (upgraded from `"low"`)

Reading the `used_by` Format

"3f 12r c:45"
 │   │    │
 │   │    └── complexity: 45th percentile (higher = more complex)
 │   └─────── refs: 12 total references across codebase
 └─────────── files: referenced in 3 different files

Risk Calculation

Risk Level	Criteria	Action
LOW	0-2 refs	Safe to change, minimal impact
MEDIUM	3-10 refs	Review callers before changing
HIGH	11+ refs	Careful refactoring needed, many dependents

Usage Markers

Marker	Condition	Meaning
`#DEADCODE`	0 external refs	Safe to remove
`#UNUSED`	refs ≤ 2	Low usage, consider removing
`#REFAC`	refs > 10	High usage, careful refactoring needed
`#COMPLEX`	complexity > 80%	High cyclomatic complexity

Language Support Matrix

Feature	Python	JS/TS	Go	Rust	Java	Ruby
LSP references	✅ Full	⚠️ Via multilspy	⚠️ Via multilspy	⚠️ Via multilspy	⚠️ Via multilspy	⚠️ Via multilspy
Deadcode (fast)	✅ Full	✅ ripgrep	✅ ripgrep	✅ ripgrep	✅ ripgrep	✅ ripgrep
Deadcode (accurate)	✅ Full	⚠️ Untested	⚠️ Untested	⚠️ Untested	⚠️ Untested	⚠️ Untested
Complexity	✅ tree-sitter	❌ Not impl	❌ Not impl	❌ Not impl	❌ Not impl	❌ Not impl
Calling (outgoing)	✅ tree-sitter	❌ Not impl	❌ Not impl	❌ Not impl	❌ Not impl	❌ Not impl
Import filtering	✅ Custom	❌ Not impl	❌ Not impl	❌ Not impl	❌ Not impl	❌ Not impl
Risk calculation	✅ Full	⚠️ No complexity	⚠️ No complexity	⚠️ No complexity	⚠️ No complexity	⚠️ No complexity

Legend: ✅ Full support | ⚠️ Partial/Untested | ❌ Not implemented

Feature Implementation Details

Reference Analysis

Feature	Status	Implementation	Languages	Speed	Notes
used_by (usage count)	✅ LIVE	LSP `get_usage_summary()` via multilspy	Python (tested), others via multilspy	~1000ms/symbol	Returns files + refs count
called_by (incoming)	✅ LIVE	LSP `get_callers()` + import filtering	Python only (filter)	~1500ms/symbol	Filters import statements
calling (outgoing)	✅ LIVE	Tree-sitter AST parsing	Python only	~50ms/symbol	Filters built-ins, shows meaningful calls
references (raw)	✅ LIVE	LSP `request_references()`	All via multilspy	~800ms/symbol	Raw LSP, no filtering
hybrid_fallback	✅ LIVE	ripgrep text search when LSP=0	All languages	~100ms	Catches cross-package refs
definition	✅ LIVE (unused)	LSP `request_definition()`	All via multilspy	~200ms	Not exposed via MCP
hover	✅ LIVE (unused)	LSP `request_hover()`	All via multilspy	~200ms	Not exposed via MCP

Code Quality Markers

Feature	Status	Implementation	Languages	Speed	Notes
#DEADCODE (fast)	✅ LIVE	Batched ripgrep + within-file check	All (text search)	~2s/dir	85% accuracy
#DEADCODE (accurate)	✅ LIVE	LSP references per symbol	Python (tested)	~20s/dir	95%+ accuracy
#REFAC (high usage)	✅ LIVE	Usage count > 10 = "high" risk	Python (tested)	~1s/symbol	Part of `lsp impact`
#COMPLEX	✅ LIVE	Tree-sitter branch counting	Python only	<10ms/file	Shown as `c:95`
#UNUSED (low usage)	✅ LIVE	`unused: true` when refs <= 2	All (uses LSP count)	~1s/symbol	In mem_search + lsp check
#TYPE	❌ MISSING	Would need type checker	-	-	Language-specific

File Relationships

Feature	Status	Implementation	Languages	Speed	Notes
imports (outgoing)	⚠️ INDEXED	Tree-sitter `_extract_imports()`	Python only	<10ms/file	Not queryable via MCP
imported_by (incoming)	✅ LIVE	`lsp(action="imports")` via ripgrep	All languages	<1s	Query-time search
calls_files	❌ MISSING	Would derive from `calling`	Python (calling ready)	-	`calling` is now ready, need file mapping

Implementation Stack

Current Architecture

┌─────────────────────────────────────────────────────────────────────┐
│                         MCP Tools Layer                              │
│  lsp_tool.py              mem_search_tool.py                        │
│  - lsp(action, path)      - mem_search(query, limit)                │
└─────────────────────────────┬───────────────────────────────────────┘
                              │
┌─────────────────────────────▼───────────────────────────────────────┐
│                         Analysis Layer                               │
│  aurora_lsp/analysis.py                                              │
│  - CodeAnalyzer.find_dead_code(accurate=False)                       │
│  - CodeAnalyzer.find_usages()                                        │
│  - CodeAnalyzer.get_callers()                                        │
│  - CodeAnalyzer.get_callees()         ← Python only (tree-sitter)    │
│  - _batched_ripgrep_search()          ← Language-agnostic            │
│  - _get_complexity()                  ← Python only (tree-sitter)    │
└─────────────────────────────┬───────────────────────────────────────┘
                              │
┌─────────────────────────────▼───────────────────────────────────────┐
│                         LSP Client Layer                             │
│  aurora_lsp/client.py (multilspy wrapper)                            │
│  - request_references()                                              │
│  - request_document_symbols()                                        │
│  - request_definition()                                              │
│  Supported: Python, JS/TS, Go, Rust, Java, Ruby, C#, Dart, Kotlin   │
└─────────────────────────────┬───────────────────────────────────────┘
                              │
┌─────────────────────────────▼───────────────────────────────────────┐
│                      Language Servers (via multilspy)                │
│  Python: jedi-language-server                                        │
│  JS/TS: typescript-language-server                                   │
│  Go: gopls                                                           │
│  Rust: rust-analyzer                                                 │
│  Java: jdtls                                                         │
│  Ruby: solargraph                                                    │
└─────────────────────────────────────────────────────────────────────┘

Component Breakdown

Component	Technology	Custom Code	Language Support
LSP client	multilspy library	Thin wrapper	10+ languages
Reference search	LSP protocol	None	All via LSP
Import filtering	Custom regex	`aurora_lsp/filters.py`	Multi-language patterns
Deadcode (fast)	ripgrep subprocess	`_batched_ripgrep_search()`	All languages
Deadcode (accurate)	LSP references	`find_usages()` loop	Python (tested)
Complexity	tree-sitter	`aurora_lsp/languages/`	Python only (config-based)
Risk calculation	Custom formula	`_calculate_risk()`	All (uses counts)
Entry point filter	Config patterns	`aurora_lsp/languages/python.py`	Python (extensible)

Language Abstraction Layer (NEW)

packages/lsp/src/aurora_lsp/languages/
  __init__.py       # Registry: get_config(), is_entry_point(), etc.
  base.py           # LanguageConfig dataclass
  python.py         # Python config (entry points, branch types, patterns)

LanguageConfig Fields

Field	Type	Purpose	Example (Python)
`name`	str	Language identifier	`"python"`
`extensions`	list[str]	File extensions	`[".py", ".pyi"]`
`tree_sitter_module`	str \| None	Tree-sitter parser module	`"tree_sitter_python"`
`branch_types`	set[str]	AST nodes for complexity	`{"if_statement", "for_statement", ...}`
`entry_points`	set[str]	Skip in deadcode (exact)	`{"main", "cli", "app", "setup"}`
`entry_patterns`	set[str]	Skip in deadcode (glob)	`{"pytest_", "test_"}`
`entry_decorators`	set[str]	Decorator entry points	`{"@click.command", "@app.route"}`
`nested_patterns`	set[str]	Nested helper patterns	`{"wrapper", "inner", "on_*"}`
`import_patterns`	list[str]	Import regex patterns	`[r"^\s*import\s+", ...]`
`call_node_type`	str	AST node for calls	`"call"`
`function_def_types`	set[str]	AST nodes for defs	`{"function_definition", "class_definition"}`

Registry API

from aurora_lsp.languages import (
    get_config,                    # Get full LanguageConfig for file
    get_language,                  # Get language name for file
    get_complexity_branch_types,   # Get branch types for complexity calc
    get_call_node_type,            # Get AST node type for calls
    get_function_def_types,        # Get AST node types for function defs
    is_entry_point,                # Check if name is entry point
    is_nested_helper,              # Check if name is nested helper
    supported_extensions,          # Get all supported extensions
)

# Usage
config = get_config("foo.py")           # Returns PYTHON config
config = get_config("foo.js")           # Returns None (not yet supported)

is_entry_point("foo.py", "main")        # True
is_entry_point("foo.py", "my_func")     # False
is_entry_point("foo.py", "pytest_configure")  # True (matches pytest_*)

branch_types = get_complexity_branch_types("foo.py")  # {"if_statement", ...}

call_type = get_call_node_type("foo.py")             # "call"
def_types = get_function_def_types("foo.py")         # {"function_definition", ...}

Adding a New Language

1. Create config file (languages/javascript.py):

from aurora_lsp.languages.base import LanguageConfig

JAVASCRIPT = LanguageConfig(
    name="javascript",
    extensions=[".js", ".jsx", ".mjs"],
    tree_sitter_module="tree_sitter_javascript",

    branch_types={
        "if_statement", "for_statement", "while_statement",
        "switch_statement", "ternary_expression", "catch_clause",
    },

    entry_points={"main", "default"},
    entry_patterns={"test_*", "spec_*"},
    entry_decorators=set(),  # JS doesn't use decorators same way

    nested_patterns={"callback", "handler", "wrapper"},

    import_patterns=[
        r"^\s*import\s+",
        r"^\s*import\s*\{",
        r"^\s*(const|let|var)\s+.*=\s*require\(",
    ],
)

2. Register in __init__.py:

from aurora_lsp.languages.javascript import JAVASCRIPT

LANGUAGES["javascript"] = JAVASCRIPT
EXTENSION_MAP.update({
    ".js": "javascript",
    ".jsx": "javascript",
    ".mjs": "javascript",
})

3. Add dependency (if complexity needed):

# pyproject.toml
dependencies = [
    "tree-sitter-javascript>=0.20",
]

Scaling to Other Languages

What Would Be Needed

Feature	Current (Python)	To Add Language X
LSP references	✅ Works	✅ Already works (multilspy)
Deadcode (fast)	✅ Works	✅ Already works (ripgrep)
Deadcode (accurate)	✅ Works	⚠️ Needs testing with language X server
Complexity	tree-sitter-python	Need `tree-sitter-X` + parser code
Import filtering	Python regex	Need language-specific patterns
Entry point filter	Python patterns	Need language-specific patterns

Effort Estimate Per Language

Language	LSP	Deadcode	Complexity	Import Filter	Total
JavaScript/TS	✅ Ready	✅ Ready	2 days	1 day	3 days
Go	✅ Ready	✅ Ready	2 days	1 day	3 days
Rust	✅ Ready	✅ Ready	2 days	1 day	3 days
Java	✅ Ready	✅ Ready	2 days	2 days	4 days

MCP Tool Parameters

`lsp` Tool

lsp(
    action: "check" | "impact" | "deadcode" | "imports",
    path: str,           # File or directory
    line: int | None,    # Required for check/impact (1-indexed)
    accurate: bool,      # For deadcode: True=LSP refs (slow), False=ripgrep (fast)
)

Action	What It Does	Languages	Speed
`check`	Quick usage count before editing	Python (tested)	~1s
`impact`	Full analysis with top callers	Python (tested)	~2s
`deadcode`	Find all unused symbols	All (fast), Python (accurate)	2-20s
`imports`	Find files that import this module	All (ripgrep)	<1s

`mem_search` Tool

mem_search(
    query: str,          # Search query
    limit: int = 5,      # Max results
    enrich: bool = False # Add callers/callees/git
)

Output Field	Source	Languages
`type`	Indexed metadata	All
`file`	Indexed metadata	All
`name`	Indexed metadata	All
`lines`	Indexed metadata	All
`used_by`	LSP + tree-sitter	Python (full), others (no complexity)
`risk`	Calculated	Python (full), others (partial)
`score`	Hybrid retrieval	All

Performance Benchmarks

Deadcode Detection

Mode	Symbols	Time	Accuracy	Method
Fast (default)	50	2s	85%	Batched ripgrep + within-file check
Accurate	50	20s	95%+	LSP references per symbol

Reference Counting

Approach	Symbols	Time	Per Symbol
Ripgrep (batched)	50	0.1s	2ms
LSP references	50	15s	300ms

Complexity Calculation

Method	Files	Time	Per File
Tree-sitter (Python)	10	0.1s	10ms

Known Limitations

Python-Only Features

Complexity calculation - Uses tree-sitter-python
Import filtering - Python regex patterns (from X import, import X)
Entry point detection - Python patterns (main, pytest_*, decorators)

Cross-Package References (Mitigated)

LSP may miss references when:

Packages are installed (site-packages) rather than source
Lazy imports are used (if TYPE_CHECKING:)
Dynamic imports (importlib.import_module())

Mitigation (v0.13.4+): Hybrid fallback uses ripgrep text search when LSP returns 0 refs, catching ~85% of cross-package references. Response includes text_matches, text_files, and adjusted risk.

External Callers Not Detected

Both fast and accurate modes miss:

MCP tool calls (lsp_client.find_dead_code())
CLI entry points called via python -m
Framework callbacks (Flask routes, pytest fixtures)

LSP Limitations

jedi-language-server doesn't provide diagnostics (no linting)
Outgoing calls (calling) now use tree-sitter AST parsing (Python only)
Some language servers (Ruby/solargraph) less reliable

Recommended Next Steps

Quick Wins (1 day each)

✅ ~~Add --accurate flag to deadcode~~ DONE
✅ ~~Add complexity to mem_search output~~ DONE
✅ ~~Add risk calculation~~ DONE
✅ ~~Add #UNUSED marker (usage <= 2)~~ DONE

Medium Term (3-5 days each)

Add JavaScript/TypeScript complexity (tree-sitter-typescript)
Add JS/TS import filtering patterns
✅ ~~Build imported_by reverse lookup~~ DONE - lsp(action="imports")
✅ ~~Build calling (outgoing calls)~~ DONE - tree-sitter AST parsing (Python)
Add pre-edit hook for related files

Long Term

Multi-language complexity support
Type checking integration per language
LSP warm-up / persistent daemon for faster cold start

FilesExpand file tree

CODE_INTELLIGENCE_STATUS.md

Latest commit

History