Skip to content

Commit 73d2241

Browse files
committed
Add release notes
Signed-off-by: Rahul Krishna <[email protected]>
1 parent 33f893b commit 73d2241

File tree

13 files changed

+294
-76
lines changed

13 files changed

+294
-76
lines changed

RELEASE.md

Lines changed: 146 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,146 @@
1+
## 🎉 CodeAnalyzer Python v0.1.0 Release
2+
3+
**Python Static Analysis Backend for CodeLLM DevKit (CLDK)**
4+
5+
Initial release of **CodeAnalyzer Python**: A comprehensive static analysis tool designed specifically as the Python backend for the CodeLLM DevKit ecosystem. This tool provides deep code understanding capabilities through symbol table generation, with future support for call graph analysis and semantic analysis using industry-standard tools.
6+
7+
### 🚀 Key Features
8+
9+
#### **Symbol Table Generation**
10+
- **Complete AST Analysis**: Extracts classes, functions, variables, imports, and comments from Python source code
11+
- **Type Inference**: Leverages Jedi for intelligent type inference and symbol resolution
12+
- **Rich Metadata**: Captures cyclomatic complexity, parameter details, call sites, and code structure
13+
- **Comprehensive Coverage**: Supports modules, classes, functions, variables, imports, and docstrings
14+
15+
#### **Smart Project Processing**
16+
- **Intelligent File Discovery**: Automatically excludes virtual environments, site-packages, and cache directories
17+
- **Progress Tracking**: Beautiful Rich-based progress bars with real-time feedback
18+
- **Error Resilience**: Continues processing on individual file failures with detailed error reporting
19+
- **Caching Support**: Efficient caching system with customizable cache directories
20+
21+
#### **Modern CLI Interface**
22+
- **Rich Terminal UI**: Beautiful, colorful output with Rich integration
23+
- **Flexible Logging**: Multiple verbosity levels (`-v`, `-vv`, `-vvv`) with structured logging
24+
- **Multiple Output Formats**: JSON output to stdout or file
25+
- **Comprehensive Options**: Eager/lazy analysis, cache management, and output control
26+
27+
### 🛠️ Technical Highlights
28+
29+
#### **Built with Modern Python**
30+
- **Python 3.12+**: Leverages latest Python features and type hints
31+
- **uv Package Manager**: Fast, reliable dependency management
32+
- **Pydantic Models**: Type-safe data structures with validation
33+
- **Rich Progress Bars**: Non-blocking progress indication that preserves log output
34+
35+
#### **Advanced Code Analysis**
36+
- **Jedi Integration**: Professional-grade code intelligence and type inference
37+
- **AST Processing**: Deep abstract syntax tree analysis
38+
- **Builder Pattern**: Fluent, type-safe object construction
39+
- **Comprehensive Schema**: Detailed Python code representation models
40+
41+
#### **Production Ready**
42+
- **Error Handling**: Graceful failure handling with detailed logging
43+
- **Memory Efficient**: Processes large codebases without memory issues
44+
- **Configurable**: Extensive customization options for different use cases
45+
- **Well Tested**: Comprehensive test suite with CLI testing
46+
47+
### 📋 Usage Examples
48+
49+
**Basic Symbol Table Generation:**
50+
```bash
51+
uv run codeanalyzer --input ./my-python-project
52+
```
53+
54+
**Save Results to File:**
55+
```bash
56+
uv run codeanalyzer --input ./project --output ./analysis-results
57+
```
58+
59+
**Verbose Analysis with Custom Cache:**
60+
```bash
61+
uv run codeanalyzer --input ./project -vv --cache-dir ./custom-cache --eager
62+
```
63+
64+
### 🔧 Installation
65+
66+
```bash
67+
# Clone the repository
68+
git clone https://github.com/codellm-devkit/codeanalyzer-python
69+
cd codeanalyzer-python
70+
71+
# Install with uv
72+
uv sync --all-groups
73+
74+
# Run analysis
75+
uv run codeanalyzer --input /path/to/your/project
76+
```
77+
78+
### 🎯 What's Included
79+
80+
#### **Core Modules**
81+
- **`SymbolTableBuilder`**: Main analysis engine with comprehensive Python code parsing
82+
- **`ProgressBar`**: Smart progress indication that respects logging levels
83+
- **`PySchema`**: Rich data models for representing Python code structures
84+
- **`AnalyzerCore`**: Central orchestration with caching and virtual environment support
85+
86+
#### **Advanced Features**
87+
- **Virtual Environment Detection**: Automatic Python environment discovery and setup
88+
- **CodeQL Integration**: Foundation for future semantic analysis (in development)
89+
- **Extensible Architecture**: Modular design ready for additional analysis backends
90+
91+
### 🔮 Future Roadmap
92+
93+
#### **Planned Features**
94+
- **Call Graph Analysis** (`--analysis-level 2`): Complete function call relationship mapping
95+
- **CodeQL Semantic Analysis**: Advanced code pattern detection and vulnerability analysis
96+
- **WALA Integration**: Additional semantic analysis capabilities
97+
- **Performance Optimizations**: Parallel processing and incremental analysis
98+
99+
### 🏗️ Architecture Improvements in v0.1.0
100+
101+
#### **Logging System Overhaul**
102+
- **Replaced Loguru with Rich Logging**: Better terminal integration and formatting
103+
- **Centralized Logger**: Consistent logging across all modules
104+
- **Progress-Aware Logging**: Error messages don't interfere with progress bars
105+
106+
#### **Progress Bar Enhancement**
107+
- **Rich Integration**: Beautiful, informative progress indication
108+
- **Logger-Aware**: Automatically disables when logging level is high
109+
- **Error Collection**: Batches error messages to display after progress completion
110+
111+
#### **Dependency Management**
112+
- **Switched from tqdm to Rich**: Unified UI framework
113+
- **Cleaner Dependencies**: Removed redundant packages
114+
- **Better Error Handling**: More robust dependency resolution
115+
116+
### 🧪 Quality Assurance
117+
118+
#### **Testing Infrastructure**
119+
- **CLI Testing**: Comprehensive command-line interface validation
120+
- **Symbol Table Testing**: Verification of analysis accuracy
121+
- **Error Handling Tests**: Robust failure mode testing
122+
123+
#### **Code Quality**
124+
- **Type Safety**: Full type hints with mypy compatibility
125+
- **Modern Python**: Leverages Python 3.12+ features
126+
- **Clean Architecture**: Modular, testable design patterns
127+
128+
### 🎊 Perfect for CodeLLM DevKit
129+
130+
This release establishes CodeAnalyzer Python as the foundational static analysis backend for the CodeLLM DevKit ecosystem, providing:
131+
132+
- **Structured Code Representation**: Rich JSON output perfect for LLM consumption
133+
- **Comprehensive Metadata**: All the context needed for intelligent code understanding
134+
- **Extensible Design**: Ready to integrate with additional CLDK tools and workflows
135+
- **Production Scalability**: Handles enterprise-scale Python codebases efficiently
136+
137+
### 📖 Documentation & Support
138+
139+
- **Comprehensive README**: Detailed installation and usage instructions
140+
- **Rich CLI Help**: Built-in help system with examples
141+
- **Type-Safe APIs**: Full type hints for IDE integration
142+
- **Open Source**: Apache 2.0 license with community contributions welcome
143+
144+
---
145+
146+
*For issues, feature requests, or contributions, visit our [GitHub repository](https://github.com/codellm-devkit/codeanalyzer-python).*

pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,9 +17,9 @@ dependencies = [
1717
"pandas>=2.3.0",
1818
"pydantic>=2.11.7",
1919
"requests>=2.32.4",
20+
"rich>=14.0.0",
2021
"ruff>=0.12.2",
2122
"toml>=0.10.2",
22-
"tqdm>=4.67.1",
2323
"typer>=0.16.0",
2424
]
2525

src/codeanalyzer/__main__.py

Lines changed: 9 additions & 38 deletions
Original file line numberDiff line numberDiff line change
@@ -1,37 +1,13 @@
11
from contextlib import nullcontext
22
import sys
3-
from loguru import logger
43
import typer
54
from typing import Optional, Annotated
65
from pathlib import Path
7-
from codeanalyzer.core import AnalyzerCore
8-
import json
9-
10-
11-
def _setup_logger(level: str = "INFO") -> None:
12-
"""
13-
Setup the logger with the specified level.
14-
15-
Args:
16-
level (str): The logging level to set. Default is "INFO".
17-
"""
18-
if __name__ != "__main__":
19-
return # Avoid reconfiguring logger if not running as a cli application
6+
from codeanalyzer.utils import _set_log_level
207

21-
logger.remove()
22-
23-
if level == "OFF":
24-
return # If logging is turned off, we do not add any handlers.
25-
26-
logger.add(
27-
sys.stderr,
28-
format="{time:YYYY-MM-DD at HH:mm:ss} | {level} | {message}",
29-
level=level,
30-
colorize=True,
31-
)
8+
from codeanalyzer.core import AnalyzerCore
329

3310

34-
@logger.catch
3511
def main(
3612
input: Annotated[
3713
Path, typer.Option("-i", "--input", help="Path to the project root directory.")
@@ -51,40 +27,35 @@ def main(
5127
bool,
5228
typer.Option(
5329
"--eager/--lazy",
54-
help="Enable eager or lazy analysis. Eager will rebuild the analysis cache at every run and lazy will use the cache if available. Defaults to lazy.",
30+
help="Enable eager or lazy analysis. Defaults to lazy.",
5531
),
5632
] = False,
5733
cache_dir: Annotated[
5834
Optional[Path],
5935
typer.Option(
6036
"-c",
6137
"--cache-dir",
62-
help="Directory to store analysis cache. If not specified, the cache will be stored in the current working directory under `.codeanalyzer`. Defaults to None.",
38+
help="Directory to store analysis cache.",
6339
),
6440
] = None,
6541
clear_cache: Annotated[
6642
bool,
6743
typer.Option("--clear-cache/--keep-cache", help="Clear cache after analysis."),
6844
] = True,
69-
verbose: Annotated[
70-
bool, typer.Option("-v/-q", "--verbose/--quiet", help="Enable verbose output.")
71-
] = True,
45+
verbosity: Annotated[
46+
int, typer.Option("-v", count=True, help="Increase verbosity: -v, -vv, -vvv")
47+
] = 0,
7248
):
73-
"""Static Analysis on Python source code using Jedi, Asteroid, and Treesitter."""
74-
if verbose:
75-
_setup_logger("DEBUG")
76-
else:
77-
_setup_logger("OFF")
49+
"""Static Analysis on Python source code using Jedi, Astroid, and Treesitter."""
50+
_set_log_level(verbosity)
7851

7952
with AnalyzerCore(
8053
input, analysis_level, using_codeql, rebuild_analysis, cache_dir, clear_cache
8154
) as analyzer:
8255
artifacts = analyzer.analyze()
83-
# Default to printing the artifacts to stdout
8456
print_stream = sys.stdout
8557
stream_context = nullcontext(print_stream)
8658

87-
# If output is specified, redirect to file
8859
if output is not None:
8960
output.mkdir(parents=True, exist_ok=True)
9061
output_file = output / "analysis.json"

src/codeanalyzer/core.py

Lines changed: 1 addition & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55
from pathlib import Path
66
import sys
77
from typing import Any, Dict, Union, Optional
8-
from loguru import logger
8+
from codeanalyzer.utils import logger
99

1010
from codeanalyzer.schema.py_schema import PyApplication, PyModule
1111
from codeanalyzer.semantic_analysis.codeql import CodeQLLoader
@@ -120,23 +120,6 @@ def __enter__(self) -> "AnalyzerCore":
120120
# Find python in the virtual environment
121121
venv_python = venv_path / "bin" / "python"
122122

123-
# Upgrade pip + install build backend dependencies
124-
self._cmd_exec_helper(
125-
[
126-
str(venv_python),
127-
"-m",
128-
"pip",
129-
"install",
130-
"--upgrade",
131-
"--editable",
132-
"pip",
133-
"build",
134-
"setuptools",
135-
"wheel",
136-
],
137-
check=True,
138-
)
139-
140123
# Install the project itself (reads pyproject.toml)
141124
self._cmd_exec_helper(
142125
[str(venv_python), "-m", "pip", "install", "-U", f"{self.project_dir}"],

src/codeanalyzer/schema/py_schema.py

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -21,8 +21,7 @@
2121
"""
2222

2323
from pathlib import Path
24-
import sys
25-
from typing import Any, Dict, List, Optional, get_type_hints
24+
from typing import Any, Dict, List, Optional
2625
from typing_extensions import Literal
2726
from pydantic import BaseModel
2827

@@ -287,7 +286,7 @@ class PyClassAttribute(BaseModel):
287286
"""
288287

289288
name: str
290-
type: str = None
289+
type: Optional[str] = None
291290
comments: List[PyComment] = []
292291
start_line: int = -1
293292
end_line: int = -1

src/codeanalyzer/semantic_analysis/codeql/codeql_loader.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
import requests
33
import zipfile
44
from pathlib import Path
5-
from loguru import logger
5+
from codeanalyzer.utils import logger
66
from tqdm import tqdm
77

88

src/codeanalyzer/syntactic_analysis/symbol_table_builder.py

Lines changed: 23 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -4,9 +4,10 @@
44
from typing import Dict, List, Optional
55
import astor
66
import jedi
7-
from loguru import logger
7+
from codeanalyzer.utils import logger
88
from jedi.api.project import Project
99
from jedi.api import Script
10+
from rich.progress import track
1011
from codeanalyzer.schema.py_schema import (
1112
PyCallable,
1213
PyCallableParameter,
@@ -22,6 +23,8 @@
2223
import ast
2324
from ast import AST, ClassDef
2425

26+
from codeanalyzer.utils.progress_bar import ProgressBar
27+
2528

2629
class SymbolTableBuilder:
2730
"""A class for building a symbol table for a Python project."""
@@ -875,16 +878,26 @@ def build(self) -> Dict[str, PyModule]:
875878
functions, and variables defined in those files.
876879
"""
877880
symbol_table: Dict[str, PyModule] = {}
878-
for directory in self.project_dir.iterdir():
879-
if directory.is_dir() and directory.name.startswith("."):
880-
continue
881-
for py_file in directory.rglob("*.py"):
882-
if py_file.name.startswith("__"):
883-
continue
881+
# Get all Python files first to show accurate progress
882+
py_files = [
883+
py_file
884+
for py_file in self.project_dir.rglob("*.py")
885+
if "site-packages"
886+
not in py_file.resolve().__str__() # exclude site-packages
887+
and ".venv"
888+
not in py_file.resolve().__str__() # exclude virtual environments
889+
and ".codeanalyzer"
890+
not in py_file.resolve().__str__() # exclude internal cache directories
891+
]
892+
893+
with ProgressBar(len(py_files), "Building symbol table") as progress:
894+
for py_file in py_files:
884895
try:
885-
py_module: PyModule = self._module(py_file)
886-
symbol_table.update({py_file: py_module})
896+
py_module = self._module(py_file)
897+
symbol_table[str(py_file)] = py_module
887898
except Exception as e:
888899
logger.error(f"Failed to process {py_file}: {e}")
889-
continue
900+
progress.advance()
901+
progress.finish("✅ Symbol table generation complete.")
902+
890903
return symbol_table

src/codeanalyzer/utils/__init__.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
from .logging import logger
2+
from .logging import _set_log_level
3+
from .progress_bar import ProgressBar
4+
5+
__all__ = ["logger", "_set_log_level", "ProgressBar"]

src/codeanalyzer/utils/logging.py

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
from rich.console import Console
2+
from rich.logging import RichHandler
3+
import logging
4+
5+
# Set up base logger with RichHandler
6+
console = Console()
7+
handler = RichHandler(console=console, show_time=True, show_level=True, show_path=False)
8+
9+
logger = logging.getLogger("codeanalyzer")
10+
logger.setLevel(logging.ERROR) # Default level
11+
logger.addHandler(handler)
12+
logger.propagate = False # Prevent double logs
13+
14+
15+
def _set_log_level(verbosity: int) -> None:
16+
levels = [logging.ERROR, logging.WARNING, logging.INFO, logging.DEBUG]
17+
level = levels[min(verbosity, len(levels) - 1)]
18+
logger.setLevel(level)

0 commit comments

Comments
 (0)