diff --git a/README.md b/README.md index 60cd6e9..ce0c15d 100644 --- a/README.md +++ b/README.md @@ -6,451 +6,176 @@ A Model Context Protocol (MCP) server for [GreptimeDB](https://github.com/GreptimeTeam/greptimedb) — an open-source, cloud-native, unified observability database. -This server enables AI assistants to query and analyze GreptimeDB through SQL, TQL (PromQL-compatible), and RANGE queries. It includes security features like read-only enforcement and sensitive data masking, along with prompt templates for common analysis tasks. +Enables AI assistants to query and analyze GreptimeDB using SQL, TQL (PromQL-compatible), and RANGE queries, with built-in security features like read-only enforcement and data masking. -# Features +## Quick Start -## Resources -- **list_resources** - List all tables in the database as browsable resources -- **read_resource** - Read table data via `greptime:///data` URIs +```bash +# Install +pip install greptimedb-mcp-server -## Tools +# Run (connects to localhost:4002 by default) +greptimedb-mcp-server --host localhost --database public +``` + +For Claude Desktop, add this to your config (`~/Library/Application Support/Claude/claude_desktop_config.json` on macOS): + +```json +{ + "mcpServers": { + "greptimedb": { + "command": "greptimedb-mcp-server", + "args": ["--host", "localhost", "--database", "public"] + } + } +} +``` + +## Features + +### Tools | Tool | Description | |------|-------------| | `execute_sql` | Execute SQL queries with format (csv/json/markdown) and limit options | -| `describe_table` | Get table schema including column names, types, and constraints | -| `health_check` | Check database connection status and server version | | `execute_tql` | Execute TQL (PromQL-compatible) queries for time-series analysis | | `query_range` | Execute time-window aggregation queries with RANGE/ALIGN syntax | +| `describe_table` | Get table schema including column names, types, and constraints | | `explain_query` | Analyze SQL or TQL query execution plans | +| `health_check` | Check database connection status and server version | + +### Pipeline Management + +| Tool | Description | +|------|-------------| | `list_pipelines` | List all pipelines or get details of a specific pipeline | | `create_pipeline` | Create a new pipeline with YAML configuration | | `dryrun_pipeline` | Test a pipeline with sample data without writing to database | | `delete_pipeline` | Delete a specific version of a pipeline | -## Prompts - -MCP prompt system APIs: -- **list_prompts** - List available prompt templates -- **get_prompt** - Get a prompt template by name with argument substitution - -Available prompt templates: - -| Prompt | Description | -|--------|-------------| -| `pipeline_creator` | Generate GreptimeDB pipeline YAML configuration from log samples | -| `log_pipeline` | Log analysis with full-text search and aggregation | -| `metrics_analysis` | Comprehensive metrics analysis for monitoring data | -| `promql_analysis` | PromQL-style queries using GreptimeDB TQL EVAL syntax | -| `iot_monitoring` | IoT device monitoring with TAG/FIELD semantics and device aggregation | -| `trace_analysis` | Distributed trace analysis for OpenTelemetry spans | -| `table_operation` | Table diagnostics: schema, region health, storage analysis, and query optimization | - -### Using Prompts in Claude Desktop - -In Claude Desktop, MCP prompts need to be added manually to your conversation: - -1. Click the **+** button in the conversation input area -2. Select **MCP Server** -3. Choose **Prompt/References** -4. Select the prompt you want to use (e.g., `pipeline_creator`) -5. Fill in the required arguments - -Note: Prompts are not automatically available via `/` slash commands in Claude Desktop. You must add them through the UI as described above. - -### LLM Instructions - -Add this to your system prompt or custom instructions to help AI assistants use this MCP server effectively: - -``` -You have access to a GreptimeDB MCP server for querying and managing time-series data, logs, and metrics. - -## Available Tools -- `execute_sql`: Run SQL queries (SELECT, SHOW, DESCRIBE only - read-only access) -- `execute_tql`: Run PromQL-compatible time-series queries -- `query_range`: Time-window aggregation with RANGE/ALIGN syntax -- `describe_table`: Get table schema information -- `health_check`: Check database connection status -- `explain_query`: Analyze query execution plans - -### Pipeline Management -- `list_pipelines`: View existing log pipelines -- `create_pipeline`: Create/update pipeline with YAML config (same name creates new version) -- `dryrun_pipeline`: Test pipeline with sample data without writing -- `delete_pipeline`: Remove a pipeline version - -**Note**: All HTTP API calls (pipeline tools) require authentication. The MCP server handles auth automatically using configured credentials. When providing curl examples to users, always include `-u :`. - -## Available Prompts -Use these prompts for specialized tasks: -- `pipeline_creator`: Generate pipeline YAML from log samples - use when user provides log examples -- `log_pipeline`: Log analysis with full-text search -- `metrics_analysis`: Metrics monitoring and analysis -- `promql_analysis`: PromQL-style queries -- `iot_monitoring`: IoT device data analysis -- `trace_analysis`: Distributed tracing analysis -- `table_operation`: Table diagnostics and optimization - -## Workflow Tips -1. For log pipeline creation: Get log sample → use `pipeline_creator` prompt → generate YAML → `create_pipeline` → `dryrun_pipeline` to verify -2. For data analysis: `describe_table` first → understand schema → `execute_sql` or `execute_tql` -3. For time-series: Prefer `query_range` for aggregations, `execute_tql` for PromQL patterns -4. Always check `health_check` if queries fail unexpectedly -``` - -### Example: Creating a Pipeline - -Ask Claude to help create a pipeline by providing your log sample: - -``` -Help me create a GreptimeDB pipeline to parse this nginx log: -127.0.0.1 - - [25/May/2024:20:16:37 +0000] "GET /index.html HTTP/1.1" 200 612 "-" "Mozilla/5.0..." -``` - -Claude will: -1. Analyze your log format -2. Generate a pipeline YAML configuration -3. Create the pipeline using `create_pipeline` tool -4. Test it with `dryrun_pipeline` tool - -## Security - -### Database User Configuration (Recommended) - -For production deployments, create a **read-only database user** for the MCP server. This provides defense-in-depth security at the database level. - -Configure a read-only user in GreptimeDB using [static user provider](https://docs.greptime.com/user-guide/deployments-administration/authentication/static/#permission-modes): - -``` -# User format: username:permission_mode=password -mcp_readonly:readonly=your_secure_password -``` - -Permission modes: -- `readonly` (or `ro`) - Can only query data (recommended for MCP server) -- `writeonly` (or `wo`) - Can only write data -- `readwrite` (or `rw`) - Full access (default) - -Then configure the MCP server to use this user: -```bash -GREPTIMEDB_USER=mcp_readonly -GREPTIMEDB_PASSWORD=your_secure_password -``` - -### Application-Level Security Gate - -All queries also pass through a security gate that: -- Blocks DDL/DML operations: DROP, DELETE, TRUNCATE, UPDATE, INSERT, ALTER, CREATE, GRANT, REVOKE -- Blocks dynamic SQL execution: EXEC, EXECUTE, CALL -- Blocks data modification: REPLACE INTO -- Blocks file system access: LOAD, COPY, OUTFILE, LOAD_FILE, INTO DUMPFILE -- Blocks encoded content bypass attempts: hex encoding (0x...), UNHEX(), CHAR() -- Prevents multiple statement execution with dangerous operations -- Allows read-only operations: SELECT, SHOW, DESCRIBE, TQL, EXPLAIN, UNION, INFORMATION_SCHEMA +### Resources & Prompts -## Data Masking -Sensitive data in query results is automatically masked to protect privacy: +- **Resources**: Browse tables via `greptime://
/data` URIs +- **Prompts**: Built-in templates for common tasks — `pipeline_creator`, `log_pipeline`, `metrics_analysis`, `promql_analysis`, `iot_monitoring`, `trace_analysis`, `table_operation` -**Default masked column patterns:** -- Authentication: `password`, `passwd`, `pwd`, `secret`, `token`, `api_key`, `access_key`, `private_key`, `credential`, `auth` -- Financial: `credit_card`, `card_number`, `cvv`, `cvc`, `pin`, `bank_account`, `account_number`, `iban`, `swift` -- Personal: `ssn`, `social_security`, `id_card`, `passport` +For LLM integration and prompt usage, see [docs/llm-instructions.md](docs/llm-instructions.md). -**Configuration:** -```bash -# Disable masking (default: true) -GREPTIMEDB_MASK_ENABLED=false - -# Add custom patterns (comma-separated) -GREPTIMEDB_MASK_PATTERNS=phone,address,email -``` +## Configuration -Masked values appear as `******` in all output formats (CSV, JSON, Markdown). - -# Installation +### Environment Variables ```bash -pip install greptimedb-mcp-server - -# Upgrade to latest version -pip install -U greptimedb-mcp-server -``` - -After installation, run the server: +GREPTIMEDB_HOST=localhost # Database host +GREPTIMEDB_PORT=4002 # MySQL protocol port (default: 4002) +GREPTIMEDB_USER=root # Database user +GREPTIMEDB_PASSWORD= # Database password +GREPTIMEDB_DATABASE=public # Database name +GREPTIMEDB_TIMEZONE=UTC # Session timezone -```bash -# Using the command -greptimedb-mcp-server --host localhost --port 4002 --database public +# Optional +GREPTIMEDB_HTTP_PORT=4000 # HTTP API port for pipeline management +GREPTIMEDB_HTTP_PROTOCOL=http # HTTP protocol (http/https) +GREPTIMEDB_POOL_SIZE=5 # Connection pool size +GREPTIMEDB_MASK_ENABLED=true # Enable sensitive data masking +GREPTIMEDB_MASK_PATTERNS= # Additional patterns (comma-separated) +GREPTIMEDB_AUDIT_ENABLED=true # Enable audit logging -# Or as a Python module -python -m greptimedb_mcp_server.server +# Transport (for HTTP server mode) +GREPTIMEDB_TRANSPORT=stdio # stdio, sse, or streamable-http +GREPTIMEDB_LISTEN_HOST=0.0.0.0 # HTTP server bind host +GREPTIMEDB_LISTEN_PORT=8080 # HTTP server bind port ``` -# Configuration - -Set the following environment variables: +### CLI Arguments ```bash -GREPTIMEDB_HOST=localhost # Database host -GREPTIMEDB_PORT=4002 # Optional: Database MySQL port (defaults to 4002) -GREPTIMEDB_HTTP_PORT=4000 # Optional: HTTP API port for pipeline management (defaults to 4000) -GREPTIMEDB_HTTP_PROTOCOL=http # Optional: HTTP protocol (http or https, defaults to http) -GREPTIMEDB_USER=root -GREPTIMEDB_PASSWORD= -GREPTIMEDB_DATABASE=public -GREPTIMEDB_TIMEZONE=UTC -GREPTIMEDB_POOL_SIZE=5 # Optional: Connection pool size (defaults to 5) -GREPTIMEDB_MASK_ENABLED=true # Optional: Enable data masking (defaults to true) -GREPTIMEDB_MASK_PATTERNS= # Optional: Additional sensitive column patterns (comma-separated) - -# MCP Server Transport Options -GREPTIMEDB_TRANSPORT=stdio # Optional: Transport mode (stdio, sse, streamable-http, defaults to stdio) -GREPTIMEDB_LISTEN_HOST=0.0.0.0 # Optional: HTTP server bind host (defaults to 0.0.0.0) -GREPTIMEDB_LISTEN_PORT=8080 # Optional: HTTP server bind port (defaults to 8080) +greptimedb-mcp-server \ + --host localhost \ + --port 4002 \ + --database public \ + --user root \ + --password "" \ + --timezone UTC \ + --pool-size 5 \ + --mask-enabled true \ + --transport stdio ``` -Or via command-line args: - -* `--host` the database host, `localhost` by default, -* `--port` the database port, must be MySQL protocol port, `4002` by default, -* `--http-port` the HTTP API port for pipeline management, `4000` by default, -* `--http-protocol` the HTTP protocol for API calls (http or https), `http` by default, -* `--user` the database username, empty by default, -* `--password` the database password, empty by default, -* `--database` the database name, `public` by default, -* `--timezone` the session time zone, empty by default (using server default time zone), -* `--pool-size` the connection pool size, `5` by default, -* `--mask-enabled` enable data masking for sensitive columns, `true` by default, -* `--mask-patterns` additional sensitive column patterns (comma-separated), empty by default, -* `--transport` MCP transport mode (`stdio`, `sse`, `streamable-http`), `stdio` by default, -* `--listen-host` HTTP server bind host (for sse/streamable-http), `0.0.0.0` by default, -* `--listen-port` HTTP server bind port (for sse/streamable-http), `8080` by default. - -## HTTP Server Mode +### HTTP Server Mode -For containerized or Kubernetes deployments, you can run the MCP server in HTTP mode instead of stdio: +For containerized or Kubernetes deployments: ```bash -# Streamable HTTP mode (recommended for production) +# Streamable HTTP (recommended for production) greptimedb-mcp-server --transport streamable-http --listen-port 8080 -# SSE mode (legacy, for older clients) -greptimedb-mcp-server --transport sse --listen-host 0.0.0.0 --listen-port 3000 - -# Via environment variables (for Docker/K8s) -GREPTIMEDB_TRANSPORT=streamable-http \ -GREPTIMEDB_LISTEN_HOST=0.0.0.0 \ -GREPTIMEDB_LISTEN_PORT=8080 \ -greptimedb-mcp-server -``` - -**Transport modes:** -- `stdio` (default): Standard input/output, for local CLI integration (e.g., Claude Desktop) -- `streamable-http`: HTTP-based transport with SSE streaming, recommended for remote/production deployments -- `sse`: Server-Sent Events transport (legacy, being deprecated in MCP spec) - -# Usage - -## Tool Examples - -### execute_sql -Execute SQL queries with optional format and limit: -```json -{ - "query": "SELECT * FROM metrics WHERE host = 'server1'", - "format": "json", - "limit": 100 -} -``` -Formats: `csv` (default), `json`, `markdown` - -### execute_tql -Execute PromQL-compatible time-series queries: -```json -{ - "query": "rate(http_requests_total[5m])", - "start": "2024-01-01T00:00:00Z", - "end": "2024-01-01T01:00:00Z", - "step": "1m", - "lookback": "5m" -} -``` - -### query_range -Execute time-window aggregation queries: -```json -{ - "table": "metrics", - "select": "ts, host, avg(cpu) RANGE '5m'", - "align": "1m", - "by": "host", - "where": "region = 'us-east'" -} +# SSE mode (legacy) +greptimedb-mcp-server --transport sse --listen-port 3000 ``` -### describe_table -Get table schema information: -```json -{ - "table": "metrics" -} -``` - -### explain_query -Analyze query execution plan: -```json -{ - "query": "SELECT * FROM metrics", - "analyze": true -} -``` - -### health_check -Check database connection (no parameters required). +## Security -### Pipeline Management +### Read-Only Database User (Recommended) -#### list_pipelines -List all pipelines or filter by name: -```json -{ - "name": "my_pipeline" -} -``` - -#### create_pipeline -Create a new pipeline with YAML configuration: -```json -{ - "name": "nginx_logs", - "pipeline": "version: 2\nprocessors:\n - dissect:\n fields:\n - message\n patterns:\n - '%{ip} - - [%{timestamp}] \"%{method} %{path}\"'\n ignore_missing: true\n - date:\n fields:\n - timestamp\n formats:\n - '%d/%b/%Y:%H:%M:%S %z'\n\ntransform:\n - fields:\n - ip\n type: string\n index: inverted\n - fields:\n - timestamp\n type: time\n index: timestamp" -} -``` +Create a read-only user in GreptimeDB using [static user provider](https://docs.greptime.com/user-guide/deployments-administration/authentication/static/#permission-modes): -#### dryrun_pipeline -Test a pipeline with sample data (no actual write): -```json -{ - "pipeline_name": "nginx_logs", - "data": "{\"message\": \"127.0.0.1 - - [25/May/2024:20:16:37 +0000] \\\"GET /index.html\\\"\"}" -} ``` - -#### delete_pipeline -Delete a specific version of a pipeline: -```json -{ - "name": "nginx_logs", - "version": "2024-06-27 12:02:34.257312110" -} +mcp_readonly:readonly=your_secure_password ``` -## Claude Desktop Integration +### Application-Level Security Gate -Configure the MCP server in Claude Desktop's configuration file: +All queries go through a security gate that: +- **Blocks**: DROP, DELETE, TRUNCATE, UPDATE, INSERT, ALTER, CREATE, GRANT, REVOKE, EXEC, LOAD, COPY +- **Blocks**: Encoded bypass attempts (hex, UNHEX, CHAR) +- **Allows**: SELECT, SHOW, DESCRIBE, TQL, EXPLAIN, UNION -#### MacOS +### Data Masking -Location: `~/Library/Application Support/Claude/claude_desktop_config.json` +Sensitive columns are automatically masked (`******`) based on column name patterns: +- Authentication: `password`, `secret`, `token`, `api_key`, `credential` +- Financial: `credit_card`, `cvv`, `bank_account` +- Personal: `ssn`, `id_card`, `passport` -#### Windows +Configure with `--mask-patterns phone,email` to add custom patterns. -Location: `%APPDATA%/Claude/claude_desktop_config.json` +### Audit Logging -**Option 1: Using pip installed command (recommended)** +All tool invocations are logged: -```json -{ - "mcpServers": { - "greptimedb": { - "command": "greptimedb-mcp-server", - "args": [ - "--host", "localhost", - "--port", "4002", - "--database", "public" - ] - } - } -} ``` - -**Option 2: Using uv with source directory** - -```json -{ - "mcpServers": { - "greptimedb": { - "command": "uv", - "args": [ - "--directory", - "/path/to/greptimedb-mcp-server", - "run", - "-m", - "greptimedb_mcp_server.server" - ], - "env": { - "GREPTIMEDB_HOST": "localhost", - "GREPTIMEDB_PORT": "4002", - "GREPTIMEDB_USER": "root", - "GREPTIMEDB_PASSWORD": "", - "GREPTIMEDB_DATABASE": "public", - "GREPTIMEDB_TIMEZONE": "", - "GREPTIMEDB_POOL_SIZE": "5", - "GREPTIMEDB_HTTP_PORT": "4000", - "GREPTIMEDB_MASK_ENABLED": "true", - "GREPTIMEDB_MASK_PATTERNS": "" - } - } - } -} +2025-12-10 10:30:45 - greptimedb_mcp_server.audit - INFO - [AUDIT] execute_sql | query="SELECT * FROM cpu LIMIT 10" | success=True | duration_ms=45.2 ``` -# License - -MIT License - see LICENSE.md file for details. - -# Contribute - -## Prerequisites -- Python with `uv` package manager -- GreptimeDB installation -- MCP server dependencies +Disable with `--audit-enabled false`. ## Development -``` -# Clone the repository +```bash +# Clone and setup git clone https://github.com/GreptimeTeam/greptimedb-mcp-server.git cd greptimedb-mcp-server - -# Create virtual environment -uv venv -source venv/bin/activate # or `venv\Scripts\activate` on Windows - -# Install development dependencies +uv venv && source .venv/bin/activate uv sync # Run tests pytest -``` -Use [MCP Inspector](https://modelcontextprotocol.io/docs/tools/inspector) for debugging: +# Format & lint +uv run black . +uv run flake8 src -```bash -npx @modelcontextprotocol/inspector uv \ - --directory \ - /path/to/greptimedb-mcp-server \ - run \ - -m \ - greptimedb_mcp_server.server +# Debug with MCP Inspector +npx @modelcontextprotocol/inspector uv --directory . run -m greptimedb_mcp_server.server ``` -# Acknowledgement -This library's implementation was inspired by the following two repositories and incorporates their code, for which we express our gratitude: +## License + +MIT License - see [LICENSE.md](LICENSE.md). -* [ktanaka101/mcp-server-duckdb](https://github.com/ktanaka101/mcp-server-duckdb) -* [designcomputer/mysql_mcp_server](https://github.com/designcomputer/mysql_mcp_server) -* [mikeskarl/mcp-prompt-templates](https://github.com/mikeskarl/mcp-prompt-templates) +## Acknowledgement -Thanks! +Inspired by: +- [ktanaka101/mcp-server-duckdb](https://github.com/ktanaka101/mcp-server-duckdb) +- [designcomputer/mysql_mcp_server](https://github.com/designcomputer/mysql_mcp_server) +- [mikeskarl/mcp-prompt-templates](https://github.com/mikeskarl/mcp-prompt-templates) diff --git a/docs/llm-instructions.md b/docs/llm-instructions.md new file mode 100644 index 0000000..32ecbb8 --- /dev/null +++ b/docs/llm-instructions.md @@ -0,0 +1,68 @@ +# LLM Instructions for GreptimeDB MCP Server + +Add this to your system prompt to help AI assistants work with this MCP server. + +## System Prompt + +``` +You have access to a GreptimeDB MCP server for querying and managing time-series data, logs, and metrics. + +## Available Tools +- `execute_sql`: Run SQL queries (SELECT, SHOW, DESCRIBE only - read-only access) +- `execute_tql`: Run PromQL-compatible time-series queries +- `query_range`: Time-window aggregation with RANGE/ALIGN syntax +- `describe_table`: Get table schema information +- `health_check`: Check database connection status +- `explain_query`: Analyze query execution plans + +### Pipeline Management +- `list_pipelines`: View existing log pipelines +- `create_pipeline`: Create/update pipeline with YAML config (same name creates new version) +- `dryrun_pipeline`: Test pipeline with sample data without writing +- `delete_pipeline`: Remove a pipeline version + +**Note**: All HTTP API calls (pipeline tools) require authentication. The MCP server handles auth automatically using configured credentials. When providing curl examples to users, always include `-u :`. + +## Available Prompts +Use these prompts for specialized tasks: +- `pipeline_creator`: Generate pipeline YAML from log samples - use when user provides log examples +- `log_pipeline`: Log analysis with full-text search +- `metrics_analysis`: Metrics monitoring and analysis +- `promql_analysis`: PromQL-style queries +- `iot_monitoring`: IoT device data analysis +- `trace_analysis`: Distributed tracing analysis +- `table_operation`: Table diagnostics and optimization + +## Workflow Tips +1. For log pipeline creation: Get log sample → use `pipeline_creator` prompt → generate YAML → `create_pipeline` → `dryrun_pipeline` to verify +2. For data analysis: `describe_table` first → understand schema → `execute_sql` or `execute_tql` +3. For time-series: Prefer `query_range` for aggregations, `execute_tql` for PromQL patterns +4. Always check `health_check` if queries fail unexpectedly +``` + +## Using Prompts in Claude Desktop + +In Claude Desktop, you need to add MCP prompts manually: + +1. Click the **+** button in the conversation input area +2. Select **MCP Server** +3. Choose **Prompt/References** +4. Select the prompt you want to use (e.g., `pipeline_creator`) +5. Fill in the required arguments + +Note: Prompts are not automatically available via `/` slash commands in Claude Desktop. You must add them through the UI as described above. + +## Example: Creating a Pipeline + +Provide your log sample and ask Claude to create a pipeline: + +``` +Help me create a GreptimeDB pipeline to parse this nginx log: +127.0.0.1 - - [25/May/2024:20:16:37 +0000] "GET /index.html HTTP/1.1" 200 612 "-" "Mozilla/5.0..." +``` + +Claude will: +1. Analyze your log format +2. Generate a pipeline YAML configuration +3. Create the pipeline using `create_pipeline` tool +4. Test it with `dryrun_pipeline` tool diff --git a/pyproject.toml b/pyproject.toml index de81c01..5aaa78c 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -4,7 +4,7 @@ build-backend = "hatchling.build" [project] name = "greptimedb-mcp-server" -version = "0.3.1" +version = "0.4.0" description = "MCP server for GreptimeDB with SQL/TQL/PromQL support, sensitive data masking, and prompt templates for observability data analysis." readme = "README.md" license = {text = "MIT"} diff --git a/src/greptimedb_mcp_server/__init__.py b/src/greptimedb_mcp_server/__init__.py index b2aec5e..83d1449 100644 --- a/src/greptimedb_mcp_server/__init__.py +++ b/src/greptimedb_mcp_server/__init__.py @@ -1,3 +1,4 @@ +import asyncio import sys if "-m" not in sys.argv: @@ -6,7 +7,12 @@ def main(): """Main entry point for the package.""" - server.main() + try: + server.main() + except KeyboardInterrupt: + print("\nReceived Ctrl+C, shutting down...") + except asyncio.CancelledError: + print("\nServer shutdown complete.") # Expose important items at package level diff --git a/src/greptimedb_mcp_server/config.py b/src/greptimedb_mcp_server/config.py index 7935b87..036ca70 100644 --- a/src/greptimedb_mcp_server/config.py +++ b/src/greptimedb_mcp_server/config.py @@ -79,6 +79,11 @@ class Config: MCP HTTP server bind port (for sse/streamable-http transports) """ + audit_enabled: bool + """ + Enable audit logging for all tool calls + """ + @staticmethod def from_env_arguments() -> "Config": """ @@ -186,6 +191,13 @@ def from_env_arguments() -> "Config": default=int(os.getenv("GREPTIMEDB_LISTEN_PORT", "8080")), ) + parser.add_argument( + "--audit-enabled", + type=lambda x: x.lower() not in ("false", "0", "no"), + help="Enable audit logging for all tool calls (default: true)", + default=os.getenv("GREPTIMEDB_AUDIT_ENABLED", "true"), + ) + args = parser.parse_args() return Config( host=args.host, @@ -202,4 +214,5 @@ def from_env_arguments() -> "Config": transport=args.transport, listen_host=args.listen_host, listen_port=args.listen_port, + audit_enabled=args.audit_enabled, ) diff --git a/src/greptimedb_mcp_server/server.py b/src/greptimedb_mcp_server/server.py index c57247f..e909ade 100644 --- a/src/greptimedb_mcp_server/server.py +++ b/src/greptimedb_mcp_server/server.py @@ -12,6 +12,7 @@ validate_fill, validate_time_expression, format_tql_time_param, + audit_log, ) import asyncio @@ -857,11 +858,39 @@ def prompt_fn({arg_params}) -> str: _register_prompts() +def _install_audit_hook(): + """Install audit logging hook by wrapping tool manager's call_tool method.""" + original_call_tool = mcp._tool_manager.call_tool + + async def audited_call_tool(name, arguments, context=None, convert_result=False): + start_time = time.time() + try: + result = await original_call_tool(name, arguments, context, convert_result) + elapsed_ms = (time.time() - start_time) * 1000 + audit_log(name, arguments, success=True, duration_ms=elapsed_ms) + return result + except Exception as e: + elapsed_ms = (time.time() - start_time) * 1000 + audit_log( + name, arguments, success=False, duration_ms=elapsed_ms, error=str(e) + ) + raise + + mcp._tool_manager.call_tool = audited_call_tool + + def main(): """Main entry point.""" global _config _config = Config.from_env_arguments() + # Install audit logging hook if enabled + if _config.audit_enabled: + _install_audit_hook() + logger.info("Audit logging: enabled") + else: + logger.info("Audit logging: disabled") + # Only configure HTTP server settings for non-stdio transports # to avoid overriding user's programmatic configuration if _config.transport != "stdio": diff --git a/src/greptimedb_mcp_server/utils.py b/src/greptimedb_mcp_server/utils.py index 685266e..b3d9e36 100644 --- a/src/greptimedb_mcp_server/utils.py +++ b/src/greptimedb_mcp_server/utils.py @@ -2,6 +2,7 @@ import logging import yaml import os +from typing import Any logger = logging.getLogger("greptimedb_mcp_server") @@ -157,10 +158,56 @@ def validate_time_expression(value: str, name: str) -> str: raise ValueError(f"{name} is required") if ";" in value or "--" in value: raise ValueError(f"Invalid characters in {name}") - # Guard against malformed or injected strings with unbalanced quotes if value.count("'") % 2 != 0: raise ValueError(f"Unbalanced quotes in {name}") is_dangerous, reason = security_gate(value) if is_dangerous: raise ValueError(f"Dangerous pattern in {name}: {reason}") return value + + +# Audit logging +audit_logger = logging.getLogger("greptimedb_mcp_server.audit") + + +def _truncate_value(v: Any, max_len: int = 200) -> str: + """Truncate a value to max_len characters.""" + v_str = str(v) + if len(v_str) > max_len: + return v_str[:max_len] + "..." + return v_str + + +def _format_audit_params(params: dict) -> str: + """Format parameters for audit log.""" + if not params: + return "" + parts = [] + for k, v in params.items(): + parts.append(f'{k}="{_truncate_value(v)}"') + return " | ".join(parts) + + +def audit_log( + tool: str, + params: dict, + success: bool, + duration_ms: float, + error: str | None = None, +): + """Record audit log for tool invocation. Never raises exceptions.""" + try: + parts = [f"[AUDIT] {tool}"] + + params_str = _format_audit_params(params) + if params_str: + parts.append(params_str) + + parts.append(f"success={success}") + if error: + parts.append(f'error="{_truncate_value(error)}"') + parts.append(f"duration_ms={duration_ms:.1f}") + + audit_logger.info(" | ".join(parts)) + except Exception: + pass diff --git a/tests/test_server.py b/tests/test_server.py index 56981b2..1ec7684 100644 --- a/tests/test_server.py +++ b/tests/test_server.py @@ -39,6 +39,7 @@ def setup_state(): transport="stdio", listen_host="0.0.0.0", listen_port=8080, + audit_enabled=False, ) # Set global config for get_config() calls server._config = config diff --git a/tests/test_utils.py b/tests/test_utils.py index 52903df..e272840 100644 --- a/tests/test_utils.py +++ b/tests/test_utils.py @@ -8,6 +8,9 @@ is_sql_time_expression, format_tql_time_param, validate_time_expression, + _truncate_value, + _format_audit_params, + audit_log, ) from greptimedb_mcp_server.formatter import format_results @@ -523,3 +526,36 @@ def test_validate_time_expression_dangerous(): with pytest.raises(ValueError) as excinfo: validate_time_expression("DELETE FROM users", "start") assert "Dangerous pattern" in str(excinfo.value) + + +# Tests for audit logging functions + + +def test_truncate_value(): + """Test _truncate_value truncates long values""" + assert _truncate_value("short") == "short" + assert _truncate_value("a" * 201) == "a" * 200 + "..." + + +def test_format_audit_params(): + """Test _format_audit_params formats params correctly""" + assert _format_audit_params({}) == "" + assert _format_audit_params({"query": "SELECT 1"}) == 'query="SELECT 1"' + + +def test_audit_log(caplog): + """Test audit_log records tool calls""" + import logging + + with caplog.at_level(logging.INFO, logger="greptimedb_mcp_server.audit"): + audit_log("execute_sql", {"query": "SELECT 1"}, success=True, duration_ms=10.5) + + assert len(caplog.records) == 1 + msg = caplog.records[0].message + assert "[AUDIT] execute_sql" in msg + assert "success=True" in msg + + +def test_audit_log_never_raises(): + """Test audit_log never raises exceptions""" + audit_log(None, None, None, None, None) # Should not raise