ccutils

Claude utilities for session transcripts, star schema analytics, data exploration, and probably more as it comes up as a use case in my day to day.

Origin: This project began as a fork of Simon Willison's claude-code-transcripts. It has since diverged significantly with star schema analytics, a visual data explorer, modular architecture, and as a broader Claude utility.

Installation

uv tool install -e .

Or run without installing:

uv run ccutils --help

Quick Start

# Interactive two-phase picker - select projects, then sessions
ccutils

# Convert a single session file (opens in browser)
ccutils session.jsonl

# Export to DuckDB for SQL analytics
ccutils --format duckdb -o ./archive

# Export with star schema (22 tables + 10 views)
ccutils --format duckdb-star -o ./analytics

# Launch visual data explorer
ccutils explore ./analytics/archive.duckdb

Commands

Command	Description
`local`	Interactive picker + single-file conversion -- default (no subcommand needed)
`all`	Batch convert all sessions (HTML archive, DuckDB, or JSON)
`web`	Import from Claude API (auto-detects credentials from macOS keychain)
`explore`	Launch Data Explorer web UI for star schema databases
`import`	Import Claude.ai account exports (Settings > Privacy > Export)
`schema`	Inspect JSON structure without exposing content (safe to share publicly)

local (default)

The default command. With no arguments, launches a two-phase interactive picker (projects then sessions). Pass a session file to convert it directly.

Thinking blocks and subagent sessions are included by default.

ccutils                                          # Interactive picker
ccutils session.jsonl                            # Convert file, open in browser
ccutils session.jsonl --format duckdb-star -o .  # Star schema from file
ccutils --format duckdb-star -o ./analytics      # Pick sessions, star schema
ccutils -p myproject                             # Filter by project name
ccutils --flat                                   # Legacy single-list mode
ccutils --no-thinking --no-subagents             # Exclude thinking/agents
ccutils --format duckdb-star --embed -o .        # With ColBERT embeddings

all

Batch convert every session. Agents and thinking blocks included by default.

ccutils all -o ./archive                         # HTML archive with search index
ccutils all --format duckdb-star -o ./analytics   # Star schema for all sessions
ccutils all --format duckdb-star --embed -o ./out # With ColBERT embeddings
ccutils all -j 4 --batch-size 20 -o ./archive    # Parallel processing
ccutils all --no-agents --no-thinking             # Exclude agents and thinking
ccutils all --dry-run                            # Preview without converting

import

Import Claude.ai web conversation exports (the ZIP/directory from Settings > Privacy).

ccutils import ./my-claude-export --open         # HTML, opens in browser
ccutils import ./export --format duckdb -o data.duckdb  # DuckDB
ccutils import ./export --interactive            # Pick conversations
ccutils import ./export --list                   # List without converting

explore

Visual query builder for star schema DuckDB databases. Runs a local web server.

ccutils explore ./analytics/archive.duckdb

schema

Inspect JSON file structure without exposing content. Output is safe to share publicly or paste into AI assistants.

ccutils schema conversations.json
ccutils schema ./my-claude-export/               # Inspect all files in directory
ccutils schema ./export --json > schema.json     # Machine-readable output

Export Formats

Schema type is auto-inferred from --format: duckdb-star and json-star use star schema, plain duckdb and json use simple.

HTML Transcripts

Clean, mobile-friendly HTML with pagination, commit timeline, tool stats, and full-text search.

ccutils -o ./transcript --open
ccutils all -o ./archive                    # Archive with master index and search

DuckDB Analytics

Simple Schema (4 tables)

ccutils --format duckdb -o ./archive

Tables: sessions, messages, tool_calls, thinking

Star Schema (22 tables + 10 views)

ccutils --format duckdb-star -o ./analytics

Dimensional model designed for analytics:

6 dimensions: sessions (with heuristic classifications), projects, tools (with categories), models (with families), dates, times
6 core facts: messages, tool calls (with duration tracking), session summaries (with inclusive agent metric rollup), file operations, errors (with type classification), tool chain steps
5 granular tables: files (with language detection), session chains, content blocks, code blocks, entity mentions
3 agent/bridge tables: agent delegations (with denormalized metrics), cross-session file tracking, task-agent mapping
2 optional: ColBERT embeddings, tool input parameters
10 semantic views: pre-joined views for common queries (includes project context and file tracking)

Heuristic Classification

The star schema ETL runs heuristic classification during ingestion with zero external dependencies -- no LLM, no API key needed. Results are stored on dim_session:

Classifier	Method	Values
Intent	Score-based keyword matching on first user message	bug_fix, feature, refactor, debug, test, docs, review, explore
Complexity	Points-based scoring from session metrics	trivial, simple, moderate, complex
Outcome	Inferred from last assistant message + error rate	success, failure, unknown
Domain	Inferred from file extensions touched	web, backend, data, devops, docs, mixed, unknown
Error type	Classified from error message text (on `fact_errors`)	permission_denied, file_not_found, syntax_error, timeout, import_error, tool_error

-- What kinds of sessions do I have?
SELECT intent, complexity, COUNT(*) as sessions
FROM dim_session GROUP BY intent, complexity ORDER BY sessions DESC;

-- Tool usage by category
SELECT dt.tool_category, COUNT(*) as uses
FROM fact_tool_calls ftc
JOIN dim_tool dt ON ftc.tool_key = dt.tool_key
GROUP BY dt.tool_category ORDER BY uses DESC;

-- Am I more productive mornings or evenings?
SELECT dti.time_of_day, COUNT(*) as sessions, AVG(fss.total_messages) as avg_msgs
FROM fact_session_summary fss
JOIN dim_time dti ON fss.time_key = dti.time_key
GROUP BY dti.time_of_day;

-- Most-touched files across all sessions
SELECT df.file_path, SUM(bsf.write_count + bsf.edit_count) as modifications
FROM bridge_session_file bsf
JOIN dim_file df ON bsf.file_key = df.file_key
GROUP BY df.file_path ORDER BY modifications DESC LIMIT 20;

-- Catch up on a project (what was worked on recently)
SELECT first_user_message, last_assistant_message, intent, created_at
FROM semantic_project_context
WHERE project_name = 'my-project' LIMIT 5;

JSON Export

# Simple schema - single file
ccutils --format json -o ./sessions.json

# Star schema - directory structure (dimensions/ + facts/ + meta.json)
ccutils --format json-star -o ./star-export/

Common Options

# Output
-o, --output PATH          Output directory or file
--format FORMAT            html, duckdb, duckdb-star, json, json-star
--open                     Open result in browser

# Content (included by default -- use flags to exclude)
--no-thinking              Exclude thinking blocks
--no-subagents             Exclude related agent sessions (local)
--no-agents                Exclude agent-* session files (all)
--private                  Sanitize file paths for sharing

# Selection (local command)
--flat                     Flat single-list mode (skip project grouping)
--expand-chains            Show individual sessions in resumed chains
-p, --project TEXT         Filter by project name

# Embeddings (local and all commands, star schema only)
--embed [MODEL]            Run ColBERT embeddings (optionally specify model)

# Batch processing (all command)
-j, --jobs N               Parallel workers (default: 1)
--batch-size N             Sessions per transaction (default: 10)
--no-search-index          Skip search index generation

Documentation

Star Schema Reference -- table definitions, ETL capabilities, heuristic classification, example queries
Data Explorer Guide -- visual query builder features and architecture

Development

uv run pytest              # Run tests (~737 passing)
uv run ccutils --help      # Run development version
uv run pytest --cov=ccutils  # Coverage

License

Apache-2.0

Name		Name	Last commit message	Last commit date
Latest commit History 170 Commits
.github/workflows		.github/workflows
docs		docs
src/ccutils		src/ccutils
tests		tests
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ccutils

Installation

Quick Start

Commands

local (default)

all

import

explore

schema

Export Formats

HTML Transcripts

DuckDB Analytics

Simple Schema (4 tables)

Star Schema (22 tables + 10 views)

Heuristic Classification

JSON Export

Common Options

Documentation

Development

License

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ccutils

Installation

Quick Start

Commands

local (default)

all

import

explore

schema

Export Formats

HTML Transcripts

DuckDB Analytics

Simple Schema (4 tables)

Star Schema (22 tables + 10 views)

Heuristic Classification

JSON Export

Common Options

Documentation

Development

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages