Skip to content

fblissjr/ccutils

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

170 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ccutils

Claude utilities for session transcripts, star schema analytics, data exploration, and probably more as it comes up as a use case in my day to day.

Origin: This project began as a fork of Simon Willison's claude-code-transcripts. It has since diverged significantly with star schema analytics, a visual data explorer, modular architecture, and as a broader Claude utility.

Installation

uv tool install -e .

Or run without installing:

uv run ccutils --help

Quick Start

# Interactive two-phase picker - select projects, then sessions
ccutils

# Convert a single session file (opens in browser)
ccutils session.jsonl

# Export to DuckDB for SQL analytics
ccutils --format duckdb -o ./archive

# Export with star schema (22 tables + 10 views)
ccutils --format duckdb-star -o ./analytics

# Launch visual data explorer
ccutils explore ./analytics/archive.duckdb

Commands

Command Description
local Interactive picker + single-file conversion -- default (no subcommand needed)
all Batch convert all sessions (HTML archive, DuckDB, or JSON)
web Import from Claude API (auto-detects credentials from macOS keychain)
explore Launch Data Explorer web UI for star schema databases
import Import Claude.ai account exports (Settings > Privacy > Export)
schema Inspect JSON structure without exposing content (safe to share publicly)

local (default)

The default command. With no arguments, launches a two-phase interactive picker (projects then sessions). Pass a session file to convert it directly.

Thinking blocks and subagent sessions are included by default.

ccutils                                          # Interactive picker
ccutils session.jsonl                            # Convert file, open in browser
ccutils session.jsonl --format duckdb-star -o .  # Star schema from file
ccutils --format duckdb-star -o ./analytics      # Pick sessions, star schema
ccutils -p myproject                             # Filter by project name
ccutils --flat                                   # Legacy single-list mode
ccutils --no-thinking --no-subagents             # Exclude thinking/agents
ccutils --format duckdb-star --embed -o .        # With ColBERT embeddings

all

Batch convert every session. Agents and thinking blocks included by default.

ccutils all -o ./archive                         # HTML archive with search index
ccutils all --format duckdb-star -o ./analytics   # Star schema for all sessions
ccutils all --format duckdb-star --embed -o ./out # With ColBERT embeddings
ccutils all -j 4 --batch-size 20 -o ./archive    # Parallel processing
ccutils all --no-agents --no-thinking             # Exclude agents and thinking
ccutils all --dry-run                            # Preview without converting

import

Import Claude.ai web conversation exports (the ZIP/directory from Settings > Privacy).

ccutils import ./my-claude-export --open         # HTML, opens in browser
ccutils import ./export --format duckdb -o data.duckdb  # DuckDB
ccutils import ./export --interactive            # Pick conversations
ccutils import ./export --list                   # List without converting

explore

Visual query builder for star schema DuckDB databases. Runs a local web server.

ccutils explore ./analytics/archive.duckdb

schema

Inspect JSON file structure without exposing content. Output is safe to share publicly or paste into AI assistants.

ccutils schema conversations.json
ccutils schema ./my-claude-export/               # Inspect all files in directory
ccutils schema ./export --json > schema.json     # Machine-readable output

Export Formats

Schema type is auto-inferred from --format: duckdb-star and json-star use star schema, plain duckdb and json use simple.

HTML Transcripts

Clean, mobile-friendly HTML with pagination, commit timeline, tool stats, and full-text search.

ccutils -o ./transcript --open
ccutils all -o ./archive                    # Archive with master index and search

DuckDB Analytics

Simple Schema (4 tables)

ccutils --format duckdb -o ./archive

Tables: sessions, messages, tool_calls, thinking

Star Schema (22 tables + 10 views)

ccutils --format duckdb-star -o ./analytics

Dimensional model designed for analytics:

  • 6 dimensions: sessions (with heuristic classifications), projects, tools (with categories), models (with families), dates, times
  • 6 core facts: messages, tool calls (with duration tracking), session summaries (with inclusive agent metric rollup), file operations, errors (with type classification), tool chain steps
  • 5 granular tables: files (with language detection), session chains, content blocks, code blocks, entity mentions
  • 3 agent/bridge tables: agent delegations (with denormalized metrics), cross-session file tracking, task-agent mapping
  • 2 optional: ColBERT embeddings, tool input parameters
  • 10 semantic views: pre-joined views for common queries (includes project context and file tracking)

Heuristic Classification

The star schema ETL runs heuristic classification during ingestion with zero external dependencies -- no LLM, no API key needed. Results are stored on dim_session:

Classifier Method Values
Intent Score-based keyword matching on first user message bug_fix, feature, refactor, debug, test, docs, review, explore
Complexity Points-based scoring from session metrics trivial, simple, moderate, complex
Outcome Inferred from last assistant message + error rate success, failure, unknown
Domain Inferred from file extensions touched web, backend, data, devops, docs, mixed, unknown
Error type Classified from error message text (on fact_errors) permission_denied, file_not_found, syntax_error, timeout, import_error, tool_error
-- What kinds of sessions do I have?
SELECT intent, complexity, COUNT(*) as sessions
FROM dim_session GROUP BY intent, complexity ORDER BY sessions DESC;

-- Tool usage by category
SELECT dt.tool_category, COUNT(*) as uses
FROM fact_tool_calls ftc
JOIN dim_tool dt ON ftc.tool_key = dt.tool_key
GROUP BY dt.tool_category ORDER BY uses DESC;

-- Am I more productive mornings or evenings?
SELECT dti.time_of_day, COUNT(*) as sessions, AVG(fss.total_messages) as avg_msgs
FROM fact_session_summary fss
JOIN dim_time dti ON fss.time_key = dti.time_key
GROUP BY dti.time_of_day;

-- Most-touched files across all sessions
SELECT df.file_path, SUM(bsf.write_count + bsf.edit_count) as modifications
FROM bridge_session_file bsf
JOIN dim_file df ON bsf.file_key = df.file_key
GROUP BY df.file_path ORDER BY modifications DESC LIMIT 20;

-- Catch up on a project (what was worked on recently)
SELECT first_user_message, last_assistant_message, intent, created_at
FROM semantic_project_context
WHERE project_name = 'my-project' LIMIT 5;

JSON Export

# Simple schema - single file
ccutils --format json -o ./sessions.json

# Star schema - directory structure (dimensions/ + facts/ + meta.json)
ccutils --format json-star -o ./star-export/

Common Options

# Output
-o, --output PATH          Output directory or file
--format FORMAT            html, duckdb, duckdb-star, json, json-star
--open                     Open result in browser

# Content (included by default -- use flags to exclude)
--no-thinking              Exclude thinking blocks
--no-subagents             Exclude related agent sessions (local)
--no-agents                Exclude agent-* session files (all)
--private                  Sanitize file paths for sharing

# Selection (local command)
--flat                     Flat single-list mode (skip project grouping)
--expand-chains            Show individual sessions in resumed chains
-p, --project TEXT         Filter by project name

# Embeddings (local and all commands, star schema only)
--embed [MODEL]            Run ColBERT embeddings (optionally specify model)

# Batch processing (all command)
-j, --jobs N               Parallel workers (default: 1)
--batch-size N             Sessions per transaction (default: 10)
--no-search-index          Skip search index generation

Documentation

Development

uv run pytest              # Run tests (~737 passing)
uv run ccutils --help      # Run development version
uv run pytest --cov=ccutils  # Coverage

License

Apache-2.0

About

Claude utilities for session transcripts, star schema analytics, data exploration, and probably more as it comes up as a use case in my day to day.

Resources

License

Stars

Watchers

Forks

Contributors

Languages

  • Python 88.7%
  • JavaScript 6.6%
  • HTML 2.6%
  • CSS 2.1%