Feature: rtk reach — measure what fraction of Claude Code's input context RTK can actually touch
Context
rtk already provides three analytics views:
rtk gain — what RTK did save (from the tracking DB)
rtk discover — Bash commands that ran without RTK filtering, with token estimates
rtk session — per-session Bash adoption %: how many shell commands routed through RTK
All three stop at the Bash boundary. None answer the question that determines
whether RTK is worth installing for a given workflow:
Of all tool-result tokens entering my context, what fraction is even Bash
in the first place?
The hook only runs on the Bash tool. Built-in Read, Grep, Glob,
WebFetch, Task/Agent, Edit, MCP servers — none of those go through
the Bash hook, so RTK is structurally unable to reach them. On real
workloads, those tools often produce more tool-result content than Bash
does, and the user has no easy way to find out before installing.
Proposal
Add rtk reach — a new analytics subcommand that scans Claude Code session
JSONL files (the same source rtk discover and rtk session already use)
and reports a per-tool breakdown of tool-result content sizes, plus a Bash
sub-breakdown showing rtk-wrapped vs. rtk-coverable vs. other.
Headline numbers it would surface
Headline
────────────────────────────────────────────────────────────────
Bash share of tool results: 30.2%
RTK active on Bash output: 7.2%
RTK reachable on Bash output: 50.1%
RTK reachable on ALL tool results: 15.1%
The last line is the new one. It's the upper bound on what RTK can
ever save you for the workflow being measured.
Sample full output
RTK Reach Analysis (all projects, last 7d)
================================================================
Sessions scanned: 6
Total tool-result tokens: ~200.6K
Billed input (non-cache): ~18.4K
Reach Headline
Bash share of tool results: 30.2%
RTK active on Bash output: 7.2%
RTK reachable on Bash output: 50.1%
RTK reachable on ALL tool results: 15.1%
By tool (tool_result content size)
Tool Calls Tokens Share
Read 119 84.2K 42.0%
Bash 451 60.6K 30.2%
ExitPlanMode 30 22.6K 11.3%
Edit 325 9.8K 4.9%
Agent 8 9.6K 4.8%
... (truncated)
Bash breakdown
rtk-wrapped 13 calls ~4.4K tokens
rtk-coverable* 179 calls ~26.0K tokens
other 259 calls ~30.3K tokens
Why land it in rtk itself
- Closes the analytics loop.
gain (what RTK saved) + discover
(Bash misses) + session (Bash adoption per session) + reach (what
fraction of context RTK can touch at all) gives users a complete picture.
- Sets honest expectations. New users often expect 80% savings and
get 10–20% on Read-heavy workflows. Surfacing the ceiling up front
improves trust.
- Cheap to ship.
discover/provider.rs::ClaudeProvider already
parses the JSONL — only the Bash-only pre-filter inside extract_commands
needs to be relaxed (a sibling extractor that keeps non-Bash tool_use
blocks). See rust-integration-sketch.md.
Working prototype
A standalone Python prototype that produces the output above is available
at: https://github.com/bkizzy/rtk-reach
Files:
rtk_reach.py — runnable today, no external deps
sample-output.txt — real output on a 7-day, 6-session scan
rust-integration-sketch.md — high-level shape of the Rust port
README.md — design rationale
Suggested API
Match rtk discover's flags for consistency:
rtk reach # current project, last 7 days
rtk reach --all # every project
rtk reach --since 30 # last 30 days
rtk reach -p finance # filter by path substring
rtk reach --json # JSON output
Estimated effort
~150–250 lines of Rust + tests. Reuses the existing
ClaudeProvider::extract_commands plumbing; adds a sibling extractor
(extract_tool_results) that keeps non-Bash tool_use blocks. Single PR,
no breaking changes. Sketch in rust-integration-sketch.md.
Open questions
- Should sidechain (subagent) calls be broken out by default or hidden
behind a flag?
- Should the per-Bash classification reuse
discover::registry::classify_command directly (authoritative — already
used by rtk session) or be duplicated for performance?
- Is there appetite for a
rtk reach --watch mode that streams as
sessions grow?
Happy to iterate on the design before opening a PR. If the maintainers
prefer extending rtk discover with a --all-tools flag instead, that
also works — the underlying parser change is the same.
Feature:
rtk reach— measure what fraction of Claude Code's input context RTK can actually touchContext
rtkalready provides three analytics views:rtk gain— what RTK did save (from the tracking DB)rtk discover— Bash commands that ran without RTK filtering, with token estimatesrtk session— per-session Bash adoption %: how many shell commands routed through RTKAll three stop at the Bash boundary. None answer the question that determines
whether RTK is worth installing for a given workflow:
The hook only runs on the
Bashtool. Built-inRead,Grep,Glob,WebFetch,Task/Agent,Edit, MCP servers — none of those go throughthe Bash hook, so RTK is structurally unable to reach them. On real
workloads, those tools often produce more tool-result content than Bash
does, and the user has no easy way to find out before installing.
Proposal
Add
rtk reach— a new analytics subcommand that scans Claude Code sessionJSONL files (the same source
rtk discoverandrtk sessionalready use)and reports a per-tool breakdown of tool-result content sizes, plus a Bash
sub-breakdown showing rtk-wrapped vs. rtk-coverable vs. other.
Headline numbers it would surface
The last line is the new one. It's the upper bound on what RTK can
ever save you for the workflow being measured.
Sample full output
Why land it in
rtkitselfgain(what RTK saved) +discover(Bash misses) +
session(Bash adoption per session) +reach(whatfraction of context RTK can touch at all) gives users a complete picture.
get 10–20% on Read-heavy workflows. Surfacing the ceiling up front
improves trust.
discover/provider.rs::ClaudeProvideralreadyparses the JSONL — only the Bash-only pre-filter inside
extract_commandsneeds to be relaxed (a sibling extractor that keeps non-Bash tool_use
blocks). See
rust-integration-sketch.md.Working prototype
A standalone Python prototype that produces the output above is available
at: https://github.com/bkizzy/rtk-reach
Files:
rtk_reach.py— runnable today, no external depssample-output.txt— real output on a 7-day, 6-session scanrust-integration-sketch.md— high-level shape of the Rust portREADME.md— design rationaleSuggested API
Match
rtk discover's flags for consistency:Estimated effort
~150–250 lines of Rust + tests. Reuses the existing
ClaudeProvider::extract_commandsplumbing; adds a sibling extractor(
extract_tool_results) that keeps non-Bash tool_use blocks. Single PR,no breaking changes. Sketch in
rust-integration-sketch.md.Open questions
behind a flag?
discover::registry::classify_commanddirectly (authoritative — alreadyused by
rtk session) or be duplicated for performance?rtk reach --watchmode that streams assessions grow?
Happy to iterate on the design before opening a PR. If the maintainers
prefer extending
rtk discoverwith a--all-toolsflag instead, thatalso works — the underlying parser change is the same.