-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Description
Summary
Refactor link-draft.py to use a config file (link-draft.toml) as single source of truth for mappings, and add a light ML classifier for predicting tags and other metadata.
Design
Config file: link-draft.toml
Committed to repo. Training updates it in-place; user reviews git diff and commits what's correct.
# Auto-updated by: link-draft.py --train
# Last trained: 2026-02-03T15:30:00Z
[author_domains]
"simonwillison.net" = "Simon Willison" # seen: 12x
"astralcodexten.com" = "Scott Alexander" # seen: 8x
[domain_tags]
"x.com" = ["tweet"]
"github.com" = ["programming"]
"amazon.com" = ["book"]
[keyword_tags]
"\\bAI\\b" = ["ai"]
"\\bLLM\\b" = ["ai", "llm"]
[known_people]
names = ["Jeremy Wertheimer", "Shalev NessAiver"]
[rejected.author_domains]
# Entries here won't be re-suggested by trainingTraining pipeline
- Scans all existing link posts
- Learns patterns weighted by recency (recent edits override older patterns)
- Updates
link-draft.tomlin-place - Trains sklearn classifier for tag prediction, saves to
_ignore_link-draft-model.pkl
Training flags
- Auto-train if
link-draft.tomlmissing or older than newest post --skip-trainto bypass auto-training--trainto force retrain
Draft generation flow
- Load static data from
link-draft.toml(user overrides take precedence) - Load ML model for fuzzy predictions
- Combine: static rules first, model fills gaps
Tasks
- Define
link-draft.tomlschema - Extract hardcoded mappings from
link-draft.pyto config - Add training pipeline to scan posts and update toml
- Add recency weighting (recent posts weighted higher)
- Train sklearn classifier (e.g.,
SGDClassifierorMultinomialNB) for tags - Add
--train,--skip-trainflags - Add auto-train logic (train if config older than newest post)
- Add
[rejected]section support - Update draft generation to use config + model
Notes
- Training should be fast (<5 seconds for ~324 posts)
- Model file is gitignored; static config is committed
- If user deletes a suggestion and it keeps coming back, add to
[rejected]section
Issue created by Claude Opus 4.5
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels