This repository contains the Lamatic.ai GitHub Automation Bundle, a workflow system designed to bridge the gap between static documentation and active repository management.
It enables:
- 📚 Continuous documentation ingestion for LLM context
- ⚡ Instant GitHub issue triaging and labeling
The system operates through two primary workflows:
This pipeline converts unstructured documentation into machine-readable knowledge.
- Targets specified documentation URLs
- Filters out non-essential HTML (e.g., navbars, footers)
- Extracts only core technical content
- Splits large documents into smaller semantic chunks
- Improves retrieval accuracy and prevents context loss
- Uses embedding models (e.g.,
text-embedding-3-small) - Converts text chunks into high-dimensional vectors
- Stores vectors in a Vector Database:
- Lamatic Managed Storage
- Pinecone
- Milvus
- Enables semantic search for contextual understanding
Automates real-time issue classification and labeling.
- Fires on every new GitHub issue (
new_issue) - Captures title, body, and metadata
- Sends issue content to an LLM
- Optionally retrieves relevant documentation context
- Assigns labels based on intent:
bugfeature-requestenhancementwontfix
- Applies labels via GitHub API using a POST request
- Lamatic.ai account
- GitHub Personal Access Token (PAT) with
reposcope - Documentation URL to ingest
- Set
DOCS_URLin the Scraper Node - Configure Vector Database destination
- Run initial crawl to populate vector storage
- Add
GITHUB_TOKENas a secret - Define classification rules in LLM prompt:
In your GitHub repository:
- Go to:
Settings > Webhooks > Add Webhook - Payload URL: Provided by Lamatic Trigger Node
- Content Type:
application/json - Events: Select
Issues
To refresh the LLM knowledge base:
- Trigger the Scraper Workflow from Lamatic Dashboard
- System will:
- Re-scrape documentation
- Re-vectorize content
- Upsert into Vector DB
Once configured, the system runs automatically:
Example Flow:
- User creates an issue:
- Webhook triggers Lamatic
- Classifier analyzes issue
- Labels applied:
bughigh-priority
- Verify scraper CSS selectors target main content correctly
- Ensure GitHub PAT is not rate-limited for high-volume repos
- Adjust LLM temperature
- Improve prompt with better examples
Want to extend this setup?
You can define a custom JSON schema for GitHub API labeling requests to improve consistency and validation.
This bundle enables a self-updating, intelligent GitHub workflow by combining:
- Documentation-aware LLMs
- Semantic search via vector databases
- Real-time issue classification