llm-d is a Kubernetes-native high-performance distributed LLM inference framework that provides the fastest time-to-value and competitive performance per dollar. Built on vLLM, Kubernetes, and Inference Gateway, llm-d offers modular solutions for distributed inference with features like KV-cache aware routing and disaggregated serving.
New to llm-d? Here's how to get started:
- Join our Slack 💬 → Get your invite and visit llm-d.slack.com
- Explore our code 📂 → GitHub Organization
- Join a meeting 📅 → Add calendar
- Pick your area 🎯 → Browse Special Interest Groups.
- 📖 Documentation: llm-d.ai
- 🏗️ Architecture: Architecture docs
- 📖 Project Details: PROJECT.md
- 📦 Releases: GitHub Releases
- 📅 Upcoming Events: Upcoming Events
- 💬 Slack: llm-d Workspace - Daily conversations and Q&A
- 📂 GitHub: llm-d Organization - Code, issues, and discussions
- 📧 Google Group: llm-d-contributors - Architecture diagrams and updates
- 📚 Google Drive: Public Documentation - Meeting recordings and project docs
All meetings are open to the public! 🌟
- 📅 Weekly Standup: Every Wednesday at 12:30pm ET - Project updates and open discussion
- 🎯 SIG Meetings: Various times throughout the week - See SIG details for schedules
Join to participate, ask questions, or just listen and learn!
Want to dive deeper into specific areas? Our Special Interest Groups are focused teams working on different aspects of llm-d:
- Inference Scheduler - Intelligent request routing and load balancing
- Benchmarking - Performance testing and optimization
- PD-Disaggregation - Prefill/decode separation patterns
- KV-Disaggregation - KV caching and distributed storage
- Installation - Kubernetes integration and deployment
- Autoscaling - Traffic-aware autoscaling and resource management
- Observability - Monitoring, logging, and metrics
- 📅 Upcoming Events - Meetups, talks, and conferences
- 📝 Contributing Guidelines - Complete guide to contributing code, docs, and ideas
- 👥 Special Interest Groups (SIGs) - Join focused teams working on specific areas
- 🤝 Code of Conduct - Our community standards and values
- Read Guidelines: Review our Code of Conduct and contribution process
- Sign Commits: All commits require DCO sign-off (
git commit -s)
- 🐛 Bug fixes and small features - Submit PRs directly to component repos
- 🚀 New features with APIs - Require project proposals
- 📚 Documentation - Help improve guides and examples
- 🧪 Testing & Benchmarking - Contribute to our test coverage
- 💡 Experimental features - Start in llm-d-incubation org
- 🛡️ Security Policy - How to report vulnerabilities and security issues
- 📢 Security Announcements - Join the llm-d-security-announce group for emails about security and major API announcements.
Follow llm-d across social platforms for updates, discussions, and community highlights:
- 💼 LinkedIn: @llm-d
- 🐦 X (Twitter): @_llm_d_
- ☁️ Bluesky: @llm-d.ai
- 🤖 Reddit: r/llm_d
- 📺 YouTube: @llm-d-project
Questions? Ideas? Just want to chat? We're here to help! The llm-d community team is friendly and responsive.
- 💬 Slack: Join our Slack workspace and mention
@community-teamfor quick response - 🐛 GitHub Issues: Open an issue for bug reports, feature requests, or general questions
- 📧 Mailing List: llm-d-contributors for broader community discussions
License: Apache 2.0