This is a curated list of resources related to coding-copilot and related tools.
- Awesome open coding-copilot and friends
- Table of Contents
- Plugins
- Models
- Benchmarks/Datasets
- Papers
- Performance Comparisons
- Related Open Source Projects
- Ethics and Challenges
- Contributing Guidelines
These plugins integrate AI-assisted coding capabilities into various development environments:
- copilot-clone - An open-source alternative to GitHub Copilot
- fauxpilot - An open-source alternative to GitHub Copilot server
- twinny - The most no-nonsense, locally or API-hosted AI code completion plugin for Visual Studio Code
- claude-dev - Autonomous software engineer right in your IDE, capable of creating/editing files, executing commands, and more with your permission every step of the way.
These are some of the prominent models used for code generation and understanding:
- CodeT5 - Open-source model for code understanding and generation
- CodeGen - Large language model for program synthesis
- StarCoder - Large language model trained on source code
- CodeBERT - Pre-trained model for programming language
- GPT-Neo - Open-source alternative to GPT-3
- CodeParrot - Large language model trained on code
Resources for evaluating and training code models:
- CodeSearchNet - Dataset and benchmark for code search
- Stack Exchange Dataset - Instruction dataset based on Stack Exchange
- The Pile - Large-scale dataset including programming language data
- APPS - Benchmark for code generation
- HumanEval - Benchmark for evaluating language models on coding tasks
- CodeXGLUE - Benchmark dataset for code intelligence
- CoNaLa - Dataset for mapping natural language to code
- NL2Code - A dataset for natural language to code generation
- PY150 - Dataset for Python code completion
- CodeCompletionBenchmark - Google's benchmark for code completion
- CodeSearchNet Challenge - Multiple datasets for code search tasks
- AdvTest - Advanced code-to-code search dataset
Seminal and recent research papers in the field:
- Evaluating Large Language Models Trained on Code - OpenAI's paper on Codex
- PyMT5: Multi-mode Translation of Natural Language and Python Code with Transformers
- InCoder: A Generative Model for Code Infilling and Synthesis
- Competition-Level Code Generation with AlphaCode
- CodeXGLUE: A Benchmark Dataset and Open Challenge for Code Intelligence
- CodeSearchNet Challenge: Evaluating the State of Semantic Code Search
- CodeBERT: A Pre-Trained Model for Programming and Natural Languages
- CodeT5: Transformer-based Models for Code Understanding
- StarCoder: A Large Language Model for Program Synthesis
- CodeParrot: A Large Language Model for Code Generation
- GPT-Neo: Large-Scale Language Models for Code Generation
- DeepCoder: Learning to Write Programs
- Code2Vec: Learning Distributed Representations of Code
Studies and articles comparing the performance of different AI coding assistants:
- Comparing Copilot, ChatGPT, and Human Developers - Research study on code generation performance
- Evaluating LLM Performance on Programming Tasks - Comprehensive evaluation of various models
Open-source projects that complement or enhance AI-assisted coding:
- LSP - Language Server Protocol for editor-agnostic tooling
- Tree-sitter - Parser generator tool and incremental parsing library
- SonarQube - Static code analysis tool that can be enhanced with AI capabilities
Discussions and resources on the ethical considerations and challenges in AI-assisted coding:
- The Ethics of AI-Assisted Coding - ACM article on ethical considerations
- Challenges in AI-Assisted Coding - Research paper on challenges and future directions
- Copyright and AI-Generated Code - Legal perspective on AI-generated code
We welcome contributions to this awesome list! Here's how you can contribute:
- Fork the repository
- Create a new branch for your additions
- Add your links and descriptions, following the existing format
- Ensure your additions are in alphabetical order within their respective sections
- Create a pull request with a clear description of your changes
Please make sure any resources you add are:
- Relevant to AI-assisted coding
- Of high quality and useful to the community
- Not duplicates of existing entries
Contributions to this awesome list are welcome! Please submit a pull request or open an issue to suggest additions or changes.