-
Notifications
You must be signed in to change notification settings - Fork 5
ChatR: R Expert Chatbot
R users and developers commonly use LLMs for coding, as LLMs are proficient in generating R code and answering general questions about R. However, general foundational LLMs are less accurate when it comes to specifics, such as the precise API of a third party package or best practices for contributing to R itself. A chatbot that has been customized to yield accurate information about the R ecosystem and the contribution process would be useful to every R programmer and especially those looking for an on-ramp to becoming a contributor to the core. In order to be maximally inclusive, it is also important for the bot to run on local hardware, even in the absence of a GPU, while also supporting commercial models.
Proprietary platforms like chat.openai.com offer a number of R-oriented chatbots; however, none of these are freely available, which strictly limits their reach and excludes many R users and potential contributors.
The closest prior work to our awareness is the Shiny Assistant, which is a freely available bot that has been customized to answer questions about Shiny and even generate entire Shiny applications.
There are also many R/LLM interfaces, which will likely be useful for implementing our customizations and for providing a demonstrative chat interface directly in the R and/or RStudio session.
The contributor will experiment with prompt engineering, tool calling and RAG-based approaches to customizing a chatbot to R programming and contribution tasks. The output will be an R package that encapsulates the customizations and provides an interface to the bot. The package will rely on existing packages for capabilities like communicating with the model, embedding a chat widget in a simple Shiny app and indexing R-related documentation.
Every R user would benefit from a more accurate R-oriented chatbot. The bot will also help new contributors learn how to work with the R codebase and collaborate with R core, and new contributors are critical for the longevity of the project.
- EVALUATING mentor: Michael Lawrence [email protected]: Member of R core, experienced with LLM customization and former GSOC mentor.
- Assisting mentor: Gabriel Becker [email protected]: Expert R programmer, committed advocate for new contributors to R and former GSOC mentor.
Contributors, please do one or more of the following tests before contacting the mentors above.
- Easy: Install the Ollama software, pull the Q4_K_M quantization of llama3.2:3b-instruct and ask it a question that could be answered by reading Writing R Extensions.
- Medium: Use the ellmer package to perform the easy task above but programmatically.
- Hard: Create an R package that depends on ellmer and provides a function that takes the name of a package returns a character vector of functions exported by that package by only using ellmer and llama3.2:3b from Ollama. The list of functions does not need to be correct (making it correct is the whole point of this project).
Contributors, please post a link to your test results here.
- Dev Goel, Test Solutions
- Afraaz Ali, GITHUB PROFILE, EASY TEST