-
Notifications
You must be signed in to change notification settings - Fork 6
R Community Explorer: Twitter
The R language has a very large, active, and diverse global community that is the core of its ecosystem. While many aspects of the R ecosystem are continuously expanding in popularity, the R language may have been adversely affected in language rankings because of underestimations of its use. The "R Community Explorer" developed during Google Summer of Code 2019 offers a data-driven exploration of the R ecosystem that helps current and potential users accurately assess growth trends.
As more sub-communities and diverse members are added to the R community, and newer tools developed, continuous monitoring and reporting of developments within the R ecosystem will help potential contributors get involved.
Tracking topics of interest on Twitter among #rstats users and tweet trends will facilitate engagement and understanding of usage growth patterns, while giving insights to the activeness of the global #rstats community. For conferences like useR!, we are thinking that a Twitter dashboard would be helpful in attracting sponsorship as it gives potential sponsors an insight into how much chatter the conference gets on Twitter. Most events are virtual these days, and people often rely on social media channels to share important resources and keep community engagement alive. Other smaller local/regional conferences including satRdays could also benefit from having a Twitter dashboard for their conferences.
R Community Explorer began in 2019 as a GSoC project that focused on aggregating information about R User Groups, R-Ladies Chapters and R-GSoC projects. The static dashboards render this information via interactive visualizations and data-widgets. However, R Community Explorer is still far from being a complete exploration of the R community. Many aspects of the R ecosystem including Twitter remain potential valuable additions.
This shinydashboard by Garrick depends on Shiny. R Community Explorer is static and does not depend on Shiny at the moment. A Shiny solution requires more resources to serve a global audience than a static website.
- Twitter exploration of #rstats tweets via Twitter API based on
rtweet
R package.
Some coding expectations here are: reading our archive of #rstats tweets into an R session which is stored in folders, obtaining some stats including the frequency and average of #rstats tweets per day/week/month, trending #rstats topics per day/week/month, most interesting #rstats Twitter content, most popular #rstats Twitter accounts, etc.
Visualize this data using flexdashboard
or a HTML dashboard using JavaScript. Word-counts/wordclouds are also a good idea. Consider using the crosstalk
package to allow visualizing data for a range of time.
Please find some visualization ideas here: https://gadenbuie.shinyapps.io/tweet-conf-dash/
-
Adapt the achievements in (1) above to serve a similar purpose for useR! 2021 tweets (explore #useR2021 tweets and produce a dashboard). For a conference such as this that is planned to be global and community focused, a way to track/explore social interaction via Twitter during the conference could be helpful.
-
Setup pipelines using CI/CD to facilitate the ETL of related Twitter data daily and hourly. Document and package the results of (2) above such that other useR! or related conferences could use the tool for future events.
-
Setup a way to customize the CSS styling of the dashboard to serve different occasions.
R, JavaScript
The importance of this project lies in the fact that several key players in the R ecosystem would function better if they had insights on what R users are most interested in over time.
Players such as the R Foundation, R Consortium, and organizations that build their portfolio around R services could be better informed on key areas of most interest to R users.
Almost every user or organization interested in the R language would likely be interested in understanding the popularity of the R language over time. Decision-making would be guided based on the decline or rise in popularity of aspects of the R language ecosystem. The same could be applicable to R Foundation sponsors and R Consortium members. This project hopes to build the framework for monitoring the popularity of R-related content and understanding the interests of the global R community over time by tracking Twitter activity.
EVALUATING MENTOR: Ben Ubah ([email protected]) is the primary maintainer of R Community Explorer, an R package author, GCI mentor and has prior GSoC experience. He is part of the global organizing committee (Technology) for useR! 2021
SECOND MENTOR: Rick Pack ([email protected]) is a contributor to R Community Explorer, past mentor for this project and a data scientist at LabCorp.
BACKUP MENTOR: Gergely Daroczi [email protected] is the author of eg pander
, botor
and logger
packages. He is a Director of Data Ops at System1, and the organizer of the Budapest Users of R Network, the first satRday conference, and the European R Users Meeting 2018.
-
Design a way of downloading all tweets that contain the #rstats hashtag every day and stacking them day by day without losing tweets or having duplicate tweets. You may want to consider using a continuous integration tool like GitHub Action to run a script daily that pulls the data and joins it to previously stored data.
-
Create a simple demo flexdashboard.
-
Make a simple plot using any of echarts.js, d3.js and plotly.js
Students, please write your name here, and send a link to your solution via email to avoid plagiarism.
- Name - Aryan Patel -Test 1, Test 2, Test-3
- Name - João Vitor F. Cavalcante - Test solution
- Name - Meet Bhatnagar - Test 1A, Test 1B, Test 2, Test 3, Cumulative Test Results
- Name - Aris Zagakos - Test Solutions
- Name - Anushka Gupta -Test Solutions