Welcome! This repository contains a collection of data science projects built using Python (and a little R). These projects cover a range of topics from statistical modeling and visualization to machine learning and real-world datasets.
Whether you're just starting out in data science or looking to explore practical examples, this repo has something for you!
Here are some of the types of projects and topics you'll find:
-
Machine Learning Logistic regression, random forests, prediction intervals, and more.
-
Statistics & Probability Bayesian inference, Poisson distributions, ECDFs, and statistical significance.
-
Visualization Projects using Plotly, Datashader, and pair plots for insightful visual storytelling.
-
Time Series & Features Working with cyclical features, time features, and stock market data.
-
Data Engineering & Automation Web automation scripts, Slack interactions, and organizing large datasets.
-
Geographic & Traffic Data Analysis of NYC traffic data and geospatial visualization.
-
Miscellaneous Explorations Economics, learning habits, weight loss tracking, and more.
Each folder in the repo contains code, Jupyter notebooks, and notes that explain the thinking behind the project.
Iβve written up many of these projects as articles on Towards Data Science β where I explain the concepts in more detail for a broader audience.
These projects span several years β from early experiments 7β8 years ago to more recent work. Some folders include notes like:
- "Working on plotting"
- "Finished prediction intervals notebook"
- "Formatted with Black"
- "Added table of contents"
This variety reflects an evolving learning journey through data science.
To run these notebooks:
-
Clone the repository
git clone https://github.com/YOUR_USERNAME/data-analysis.git
-
Install dependencies
pip install -r requirements.txt
-
Launch Jupyter Notebook or JupyterLab
jupyter notebook
If you have questions or want to connect, feel free to reach out on Twitter @koehrsen_will.
This project is licensed under the MIT License. See the LICENSE file for details.