Stroke Data Analytics 🧠💻

📋 Overview

This repository showcases the implementations for two interconnected coursework 🎓. We dive deep into stroke data analytics using a simulated dataset of 172,000 anonymous patient records 📈. The focus? Analyzing cardiovascular risk factors like age 👴, hypertension 🩸, smoking 🚬, glucose levels 🍯, and lifestyle habits 🏃‍♂️ to empower clinicians in preventing fatalities ⚕️.

The dataset (data.csv in /shared/) packs 20 features such as age, hypertension, heart_disease, avg_glucose_level, bmi, smoking_status, stroke, and more 🌐. Both tasks champion ethical AI use 🤝, inclusivity in health tech ♿, and sustainability in data-driven healthcare 🌍.

📂 Task 1: Procedural Data Loading & Querying🔍

Objectives: Build three modules sans high-level libraries (no Pandas/NumPy – pure file I/O! 🚫) using core Python basics.
- dataset_module.py: Loads CSV into a nested dictionary 📚.
- query_module.py: Crunches stats (mean, median, mode) for stroke queries, e.g., average age for smokers with hypertension 🧮; dietary habits by stroke outcome 🍎; persists outputs to CSV 💾.
- ui_module.py: Interactive text-based menu for queries, weaving in prior modules 🔗.
Main Entry: Fire up task1/main.ipynb in Jupyter for the demo 🎬.
Key Learning: Iteration loops 🔄, string wizardry ✂️, and custom data structures for domain-specific software🏆.
Extensions: Streamlit.

📊 Task 2: OOP, EDA, & ML Predictions 🤖

Objectives: Refactor with OOP flair 🏗️; unleash EDA with libraries; craft predictive models.
- load2.py: OOP-savvy loading + cleaning 🧹.
- eda2.py: Tackles missing data 🔍, descriptive stats (mean, SD, skewness) 📉, visualizations (bar/pie/box/scatter plots via Matplotlib/Seaborn 🎨), class balancing (e.g., SMOTE for stroke skew ⚖️), and train-test split 🎯.
- ui2.py: Upgraded UI for EDA/ML insights 🖥️.
- ML Magic: Feature engineering (e.g., BMI buckets 📏); trains 3 classifiers (Naive Bayes, Random Forest & XGBoost) per target (chronic_stress 😰, physical_activity 🏋️, income_level 💰, stroke 🧠) using Scikit-learn. Evaluates via confusion matrices 🗺️, precision/recall/accuracy 🎯; visualizes model showdowns 📈.
Main Entry: Launch task2/main2.ipynb for the full pipeline 🚀.
Key Learning: OOP encapsulation/inheritance 🛡️, ML ethics ⚖️, and performance deep-dives 📚.
Extensions: Simple Tkinter GUI for predictions 🎨.

🛠️ Technologies Stack

Python 3.8+ 🐍; Jupyter Notebooks 📓.
Task 1: Core Python (file I/O 📁, dicts/lists 🗂️).
Task 2: Pandas 🐼, NumPy 🔢, Matplotlib/Seaborn 📊, Scikit-learn 🤖, Imbalanced-learn ⚖️.

🚀 How to Run (Step-by-Step) 🕹️

Clone the Repo: git clone https://github.com/yourusername/stroke-data-analytics-projects.git 📥.
Task 1 Demo: cd Task1 && jupyter notebook main.ipynb 🔄.
Task 2 Pipeline: cd Task2 && jupyter notebook main2.ipynb 🎯.
Reports & Insights: Flip through /task1/Report.pdf 📄 and /task2/PCP2_Akshen_Report.pdf for designs, pseudocode, and reflections 💭.

💭 Reflections & Takeaways 🌟

These projects sharpened my modular Python chops 🛠️, from gritty low-level data wrangling to slick ML deployment 🚀, sparking critical vibes on healthcare biases (e.g., urban-rural stroke gaps 🏙️🌾). Hurdles? Task 1's manual stats grind – conquered with smart loops 🔄; Task 2's class imbalances boosted model grit 💪. On the pro front, it mirrors real data scientist gigs, stressing clean code 🧹 and ethical AI 🤝. Next time? Wire in real-time APIs for live alerts 📡.

📜 License

MIT License – Fork away, just shout-out! 🎉

Author: Akshen Dhami ([email protected]) & https://www.linkedin.com/in/akshen-dhami22.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
Task 1		Task 1
Task 2		Task 2
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Stroke Data Analytics 🧠💻

📋 Overview

📂 Task 1: Procedural Data Loading & Querying🔍

📊 Task 2: OOP, EDA, & ML Predictions 🤖

🛠️ Technologies Stack

🚀 How to Run (Step-by-Step) 🕹️

💭 Reflections & Takeaways 🌟

📜 License

About

Uh oh!

Releases

Packages

Languages

License

Akshen22/Stroke-Data-Analytics

Folders and files

Latest commit

History

Repository files navigation

Stroke Data Analytics 🧠💻

📋 Overview

📂 Task 1: Procedural Data Loading & Querying🔍

📊 Task 2: OOP, EDA, & ML Predictions 🤖

🛠️ Technologies Stack

🚀 How to Run (Step-by-Step) 🕹️

💭 Reflections & Takeaways 🌟

📜 License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages