Skip to content

Nitin-Prata/Machine-learning-Algorithm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

๐Ÿง  Machine Learning Algorithms

From Scratch & Scikit-Learn

Python Jupyter NumPy Scikit-Learn License Stars Forks Watchers

A complete educational journey through classical Machine Learning โ€” built from scratch, line by line.

Understand the math. Build the code. Train the mind.

โญ Star this repo โ€ข ๐Ÿด Fork it โ€ข ๐Ÿ“š Documentation โ€ข ๐Ÿš€ Get Started


๐Ÿ“– Table of Contents


๐ŸŽฏ Overview

Welcome to Machine Learning Algorithms โ€” your comprehensive playground for mastering classical ML! This repository features hand-crafted implementations of every major algorithm, from the ground up.

๐ŸŒŸ Why This Repository?

  • Learn by Building: Every algorithm implemented from scratch using pure NumPy
  • Compare & Contrast: Side-by-side comparisons with industry-standard Scikit-Learn
  • Visual Learning: Beautiful plots and visualizations that bring theory to life
  • Real-World Applications: Applied examples on diverse datasets
  • Educational Focus: Clear documentation, math explanations, and code comments

๐Ÿ’ก Perfect for students, developers, data scientists, and AI enthusiasts who want to truly understand Machine Learning from first principles.


โœจ Features

๐Ÿ”ฌ From Scratch Implementations

  • Pure Python & NumPy implementations
  • Step-by-step algorithm breakdown
  • Mathematical intuition explained
  • No black boxes โ€” see every calculation

๐Ÿ“ˆ Production Comparisons

  • Scikit-Learn implementations
  • Performance benchmarking
  • Hyperparameter tuning examples
  • Best practices demonstrated

๐ŸŽจ Beautiful Visualizations

  • Decision boundaries
  • Loss function curves
  • Feature importance plots
  • Clustering visualizations

๐Ÿ“š Rich Documentation

  • Jupyter Notebooks with explanations
  • Code comments and docstrings
  • Theory behind each algorithm
  • Use case examples

๐Ÿ—๏ธ Tech Stack

Category Technologies
Language Python
Core Libraries NumPy Pandas
ML Framework Scikit-Learn
Visualization Matplotlib Seaborn
Environment Jupyter
Version Control Git GitHub

๐Ÿ“‚ Repository Structure

Machine-learning-Algorithm/
โ”‚
โ”œโ”€โ”€ ๐Ÿ“ Supervised Learning
โ”‚   โ”œโ”€โ”€ ๐Ÿ“‚ Regression
โ”‚   โ”‚   โ”œโ”€โ”€ LinearRegression/          # Simple & Multiple Linear Regression
โ”‚   โ”‚   โ”œโ”€โ”€ PolynomialRegression/      # Polynomial Regression
โ”‚   โ”‚   โ””โ”€โ”€ GradientDescent/           # Batch, Mini-Batch, Stochastic GD
โ”‚   โ”‚
โ”‚   โ”œโ”€โ”€ ๐Ÿ“‚ Classification
โ”‚   โ”‚   โ”œโ”€โ”€ LogisticRegression/        # Binary & Multi-class Classification
โ”‚   โ”‚   โ”œโ”€โ”€ KNN/                       # K-Nearest Neighbors
โ”‚   โ”‚   โ”œโ”€โ”€ NaiveBayes/                # Gaussian, Multinomial, Bernoulli
โ”‚   โ”‚   โ”œโ”€โ”€ SupportVectorMachines/     # SVM with Kernel Tricks
โ”‚   โ”‚   โ”œโ”€โ”€ DecisionTrees/             # CART Algorithm
โ”‚   โ”‚   โ””โ”€โ”€ NeuralNetworks/            # Perceptron & MLP
โ”‚   โ”‚
โ”‚   โ””โ”€โ”€ ๐Ÿ“‚ Ensemble Methods
โ”‚       โ”œโ”€โ”€ RandomForest/              # Random Forest Classifier & Regressor
โ”‚       โ”œโ”€โ”€ Bagging/                   # Bootstrap Aggregating
โ”‚       โ”œโ”€โ”€ AdaBoost/                  # Adaptive Boosting
โ”‚       โ”œโ”€โ”€ GradientBoosting/          # Gradient Boosting Machines
โ”‚       โ””โ”€โ”€ XGBoost/                   # Extreme Gradient Boosting
โ”‚
โ”œโ”€โ”€ ๐Ÿ“ Unsupervised Learning
โ”‚   โ”œโ”€โ”€ ๐Ÿ“‚ Clustering
โ”‚   โ”‚   โ”œโ”€โ”€ K-Means-clustering/        # K-Means from Scratch
โ”‚   โ”‚   โ”œโ”€โ”€ HierarchicalClustering/    # Agglomerative & Divisive
โ”‚   โ”‚   โ””โ”€โ”€ DBSCAN/                    # Density-Based Clustering
โ”‚   โ”‚
โ”‚   โ””โ”€โ”€ ๐Ÿ“‚ Dimensionality Reduction
โ”‚       โ””โ”€โ”€ PCA/                       # Principal Component Analysis
โ”‚
โ”œโ”€โ”€ ๐Ÿ“ DataSets/                       # Curated Real-World Datasets
โ”‚   โ”œโ”€โ”€ iris.csv                       # Classification Dataset
โ”‚   โ”œโ”€โ”€ heart.csv                      # Healthcare Dataset
โ”‚   โ”œโ”€โ”€ Social_Network_Ads.csv         # Marketing Dataset
โ”‚   โ”œโ”€โ”€ ipl-matches.csv                # Sports Analytics
โ”‚   โ”œโ”€โ”€ zomato.csv                     # Restaurant Data
โ”‚   โ””โ”€โ”€ student_clustering.csv         # Educational Data
โ”‚
โ”œโ”€โ”€ ๐Ÿ“ Visualizations/                 # Plots & Charts
โ”œโ”€โ”€ ๐Ÿ“„ requirements.txt                # Python Dependencies
โ”œโ”€โ”€ ๐Ÿ“„ CONTRIBUTING.md                 # Contribution Guidelines
โ”œโ”€โ”€ ๐Ÿ“„ LICENSE                         # MIT License
โ””โ”€โ”€ ๐Ÿ“„ README.md                       # You are here!

๐Ÿš€ Quick Start

๐Ÿ”ง Prerequisites

  • Python 3.9 or higher
  • pip package manager
  • Git

๐Ÿ“ฅ Installation

1๏ธโƒฃ Clone the Repository

git clone https://github.com/Nitin-Prata/Machine-learning-Algorithm.git
cd Machine-learning-Algorithm

2๏ธโƒฃ Create Virtual Environment

Windows (PowerShell)

python -m venv .venv
.\.venv\Scripts\Activate.ps1

macOS / Linux

python3 -m venv .venv
source .venv/bin/activate

3๏ธโƒฃ Install Dependencies

pip install -r requirements.txt

4๏ธโƒฃ Launch Jupyter Notebook

jupyter notebook

Your browser will open automatically at http://localhost:8888

๐ŸŽฏ First Steps

  1. Navigate to any algorithm folder (e.g., LinearRegression/)
  2. Open the Jupyter Notebook (.ipynb file)
  3. Run cells sequentially to see the implementation
  4. Experiment with parameters and datasets
  5. Compare scratch implementation with Scikit-Learn

๐Ÿงฎ Algorithms Implemented

๐Ÿ“Š Regression Algorithms (5)
Algorithm From Scratch Scikit-Learn Notebook
Linear Regression โœ… โœ… ๐Ÿ““
Multiple Linear Regression โœ… โœ… ๐Ÿ““
Polynomial Regression โœ… โœ… ๐Ÿ““
Ridge Regression โœ… โœ… ๐Ÿ““
Lasso Regression โœ… โœ… ๐Ÿ““
๐ŸŽฏ Classification Algorithms (8)
Algorithm From Scratch Scikit-Learn Notebook
Logistic Regression โœ… โœ… ๐Ÿ““
K-Nearest Neighbors (KNN) โœ… โœ… ๐Ÿ““
Naive Bayes โœ… โœ… ๐Ÿ““
Support Vector Machine (SVM) โœ… โœ… ๐Ÿ““
Decision Trees โœ… โœ… ๐Ÿ““
Random Forest โœ… โœ… ๐Ÿ““
Neural Networks (MLP) โœ… โœ… ๐Ÿ““
Softmax Regression โœ… โœ… ๐Ÿ““
๐ŸŒณ Ensemble Methods (5)
Algorithm From Scratch Scikit-Learn Notebook
Random Forest โœ… โœ… ๐Ÿ““
Bagging โœ… โœ… ๐Ÿ““
AdaBoost โœ… โœ… ๐Ÿ““
Gradient Boosting โœ… โœ… ๐Ÿ““
XGBoost โœ… โœ… ๐Ÿ““
๐Ÿ” Clustering Algorithms (3)
Algorithm From Scratch Scikit-Learn Notebook
K-Means Clustering โœ… โœ… ๐Ÿ““
Hierarchical Clustering โœ… โœ… ๐Ÿ““
DBSCAN โœ… โœ… ๐Ÿ““
๐Ÿ“‰ Dimensionality Reduction (1)
Algorithm From Scratch Scikit-Learn Notebook
Principal Component Analysis (PCA) โœ… โœ… ๐Ÿ““
โšก Optimization Algorithms (3)
Algorithm From Scratch Notebook
Batch Gradient Descent โœ… ๐Ÿ““
Mini-Batch Gradient Descent โœ… ๐Ÿ““
Stochastic Gradient Descent โœ… ๐Ÿ““

๐Ÿ“Š Algorithm Summary

Category Count Implementation Status
Regression 5 โœ… Complete
Classification 8 โœ… Complete
Ensemble Methods 5 โœ… Complete
Clustering 3 โœ… Complete
Dimensionality Reduction 1 โœ… Complete
Optimization 3 โœ… Complete
TOTAL 25 โœ… Complete

๐Ÿ“Š Dataset Collection

All datasets are curated, cleaned, and ready to use in the /DataSets folder.

Dataset Size Features Use Case Domain
iris.csv 150 4 Multi-class Classification Botany
heart.csv 303 13 Binary Classification Healthcare
Social_Network_Ads.csv 400 4 Marketing Classification Business
ipl-matches.csv 756 18 Regression & Analysis Sports
zomato.csv 9551 21 Clustering Food Industry
student_clustering.csv 2000 7 Clustering Education

๐Ÿ“ธ Visualizations

๐ŸŽจ Sample Outputs

Decision boundaries, loss curves, clustering plots, and feature importance visualizations are included in each notebook.


๐ŸŽ“ Learning Path

๐Ÿ”ฐ Beginner Track (Weeks 1-4)

  1. Week 1-2: Linear & Logistic Regression

    • Start with LinearRegression/
    • Move to LogisticRegression/
    • Understand cost functions and gradient descent
  2. Week 3: Classification Basics

    • Explore KNN/
    • Study NaiveBayes/
    • Practice on iris dataset
  3. Week 4: Tree-Based Methods

    • Learn DecisionTrees/
    • Build intuition with visualizations

๐Ÿš€ Intermediate Track (Weeks 5-8)

  1. Week 5-6: Advanced Classification

    • Master SupportVectorMachines/
    • Understand kernel tricks
    • Implement NeuralNetworks/
  2. Week 7: Ensemble Methods

    • Study RandomForest/
    • Compare with Bagging/
    • Understand bootstrap aggregating
  3. Week 8: Clustering

    • Implement K-Means-clustering/
    • Explore HierarchicalClustering/
    • Try DBSCAN/

โšก Advanced Track (Weeks 9-12)

  1. Week 9-10: Boosting Algorithms

    • Deep dive into AdaBoost/
    • Master GradientBoosting/
    • Optimize with XGBoost/
  2. Week 11: Dimensionality Reduction

    • Understand PCA/
    • Apply to high-dimensional data
  3. Week 12: Optimization Techniques

    • Compare gradient descent variants
    • Implement custom optimizers
    • Hyperparameter tuning

๐ŸŽฏ Key Learning Outcomes

After completing this repository, you will:

  • โœ… Understand the mathematical foundations of ML algorithms
  • โœ… Implement algorithms from scratch using NumPy
  • โœ… Debug and optimize ML code effectively
  • โœ… Compare custom implementations with Scikit-Learn
  • โœ… Visualize model behavior and decision boundaries
  • โœ… Apply algorithms to real-world datasets
  • โœ… Choose the right algorithm for specific problems
  • โœ… Tune hyperparameters for optimal performance

๐Ÿค Contributing

Contributions are always welcome! Here's how you can help:

๐ŸŒŸ Ways to Contribute

  • ๐Ÿ› Report bugs and issues
  • ๐Ÿ’ก Suggest new algorithms to implement
  • ๐Ÿ“ Improve documentation
  • ๐ŸŽจ Add visualizations
  • ๐Ÿ“Š Contribute new datasets
  • โœจ Optimize existing code
  • ๐Ÿงช Add unit tests

๐Ÿ“‹ Contribution Guidelines

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/AmazingFeature)
  3. Commit your changes (git commit -m 'Add some AmazingFeature')
  4. Push to the branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

Please read CONTRIBUTING.md for detailed guidelines.


๐Ÿ‘จโ€๐Ÿ’ป Author

Nitin Pratap Singh

๐ŸŽ“ B.Tech in Computer Science (AI) | India ๐Ÿ‡ฎ๐Ÿ‡ณ

๐Ÿ’ผ Machine Learning, AI Education & Open Source

"Learn the math. Build the code. Train the mind." ๐Ÿง 

GitHub LinkedIn Email


๐Ÿ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

MIT License

Copyright (c) 2024 Nitin Pratap Singh

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files...

โญ Support This Project

If you find this repository helpful, please consider:

Action Why?
โญ Star this repository Show appreciation & help others discover it
๐Ÿด Fork it Create your own version & experiment
๐Ÿ‘€ Watch Get notified of updates
๐Ÿ’ฌ Share Help the ML community learn
๐Ÿ› Report Issues Help improve the project
๐Ÿค Contribute Make it even better

๐ŸŽฏ Repository Stats

Stars Forks Issues Pull Requests


๐Ÿ™ Acknowledgments

Special thanks to:

  • Andrew Ng for his legendary Machine Learning course that inspired this project
  • CampusX for their exceptional ML tutorials and educational content
  • Scikit-Learn team for the amazing library
  • NumPy contributors for the numerical computing foundation
  • The open-source community for inspiration
  • You for taking the time to explore this repository!

๐Ÿ“š Additional Resources

๐Ÿ“– Recommended Reading

๐ŸŽฅ Video Courses

๐ŸŒ Online Resources


๐Ÿ’ญ Final Thoughts

Machine Learning is not just about using libraries โ€” it's about understanding the principles that make those libraries work. This repository bridges the gap between theory and practice, empowering you to not just use ML, but to truly understand it.

๐Ÿš€ Start Your ML Journey Today!

Get Started View Notebooks Join Community


Made with โค๏ธ and โ˜• by Nitin Pratap Singh

Happy Learning! ๐ŸŽ“๐Ÿš€

About

All Machine learning Algorithms implemented in Python

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published