Skip to content

Performing a Principal component analysis on MNIST dataset

Notifications You must be signed in to change notification settings

AmiraBelkis/Mnist-PCA

Repository files navigation

Project Title: Image Compression using Principal Component Analysis (PCA) on the MNIST Dataset

Objective:

To apply Principal Component Analysis (PCA) on the MNIST handwritten digit dataset to effectively compress image size while preserving crucial information.

Methodology:

  1. Data Acquisition: Acquire the MNIST dataset, consisting of 60,000 grayscale images of handwritten digits (0-9). Each image is represented by a 28x28 pixel matrix, resulting in a total of 784 features per image.

  2. Data Preprocessing: Standardize the pixel intensities to ensure a mean of zero and a standard deviation of one for each feature. This facilitates PCA in identifying the most significant patterns in the data.

  3. Principal Component Analysis (PCA): Implement PCA on the standardized data, projecting it onto a lower-dimensional subspace while maintaining the maximum variance in the original dataset.

  4. Image Compression: Reconstruct the compressed images from the reduced set of principal components. Evaluate the trade-off between compression rate and image quality.

Technologies Utilized:

  • Python: Programming language for data manipulation and algorithm implementation.

  • MongoDB: NoSQL database for storing and retrieving the MNIST dataset.

  • Matplotlib: Python library for data visualization and generating plots.

  • NumPy: Numerical computing library for efficient array operations.

  • Pandas: Data analysis and manipulation library for processing and handling tabular data.

  • scikit-learn: Machine learning library, specifically utilizing the PCA module for dimensionality reduction.

  • Keras: Deep learning library for neural network modeling.

About

Performing a Principal component analysis on MNIST dataset

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published