Skip to content

In this repository I implemented Matrix multiplication on Hadoop and Linear regression and gradient descent using Spark

Notifications You must be signed in to change notification settings

yashasvimatta/BigData-and-CloudComputing

Repository files navigation

BigData-and-CloudComputing

Overview

This repository contains two distinct projects that leverage Apache Spark for large-scale data processing and machine learning tasks. Each project is designed to demonstrate the power of Spark in handling complex computations and large datasets efficiently.

Projects

  1. Linear Regression and Gradient Descent

    • Implements linear regression using both the normal equation and gradient descent optimization techniques.
    • Utilizes Spark RDDs and Breeze for matrix operations and inversion.
    • Includes RMSE calculation for model performance evaluation.
  2. Matrix Multiplication using Spark

    • Demonstrates efficient matrix multiplication using Spark's distributed computation capabilities.
    • Handles large matrices that may not fit into the memory of a single machine, showcasing Spark's scalability.

Installation

  • Apache Spark
  • Scala
  • Breeze (for numerical operations in the linear regression project)

Usage

Go to the individual project directories for detailed instructions on running each project:

  • Linear_Regression/: Contains the implementation and instructions for the linear regression and gradient descent project.
  • Matrix_Multiplication/: Contains the implementation and instructions for the matrix multiplication project.

License

This repository is licensed under the MIT License. See the LICENSE file for more details.

About

In this repository I implemented Matrix multiplication on Hadoop and Linear regression and gradient descent using Spark

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published