Big Data Experiments using Apache Spark
This module implements word count on a give dataset of words in three variants:
- Do a word count
- Do a double word count
- Find frequency of one-words in another given list
This module implements the pagerank (Tested on the wex dataset) using two methods:
- Apache spark naive implementation
- GraphX scala library
- Find top hundred universities list (Scraped the list of universities from 4icu.org)