Skip to content

Latest commit

 

History

History
25 lines (16 loc) · 1.23 KB

README.md

File metadata and controls

25 lines (16 loc) · 1.23 KB

M1522.006300 Distributed Systems

This is the course project folder of M1522.006300 Distributed Systems of Group 17.

Project Description

The goal of this project is to deploy and manage a prototype cloud cluster running batch processing WordLetterCount applications. There are two WordLetterCount applications implemented in different ways: one used the Spark API, the other used WordCount API and a self-designed resource scheduler.

Developer Tutorials

Refer to the docs folder for useful guides. `` The project specification is specified in Specification.md.

Refer to GCP guide for a detailed tutorial on how to configure, access and use your GCP clusters.

Our project ID is peaceful-fact-294309, you can use the web-based dashboard GCP Console to view our cluster, VMs and Pods.

To-Dos

  • Deploy Google Dataproc on GKE (ref: Dataproc on Google Kubernetes Engine)
  • Install WordCount locally to test
  • Test WordCount on GKE
    • Deploy Hadoop on GKE
    • Tweak Hadoop deployment, integration with GCS