This repository is designed to quickly get you started with new Machine Learning projects on Google Cloud Platform. Slides: https://bit.ly/mlwithgcp
- preprocessing pipeline (with Apache Beam) that runs on Cloud Dataflow or locally
 - model training (with Tensorflow) that runs locally or on ML Engine
 - ready to deploy saved models to deploy on ML Engine
 - starter code to use the saved model on ML Engine
 
Note You will need a Linux or Mac environment with Python 2.7.x to install the dependencies [1]. Install the following dependencies:
You need to complete the following parts to run the code:
- preprocess.py pipeline with your own custom preprocess steps
 - model.py with your own model function according to the specification
 - config.py with your project-id and databuckets
 - upload data to your buckets, you can upload data/test.csv to test this code
 - (optionally) task.py with more custom training steps
 
You can run preprocess.py in the cloud using:
python preprocess.py --cloud
      
To improve efficiency you can also run the code locally on a sample of the dataset:
python preprocess.py
You can submit a ML Engine training job with:
gcloud ml-engine jobs submit training my_job \
                --module-name trainer.task \
                --staging-bucket gs://<staging_bucket> \
                --package-path trainer
                --runtime-version 1.10
Testing it locally:
gcloud ml-engine local train --package-path trainer \
                           --module-name trainer.task
To deploy your model to ML Engine
gcloud ml-engine models create MODEL_NAME --regions=REGION
gcloud ml-engine versions create VERSION --model=MODEL_NAME --origin=ORIGIN
To test the deployed model using python:
python predictions/predict.py
To test the deployed model with gcloud ml-engine predict command:
gcloud ml-engine predict --model MODEL_NAME --version VERSION --json-instances instances.json
We are working to add the following functionalities:
- hypertune
 - tensorflow-transform
 
[1] MLEngine-Boilerplate requires both Tensorflow as Apache Beam and currently Tensorflow on Windows only supports Python 3.5.x