This example uses the mnist model to identify a digit from an image. It steps through a simple DKube workflow.
Note This example runs on DKube V3.x and above
Before using DKube to experiment, train, and deploy, the resources must be set up.
- From the
Codemenu on the left, select+ Add Codewith the following fields:- Name:
mnist(Or choose<your-code-repo>) - Code Source:
Git - URL::
https://github.com/oneconvergence/dkube-examples.git - Branch:
tensorflow
- Name:
- Leave the other fields in their current selection and
Add Code
- From the
Datasetsmenu, select+ Add Datasetwith the following fields:- Name:
mnist(Or choose<your dataset-repo) - Dataset source:
Other - URL:
https://s3.amazonaws.com/img-datasets/mnist.pkl.gz
- Name:
- Leave the other fields in their current selection and
Add Dataset
- From the
Modelsmenu, select+ Add Modelwith the following fields:- Name:
mnist(Or choose<your model-repo)
- Name:
- Leave the other fields in their current selection and
Add Model
JupyterLab can be used to experiment with your code.
- Ensure that all of Repos above are in the
Readystate - From the
IDEsmenu, select+ Add JupyterLabwith the following fields:Basictab- Name:
<your-IDE-name(Your choice) - Code:
<your-code-repo>(Created during the Code Repo step) - Framework:
tensorflow - Framework Version:
2.0.0 - Image:
ocdr/dkube-datascience-tf-cpu-multiuser:v2.0.0-17
Note The default Tensorflow Image should fill in automatically, but ensure that it is correct
- Name:
Repostab- Inputs > Datasets:
<your-dataset-repo>(Created during the Dataset Repo step)- Mount Path:
/mnist
- Mount Path:
- Inputs > Datasets:
- Leave the other fields in their current selection and
Submit - Once the IDE is running and the JupyterLab icon on the right is active, select it to launch a JupyterLab window
- Navigate to
workspace/<your-code-repo>/mnist - Open
train.ipynbRun All Cellsfrom the menu at the top- Change the
EPOCHSvariable in the 2nd cell "5" and rerun all cells - You can view the difference in output at the bottom of the script
Note You would normally be developing your code in JupyterLab, and once you were satisfied you would create a Python file from the
ipynbfile. In this example, a Python file is already ready for execution.
- Navigate to
Batch training runs can be used to create trained models.
- From the
Runsmenu, select+ Run>Trainingwith the following fields:Basictab- Name:
<your-run-name(Your choice) - Code:
<your-code-repo>(Created during the Code Repo step) - Framework:
tensorflow - Framework Version:
2.0.0 - Image:
ocdr/dkube-datascience-tf-cpu-multiuser:v2.0.0-17
Note The default Tensorflow Image should fill in automatically, but ensure that it is correct
- Start-up Command:
python mnist/train.py
- Name:
Repostab- Inputs > Datasets:
<your-dataset-repo>(Created during the Dataset Repo step)- Mount Path:
/mnist
- Mount Path:
- Outputs > Models:
<your-model-repo>(Created during the Model Repo step)- Mount Path:
/model
- Mount Path:
Note Ensure that you add the Model into the
Outputssection, and not theInputssection- Inputs > Datasets:
- Leave the other fields in their current selection and
Submit - Your Run will show up from the
Runsmenu screen - Clone the Run by selecting the checkbox and choosing
Clonefrom the top buttons- Leave the
BasicandRepostabs the same - On the
Configurationtab- Select the
+button next toEnvironment Variables - Key:
EPOCHS(Must be in upper case) - Value:
5
- Select the
Submit
- Leave the
- Wait for both Runs to
complete - From the
Runsmenu, select both Run checkboxes, then selectComparebutton - Scroll down and choose Y-Axis:
train_accuracy
- Go to https://github.com/oneconvergence/dkube-examples/tree/tensorflow/mnist/tuning.yaml
- Select
Raw - Right-click &
Save as..."tuning.yaml" - From the
Runsmenu, select the first Run checkbox, then selectClone- Leave the
BasicandRepostabs the same - On the
Configurationtab- Select
Upload Tuning Definition - Choose the
tuning.yamlfile that you saved
- Select
Submit
- Leave the
- Wait for Run to complete
- View the results by selecting the Katib icon on the right of the Run line
- objective: The metric that you want to optimize
- goal parameter is mandatory in tuning.yaml file
- objectiveMetricName: Katib uses the objectiveMetricName and additionalMetricNames to monitor how the hyperparameters work with the model. Katib records the value of the best objectiveMetricName metric.
- parameters : The range of the hyperparameters or other parameters that you want to tune for your machine learning (ML) model
- parallelTrialCount: The maximum number of hyperparameter sets that Katib should train in parallel. The default value is 3.
- maxTrialCount: The maximum number of trials to run
- maxFailedTrialCount: The maximum number of failed trials before Katib should stop the experiment
- algorithm: Search algorithm to find the best hyper parameters. Value must be one of following:
- random
- bayesianoptimization
- hyperband
- cmaes
- enas
After the best model is identified, it can be deployed for inference serving.
- From
Modelsmenu, select<your-model-repo>(Created during Model Repo step) - Choose the highest version of the Model
- Select the
Lineagetab- This provides information on the inputs and outputs of the Model
- This provides information on the inputs and outputs of the Model
- Select the
Metricstab- This provides the metrics associated with the Model
- This provides the metrics associated with the Model
- Go back to
Modelstop menu, and reselect the Model - Select the
Deployicon on the right of the newest Model- Name:
<your-deploy-name>(Your choice) - Deployment:
Production - Deploy Using:
CPU - Transformer:
Check Box- Transformer Script:
mnist/transformer.py
- Transformer Script:
- Leave the other fields in their current selection an
Submit - The deployed Model will appear in the
Deploymentsmenu screen
- Name:
The training and deployment steps can be automated using Kubeflow Pipelines.
- Open the JupyterLab window
- Navigate to
workspace/<your-code-repo>/mnist - Open
pipeline.ipynb - If you chose the default value for all of your repos (
mnist) thenRun all Cells - If you chose different repo names
- In the 2nd cell, labeled
User Variables, modify the repo names with your chosen names Run All Cellsfrom the menu at the top
- In the 2nd cell, labeled
- From the
Pipelinesmenu on the left- Select
Runstab - Your new pipeline will be executing
- Select the pipeline name to see its progress
- Select
- Create a browser tab and go to https://<dkube_url>/inference
- Paste the Endpoint URL from
Deployments - Copy Auth token from
Developer settingsin DKube page and paste in - Choose
mnistfor model type - Download
3.pngfrom repo - Click
Predict
Note The prediction may time out waiting for the pod to start - select
waitif prompted