Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding estimators for free #20

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
387 changes: 387 additions & 0 deletions extras/estimators-for-free.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,387 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Some things we get for free by using Estimators\n",
"\n",
"Estimators are a high level abstraction (Interface) that supports all the basic operations you need to support a ML model on top of TensorFlow.\n",
"\n",
"Estimators:\n",
"* provide a simple interface for users of canned model architectures: Training, evaluation, prediction, export for serving;\n",
"* provide a standard interface for model developers;\n",
"* drastically reduce the amount of user code required. This avoids bugs and speeds up development significantly;\n",
"* enable building production services against a standard interface;\n",
"* using experiments abstraction give you free data-parallelism.\n",
"\n",
"You can use an already implemented estimator (canned estimator) or implement your own (custom estimator).\n",
"\n",
"This tutorial is not focused on how to build your own estimator, we're using a custom estimator that implements a [CNN classifier for MNIST dataset](https://www.tensorflow.org/get_started/mnist/pros) but we're not going into details about how that's implemented.\n",
"\n",
"Here we're going to show how Estimators make your life easier, once you have a estimator model is very simple to make changes on your model, compare results and iterate over time.\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Having a look at the code and running the experiment"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Dependencies"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from __future__ import absolute_import\n",
"from __future__ import division\n",
"from __future__ import print_function\n",
"\n",
"# our model \n",
"import model as m\n",
"\n",
"# tensorflow\n",
"import tensorflow as tf \n",
"print(tf.__version__) # tested with tf v1.2\n",
"\n",
"from tensorflow.contrib import learn\n",
"from tensorflow.contrib.learn.python.learn import learn_runner\n",
"from tensorflow.python.estimator.inputs import numpy_io\n",
"\n",
"# MNIST data\n",
"from tensorflow.examples.tutorials.mnist import input_data\n",
"# Numpy\n",
"import numpy as np\n",
"\n",
"# Enable TensorFlow logs\n",
"tf.logging.set_verbosity(tf.logging.INFO)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Getting the data\n",
"\n",
"We're not going into details here"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Import the MNIST dataset\n",
"mnist = input_data.read_data_sets(\"/tmp/MNIST/\", one_hot=True)\n",
"\n",
"x_train = np.reshape(mnist.train.images, (-1, 28, 28, 1))\n",
"y_train = mnist.train.labels\n",
"x_test = np.reshape(mnist.test.images, (-1, 28, 28, 1))\n",
"y_test = mnist.test.labels"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Defining the model\n",
"\n",
"We're not going into details here"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# coding: utf-8\n",
"\n",
"'''A Custom Estimator using CNNS for MNIST using Keras.\n",
"\n",
"For reference:\n",
"\n",
"* https://www.tensorflow.org/extend/estimators.\n",
"* https://www.tensorflow.org/get_started/mnist/beginners.\n",
"'''\n",
"\n",
"# Define the model, using Keras\n",
"def model_fn(features, labels, mode, params):\n",
" # Input Layer\n",
" # Reshape X to 4-D tensor: [batch_size, width, height, channels]\n",
" # MNIST images are 28x28 pixels, and have one color channel\n",
" x = tf.reshape(features['x'], shape=[-1, 28, 28, 1])\n",
"\n",
" # Convolutional Layer #1\n",
" # Computes 32 features using a 5x5 filter with ReLU activation.\n",
" # Padding is added to preserve width and height.\n",
" # Input Tensor Shape: [batch_size, 28, 28, 1]\n",
" # Output Tensor Shape: [batch_size, 28, 28, 32]\n",
" conv1 = K.layers.Conv2D(32, (5, 5), activation='relu',\n",
" input_shape=(28, 28, 1))(x)\n",
"\n",
" # Pooling Layer #1\n",
" # First max pooling layer with a 2x2 filter and stride of 2\n",
" # Input Tensor Shape: [batch_size, 28, 28, 32]\n",
" # Output Tensor Shape: [batch_size, 14, 14, 32]\n",
" pool1 = K.layers.MaxPooling2D(pool_size=(2, 2),\n",
" strides=2,\n",
" padding='same')(conv1)\n",
"\n",
" # Convolutional Layer #2\n",
" # Computes 64 features using a 5x5 filter.\n",
" # Padding is added to preserve width and height.\n",
" # Input Tensor Shape: [batch_size, 14, 14, 32]\n",
" # Output Tensor Shape: [batch_size, 14, 14, 64]\n",
" conv2 = K.layers.Conv2D(64, (5, 5), activation='relu')(pool1)\n",
"\n",
" # Pooling Layer #2\n",
" # Second max pooling layer with a 2x2 filter and stride of 2.\n",
" # Input Tensor Shape: [batch_size, 14, 14, 64]\n",
" # Output Tensor Shape: [batch_size, 7, 7, 64]\n",
" pool2 = K.layers.MaxPooling2D(pool_size=(2, 2),\n",
" strides=2,\n",
" padding='same')(conv2)\n",
"\n",
" # Flatten tensor into a batch of vectors.\n",
" # Input Tensor Shape: [batch_size, 7, 7, 64]\n",
" # Output Tensor Shape: [batch_size, 7 * 7 * 64]\n",
" flat = K.layers.Flatten()(pool2)\n",
"\n",
" # Dense Layer\n",
" # Densely connected layer with 1024 neurons.\n",
" # Input Tensor Shape: [batch_size, 7 * 7 * 64]\n",
" # Output Tensor Shape: [batch_size, 1024]\n",
" dense = K.layers.Dense(1024, activation='relu')(flat)\n",
"\n",
" # Logits layer\n",
" # Input Tensor Shape: [batch_size, 1024]\n",
" # Output Tensor Shape: [batch_size, 10]\n",
" logits = K.layers.Dense(10, activation='softmax')(dense)\n",
"\n",
" predictions = {\n",
" 'classes': tf.argmax(input=logits, axis=1),\n",
" 'probabilities': tf.nn.softmax(logits)\n",
" }\n",
"\n",
" train_op = None\n",
" eval_metric_ops = None\n",
"\n",
" loss = tf.losses.softmax_cross_entropy(onehot_labels=labels, logits=logits)\n",
"\n",
" if mode == tf.estimator.ModeKeys.TRAIN:\n",
" train_op = tf.contrib.layers.optimize_loss(\n",
" loss=loss,\n",
" global_step=tf.train.get_global_step(),\n",
" learning_rate=params['learning_rate'],\n",
" optimizer='Adam')\n",
"\n",
" if mode == tf.estimator.ModeKeys.EVAL:\n",
" eval_metric_ops = {\n",
" 'accuracy': tf.metrics.accuracy(\n",
" tf.argmax(input=logits, axis=1),\n",
" tf.argmax(input=labels, axis=1))\n",
" }\n",
"\n",
" return model_fn_lib.EstimatorSpec(mode=mode, train_op=train_op,\n",
" predictions=predictions,\n",
" loss=loss,\n",
" eval_metric_ops=eval_metric_ops)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Defining the input function\n",
"\n",
"To feed the data to the Estimator model we need to create an input function. This means that the estimator doesn't know about data files, it knows about input functions.\n",
"\n",
"You can learn more about input functions [here](https://www.tensorflow.org/get_started/input_fn)\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"BATCH_SIZE = 128\n",
"\n",
"x_train_dict = {'x': x_train}\n",
"train_input_fn = numpy_io.numpy_input_fn(x_train_dict, y_train, batch_size=BATCH_SIZE, \n",
" shuffle=True, num_epochs=None,\n",
" queue_capacity=1000, num_threads=4)\n",
"\n",
"x_test_dict = {'x': x_test}\n",
"test_input_fn = numpy_io.numpy_input_fn(x_test_dict, y_test, batch_size=BATCH_SIZE,\n",
" shuffle=False, num_epochs=1)\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Creating an experiment\n",
"\n",
"An Experiment instance knows how to invoke training and eval loops in a sensible fashion for distributed training. More about it [here](https://www.tensorflow.org/api_docs/python/tf/contrib/learn/Experiment)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"# parameters\n",
"LEARNING_RATE = 0.01\n",
"STEPS = 1000\n",
"\n",
"# create experiment\n",
"def generate_experiment_fn():\n",
" def _experiment_fn(run_config, hparams):\n",
" del hparams # unused, required by signature.\n",
" # create estimator\n",
" model_params = {\"learning_rate\": LEARNING_RATE}\n",
" estimator = tf.estimator.Estimator(model_fn=m.get_model(), \n",
" params=model_params,\n",
" config=run_config)\n",
"\n",
" train_input = train_input_fn\n",
" test_input = test_input_fn\n",
" \n",
" return tf.contrib.learn.Experiment(\n",
" estimator,\n",
" train_input_fn=train_input,\n",
" eval_input_fn=test_input,\n",
" train_steps=STEPS\n",
" )\n",
" return _experiment_fn"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": true
},
"source": [
"### Run the experiment"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"OUTPUT_DIR='output/model_1'\n",
"learn_runner.run(generate_experiment_fn(), run_config=tf.contrib.learn.RunConfig(model_dir=OUTPUT_DIR))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Running a second time"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Okay, the model is definitely not good... But, check output_dir/model1, you'll see that this folder was created and that there are a lot of files there that were created automatically by TensorFlow! \n",
"\n",
"Most of these files are actually checkpoints, this means that **if we run the experiment again with the same model_dir it will just load the checkpoint and start from there instead of starting all over again!**\n",
"\n",
"This means that:\n",
"\n",
"- If we have a problem while training you can just restore from where you stopped instead of start all over again \n",
"- If we didn't train enough we can just continue to train\n",
"\n",
"**This is all true as long as you use the same model_dir!**\n",
"\n",
"So, let's run again the experiment for more 1000 steps to see if we can improve the accuracy. So, notice that the first step in this run will actually be the step 1001. So, we need to change the number of steps to 2000 (otherwhise the experiment will find the checkpoint and will think it already finished training)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"STEPS = STEPS + 1000\n",
"learn_runner.run(generate_experiment_fn(), run_config=tf.contrib.learn.RunConfig(model_dir=OUTPUT_DIR))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Tensorboard\n",
"\n",
"Another thing we get for free is TensorBoard. You can use TensorBoard to visualize your TensorFlow graph, plot quantitative metrics about the execution of your graph, and show additional data like images that pass through it. When TensorBoard is fully configured, it looks like this:\n",
"\n",
"If you run: *tensorboard --logdir=output_dir/model1*\n",
"\n",
"You'll see that we get the graph and some scalars, also if you use an [embedding layer](https://www.tensorflow.org/api_docs/python/tf/contrib/layers/embed_sequence) you'll get an [embedding visualization](https://www.tensorflow.org/get_started/embedding_viz) in tensorboard as well!\n",
"\n",
"So, we can make small changes and we'll have an easy (and totally for free) way to compare the models.\n",
"\n",
"Let's make these changes:\n",
"1. change the learning rate to 0.05 \n",
"2. change the OUTPUT_DIR to some path in output_dir/\n",
"\n",
"The 2. must be inside output_dir/ because we can run: *tensorboard --logdir=output_dir/* \n",
"And we'll get both models visualized at the same time in tensorboard.\n",
"\n",
"You'll notice that the model will start from step 1, because there's no existing checkpoint in this path."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"LEARNING_RATE = 0.05\n",
"OUTPUT_DIR = 'output_dir/model2'\n",
"learn_runner.run(generate_experiment_fn(), run_config=tf.contrib.learn.RunConfig(model_dir=OUTPUT_DIR))"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.4.3"
}
},
"nbformat": 4,
"nbformat_minor": 2
}