diff --git a/mask_detection.ipynb b/mask_detection.ipynb index 46488ad..3eab5c5 100644 --- a/mask_detection.ipynb +++ b/mask_detection.ipynb @@ -6,15 +6,15 @@ "source": [ "# Mask detection demo\n", "The following example demonstrates an end to end data science workflow for building an image classifier model.
\n", - "The model is trained on dataset holding images of people with or without masks.
\n", - "The model is then deployed to a Nuclio function allowing users to send http request with an image and getting a respond\n", - "back with the probability of mask existence for the people in the picture.\n", + "The model is trained on a dataset containing images of people with or without masks.
\n", + "The model is then deployed to a Nuclio function that allows users to send an http request with an image and receive a response\n", + " with the probability that the person in the picture is wearing a mask.\n", "\n", - "## key technologies\n", - "- **Tensorflow-Keras** for training the model\n", - "- **Horovod** for running a distributed training\n", - "- **Nuclio** for creating an high performance serverless Serving function\n", - "- **MLRun** for orchestrating it all together" + "## Key Technologies\n", + "- **Tensorflow-Keras** to train the model\n", + "- **Horovod** to run distributed training\n", + "- **Nuclio** to create a high-performance serverless Serving function\n", + "- **MLRun** to orchestrate the process" ] }, { @@ -69,7 +69,7 @@ "source": [ "### Import the dataset\n", "[The MLRun functions marketplace](https://github.com/mlrun/functions/) (a.k.a. \"the MLRun functions hub\") is a \n", - "centralized location for open-source contributions of function components that are commonly used in machine-learning \n", + "centralized location for open-source contributions of function components that are commonly used in machine learning \n", "development.\n", "\n", "This step uses the [Open archive marketplace function](https://github.com/mlrun/functions/tree/master/open_archive) \n", @@ -91,9 +91,9 @@ "source": [ "**Running the function**\n", "\n", - "Note that we're setting the `local` param to `True` which means the function code will run locally inside the jupyter \n", - "kernel. We could as well set it to `False` (the default), then MLRun will automatically spawn a K8s Pod and run the \n", - "function inside it, allowing us to isolate the function running environment and using our k8s cluster resources.\n", + "Note that in this case we're setting the `local` param to `True` which means the function code will run locally inside the jupyter \n", + "kernel. Alternatively, we could set it to `False` (the default), in which case MLRun would automatically spawn a k8s pod and run the \n", + "function inside it, allowing us to isolate the function running environment and use our k8s cluster resources.\n", "\n", "MAKE THIS WORK - currently without local it fails because of a bug in the function code" ] @@ -345,18 +345,18 @@ "source": [ "### Training the model\n", "The training code is basically the same as the code from [the kaggle project](https://www.kaggle.com/notadithyabhat/face-mask-detector/execution) \n", - "with slight changes to allow levaraging horovod for doing distributed training (done by following [horovod's usage instructions](https://github.com/horovod/horovod#usage)). \n", - "The original training code can be found in [here](training_code_original.py).\n", + "with slight changes to allow us to use horovod to do distributed training (which we can do by following [horovod's usage instructions](https://github.com/horovod/horovod#usage)). \n", + "The original training code can be found [here](training_code_original.py).\n", "\n", - "The changes done are roughly:\n", + "Roughly, the changes between the kaggle project and this one are:\n", "- Code at the beginning to extract job params from MLRun context\n", "- Code to initialize horovod and assign devices (CPU/GPU) before building the model\n", "- Replace to use the horovod optimizer\n", "- Code at the end to log the results to MLRun DB\n", "\n", - "In the code below we're starting to see **the value of MLRun** - in a matter of value change we can move our \n", + "In the code below we're starting to see **the value of MLRun** - with a simple value change we can move our \n", "**training** to be **distributed** across **several workers**, assign **GPUs** and more. No need to write dockerfiles, \n", - "no need to hassle with k8s yamls, MLRun does all of this for us.\n", + "no need to hassle with k8s yamls:/ MLRun does all of this for us.\n", "\n", "REPLACE IMAGE - using my image as I'm waiting for the images refactor PR to be merged - the PR will lower the image \n", "size + add a package (opencv) I'm missing for this demo" @@ -1011,9 +1011,9 @@ "from the hub.\n", "The serving function kind is using [Nuclio](https://github.com/nuclio/nuclio/) - an high-performance serverless event \n", "and data processing platform.\n", - "With several lines of code we were able to take our model, **expose** it with an **http endpoint** and deploy it on an \n", - "**high performance** infrastructure that can easily scale to serve on **production scale** (hundreds of thousands of \n", - "requests per second)\n", + "With several lines of code we were able to take our model, **expose** it with an **http endpoint** and deploy it on a \n", + "**high performance** infrastructure that can easily scale up to serve on **production scale** with hundreds of thousands of \n", + "requests per second\n", "\n", "CHANGE to actually make it use the hub function - currently hub function fits only classes (classification models) and \n", "not this one which just gives a number - probablility" @@ -1088,8 +1088,8 @@ "metadata": {}, "source": [ "### Using the model\n", - "Here we're demonstrating how to take an image, send it to our serving function API, and seeing the model prediction for \n", - "the probability the people in the image are wearing mask" + "Here we're demonstrating how to take an image, send it to our serving function API, and see the model's prediction of \n", + "the probability that the people in the image are wearing a mask." ] }, { @@ -1182,17 +1182,17 @@ "metadata": {}, "source": [ "## Summary & conclusions:\n", - "- MLOps is a big thing should not be underestimated when planning a datascience project.\n", - "- Using the right frameworks, like MLRun, enables you to take an existing training code, and build a full data science \n", - "project out of it. \n", - "- In a matter of no more than 200 lines of code we were able to distribute our training, track our different trials, and\n", - " deploy our final model on a production scale infrastructure.\n", + "- MLOps is a critical factor big thing should not be underestimated when planning a datascience project.\n", + "- The right framework (like MLRun in this case) enables you to take an existing training code, and quickly build it into a full production-ready data science \n", + "project. \n", + "- With just 200 lines of code we were able to distribute our training, track our different trials, and\n", + " deploy our final model on production scale infrastructure.\n", "- MLRun has much more to offer (search for something like קצרה היריעה מלהכיל) - function versioning, feature store, \n", "automatic pipeline scheduling, model monitoring.\n", "\n", "\n", "### The author\n", - "Hedi Ingber is the main maintainer of MLRun, and working as a Backend engineer at Iguazio which offers the enterprise \n", + "Hedi Ingber is the main maintainer of MLRun. He's also a Backend Engineer at Iguazio, which offers the enterprise \n", "version of MLRun.\n", " " ] @@ -1235,4 +1235,4 @@ }, "nbformat": 4, "nbformat_minor": 4 -} \ No newline at end of file +}