explain while loop; define learning rate

labarba · labarba · commit f8fa01a4c656 · 2021-07-18T16:05:40.000-04:00
diff --git a/notebooks_en/2_Logistic_Regression.ipynb b/notebooks_en/2_Logistic_Regression.ipynb
@@ -425,7 +425,7 @@
     "pip install autograd\n",
     "```\n",
     "\n",
-    "or, if you installed your Python environment via Anaconda, you can run this command:\n",
+    "Or, if you installed your Python environment via [Anaconda](https://www.anaconda.com/products/individual), you can run this command:\n",
     "```\n",
     "conda install -c conda-forge autograd\n",
     "```\n",
@@ -526,6 +526,7 @@
     "##### Note:\n",
     "\n",
     "> The argument list of the function returned by `grad()` will be the same as the argument list of the loss function that we input to it.\n",
+    "> The default setting of `grad()` is to return the derivative(s) with respect to the first argument, in this case, the two-element array `params`. This gives us the two derivatives (with respect to $w$ and $b$) at once.\n",
     "\n",
     "Let's now make a random starting guess for the two model parameters:"
    ]
@@ -536,6 +537,7 @@
    "metadata": {},
    "outputs": [],
    "source": [
+    "numpy.random.seed(0)\n",
     "params = np.random.rand(2)\n",
     "print(params)"
    ]
@@ -560,7 +562,12 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Now we optimize! Notice that we set both a maximum number of iterations, and an exit criterion based on two successive residuals being very close. \n",
+    "Now we optimize! Notice that in this optimization loop we chose to use a `while` statement, instead of `for`. \n",
+    "We set both a maximum number of iterations, and an exit criterion based on the norm of the gradient being very small.\n",
+    "Already you should be thinking about these choices we make:\n",
+    "\n",
+    "- The gradient is multiplied by a small step of $0.01$: this is called the _learning rate_. How do we choose this value?\n",
+    "- The `while` loop exits when the norm of the gradient is $0.001$: how do we know if this is a good choice?\n",
     "\n",
     "Finally, we plot the synthetic data we created above, and the logistic regression curve corresponding to the parameters found in the optimization loop."
    ]
@@ -571,9 +578,9 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "max_iter = 3000\n",
+    "max_iter = 5000\n",
     "i = 0\n",
-    "descent = np.ones(len(x_data))\n",
+    "descent = np.ones(len(params))\n",
     "\n",
     "while np.linalg.norm(descent) > 0.001 and i < max_iter:\n",
     "\n",