change to while loop; tighten text

labarba · labarba · commit bf4a35a3ca8f · 2021-07-17T21:40:32.000-04:00
diff --git a/notebooks_en/2_Logistic_Regression.ipynb b/notebooks_en/2_Logistic_Regression.ipynb
@@ -105,6 +105,10 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
+    "Above, we chose the parameters $w$ and $b$ for the model, and used them to get the intermediate variable $z$ adding some random noise to make our synthetic data look more \"realistic.\"\n",
+    "The call to `numpy.random.seed()` makes our added noise reproducible, i.e., you always get the same pseudo-random numbers from the subsequent call to `numpy.random.normal()`. \n",
+    "(Let's not get side-tracked into a discussion about pseudo-random number generation, and leave that for another tutorial.)\n",
+    "\n",
     "We can apply a decision boundary now to assign the data to the two classes. Be sure to read the documentation of [`numpy.where()`](https://numpy.org/doc/stable/reference/generated/numpy.where.html) to understand the code below, noting that after the logical condition we specify the values to assign if the condition is `True` or `False`."
    ]
   },
@@ -158,7 +162,7 @@
     "We'd like to work with a better loss function, that avoids this problem, and we build one below by integration. (For a more detailed discussion, we recommend Chapter 3 of Michael Nielsen's free ebook [2]).\n",
     "\n",
     "It's important to note also that our prediction model is a nonlinear function, composed with the linear model, and the square-error would lead to a non-convex loss function that can have local minima, and make gradient descent fail. \n",
-    "Here's an example posted on Stackoverflow in answer to this very [question](https://math.stackexchange.com/questions/2381724/logistic-regression-when-can-the-cost-function-be-non-convex). Consider just three data points, and a model with no intercept, $z = wx$: $(-1, 2), (-20, -1), (-5, 5)$. What would a square-error mean function look like? We can plot it using SymPy."
+    "Here's an example posted on Stackoverflow in answer to this very [question](https://math.stackexchange.com/questions/2381724/logistic-regression-when-can-the-cost-function-be-non-convex). Consider just three data points, and a model with no intercept, $z = wx$: $(-1, 2), (-20, -1), (-5, 5)$. What would a square-error loss function look like? We can plot it using SymPy."
    ]
   },
   {
@@ -379,7 +383,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "So far, we can still use SymPy to get derivatives of the loss function with respect to the parameters. But with more complicated models, finding symbolic derivatives will take a long time.\n",
+    "So far, we can still use SymPy to get derivatives of the loss function with respect to the parameters. But with more complicated models, finding symbolic derivatives could take a long time.\n",
     "\n",
     "Have a look at the derivative of the logistic loss with respect to the parameter $b$: "
    ]
@@ -397,7 +401,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "We can use symbolic differentiation but it can take a long time to compute for very complicated functions.\n",
+    "Although we can use symbolic differentiation, we later need to convert the resulting expression to a Python function that can be called and evaluated at many data inputs. Maybe this is not the best approach.\n",
     "\n",
     "There's a better way! It's called _automatic differentiation_: the idea is to algorithmically obtain derivatives of numeric functions written in computer code. \n",
     "It sounds like magic, but it can be done by a combination of the chain rule, symbolic rules of differentiation for elementary operations, and a numeric evaluation trace of the elementary derivatives. \n",
@@ -432,7 +436,7 @@
     "\n",
     "In addition, `autograd.numpy` is a wrapper to the NumPy library. This allows you to call your favorite NumPy methods with `autograd` keeping track of every operation so it can give you the derivative (via the chain rule).\n",
     "We ill import it using the alias (`as np`), consistent with the tutorials and documentation that you will find online.\n",
-    "Up to now in the _Engineering Computations_ series of modules, we have refrained from using the aliased form of the import statements, just to have more explicit and readable code. "
+    "Up to now in the _Engineering Computations_ series of modules, we had refrained from using the aliased form of the import statements, just to have more explicit and readable code. "
    ]
   },
   {
@@ -477,6 +481,7 @@
     "    \n",
     "def logistic_model(params, x):\n",
     "    '''A prediction model based on the logistic function composed with wx+b\n",
+    "    Arguments:\n",
     "       params: array(w,b) of model parameters\n",
     "       x :  array of x data'''\n",
     "    w = params[0]\n",
@@ -486,7 +491,11 @@
     "    return y\n",
     "\n",
     "def log_loss(params, model, x, y):\n",
-    "    '''The logistic loss function'''\n",
+    "    '''The logistic loss function\n",
+    "    Arguments:\n",
+    "       params: array(w,b) of model parameters\n",
+    "       model:  the Python function for the logistic model\n",
+    "       x, y:   arrays of input data to the model'''\n",
     "    y_pred = model(params, x)\n",
     "    return -np.mean(y * np.log(y_pred) + (1-y) * np.log(1 - y_pred))"
    ]
@@ -562,18 +571,21 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "for i in range(3000):\n",
+    "max_iter = 3000\n",
+    "i = 0\n",
+    "descent = np.ones(len(x_data))\n",
+    "\n",
+    "while np.linalg.norm(descent) > 0.001 and i < max_iter:\n",
+    "\n",
     "    descent = gradient(params, logistic_model, x_data, y_data)\n",
-    "    oldparams = params\n",
     "    params = params - descent * 0.01\n",
-    "    residual = np.abs((params - oldparams) / oldparams)\n",
-    "    if np.all(residual < 1e-6):\n",
-    "        break\n",
+    "    i += 1\n",
+    "\n",
     "\n",
     "print(f'Optimized value of w is {params[0]:.3f} vs. true value: 2')\n",
     "print(f'Optimized value of b is {params[1]:.3f} vs. true value: 1')\n",
     "print(f'Exited after {i} iterations')\n",
-    "print(f'Residual is {residual}')\n",
+    "\n",
     "\n",
     "pyplot.scatter(x_data, y_data, alpha=0.4)\n",
     "pyplot.plot(x_data, logistic_model(params, x_data), '-r');"
@@ -694,7 +706,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.6.13"
+   "version": "3.8.5"
   }
  },
  "nbformat": 4,