|
425 | 425 | "pip install autograd\n",
|
426 | 426 | "```\n",
|
427 | 427 | "\n",
|
428 |
| - "or, if you installed your Python environment via Anaconda, you can run this command:\n", |
| 428 | + "Or, if you installed your Python environment via [Anaconda](https://www.anaconda.com/products/individual), you can run this command:\n", |
429 | 429 | "```\n",
|
430 | 430 | "conda install -c conda-forge autograd\n",
|
431 | 431 | "```\n",
|
|
526 | 526 | "##### Note:\n",
|
527 | 527 | "\n",
|
528 | 528 | "> The argument list of the function returned by `grad()` will be the same as the argument list of the loss function that we input to it.\n",
|
| 529 | + "> The default setting of `grad()` is to return the derivative(s) with respect to the first argument, in this case, the two-element array `params`. This gives us the two derivatives (with respect to $w$ and $b$) at once.\n", |
529 | 530 | "\n",
|
530 | 531 | "Let's now make a random starting guess for the two model parameters:"
|
531 | 532 | ]
|
|
536 | 537 | "metadata": {},
|
537 | 538 | "outputs": [],
|
538 | 539 | "source": [
|
| 540 | + "numpy.random.seed(0)\n", |
539 | 541 | "params = np.random.rand(2)\n",
|
540 | 542 | "print(params)"
|
541 | 543 | ]
|
|
560 | 562 | "cell_type": "markdown",
|
561 | 563 | "metadata": {},
|
562 | 564 | "source": [
|
563 |
| - "Now we optimize! Notice that we set both a maximum number of iterations, and an exit criterion based on two successive residuals being very close. \n", |
| 565 | + "Now we optimize! Notice that in this optimization loop we chose to use a `while` statement, instead of `for`. \n", |
| 566 | + "We set both a maximum number of iterations, and an exit criterion based on the norm of the gradient being very small.\n", |
| 567 | + "Already you should be thinking about these choices we make:\n", |
| 568 | + "\n", |
| 569 | + "- The gradient is multiplied by a small step of $0.01$: this is called the _learning rate_. How do we choose this value?\n", |
| 570 | + "- The `while` loop exits when the norm of the gradient is $0.001$: how do we know if this is a good choice?\n", |
564 | 571 | "\n",
|
565 | 572 | "Finally, we plot the synthetic data we created above, and the logistic regression curve corresponding to the parameters found in the optimization loop."
|
566 | 573 | ]
|
|
571 | 578 | "metadata": {},
|
572 | 579 | "outputs": [],
|
573 | 580 | "source": [
|
574 |
| - "max_iter = 3000\n", |
| 581 | + "max_iter = 5000\n", |
575 | 582 | "i = 0\n",
|
576 |
| - "descent = np.ones(len(x_data))\n", |
| 583 | + "descent = np.ones(len(params))\n", |
577 | 584 | "\n",
|
578 | 585 | "while np.linalg.norm(descent) > 0.001 and i < max_iter:\n",
|
579 | 586 | "\n",
|
|
0 commit comments