diff --git a/NerobovaAS/LabDeepLearningNerobovaAS.html b/NerobovaAS/LabDeepLearningNerobovaAS.html new file mode 100644 index 0000000..770a9cb --- /dev/null +++ b/NerobovaAS/LabDeepLearningNerobovaAS.html @@ -0,0 +1,15477 @@ + + + + + +LabDeepLean + + + + + + + + + + + + + + + + + + + + + + + +
+
+

Лабораторная работа по курсу "Глубокое обучение"

+
+
+
+ +
+ +
+
+
+

Считывание данных

+
+
+
+
+

В данной работе используется набор данных MNIST. Данный набор был предварительно скачен с сайта [http://yann.lecun.com/exdb/mnist].

+ +
+
+
+
+ + +
+
+
+ +
+ +
+
+
+

Загрузим тренировочные и тестовые данные. Выведем на экран размерность данных.

+ +
+
+
+ +
+ +
+ + + + +
+ +
+
+
+

Проверм, что данные считались корректно. Для этого выведем на экран первую цифру и её метку класса.

+ +
+
+
+ +
+ +
+ + + + +
+ +
+
+
+

Нормировка данных

+
+
+
+ +
+ +
+ + + + +
+ +
+
+
+

Математическая модель

+
+
+
+
+

Рассмотрим двухслойную нейронную сеть

+ +
+
+
+
+

2022-12-15_20-15-21.md.png

+ +
+
+
+
+

где

+ + +
+
+
+
+

Модель нейрона описывается следующими уравнениями:

+ +
+
+
+
+$$ \ u_k=\sum_{j=1}^{n}w_{k,j}x_j$$ +
+
+
+
+$$ \ y_k=\varphi(u_k+b_k)$$ +
+
+
+
+

где +$\ x_j$ - входной сигнал, $\ w_{k,j} $ - синаптический вес сигнала $\ x_j$, $\ \varphi $ - функция активации, $\ b_k$ - смещение.

+ +
+
+
+
+

Метод обратного распространения ошибки определяет +стратегию изменения параметров сети $\ 𝑤 $ в ходе обучения +с использованием градиентных методов оптимизации.

+ +
+
+
+
+

Градиентные методы на каждом шаге уточняют значения параметров: $\ w(k+1) = w(k) + \eta p(w)$

+ +
+
+
+
+

где

+ + +
+
+
+
+

В классическом методе обратного распространения ошибки направление движения совпадает с направлением антиградиента $ \ 𝑝(𝑤) = −\nabla 𝐸(𝑤(𝑘)) $ на 𝑘-ой итерации метода.

+ +
+
+
+
+

Метод обратного распространения ошибки

+
+
+
+
+

1. Прямой проход

+
+
+
+
+
    +
  1. Вычисление значений выходных сигналов нейронов всех слоев
  2. +
  3. Вычисление значений производных функций активации на каждом +слое сети
  4. +
+ +
+
+
+
+

Выходной сигнал нейрона скрытого слоя описывается следующим образом:

+ +
+
+
+
+$$ \ v_s = \varphi ^{(1)} (\sum_{i=1}^{N}w_{si}^{(1)}x_i) $$ +
+
+
+
+

Сигнал 𝑗-ого нейрона выходного слоя:

+ +
+
+
+
+$$ \ u_j = \varphi ^{(2)}(\sum_{s=0}^{K}w_{js}^{(2)}v_s) = \varphi ^{(2)}(\sum_{s=0}^{K}w_{js}^{(2)}\varphi ^{(1)} (\sum_{i=0}^{N}w_{si}^{(1)}x_i)), \ j = 1,M $$

+ +
+
+
+
+

В качестве функции активации на скрытом слое используется ReLU (Rectified Linear Unit).

+ +
+
+
+
+$$ \ \varphi(v) = \begin{cases} 0,v \leq 0 \\ v,v>0 \end{cases}$$ +
+
+
+
+

2022-12-18_11-51-48.png

+ +
+
+
+
+

Её производная есть

+ +
+
+
+
+$$ \ \varphi(v)^{'} = \begin{cases} 0,v \leq 0 \\ 1,v>0 \end{cases}$$ +
+
+
+
+

В качестве функции активациии на выходном слое используется функция softmax.

+ +
+
+
+
+$$\ \varphi(u_j) = \frac{e^{u_j} }{\sum_{i=1}^{M} e^{u_i}} $$ +
+
+
+
+

2. Обратный проход

+
+
+
+
+
    +
  1. Вычисление целевой функции 𝐸 и ее градиента $\ \frac{\partial E }{\partial w_{js}^{(2)}} $ и $\ \frac{\partial E }{\partial w_{si}^{(1)}}$
  2. +
  3. Коррекция весов $\ 𝑤(𝑘 + 1) = 𝑤(𝑘) − \eta \nabla 𝐸(𝑤(𝑘)) $
  4. +
+ +
+
+
+
+

В качестве функции ошибки используется кросс-энтропия (или логарифмическая функция потерь – log loss), т.е.

+ +
+
+
+
+$$\ E(w) = - \sum_{j=1}^{M}y_j ln(u_j) $$ +
+
+
+
+

где $\ y_j$ ожидаемый выход (метки).

+ +
+
+
+
+

Производную целевой функции по весам можно вывести следующим образом:

+ +
+
+
+
+

По весам второго слоя:

+ +
+
+
+
+$$\ \frac{\partial E }{\partial w_{js}^{(2)}} = - \sum_{j'=0}^{M} y_{j'} \frac{\partial \ln u_{j'}}{\partial w_{js}^{(2)}} = - \sum_{j'=0}^{M} y_{j'} \frac{1}{u_{j'}} \frac{\partial u_{j'}}{\partial w_{js}^{(2)}}$$ +
+
+
+
+$$ \ \frac{\partial u_{j'}}{\partial w_{js}^{(2)}} = \frac{\partial \varphi ^{(2)} (g_{j'})}{\partial g_{j}} \frac{\partial g_{j}}{\partial w_{js}^{(2)}} = \varphi (g_{j'}) (\delta_{j,j'}-\varphi(g_i)) \frac{\partial g_j}{\partial w_{js}^{(2)}}$$ +
+
+
+
+$$ \ g_j = \sum_{s=0}^K w_{ij}^{(2)}v_s; \frac{\partial g_j}{\partial w_{js}^{(2)}} = v_s$$ +
+
+
+
+$$\ \frac{\partial E }{\partial w_{js}^{(2)}} = - \sum_{j'=0}^{M} y_{j'} (\delta_{j,j'}-\varphi^{(2)}(g_i))v_s = (\varphi^{(2)}(g_j) \sum_{j'=0}^{M} y_{j'} - y_i )v_s $$ +
+
+
+
+

из условия $ \ \sum_{j=1}^M y_j = 1$ получаем

+ +
+
+
+
+$$\ \frac{\partial E }{\partial w_{js}^{(2)}} = (\partial^{(2)}(g_j) - y_j)v_s $$ +
+
+
+
+

По весам первого слоя

+ +
+
+
+
+$$\ \frac{\partial E }{\partial w_{si}^{(1)}} = - \sum_{j'=0}^{M} y_{j'} \frac{\partial \ln u_{j'}}{\partial w_{si}^{(1)}} = - \sum_{j'=0}^{M} y_{j'} \frac{1}{u_{j'}} \frac{\partial u_{j'}}{\partial w_{si}^{(1)}}$$ +
+
+
+
+$$ \ \frac{\partial u_{j'}}{\partial w_{si}^{(1)}} = \frac{\partial \varphi ^{(2)} (g_{j'})}{\partial g_{j}} \frac{\partial g_{j}}{\partial w_{si}^{(1)}} = \varphi^{(2)} (g_{j'}) (\delta_{j',j}-\varphi^{(2)}(g_i)) w_{js}^{(2)} \frac{\partial \varphi_i^{(1)}}{\partial w_{si}^{(1)}}x_i$$ +
+
+
+
+$$\ \frac{\partial E }{\partial w_{si}^{(1)}} = (\sum_{j'=0}^M (y_{j'}-u_{j'})w_{j's}^{(2)})\frac{\partial \varphi_i^{(1)}}{\partial w_{si}^{(1)}}x_i $$ +
+
+
+
+

Программная реализация

+
+
+
+ +
+ +
+
+
+

Создадим класс NeuralNetwork, который будет содержать имплементацию основных методов нейронной сети

+ +
+
+
+ +
+ +
+
+ +
+ +
+
+ +
+ +
+ + + + +
+ +
+
+ +
+ +
+ + + + +
+ +
+
+
+

"Поиграемся" с параметрами модели

+ +
+
+
+ +
+ +
+ + + + +
+ +
+
+ +
+ +
+ + + + +
+ +
+
+
+

Контрольный набор данных

+
+
+
+
+

Запустим нашу модель на контрольном наборе параметров: размер пачки 64, скорость обучения составляет 0.1, количество скрытых нейронов – 300, количество эпох – 20.

+ +
+
+
+ +
+ +
+ + + + +
+ +
+
+
+

Реализация PyTorch

+
+
+
+
+

Проверм, достигнута ли точность классификации на тестовых данных для контрольных значений параметров, сравнимая с точностью, которую выдают стандартные инструменты глубокого обучения.

+ +
+
+
+
+

Для этого напишем воспользуемся PyTorch.

+ +
+
+
+ +
+ +
+ + + + +
+ +
+
+ +
+ +
+ + + + +
+ +
+
+ +
+ +
+
+ +
+ +
+ + + + +
+ +
+
+ +
+ +
+ + + + +
+ +
+
+ +
+ +
+ + + + + + + + + diff --git a/NerobovaAS/LabDeepLearningNerobovaAS.ipynb b/NerobovaAS/LabDeepLearningNerobovaAS.ipynb new file mode 100644 index 0000000..a6fbffb --- /dev/null +++ b/NerobovaAS/LabDeepLearningNerobovaAS.ipynb @@ -0,0 +1,1130 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "5ef8beff", + "metadata": {}, + "source": [ + "# Лабораторная работа по курсу \"Глубокое обучение\"" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "id": "c72322e3", + "metadata": {}, + "outputs": [], + "source": [ + "import numpy as np\n", + "import matplotlib.pyplot as plt\n", + "import struct\n", + "import os\n", + "import gzip\n", + "import time" + ] + }, + { + "cell_type": "markdown", + "id": "19f4dcba", + "metadata": {}, + "source": [ + "## Считывание данных" + ] + }, + { + "cell_type": "markdown", + "id": "a94ab362", + "metadata": {}, + "source": [ + "В данной работе используется набор данных MNIST. Данный набор был предварительно скачен с сайта [http://yann.lecun.com/exdb/mnist]. " + ] + }, + { + "cell_type": "markdown", + "id": "d97a19ce", + "metadata": {}, + "source": [ + "- train-images-idx3-ubyte.gz; train-labels-idx1-ubyte – обучающие данные и их метки. \n", + " - t10k-images-idx3-ubyte, t10k-labels-idx1-ubyte – тестовые данные и их метки. " + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "id": "098ceb14", + "metadata": {}, + "outputs": [], + "source": [ + "# функция, возвращающая разархивирующий файл с картинками \n", + "def readBinFileMatrix(nameFile):\n", + " pathFiles = os.getcwd()\n", + " with gzip.open(pathFiles + '/' + nameFile,'rb') as f:\n", + " magic, size = struct.unpack(\">II\", f.read(8))\n", + " nrows, ncols = struct.unpack(\">II\", f.read(8))\n", + " data = np.frombuffer(f.read(), dtype=np.dtype(np.uint8).newbyteorder('>'))\n", + " data = data.reshape((size, nrows, ncols))\n", + " return np.array(data)\n", + "\n", + "# функция, возвращающая разархивирующий файл с метками \n", + "def readBinFileLabel(nameFile):\n", + " pathFiles = os.getcwd()\n", + " with gzip.open(pathFiles + '/' + nameFile,'rb') as f:\n", + " magic, size = struct.unpack(\">II\", f.read(8))\n", + " data = np.frombuffer(f.read(), dtype=np.dtype(np.uint8).newbyteorder('>'))\n", + " data = data.reshape((size,)) \n", + " return np.array(data)\n", + "\n", + "# функция, возвращающая тренировочные и тестовые данные\n", + "def loadData(nameData):\n", + " # nameFile = ['t10k-images.idx3-ubyte', 't10k-labels.idx1-ubyte', 'train-images.idx3-ubyte', 'train-labels.idx1-ubyte']\n", + " if nameData == 'Xtrain':\n", + " return readBinFileMatrix('train-images-idx3-ubyte.gz')\n", + " if nameData == 'Ytrain':\n", + " return readBinFileLabel('train-labels-idx1-ubyte.gz')\n", + " if nameData == 'Xtest':\n", + " return readBinFileMatrix('t10k-images-idx3-ubyte.gz')\n", + " if nameData == 'Ytest':\n", + " return readBinFileLabel('t10k-labels-idx1-ubyte.gz') " + ] + }, + { + "cell_type": "markdown", + "id": "a6c786b8", + "metadata": {}, + "source": [ + "Загрузим тренировочные и тестовые данные. Выведем на экран размерность данных." + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "id": "8a9de471", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Shape X_train = (60000, 28, 28), Shape y_train = (60000,)\n", + "Shape X_test = (10000, 28, 28), Shape y_test = (10000,)\n" + ] + } + ], + "source": [ + "train_x = loadData('Xtrain')\n", + "train_y= loadData('Ytrain')\n", + "print('Shape X_train = {}, Shape y_train = {}'.format(train_x.shape, train_y.shape))\n", + "\n", + "test_x = loadData('Xtest')\n", + "test_y = loadData('Ytest')\n", + "print('Shape X_test = {}, Shape y_test = {}'.format(test_x.shape, test_y.shape))" + ] + }, + { + "cell_type": "markdown", + "id": "02c0c140", + "metadata": {}, + "source": [ + "Проверм, что данные считались корректно. Для этого выведем на экран первую цифру и её метку класса." + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "id": "a45d55a8", + "metadata": {}, + "outputs": [ + { + "data": { + "image/png": "iVBORw0KGgoAAAANSUhEUgAAAPsAAAD4CAYAAAAq5pAIAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+WH4yJAAAN80lEQVR4nO3df6hcdXrH8c+ncf3DrBpTMYasNhuRWBWbLRqLSl2RrD9QNOqWDVgsBrN/GHChhEr6xyolEuqP0qAsuYu6sWyzLqgYZVkVo6ZFCF5j1JjU1YrdjV6SSozG+KtJnv5xT+Su3vnOzcyZOZP7vF9wmZnzzJnzcLife87Md879OiIEYPL7k6YbANAfhB1IgrADSRB2IAnCDiRxRD83ZpuP/oEeiwiPt7yrI7vtS22/aftt27d281oAesudjrPbniLpd5IWSNou6SVJiyJia2EdjuxAj/XiyD5f0tsR8U5EfCnpV5Ku6uL1APRQN2GfJekPYx5vr5b9EdtLbA/bHu5iWwC61M0HdOOdKnzjND0ihiQNSZzGA03q5si+XdJJYx5/R9L73bUDoFe6CftLkk61/V3bR0r6kaR19bQFoG4dn8ZHxD7bSyU9JWmKpAci4o3aOgNQq46H3jraGO/ZgZ7ryZdqABw+CDuQBGEHkiDsQBKEHUiCsANJEHYgCcIOJEHYgSQIO5AEYQeSIOxAEoQdSIKwA0kQdiAJwg4kQdiBJAg7kARhB5Ig7EAShB1IgrADSRB2IAnCDiRB2IEkCDuQBGEHkiDsQBKEHUii4ymbcXiYMmVKsX7sscf2dPtLly5tWTvqqKOK686dO7dYv/nmm4v1u+66q2Vt0aJFxXU///zzYn3lypXF+u23316sN6GrsNt+V9IeSfsl7YuIs+toCkD96jiyXxQRH9TwOgB6iPfsQBLdhj0kPW37ZdtLxnuC7SW2h20Pd7ktAF3o9jT+/Ih43/YJkp6x/V8RsWHsEyJiSNKQJNmOLrcHoENdHdkj4v3qdqekxyTNr6MpAPXrOOy2p9o++uB9ST+QtKWuxgDUq5vT+BmSHrN98HX+PSJ+W0tXk8zJJ59crB955JHF+nnnnVesX3DBBS1r06ZNK6577bXXFutN2r59e7G+atWqYn3hwoUta3v27Cmu++qrrxbrL7zwQrE+iDoOe0S8I+kvauwFQA8x9AYkQdiBJAg7kARhB5Ig7EASjujfl9om6zfo5s2bV6yvX7++WO/1ZaaD6sCBA8X6jTfeWKx/8sknHW97ZGSkWP/www+L9TfffLPjbfdaRHi85RzZgSQIO5AEYQeSIOxAEoQdSIKwA0kQdiAJxtlrMH369GJ948aNxfqcOXPqbKdW7XrfvXt3sX7RRRe1rH355ZfFdbN+/6BbjLMDyRF2IAnCDiRB2IEkCDuQBGEHkiDsQBJM2VyDXbt2FevLli0r1q+44opi/ZVXXinW2/1L5ZLNmzcX6wsWLCjW9+7dW6yfccYZLWu33HJLcV3UiyM7kARhB5Ig7EAShB1IgrADSRB2IAnCDiTB9ewD4JhjjinW200vvHr16pa1xYsXF9e9/vrri/W1a9cW6xg8HV/PbvsB2zttbxmzbLrtZ2y/Vd0eV2ezAOo3kdP4X0i69GvLbpX0bEScKunZ6jGAAdY27BGxQdLXvw96laQ11f01kq6uuS8ANev0u/EzImJEkiJixPYJrZ5oe4mkJR1uB0BNen4hTEQMSRqS+IAOaFKnQ287bM+UpOp2Z30tAeiFTsO+TtIN1f0bJD1eTzsAeqXtabzttZK+L+l429sl/VTSSkm/tr1Y0u8l/bCXTU52H3/8cVfrf/TRRx2ve9NNNxXrDz/8cLHebo51DI62YY+IRS1KF9fcC4Ae4uuyQBKEHUiCsANJEHYgCcIOJMElrpPA1KlTW9aeeOKJ4roXXnhhsX7ZZZcV608//XSxjv5jymYgOcIOJEHYgSQIO5AEYQeSIOxAEoQdSIJx9knulFNOKdY3bdpUrO/evbtYf+6554r14eHhlrX77ruvuG4/fzcnE8bZgeQIO5AEYQeSIOxAEoQdSIKwA0kQdiAJxtmTW7hwYbH+4IMPFutHH310x9tevnx5sf7QQw8V6yMjIx1vezJjnB1IjrADSRB2IAnCDiRB2IEkCDuQBGEHkmCcHUVnnnlmsX7PPfcU6xdf3Plkv6tXry7WV6xYUay/9957HW/7cNbxOLvtB2zvtL1lzLLbbL9ne3P1c3mdzQKo30RO438h6dJxlv9LRMyrfn5Tb1sA6tY27BGxQdKuPvQCoIe6+YBuqe3XqtP841o9yfYS28O2W/8zMgA912nYfybpFEnzJI1IurvVEyNiKCLOjoizO9wWgBp0FPaI2BER+yPigKSfS5pfb1sA6tZR2G3PHPNwoaQtrZ4LYDC0HWe3vVbS9yUdL2mHpJ9Wj+dJCknvSvpxRLS9uJhx9sln2rRpxfqVV17ZstbuWnl73OHir6xfv75YX7BgQbE+WbUaZz9iAisuGmfx/V13BKCv+LoskARhB5Ig7EAShB1IgrADSXCJKxrzxRdfFOtHHFEeLNq3b1+xfskll7SsPf/888V1D2f8K2kgOcIOJEHYgSQIO5AEYQeSIOxAEoQdSKLtVW/I7ayzzirWr7vuumL9nHPOaVlrN47eztatW4v1DRs2dPX6kw1HdiAJwg4kQdiBJAg7kARhB5Ig7EAShB1IgnH2SW7u3LnF+tKlS4v1a665plg/8cQTD7mnidq/f3+xPjJS/u/lBw4cqLOdwx5HdiAJwg4kQdiBJAg7kARhB5Ig7EAShB1IgnH2w0C7sexFi8abaHdUu3H02bNnd9JSLYaHh4v1FStWFOvr1q2rs51Jr+2R3fZJtp+zvc32G7ZvqZZPt/2M7beq2+N63y6ATk3kNH6fpL+PiD+X9FeSbrZ9uqRbJT0bEadKerZ6DGBAtQ17RIxExKbq/h5J2yTNknSVpDXV09ZIurpXTQLo3iG9Z7c9W9L3JG2UNCMiRqTRPwi2T2ixzhJJS7prE0C3Jhx229+W9Iikn0TEx/a4c8d9Q0QMSRqqXoOJHYGGTGjozfa3NBr0X0bEo9XiHbZnVvWZknb2pkUAdWh7ZPfoIfx+Sdsi4p4xpXWSbpC0srp9vCcdTgIzZswo1k8//fRi/d577y3WTzvttEPuqS4bN24s1u+8886WtccfL//KcIlqvSZyGn++pL+V9LrtzdWy5RoN+a9tL5b0e0k/7E2LAOrQNuwR8Z+SWr1Bv7jedgD0Cl+XBZIg7EAShB1IgrADSRB2IAkucZ2g6dOnt6ytXr26uO68efOK9Tlz5nTUUx1efPHFYv3uu+8u1p966qli/bPPPjvkntAbHNmBJAg7kARhB5Ig7EAShB1IgrADSRB2IIk04+znnntusb5s2bJiff78+S1rs2bN6qinunz66acta6tWrSque8cddxTre/fu7agnDB6O7EAShB1IgrADSRB2IAnCDiRB2IEkCDuQRJpx9oULF3ZV78bWrVuL9SeffLJY37dvX7FeuuZ89+7dxXWRB0d2IAnCDiRB2IEkCDuQBGEHkiDsQBKEHUjCEVF+gn2SpIcknSjpgKShiPhX27dJuknS/1ZPXR4Rv2nzWuWNAehaRIw76/JEwj5T0syI2GT7aEkvS7pa0t9I+iQi7ppoE4Qd6L1WYZ/I/Owjkkaq+3tsb5PU7L9mAXDIDuk9u+3Zkr4naWO1aKnt12w/YPu4FusssT1se7irTgF0pe1p/FdPtL8t6QVJKyLiUdszJH0gKST9k0ZP9W9s8xqcxgM91vF7dkmy/S1JT0p6KiLuGac+W9KTEXFmm9ch7ECPtQp729N425Z0v6RtY4NefXB30EJJW7ptEkDvTOTT+Ask/Yek1zU69CZJyyUtkjRPo6fx70r6cfVhXum1OLIDPdbVaXxdCDvQex2fxgOYHAg7kARhB5Ig7EAShB1IgrADSRB2IAnCDiRB2IEkCDuQBGEHkiDsQBKEHUiCsANJ9HvK5g8k/c+Yx8dXywbRoPY2qH1J9NapOnv7s1aFvl7P/o2N28MRcXZjDRQMam+D2pdEb53qV2+cxgNJEHYgiabDPtTw9ksGtbdB7Uuit071pbdG37MD6J+mj+wA+oSwA0k0Enbbl9p+0/bbtm9toodWbL9r+3Xbm5uen66aQ2+n7S1jlk23/Yztt6rbcefYa6i322y/V+27zbYvb6i3k2w/Z3ub7Tds31Itb3TfFfrqy37r+3t221Mk/U7SAknbJb0kaVFEbO1rIy3YflfS2RHR+BcwbP+1pE8kPXRwai3b/yxpV0SsrP5QHhcR/zAgvd2mQ5zGu0e9tZpm/O/U4L6rc/rzTjRxZJ8v6e2IeCcivpT0K0lXNdDHwIuIDZJ2fW3xVZLWVPfXaPSXpe9a9DYQImIkIjZV9/dIOjjNeKP7rtBXXzQR9lmS/jDm8XYN1nzvIelp2y/bXtJ0M+OYcXCarer2hIb7+bq203j309emGR+YfdfJ9OfdaiLs401NM0jjf+dHxF9KukzSzdXpKibmZ5JO0egcgCOS7m6ymWqa8Uck/SQiPm6yl7HG6asv+62JsG+XdNKYx9+R9H4DfYwrIt6vbndKekyjbzsGyY6DM+hWtzsb7ucrEbEjIvZHxAFJP1eD+66aZvwRSb+MiEerxY3vu/H66td+ayLsL0k61fZ3bR8p6UeS1jXQxzfYnlp9cCLbUyX9QIM3FfU6STdU92+Q9HiDvfyRQZnGu9U042p43zU+/XlE9P1H0uUa/UT+vyX9YxM9tOhrjqRXq583mu5N0lqNntb9n0bPiBZL+lNJz0p6q7qdPkC9/ZtGp/Z+TaPBmtlQbxdo9K3ha5I2Vz+XN73vCn31Zb/xdVkgCb5BByRB2IEkCDuQBGEHkiDsQBKEHUiCsANJ/D+f1mbtgJ8kQQAAAABJRU5ErkJggg==\n", + "text/plain": [ + "
" + ] + }, + "metadata": { + "needs_background": "light" + }, + "output_type": "display_data" + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Label: 5\n" + ] + } + ], + "source": [ + "plt.imshow(train_x[0,:,:], cmap='gray')\n", + "plt.show()\n", + "print('Label: {}'.format(train_y[0]))" + ] + }, + { + "cell_type": "markdown", + "id": "51c96fdf", + "metadata": {}, + "source": [ + "## Нормировка данных" + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "id": "c240b3bb", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Shape X_train = (60000, 784), Shape y_train = (60000, 10)\n", + "Shape X_test = (10000, 784), Shape y_test = (10000, 10)\n" + ] + } + ], + "source": [ + "train_x = train_x.reshape(\n", + " train_x.shape[0], train_x.shape[1]*train_x.shape[2]).astype('float32')\n", + "train_x = train_x / 255 \n", + "train_y = np.eye(10)[train_y] \n", + "print('Shape X_train = {}, Shape y_train = {}'.format(train_x.shape, train_y.shape))\n", + "\n", + "# flatten 28x28 to 784x1 vectors, [60000, 784]\n", + "test_x = test_x.reshape(\n", + " test_x.shape[0], test_x.shape[1]*test_x.shape[2]).astype('float32')\n", + "test_x = test_x / 255 \n", + "test_y = np.eye(10)[test_y]\n", + "print('Shape X_test = {}, Shape y_test = {}'.format(test_x.shape, test_y.shape))" + ] + }, + { + "cell_type": "markdown", + "id": "6a9fa3d9", + "metadata": {}, + "source": [ + "## Математическая модель" + ] + }, + { + "cell_type": "markdown", + "id": "ea1e1674", + "metadata": {}, + "source": [ + "Рассмотрим двухслойную нейронную сеть" + ] + }, + { + "cell_type": "markdown", + "id": "6e83a26d", + "metadata": {}, + "source": [ + "[![2022-12-15_20-15-21.md.png](https://ic.wampi.ru/2022/12/18/2022-12-15_20-15-21.md.png)](https://wampi.ru/image/RHEMg2t)" + ] + }, + { + "cell_type": "markdown", + "id": "a5ffad8d", + "metadata": {}, + "source": [ + "где \n", + "- $ \\ x_i$ - множество входных сигналов \n", + "- $ \\ u_j$ - выход сети\n", + "- $ \\ v_s$ - выходной сигнал нейрона скрытого слоя\n", + "- $ \\ w_{si}^{(1)} $ и $ \\ w_{js}^{(2)} $ - веса синаптических связей" + ] + }, + { + "cell_type": "markdown", + "id": "229bd089", + "metadata": {}, + "source": [ + "Модель нейрона описывается следующими уравнениями:" + ] + }, + { + "cell_type": "markdown", + "id": "62f7ac66", + "metadata": {}, + "source": [ + "$$ \\ u_k=\\sum_{j=1}^{n}w_{k,j}x_j$$" + ] + }, + { + "cell_type": "markdown", + "id": "2980febc", + "metadata": {}, + "source": [ + "$$ \\ y_k=\\varphi(u_k+b_k)$$" + ] + }, + { + "cell_type": "markdown", + "id": "ab73fbfb", + "metadata": {}, + "source": [ + "где \n", + "$\\ x_j$ - входной сигнал, $\\ w_{k,j} $ - синаптический вес сигнала $\\ x_j$, $\\ \\varphi $ - функция активации, $\\ b_k$ - смещение." + ] + }, + { + "cell_type": "markdown", + "id": "a93603f9", + "metadata": {}, + "source": [ + "Метод обратного распространения ошибки определяет\n", + "стратегию изменения параметров сети $\\ 𝑤 $ в ходе обучения\n", + "с использованием градиентных методов оптимизации." + ] + }, + { + "cell_type": "markdown", + "id": "156ce735", + "metadata": {}, + "source": [ + "Градиентные методы на каждом шаге уточняют значения параметров: $\\ w(k+1) = w(k) + \\eta p(w)$" + ] + }, + { + "cell_type": "markdown", + "id": "ad73c87a", + "metadata": {}, + "source": [ + "где\n", + "- $ \\ \\eta $ , $ \\ 0 < \\eta < 1 $ – скорость обучения (learning rate) –«скорость» движения в направлении минимального значения функции\n", + "- $ \\ 𝑝(𝑤) $ – направление в многомерном пространстве параметров нейронной сети\n" + ] + }, + { + "cell_type": "markdown", + "id": "84e65f55", + "metadata": {}, + "source": [ + "В классическом методе обратного распространения ошибки направление движения совпадает с направлением антиградиента $ \\ 𝑝(𝑤) = −\\nabla 𝐸(𝑤(𝑘)) $ на 𝑘-ой итерации метода." + ] + }, + { + "cell_type": "markdown", + "id": "b5703b9f", + "metadata": {}, + "source": [ + "### Метод обратного распространения ошибки" + ] + }, + { + "cell_type": "markdown", + "id": "ec077725", + "metadata": {}, + "source": [ + "#### 1. Прямой проход" + ] + }, + { + "cell_type": "markdown", + "id": "ce502359", + "metadata": {}, + "source": [ + "1. Вычисление значений выходных сигналов нейронов всех слоев\n", + "2. Вычисление значений производных функций активации на каждом\n", + "слое сети" + ] + }, + { + "cell_type": "markdown", + "id": "6e673f7d", + "metadata": {}, + "source": [ + "Выходной сигнал нейрона скрытого слоя описывается следующим образом:" + ] + }, + { + "cell_type": "markdown", + "id": "9cd097df", + "metadata": {}, + "source": [ + "$$ \\ v_s = \\varphi ^{(1)} (\\sum_{i=1}^{N}w_{si}^{(1)}x_i) $$" + ] + }, + { + "cell_type": "markdown", + "id": "89b0d81a", + "metadata": {}, + "source": [ + "Сигнал 𝑗-ого нейрона выходного слоя:" + ] + }, + { + "cell_type": "markdown", + "id": "b90e5bd2", + "metadata": {}, + "source": [ + "$$ \\ u_j = \\varphi ^{(2)}(\\sum_{s=0}^{K}w_{js}^{(2)}v_s) = \\varphi ^{(2)}(\\sum_{s=0}^{K}w_{js}^{(2)}\\varphi ^{(1)} (\\sum_{i=0}^{N}w_{si}^{(1)}x_i)), \\ j = 1,M $$ " + ] + }, + { + "cell_type": "markdown", + "id": "e8c1a760", + "metadata": {}, + "source": [ + "В качестве функции активации на скрытом слое используется ReLU (Rectified Linear Unit)." + ] + }, + { + "cell_type": "markdown", + "id": "bc001917", + "metadata": {}, + "source": [ + "$$ \\ \\varphi(v) = \\begin{cases} 0,v \\leq 0 \\\\ v,v>0 \\end{cases}$$" + ] + }, + { + "cell_type": "markdown", + "id": "a7aa9105", + "metadata": {}, + "source": [ + "[![2022-12-18_11-51-48.png](https://im.wampi.ru/2022/12/18/2022-12-18_11-51-48.png)](https://wampi.ru/image/RHEMDAg)" + ] + }, + { + "cell_type": "markdown", + "id": "71c1946d", + "metadata": {}, + "source": [ + "Её производная есть " + ] + }, + { + "cell_type": "markdown", + "id": "9ac9bbc8", + "metadata": {}, + "source": [ + "$$ \\ \\varphi(v)^{'} = \\begin{cases} 0,v \\leq 0 \\\\ 1,v>0 \\end{cases}$$" + ] + }, + { + "cell_type": "markdown", + "id": "082c6fcc", + "metadata": {}, + "source": [ + "В качестве функции активациии на выходном слое используется функция softmax." + ] + }, + { + "cell_type": "markdown", + "id": "cacfcb65", + "metadata": {}, + "source": [ + "$$\\ \\varphi(u_j) = \\frac{e^{u_j} }{\\sum_{i=1}^{M} e^{u_i}} $$" + ] + }, + { + "cell_type": "markdown", + "id": "4a13abd4", + "metadata": {}, + "source": [ + "#### 2. Обратный проход" + ] + }, + { + "cell_type": "markdown", + "id": "98507bdb", + "metadata": {}, + "source": [ + "1. Вычисление целевой функции 𝐸 и ее градиента $\\ \\frac{\\partial E }{\\partial w_{js}^{(2)}} $ и $\\ \\frac{\\partial E }{\\partial w_{si}^{(1)}}$ \n", + "2. Коррекция весов $\\ 𝑤(𝑘 + 1) = 𝑤(𝑘) − \\eta \\nabla 𝐸(𝑤(𝑘)) $" + ] + }, + { + "cell_type": "markdown", + "id": "846bf981", + "metadata": {}, + "source": [ + "В качестве функции ошибки используется кросс-энтропия (или логарифмическая функция потерь – log loss), т.е. " + ] + }, + { + "cell_type": "markdown", + "id": "9ef3bbae", + "metadata": {}, + "source": [ + "$$\\ E(w) = - \\sum_{j=1}^{M}y_j ln(u_j) $$" + ] + }, + { + "cell_type": "markdown", + "id": "ecbd81bf", + "metadata": {}, + "source": [ + "где $\\ y_j$ ожидаемый выход (метки)." + ] + }, + { + "cell_type": "markdown", + "id": "d791dbec", + "metadata": {}, + "source": [ + "Производную целевой функции по весам можно вывести следующим образом:" + ] + }, + { + "cell_type": "markdown", + "id": "56f89295", + "metadata": {}, + "source": [ + "По весам второго слоя:" + ] + }, + { + "cell_type": "markdown", + "id": "ea29cc3e", + "metadata": {}, + "source": [ + "$$\\ \\frac{\\partial E }{\\partial w_{js}^{(2)}} = - \\sum_{j'=0}^{M} y_{j'} \\frac{\\partial \\ln u_{j'}}{\\partial w_{js}^{(2)}} = - \\sum_{j'=0}^{M} y_{j'} \\frac{1}{u_{j'}} \\frac{\\partial u_{j'}}{\\partial w_{js}^{(2)}}$$" + ] + }, + { + "cell_type": "markdown", + "id": "7c9ad94f", + "metadata": {}, + "source": [ + "$$ \\ \\frac{\\partial u_{j'}}{\\partial w_{js}^{(2)}} = \\frac{\\partial \\varphi ^{(2)} (g_{j'})}{\\partial g_{j}} \\frac{\\partial g_{j}}{\\partial w_{js}^{(2)}} = \\varphi (g_{j'}) (\\delta_{j,j'}-\\varphi(g_i)) \\frac{\\partial g_j}{\\partial w_{js}^{(2)}}$$" + ] + }, + { + "cell_type": "markdown", + "id": "8c36be67", + "metadata": {}, + "source": [ + "$$ \\ g_j = \\sum_{s=0}^K w_{ij}^{(2)}v_s; \\frac{\\partial g_j}{\\partial w_{js}^{(2)}} = v_s$$" + ] + }, + { + "cell_type": "markdown", + "id": "67b6fd71", + "metadata": {}, + "source": [ + "$$\\ \\frac{\\partial E }{\\partial w_{js}^{(2)}} = - \\sum_{j'=0}^{M} y_{j'} (\\delta_{j,j'}-\\varphi^{(2)}(g_i))v_s = (\\varphi^{(2)}(g_j) \\sum_{j'=0}^{M} y_{j'} - y_i )v_s $$" + ] + }, + { + "cell_type": "markdown", + "id": "13fd89db", + "metadata": {}, + "source": [ + "из условия $ \\ \\sum_{j=1}^M y_j = 1$ получаем" + ] + }, + { + "cell_type": "markdown", + "id": "03dcb5ed", + "metadata": {}, + "source": [ + "$$\\ \\frac{\\partial E }{\\partial w_{js}^{(2)}} = (\\partial^{(2)}(g_j) - y_j)v_s $$" + ] + }, + { + "cell_type": "markdown", + "id": "51237daf", + "metadata": {}, + "source": [ + "По весам первого слоя" + ] + }, + { + "cell_type": "markdown", + "id": "43dbdeb8", + "metadata": {}, + "source": [ + "$$\\ \\frac{\\partial E }{\\partial w_{si}^{(1)}} = - \\sum_{j'=0}^{M} y_{j'} \\frac{\\partial \\ln u_{j'}}{\\partial w_{si}^{(1)}} = - \\sum_{j'=0}^{M} y_{j'} \\frac{1}{u_{j'}} \\frac{\\partial u_{j'}}{\\partial w_{si}^{(1)}}$$" + ] + }, + { + "cell_type": "markdown", + "id": "d7151570", + "metadata": {}, + "source": [ + "$$ \\ \\frac{\\partial u_{j'}}{\\partial w_{si}^{(1)}} = \\frac{\\partial \\varphi ^{(2)} (g_{j'})}{\\partial g_{j}} \\frac{\\partial g_{j}}{\\partial w_{si}^{(1)}} = \\varphi^{(2)} (g_{j'}) (\\delta_{j',j}-\\varphi^{(2)}(g_i)) w_{js}^{(2)} \\frac{\\partial \\varphi_i^{(1)}}{\\partial w_{si}^{(1)}}x_i$$" + ] + }, + { + "cell_type": "markdown", + "id": "c254b446", + "metadata": {}, + "source": [ + "$$\\ \\frac{\\partial E }{\\partial w_{si}^{(1)}} = (\\sum_{j'=0}^M (y_{j'}-u_{j'})w_{j's}^{(2)})\\frac{\\partial \\varphi_i^{(1)}}{\\partial w_{si}^{(1)}}x_i $$" + ] + }, + { + "cell_type": "markdown", + "id": "eaff0904", + "metadata": {}, + "source": [ + "## Программная реализация" + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "id": "a5f1abe9", + "metadata": {}, + "outputs": [], + "source": [ + "# определение функции активации ReLU на скрытом слое\n", + "def ReLU(x):\n", + " return np.maximum(x, 0)\n", + "\n", + "# определение производной функции автивации ReLU\n", + "def ReLuDerivative(x):\n", + " x[x <= 0] = 0\n", + " x[x > 0] = 1\n", + " return x\n", + "\n", + "# определение функции активации softmax на выходном слое\n", + "def softmax(x):\n", + " exp = np.exp(x)\n", + " return exp / np.sum(exp, axis = 1, keepdims = True)\n", + "\n", + "# определение функции ошибки кросс-энтропия\n", + "def crossEntropyLoss(x1, x2):\n", + " return np.mean(-np.sum(x1 * np.log(x2), axis=1))\n", + "\n", + "# функция подсчета точности на тестовой или обучающей выборке \n", + "def accuracy(x1, x2):\n", + " return np.mean(np.argmax(x1, axis=1) == np.argmax(x2, axis=1))" + ] + }, + { + "cell_type": "markdown", + "id": "fc28e0ef", + "metadata": {}, + "source": [ + "Создадим класс NeuralNetwork, который будет содержать имплементацию основных методов нейронной сети" + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "id": "bc0a437b", + "metadata": {}, + "outputs": [], + "source": [ + "class NeuralNetwork(object):\n", + " # Объявление конструктора\n", + " def __init__(self, input_layer=28 * 28, hidden_layer=300, output_layer=10):\n", + " # инициализация\n", + " # количество нейронов входного слоя \n", + " self.input_layer = input_layer\n", + " # количество нейронов скрытого слоя\n", + " self.hidden_layer = hidden_layer\n", + " # количество нейронов выходного слоя\n", + " self.output_layer = output_layer\n", + " \n", + " # w1, w2 (веса синаптических связей) инициализируем нормальным распределением с дисперсией sqrt(input_layer)\n", + " # w1 и b1 массивы для хранения весов и смещений первого слоя\n", + " self.w1 = np.random.randn(input_layer, hidden_layer) / np.sqrt(input_layer)\n", + " self.b1 = np.zeros((1, hidden_layer))\n", + " # w2 и b2 массивы для хранения весов и смещений второго слоя\n", + " self.w2 = np.random.randn(hidden_layer, output_layer) / np.sqrt(hidden_layer)\n", + " self.b2 = np.zeros((1, output_layer))\n", + " \n", + " # прямой ход сети\n", + " # Возвращает выходной сигнал первого и второго слоя\n", + " def forward(self, x):\n", + " self.z1 = np.matmul(x, self.w1) + self.b1\n", + " self.a1 = ReLU(self.z1)\n", + " self.z2 = np.matmul(self.a1, self.w2) + self.b2\n", + " self.a2 = softmax(self.z2)\n", + "\n", + " # обратных ход сети\n", + " # вычисление градиента функции ошибки\n", + " # корректировка весов сети при помощи посчитанных градиентов\n", + " def backward(self, xTrain, yTrain, learningRate):\n", + " w1, b1, w2, b2 = self.w1, self.b1, self.w2, self.b2\n", + "\n", + " delta3 = (self.a2 - yTrain) / self.a2.shape[0]\n", + " dW2 = (self.a1.T).dot(delta3)\n", + " db2 = np.sum(delta3, axis=0, keepdims=True)\n", + "\n", + " delta2 = delta3.dot(self.w2.T) * ReLuDerivative(self.z1)\n", + " dW1 = np.dot(xTrain.T, delta2)\n", + " db1 = np.sum(delta2, axis=0, keepdims=True)\n", + "\n", + " self.w1 += -learningRate * dW1\n", + " self.b1 += -learningRate * db1\n", + " self.w2 += -learningRate * dW2\n", + " self.b2 += -learningRate * db2\n", + "\n", + " # пакетное обучение сети на epochs эпохах, скоростью обучения l_rate, размером пакета batch_size\n", + " def train(self, X_train, y_train, num_epochs=10, l_rate=0.1, batch_size=16):\n", + " print('Train ...')\n", + " all_time = time.time()\n", + " for epoch in range(num_epochs):\n", + " epoch_time_start = time.time()\n", + " iteration = 0\n", + " while iteration < len(X_train):\n", + " # берем часть данных, размером batch_size\n", + " X_train_batch = X_train[iteration:iteration+batch_size]\n", + " y_train_batch = y_train[iteration:iteration+batch_size]\n", + " \n", + " self.forward(X_train_batch)\n", + " self.backward(X_train_batch, y_train_batch, l_rate)\n", + "\n", + " iteration += batch_size\n", + " epoch_time = time.time()\n", + "\n", + " self.forward(X_train)\n", + " crossEntropyValue = crossEntropyLoss(y_train, self.a2)\n", + " accuracyValue = accuracy(y_train, self.a2)\n", + "\n", + " print('Epoch: {}; Time: {}; Loss: {}; Accuracy: {}'.format(epoch, epoch_time - epoch_time_start, crossEntropyValue, accuracyValue))\n", + " print(f\"Total train time: {(time.time()-all_time):.3f} sec\")\n", + "\n", + " # тестирование сети\n", + " # вывод метрик loss и accuracy\n", + " def test(self, X_test, y_test):\n", + " print('Test ...')\n", + " all_time = time.time()\n", + " self.forward(X_test)\n", + " crossEntropyValue = crossEntropyLoss(y_test, self.a2)\n", + " accuracyValue = accuracy(y_test, self.a2)\n", + "\n", + " print('Loss = {}; Accuracy = {}'.format(crossEntropyValue, accuracyValue))\n", + " print(f\"Total test time: {time.time() - all_time} sec\")\n" + ] + }, + { + "cell_type": "code", + "execution_count": 12, + "id": "90ea5a14", + "metadata": {}, + "outputs": [], + "source": [ + "# Создание объекта разработанного класса\n", + "network = NeuralNetwork()" + ] + }, + { + "cell_type": "code", + "execution_count": 13, + "id": "e7166b02", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Train ...\n", + "Epoch: 0; Time: 18.00784158706665; Loss: 0.1349527058387903; Accuracy: 0.9576333333333333\n", + "Epoch: 1; Time: 16.74522376060486; Loss: 0.08370092845220185; Accuracy: 0.9739\n", + "Epoch: 2; Time: 16.751219034194946; Loss: 0.06145281115565548; Accuracy: 0.9806833333333334\n", + "Epoch: 3; Time: 16.82201313972473; Loss: 0.04785633851741321; Accuracy: 0.9845166666666667\n", + "Epoch: 4; Time: 16.820075750350952; Loss: 0.04104109750956039; Accuracy: 0.9866666666666667\n", + "Epoch: 5; Time: 16.89297604560852; Loss: 0.034057166605354766; Accuracy: 0.989\n", + "Epoch: 6; Time: 16.794084787368774; Loss: 0.02821324932708801; Accuracy: 0.9910333333333333\n", + "Epoch: 7; Time: 17.208982706069946; Loss: 0.02268514669964699; Accuracy: 0.9928\n", + "Epoch: 8; Time: 17.462302923202515; Loss: 0.01819413801111785; Accuracy: 0.9944166666666666\n", + "Epoch: 9; Time: 60.03344774246216; Loss: 0.01628618454131359; Accuracy: 0.9952833333333333\n", + "Total train time: 227.006 sec\n" + ] + } + ], + "source": [ + "# Обучение модели на 10 эпохах, скоростью обучения 0.1, размером пакета 16\n", + "network.train(train_x, train_y, num_epochs=10, l_rate=0.1, batch_size=16)" + ] + }, + { + "cell_type": "code", + "execution_count": 14, + "id": "5707fa28", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Test ...\n", + "Loss = 0.0735054455477396; Accuracy = 0.9778\n", + "Total test time: 0.17752361297607422 sec\n" + ] + } + ], + "source": [ + "# тестирование модели\n", + "network.test(test_x, test_y)" + ] + }, + { + "cell_type": "markdown", + "id": "c85467be", + "metadata": {}, + "source": [ + "\"Поиграемся\" с параметрами модели" + ] + }, + { + "cell_type": "code", + "execution_count": 15, + "id": "c15c04f2", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Train ...\n", + "Epoch: 0; Time: 5.090388536453247; Loss: 0.24294348513074265; Accuracy: 0.9280666666666667\n", + "Epoch: 1; Time: 5.972088098526001; Loss: 0.1715999161383804; Accuracy: 0.9502\n", + "Epoch: 2; Time: 5.992071628570557; Loss: 0.13298560913337765; Accuracy: 0.9616666666666667\n", + "Epoch: 3; Time: 5.941112518310547; Loss: 0.10770514035803948; Accuracy: 0.9691\n", + "Epoch: 4; Time: 5.963052988052368; Loss: 0.08956400645403255; Accuracy: 0.9745166666666667\n", + "Epoch: 5; Time: 6.005940914154053; Loss: 0.07586757175581064; Accuracy: 0.9786333333333334\n", + "Epoch: 6; Time: 5.945213794708252; Loss: 0.06548302043785972; Accuracy: 0.98185\n", + "Epoch: 7; Time: 5.985988616943359; Loss: 0.05745028924880232; Accuracy: 0.9843666666666666\n", + "Epoch: 8; Time: 6.00201416015625; Loss: 0.05105409641189437; Accuracy: 0.9864166666666667\n", + "Epoch: 9; Time: 5.976008176803589; Loss: 0.04579017932541774; Accuracy: 0.98805\n", + "Total train time: 71.740 sec\n", + "Test ...\n", + "Loss = 0.07396158594576653; Accuracy = 0.9772\n", + "Total test time: 0.26427483558654785 sec\n" + ] + } + ], + "source": [ + "network = NeuralNetwork()\n", + "# Обучение модели на 10 эпохах, скоростью обучения 0.1, размером пакета 64\n", + "network.train(train_x, train_y, num_epochs=10, l_rate=0.1, batch_size=64)\n", + "network.test(test_x, test_y)" + ] + }, + { + "cell_type": "code", + "execution_count": 16, + "id": "60ebc5ad", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Train ...\n", + "Epoch: 0; Time: 1.059169054031372; Loss: 0.26034218842562806; Accuracy: 0.9227\n", + "Epoch: 1; Time: 1.4261975288391113; Loss: 0.18869846939711332; Accuracy: 0.9442\n", + "Epoch: 2; Time: 2.0904133319854736; Loss: 0.1485427282522527; Accuracy: 0.9565666666666667\n", + "Epoch: 3; Time: 1.3025224208831787; Loss: 0.12263490128501105; Accuracy: 0.9644833333333334\n", + "Epoch: 4; Time: 2.1153440475463867; Loss: 0.10480436048740525; Accuracy: 0.9695666666666667\n", + "Epoch: 5; Time: 2.0734517574310303; Loss: 0.09198975938093959; Accuracy: 0.9730166666666666\n", + "Epoch: 6; Time: 2.1023786067962646; Loss: 0.08204000656390269; Accuracy: 0.9759666666666666\n", + "Epoch: 7; Time: 2.1023709774017334; Loss: 0.07364010494164555; Accuracy: 0.9785333333333334\n", + "Epoch: 8; Time: 2.06646990776062; Loss: 0.06706364376078995; Accuracy: 0.9800833333333333\n", + "Epoch: 9; Time: 2.1642138957977295; Loss: 0.06142786623959192; Accuracy: 0.9818166666666667\n", + "Total train time: 25.613 sec\n", + "Test ...\n", + "Loss = 0.08833317490485565; Accuracy = 0.9733\n", + "Total test time: 0.1356368064880371 sec\n" + ] + } + ], + "source": [ + "network = NeuralNetwork(input_layer=28 * 28, hidden_layer=100, output_layer=10)\n", + "# Обучение модели на 10 эпохах, скоростью обучения 0.1, размером пакета 16\n", + "network.train(train_x, train_y, num_epochs=10, l_rate=0.1, batch_size=64)\n", + "network.test(test_x, test_y)" + ] + }, + { + "cell_type": "markdown", + "id": "4b620807", + "metadata": {}, + "source": [ + "## Контрольный набор данных" + ] + }, + { + "cell_type": "markdown", + "id": "f5e5df4d", + "metadata": {}, + "source": [ + "Запустим нашу модель на контрольном наборе параметров: размер пачки 64, скорость обучения составляет 0.1, количество скрытых нейронов – 300, количество эпох – 20." + ] + }, + { + "cell_type": "code", + "execution_count": 17, + "id": "f7a966a9", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Train ...\n", + "Epoch: 0; Time: 4.638623237609863; Loss: 0.2372415223735498; Accuracy: 0.9303333333333333\n", + "Epoch: 1; Time: 5.945101499557495; Loss: 0.16535430731543788; Accuracy: 0.95125\n", + "Epoch: 2; Time: 6.0797953605651855; Loss: 0.12740871082300592; Accuracy: 0.9631\n", + "Epoch: 3; Time: 6.143598318099976; Loss: 0.10327006829697495; Accuracy: 0.9708166666666667\n", + "Epoch: 4; Time: 6.080739259719849; Loss: 0.08650579076716439; Accuracy: 0.9756833333333333\n", + "Epoch: 5; Time: 6.071762800216675; Loss: 0.07396265731808521; Accuracy: 0.9793833333333334\n", + "Epoch: 6; Time: 5.919201374053955; Loss: 0.0645307308825114; Accuracy: 0.9819166666666667\n", + "Epoch: 7; Time: 5.920161724090576; Loss: 0.057073297857151636; Accuracy: 0.9843\n", + "Epoch: 8; Time: 5.926152944564819; Loss: 0.05108395936046026; Accuracy: 0.9859333333333333\n", + "Epoch: 9; Time: 5.907189130783081; Loss: 0.046115866895064916; Accuracy: 0.9874166666666667\n", + "Epoch: 10; Time: 6.098692417144775; Loss: 0.042079473095414516; Accuracy: 0.9886166666666667\n", + "Epoch: 11; Time: 5.923326253890991; Loss: 0.03861578290703333; Accuracy: 0.9896666666666667\n", + "Epoch: 12; Time: 5.94707179069519; Loss: 0.03542242858240878; Accuracy: 0.99045\n", + "Epoch: 13; Time: 6.02488899230957; Loss: 0.03261718990379848; Accuracy: 0.9914333333333334\n", + "Epoch: 14; Time: 5.887335777282715; Loss: 0.03011773525003235; Accuracy: 0.9925666666666667\n", + "Epoch: 15; Time: 5.890272617340088; Loss: 0.02795657889646554; Accuracy: 0.9932833333333333\n", + "Epoch: 16; Time: 5.932122230529785; Loss: 0.02598646160478246; Accuracy: 0.9939666666666667\n", + "Epoch: 17; Time: 5.932113170623779; Loss: 0.024148895523653575; Accuracy: 0.9945166666666667\n", + "Epoch: 18; Time: 6.116644620895386; Loss: 0.022516730664805102; Accuracy: 0.9950666666666667\n", + "Epoch: 19; Time: 6.121629476547241; Loss: 0.020971720196515484; Accuracy: 0.9954833333333334\n", + "Total train time: 144.532 sec\n", + "Test ...\n", + "Loss = 0.06453382283579312; Accuracy = 0.9803\n", + "Total test time: 0.2722773551940918 sec\n" + ] + } + ], + "source": [ + "network = NeuralNetwork(input_layer=28 * 28, hidden_layer=300, output_layer=10)\n", + "network.train(train_x, train_y, num_epochs=20, l_rate=0.1, batch_size=64)\n", + "network.test(test_x, test_y)" + ] + }, + { + "cell_type": "markdown", + "id": "4568cbd1", + "metadata": {}, + "source": [ + "## Реализация PyTorch" + ] + }, + { + "cell_type": "markdown", + "id": "8a6ecb07", + "metadata": {}, + "source": [ + "Проверм, достигнута ли точность классификации на тестовых данных для контрольных значений параметров, сравнимая с точностью, которую выдают стандартные инструменты глубокого обучения. " + ] + }, + { + "cell_type": "markdown", + "id": "906285f9", + "metadata": {}, + "source": [ + "Для этого напишем воспользуемся PyTorch. " + ] + }, + { + "cell_type": "code", + "execution_count": 18, + "id": "4926270c", + "metadata": {}, + "outputs": [ + { + "name": "stderr", + "output_type": "stream", + "text": [ + "C:\\Users\\PC\\.conda\\envs\\deeplean\\lib\\site-packages\\tqdm\\auto.py:22: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html\n", + " from .autonotebook import tqdm as notebook_tqdm\n" + ] + } + ], + "source": [ + "import torch\n", + "import torchvision\n", + "import torchvision.datasets\n", + "import torchvision.transforms\n", + "import torch.utils.data\n", + "from torch import nn\n", + "import torch.optim as optim\n", + "import os\n", + "from matplotlib import pyplot as plot" + ] + }, + { + "cell_type": "code", + "execution_count": 19, + "id": "f42e9fe2", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "TorchNeuralNetwork(\n", + " (layer1): Linear(in_features=784, out_features=300, bias=True)\n", + " (relu): ReLU()\n", + " (layer2): Linear(in_features=300, out_features=10, bias=True)\n", + " (softmax): Softmax(dim=0)\n", + ")\n" + ] + } + ], + "source": [ + "input_size = 28*28\n", + "hidden_size = 300\n", + "output_size = 10\n", + "batch_size = 64\n", + "num_epochs = 20\n", + "l_rate = 0.1\n", + "\n", + "class TorchNeuralNetwork(nn.Module):\n", + " # Объявление конструктора\n", + " def __init__(self):\n", + " super().__init__()\n", + " self.layer1 = nn.Linear(input_size, hidden_size)\n", + " self.relu = nn.ReLU()\n", + " self.layer2 = nn.Linear(hidden_size, output_size)\n", + " self.softmax = nn.Softmax(dim=0)\n", + " # Переопределение метода, вызываемого в процессе прямого прохода\n", + " def forward(self, x):\n", + " out = self.layer1(x)\n", + " out = self.relu(out)\n", + " out = self.layer2(out)\n", + " out = self.softmax(out)\n", + " return out\n", + "\n", + "# Создание объекта разработанного класса\n", + "nn_model = TorchNeuralNetwork()\n", + "print(nn_model)\n", + "\n", + "optimizer = optim.SGD(nn_model.parameters(), lr=l_rate)\n", + "loss_func = nn.CrossEntropyLoss()\n" + ] + }, + { + "cell_type": "code", + "execution_count": 20, + "id": "b0e115d1", + "metadata": {}, + "outputs": [], + "source": [ + "def train(model, train_x, train_y, loss_func, optimizer, batch_size, num_epochs):\n", + " all_time = time.time()\n", + "\n", + " for epoch in range(num_epochs):\n", + " epoch_time_start = time.time()\n", + " iteration = 0\n", + " while iteration < len(train_x):\n", + " X_train_batch = torch.tensor(train_x[iteration:iteration+batch_size])\n", + " y_train_batch = torch.tensor(train_y[iteration:iteration+batch_size])\n", + " # Прямой проход\n", + " net_out = model(X_train_batch) # вычисление выхода сети\n", + " loss = loss_func(net_out, y_train_batch) # вычисление функции ошибки\n", + " # Обратный проход\n", + " optimizer.zero_grad() # обнуление всех вычисляемых градиентов\n", + " loss.backward() # вычисление градиента функции ошибки\n", + " optimizer.step() # обновление параметров модели\n", + "\n", + " iteration += batch_size\n", + " \n", + " epoch_time = time.time()\n", + " \n", + " accuracy_train = accuracy(model(torch.tensor(train_x)).detach().numpy(), train_y)\n", + " print('Epoch: {}; Time: {}, Loss: {}; Accuracy: {}'.format(epoch, epoch_time - epoch_time_start, loss, accuracy_train))\n", + " print(f\"Total train time: {(time.time()-all_time):.3f} sec\")\n", + "\n", + "def test(model, test_x, test_y):\n", + " print(\"Test ...\")\n", + " all_time = time.time()\n", + " accuracy_test = accuracy(model(torch.tensor(test_x)).detach().numpy(), test_y)\n", + " print('Accuracy test: {}'.format(accuracy_test))\n", + " print(f\"Total test time: {time.time() - all_time} sec\")\n" + ] + }, + { + "cell_type": "code", + "execution_count": 21, + "id": "ba621bed", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Epoch: 0; Time: 2.380633592605591, Loss: 2.2841494977474213; Accuracy: 0.72665\n", + "Epoch: 1; Time: 3.4866771697998047, Loss: 2.147377446293831; Accuracy: 0.7202\n", + "Epoch: 2; Time: 3.3979172706604004, Loss: 2.0820818319916725; Accuracy: 0.7411666666666666\n", + "Epoch: 3; Time: 3.4019062519073486, Loss: 2.0629787296056747; Accuracy: 0.79005\n", + "Epoch: 4; Time: 3.4307913780212402, Loss: 2.0555210188031197; Accuracy: 0.8026333333333333\n", + "Epoch: 5; Time: 3.3630142211914062, Loss: 2.0508678443729877; Accuracy: 0.8093\n", + "Epoch: 6; Time: 3.3919310569763184, Loss: 2.047722462564707; Accuracy: 0.8133833333333333\n", + "Epoch: 7; Time: 3.355032444000244, Loss: 2.0455083772540092; Accuracy: 0.8172833333333334\n", + "Epoch: 8; Time: 3.3819503784179688, Loss: 2.043867826461792; Accuracy: 0.8202\n", + "Epoch: 9; Time: 3.3580188751220703, Loss: 2.042600031942129; Accuracy: 0.82245\n", + "Epoch: 10; Time: 3.3789641857147217, Loss: 2.0416219867765903; Accuracy: 0.8239833333333333\n", + "Epoch: 11; Time: 3.3759748935699463, Loss: 2.0408297553658485; Accuracy: 0.82595\n", + "Epoch: 12; Time: 3.396920680999756, Loss: 2.040179491043091; Accuracy: 0.8273666666666667\n", + "Epoch: 13; Time: 3.4088833332061768, Loss: 2.039628654718399; Accuracy: 0.8284\n", + "Epoch: 14; Time: 3.442798376083374, Loss: 2.039162538945675; Accuracy: 0.82955\n", + "Epoch: 15; Time: 3.445783853530884, Loss: 2.0387645587325096; Accuracy: 0.8308\n", + "Epoch: 16; Time: 3.433814287185669, Loss: 2.0384192280471325; Accuracy: 0.8319166666666666\n", + "Epoch: 17; Time: 3.37896728515625, Loss: 2.038111340254545; Accuracy: 0.8328\n", + "Epoch: 18; Time: 3.3530333042144775, Loss: 2.0378432124853134; Accuracy: 0.8337333333333333\n", + "Epoch: 19; Time: 3.473710775375366, Loss: 2.037606045603752; Accuracy: 0.8347166666666667\n", + "Total train time: 75.773 sec\n" + ] + } + ], + "source": [ + "train(nn_model, train_x, train_y, loss_func, optimizer, 64, 20)" + ] + }, + { + "cell_type": "code", + "execution_count": 22, + "id": "3d842352", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Test ...\n", + "Accuracy test: 0.8347166666666667\n", + "Total test time: 0.3131601810455322 sec\n" + ] + } + ], + "source": [ + "test(nn_model, train_x, train_y)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "a9db01e6", + "metadata": {}, + "outputs": [], + "source": [] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.6.13" + }, + "vscode": { + "interpreter": { + "hash": "e22e4133a33ba01b41040227d186af91b166636dec3968d03ffaaac554692717" + } + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/NerobovaAS/README.md b/NerobovaAS/README.md new file mode 100644 index 0000000..7161a45 --- /dev/null +++ b/NerobovaAS/README.md @@ -0,0 +1,4 @@ +| ФИО | Hidden neurons | Learning rate | Number of epochs | Time, s | Train accuracy | Test accuracy | +| --------------------------- | -------------- | ------------- | ---------------- | ------- | -------------- | ------------- | +| Неробова Анастасия Сергеевна | 300 | 0.1 | 20 | 144.532 | 0.996 | 0.98 | +