Skip to content

Latest commit

 

History

History
132 lines (113 loc) · 5.24 KB

README.md

File metadata and controls

132 lines (113 loc) · 5.24 KB

🔥 Classic DeepLearning Models by Jax

💡: The Latest Framework test case is under: ./example/.

MLP on MNIST LeNet on MNIST

On MNIST: (a) acc[96.80%] & loss vs. epochs for mlp; (b) acc[98.24%] & loss vs. epochs for LeNet

# models implemented in this project

  • Linear Regression
    • Self Made Gauss-Noise of a Function.
  • Logistic Regression
    • Iris.
  • KNN
    • CIFAR-10.
  • MLP
    • MNIST.
    • CIFAR-10.
  • LeNet[1]
    • MNIST.
    • CIFAR-10.
  • LSTM
    • UCI HAR.
  • GRU[3]
    • UCI HAR.
  • Transformer[4]
    • WMT15. TODO
  • Nerual ODE[5]
    • MNIST. TODO
  • VAE[7]
    • MNIST.

# NoteBook Docs

Some small tests for debug during the development of this project:

  • How to Use Mini-torch? A brief e.g. Doc TODO
  • How to Use Jax Gradient, Ideas about how I manage parameters in this Framework.
  • Some Jax Tips, About How to Use Jax Builtins & JIT to Optimize Loops & Matrix Operations.
  • Kaiming Initialization[2] used in MLP & Conv, With math derivation.
  • Difference between Conv2d Operation by python loop and by Jax.lax.
  • Dropout mechanism impl, About Seed in Jax.
  • Runge-Kuta solver for Neural ODE.

# Mini-torch

Overview of framework

Overview of Framework

  • nn
    • Model (Base Class for Nerual Networks, like nn.Module in torch)
    • Conv
      • Conv1d, Conv2d, Conv3d
      • MaxPooling1d, MaxPooling2d, MaxPooling3d
      • BatchNorm TODO
    • RnnCell
      • Basic rnn kernel
      • LSTM kernel
      • GRU kernel
      • BiLSTM kernel
      • BiGRU kernel
      • Layer Norm TODO
    • FC
      • Dropout
      • Linear
  • Optimizer
    • Algorithms
      • Raw GD
      • Momentum
      • Nesterov(NAG)
      • AdaGrad
      • RMSProp
      • AdaDelta
      • Adam[6]
    • Machanisms
      • Lr Decay. TODO
      • Weight Decay. TODO
      • Freeze. TODO
  • Utils
    • sigmoid
    • one hot
    • softmax
    • cross_entropy_loss
    • mean_square_error
    • l1_regularization
    • l2_regularization

@ Number of Codes

Last update: 2025.03.14.

     236 text files.
     135 unique files.                              
     138 files ignored.

github.com/AlDanial/cloc v 1.98  T=0.05 s (2810.5 files/s, 307803.1 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
Python                          33           1689           3297           3177
Jupyter Notebook                21              0           3947           1913
Text                             6              1              0            301
CSV                             68              0              0            203
Markdown                         5             40              0            198
TOML                             2              3              0             16
-------------------------------------------------------------------------------
SUM:                           135           1733           7244           5808
-------------------------------------------------------------------------------

Reference

[1] LeCun, Y., Boser, B., Denker, J., Henderson, D., Howard, R., Hubbard, W., & Jackel, L. (1989). Backpropagation Applied to Handwritten Zip Code Recognition. Neural Computation, 1(4), 541–551.
[2] He, K., Zhang, X., Ren, S., & Sun, J. (2015). Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification. In Proceedings of the IEEE International Conference on Computer Vision (ICCV) (pp. 1026–1034).
[3] Pascanu, R., Mikolov, T., & Bengio, Y. (2013). On the Difficulty of Training Recurrent Neural Networks. In Proceedings of the 30th International Conference on Machine Learning (ICML) (pp. 1310–1318).
[4] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A., Kaiser, ., & Polosukhin, I. (2017). Attention is All You Need. In Advances in Neural Information Processing Systems (NeurIPS).
[5] Chen, T., Rubanova, Y., Bettencourt, J., & Duvenaud, D. (2018). Neural Ordinary Differential Equations. In Advances in Neural Information Processing Systems (NeurIPS).
[6] Kingma, D. P., & Ba, J. (2014). Adam: A Method for Stochastic Optimization. Proceedings of the International Conference on Learning Representations (ICLR).
[7] Kingma, D., & Welling, M. (2014). Auto-Encoding Variational Bayes. In International Conference on Learning Representations (ICLR).