This repository implements a fully connected neural network from first principles using only NumPy. The project emphasizes mathematical correctness, modular design, and interpretability, intentionally avoiding deep learning frameworks to expose the internal mechanics of neural network training.
The implementation covers forward propagation, backpropagation, optimization, and controlled experimental analysis.
- Implement a neural network without using TensorFlow or PyTorch
- Derive and code backpropagation using the chain rule
- Study training dynamics, convergence, and gradient flow
- Validate correctness through gradient checking and experiments
Input (2) → Dense (64) + ReLU → Dense (32) + ReLU → Dense (1) + Sigmoid
- Loss: Binary Cross-Entropy
- Optimizer: Gradient Descent
- Initialization: He Initialization
All experiments are conducted in notebooks/01_experiments.ipynb and summarized in experiments/.
- Tested learning rates: 0.001, 0.01, 0.05, 0.1
- Observed trade-off between convergence speed and stability
- ReLU vs Sigmoid in hidden layers
- ReLU provides faster convergence and healthier gradient flow
- Xavier vs He initialization
- He initialization performs better with ReLU activations
All derivations used in this project are documented in:
docs/math_derivations.md
Covered topics:
- Forward propagation equations
- Backpropagation via chain rule
- Binary cross-entropy loss gradients
- Gradient descent update rules
- Numerical gradient checking
- Modular architecture for clarity and extensibility
- Explicit implementation of every component
- Emphasis on correctness over abstraction
- Reproducible and debuggable design
Detailed design decisions are available in:
docs/design_notes.md
- Neural networks can be fully implemented using basic calculus and linear algebra
- Proper initialization and activation choice are critical for stable learning
- Gradient correctness must be verified, not assumed
- Framework abstractions should be built on strong fundamentals
- Softmax + categorical cross-entropy
- Momentum and Adam optimizers
- Regularization (L2, dropout)
- Deeper architectures and residual connections
GiGi Koneti (Koneti Gireesh Kumar)
AI Engineer | Applied Machine Learning & Intelligent Systems
GitHub: https://github.com/GiGiKoneti