micrograd CUDA

Teaching myself basic CUDA by building GPU-accelerated tensor-based autodiff from the ground up, inspired by Andrej's micrograd.

No dependencies other than Python's standard library and CUDA.

Compiling

To compile the CUDA kernels:

nvcc -shared -o liboperations.so micrograd_cuda/operations.cu -Xcompiler -fPIC

Usage

import random
import time

from micrograd_cuda.mlp import MLP
from micrograd_cuda.tensor import Tensor
from micrograd_cuda.operations import Operations

# Model
model = MLP(100, [100, 100, 1])
epochs = 20
device = "cpu"

# Data
xs_batch = Tensor([[random.choice([-1, 1]) for _ in range(100)] for _ in range(10)]).T
ys_batch = Tensor([[random.choice([-1, 1])] for _ in range(10)]).T

# Move to device
model.to(device)
xs_batch.to(device)
ys_batch.to(device)

start = time.time()

for k in range(epochs):

    # Forward pass
    ypred = model(xs_batch)
    diff = ypred - ys_batch
    loss = (diff**2).sum()

    # Backward pass
    for p in model.parameters():
        p.zero_grad()
        
    loss.backward()
    
    # Update
    for p in model.parameters():
        p.data = (-0.1 * p.grad + p).data_copy()

print(f"Elapsed: {time.time() - start:.2f} sec")
    
loss.to("cpu")
print(loss.data)

Speedup

Roadmap

The codebase is still WIP with some rough spots, especially around CUDA Tensor data manipulation and copying.

Running tests

python -m pytest

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
micrograd_cuda		micrograd_cuda
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
front.jpg		front.jpg
requirements.txt		requirements.txt
speedup.jpg		speedup.jpg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

micrograd CUDA

Compiling

Usage

Speedup

Roadmap

Running tests

License

About

Releases

Packages

Languages

License

mlecauchois/micrograd-cuda

Folders and files

Latest commit

History

Repository files navigation

micrograd CUDA

Compiling

Usage

Speedup

Roadmap

Running tests

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages