Skip to content

Example CIFAR 10 using Deep Layer Aggregation to be used on DeepSquare

License

Notifications You must be signed in to change notification settings

deepsquare-io/cifar-10-example

Repository files navigation

CIFAR 10 Horovod Example

This example uses the Deep Layer Aggregation method to train on the CIFAR10 dataset.

Installation with Pipenv

  1. Install OpenMPI if you wish to be able to run a distributed workload locally.

  2. Install Pipenv which is a dependency management tool with a locking mechanism (similar to Anaconda).

  3. Clone this repository and run:

    export HOROVOD_WITH_PYTORCH=1
    export HOROVOD_WITH_MPI=1
    export HOROVOD_WITHOUT_GLOO=1
    
    # If GPU
    # export HOROVOD_CUDA_HOME=/usr/local/cuda
    # export HOROVOD_GPU=CUDA
    pipenv install

    This command creates a virtualenv based on the Pipfile and Pipfile.lock.

Usage

With Docker

Prepare the directories:

mkdir -p "$(pwd)/data"
# Download CIFAR-10 dataset
curl -fsSL https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz -o "$(pwd)/data/cifar-10-python.tar.gz"
tar -C $(pwd)/data/ -xvzf "$(pwd)/data/cifar-10-python.tar.gz"
mkdir -p "$(pwd)/checkpoint"

Run the model:

docker run \
  --rm \
  -v "$(pwd)/data:/data" \
  -v "$(pwd)/checkpoint:/checkpoint" \
  -u 1000:1000 \
  --entrypoint /bin/sh \
  ghcr.io/deepsquare-io/cifar-10-example:latest \
  -c '\
  mpirun \
  -np 4 \
  /.venv/bin/python3 \
  /app/main.py \
  --no-cuda \
  --horovod \
  --checkpoint_in=/checkpoint/ckpt.pth \
  --checkpoint_out=/checkpoint/ckpt.pth \
  --dataset=/data
'

With Pipenv

Prepare the directories:

mkdir -p "$(pwd)/data"
# Download CIFAR-10 dataset
curl -fsSL https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz -o "$(pwd)/data/cifar-10-python.tar.gz"
tar -C $(pwd)/data/ -xvzf "$(pwd)/data/cifar-10-python.tar.gz"
mkdir -p "$(pwd)/checkpoint"

Run the model:

pipenv shell
mpirun \
  -np 4 \
  python3 \
  main.py \
  --no-cuda \
  --horovod \
  --checkpoint_in="$(pwd)/checkpoint/ckpt.pth" \
  --checkpoint_out="$(pwd)/checkpoint/ckpt.pth" \
  --dataset="$(pwd)/data"
'

About

Example CIFAR 10 using Deep Layer Aggregation to be used on DeepSquare

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages