Caffe on YARN is a project to support running Caffe on YARN, based on CaffeOnSpark from yahoo to rebase on YARN by removing Spark dependency. It's a part of Deep Learning on Hadoop (HDL).
Note that current project is a prototype with limitation and is still under development.
Figure1. CaffeOnYARN Architecture
Figure 1 describes the system architecture of CaffeOnYARN. Based on CaffeOnSpark, we launch Caffe engines on CPU devices within the YARN container. As same as CaffeOnSpark, CaffeOnYarn containers communicate to each other via MPI allreduce style interface via TCP/Ethernet or RDMA/Infiniband.
-
Git clone ..
-
Set environment variables
export CAFFE_ON_YARN=$(pwd)/CaffeOnYARN export LD_LIBRARY_PATH=${CAFFE_ON_YARN}/caffe/lib
-
Compile CaffeOnYARN
cd <path_to_caffe> mvn clean install
Run your Caffe application.
cd bin
ydl-cf -jar <path_to_caffe-with-dependency_jar> \
-conf <your_solver_protoxt> \
-model <model_output_hdfs_path> \
-num <container_num>