|
| 1 | +# Real time face detector |
| 2 | + |
| 3 | +This repository contains training scripts for lightweight SSD-based face detector. The detector is based on the MobileNetV2 backbone and has single SSD head with manually designed anchors. As a result it has computational complexity 0.51 GMACs and 1.03 M of parameters. |
| 4 | + |
| 5 | +## Requirements |
| 6 | + |
| 7 | +* Ubuntu 16.04 |
| 8 | +* Python 3.5 or 3.6 (3.6 is preferable) |
| 9 | +* PyTorch 1.1 |
| 10 | + |
| 11 | +## Prerequisites |
| 12 | + |
| 13 | +1. Download mmdetection submodule `git submodule update --init --recommend-shallow external/mmdetection` |
| 14 | +2. Download the [WIDER Face](http://shuoyang1213.me/WIDERFACE/) and unpack it to `data` folder. |
| 15 | +3. Annotation in the VOC format |
| 16 | +can be found in this [repo](https://github.com/sovrasov/wider-face-pascal-voc-annotations.git). Move the annotation files from `WIDER_train_annotations` and `WIDER_val_annotations` folders |
| 17 | +to the `Annotation` folders inside the corresponding directories `WIDER_train` and `WIDER_val`. |
| 18 | +Also annotation lists `val.txt` and `train.txt` should be copied to `data/WIDERFace` from `WIDER_train_annotations` and `WIDER_val_annotations`. |
| 19 | +The directory should be like this: |
| 20 | + |
| 21 | +``` |
| 22 | +object_detection |
| 23 | +├── tools |
| 24 | +├── data |
| 25 | +│ ├── WIDERFace |
| 26 | +│ │ ├── WIDER_train |
| 27 | +│ | │ ├──0--Parade |
| 28 | +│ | │ ├── ... |
| 29 | +│ | │ ├── Annotations |
| 30 | +│ │ ├── WIDER_val |
| 31 | +│ | │ ├──0--Parade |
| 32 | +│ | │ ├── ... |
| 33 | +│ | │ ├── Annotations |
| 34 | +│ │ ├── val.txt |
| 35 | +│ │ ├── train.txt |
| 36 | +
|
| 37 | +``` |
| 38 | +4. Create virtual environment `bash init_venv.sh` |
| 39 | + |
| 40 | +## Training |
| 41 | + |
| 42 | +1. Download pre-trained MobileNetV2 weights `mobilenet_v2.pth.tar` from: [https://github.com/tonylins/pytorch-mobilenet-v2](https://github.com/tonylins/pytorch-mobilenet-v2). Move the file with weights to the folder `snapshots`. |
| 43 | +2. Run in terminal: |
| 44 | +```bash |
| 45 | +python3 ../../external/mmdetection/tools/train.py \ |
| 46 | + ../../external/mmdetection/configs/wider_face/mobilenetv2_tiny_ssd300_wider_face.py |
| 47 | +``` |
| 48 | +to train the detector on a single GPU. |
| 49 | + |
| 50 | +## Validation |
| 51 | + |
| 52 | +1. Run in terminal |
| 53 | +```bash |
| 54 | +python3 ../../external/mmdetection/tools/test.py \ |
| 55 | + ../../external/mmdetection/configs/wider_face/mobilenetv2_tiny_ssd300_wider_face.py \ |
| 56 | + <CHECKPOINT> \ |
| 57 | + --out result.pkl |
| 58 | +``` |
| 59 | +to dump detections. |
| 60 | +2. Then run |
| 61 | +```bash |
| 62 | +python3 ../../external/mmdetection/tools/voc_eval.py \ |
| 63 | + result.pkl \ |
| 64 | + ../../external/mmdetection/configs/wider_face/mobilenetv2_tiny_ssd300_wider_face.py |
| 65 | +``` |
| 66 | +One should observe 0.305 AP on validation set. For more detailed results and comparison with vanilla SSD300 see `../../external/mmdetection/configs/wider_face/README.md`. |
| 67 | + |
| 68 | +## Conversion to OpenVINO format |
| 69 | + |
| 70 | +1. Convert PyTorch model to ONNX format: run script in terminal |
| 71 | +```bash |
| 72 | +python3 tools/onnx_export.py \ |
| 73 | + ../../external/mmdetection/configs/wider_face/mobilenetv2_tiny_ssd300_wider_face.py |
| 74 | + <CHECKPOINT> \ |
| 75 | + face_detector.onnx |
| 76 | +``` |
| 77 | +This command produces `face_detector.onnx`. |
| 78 | +2. Convert ONNX model to OpenVINO format with Model Optimizer: run in terminal |
| 79 | +```bash |
| 80 | +python3 <OpenVINO_INSTALL_DIR>/deployment_tools/model_optimizer/mo.py \ |
| 81 | + --input_model face_detector.onnx \ |
| 82 | + --scale 255 \ |
| 83 | + --reverse_input_channels |
| 84 | +``` |
| 85 | +This produces model `face_detector.xml` and weights `face_detector.bin` in single-precision floating-point format (FP32). The obtained model expects normalized image in planar BGR format. |
| 86 | + |
| 87 | + |
| 88 | +## Python demo |
| 89 | + |
| 90 | +To run the demo connect a webcam end execute command: |
| 91 | +```bash |
| 92 | +python3 tools/detection_live_demo.py \ |
| 93 | + ../../external/mmdetection/configs/wider_face/mobilenetv2_tiny_ssd300_wider_face.py \ |
| 94 | + <CHECKPOINT> \ |
| 95 | + --cam_id 0 |
| 96 | +``` |
| 97 | + |
| 98 | + |
| 99 | +## Estimate theoretical computational complexity |
| 100 | + |
| 101 | +To get per layer computational complexity estimations run the following command: |
| 102 | +```bash |
| 103 | +python3 tools/count_flops.py \ |
| 104 | + ../../external/mmdetection/configs/wider_face/mobilenetv2_tiny_ssd300_wider_face.py |
| 105 | +``` |
| 106 | + |
| 107 | + |
| 108 | +## Fine-tuning |
| 109 | + |
| 110 | +* The annotations have to be in [PASCAL VOC](http://host.robots.ox.ac.uk/pascal/VOC/voc2012/index.html#data) format. |
| 111 | +* Fine-tuning steps are the same as step 2 for training, just edit the initial checkpoint and data paths in detector config. |
0 commit comments