Skip to content

Gluon CV Toolkit 0.3.0

Pre-release
Pre-release
Compare
Choose a tag to compare
@zhreshold zhreshold released this 16 Oct 22:08
· 545 commits to master since this release
b55232d

0.3 Release Note

Highlights

Added 5 new algorithms and updated 38 pre-trained models with improved accuracy

Compare 7 selected models

Model Metric 0.2 0.3 Reference
ResNet-50 top-1 acc on ImageNet 77.07% 79.15% 75.3% (Caffe impl)
ResNet-101 top-1 acc on ImageNet 78.81% 80.51% 76.4% (Caffe impl)
MobileNet 1.0 top-1 acc on ImageNet N/A 73.28% 70.9% (tensorflow impl)
Faster-RCNN mAP on COCO N/A 40.1% 39.6% (Detectron)
Yolo-v3 mAP on COCO N/A 37.0% 33.0% (paper)
DeepLab-v3 mIoU on VOC N/A 86.7% 85.7% (paper)
Mask-RCNN mask AP on COCO N/A 33.1% 32.8% (Detectron)

Interactive visualizations for pre-trained models

For image classification:

and for object detection

Deploy without Python

All models are hybridiziable. They can be deployed without Python. See tutorials to deploy these models in C++.

New Models with Training Scripts

DenseNet, DarkNet, SqueezeNet for image classification

We now provide a broader range of model families that are good for out of box usage and various research purposes.

YoloV3 for object detection

Significantly more accurate than original paper. For example, we get 37.0% mAP on CoCo versus the original paper's 33.0%. The techniques we used will be included in a paper to be released later.

Mask-RCNN for instance segmentation

Accuracy now matches Caffe2 Detectron without FPN, e.g. 38.3% box AP and 33.1% mask AP on COCO with ResNet50.

FPN support will come in future versions.

DeepLabV3 for semantic segmentation.

Slightly more accurate than original paper. For example, we get 86.7% mIoU on voc versus the original paper's 85.7%.

WGAN

Reproduced WGAN with ResNet

Person Re-identification

Provide a baseline model which achieved 93.1 best rank1 score on Market1501 dataset.

Enhanced Models with Better Accuracy

Faster R-CNN

  • Improved Pascal VOC model accuracy. mAP improves to 78.3% from previous version's 77.9%. VOC models with 80%+ mAP will be released with the tech paper.
  • Added models trained on COCO dataset.
    • Now Resnet50 model achieves 37.0 mAP, out-performs Caffe2 Detectron without FPN (36.5 mAP).
    • Resnet101 model achieves 40.1 mAP, out-performs Caffe2 Detectron with FPN(39.8 mAP)
  • FPN support will come in future versions.

ResNet, MobileNet, DarkNet, Inception for image classifcation

  • Significantly improved accuracy for some models. For example, ResNet50_v1b gets 78.3% versus previous version's ResNet50_v1b's 77.07%.
  • Added models trained with mixup and distillation. For example, ResNet50_v1d has 3 versions: ResNet50_v1d_distill (78.67%), ResNet50_v1d_mixup (79.16%), ResNet50_v1d_mixup_distill (79.29%).

Semantic Segmentation

  • Synchronized Batch Normalization training.
  • Added Cityscapes dataset and pretrained models.
  • Added training details for reproducing state-of-the-art on Pascal VOC and Provided COCO pre-trained models for VOC.

Dependency

GluonCV 0.3.0 now depends on incubator-mxnet >= 1.3.0, please update mxnet according to installation guide to avoid compatibility issues.