$coco/
|–– images/
|–––– val2017/
|–––––– 000000134722.jpg
|–––––– 000000177015.jpg
|–––––– ...
|–– annotations/
|–––– captions_val2017.json
Step 1. Download validation images from COCO 2017 Val Images, unzip them to coco/images/val2017
.
Step 2. Download the 2017 Val annotations, place it under coco/annotations/captions_val2017.json
.
$flickr30k-images/
|–– 2217728745.jpg
|–– 2217728745.jpg
|–– ...
|–– flickr30k_val.json
|–– flickr30k_test.json
Step 1. Download flickr30k dataset, unzip them under flickr30k-images/
, all the images and annotations files will be structured as above.
$imagenet/
|–– data/
|–––– val_images/
|–––––– n01440764/
|–––––––– ILSVRC2012_val_00000293_n01440764.JPEG
|–––––––– ILSVRC2012_val_00017699_n01440764.JPEG
|–––––––– ...
|–––––– n01871265/
|–––––––– ILSVRC2012_val_00000067_n01871265.JPEG
|–––––––– ILSVRC2012_val_00017361_n01871265.JPEG
|–––––––– ...
Step 1. Download validation data val_images.tar.gz
from ILSVRC/imagenet-1k, and unzip them to imagenet/data/val_images
.
You can manually download the imagenet-1k/data/val_images.tar.gz
or use this command. huggingface-cli download ILSVRC/imagenet-1k --repo-type dataset --local-dir /directory/to/your/dataset/
.
Step 2. Change source_dir in imagenet_organize.py
according to your val_images folder. Then, run imagenet_organize.py
to organize the image in the above format.
Other classification datasets include ["food101", "cifar10", "cifar100", "sun397", "stanford_car", "aircraft", "dtd", "pets", "caltech101", "flowers"]
.
Please set appropriate dataset_root in src/dataloaders/utils.py
to save classification datasets.
Then, torchvision.datasets
will automatically download the datatsets in dataset_root
during inference.
We followed the evaluation scheme and config files provided by SCLIP as shown here.
Our segmentation configs include benchmarks with background ['cfg_voc21.py', 'cfg_context60.py', 'cfg_coco_object.py']
and without background ['cfg_voc20.py', 'cfg_city_scapes.py', 'cfg_context59.py', 'cfg_ade20k.py', 'cfg_coco_stuff164k.py']
.
Please follow the dataset preparation instruction provided by SCLIP and mmsegmentation to download the following datasets: ["VOCdevkit/VOC2012", "VOCdevkit/VOC2010", "coco_stuff164k", "cityscapes, "ade"]
.
Then, change the data_root
in each segmentation config according to the dataset location. For example, this is root_dir for cfg_ade20k.py
.