DHR: Dual Features-Driven Hierarchical Rebalancing in Inter- and Intra-Class Regions for Weakly-Supervised Semantic Segmentation
This repository is the official implementation of "DHR: Dual Features-Driven Hierarchical Rebalancing in Inter- and Intra-Class Regions for Weakly-Supervised Semantic Segmentation".
[07/02/2024] Our DHR has been accepted to ECCV 2024. π₯π₯π₯
[04/02/2024] Released initial commits.
Please cite our paper if the code is helpful to your research.
@inproceedings{jo2024dhr,
title={DHR: Dual Features-Driven Hierarchical Rebalancing in Inter- and Intra-Class Regions for Weakly-Supervised Semantic Segmentation},
author={Sanghyun Jo and Fei Pan and In-Jae Yu and Kyungsu Kim},
booktitle={European Conference on Computer Vision (ECCV)},
year={2024}
}
Weakly-supervised semantic segmentation (WSS) ensures high-quality segmentation with limited data and excels when employed as input seed masks for large-scale vision models such as Segment Anything. However, WSS faces challenges related to minor classes since those are overlooked in images with adjacent multiple classes, a limitation originating from the overfitting of traditional expansion methods like Random Walk. We first address this by employing unsupervised and weakly-supervised feature maps instead of conventional methodologies, allowing for hierarchical mask enhancement. This method distinctly categorizes higher-level classes and subsequently separates their associated lower-level classes, ensuring all classes are correctly restored in the mask without losing minor ones. Our approach, validated through extensive experimentation, significantly improves WSS across five benchmarks (VOC: 79.8%, COCO: 53.9%, Context: 49.0%, ADE: 32.9%, Stuff: 37.4%), reducing the gap with fully supervised methods by over 84% on the VOC validation set.
Setting up for this project involves installing dependencies and preparing datasets. The code is tested on Ubuntu 20.04 with NVIDIA GPUs and CUDA installed.
To install all dependencies, please run the following:
pip install -U "ray[default]"
pip install git+https://github.com/lucasb-eyer/pydensecrf.git
python3 -m pip install -r requirements.txt
or reproduce our results using docker.
docker build -t dhr_pytorch:v1.13.1 .
docker run --gpus all -it --rm \
--shm-size 32G --volume="$(pwd):$(pwd)" --workdir="$(pwd)" \
dhr_pytorch:v1.13.1
Please download following VOC, COCO, Context, ADE, and COCO-Stuff datasets. Each dataset has a different directory structure. Therefore, we modify directory structures of all datasets for a comfortable implementation.
Download PASCAL VOC 2012 dataset from our [Google Drive].
Download MS COCO 2014 dataset from our [Google Drive].
Download Pascal Context dataset from our [Google Drive].
Download ADE 2016 dataset from our [Google Drive].
Download COCO-Stuff dataset from our [Google Drive].
Download [all results] and [the reproduced project] for a fair comparison with WSS.
Create a directory "../VOC2012/" for storing the dataset and appropriately place each dataset to have the following directory structure.
../ # parent directory
βββ ./ # current (project) directory
β βββ core/ # (dir.) implementation of our DHR (e.g., OT)
β βββ tools/ # (dir.) helper functions
β βββ experiments/ # (dir.) checkpoints and WSS masks
β βββ README.md # instruction for a reproduction
β βββ ... some python files ...
β
βββ WSS/ # WSS masks across all training and testing datasets
β βββ VOC2012/
β β βββ RSEPM/
β β βββ MARS/
β β βββ DHR/
β βββ COCO2014/
β β βββ DHR/
β βββ PascalContext/
β β βββ DHR/
β βββ ADE2016/
β β βββ DHR/
β βββ COCO-Stuff/
β βββ DHR/
β
βββ GroundingDINO_Ferret_SAM/ # reproduced project for Grounding DINO and Ferret with SAM
β βββ core/ # (dir.) implementation details
β βββ tools/ # (dir.) helper functions
β βββ weights/ # (dir.) checkpoints of Grounding DINO and Ferret
β βββ README.md # instruction for implementing Grounding DINO and Ferret
β βββ ... some python files ...
β
βββ OVSeg/ # SAM-based outputs of Grounding DINO and Ferret for a fair comparison
β βββ VOC2012/
β β βββ GroundingDINO+SAM/
β β βββ Ferret+SAM/
β βββ COCO2014/
β β βββ GroundingDINO+SAM/
β β βββ Ferret+SAM/
β βββ PascalContext/
β β βββ GroundingDINO+SAM/
β β βββ Ferret+SAM/
β βββ ADE2016/
β β βββ GroundingDINO+SAM/
β β βββ Ferret+SAM/
β βββ COCO-Stuff/
β βββ GroundingDINO+SAM/
β βββ Ferret+SAM/
β
βββ VOC2012/ # PASCAL VOC 2012
β βββ train_aug/
β β βββ image/
β β βββ mask/
β β βββ xml/
β βββ validation/
β β βββ image/
β β βββ mask/
β β βββ xml/
β βββ test/
β βββ image/
β
βββ COCO2014/ # MS COCO 2014
β βββ train/
β β βββ image/
β β βββ mask/
β β βββ xml/
β βββ validation/
β βββ image/
β βββ mask/
β βββ xml/
β
βββ PascalContext/ # PascalContext
β βββ train/
β β βββ image/
β β βββ mask/
β β βββ xml/
β βββ validation/
β βββ image/
β βββ mask/
β βββ xml/
β
βββ ADE2016/ # ADE2016
β βββ train/
β β βββ image/
β β βββ mask/
β β βββ xml/
β βββ validation/
β βββ image/
β βββ mask/
β βββ xml/
β
βββ COCO-Stuff/ # COCO-Stuff
βββ train/
β βββ image/
β βββ mask/
β βββ xml/
βββ validation/
βββ image/
βββ mask/
βββ xml/
Please download the trained CAUSE weights from scratch on other datasets CAUSE weights. We follow the official CAUSE to train CAUSE from scratch on five datasets.
Please download and prepare WSS masks WSS labels. You can replace existing WSS methods with other WSS methods following the current structure.
Our code is coming soon.
Release our checkpoint and official VOC results (anonymous links).
Method | Backbone | Checkpoints | VOC val | VOC test |
---|---|---|---|---|
DHR | ResNet-101 | Google Drive | link | link |
Below lines are testing commands to reproduce our results. Additionally, we follow the official Mask2Former to train Swin-L+Mask2Former with our DHR masks on five datasets.
# Generate the final segmentation outputs with CRF
python3 produce_wss_masks.py --gpus 0 --cpus 64 --root ../ --data VOC2012 --domain validation \
--backbone resnet101 --decoder deeplabv3+ --tag "ResNet-101@VOC2012@DeepLabv3+@DHR" --checkpoint "last"
# Calculate the mIoU
python3 evaluate.py --fix --data VOC2012 --gt ../VOC2012/validation/mask/ \
--tag "DHR" --pred "./experiments/results/VOC2012/ResNet-101@VOC2012@DeepLabv3+@DHR@last/validation/"
# Reproduce WSS performance related to official VOC results
# DHR (Ours, DeepLabv3+) | mIoU: 79.6%, mFPR: 0.127, mFNR: 0.077
# DHR (Ours, Mask2Former) | mIoU: 81.7%, mFPR: 0.131, mFNR: 0.052
python3 evaluate.py --fix --data VOC2012 --gt ../VOC2012/validation/mask/ \
--tag "DHR (Ours, DeepLabv3+)" --pred "./submissions_DHR@DeepLabv3+/validation/results/VOC2012/Segmentation/comp5_val_cls/"
python3 evaluate.py --fix --data VOC2012 --gt ../VOC2012/validation/mask/ \
--tag "DHR (Ours, Mask2Former)" --pred "./submissions_DHR@Mask2Former/validation/results/VOC2012/Segmentation/comp5_val_cls/"