The code repository for "Learning Adaptive Classifiers Synthesis for Generalized Few-Shot Learning" (Accepted by IJCV) in PyTorch. If you use any content of this repo for your work, please cite the following bib entry:
@article{YeHZS2021Learning,
author = {Han-Jia Ye and
Hexiang Hu and
De-Chuan Zhan},
title = {Learning Adaptive Classifiers Synthesis for Generalized Few-Shot Learning},
journal = {International Journal of Computer Vision},
volume = {129},
number = {6},
pages = {1930--1953},
year = {2021}
}
Object recognition in the real-world requires handling long-tailed or even open-ended data. An ideal visual system needs to recognize the populated head visual concepts reliably and meanwhile efficiently learn about emerging new tail categories with a few training instances. Class-balanced many-shot learning and few-shot learning tackle one side of this problem, by either learning strong classifiers for head or learning to learn few-shot classifiers for the tail. We investigate the problem of generalized few-shot learning (GFSL) --- a model during the deployment is required to learn about tail categories with few shots and simultaneously classify the head classes.
We propose the ClAssifier SynThesis LEarning (Castle), a learning framework that learns how to synthesize calibrated few-shot classifiers in addition to the multi-class classifiers of head classes with a shared neural dictionary, shedding light upon the inductive GFSL. Furthermore, we propose an adaptive version of Castle (ACastle) that adapts the head classifiers conditioned on the incoming tail training examples, yielding a framework that allows effective backward knowledge transfer. As a consequence, ACastle can handle GFSL with classes from heterogeneous domains.
Experimental results on MiniImageNet with ResNet-12 backbone (Same as this repo). We report average results with 10,000 randomly sampled few-shot learning episodes for stablized evaluation. Different from the standard few-shot learning, the model is required to discern over the novel classes and all SEEN classes.
5-Way Harmonic Mean Accuracy (Classification over 64 head and 5 tail categories)
Setups | 1-Shot 5-Way | 5-Shot 5-Way | Link to Weights |
---|---|---|---|
MC | 0.00 | 0.00 | 1-Shot, 5-Shot |
ProtoNet | 19.26 | 67.73 | 1-Shot, 5-Shot |
Castle | 66.22 | 76.32 | 1-Shot, 5-Shot |
ACastle | 66.24 | 78.33 | 1-Shot, 5-Shot |
20-Way Harmonic Mean Accuracy (Classification over 64 head and 20 tail categories)
Setups | 1-Shot 20-Way | 5-Shot 20-Way | Link to Weights |
---|---|---|---|
MC | 0.00 | 0.00 | 1-Shot, 5-Shot |
ProtoNet | 17.71 | 55.51 | 1-Shot, 5-Shot |
Castle | 43.06 | 55.65 | 1-Shot, 5-Shot |
ACastle | 43.63 | 56.33 | 1-Shot, 5-Shot |
Experimental results on MiniImageNet with ResNet-12 backbone (Same as this repo). We report average results with 10,000 randomly sampled few-shot learning episodes for stablized evaluation.
Accuracy on MiniImageNet Dataset
Setups | 1-Shot 5-Way | 5-Shot 5-Way | Link to Weights |
---|---|---|---|
ProtoNet | 62.39 | 80.53 | 1-Shot, 5-Shot |
Castle | 66.75 | 81.98 | 1-Shot, 5-Shot |
ACastle | 66.83 | 82.08 | 1-Shot, 5-Shot |
The following packages are required to run the scripts:
-
Package tensorboardX
-
Dataset: please download the dataset and put images into the folder data/[name of the dataset, miniimagenet or cub]/images
-
Pre-trained weights: please download the pre-trained weights of the encoder if needed. The pre-trained weights can be downloaded in a zip file.
The MiniImageNet dataset is a subset of the ImageNet that includes a total number of 100 classes and 600 examples per class. We follow the previous setup, and use 64 classes as SEEN categories, 16 and 20 as two sets of UNSEEN categories for model validation and evaluation, respectively. The auxiliary images for evaluating SEEN categories could be downloaded from here.
TieredImageNet is a large-scale dataset with more categories, which contains 351, 97, and 160 categoriesfor model training, validation, and evaluation, respectively. The dataset can also be download from here. We only test TieredImageNet with ResNet backbone in our work. The auxiliary images for evaluating SEEN categories could be downloaded from here.
To reproduce our experiments with FEAT, please use train_gfsl.py. eval_gfsl can evaluate a GFSL model dierectly. There are four parts in the code.
model
: It contains the main files of the code, including the few-shot learning trainer, evaluator, the dataloader, the network architectures, baselines, and comparison models.data
: Images and splits for the data sets.saves
: The pre-trained weights of different networks.checkpoints
: To save the trained models.
The train_gfsl.py and eval_gfsl.py take similar command line options (details are in the model/utils.py
):
Task Related Arguments
-
dataset
: Option for the dataset (MiniImageNet
,TieredImageNet
, orCUB
), default toMiniImageNet
-
way
: The number of classes in a few-shot task during meta-training, default to5
-
eval_way
: The number of classes in a few-shot task during meta-test, default to5
-
shot
: Number of instances in each class in a few-shot task during meta-training, default to1
-
eval_shot
: Number of instances in each class in a few-shot task during meta-test, default to1
-
query
: Number of instances in each class to evaluate the performance during meta-training, default to15
-
eval_query
: Number of instances in each class to evaluate the performance during meta-test, default to15
-
test_mode
: Evaluate the model withFSL
orGFSL
during the meta-training, default toFSL
Optimization Related Arguments
-
sample_class
: The number of classes we sample in each episode which act as pseudo UNSEEN classes, default to8
-
num_tasks
: The number of patitions we resplit fromsample_class
in each episode, default to256
. It means we resample 256 5-way tasks based on the extracted features of the previous 8 classes, which constructs 256 GFSL tasks in each episode efficiently. -
batch_size
: The batch-size when sampling the instances during meta-training, default to128
-
max_epoch
: The maximum number of training epochs, default to200
-
episodes_per_epoch
: The number of tasks sampled in each epoch, default to100
-
num_eval_episodes
: The number of tasks sampled from the meta-val set to evaluate the performance of the model (note that we fix sampling 10,000 tasks from the meta-test set during final evaluation), default to200
-
lr
: Learning rate for the model, default to0.0001
with pre-trained weights -
lr_mul
: This is specially designed for set-to-set functions like FEAT. The learning rate for the top layer will be multiplied by this value (usually with faster learning rate). Default to10
-
lr_scheduler
: The scheduler to set the learning rate (step
,multistep
, orcosine
), default tostep
-
step_size
: The step scheduler to decrease the learning rate. Set it to a single value if choose thestep
scheduler and provide multiple values when choosing themultistep
scheduler. Default to20
-
gamma
: Learning rate ratio forstep
ormultistep
scheduler, default to0.2
-
augment
: Whether to do data augmentation or not during meta-training, default toFalse
-
mom
: The momentum value for the SGD optimizer, default to0.9
-
weight_decay
: The weight_decay value for SGD optimizer, default to0.0005
Model Related Arguments
-
model_class
: The model to use during meta-learning. We provide our Castle variants (Castle
andACastle
). Default toCastle
. We also provide the choice forCLS
andProtoNet
when evaluating the model. -
backbone_class
: Types of the encoder, i.e., the convolution network (ConvNet
), ResNet-12 (Res12
), or Wide ResNet (WRN
), default toConvNet
-
head
: The number of head of the neural dictionary module, default to1
-
dp_rate
: The dropout rate of the neural dictionary module, default to0.1
-
temperature
: Temperature over the logits, we #divide# logits with this value. It is useful when meta-learning with pre-trained weights. Default to1
Model Evaluation Criteria
criteria
: We implement various criteria to evaluate GFSL performance during model evaluation, i.e., mean accuracy (Acc
), harmonic mean accuracy (HMeanAcc
), harmonic mean MAP (HMeanMAP
), Delta Value (Delta
), and Area Under the Seen-Unseen Curve (AUSUC
). Please provide a list of criteria to measure during the evaluation. Default toAcc, HMeanAcc, HMeanMAP, Delta, AUSUC
. Note that there may exist warnings when evaluate HMeanMAP, please changerecall = tps / tps[-1]
in sklearn/metrics/ranking.py torecall = np.ones(tps.size) if tps[-1] == 0 else tps / tps[-1]
in that case
Other Arguments
-
orig_imsize
: Whether to resize the images before loading the data into the memory.-1
means we do not resize the images and do not read all images into the memory. Default to-1
-
multi_gpu
: Whether to use multiple gpus during meta-training, default toFalse
-
gpu
: The index of GPU to use. Please provide multiple indexes if choosemulti_gpu
. Default to0
-
log_interval
: How often to log the meta-training information, default to every50
tasks -
eval_interval
: How often to validate the model over the meta-val set, default to every1
epoch -
save_dir
: The path to save the learned models, default to./checkpoints
Running the command without arguments will train the models with the default hyper-parameter values. Loss changes will be recorded as a tensorboard file.
To train the 1-shot/5-shot 5-way ACastle model with ResNet-12 backbone on MiniImageNet:
$ python train_gfsl.py --max_epoch 50 --batch_size 128 --model_class ACastle --backbone_class Res12 --dataset MiniImageNet --way 5 --eval_way 5 --shot 1 --eval_shot 1 --query 15 --eval_query 15 --temperature 1 --lr 0.0002 --lr_mul 1 --lr_scheduler step --step_size 10 --gamma 0.1 --gpu 0 --init_weights ./saves/initialization/miniimagenet/Res12-pre.pth --dp_rate 0.5 --test_mode GFSL --num_tasks 256 --sample_class 16
$ python train_gfsl.py --max_epoch 50 --batch_size 128 --model_class ACastle --backbone_class Res12 --dataset MiniImageNet --way 5 --eval_way 5 --shot 5 --eval_shot 5 --query 15 --eval_query 15 --temperature 1 --lr 0.0002 --lr_mul 10 --lr_scheduler step --step_size 10 --gamma 0.5 --gpu 0 --init_weights ./saves/initialization/miniimagenet/Res12-pre.pth --dp_rate 0.5 --test_mode GFSL --num_tasks 256 --sample_class 16
to evaluate the GFSL performance of the 1-shot/5-shot 20-way models with ResNet-12 backbone on MiniImageNet (change eval_way
to 5
when evaluating the 5-way GFSL performance):
$ python eval_gfsl.py --dataset MiniImageNet --eval_way 20 --eval_shot 1 --model_path './saves/initialization/miniimagenet/Res12-pre.pth' --model_class CLS --gpu 0 --num_workers 4 --num_eval_episodes 10000
$ python eval_gfsl.py --dataset MiniImageNet --eval_way 20 --eval_shot 1 --model_path './saves/learned/miniimagenet/protonet-1-shot.pth' --model_class ProtoNet --gpu 0 --num_workers 4 --num_eval_episodes 10000
$ python eval_gfsl.py --dataset MiniImageNet --eval_way 20 --eval_shot 1 --model_path './saves/learned/miniimagenet/castle-1-shot-GFSL.pth' --model_class Castle --gpu 0 --num_workers 4 --num_eval_episodes 10000
$ python eval_gfsl.py --dataset MiniImageNet --eval_way 20 --eval_shot 1 --model_path './saves/learned/miniimagenet/acastle-1-shot-GFSL.pth' --model_class ACastle --gpu 0 --num_workers 4 --num_eval_episodes 10000
We thank following repos providing helpful components/functions in our work.