Backdoor Secrets Unveiled: Identifying Backdoor Data with Optimized Scaled Prediction Consistency (ICLR 2024)
This is an official implementation of the paper Backdoor Secrets Unveiled: Identifying Backdoor Data with Optimized Scaled Prediction Consistency
Let's start by installing all the dependencies.
Please install dependencies from requirement.txt.
Additionally, please install ffcv and dependencies (as per instructions from https://ffcv.io/).
The full implementation consists of various stages - which includes creating a poisoned dataset, training a model on the dataset and finally using our algorithm to identify backdoor samples. We use FFCV (https://ffcv.io/) for our experiments, which ensures faster implementation. This requires storing our requisite dataset into a custom .beton format and also using a FFCV dataloader. We have elucidated these steps below.
The implementation of various backdoor attacks are given in the folder datasets.
Each file contains a dataset class for a backdoor attack such that we can create an indexable object.
Note that required triggers for the attacks are present in data/triggers/.
We will use write_dataset_ffcv.py to write such datasets as a .beton file in data. The key arguments and their usage are listed below:
--datasetcifar10 | tinyimagenet |imagenet200--poison_ratioThis argument specifies the poison ratio of the backdoor attack. e.g. 0.1 or 0.05--attackBadnet | Blend | LabelConsistent | CleanLabel | Trojan | Wanet | DFST | AdaptiveBlend--save_samplesTrue | FalseThis argument determines if we want to save a number of samples after the backdoor attack--targetThis specifies the target class index for the backdoor attack.
See write.sh for recommended usage.
We use trainnew.py to train a model using a .beton dataset created in the previous step. Key arguments and their usage are as follows:
--datasetPlease see above--archres18 | vit_tiny--poison_ratioPlease see above--attackPlease see above--targetPlease see above
We note that based on the trial number of a particular setting (let's say x), the model is saved in Results in a folder called Trial x.
See train.sh for recommended usage.
This is the core part of our contribution, using which we can identify backdoor samples from a dataset. We will use bilevel_full.py to implement our proposed algorithm. We describe the key arguments and their usage:
--datasetPlease see above--archPlease see above--batch_sizeThis specifies the batch size used in each iteration of our algorithm--poison_ratioPlease see above--attackPlease see above--epoch_innerThis argument specifies the number of epochs we run the inner level optimization of our bilevel formulation--outer_epochThis argument specifies the number of epochs we run the our whole bilevel (both inner and outer) optimization--scalesThis is a list of integers specifying the scalar values used to multiply the input to calculate the SPC / MSPC loss--trialnoThis argument specifies the trial number where the poisoned model is stored. For example, if we want to run a second trial for a particular setting, we first train a poisoned model which is stored underTrial 2(please see above section). Then we specify2for this argument to run the bilevel optimization.--targetPlease see above.
See bilevel.sh for recommended usage.
Here we point to the implementations of the baseline defenses mentioned in the paper.
- SD-FCT: https://github.com/SCLBD/Effective_backdoor_defense/
- ABL : https://github.com/bboylyg/ABL
- STRIP : https://github.com/garrisongys/STRIP
If this code base helps you, please consider citing our paper:
@inproceedings{
pal2024backdoor,
title={Backdoor Secrets Unveiled: Identifying Backdoor Data with Optimized Scaled Prediction Consistency},
author={Soumyadeep Pal and Yuguang Yao and Ren Wang and Bingquan Shen and Sijia Liu},
booktitle={The Twelfth International Conference on Learning Representations},
year={2024},
url={https://openreview.net/forum?id=1OfAO2mes1}
}