Relevant configuration for training swin transformer #570

starmemda · 2021-04-17T07:29:00Z

starmemda
Apr 17, 2021

hello, i have try to train swin transformer myself aiming to get the same performance in the paper. but i failed as following configuration:

nohup ./distributed_train.sh 8 /share/wuxiangyu/cx/data/CLS-LOC --model swin_tiny_patch4_window7_224 -b 300 --sched cosine --epochs 300 --decay-epochs 2.4 --decay-rate .97 --opt adamw --opt-eps .001 -j 8 --warmup-lr 1e-3 --warmup-epochs 20 --weight-decay 0.05 --drop 0.3 --drop-connect 0.2 --model-ema --model-ema-decay 0.9999 --aa rand-m9-mstd0.5 --remode pixel --reprob 0.2 --amp --lr .001 >./logs/swin_tiny_patch4_window7_224_0417.log 2>&1&

could anyone tell me how to modify my code to star training?

rwightman · 2021-04-17T18:40:59Z

rwightman
Apr 17, 2021
Maintainer

@starmemda I've moved to discussion since not an issue. You might want to look closer at the original Swin repository https://github.com/microsoft/Swin-Transformer/blob/main/config.py

Their training code derives from a number of components here so I think everything should be reproducible with this train script, but your hparams differ from their defaults. Some of their values look related to the DeiT paper/impl so might look at that too.

Just a few differences... --mixup 0.8 --cutmix 1.0 --opt-eps 1e-8 --aa rand-m9-mstd0.5-inc1 --reprob 0.25 --warmup-lr 5e-7 --lr 5e-4

They don't use dropout (drop_rate), and use a drop_path (drop_connect) that varies per model. Their learning rates are normalized per 512 batch size, you should probably look at their train script

you don't need to specify decay epocsh or decay rate with cosine schedule

1 reply

starmemda Apr 18, 2021
Author

Thank you very much! But how can i make the learning rates normalized per 512 batch size?

hellodoger · 2021-04-28T01:34:47Z

hellodoger
Apr 28, 2021

Is there the code for Swin-Transformer with TensorFlow2.0? could anyone tell me，thank you.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Relevant configuration for training swin transformer #570

{{title}}

Replies: 2 comments 1 reply

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

Select a reply

Relevant configuration for training swin transformer #570

starmemda Apr 17, 2021

Replies: 2 comments · 1 reply

rwightman Apr 17, 2021 Maintainer

starmemda Apr 18, 2021 Author

hellodoger Apr 28, 2021

starmemda
Apr 17, 2021

Replies: 2 comments 1 reply

rwightman
Apr 17, 2021
Maintainer

starmemda Apr 18, 2021
Author

hellodoger
Apr 28, 2021