Replies: 2 comments 1 reply
-
@starmemda I've moved to discussion since not an issue. You might want to look closer at the original Swin repository https://github.com/microsoft/Swin-Transformer/blob/main/config.py Their training code derives from a number of components here so I think everything should be reproducible with this train script, but your hparams differ from their defaults. Some of their values look related to the DeiT paper/impl so might look at that too. Just a few differences... They don't use dropout (drop_rate), and use a drop_path (drop_connect) that varies per model. Their learning rates are normalized per 512 batch size, you should probably look at their train script you don't need to specify decay epocsh or decay rate with cosine schedule |
Beta Was this translation helpful? Give feedback.
-
Is there the code for Swin-Transformer with TensorFlow2.0? could anyone tell me,thank you. |
Beta Was this translation helpful? Give feedback.
-
hello, i have try to train swin transformer myself aiming to get the same performance in the paper. but i failed as following configuration:
nohup ./distributed_train.sh 8 /share/wuxiangyu/cx/data/CLS-LOC --model swin_tiny_patch4_window7_224 -b 300 --sched cosine --epochs 300 --decay-epochs 2.4 --decay-rate .97 --opt adamw --opt-eps .001 -j 8 --warmup-lr 1e-3 --warmup-epochs 20 --weight-decay 0.05 --drop 0.3 --drop-connect 0.2 --model-ema --model-ema-decay 0.9999 --aa rand-m9-mstd0.5 --remode pixel --reprob 0.2 --amp --lr .001 >./logs/swin_tiny_patch4_window7_224_0417.log 2>&1&
could anyone tell me how to modify my code to star training?
Beta Was this translation helpful? Give feedback.
All reactions