-
Notifications
You must be signed in to change notification settings - Fork 150
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Syncnet loss does not converge #146
Comments
I have the same issue , it's always around 0.69 |
How long does it take to train 500k steps? |
需要更多高质量的训练集 我降到0.37了 |
请问你用了多少数据,训练了多少batch,大概需要多久呀? |
2万个不到5秒的视频文件 跑了36万step 我中途修改了一下训练集 增加了一些高质量的训练数据。按照作者的说法我的训练集数据可能还远远不够 。我先试试吧 毕竟炼丹靠玄学 |
非常感谢!请问你的batch size是多少呢?是的 炼丹靠玄学😂 |
16 |
#30 那我觉得你和这个issue里面的loss progress还挺相似的, 希望就在眼前 |
@linqiu0-0 it took around 4.6 days to finish 500k steps |
@openalldoors, which dataset are you using for high quality? So, are you saying if I include a more high-quality dataset, then the loss will converge? |
|
请教一下,您说的高质量数据集标准是什么,另外学习率是多少,感激不尽 |
视频的码率够不够 声音是否同步 更重要的是 视频每一帧里面的脸有没有 是不是同一个人,是不是有多人? |
感谢,码率、人脸我都检查过,全1080p,音画syncnet_python检测我省略了,通过降低学习率过了0.69的坎。但现在训练很慢,160w steps才到了0.44左右,而且貌似有过拟合的趋势,看见你的恢复怀疑是数据集质量的问题,能请教一下您的音画同步步骤吗 |
音画syncnet检测别省 除非你有百分之百的把握 你可以试着看看eval的log 如果eval数据异常的话 loss值会异常 显著大于1 (大于6也是有可能的)你需要去排查。 |
请问 作者这套源码 不需要调整网络结构和损失函数 就可以直接训练384吗? |
这直接用syncnet_python 去跑 一个开源项目 AV offset 0 就代表同步了。我也有个问题 作者的源代码确定可以不用改就能跑288或者384 512的训练吗? 不是说网络结构和损失函数都要和96*96 有区别吗?这里你懂不? |
感谢分享经验,我目前是train0.42,eval0.45-0.43波动,因为显存小训练慢所以还不太好判断。 |
要看eval 每一条的输出 看均值看不出问题来 |
syncnet 训练过拟合是什么原因呢?数据音画同步跑过检测没问题 |
AVSpeech这个数据集算得上高质量吗,https://hyper.ai/datasets/8754这个链接,有800duogeG |
视频清晰度 音画同步 帧率 数据集大小 这几个符合条件了 才行 |
I am training syncnet on avspeech dataset with
train_syncnet_sam.py
. My training loss is stuck at 0.69 even after 500k steps.Lr
andbs
are5e-5
and64
, respectively.I have gone through all the issues, but I haven't found any workable solution. If anyone has any suggestions, it will be a great help.
For preprocessing, I followed all the steps suggested here except the video split part. My videos average length is 7.1s (videos are in range 0-15s) and total length of training dataset is roughly 30.5hr
The text was updated successfully, but these errors were encountered: