The result of Tacotron 2 is only 12 seconds when inferenced #1325
Replies: 2 comments 1 reply
-
Tacotron has a param named If you see this message even for a short sentences (a few words for exmaple), then your model isn't well trained. If there is no such message on your console right after the synthesis completes, then it might be another problem. |
Beta Was this translation helpful? Give feedback.
-
its just that the model isnt trained enough there is also a stop decoding conf param in the hparams.py thats you can lower but most of the time its just because the model was not trained enough. |
Beta Was this translation helpful? Give feedback.
-
I tried inference on the Tacotron 2 model with Griffin Lim and WaveGrad but it only produces 12 seconds of audio for all text input lengths.
For example, I do a short text input and then it produces a sound of duration 12 second. This is also the same as I do input with long text by producing a sound of 12 seconds but the spoken text input is not fully spoken.
How do I get flexible audio output that depends on text input?
Beta Was this translation helpful? Give feedback.
All reactions