You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, i'm trying to train a streaming zipformer transducer model for Indonesian language. I use Colab to train and i need help on how should i train this model. There's 4 datasets that i'll be using:
Common Voice 17 (7h train, 4h dev, 4h test)
Google Fleurs ID (9h train, 1h dev, 2h test)
Librivox Indonesia (6h train, 1h test)
TITML IDN (14h train)
totalling 37 hours of training data, 5 hours of validation data, and 7 hours of test data.
I normalized the text and prepared all of the manifests for the cuts and combining the cuts by concatenate the datasets. I also prepared the cuts for MUSAN and compute the features of the cuts. So the next step is training the model. I customized pruned_transducer_stateless7_streaming Librispeech script a little bit to train my model by modifying the LibriSpeechAsrModule class and the dataloader.
I use SimpleCutSampler shuffled and the parameters of the model and training arguments are all default parameters, except:
The training script works perfect, but i don't really know if this is the best parameter for the model:
My validation loss is way higher than my training loss, is that because i have small dev data? I decode the model and it got WER of 39% which is not great. Also i noticed after training that the "zipformer" model was the latest zipformer and this pruned transducer model is the old one, should i change to new one and what parameters should i change for the new model?
Thank you!
The text was updated successfully, but these errors were encountered:
Hi, i'm trying to train a streaming zipformer transducer model for Indonesian language. I use Colab to train and i need help on how should i train this model. There's 4 datasets that i'll be using:
totalling 37 hours of training data, 5 hours of validation data, and 7 hours of test data.
I normalized the text and prepared all of the manifests for the cuts and combining the cuts by concatenate the datasets. I also prepared the cuts for MUSAN and compute the features of the cuts. So the next step is training the model. I customized pruned_transducer_stateless7_streaming Librispeech script a little bit to train my model by modifying the LibriSpeechAsrModule class and the dataloader.
I use SimpleCutSampler shuffled and the parameters of the model and training arguments are all default parameters, except:
Here are the parameters:
The training script works perfect, but i don't really know if this is the best parameter for the model:
My validation loss is way higher than my training loss, is that because i have small dev data? I decode the model and it got WER of 39% which is not great. Also i noticed after training that the "zipformer" model was the latest zipformer and this pruned transducer model is the old one, should i change to new one and what parameters should i change for the new model?
Thank you!
The text was updated successfully, but these errors were encountered: