You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thank you for your great work! I've read your paper and am having trouble understanding generating sequences with different lengths. It seems to me that as you fix n=64 in experiments, you can't change it anymore as the hidden size d'=n*d in Transformer is fixed. As a result, it should be impossible during inference time to generate sequences with length other than 64...?
The text was updated successfully, but these errors were encountered:
We can generate sentences shorter than length 64 via padding. If the training script sets --padding_mode pad, then the format will be [BOS][SENTENCE][EOS][PAD][PAD][PAD]...
Thank you for your great work! I've read your paper and am having trouble understanding generating sequences with different lengths. It seems to me that as you fix n=64 in experiments, you can't change it anymore as the hidden size d'=n*d in Transformer is fixed. As a result, it should be impossible during inference time to generate sequences with length other than 64...?
The text was updated successfully, but these errors were encountered: