-
Notifications
You must be signed in to change notification settings - Fork 278
improving of position embedding #2223
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
What is the cause of this error? I need help |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Just a question for now. What models use this?
@@ -61,6 +61,7 @@ def __init__( | |||
self, | |||
sequence_length, | |||
initializer="glorot_uniform", | |||
hierarchical_alpha=0.4, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'd need a docstring to add this. But before we do, is this used by more than one model architecture? If this is just used in one, we probably would prefer to keep the layer simple and just leave the customization per model arch.
The main function of this feature is to non-destructively expand the max sequence of models such as BERT and ViT. |
@mattdangerw |
@pass-lin how would you expect to use this feature with BERT currently? Seems tricky to actual use given that bert will be loaded form a config without this option. |
I hope it can be used without feeling anything. It can be activated when the length exceeds the preset maximum length. If you think this is unnecessary, you can close this PR. |
from #2192
Removed the submission about the export of the repair roformer