Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to use Custom Tokenizer after trained the model? #193

Open
uraibeefnn opened this issue Sep 20, 2024 · 0 comments
Open

How to use Custom Tokenizer after trained the model? #193

uraibeefnn opened this issue Sep 20, 2024 · 0 comments

Comments

@uraibeefnn
Copy link

I hope you're doing well. I have a question regarding the usage of a custom tokenizer after training a model using your library.

I have trained a model using the GLiNER framework, but now I want to use a custom tokenizer, specifically tokenizer for Thai language. I noticed some issues when the tokenizer’s vocabulary size doesn’t match the model's expected vocab size, and I'm trying to figure out the best way to resolve this.

Could you please guide me on the following:

How do I load and use a custom tokenizer with a trained model?
Is there any recommended method to update or resize token embeddings if additional tokens are added to the tokenizer?
I would appreciate any advice on how to ensure the tokenizer and model are compatible after training.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant