How to use Custom Tokenizer after trained the model? #193

uraibeefnn · 2024-09-20T10:18:34Z

I hope you're doing well. I have a question regarding the usage of a custom tokenizer after training a model using your library.

I have trained a model using the GLiNER framework, but now I want to use a custom tokenizer, specifically tokenizer for Thai language. I noticed some issues when the tokenizer’s vocabulary size doesn’t match the model's expected vocab size, and I'm trying to figure out the best way to resolve this.

Could you please guide me on the following:

How do I load and use a custom tokenizer with a trained model?
Is there any recommended method to update or resize token embeddings if additional tokens are added to the tokenizer?
I would appreciate any advice on how to ensure the tokenizer and model are compatible after training.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to use Custom Tokenizer after trained the model? #193

How to use Custom Tokenizer after trained the model? #193

uraibeefnn commented Sep 20, 2024

How to use Custom Tokenizer after trained the model? #193

How to use Custom Tokenizer after trained the model? #193

Comments

uraibeefnn commented Sep 20, 2024