Is adding extra tokens supported? #3614

namgyaaal · 2026-01-19T17:04:06Z

namgyaaal
Jan 19, 2026

Hello!

I have fine-tuned whisper small via. transformers and I want to use with whisper.cpp. It has extra tokens added with tokenizer.add_tokens() and model.resize_token_embeddings() that are tags outputted alongside speech. Testing it on transformers shows that it works.

Currently generating the model with python models/convert-h5-to-ggml.py safetensors_export_dir ./whisper ./out_model and testing the ggml bin file on a test audio file produces illegible results.

Is there something I'm missing in this process or are additional tokens currently not supported?

Thanks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is adding extra tokens supported? #3614

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Is adding extra tokens supported? #3614

Uh oh!

namgyaaal Jan 19, 2026

Replies: 0 comments

namgyaaal
Jan 19, 2026