Skip to content

Error thrown by chat_template if multiple system messages (llama.cpp / Huggingface models) #1783

Open
@awijshoff

Description

@awijshoff

Hi,

first of all, thank you for the awesome project — really cool!

Describe the bug
I'm currently experimenting with the new @blocknote/xl-ai package, using the Vercel OpenAI-compatible provider to connect with a local llama.cpp server. I'm running smaller models from HuggingFace e. g. to summarize personal notes.

These HF-models typically include a chat_template in the tokenizer_config.json, which is used when converting to .gguf format during quantization. The default chat_template often expects alternating user/assistant messages (see examples below). Using default settings, this causes the llama.cpp server to raise an error (HTTP 500), when the request has multiple system role messages.

Examples:

The error seems to originate from this section of the code:
https://github.com/TypeCellOS/BlockNote/blob/main/packages/xl-ai/src/api/LLMRequest.ts#L168-L180

The comments around the code suggest the issue is known already.

I'm currently fiddling with a workaround via a custom fetch-function in the OpenAI-compatible provider. It modifies the request by merging all system messages into a single system-message before sending it to the llama-server.

Another workaround would be to start the llama-server with the flags --jinja --chat-template chatml, which seems to work — but I noticed that breaks compatibility with the default llama.cpp webUI due to BOS/EOS token issues.

To Reproduce

  1. Download a GGUF model that includes a strict chat_template, such as Gemma 3B.
  2. Start the llama.cpp server with:
    ./llama-server.exe --model gemma-3b.gguf --port 8000 --jinja --cache-reuse 256 --ctx-size 8192
  3. Use the @blocknote/xl-ai package with the OpenAI-compatible provider @ai-sdk/openai-compatible.
  4. Observe the 500 error from the server due to chat_template validation.

Misc
I'm not sure if other LLM engines such as:

also run into this issue when using the default provided tokenizer_config.json or chat_template.

  • Node version:
  • Package manager:
  • Browser:
  • I'm a sponsor and would appreciate if you could look into this sooner than later 💖

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions