Replies: 1 comment
-
|
I believe it's a bit more complex than just adding the kwarg: #814
Waiting for the PR to be merged 👀 |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hello everyone,
Qwen team introduced this feature in Qwen 3.6, this is quote from their huggingface card:
To avoid constantly repeating loops and failed tool calls for example in Claude Code, when using Qwen3.6-35B-A3B please do this step in oMLX.
In "Chat Template Kwargs" click "Add" and select "Custom" then:
in key write: preserve_thinking
in value write: True
click "Save".
After this you should have clear and smooth responses in Claude Code without any stuck reasoning loops or failed tool calls.
@jundot Can you somehow make it as a default option for every quant/variation of Qwen 3.6 in oMLX?
Without that "preserve_thinking" setting after some context it gets stuck in reasoning loops and it is not enabled by default in model card, and most people don't know about this, but that improve LLM output quality a lot!
Thanks
Beta Was this translation helpful? Give feedback.
All reactions