Skip to content

GrayXu/openwebui-hybrid-thinking

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 

Repository files navigation

openwebui-hybrid-thinking

A Pipe and Filter for open-webui to do hybrid-thinking: you can use DeepSeek R1 or QwQ 32B for cheap and fast thinking, and use stronger and more expensive models like claude-3.7-Sonnet for final summarization output, to achieve a better balance between inference cost and performance.

Aider LLM Leaderboards and DeepClaude shows the efficiency of hybrid thinking: deepseek-r1 + claude-3.5-sonnet can achieve results very close to claude-3.7-sonnet-thinking at 1/3 of the cost.

You can install those scripts by importing:

The filter version can be used with multiple derived models, but it doesn't support streaming the thought process (because filter doesn't support multiple stream requests?).
The pipe version configures a pipe function corresponding to a hybrid thinking model, but it can support streaming all data.

You can set two booleans, REASONING_CONTENT_AS_CONTEXT and CONTENT_AS_CONTEXT, to control whether the reasoning and actual output are passed as context to the output model:

  • The default behavior is consistent with Deep Claude, where only the reasoning content is used as context. However, be aware that this might confuse the output model.
  • In practice, Aider only uses the final output of the reasoning model as context for the output model. If you want the same approach, set REASONING_CONTENT_AS_CONTEXT=False and CONTENT_AS_CONTEXT=True.

一个为 open-webui 提供的 管道 (Pipe)过滤器 (Filter),用于实现 混合思考 (hybrid-thinking):你可以使用 DeepSeek R1QwQ 32B 进行廉价且快速的思考,然后使用更强大、更昂贵的模型(如 claude-3.7-Sonnet)进行最终的总结输出,从而在推理成本和性能之间取得更好的平衡。

Aider LLM 排行榜DeepClaude 展示了混合思维的效率:deepseek-r1 + claude-3.5-sonnet 可以以 1/3 的成本实现非常接近 claude-3.7-sonnet-thinking 的结果。

你可以通过导入以下脚本来安装它们:

过滤器版本可以与多个派生模型一起使用,但它不支持流式传输思考过程(因为过滤器不支持多个流请求?)。 管道版本配置一个与混合思维模型对应的管道函数,但它可以支持流式传输所有数据。

你可以设置两个布尔值 REASONING_CONTENT_AS_CONTEXTCONTENT_AS_CONTEXT,来控制是否将推理内容和实际输出作为上下文传递给输出模型:

  • 默认行为与 DeepClaude 一致,仅将推理内容用作上下文。但是,请注意这可能会让输出模型混乱
  • 在实践中,Aider 仅使用推理模型的最终输出作为输出模型的上下文。如果你想要相同的方法,请设置 REASONING_CONTENT_AS_CONTEXT=FalseCONTENT_AS_CONTEXT=True

ref:

About

hybrid thinking (aka deepclaude) in open-webui

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages