openwebui-hybrid-thinking

A Pipe and Filter for open-webui to do hybrid-thinking: you can use DeepSeek R1 or QwQ 32B for cheap and fast thinking, and use stronger and more expensive models like claude-3.7-Sonnet for final summarization output, to achieve a better balance between inference cost and performance.

Aider LLM Leaderboards and DeepClaude shows the efficiency of hybrid thinking: deepseek-r1 + claude-3.5-sonnet can achieve results very close to claude-3.7-sonnet-thinking at 1/3 of the cost.

You can install those scripts by importing:

Pipe (recommended): https://openwebui.com/f/grayxu/hybrid_thinking_pipe
Filter: https://openwebui.com/f/grayxu/hybrid_thinking

The filter version can be used with multiple derived models, but it doesn't support streaming the thought process (because filter doesn't support multiple stream requests?).
The pipe version configures a pipe function corresponding to a hybrid thinking model, but it can support streaming all data.

You can set two booleans, REASONING_CONTENT_AS_CONTEXT and CONTENT_AS_CONTEXT, to control whether the reasoning and actual output are passed as context to the output model:

The default behavior is consistent with Deep Claude, where only the reasoning content is used as context. However, be aware that this might confuse the output model.
In practice, Aider only uses the final output of the reasoning model as context for the output model. If you want the same approach, set REASONING_CONTENT_AS_CONTEXT=False and CONTENT_AS_CONTEXT=True.

一个为 open-webui 提供的 管道 (Pipe) 和 过滤器 (Filter)，用于实现 混合思考 (hybrid-thinking)：你可以使用 DeepSeek R1 或 QwQ 32B 进行廉价且快速的思考，然后使用更强大、更昂贵的模型（如 claude-3.7-Sonnet）进行最终的总结输出，从而在推理成本和性能之间取得更好的平衡。

Aider LLM 排行榜和 DeepClaude 展示了混合思维的效率：deepseek-r1 + claude-3.5-sonnet 可以以 1/3 的成本实现非常接近 claude-3.7-sonnet-thinking 的结果。

你可以通过导入以下脚本来安装它们：

管道 (Pipe) (推荐): https://openwebui.com/f/grayxu/hybrid_thinking_pipe
过滤器 (Filter): https://openwebui.com/f/grayxu/hybrid_thinking

过滤器版本可以与多个派生模型一起使用，但它不支持流式传输思考过程（因为过滤器不支持多个流请求？）。管道版本配置一个与混合思维模型对应的管道函数，但它可以支持流式传输所有数据。

你可以设置两个布尔值 REASONING_CONTENT_AS_CONTEXT 和 CONTENT_AS_CONTEXT，来控制是否将推理内容和实际输出作为上下文传递给输出模型：

默认行为与 DeepClaude 一致，仅将推理内容用作上下文。但是，请注意这可能会让输出模型混乱。
在实践中，Aider 仅使用推理模型的最终输出作为输出模型的上下文。如果你想要相同的方法，请设置 REASONING_CONTENT_AS_CONTEXT=False 和 CONTENT_AS_CONTEXT=True。

ref:

part of pipe codes from charleskanp

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
README.md		README.md
hybrid_thinking.py		hybrid_thinking.py
hybrid_thinking_pipe.py		hybrid_thinking_pipe.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

openwebui-hybrid-thinking

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

openwebui-hybrid-thinking

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages