You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add support for async pipeline wrapper methods (#122)
* Add async pipeline wrapper methods support ; ensure current logic doesn't break
* fixed all mypy warnings
* add missing yield
* remove docstring ; reformat
* refactor
* Add test for async pipeline
* remove unneeded fixtures
* fix some lint errors
* Add support for AsyncGenerator as pipeline result in chat_endpoint
* add async_streaming_generator and expose it
* refactor
* refactor
* divide unit / it tests
* cleanup
* do not mark mcp tests as integration
* add async QA pipeline for some IT tests
* add test for streaming utilities
* Fix test
* refactor
* Better testing of streaming generators (sync/async)
* Add it tests for streaming generators ; Further improve exceptions handling
* Fix tests for python 3.9
* Update README adding docs about new async features
* Add a simple AsyncPipeline + async_streaming_generator example
* Align example pipeline with test one
* Add comment
* Update README
Copy file name to clipboardExpand all lines: README.md
+108-3Lines changed: 108 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,8 +1,13 @@
1
1
# Hayhooks
2
2
3
-
**Hayhooks** makes it easy to deploy and serve [Haystack](https://haystack.deepset.ai/) pipelines as REST APIs.
3
+
**Hayhooks** makes it easy to deploy and serve [Haystack](https://haystack.deepset.ai/) pipelines.
4
4
5
-
It provides a simple way to wrap your Haystack pipelines with custom logic and expose them via HTTP endpoints, including OpenAI-compatible chat completion endpoints. With Hayhooks, you can quickly turn your Haystack pipelines into API services with minimal boilerplate code.
5
+
With Hayhooks, you can:
6
+
7
+
-**Deploy your Haystack pipelines as REST APIs** with maximum flexibility and minimal boilerplate code.
8
+
-**Expose your Haystack pipelines over the MCP protocol**, making them available as tools in AI dev environments like [Cursor](https://cursor.com) or [Claude Desktop](https://claude.ai/download). Under the hood, Hayhooks runs as an [MCP Server](https://modelcontextprotocol.io/docs/concepts/architecture), exposing each pipeline as an [MCP Tool](https://modelcontextprotocol.io/docs/concepts/tools).
9
+
-**Expose your Haystack pipelines as OpenAI-compatible chat completion backends** with streaming support (to be used with [open-webui](https://openwebui.com) or any other OpenAI compatible client).
10
+
-**Control Hayhooks core APIs through chat** - deploy, undeploy, list, or run Haystack pipelines by chatting with [Claude Desktop](https://claude.ai/download), [Cursor](https://cursor.com), or any other MCP client.
@@ -232,6 +240,30 @@ The input arguments will be used to generate a Pydantic model that will be used
232
240
233
241
**NOTE**: Since Hayhooks will _dynamically_ create the Pydantic models, you need to make sure that the input arguments are JSON-serializable.
234
242
243
+
#### run_api_async(...)
244
+
245
+
This method is the asynchronous version of `run_api`. It will be used to run the pipeline in API mode when you call the `{pipeline_name}/run` endpoint, but handles requests asynchronously for better performance under high load.
246
+
247
+
**You can define the input arguments of the method according to your needs**, just like with `run_api`.
# Use async/await with AsyncPipeline or async operations
252
+
result =awaitself.pipeline.run_async({"fetcher": {"urls": urls}, "prompt": {"query": question}})
253
+
return result["llm"]["replies"][0]
254
+
```
255
+
256
+
This is particularly useful when:
257
+
258
+
- Working with `AsyncPipeline` instances that support async execution
259
+
- Integrating with async-compatible Haystack components (e.g., `OpenAIChatGenerator` with async support)
260
+
- Handling I/O-bound operations more efficiently
261
+
- Deploying pipelines that need to handle many concurrent requests
262
+
263
+
**NOTE**: You can implement either `run_api`, `run_api_async`, or both. Hayhooks will automatically detect which methods are implemented and route requests accordingly.
264
+
265
+
You can find complete working examples of async pipeline wrappers in the [test files](tests/test_files/files/async_question_answer) and [async streaming examples](tests/test_files/files/async_chat_with_website_streaming).
This method is the asynchronous version of `run_chat_completion`. It handles OpenAI-compatible chat completion requests asynchronously, which is particularly useful for streaming responses and high-concurrency scenarios.
Like `run_chat_completion`, this method has a **fixed signature** and will be called with the same arguments. The key differences are:
642
+
643
+
- It's declared as `async` and can use `await` for asynchronous operations
644
+
- It can return an `AsyncGenerator` for streaming responses using `async_streaming_generator`
645
+
- It provides better performance for concurrent chat requests
646
+
- It's required when using async streaming with components that support async streaming callbacks
647
+
648
+
**NOTE**: You can implement either `run_chat_completion`, `run_chat_completion_async`, or both. When both are implemented, Hayhooks will prefer the async version for better performance.
649
+
650
+
You can find complete working examples combining async chat completion with streaming in the [async streaming test examples](tests/test_files/files/async_question_answer).
651
+
591
652
### Streaming responses in OpenAI-compatible endpoints
592
653
593
-
Hayhooks now provides a `streaming_generator` utility function that can be used to stream the pipeline output to the client.
654
+
Hayhooks provides `streaming_generator`and `async_streaming_generator`utility functions that can be used to stream the pipeline output to the client.
594
655
595
656
Let's update the `run_chat_completion` method of the previous example:
596
657
@@ -634,10 +695,54 @@ You will see the pipeline output being streamed [in OpenAI-compatible format](ht
634
695
635
696
Since output will be streamed to `open-webui` there's **no need to change `Stream Chat Response`** chat setting (leave it as `Default` or `On`).
636
697
698
+
You can find a complete working example of `streaming_generator` usage in the [examples/pipeline_wrappers/chat_with_website_streaming](examples/pipeline_wrappers/chat_with_website_streaming) directory.
- Works with both `Pipeline` and `AsyncPipeline` instances
739
+
- Requires **components that support async streaming callbacks** (e.g., `OpenAIChatGenerator` instead of `OpenAIGenerator`)
740
+
- Provides better performance for concurrent streaming requests
741
+
- Returns an `AsyncGenerator` that yields chunks asynchronously
742
+
- Automatically handles async pipeline execution and cleanup
743
+
744
+
**NOTE**: The streaming component in your pipeline must support async streaming callbacks. If you get an error about async streaming support, either use the sync `streaming_generator` or switch to async-compatible components.
745
+
641
746
### Integration with haystack OpenAIChatGenerator
642
747
643
748
Since Hayhooks is OpenAI-compatible, it can be used as a backend for the [haystack OpenAIChatGenerator](https://docs.haystack.deepset.ai/docs/openaichatgenerator).
0 commit comments