-
Notifications
You must be signed in to change notification settings - Fork 32
Support streaming from multiple pipeline components #178
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was wondering if to keep the "streaming all components" a default thing, or if it worth make an user to choice between "stream all capable components" and "stream only the last capable one".
I'm not sure... Maybe I would let the user choose.
On the other hand, if streaming isn't needed, the user could simply avoid providing a streaming_callback
.
Curious to hear @sjrl's thoughts as well...
examples/pipeline_wrappers/multi_llm_streaming/pipeline_wrapper.py
Outdated
Show resolved
Hide resolved
I'd be for allowing a user to choose. E.g. I could imagine a user providing a list or set of component names where streaming should be enabled. Turning it on for all by default could cause problems in edge cases where two LLMs in two branches are running simultaneously (like could happen when using |
Speaking with @tstadel this usually doesn't happen due to XOR branches (so one generator at the time is streaming). But assuming it will happen, we may get the source component from the streaming chunk and "fix" the output stream accordingly. @anakin87 Probably a better solution would be:
This should cover all use cases. For YAML pipelines, same streaming configuration may be read from a WDYT? |
For simplicity, I would:
Ofc, also your original idea would work... |
Should close #144 and #133.
Currently in Hayhooks we support streaming only considering the last streaming-capable component.
With this PR, we want to support all streaming-capable components in a pipeline. This can enable support from some specific use cases.
Example where we have a pipeline with 2 LLM-based components:
Full example: https://github.com/deepset-ai/hayhooks/tree/multi-component-streaming/examples/pipeline_wrappers/multi_llm_streaming
Sample pipeline and
PipelineWrapper
: https://github.com/deepset-ai/hayhooks/blob/multi-component-streaming/examples/pipeline_wrappers/multi_llm_streaming/pipeline_wrapper.pyNOTE: I was wondering if to keep the "streaming all components" a default thing, or if it worth make an user to choice between "stream all capable components" and "stream only the last capable one".