Skip to content

Conversation

mpangrazzi
Copy link
Contributor

@mpangrazzi mpangrazzi commented Oct 17, 2025

Should close #144 and #133.

Currently in Hayhooks we support streaming only considering the last streaming-capable component.

With this PR, we want to support all streaming-capable components in a pipeline. This can enable support from some specific use cases.

Example where we have a pipeline with 2 LLM-based components:

multi_stream

Full example: https://github.com/deepset-ai/hayhooks/tree/multi-component-streaming/examples/pipeline_wrappers/multi_llm_streaming

Sample pipeline and PipelineWrapper: https://github.com/deepset-ai/hayhooks/blob/multi-component-streaming/examples/pipeline_wrappers/multi_llm_streaming/pipeline_wrapper.py

NOTE: I was wondering if to keep the "streaming all components" a default thing, or if it worth make an user to choice between "stream all capable components" and "stream only the last capable one".

Copy link
Member

@anakin87 anakin87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wondering if to keep the "streaming all components" a default thing, or if it worth make an user to choice between "stream all capable components" and "stream only the last capable one".

I'm not sure... Maybe I would let the user choose.

On the other hand, if streaming isn't needed, the user could simply avoid providing a streaming_callback.

Curious to hear @sjrl's thoughts as well...

@sjrl
Copy link
Contributor

sjrl commented Oct 17, 2025

I was wondering if to keep the "streaming all components" a default thing, or if it worth make an user to choice between "stream all capable components" and "stream only the last capable one".

I'm not sure... Maybe I would let the user choose.

On the other hand, if streaming isn't needed, the user could simply avoid providing a streaming_callback.

Curious to hear @sjrl's thoughts as well...

I'd be for allowing a user to choose. E.g. I could imagine a user providing a list or set of component names where streaming should be enabled.

Turning it on for all by default could cause problems in edge cases where two LLMs in two branches are running simultaneously (like could happen when using AsyncPipeline).

@mpangrazzi
Copy link
Contributor Author

@sjrl

Turning it on for all by default could cause problems in edge cases where two LLMs in two branches are running simultaneously (like could happen when using AsyncPipeline).

Speaking with @tstadel this usually doesn't happen due to XOR branches (so one generator at the time is streaming). But assuming it will happen, we may get the source component from the streaming chunk and "fix" the output stream accordingly.

@anakin87 Probably a better solution would be:

  • Stream only the last capable component by default
  • Accept a param to enable streaming on all capable components (assuming they will do it serially)
  • Accept a param to exclude some specific component from streaming (e.g. by name)

This should cover all use cases. For YAML pipelines, same streaming configuration may be read from a streaming_config field (as for inputs / outputs).

WDYT?

@anakin87
Copy link
Member

For simplicity, I would:

  • Stream only the last capable component by default
  • Accept a param to express streaming_config (streaming_config={"component_a": True, "component_b": False})

Ofc, also your original idea would work...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants