diff --git a/docs/concepts/pipeline-wrapper.md b/docs/concepts/pipeline-wrapper.md
index 0372d85d..63280348 100644
--- a/docs/concepts/pipeline-wrapper.md
+++ b/docs/concepts/pipeline-wrapper.md
@@ -176,6 +176,69 @@ async def run_chat_completion_async(self, model: str, messages: List[dict], body
)
```
+## Streaming from Multiple Components
+
+!!! info "Automatic Multi-Component Streaming"
+ Hayhooks automatically enables streaming for **all** streaming-capable components in your pipeline - no special configuration needed!
+
+When your pipeline contains multiple components that support streaming (e.g., multiple LLMs), all of them stream their outputs automatically as the pipeline executes.
+
+### Example: Sequential LLMs with Streaming
+
+```python
+class MultiLLMWrapper(BasePipelineWrapper):
+ def setup(self) -> None:
+ from haystack.components.builders import ChatPromptBuilder
+ from haystack.components.generators.chat import OpenAIChatGenerator
+ from haystack.dataclasses import ChatMessage
+
+ self.pipeline = Pipeline()
+
+ # First LLM - initial answer
+ self.pipeline.add_component(
+ "prompt_1",
+ ChatPromptBuilder(
+ template=[
+ ChatMessage.from_system("You are a helpful assistant."),
+ ChatMessage.from_user("{{query}}")
+ ]
+ )
+ )
+ self.pipeline.add_component("llm_1", OpenAIChatGenerator(model="gpt-4o-mini"))
+
+ # Second LLM - refines the answer using Jinja2 to access ChatMessage attributes
+ self.pipeline.add_component(
+ "prompt_2",
+ ChatPromptBuilder(
+ template=[
+ ChatMessage.from_system("You are a helpful assistant that refines responses."),
+ ChatMessage.from_user(
+ "Previous response: {{previous_response[0].text}}\n\nRefine this."
+ )
+ ]
+ )
+ )
+ self.pipeline.add_component("llm_2", OpenAIChatGenerator(model="gpt-4o-mini"))
+
+ # Connect components - LLM 1's replies go directly to prompt_2
+ self.pipeline.connect("prompt_1.prompt", "llm_1.messages")
+ self.pipeline.connect("llm_1.replies", "prompt_2.previous_response")
+ self.pipeline.connect("prompt_2.prompt", "llm_2.messages")
+
+ def run_chat_completion(self, model: str, messages: List[dict], body: dict) -> Generator:
+ question = get_last_user_message(messages)
+
+ # Both LLMs will stream automatically!
+ return streaming_generator(
+ pipeline=self.pipeline,
+ pipeline_run_args={"prompt_1": {"template_variables": {"query": question}}}
+ )
+```
+
+**What happens:** Both LLMs **automatically stream** their responses token by token as the pipeline executes. The second prompt builder uses Jinja2 syntax (`{{previous_response[0].text}}`) to access the text content from the first LLM's `ChatMessage` response. **No custom extraction components needed** - streaming works for any number of components.
+
+See the [Multi-LLM Streaming Example](https://github.com/deepset-ai/hayhooks/tree/main/examples/pipeline_wrappers/multi_llm_streaming) for a complete working implementation.
+
## File Upload Support
Hayhooks can handle file uploads by adding a `files` parameter:
diff --git a/examples/README.md b/examples/README.md
index 2eafe761..c0921929 100644
--- a/examples/README.md
+++ b/examples/README.md
@@ -6,6 +6,7 @@ This directory contains various examples demonstrating different use cases and f
| Example | Description | Key Features | Use Case |
|---------|-------------|--------------|----------|
+| [multi_llm_streaming](./pipeline_wrappers/multi_llm_streaming/) | Multiple LLM components with automatic streaming | • Two sequential LLMs
• Automatic multi-component streaming
• No special configuration needed
• Shows default streaming behavior | Demonstrating how hayhooks automatically streams from all components in a pipeline |
| [async_question_answer](./pipeline_wrappers/async_question_answer/) | Async question-answering pipeline with streaming support | • Async pipeline execution
• Streaming responses
• OpenAI Chat Generator
• Both API and chat completion interfaces | Building conversational AI systems that need async processing and real-time streaming responses |
| [chat_with_website](./pipeline_wrappers/chat_with_website/) | Answer questions about website content | • Web content fetching
• HTML to document conversion
• Content-based Q&A
• Configurable URLs | Creating AI assistants that can answer questions about specific websites or web-based documentation |
| [chat_with_website_mcp](./pipeline_wrappers/chat_with_website_mcp/) | MCP-compatible website chat pipeline | • MCP (Model Context Protocol) support
• Website content analysis
• API-only interface
• Simplified deployment | Integrating website analysis capabilities into MCP-compatible AI systems and tools |
diff --git a/examples/pipeline_wrappers/multi_llm_streaming/README.md b/examples/pipeline_wrappers/multi_llm_streaming/README.md
new file mode 100644
index 00000000..a134cbbb
--- /dev/null
+++ b/examples/pipeline_wrappers/multi_llm_streaming/README.md
@@ -0,0 +1,98 @@
+# Multi-LLM Streaming Example
+
+This example demonstrates hayhooks' automatic multi-component streaming support.
+
+## Overview
+
+The pipeline contains **two LLM components in sequence**:
+
+1. **LLM 1** (`gpt-5-nano` with `reasoning_effort: low`): Provides a short, concise initial answer to the user's question
+2. **LLM 2** (`gpt-5-nano` with `reasoning_effort: medium`): Refines and expands the answer into a detailed, professional response
+
+Both LLMs automatically stream their responses - no special configuration needed!
+
+
+
+## How It Works
+
+Hayhooks automatically enables streaming for **all** streaming-capable components. Both LLMs stream their responses serially (one after another) without any special configuration.
+
+The pipeline connects LLM 1's replies directly to the second prompt builder. Using Jinja2 template syntax, the second prompt builder can access the `ChatMessage` attributes directly: `{{previous_response[0].text}}`. This approach is simple and doesn't require any custom extraction components.
+
+This example also demonstrates injecting a visual separator (`**[LLM 2 - Refining the response]**`) between the two LLM outputs using `StreamingChunk.component_info` to detect component transitions.
+
+## Usage
+
+### Deploy with Hayhooks
+
+```bash
+# Set your OpenAI API key
+export OPENAI_API_KEY=your_api_key_here
+
+# Deploy the pipeline
+hayhooks deploy examples/pipeline_wrappers/multi_llm_streaming
+
+# Test it via OpenAI-compatible API
+curl -X POST http://localhost:1416/v1/chat/completions \
+ -H "Content-Type: application/json" \
+ -d '{
+ "model": "multi_llm_streaming",
+ "messages": [{"role": "user", "content": "What is machine learning?"}],
+ "stream": true
+ }'
+```
+
+### Use Directly in Code
+
+```python
+from haystack import Pipeline
+from haystack.components.builders import ChatPromptBuilder
+from haystack.dataclasses import ChatMessage
+from hayhooks import streaming_generator
+
+# Create your pipeline with multiple streaming components
+pipeline = Pipeline()
+# ... add LLM 1 and prompt_builder_1 ...
+
+# Add second prompt builder that accesses ChatMessage attributes via Jinja2
+pipeline.add_component(
+ "prompt_builder_2",
+ ChatPromptBuilder(
+ template=[
+ ChatMessage.from_system("You are a helpful assistant."),
+ ChatMessage.from_user("Previous: {{previous_response[0].text}}\n\nRefine this.")
+ ]
+ )
+)
+# ... add LLM 2 ...
+
+# Connect: LLM 1 replies directly to prompt_builder_2
+pipeline.connect("llm_1.replies", "prompt_builder_2.previous_response")
+
+# streaming_generator automatically streams from ALL components
+for chunk in streaming_generator(
+ pipeline=pipeline,
+ pipeline_run_args={"prompt_builder_1": {"template_variables": {"query": "Your question"}}}
+):
+ print(chunk.content, end="", flush=True)
+```
+
+## Integration with OpenWebUI
+
+This pipeline works seamlessly with OpenWebUI:
+
+1. Configure OpenWebUI to connect to hayhooks (see [OpenWebUI Integration docs](../../../docs/features/openwebui-integration.md))
+2. Deploy this pipeline
+3. Select it as a model in OpenWebUI
+4. Watch both LLMs stream their responses in real-time
+
+## Technical Details
+
+- **Pipeline Flow**: `LLM 1 → Prompt Builder 2 → LLM 2`
+- **Jinja2 Templates**: `ChatPromptBuilder` uses Jinja2, allowing direct access to `ChatMessage` attributes in templates
+- **Template Variables**: LLM 1's `List[ChatMessage]` replies are passed directly as `previous_response` to the second prompt builder
+- **Accessing ChatMessage Content**: Use `{{previous_response[0].text}}` in templates to access the text content
+- **Streaming**: Serial execution with automatic callback management for all components
+- **Transition Detection**: Uses `StreamingChunk.component_info.name` to detect when LLM 2 starts
+- **Visual Separator**: Injects a `StreamingChunk` between LLM outputs
+- **Error Handling**: Stream terminates gracefully if any component fails
diff --git a/examples/pipeline_wrappers/multi_llm_streaming/multi_stream.gif b/examples/pipeline_wrappers/multi_llm_streaming/multi_stream.gif
new file mode 100644
index 00000000..673d1f6b
Binary files /dev/null and b/examples/pipeline_wrappers/multi_llm_streaming/multi_stream.gif differ
diff --git a/examples/pipeline_wrappers/multi_llm_streaming/pipeline_wrapper.py b/examples/pipeline_wrappers/multi_llm_streaming/pipeline_wrapper.py
new file mode 100644
index 00000000..d1abe1c4
--- /dev/null
+++ b/examples/pipeline_wrappers/multi_llm_streaming/pipeline_wrapper.py
@@ -0,0 +1,129 @@
+from collections.abc import Generator
+from typing import Any, List, Union # noqa: UP035
+
+from haystack import Pipeline
+from haystack.components.builders import ChatPromptBuilder
+from haystack.components.generators.chat import OpenAIChatGenerator
+from haystack.dataclasses import ChatMessage, StreamingChunk
+from haystack.utils import Secret
+
+from hayhooks import BasePipelineWrapper, get_last_user_message, streaming_generator
+
+
+class PipelineWrapper(BasePipelineWrapper):
+ """
+ A pipeline with two sequential LLM components that both stream.
+
+ The first LLM (low reasoning) provides a concise answer, and the second LLM
+ (medium reasoning) refines and expands it with more detail.
+ Both automatically stream their responses - this is the default behavior in hayhooks.
+ """
+
+ def setup(self) -> None:
+ """Initialize the pipeline with two streaming LLM components."""
+ self.pipeline = Pipeline()
+
+ # First stage: Initial answer
+ self.pipeline.add_component(
+ "prompt_builder_1",
+ ChatPromptBuilder(
+ template=[
+ ChatMessage.from_system(
+ "You are a helpful assistant. \nAnswer the user's question in a short and concise manner."
+ ),
+ ChatMessage.from_user("{{query}}"),
+ ]
+ ),
+ )
+ self.pipeline.add_component(
+ "llm_1",
+ OpenAIChatGenerator(
+ api_key=Secret.from_env_var("OPENAI_API_KEY"),
+ model="gpt-5-nano",
+ generation_kwargs={
+ "reasoning_effort": "low",
+ },
+ ),
+ )
+
+ # Second stage: Refinement
+ # The prompt builder can directly access ChatMessage attributes via Jinja2
+ self.pipeline.add_component(
+ "prompt_builder_2",
+ ChatPromptBuilder(
+ template=[
+ ChatMessage.from_system("You are a helpful assistant that refines and improves responses."),
+ ChatMessage.from_user(
+ "Here is the previous response:\n\n{{previous_response[0].text}}\n\n"
+ "Please refine and improve this response. "
+ "Make it a bit more detailed, clear, and professional. "
+ "Please state that you're refining the response in the beginning of your answer."
+ ),
+ ]
+ ),
+ )
+ self.pipeline.add_component(
+ "llm_2",
+ OpenAIChatGenerator(
+ api_key=Secret.from_env_var("OPENAI_API_KEY"),
+ model="gpt-5-nano",
+ generation_kwargs={
+ "reasoning_effort": "medium",
+ },
+ ),
+ )
+
+ # Connect the components
+ self.pipeline.connect("prompt_builder_1.prompt", "llm_1.messages")
+ self.pipeline.connect("llm_1.replies", "prompt_builder_2.previous_response")
+ self.pipeline.connect("prompt_builder_2.prompt", "llm_2.messages")
+
+ def run_api(self, query: str) -> dict[str, Any]:
+ """Run the pipeline in non-streaming mode."""
+ result = self.pipeline.run(
+ {
+ "prompt_builder_1": {"template_variables": {"query": query}},
+ }
+ )
+ return {"reply": result["llm_2"]["replies"][0].text if result["llm_2"]["replies"] else ""}
+
+ def run_chat_completion(self, model: str, messages: List[dict], body: dict) -> Union[str, Generator]: # noqa: ARG002, UP006
+ """
+ Run the pipeline in streaming mode.
+
+ Both LLMs will automatically stream their responses thanks to
+ hayhooks' built-in multi-component streaming support.
+
+ We inject a visual separator between LLM 1 and LLM 2 outputs.
+ """
+ question = get_last_user_message(messages)
+
+ def custom_streaming():
+ """
+ Enhanced streaming that injects a visual separator between LLM outputs.
+
+ Uses StreamingChunk.component_info.name to reliably detect which component
+ is streaming, avoiding fragile chunk counting or heuristics.
+
+ NOTE: This is simply a workaround to inject a visual separator between LLM outputs.
+ """
+ llm2_started = False
+
+ for chunk in streaming_generator(
+ pipeline=self.pipeline,
+ pipeline_run_args={
+ "prompt_builder_1": {"template_variables": {"query": question}},
+ },
+ ):
+ # Use component_info to detect which LLM is streaming
+ if hasattr(chunk, "component_info") and chunk.component_info:
+ component_name = chunk.component_info.name
+
+ # When we see llm_2 for the first time, inject a visual separator
+ if component_name == "llm_2" and not llm2_started:
+ llm2_started = True
+ yield StreamingChunk(content="\n\n**[LLM 2 - Refining the response]**\n\n")
+
+ yield chunk
+
+ return custom_streaming()
diff --git a/src/hayhooks/server/pipelines/utils.py b/src/hayhooks/server/pipelines/utils.py
index 9d84792a..06c7c78a 100644
--- a/src/hayhooks/server/pipelines/utils.py
+++ b/src/hayhooks/server/pipelines/utils.py
@@ -40,33 +40,32 @@ def get_last_user_message(messages: list[Union[Message, dict]]) -> Union[str, No
return None
-def find_streaming_component(pipeline: Union[Pipeline, AsyncPipeline]) -> tuple[Component, str]:
+def find_all_streaming_components(pipeline: Union[Pipeline, AsyncPipeline]) -> list[tuple[Component, str]]:
"""
- Finds the component in the pipeline that supports streaming_callback
+ Finds all components in the pipeline that support streaming_callback.
Returns:
- The first component that supports streaming
+ A list of tuples containing (component, component_name) for all streaming components
"""
- streaming_component = None
- streaming_component_name = ""
+ streaming_components = []
for name, component in pipeline.walk():
if hasattr(component, "streaming_callback"):
log.trace(f"Streaming component found in '{name}' with type {type(component)}")
- streaming_component = component
- streaming_component_name = name
- if not streaming_component:
- msg = "No streaming-capable component found in the pipeline"
+ streaming_components.append((component, name))
+
+ if not streaming_components:
+ msg = "No streaming-capable components found in the pipeline"
raise ValueError(msg)
- return streaming_component, streaming_component_name
+ return streaming_components
def _setup_streaming_callback_for_pipeline(
pipeline: Union[Pipeline, AsyncPipeline], pipeline_run_args: dict[str, Any], streaming_callback: Any
) -> dict[str, Any]:
"""
- Sets up streaming callback for pipeline components.
+ Sets up streaming callbacks for all streaming-capable components in the pipeline.
Args:
pipeline: The pipeline to configure
@@ -76,16 +75,17 @@ def _setup_streaming_callback_for_pipeline(
Returns:
Updated pipeline run arguments
"""
- _, streaming_component_name = find_streaming_component(pipeline)
+ streaming_components = find_all_streaming_components(pipeline)
- # Ensure component args exist in pipeline run args
- if streaming_component_name not in pipeline_run_args:
- pipeline_run_args[streaming_component_name] = {}
+ for _, component_name in streaming_components:
+ # Ensure component args exist in pipeline run args
+ if component_name not in pipeline_run_args:
+ pipeline_run_args[component_name] = {}
- # Set the streaming callback on the component
- streaming_component = pipeline.get_component(streaming_component_name)
- assert hasattr(streaming_component, "streaming_callback")
- streaming_component.streaming_callback = streaming_callback
+ # Set the streaming callback on the component
+ streaming_component = pipeline.get_component(component_name)
+ assert hasattr(streaming_component, "streaming_callback")
+ streaming_component.streaming_callback = streaming_callback
return pipeline_run_args
@@ -157,7 +157,8 @@ def streaming_generator( # noqa: C901, PLR0912
"""
Creates a generator that yields streaming chunks from a pipeline or agent execution.
- Automatically finds the streaming-capable component in pipelines or uses the agent's streaming callback.
+ Automatically finds all streaming-capable components in pipelines and sets up streaming for all of them.
+ For agents, uses the agent's streaming callback.
Args:
pipeline: The Pipeline, AsyncPipeline, or Agent to execute
@@ -171,8 +172,9 @@ def streaming_generator( # noqa: C901, PLR0912
OpenWebUIEvent: Event for tool call
str: Tool name or stream content
- NOTE: This generator works with sync/async pipelines and agents, but pipeline components
- which support streaming must have a _sync_ `streaming_callback`.
+ NOTE: This generator works with sync/async pipelines and agents. Pipeline components
+ which support streaming must have a _sync_ `streaming_callback`. All streaming-capable
+ components in the pipeline will stream their outputs serially as the pipeline executes.
"""
if pipeline_run_args is None:
pipeline_run_args = {}
@@ -247,26 +249,27 @@ def run_pipeline() -> None:
def _validate_async_streaming_support(pipeline: Union[Pipeline, AsyncPipeline]) -> None:
"""
- Validates that the pipeline supports async streaming callbacks.
+ Validates that all streaming components in the pipeline support async streaming callbacks.
Args:
pipeline: The pipeline to validate
Raises:
- ValueError: If the pipeline doesn't support async streaming
+ ValueError: If any streaming component doesn't support async streaming
"""
- streaming_component, streaming_component_name = find_streaming_component(pipeline)
-
- # Check if the streaming component supports async streaming callbacks
- # We check for run_async method as an indicator of async support
- if not hasattr(streaming_component, "run_async"):
- component_type = type(streaming_component).__name__
- msg = (
- f"Component '{streaming_component_name}' of type '{component_type}' seems to not support async streaming "
- "callbacks. Use the sync 'streaming_generator' function instead, or switch to a component that supports "
- "async streaming callbacks (e.g., OpenAIChatGenerator instead of OpenAIGenerator)."
- )
- raise ValueError(msg)
+ streaming_components = find_all_streaming_components(pipeline)
+
+ for streaming_component, streaming_component_name in streaming_components:
+ # Check if the streaming component supports async streaming callbacks
+ # We check for run_async method as an indicator of async support
+ if not hasattr(streaming_component, "run_async"):
+ component_type = type(streaming_component).__name__
+ msg = (
+ f"Component '{streaming_component_name}' of type '{component_type}' seems to not support async "
+ "streaming callbacks. Use the sync 'streaming_generator' function instead, or switch to a component "
+ "that supports async streaming callbacks (e.g., OpenAIChatGenerator instead of OpenAIGenerator)."
+ )
+ raise ValueError(msg)
async def _execute_pipeline_async(
@@ -352,7 +355,8 @@ async def async_streaming_generator( # noqa: C901, PLR0912
"""
Creates an async generator that yields streaming chunks from a pipeline or agent execution.
- Automatically finds the streaming-capable component in pipelines or uses the agent's streaming callback.
+ Automatically finds all streaming-capable components in pipelines and sets up streaming for all of them.
+ For agents, uses the agent's streaming callback.
Args:
pipeline: The Pipeline, AsyncPipeline, or Agent to execute
@@ -366,8 +370,9 @@ async def async_streaming_generator( # noqa: C901, PLR0912
OpenWebUIEvent: Event for tool call
str: Tool name or stream content
- NOTE: This generator works with sync/async pipelines and agents. For pipelines, the streaming component
+ NOTE: This generator works with sync/async pipelines and agents. For pipelines, the streaming components
must support an _async_ `streaming_callback`. Agents have built-in async streaming support.
+ All streaming-capable components in the pipeline will stream their outputs serially as the pipeline executes.
"""
# Validate async streaming support for pipelines (not needed for agents)
if pipeline_run_args is None:
diff --git a/tests/test_it_pipeline_utils.py b/tests/test_it_pipeline_utils.py
index 6f238044..91c00667 100644
--- a/tests/test_it_pipeline_utils.py
+++ b/tests/test_it_pipeline_utils.py
@@ -14,7 +14,11 @@
from hayhooks import callbacks
from hayhooks.open_webui import OpenWebUIEvent, create_notification_event
-from hayhooks.server.pipelines.utils import async_streaming_generator, find_streaming_component, streaming_generator
+from hayhooks.server.pipelines.utils import (
+ async_streaming_generator,
+ find_all_streaming_components,
+ streaming_generator,
+)
QUESTION = "Is Haystack a framework for developing AI applications? Answer Yes or No"
@@ -141,33 +145,10 @@ def mocked_pipeline_with_streaming_component(mocker):
return streaming_component, pipeline
-def test_find_streaming_component_no_streaming_component():
- pipeline = Pipeline()
-
- with pytest.raises(ValueError, match="No streaming-capable component found in the pipeline"):
- find_streaming_component(pipeline)
-
-
-def test_find_streaming_component_finds_streaming_component(mocker):
- streaming_component = MockComponent(has_streaming=True)
- non_streaming_component = MockComponent(has_streaming=False)
-
- pipeline = mocker.Mock(spec=Pipeline)
- pipeline.walk.return_value = [
- ("component1", non_streaming_component),
- ("streaming_component", streaming_component),
- ("component2", non_streaming_component),
- ]
-
- component, name = find_streaming_component(pipeline)
- assert component == streaming_component
- assert name == "streaming_component"
-
-
def test_streaming_generator_no_streaming_component():
pipeline = Pipeline()
- with pytest.raises(ValueError, match="No streaming-capable component found in the pipeline"):
+ with pytest.raises(ValueError, match="No streaming-capable components found in the pipeline"):
list(streaming_generator(pipeline))
@@ -219,7 +200,7 @@ def test_streaming_generator_empty_output(mocked_pipeline_with_streaming_compone
async def test_async_streaming_generator_no_streaming_component():
pipeline = Pipeline()
- with pytest.raises(ValueError, match="No streaming-capable component found in the pipeline"):
+ with pytest.raises(ValueError, match="No streaming-capable components found in the pipeline"):
_ = [chunk async for chunk in async_streaming_generator(pipeline)]
@@ -961,3 +942,114 @@ def custom_on_pipeline_end(output):
logger.add(lambda msg: messages.append(msg), level="ERROR")
_ = [chunk async for chunk in generator]
assert "Callback error" in messages[0]
+
+
+def test_find_all_streaming_components_finds_multiple(mocker):
+ streaming_component1 = MockComponent(has_streaming=True)
+ streaming_component2 = MockComponent(has_streaming=True)
+ non_streaming_component = MockComponent(has_streaming=False)
+
+ pipeline = mocker.Mock(spec=Pipeline)
+ pipeline.walk.return_value = [
+ ("component1", streaming_component1),
+ ("non_streaming", non_streaming_component),
+ ("component2", streaming_component2),
+ ]
+
+ components = find_all_streaming_components(pipeline)
+ assert len(components) == 2
+ assert components[0] == (streaming_component1, "component1")
+ assert components[1] == (streaming_component2, "component2")
+
+
+def test_find_all_streaming_components_raises_when_none_found():
+ pipeline = Pipeline()
+
+ with pytest.raises(ValueError, match="No streaming-capable components found in the pipeline"):
+ find_all_streaming_components(pipeline)
+
+
+@pytest.fixture
+def pipeline_with_multiple_streaming_components(mocker):
+ streaming_component1 = MockComponent(has_streaming=True)
+ streaming_component2 = MockComponent(has_streaming=True)
+ non_streaming_component = MockComponent(has_streaming=False)
+
+ pipeline = mocker.Mock(spec=AsyncPipeline)
+ pipeline._spec_class = AsyncPipeline
+ pipeline.walk.return_value = [
+ ("component1", streaming_component1),
+ ("non_streaming", non_streaming_component),
+ ("component2", streaming_component2),
+ ]
+
+ def mock_get_component(name):
+ if name == "component1":
+ return streaming_component1
+ elif name == "component2":
+ return streaming_component2
+ return non_streaming_component
+
+ pipeline.get_component.side_effect = mock_get_component
+
+ return streaming_component1, streaming_component2, pipeline
+
+
+def test_streaming_generator_with_multiple_components(pipeline_with_multiple_streaming_components):
+ streaming_component1, streaming_component2, pipeline = pipeline_with_multiple_streaming_components
+
+ mock_chunks = [
+ StreamingChunk(content="chunk1_from_component1"),
+ StreamingChunk(content="chunk2_from_component1"),
+ StreamingChunk(content="chunk1_from_component2"),
+ StreamingChunk(content="chunk2_from_component2"),
+ ]
+
+ def mock_run(data):
+ # Simulate both components streaming
+ if streaming_component1.streaming_callback:
+ streaming_component1.streaming_callback(mock_chunks[0])
+ streaming_component1.streaming_callback(mock_chunks[1])
+ if streaming_component2.streaming_callback:
+ streaming_component2.streaming_callback(mock_chunks[2])
+ streaming_component2.streaming_callback(mock_chunks[3])
+
+ pipeline.run.side_effect = mock_run
+
+ generator = streaming_generator(pipeline)
+ chunks = list(generator)
+
+ assert chunks == mock_chunks
+ # Verify both components had their callbacks set
+ assert streaming_component1.streaming_callback is not None
+ assert streaming_component2.streaming_callback is not None
+
+
+@pytest.mark.asyncio
+async def test_async_streaming_generator_with_multiple_components(mocker, pipeline_with_multiple_streaming_components):
+ streaming_component1, streaming_component2, pipeline = pipeline_with_multiple_streaming_components
+
+ mock_chunks = [
+ StreamingChunk(content="async_chunk1_from_component1"),
+ StreamingChunk(content="async_chunk2_from_component1"),
+ StreamingChunk(content="async_chunk1_from_component2"),
+ StreamingChunk(content="async_chunk2_from_component2"),
+ ]
+
+ async def mock_run_async(data):
+ # Simulate both components streaming
+ if streaming_component1.streaming_callback:
+ await streaming_component1.streaming_callback(mock_chunks[0])
+ await streaming_component1.streaming_callback(mock_chunks[1])
+ if streaming_component2.streaming_callback:
+ await streaming_component2.streaming_callback(mock_chunks[2])
+ await streaming_component2.streaming_callback(mock_chunks[3])
+
+ pipeline.run_async = mocker.AsyncMock(side_effect=mock_run_async)
+
+ chunks = [chunk async for chunk in async_streaming_generator(pipeline)]
+
+ assert chunks == mock_chunks
+ # Verify both components had their callbacks set
+ assert streaming_component1.streaming_callback is not None
+ assert streaming_component2.streaming_callback is not None