Enhance YAML pipeline deployments using `inputs` / `outputs` fields #161

mpangrazzi · 2025-09-02T12:02:01Z

Fixes #156.

Note: initially I wanted to remove old YAML logic in another PR, but it would end to be quite confusing. Better to remove it now and update README accordingly.

mpangrazzi · 2025-09-02T14:29:36Z

I'll put here a question as a reminder. Considering this situation (coming from a complex Haystack pipeline):

inputs:
  query:
  - bm25_retriever.query
  - query_embedder.text
  - ConditionalRouter.question
  filters:
  - bm25_retriever.filters
  - embedding_retriever.filters

It's always safe to assume that bm25_retriever.query, query_embedder.text and ConditionalRouter.question will have the same input type? (Same can be said for filters). I assume yes of course 😉

sjrl · 2025-09-09T09:13:02Z

I'll put here a question as a reminder. Considering this situation (coming from a complex Haystack pipeline):
inputs:
  query:
  - bm25_retriever.query
  - query_embedder.text
  - ConditionalRouter.question
  filters:
  - bm25_retriever.filters
  - embedding_retriever.filters
It's always safe to assume that bm25_retriever.query, query_embedder.text and ConditionalRouter.question will have the same input type? (Same can be said for filters). I assume yes of course 😉

@mpangrazzi yes I'd say based on the provided mapping here we can assume they will all have the same input type. Technically you could inspect it but that would require creating the pipeline first.

…yaml ; refactoring

…sing inputs/outputs

…t using old YAML deploy logic)

…threadpool when running it)

anakin87

I left a few comments.

Some general notes:

I would improve the PR title (for release notes) to make it clear that the change is breaking and that we are introducing a non-backward compatible way to deploy YAML pipelines.
Related to the previous point: are you considering releasing a major version?

anakin87 · 2025-09-16T08:36:05Z

README.md

@@ -162,8 +163,9 @@ CLI commands are basically wrappers around the HTTP API of the server. The full
 hayhooks run     # Start the server
 hayhooks status  # Check the status of the server and show deployed pipelines

-hayhooks pipeline deploy-files <path_to_dir>   # Deploy a pipeline using PipelineWrapper
-hayhooks pipeline deploy <pipeline_name>       # Deploy a pipeline from a YAML file
+hayhooks pipeline deploy-yaml <path_to_yaml>   # Deploy a pipeline from a YAML file (preferred)


why this option is preferred?

It seems a convenient option if you are used to working with YAML, but is less flexible (no OpenAI/OpenWebUI compatibility), so I would not indicate it as preferred. But maybe I am missing something. WDYT?

Also had the same question :)

That's probably a typo sorry, deploy-files should obviously be the preferred one due exactly to what you said ;)

There are still some mentions of "preferred" here and there

README.md

anakin87 · 2025-09-16T08:41:30Z

README.md

+tools = await client.list_tools()
+# Find YAML tool by name, e.g., "calc" (the pipeline name)
+result = await client.call_tool("calc", {"value": 3})
+assert result.content[0].text == '{"double": {"value": 10}}'


I can only understand this example by looking at the calc pipeline code.
Perhaps we can use something easier to grasp

What do you have in mind? Or maybe we can reference better the calc pipeline?

Something like

tools = await client.list_tools() # Find YAML tool by name, e.g., "multiply" (the pipeline name) result = await client.call_tool("multiply", {"x": 3, "y": 4}) assert result.content[0].text == '{"product": 12}'

I just find it hard to understand the example without knowing the Pipeline.

src/hayhooks/cli/pipeline.py

anakin87 · 2025-09-16T08:53:27Z

src/hayhooks/cli/pipeline.py

@@ -91,6 +119,10 @@ def deploy_files(
    _deploy_with_progress(ctx=ctx, name=name, endpoint="deploy_files", payload=payload)


+# Register alias: `deploy` -> `deploy-files`
+pipeline.command(name="deploy")(deploy_files)


Just two thoughts:

this reinforces the idea expressed above, that the preferred option is using pipeline wrappers

even if the overall impact of this PR is highly breaking, I find it confusing that now deploy is an alias for deploy-files while previously it was used to deploy YAML pipelines in the old way. For me, it would be clearer to remove deploy altogether at the moment.

WDYT?

yeah, maybe we can remove deploy command completely since this will already contain breaking changes. Or maybe add a message which tells to use deploy-files or deploy-yaml?

Makes sense. Let's add an error message explaining that to deploy YAML pipelines you now need to use deploy-yaml with a new YAML structure.

src/hayhooks/server/utils/yaml_utils.py

tests/test_deploy_at_startup.py

tests/test_files/yaml/inputs_outputs_pipeline.yml

anakin87 · 2025-09-16T09:34:32Z

tests/test_files/yaml/working_pipelines/minimal_retriever.yml

@@ -1,79 +0,0 @@
-components:


can you confirm that most of these removed files are not used in tests?
(My impression is that they were present for manual tests before proper tests were put in place)

Yes I confirm! Those files were used when we were iterating on improving type handling of components inputs in old YAML logic (i.e. without inputs/outputs fields).

tests/test_it_deploy.py

src/hayhooks/server/pipelines/models.py

sjrl · 2025-09-16T12:27:12Z

src/hayhooks/server/utils/deploy_utils.py

+        msg = f"Failed to save YAML pipeline file: {e!s}"
+        raise PipelineFilesError(msg) from e


Would it be worth including the pipeline_name as part of the error message?

src/hayhooks/server/utils/deploy_utils.py

sjrl · 2025-09-16T12:30:26Z

src/hayhooks/server/utils/deploy_utils.py

+    # Ensure the registered object is a Haystack Pipeline, not a wrapper
+    if not isinstance(pipeline_instance, AsyncPipeline):
+        msg = f"Pipeline '{pipeline_name}' is not a Haystack AsyncPipeline instance"
+        raise PipelineYamlError(msg)


Dev comment doesn't quite line up with the actual check which is that it must be an AsyncPipeline and not a normal Pipeline?

Does this also mean this deploying with yaml only works with AsyncPipeline?

src/hayhooks/server/utils/deploy_utils.py

sjrl · 2025-09-16T12:35:41Z

src/hayhooks/server/utils/deploy_utils.py

+        clog.error(f"Failed creating request/response models for YAML pipeline: {e!s}")
+        raise


Here as well, should we include pipeline_name in the erorr message?

sjrl · 2025-09-16T12:37:19Z

src/hayhooks/server/utils/deploy_utils.py

+    # NOTE: We want to create an AsyncPipeline here so we can avoid using
+    #       run_in_threadpool when running the pipeline.


I think we should make it more clear in docstrings that only AsyncPipeline is supported when deploying with yaml. E.g. Use "Haystack AsyncPipeline.run_async" instead of "Haystack Pipeline.run"

sjrl · 2025-09-16T12:39:07Z

README.md

+Limitations:
+
+- YAML-deployed pipelines do not support OpenAI-compatible chat completion endpoints, so they cannot be used with Open WebUI. If you need chat completion/streaming, use a `PipelineWrapper` and implement `run_chat_completion` or `run_chat_completion_async` (see the OpenAI compatibility section below).


I might add here as well to say that YAML-deployed pipelines only work with AsyncPipeline

resolve inputs and outputs from a Haystack pipeline YAML definition

3efcda8

mpangrazzi self-assigned this Sep 2, 2025

mpangrazzi added 2 commits September 2, 2025 15:48

Add method to add YAML pipeline to registry

a855065

Better types handling

d2ad833

sjrl self-requested a review September 9, 2025 07:45

mpangrazzi added 18 commits September 10, 2025 21:53

Add deploy_pipeline_yaml ; Add route for YAML pipeline ; Add /deploy-…

d55195f

…yaml ; refactoring

Fix types

28d6ca7

Fix lint

bf52b0d

Fix for python 3.9

787bd37

Fix for last ruff version

28d0840

Fix tests

1cf600f

Add route for YAML if app is present

5b7fe4b

Introduced InvalidYamlIOError 422 error for YAML deployments with mis…

fd9c44d

…sing inputs/outputs

Add deploy-yaml CLI command

924bd3a

Ensure inputs / outputs YAML pipelines are deployed at startup (so no…

eee69a3

…t using old YAML deploy logic)

Skip tests with old YAML logic

0176170

Cleanup of old YAML handling logic to avoid confusion

9aadb12

Remove old CLI deploy command

48e7428

Add CLI alias: deploy -> deploy-files

487cb63

Update README

b1af4d0

Use AsyncPipeline when loading YAML pipelines (to avoid using run_in_…

6ca670e

…threadpool when running it)

Fix lint

c94d0ef

Add a section for loading pipelines or agents at startup

4446dd8

mpangrazzi requested a review from anakin87 September 15, 2025 14:38

mpangrazzi added 2 commits September 15, 2025 16:55

Enable YAML pipelines as MCP tools

d9420c2

Update README

1593afd

mpangrazzi marked this pull request as ready for review September 16, 2025 07:44

anakin87 reviewed Sep 16, 2025

View reviewed changes