Skip to content

Disable reflection after tool use #1024

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
WorldInnovationsDepartment opened this issue Mar 1, 2025 · 8 comments
Closed

Disable reflection after tool use #1024

WorldInnovationsDepartment opened this issue Mar 1, 2025 · 8 comments

Comments

@WorldInnovationsDepartment
Copy link

WorldInnovationsDepartment commented Mar 1, 2025

Description

Hello, Pydantic AI Team 👋

I’d like to propose a feature that allows for the creation of highly specialized agents that only execute tools without streaming textual output from LLM model (Disable reflection after tool use).

Use Case:

Consider a Search Agent that solely runs tool calls, passing their outputs to an Analyst Agent for processing. The Analyst Agent then collaborates with a Critique Agent to construct the final response. This setup enables modular and efficient agent interactions where streaming text from intermediate agents is unnecessary.

Current Limitation:

From my understanding, the framework does not currently support an agent that exclusively executes tools without attempting to stream a final response. I attempted to implement a workaround using .iter() and breaking the stream when an answer starts forming, but this approach was unsuccessful.

Questions & Contribution:

  1. Does this feature align with your roadmap or design philosophy?
  2. Is there a way to implement this behavior using an existing but undocumented approach?
  3. If this makes sense to you, I’d be happy to contribute a PR! Could you provide guidance on how best to implement it?

Looking forward to your thoughts! 🚀

References

I like how autogen managed to do this, their AssistantAgent (similar to Agent in pydantic-ai) has reflect_on_tool_use param and if it is False - FinalResult is just tool outputs.
Check the link to their docs for more details and check last block in diagram pls https://microsoft.github.io/autogen/dev/reference/python/autogen_agentchat.agents.html#autogen_agentchat.agents.AssistantAgent
Image

@WorldInnovationsDepartment
Copy link
Author

@dmontagu @Kludex, can we schedule a call, or could you provide me with useful tips for implementing this and I can create a PR? It’s a pretty useful feature that keeps coming up in different requests.

@WorldInnovationsDepartment
Copy link
Author

@Kludex @dmontagu @samuelcolvin
I was able to implement reflect_on_tool_use=False with PydAI.
I specified a structured output with the following format:

class ResearchResponse(BaseModel):
is_final: bool = Field(...)

This forces PydAI to call the final_result tool to return a structured output.
I created a script that asks 20 questions, gathers latency statistics, and did the same for the Researcher (reflect_on_tool_use=False) + Analyst agent (no tools) in Autogen. Now, I have some latency stats.

The first screenshot shows PydAI, and the second one is Autogen. The difference is significant. As I understand, there is no way in PydAI to avoid calling final_result when generating text.

PydAI:
Image
Autogen:
Image

@Finndersen
Copy link

Is this issue similar/related? #142

@WorldInnovationsDepartment
Copy link
Author

WorldInnovationsDepartment commented Mar 4, 2025

@Finndersen
I have created PR #1040 for my issue. It is similar, but as I understand, it will not be fixed in #142.

My approach is interrupting the agent run once all tools have completed execution.

Example:
A Search Agent runs five tool calls to gather information for a task. If I am forced to call an unnecessary tool just to avoid triggering text generation, I will waste time on that extra call. My approach is to terminate the agent’s run when it no longer needs to invoke any tool calls.

This enables the creation of agents that work exclusively with tools, eliminating unnecessary tool calls and reducing latency—something currently missing.

cc @Kludex @dmontagu

@Finndersen
Copy link

Does that change mean that all to calls will always return the results directly instead of to LLM? would probably want to make it a bit more flexible than that

@WorldInnovationsDepartment
Copy link
Author

WorldInnovationsDepartment commented Mar 4, 2025

@Finndersen You are right. I think we can use the Autogen approach—just create a message with all tool_call outputs to reuse later. In Autogen, there is a specific class to distinguish between LLM-generated text. It’s called ToolCallsSummaryMessage, and it stores tool calls in string format. A very useful feature for speed optimizations

@aristideubertas
Copy link

You could probably also just use a decorator on a tool to signal that usage of that tool will always imply return of the function output without any interpretation.

This is pretty core functionality and it is very odd that it hasn't been implemented yet. It is arguably more important than Graphs or other advanced features.

@WorldInnovationsDepartment WorldInnovationsDepartment changed the title Agent Capable of Running Tools Only (No Streaming Output) Disable reflection after tool use Mar 4, 2025
@DouweM
Copy link
Contributor

DouweM commented Apr 30, 2025

This should be addressed by #1463, which @dmontagu is working on. I'll close this until we determine it's not sufficient.

@DouweM DouweM closed this as not planned Won't fix, can't repro, duplicate, stale Apr 30, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants