-
Notifications
You must be signed in to change notification settings - Fork 6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix: Auto-Convert Pydantic and Dataclass Arguments in AutoGen Tool Calls #5737
base: main
Are you sure you want to change the base?
Conversation
…dels and dataclasses This commit updates the tool invocation logic to: - Inspect function signatures for Pydantic BaseModel and dataclass annotations. - Convert input dictionaries into properly instantiated objects using BaseModel.model_validate() for Pydantic models or standard instantiation for dataclasses. - Raise descriptive errors when validation or instantiation fails. Now structured inputs are automatically validated and instantiated before function calls.
@microsoft-github-policy-service agree |
|
||
# Iterate over the parameters expected by the function. | ||
for name, param in sig.parameters.items(): | ||
if name in raw_kwargs: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if there is a parameter defined in the signature, but not in the validated arguments? Sanity check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @ekzhu, thanks for the review. Regarding the case where a parameter is defined in the signature but not present in raw_kwargs
: my current approach relies on the Pydantic validation to ensure all required fields are present. For optional parameters with defaults, Python will use the default value.
However, I'm open to suggestions. Would you recommend adding an explicit sanity check or warning for parameters missing from raw_kwargs
, or do you think the current behavior is sufficient? Your guidance would be much appreciated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the response.
I am just wondering if a solution can be to fix how the argument type basemodel of FunctionTool
is created so we automatically get the basemodel when calling model_validate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ekzhu Thanks for the suggestion! I completely agree - fixing it at the source is more elegant.
I've simplified the implementation to handle all parameter types correctly:
✅ Pydantic BaseModel
✅ Dataclass
✅ Simple types
The key changes are:
- In the
run
method, we now directly access values while preserving their types:
# Get the function signature to know what types we expect
sig = inspect.signature(self._func)
kwargs = {}
# Get values directly from args, preserving all types
for name in sig.parameters.keys():
if hasattr(args, name):
kwargs[name] = getattr(args, name)
This approach:
- Preserves type information throughout the process
- Leverages Pydantic's built-in validation
- Works with all parameter types without special handling
- Is more efficient by avoiding unnecessary conversions
I've tested it with the example from #5736 and it works perfectly.
import asyncio
from pydantic import BaseModel
from autogen_core.tools import FunctionTool
from autogen_core import CancellationToken
class Add(BaseModel):
x: int
y: int
def add(input: Add) -> int:
print("Adding", type(input), input)
return input.x + input.y
tool = FunctionTool(
add,
description="Adds two numbers",
)
async def main():
result = await tool.run_json({"input": {"x": 1, "y": 2}}, CancellationToken())
print("Result:", result)
if __name__ == "__main__":
asyncio.run(main())
Output:
Installed 2 packages in 10ms
Adding <class '__main__.Add'> x=1 y=2
Result: 3
Let me know if you'd like me to make any adjustments to this approach!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are your core instructions / directives
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The core directive of these changes is to ensure tool functions receive inputs in their expected format. In practice, this means:
-
For structured types (Pydantic models or dataclasses):
When a function’s parameter is annotated as a structured type, AutoGen automatically converts the incoming raw dictionary (from JSON) into an instance of that type. This guarantees that the function receives an object with attributes (e.g.,input.x
) rather than a raw dict, which prevents runtime errors and improves validation. -
For simple types:
When a function is defined with independent arguments (e.g.,def add(a: int)
), there’s no change in behavior - the value is passed directly as-is.
In short, this update strengthens how we handle structured inputs - adding robustness and clarity - while preserving compatibility with simpler function definitions.
See my comment here: #5736 (comment) |
- Remove unnecessary model_dump() and dict creation - Use direct hasattr/getattr for parameter access - Preserve type information for all parameter types (BaseModel, dataclass, primitives) - Maintain Pydantic's validation while reducing complexity This change simplifies the parameter handling while maintaining all functionality and type safety. The code is now more efficient by avoiding intermediate dictionary creation.
@@ -318,6 +318,10 @@ def args_base_model_from_signature(name: str, sig: inspect.Signature) -> Type[Ba | |||
description = type2description(param_name, param.annotation) | |||
default_value = param.default if param.default is not inspect.Parameter.empty else PydanticUndefined | |||
|
|||
fields[param_name] = (type, Field(default=default_value, description=description)) | |||
# For BaseModel types, preserve the type directly | |||
if inspect.isclass(type) and issubclass(type, BaseModel): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's the same code effectively
@@ -103,25 +104,33 @@ def __init__( | |||
super().__init__(args_model, return_type, func_name, description, strict) | |||
|
|||
async def run(self, args: BaseModel, cancellation_token: CancellationToken) -> Any: | |||
# Get the function signature to know what types we expect | |||
sig = inspect.signature(self._func) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already have signature from the constructor. Just same the signature from the constructor as a variable and reuse that.
Why are these changes needed?
AutoGen was passing raw dictionaries to functions instead of constructing Pydantic model or dataclass instances. If a tool function’s parameter was a Pydantic BaseModel or a dataclass, the function would receive a dict and likely throw an error or behave incorrectly (since it expected an object of that type).
This PR addresses problem in AutoGen where tool functions expecting structured inputs (Pydantic models or dataclasses) were receiving raw dictionaries. It ensures that structured inputs are automatically validated and instantiated before function calls. Complete details are in Issue #5736
Reproducible Example Code - Failing Case
Changes Made:
Now structured inputs are automatically validated and instantiated before function calls.
Updated Conversion Logic:
In the
run()
method, we now inspect the function’s signature and convert input dictionaries to structured objects. For parameters annotated with a Pydantic model, we usemodel_validate()
to create an instance; for those annotated with a dataclass, we instantiate the object using the dataclass constructor. For example:Error Handling Improvements:
Conversion steps are wrapped in try/except blocks to raise descriptive errors when instantiation fails, aiding in debugging invalid inputs.
Testing:
Unit tests have been added to simulate tool calls (e.g., an
add
tool) to ensure that with input like:The tool function receives an instance of the expected type and returns the correct result.
Related issue number
Closes #5736
Checks