-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve tool call message processing #3036
Conversation
Thank you for this change! I'll test it out and update here |
I've tested this change, it is working well! |
router/src/lib.rs
Outdated
@@ -1221,9 +1223,15 @@ pub struct TextMessage { | |||
|
|||
impl From<Message> for TextMessage { | |||
fn from(value: Message) -> Self { | |||
let content = value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we could add an additional check here to prevent this type of messages from anything other than the assistant
, as well as if both content
and tool_calls
are provided, WDYT? Maybe even a custom new error as you suggested recently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point! I've updated the PR to prefer a none optional enum that ensure either a content or tools are provided - looking into how errors may be improved now
router/src/server.rs
Outdated
("application/json" = Vec<GenerateResponse>), | ||
("text/event-stream" = StreamResponse), | ||
(Vec<GenerateResponse> = "application/json"), | ||
(Vec<GenerateResponse> = "application/json"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Duplicated here!
(Vec<GenerateResponse> = "application/json"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, avoided these issues/changes, by avoiding bumping the utopia version
router/src/lib.rs
Outdated
pub content: Option<MessageContent>, | ||
#[serde(default, skip_serializing_if = "Option::is_none")] | ||
#[schema(example = "\"David\"")] | ||
name: Option<String>, | ||
#[serde(default, skip_serializing_if = "Option::is_none")] | ||
tool_calls: Option<Vec<ToolCall>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we agree there can never be both a content and a tool_calls ?
Also never neither.
If true, then this begs to become an Enum of Either content or tool calling.
And that should simplify all the underlying code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Untagged enum are not great for the error message so we should probably implement our own deserialize method instead. (Since there doesn't seem to be a type
field equivalent serde could use to know which enum to deserialize)
"content": "I can't access real-time data, but I can provide you with current conditions and forecast for Paris, France:\n\nThe current conditions in Paris are mostly cloudy with a temperature of 6.7°C (44.1°F). \n\nPlease note that the actual weather may differ from this information, and I recommend checking the forecast on a reliable weather website for the most up-to-date information.", | ||
"name": null, | ||
"role": "assistant", | ||
"tool_calls": null |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we skip that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
name and tool_calls are actually skipped in the response, but the Python client library adds None when deserializing the response.
the actually response payload is
{
"object": "chat.completion",
"id": "",
"created": 1740011163,
"model": "meta-llama/Llama-3.1-8B-Instruct",
"system_fingerprint": "3.1.1-dev0-native",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "I can't access real-time data, but I can provide you with current conditions and forecast for Paris, France:\n\nThe current conditions in Paris are mostly cloudy with a temperature of 6.7\u00b0C (44.1\u00b0F). \n\nPlease note that the actual weather may differ from this information, and I recommend checking the forecast on a reliable weather website for the most up-to-date information."
},
"logprobs": null,
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 103,
"completion_tokens": 79,
"total_tokens": 182
}
}
backends/v3/Cargo.toml
Outdated
@@ -16,7 +16,7 @@ path = "src/main.rs" | |||
[dependencies] | |||
async-trait = "0.1.74" | |||
async-stream = "0.3.5" | |||
axum = { version = "0.7", features = ["json"] } | |||
axum = { version = "0.8", features = ["json"] } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need upgrades ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
only needed to generate valid openapi docs for the previous approach of Option<MessageContent>
this is no longer needed with improved typing (using an enum instead of two optionals) in the latest commits
{ | ||
"content": "", | ||
"role": "assistant", | ||
"tool_calls": [ | ||
{ | ||
"id": "0", | ||
"function": { | ||
"arguments": '{"longitude": 2.2945, "latitude": 48.8567}', | ||
"name": "get_weather", | ||
"description": None, | ||
}, | ||
"type": "function", | ||
} | ||
], | ||
}, | ||
{"role": "tool", "tool_call_id": "0", "content": "6.7"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really don't get that structure.
Assistant is saying it's calling a tool, and the tool is another participant in the conversation saying it has the answer for the call, is that correct ?
How does this play with chat templates ?? Are they ready for that ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Relevant internal discussion : https://huggingface.slack.com/archives/C06JKEMK6BZ/p1739982025532149
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated to include tool_call_id
in the chat template as some models will expect this template input in the latest commits
db23337
to
bcc4489
Compare
this PR allows incoming chat requests to specify
tool_calls
and nocontent
, which allows tools to be passed from a tool response call into the messages of a subsequent requestfor example if the following set of messages is sent without these changes an error is thrown attempting to deserialize the message
missing field 'content'
response
this PR allows messages without tools to be sent without error.
notes: