-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pr 2954 ci branch #3006
base: main
Are you sure you want to change the base?
Pr 2954 ci branch #3006
Conversation
Hi @drbh Thanks for moving forward with my PR ! However, there's something I'm not understanding. Your changes seem to remove my fixes related to streaming. I haven't had time to test the branch but by reading your changes it seems the tool calls will still be impossible to use with streaming using an open ai client. |
5f88bc4
to
3b09662
Compare
Hi @Trofleb thank you again for opening this PR. I've made some small changes; namely to avoid attempting to deserialize the string as json at each generation (and some other tweaks for test/ci) additionally I've added a small test that includes the openai client here would you kindly take a look at the PR and let me know if these changes resolve your issue? Thanks! |
Hi @drbh there's just one thing missing, the name should be only in the first event. FYI, my test case:
The last call ends with the following error:
Not sure why langchain does that but it concatenates the names at the end of the stream which means he doesn't know which function to call. |
@Trofleb thanks for the information, i've just made a small update to only send the name in the first message of the stream. I've tested with the example provided and receive reasonable output TGI started with text-generation-launcher --model-id meta-llama/Meta-Llama-3.1-8B-Instruct example output python tool-langchain-repro.py *note: only change made to the example script was to add output {'raw': AIMessage(content='', additional_kwargs={'tool_calls': [{'id': '0', 'function': {'arguments': '{"city":"zurich"}', 'name': 'GetWeather', 'description': None}, 'type': 'function'}], 'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 20, 'prompt_tokens': 213, 'total_tokens': 233, 'completion_tokens_details': None, 'prompt_tokens_details': None}, 'model_name': 'llama', 'system_fingerprint': '3.1.1-dev0-native', 'finish_reason': 'stop', 'logprobs': None}, id='run-30c7503d-ed3d-4453-92a5-2a6160ba8694-0', tool_calls=[{'name': 'GetWeather', 'args': {'city': 'zurich'}, 'id': '0', 'type': 'tool_call'}], usage_metadata={'input_tokens': 213, 'output_tokens': 20, 'total_tokens': 233, 'input_token_details': {}, 'output_token_details': {}}), 'parsed': GetWeather(city='zurich'), 'parsing_error': None}
{'raw': AIMessageChunk(content='', additional_kwargs={'tool_calls': [{'index': 0, 'id': '', 'function': {'arguments': '{ "city": "zurich"}', 'name': 'GetWeather'}, 'type': 'function'}]}, response_metadata={}, id='run-280b6234-3dc1-4896-b2bf-53896b708053', tool_calls=[{'name': 'GetWeather', 'args': {'city': 'zurich'}, 'id': '', 'type': 'tool_call'}], tool_call_chunks=[{'name': 'GetWeather', 'args': '{ "city": "zurich"}', 'id': '', 'index': 0, 'type': 'tool_call_chunk'}])}
{'parsed': GetWeather(city='zurich')}
{'parsing_error': None} |
Necessary to keep compatibility with openai. The usage of tgi with openai compatible libraries for function calling was broken.
The streaming API for tool calling now starts when the name is parsed and then send arguments as token are generated and stops properly.
…ction definition type
22fa0e1
to
3dd0128
Compare
This PR reopens #2954 and adds some small changes to rely on serde where possible. Thank you @Trofleb for the changes!
This PR aligns the tool calling output to return an array of tool calls as well as serialize the tool arguments as a JSON string.
Important
This PR contains breaking changes and aligns the tool choice output to match openai