You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: python/agents/tau2-benchmark-agent/README.md
+53-6Lines changed: 53 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -48,6 +48,18 @@ tenacity dependecy version may conflict with that of tau2 repo. Upgrading it bac
48
48
pip install --upgrade tenacity
49
49
```
50
50
51
+
**IMPORTANT:** Gemini 3 Pro model makes sending thought signatures mandatory. Tau2 bench relies on litellm for user simulation and non-adk agent simulation. Until https://github.com/BerriAI/litellm/pull/16812 is merged to litellm repository, the PR needs to be applied as shown below:
Optionally, you can run specific example by using `--task-ids` instead of `--num-tasks`.
137
149
150
+
**temperature:** When adk_agent is used defaults to 1. The commands in this document sets them explicitly using llm_args for both user and agent models.
151
+
152
+
**reasoning_level** Only applies to Gemini 3 Pro model. It defaults to high for adk_agent while using this model. Otherwise, it will default to dynamic thinking. Again this document demonsrates setting it explicitly using llm_args.
153
+
154
+
**NOTE**: It is normal that you will be getting `This model isn't mapped yet` error logs. This is coming from litellm cost calculation workflow used by `--user-llm`. You can suppress is temporarily by swapping `--user-llm vertex_ai/gemini-3-pro-preview` with `--user-llm vertex_ai/gemini-2.5-pro`.
155
+
138
156
### Viewing trajectories
139
157
140
158
You can use the following command to view trajectories after following the default options:
@@ -149,18 +167,47 @@ Full run requires dropping the arg `--task-ids`.
149
167
150
168
```bash
151
169
# Example: Run complete evaluation for all domains
0 commit comments