tool calling AST environment #1

1stprinciple · 2025-09-03T17:01:44Z

observation = json.loads(observation)

Update dataset_registry.json

prepare dataset

with _timer(

only necessary dependencies

16

revert

env

bring them back

clip_advantages

mask_truncated_samples

remove chat_scheduler

cleanup

target

critic: enable: False

revert

trainer.device=cuda

Timer

timer

only import ToolASTAgent

from rllm.agents.tool_ast_agent import ToolASTAgent

examples.tool_calling.train_apigen_mt

trainer.experiment_name='rllm-apigen-mt-16k-stage2' batch_size=128 comment to the back back to full finetune comment double parse update ground_truth ground_truth update ground_truth = [tool_call["function"] for tool_call in ground_truth] \n instead of \\n lora remove nulls in tools tool_calls = [tool_call.to_dict() for tool_call in tool_calls] tool_call_str question = json.loads(question) agent_args = {} tool calling AST environment observation = json.loads(observation) Update dataset_registry.json prepare dataset with _timer( only necessary dependencies 16 revert env bring them back clip_advantages mask_truncated_samples remove chat_scheduler cleanup _target_ critic: enable: False revert trainer.device=cuda Timer Timer timer only import ToolASTAgent from rllm.agents.tool_ast_agent import ToolASTAgent examples.tool_calling.train_apigen_mt

1stprinciple force-pushed the tool-calling-ast-env branch from 0ffccd0 to af921ce Compare September 5, 2025 03:13

1stprinciple added 5 commits September 5, 2025 05:34

8B without kl loss

5a53ed5

8b in experiment name

9963bfd

same project_name

b1b5260

24576

68a619b

reduce dynamic batch_size

a498e5c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

tool calling AST environment #1

tool calling AST environment #1

Uh oh!

1stprinciple commented Sep 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

tool calling AST environment #1

Are you sure you want to change the base?

tool calling AST environment #1

Uh oh!

Conversation

1stprinciple commented Sep 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants