Cost-Aware Tool Usage for LLMs

Requirements

A Hugging Face and Wandb token are need to view outputs. Running the first cell in each notebook should install all the requirements. It is highly suggested to open and run in Google Colab. If you have any issues doing so please reach out to jmayhugh@tamu.edu or Janshyam03@tamu.edu. If you choose not to use google colab please consult the requirements.txt and run pip install -r requirements.txt

rl_llms.ipynb

This is where the pretrained model is loaded, the xlam tool calling dataset is loaded and the SFT Model is trained. All cells can be run sequentially.

GRPO.ipynb

This is where the additional step of fine tuning is done with GRPO, this containts the reward modeling and training. All cells can be run sequentially.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
DeepRLFinalPaper.pdf		DeepRLFinalPaper.pdf
GRPO (2).ipynb		GRPO (2).ipynb
README.md		README.md
requirements.txt		requirements.txt
rl_llms (2).ipynb		rl_llms (2).ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Cost-Aware Tool Usage for LLMs

Requirements

rl_llms.ipynb

GRPO.ipynb

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Cost-Aware Tool Usage for LLMs

Requirements

rl_llms.ipynb

GRPO.ipynb

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages