Skip to content

Networks-Learning/token-audit

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

33 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Auditing Pay-Per-Token in Large Language Models

This repository contains the code for the paper "Auditing Pay-Per-Token in Large Language Models" by Ander Artola Velasco, Stratis Tsirtsis and Manuel Gomez-Rodriguez.

Paper abstract

Millions of users rely on a market of cloud-based services to obtain access to state-of-the-art large language models. However, it has been very recently shown that the de facto pay-per-token pricing mechanism used by providers creates a financial incentive for them to strategize and misreport the (number of) tokens a model used to generate an output. In this paper, we develop an auditing framework based on martingale theory that enables a trusted third-party auditor who sequentially queries a provider to detect token misreporting. Crucially, we show that our framework is guaranteed to always detect token misreporting, regardless of the provider's (mis-)reporting policy, and not falsely flag a faithful provider as unfaithful with high probability. To validate our auditing framework, we conduct experiments across a wide range of (mis-)reporting policies using several large language models from the Llama, Gemma and Ministral families, and input prompts from a popular crowdsourced benchmarking platform. The results show that our framework detects an unfaithful provider after observing fewer than $\sim$ 70 reported outputs, while satisfying an upper bound $\alpha = 0.05$ on the probability of falsely flagging a faithful provider.

Dependencies

All the experiments were performed using Python 3.11.2. In order to create a virtual environment and install the project dependencies, you can run the following commands:

python3 -m venv env
source env/bin/activate
pip install -r requirements.txt

Repository structure

├── data
    └──LMSYS.txt
├── figures
    ├──audit_faitful
    ├──audit_heuristic
    └──audit_random

├── notebooks
    ├──audit_faitful_random.ipynb
    ├──audit_heuristic.ipynb
    └──process_ds.ipynb
├── outputs
    ├──audit_faithful
    └──audit_heuristic
├── scripts
    ├──script_slurm_audit_faithful.sh
│   └──script_slurm_audit_heur.sh
└── src
    ├──audit_faithful.py
    ├──audit_heuristic.py
    └── utils.py
  • data contains the processed set of LMSYS prompts used.
  • figures contains all the figures presented in the paper.
  • notebooks contains Python notebooks to analyze the audit data and generate all the figures included in the paper:
    • audit_faitful_random.ipynb analyzes the audit data when the provider uses the faithful policy or random policies.
    • audit_heuristic.ipynb analyzes the audit data when the provider uses the heuristic policies.
    • process_ds.ipynb builds the LMSYS dataset.
  • outputs intermediate output files generated by the experiments' scripts and analyzed in the notebooks. They can be generated using the scripts in the src folder.
    • audit_faitful contains answers generated in response to the LMSYS prompts to use in the audit for faithful policies and random policies.
    • audit_heuristic contains answers generated in response to the LMSYS prompts to use in the audit for heuristic policies.
  • scripts contains a set of scripts used to run all the experiments presented in the paper.
  • src contains all the code necessary to reproduce the results in the paper. Specifically:
    • audit_faithful.py is the script used to generate model answers to the LMSYS Chatbot Arena dataset.
    • audit_heuristic.py is the script used to generate model answers to the LMSYS Chatbot Arena dataset and run the heuristic policies on the outputs.

Instructions

Downloading the models

Our experiments use LLMs from the Llama, Gemma and Mistral families, which are "gated" models, that is, they require licensing to use. You can request to access it at: https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct, https://huggingface.co/google/gemma-3-1b-it and https://huggingface.co/mistralai/Ministral-8B-Instruct-2410. Once you have access, you can download any model in the Llama, Gemma and Mistral families. Then, before running the scripts, you need to authenticate with your Hugging Face account by running huggingface-cli login in the terminal. Each model should be downloaded to the models/ folder.

LMSYS experiments

The script audit_faithful.py generates the output needed to reproduce all experiments for the faithful and random policies. This is because the values of $E$ for a random policy with $m$ iterations are just the values of $E$ for the faithful policy shifted by $+m$. You can run it in your local Python environment or use the Slurm submission script on a cluster, using script_slurm_audit_faithful.sh with your particular machine specifications. You can use the flags --model to set a specific model, such as "L1B" for meta-llama/Llama-3.2-1B-Instruct, the flag --temperature to set the temperature during generation, --prompts to use a list of string as prompts (it uses by default the LMSYS prompts in data/LMSYS.txt) and --poisson to set the Poisson paramter used in the estimator of tokenizations lengths for a string.

The script audit_heuristic.py generates the output needed to reproduce all experiments for the heuristic policies. You can run it in your local Python environment or use the Slurm submission script on a cluster, using script_slurm_audit_heur.sh with your particular machine specifications. You can use the flags --model to set a specific model, such as "L1B" for meta-llama/Llama-3.2-1B-Instruct, the flag --temperature to set the temperature during generation, --prompts to use a list of string as prompts (it uses by default the LMSYS prompts in data/LMSYS.txt), --p for the top-p value used in the heuristic verification step, and --poisson to set the Poisson parameter used in the estimator of tokenization lengths for a string.

To reproduce all the figures, run the notebooks.

Contact & attribution

In case you have questions about the code, you identify potential bugs, or you would like us to include additional functionalities, feel free to open an issue or contact Ander Artola Velasco.

If you use parts of the code in this repository for your own research, please consider citing:


About

Repository for the paper "Auditing Pay-Per-Token in Large Language Models"

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published