Skip to content

Latest commit

 

History

History
 
 

scripts

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 

Generate LLM code

To run the script, go to the root of this repo and use the following command from the repository root:

python evaluation/scripts/gencode_json.py [options]

Your first need to set up your API keys. For this, create a keys.cfg file at the root of the repository and add keys as follows:

OPENAI_KEY = 'your_api_key'
ANTHROPIC_KEY = 'your_api_key'
GOOGLE_KEY = 'your_api_key' 

For example, to create model results with gpt-4o and the default settings, run

python evaluation/scripts/gencode_json.py --model gpt-4o

Command-Line Arguments

  • --model - Specifies the model name used for generating responses.
  • --output-dir - Directory to store the generated code outputs (Default: eval_results/generated_code).
  • --input-path - Directory containing the JSON files describing the problems (Default: eval/data/problems_all.jsonl).
  • --prompt-dir - Directory where prompt files are saved (Default: eval_results/prompt).
  • --temperature - Controls the randomness of the generation (Default: 0).

Evaluate generated code

Download the numeric test results and save them as ./eval/data/test_data.h5

To run the script, go to the root of this repo and use the following command:

python evaluation/scripts/test_generated_code.py

Please edit the test_generated_code.py source file to specify your model name, results directory and problem set (if not problems_all.jsonl).