All evaluation scripts provided usage details/cases in the first several lines of codes.
Please follow gpt4_eval_script.sh to run inference on Ferret-Bench data and use GPT-4 to rate. It's noted that openai package should be installed and user's OPENAI_KEY should be provided.
Run ferret/eval/model_lvis.py following the usage in the file and then run ferret/eval/eval_lvis.py.
Run ferret/eval/model_refcoco.py following the usage in the file and then run ferret/eval/eval_refexp.py.
Run ferret/eval/model_flickr.py following the usage in the file and then run ferret/eval/eval_flickr_entities.py.
Run ferret/eval/model_pope.py following the usage in the file and then run ferret/eval/eval_pope.py.