Skip to content

prmbiy/IndiMathBench

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 

Repository files navigation

IndiMathBench is a fully human-verified benchmark for evaluating automated theorem provers on 312 competition-level mathematics Olympiad problems formalized in Lean 4. All problems are produced through a structured Human–AI collaborative pipeline and are fully human-verified. The dataset is sourced from the Indian National Mathematics Olympiad (INMO) and the Regional Mathematics Olympiad (RMO), two long-running annual math competitions in India.

The benchmark contributes 98 new Geometry problems and 45 new Combinatorics problems, two domains that are underrepresented in existing formal-reasoning benchmarks.

For submissions or questions, please contact parambiyani8@gmail.com. Feedback is welcome; please open an issue if you find errors or would like to suggest improvements.

Problem Domains

Category Count
Geometry 98
Algebra 92
Set Theory & Combinatorics 45
Number Theory 77

Evaluation Results

Model Scaffold Resolution
Claude Sonnet 4 ReAct 10 turns 4/312
Gemini 2.5 Pro ReAct 10 turns 12/312
GPT-5 ReAct 10 turns 36/312
DeepSeekProver V2 7B ReAct 10 turns 24/312
GodelProver V2 8B ReAct 10 turns 36/312

Detailed description of the 10 turn ReAct scaffold in our paper!

Citation

The associated paper for IndiMathBench is available here. Please consider citing if you find IndiMathBench useful:

@misc{biyani2025indimathbenchautoformalizingmathematicalreasoning,
      title={IndiMathBench: Autoformalizing Mathematical Reasoning Problems with a Human Touch}, 
      author={Param Biyani and Shashank Kirtania and Yasharth Bajpai and Sumit Gulwani and Ashish Tiwari},
      year={2025},
      eprint={2512.00997},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={https://arxiv.org/abs/2512.00997}, 
}

License

This dataset is released under the Creative Commons Attribution 4.0 International (CC BY 4.0) license.

About

An evaluation benchmark for Olympiad competition math in Lean4 based on Indian Mathematical Olympiads

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors