About FormationEval
Overview
FormationEval is an MMLU-style multiple-choice question (MCQ) benchmark designed for evaluating language models on oil and gas geoscience knowledge. The benchmark covers subsurface disciplines including petrophysics, petroleum geology, geophysics and reservoir engineering.
The dataset contains 505 questions derived from authoritative textbooks and open courseware, with each question including a rationale and source citation. All questions were generated using a controlled LLM pipeline with human verification to ensure accuracy and coverage.
72 language models have been evaluated on this benchmark, spanning proprietary and open-weight models from major AI providers.
Domain distribution
Questions are tagged with 1-3 domains. Percentages sum to more than 100% due to multi-domain questions.
| Domain | Count | Percentage |
|---|---|---|
| Petrophysics | 272 | 53.9% |
| Petroleum Geology | 151 | 29.9% |
| Sedimentology | 98 | 19.4% |
| Geophysics | 80 | 15.8% |
| Reservoir Engineering | 43 | 8.5% |
| Drilling Engineering | 24 | 4.8% |
| Production Engineering | 14 | 2.8% |
Difficulty distribution
| Difficulty | Count | Percentage |
|---|---|---|
| Easy | 132 | 26.1% |
| Medium | 274 | 54.3% |
| Hard | 99 | 19.6% |
Source materials
Questions are derived from the following sources. All questions are concept-based derivations, not direct copies.
Citation
If you use FormationEval in your research, please cite our paper:
@misc{ermilov2026formationeval,
title={FormationEval, an open multiple-choice benchmark for petroleum geoscience},
author={Almaz Ermilov},
year={2026},
eprint={2601.02158},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2601.02158},
doi={10.48550/arXiv.2601.02158}
}About the author
Almaz Ermilov
Software engineer with a background in petrophysics. Previously worked as a petrophysicist in the oil and gas industry before transitioning to software development. FormationEval was developed to measure how well language models understand subsurface concepts and technical knowledge in the petroleum geoscience domain.