OIBench: Benchmarking Strong Reasoning Models with Olympiad in Informatics
Yaoming Zhu, Junxin Wang, Yiyang Li, Lin Qiu, ZongYu Wang, Jun Xu, Xuezhi Cao, Yuhuai Wei, Mingshi Wang, Xunliang Cai, Rong Ma

TL;DR
OIBench is a new challenging informatics benchmark with 250 problems designed to evaluate and advance reasoning capabilities of models, revealing current strengths and gaps in AI performance on complex algorithmic tasks.
Contribution
The paper introduces OIBench, a comprehensive, contamination-resistant olympiad-level dataset for benchmarking AI reasoning, along with novel analysis tools and human-model comparison methods.
Findings
Current SOTA models outperform most humans in correctness and efficiency.
Open-source models lag behind closed-source counterparts.
Models are still suboptimal compared to canonical solutions.
Abstract
As models become increasingly sophisticated, conventional algorithm benchmarks are increasingly saturated, underscoring the need for more challenging benchmarks to guide future improvements in algorithmic reasoning. This paper introduces OIBench, a high-quality, private, and challenging olympiad-level informatics dataset comprising 250 carefully curated original problems. We detail the construction methodology of the benchmark, ensuring a comprehensive assessment across various programming paradigms and complexities, and we demonstrate its contamination-resistant properties via experiments. We propose Time/Space Completion Curves for finer-grained efficiency analysis and enable direct human-model comparisons through high-level participant evaluations. Our experiments reveal that while open-source models lag behind closed-source counterparts, current SOTA models already outperform most…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Machine Learning and Data Classification · Advanced Graph Neural Networks
