The Isabelle Community Benchmark
Fabian Huch, Vincent Bode

TL;DR
This study introduces a community benchmark for Isabelle's theorem prover build time, analyzing hardware impacts and establishing correlations with popular benchmarks to optimize performance.
Contribution
It provides the first comprehensive analysis of hardware effects on Isabelle performance and develops a predictive model based on consumer CPU parameters.
Findings
Consumer CPUs with 8-16 threads perform best for Isabelle.
CPU base clock and boost frequency significantly influence performance.
Strong correlation between Isabelle build times and high-performance computing benchmarks.
Abstract
Choosing hardware for theorem proving is no simple task: automated provers are highly complex and optimized programs, often utilizing a parallel computation model, and there is little prior research on the hardware impact on prover performance. To alleviate the problem for Isabelle, we initiated a community benchmark where the build time of HOL-Analysis is measured. On distinct CPUs, a total of runs with different Isabelle configurations were reported by Isabelle users. Results range from s to over h. We found that current consumer CPUs performed best, with an optimal number of to threads, largely independent of heap memory. As for hardware parameters, CPU base clock affected multi-threaded execution most with a linear correlation of , whereas boost frequency was the most influential parameter for single-threaded runs (correlation coefficient );…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLogic, programming, and type systems · Parallel Computing and Optimization Techniques · Software Engineering Research
