Towards Real-World Industrial-Scale Verification: LLM-Driven Theorem Proving on seL4

Jianyu Zhang; Fuyuan Zhang; Jiayi Lu; Jilin Hu; Xiaoyi Yin; Long Zhang; Feng Yang; and Yongwang Zhao

arXiv:2602.08384·cs.CR·February 10, 2026

Towards Real-World Industrial-Scale Verification: LLM-Driven Theorem Proving on seL4

Jianyu Zhang, Fuyuan Zhang, Jiayi Lu, Jilin Hu, Xiaoyi Yin, Long Zhang, Feng Yang, and Yongwang Zhao

PDF

Open Access

TL;DR

This paper introduces AutoReal, an LLM-based theorem proving approach tailored for industrial-scale verification, demonstrating significant success on seL4 and AFP projects with a lightweight, deployable model.

Contribution

The paper presents AutoReal, a novel LLM-driven theorem proving method with chain-of-thought training and context augmentation, enabling effective industrial-scale verification with a compact model.

Findings

01

AutoReal-Prover achieves 51.67% proof success on seL4.

02

AutoReal-Prover outperforms prior methods on seL4 (27.06%).

03

AutoReal-Prover attains 53.88% success on AFP projects.

Abstract

Formal methods (FM) are reliable but costly to apply, often requiring years of expert effort in industrial-scale projects such as seL4, especially for theorem proving. Recent advances in large language models (LLMs) have made automated theorem proving increasingly feasible. However, most prior work focuses on mathematics-oriented benchmarks such as miniF2F, with limited evaluation on real-world verification projects. The few studies that consider industrial-scale verification mostly rely on closed-source models with hundreds of billions of parameters, which cannot be locally deployed and incur substantial usage costs. In this paper, we propose AutoReal, an LLM-driven theorem proving method for real-world industrial-scale systems with support for lightweight local deployment. We evaluate AutoReal on the seL4-Isabelle verification project as a representative and challenging case study.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsLogic, programming, and type systems · Formal Methods in Verification · Machine Learning in Materials Science