LLMs versus the Halting Problem: Revisiting Program Termination Prediction

Oren Sultan; Jordi Armengol-Estape; Pascal Kesseli; Julien Vanegue; Dafna Shahaf; Yossi Adi; Peter O'Hearn

arXiv:2601.18987·cs.CL·March 31, 2026

LLMs versus the Halting Problem: Revisiting Program Termination Prediction

Oren Sultan, Jordi Armengol-Estape, Pascal Kesseli, Julien Vanegue, Dafna Shahaf, Yossi Adi, Peter O'Hearn

PDF

TL;DR

This paper evaluates large language models' ability to predict program termination, showing they perform well but struggle with complex programs and providing proofs, highlighting their potential and limitations in reasoning about undecidable problems.

Contribution

The study demonstrates that LLMs can effectively predict program termination, approaching the performance of specialized tools, and explores their limitations in proof generation and handling complex programs.

Findings

01

LLMs perform remarkably well at predicting termination, close to top tools.

02

Performance drops with increased program length and complexity.

03

LLMs often fail to produce valid proofs or witnesses for termination.

Abstract

Determining whether a program terminates is a central problem in computer science. Turing's foundational result established the Halting Problem as undecidable, showing that no algorithm can universally determine termination for all programs and inputs. Consequently, automatic verification tools approximate termination, sometimes failing to prove or disprove; these tools rely on problem-specific architectures, and are usually tied to particular programming languages. Recent success and progress in large language models (LLMs) raises the following question: can LLMs reliably predict program termination? In this work, we evaluate LLMs on a diverse set of programs from the Termination category of the International Competition on Software Verification (SV-Comp) 2025. Our results suggest that LLMs perform remarkably well at predicting program termination, where GPT-5 and Claude Sonnet-4.5…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.