BarrierBench: Evaluating Large Language Models for Safety Verification in Dynamical Systems

Ali Taheri; Alireza Taban; Sadegh Soudjani; Ashutosh Trivedi

arXiv:2511.09363·cs.AI·April 17, 2026

BarrierBench: Evaluating Large Language Models for Safety Verification in Dynamical Systems

Ali Taheri, Alireza Taban, Sadegh Soudjani, Ashutosh Trivedi

PDF

1 Repo

TL;DR

This paper introduces BarrierBench, a benchmark for evaluating large language models in synthesizing safety barrier certificates for dynamical systems, demonstrating over 90% success rate with an LLM-based framework.

Contribution

The work presents a novel LLM-driven framework for barrier certificate synthesis and introduces BarrierBench, a comprehensive benchmark for evaluating such methods in diverse dynamical systems.

Findings

01

LLM-based framework achieves over 90% success in generating valid safety certificates.

02

BarrierBench provides a diverse set of 100 dynamical systems for evaluation.

03

Retrieval-augmented generation improves the reliability of LLM-guided synthesis.

Abstract

Safety verification of dynamical systems via barrier certificates is essential for ensuring correctness in autonomous applications. Synthesizing these certificates involves discovering mathematical functions with current methods suffering from poor scalability, dependence on carefully designed templates, and exhaustive or incremental function-space searches. They also demand substantial manual expertise--selecting templates, solvers, and hyperparameters, and designing sampling strategies--requiring both theoretical and practical knowledge traditionally shared through linguistic reasoning rather than formalized methods. This motivates a key question: can such expert reasoning be captured and operationalized by language models? We address this by introducing an LLM-based agentic framework for barrier certificate synthesis. The framework uses natural language reasoning to propose,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

https://hycodev.com/dataset/barrierbench
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.