TL;DR
This paper introduces BarrierBench, a benchmark for evaluating large language models in synthesizing safety barrier certificates for dynamical systems, demonstrating over 90% success rate with an LLM-based framework.
Contribution
The work presents a novel LLM-driven framework for barrier certificate synthesis and introduces BarrierBench, a comprehensive benchmark for evaluating such methods in diverse dynamical systems.
Findings
LLM-based framework achieves over 90% success in generating valid safety certificates.
BarrierBench provides a diverse set of 100 dynamical systems for evaluation.
Retrieval-augmented generation improves the reliability of LLM-guided synthesis.
Abstract
Safety verification of dynamical systems via barrier certificates is essential for ensuring correctness in autonomous applications. Synthesizing these certificates involves discovering mathematical functions with current methods suffering from poor scalability, dependence on carefully designed templates, and exhaustive or incremental function-space searches. They also demand substantial manual expertise--selecting templates, solvers, and hyperparameters, and designing sampling strategies--requiring both theoretical and practical knowledge traditionally shared through linguistic reasoning rather than formalized methods. This motivates a key question: can such expert reasoning be captured and operationalized by language models? We address this by introducing an LLM-based agentic framework for barrier certificate synthesis. The framework uses natural language reasoning to propose,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
