Synthesizing Precise Protocol Specs from Natural Language for Effective Test Generation
Kuangxiangzi Liu, Dhiman Chakraborty, Alexander Liggesmeyer, Andreas Zeller

TL;DR
This paper presents a two-stage approach using large language models to convert natural language protocol specifications into formal, human-readable specs, enabling scalable and traceable test generation for safety-critical systems.
Contribution
It introduces a novel pipeline that extracts protocol elements from natural language and synthesizes formal specifications, improving traceability, human readability, and enabling automated test generation.
Findings
Achieves 92.8% recovery of client message types
Recovers 80.2% of server message types
Achieves 81.5% message acceptance in real-world tests
Abstract
Safety- and security-critical systems have to be thoroughly tested against their specifications. The state of practice is to have _natural language_ specifications, from which test cases are derived manually - a process that is slow, error-prone, and difficult to scale. _Formal_ specifications, on the other hand, are well-suited for automated test generation, but are tedious to write and maintain. In this work, we propose a two-stage pipeline that uses large language models (LLMs) to bridge the gap: First, we extract _protocol elements_ from natural-language specifications; second, leveraging a protocol implementation, we synthesize and refine a formal _protocol specification_ from these elements, which we can then use to massively test further implementations. We see this two-stage approach to be superior to end-to-end LLM-based test generation, as 1. it produces an _inspectable…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Testing and Debugging Techniques · Adversarial Robustness in Machine Learning · Web Application Security Vulnerabilities
