Evaluating the Effectiveness of GPT-4 Turbo in Creating Defeaters for Assurance Cases
Kimya Khakzad Shahandashti, Mithila Sivakumar, Mohammad Mahdi Mohajer,, Alvine B. Belle, Song Wang, Timothy C. Lethbridge

TL;DR
This paper explores using GPT-4 Turbo to automatically identify defeaters in assurance cases, enhancing the robustness of safety and security verification in critical systems.
Contribution
It introduces a novel method leveraging GPT-4 Turbo to detect defeaters in formalized assurance cases using Eliminative Argumentation notation.
Findings
GPT-4 Turbo effectively understands EA notation
The model can generate various types of defeaters
Initial evaluation shows promising results in automating defeater identification
Abstract
Assurance cases (ACs) are structured arguments that support the verification of the correct implementation of systems' non-functional requirements, such as safety and security, thereby preventing system failures which could lead to catastrophic outcomes, including loss of lives. ACs facilitate the certification of systems in accordance with industrial standards, for example, DO-178C and ISO 26262. Identifying defeaters arguments that refute these ACs is essential for improving the robustness and confidence in ACs. To automate this task, we introduce a novel method that leverages the capabilities of GPT-4 Turbo, an advanced Large Language Model (LLM) developed by OpenAI, to identify defeaters within ACs formalized using the Eliminative Argumentation (EA) notation. Our initial evaluation gauges the model's proficiency in understanding and generating arguments within this framework. The…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Reliability and Analysis Research · Safety Systems Engineering in Autonomy · Risk and Safety Analysis
