Evaluating the Effectiveness of GPT-4 Turbo in Creating Defeaters for   Assurance Cases

Kimya Khakzad Shahandashti; Mithila Sivakumar; Mohammad Mahdi Mohajer,; Alvine B. Belle; Song Wang; Timothy C. Lethbridge

arXiv:2401.17991·cs.SE·February 1, 2024·2 cites

Evaluating the Effectiveness of GPT-4 Turbo in Creating Defeaters for Assurance Cases

Kimya Khakzad Shahandashti, Mithila Sivakumar, Mohammad Mahdi Mohajer,, Alvine B. Belle, Song Wang, Timothy C. Lethbridge

PDF

Open Access

TL;DR

This paper explores using GPT-4 Turbo to automatically identify defeaters in assurance cases, enhancing the robustness of safety and security verification in critical systems.

Contribution

It introduces a novel method leveraging GPT-4 Turbo to detect defeaters in formalized assurance cases using Eliminative Argumentation notation.

Findings

01

GPT-4 Turbo effectively understands EA notation

02

The model can generate various types of defeaters

03

Initial evaluation shows promising results in automating defeater identification

Abstract

Assurance cases (ACs) are structured arguments that support the verification of the correct implementation of systems' non-functional requirements, such as safety and security, thereby preventing system failures which could lead to catastrophic outcomes, including loss of lives. ACs facilitate the certification of systems in accordance with industrial standards, for example, DO-178C and ISO 26262. Identifying defeaters arguments that refute these ACs is essential for improving the robustness and confidence in ACs. To automate this task, we introduce a novel method that leverages the capabilities of GPT-4 Turbo, an advanced Large Language Model (LLM) developed by OpenAI, to identify defeaters within ACs formalized using the Eliminative Argumentation (EA) notation. Our initial evaluation gauges the model's proficiency in understanding and generating arguments within this framework. The…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware Reliability and Analysis Research · Safety Systems Engineering in Autonomy · Risk and Safety Analysis