STAMP/STPA Informed Characterization of Factors Leading to Loss of Control in AI Systems
Steve Barrett, Anna Bruvere, Sean P. Fillingham, Catherine Rhodes, Stefano Vergani

TL;DR
This paper introduces a structured framework based on STAMP/STPA to analyze and identify factors leading to loss of control in AI systems, addressing safety concerns across current and future AI developments.
Contribution
It adapts the STAMP/STPA safety analysis methodology to the AI domain, providing a new approach for characterizing and mitigating loss of control risks in AI systems.
Findings
Framework helps identify causal factors of control loss in AI
Application of STAMP/STPA clarifies control structure failures
Supports safer design of socio-technical AI systems
Abstract
A major concern amongst AI safety practitioners is the possibility of loss of control, whereby humans lose the ability to exert control over increasingly advanced AI systems. The range of concerns is wide, spanning current day risks to future existential risks, and a range of loss of control pathways from rapid AI self-exfiltration scenarios to more gradual disempowerment scenarios. In this work we set out to firstly, provide a more structured framework for discussing and characterizing loss of control and secondly, to use this framework to assist those responsible for the safe operation of AI-containing socio-technical systems to identify causal factors leading to loss of control. We explore how these two needs can be better met by making use of a methodology developed within the safety-critical systems community known as STAMP and its associated hazard analysis technique of STPA. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman-Automation Interaction and Safety · Occupational Health and Safety Research · Ethics and Social Impacts of AI
