Sociotechnical Safety Evaluation of Generative AI Systems

Laura Weidinger; Maribeth Rauh; Nahema Marchal; Arianna Manzini; Lisa; Anne Hendricks; Juan Mateos-Garcia; Stevie Bergman; Jackie Kay; Conor; Griffin; Ben Bariach; Iason Gabriel; Verena Rieser; William Isaac

arXiv:2310.11986·cs.AI·November 2, 2023·40 cites

Sociotechnical Safety Evaluation of Generative AI Systems

Laura Weidinger, Maribeth Rauh, Nahema Marchal, Arianna Manzini, Lisa, Anne Hendricks, Juan Mateos-Garcia, Stevie Bergman, Jackie Kay, Conor, Griffin, Ben Bariach, Iason Gabriel, Verena Rieser, William Isaac

PDF

Open Access

TL;DR

This paper introduces a sociotechnical framework for evaluating the safety of generative AI systems, emphasizing context and systemic impacts, and surveys current evaluation practices to identify gaps and propose improvements.

Contribution

It presents a novel three-layered sociotechnical evaluation framework and provides a comprehensive survey of existing safety assessments, highlighting key gaps and future directions.

Findings

01

Current safety evaluations focus mainly on capabilities.

02

Context and systemic impacts are crucial for safety assessment.

03

Identified three major gaps in existing evaluation practices.

Abstract

Generative AI systems produce a range of risks. To ensure the safety of generative AI systems, these risks must be evaluated. In this paper, we make two main contributions toward establishing such evaluations. First, we propose a three-layered framework that takes a structured, sociotechnical approach to evaluating these risks. This framework encompasses capability evaluations, which are the main current approach to safety evaluation. It then reaches further by building on system safety principles, particularly the insight that context determines whether a given capability may cause harm. To account for relevant context, our framework adds human interaction and systemic impacts as additional layers of evaluation. Second, we survey the current state of safety evaluation of generative AI systems and create a repository of existing evaluations. Three salient evaluation gaps emerge from…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEthics and Social Impacts of AI · Law, AI, and Intellectual Property