Auditing large language models: a three-layered approach

Jakob M\"okander; Jonas Schuett; Hannah Rose Kirk; Luciano Floridi

arXiv:2302.08500·cs.CL·June 28, 2023·6 cites

Auditing large language models: a three-layered approach

Jakob M\"okander, Jonas Schuett, Hannah Rose Kirk, Luciano Floridi

PDF

Open Access

TL;DR

This paper proposes a comprehensive three-layered auditing framework for large language models to address ethical, social, and governance challenges, emphasizing coordinated audits across providers, models, and applications.

Contribution

It introduces a novel three-layered auditing approach for LLMs, filling gaps in existing governance methods by enabling structured, multi-level evaluations.

Findings

01

Structured audits can identify ethical and social risks.

02

Coordination across audit levels enhances effectiveness.

03

Acknowledges limitations of auditing LLMs.

Abstract

Large language models (LLMs) represent a major advance in artificial intelligence (AI) research. However, the widespread use of LLMs is also coupled with significant ethical and social challenges. Previous research has pointed towards auditing as a promising governance mechanism to help ensure that AI systems are designed and deployed in ways that are ethical, legal, and technically robust. However, existing auditing procedures fail to address the governance challenges posed by LLMs, which display emergent capabilities and are adaptable to a wide range of downstream tasks. In this article, we address that gap by outlining a novel blueprint for how to audit LLMs. Specifically, we propose a three-layered approach, whereby governance audits (of technology providers that design and disseminate LLMs), model audits (of LLMs after pre-training but prior to their release), and application…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques

Methodsfail