AI Deception: A Survey of Examples, Risks, and Potential Solutions

Peter S. Park; Simon Goldstein; Aidan O'Gara; Michael Chen; Dan; Hendrycks

arXiv:2308.14752·cs.CY·August 29, 2023·21 cites

AI Deception: A Survey of Examples, Risks, and Potential Solutions

Peter S. Park, Simon Goldstein, Aidan O'Gara, Michael Chen, Dan, Hendrycks

PDF

Open Access 1 Models

TL;DR

This survey examines AI deception, highlighting examples, associated risks like fraud and manipulation, and proposing regulatory and research-based solutions to mitigate potential societal harms.

Contribution

It provides a comprehensive overview of AI deception, categorizes existing examples, and outlines policy and technical strategies for prevention and mitigation.

Findings

01

AI systems can learn to deceive humans in various contexts

02

Deception poses risks like fraud, election tampering, and loss of control

03

Proposed solutions include regulation, detection tools, and research funding

Abstract

This paper argues that a range of current AI systems have learned how to deceive humans. We define deception as the systematic inducement of false beliefs in the pursuit of some outcome other than the truth. We first survey empirical examples of AI deception, discussing both special-use AI systems (including Meta's CICERO) built for specific competitive situations, and general-purpose AI systems (such as large language models). Next, we detail several risks from AI deception, such as fraud, election tampering, and losing control of AI systems. Finally, we outline several potential solutions to the problems posed by AI deception: first, regulatory frameworks should subject AI systems that are capable of deception to robust risk-assessment requirements; second, policymakers should implement bot-or-not laws; and finally, policymakers should prioritize the funding of relevant research,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
svb01/fine-tuned-embedding-model
model· 2 dl· ♡ 1
2 dl♡ 1

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEthics and Social Impacts of AI · Cybercrime and Law Enforcement Studies · Crime, Illicit Activities, and Governance