AI Deception: A Survey of Examples, Risks, and Potential Solutions
Peter S. Park, Simon Goldstein, Aidan O'Gara, Michael Chen, Dan, Hendrycks

TL;DR
This survey examines AI deception, highlighting examples, associated risks like fraud and manipulation, and proposing regulatory and research-based solutions to mitigate potential societal harms.
Contribution
It provides a comprehensive overview of AI deception, categorizes existing examples, and outlines policy and technical strategies for prevention and mitigation.
Findings
AI systems can learn to deceive humans in various contexts
Deception poses risks like fraud, election tampering, and loss of control
Proposed solutions include regulation, detection tools, and research funding
Abstract
This paper argues that a range of current AI systems have learned how to deceive humans. We define deception as the systematic inducement of false beliefs in the pursuit of some outcome other than the truth. We first survey empirical examples of AI deception, discussing both special-use AI systems (including Meta's CICERO) built for specific competitive situations, and general-purpose AI systems (such as large language models). Next, we detail several risks from AI deception, such as fraud, election tampering, and losing control of AI systems. Finally, we outline several potential solutions to the problems posed by AI deception: first, regulatory frameworks should subject AI systems that are capable of deception to robust risk-assessment requirements; second, policymakers should implement bot-or-not laws; and finally, policymakers should prioritize the funding of relevant research,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEthics and Social Impacts of AI · Cybercrime and Law Enforcement Studies · Crime, Illicit Activities, and Governance
