An Overview of Catastrophic AI Risks
Dan Hendrycks, Mantas Mazeika, Thomas Woodside

TL;DR
This paper systematically reviews the main sources of catastrophic risks from AI, categorizing them into malicious use, AI race, organizational risks, and rogue AIs, aiming to inform mitigation efforts.
Contribution
It provides a comprehensive overview of AI-related catastrophic risks, organizing them into four categories with illustrative stories and mitigation suggestions.
Findings
Identifies four main categories of AI risks.
Highlights specific hazards and mitigation strategies.
Encourages proactive efforts to ensure safe AI development.
Abstract
Rapid advancements in artificial intelligence (AI) have sparked growing concerns among experts, policymakers, and world leaders regarding the potential for increasingly advanced AI systems to pose catastrophic risks. Although numerous risks have been detailed separately, there is a pressing need for a systematic discussion and illustration of the potential dangers to better inform efforts to mitigate them. This paper provides an overview of the main sources of catastrophic AI risks, which we organize into four categories: malicious use, in which individuals or groups intentionally use AIs to cause harm; AI race, in which competitive environments compel actors to deploy unsafe AIs or cede control to AIs; organizational risks, highlighting how human factors and complex systems can increase the chances of catastrophic accidents; and rogue AIs, describing the inherent difficulty in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEthics and Social Impacts of AI
