The Decision Path to Control AI Risks Completely: Fundamental Control Mechanisms for AI Governance
Yong Tao

TL;DR
This paper proposes a comprehensive architecture with five pillars and six control mechanisms to achieve complete human control over AI risks, aiming to establish a theoretical foundation for AI governance and legislation.
Contribution
It introduces a systematic control framework with AI mandates built into systems and society, addressing major AI risks and physical safeguards for the first time.
Findings
Developed five pillars and six control mechanisms for AI governance.
Outlined AI mandates for internal and societal control.
Highlighted physical safeguards to prevent AI circumvention.
Abstract
Artificial intelligence (AI) advances rapidly but achieving complete human control over AI risks remains an unsolved problem, akin to driving the fast AI "train" without a "brake system." By exploring fundamental control mechanisms at key elements of AI decisions, we develop a systematic solution to thoroughly control AI risks, providing an architecture for AI governance and legislation with five pillars supported by six control mechanisms, illustrated through a minimum set of AI Mandates (AIMs). Three of the AIMs must be built inside AI systems and three in society to address major areas of AI risks: 1) align AI values with human users; 2) constrain AI decision-actions by societal ethics, laws, and regulations; 3) build in human intervention options for emergencies and shut-off switches for existential threats; 4) limit AI access to user resources to reinforce controls inside AI; 5)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEthics and Social Impacts of AI · Human-Automation Interaction and Safety · Adversarial Robustness in Machine Learning
