Aligning Artificial Intelligence with Humans through Public Policy
John Nay, James Daily

TL;DR
This paper explores how public policy data can be used to align AI systems with human values by enabling AI to understand and predict policy impacts, thereby improving human-AI alignment in high-stakes societal contexts.
Contribution
It proposes leveraging policy data for AI to learn human values and demonstrates this with a case study on legislative relevance prediction.
Findings
AI can predict legislation relevance to companies
Policy data helps AI understand human values
Aligning AI with policy improves societal outcomes
Abstract
Given that Artificial Intelligence (AI) increasingly permeates our lives, it is critical that we systematically align AI objectives with the goals and values of humans. The human-AI alignment problem stems from the impracticality of explicitly specifying the rewards that AI models should receive for all the actions they could take in all relevant states of the world. One possible solution, then, is to leverage the capabilities of AI models to learn those rewards implicitly from a rich source of data describing human values in a wide range of contexts. The democratic policy-making process produces just such data by developing specific rules, flexible standards, interpretable guidelines, and generalizable precedents that synthesize citizens' preferences over potential actions taken in many states of the world. Therefore, computationally encoding public policies to make them legible to AI…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEthics and Social Impacts of AI
MethodsALIGN
