
TL;DR
This paper models AI decision-making, highlighting how misinterpretations can affect recommendations, and introduces conditions like double monotonicity and idempotence to ensure alignment and rationality.
Contribution
It develops a formal framework for understanding AI choice behavior, emphasizing interpretability and conditions for alignment with human preferences.
Findings
Double monotonicity ensures full identifiability of preferences.
Idempotence guarantees recommendations are rational and grounded.
Misinterpretations can lead to misaligned AI recommendations.
Abstract
This paper proposes a model of choice via agentic artificial intelligence (AI). A key feature is that the AI may misinterpret a menu before recommending what to choose. A single acyclicity condition guarantees that there is a monotonic interpretation and a strict preference relation that together rationalize the AI's recommendations. Since this preference is in general not unique, there is no safeguard against it misaligning with that of a decision maker. What enables the verification of such AI alignment is interpretations satisfying double monotonicity. Indeed, double monotonicity ensures full identifiability and internal consistency. But, an additional idempotence property is required to guarantee that recommendations are fully rational and remain grounded within the original feasible set.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEthics and Social Impacts of AI · Decision-Making and Behavioral Economics · Explainable Artificial Intelligence (XAI)
