Formalizing Embeddedness Failures in Universal Artificial Intelligence
Cole Wyeth, Marcus Hutter

TL;DR
This paper rigorously formalizes and proves the occurrence of embeddedness failure modes in the AIXI universal artificial intelligence model, highlighting challenges in modeling embedded agency.
Contribution
It provides a formal proof of embeddedness failures in AIXI and evaluates progress towards a comprehensive theory of embedded agency.
Findings
Embeddedness failures occur within the AIXI framework.
Formal proofs of failure modes are established.
Progress towards a theory of embedded agency is assessed.
Abstract
We rigorously discuss the commonly asserted failures of the AIXI reinforcement learning agent as a model of embedded agency. We attempt to formalize these failure modes and prove that they occur within the framework of universal artificial intelligence, focusing on a variant of AIXI that models the joint action/percept history as drawn from the universal distribution. We also evaluate the progress that has been made towards a successful theory of embedded agency based on variants of the AIXI agent.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
