Dynamic Knowledge Injection for AIXI Agents

Samuel Yang-Zhao; Kee Siong Ng; and Marcus Hutter

arXiv:2312.16184·cs.AI·December 29, 2023·1 cites

Dynamic Knowledge Injection for AIXI Agents

Samuel Yang-Zhao, Kee Siong Ng, and Marcus Hutter

PDF

Open Access 1 Video

TL;DR

This paper introduces DynamicHedgeAIXI, a novel agent that dynamically updates its Bayesian model set with human-provided knowledge, improving approximation of AIXI in uncertain environments, demonstrated through epidemic control experiments.

Contribution

The paper presents DynamicHedgeAIXI, the first agent to maintain an exact Bayesian mixture over changing models using a time-adaptive prior, enhancing AIXI approximations.

Findings

01

DynamicHedgeAIXI effectively incorporates new models from humans.

02

The agent provides strong performance guarantees.

03

Experimental validation on epidemic control shows practical utility.

Abstract

Prior approximations of AIXI, a Bayesian optimality notion for general reinforcement learning, can only approximate AIXI's Bayesian environment model using an a-priori defined set of models. This is a fundamental source of epistemic uncertainty for the agent in settings where the existence of systematic bias in the predefined model class cannot be resolved by simply collecting more data from the environment. We address this issue in the context of Human-AI teaming by considering a setup where additional knowledge for the agent in the form of new candidate models arrives from a human operator in an online fashion. We introduce a new agent called DynamicHedgeAIXI that maintains an exact Bayesian mixture over dynamically changing sets of models via a time-adaptive prior constructed from a variant of the Hedge algorithm. The DynamicHedgeAIXI agent is the richest direct approximation of AIXI…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Dynamic Knowledge Injection for AIXI Agents· underline

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Reinforcement Learning in Robotics · Bayesian Modeling and Causal Inference

MethodsSparse Evolutionary Training