Model-based Utility Functions

Bill Hibbard

arXiv:1111.3934·cs.AI·May 15, 2012

Model-based Utility Functions

Bill Hibbard

PDF

TL;DR

This paper proposes a two-step approach to defining utility functions for agents by first inferring an environment model from interactions and then computing utility based on this model, aiming to avoid self-delusion problems.

Contribution

It introduces a model-based utility function formulation that mitigates self-delusion issues and discusses the implications for self-modifying agents.

Findings

01

Model-based utility functions prevent self-delusion.

02

Agents do not modify utility functions under certain assumptions.

03

Approach relies on prior environment specifications.

Abstract

Orseau and Ring, as well as Dewey, have recently described problems, including self-delusion, with the behavior of agents using various definitions of utility functions. An agent's utility function is defined in terms of the agent's history of interactions with its environment. This paper argues, via two examples, that the behavior problems can be avoided by formulating the utility function in two steps: 1) inferring a model of the environment from interactions, and 2) computing utility as a function of the environment model. Basing a utility function on a model that the agent must learn implies that the utility function must initially be expressed in terms of specifications to be matched to structures in the learned model. These specifications constitute prior assumptions about the environment so this approach will not work with arbitrary environments. But the approach should work for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.