A Definition of Happiness for Reinforcement Learning Agents

Mayank Daswani; Jan Leike

arXiv:1505.04497·cs.AI·May 19, 2015

A Definition of Happiness for Reinforcement Learning Agents

Mayank Daswani, Jan Leike

PDF

Open Access 1 Datasets

TL;DR

This paper proposes a formal definition of happiness for reinforcement learning agents as the temporal difference error, aligning with human empirical research and satisfying key desiderata.

Contribution

It introduces a novel formal definition of happiness for RL agents based on temporal difference error, bridging AI and human happiness research.

Findings

01

The definition aligns with human empirical findings.

02

It satisfies most of the proposed desiderata.

03

Implications for AI and human happiness are discussed.

Abstract

What is happiness for reinforcement learning agents? We seek a formal definition satisfying a list of desiderata. Our proposed definition of happiness is the temporal difference error, i.e. the difference between the value of the obtained reward and observation and the agent's expectation of this value. This definition satisfies most of our desiderata and is compatible with empirical research on humans. We state several implications and discuss examples.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

Kylan12/Synthetic-AI-ML-Dataset
dataset· 42 dl
42 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Evolutionary Algorithms and Applications · Neural dynamics and brain function