Scalar reward is not enough: A response to Silver, Singh, Precup and   Sutton (2021)

Peter Vamplew; Benjamin J. Smith; Johan Kallstrom; Gabriel Ramos,; Roxana Radulescu; Diederik M. Roijers; Conor F. Hayes; Fredrik Heintz,; Patrick Mannion; Pieter J.K. Libin; Richard Dazeley; Cameron Foale

arXiv:2112.15422·cs.AI·January 3, 2022·6 cites

Scalar reward is not enough: A response to Silver, Singh, Precup and Sutton (2021)

Peter Vamplew, Benjamin J. Smith, Johan Kallstrom, Gabriel Ramos,, Roxana Radulescu, Diederik M. Roijers, Conor F. Hayes, Fredrik Heintz,, Patrick Mannion, Pieter J.K. Libin, Richard Dazeley, Cameron Foale

PDF

Open Access

TL;DR

This paper challenges the idea that scalar reward functions are sufficient for intelligence, arguing for multi-objective models to better capture biological and computational complexity and ensure safety in artificial intelligence development.

Contribution

It critiques the scalar reward assumption in AI, advocates for multi-objective reward models, and discusses safety concerns in artificial general intelligence.

Findings

01

Scalar rewards are insufficient for complex intelligence.

02

Multi-objective reward models better reflect biological intelligence.

03

Scalar reward-based AI poses safety and ethical risks.

Abstract

The recent paper `"Reward is Enough" by Silver, Singh, Precup and Sutton posits that the concept of reward maximisation is sufficient to underpin all intelligence, both natural and artificial. We contest the underlying assumption of Silver et al. that such reward can be scalar-valued. In this paper we explain why scalar rewards are insufficient to account for some aspects of both biological and computational intelligence, and argue in favour of explicitly multi-objective models of reward maximisation. Furthermore, we contend that even if scalar reward functions can trigger intelligent behaviour in specific cases, it is still undesirable to use this approach for the development of artificial general intelligence due to unacceptable risks of unsafe or unethical behaviour.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReceptor Mechanisms and Signaling · Computational Drug Discovery Methods