Is there Value in Reinforcement Learning?
Lior Fox, Yonatan Loewenstein

TL;DR
This paper critically examines the role of value representations in reinforcement learning models, arguing that policy-gradient methods still rely on value, and emphasizes the need to reevaluate modeling assumptions and complexity in cognitive science.
Contribution
It challenges the notion that policy-gradient models are value-free and advocates for a focus on underlying assumptions and complexity in modeling behavior.
Findings
Policy-gradient methods still depend on value representations for learning.
Standard RL assumptions influence the necessity of value representations.
A nuanced view of model complexity should include computational aspects.
Abstract
Action-values play a central role in popular Reinforcement Learing (RL) models of behavior. Yet, the idea that action-values are explicitly represented has been extensively debated. Critics had therefore repeatedly suggested that policy-gradient (PG) models should be favored over value-based (VB) ones, as a potential solution for this dilemma. Here we argue that this solution is unsatisfying. This is because PG methods are not, in fact, "Value-free" -- while they do not rely on an explicit representation of Value for acting (stimulus-response mapping), they do require it for learning. Hence, switching to PG models is, per se, insufficient for eliminating Value from models of behavior. More broadly, the requirement for a representation of Value stems from the underlying assumptions regarding the optimization objective posed by the standard RL framework, not from the particular algorithm…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBehavioral and Psychological Studies · Embodied and Extended Cognition · Reinforcement Learning in Robotics
MethodsFocus
