Weber-Fechner Law in Temporal Difference learning derived from Control as Inference

Keiichiro Takahashi; Taisuke Kobayashi; Tomoya Yamanokuchi; and Takamitsu Matsubara

arXiv:2412.21004·cs.LG·February 25, 2026

Weber-Fechner Law in Temporal Difference learning derived from Control as Inference

Keiichiro Takahashi, Taisuke Kobayashi, Tomoya Yamanokuchi, and Takamitsu Matsubara

PDF

Open Access

TL;DR

This paper introduces a nonlinear update rule in reinforcement learning inspired by the Weber-Fechner law, which biases learning in a way that accelerates reward acquisition and suppresses punishments, inspired by biological findings.

Contribution

It derives a novel nonlinear TD error update rule from control as inference, incorporating Weber-Fechner law to improve RL performance and biological plausibility.

Findings

01

Accelerates reward acquisition in RL tasks

02

Suppresses punishments effectively during learning

03

Validated through simulations and robot experiments

Abstract

This paper investigates a novel nonlinear update rule based on temporal difference (TD) errors in reinforcement learning (RL). The update rule in the standard RL states that the TD error is linearly proportional to the degree of updates, treating all rewards equally without no bias. On the other hand, the recent biological studies revealed that there are nonlinearities in the TD error and the degree of updates, biasing policies optimistic or pessimistic. Such biases in learning due to nonlinearities are expected to be useful and intentionally leftover features in biological learning. Therefore, this research explores a theoretical framework that can leverage the nonlinearity between the degree of the update and TD errors. To this end, we focus on a control as inference framework, since it is known as a generalized formulation encompassing various RL and optimal control methods. In…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications

MethodsFocus