Modulation of viability signals for self-regulatory control
Alvaro Ovalle, Simon M. Lucas

TL;DR
This paper investigates how agents can modulate their signals of viability to enhance self-regulatory control by learning and minimizing surprisal in dynamic environments, integrating active inference with reinforcement learning.
Contribution
It introduces a self-supervised approach for agents to learn preference distributions and modulate signals, bridging active inference and reinforcement learning in dynamic settings.
Findings
Self-supervised learning of preference distributions improves agent adaptability.
Modulating viability signals enhances self-regulatory control in dynamic environments.
The approach unifies active inference with reinforcement learning methods.
Abstract
We revisit the role of instrumental value as a driver of adaptive behavior. In active inference, instrumental or extrinsic value is quantified by the information-theoretic surprisal of a set of observations measuring the extent to which those observations conform to prior beliefs or preferences. That is, an agent is expected to seek the type of evidence that is consistent with its own model of the world. For reinforcement learning tasks, the distribution of preferences replaces the notion of reward. We explore a scenario in which the agent learns this distribution in a self-supervised manner. In particular, we highlight the distinction between observations induced by the environment and those pertaining more directly to the continuity of an agent in time. We evaluate our methodology in a dynamic environment with discrete time and actions. First with a surprisal minimizing model-free…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
