Reinforcement Learning with Information-Theoretic Actuation
Elliot Catt, Marcus Hutter, Joel Veness

TL;DR
This paper proposes a novel reinforcement learning framework that models actions as outputs of internal decision sequences, leveraging information theory and sequence models to enhance multi-task learning.
Contribution
It introduces an augmented MDP formalism incorporating internal action sequences using information-theoretic methods, enabling better integration of prior knowledge in reinforcement learning.
Findings
Formalized internal action sequences within MDPs.
Defined consistent internal and external action value functions.
Enhanced multi-task reinforcement learning with sequence models.
Abstract
Reinforcement Learning formalises an embodied agent's interaction with the environment through observations, rewards and actions. But where do the actions come from? Actions are often considered to represent something external, such as the movement of a limb, a chess piece, or more generally, the output of an actuator. In this work we explore and formalize a contrasting view, namely that actions are best thought of as the output of a sequence of internal choices with respect to an action model. This view is particularly well-suited for leveraging the recent advances in large sequence models as prior knowledge for multi-task reinforcement learning problems. Our main contribution in this work is to show how to augment the standard MDP formalism with a sequential notion of internal action using information-theoretic techniques, and that this leads to self-consistent definitions of both…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Evolutionary Algorithms and Applications · Formal Methods in Verification
