Action-Driven Processes for Continuous-Time Control
Ruimin He, Shaowei Lin

TL;DR
This paper unifies stochastic processes and reinforcement learning through action-driven processes, demonstrating their application to spiking neural networks and linking control-as-inference with maximum entropy RL.
Contribution
It introduces a unified framework for stochastic processes and reinforcement learning via action-driven processes, with applications to neural networks and a novel theoretical connection.
Findings
Minimizing KL divergence aligns with maximum entropy reinforcement learning.
Application of action-driven processes to spiking neural networks.
Theoretical link between control-as-inference and RL.
Abstract
At the heart of reinforcement learning are actions -- decisions made in response to observations of the environment. Actions are equally fundamental in the modeling of stochastic processes, as they trigger discontinuous state transitions and enable the flow of information through large, complex systems. In this paper, we unify the perspectives of stochastic processes and reinforcement learning through action-driven processes, and illustrate their application to spiking neural networks. Leveraging ideas from control-as-inference, we show that minimizing the Kullback-Leibler divergence between a policy-driven true distribution and a reward-driven model distribution for a suitably defined action-driven process is equivalent to maximum entropy reinforcement learning.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
