Anticipatory Reinforcement Learning: From Generative Path-Laws to Distributional Value Functions
Daniel Bloch

TL;DR
This paper presents Anticipatory Reinforcement Learning, a new framework that embeds path history into a signature manifold to enable stable, foresightful decision-making in complex, non-Markovian environments.
Contribution
It introduces a signature-augmented state space and a self-consistent field approach to improve foresight and stability in reinforcement learning for path-dependent processes.
Findings
Reduces computational complexity and variance in evaluation.
Ensures stable generalisation in heavy-tailed noise environments.
Enables proactive risk management in volatile settings.
Abstract
This paper introduces Anticipatory Reinforcement Learning (ARL), a novel framework designed to bridge the gap between non-Markovian decision processes and classical reinforcement learning architectures, specifically under the constraint of a single observed trajectory. In environments characterised by jump-diffusions and structural breaks, traditional state-based methods often fail to capture the essential path-dependent geometry required for accurate foresight. We resolve this by lifting the state space into a signature-augmented manifold, where the history of the process is embedded as a dynamical coordinate. By utilising a self-consistent field approach, the agent maintains an anticipated proxy of the future path-law, allowing for a deterministic evaluation of expected returns. This transition from stochastic branching to a single-pass linear evaluation significantly reduces…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
