Consistent time travel for realistic interactions with historical data:   reinforcement learning for market making

Vincent Ragel; Damien Challet

arXiv:2408.02322·q-fin.TR·January 30, 2025

Consistent time travel for realistic interactions with historical data: reinforcement learning for market making

Vincent Ragel, Damien Challet

PDF

Open Access

TL;DR

This paper introduces a novel approach called consistent data time travel for offline reinforcement learning, enabling more realistic interactions with historical data, demonstrated in market making where it improves agent gains significantly.

Contribution

The paper proposes a new method for offline RL called consistent data time travel, addressing simulation and data inference challenges in multi-agent systems like financial markets.

Findings

01

Data time travel improves RL agent gains in market making.

02

The approach reduces the overestimation of RL difficulty in complex tasks.

03

It alleviates the need for imperfect models in offline RL.

Abstract

Reinforcement learning works best when the impact of the agent's actions on its environment can be perfectly simulated or fully appraised from available data. Some systems are however both hard to simulate and very sensitive to small perturbations. An additional difficulty arises when a RL agent is trained offline to be part of a multi-agent system using only anonymous data, which makes it impossible to infer the state of each agent, thus to use data directly. Typical examples are competitive systems without agent-resolved data such as financial markets. We introduce consistent data time travel for offline RL as a remedy for these problems: instead of using historical data in a sequential way, we argue that one needs to perform time travel in historical data, i.e., to adjust the time index so that both the past state and the influence of the RL agent's action on the system coincide with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsComplex Systems and Time Series Analysis

MethodsEmirates Airlines Office in Dubai