Robust Reinforcement Learning under Diffusion Models for Data with Jumps

Chenyang Jiang; Donggyu Kim; Alejandra Quintos; Yazhen Wang

arXiv:2411.11697·cs.LG·September 19, 2025

Robust Reinforcement Learning under Diffusion Models for Data with Jumps

Chenyang Jiang, Donggyu Kim, Alejandra Quintos, Yazhen Wang

PDF

Open Access

TL;DR

This paper introduces the MSBVE algorithm for robust reinforcement learning in environments with stochastic jumps, improving value estimation accuracy over traditional methods in jump-diffusion settings.

Contribution

The paper proposes the MSBVE algorithm, a novel approach that minimizes quadratic variation error to enhance RL robustness in jump-diffusion stochastic environments.

Findings

01

MSBVE outperforms MSTDE in environments with jumps.

02

Simulations confirm MSBVE's improved convergence and robustness.

03

Formal proofs validate the theoretical advantages of MSBVE.

Abstract

Reinforcement Learning (RL) has proven effective in solving complex decision-making tasks across various domains, but challenges remain in continuous-time settings, particularly when state dynamics are governed by stochastic differential equations (SDEs) with jump components. In this paper, we address this challenge by introducing the Mean-Square Bipower Variation Error (MSBVE) algorithm, which enhances robustness and convergence in scenarios involving significant stochastic noise and jumps. We first revisit the Mean-Square TD Error (MSTDE) algorithm, commonly used in continuous-time RL, and highlight its limitations in handling jumps in state dynamics. The proposed MSBVE algorithm minimizes the mean-square quadratic variation error, offering improved performance over MSTDE in environments characterized by SDEs with jumps. Simulations and formal proofs demonstrate that the MSBVE…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics