Sample and Computationally Efficient Continuous-Time Reinforcement Learning with General Function Approximation

Runze Zhao; Yue Yu; Adams Yiyue Zhu; Chen Yang; Dongruo Zhou

arXiv:2505.14821·cs.LG·May 22, 2025

Sample and Computationally Efficient Continuous-Time Reinforcement Learning with General Function Approximation

Runze Zhao, Yue Yu, Adams Yiyue Zhu, Chen Yang, Dongruo Zhou

PDF

Open Access 1 Repo

TL;DR

This paper introduces a theoretically grounded, sample-efficient, and computationally practical continuous-time reinforcement learning algorithm that leverages general function approximation, optimism-based confidence sets, and structured policy updates, with empirical validation on control tasks.

Contribution

It provides the first sample complexity guarantee for CTRL with general function approximation and proposes structured policy updates to enhance efficiency.

Findings

01

Achieves near-optimal policy with suboptimality gap of rac{}{}(+d_{}+d_{})N^{-1/2}

02

Reduces number of policy updates and rollouts while maintaining performance

03

Demonstrates competitive results on continuous control and diffusion model tasks

Abstract

Continuous-time reinforcement learning (CTRL) provides a principled framework for sequential decision-making in environments where interactions evolve continuously over time. Despite its empirical success, the theoretical understanding of CTRL remains limited, especially in settings with general function approximation. In this work, we propose a model-based CTRL algorithm that achieves both sample and computational efficiency. Our approach leverages optimism-based confidence sets to establish the first sample complexity guarantee for CTRL with general function approximation, showing that a near-optimal policy can be learned with a suboptimality gap of $\tilde{O} (d_{R} + d_{F} N^{- 1/2})$ using $N$ measurements, where $d_{R}$ and $d_{F}$ denote the distributional Eluder dimensions of the reward and dynamic functions, respectively, capturing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

MLIUB/PURE
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Muscle activation and electromyography studies · Elevator Systems and Control

MethodsAttention Is All You Need · Linear Layer · Residual Connection · Byte Pair Encoding · Dropout · Multi-Head Attention · Dense Connections · Layer Normalization · Diffusion · Softmax