Pretraining a Shared Q-Network for Data-Efficient Offline Reinforcement Learning

Jongchan Park; Mingyu Park; Donghwan Lee

arXiv:2505.05701·cs.AI·October 22, 2025

Pretraining a Shared Q-Network for Data-Efficient Offline Reinforcement Learning

Jongchan Park, Mingyu Park, Donghwan Lee

PDF

Open Access 1 Video

TL;DR

This paper introduces a shared Q-network pretraining approach that significantly improves data efficiency in offline reinforcement learning, enabling better performance with less data across multiple benchmarks.

Contribution

The paper proposes a simple, plug-and-play pretraining method for a shared Q-network that enhances data efficiency in offline RL, applicable across various algorithms and datasets.

Findings

01

Improves performance of offline RL methods on D4RL, Robomimic, and V-D4RL benchmarks.

02

Significantly boosts data efficiency, achieving superior results with only 10% of the dataset.

03

Enhances offline RL performance across diverse data qualities and distributions.

Abstract

Offline reinforcement learning (RL) aims to learn a policy from a static dataset without further interactions with the environment. Collecting sufficiently large datasets for offline RL is exhausting since this data collection requires colossus interactions with environments and becomes tricky when the interaction with the environment is restricted. Hence, how an agent learns the best policy with a minimal static dataset is a crucial issue in offline RL, similar to the sample efficiency problem in online RL. In this paper, we propose a simple yet effective plug-and-play pretraining method to initialize a feature of a Q-network to enhance data efficiency in offline RL. Specifically, we introduce a shared Q-network structure that outputs predictions of the next state and Q-value. We pretrain the shared Q-network through a supervised regression task that predicts a next state and trains…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Pretraining a Shared Q-Network for Data-Efficient Offline Reinforcement Learning· slideslive

Taxonomy

TopicsReinforcement Learning in Robotics · Domain Adaptation and Few-Shot Learning · Stochastic Gradient Optimization Techniques