PIQL: Projective Implicit Q-Learning with Support Constraint for Offline Reinforcement Learning

Xinchen Han; Hossam Afifi; Michel Marot

arXiv:2501.08907·cs.LG·February 3, 2026

PIQL: Projective Implicit Q-Learning with Support Constraint for Offline Reinforcement Learning

Xinchen Han, Hossam Afifi, Michel Marot

PDF

Open Access

TL;DR

PIQL introduces a support constraint and a projection-based hyperparameter to improve implicit Q-learning, leading to better offline RL performance and state-of-the-art results on benchmarks.

Contribution

It proposes Projective IQL (PIQL), a novel offline RL method that enhances IQL with support constraints and multi-step evaluation for improved adaptability and performance.

Findings

01

Achieves state-of-the-art results on D4RL and NeoRL2 benchmarks.

02

Demonstrates robust performance across diverse offline RL domains.

03

Guarantees monotonic policy improvement with theoretical support.

Abstract

Offline Reinforcement Learning (RL) faces a fundamental challenge of extrapolation errors caused by out-of-distribution (OOD) actions. Implicit Q-Learning (IQL) employs expectile regression to achieve in-sample learning. Nevertheless, IQL relies on a fixed expectile hyperparameter and a density-based policy improvement method, both of which impede its adaptability and performance. In this paper, we propose Projective IQL (PIQL), a projective variant of IQL enhanced with a support constraint. In the policy evaluation stage, PIQL substitutes the fixed expectile hyperparameter with a projection-based parameter and extends the one-step value estimation to a multi-step formulation. In the policy improvement stage, PIQL adopts a support constraint instead of a density constraint, ensuring closer alignment with the policy evaluation. Theoretically, we demonstrate that PIQL maintains the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsElevator Systems and Control · Reinforcement Learning in Robotics

MethodsQ-Learning · Implicit Q-Learning