QHyer: Q-conditioned Hybrid Attention-mamba Transformer for Offline Goal-conditioned RL

Xing Lei; Jincheng Wang; Xuetao Zhang; Donglin Wang

arXiv:2605.01862·cs.LG·May 11, 2026

QHyer: Q-conditioned Hybrid Attention-mamba Transformer for Offline Goal-conditioned RL

Xing Lei, Jincheng Wang, Xuetao Zhang, Donglin Wang

PDF

TL;DR

QHyer is a novel offline goal-conditioned RL method that combines a state-conditioned Q-estimator with a hybrid attention-mamba backbone to handle long-term dependencies and sparse rewards effectively.

Contribution

It introduces a flow-parameterized Q-estimator for better demonstration stitching and a gated hybrid attention-mamba architecture for adaptive history compression.

Findings

01

Achieves state-of-the-art results on non-Markovian datasets.

02

Effectively handles long-range dependencies and sparse rewards.

03

Validates versatility across diverse offline GCRL scenarios.

Abstract

Offline goal-conditioned RL (GCRL) learns goal-reaching policies from static datasets, but real-world datasets are often partially observable and history-dependent, exhibiting a mix of Markovian and non-Markovian that violate standard RL assumptions. History-aware sequence models such as Decision Transformer (DT) are a natural fit for long-term dependency modeling, yet pure attention is inefficient and brittle when handling local Markovian structure and long-range context simultaneously. Although recent hybrid architectures (e.g., LSDT) introduce local extractors to improve local dependencies modeling, the fixed-window extraction cannot adapt its effective memory to varying dependency lengths in temporally heterogeneous settings, often truncating long-range context rather than compressing its content adaptively. Moreover, sequential offline GCRL faces a key bottleneck: under sparse…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.