CHDP: Cooperative Hybrid Diffusion Policies for Reinforcement Learning in Parameterized Action Space

Bingyi Liu; Jinbo He; Haiyong Shi; Enshu Wang; Weizhen Han; Jingxiang Hao; Peixi Wang; Zhuangzhuang Zhang

arXiv:2601.05675·cs.AI·January 12, 2026

CHDP: Cooperative Hybrid Diffusion Policies for Reinforcement Learning in Parameterized Action Space

Bingyi Liu, Jinbo He, Haiyong Shi, Enshu Wang, Weizhen Han, Jingxiang Hao, Peixi Wang, Zhuangzhuang Zhang

PDF

Open Access 1 Video

TL;DR

This paper introduces CHDP, a cooperative diffusion policy framework for hybrid discrete-continuous action spaces in reinforcement learning, improving expressiveness and scalability in complex domains.

Contribution

The paper proposes a novel cooperative diffusion policy approach with sequential updates and a low-dimensional codebook for hybrid action spaces, enhancing performance and scalability.

Findings

01

Outperforms state-of-the-art by up to 19.3% success rate

02

Effectively models complex hybrid action distributions

03

Improves scalability in high-dimensional discrete actions

Abstract

Hybrid action space, which combines discrete choices and continuous parameters, is prevalent in domains such as robot control and game AI. However, efficiently modeling and optimizing hybrid discrete-continuous action space remains a fundamental challenge, mainly due to limited policy expressiveness and poor scalability in high-dimensional settings. To address this challenge, we view the hybrid action space problem as a fully cooperative game and propose a \textbf{Cooperative Hybrid Diffusion Policies (CHDP)} framework to solve it. CHDP employs two cooperative agents that leverage a discrete and a continuous diffusion policy, respectively. The continuous policy is conditioned on the discrete action's representation, explicitly modeling the dependency between them. This cooperative design allows the diffusion policies to leverage their expressiveness to capture complex distributions in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

CHDP: Cooperative Hybrid Diffusion Policies for Reinforcement Learning in Parameterized Action Space· underline

Taxonomy

TopicsReinforcement Learning in Robotics · Robot Manipulation and Learning · Artificial Intelligence in Games