HALyPO: Heterogeneous-Agent Lyapunov Policy Optimization for Human-Robot Collaboration

Hao Zhang; Yaru Niu; Yikai Wang; Ding Zhao; H. Eric Tseng

arXiv:2603.03741·cs.RO·March 5, 2026

HALyPO: Heterogeneous-Agent Lyapunov Policy Optimization for Human-Robot Collaboration

Hao Zhang, Yaru Niu, Yikai Wang, Ding Zhao, H. Eric Tseng

PDF

Open Access

TL;DR

HALyPO introduces a Lyapunov-based method to stabilize multi-agent policy learning in human-robot collaboration, enhancing generalization and robustness by addressing heterogeneity and rationality gaps.

Contribution

The paper proposes HALyPO, a novel Lyapunov policy optimization framework that ensures stability in heterogeneous multi-agent reinforcement learning for HRC.

Findings

01

Improved generalization in human-robot collaboration tasks.

02

Enhanced robustness in open-ended interaction scenarios.

03

Validated effectiveness through simulations and humanoid-robot experiments.

Abstract

To improve generalization and resilience in human-robot collaboration (HRC), robots must handle the combinatorial diversity of human behaviors and contexts, motivating multi-agent reinforcement learning (MARL). However, inherent heterogeneity between robots and humans creates a rationality gap (RG) in the learning process-a variational mismatch between decentralized best-response dynamics and centralized cooperative ascent. The resulting learning problem is a general-sum differentiable game, so independent policy-gradient updates can oscillate or diverge without added structure. We propose heterogeneous-agent Lyapunov policy optimization (HALyPO), which establishes formal stability directly in the policy-parameter space by enforcing a per-step Lyapunov decrease condition on a parameter-space disagreement metric. Unlike Lyapunov-based safe RL, which targets state/trajectory constraints…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Robot Manipulation and Learning · Social Robot Interaction and HRI