Implicitly Aligning Humans and Autonomous Agents through Shared Task Abstractions
St\'ephane Aroca-Ouellette, Miguel Aroca-Ouellette, Katharina von der, Wense, Alessandro Roncone

TL;DR
This paper introduces HA$^2$, a hierarchical reinforcement learning framework that enhances zero-shot coordination between autonomous agents and humans by mimicking human-like shared task abstractions, leading to improved adaptability and performance.
Contribution
The paper proposes HA$^2$, a novel hierarchical RL approach that models shared task abstractions to improve autonomous agents' zero-shot coordination with humans and unseen agents.
Findings
HA$^2$ outperforms existing methods in Overcooked environment
HA$^2$ shows improved resilience to environmental shifts
HA$^2$ achieves statistically significant gains in coordination metrics
Abstract
In collaborative tasks, autonomous agents fall short of humans in their capability to quickly adapt to new and unfamiliar teammates. We posit that a limiting factor for zero-shot coordination is the lack of shared task abstractions, a mechanism humans rely on to implicitly align with teammates. To address this gap, we introduce HA: Hierarchical Ad Hoc Agents, a framework leveraging hierarchical reinforcement learning to mimic the structured approach humans use in collaboration. We evaluate HA in the Overcooked environment, demonstrating statistically significant improvement over existing baselines when paired with both unseen agents and humans, providing better resilience to environmental shifts, and outperforming all state-of-the-art methods.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Social Robot Interaction and HRI · Multimodal Machine Learning Applications
MethodsHigh-Order Consensuses · ALIGN
