TeamHOI: Learning a Unified Policy for Cooperative Human-Object Interactions with Any Team Size
Stefan Lionar, Gim Hee Lee

TL;DR
TeamHOI introduces a scalable, unified policy framework enabling cooperative human-object interactions with any team size, leveraging Transformer-based coordination and motion priors for realistic, diverse behaviors.
Contribution
The paper presents a novel decentralized policy architecture using Transformers and a masked Adversarial Motion Prior to handle variable team sizes and promote realistic cooperative behaviors.
Findings
Achieves high success rates in cooperative carrying tasks with 2-8 agents.
Demonstrates coherent cooperation across diverse team sizes and object geometries.
Introduces a formation reward for stable team carrying behaviors.
Abstract
Physics-based humanoid control has achieved remarkable progress in enabling realistic and high-performing single-agent behaviors, yet extending these capabilities to cooperative human-object interaction (HOI) remains challenging. We present TeamHOI, a framework that enables a single decentralized policy to handle cooperative HOIs across any number of cooperating agents. Each agent operates using local observations while attending to other teammates through a Transformer-based policy network with teammate tokens, allowing scalable coordination across variable team sizes. To enforce motion realism while addressing the scarcity of cooperative HOI data, we further introduce a masked Adversarial Motion Prior (AMP) strategy that uses single-human reference motions while masking object-interacting body parts during training. The masked regions are then guided through task rewards to produce…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobot Manipulation and Learning · Social Robot Interaction and HRI · Human Motion and Animation
