RHINO: Learning Real-Time Humanoid-Human-Object Interaction from Human Demonstrations
Jingxiao Chen, Xinyao Li, Jiahang Cao, Zhengbang Zhu, Wentao Dong,, Minghuan Liu, Ying Wen, Yong Yu, Liqing Zhang, Weinan Zhang

TL;DR
RHINO is a hierarchical framework enabling humanoid robots to learn real-time, reactive interactions from demonstrations, allowing immediate responses to human signals across multiple modalities for versatile assistance.
Contribution
The paper introduces RHINO, a unified, real-time humanoid-human-object interaction framework that learns from demonstrations to enable immediate, multi-modal reactive behaviors.
Findings
Effective real-time reactions demonstrated on a humanoid robot
Flexible handling of diverse human signals and instructions
Enhanced safety and responsiveness in various scenarios
Abstract
Humanoid robots have shown success in locomotion and manipulation. Despite these basic abilities, humanoids are still required to quickly understand human instructions and react based on human interaction signals to become valuable assistants in human daily life. Unfortunately, most existing works only focus on multi-stage interactions, treating each task separately, and neglecting real-time feedback. In this work, we aim to empower humanoid robots with real-time reaction abilities to achieve various tasks, allowing human to interrupt robots at any time, and making robots respond to humans immediately. To support such abilities, we propose a general humanoid-human-object interaction framework, named RHINO, i.e., Real-time Humanoid-human Interaction and Object manipulation. RHINO provides a unified view of reactive motion, instruction-based manipulation, and safety concerns, over…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Multimodal Machine Learning Applications · Anomaly Detection Techniques and Applications
MethodsFocus
