OmniH2O: Universal and Dexterous Human-to-Humanoid Whole-Body   Teleoperation and Learning

Tairan He; Zhengyi Luo; Xialin He; Wenli Xiao; Chong Zhang; Weinan; Zhang; Kris Kitani; Changliu Liu; Guanya Shi

arXiv:2406.08858·cs.RO·June 14, 2024

OmniH2O: Universal and Dexterous Human-to-Humanoid Whole-Body Teleoperation and Learning

Tairan He, Zhengyi Luo, Xialin He, Wenli Xiao, Chong Zhang, Weinan, Zhang, Kris Kitani, Changliu Liu, Guanya Shi

PDF

Open Access

TL;DR

OmniH2O is a versatile system enabling human-to-humanoid teleoperation and autonomous control of full-body humanoids using various interfaces and learning methods, demonstrated across multiple real-world tasks.

Contribution

The paper introduces OmniH2O, a novel learning-based framework for universal, dexterous humanoid teleoperation and autonomy, including a new dataset and an RL-based sim-to-real pipeline.

Findings

01

Successful real-world task execution including sports and object manipulation

02

Effective learning from teleoperated demonstrations with sparse sensors

03

Versatile control via VR, verbal commands, and RGB cameras

Abstract

We present OmniH2O (Omni Human-to-Humanoid), a learning-based system for whole-body humanoid teleoperation and autonomy. Using kinematic pose as a universal control interface, OmniH2O enables various ways for a human to control a full-sized humanoid with dexterous hands, including using real-time teleoperation through VR headset, verbal instruction, and RGB camera. OmniH2O also enables full autonomy by learning from teleoperated demonstrations or integrating with frontier models such as GPT-4. OmniH2O demonstrates versatility and dexterity in various real-world whole-body tasks through teleoperation or autonomy, such as playing multiple sports, moving and manipulating objects, and interacting with humans. We develop an RL-based sim-to-real pipeline, which involves large-scale retargeting and augmentation of human motion datasets, learning a real-world deployable policy with sparse…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsContext-Aware Activity Recognition Systems · Virtual Reality Applications and Impacts · Social Robot Interaction and HRI

MethodsResidual Connection · Softmax · Layer Normalization · Byte Pair Encoding · Label Smoothing · Adam · Attention Is All You Need · Linear Layer · Multi-Head Attention · Position-Wise Feed-Forward Layer