DASH: Modularized Human Manipulation Simulation with Vision and Language   for Embodied AI

Yifeng Jiang; Michelle Guo; Jiangshan Li; Ioannis Exarchos; Jiajun Wu,; C. Karen Liu

arXiv:2108.12536·cs.GR·September 2, 2021

DASH: Modularized Human Manipulation Simulation with Vision and Language for Embodied AI

Yifeng Jiang, Michelle Guo, Jiangshan Li, Ioannis Exarchos, Jiajun Wu,, C. Karen Liu

PDF

TL;DR

DASH is a modular virtual human platform capable of performing grasp-and-stack tasks in simulated environments using vision and language, without human motion data, enabling flexible and realistic embodied AI research.

Contribution

The paper introduces DASH, a modular, vision-and-language-driven virtual human system that performs manipulation tasks without human motion data, supporting analysis and extension.

Findings

01

High success rate in task performance

02

Modular design enables flexibility and extensibility

03

Performs diverse and fluid manipulation motions

Abstract

Creating virtual humans with embodied, human-like perceptual and actuation constraints has the promise to provide an integrated simulation platform for many scientific and engineering applications. We present Dynamic and Autonomous Simulated Human (DASH), an embodied virtual human that, given natural language commands, performs grasp-and-stack tasks in a physically-simulated cluttered environment solely using its own visual perception, proprioception, and touch, without requiring human motion data. By factoring the DASH system into a vision module, a language module, and manipulation modules of two skill categories, we can mix and match analytical and machine learning techniques for different modules so that DASH is able to not only perform randomly arranged tasks with a high success rate, but also do so under anthropomorphic constraints and with fluid and diverse motions. The modular…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.