SAMPLE-HD: Simultaneous Action and Motion Planning Learning Environment

Michal Nazarczuk; Tony Ng; Krystian Mikolajczyk

arXiv:2206.10312·cs.RO·June 22, 2022

SAMPLE-HD: Simultaneous Action and Motion Planning Learning Environment

Michal Nazarczuk, Tony Ng, Krystian Mikolajczyk

PDF

Open Access

TL;DR

SAMPLE-HD is a new simulation environment designed for learning interactive reasoning in manipulation tasks by integrating visual understanding, language instructions, and ground truth paths for training.

Contribution

It introduces a comprehensive environment that combines visual, linguistic, and behavioral data for manipulation learning, filling a gap in existing simulation tools.

Findings

01

Enables generation of diverse household scenes.

02

Procedurally generates language instructions for manipulation.

03

Provides ground truth paths for training models.

Abstract

Humans exhibit incredibly high levels of multi-modal understanding - combining visual cues with read, or heard knowledge comes easy to us and allows for very accurate interaction with the surrounding environment. Various simulation environments focus on providing data for tasks related to scene understanding, question answering, space exploration, visual navigation. In this work, we are providing a solution to encompass both, visual and behavioural aspects of simulation in a new environment for learning interactive reasoning in manipulation setup. SAMPLE-HD environment allows to generate various scenes composed of small household objects, to procedurally generate language instructions for manipulation, and to generate ground truth paths serving as training data.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Speech and dialogue systems · Human Pose and Action Recognition