SAMPLE-HD: Simultaneous Action and Motion Planning Learning Environment
Michal Nazarczuk, Tony Ng, Krystian Mikolajczyk

TL;DR
SAMPLE-HD is a new simulation environment designed for learning interactive reasoning in manipulation tasks by integrating visual understanding, language instructions, and ground truth paths for training.
Contribution
It introduces a comprehensive environment that combines visual, linguistic, and behavioral data for manipulation learning, filling a gap in existing simulation tools.
Findings
Enables generation of diverse household scenes.
Procedurally generates language instructions for manipulation.
Provides ground truth paths for training models.
Abstract
Humans exhibit incredibly high levels of multi-modal understanding - combining visual cues with read, or heard knowledge comes easy to us and allows for very accurate interaction with the surrounding environment. Various simulation environments focus on providing data for tasks related to scene understanding, question answering, space exploration, visual navigation. In this work, we are providing a solution to encompass both, visual and behavioural aspects of simulation in a new environment for learning interactive reasoning in manipulation setup. SAMPLE-HD environment allows to generate various scenes composed of small household objects, to procedurally generate language instructions for manipulation, and to generate ground truth paths serving as training data.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Speech and dialogue systems · Human Pose and Action Recognition
