MIME: Human-Aware 3D Scene Generation
Hongwei Yi, Chun-Hao P. Huang, Shashank Tripathi, Lea Hering, Justus, Thies, Michael J. Black

TL;DR
MIME is a generative model that creates realistic 3D indoor scenes based on human movement data, improving diversity and plausibility of generated environments by leveraging human motion cues.
Contribution
This work introduces MIME, a novel transformer-based model that generates furniture layouts from human movement data, a reverse approach to existing scene generation methods.
Findings
MIME produces more diverse scenes than previous methods.
Scenes generated by MIME are more plausible and consistent with human movement.
The dataset was extended with 3D human annotations for training.
Abstract
Generating realistic 3D worlds occupied by moving humans has many applications in games, architecture, and synthetic data creation. But generating such scenes is expensive and labor intensive. Recent work generates human poses and motions given a 3D scene. Here, we take the opposite approach and generate 3D indoor scenes given 3D human motion. Such motions can come from archival motion capture or from IMU sensors worn on the body, effectively turning human movement in a "scanner" of the 3D world. Intuitively, human movement indicates the free-space in a room and human contact indicates surfaces or objects that support activities such as sitting, lying or touching. We propose MIME (Mining Interaction and Movement to infer 3D Environments), which is a generative model of indoor scenes that produces furniture layouts that are consistent with the human movement. MIME uses an auto-regressive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Motion and Animation · Human Pose and Action Recognition · 3D Shape Modeling and Analysis
