Pose2Room: Understanding 3D Scenes from Human Activities

Yinyu Nie; Angela Dai; Xiaoguang Han; Matthias Nie{\ss}ner

arXiv:2112.03030·cs.RO·July 15, 2022

Pose2Room: Understanding 3D Scenes from Human Activities

Yinyu Nie, Angela Dai, Xiaoguang Han, Matthias Nie{\ss}ner

PDF

Open Access

TL;DR

This paper introduces P2R-Net, a probabilistic model that infers 3D object structures in environments solely from human trajectory data, capturing multiple plausible configurations without visual input.

Contribution

P2R-Net is the first model to learn multi-modal 3D object distributions from human motion data, enabling scene understanding without visual cues.

Findings

01

P2R-Net outperforms baselines on PROX and VirtualHome datasets.

02

It effectively models multi-modal object configurations from human trajectories.

03

The approach captures diverse plausible scene structures without visual information.

Abstract

With wearable IMU sensors, one can estimate human poses from wearable devices without requiring visual input~\cite{von2017sparse}. In this work, we pose the question: Can we reason about object structure in real-world environments solely from human trajectory information? Crucially, we observe that human motion and interactions tend to give strong information about the objects in a scene -- for instance a person sitting indicates the likely presence of a chair or sofa. To this end, we propose P2R-Net to learn a probabilistic 3D model of the objects in a scene characterized by their class categories and oriented 3D bounding boxes, based on an input observed human trajectory in the environment. P2R-Net models the probability distribution of object class as well as a deep Gaussian mixture model for object boxes, enabling sampling of multiple, diverse, likely modes of object configurations…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Hand Gesture Recognition Systems · Video Surveillance and Tracking Methods