The One Where They Reconstructed 3D Humans and Environments in TV Shows
Georgios Pavlakos, Ethan Weber, Matthew Tancik, Angjoo Kanazawa

TL;DR
This paper introduces an automated method to reconstruct 3D models of humans and environments from TV shows by leveraging the repetition of scenes, enabling enhanced analysis and applications like re-identification and editing.
Contribution
The authors propose a novel approach that reconstructs 3D environments and human poses from entire TV show seasons, improving downstream tasks and applications.
Findings
Successfully reconstructed 3D environments and human models from seven TV shows.
Enhanced 3D human pose and position recovery using environment context.
Demonstrated applications in re-identification, gaze estimation, and image editing.
Abstract
TV shows depict a wide variety of human behaviors and have been studied extensively for their potential to be a rich source of data for many applications. However, the majority of the existing work focuses on 2D recognition tasks. In this paper, we make the observation that there is a certain persistence in TV shows, i.e., repetition of the environments and the humans, which makes possible the 3D reconstruction of this content. Building on this insight, we propose an automatic approach that operates on an entire season of a TV show and aggregates information in 3D; we build a 3D model of the environment, compute camera information, static 3D scene structure and body scale information. Then, we demonstrate how this information acts as rich 3D context that can guide and improve the recovery of 3D human pose and position in these environments. Moreover, we show that reasoning about humans…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Human Pose and Action Recognition · Gaze Tracking and Assistive Technology
