Replay: Multi-modal Multi-view Acted Videos for Casual Holography
Roman Shapovalov, Yanir Kleiman, Ignacio Rocco, David Novotny, Andrea, Vedaldi, Changan Chen, Filippos Kokkinos, Ben Graham, Natalia Neverova

TL;DR
Replay is a comprehensive multi-modal, multi-view dataset of human interactions designed to advance research in novel-view synthesis, 3D reconstruction, and acoustic modeling, with extensive annotations and benchmark evaluations.
Contribution
The paper introduces Replay, a large-scale, high-quality multi-view, multi-modal dataset with benchmarks for novel-view synthesis and related tasks.
Findings
Baseline methods achieve moderate performance on the new benchmark.
Replay dataset enables diverse applications like 3D reconstruction and acoustic synthesis.
High-quality annotations facilitate training advanced generative models.
Abstract
We introduce Replay, a collection of multi-view, multi-modal videos of humans interacting socially. Each scene is filmed in high production quality, from different viewpoints with several static cameras, as well as wearable action cameras, and recorded with a large array of microphones at different positions in the room. Overall, the dataset contains over 4000 minutes of footage and over 7 million timestamped high-resolution frames annotated with camera poses and partially with foreground masks. The Replay dataset has many potential applications, such as novel-view synthesis, 3D reconstruction, novel-view acoustic synthesis, human body and face analysis, and training generative models. We provide a benchmark for training and evaluating novel-view synthesis, with two scenarios of different difficulty. Finally, we evaluate several baseline state-of-the-art methods on the new benchmark.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Advanced Vision and Imaging · Face recognition and analysis
