Volumetric performance capture from minimal camera viewpoints
Andrew Gilbert, Marco Volino, John Collomosse, Adrian Hilton

TL;DR
This paper introduces a convolutional autoencoder that achieves high-quality 3D human performance capture from very few camera views, matching the accuracy of methods using many more cameras, thus enabling cost-effective volumetric reconstruction.
Contribution
The paper presents a novel autoencoder-based approach that leverages a deep prior learned from limited multi-view video data to perform accurate volumetric reconstruction with minimal viewpoints.
Findings
Achieves similar reconstruction error to methods using many more cameras.
Enables high-fidelity volumetric capture in low-camera scenarios.
Reduces cost and complexity of performance capture setups.
Abstract
We present a convolutional autoencoder that enables high fidelity volumetric reconstructions of human performance to be captured from multi-view video comprising only a small set of camera views. Our method yields similar end-to-end reconstruction error to that of a probabilistic visual hull computed using significantly more (double or more) viewpoints. We use a deep prior implicitly learned by the autoencoder trained over a dataset of view-ablated multi-view video footage of a wide range of subjects and actions. This opens up the possibility of high-end volumetric performance capture in on-set and prosumer scenarios where time or cost prohibit a high witness camera count.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Advanced Image Processing Techniques · Image Enhancement Techniques
MethodsSolana Customer Service Number +1-833-534-1729
