TL;DR
KeypointNeRF introduces a novel relative 3D keypoint encoding method that improves the generalization and fidelity of image-based volumetric human reconstructions from sparse views, outperforming existing approaches especially in head and unseen subject reconstructions.
Contribution
The paper proposes a simple, effective relative spatial encoding of keypoints that enhances volumetric human reconstruction from sparse views, addressing overfitting and multi-view consistency issues.
Findings
Outperforms state-of-the-art head reconstruction methods
Achieves comparable results to parametric models on unseen subjects
Robust to viewpoint sparsity and domain gaps
Abstract
Image-based volumetric humans using pixel-aligned features promise generalization to unseen poses and identities. Prior work leverages global spatial encodings and multi-view geometric consistency to reduce spatial ambiguity. However, global encodings often suffer from overfitting to the distribution of the training data, and it is difficult to learn multi-view consistent reconstruction from sparse views. In this work, we investigate common issues with existing spatial encodings and propose a simple yet highly effective approach to modeling high-fidelity volumetric humans from sparse views. One of the key ideas is to encode relative spatial 3D information via sparse 3D keypoints. This approach is robust to the sparsity of viewpoints and cross-dataset domain gap. Our approach outperforms state-of-the-art methods for head reconstruction. On human body reconstruction for unseen subjects,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsRobinhood Customer Care Number +1-833-534-1729
