Representing Animatable Avatar via Factorized Neural Fields
Chunjin Song, Zhijie Wu, Bastian Wandt, Leonid Sigal, Helge Rhodin

TL;DR
This paper introduces a dual-branch neural network that factorizes pose-independent and pose-dependent components with frequency restrictions to improve 3D human avatar reconstruction from monocular videos, enhancing detail and consistency.
Contribution
It proposes a novel dual-branch network with frequency-based separation for better consistency and detail in animatable 3D human models from monocular videos.
Findings
Outperforms NeRF-based methods in detail preservation.
Ensures consistent large-scale body shapes across frames.
Achieves high-fidelity, photo-realistic 3D human images.
Abstract
For reconstructing high-fidelity human 3D models from monocular videos, it is crucial to maintain consistent large-scale body shapes along with finely matched subtle wrinkles. This paper explores the observation that the per-frame rendering results can be factorized into a pose-independent component and a corresponding pose-dependent equivalent to facilitate frame consistency. Pose adaptive textures can be further improved by restricting frequency bands of these two components. In detail, pose-independent outputs are expected to be low-frequency, while highfrequency information is linked to pose-dependent factors. We achieve a coherent preservation of both coarse body contours across the entire input video and finegrained texture features that are time variant with a dual-branch network with distinct frequency components. The first branch takes coordinates in canonical space as input,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Motion and Animation · Generative Adversarial Networks and Image Synthesis
