4DEquine: Disentangling Motion and Appearance for 4D Equine Reconstruction from Monocular Video
Jin Lyu, Liang An, Pujin Cheng, Yebin Liu, Xiaoying Tang

TL;DR
4DEquine introduces a novel approach to 4D equine reconstruction from monocular video by disentangling motion and appearance, utilizing synthetic datasets for training, and achieving state-of-the-art results in real-world applications.
Contribution
The paper presents a new framework that separates motion and appearance reconstruction, along with synthetic datasets, enabling efficient and high-quality 4D equine reconstruction from monocular videos.
Findings
State-of-the-art performance on real-world datasets.
Effective disentanglement of motion and appearance.
High-fidelity 3D avatar reconstruction from minimal input.
Abstract
4D reconstruction of equine family (e.g. horses) from monocular video is important for animal welfare. Previous mainstream 4D animal reconstruction methods require joint optimization of motion and appearance over a whole video, which is time-consuming and sensitive to incomplete observation. In this work, we propose a novel framework called 4DEquine by disentangling the 4D reconstruction problem into two sub-problems: dynamic motion reconstruction and static appearance reconstruction. For motion, we introduce a simple yet effective spatio-temporal transformer with a post-optimization stage to regress smooth and pixel-aligned pose and shape sequences from video. For appearance, we design a novel feed-forward network that reconstructs a high-fidelity, animatable 3D Gaussian avatar from as few as a single image. To assist training, we create a large-scale synthetic motion dataset,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topics3D Shape Modeling and Analysis · Generative Adversarial Networks and Image Synthesis · Face recognition and analysis
