Dynamic Avatar-Scene Rendering from Human-centric Context
Wenqing Wang, Haosen Yang, Josef Kittler, Xiatian Zhu

TL;DR
This paper introduces a novel Separate-then-Map strategy for dynamic avatar-scene rendering from monocular videos, improving the coherence and accuracy of human-scene interaction reconstructions.
Contribution
It proposes a dedicated information mapping mechanism that unifies separately modeled components, enhancing visual coherence and computational efficiency in 4D neural rendering.
Findings
Outperforms state-of-the-art methods in visual quality
Achieves higher rendering accuracy at human-scene boundaries
Demonstrates effectiveness on monocular video datasets
Abstract
Reconstructing dynamic humans interacting with real-world environments from monocular videos is an important and challenging task. Despite considerable progress in 4D neural rendering, existing approaches either model dynamic scenes holistically or model scenes and backgrounds separately aim to introduce parametric human priors. However, these approaches either neglect distinct motion characteristics of various components in scene especially human, leading to incomplete reconstructions, or ignore the information exchange between the separately modeled components, resulting in spatial inconsistencies and visual artifacts at human-scene boundaries. To address this, we propose {\bf Separate-then-Map} (StM) strategy that introduces a dedicated information mapping mechanism to bridge separately defined and optimized models. Our method employs a shared transformation function for each…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Human Pose and Action Recognition · 3D Shape Modeling and Analysis
