TL;DR
This paper introduces LCVD, a novel diffusion model that enables high-fidelity, relightable portrait animation by separating intrinsic and extrinsic features for nuanced control over lighting, pose, and expression.
Contribution
It proposes a feature space separation within a diffusion model to achieve relightable portrait animation, addressing limitations of previous methods.
Findings
Outperforms state-of-the-art in lighting realism
Achieves higher image quality and video consistency
Sets a new benchmark in relightable portrait animation
Abstract
Relightable portrait animation aims to animate a static reference portrait to match the head movements and expressions of a driving video while adapting to user-specified or reference lighting conditions. Existing portrait animation methods fail to achieve relightable portraits because they do not separate and manipulate intrinsic (identity and appearance) and extrinsic (pose and lighting) features. In this paper, we present a Lighting Controllable Video Diffusion model (LCVD) for high-fidelity, relightable portrait animation. We address this limitation by distinguishing these feature types through dedicated subspaces within the feature space of a pre-trained image-to-video diffusion model. Specifically, we employ the 3D mesh, pose, and lighting-rendered shading hints of the portrait to represent the extrinsic attributes, while the reference represents the intrinsic attributes. In the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsDiffusion · Adapter
