TL;DR
TC-Light is a novel generative rendering method that achieves realistic, temporally coherent world transfer in videos with complex dynamics, overcoming limitations of existing models in domain generalization and efficiency.
Contribution
It introduces a two-stage optimization approach using appearance embedding and a canonical video representation called UVT for improved temporal coherence and realism.
Findings
Enables physically plausible re-rendering of complex dynamic videos.
Achieves superior temporal coherence compared to existing methods.
Operates with low computational cost.
Abstract
Illumination and texture editing are critical dimensions for world-to-world transfer, which is valuable for applications including sim2real and real2real visual data scaling up for embodied AI. Existing techniques generatively re-render the input video to realize the transfer, such as video relighting models and conditioned world generation models. Nevertheless, these models are predominantly limited to the domain of training data (e.g., portrait) or fall into the bottleneck of temporal consistency and computation efficiency, especially when the input video involves complex dynamics and long durations. In this paper, we propose TC-Light, a novel generative renderer to overcome these problems. Starting from the video preliminarily relighted by an inflated video relighting model, it optimizes appearance embedding in the first stage to align global illumination. Then it optimizes the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
