Intrinsic Temporal Regularization for High-resolution Human Video Synthesis
Lingbo Yang, Zhanning Gao, Peiran Ren, Siwei Ma, Wen Gao

TL;DR
This paper introduces an intrinsic temporal regularization method for high-resolution human video synthesis, improving temporal coherence and visual quality by directly regulating motion estimation during training.
Contribution
It proposes a novel intrinsic confidence map approach that enhances temporal consistency in human video generation, addressing flow estimation challenges.
Findings
Generated 512x512 high-res human action videos with improved temporal coherence.
Outperforms several competitive baselines in experiments.
Enhances training stability and visual realism in video synthesis.
Abstract
Temporal consistency is crucial for extending image processing pipelines to the video domain, which is often enforced with flow-based warping error over adjacent frames. Yet for human video synthesis, such scheme is less reliable due to the misalignment between source and target video as well as the difficulty in accurate flow estimation. In this paper, we propose an effective intrinsic temporal regularization scheme to mitigate these issues, where an intrinsic confidence map is estimated via the frame generator to regulate motion estimation via temporal loss modulation. This creates a shortcut for back-propagating temporal loss gradients directly to the front-end motion estimator, thus improving training stability and temporal coherence in output videos. We apply our intrinsic temporal regulation to single-image generator, leading to a powerful "INTERnet" capable of generating…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Advanced Image Processing Techniques · Generative Adversarial Networks and Image Synthesis
