TL;DR
This paper introduces I2V-GAN, a novel unpaired infrared-to-visible video translation framework that produces spatial-temporally consistent videos, supported by a new dataset IRVI, outperforming existing methods in quality and detail.
Contribution
The paper proposes I2V-GAN, a new unpaired video translation model with innovative constraints, and introduces the IRVI dataset for infrared-visible video translation tasks.
Findings
I2V-GAN outperforms SOTA methods in fluency and semantic detail.
The model effectively maintains spatial-temporal consistency.
The IRVI dataset provides a valuable resource for future research.
Abstract
Human vision is often adversely affected by complex environmental factors, especially in night vision scenarios. Thus, infrared cameras are often leveraged to help enhance the visual effects via detecting infrared radiation in the surrounding environment, but the infrared videos are undesirable due to the lack of detailed semantic information. In such a case, an effective video-to-video translation method from the infrared domain to the visible light counterpart is strongly needed by overcoming the intrinsic huge gap between infrared and visible fields. To address this challenging problem, we propose an infrared-to-visible (I2V) video translation method I2V-GAN to generate fine-grained and spatial-temporal consistent visible light videos by given unpaired infrared videos. Technically, our model capitalizes on three types of constraints: 1)adversarial constraint to generate synthetic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
