TC-PDM: Temporally Consistent Patch Diffusion Models for Infrared-to-Visible Video Translation
Anh-Dzung Doan, Vu Minh Hieu Phan, Surabhi Gupta, Markus, Wagner, Tat-Jun Chin, Ian Reid

TL;DR
This paper introduces TC-PDM, a diffusion-based approach for infrared-to-visible video translation that maintains semantic and temporal consistency, significantly outperforming existing methods in visual quality and object detection accuracy.
Contribution
The paper presents a novel temporally consistent patch diffusion model with semantic-guided denoising and a temporal blending module for improved infrared-to-visible video translation.
Findings
Outperforms state-of-the-art by 35.3% in FVD
Achieves 6.1% improvement in AP50 for object detection
Ensures semantic and temporal consistency in translated videos
Abstract
Infrared imaging offers resilience against changing lighting conditions by capturing object temperatures. Yet, in few scenarios, its lack of visual details compared to daytime visible images, poses a significant challenge for human and machine interpretation. This paper proposes a novel diffusion method, dubbed Temporally Consistent Patch Diffusion Models (TC-DPM), for infrared-to-visible video translation. Our method, extending the Patch Diffusion Model, consists of two key components. Firstly, we propose a semantic-guided denoising, leveraging the strong representations of foundational models. As such, our method faithfully preserves the semantic structure of generated visible images. Secondly, we propose a novel temporal blending module to guide the denoising trajectory, ensuring the temporal consistency between consecutive frames. Experiment shows that TC-PDM outperforms…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Coding and Compression Technologies · Generative Adversarial Networks and Image Synthesis · Subtitles and Audiovisual Media
MethodsDiffusion
