TC-PDM: Temporally Consistent Patch Diffusion Models for   Infrared-to-Visible Video Translation

Anh-Dzung Doan; Vu Minh Hieu Phan; Surabhi Gupta; Markus; Wagner; Tat-Jun Chin; Ian Reid

arXiv:2408.14227·cs.CV·August 27, 2024

TC-PDM: Temporally Consistent Patch Diffusion Models for Infrared-to-Visible Video Translation

Anh-Dzung Doan, Vu Minh Hieu Phan, Surabhi Gupta, Markus, Wagner, Tat-Jun Chin, Ian Reid

PDF

Open Access 1 Repo

TL;DR

This paper introduces TC-PDM, a diffusion-based approach for infrared-to-visible video translation that maintains semantic and temporal consistency, significantly outperforming existing methods in visual quality and object detection accuracy.

Contribution

The paper presents a novel temporally consistent patch diffusion model with semantic-guided denoising and a temporal blending module for improved infrared-to-visible video translation.

Findings

01

Outperforms state-of-the-art by 35.3% in FVD

02

Achieves 6.1% improvement in AP50 for object detection

03

Ensures semantic and temporal consistency in translated videos

Abstract

Infrared imaging offers resilience against changing lighting conditions by capturing object temperatures. Yet, in few scenarios, its lack of visual details compared to daytime visible images, poses a significant challenge for human and machine interpretation. This paper proposes a novel diffusion method, dubbed Temporally Consistent Patch Diffusion Models (TC-DPM), for infrared-to-visible video translation. Our method, extending the Patch Diffusion Model, consists of two key components. Firstly, we propose a semantic-guided denoising, leveraging the strong representations of foundational models. As such, our method faithfully preserves the semantic structure of generated visible images. Secondly, we propose a novel temporal blending module to guide the denoising trajectory, ensuring the temporal consistency between consecutive frames. Experiment shows that TC-PDM outperforms…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

dzungdoan6/tc-pdm
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Coding and Compression Technologies · Generative Adversarial Networks and Image Synthesis · Subtitles and Audiovisual Media

MethodsDiffusion