When Visible-to-Thermal Facial GAN Beats Conditional Diffusion
Catherine Ordun, Edward Raff, Sanjay Purushotham

TL;DR
This paper introduces VTF-GAN, a novel model that generates high-resolution thermal facial images from visible spectrum images, outperforming diffusion models and other GAN baselines in quality and realism.
Contribution
The paper presents VTF-GAN, a new GAN architecture tailored for visible-to-thermal face translation, demonstrating superior performance over diffusion models and existing GANs.
Findings
VTF-GAN produces high-quality, realistic thermal face images.
VTF-GAN outperforms diffusion models and other GAN baselines in quality.
The model effectively learns both spatial and frequency domain features.
Abstract
Thermal facial imagery offers valuable insight into physiological states such as inflammation and stress by detecting emitted radiation in the infrared spectrum, which is unseen in the visible spectra. Telemedicine applications could benefit from thermal imagery, but conventional computers are reliant on RGB cameras and lack thermal sensors. As a result, we propose the Visible-to-Thermal Facial GAN (VTF-GAN) that is specifically designed to generate high-resolution thermal faces by learning both the spatial and frequency domains of facial regions, across spectra. We compare VTF-GAN against several popular GAN baselines and the first conditional Denoising Diffusion Probabilistic Model (DDPM) for VT face translation (VTF-Diff). Results show that VTF-GAN achieves high quality, crisp, and perceptually realistic thermal faces using a combined set of patch, temperature, perceptual, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInfrared Thermography in Medicine
MethodsDiffusion
