Inference-Time Scaling of Diffusion Models for Infrared Data Generation

Kai A. Horstmann; Maxim Clouser; Kia Khezeli

arXiv:2511.07362·cs.CV·November 11, 2025

Inference-Time Scaling of Diffusion Models for Infrared Data Generation

Kai A. Horstmann, Maxim Clouser, Kia Khezeli

PDF

Open Access

TL;DR

This paper presents an inference-time guidance method using a domain-adapted CLIP verifier to improve infrared image generation quality with diffusion models, addressing data scarcity issues in infrared domain adaptation.

Contribution

It introduces a novel inference-time scaling approach with a CLIP-based verifier to enhance infrared image generation quality using diffusion models.

Findings

01

10% reduction in FID scores on KAIST dataset

02

Improved alignment of generated images with text prompts

03

Effective guidance in low-data infrared settings

Abstract

Infrared imagery enables temperature-based scene understanding using passive sensors, particularly under conditions of low visibility where traditional RGB imaging fails. Yet, developing downstream vision models for infrared applications is hindered by the scarcity of high-quality annotated data, due to the specialized expertise required for infrared annotation. While synthetic infrared image generation has the potential to accelerate model development by providing large-scale, diverse training data, training foundation-level generative diffusion models in the infrared domain has remained elusive due to limited datasets. In light of such data constraints, we explore an inference-time scaling approach using a domain-adapted CLIP-based verifier for enhanced infrared image generation quality. We adapt FLUX.1-dev, a state-of-the-art text-to-image diffusion model, to the infrared domain by…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Advanced Neural Network Applications · Face recognition and analysis