Accelerating Diffusion for SAR-to-Optical Image Translation via Adversarial Consistency Distillation
Xinyu Bai, Feng Xu

TL;DR
This paper introduces a novel training framework that combines diffusion models and adversarial learning to accelerate SAR-to-optical image translation, achieving 131 times faster inference while maintaining high image quality.
Contribution
The proposed method effectively reduces inference steps through consistency distillation and incorporates adversarial learning to preserve image quality, offering a flexible speed-quality trade-off.
Findings
Inference speed improved by 131 times
Maintains high image quality with PSNR, SSIM, FID metrics
Effective on SEN12 and GF3 datasets
Abstract
Synthetic Aperture Radar (SAR) provides all-weather, high-resolution imaging capabilities, but its unique imaging mechanism often requires expert interpretation, limiting its widespread applicability. Translating SAR images into more easily recognizable optical images using diffusion models helps address this challenge. However, diffusion models suffer from high latency due to numerous iterative inferences, while Generative Adversarial Networks (GANs) can achieve image translation with just a single iteration but often at the cost of image quality. To overcome these issues, we propose a new training framework for SAR-to-optical image translation that combines the strengths of both approaches. Our method employs consistency distillation to reduce iterative inference steps and integrates adversarial learning to ensure image clarity and minimize color shifts. Additionally, our approach…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage Processing Techniques and Applications · Advanced Image Processing Techniques · Digital Media Forensic Detection
MethodsDiffusion · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
