FBSDiff++: Improved Frequency Band Substitution of Diffusion Features for Efficient and Highly Controllable Text-Driven Image-to-Image Translation
Xiang Gao, Yunpeng Jia

TL;DR
FBSDiff++ introduces a frequency-domain approach for efficient, controllable, and versatile text-driven image-to-image translation by substituting diffusion features across frequency bands without retraining.
Contribution
The paper presents FBSDiff++, an improved framework that accelerates inference, handles arbitrary image resolutions, and enables localized and style-specific image manipulations in I2I translation.
Findings
Achieves 8.9× inference speedup
Outperforms related methods in visual quality and controllability
Supports arbitrary resolution and localized editing
Abstract
With large-scale text-to-image (T2I) diffusion models achieving significant advancements in open-domain image creation, increasing attention has been focused on their natural extension to the realm of text-driven image-to-image (I2I) translation, where a source image acts as visual guidance to the generated image in addition to the textual guidance provided by the text prompt. We propose FBSDiff, a novel framework adapting off-the-shelf T2I diffusion model into the I2I paradigm from a fresh frequency-domain perspective. Through dynamic frequency band substitution of diffusion features, FBSDiff realizes versatile and highly controllable text-driven I2I in a plug-and-play manner (without need for model training, fine-tuning, or online optimization), allowing appearance-guided, layout-guided, and contour-guided I2I translation by progressively substituting low-frequency band, mid-frequency…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Computer Graphics and Visualization Techniques · Digital Humanities and Scholarship
