DLSF: Dual-Layer Synergistic Fusion for High-Fidelity Image Syn-thesis
Zhen-Qi Chen, Yuan-Fu Yang

TL;DR
This paper introduces DLSF, a dual-layer fusion framework that improves feature aggregation in diffusion-based image synthesis models, leading to better semantic alignment and preservation of fine details in complex scenes.
Contribution
It proposes a novel dual-latent integration method with adaptive fusion modules, enhancing feature interaction and detail preservation in high-fidelity image synthesis.
Findings
Enhanced semantic alignment in synthesized images.
Improved preservation of fine-grained details.
Effective feature fusion demonstrated in experiments.
Abstract
With the rapid advancement of diffusion-based generative models, Stable Diffusion (SD) has emerged as a state-of-the-art framework for high-fidelity im-age synthesis. However, existing SD models suffer from suboptimal feature aggregation, leading to in-complete semantic alignment and loss of fine-grained details, especially in highly textured and complex scenes. To address these limitations, we propose a novel dual-latent integration framework that en-hances feature interactions between the base latent and refined latent representations. Our approach em-ploys a feature concatenation strategy followed by an adaptive fusion module, which can be instantiated as either (i) an Adaptive Global Fusion (AGF) for hier-archical feature harmonization, or (ii) a Dynamic Spatial Fusion (DSF) for spatially-aware refinement. This design enables more effective cross-latent com-munication, preserving…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Face recognition and analysis · Domain Adaptation and Few-Shot Learning
