FSATFusion: Frequency-Spatial Attention Transformer for Infrared and Visible Image Fusion
Tianpei Zhang, Jufeng Zhao, Yiming Zhu, Guangmang Cui, Yuhan Lyu

TL;DR
FSATFusion is a novel transformer-based network that enhances infrared and visible image fusion by capturing global context and discriminative features, leading to superior fusion quality and generalization in downstream tasks.
Contribution
The paper introduces a frequency-spatial attention Transformer module and an improved Transformer to improve global feature extraction in image fusion.
Findings
Outperforms state-of-the-art methods in fusion quality.
Demonstrates strong generalization across tasks.
Shows superior performance in object detection downstream tasks.
Abstract
The infrared and visible images fusion (IVIF) is receiving increasing attention from both the research community and industry due to its excellent results in downstream applications. Existing deep learning approaches often utilize convolutional neural networks to extract image features. However, the inherently capacity of convolution operations to capture global context can lead to information loss, thereby restricting fusion performance. To address this limitation, we propose an end-to-end fusion network named the Frequency-Spatial Attention Transformer Fusion Network (FSATFusion). The FSATFusion contains a frequency-spatial attention Transformer (FSAT) module designed to effectively capture discriminate features from source images. This FSAT module includes a frequency-spatial attention mechanism (FSAM) capable of extracting significant features from feature maps. Additionally, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image Fusion Techniques · Image Enhancement Techniques · Advanced Neural Network Applications
MethodsAbsolute Position Encodings · Layer Normalization · Byte Pair Encoding · Label Smoothing · Softmax · Dropout · Dense Connections · Transformer · Attention Is All You Need · Convolution
