Infrared and Visible Image Fusion with Hierarchical Human Perception
Guang Yang, Jie Li, Xin Liu, Zhusi Zhong, and Xinbo Gao

TL;DR
This paper presents HPFusion, a novel image fusion method that uses a large vision-language model to incorporate hierarchical human perception, enhancing the fused image for better information preservation and visual appeal.
Contribution
It introduces a new fusion approach leveraging human semantic priors via vision-language models, addressing the lack of human perception considerations in existing methods.
Findings
High-quality fusion results with improved information preservation
Enhanced visual perception in fused images
Effective incorporation of human semantic priors
Abstract
Image fusion combines images from multiple domains into one image, containing complementary information from source domains. Existing methods take pixel intensity, texture and high-level vision task information as the standards to determine preservation of information, lacking enhancement for human perception. We introduce an image fusion method, Hierarchical Perception Fusion (HPFusion), which leverages Large Vision-Language Model to incorporate hierarchical human semantic priors, preserving complementary information that satisfies human visual system. We propose multiple questions that humans focus on when viewing an image pair, and answers are generated via the Large Vision-Language Model according to images. The texts of answers are encoded into the fusion network, and the optimization also aims to guide the human semantic distribution of the fused image more similarly to source…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image Fusion Techniques · Infrared Target Detection Methodologies · Remote-Sensing Image Classification
MethodsFocus
