MAVFusion: Efficient Infrared and Visible Video Fusion via Motion-Aware Sparse Interaction

Xilai Li; Weijun Jiang; Xiaosong Li; Yang Liu; Hongbin Wang; Tao Ye; Huafeng Li; Haishu Tan

arXiv:2604.01958·cs.CV·April 3, 2026

MAVFusion: Efficient Infrared and Visible Video Fusion via Motion-Aware Sparse Interaction

Xilai Li, Weijun Jiang, Xiaosong Li, Yang Liu, Hongbin Wang, Tao Ye, Huafeng Li, Haishu Tan

PDF

1 Repo

TL;DR

MAVFusion is an efficient end-to-end video fusion framework that uses motion-aware sparse interactions to combine infrared and visible videos, improving speed and quality by focusing on dynamic regions.

Contribution

It introduces a novel motion-aware sparse interaction mechanism that adaptively allocates attention to dynamic regions, significantly enhancing efficiency and fusion quality.

Findings

01

Achieves state-of-the-art performance on multiple benchmarks.

02

Runs at 14.16 FPS at 640x480 resolution.

03

Effectively preserves temporal consistency and details.

Abstract

Infrared and visible video fusion combines the object saliency from infrared images with the texture details from visible images to produce semantically rich fusion results. However, most existing methods are designed for static image fusion and cannot effectively handle frame-to-frame motion in videos. Current video fusion methods improve temporal consistency by introducing interactions across frames, but they often require high computational cost. To mitigate these challenges, we propose MAVFusion, an end-to-end video fusion framework featuring a motion-aware sparse interaction mechanism that enhances efficiency while maintaining superior fusion quality. Specifically, we leverage optical flow to identify dynamic regions in multi-modal sequences, adaptively allocating computationally intensive cross-modal attention to these sparse areas to capture salient transitions and facilitate…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ixilai/MAVFusion
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.