Rethinking Early-Fusion Strategies for Improved Multimodal Image   Segmentation

Zhengwen Shen; Yulian Li; Han Zhang; Yuchen Weng; Jun Wang

arXiv:2501.10958·cs.CV·January 22, 2025

Rethinking Early-Fusion Strategies for Improved Multimodal Image Segmentation

Zhengwen Shen, Yulian Li, Han Zhang, Yuchen Weng, Jun Wang

PDF

Open Access

TL;DR

This paper introduces EFNet, a lightweight multimodal fusion network that employs early fusion and feature clustering to improve RGB-T image segmentation efficiency and accuracy in low-light conditions.

Contribution

The paper proposes a novel early fusion strategy with simple feature clustering and a multi-scale decoder, reducing parameters and computation while enhancing segmentation performance.

Findings

01

Outperforms state-of-the-art methods on multiple datasets.

02

Uses fewer parameters and less computation.

03

Achieves improved segmentation accuracy in low-light conditions.

Abstract

RGB and thermal image fusion have great potential to exhibit improved semantic segmentation in low-illumination conditions. Existing methods typically employ a two-branch encoder framework for multimodal feature extraction and design complicated feature fusion strategies to achieve feature extraction and fusion for multimodal semantic segmentation. However, these methods require massive parameter updates and computational effort during the feature extraction and fusion. To address this issue, we propose a novel multimodal fusion network (EFNet) based on an early fusion strategy and a simple but effective feature clustering for training efficient RGB-T semantic segmentation. In addition, we also propose a lightweight and efficient multi-scale feature aggregation decoder based on Euclidean distance. We validate the effectiveness of our method on different datasets and outperform previous…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage Retrieval and Classification Techniques