EIFNet: Leveraging Event-Image Fusion for Robust Semantic Segmentation

Zhijiang Li; Haoran He

arXiv:2507.21971·cs.CV·July 30, 2025

EIFNet: Leveraging Event-Image Fusion for Robust Semantic Segmentation

Zhijiang Li, Haoran He

PDF

TL;DR

EIFNet is a novel multi-modal fusion network that combines event camera data and images using attention mechanisms to improve semantic segmentation in challenging environments.

Contribution

The paper introduces EIFNet, a new fusion architecture with modules for feature refinement and adaptive integration, advancing event-based semantic segmentation.

Findings

01

Achieves state-of-the-art results on DDD17-Semantic dataset

02

Effectively fuses event and image data with attention mechanisms

03

Improves robustness in challenging lighting and dynamic conditions

Abstract

Event-based semantic segmentation explores the potential of event cameras, which offer high dynamic range and fine temporal resolution, to achieve robust scene understanding in challenging environments. Despite these advantages, the task remains difficult due to two main challenges: extracting reliable features from sparse and noisy event streams, and effectively fusing them with dense, semantically rich image data that differ in structure and representation. To address these issues, we propose EIFNet, a multi-modal fusion network that combines the strengths of both event and frame-based inputs. The network includes an Adaptive Event Feature Refinement Module (AEFRM), which improves event representations through multi-scale activity modeling and spatial attention. In addition, we introduce a Modality-Adaptive Recalibration Module (MARM) and a Multi-Head Attention Gated Fusion Module…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.