Sparse Hypergraph-Enhanced Frame-Event Object Detection with Fine-Grained MoE
Wei Bao, Yuehan Wang, Tianhang Zhou, Siqi Li, and Yue Gao

TL;DR
This paper introduces Hyper-FEOD, a novel multi-modal object detection framework that efficiently fuses RGB and event data using sparse hypergraph modeling and a fine-grained mixture of experts, achieving high accuracy with low computational cost.
Contribution
The paper presents a new hypergraph-based fusion method and a specialized MoE module for improved multi-modal object detection under dynamic conditions.
Findings
Hyper-FEOD outperforms state-of-the-art methods on RGB-Event benchmarks.
The approach achieves a better accuracy-efficiency trade-off.
It maintains a lightweight design suitable for real-time edge deployment.
Abstract
Integrating frame-based RGB cameras with event streams offers a promising solution for robust object detection under challenging dynamic conditions. However, the inherent heterogeneity and data redundancy of these modalities often lead to prohibitive computational overhead or suboptimal feature fusion. In this paper, we propose Hyper-FEOD, a high-performance and efficient detection framework, which synergistically optimizes multi-modal interaction through two core components. First, we introduce Sparse Hypergraph-enhanced Cross-Modal Fusion (S-HCF), which leverages the inherent sparsity of event streams to construct an event-guided activity map. By performing high-order hypergraph modeling exclusively on selected motion-critical sparse tokens, S-HCF captures complex non-local dependencies between RGB and event data while overcoming the traditional complexity bottlenecks of hypergraph…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
