Embracing Events and Frames with Hierarchical Feature Refinement Network for Object Detection
Hu Cao, Zehua Zhang, Yan Xia, Xinyi Li, Jiahao Xia, Guang Chen, Alois, Knoll

TL;DR
This paper introduces a hierarchical feature refinement network for fusing event data and frame images to improve object detection, especially under challenging conditions, outperforming state-of-the-art methods.
Contribution
The proposed coarse-to-fine fusion module with bidirectional interaction and adaptive refinement effectively combines heterogeneous modalities for enhanced detection.
Findings
Outperforms state-of-the-art by 8.0% on DSEC dataset
Shows significantly improved robustness under image corruption
Effective fusion of event and frame data for object detection
Abstract
In frame-based vision, object detection faces substantial performance degradation under challenging conditions due to the limited sensing capability of conventional cameras. Event cameras output sparse and asynchronous events, providing a potential solution to solve these problems. However, effectively fusing two heterogeneous modalities remains an open issue. In this work, we propose a novel hierarchical feature refinement network for event-frame fusion. The core concept is the design of the coarse-to-fine fusion module, denoted as the cross-modality adaptive feature refinement (CAFR) module. In the initial phase, the bidirectional cross-modality interaction (BCI) part facilitates information bridging from two distinct sources. Subsequently, the features are further refined by aligning the channel-level mean and variance in the two-fold adaptive feature refinement (TAFR) part. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Industrial Vision Systems and Defect Detection · Image Processing and 3D Reconstruction
